Hierarchical low-rank approximation for high dimensional approximation
Nouy, Anthony
2016-01-07
Tensor methods are among the most prominent tools for the numerical solution of high-dimensional problems where functions of multiple variables have to be approximated. Such high-dimensional approximation problems naturally arise in stochastic analysis and uncertainty quantification. In many practical situations, the approximation of high-dimensional functions is made computationally tractable by using rank-structured approximations. In this talk, we present algorithms for the approximation in hierarchical tensor format using statistical methods. Sparse representations in a given tensor format are obtained with adaptive or convex relaxation methods, with a selection of parameters using crossvalidation methods.
Non-intrusive low-rank separated approximation of high-dimensional stochastic models
Doostan, Alireza
2013-08-01
This work proposes a sampling-based (non-intrusive) approach within the context of low-. rank separated representations to tackle the issue of curse-of-dimensionality associated with the solution of models, e.g., PDEs/ODEs, with high-dimensional random inputs. Under some conditions discussed in details, the number of random realizations of the solution, required for a successful approximation, grows linearly with respect to the number of random inputs. The construction of the separated representation is achieved via a regularized alternating least-squares regression, together with an error indicator to estimate model parameters. The computational complexity of such a construction is quadratic in the number of random inputs. The performance of the method is investigated through its application to three numerical examples including two ODE problems with high-dimensional random inputs. © 2013 Elsevier B.V.
On matrices with low-rank-plus-shift structure: Partial SVD and latent semantic indexing
Energy Technology Data Exchange (ETDEWEB)
Zha, H.; Zhang, Z.
1998-08-01
The authors present a detailed analysis of matrices satisfying the so-called low-rank-plus-shift property in connection with the computation of their partial singular value decomposition. The application they have in mind is Latent Semantic Indexing for information retrieval where the term-document matrices generated from a text corpus approximately satisfy this property. The analysis is motivated by developing more efficient methods for computing and updating partial SVD of large term-document matrices and gaining deeper understanding of the behavior of the methods in the presence of noise.
Dutta, Aritra
2017-07-02
Principal component pursuit (PCP) is a state-of-the-art approach for background estimation problems. Due to their higher computational cost, PCP algorithms, such as robust principal component analysis (RPCA) and its variants, are not feasible in processing high definition videos. To avoid the curse of dimensionality in those algorithms, several methods have been proposed to solve the background estimation problem in an incremental manner. We propose a batch-incremental background estimation model using a special weighted low-rank approximation of matrices. Through experiments with real and synthetic video sequences, we demonstrate that our method is superior to the state-of-the-art background estimation algorithms such as GRASTA, ReProCS, incPCP, and GFL.
Fairbanks, Hillary R.; Doostan, Alireza; Ketelsen, Christian; Iaccarino, Gianluca
2017-07-01
Multilevel Monte Carlo (MLMC) is a recently proposed variation of Monte Carlo (MC) simulation that achieves variance reduction by simulating the governing equations on a series of spatial (or temporal) grids with increasing resolution. Instead of directly employing the fine grid solutions, MLMC estimates the expectation of the quantity of interest from the coarsest grid solutions as well as differences between each two consecutive grid solutions. When the differences corresponding to finer grids become smaller, hence less variable, fewer MC realizations of finer grid solutions are needed to compute the difference expectations, thus leading to a reduction in the overall work. This paper presents an extension of MLMC, referred to as multilevel control variates (MLCV), where a low-rank approximation to the solution on each grid, obtained primarily based on coarser grid solutions, is used as a control variate for estimating the expectations involved in MLMC. Cost estimates as well as numerical examples are presented to demonstrate the advantage of this new MLCV approach over the standard MLMC when the solution of interest admits a low-rank approximation and the cost of simulating finer grids grows fast.
Inference for High-dimensional Differential Correlation Matrices.
Cai, T Tony; Zhang, Anru
2016-01-01
Motivated by differential co-expression analysis in genomics, we consider in this paper estimation and testing of high-dimensional differential correlation matrices. An adaptive thresholding procedure is introduced and theoretical guarantees are given. Minimax rate of convergence is established and the proposed estimator is shown to be adaptively rate-optimal over collections of paired correlation matrices with approximately sparse differences. Simulation results show that the procedure significantly outperforms two other natural methods that are based on separate estimation of the individual correlation matrices. The procedure is also illustrated through an analysis of a breast cancer dataset, which provides evidence at the gene co-expression level that several genes, of which a subset has been previously verified, are associated with the breast cancer. Hypothesis testing on the differential correlation matrices is also considered. A test, which is particularly well suited for testing against sparse alternatives, is introduced. In addition, other related problems, including estimation of a single sparse correlation matrix, estimation of the differential covariance matrices, and estimation of the differential cross-correlation matrices, are also discussed.
On spectral distribution of high dimensional covariation matrices
DEFF Research Database (Denmark)
Heinrich, Claudio; Podolskij, Mark
In this paper we present the asymptotic theory for spectral distributions of high dimensional covariation matrices of Brownian diffusions. More specifically, we consider N-dimensional Itô integrals with time varying matrix-valued integrands. We observe n equidistant high frequency data points...... of the underlying Brownian diffusion and we assume that N/n -> c in (0,oo). We show that under a certain mixed spectral moment condition the spectral distribution of the empirical covariation matrix converges in distribution almost surely. Our proof relies on method of moments and applications of graph theory....
Energy Technology Data Exchange (ETDEWEB)
Peng, Bo; Kowalski, Karol
2017-03-01
In this letter, we introduce the reverse Cuthill-McKee (RCM) algorithm, which is often used for the bandwidth reduction of sparse tensors, to transform the two-electron integral tensors to their block diagonal forms. By further applying the pivoted Cholesky decomposition (CD) on each of the diagonal blocks, we are able to represent the high-dimensional two-electron integral tensors in terms of permutation matrices and low-rank Cholesky vectors. This representation facilitates the low-rank factorization of the high-dimensional tensor contractions that are usually encountered in post-Hartree-Fock calculations. In this letter, we discuss the second-order Møller-Plesset (MP2) method and linear coupled- cluster model with doubles (L-CCD) as two simple examples to demonstrate the efficiency of the RCM-CD technique in representing two-electron integrals in a compact form.
Energy Technology Data Exchange (ETDEWEB)
Weber, G. F.; Laudal, D. L.
1989-01-01
This work is a compilation of reports on ongoing research at the University of North Dakota. Topics include: Control Technology and Coal Preparation Research (SO{sub x}/NO{sub x} control, waste management), Advanced Research and Technology Development (turbine combustion phenomena, combustion inorganic transformation, coal/char reactivity, liquefaction reactivity of low-rank coals, gasification ash and slag characterization, fine particulate emissions), Combustion Research (fluidized bed combustion, beneficiation of low-rank coals, combustion characterization of low-rank coal fuels, diesel utilization of low-rank coals), Liquefaction Research (low-rank coal direct liquefaction), and Gasification Research (hydrogen production from low-rank coals, advanced wastewater treatment, mild gasification, color and residual COD removal from Synfuel wastewaters, Great Plains Gasification Plant, gasifier optimization).
Kim, Eunwoo; Lee, Minsik; Choi, Chong-Ho; Kwak, Nojun; Oh, Songhwai
2015-02-01
Low-rank matrix approximation plays an important role in the area of computer vision and image processing. Most of the conventional low-rank matrix approximation methods are based on the l2 -norm (Frobenius norm) with principal component analysis (PCA) being the most popular among them. However, this can give a poor approximation for data contaminated by outliers (including missing data), because the l2 -norm exaggerates the negative effect of outliers. Recently, to overcome this problem, various methods based on the l1 -norm, such as robust PCA methods, have been proposed for low-rank matrix approximation. Despite the robustness of the methods, they require heavy computational effort and substantial memory for high-dimensional data, which is impractical for real-world problems. In this paper, we propose two efficient low-rank factorization methods based on the l1 -norm that find proper projection and coefficient matrices using the alternating rectified gradient method. The proposed methods are applied to a number of low-rank matrix approximation problems to demonstrate their efficiency and robustness. The experimental results show that our proposals are efficient in both execution time and reconstruction performance unlike other state-of-the-art methods.
Low-Rank Affinity Based Local-Driven Multilabel Propagation
Directory of Open Access Journals (Sweden)
Teng Li
2013-01-01
Full Text Available This paper presents a novel low-rank affinity based local-driven algorithm to robustly propagate the multilabels from training images to test images. A graph is constructed over the segmented local image regions. The labels for vertices from the training data are derived based on the context among different training images, and the derived vertex labels are propagated to the unlabeled vertices via the graph. The multitask low-rank affinity, which jointly seeks the sparsity-consistent low-rank affinities from multiple feature matrices, is applied to compute the edge weights between graph vertices. The inference process of multitask low-rank affinity is formulated as a constrained nuclear norm and ℓ2,1-norm minimization problem. The optimization is conducted efficiently with the augmented Lagrange multiplier method. Based on the learned local patch labels we can predict the multilabels for the test images. Experiments on multilabel image annotation demonstrate the encouraging results from the proposed framework.
Low Rank Approximation Algorithms, Implementation, Applications
Markovsky, Ivan
2012-01-01
Matrix low-rank approximation is intimately related to data modelling; a problem that arises frequently in many different fields. Low Rank Approximation: Algorithms, Implementation, Applications is a comprehensive exposition of the theory, algorithms, and applications of structured low-rank approximation. Local optimization methods and effective suboptimal convex relaxations for Toeplitz, Hankel, and Sylvester structured problems are presented. A major part of the text is devoted to application of the theory. Applications described include: system and control theory: approximate realization, model reduction, output error, and errors-in-variables identification; signal processing: harmonic retrieval, sum-of-damped exponentials, finite impulse response modeling, and array processing; machine learning: multidimensional scaling and recommender system; computer vision: algebraic curve fitting and fundamental matrix estimation; bioinformatics for microarray data analysis; chemometrics for multivariate calibration; ...
High-dimensional covariance estimation with high-dimensional data
Pourahmadi, Mohsen
2013-01-01
Methods for estimating sparse and large covariance matrices Covariance and correlation matrices play fundamental roles in every aspect of the analysis of multivariate data collected from a variety of fields including business and economics, health care, engineering, and environmental and physical sciences. High-Dimensional Covariance Estimation provides accessible and comprehensive coverage of the classical and modern approaches for estimating covariance matrices as well as their applications to the rapidly developing areas lying at the intersection of statistics and mac
Low-rank coal oil agglomeration
Knudson, C.L.; Timpe, R.C.
1991-07-16
A low-rank coal oil agglomeration process is described. High mineral content, a high ash content subbituminous coals are effectively agglomerated with a bridging oil which is partially water soluble and capable of entering the pore structure, and is usually coal-derived.
Low-rank quadratic semidefinite programming
Yuan, Ganzhao
2013-04-01
Low rank matrix approximation is an attractive model in large scale machine learning problems, because it can not only reduce the memory and runtime complexity, but also provide a natural way to regularize parameters while preserving learning accuracy. In this paper, we address a special class of nonconvex quadratic matrix optimization problems, which require a low rank positive semidefinite solution. Despite their non-convexity, we exploit the structure of these problems to derive an efficient solver that converges to their local optima. Furthermore, we show that the proposed solution is capable of dramatically enhancing the efficiency and scalability of a variety of concrete problems, which are of significant interest to the machine learning community. These problems include the Top-k Eigenvalue problem, Distance learning and Kernel learning. Extensive experiments on UCI benchmarks have shown the effectiveness and efficiency of our proposed method. © 2012.
Probabilistic Low-Rank Multitask Learning.
Kong, Yu; Shao, Ming; Li, Kang; Fu, Yun
2017-01-04
In this paper, we consider the problem of learning multiple related tasks simultaneously with the goal of improving the generalization performance of individual tasks. The key challenge is to effectively exploit the shared information across multiple tasks as well as preserve the discriminative information for each individual task. To address this, we propose a novel probabilistic model for multitask learning (MTL) that can automatically balance between low-rank and sparsity constraints. The former assumes a low-rank structure of the underlying predictive hypothesis space to explicitly capture the relationship of different tasks and the latter learns the incoherent sparse patterns private to each task. We derive and perform inference via variational Bayesian methods. Experimental results on both regression and classification tasks on real-world applications demonstrate the effectiveness of the proposed method in dealing with the MTL problems.
Low-Rank Kalman Filtering in Subsurface Contaminant Transport Models
El Gharamti, Mohamad
2010-12-01
Understanding the geology and the hydrology of the subsurface is important to model the fluid flow and the behavior of the contaminant. It is essential to have an accurate knowledge of the movement of the contaminants in the porous media in order to track them and later extract them from the aquifer. A two-dimensional flow model is studied and then applied on a linear contaminant transport model in the same porous medium. Because of possible different sources of uncertainties, the deterministic model by itself cannot give exact estimations for the future contaminant state. Incorporating observations in the model can guide it to the true state. This is usually done using the Kalman filter (KF) when the system is linear and the extended Kalman filter (EKF) when the system is nonlinear. To overcome the high computational cost required by the KF, we use the singular evolutive Kalman filter (SEKF) and the singular evolutive extended Kalman filter (SEEKF) approximations of the KF operating with low-rank covariance matrices. The SEKF can be implemented on large dimensional contaminant problems while the usage of the KF is not possible. Experimental results show that with perfect and imperfect models, the low rank filters can provide as much accurate estimates as the full KF but at much less computational cost. Localization can help the filter analysis as long as there are enough neighborhood data to the point being analyzed. Estimating the permeabilities of the aquifer is successfully tackled using both the EKF and the SEEKF.
Tensor Factorization for Low-Rank Tensor Completion.
Zhou, Pan; Lu, Canyi; Lin, Zhouchen; Zhang, Chao
2018-03-01
Recently, a tensor nuclear norm (TNN) based method was proposed to solve the tensor completion problem, which has achieved state-of-the-art performance on image and video inpainting tasks. However, it requires computing tensor singular value decomposition (t-SVD), which costs much computation and thus cannot efficiently handle tensor data, due to its natural large scale. Motivated by TNN, we propose a novel low-rank tensor factorization method for efficiently solving the 3-way tensor completion problem. Our method preserves the low-rank structure of a tensor by factorizing it into the product of two tensors of smaller sizes. In the optimization process, our method only needs to update two smaller tensors, which can be more efficiently conducted than computing t-SVD. Furthermore, we prove that the proposed alternating minimization algorithm can converge to a Karush-Kuhn-Tucker point. Experimental results on the synthetic data recovery, image and video inpainting tasks clearly demonstrate the superior performance and efficiency of our developed method over state-of-the-arts including the TNN and matricization methods.
Akbudak, Kadir
2017-05-11
Covariance matrices are ubiquitous in computational science and engineering. In particular, large covariance matrices arise from multivariate spatial data sets, for instance, in climate/weather modeling applications to improve prediction using statistical methods and spatial data. One of the most time-consuming computational steps consists in calculating the Cholesky factorization of the symmetric, positive-definite covariance matrix problem. The structure of such covariance matrices is also often data-sparse, in other words, effectively of low rank, though formally dense. While not typically globally of low rank, covariance matrices in which correlation decays with distance are nearly always hierarchically of low rank. While symmetry and positive definiteness should be, and nearly always are, exploited for performance purposes, exploiting low rank character in this context is very recent, and will be a key to solving these challenging problems at large-scale dimensions. The authors design a new and flexible tile row rank Cholesky factorization and propose a high performance implementation using OpenMP task-based programming model on various leading-edge manycore architectures. Performance comparisons and memory footprint saving on up to 200K×200K covariance matrix size show a gain of more than an order of magnitude for both metrics, against state-of-the-art open-source and vendor optimized numerical libraries, while preserving the numerical accuracy fidelity of the original model. This research represents an important milestone in enabling large-scale simulations for covariance-based scientific applications.
Implicit Block Diagonal Low-Rank Representation.
Xie, Xingyu; Guo, Xianglin; Liu, Guangcan; Wang, Jun
2017-10-17
While current block diagonal constrained subspace clustering methods are performed explicitly on the original data space, in practice it is often more desirable to embed the block diagonal prior into the reproducing kernel Hilbert feature space by kernelization techniques, as the underlying data structure in reality is usually nonlinear. However, it is still unknown how to carry out the embedding and kernelization in the models with block diagonal constraints. In this work, we shall take a step in this direction. First, we establish a novel model termed Implicit Block Diagonal Low-Rank Representation (IBDLR), by incorporating the implicit feature representation and block diagonal prior into the prevalent Low-Rank Representation (LRR) method. Second, mostly important, we show that the model in IBDLR could be kernelized by making use of a smoothed dual representation and the specifics of a proximal gradient based optimization algorithm. Finally, we provide some theoretical analyses for the convergence of our optimization algorithm. Comprehensive experiments on synthetic and realworld datasets demonstrate the superiorities of our IBDLR over state-of-the-art methods.While current block diagonal constrained subspace clustering methods are performed explicitly on the original data space, in practice it is often more desirable to embed the block diagonal prior into the reproducing kernel Hilbert feature space by kernelization techniques, as the underlying data structure in reality is usually nonlinear. However, it is still unknown how to carry out the embedding and kernelization in the models with block diagonal constraints. In this work, we shall take a step in this direction. First, we establish a novel model termed Implicit Block Diagonal Low-Rank Representation (IBDLR), by incorporating the implicit feature representation and block diagonal prior into the prevalent Low-Rank Representation (LRR) method. Second, mostly important, we show that the model in IBDLR could be
Tao, Molei; Owhadi, Houman; Marsden, Jerrold E.
2010-01-01
We present a multiscale integrator for Hamiltonian systems with slowly varying quadratic stiff potentials that uses coarse timesteps (analogous to what the impulse method uses for constant quadratic stiff potentials). This method is based on the highly-non-trivial introduction of two efficient symplectic schemes for exponentiations of matrices that only require O(n) matrix multiplications operations at each coarse time step for a preset small number n. The proposed integrator is shown to be (...
Energy Technology Data Exchange (ETDEWEB)
Zhang, Zhenyue [Zhejiang Univ., Hangzhou (People' s Republic of China); Zha, Hongyuan [Pennsylvania State Univ., University Park, PA (United States); Simon, Horst [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
2006-07-31
In this paper, we developed numerical algorithms for computing sparse low-rank approximations of matrices, and we also provided a detailed error analysis of the proposed algorithms together with some numerical experiments. The low-rank approximations are constructed in a certain factored form with the degree of sparsity of the factors controlled by some user-specified parameters. In this paper, we cast the sparse low-rank approximation problem in the framework of penalized optimization problems. We discuss various approximation schemes for the penalized optimization problem which are more amenable to numerical computations. We also include some analysis to show the relations between the original optimization problem and the reduced one. We then develop a globally convergent discrete Newton-like iterative method for solving the approximate penalized optimization problems. We also compare the reconstruction errors of the sparse low-rank approximations computed by our new methods with those obtained using the methods in the earlier paper and several other existing methods for computing sparse low-rank approximations. Numerical examples show that the penalized methods are more robust and produce approximations with factors which have fewer columns and are sparser.
Global Low-Rank Image Restoration With Gaussian Mixture Model.
Zhang, Sibo; Jiao, Licheng; Liu, Fang; Wang, Shuang
2017-06-27
Low-rank restoration has recently attracted a lot of attention in the research of computer vision. Empirical studies show that exploring the low-rank property of the patch groups can lead to superior restoration performance, however, there is limited achievement on the global low-rank restoration because the rank minimization at image level is too strong for the natural images which seldom match the low-rank condition. In this paper, we describe a flexible global low-rank restoration model which introduces the local statistical properties into the rank minimization. The proposed model can effectively recover the latent global low-rank structure via nuclear norm, as well as the fine details via Gaussian mixture model. An alternating scheme is developed to estimate the Gaussian parameters and the restored image, and it shows excellent convergence and stability. Besides, experiments on image and video sequence datasets show the effectiveness of the proposed method in image inpainting problems.
De-biasing low-rank projection for matrix completion
Foucart, Simon; Needell, Deanna; Plan, Yaniv; Wootters, Mary
2017-08-01
We study matrix completion with non-uniform, deterministic sampling patterns. We introduce a computable parameter, which is a function of the sampling pattern, and show that if this parameter is small, then we may recover missing entries of the matrix, with appropriate weights. We theoretically analyze a simple and well-known recovery method, which simply projects the (zero-padded) subsampled matrix onto the set of low-rank matrices. We show that under non-uniform deterministic sampling, this method yields a biased solution, and we propose an algorithm to de-bias it. Numerical simulations demonstrate that de-biasing significantly improves the estimate. However, when the observations are noisy, the error of this method can be sub-optimal when the sampling is highly non-uniform. To remedy this, we suggest an alternative which is based on projection onto the max-norm ball whose robustness to noise tolerates arbitrarily non-uniform sampling. Finally, we analyze convex optimization in this framework.
Proceedings of the sixteenth biennial low-rank fuels symposium
Energy Technology Data Exchange (ETDEWEB)
1991-01-01
Low-rank coals represent a major energy resource for the world. The Low-Rank Fuels Symposium, building on the traditions established by the Lignite Symposium, focuses on the key opportunities for this resource. This conference offers a forum for leaders from industry, government, and academia to gather to share current information on the opportunities represented by low-rank coals. In the United States and throughout the world, the utility industry is the primary user of low-rank coals. As such, current experiences and future opportunities for new technologies in this industry were the primary focuses of the symposium.
Information Theoretic Bounds for Low-Rank Matrix Completion
Vishwanath, Sriram
2010-01-01
This paper studies the low-rank matrix completion problem from an information theoretic perspective. The completion problem is rephrased as a communication problem of an (uncoded) low-rank matrix source over an erasure channel. The paper then uses achievability and converse arguments to present order-wise optimal bounds for the completion problem.
Robust Visual Tracking via Online Discriminative and Low-Rank Dictionary Learning.
Zhou, Tao; Liu, Fanghui; Bhaskar, Harish; Yang, Jie
2017-09-12
In this paper, we propose a novel and robust tracking framework based on online discriminative and low-rank dictionary learning. The primary aim of this paper is to obtain compact and low-rank dictionaries that can provide good discriminative representations of both target and background. We accomplish this by exploiting the recovery ability of low-rank matrices. That is if we assume that the data from the same class are linearly correlated, then the corresponding basis vectors learned from the training set of each class shall render the dictionary to become approximately low-rank. The proposed dictionary learning technique incorporates a reconstruction error that improves the reliability of classification. Also, a multiconstraint objective function is designed to enable active learning of a discriminative and robust dictionary. Further, an optimal solution is obtained by iteratively computing the dictionary, coefficients, and by simultaneously learning the classifier parameters. Finally, a simple yet effective likelihood function is implemented to estimate the optimal state of the target during tracking. Moreover, to make the dictionary adaptive to the variations of the target and background during tracking, an online update criterion is employed while learning the new dictionary. Experimental results on a publicly available benchmark dataset have demonstrated that the proposed tracking algorithm performs better than other state-of-the-art trackers.
An Approach to Streaming Video Segmentation With Sub-Optimal Low-Rank Decomposition.
Li, Chenglong; Lin, Liang; Zuo, Wangmeng; Wang, Wenzhong; Tang, Jin
2016-05-01
This paper investigates how to perform robust and efficient video segmentation while suppressing the effects of data noises and/or corruptions, and an effective approach is introduced to this end. First, a general algorithm, called sub-optimal low-rank decomposition (SOLD), is proposed to pursue the low-rank representation for video segmentation. Given the data matrix formed by supervoxel features of an observed video sequence, SOLD seeks a sub-optimal solution by making the matrix rank explicitly determined. In particular, the representation coefficient matrix with the fixed rank can be decomposed into two sub-matrices of low rank, and then we iteratively optimize them with closed-form solutions. Moreover, we incorporate a discriminative replication prior into SOLD based on the observation that small-size video patterns tend to recur frequently within the same object. Second, based on SOLD, we present an efficient inference algorithm to perform streaming video segmentation in both unsupervised and interactive scenarios. More specifically, the constrained normalized-cut algorithm is adopted by incorporating the low-rank representation with other low level cues and temporal consistent constraints for spatio-temporal segmentation. Extensive experiments on two public challenging data sets VSB100 and SegTrack suggest that our approach outperforms other video segmentation approaches in both accuracy and efficiency.
Efficient completion for corrupted low-rank images via alternating direction method
Li, Wei; Zhao, Lei; Xu, Duanqing; Lu, Dongming
2014-05-01
We propose an efficient and easy-to-implement method to settle the inpainting problem for low-rank images following the recent studies about low-rank matrix completion. In general, our method has three steps: first, corresponding to the three channels of RGB color space, an incomplete image is split into three incomplete matrices; second, each matrix is restored by solving a convex problem derived from the nuclear norm relaxation; at last, the three recovered matrices are merged to produce the final output. During the process, in order to efficiently solve the nuclear norm minimization problem, we employ the alternating direction method. Except for the basic image inpainting problem, we also enable our method to handle cases where corrupted images not only have missing values but also have noisy entries. Our experiments show that our method outperforms the existing inpainting techniques both quantitatively and qualitatively. We also demonstrate that our method is capable of processing many other situations, including block-wise low-rank image completion, large-scale image restoration, and object removal.
On low-rank updates to the singular value and Tucker decompositions
Energy Technology Data Exchange (ETDEWEB)
O' Hara, M J
2009-10-06
The singular value decomposition is widely used in signal processing and data mining. Since the data often arrives in a stream, the problem of updating matrix decompositions under low-rank modification has been widely studied. Brand developed a technique in 2006 that has many advantages. However, the technique does not directly approximate the updated matrix, but rather its previous low-rank approximation added to the new update, which needs justification. Further, the technique is still too slow for large information processing problems. We show that the technique minimizes the change in error per update, so if the error is small initially it remains small. We show that an updating algorithm for large sparse matrices should be sub-linear in the matrix dimension in order to be practical for large problems, and demonstrate a simple modification to the original technique that meets the requirements.
Weighted Discriminative Dictionary Learning based on Low-rank Representation
Chang, Heyou; Zheng, Hao
2017-01-01
Low-rank representation has been widely used in the field of pattern classification, especially when both training and testing images are corrupted with large noise. Dictionary plays an important role in low-rank representation. With respect to the semantic dictionary, the optimal representation matrix should be block-diagonal. However, traditional low-rank representation based dictionary learning methods cannot effectively exploit the discriminative information between data and dictionary. To address this problem, this paper proposed weighted discriminative dictionary learning based on low-rank representation, where a weighted representation regularization term is constructed. The regularization associates label information of both training samples and dictionary atoms, and encourages to generate a discriminative representation with class-wise block-diagonal structure, which can further improve the classification performance where both training and testing images are corrupted with large noise. Experimental results demonstrate advantages of the proposed method over the state-of-the-art methods.
El Gharamti, Mohamad
2012-04-01
Accurate knowledge of the movement of contaminants in porous media is essential to track their trajectory and later extract them from the aquifer. A two-dimensional flow model is implemented and then applied on a linear contaminant transport model in the same porous medium. Because of different sources of uncertainties, this coupled model might not be able to accurately track the contaminant state. Incorporating observations through the process of data assimilation can guide the model toward the true trajectory of the system. The Kalman filter (KF), or its nonlinear invariants, can be used to tackle this problem. To overcome the prohibitive computational cost of the KF, the singular evolutive Kalman filter (SEKF) and the singular fixed Kalman filter (SFKF) are used, which are variants of the KF operating with low-rank covariance matrices. Experimental results suggest that under perfect and imperfect model setups, the low-rank filters can provide estimates as accurate as the full KF but at much lower computational effort. Low-rank filters are demonstrated to significantly reduce the computational effort of the KF to almost 3%. © 2012 American Society of Civil Engineers.
Low-Rank Sparse Coding for Image Classification
Zhang, Tianzhu
2013-12-01
In this paper, we propose a low-rank sparse coding (LRSC) method that exploits local structure information among features in an image for the purpose of image-level classification. LRSC represents densely sampled SIFT descriptors, in a spatial neighborhood, collectively as low-rank, sparse linear combinations of code words. As such, it casts the feature coding problem as a low-rank matrix learning problem, which is different from previous methods that encode features independently. This LRSC has a number of attractive properties. (1) It encourages sparsity in feature codes, locality in codebook construction, and low-rankness for spatial consistency. (2) LRSC encodes local features jointly by considering their low-rank structure information, and is computationally attractive. We evaluate the LRSC by comparing its performance on a set of challenging benchmarks with that of 7 popular coding and other state-of-the-art methods. Our experiments show that by representing local features jointly, LRSC not only outperforms the state-of-the-art in classification accuracy but also improves the time complexity of methods that use a similar sparse linear representation model for feature coding.
Utilization of low rank coal and agricultural by-products
Energy Technology Data Exchange (ETDEWEB)
Ekinci, E.; Yardim, M.F.; Petrova, B.; Budinova, T.; Petrov, N. [Istanbul Technical University, Maslak-Istanbul (Turkey). Department of Chemical Engineering
2007-07-01
The present investigation deals with alternative utilization processes to convert low rank coal and agricultural by products into solid, liquid and gaseous products for a more efficient exploitation of these materials. Low rank coals and different agricultural by-products were subjected to different thermochemical treatments. The composition and physico-chemical properties of liquid products obtained from agricultural by-products were investigated. The identified compounds are predominantly oxygen derivatives of phenol, dihydroxybenzenes, guaiacol, syringol, vanilin, veratrol, furan and acids. Liquids from low rank coals contain higher quality of aromatic compounds predominantly mono- and bicyclic. The amount of oxygen containing structures is high as well. By thermo-chemical treatment of liquid products from agricultural by-products, low rank coals and their mixtures with H{sub 2}SO{sub 4} carbon adsorbents with very low ash and sulfur content are obtained. Using different activation reagents large scale carbon adsorbents are prepared from agricultural by-products and coals. The results of the investigations open-up possibilities for utilization of low rank coals and agricultural by-products. 18 refs., 5 figs., 7 tabs.
Low-rank and sparse modeling for visual analysis
Fu, Yun
2014-01-01
This book provides a view of low-rank and sparse computing, especially approximation, recovery, representation, scaling, coding, embedding and learning among unconstrained visual data. The book includes chapters covering multiple emerging topics in this new field. It links multiple popular research fields in Human-Centered Computing, Social Media, Image Classification, Pattern Recognition, Computer Vision, Big Data, and Human-Computer Interaction. Contains an overview of the low-rank and sparse modeling techniques for visual analysis by examining both theoretical analysis and real-world applic
Large-scale 3-D EM modelling with a Block Low-Rank multifrontal direct solver
Shantsev, Daniil V.; Jaysaval, Piyoosh; de la Kethulle de Ryhove, Sébastien; Amestoy, Patrick R.; Buttari, Alfredo; L'Excellent, Jean-Yves; Mary, Theo
2017-06-01
We put forward the idea of using a Block Low-Rank (BLR) multifrontal direct solver to efficiently solve the linear systems of equations arising from a finite-difference discretization of the frequency-domain Maxwell equations for 3-D electromagnetic (EM) problems. The solver uses a low-rank representation for the off-diagonal blocks of the intermediate dense matrices arising in the multifrontal method to reduce the computational load. A numerical threshold, the so-called BLR threshold, controlling the accuracy of low-rank representations was optimized by balancing errors in the computed EM fields against savings in floating point operations (flops). Simulations were carried out over large-scale 3-D resistivity models representing typical scenarios for marine controlled-source EM surveys, and in particular the SEG SEAM model which contains an irregular salt body. The flop count, size of factor matrices and elapsed run time for matrix factorization are reduced dramatically by using BLR representations and can go down to, respectively, 10, 30 and 40 per cent of their full-rank values for our largest system with N = 20.6 million unknowns. The reductions are almost independent of the number of MPI tasks and threads at least up to 90 × 10 = 900 cores. The BLR savings increase for larger systems, which reduces the factorization flop complexity from O(N2) for the full-rank solver to O(Nm) with m = 1.4-1.6. The BLR savings are significantly larger for deep-water environments that exclude the highly resistive air layer from the computational domain. A study in a scenario where simulations are required at multiple source locations shows that the BLR solver can become competitive in comparison to iterative solvers as an engine for 3-D controlled-source electromagnetic Gauss-Newton inversion that requires forward modelling for a few thousand right-hand sides.
Hybrid reconstruction of quantum density matrix: when low-rank meets sparsity
Li, Kezhi; Zheng, Kai; Yang, Jingbei; Cong, Shuang; Liu, Xiaomei; Li, Zhaokai
2017-12-01
Both the mathematical theory and experiments have verified that the quantum state tomography based on compressive sensing is an efficient framework for the reconstruction of quantum density states. In recent physical experiments, we found that many unknown density matrices in which people are interested in are low-rank as well as sparse. Bearing this information in mind, in this paper we propose a reconstruction algorithm that combines the low-rank and the sparsity property of density matrices and further theoretically prove that the solution of the optimization function can be, and only be, the true density matrix satisfying the model with overwhelming probability, as long as a necessary number of measurements are allowed. The solver leverages the fixed-point equation technique in which a step-by-step strategy is developed by utilizing an extended soft threshold operator that copes with complex values. Numerical experiments of the density matrix estimation for real nuclear magnetic resonance devices reveal that the proposed method achieves a better accuracy compared to some existing methods. We believe that the proposed method could be leveraged as a generalized approach and widely implemented in the quantum state estimation.
Low-rank sparse learning for robust visual tracking
Zhang, Tianzhu
2012-01-01
In this paper, we propose a new particle-filter based tracking algorithm that exploits the relationship between particles (candidate targets). By representing particles as sparse linear combinations of dictionary templates, this algorithm capitalizes on the inherent low-rank structure of particle representations that are learned jointly. As such, it casts the tracking problem as a low-rank matrix learning problem. This low-rank sparse tracker (LRST) has a number of attractive properties. (1) Since LRST adaptively updates dictionary templates, it can handle significant changes in appearance due to variations in illumination, pose, scale, etc. (2) The linear representation in LRST explicitly incorporates background templates in the dictionary and a sparse error term, which enables LRST to address the tracking drift problem and to be robust against occlusion respectively. (3) LRST is computationally attractive, since the low-rank learning problem can be efficiently solved as a sequence of closed form update operations, which yield a time complexity that is linear in the number of particles and the template size. We evaluate the performance of LRST by applying it to a set of challenging video sequences and comparing it to 6 popular tracking methods. Our experiments show that by representing particles jointly, LRST not only outperforms the state-of-the-art in tracking accuracy but also significantly improves the time complexity of methods that use a similar sparse linear representation model for particles [1]. © 2012 Springer-Verlag.
Combinatorial conditions for low rank solutions in semidefinite programming
A. Varvitsiotis (Antonios)
2013-01-01
htmlabstractIn this thesis we investigate combinatorial conditions that guarantee the existence of low-rank optimal solutions to semidefinite programs. Results of this type are important for approximation algorithms and for the study of geometric representations of graphs. The structure of the
Combinatorial conditions for low rank solutions in semidefinite programming
Varvitsiotis, A.
2013-01-01
In this thesis we investigate combinatorial conditions that guarantee the existence of low-rank optimal solutions to semidefinite programs. Results of this type are important for approximation algorithms and for the study of geometric representations of graphs. The structure of the thesis is as
Sampling and Low-Rank Tensor Approximation of the Response Surface
Litvinenko, Alexander
2013-01-01
Most (quasi)-Monte Carlo procedures can be seen as computing some integral over an often high-dimensional domain. If the integrand is expensive to evaluate-we are thinking of a stochastic PDE (SPDE) where the coefficients are random fields and the integrand is some functional of the PDE-solution-there is the desire to keep all the samples for possible later computations of similar integrals. This obviously means a lot of data. To keep the storage demands low, and to allow evaluation of the integrand at points which were not sampled, we construct a low-rank tensor approximation of the integrand over the whole integration domain. This can also be viewed as a representation in some problem-dependent basis which allows a sparse representation. What one obtains is sometimes called a "surrogate" or "proxy" model, or a "response surface". This representation is built step by step or sample by sample, and can already be used for each new sample. In case we are sampling a solution of an SPDE, this allows us to reduce the number of necessary samples, namely in case the solution is already well-represented by the low-rank tensor approximation. This can be easily checked by evaluating the residuum of the PDE with the approximate solution. The procedure will be demonstrated in the computation of a compressible transonic Reynolds-averaged Navier-Strokes flow around an airfoil with random/uncertain data. © Springer-Verlag Berlin Heidelberg 2013.
Low-rank approximation pursuit for matrix completion
Xu, An-Bao; Xie, Dongxiu
2017-10-01
We consider the matrix completion problem that aims to construct a low rank matrix X that approximates a given large matrix Y from partially known sample data in Y . In this paper we introduce an efficient greedy algorithm for such matrix completions. The greedy algorithm generalizes the orthogonal rank-one matrix pursuit method (OR1MP) by creating s ⩾ 1 candidates per iteration by low-rank matrix approximation. Due to selecting s ⩾ 1 candidates in each iteration step, our approach uses fewer iterations than OR1MP to achieve the same results. Our algorithm is a randomized low-rank approximation method which makes it computationally inexpensive. The algorithm comes in two forms, the standard one which uses the Lanzcos algorithm to find partial SVDs, and another that uses a randomized approach for this part of its work. The storage complexity of this algorithm can be reduced by using an weight updating rule as an economic version algorithm. We prove that all our algorithms are linearly convergent. Numerical experiments on image reconstruction and recommendation problems are included that illustrate the accuracy and efficiency of our algorithms.
Robust Visual Tracking Via Consistent Low-Rank Sparse Learning
Zhang, Tianzhu
2014-06-19
Object tracking is the process of determining the states of a target in consecutive video frames based on properties of motion and appearance consistency. In this paper, we propose a consistent low-rank sparse tracker (CLRST) that builds upon the particle filter framework for tracking. By exploiting temporal consistency, the proposed CLRST algorithm adaptively prunes and selects candidate particles. By using linear sparse combinations of dictionary templates, the proposed method learns the sparse representations of image regions corresponding to candidate particles jointly by exploiting the underlying low-rank constraints. In addition, the proposed CLRST algorithm is computationally attractive since temporal consistency property helps prune particles and the low-rank minimization problem for learning joint sparse representations can be efficiently solved by a sequence of closed form update operations. We evaluate the proposed CLRST algorithm against 14 state-of-the-art tracking methods on a set of 25 challenging image sequences. Experimental results show that the CLRST algorithm performs favorably against state-of-the-art tracking methods in terms of accuracy and execution time.
Constrained low-rank gamut completion for robust illumination estimation
Zhou, Jianshen; Yuan, Jiazheng; Liu, Hongzhe
2017-02-01
Illumination estimation is an important component of color constancy and automatic white balancing. According to recent survey and evaluation work, the supervised methods with a learning phase are competitive for illumination estimation. However, the robustness and performance of any supervised algorithm suffer from an incomplete gamut in training image sets because of limited reflectance surfaces in a scene. In order to address this problem, we present a constrained low-rank gamut completion algorithm, which can replenish gamut from limited surfaces in an image, for robust illumination estimation. In the proposed algorithm, we first discuss why the gamut completion is actually a low-rank matrix completion problem. Then a constrained low-rank matrix completion framework is proposed by adding illumination similarities among the training images as an additional constraint. An optimization algorithm is also given out by extending the augmented Lagrange multipliers. Finally, the completed gamut based on the proposed algorithm is fed into the support vector regression (SVR)-based illumination estimation method to evaluate the effect of gamut completion. The experimental results on both synthetic and real-world image sets show that the proposed gamut completion model not only can effectively improve the performance of the original SVR method but is also robust to the surface insufficiency in training samples.
Low-rank regularization for learning gene expression programs.
Ye, Guibo; Tang, Mengfan; Cai, Jian-Feng; Nie, Qing; Xie, Xiaohui
2013-01-01
Learning gene expression programs directly from a set of observations is challenging due to the complexity of gene regulation, high noise of experimental measurements, and insufficient number of experimental measurements. Imposing additional constraints with strong and biologically motivated regularizations is critical in developing reliable and effective algorithms for inferring gene expression programs. Here we propose a new form of regulation that constrains the number of independent connectivity patterns between regulators and targets, motivated by the modular design of gene regulatory programs and the belief that the total number of independent regulatory modules should be small. We formulate a multi-target linear regression framework to incorporate this type of regulation, in which the number of independent connectivity patterns is expressed as the rank of the connectivity matrix between regulators and targets. We then generalize the linear framework to nonlinear cases, and prove that the generalized low-rank regularization model is still convex. Efficient algorithms are derived to solve both the linear and nonlinear low-rank regularized problems. Finally, we test the algorithms on three gene expression datasets, and show that the low-rank regularization improves the accuracy of gene expression prediction in these three datasets.
Schwerdtfeger, Christine A; Mazziotti, David A
2012-12-28
Treatment of two-electron excitations is a fundamental but computationally expensive part of ab initio calculations of many-electron correlation. In this paper we develop a low-rank spectral expansion of two-electron excitations for accelerated electronic-structure calculations. The spectral expansion differs from previous approaches by relying upon both (i) a sum of three expansions to increase the rank reduction of the tensor and (ii) a factorization of the tensor into geminal (rank-two) tensors rather than orbital (rank-one) tensors. We combine three spectral expansions from the three distinct forms of the two-electron reduced density matrix (2-RDM), (i) the two-particle (2)D, (ii) the two-hole (2)Q, and the (iii) particle-hole (2)G matrices, to produce a single spectral expansion with significantly accelerated convergence. While the resulting expansion is applicable to any quantum-chemistry calculation with two-particle excitation amplitudes, it is employed here in the parametric 2-RDM method [D. A. Mazziotti, Phys. Rev. Lett. 101, 253002 (2008)]. The low-rank parametric 2-RDM method scales quartically with the basis-set size, but like its full-rank version it can capture multi-reference correlation effects that are difficult to treat efficiently by traditional single-reference wavefunction methods. Applications are made to computing potential energy curves of HF and triplet OH(+), equilibrium bond distances and frequencies, the HCN-HNC isomerization, and the energies of hydrocarbon chains. Computed 2-RDMs nearly satisfy necessary N-representability conditions. The low-rank spectral expansion has the potential to expand the applicability of the parametric 2-RDM method as well as other ab initio methods to large-scale molecular systems that are often only treatable by mean-field or density functional theories.
Moving object detection via low-rank total variation regularization
Wang, Pengcheng; Chen, Qian; Shao, Na
2016-09-01
Moving object detection is a challenging task in video surveillance. Recently proposed Robust Principal Component Analysis (RPCA) can recover the outlier patterns from the low-rank data under some mild conditions. However, the l-penalty in RPCA doesn't work well in moving object detection because the irrepresentable condition is often not satisfied. In this paper, a method based on total variation (TV) regularization scheme is proposed. In our model, image sequences captured with a static camera are highly related, which can be described using a low-rank matrix. Meanwhile, the low-rank matrix can absorb background motion, e.g. periodic and random perturbation. The foreground objects in the sequence are usually sparsely distributed and drifting continuously, and can be treated as group outliers from the highly-related background scenes. Instead of l-penalty, we exploit the total variation of the foreground. By minimizing the total variation energy, the outliers tend to collapse and finally converge to be the exact moving objects. The TV-penalty is superior to the l-penalty especially when the outlier is in the majority for some pixels, and our method can estimate the outlier explicitly with less bias but higher variance. To solve the problem, a joint optimization function is formulated and can be effectively solved through the inexact Augmented Lagrange Multiplier (ALM) method. We evaluate our method along with several state-of-the-art approaches in MATLAB. Both qualitative and quantitative results demonstrate that our proposed method works effectively on a large range of complex scenarios.
Domain Generalization and Adaptation using Low Rank Exemplar SVMs.
Li, Wen; Xu, Zheng; Xu, Dong; Dai, Dengxin; Van Gool, Luc
2017-05-16
Domain adaptation between diverse source and target domains is a challenging research problem, especially in the real-world visual recognition tasks where the images and videos consist of significant variations in viewpoints, illuminations, qualities, etc. In this paper, we propose a new approach for domain generalization and domain adaptation based on exemplar SVMs. Specifically, we decompose the source domain into many subdomains, each of which contains only one positive training sample and all negative samples. Each subdomain is relatively less diverse, and is expected to have a simpler distribution. By training one exemplar SVM for each subdomain, we obtain a set of exemplar SVMs. To further exploit the inherent structure of source domain, we introduce a nuclear-norm based regularizer into the objective function in order to enforce the exemplar SVMs to produce a low-rank output on training samples. In the prediction process, the confident exemplar SVM classifiers are selected and reweigted according to the distribution mismatch between each subdomain and the test sample in the target domain. We formulate our approach based on the logistic regression and least square SVM algorithms, which are referred to as low rank exemplar SVMs (LRE-SVMs) and low rank exemplar least square SVMs (LRE-LSSVMs), respectively. A fast algorithm is also developed for accelerating the training of LRE-LSSVMs. We further extend Domain Adaptation Machine (DAM) to learn an optimal target classifier for domain adaptation, and show that our approach can also be applied to domain adaptation with evolving target domain, where the target data distribution is gradually changing. The comprehensive experiments for object recognition and action recognition demonstrate the effectiveness of our approach for domain generalization and domain adaptation with fixed and evolving target domains.
Analysis of linear dynamic systems of low rank
DEFF Research Database (Denmark)
Høskuldsson, Agnar
2003-01-01
We present here procedures of how obtain stable solutions to linear dynamic systems can be found. Different types of models are considered. The basic idea is to use the H-principle to develop low rank approximations to solutions. The approximations stop, when the prediction ability of the model...... cannot be improved for the present data. Therefore, the present methods give better prediction results than traditional methods that give exact solutions. The vectors used in the approximations can be used to carry out graphic analysis of the dynamic systems. We show how score vectors can display the low...
Low-Rank Coal Grinding Performance Versus Power Plant Performance
Energy Technology Data Exchange (ETDEWEB)
Rajive Ganguli; Sukumar Bandopadhyay
2008-12-31
The intent of this project was to demonstrate that Alaskan low-rank coal, which is high in volatile content, need not be ground as fine as bituminous coal (typically low in volatile content) for optimum combustion in power plants. The grind or particle size distribution (PSD), which is quantified by percentage of pulverized coal passing 74 microns (200 mesh), affects the pulverizer throughput in power plants. The finer the grind, the lower the throughput. For a power plant to maintain combustion levels, throughput needs to be high. The problem of particle size is compounded for Alaskan coal since it has a low Hardgrove grindability index (HGI); that is, it is difficult to grind. If the thesis of this project is demonstrated, then Alaskan coal need not be ground to the industry standard, thereby alleviating somewhat the low HGI issue (and, hopefully, furthering the salability of Alaskan coal). This project studied the relationship between PSD and power plant efficiency, emissions, and mill power consumption for low-rank high-volatile-content Alaskan coal. The emissions studied were CO, CO{sub 2}, NO{sub x}, SO{sub 2}, and Hg (only two tests). The tested PSD range was 42 to 81 percent passing 76 microns. Within the tested range, there was very little correlation between PSD and power plant efficiency, CO, NO{sub x}, and SO{sub 2}. Hg emissions were very low and, therefore, did not allow comparison between grind sizes. Mill power consumption was lower for coarser grinds.
Low-Rank Linear Dynamical Systems for Motor Imagery EEG
Tan, Chuanqi; Liu, Shaobo
2016-01-01
The common spatial pattern (CSP) and other spatiospectral feature extraction methods have become the most effective and successful approaches to solve the problem of motor imagery electroencephalography (MI-EEG) pattern recognition from multichannel neural activity in recent years. However, these methods need a lot of preprocessing and postprocessing such as filtering, demean, and spatiospectral feature fusion, which influence the classification accuracy easily. In this paper, we utilize linear dynamical systems (LDSs) for EEG signals feature extraction and classification. LDSs model has lots of advantages such as simultaneous spatial and temporal feature matrix generation, free of preprocessing or postprocessing, and low cost. Furthermore, a low-rank matrix decomposition approach is introduced to get rid of noise and resting state component in order to improve the robustness of the system. Then, we propose a low-rank LDSs algorithm to decompose feature subspace of LDSs on finite Grassmannian and obtain a better performance. Extensive experiments are carried out on public dataset from “BCI Competition III Dataset IVa” and “BCI Competition IV Database 2a.” The results show that our proposed three methods yield higher accuracies compared with prevailing approaches such as CSP and CSSP. PMID:28096809
Low-Rank Linear Dynamical Systems for Motor Imagery EEG
Directory of Open Access Journals (Sweden)
Wenchang Zhang
2016-01-01
Full Text Available The common spatial pattern (CSP and other spatiospectral feature extraction methods have become the most effective and successful approaches to solve the problem of motor imagery electroencephalography (MI-EEG pattern recognition from multichannel neural activity in recent years. However, these methods need a lot of preprocessing and postprocessing such as filtering, demean, and spatiospectral feature fusion, which influence the classification accuracy easily. In this paper, we utilize linear dynamical systems (LDSs for EEG signals feature extraction and classification. LDSs model has lots of advantages such as simultaneous spatial and temporal feature matrix generation, free of preprocessing or postprocessing, and low cost. Furthermore, a low-rank matrix decomposition approach is introduced to get rid of noise and resting state component in order to improve the robustness of the system. Then, we propose a low-rank LDSs algorithm to decompose feature subspace of LDSs on finite Grassmannian and obtain a better performance. Extensive experiments are carried out on public dataset from “BCI Competition III Dataset IVa” and “BCI Competition IV Database 2a.” The results show that our proposed three methods yield higher accuracies compared with prevailing approaches such as CSP and CSSP.
CO2 Sequestration Potential of Texas Low-Rank Coals
Energy Technology Data Exchange (ETDEWEB)
Duane McVay; Walter Ayers, Jr.; Jerry Jensen; Jorge Garduno; Gonzola Hernandez; Rasheed Bello; Rahila Ramazanova
2006-08-31
Injection of CO{sub 2} in coalbeds is a plausible method of reducing atmospheric emissions of CO{sub 2}, and it can have the additional benefit of enhancing methane recovery from coal. Most previous studies have evaluated the merits of CO{sub 2} disposal in high-rank coals. The objective of this research was to determine the technical and economic feasibility of CO{sub 2} sequestration in, and enhanced coalbed methane (ECBM) recovery from, low-rank coals in the Texas Gulf Coast area. Our research included an extensive coal characterization program, including acquisition and analysis of coal core samples and well transient test data. We conducted deterministic and probabilistic reservoir simulation and economic studies to evaluate the effects of injectant fluid composition (pure CO{sub 2} and flue gas), well spacing, injection rate, and dewatering on CO{sub 2} sequestration and ECBM recovery in low-rank coals of the Calvert Bluff formation of the Texas Wilcox Group. Shallow and deep Calvert Bluff coals occur in two, distinct, coalbed gas petroleum systems that are separated by a transition zone. Calvert Bluff coals < 3,500 ft deep are part of a biogenic coalbed gas system. They have low gas content and are part of a freshwater aquifer. In contrast, Wilcox coals deeper than 3,500 ft are part of a thermogenic coalbed gas system. They have high gas content and are part of a saline aquifer. CO{sub 2} sequestration and ECBM projects in Calvert Bluff low-rank coals of East-Central Texas must be located in the deeper, unmineable coals, because shallow Wilcox coals are part of a protected freshwater aquifer. Probabilistic simulation of 100% CO{sub 2} injection into 20 feet of Calvert Bluff coal in an 80-acre 5-spot pattern indicates that these coals can store 1.27 to 2.25 Bcf of CO{sub 2} at depths of 6,200 ft, with an ECBM recovery of 0.48 to 0.85 Bcf. Simulation results of flue gas injection (87% N{sub 2}-13% CO{sub 2}) indicate that these same coals can store 0.34 to 0
Clustering high dimensional data
DEFF Research Database (Denmark)
Assent, Ira
2012-01-01
High-dimensional data, i.e., data described by a large number of attributes, pose specific challenges to clustering. The so-called ‘curse of dimensionality’, coined originally to describe the general increase in complexity of various computational problems as dimensionality increases, is known...... for clustering are required. Consequently, recent research has focused on developing techniques and clustering algorithms specifically for high-dimensional data. Still, open research issues remain. Clustering is a data mining task devoted to the automatic grouping of data based on mutual similarity. Each cluster...... groups objects that are similar to one another, whereas dissimilar objects are assigned to different clusters, possibly separating out noise. In this manner, clusters describe the data structure in an unsupervised manner, i.e., without the need for class labels. A number of clustering paradigms exist...
Direct liquefaction of low-rank coals under mild conditions
Energy Technology Data Exchange (ETDEWEB)
Braun, N.; Rinaldi, R. [Max-Planck-Institut fuer Kohlenforschung, Muelheim an der Ruhr (Germany)
2013-11-01
Due to decreasing of petroleum reserves, direct coal liquefaction is attracting renewed interest as an alternative process to produce liquid fuels. The combination of hydrogen peroxide and coal is not a new one. In the early 1980, Vasilakos and Clinton described a procedure for desulfurization by leaching coal with solutions of sulphuric acid/H{sub 2}O{sub 2}. But so far, H{sub 2}O{sub 2} has never been ascribed a major role in coal liquefaction. Herein, we describe a novel approach for liquefying low-rank coals using a solution of H{sub 2}O{sub 2} in presence of a soluble non-transition metal catalyst. (orig.)
Robust Principal Component Analysis: Exact Recovery of Corrupted Low-Rank Matrices
Wright, John; Ganesh, Arvind; Rao, Shankar; Ma, Yi
2009-01-01
This paper has been withdrawn due to a critical error near equation (71). This error causes the entire argument of the paper to collapse. Emmanuel Candes of Stanford discovered the error, and has suggested a correct analysis, which will be reported in a separate publication.
Low-rank coal research: Volume 3, Combustion research: Final report. [Great Plains
Energy Technology Data Exchange (ETDEWEB)
Mann, M. D.; Hajicek, D. R.; Zobeck, B. J.; Kalmanovitch, D. P.; Potas, T. A.; Maas, D. J.; Malterer, T. J.; DeWall, R. A.; Miller, B. G.; Johnson, M. D.
1987-04-01
Volume III, Combustion Research, contains articles on fluidized bed combustion, advanced processes for low-rank coal slurry production, low-rank coal slurry combustion, heat engine utilization of low-rank coals, and Great Plains Gasification Plant. These articles have been entered individually into EDB and ERA. (LTN)
CSIR Research Space (South Africa)
Mc
2012-07-01
Full Text Available stream_source_info McLaren_2012.pdf.txt stream_content_type text/plain stream_size 2190 Content-Encoding ISO-8859-1 stream_name McLaren_2012.pdf.txt Content-Type text/plain; charset=ISO-8859-1 High dimensional... entanglement M. McLAREN1,2, F.S. ROUX1 & A. FORBES1,2,3 1. CSIR National Laser Centre, PO Box 395, Pretoria 0001 2. School of Physics, University of the Stellenbosch, Private Bag X1, 7602, Matieland 3. School of Physics, University of Kwazulu...
Fast Low-Rank Shared Dictionary Learning for Image Classification
Vu, Tiep Huu; Monga, Vishal
2017-11-01
Despite the fact that different objects possess distinct class-specific features, they also usually share common patterns. This observation has been exploited partially in a recently proposed dictionary learning framework by separating the particularity and the commonality (COPAR). Inspired by this, we propose a novel method to explicitly and simultaneously learn a set of common patterns as well as class-specific features for classification with more intuitive constraints. Our dictionary learning framework is hence characterized by both a shared dictionary and particular (class-specific) dictionaries. For the shared dictionary, we enforce a low-rank constraint, i.e. claim that its spanning subspace should have low dimension and the coefficients corresponding to this dictionary should be similar. For the particular dictionaries, we impose on them the well-known constraints stated in the Fisher discrimination dictionary learning (FDDL). Further, we develop new fast and accurate algorithms to solve the subproblems in the learning step, accelerating its convergence. The said algorithms could also be applied to FDDL and its extensions. The efficiencies of these algorithms are theoretically and experimentally verified by comparing their complexities and running time with those of other well-known dictionary learning methods. Experimental results on widely used image datasets establish the advantages of our method over state-of-the-art dictionary learning methods.
Moving Bed Gasification of Low Rank Alaska Coal
Directory of Open Access Journals (Sweden)
Mandar Kulkarni
2012-01-01
Full Text Available This paper presents process simulation of moving bed gasifier using low rank, subbituminous Usibelli coal from Alaska. All the processes occurring in a moving bed gasifier, drying, devolatilization, gasification, and combustion, are included in this model. The model, developed in Aspen Plus, is used to predict the effect of various operating parameters including pressure, oxygen to coal, and steam to coal ratio on the product gas composition. The results obtained from the simulation were compared with experimental data in the literature. The predicted composition of the product gas was in general agreement with the established results. Carbon conversion increased with increasing oxygen-coal ratio and decreased with increasing steam-coal ratio. Steam to coal ratio and oxygen to coal ratios impacted produced syngas composition, while pressure did not have a large impact on the product syngas composition. A nonslagging moving bed gasifier would have to be limited to an oxygen-coal ratio of 0.26 to operate below the ash softening temperature. Slagging moving bed gasifiers, not limited by operating temperature, could achieve carbon conversion efficiency of 99.5% at oxygen-coal ratio of 0.33. The model is useful for predicting performance of the Usibelli coal in a moving bed gasifier using different operating parameters.
The solubilization of low-ranked coals by microorganisms
Energy Technology Data Exchange (ETDEWEB)
Strandberg, G.W.
1987-07-09
Late in 1984, our Laboratory was funded by the Pittsburgh Energy Technology Center, US Department of Energy, to investigate the potential utility of microorganisms for the solubilization of low-ranked coals. Our approach has been multifacited, including studies of the types of microorganisms involved, appropriate conditions for their growth and coal-solubilization, the suceptibility of different coals to microbial action, the chemical and physical nature of the product, and potential bioprocess designs. A substantial number of fungal species have been shown to be able to solubilize coal. Cohen and Gabrielle reported that two lignin-degrading fungi, Polyporous (Trametes) versicolor and Poria monticola could solubilize lignite. Ward has isolated several diverse fungi from nature which are capable of degrading different lignites, and our Laboratory has isolated three coal-solubilizing fungi which were found growing on a sample of Texas lignite. The organisms we studied are shown in Table 1. The perceived significance of lignin degradation led us to examine two lignin-degrading strains of the genus Streptomyces. As discussed later, these bacteria were capable of solubilizing coal; but, in the case of at least one, the mechanism was non-enzymatic. The coal-solubilizing ability of other strains of Streptomyces was recently reported. Fakoussa and Trueper found evidence that a strain of Pseudomonas was capble of solubizing coal. It would thus appear that a diverse array of microorganisms possess the ability to solubilize coal. 16 refs.
7th High Dimensional Probability Meeting
Mason, David; Reynaud-Bouret, Patricia; Rosinski, Jan
2016-01-01
This volume collects selected papers from the 7th High Dimensional Probability meeting held at the Institut d'Études Scientifiques de Cargèse (IESC) in Corsica, France. High Dimensional Probability (HDP) is an area of mathematics that includes the study of probability distributions and limit theorems in infinite-dimensional spaces such as Hilbert spaces and Banach spaces. The most remarkable feature of this area is that it has resulted in the creation of powerful new tools and perspectives, whose range of application has led to interactions with other subfields of mathematics, statistics, and computer science. These include random matrices, nonparametric statistics, empirical processes, statistical learning theory, concentration of measure phenomena, strong and weak approximations, functional estimation, combinatorial optimization, and random graphs. The contributions in this volume show that HDP theory continues to thrive and develop new tools, methods, techniques and perspectives to analyze random phenome...
Statistical analysis of compressive low rank tomography with random measurements
Acharya, Anirudh; Guţă, Mădălin
2017-05-01
We consider the statistical problem of ‘compressive’ estimation of low rank states (r\\ll d ) with random basis measurements, where r, d are the rank and dimension of the state respectively. We investigate whether for a fixed sample size N, the estimation error associated with a ‘compressive’ measurement setup is ‘close’ to that of the setting where a large number of bases are measured. We generalise and extend previous results, and show that the mean square error (MSE) associated with the Frobenius norm attains the optimal rate rd/N with only O(r log{d}) random basis measurements for all states. An important tool in the analysis is the concentration of the Fisher information matrix (FIM). We demonstrate that although a concentration of the MSE follows from a concentration of the FIM for most states, the FIM fails to concentrate for states with eigenvalues close to zero. We analyse this phenomenon in the case of a single qubit and demonstrate a concentration of the MSE about its optimal despite a lack of concentration of the FIM for states close to the boundary of the Bloch sphere. We also consider the estimation error in terms of a different metric-the quantum infidelity. We show that a concentration in the mean infidelity (MINF) does not exist uniformly over all states, highlighting the importance of loss function choice. Specifically, we show that for states that are nearly pure, the MINF scales as 1/\\sqrt{N} but the constant converges to zero as the number of settings is increased. This demonstrates a lack of ‘compressive’ recovery for nearly pure states in this metric.
Low-rank coal research. Quarterly report, January--March 1990
Energy Technology Data Exchange (ETDEWEB)
1990-08-01
This document contains several quarterly progress reports for low-rank coal research that was performed from January-March 1990. Reports in Control Technology and Coal Preparation Research are in Flue Gas Cleanup, Waste Management, and Regional Energy Policy Program for the Northern Great Plains. Reports in Advanced Research and Technology Development are presented in Turbine Combustion Phenomena, Combustion Inorganic Transformation (two sections), Liquefaction Reactivity of Low-Rank Coals, Gasification Ash and Slag Characterization, and Coal Science. Reports in Combustion Research cover Fluidized-Bed Combustion, Beneficiation of Low-Rank Coals, Combustion Characterization of Low-Rank Coal Fuels, Diesel Utilization of Low-Rank Coals, and Produce and Characterize HWD (hot-water drying) Fuels for Heat Engine Applications. Liquefaction Research is reported in Low-Rank Coal Direct Liquefaction. Gasification Research progress is discussed for Production of Hydrogen and By-Products from Coal and for Chemistry of Sulfur Removal in Mild Gas.
A multi-platform evaluation of the randomized CX low-rank matrix factorization in Spark
Energy Technology Data Exchange (ETDEWEB)
Gittens, Alex; Kottalam, Jey; Yang, Jiyan; Ringenburg, Michael, F.; Chhugani, Jatin; Racah, Evan; Singh, Mohitdeep; Yao, Yushu; Fischer, Curt; Ruebel, Oliver; Bowen, Benjamin; Lewis, Norman, G.; Mahoney, Michael, W.; Krishnamurthy, Venkat; Prabhat, Mr
2017-07-27
We investigate the performance and scalability of the randomized CX low-rank matrix factorization and demonstrate its applicability through the analysis of a 1TB mass spectrometry imaging (MSI) dataset, using Apache Spark on an Amazon EC2 cluster, a Cray XC40 system, and an experimental Cray cluster. We implemented this factorization both as a parallelized C implementation with hand-tuned optimizations and in Scala using the Apache Spark high-level cluster computing framework. We obtained consistent performance across the three platforms: using Spark we were able to process the 1TB size dataset in under 30 minutes with 960 cores on all systems, with the fastest times obtained on the experimental Cray cluster. In comparison, the C implementation was 21X faster on the Amazon EC2 system, due to careful cache optimizations, bandwidth-friendly access of matrices and vector computation using SIMD units. We report these results and their implications on the hardware and software issues arising in supporting data-centric workloads in parallel and distributed environments.
Likelihood ratio based verification in high dimensional spaces
Hendrikse, A.J.; Veldhuis, Raymond N.J.; Spreeuwers, Lieuwe Jan
The increase of the dimensionality of data sets often lead to problems during estimation, which are denoted as the curse of dimensionality. One of the problems of Second Order Statistics (SOS) estimation in high dimensional data is that the resulting covariance matrices are not full rank, so their
Image Inpainting Algorithm Based on Low-Rank Approximation and Texture Direction
Directory of Open Access Journals (Sweden)
Jinjiang Li
2014-01-01
Full Text Available Existing image inpainting algorithm based on low-rank matrix approximation cannot be suitable for complex, large-scale, damaged texture image. An inpainting algorithm based on low-rank approximation and texture direction is proposed in the paper. At first, we decompose the image using low-rank approximation method. Then the area to be repaired is interpolated by level set algorithm, and we can reconstruct a new image by the boundary values of level set. In order to obtain a better restoration effect, we make iteration for low-rank decomposition and level set interpolation. Taking into account the impact of texture direction, we segment the texture and make low-rank decomposition at texture direction. Experimental results show that the new algorithm is suitable for texture recovery and maintaining the overall consistency of the structure, which can be used to repair large-scale damaged image.
Fan, Jicong; Tian, Zhaoyang; Zhao, Mingbo; Chow, Tommy W S
2018-02-02
The scalability of low-rank representation (LRR) to large-scale data is still a major research issue, because it is extremely time-consuming to solve singular value decomposition (SVD) in each optimization iteration especially for large matrices. Several methods were proposed to speed up LRR, but they are still computationally heavy, and the overall representation results were also found degenerated. In this paper, a novel method, called accelerated LRR (ALRR) is proposed for large-scale data. The proposed accelerated method integrates matrix factorization with nuclear-norm minimization to find a low-rank representation. In our proposed method, the large square matrix of representation coefficients is transformed into a significantly smaller square matrix, on which SVD can be efficiently implemented. The size of the transformed matrix is not related to the number of data points and the optimization of ALRR is linear with the number of data points. The proposed ALRR is convex, accurate, robust, and efficient for large-scale data. In this paper, ALRR is compared with state-of-the-art in subspace clustering and semi-supervised classification on real image datasets. The obtained results verify the effectiveness and superiority of the proposed ALRR method. Copyright © 2018 Elsevier Ltd. All rights reserved.
Benner, Peter; Dolgov, Sergey; Khoromskaia, Venera; Khoromskij, Boris N.
2017-04-01
In this paper, we propose and study two approaches to approximate the solution of the Bethe-Salpeter equation (BSE) by using structured iterative eigenvalue solvers. Both approaches are based on the reduced basis method and low-rank factorizations of the generating matrices. We also propose to represent the static screen interaction part in the BSE matrix by a small active sub-block, with a size balancing the storage for rank-structured representations of other matrix blocks. We demonstrate by various numerical tests that the combination of the diagonal plus low-rank plus reduced-block approximation exhibits higher precision with low numerical cost, providing as well a distinct two-sided error estimate for the smallest eigenvalues of the Bethe-Salpeter operator. The complexity is reduced to O (Nb2) in the size of the atomic orbitals basis set, Nb, instead of the practically intractable O (Nb6) scaling for the direct diagonalization. In the second approach, we apply the quantized-TT (QTT) tensor representation to both, the long eigenvectors and the column vectors in the rank-structured BSE matrix blocks, and combine this with the ALS-type iteration in block QTT format. The QTT-rank of the matrix entities possesses almost the same magnitude as the number of occupied orbitals in the molecular systems, No chain-type molecules, while supporting sufficient accuracy.
Understanding high-dimensional spaces
Skillicorn, David B
2012-01-01
High-dimensional spaces arise as a way of modelling datasets with many attributes. Such a dataset can be directly represented in a space spanned by its attributes, with each record represented as a point in the space with its position depending on its attribute values. Such spaces are not easy to work with because of their high dimensionality: our intuition about space is not reliable, and measures such as distance do not provide as clear information as we might expect. There are three main areas where complex high dimensionality and large datasets arise naturally: data collected by online ret
Acceleration of MR parameter mapping using annihilating filter‐based low rank hankel matrix (ALOHA)
National Research Council Canada - National Science Library
Lee, Dongwook; Jin, Kyong Hwan; Kim, Eung Yeop; Park, Sung‐Hong; Ye, Jong Chul
2016-01-01
.... However, increased scan time makes it difficult for routine clinical use. This article aims at developing an accelerated MR parameter mapping technique using annihilating filter based low-rank Hankel matrix approach (ALOHA...
Reduced basis ANOVA methods for partial differential equations with high-dimensional random inputs
Energy Technology Data Exchange (ETDEWEB)
Liao, Qifeng, E-mail: liaoqf@shanghaitech.edu.cn [School of Information Science and Technology, ShanghaiTech University, Shanghai 200031 (China); Lin, Guang, E-mail: guanglin@purdue.edu [Department of Mathematics & School of Mechanical Engineering, Purdue University, West Lafayette, IN 47907 (United States)
2016-07-15
In this paper we present a reduced basis ANOVA approach for partial deferential equations (PDEs) with random inputs. The ANOVA method combined with stochastic collocation methods provides model reduction in high-dimensional parameter space through decomposing high-dimensional inputs into unions of low-dimensional inputs. In this work, to further reduce the computational cost, we investigate spatial low-rank structures in the ANOVA-collocation method, and develop efficient spatial model reduction techniques using hierarchically generated reduced bases. We present a general mathematical framework of the methodology, validate its accuracy and demonstrate its efficiency with numerical experiments.
Low-rank coal study. Volume 4. Regulatory, environmental, and market analyses
Energy Technology Data Exchange (ETDEWEB)
1980-11-01
The regulatory, environmental, and market constraints to development of US low-rank coal resources are analyzed. Government-imposed environmental and regulatory requirements are among the most important factors that determine the markets for low-rank coal and the technology used in the extraction, delivery, and utilization systems. Both state and federal controls are examined, in light of available data on impacts and effluents associated with major low-rank coal development efforts. The market analysis examines both the penetration of existing markets by low-rank coal and the evolution of potential markets in the future. The electric utility industry consumes about 99 percent of the total low-rank coal production. This use in utility boilers rose dramatically in the 1970's and is expected to continue to grow rapidly. In the late 1980's and 1990's, industrial direct use of low-rank coal and the production of synthetic fuels are expected to start growing as major new markets.
Thermolysis of phenethyl phenyl ether: A model of ether linkages in low rank coal
Energy Technology Data Exchange (ETDEWEB)
Britt, P.F.; Buchanan, A.C. III; Malcolm, E.A.
1994-09-01
Currently, an area of interest and frustration for coal chemists has been the direct liquefaction of low rank coal. Although low rank coals are more reactive than bituminous coals, they are more difficult to liquefy and offer lower liquefaction yields under conditions optimized for bituminous coals. Solomon, Serio, and co-workers have shown that: in the pyrolysis and liquefaction of low rank coals, a low temperature cross-linking reaction associated with oxygen functional groups occurs before tar evolution. A variety of pretreatments (demineralization, alkylation, and ion-exchange) have been shown to reduce these retrogressive reactions and increase tar yields, but the actual chemical reactions responsible for these processes have not been defined. In order to gain insight into the thermochemical reactions leading to cross-linking in low rank coal, we have undertaken a study of the pyrolysis of oxygen containing coal model compounds. Solid state NMR studies suggest that the alkyl aryl ether linkage may be present in modest amounts in low rank coal. Therefore, in this paper, we will investigate the thermolysis of phenethyl phenyl ether (PPE) as a model of 0-aryl ether linkages found in low rank coal, lignites, and lignin, an evolutionary precursor of coal. Our results have uncovered a new reaction channel that can account for 25% of the products formed. The impact of reaction conditions, including restricted mass transport, on this new reaction pathway and the role of oxygen functional groups in cross-linking reactions will be investigated.
GoDec+: Fast and Robust Low-Rank Matrix Decomposition Based on Maximum Correntropy.
Guo, Kailing; Liu, Liu; Xu, Xiangmin; Xu, Dong; Tao, Dacheng
2017-04-24
GoDec is an efficient low-rank matrix decomposition algorithm. However, optimal performance depends on sparse errors and Gaussian noise. This paper aims to address the problem that a matrix is composed of a low-rank component and unknown corruptions. We introduce a robust local similarity measure called correntropy to describe the corruptions and, in doing so, obtain a more robust and faster low-rank decomposition algorithm: GoDec+. Based on half-quadratic optimization and greedy bilateral paradigm, we deliver a solution to the maximum correntropy criterion (MCC)-based low-rank decomposition problem. Experimental results show that GoDec+ is efficient and robust to different corruptions including Gaussian noise, Laplacian noise, salt & pepper noise, and occlusion on both synthetic and real vision data. We further apply GoDec+ to more general applications including classification and subspace clustering. For classification, we construct an ensemble subspace from the low-rank GoDec+ matrix and introduce an MCC-based classifier. For subspace clustering, we utilize GoDec+ values low-rank matrix for MCC-based self-expression and combine it with spectral clustering. Face recognition, motion segmentation, and face clustering experiments show that the proposed methods are effective and robust. In particular, we achieve the state-of-the-art performance on the Hopkins 155 data set and the first 10 subjects of extended Yale B for subspace clustering.
Adaptive low-rank subspace learning with online optimization for robust visual tracking.
Liu, Risheng; Wang, Di; Han, Yuzhuo; Fan, Xin; Luo, Zhongxuan
2017-04-01
In recent years, sparse and low-rank models have been widely used to formulate appearance subspace for visual tracking. However, most existing methods only consider the sparsity or low-rankness of the coefficients, which is not sufficient enough for appearance subspace learning on complex video sequences. Moreover, as both the low-rank and the column sparse measures are tightly related to all the samples in the sequences, it is challenging to incrementally solve optimization problems with both nuclear norm and column sparse norm on sequentially obtained video data. To address above limitations, this paper develops a novel low-rank subspace learning with adaptive penalization (LSAP) framework for subspace based robust visual tracking. Different from previous work, which often simply decomposes observations as low-rank features and sparse errors, LSAP simultaneously learns the subspace basis, low-rank coefficients and column sparse errors to formulate appearance subspace. Within LSAP framework, we introduce a Hadamard production based regularization to incorporate rich generative/discriminative structure constraints to adaptively penalize the coefficients for subspace learning. It is shown that such adaptive penalization can significantly improve the robustness of LSAP on severely corrupted dataset. To utilize LSAP for online visual tracking, we also develop an efficient incremental optimization scheme for nuclear norm and column sparse norm minimizations. Experiments on 50 challenging video sequences demonstrate that our tracker outperforms other state-of-the-art methods. Copyright © 2017 Elsevier Ltd. All rights reserved.
Low-rank coal research, Task 5.1. Topical report, April 1986--December 1992
Energy Technology Data Exchange (ETDEWEB)
1993-02-01
This document is a topical progress report for Low-Rank Coal Research performed April 1986 - December 1992. Control Technology and Coal Preparation Research is described for Flue Gas Cleanup, Waste Management, Regional Energy Policy Program for the Northern Great Plains, and Hot-Gas Cleanup. Advanced Research and Technology Development was conducted on Turbine Combustion Phenomena, Combustion Inorganic Transformation (two sections), Liquefaction Reactivity of Low-Rank Coals, Gasification Ash and Slag Characterization, and Coal Science. Combustion Research is described for Atmospheric Fluidized-Bed Combustion, Beneficiation of Low-Rank Coals, Combustion Characterization of Low-Rank Fuels (completed 10/31/90), Diesel Utilization of Low-Rank Coals (completed 12/31/90), Produce and Characterize HWD (hot-water drying) Fuels for Heat Engine Applications (completed 10/31/90), Nitrous Oxide Emission, and Pressurized Fluidized-Bed Combustion. Liquefaction Research in Low-Rank Coal Direct Liquefaction is discussed. Gasification Research was conducted in Production of Hydrogen and By-Products from Coals and in Sulfur Forms in Coal.
Low-rank coal study : national needs for resource development. Volume 2. Resource characterization
Energy Technology Data Exchange (ETDEWEB)
1980-11-01
Comprehensive data are presented on the quantity, quality, and distribution of low-rank coal (subbituminous and lignite) deposits in the United States. The major lignite-bearing areas are the Fort Union Region and the Gulf Lignite Region, with the predominant strippable reserves being in the states of North Dakota, Montana, and Texas. The largest subbituminous coal deposits are in the Powder River Region of Montana and Wyoming, The San Juan Basin of New Mexico, and in Northern Alaska. For each of the low-rank coal-bearing regions, descriptions are provided of the geology; strippable reserves; active and planned mines; classification of identified resources by depth, seam thickness, sulfur content, and ash content; overburden characteristics; aquifers; and coal properties and characteristics. Low-rank coals are distinguished from bituminous coals by unique chemical and physical properties that affect their behavior in extraction, utilization, or conversion processes. The most characteristic properties of the organic fraction of low-rank coals are the high inherent moisture and oxygen contents, and the correspondingly low heating value. Mineral matter (ash) contents and compositions of all coals are highly variable; however, low-rank coals tend to have a higher proportion of the alkali components CaO, MgO, and Na/sub 2/O. About 90% of the reserve base of US low-rank coal has less than one percent sulfur. Water resources in the major low-rank coal-bearing regions tend to have highly seasonal availabilities. Some areas appear to have ample water resources to support major new coal projects; in other areas such as Texas, water supplies may be constraining factor on development.
Low-Rank Representation-Based Object Tracking Using Multitask Feature Learning with Joint Sparsity
Directory of Open Access Journals (Sweden)
Hyuncheol Kim
2014-01-01
Full Text Available We address object tracking problem as a multitask feature learning process based on low-rank representation of features with joint sparsity. We first select features with low-rank representation within a number of initial frames to obtain subspace basis. Next, the features represented by the low-rank and sparse property are learned using a modified joint sparsity-based multitask feature learning framework. Both the features and sparse errors are then optimally updated using a novel incremental alternating direction method. The low-rank minimization problem for learning multitask features can be achieved by a few sequences of efficient closed form update process. Since the proposed method attempts to perform the feature learning problem in both multitask and low-rank manner, it can not only reduce the dimension but also improve the tracking performance without drift. Experimental results demonstrate that the proposed method outperforms existing state-of-the-art tracking methods for tracking objects in challenging image sequences.
CT image sequence restoration based on sparse and low-rank decomposition.
Directory of Open Access Journals (Sweden)
Shuiping Gou
Full Text Available Blurry organ boundaries and soft tissue structures present a major challenge in biomedical image restoration. In this paper, we propose a low-rank decomposition-based method for computed tomography (CT image sequence restoration, where the CT image sequence is decomposed into a sparse component and a low-rank component. A new point spread function of Weiner filter is employed to efficiently remove blur in the sparse component; a wiener filtering with the Gaussian PSF is used to recover the average image of the low-rank component. And then we get the recovered CT image sequence by combining the recovery low-rank image with all recovery sparse image sequence. Our method achieves restoration results with higher contrast, sharper organ boundaries and richer soft tissue structure information, compared with existing CT image restoration methods. The robustness of our method was assessed with numerical experiments using three different low-rank models: Robust Principle Component Analysis (RPCA, Linearized Alternating Direction Method with Adaptive Penalty (LADMAP and Go Decomposition (GoDec. Experimental results demonstrated that the RPCA model was the most suitable for the small noise CT images whereas the GoDec model was the best for the large noisy CT images.
Patch-Based Image Inpainting via Two-Stage Low Rank Approximation.
Guo, Qiang; Gao, Shanshan; Zhang, Xiaofeng; Yin, Yilong; Zhang, Caiming
2017-05-09
To recover the corrupted pixels, traditional inpainting methods based on low-rank priors generally need to solve a convex optimization problem by an iterative singular value shrinkage algorithm. In this paper, we propose a simple method for image inpainting using low rank approximation, which avoids the time-consuming iterative shrinkage. Specifically, if similar patches of a corrupted image are identified and reshaped as vectors, then a patch matrix can be constructed by collecting these similar patch-vectors. Due to its columns being highly linearly correlated, this patch matrix is low-rank. Instead of using an iterative singular value shrinkage scheme, the proposed method utilizes low rank approximation with truncated singular values to derive a closed-form estimate for each patch matrix. Depending upon an observation that there exists a distinct gap in the singular spectrum of patch matrix, the rank of each patch matrix is empirically determined by a heuristic procedure. Inspired by the inpainting algorithms with component decomposition, a two-stage low rank approximation (TSLRA) scheme is designed to recover image structures and refine texture details of corrupted images. Experimental results on various inpainting tasks demonstrate that the proposed method is comparable and even superior to some state-of-the-art inpainting algorithms.
CT image sequence restoration based on sparse and low-rank decomposition.
Gou, Shuiping; Wang, Yueyue; Wang, Zhilong; Peng, Yong; Zhang, Xiaopeng; Jiao, Licheng; Wu, Jianshe
2013-01-01
Blurry organ boundaries and soft tissue structures present a major challenge in biomedical image restoration. In this paper, we propose a low-rank decomposition-based method for computed tomography (CT) image sequence restoration, where the CT image sequence is decomposed into a sparse component and a low-rank component. A new point spread function of Weiner filter is employed to efficiently remove blur in the sparse component; a wiener filtering with the Gaussian PSF is used to recover the average image of the low-rank component. And then we get the recovered CT image sequence by combining the recovery low-rank image with all recovery sparse image sequence. Our method achieves restoration results with higher contrast, sharper organ boundaries and richer soft tissue structure information, compared with existing CT image restoration methods. The robustness of our method was assessed with numerical experiments using three different low-rank models: Robust Principle Component Analysis (RPCA), Linearized Alternating Direction Method with Adaptive Penalty (LADMAP) and Go Decomposition (GoDec). Experimental results demonstrated that the RPCA model was the most suitable for the small noise CT images whereas the GoDec model was the best for the large noisy CT images.
Low-rank coal study. Volume 5. RD and D program evaluation
Energy Technology Data Exchange (ETDEWEB)
1980-11-01
A national program is recommended for research, development, and demonstration (RD and D) of improved technologies for the enviromentally acceptable use of low-rank coals. RD and D project recommendations are outlined in all applicable technology areas, including extraction, transportation, preparation, handling and storage, conventional combustion and environmental control technology, fluidized bed combustion, gasification, liquefaction, and pyrolysis. Basic research topics are identified separately, as well as a series of crosscutting research activities addressing environmental, economic, and regulatory issues. The recommended RD and D activities are classified into Priority I and Priority II categories, reflecting their relative urgency and potential impact on the advancement of low-rank coal development. Summaries of ongoing research projects on low-rank coals in the US are presented in an Appendix, and the relationships of these ongoing efforts to the recommended RD and D program are discussed.
Image restoration via patch orientation-based low-rank matrix approximation and nonlocal means
Zhang, Di; He, Jiazhong; Du, Minghui
2016-03-01
Low-rank matrix approximation and nonlocal means (NLM) are two popular techniques for image restoration. Although the basic principle for applying these two techniques is the same, i.e., similar image patches are abundant in the image, previously published related algorithms use either low-rank matrix approximation or NLM because they manipulate the information of similar patches in different ways. We propose a method for image restoration by jointly using low-rank matrix approximation and NLM in a unified minimization framework. To improve the accuracy of determining similar patches, we also propose a patch similarity measurement based on curvelet transform. Extensive experiments on image deblurring and compressive sensing image recovery validate that the proposed method achieves better results than many state-of-the-art algorithms in terms of both quantitative measures and visual perception.
Low-rank coal study: national needs for resource development. Volume 3. Technology evaluation
Energy Technology Data Exchange (ETDEWEB)
1980-11-01
Technologies applicable to the development and use of low-rank coals are analyzed in order to identify specific needs for research, development, and demonstration (RD and D). Major sections of the report address the following technologies: extraction; transportation; preparation, handling and storage; conventional combustion and environmental control technology; gasification; liquefaction; and pyrolysis. Each of these sections contains an introduction and summary of the key issues with regard to subbituminous coal and lignite; description of all relevant technology, both existing and under development; a description of related environmental control technology; an evaluation of the effects of low-rank coal properties on the technology; and summaries of current commercial status of the technology and/or current RD and D projects relevant to low-rank coals.
[Study on Microwave Co-Pyrolysis of Low Rank Coal and Circulating Coal Gas].
Zhou, Jun; Yang, Zhe; Liu, Xiao-feng; Wu, Lei; Tian, Yu-hong; Zhao, Xi-cheng
2016-02-01
The pyrolysis of low rank coal to produce bluecoke, coal tar and gas is considered to be the optimal method to realize its clean and efficient utilization. However, the current mainstream pyrolysis production technology generally has a certain particle size requirements for raw coal, resulting in lower yield and poorer quality of coal tar, lower content of effective components in coal gas such as H₂, CH₄, CO, etc. To further improve the yield of coal tar obtained from the pyrolysis of low rank coal and explore systematically the effect of microwave power, pyrolysis time and particle size of coal samples on the yield and composition of microwave pyrolysis products of low rank coal through the analysis and characterization of products with FTIR and GC-MS, introducing microwave pyrolysis of low rank coal into the microwave pyrolysis reactor circularly was suggested to carry out the co-pyrolysis experiment of the low rank coal and coal gas generated by the pyrolysis of low rank coal. The results indicated that the yield of the bluecoke and liquid products were up to 62.2% and 26.8% respectively when the optimal pyrolysis process conditions with the microwave power of 800W, pyrolysis time of 40 min, coal samples particle size of 5-10 mm and circulating coal gas flow rate of 0.4 L · min⁻¹ were selected. The infrared spectrogram of the bluecoke under different microwave power and pyrolysis time overlapped roughly. The content of functional groups with -OH, C==O, C==C and C−O from the bluecoke through the pyrolysis of particle size coal samples had a larger difference. To improve microwave power, prolonging pyrolysis time and reducing particle size of coal samples were conducive to converting heavy component to light one into coal tar.
Low-rank Atlas Image Analyses in the Presence of Pathologies
Liu, Xiaoxiao; Niethammer, Marc; Kwitt, Roland; Singh, Nikhil; McCormick, Matt; Aylward, Stephen
2015-01-01
We present a common framework, for registering images to an atlas and for forming an unbiased atlas, that tolerates the presence of pathologies such as tumors and traumatic brain injury lesions. This common framework is particularly useful when a sufficient number of protocol-matched scans from healthy subjects cannot be easily acquired for atlas formation and when the pathologies in a patient cause large appearance changes. Our framework combines a low-rank-plus-sparse image decomposition technique with an iterative, diffeomorphic, group-wise image registration method. At each iteration of image registration, the decomposition technique estimates a “healthy” version of each image as its low-rank component and estimates the pathologies in each image as its sparse component. The healthy version of each image is used for the next iteration of image registration. The low-rank and sparse estimates are refined as the image registrations iteratively improve. When that framework is applied to image-to-atlas registration, the low-rank image is registered to a pre-defined atlas, to establish correspondence that is independent of the pathologies in the sparse component of each image. Ultimately, image-to-atlas registrations can be used to define spatial priors for tissue segmentation and to map information across subjects. When that framework is applied to unbiased atlas formation, at each iteration, the average of the low-rank images from the patients is used as the atlas image for the next iteration, until convergence. Since each iteration’s atlas is comprised of low-rank components, it provides a population-consistent, pathology-free appearance. Evaluations of the proposed methodology are presented using synthetic data as well as simulated and clinical tumor MRI images from the brain tumor segmentation (BRATS) challenge from MICCAI 2012. PMID:26111390
Speech Denoising in White Noise Based on Signal Subspace Low-rank Plus Sparse Decomposition
Directory of Open Access Journals (Sweden)
yuan Shuai
2017-01-01
Full Text Available In this paper, a new subspace speech enhancement method using low-rank and sparse decomposition is presented. In the proposed method, we firstly structure the corrupted data as a Toeplitz matrix and estimate its effective rank for the underlying human speech signal. Then the low-rank and sparse decomposition is performed with the guidance of speech rank value to remove the noise. Extensive experiments have been carried out in white Gaussian noise condition, and experimental results show the proposed method performs better than conventional speech enhancement methods, in terms of yielding less residual noise and lower speech distortion.
30 CFR 870.20 - How to calculate excess moisture in LOW-rank coals.
2010-07-01
... coals. 870.20 Section 870.20 Mineral Resources OFFICE OF SURFACE MINING RECLAMATION AND ENFORCEMENT, DEPARTMENT OF THE INTERIOR ABANDONED MINE LAND RECLAMATION ABANDONED MINE RECLAMATION FUND-FEE COLLECTION AND COAL PRODUCTION REPORTING § 870.20 How to calculate excess moisture in LOW-rank coals. Here are the...
Weakly intrusive low-rank approximation method for nonlinear parameter-dependent equations
Giraldi, Loic
2017-06-30
This paper presents a weakly intrusive strategy for computing a low-rank approximation of the solution of a system of nonlinear parameter-dependent equations. The proposed strategy relies on a Newton-like iterative solver which only requires evaluations of the residual of the parameter-dependent equation and of a preconditioner (such as the differential of the residual) for instances of the parameters independently. The algorithm provides an approximation of the set of solutions associated with a possibly large number of instances of the parameters, with a computational complexity which can be orders of magnitude lower than when using the same Newton-like solver for all instances of the parameters. The reduction of complexity requires efficient strategies for obtaining low-rank approximations of the residual, of the preconditioner, and of the increment at each iteration of the algorithm. For the approximation of the residual and the preconditioner, weakly intrusive variants of the empirical interpolation method are introduced, which require evaluations of entries of the residual and the preconditioner. Then, an approximation of the increment is obtained by using a greedy algorithm for low-rank approximation, and a low-rank approximation of the iterate is finally obtained by using a truncated singular value decomposition. When the preconditioner is the differential of the residual, the proposed algorithm is interpreted as an inexact Newton solver for which a detailed convergence analysis is provided. Numerical examples illustrate the efficiency of the method.
Kriging accelerated by orders of magnitude: combining low-rank with FFT techniques
Litvinenko, Alexander
2014-05-04
Kriging algorithms based on FFT, the separability of certain covariance functions and low-rank representations of covariance functions have been investigated. The current study combines these ideas, and so combines the individual speedup factors of all ideas. The reduced computational complexity is O(dLlogL), where L := max ini, i = 1
Low-rank coal research: Volume 2, Advanced research and technology development: Final report
Energy Technology Data Exchange (ETDEWEB)
Mann, M.D.; Swanson, M.L.; Benson, S.A.; Radonovich, L.; Steadman, E.N.; Sweeny, P.G.; McCollor, D.P.; Kleesattel, D.; Grow, D.; Falcone, S.K.
1987-04-01
Volume II contains articles on advanced combustion phenomena, combustion inorganic transformation; coal/char reactivity; liquefaction reactivity of low-rank coals, gasification ash and slag characterization, and fine particulate emissions. These articles have been entered individually into EDB and ERA. (LTN)
Depth Image Inpainting: Improving Low Rank Matrix Completion With Low Gradient Regularization
Xue, Hongyang; Zhang, Shengming; Cai, Deng
2017-09-01
We consider the case of inpainting single depth images. Without corresponding color images, previous or next frames, depth image inpainting is quite challenging. One natural solution is to regard the image as a matrix and adopt the low rank regularization just as inpainting color images. However, the low rank assumption does not make full use of the properties of depth images. A shallow observation may inspire us to penalize the non-zero gradients by sparse gradient regularization. However, statistics show that though most pixels have zero gradients, there is still a non-ignorable part of pixels whose gradients are equal to 1. Based on this specific property of depth images , we propose a low gradient regularization method in which we reduce the penalty for gradient 1 while penalizing the non-zero gradients to allow for gradual depth changes. The proposed low gradient regularization is integrated with the low rank regularization into the low rank low gradient approach for depth image inpainting. We compare our proposed low gradient regularization with sparse gradient regularization. The experimental results show the effectiveness of our proposed approach.
Lesieur, Thibault; Krzakala, Florent; Zdeborová, Lenka
2017-07-01
This article is an extended version of previous work of Lesieur et al (2015 IEEE Int. Symp. on Information Theory Proc. pp 1635-9 and 2015 53rd Annual Allerton Conf. on Communication, Control and Computing (IEEE) pp 680-7) on low-rank matrix estimation in the presence of constraints on the factors into which the matrix is factorized. Low-rank matrix factorization is one of the basic methods used in data analysis for unsupervised learning of relevant features and other types of dimensionality reduction. We present a framework to study the constrained low-rank matrix estimation for a general prior on the factors, and a general output channel through which the matrix is observed. We draw a parallel with the study of vector-spin glass models—presenting a unifying way to study a number of problems considered previously in separate statistical physics works. We present a number of applications for the problem in data analysis. We derive in detail a general form of the low-rank approximate message passing (Low-RAMP) algorithm, that is known in statistical physics as the TAP equations. We thus unify the derivation of the TAP equations for models as different as the Sherrington-Kirkpatrick model, the restricted Boltzmann machine, the Hopfield model or vector (xy, Heisenberg and other) spin glasses. The state evolution of the Low-RAMP algorithm is also derived, and is equivalent to the replica symmetric solution for the large class of vector-spin glass models. In the section devoted to result we study in detail phase diagrams and phase transitions for the Bayes-optimal inference in low-rank matrix estimation. We present a typology of phase transitions and their relation to performance of algorithms such as the Low-RAMP or commonly used spectral methods.
Jiang, Tiefeng; Yang, Fan
2013-01-01
For random samples of size $n$ obtained from $p$-variate normal distributions, we consider the classical likelihood ratio tests (LRT) for their means and covariance matrices in the high-dimensional setting. These test statistics have been extensively studied in multivariate analysis, and their limiting distributions under the null hypothesis were proved to be chi-square distributions as $n$ goes to infinity and $p$ remains fixed. In this paper, we consider the high-dimensional case where both...
A low-rank approach to off-the-grid sparse deconvolution
Catala, Paul; Duval, Vincent; Peyré, Gabriel
2017-10-01
We propose a new solver for the sparse spikes deconvolution problem over the space of Radon measures. A common approach to off-the-grid deconvolution considers semidefinite (SDP) relaxations of the total variation (the total mass of the measure) minimization problem. The direct resolution of this SDP is however intractable for large scale settings, since the problem size grows as f c 2d where fc is the cutoff frequency of the filter. Our first contribution introduces a penalized formulation of this semidefinite lifting, which has low-rank solutions. Our second contribution is a conditional gradient optimization scheme with non-convex updates. This algorithm leverages both the low-rank and the convolutive structure of the problem, resulting in an O(fc d log fc) complexity per iteration. Numerical simulations are promising and show that the algorithm converges in exactly k steps, k being the number of Diracs composing the solution.
Non-Convex Sparse and Low-Rank Based Robust Subspace Segmentation for Data Mining.
Cheng, Wenlong; Zhao, Mingbo; Xiong, Naixue; Chui, Kwok Tai
2017-07-15
Parsimony, including sparsity and low-rank, has shown great importance for data mining in social networks, particularly in tasks such as segmentation and recognition. Traditionally, such modeling approaches rely on an iterative algorithm that minimizes an objective function with convex l₁-norm or nuclear norm constraints. However, the obtained results by convex optimization are usually suboptimal to solutions of original sparse or low-rank problems. In this paper, a novel robust subspace segmentation algorithm has been proposed by integrating lp-norm and Schatten p-norm constraints. Our so-obtained affinity graph can better capture local geometrical structure and the global information of the data. As a consequence, our algorithm is more generative, discriminative and robust. An efficient linearized alternating direction method is derived to realize our model. Extensive segmentation experiments are conducted on public datasets. The proposed algorithm is revealed to be more effective and robust compared to five existing algorithms.
Lyu, Jingyuan; Nakarmi, Ukash; Zhang, Chaoyi; Ying, Leslie
2016-05-01
This paper presents a new approach to highly accelerated dynamic parallel MRI using low rank matrix completion, partial separability (PS) model. In data acquisition, k-space data is moderately randomly undersampled at the center kspace navigator locations, but highly undersampled at the outer k-space for each temporal frame. In reconstruction, the navigator data is reconstructed from undersampled data using structured low-rank matrix completion. After all the unacquired navigator data is estimated, the partial separable model is used to obtain partial k-t data. Then the parallel imaging method is used to acquire the entire dynamic image series from highly undersampled data. The proposed method has shown to achieve high quality reconstructions with reduction factors up to 31, and temporal resolution of 29ms, when the conventional PS method fails.
A Class of Weighted Low Rank Approximation of the Positive Semidefinite Hankel Matrix
Directory of Open Access Journals (Sweden)
Jianchao Bai
2015-01-01
Full Text Available We consider the weighted low rank approximation of the positive semidefinite Hankel matrix problem arising in signal processing. By using the Vandermonde representation, we firstly transform the problem into an unconstrained optimization problem and then use the nonlinear conjugate gradient algorithm with the Armijo line search to solve the equivalent unconstrained optimization problem. Numerical examples illustrate that the new method is feasible and effective.
Enhanced low-rank representation via sparse manifold adaption for semi-supervised learning.
Peng, Yong; Lu, Bao-Liang; Wang, Suhang
2015-05-01
Constructing an informative and discriminative graph plays an important role in various pattern recognition tasks such as clustering and classification. Among the existing graph-based learning models, low-rank representation (LRR) is a very competitive one, which has been extensively employed in spectral clustering and semi-supervised learning (SSL). In SSL, the graph is composed of both labeled and unlabeled samples, where the edge weights are calculated based on the LRR coefficients. However, most of existing LRR related approaches fail to consider the geometrical structure of data, which has been shown beneficial for discriminative tasks. In this paper, we propose an enhanced LRR via sparse manifold adaption, termed manifold low-rank representation (MLRR), to learn low-rank data representation. MLRR can explicitly take the data local manifold structure into consideration, which can be identified by the geometric sparsity idea; specifically, the local tangent space of each data point was sought by solving a sparse representation objective. Therefore, the graph to depict the relationship of data points can be built once the manifold information is obtained. We incorporate a regularizer into LRR to make the learned coefficients preserve the geometric constraints revealed in the data space. As a result, MLRR combines both the global information emphasized by low-rank property and the local information emphasized by the identified manifold structure. Extensive experimental results on semi-supervised classification tasks demonstrate that MLRR is an excellent method in comparison with several state-of-the-art graph construction approaches. Copyright © 2015 Elsevier Ltd. All rights reserved.
Jain, Riddhika
2013-01-01
This thesis pertains to the processing of ultra-fine mineral particles and low rank coal using the hydrophobic--hydrophilic separation (HHS) method. Several explorative experimental tests have been carried out to study the effect of the various physical and chemical parameters on the HHS process. In this study, the HHS process has been employed to upgrade a chalcopyrite ore. A systematic experimental study on the effects of various physical and chemical parameters such as particle size, re...
Nonlocal low-rank and sparse matrix decomposition for spectral CT reconstruction
Niu, Shanzhou; Yu, Gaohang; Ma, Jianhua; Wang, Jing
2018-02-01
Spectral computed tomography (CT) has been a promising technique in research and clinics because of its ability to produce improved energy resolution images with narrow energy bins. However, the narrow energy bin image is often affected by serious quantum noise because of the limited number of photons used in the corresponding energy bin. To address this problem, we present an iterative reconstruction method for spectral CT using nonlocal low-rank and sparse matrix decomposition (NLSMD), which exploits the self-similarity of patches that are collected in multi-energy images. Specifically, each set of patches can be decomposed into a low-rank component and a sparse component, and the low-rank component represents the stationary background over different energy bins, while the sparse component represents the rest of the different spectral features in individual energy bins. Subsequently, an effective alternating optimization algorithm was developed to minimize the associated objective function. To validate and evaluate the NLSMD method, qualitative and quantitative studies were conducted by using simulated and real spectral CT data. Experimental results show that the NLSMD method improves spectral CT images in terms of noise reduction, artifact suppression and resolution preservation.
Directory of Open Access Journals (Sweden)
Hongyang Lu
2016-06-01
Full Text Available Because of the contradiction between the spatial and temporal resolution of remote sensing images (RSI and quality loss in the process of acquisition, it is of great significance to reconstruct RSI in remote sensing applications. Recent studies have demonstrated that reference image-based reconstruction methods have great potential for higher reconstruction performance, while lacking accuracy and quality of reconstruction. For this application, a new compressed sensing objective function incorporating a reference image as prior information is developed. We resort to the reference prior information inherent in interior and exterior data simultaneously to build a new generalized nonconvex low-rank approximation framework for RSI reconstruction. Specifically, the innovation of this paper consists of the following three respects: (1 we propose a nonconvex low-rank approximation for reconstructing RSI; (2 we inject reference prior information to overcome over smoothed edges and texture detail losses; (3 on this basis, we combine conjugate gradient algorithms and a single-value threshold (SVT simultaneously to solve the proposed algorithm. The performance of the algorithm is evaluated both qualitatively and quantitatively. Experimental results demonstrate that the proposed algorithm improves several dBs in terms of peak signal to noise ratio (PSNR and preserves image details significantly compared to most of the current approaches without reference images as priors. In addition, the generalized nonconvex low-rank approximation of our approach is naturally robust to noise, and therefore, the proposed algorithm can handle low resolution with noisy inputs in a more unified framework.
Robust Alternating Low-Rank Representation by joint Lp- and L2,p-norm minimization.
Zhang, Zhao; Zhao, Mingbo; Li, Fanzhang; Zhang, Li; Yan, Shuicheng
2017-12-01
We propose a robust Alternating Low-Rank Representation (ALRR) model formed by an alternating forward-backward representation process. For forward representation, ALRR first recovers the low-rank PCs and random corruptions by an adaptive local Robust PCA (RPCA). Then, ALRR performs a joint L p -norm and L 2,p -norm minimization (0representation, while the L 2,p -norm on the reconstruction error can handle outlier pursuit. After that, ALRR returns the coefficients as adaptive weights to local RPCA for updating PCs and dictionary in the backward representation process. Thus, ALRR is regarded as an integration of local RPCA with adaptive weights plus sparse LRR with a self-expressive low-rank dictionary. To enable ALRR to handle outside data efficiently, a projective ALRR that can extract features from data directly by embedding is also derived. To solve the L 2,p -norm based minimization problem, a new iterative scheme based on the Iterative Shrinkage/Thresholding (IST) approach is presented. The relationship analysis with other related criteria show that our methods are more general. Visual and numerical results demonstrate the effectiveness of our algorithms for representation. Copyright © 2017 Elsevier Ltd. All rights reserved.
OCT despeckling via weighted nuclear norm constrained non-local low-rank representation
Tang, Chang; Zheng, Xiao; Cao, Lijuan
2017-10-01
As a non-invasive imaging modality, optical coherence tomography (OCT) plays an important role in medical sciences. However, OCT images are always corrupted by speckle noise, which can mask image features and pose significant challenges for medical analysis. In this work, we propose an OCT despeckling method by using non-local, low-rank representation with weighted nuclear norm constraint. Unlike previous non-local low-rank representation based OCT despeckling methods, we first generate a guidance image to improve the non-local group patches selection quality, then a low-rank optimization model with a weighted nuclear norm constraint is formulated to process the selected group patches. The corrupted probability of each pixel is also integrated into the model as a weight to regularize the representation error term. Note that each single patch might belong to several groups, hence different estimates of each patch are aggregated to obtain its final despeckled result. Both qualitative and quantitative experimental results on real OCT images show the superior performance of the proposed method compared with other state-of-the-art speckle removal techniques.
Color correction with blind image restoration based on multiple images using a low-rank model
Li, Dong; Xie, Xudong; Lam, Kin-Man
2014-03-01
We present a method that can handle the color correction of multiple photographs with blind image restoration simultaneously and automatically. We prove that the local colors of a set of images of the same scene exhibit the low-rank property locally both before and after a color-correction operation. This property allows us to correct all kinds of errors in an image under a low-rank matrix model without particular priors or assumptions. The possible errors may be caused by changes of viewpoint, large illumination variations, gross pixel corruptions, partial occlusions, etc. Furthermore, a new iterative soft-segmentation method is proposed for local color transfer using color influence maps. Due to the fact that the correct color information and the spatial information of images can be recovered using the low-rank model, more precise color correction and many other image-restoration tasks-including image denoising, image deblurring, and gray-scale image colorizing-can be performed simultaneously. Experiments have verified that our method can achieve consistent and promising results on uncontrolled real photographs acquired from the Internet and that it outperforms current state-of-the-art methods.
Task 27 -- Alaskan low-rank coal-water fuel demonstration project
Energy Technology Data Exchange (ETDEWEB)
NONE
1995-10-01
Development of coal-water-fuel (CWF) technology has to-date been predicated on the use of high-rank bituminous coal only, and until now the high inherent moisture content of low-rank coal has precluded its use for CWF production. The unique feature of the Alaskan project is the integration of hot-water-drying (HWD) into CWF technology as a beneficiation process. Hot-water-drying is an EERC developed technology unavailable to the competition that allows the range of CWF feedstock to be extended to low-rank coals. The primary objective of the Alaskan Project, is to promote interest in the CWF marketplace by demonstrating the commercial viability of low-rank coal-water-fuel (LRCWF). While commercialization plans cannot be finalized until the implementation and results of the Alaskan LRCWF Project are known and evaluated, this report has been prepared to specifically address issues concerning business objectives for the project, and outline a market development plan for meeting those objectives.
Energy Technology Data Exchange (ETDEWEB)
Wiltsee, Jr., G. A.
1983-01-01
Progress reports are presented for the following tasks: (1) gasification wastewater treatment and reuse; (2) fine coal cleaning; (3) coal-water slurry preparation; (4) low-rank coal liquefaction; (5) combined flue gas cleanup/simultaneous SO/sub x/-NO/sub x/ control; (6) particulate control and hydrocarbons and trace element emissions from low-rank coals; (7) waste characterization; (8) combustion research and ash fowling; (9) fluidized-bed combustion of low-rank coals; (10) ash and slag characterization; (11) organic structure of coal; (12) distribution of inorganics in low-rank coals; (13) physical properties and moisture of low-rank coals; (14) supercritical solvent extraction; and (15) pyrolysis and devolatilization.
Sparse High Dimensional Models in Economics.
Fan, Jianqing; Lv, Jinchi; Qi, Lei
2011-09-01
This paper reviews the literature on sparse high dimensional models and discusses some applications in economics and finance. Recent developments of theory, methods, and implementations in penalized least squares and penalized likelihood methods are highlighted. These variable selection methods are proved to be effective in high dimensional sparse modeling. The limits of dimensionality that regularization methods can handle, the role of penalty functions, and their statistical properties are detailed. Some recent advances in ultra-high dimensional sparse modeling are also briefly discussed.
Effect of Water Invasion on Outburst Predictive Index of Low Rank Coals in Dalong Mine.
Directory of Open Access Journals (Sweden)
Jingyu Jiang
Full Text Available To improve the coal permeability and outburst prevention, coal seam water injection and a series of outburst prevention measures were tested in outburst coal mines. These methods have become important technologies used for coal and gas outburst prevention and control by increasing the external moisture of coal or decreasing the stress of coal seam and changing the coal pore structure and gas desorption speed. In addition, techniques have had a significant impact on the gas extraction and outburst prevention indicators of coal seams. Globally, low rank coals reservoirs account for nearly half of hidden coal reserves and the most obvious feature of low rank coal is the high natural moisture content. Moisture will restrain the gas desorption and will affect the gas extraction and accuracy of the outburst prediction of coals. To study the influence of injected water on methane desorption dynamic characteristics and the outburst predictive index of coal, coal samples were collected from the Dalong Mine. The methane adsorption/desorption test was conducted on coal samples under conditions of different injected water contents. Selective analysis assessed the variations of the gas desorption quantities and the outburst prediction index (coal cutting desorption index. Adsorption tests indicated that the Langmuir volume of the Dalong coal sample is ~40.26 m3/t, indicating a strong gas adsorption ability. With the increase of injected water content, the gas desorption amount of the coal samples decreased under the same pressure and temperature. Higher moisture content lowered the accumulation desorption quantity after 120 minutes. The gas desorption volumes and moisture content conformed to a logarithmic relationship. After moisture correction, we obtained the long-flame coal outburst prediction (cutting desorption index critical value. This value can provide a theoretical basis for outburst prediction and prevention of low rank coal mines and similar
The optimized expansion based low-rank method for wavefield extrapolation
Wu, Zedong
2014-03-01
Spectral methods are fast becoming an indispensable tool for wavefield extrapolation, especially in anisotropic media because it tends to be dispersion and artifact free as well as highly accurate when solving the wave equation. However, for inhomogeneous media, we face difficulties in dealing with the mixed space-wavenumber domain extrapolation operator efficiently. To solve this problem, we evaluated an optimized expansion method that can approximate this operator with a low-rank variable separation representation. The rank defines the number of inverse Fourier transforms for each time extrapolation step, and thus, the lower the rank, the faster the extrapolation. The method uses optimization instead of matrix decomposition to find the optimal wavenumbers and velocities needed to approximate the full operator with its explicit low-rank representation. As a result, we obtain lower rank representations compared with the standard low-rank method within reasonable accuracy and thus cheaper extrapolations. Additional bounds set on the range of propagated wavenumbers to adhere to the physical wave limits yield unconditionally stable extrapolations regardless of the time step. An application on the BP model provided superior results compared to those obtained using the decomposition approach. For transversely isotopic media, because we used the pure P-wave dispersion relation, we obtained solutions that were free of the shear wave artifacts, and the algorithm does not require that n > 0. In addition, the required rank for the optimization approach to obtain high accuracy in anisotropic media was lower than that obtained by the decomposition approach, and thus, it was more efficient. A reverse time migration result for the BP tilted transverse isotropy model using this method as a wave propagator demonstrated the ability of the algorithm.
Aviles, Angelica I.; Widlak, Thomas; Casals, Alicia; Nillesen, Maartje M.; Ammari, Habib
2017-06-01
Cardiac motion estimation is an important diagnostic tool for detecting heart diseases and it has been explored with modalities such as MRI and conventional ultrasound (US) sequences. US cardiac motion estimation still presents challenges because of complex motion patterns and the presence of noise. In this work, we propose a novel approach to estimate cardiac motion using ultrafast ultrasound data. Our solution is based on a variational formulation characterized by the L 2-regularized class. Displacement is represented by a lattice of b-splines and we ensure robustness, in the sense of eliminating outliers, by applying a maximum likelihood type estimator. While this is an important part of our solution, the main object of this work is to combine low-rank data representation with topology preservation. Low-rank data representation (achieved by finding the k-dominant singular values of a Casorati matrix arranged from the data sequence) speeds up the global solution and achieves noise reduction. On the other hand, topology preservation (achieved by monitoring the Jacobian determinant) allows one to radically rule out distortions while carefully controlling the size of allowed expansions and contractions. Our variational approach is carried out on a realistic dataset as well as on a simulated one. We demonstrate how our proposed variational solution deals with complex deformations through careful numerical experiments. The low-rank constraint speeds up the convergence of the optimization problem while topology preservation ensures a more accurate displacement. Beyond cardiac motion estimation, our approach is promising for the analysis of other organs that exhibit motion.
Effect of Water Invasion on Outburst Predictive Index of Low Rank Coals in Dalong Mine
Jiang, Jingyu; Cheng, Yuanping; Mou, Junhui; Jin, Kan; Cui, Jie
2015-01-01
To improve the coal permeability and outburst prevention, coal seam water injection and a series of outburst prevention measures were tested in outburst coal mines. These methods have become important technologies used for coal and gas outburst prevention and control by increasing the external moisture of coal or decreasing the stress of coal seam and changing the coal pore structure and gas desorption speed. In addition, techniques have had a significant impact on the gas extraction and outburst prevention indicators of coal seams. Globally, low rank coals reservoirs account for nearly half of hidden coal reserves and the most obvious feature of low rank coal is the high natural moisture content. Moisture will restrain the gas desorption and will affect the gas extraction and accuracy of the outburst prediction of coals. To study the influence of injected water on methane desorption dynamic characteristics and the outburst predictive index of coal, coal samples were collected from the Dalong Mine. The methane adsorption/desorption test was conducted on coal samples under conditions of different injected water contents. Selective analysis assessed the variations of the gas desorption quantities and the outburst prediction index (coal cutting desorption index). Adsorption tests indicated that the Langmuir volume of the Dalong coal sample is ~40.26 m3/t, indicating a strong gas adsorption ability. With the increase of injected water content, the gas desorption amount of the coal samples decreased under the same pressure and temperature. Higher moisture content lowered the accumulation desorption quantity after 120 minutes. The gas desorption volumes and moisture content conformed to a logarithmic relationship. After moisture correction, we obtained the long-flame coal outburst prediction (cutting desorption) index critical value. This value can provide a theoretical basis for outburst prediction and prevention of low rank coal mines and similar occurrence conditions
Two-Step Proximal Gradient Algorithm for Low-Rank Matrix Completion
Directory of Open Access Journals (Sweden)
Qiuyu Wang
2016-06-01
Full Text Available In this paper, we propose a two-step proximal gradient algorithm to solve nuclear norm regularized least squares for the purpose of recovering low-rank data matrix from sampling of its entries. Each iteration generated by the proposed algorithm is a combination of the latest three points, namely, the previous point, the current iterate, and its proximal gradient point. This algorithm preserves the computational simplicity of classical proximal gradient algorithm where a singular value decomposition in proximal operator is involved. Global convergence is followed directly in the literature. Numerical results are reported to show the efficiency of the algorithm.
Robust subspace estimation using low-rank optimization theory and applications
Oreifej, Omar
2014-01-01
Various fundamental applications in computer vision and machine learning require finding the basis of a certain subspace. Examples of such applications include face detection, motion estimation, and activity recognition. An increasing interest has been recently placed on this area as a result of significant advances in the mathematics of matrix rank optimization. Interestingly, robust subspace estimation can be posed as a low-rank optimization problem, which can be solved efficiently using techniques such as the method of Augmented Lagrange Multiplier. In this book,?the authors?discuss fundame
Kriging accelerated by orders of magnitude: combining low-rank with FFT techniques
Litvinenko, Alexander
2014-01-08
Kriging algorithms based on FFT, the separability of certain covariance functions and low-rank representations of covariance functions have been investigated. The current study combines these ideas, and so combines the individual speedup factors of all ideas. For separable covariance functions, the results are exact, and non-separable covariance functions can be approximated through sums of separable components. Speedup factor is 1e+8, problem sizes 1.5e+13 and 2e+15 estimation points for Kriging and spatial design.
Kriging accelerated by orders of magnitude: combining low-rank with FFT techniques
Litvinenko, Alexander
2014-01-06
Kriging algorithms based on FFT, the separability of certain covariance functions and low-rank representations of covariance functions have been investigated. The current study combines these ideas, and so combines the individual speedup factors of all ideas. The reduced computational complexity is O(dLlogL), where L := max ini, i = 1..d. For separable covariance functions, the results are exact, and non-separable covariance functions can be approximated through sums of separable components. Speedup factor is 10 8, problem sizes 15e + 12 and 2e + 15 estimation points for Kriging and spatial design.
The role of IGCC technology in power generation using low-rank coal
Energy Technology Data Exchange (ETDEWEB)
Juangjandee, Pipat
2010-09-15
Based on basic test results on the gasification rate of Mae Moh lignite coal. It was found that an IDGCC power plant is the most suitable for Mae Moh lignite. In conclusion, the future of an IDGCC power plant using low-rank coal in Mae Moh mine would hinge on the strictness of future air pollution control regulations including green-house gas emission and the constraint of Thailand's foreign currency reserves needed to import fuels, in addition to economic consideration. If and when it is necessary to overcome these obstacles, IGCC is one variable alternative power generation must consider.
Low-rank Quasi-Newton updates for Robust Jacobian lagging in Newton methods
Energy Technology Data Exchange (ETDEWEB)
Brown, J.; Brune, P. [Mathematics and Computer Science Division, Argonne National Laboratory, 9700 S. Cass Ave., Argonne, IL 60439 (United States)
2013-07-01
Newton-Krylov methods are standard tools for solving nonlinear problems. A common approach is to 'lag' the Jacobian when assembly or preconditioner setup is computationally expensive, in exchange for some degradation in the convergence rate and robustness. We show that this degradation may be partially mitigated by using the lagged Jacobian as an initial operator in a quasi-Newton method, which applies unassembled low-rank updates to the Jacobian until the next full reassembly. We demonstrate the effectiveness of this technique on problems in glaciology and elasticity. (authors)
Multi-shot multi-channel diffusion data recovery using structured low-rank matrix completion
Mani, Merry; Kelley, Douglas; Magnotta, Vincent
2016-01-01
Purpose: To introduce a novel method for the recovery of multi-shot diffusion weighted (MS-DW) images from echo-planar imaging (EPI) acquisitions. Methods: Current EPI-based MS-DW reconstruction methods rely on the explicit estimation of the motion- induced phase maps to recover the unaliased images. In the new formulation, the k-space data of the unaliased DWI is recovered using a structured low-rank matrix completion scheme, which does not require explicit estimation of the phase maps. The structured matrix is obtained as the lifting of the multi-shot data. The smooth phase-modulations between shots manifest as null-space vectors of this matrix, which implies that the structured matrix is low-rank. The missing entries of the structured matrix are filled in using a nuclear-norm minimization algorithm subject to the data-consistency. The formulation enables the natural introduction of smoothness regularization, thus enabling implicit motion-compensated recovery of fully-sampled as well as under-sampled MS-DW ...
Accurate low-rank matrix recovery from a small number of linear measurements
Candes, Emmanuel J
2009-01-01
We consider the problem of recovering a lowrank matrix M from a small number of random linear measurements. A popular and useful example of this problem is matrix completion, in which the measurements reveal the values of a subset of the entries, and we wish to fill in the missing entries (this is the famous Netflix problem). When M is believed to have low rank, one would ideally try to recover M by finding the minimum-rank matrix that is consistent with the data; this is, however, problematic since this is a nonconvex problem that is, generally, intractable. Nuclear-norm minimization has been proposed as a tractable approach, and past papers have delved into the theoretical properties of nuclear-norm minimization algorithms, establishing conditions under which minimizing the nuclear norm yields the minimum rank solution. We review this spring of emerging literature and extend and refine previous theoretical results. Our focus is on providing error bounds when M is well approximated by a low-rank matrix, and ...
Directory of Open Access Journals (Sweden)
Meiting Yu
2018-02-01
Full Text Available The extraction of a valuable set of features and the design of a discriminative classifier are crucial for target recognition in SAR image. Although various features and classifiers have been proposed over the years, target recognition under extended operating conditions (EOCs is still a challenging problem, e.g., target with configuration variation, different capture orientations, and articulation. To address these problems, this paper presents a new strategy for target recognition. We first propose a low-dimensional representation model via incorporating multi-manifold regularization term into the low-rank matrix factorization framework. Two rules, pairwise similarity and local linearity, are employed for constructing multiple manifold regularization. By alternately optimizing the matrix factorization and manifold selection, the feature representation model can not only acquire the optimal low-rank approximation of original samples, but also capture the intrinsic manifold structure information. Then, to take full advantage of the local structure property of features and further improve the discriminative ability, local sparse representation is proposed for classification. Finally, extensive experiments on moving and stationary target acquisition and recognition (MSTAR database demonstrate the effectiveness of the proposed strategy, including target recognition under EOCs, as well as the capability of small training size.
Zhou, Xiaowei; Liu, Jiming; Wan, Xiang; Yu, Weichuan
2014-07-15
The post-genome era sees urgent need for more novel approaches to extracting useful information from the huge amount of genetic data. The identification of recurrent copy number variations (CNVs) from array-based comparative genomic hybridization (aCGH) data can help understand complex diseases, such as cancer. Most of the previous computational methods focused on single-sample analysis or statistical testing based on the results of single-sample analysis. Finding recurrent CNVs from multi-sample data remains a challenging topic worth further study. We present a general and robust method to identify recurrent CNVs from multi-sample aCGH profiles. We express the raw dataset as a matrix and demonstrate that recurrent CNVs will form a low-rank matrix. Hence, we formulate the problem as a matrix recovering problem, where we aim to find a piecewise-constant and low-rank approximation (PLA) to the input matrix. We propose a convex formulation for matrix recovery and an efficient algorithm to globally solve the problem. We demonstrate the advantages of PLA compared with alternative methods using synthesized datasets and two breast cancer datasets. The experimental results show that PLA can successfully reconstruct the recurrent CNV patterns from raw data and achieve better performance compared with alternative methods under a wide range of scenarios. The MATLAB code is available at http://bioinformatics.ust.hk/pla.zip. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Acceleration of MR parameter mapping using annihilating filter-based low rank hankel matrix (ALOHA).
Lee, Dongwook; Jin, Kyong Hwan; Kim, Eung Yeop; Park, Sung-Hong; Ye, Jong Chul
2016-12-01
MR parameter mapping is one of clinically valuable MR imaging techniques. However, increased scan time makes it difficult for routine clinical use. This article aims at developing an accelerated MR parameter mapping technique using annihilating filter based low-rank Hankel matrix approach (ALOHA). When a dynamic sequence can be sparsified using spatial wavelet and temporal Fourier transform, this results in a rank-deficient Hankel structured matrix that is constructed using weighted k-t measurements. ALOHA then utilizes the low rank matrix completion algorithm combined with a multiscale pyramidal decomposition to estimate the missing k-space data. Spin-echo inversion recovery and multiecho spin echo pulse sequences for T1 and T2 mapping, respectively, were redesigned to perform undersampling along the phase encoding direction according to Gaussian distribution. The missing k-space is reconstructed using ALOHA. Then, the parameter maps were constructed using nonlinear regression. Experimental results confirmed that ALOHA outperformed the existing compressed sensing algorithms. Compared with the existing methods, the reconstruction errors appeared scattered throughout the entire images rather than exhibiting systematic distortion along edges and the parameter maps. Given that many diagnostic errors are caused by the systematic distortion of images, ALOHA may have a great potential for clinical applications. Magn Reson Med 76:1848-1864, 2016. © 2016 International Society for Magnetic Resonance in Medicine. © 2016 International Society for Magnetic Resonance in Medicine.
Shape-Constrained Sparse and Low-Rank Decomposition for Auroral Substorm Detection.
Yang, Xi; Gao, Xinbo; Tao, Dacheng; Li, Xuelong; Han, Bing; Li, Jie
2016-01-01
An auroral substorm is an important geophysical phenomenon that reflects the interaction between the solar wind and the Earth's magnetosphere. Detecting substorms is of practical significance in order to prevent disruption to communication and global positioning systems. However, existing detection methods can be inaccurate or require time-consuming manual analysis and are therefore impractical for large-scale data sets. In this paper, we propose an automatic auroral substorm detection method based on a shape-constrained sparse and low-rank decomposition (SCSLD) framework. Our method automatically detects real substorm onsets in large-scale aurora sequences, which overcomes the limitations of manual detection. To reduce noise interference inherent in current SLD methods, we introduce a shape constraint to force the noise to be assigned to the low-rank part (stationary background), thus ensuring the accuracy of the sparse part (moving object) and improving the performance. Experiments conducted on aurora sequences in solar cycle 23 (1996-2008) show that the proposed SCSLD method achieves good performance for motion analysis of aurora sequences. Moreover, the obtained results are highly consistent with manual analysis, suggesting that the proposed automatic method is useful and effective in practice.
Hyperspectral Anomaly Detection Based on Low-Rank Representation and Learned Dictionary
Directory of Open Access Journals (Sweden)
Yubin Niu
2016-03-01
Full Text Available In this paper, a novel hyperspectral anomaly detector based on low-rank representation (LRR and learned dictionary (LD has been proposed. This method assumes that a two-dimensional matrix transformed from a three-dimensional hyperspectral imagery can be decomposed into two parts: a low rank matrix representing the background and a sparse matrix standing for the anomalies. The direct application of LRR model is sensitive to a tradeoff parameter that balances the two parts. To mitigate this problem, a learned dictionary is introduced into the decomposition process. The dictionary is learned from the whole image with a random selection process and therefore can be viewed as the spectra of the background only. It also requires a less computational cost with the learned dictionary. The statistic characteristic of the sparse matrix allows the application of basic anomaly detection method to obtain detection results. Experimental results demonstrate that, compared to other anomaly detection methods, the proposed method based on LRR and LD shows its robustness and has a satisfactory anomaly detection result.
Modeling of pseudoacoustic P-waves in orthorhombic media with a low-rank approximation
Song, Xiaolei
2013-06-04
Wavefield extrapolation in pseudoacoustic orthorhombic anisotropic media suffers from wave-mode coupling and stability limitations in the parameter range. We use the dispersion relation for scalar wave propagation in pseudoacoustic orthorhombic media to model acoustic wavefields. The wavenumber-domain application of the Laplacian operator allows us to propagate the P-waves exclusively, without imposing any conditions on the parameter range of stability. It also allows us to avoid dispersion artifacts commonly associated with evaluating the Laplacian operator in space domain using practical finite-difference stencils. To handle the corresponding space-wavenumber mixed-domain operator, we apply the low-rank approximation approach. Considering the number of parameters necessary to describe orthorhombic anisotropy, the low-rank approach yields space-wavenumber decomposition of the extrapolator operator that is dependent on space location regardless of the parameters, a feature necessary for orthorhombic anisotropy. Numerical experiments that the proposed wavefield extrapolator is accurate and practically free of dispersion. Furthermore, there is no coupling of qSv and qP waves because we use the analytical dispersion solution corresponding to the P-wave.
Directory of Open Access Journals (Sweden)
Rajive Ganguli
2012-01-01
Full Text Available The impact of particle size distribution (PSD of pulverized, low rank high volatile content Alaska coal on combustion related power plant performance was studied in a series of field scale tests. Performance was gauged through efficiency (ratio of megawatt generated to energy consumed as coal, emissions (SO2, NOx, CO, and carbon content of ash (fly ash and bottom ash. The study revealed that the tested coal could be burned at a grind as coarse as 50% passing 76 microns, with no deleterious impact on power generation and emissions. The PSD’s tested in this study were in the range of 41 to 81 percent passing 76 microns. There was negligible correlation between PSD and the followings factors: efficiency, SO2, NOx, and CO. Additionally, two tests where stack mercury (Hg data was collected, did not demonstrate any real difference in Hg emissions with PSD. The results from the field tests positively impacts pulverized coal power plants that burn low rank high volatile content coals (such as Powder River Basin coal. These plants can potentially reduce in-plant load by grinding the coal less (without impacting plant performance on emissions and efficiency and thereby, increasing their marketability.
Studies of the relationship between mineral matter and grinding properties for low-rank coals
Energy Technology Data Exchange (ETDEWEB)
Ural, Suphi [Department of Mining Engineering, Cukurova University, 01330 Adana (Turkey); Akildiz, Mustafa [Department of Geological Engineering, Cukurova University, 01330, Adana (Turkey)
2004-10-22
Investigations into the effects of mineral matter content on Hardgrove Grindability Index (HGI) were carried out on some low-rank Turkish coals. Quantitative X-ray diffraction (XRD) analyses were carried out using an interactive data processing system (SIROQUANT(TM)) based on Rietveld interpretation methods. Selective leaching processes were used to determine the water and acid-soluble contents of coal samples. Among the coal seams tested, the HGI values of Elbistan coal samples presented a large range from 39 to 83, whereas Tufanbeyli coal samples ranged from 48 to 69. Treatment of the coal with water, ammonium acetate, and hydrochloric acid showed that a considerable part of the ash-forming inorganic matter occurs in water-soluble, acid-soluble, or ion-exchangeable form. Grindability tests on samples of varied water and acid-soluble content showed a significant effect of water and acid-soluble contents on HGI.
Fast Multipole Method as a Matrix-Free Hierarchical Low-Rank Approximation
Yokota, Rio
2018-01-03
There has been a large increase in the amount of work on hierarchical low-rank approximation methods, where the interest is shared by multiple communities that previously did not intersect. This objective of this article is two-fold; to provide a thorough review of the recent advancements in this field from both analytical and algebraic perspectives, and to present a comparative benchmark of two highly optimized implementations of contrasting methods for some simple yet representative test cases. The first half of this paper has the form of a survey paper, to achieve the former objective. We categorize the recent advances in this field from the perspective of compute-memory tradeoff, which has not been considered in much detail in this area. Benchmark tests reveal that there is a large difference in the memory consumption and performance between the different methods.
OXIDATION OF MERCURY ACROSS SCR CATALYSTS IN COAL-FIRED POWER PLANTS BURNING LOW RANK FUELS
Energy Technology Data Exchange (ETDEWEB)
Constance Senior
2004-04-30
This is the fifth Quarterly Technical Report for DOE Cooperative Agreement No: DE-FC26-03NT41728. The objective of this program is to measure the oxidation of mercury in flue gas across SCR catalyst in a coal-fired power plant burning low rank fuels using a slipstream reactor containing multiple commercial catalysts in parallel. The Electric Power Research Institute (EPRI) and Argillon GmbH are providing co-funding for this program. This program contains multiple tasks and good progress is being made on all fronts. During this quarter, the available data from laboratory, pilot and full-scale SCR units was reviewed, leading to hypotheses about the mechanism for mercury oxidation by SCR catalysts.
OXIDATION OF MERCURY ACROSS SCR CATALYSTS IN COAL-FIRED POWER PLANTS BURNING LOW RANK FUELS
Energy Technology Data Exchange (ETDEWEB)
Constance Senior
2004-10-29
This is the seventh Quarterly Technical Report for DOE Cooperative Agreement No: DE-FC26-03NT41728. The objective of this program is to measure the oxidation of mercury in flue gas across SCR catalyst in a coal-fired power plant burning low rank fuels using a slipstream reactor containing multiple commercial catalysts in parallel. The Electric Power Research Institute (EPRI) and Argillon GmbH are providing co-funding for this program. This program contains multiple tasks and good progress is being made on all fronts. During this quarter, a model of Hg oxidation across SCRs was formulated based on full-scale data. The model took into account the effects of temperature, space velocity, catalyst type and HCl concentration in the flue gas.
Image Registration based on Low Rank Matrix: Rank-Regularized SSD.
Ghaffari, Aboozar; Fatemizadeh, Emad
2017-08-25
Similarity measure is a main core of image registration algorithms. Spatially varying intensity distortion is an important challenge which affects the performance of similarity measures. Correlation among pixels is the main characteristic of this distortion. Similarity measures such as sum-of-squareddifferences (SSD) and mutual information (MI) ignore this correlation; Hence, perfect registration cannot be achieved in the presence of this distortion. In this paper, we model this correlation with the aid of the low rank matrix theory. Based on this model, we compensate this distortion analytically and introduce Rank-Regularized SSD (RRSSD). This new similarity measure is a modified SSD based on singular values of difference image in mono-modal imaging. In fact, image registration and distortion correction are performed simultaneously in the proposed model. Based on our experiments, the RRSSD similarity measure achieves clinically acceptable registration results, and outperforms other state-of-the-art similarity measures such as the well-known method of residual complexity.
Solving block linear systems with low-rank off-diagonal blocks is easily parallelizable
Energy Technology Data Exchange (ETDEWEB)
Menkov, V. [Indiana Univ., Bloomington, IN (United States)
1996-12-31
An easily and efficiently parallelizable direct method is given for solving a block linear system Bx = y, where B = D + Q is the sum of a non-singular block diagonal matrix D and a matrix Q with low-rank blocks. This implicitly defines a new preconditioning method with an operation count close to the cost of calculating a matrix-vector product Qw for some w, plus at most twice the cost of calculating Qw for some w. When implemented on a parallel machine the processor utilization can be as good as that of those operations. Order estimates are given for the general case, and an implementation is compared to block SSOR preconditioning.
Energy Technology Data Exchange (ETDEWEB)
Elbeyli, I.Y.; Palantoken, A.; Piskin, S.; Kuzu, H.; Peksel, A. [Yildiz Technical University, Istanbul (Turkey). Dept. of Chemical Engineering
2006-08-15
Microbial coal liquefaction/solubilization of three low-rank Turkish coals (Bursa-Kestelek, Kutahya-Seyitomer and Mugla-Yatagan lignite) was attempted by using a white-rot fungus (Phanerochaete chrysosporium DSM No. 6909); chemical compositions of the products were investigated. The lignite samples were oxidized by nitric acid under moderate conditions and then oxidized samples were placed on the agar medium of Phanerochaete chrysosporium. FTIR spectra of raw lignites, oxidized lignites and liquid products were recorded, and the acetone-soluble fractions of these samples were identified by GC-MS technique. Results show that the fungus affects the nitro and carboxyl/carbonyl groups in oxidized lignite sample, the liquid products obtained by microbial effects are the mixture of water-soluble compounds, and show limited organic solubility.
Quantifying Photonic High-Dimensional Entanglement
Martin, Anthony; Guerreiro, Thiago; Tiranov, Alexey; Designolle, Sébastien; Fröwis, Florian; Brunner, Nicolas; Huber, Marcus; Gisin, Nicolas
2017-03-01
High-dimensional entanglement offers promising perspectives in quantum information science. In practice, however, the main challenge is to devise efficient methods to characterize high-dimensional entanglement, based on the available experimental data which is usually rather limited. Here we report the characterization and certification of high-dimensional entanglement in photon pairs, encoded in temporal modes. Building upon recently developed theoretical methods, we certify an entanglement of formation of 2.09(7) ebits in a time-bin implementation, and 4.1(1) ebits in an energy-time implementation. These results are based on very limited sets of local measurements, which illustrates the practical relevance of these methods.
Extracellular oxidases and the transformation of solubilised low-rank coal by wood-rot fungi
Energy Technology Data Exchange (ETDEWEB)
Ralph, J.P. [Flinders Univ. of South Australia, Bedford Park (Australia). School of Biological Sciences; Graham, L.A. [Flinders Univ. of South Australia, Bedford Park (Australia). School of Biological Sciences; Catcheside, D.E.A. [Flinders Univ. of South Australia, Bedford Park (Australia). School of Biological Sciences
1996-12-31
The involvement of extracellular oxidases in biotransformation of low-rank coal was assessed by correlating the ability of nine white-rot and brown-rot fungi to alter macromolecular material in alkali-solubilised brown coal with the spectrum of oxidases they produce when grown on low-nitrogen medium. The coal fraction used was that soluble at 3.0{<=}pH{<=}6.0 (SWC6 coal). In 15-ml cultures, Gloeophyllum trabeum, Lentinus lepideus and Trametes versicolor produced little or no lignin peroxidase, manganese (Mn) peroxidase or laccase activity and caused no change to SWC6 coal. Ganoderma applanatum and Pycnoporus cinnabarinus also produced no detectable lignin or Mn peroxidases or laccase yet increased the absorbance at 400 nm of SWC6 coal. G. applanatum, which produced veratryl alcohol oxidase, also increased the modal apparent molecular mass. SWC6 coal exposed to Merulius tremellosus and Perenniporia tephropora, which secreted Mn peroxidases and laccase and Phanerochaete chrysosporium, which produced Mn and lignin peroxidases was polymerised but had unchanged or decreased absorbance. In the case of both P. chrysosporium and M. tremellosus, polymerisation of SWC6 coal was most extensive, leading to the formation of a complex insoluble in 100 mM NaOH. Rigidoporus ulmarius, which produced only laccase, both polymerised and reduced the A{sub 400} of SWC6 coal. P. chrysosporium, M. tremellosus and P. tephropora grown in 10-ml cultures produced a spectrum of oxidases similar to that in 15-ml cultures but, in each case, caused more extensive loss of A{sub 400}, and P. chrysosporium depolymerised SWC6 coal. It is concluded that the extracellular oxidases of white-rot fungi can transform low-rank coal macromolecules and that increased oxygen availability in the shallower 10-ml cultures favours catabolism over polymerisation. (orig.)
Asymptotically Honest Confidence Regions for High Dimensional
DEFF Research Database (Denmark)
Caner, Mehmet; Kock, Anders Bredahl
While variable selection and oracle inequalities for the estimation and prediction error have received considerable attention in the literature on high-dimensional models, very little work has been done in the area of testing and construction of confidence bands in high-dimensional models. However...... of the asymptotic covariance matrix of an increasing number of parameters which is robust against conditional heteroskedasticity. To our knowledge we are the first to do so. Next, we show that our confidence bands are honest over sparse high-dimensional sub vectors of the parameter space and that they contract...... at the optimal rate. All our results are valid in high-dimensional models. Our simulations reveal that the desparsified conservative Lasso estimates the parameters much more precisely than the desparsified Lasso, has much better size properties and produces confidence bands with markedly superior coverage rates....
Low rank factorization of the Coulomb integrals for periodic coupled cluster theory
Hummel, Felix; Grüneis, Andreas
2016-01-01
We study the decomposition of the Coulomb integrals of periodic systems into a tensor contraction of six matrices of which only two are distinct. We find that the Coulomb integrals can be well approximated in this form already with small matrices compared to the number of real space grid points. The cost of computing the matrices scales as O(N^4) using a regularized form of the alternating least squares algorithm. The studied factorization of the Coulomb integrals can be exploited to reduce the scaling of the computational cost of expensive tensor contractions appearing in the amplitude equations of coupled cluster methods with respect to system size. We apply the developed methodologies to calculate the adsorption energy of a single water molecule on a hexagonal boron nitride monolayer in a plane wave basis set and periodic boundary conditions.
Low rank factorization of the Coulomb integrals for periodic coupled cluster theory
Hummel, Felix; Tsatsoulis, Theodoros; Grüneis, Andreas
2017-03-01
We study a tensor hypercontraction decomposition of the Coulomb integrals of periodic systems where the integrals are factorized into a contraction of six matrices of which only two are distinct. We find that the Coulomb integrals can be well approximated in this form already with small matrices compared to the number of real space grid points. The cost of computing the matrices scales as O (N4) using a regularized form of the alternating least squares algorithm. The studied factorization of the Coulomb integrals can be exploited to reduce the scaling of the computational cost of expensive tensor contractions appearing in the amplitude equations of coupled cluster methods with respect to system size. We apply the developed methodologies to calculate the adsorption energy of a single water molecule on a hexagonal boron nitride monolayer in a plane wave basis set and periodic boundary conditions.
Mehta, Madan Lal
1990-01-01
Since the publication of Random Matrices (Academic Press, 1967) so many new results have emerged both in theory and in applications, that this edition is almost completely revised to reflect the developments. For example, the theory of matrices with quaternion elements was developed to compute certain multiple integrals, and the inverse scattering theory was used to derive asymptotic results. The discovery of Selberg's 1944 paper on a multiple integral also gave rise to hundreds of recent publications. This book presents a coherent and detailed analytical treatment of random matrices, leading
Weighted low-rank sparse model via nuclear norm minimization for bearing fault detection
Du, Zhaohui; Chen, Xuefeng; Zhang, Han; Yang, Boyuan; Zhai, Zhi; Yan, Ruqiang
2017-07-01
It is a fundamental task in the machine fault diagnosis community to detect impulsive signatures generated by the localized faults of bearings. The main goal of this paper is to exploit the low-rank physical structure of periodic impulsive features and further establish a weighted low-rank sparse model for bearing fault detection. The proposed model mainly consists of three basic components: an adaptive partition window, a nuclear norm regularization and a weighted sequence. Firstly, due to the periodic repetition mechanism of impulsive feature, an adaptive partition window could be designed to transform the impulsive feature into a data matrix. The highlight of partition window is to accumulate all local feature information and align them. Then, all columns of the data matrix share similar waveforms and a core physical phenomenon arises, i.e., these singular values of the data matrix demonstrates a sparse distribution pattern. Therefore, a nuclear norm regularization is enforced to capture that sparse prior. However, the nuclear norm regularization treats all singular values equally and thus ignores one basic fact that larger singular values have more information volume of impulsive features and should be preserved as much as possible. Therefore, a weighted sequence with adaptively tuning weights inversely proportional to singular amplitude is adopted to guarantee the distribution consistence of large singular values. On the other hand, the proposed model is difficult to solve due to its non-convexity and thus a new algorithm is developed to search one satisfying stationary solution through alternatively implementing one proximal operator operation and least-square fitting. Moreover, the sensitivity analysis and selection principles of algorithmic parameters are comprehensively investigated through a set of numerical experiments, which shows that the proposed method is robust and only has a few adjustable parameters. Lastly, the proposed model is applied to the
Energy Technology Data Exchange (ETDEWEB)
Akcelik, Volkan [ORNL; Flath, Pearl [University of Texas, Austin; Ghattas, Omar [University of Texas, Austin; Hill, Judith C [ORNL; Van Bloemen Waanders, Bart [Sandia National Laboratories (SNL); Wilcox, Lucas [University of Texas, Austin
2011-01-01
We consider the problem of estimating the uncertainty in large-scale linear statistical inverse problems with high-dimensional parameter spaces within the framework of Bayesian inference. When the noise and prior probability densities are Gaussian, the solution to the inverse problem is also Gaussian, and is thus characterized by the mean and covariance matrix of the posterior probability density. Unfortunately, explicitly computing the posterior covariance matrix requires as many forward solutions as there are parameters, and is thus prohibitive when the forward problem is expensive and the parameter dimension is large. However, for many ill-posed inverse problems, the Hessian matrix of the data misfit term has a spectrum that collapses rapidly to zero. We present a fast method for computation of an approximation to the posterior covariance that exploits the lowrank structure of the preconditioned (by the prior covariance) Hessian of the data misfit. Analysis of an infinite-dimensional model convection-diffusion problem, and numerical experiments on large-scale 3D convection-diffusion inverse problems with up to 1.5 million parameters, demonstrate that the number of forward PDE solves required for an accurate low-rank approximation is independent of the problem dimension. This permits scalable estimation of the uncertainty in large-scale ill-posed linear inverse problems at a small multiple (independent of the problem dimension) of the cost of solving the forward problem.
Yano, Ken; Suyama, Takayuki
2016-01-01
This paper proposes a novel fixed low-rank spatial filter estimation for brain computer interface (BCI) systems with an application that recognizes emotions elicited by movies. The proposed approach unifies such tasks as feature extraction, feature selection, and classification, which are often independently tackled in a "bottom-up" manner, under a regularized loss minimization problem. The loss function is explicitly derived from the conventional BCI approach and solves its minimization by optimization with a nonconvex fixed low-rank constraint. For evaluation, an experiment was conducted to induce emotions by movies for dozens of young adult subjects and estimated the emotional states using the proposed method. The advantage of the proposed method is that it combines feature selection, feature extraction, and classification into a monolithic optimization problem with a fixed low-rank regularization, which implicitly estimates optimal spatial filters. The proposed method shows competitive performance against the best CSP-based alternatives.
Directory of Open Access Journals (Sweden)
Ken Yano
2016-01-01
Full Text Available This paper proposes a novel fixed low-rank spatial filter estimation for brain computer interface (BCI systems with an application that recognizes emotions elicited by movies. The proposed approach unifies such tasks as feature extraction, feature selection, and classification, which are often independently tackled in a “bottom-up” manner, under a regularized loss minimization problem. The loss function is explicitly derived from the conventional BCI approach and solves its minimization by optimization with a nonconvex fixed low-rank constraint. For evaluation, an experiment was conducted to induce emotions by movies for dozens of young adult subjects and estimated the emotional states using the proposed method. The advantage of the proposed method is that it combines feature selection, feature extraction, and classification into a monolithic optimization problem with a fixed low-rank regularization, which implicitly estimates optimal spatial filters. The proposed method shows competitive performance against the best CSP-based alternatives.
Energy Technology Data Exchange (ETDEWEB)
Gauntt, R. O.; DeOtte, R. E.; Slowey, J. F.; McFarland, A. R.
1984-01-01
In parallel with pursuing the goal of increased utilization of low-rank solid fuels, the US Department of Energy is investigating various aspects associated with the disposal of coal-combustion solid wastes. Concern has been expressed relative to the potential hazards presented by leachates from fly ash, bottom ash and scrubber wastes. This is of particular interest in some regions where disposal areas overlap aquifer recharge regions. The western regions of the United States are characterized by relatively dry alkaline soils which may effect substantial attenuation of contaminants in the leachates thereby reducing the pollution potential. A project has been initiated to study the contaminant uptake of western soils. This effort consists of two phases: (1) preparation of a state-of-the-art document on soil attenuation; and (2) laboratory experimental studies to characterize attenuation of a western soil. The state-of-the-art document, represented herein, presents the results of studies on the characteristics of selected wastes, reviews the suggested models which account for the uptake, discusses the specialized columnar laboratory studies on the interaction of leachates and soils, and gives an overview of characteristics of Texas and Wyoming soils. 116 references, 10 figures, 29 tables.
L1 -norm low-rank matrix factorization by variational Bayesian method.
Zhao, Qian; Meng, Deyu; Xu, Zongben; Zuo, Wangmeng; Yan, Yan
2015-04-01
The L1 -norm low-rank matrix factorization (LRMF) has been attracting much attention due to its wide applications to computer vision and pattern recognition. In this paper, we construct a new hierarchical Bayesian generative model for the L1 -norm LRMF problem and design a mean-field variational method to automatically infer all the parameters involved in the model by closed-form equations. The variational Bayesian inference in the proposed method can be understood as solving a weighted LRMF problem with different weights on matrix elements based on their significance and with L2 -regularization penalties on parameters. Throughout the inference process of our method, the weights imposed on the matrix elements can be adaptively fitted so that the adverse influence of noises and outliers embedded in data can be largely suppressed, and the parameters can be appropriately regularized so that the generalization capability of the problem can be statistically guaranteed. The robustness and the efficiency of the proposed method are substantiated by a series of synthetic and real data experiments, as compared with the state-of-the-art L1 -norm LRMF methods. Especially, attributed to the intrinsic generalization capability of the Bayesian methodology, our method can always predict better on the unobserved ground truth data than existing methods.
Efficient anisotropic quasi-P wavefield extrapolation using an isotropic low-rank approximation
Zhang, Zhendong
2017-12-17
The computational cost of quasi-P wave extrapolation depends on the complexity of the medium, and specifically the anisotropy. Our effective-model method splits the anisotropic dispersion relation into an isotropic background and a correction factor to handle this dependency. The correction term depends on the slope (measured using the gradient) of current wavefields and the anisotropy. As a result, the computational cost is independent of the nature of anisotropy, which makes the extrapolation efficient. A dynamic implementation of this approach decomposes the original pseudo-differential operator into a Laplacian, handled using the low-rank approximation of the spectral operator, plus an angular dependent correction factor applied in the space domain to correct for anisotropy. We analyze the role played by the correction factor and propose a new spherical decomposition of the dispersion relation. The proposed method provides accurate wavefields in phase and more balanced amplitudes than a previous spherical decomposition. Also, it is free of SV-wave artifacts. Applications to a simple homogeneous transverse isotropic medium with a vertical symmetry axis (VTI) and a modified Hess VTI model demonstrate the effectiveness of the approach. The Reverse Time Migration (RTM) applied to a modified BP VTI model reveals that the anisotropic migration using the proposed modeling engine performs better than an isotropic migration.
OXIDATION OF MERCURY ACROSS SCR CATALYSTS IN COAL-FIRED POWER PLANTS BURING LOW RANK FUELS
Energy Technology Data Exchange (ETDEWEB)
Constance Senior
2004-07-30
This is the sixth Quarterly Technical Report for DOE Cooperative Agreement No: DE-FC26-03NT41728. The objective of this program is to measure the oxidation of mercury in flue gas across SCR catalyst in a coal-fired power plant burning low rank fuels using a slipstream reactor containing multiple commercial catalysts in parallel. The Electric Power Research Institute (EPRI) and Argillon GmbH are providing co-funding for this program. This program contains multiple tasks and good progress is being made on all fronts. During this quarter, a review of the available data on mercury oxidation across SCR catalysts from small, laboratory-scale experiments, pilot-scale slipstream reactors and full-scale power plants was carried out. Data from small-scale reactors obtained with both simulated flue gas and actual coal combustion flue gas demonstrated the importance of temperature, ammonia, space velocity and chlorine on mercury oxidation across SCR catalyst. SCR catalysts are, under certain circumstances, capable of driving mercury speciation toward the gas-phase equilibrium values at SCR temperatures. Evidence suggests that mercury does not always reach equilibrium at the outlet. There may be other factors that become apparent as more data become available.
Conflict-cost based random sampling design for parallel MRI with low rank constraints
Kim, Wan; Zhou, Yihang; Lyu, Jingyuan; Ying, Leslie
2015-05-01
In compressed sensing MRI, it is very important to design sampling pattern for random sampling. For example, SAKE (simultaneous auto-calibrating and k-space estimation) is a parallel MRI reconstruction method using random undersampling. It formulates image reconstruction as a structured low-rank matrix completion problem. Variable density (VD) Poisson discs are typically adopted for 2D random sampling. The basic concept of Poisson disc generation is to guarantee samples are neither too close to nor too far away from each other. However, it is difficult to meet such a condition especially in the high density region. Therefore the sampling becomes inefficient. In this paper, we present an improved random sampling pattern for SAKE reconstruction. The pattern is generated based on a conflict cost with a probability model. The conflict cost measures how many dense samples already assigned are around a target location, while the probability model adopts the generalized Gaussian distribution which includes uniform and Gaussian-like distributions as special cases. Our method preferentially assigns a sample to a k-space location with the least conflict cost on the circle of the highest probability. To evaluate the effectiveness of the proposed random pattern, we compare the performance of SAKEs using both VD Poisson discs and the proposed pattern. Experimental results for brain data show that the proposed pattern yields lower normalized mean square error (NMSE) than VD Poisson discs.
An environmentally friendly technology for the carbonisation of low ranked coal and biomass
Energy Technology Data Exchange (ETDEWEB)
Wirtgen, G.; Weigandt, J.; Heil, J.; Thoste, V. [Aachen University of Technology, Aachen (Germany)
2002-07-01
Between 1997 and 2001 the Federal Institute for Geosciences and Natural Resources in connection with the Coking Group of the Aachen University of Technology developed a environmentally friendly process for the carbonisation of low ranked coals and biomass. The main tasks of finished investigations have been, to produce an economically competitive carbonisate to substitute wood and charcoal on local markets and to protect local forests. So far the project covered examinations on the behaviour of the pyrolysis of brown coals and biomass in a shaft reactor at Kuching, Malaysia, and a pilot rotary kiln reactor at Aachen. During test runs burning and briquetting tests were carried out with selected coals and biomass from Brazil, Thailand, Malaysia, Indonesia and the Phillipines. Also some coals from near east countries have been tested. To ensure thermally autarkic operation, the appropriate moisture and ash contents of the feed material were determined and a temperature based controlling system has been developed. Finally all tested materials allowed the production of a smokeless carbonisate under thermically autarkic operation. After finishing the test with a shaft reactor (feed up to 100 kg/h) the building of a rotary kiln pilot plant (feed 300 kg/h) as preproduction phase for commercial use (feed 3 - 5 t/h) is scheduled in 2002. First economic calculations on a rotary kiln operation demonstrated, that the carbonisate is competitive with local fuels such as kerosene, petroleum and gas. Additionally some carbonisates fit the quality standards for direct activation. 3 refs., 11 figs., 2 tabs.
Data Compression for the Tomo-e Gozen Using Low-rank Matrix Approximation
Morii, Mikio; Ikeda, Shiro; Sako, Shigeyuki; Ohsawa, Ryou
2017-01-01
Optical wide-field surveys with a high cadence are expected to create a new field of astronomy, so-called “movie astronomy,” in the near future. The amount of data from the observations will be huge, and hence efficient data compression will be indispensable. Here we propose a low-rank matrix approximation with sparse matrix decomposition as a promising solution to reduce the data size effectively while preserving sufficient scientific information. We apply one of the methods to the movie data obtained with the prototype model of the Tomo-e Gozen mounted on the 1.0 m Schmidt telescope of Kiso Observatory. Once full-scale observation with the Tomo-e Gozen commences, it will generate ˜30 TB of data per night. We demonstrate that the data are compressed by a factor of about 10 in size without losing transient events like optical short transient point sources and meteors. The intensity of point sources can be recovered from the compressed data. The processing runs sufficiently fast, compared with the expected data-acquisition rate in the actual observing runs.
Multichannel myopic deconvolution in underwater acoustic channels via low-rank recovery.
Tian, Ning; Byun, Sung-Hoon; Sabra, Karim; Romberg, Justin
2017-05-01
This paper presents a technique for solving the multichannel blind deconvolution problem. The authors observe the convolution of a single (unknown) source with K different (unknown) channel responses; from these channel outputs, the authors want to estimate both the source and the channel responses. The authors show how this classical signal processing problem can be viewed as solving a system of bilinear equations, and in turn can be recast as recovering a rank-1 matrix from a set of linear observations. Results of prior studies in the area of low-rank matrix recovery have identified effective convex relaxations for problems of this type and efficient, scalable heuristic solvers that enable these techniques to work with thousands of unknown variables. The authors show how a priori information about the channels can be used to build a linear model for the channels, which in turn makes solving these systems of equations well-posed. This study demonstrates the robustness of this methodology to measurement noises and parametrization errors of the channel impulse responses with several stylized and shallow water acoustic channel simulations. The performance of this methodology is also verified experimentally using shipping noise recorded on short bottom-mounted vertical line arrays.
Energy Technology Data Exchange (ETDEWEB)
Jain, M.K.; Narayan, R.
1993-08-05
Coal solubilization under aerobic conditions results in oxygenated coal product which, in turn, makes the coal poorer fuel than the starting material. A novel approach has been made in this project is to remove oxygen from coal by reductive decarboxylation. In Wyodak subbituminous coal the major oxygen functionality is carboxylic groups which exist predominantly as carboxylate anions strongly chelating metal cations like Ca{sup 2+} and forming strong macromolecular crosslinks which contribute in large measure to network polymer structure. Removal of the carboxylic groups at ambient temperature by anaerobic organisms would unravel the macromoleculer network, resulting in smaller coal macromolecules with increased H/C ratio which has better fuel value and better processing prospects. These studies described here sought to find biological methods to remove carboxylic functionalities from low rank coals under ambient conditions and to assess the properties of these modified coals towards coal liquefaction. Efforts were made to establish anaerobic microbial consortia having decarboxylating ability, decarboxylate coal with the adapted microbial consortia, isolate the organisms, and characterize the biotreated coal products. Production of CO{sup 2} was used as the primary indicator for possible coal decarboxylation.
Ultra low radiation dose digital subtraction angiography (DSA) imaging using low rank constraint
Niu, Kai; Li, Yinsheng; Schafer, Sebastian; Royalty, Kevin; Wu, Yijing; Strother, Charles; Chen, Guang-Hong
2015-03-01
In this work we developed a novel denoising algorithm for DSA image series. This algorithm takes advantage of the low rank nature of the DSA image sequences to enable a dramatic reduction in radiation and/or contrast doses in DSA imaging. Both spatial and temporal regularizers were introduced in the optimization algorithm to further reduce noise. To validate the method, in vivo animal studies were conducted with a Siemens Artis Zee biplane system using different radiation dose levels and contrast concentrations. Both conventionally processed DSA images and the DSA images generated using the novel denoising method were compared using absolute noise standard deviation and the contrast to noise ratio (CNR). With the application of the novel denoising algorithm for DSA, image quality can be maintained with a radiation dose reduction by a factor of 20 and/or a factor of 2 reduction in contrast dose. Image processing is completed on a GPU within a second for a 10s DSA data acquisition.
Bio-liquefaction/solubilization of low-rank Turkish lignites and characterization of the products
Energy Technology Data Exchange (ETDEWEB)
Yesim Basaran; Adil Denizli; Billur Sakintuna; Alpay Taralp; Yuda Yurum [Hacettepe University, Ankara (Turkey). Department of Environmental Sciences
2003-08-01
The effect of some white-rot fungi on the bio-liquefaction/solubilization of two low-rank Turkish coals and the chemical composition of the liquid products and the microbial mechanisms of coal conversion were investigated. Turkish Elbistan and Beypazari lignites were used in this study. The white-rot fungi received from various laboratories used in the bio-liquefaction/solubilization of the lignites were Pleurotus sajor-caju, Pleurotus sapidus, Pleurotus florida, Pleurotus ostreatus, Phanerochaete chrysosporium, and Coriolus versicolor. FT-IR spectra of raw and treated coal samples were measured, and bio-liquefied/solubilized coal samples were investigated by FT-IR and LC-MS techniques. The Coriolus versicolor fungus was determined to be most effective in bio-liquefying/solubilizing nitric acid-treated Elbistan lignite. In contrast, raw and nitric acid-treated Beypazari lignite seemed to be unaffected by the action of any kind of white-rot fungi. The liquid chromatogram of the water-soluble bio-liquefied/solubilized product contained four major peaks. Corresponding mass spectra of each peak indicated the presence of very complicated structures. 17 refs., 9 figs., 2 tabs.
Low-Rank Latent Pattern Approximation With Applications to Robust Image Classification.
Shuo Chen; Jian Yang; Lei Luo; Yang Wei; Kaihua Zhang; Ying Tai
2017-11-01
This paper develops a novel method to address the structural noise in samples for image classification. Recently, regression-related classification methods have shown promising results when facing the pixelwise noise. However, they become weak in coping with the structural noise due to ignoring of relationships between pixels of noise image. Meanwhile, most of them need to implement the iterative process for computing representation coefficients, which leads to the high time consumption. To overcome these problems, we exploit a latent pattern model called low-rank latent pattern approximation (LLPA) to reconstruct the test image having structural noise. The rank function is applied to characterize the structure of the reconstruction residual between test image and the corresponding latent pattern. Simultaneously, the error between the latent pattern and the reference image is constrained by Frobenius norm to prevent overfitting. LLPA involves a closed-form solution by the virtue of a singular value thresholding operator. The provided theoretic analysis demonstrates that LLPA indeed removes the structural noise during classification task. Additionally, LLPA is further extended to the form of matrix regression by connecting multiple training samples, and alternating direction of multipliers method with Gaussian back substitution algorithm is used to solve the extended LLPA. Experimental results on several popular data sets validate that the proposed methods are more robust to image classification with occlusion and illumination changes, as compared to some existing state-of-the-art reconstruction-based methods and one deep neural network-based method.
Accelerated cardiac cine MRI using locally low rank and finite difference constraints.
Miao, Xin; Lingala, Sajan Goud; Guo, Yi; Jao, Terrence; Usman, Muhammad; Prieto, Claudia; Nayak, Krishna S
2016-07-01
To evaluate the potential value of combining multiple constraints for highly accelerated cardiac cine MRI. A locally low rank (LLR) constraint and a temporal finite difference (FD) constraint were combined to reconstruct cardiac cine data from highly undersampled measurements. Retrospectively undersampled 2D Cartesian reconstructions were quantitatively evaluated against fully-sampled data using normalized root mean square error, structural similarity index (SSIM) and high frequency error norm (HFEN). This method was also applied to 2D golden-angle radial real-time imaging to facilitate single breath-hold whole-heart cine (12 short-axis slices, 9-13s single breath hold). Reconstruction was compared against state-of-the-art constrained reconstruction methods: LLR, FD, and k-t SLR. At 10 to 60 spokes/frame, LLR+FD better preserved fine structures and depicted myocardial motion with reduced spatio-temporal blurring in comparison to existing methods. LLR yielded higher SSIM ranking than FD; FD had higher HFEN ranking than LLR. LLR+FD combined the complimentary advantages of the two, and ranked the highest in all metrics for all retrospective undersampled cases. Single breath-hold multi-slice cardiac cine with prospective undersampling was enabled with in-plane spatio-temporal resolutions of 2×2mm(2) and 40ms. Highly accelerated cardiac cine is enabled by the combination of 2D undersampling and the synergistic use of LLR and FD constraints. Copyright © 2016 Elsevier Inc. All rights reserved.
High dimensional neurocomputing growth, appraisal and applications
Tripathi, Bipin Kumar
2015-01-01
The book presents a coherent understanding of computational intelligence from the perspective of what is known as "intelligent computing" with high-dimensional parameters. It critically discusses the central issue of high-dimensional neurocomputing, such as quantitative representation of signals, extending the dimensionality of neuron, supervised and unsupervised learning and design of higher order neurons. The strong point of the book is its clarity and ability of the underlying theory to unify our understanding of high-dimensional computing where conventional methods fail. The plenty of application oriented problems are presented for evaluating, monitoring and maintaining the stability of adaptive learning machine. Author has taken care to cover the breadth and depth of the subject, both in the qualitative as well as quantitative way. The book is intended to enlighten the scientific community, ranging from advanced undergraduates to engineers, scientists and seasoned researchers in computational intelligenc...
Advanced CO_{2} Capture Technology for Low Rank Coal IGCC System
Energy Technology Data Exchange (ETDEWEB)
Alptekin, Gokhan [Tda Research, Inc., Wheat Ridge, CO (United States)
2013-09-30
The overall objective of the project is to demonstrate the technical and economic viability of a new Integrated Gasification Combined Cycle (IGCC) power plant designed to efficiently process low rank coals. The plant uses an integrated CO_{2} scrubber/Water Gas Shift (WGS) catalyst to capture over90 percent capture of the CO_{2} emissions, while providing a significantly lower cost of electricity (COE) than a similar plant with conventional cold gas cleanup system based on Selexol^{TM} technology and 90 percent carbon capture. TDA’s system uses a high temperature physical adsorbent capable of removing CO_{2} above the dew point of the synthesis gas and a commercial WGS catalyst that can effectively convert CO in The overall objective of the project is to demonstrate the technical and economic viability of a new Integrated Gasification Combined Cycle (IGCC) power plant designed to efficiently process low rank coals. The plant uses an integrated CO_{2} scrubber/Water Gas Shift (WGS) catalyst to capture over90 percent capture of the CO_{2} emissions, while providing a significantly lower cost of electricity (COE) than a similar plant with conventional cold gas cleanup system based on Selexol^{TM} technology and 90 percent carbon capture. TDA’s system uses a high temperature physical adsorbent capable of removing CO_{2} above the dew point of the synthesis gas and a commercial WGS catalyst that can effectively convert CO in bituminous coal the net plant efficiency is about 2.4 percentage points higher than an Integrated Gasification Combined Cycle (IGCC) plant equipped with Selexol^{TM} to capture CO_{2}. We also previously completed two successful field demonstrations: one at the National Carbon Capture Center (Southern- Wilsonville, AL) in 2011, and a second demonstration in fall of 2012 at the Wabash River IGCC plant (Terra Haute, IN). In this project, we first optimized the sorbent
CO{sub 2} SEQUESTRATION POTENTIAL OF TEXAS LOW-RANK COALS
Energy Technology Data Exchange (ETDEWEB)
Duane A. McVay; Walter B. Ayers Jr; Jerry L. Jensen
2005-02-01
The objectives of this project are to evaluate the feasibility of carbon dioxide (CO{sub 2}) sequestration in Texas low-rank coals and to determine the potential for enhanced coalbed methane (CBM) recovery as an added benefit of sequestration. There were three main objectives for this reporting period, which related to obtaining accurate parameters for reservoir model description and modeling reservoir performance of CO{sub 2} sequestration and enhanced coalbed methane recovery. The first objective was to collect and desorb gas from 10 sidewall core coal samples from an Anadarko Petroleum Corporation well (APCL2 well) at approximately 6,200-ft depth in the Lower Calvert Bluff Formation of the Wilcox Group in east-central Texas. The second objective was to measure sorptive capacities of these Wilcox coal samples for CO{sub 2}, CH{sub 4}, and N{sub 2}. The final objective was to contract a service company to perform pressure transient testing in Wilcox coal beds in a shut-in well, to determine permeability of deep Wilcox coal. Bulk density of the APCL2 well sidewall core samples averaged 1.332 g/cc. The 10 sidewall core samples were placed in 4 sidewall core canisters and desorbed. Total gas content of the coal (including lost gas and projected residual gas) averaged 395 scf/ton on an as-received basis. The average lost gas estimations were approximately 45% of the bulk sample total gas. Projected residual gas was 5% of in-situ gas content. Six gas samples desorbed from the sidewall cores were analyzed to determine gas composition. Average gas composition was approximately 94.3% methane, 3.0% ethane, and 0.7% propane, with traces of heavier hydrocarbon gases. Carbon dioxide averaged 1.7%. Coal from the 4 canisters was mixed to form one composite sample that was used for pure CO{sub 2}, CH{sub 4}, and N{sub 2} isotherm analyses. The composite sample was 4.53% moisture, 37.48% volatile matter, 9.86% ash, and 48.12% fixed carbon. Mean vitrinite reflectance was 0
Scoping Studies to Evaluate the Benefits of an Advanced Dry Feed System on the Use of Low-Rank Coal
Energy Technology Data Exchange (ETDEWEB)
Rader, Jeff; Aguilar, Kelly; Aldred, Derek; Chadwick, Ronald; Conchieri, John; Dara, Satyadileep; Henson, Victor; Leininger, Tom; Liber, Pawel; Liber, Pawel; Lopez-Nakazono, Benito; Pan, Edward; Ramirez, Jennifer; Stevenson, John; Venkatraman, Vignesh
2012-03-30
The purpose of this project was to evaluate the ability of advanced low rank coal gasification technology to cause a significant reduction in the COE for IGCC power plants with 90% carbon capture and sequestration compared with the COE for similarly configured IGCC plants using conventional low rank coal gasification technology. GE’s advanced low rank coal gasification technology uses the Posimetric Feed System, a new dry coal feed system based on GE’s proprietary Posimetric Feeder. In order to demonstrate the performance and economic benefits of the Posimetric Feeder in lowering the cost of low rank coal-fired IGCC power with carbon capture, two case studies were completed. In the Base Case, the gasifier was fed a dilute slurry of Montana Rosebud PRB coal using GE’s conventional slurry feed system. In the Advanced Technology Case, the slurry feed system was replaced with the Posimetric Feed system. The process configurations of both cases were kept the same, to the extent possible, in order to highlight the benefit of substituting the Posimetric Feed System for the slurry feed system.
Liu, Yuanyuan; Jiao, L C; Shang, Fanhua; Yin, Fei; Liu, F
2013-12-01
In recent years, matrix rank minimization problems have aroused considerable interests from machine learning, data mining and computer vision communities. All of these problems can be solved via their convex relaxations which minimize the trace norm instead of the rank of the matrix, and have to be solved iteratively and involve singular value decomposition (SVD) at each iteration. Therefore, those algorithms for trace norm minimization problems suffer from high computation cost of multiple SVDs. In this paper, we propose an efficient Matrix Bi-Factorization (MBF) method to approximate the original trace norm minimization problem and mitigate the computation cost of performing SVDs. The proposed MBF method can be used to address a wide range of low-rank matrix recovery and completion problems such as low-rank and sparse matrix decomposition (LRSD), low-rank representation (LRR) and low-rank matrix completion (MC). We also present three small scale matrix trace norm models for LRSD, LRR and MC problems, respectively. Moreover, we develop two concrete linearized proximal alternative optimization algorithms for solving the above three problems. Experimental results on a variety of synthetic and real-world data sets validate the efficiency, robustness and effectiveness of our MBF method comparing with the state-of-the-art trace norm minimization algorithms. Copyright © 2013 Elsevier Ltd. All rights reserved.
Enhancement of dynamic myocardial perfusion PET images based on low-rank plus sparse decomposition.
Lu, Lijun; Ma, Xiaomian; Mohy-Ud-Din, Hassan; Ma, Jianhua; Feng, Qianjin; Rahmim, Arman; Chen, Wufan
2018-02-01
The absolute quantification of dynamic myocardial perfusion (MP) PET imaging is challenged by the limited spatial resolution of individual frame images due to division of the data into shorter frames. This study aims to develop a method for restoration and enhancement of dynamic PET images. We propose that the image restoration model should be based on multiple constraints rather than a single constraint, given the fact that the image characteristic is hardly described by a single constraint alone. At the same time, it may be possible, but not optimal, to regularize the image with multiple constraints simultaneously. Fortunately, MP PET images can be decomposed into a superposition of background vs. dynamic components via low-rank plus sparse (L + S) decomposition. Thus, we propose an L + S decomposition based MP PET image restoration model and express it as a convex optimization problem. An iterative soft thresholding algorithm was developed to solve the problem. Using realistic dynamic 82Rb MP PET scan data, we optimized and compared its performance with other restoration methods. The proposed method resulted in substantial visual as well as quantitative accuracy improvements in terms of noise versus bias performance, as demonstrated in extensive 82Rb MP PET simulations. In particular, the myocardium defect in the MP PET images had improved visual as well as contrast versus noise tradeoff. The proposed algorithm was also applied on an 8-min clinical cardiac 82Rb MP PET study performed on the GE Discovery PET/CT, and demonstrated improved quantitative accuracy (CNR and SNR) compared to other algorithms. The proposed method is effective for restoration and enhancement of dynamic PET images. Copyright © 2017 Elsevier B.V. All rights reserved.
Co-pyrolysis of low rank coals and biomass: Product distributions
Energy Technology Data Exchange (ETDEWEB)
Soncini, Ryan M.; Means, Nicholas C.; Weiland, Nathan T.
2013-10-01
Pyrolysis and gasification of combined low rank coal and biomass feeds are the subject of much study in an effort to mitigate the production of green house gases from integrated gasification combined cycle (IGCC) systems. While co-feeding has the potential to reduce the net carbon footprint of commercial gasification operations, the effects of co-feeding on kinetics and product distributions requires study to ensure the success of this strategy. Southern yellow pine was pyrolyzed in a semi-batch type drop tube reactor with either Powder River Basin sub-bituminous coal or Mississippi lignite at several temperatures and feed ratios. Product gas composition of expected primary constituents (CO, CO{sub 2}, CH{sub 4}, H{sub 2}, H{sub 2}O, and C{sub 2}H{sub 4}) was determined by in-situ mass spectrometry while minor gaseous constituents were determined using a GC-MS. Product distributions are fit to linear functions of temperature, and quadratic functions of biomass fraction, for use in computational co-pyrolysis simulations. The results are shown to yield significant nonlinearities, particularly at higher temperatures and for lower ranked coals. The co-pyrolysis product distributions evolve more tar, and less char, CH{sub 4}, and C{sub 2}H{sub 4}, than an additive pyrolysis process would suggest. For lignite co-pyrolysis, CO and H{sub 2} production are also reduced. The data suggests that evolution of hydrogen from rapid pyrolysis of biomass prevents the crosslinking of fragmented aromatic structures during coal pyrolysis to produce tar, rather than secondary char and light gases. Finally, it is shown that, for the two coal types tested, co-pyrolysis synergies are more significant as coal rank decreases, likely because the initial structure in these coals contains larger pores and smaller clusters of aromatic structures which are more readily retained as tar in rapid co-pyrolysis.
OXIDATION OF MERCURY ACROSS SCR CATALYSTS IN COAL-FIRED POWER PLANTS BURNING LOW RANK FUELS
Energy Technology Data Exchange (ETDEWEB)
Constance Senior; Temi Linjewile
2003-07-25
This is the first Quarterly Technical Report for DOE Cooperative Agreement No: DE-FC26-03NT41728. The objective of this program is to measure the oxidation of mercury in flue gas across SCR catalyst in a coal-fired power plant burning low rank fuels using a slipstream reactor containing multiple commercial catalysts in parallel. The Electric Power Research Institute (EPRI) and Ceramics GmbH are providing co-funding for this program. This program contains multiple tasks and good progress is being made on all fronts. During this quarter, analysis of the coal, ash and mercury speciation data from the first test series was completed. Good agreement was shown between different methods of measuring mercury in the flue gas: Ontario Hydro, semi-continuous emission monitor (SCEM) and coal composition. There was a loss of total mercury across the commercial catalysts, but not across the blank monolith. The blank monolith showed no oxidation. The data from the first test series show the same trend in mercury oxidation as a function of space velocity that has been seen elsewhere. At space velocities in the range of 6,000-7,000 hr{sup -1} the blank monolith did not show any mercury oxidation, with or without ammonia present. Two of the commercial catalysts clearly showed an effect of ammonia. Two other commercial catalysts showed an effect of ammonia, although the error bars for the no-ammonia case are large. A test plan was written for the second test series and is being reviewed.
OXIDATION OF MERCURY ACROSS SCR CATALYSTS IN COAL-FIRED POWER PLANTS BURNING LOW RANK FUELS
Energy Technology Data Exchange (ETDEWEB)
Constance Senior; Temi Linjewile
2003-10-31
This is the third Quarterly Technical Report for DOE Cooperative Agreement No: DE-FC26-03NT41728. The objective of this program is to measure the oxidation of mercury in flue gas across SCR catalyst in a coal-fired power plant burning low rank fuels using a slipstream reactor containing multiple commercial catalysts in parallel. The Electric Power Research Institute (EPRI) and Argillon GmbH are providing co-funding for this program. This program contains multiple tasks and good progress is being made on all fronts. During this quarter, the second set of mercury measurements was made after the catalysts had been exposed to flue gas for about 2,000 hours. There was good agreement between the Ontario Hydro measurements and the SCEM measurements. Carbon trap measurements of total mercury agreed fairly well with the SCEM. There did appear to be some loss of mercury in the sampling system toward the end of the sampling campaign. NO{sub x} reductions across the catalysts ranged from 60% to 88%. Loss of total mercury across the commercial catalysts was not observed, as it had been in the March/April test series. It is not clear whether this was due to aging of the catalyst or to changes in the sampling system made between March/April and August. In the presence of ammonia, the blank monolith showed no oxidation. Two of the commercial catalysts showed mercury oxidation that was comparable to that in the March/April series. The other three commercial catalysts showed a decrease in mercury oxidation relative to the March/April series. Oxidation of mercury increased without ammonia present. Transient experiments showed that when ammonia was turned on, mercury appeared to desorb from the catalyst, suggesting displacement of adsorbed mercury by the ammonia.
A comparison between alkaline and decomplexing reagents to extract humic acids from low rank coals
Energy Technology Data Exchange (ETDEWEB)
Garcia, D.; Cegarra, J.; Abad, M. [CSIC, Madrid (Spain). Centro de Edafologia y Biologia Aplicada del Segura
1996-07-01
Humic acids (HAs) were obtained from two low rank coals (lignite and leonardite) by using either alkali extractants (0.1 M NaOH, 0.1 M KOH or 0.25 M KOH) or solutions containing Na{sub 4}P{sub 2}O{sub 7} (0.1 M Na{sub 4}P{sub 2}O{sub 7} or 0.1 M NaOH/Na{sub 4}P{sub 2}O{sub 7}). In both coals, the greatest yields were obtained with 0.25 M KOH and the lowest with the 0.1 M alkalis, whereas the extractions based on Na{sub 4}P{sub 2}O{sub 7} yielded intermediate values and were more effective on the lignite. Chemical analysis showed that the leonardite HAs consisted of molecules that were less oxidized and had fewer functional groups than the HAs released form the lignite. Moreover, the HAs extracted by reagents containing Na{sub 4}P{sub 2}O{sub 7} exhibited more functional groups than those extracted with alkali, this effect being more apparent in lignite because of its greater cation exchange capacity. Gel permeation chromatography indicated that the leonardite HAs contained a greater proportion of higher molecular size compounds than the lignite HAs, and that both solutions containing Na{sub 4}P{sub 2}O{sub 7} released HAs with a greater proportion of smaller molecular compounds from the lignite than did the alkali extractants. 16 refs., 3 figs., 2 tabs.
Thermolysis of phenethyl phenyl ether: a model for ether linkages in lignin and low rank coal
Energy Technology Data Exchange (ETDEWEB)
Britt, P.F.; Buchanan, A.C.; Malcolm, E.A. [Oak Ridge National Laboratory, Oak Ridge, TN (United States). Division of Chemistry and Analytical Science
1995-10-06
The thermolysis of phenethyl phenyl ether (PPE) was studied at 330-425{degree}C to resolve the discrepancies in the reported mechanisms of this important model of the beta-ether linkage found in lignin and low rank coal. Cracking of PPE proceeded by two competitive pathways that produced styrene plus phenol and two previously undetected products, benzaldehyde plus toluene. The ratio of these pathways, defined as the alpha/beta selectivity, was 3.1 +/- 0.3 at 375{degree}C and independent of the PPE concentration. Thermolysis of PPE in tetralin, a model hydrogen donor solvent, increased the alpha/beta selectivity to 7 and accelerated the formation of secondary products. All the data were consistent with a free-radical chain mechanism for the decomposition of PPE. Styrene and phenol are produced by hydrogen abstraction at the alpha-carbon, beta-scission to form styrene and the phenoxy radical, followed by hydrogen abstraction. Benzaldehyde and toluene are formed by hydrogen abstraction at the beta-carbon, 1,2-phenyl migration from oxygen to carbon, beta-scission to form benzaldehyde, and the benzyl radical followed by hydrogen abstraction. Thermochemical kinetic estimates indicate that product formation is controlled by the relative rate of hydrogen abstraction at the alpha- and beta-carbons by the phenoxy radical (dominant) and benzyl radical (minor) since beta-scission and 1,2-phenyl migration are fast relative to hydrogen abstraction. Thermolysis of PhCD{sub 2}CH{sub 2}OPh and PhCH{sub 2}CD{sub 2}OPh was consistent with the previous results, indicating that there was no significant contribution of a concerted retro-ene pathway to the thermolysis of PPE.
High dimensional classifiers in the imbalanced case
DEFF Research Database (Denmark)
Bak, Britta Anker; Jensen, Jens Ledet
We consider the binary classification problem in the imbalanced case where the number of samples from the two groups differ. The classification problem is considered in the high dimensional case where the number of variables is much larger than the number of samples, and where the imbalance leads...
Tang, Xin; Feng, Guo-Can; Li, Xiao-Xin; Cai, Jia-Xin
2015-01-01
Face recognition is challenging especially when the images from different persons are similar to each other due to variations in illumination, expression, and occlusion. If we have sufficient training images of each person which can span the facial variations of that person under testing conditions, sparse representation based classification (SRC) achieves very promising results. However, in many applications, face recognition often encounters the small sample size problem arising from the small number of available training images for each person. In this paper, we present a novel face recognition framework by utilizing low-rank and sparse error matrix decomposition, and sparse coding techniques (LRSE+SC). Firstly, the low-rank matrix recovery technique is applied to decompose the face images per class into a low-rank matrix and a sparse error matrix. The low-rank matrix of each individual is a class-specific dictionary and it captures the discriminative feature of this individual. The sparse error matrix represents the intra-class variations, such as illumination, expression changes. Secondly, we combine the low-rank part (representative basis) of each person into a supervised dictionary and integrate all the sparse error matrix of each individual into a within-individual variant dictionary which can be applied to represent the possible variations between the testing and training images. Then these two dictionaries are used to code the query image. The within-individual variant dictionary can be shared by all the subjects and only contribute to explain the lighting conditions, expressions, and occlusions of the query image rather than discrimination. At last, a reconstruction-based scheme is adopted for face recognition. Since the within-individual dictionary is introduced, LRSE+SC can handle the problem of the corrupted training data and the situation that not all subjects have enough samples for training. Experimental results show that our method achieves the
Directory of Open Access Journals (Sweden)
Xin Tang
Full Text Available Face recognition is challenging especially when the images from different persons are similar to each other due to variations in illumination, expression, and occlusion. If we have sufficient training images of each person which can span the facial variations of that person under testing conditions, sparse representation based classification (SRC achieves very promising results. However, in many applications, face recognition often encounters the small sample size problem arising from the small number of available training images for each person. In this paper, we present a novel face recognition framework by utilizing low-rank and sparse error matrix decomposition, and sparse coding techniques (LRSE+SC. Firstly, the low-rank matrix recovery technique is applied to decompose the face images per class into a low-rank matrix and a sparse error matrix. The low-rank matrix of each individual is a class-specific dictionary and it captures the discriminative feature of this individual. The sparse error matrix represents the intra-class variations, such as illumination, expression changes. Secondly, we combine the low-rank part (representative basis of each person into a supervised dictionary and integrate all the sparse error matrix of each individual into a within-individual variant dictionary which can be applied to represent the possible variations between the testing and training images. Then these two dictionaries are used to code the query image. The within-individual variant dictionary can be shared by all the subjects and only contribute to explain the lighting conditions, expressions, and occlusions of the query image rather than discrimination. At last, a reconstruction-based scheme is adopted for face recognition. Since the within-individual dictionary is introduced, LRSE+SC can handle the problem of the corrupted training data and the situation that not all subjects have enough samples for training. Experimental results show that our
Energy Technology Data Exchange (ETDEWEB)
1989-12-31
This work is a compilation of reports on ongoing research at the University of North Dakota. Topics include: Control Technology and Coal Preparation Research (SO{sub x}/NO{sub x} control, waste management), Advanced Research and Technology Development (turbine combustion phenomena, combustion inorganic transformation, coal/char reactivity, liquefaction reactivity of low-rank coals, gasification ash and slag characterization, fine particulate emissions), Combustion Research (fluidized bed combustion, beneficiation of low-rank coals, combustion characterization of low-rank coal fuels, diesel utilization of low-rank coals), Liquefaction Research (low-rank coal direct liquefaction), and Gasification Research (hydrogen production from low-rank coals, advanced wastewater treatment, mild gasification, color and residual COD removal from Synfuel wastewaters, Great Plains Gasification Plant, gasifier optimization).
OXIDATION OF MERCURY ACROSS SCR CATALYSTS IN COAL-FIRED POWER PLANTS BURNING LOW RANK FUELS
Energy Technology Data Exchange (ETDEWEB)
Constance Senior
2004-12-31
The objectives of this program were to measure the oxidation of mercury in flue gas across SCR catalyst in a coal-fired power plant burning low rank fuels using a slipstream reactor containing multiple commercial catalysts in parallel and to develop a greater understanding of mercury oxidation across SCR catalysts in the form of a simple model. The Electric Power Research Institute (EPRI) and Argillon GmbH provided co-funding for this program. REI used a multicatalyst slipstream reactor to determine oxidation of mercury across five commercial SCR catalysts at a power plant that burned a blend of 87% subbituminous coal and 13% bituminous coal. The chlorine content of the blend was 100 to 240 {micro}g/g on a dry basis. Mercury measurements were carried out when the catalysts were relatively new, corresponding to about 300 hours of operation and again after 2,200 hours of operation. NO{sub x}, O{sub 2} and gaseous mercury speciation at the inlet and at the outlet of each catalyst chamber were measured. In general, the catalysts all appeared capable of achieving about 90% NO{sub x} reduction at a space velocity of 3,000 hr{sup -1} when new, which is typical of full-scale installations; after 2,200 hours exposure to flue gas, some of the catalysts appeared to lose NO{sub x} activity. For the fresh commercial catalysts, oxidation of mercury was in the range of 25% to 65% at typical full-scale space velocities. A blank monolith showed no oxidation of mercury under any conditions. All catalysts showed higher mercury oxidation without ammonia, consistent with full-scale measurements. After exposure to flue gas for 2,200 hours, some of the catalysts showed reduced levels of mercury oxidation relative to the initial levels of oxidation. A model of Hg oxidation across SCRs was formulated based on full-scale data. The model took into account the effects of temperature, space velocity, catalyst type and HCl concentration in the flue gas.
Bayesian Analysis of High Dimensional Classification
Mukhopadhyay, Subhadeep; Liang, Faming
2009-12-01
Modern data mining and bioinformatics have presented an important playground for statistical learning techniques, where the number of input variables is possibly much larger than the sample size of the training data. In supervised learning, logistic regression or probit regression can be used to model a binary output and form perceptron classification rules based on Bayesian inference. In these cases , there is a lot of interest in searching for sparse model in High Dimensional regression(/classification) setup. we first discuss two common challenges for analyzing high dimensional data. The first one is the curse of dimensionality. The complexity of many existing algorithms scale exponentially with the dimensionality of the space and by virtue of that algorithms soon become computationally intractable and therefore inapplicable in many real applications. secondly, multicollinearities among the predictors which severely slowdown the algorithm. In order to make Bayesian analysis operational in high dimension we propose a novel 'Hierarchical stochastic approximation monte carlo algorithm' (HSAMC), which overcomes the curse of dimensionality, multicollinearity of predictors in high dimension and also it possesses the self-adjusting mechanism to avoid the local minima separated by high energy barriers. Models and methods are illustrated by simulation inspired from from the feild of genomics. Numerical results indicate that HSAMC can work as a general model selection sampler in high dimensional complex model space.
A Direct Elliptic Solver Based on Hierarchically Low-Rank Schur Complements
Chávez, Gustavo
2017-03-17
A parallel fast direct solver for rank-compressible block tridiagonal linear systems is presented. Algorithmic synergies between Cyclic Reduction and Hierarchical matrix arithmetic operations result in a solver with O(Nlog2N) arithmetic complexity and O(NlogN) memory footprint. We provide a baseline for performance and applicability by comparing with well-known implementations of the $$\\\\mathcal{H}$$ -LU factorization and algebraic multigrid within a shared-memory parallel environment that leverages the concurrency features of the method. Numerical experiments reveal that this method is comparable with other fast direct solvers based on Hierarchical Matrices such as $$\\\\mathcal{H}$$ -LU and that it can tackle problems where algebraic multigrid fails to converge.
Krylov, Piotr
2017-01-01
This monograph is a comprehensive account of formal matrices, examining homological properties of modules over formal matrix rings and summarising the interplay between Morita contexts and K theory. While various special types of formal matrix rings have been studied for a long time from several points of view and appear in various textbooks, for instance to examine equivalences of module categories and to illustrate rings with one-sided non-symmetric properties, this particular class of rings has, so far, not been treated systematically. Exploring formal matrix rings of order 2 and introducing the notion of the determinant of a formal matrix over a commutative ring, this monograph further covers the Grothendieck and Whitehead groups of rings. Graduate students and researchers interested in ring theory, module theory and operator algebras will find this book particularly valuable. Containing numerous examples, Formal Matrices is a largely self-contained and accessible introduction to the topic, assuming a sol...
Cheng, Jiubing
2016-03-15
In elastic imaging, the extrapolated vector fields are decoupled into pure wave modes, such that the imaging condition produces interpretable images. Conventionally, mode decoupling in anisotropic media is costly because the operators involved are dependent on the velocity, and thus they are not stationary. We have developed an efficient pseudospectral approach to directly extrapolate the decoupled elastic waves using low-rank approximate mixed-domain integral operators on the basis of the elastic displacement wave equation. We have applied k-space adjustment to the pseudospectral solution to allow for a relatively large extrapolation time step. The low-rank approximation was, thus, applied to the spectral operators that simultaneously extrapolate and decompose the elastic wavefields. Synthetic examples on transversely isotropic and orthorhombic models showed that our approach has the potential to efficiently and accurately simulate the propagations of the decoupled quasi-P and quasi-S modes as well as the total wavefields for elastic wave modeling, imaging, and inversion.
Lewis, Cannada A; Calvin, Justus A; Valeev, Edward F
2016-12-13
We describe the clustered low-rank (CLR) framework for block-sparse and block-low-rank tensor representation and computation. The CLR framework exploits the tensor structure revealed by basis clustering; computational savings arise from low-rank compression of tensor blocks and performing block arithmetic in the low-rank form whenever beneficial. The precision is rigorously controlled by two parameters, avoiding ad-hoc heuristics, such as domains: one controls the CLR block rank truncation, and the other controls screening of small contributions in arithmetic operations on CLR tensors to propagate sparsity through expressions. As these parameters approach zero, the CLR representation and arithmetic become exact. As a pilot application, we considered the use of the CLR format for the order-2 and order-3 tensors in the context of the density fitting (DF) evaluation of the Hartree-Fock (exact) exchange (DF-K). Even for small systems and realistic basis sets, CLR-DF-K becomes more efficient than the standard DF-K approach, and it has significantly reduced asymptotic storage and computational complexities relative to the standard [Formula: see text] and [Formula: see text] DF-K figures. CLR-DF-K is also significantly more efficient-all while negligibly affecting molecular energies and properties-than the conventional (non-DF) [Formula: see text] exchange algorithm for applications to medium-sized systems (on the order of 100 atoms) with diffuse Gaussian basis sets, a necessity for applications to negatively charged species, molecular properties, and high-accuracy correlated wave functions.
Tang, Chang; Cao, Lijuan; Chen, Jiajia; Zheng, Xiao
2017-05-01
In this work, a non-local weighted group low-rank representation (WGLRR) model is proposed for speckle noise reduction in optical coherence tomography (OCT) images. It is based on the observation that the similarity between patches within the noise-free OCT image leads to a high correlation between them, which means that the data matrix grouped by these similar patches is low-rank. Thus, the low-rank representation (LRR) is used to recover the noise-free group data matrix. In order to maintain the fidelity of the recovered image, the corrupted probability of each pixel is integrated into the LRR model as a weight to regularize the error term. Considering that each single patch might belong to several groups, and multiple estimates of this patch can be obtained, different estimates of each patch is aggregated to obtain its denoised result. The aggregating weights are exploited depending on the rank of each group data matrix, which can assign higher weights to those better estimates. Both qualitative and quantitative experimental results on real OCT images show the superior performance of the WGLRR model compared with other state-of-the-art speckle removal techniques.
Introduction to high-dimensional statistics
Giraud, Christophe
2015-01-01
Ever-greater computing technologies have given rise to an exponentially growing volume of data. Today massive data sets (with potentially thousands of variables) play an important role in almost every branch of modern human activity, including networks, finance, and genetics. However, analyzing such data has presented a challenge for statisticians and data analysts and has required the development of new statistical methods capable of separating the signal from the noise.Introduction to High-Dimensional Statistics is a concise guide to state-of-the-art models, techniques, and approaches for ha
ESTIMATION OF FUNCTIONALS OF SPARSE COVARIANCE MATRICES.
Fan, Jianqing; Rigollet, Philippe; Wang, Weichen
High-dimensional statistical tests often ignore correlations to gain simplicity and stability leading to null distributions that depend on functionals of correlation matrices such as their Frobenius norm and other ℓ r norms. Motivated by the computation of critical values of such tests, we investigate the difficulty of estimation the functionals of sparse correlation matrices. Specifically, we show that simple plug-in procedures based on thresholded estimators of correlation matrices are sparsity-adaptive and minimax optimal over a large class of correlation matrices. Akin to previous results on functional estimation, the minimax rates exhibit an elbow phenomenon. Our results are further illustrated in simulated data as well as an empirical study of data arising in financial econometrics.
Effects of dependence in high-dimensional multiple testing problems
Directory of Open Access Journals (Sweden)
van de Wiel Mark A
2008-02-01
Full Text Available Abstract Background We consider effects of dependence among variables of high-dimensional data in multiple hypothesis testing problems, in particular the False Discovery Rate (FDR control procedures. Recent simulation studies consider only simple correlation structures among variables, which is hardly inspired by real data features. Our aim is to systematically study effects of several network features like sparsity and correlation strength by imposing dependence structures among variables using random correlation matrices. Results We study the robustness against dependence of several FDR procedures that are popular in microarray studies, such as Benjamin-Hochberg FDR, Storey's q-value, SAM and resampling based FDR procedures. False Non-discovery Rates and estimates of the number of null hypotheses are computed from those methods and compared. Our simulation study shows that methods such as SAM and the q-value do not adequately control the FDR to the level claimed under dependence conditions. On the other hand, the adaptive Benjamini-Hochberg procedure seems to be most robust while remaining conservative. Finally, the estimates of the number of true null hypotheses under various dependence conditions are variable. Conclusion We discuss a new method for efficient guided simulation of dependent data, which satisfy imposed network constraints as conditional independence structures. Our simulation set-up allows for a structural study of the effect of dependencies on multiple testing criterions and is useful for testing a potentially new method on π0 or FDR estimation in a dependency context.
Modeling high dimensional multichannel brain signals
Hu, Lechuan
2017-03-27
In this paper, our goal is to model functional and effective (directional) connectivity in network of multichannel brain physiological signals (e.g., electroencephalograms, local field potentials). The primary challenges here are twofold: first, there are major statistical and computational difficulties for modeling and analyzing high dimensional multichannel brain signals; second, there is no set of universally-agreed measures for characterizing connectivity. To model multichannel brain signals, our approach is to fit a vector autoregressive (VAR) model with sufficiently high order so that complex lead-lag temporal dynamics between the channels can be accurately characterized. However, such a model contains a large number of parameters. Thus, we will estimate the high dimensional VAR parameter space by our proposed hybrid LASSLE method (LASSO+LSE) which is imposes regularization on the first step (to control for sparsity) and constrained least squares estimation on the second step (to improve bias and mean-squared error of the estimator). Then to characterize connectivity between channels in a brain network, we will use various measures but put an emphasis on partial directed coherence (PDC) in order to capture directional connectivity between channels. PDC is a directed frequency-specific measure that explains the extent to which the present oscillatory activity in a sender channel influences the future oscillatory activity in a specific receiver channel relative all possible receivers in the network. Using the proposed modeling approach, we have achieved some insights on learning in a rat engaged in a non-spatial memory task.
Yang, Yongchao; Nagarajaiah, Satish
2016-06-01
Randomly missing data of structural vibration responses time history often occurs in structural dynamics and health monitoring. For example, structural vibration responses are often corrupted by outliers or erroneous measurements due to sensor malfunction; in wireless sensing platforms, data loss during wireless communication is a common issue. Besides, to alleviate the wireless data sampling or communication burden, certain accounts of data are often discarded during sampling or before transmission. In these and other applications, recovery of the randomly missing structural vibration responses from the available, incomplete data, is essential for system identification and structural health monitoring; it is an ill-posed inverse problem, however. This paper explicitly harnesses the data structure itself-of the structural vibration responses-to address this (inverse) problem. What is relevant is an empirical, but often practically true, observation, that is, typically there are only few modes active in the structural vibration responses; hence a sparse representation (in frequency domain) of the single-channel data vector, or, a low-rank structure (by singular value decomposition) of the multi-channel data matrix. Exploiting such prior knowledge of data structure (intra-channel sparse or inter-channel low-rank), the new theories of ℓ1-minimization sparse recovery and nuclear-norm-minimization low-rank matrix completion enable recovery of the randomly missing or corrupted structural vibration response data. The performance of these two alternatives, in terms of recovery accuracy and computational time under different data missing rates, is investigated on a few structural vibration response data sets-the seismic responses of the super high-rise Canton Tower and the structural health monitoring accelerations of a real large-scale cable-stayed bridge. Encouraging results are obtained and the applicability and limitation of the presented methods are discussed.
Geogenic organic contaminants in the low-rank coal-bearing Carrizo-Wilcox aquifer of East Texas, USA
Chakraborty, Jayeeta; Varonka, Matthew; Orem, William; Finkelman, Robert B.; Manton, William
2017-06-01
The organic composition of groundwater along the Carrizo-Wilcox aquifer in East Texas (USA), sampled from rural wells in May and September 2015, was examined as part of a larger study of the potential health and environmental effects of organic compounds derived from low-rank coals. The quality of water from the low-rank coal-bearing Carrizo-Wilcox aquifer is a potential environmental concern and no detailed studies of the organic compounds in this aquifer have been published. Organic compounds identified in the water samples included: aliphatics and their fatty acid derivatives, phenols, biphenyls, N-, O-, and S-containing heterocyclic compounds, polycyclic aromatic hydrocarbons (PAHs), aromatic amines, and phthalates. Many of the identified organic compounds (aliphatics, phenols, heterocyclic compounds, PAHs) are geogenic and originated from groundwater leaching of young and unmetamorphosed low-rank coals. Estimated concentrations of individual compounds ranged from about 3.9 to 0.01 μg/L. In many rural areas in East Texas, coal strata provide aquifers for drinking water wells. Organic compounds observed in groundwater are likely to be present in drinking water supplied from wells that penetrate the coal. Some of the organic compounds identified in the water samples are potentially toxic to humans, but at the estimated levels in these samples, the compounds are unlikely to cause acute health problems. The human health effects of low-level chronic exposure to coal-derived organic compounds in drinking water in East Texas are currently unknown, and continuing studies will evaluate possible toxicity.
Kohn, Lucas; Tschirsich, Ferdinand; Keck, Maximilian; Plenio, Martin B.; Tamascelli, Dario; Montangero, Simone
2018-01-01
We provide evidence that randomized low-rank factorization is a powerful tool for the determination of the ground-state properties of low-dimensional lattice Hamiltonians through tensor network techniques. In particular, we show that randomized matrix factorization outperforms truncated singular value decomposition based on state-of-the-art deterministic routines in time-evolving block decimation (TEBD)- and density matrix renormalization group (DMRG)-style simulations, even when the system under study gets close to a phase transition: We report linear speedups in the bond or local dimension of up to 24 times in quasi-two-dimensional cylindrical systems.
Sipahutar Riman; Bizzy Irwin; Faizal Muhammad; Maussa Olistiyo
2017-01-01
The objective of this study is to blend the South Sumatera low rank coal and palm shell charcoal for producing bio-coal briquettes which have better fuel properties. The experimental study for making bio-coal briquettes was carried out to examine the effect of raw material composition and binder type on the quality of the briquettes produced. A screw conveyor machine equipped with a three blade stirred and designed with the length of 40 cm, mixing process diameter of 10 cm and the capacity of...
Statistical Analysis for High-Dimensional Data : The Abel Symposium 2014
Bühlmann, Peter; Glad, Ingrid; Langaas, Mette; Richardson, Sylvia; Vannucci, Marina
2016-01-01
This book features research contributions from The Abel Symposium on Statistical Analysis for High Dimensional Data, held in Nyvågar, Lofoten, Norway, in May 2014. The focus of the symposium was on statistical and machine learning methodologies specifically developed for inference in “big data” situations, with particular reference to genomic applications. The contributors, who are among the most prominent researchers on the theory of statistics for high dimensional inference, present new theories and methods, as well as challenging applications and computational solutions. Specific themes include, among others, variable selection and screening, penalised regression, sparsity, thresholding, low dimensional structures, computational challenges, non-convex situations, learning graphical models, sparse covariance and precision matrices, semi- and non-parametric formulations, multiple testing, classification, factor models, clustering, and preselection. Highlighting cutting-edge research and casting light on...
Super-resolution reconstruction of 4D-CT lung data via patch-based low-rank matrix reconstruction
Fang, Shiting; Wang, Huafeng; Liu, Yueliang; Zhang, Minghui; Yang, Wei; Feng, Qianjin; Chen, Wufan; Zhang, Yu
2017-10-01
Lung 4D computed tomography (4D-CT), which is a time-resolved CT data acquisition, performs an important role in explicitly including respiratory motion in treatment planning and delivery. However, the radiation dose is usually reduced at the expense of inter-slice spatial resolution to minimize radiation-related health risk. Therefore, resolution enhancement along the superior-inferior direction is necessary. In this paper, a super-resolution (SR) reconstruction method based on a patch low-rank matrix reconstruction is proposed to improve the resolution of lung 4D-CT images. Specifically, a low-rank matrix related to every patch is constructed by using a patch searching strategy. Thereafter, the singular value shrinkage is employed to recover the high-resolution patch under the constraints of the image degradation model. The output high-resolution patches are finally assembled to output the entire image. This method is extensively evaluated using two public data sets. Quantitative analysis shows that the proposed algorithm decreases the root mean square error by 9.7%-33.4% and the edge width by 11.4%-24.3%, relative to linear interpolation, back projection (BP) and Zhang et al’s algorithm. A new algorithm has been developed to improve the resolution of 4D-CT. In all experiments, the proposed method outperforms various interpolation methods, as well as BP and Zhang et al’s method, thus indicating the effectivity and competitiveness of the proposed algorithm.
Energy Technology Data Exchange (ETDEWEB)
Sugiyama, T. [Center for Coal Utilization, Japan, Tokyo (Japan); Tsurui, M.; Suto, Y.; Asakura, M. [JGC Corp., Tokyo (Japan); Ogawa, J.; Yui, M.; Takano, S. [Japan COM Co. Ltd., Japan, Tokyo (Japan)
1996-09-01
A CWM manufacturing technology was developed by means of upgrading low rank coals. Even though some low rank coals have such advantages as low ash, low sulfur and high volatile matter content, many of them are merely used on a small scale in areas near the mine-mouths because of high moisture content, low calorification and high ignitability. Therefore, discussions were given on a coal fuel manufacturing technology by which coal will be irreversibly dehydrated with as much volatile matters as possible remaining in the coal, and the coal is made high-concentration CWM, thus the coal can be safely transported and stored. The technology uses a method to treat coal with hot water under high pressure and dry it with hot water. The method performs not only removal of water, but also irreversible dehydration without losing volatile matters by decomposing hydrophilic groups on surface and blocking micro pores with volatile matters in the coal (wax and tar). The upgrading effect was verified by processing coals in a pilot plant, which derived greater calorification and higher concentration CWM than with the conventional processes. A CWM combustion test proved lower NOx, lower SOx and higher combustion rate than for bituminous coal. The ash content was also found lower. This process suits a Texaco-type gasification furnace. For a production scale of three million tons a year, the production cost is lower by 2 yen per 10 {sup 3} kcal than for heavy oil with the same sulfur content. 11 figs., 15 tabs.
Directory of Open Access Journals (Sweden)
Sipahutar Riman
2017-01-01
Full Text Available The objective of this study is to blend the South Sumatera low rank coal and palm shell charcoal for producing bio-coal briquettes which have better fuel properties. The experimental study for making bio-coal briquettes was carried out to examine the effect of raw material composition and binder type on the quality of the briquettes produced. A screw conveyor machine equipped with a three blade stirred and designed with the length of 40 cm, mixing process diameter of 10 cm and the capacity of 2 kg bio-coal briquettes per hour was used to produce bio-coal briquettes ready to use in small industries. Proxymate analyses of the South Sumatera low rank coal, palm shell charcoals and bio-coal briquettes were conducted in accordance with the American Society of Testings and Materials (ASTM standards and the calorific value was determined by using a Bomb calorimeter. The experimental results showed that the calorific value of bio-coal briquette was greatly influenced by the raw material composition and the binder type. The highest calorific value was 6438 (cal/g at the sampel of SSC65-PSC20-B15(2.
Modeling High-Dimensional Multichannel Brain Signals
Hu, Lechuan
2017-12-12
Our goal is to model and measure functional and effective (directional) connectivity in multichannel brain physiological signals (e.g., electroencephalograms, local field potentials). The difficulties from analyzing these data mainly come from two aspects: first, there are major statistical and computational challenges for modeling and analyzing high-dimensional multichannel brain signals; second, there is no set of universally agreed measures for characterizing connectivity. To model multichannel brain signals, our approach is to fit a vector autoregressive (VAR) model with potentially high lag order so that complex lead-lag temporal dynamics between the channels can be captured. Estimates of the VAR model will be obtained by our proposed hybrid LASSLE (LASSO + LSE) method which combines regularization (to control for sparsity) and least squares estimation (to improve bias and mean-squared error). Then we employ some measures of connectivity but put an emphasis on partial directed coherence (PDC) which can capture the directional connectivity between channels. PDC is a frequency-specific measure that explains the extent to which the present oscillatory activity in a sender channel influences the future oscillatory activity in a specific receiver channel relative to all possible receivers in the network. The proposed modeling approach provided key insights into potential functional relationships among simultaneously recorded sites during performance of a complex memory task. Specifically, this novel method was successful in quantifying patterns of effective connectivity across electrode locations, and in capturing how these patterns varied across trial epochs and trial types.
Chen, L. X.; Wu, Q. P.
2012-10-01
Recently, Dada et al. reported on the experimental entanglement concentration and violation of generalized Bell inequalities with orbital angular momentum (OAM) [Nat. Phys. 7, 677 (2011)]. Here we demonstrate that the high-dimensional entanglement concentration can be performed in arbitrary OAM subspaces with selectivity. Instead of violating the generalized Bell inequalities, the working principle of present entanglement concentration is visualized by the biphoton OAM Klyshko picture, and its good performance is confirmed and quantified through the experimental Shannon dimensionalities after concentration.
BLANCHARD, Pierre
2017-01-01
Advanced techniques for the low-rank approximation of matrices are crucial dimension reduction tools in many domains of modern scientific computing. Hierarchical approaches like H2-matrices, in particular the Fast Multipole Method (FMM), benefit from the block low-rank structure of certain matrices to reduce the cost of computing n-body problems to O(n) operations instead of O(n2). In order to better deal with kernels of various kinds, kernel independent FMM formulations have recently arisen ...
A Durbin-Levinson Regularized Estimator of High Dimensional Autocovariance Matrices
DEFF Research Database (Denmark)
Proietti, Tommaso; Giovannelli, Alessandro
on banding and tapering the sample autocovariance matrix. This paper proposes and evaluates an alternative approach, based on regularizing the sample partial autocorrelation function, via a modified Durbin-Levinson algorithm that receives as input the banded and tapered partial autocorrelations and returns...
Inverse m-matrices and ultrametric matrices
Dellacherie, Claude; San Martin, Jaime
2014-01-01
The study of M-matrices, their inverses and discrete potential theory is now a well-established part of linear algebra and the theory of Markov chains. The main focus of this monograph is the so-called inverse M-matrix problem, which asks for a characterization of nonnegative matrices whose inverses are M-matrices. We present an answer in terms of discrete potential theory based on the Choquet-Deny Theorem. A distinguished subclass of inverse M-matrices is ultrametric matrices, which are important in applications such as taxonomy. Ultrametricity is revealed to be a relevant concept in linear algebra and discrete potential theory because of its relation with trees in graph theory and mean expected value matrices in probability theory. Remarkable properties of Hadamard functions and products for the class of inverse M-matrices are developed and probabilistic insights are provided throughout the monograph.
Gao, Shibo; Cheng, Yongmei; Song, Chunhua
2013-09-01
The technology of vision-based probe-and-drogue autonomous aerial refueling is an amazing task in modern aviation for both manned and unmanned aircraft. A key issue is to determine the relative orientation and position of the drogue and the probe accurately for relative navigation system during the approach phase, which requires locating the drogue precisely. Drogue detection is a challenging task due to disorderly motion of drogue caused by both the tanker wake vortex and atmospheric turbulence. In this paper, the problem of drogue detection is considered as a problem of moving object detection. A drogue detection algorithm based on low rank and sparse decomposition with local multiple features is proposed. The global and local information of drogue is introduced into the detection model in a unified way. The experimental results on real autonomous aerial refueling videos show that the proposed drogue detection algorithm is effective.
Energy Technology Data Exchange (ETDEWEB)
Takarada, Y.; Kato, K.; Kuroda, M.; Nakagawa, N. [Gunma University, Gunma (Japan). Faculty of Engineering; Roman, M. [New Energy and Industrial Technology Development Organization, Tokyo, (Japan)
1997-02-01
Experiment reveals the characteristics of low rank coal serving as a desulfurizing material in fluidized coal bed reactor with oxygen-containing functional groups exchanged with Ca ions. This effort aims at identifying inexpensive Ca materials and determining the desulfurizing characteristics of Ca-carrying brown coal. A slurry of cement sludge serving as a Ca source and low rank coal is agitated for the exchange of functional groups and Ca ions, and the desulfurizing characteristics of the Ca-carrying brown coal is determined. The Ca-carrying brown coal and high-sulfur coal char is mixed and incinerated in a fluidized bed reactor, and it is found that a desulfurization rate of 75% is achieved when the Ca/S ratio is 1 in the desulfurization of SO2. This rate is far higher than the rate obtained when limestone or cement sludge without preliminary treatment is used as a desulfurizer. Next, Ca-carrying brown coal and H2S are caused to react upon each other in a fixed bed reactor, and then it is found that desulfurization characteristics are not dependent on the diameter of the Ca-carrying brown coal grain, that the coal is different from limestone in that it stays quite active against H2S for long 40 minutes after the start of the reaction, and that CaO small in crystal diameter is dispersed in quantities into the char upon thermal disintegration of Ca-carrying brown coal to cause the coal to say quite active. 5 figs.
Fractal generalized Pascal matrices
Burlachenko, E.
2016-01-01
Set of generalized Pascal matrices whose elements are generalized binomial coefficients is considered as an integral object. The special system of generalized Pascal matrices, based on which we are building fractal generalized Pascal matrices, is introduced. Pascal matrix (Pascal triangle) is the Hadamard product of the fractal generalized Pascal matrices. The concept of zero generalized Pascal matrices, an example of which is the Pascal triangle modulo 2, arise in connection with the system ...
High dimensional data driven statistical mechanics.
Adachi, Yoshitaka; Sadamatsu, Sunao
2014-11-01
In "3D4D materials science", there are five categories such as (a) Image acquisition, (b) Processing, (c) Analysis, (d) Modelling, and (e) Data sharing. This presentation highlights the core of these categories [1]. Analysis and modellingA three-dimensional (3D) microstructure image contains topological features such as connectivity in addition to metric features. Such more microstructural information seems to be useful for more precise property prediction. There are two ways for microstructure-based property prediction (Fig. 1A). One is 3D image data based modelling such as micromechanics or crystal plasticity finite element method. The other one is a numerical microstructural features driven machine learning approach such as artificial neural network or Bayesian estimation method. It is the key to convert the 3D image data into numerals in order to apply the dataset to property prediction. As a numerical feature of microstructures, grain size, number of density, of particles, connectivity of particles, grain boundary connectivity, stacking degree, clustering etc. should be taken into consideration. These microstructural features are so-called "materials genome". Among those materials genome, we have to find out dominant factors to determine a focused property. The dominant factorzs are defined as "descriptor(s)" in high dimensional data driven statistical mechanics.jmicro;63/suppl_1/i4/DFU086F1F1DFU086F1Fig. 1.(a) A concept of 3D4D materials science. (b) Fully-automated serial sectioning 3D microscope "Genus_3D". (c) Materials Genome Archive (JSPS). Image acquisitionIt is important for researchers to choice a 3D microscope from various microscopes depending on a length-scale of a focused microstructure. There is a long term request to acquire a 3D microstructure image more conveniently. Therefore a fully automated serial sectioning 3D optical microscope "Genus_3D" (Fig. 1B) has been developed and nowadays it is commercially available. A user can get a good
Directory of Open Access Journals (Sweden)
Xinglin Piao
2014-12-01
Full Text Available The emerging low rank matrix approximation (LRMA method provides an energy efficient scheme for data collection in wireless sensor networks (WSNs by randomly sampling a subset of sensor nodes for data sensing. However, the existing LRMA based methods generally underutilize the spatial or temporal correlation of the sensing data, resulting in uneven energy consumption and thus shortening the network lifetime. In this paper, we propose a correlated spatio-temporal data collection method for WSNs based on LRMA. In the proposed method, both the temporal consistence and the spatial correlation of the sensing data are simultaneously integrated under a new LRMA model. Moreover, the network energy consumption issue is considered in the node sampling procedure. We use Gini index to measure both the spatial distribution of the selected nodes and the evenness of the network energy status, then formulate and resolve an optimization problem to achieve optimized node sampling. The proposed method is evaluated on both the simulated and real wireless networks and compared with state-of-the-art methods. The experimental results show the proposed method efficiently reduces the energy consumption of network and prolongs the network lifetime with high data recovery accuracy and good stability.
Min, Junhong; Carlini, Lina; Unser, Michael; Manley, Suliana; Ye, Jong Chul
2015-09-01
Localization microscopy such as STORM/PALM can achieve a nanometer scale spatial resolution by iteratively localizing fluorescence molecules. It was shown that imaging of densely activated molecules can accelerate temporal resolution which was considered as major limitation of localization microscopy. However, this higher density imaging needs to incorporate advanced localization algorithms to deal with overlapping point spread functions (PSFs). In order to address this technical challenges, previously we developed a localization algorithm called FALCON1, 2 using a quasi-continuous localization model with sparsity prior on image space. It was demonstrated in both 2D/3D live cell imaging. However, it has several disadvantages to be further improved. Here, we proposed a new localization algorithm using annihilating filter-based low rank Hankel structured matrix approach (ALOHA). According to ALOHA principle, sparsity in image domain implies the existence of rank-deficient Hankel structured matrix in Fourier space. Thanks to this fundamental duality, our new algorithm can perform data-adaptive PSF estimation and deconvolution of Fourier spectrum, followed by truly grid-free localization using spectral estimation technique. Furthermore, all these optimizations are conducted on Fourier space only. We validated the performance of the new method with numerical experiments and live cell imaging experiment. The results confirmed that it has the higher localization performances in both experiments in terms of accuracy and detection rate.
Lee, Juyoung; Jin, Kyong Hwan; Ye, Jong Chul
2016-12-01
MR measurements from an echo-planar imaging (EPI) sequence produce Nyquist ghost artifacts that originate from inconsistencies between odd and even echoes. Several reconstruction algorithms have been proposed to reduce such artifacts, but most of these methods require either additional reference scans or multipass EPI acquisition. This article proposes a novel and accurate single-pass EPI ghost artifact correction method that does not require any additional reference data. After converting a ghost correction problem into separate k-space data interpolation problems for even and odd phase encoding, our algorithm exploits an observation that the differential k-space data between the even and odd echoes is a Fourier transform of an underlying sparse image. Accordingly, we can construct a rank-deficient Hankel structured matrix, whose missing data can be recovered using an annihilating filter-based low rank Hankel structured matrix completion approach. The proposed method was applied to EPI data for both single and multicoil acquisitions. Experimental results using in vivo data confirmed that the proposed method can completely remove ghost artifacts successfully without prescan echoes. Owing to the discovery of the annihilating filter relationship from the intrinsic EPI image property, the proposed method successfully suppresses ghost artifacts without a prescan step. Magn Reson Med 76:1775-1789, 2016. © 2016 International Society for Magnetic Resonance in Medicine. © 2016 International Society for Magnetic Resonance in Medicine.
Liu, Ryan Wen; Shi, Lin; Yu, Simon Chun Ho; Xiong, Naixue; Wang, Defeng
2017-03-03
Dynamic magnetic resonance imaging (MRI) has been extensively utilized for enhancing medical living environment visualization, however, in clinical practice it often suffers from long data acquisition times. Dynamic imaging essentially reconstructs the visual image from raw (k,t)-space measurements, commonly referred to as big data. The purpose of this work is to accelerate big medical data acquisition in dynamic MRI by developing a non-convex minimization framework. In particular, to overcome the inherent speed limitation, both non-convex low-rank and sparsity constraints were combined to accelerate the dynamic imaging. However, the non-convex constraints make the dynamic reconstruction problem difficult to directly solve through the commonly-used numerical methods. To guarantee solution efficiency and stability, a numerical algorithm based on Alternating Direction Method of Multipliers (ADMM) is proposed to solve the resulting non-convex optimization problem. ADMM decomposes the original complex optimization problem into several simple sub-problems. Each sub-problem has a closed-form solution or could be efficiently solved using existing numerical methods. It has been proven that the quality of images reconstructed from fewer measurements can be significantly improved using non-convex minimization. Numerous experiments have been conducted on two in vivo cardiac datasets to compare the proposed method with several state-of-the-art imaging methods. Experimental results illustrated that the proposed method could guarantee the superior imaging performance in terms of quantitative and visual image quality assessments.
Energy Technology Data Exchange (ETDEWEB)
Aytar, Pinar; Gedikli, Serap [Graduate School of Natural and Applied Sciences, Eskisehir Osmangazi University (Turkey); Sam, Mesut [Department of Biology, Faculty of Arts and Science, Aksaray University (Turkey); Uenal, Arzu [Ministry of Agriculture and Rural Affairs, General Directorate of Agricultural Research, Ankara (Turkey); Cabuk, Ahmet [Department of Biology, Faculty of Arts and Science, Eskisehir Osmangazi University (Turkey); Kolankaya, Nazif [Department of Biology, Division of Biotechnology, Faculty of Science, Hacettepe University, Ankara (Turkey); Yueruem, Alp [Grand Water Research Institute, Technion Israel Institute of Technology, Haifa (Israel)
2011-01-15
In this paper, data obtained during the oxidative desulphurization of some low-rank Turkish lignites with crude laccase enzyme produced from Trametes versicolor ATCC 200801 are presented. In order to optimize desulphurization conditions, effects of incubation time, pulp density, incubation temperature, medium pH, and also lignite source on the desulphurization have been examined. The values for incubation period, pulp density, temperature and pH in optimum incubation condition were found as 30 min, 5%, 35 C, and pH 5.0, respectively. Under optimum conditions, treatment of coal samples with crude laccase has caused nearly 29% reduction in their total sulphur content. During the study, the rate of desulphurization of coal sample provided from Tuncbilek with crude laccase was found to be relatively higher than the other examined coal samples. Results of analytical assays have indicated that the treatment of coals with crude laccase has caused no change in their calorific values but reduced their sulphur emissions. 35%, 13%, and 25% reductions of pyritic sulphur, sulphate and organic sulphur in a period of 30 min were achieved, for a particle size of 200 {mu}m under optimal conditions with enzymatic desulphurization. Also, statistical analyses such as Tukey Multiple Comparison tests and ANOVA were performed. (author)
Directory of Open Access Journals (Sweden)
Ryan Wen Liu
2017-03-01
Full Text Available Dynamic magnetic resonance imaging (MRI has been extensively utilized for enhancing medical living environment visualization, however, in clinical practice it often suffers from long data acquisition times. Dynamic imaging essentially reconstructs the visual image from raw (k,t-space measurements, commonly referred to as big data. The purpose of this work is to accelerate big medical data acquisition in dynamic MRI by developing a non-convex minimization framework. In particular, to overcome the inherent speed limitation, both non-convex low-rank and sparsity constraints were combined to accelerate the dynamic imaging. However, the non-convex constraints make the dynamic reconstruction problem difficult to directly solve through the commonly-used numerical methods. To guarantee solution efficiency and stability, a numerical algorithm based on Alternating Direction Method of Multipliers (ADMM is proposed to solve the resulting non-convex optimization problem. ADMM decomposes the original complex optimization problem into several simple sub-problems. Each sub-problem has a closed-form solution or could be efficiently solved using existing numerical methods. It has been proven that the quality of images reconstructed from fewer measurements can be significantly improved using non-convex minimization. Numerous experiments have been conducted on two in vivo cardiac datasets to compare the proposed method with several state-of-the-art imaging methods. Experimental results illustrated that the proposed method could guarantee the superior imaging performance in terms of quantitative and visual image quality assessments.
Directory of Open Access Journals (Sweden)
Mahidin Mahidin
2016-08-01
Full Text Available Calcium oxide-based material is available abundantly and naturally. A potential resource of that material comes from marine mollusk shell such as clams, scallops, mussels, oysters, winkles and nerites. The CaO-based material has exhibited a good performance as the desulfurizer oradsorbent in coal combustion in order to reduce SO2 emission. In this study, pulverized green mussel shell, without calcination, was utilized as the desulfurizer in the briquette produced from a mixture of low rank coal and palm kernel shell (PKS, also known as bio-briquette. The ratio ofcoal to PKS in the briquette was 90:10 (wt/wt. The influence of green mussel shell contents and combustion temperature were examined to prove the possible use of that materialas a desulfurizer. The ratio of Ca to S (Ca = calcium content in desulfurizer; S = sulfur content in briquette werefixed at 1:1, 1.25:1, 1.5:1, 1.75:1, and 2:1 (mole/mole. The burning (or desulfurization temperature range was 300-500 °C; the reaction time was 720 seconds and the air flow rate was 1.2 L/min. The results showed that green mussel shell can be introduced as a desulfurizer in coal briquette or bio-briquette combustions. The desulfurization process using that desulfurizer exhibited the first order reaction and the highest average efficiency of 84.5%.
Scoping Studies to Evaluate the Benefits of an Advanced Dry Feed System on the Use of Low-Rank Coal
Energy Technology Data Exchange (ETDEWEB)
Rader, Jeff; Aguilar, Kelly; Aldred, Derek; Chadwick, Ronald; Conchieri,; Dara, Satyadileep; Henson, Victor; Leininger, Tom; Liber, Pawel; Nakazono, Benito; Pan, Edward; Ramirez, Jennifer; Stevenson, John; Venkatraman, Vignesh
2012-11-30
This report describes the development of the design of an advanced dry feed system that was carried out under Task 4.0 of Cooperative Agreement DE-FE0007902 with the US DOE, “Scoping Studies to Evaluate the Benefits of an Advanced Dry Feed System on the use of Low- Rank Coal.” The resulting design will be used for the advanced technology IGCC case with 90% carbon capture for sequestration to be developed under Task 5.0 of the same agreement. The scope of work covered coal preparation and feeding up through the gasifier injector. Subcomponents have been broken down into feed preparation (including grinding and drying), low pressure conveyance, pressurization, high pressure conveyance, and injection. Pressurization of the coal feed is done using Posimetric1 Feeders sized for the application. In addition, a secondary feed system is described for preparing and feeding slag additive and recycle fines to the gasifier injector. This report includes information on the basis for the design, requirements for down selection of the key technologies used, the down selection methodology and the final, down selected design for the Posimetric Feed System, or PFS.
Energy Technology Data Exchange (ETDEWEB)
Grasset, L.; Vlckova, Z.; Kucerik, J.; Ambles, A. [University of Poitiers, Poitiers (France)
2010-09-15
Traditional CuO oxidation and thermochemolysis with tetramethylammonium hydroxide are the two main methods for lignin characterization in gymnosperm wood, and in soils and sediments formed from degraded gymnosperm wood, or for assessing the supply of terrestrial organic matter to marine sediments. In some cases, the overall lignin yield and the compound ratios used as plant source proxies have been found to be considerably different, depending on the method used. Thus, there is a need for finding efficient and more selective methods for lignin alpha- and beta-aryl ether cleavage. Derivatization followed by reductive cleavage (the DFRC method) is suitable for lignocellulose material. Results from the DFRC method applied to the characterization of humic acids of a lignite (low rank coal) from the Czech Republic show that they contain intact lignin monomers with a dominance of coniferyl units, in accord with the gymnosperm origin of the lignite. It is expected that DFRC will be suitable also for tracing lignin in other sediments.
Accelerating Matrix-Vector Multiplication on Hierarchical Matrices Using Graphical Processing Units
Boukaram, W.
2015-03-25
Large dense matrices arise from the discretization of many physical phenomena in computational sciences. In statistics very large dense covariance matrices are used for describing random fields and processes. One can, for instance, describe distribution of dust particles in the atmosphere, concentration of mineral resources in the earth\\'s crust or uncertain permeability coefficient in reservoir modeling. When the problem size grows, storing and computing with the full dense matrix becomes prohibitively expensive both in terms of computational complexity and physical memory requirements. Fortunately, these matrices can often be approximated by a class of data sparse matrices called hierarchical matrices (H-matrices) where various sub-blocks of the matrix are approximated by low rank matrices. These matrices can be stored in memory that grows linearly with the problem size. In addition, arithmetic operations on these H-matrices, such as matrix-vector multiplication, can be completed in almost linear time. Originally the H-matrix technique was developed for the approximation of stiffness matrices coming from partial differential and integral equations. Parallelizing these arithmetic operations on the GPU has been the focus of this work and we will present work done on the matrix vector operation on the GPU using the KSPARSE library.
Introduction into Hierarchical Matrices
Litvinenko, Alexander
2013-12-05
Hierarchical matrices allow us to reduce computational storage and cost from cubic to almost linear. This technique can be applied for solving PDEs, integral equations, matrix equations and approximation of large covariance and precision matrices.
Multivariate statistics high-dimensional and large-sample approximations
Fujikoshi, Yasunori; Shimizu, Ryoichi
2010-01-01
A comprehensive examination of high-dimensional analysis of multivariate methods and their real-world applications Multivariate Statistics: High-Dimensional and Large-Sample Approximations is the first book of its kind to explore how classical multivariate methods can be revised and used in place of conventional statistical tools. Written by prominent researchers in the field, the book focuses on high-dimensional and large-scale approximations and details the many basic multivariate methods used to achieve high levels of accuracy. The authors begin with a fundamental presentation of the basic
Circulant conference matrices for new complex Hadamard matrices
Dita, Petre
2011-01-01
The circulant real and complex matrices are used to find new real and complex conference matrices. With them we construct Sylvester inverse orthogonal matrices by doubling the size of inverse complex conference matrices. When the free parameters take values on the unit circle the inverse orthogonal matrices transform into complex Hadamard matrices. The method is used for $n=6$ conference matrices and in this way we find new parametrisations of Hadamard matrices for dimension $ n=12$.
Directory of Open Access Journals (Sweden)
Mahidin Mahidin
2012-12-01
Full Text Available NOx and N2O emissions from coal combustion are claimed as the major contributors for the acid rain, photochemical smog, green house and ozone depletion problems. Based on the facts, study on those emissions formation is interest topic in the combustion area. In this paper, theoretical study by modeling and simulation on NOx and N2O formation in co-combustion of low-rank coal and palm kernel shell has been done. Combustion model was developed by using the principle of chemical-reaction equilibrium. Simulation on the model in order to evaluate the composition of the flue gas was performed by minimization the Gibbs free energy. The results showed that by introduced of biomass in coal combustion can reduce the NOx concentration in considerably level. Maximum NO level in co-combustion of low-rank coal and palm kernel shell with fuel composition 1:1 is 2,350 ppm, low enough compared to single low-rank coal combustion up to 3,150 ppm. Moreover, N2O is less than 0.25 ppm in all cases. Keywords: low-rank coal, N2O emission, NOx emission, palm kernel shell
Chen, Xiao; Salerno, Michael; Yang, Yang; Epstein, Frederick H.
2014-01-01
Purpose Dynamic contrast-enhanced MRI of the heart is well-suited for acceleration with compressed sensing (CS) due to its spatiotemporal sparsity; however, respiratory motion can degrade sparsity and lead to image artifacts. We sought to develop a motion-compensated CS method for this application. Methods A new method, Block LOw-rank Sparsity with Motion-guidance (BLOSM), was developed to accelerate first-pass cardiac MRI, even in the presence of respiratory motion. This method divides the images into regions, tracks the regions through time, and applies matrix low-rank sparsity to the tracked regions. BLOSM was evaluated using computer simulations and first-pass cardiac datasets from human subjects. Using rate-4 acceleration, BLOSM was compared to other CS methods such as k-t SLR that employs matrix low-rank sparsity applied to the whole image dataset, with and without motion tracking, and to k-t FOCUSS with motion estimation and compensation that employs spatial and temporal-frequency sparsity. Results BLOSM was qualitatively shown to reduce respiratory artifact compared to other methods. Quantitatively, using root mean squared error and the structural similarity index, BLOSM was superior to other methods. Conclusion BLOSM, which exploits regional low rank structure and uses region tracking for motion compensation, provides improved image quality for CS-accelerated first-pass cardiac MRI. PMID:24243528
Bayesian Variable Selection in High-dimensional Applications
V. Rockova (Veronika)
2013-01-01
markdownabstract__Abstract__ Advances in research technologies over the past few decades have encouraged the proliferation of massive datasets, revolutionizing statistical perspectives on high-dimensionality. Highthroughput technologies have become pervasive in diverse scientific disciplines
Effects of dependence in high-dimensional multiple testing problems
Kim, K.I.; van de Wiel, M.A.
2008-01-01
Background: We consider effects of dependence among variables of high-dimensional data in multiple hypothesis testing problems, in particular the False Discovery Rate (FDR) control procedures. Recent simulation studies consider only simple correlation structures among variables, which is hardly
El Gharamti, Mohamad
2014-02-01
The accuracy of groundwater flow and transport model predictions highly depends on our knowledge of subsurface physical parameters. Assimilation of contaminant concentration data from shallow dug wells could help improving model behavior, eventually resulting in better forecasts. In this paper, we propose a joint state-parameter estimation scheme which efficiently integrates a low-rank extended Kalman filtering technique, namely the Singular Evolutive Extended Kalman (SEEK) filter, with the prominent complex-step method (CSM). The SEEK filter avoids the prohibitive computational burden of the Extended Kalman filter by updating the forecast along the directions of error growth only, called filter correction directions. CSM is used within the SEEK filter to efficiently compute model derivatives with respect to the state and parameters along the filter correction directions. CSM is derived using complex Taylor expansion and is second order accurate. It is proven to guarantee accurate gradient computations with zero numerical round-off errors, but requires complexifying the numerical code. We perform twin-experiments to test the performance of the CSM-based SEEK for estimating the state and parameters of a subsurface contaminant transport model. We compare the efficiency and the accuracy of the proposed scheme with two standard finite difference-based SEEK filters as well as with the ensemble Kalman filter (EnKF). Assimilation results suggest that the use of the CSM in the context of the SEEK filter may provide up to 80% more accurate solutions when compared to standard finite difference schemes and is competitive with the EnKF, even providing more accurate results in certain situations. We analyze the results based on two different observation strategies. We also discuss the complexification of the numerical code and show that this could be efficiently implemented in the context of subsurface flow models. © 2013 Elsevier B.V.
Tensor Dictionary Learning for Positive Definite Matrices.
Sivalingam, Ravishankar; Boley, Daniel; Morellas, Vassilios; Papanikolopoulos, Nikolaos
2015-11-01
Sparse models have proven to be extremely successful in image processing and computer vision. However, a majority of the effort has been focused on sparse representation of vectors and low-rank models for general matrices. The success of sparse modeling, along with popularity of region covariances, has inspired the development of sparse coding approaches for these positive definite descriptors. While in earlier work, the dictionary was formed from all, or a random subset of, the training signals, it is clearly advantageous to learn a concise dictionary from the entire training set. In this paper, we propose a novel approach for dictionary learning over positive definite matrices. The dictionary is learned by alternating minimization between sparse coding and dictionary update stages, and different atom update methods are described. A discriminative version of the dictionary learning approach is also proposed, which simultaneously learns dictionaries for different classes in classification or clustering. Experimental results demonstrate the advantage of learning dictionaries from data both from reconstruction and classification viewpoints. Finally, a software library is presented comprising C++ binaries for all the positive definite sparse coding and dictionary learning approaches presented here.
Directory of Open Access Journals (Sweden)
Zutao Zhang
2016-06-01
Full Text Available Environmental perception and information processing are two key steps of active safety for vehicle reversing. Single-sensor environmental perception cannot meet the need for vehicle reversing safety due to its low reliability. In this paper, we present a novel multi-sensor environmental perception method using low-rank representation and a particle filter for vehicle reversing safety. The proposed system consists of four main steps, namely multi-sensor environmental perception, information fusion, target recognition and tracking using low-rank representation and a particle filter, and vehicle reversing speed control modules. First of all, the multi-sensor environmental perception module, based on a binocular-camera system and ultrasonic range finders, obtains the distance data for obstacles behind the vehicle when the vehicle is reversing. Secondly, the information fusion algorithm using an adaptive Kalman filter is used to process the data obtained with the multi-sensor environmental perception module, which greatly improves the robustness of the sensors. Then the framework of a particle filter and low-rank representation is used to track the main obstacles. The low-rank representation is used to optimize an objective particle template that has the smallest L-1 norm. Finally, the electronic throttle opening and automatic braking is under control of the proposed vehicle reversing control strategy prior to any potential collisions, making the reversing control safer and more reliable. The final system simulation and practical testing results demonstrate the validity of the proposed multi-sensor environmental perception method using low-rank representation and a particle filter for vehicle reversing safety.
Lee, Dong-Wook; Bae, Jong-Soo; Lee, Young-Joo; Park, Se-Joon; Hong, Jai-Chang; Lee, Byoung-Hwa; Jeon, Chung-Hwan; Choi, Young-Chan
2013-02-05
Coal-fired power plants are facing to two major independent problems, namely, the burden to reduce CO(2) emission to comply with renewable portfolio standard (RPS) and cap-and-trade system, and the need to use low-rank coal due to the instability of high-rank coal supply. To address such unresolved issues, integrated gasification combined cycle (IGCC) with carbon capture and storage (CCS) has been suggested, and low rank coal has been upgraded by high-pressure and high-temperature processes. However, IGCC incurs huge construction costs, and the coal upgrading processes require fossil-fuel-derived additives and harsh operation condition. Here, we first show a hybrid coal that can solve these two problems simultaneously while using existing power plants. Hybrid coal is defined as a two-in-one fuel combining low rank coal with a sugar cane-derived bioliquid, such as molasses and sugar cane juice, by bioliquid diffusion into coal intrapores and precarbonization of the bioliquid. Unlike the simple blend of biomass and coal showing dual combustion behavior, hybrid coal provided a single coal combustion pattern. If hybrid coal (biomass/coal ratio = 28 wt %) is used as a fuel for 500 MW power generation, the net CO(2) emission is 21.2-33.1% and 12.5-25.7% lower than those for low rank coal and designed coal, and the required coal supply can be reduced by 33% compared with low rank coal. Considering high oil prices and time required before a stable renewable energy supply can be established, hybrid coal could be recognized as an innovative low-carbon-emission energy technology that can bridge the gulf between fossil fuels and renewable energy, because various water-soluble biomass could be used as an additive for hybrid coal through proper modification of preparation conditions.
Matrices and linear transformations
Cullen, Charles G
1990-01-01
""Comprehensive . . . an excellent introduction to the subject."" - Electronic Engineer's Design Magazine.This introductory textbook, aimed at sophomore- and junior-level undergraduates in mathematics, engineering, and the physical sciences, offers a smooth, in-depth treatment of linear algebra and matrix theory. The major objects of study are matrices over an arbitrary field. Contents include Matrices and Linear Systems; Vector Spaces; Determinants; Linear Transformations; Similarity: Part I and Part II; Polynomials and Polynomial Matrices; Matrix Analysis; and Numerical Methods. The first
High Dimensional Modulation and MIMO Techniques for Access Networks
DEFF Research Database (Denmark)
Binti Othman, Maisara
Exploration of advanced modulation formats and multiplexing techniques for next generation optical access networks are of interest as promising solutions for delivering multiple services to end-users. This thesis addresses this from two different angles: high dimensionality carrierless amplitudep...... wired-wireless access networks....... the capacity per wavelength of the femto-cell network. Bit rate up to 1.59 Gbps with fiber-wireless transmission over 1 m air distance is demonstrated. The results presented in this thesis demonstrate the feasibility of high dimensionality CAP in increasing the number of dimensions and their potentially...... to be utilized for multiple service allocation to different users. MIMO multiplexing techniques with OFDM provides the scalability in increasing spectral efficiency and bit rates for RoF systems. High dimensional CAP and MIMO multiplexing techniques are two promising solutions for supporting wired and hybrid...
2016-02-01
choice for the weight function is the Zelnik-Manor and Perona function [31] for sparse matrices : w(x, y) = e − M(x,y) 2 √ τ(x)τ(y) , (49) using τ(x...Modified Cheeger and Ratio Cut Methods Using the Ginzburg-Landau Functional for Classification of High-Dimensional Data Ekaterina Merkurjev*, Andrea...related Ginzburg-Landau functional is used in the derivation of the methods. The graph framework discussed in this paper is undirected. The resulting
Cheng, Jiubing
2014-08-05
In elastic imaging, the extrapolated vector fields are decomposed into pure wave modes, such that the imaging condition produces interpretable images, which characterize reflectivity of different reflection types. Conventionally, wavefield decomposition in anisotropic media is costly as the operators involved is dependent on the velocity, and thus not stationary. In this abstract, we propose an efficient approach to directly extrapolate the decomposed elastic waves using lowrank approximate mixed space/wavenumber domain integral operators for heterogeneous transverse isotropic (TI) media. The low-rank approximation is, thus, applied to the pseudospectral extrapolation and decomposition at the same time. The pseudo-spectral implementation also allows for relatively large time steps in which the low-rank approximation is applied. Synthetic examples show that it can yield dispersionfree extrapolation of the decomposed quasi-P (qP) and quasi- SV (qSV) modes, which can be used for imaging, as well as the total elastic wavefields.
High-dimensional model estimation and model selection
CERN. Geneva
2015-01-01
I will review concepts and algorithms from high-dimensional statistics for linear model estimation and model selection. I will particularly focus on the so-called p>>n setting where the number of variables p is much larger than the number of samples n. I will focus mostly on regularized statistical estimators that produce sparse models. Important examples include the LASSO and its matrix extension, the Graphical LASSO, and more recent non-convex methods such as the TREX. I will show the applicability of these estimators in a diverse range of scientific applications, such as sparse interaction graph recovery and high-dimensional classification and regression problems in genomics.
Diagonal Likelihood Ratio Test for Equality of Mean Vectors in High-Dimensional Data
Hu, Zongliang
2017-10-27
We propose a likelihood ratio test framework for testing normal mean vectors in high-dimensional data under two common scenarios: the one-sample test and the two-sample test with equal covariance matrices. We derive the test statistics under the assumption that the covariance matrices follow a diagonal matrix structure. In comparison with the diagonal Hotelling\\'s tests, our proposed test statistics display some interesting characteristics. In particular, they are a summation of the log-transformed squared t-statistics rather than a direct summation of those components. More importantly, to derive the asymptotic normality of our test statistics under the null and local alternative hypotheses, we do not require the assumption that the covariance matrix follows a diagonal matrix structure. As a consequence, our proposed test methods are very flexible and can be widely applied in practice. Finally, simulation studies and a real data analysis are also conducted to demonstrate the advantages of our likelihood ratio test method.
Directory of Open Access Journals (Sweden)
Ichitaro Yamazaki
2015-01-01
of their low-rank properties. To compute a low-rank approximation of a dense matrix, in this paper, we study the performance of QR factorization with column pivoting or with restricted pivoting on multicore CPUs with a GPU. We first propose several techniques to reduce the postprocessing time, which is required for restricted pivoting, on a modern CPU. We then examine the potential of using a GPU to accelerate the factorization process with both column and restricted pivoting. Our performance results on two eight-core Intel Sandy Bridge CPUs with one NVIDIA Kepler GPU demonstrate that using the GPU, the factorization time can be reduced by a factor of more than two. In addition, to study the performance of our implementations in practice, we integrate them into a recently developed software StruMF which algebraically exploits such low-rank structures for solving a general sparse linear system of equations. Our performance results for solving Poisson's equations demonstrate that the proposed techniques can significantly reduce the preconditioner construction time of StruMF on the CPUs, and the construction time can be further reduced by 10%–50% using the GPU.
On greedy and submodular matrices
Faigle, U.; Kern, Walter; Peis, Britta; Marchetti-Spaccamela, Alberto; Segal, Michael
2011-01-01
We characterize non-negative greedy matrices, i.e., 0-1 matrices $A$ such that max $\\{c^Tx|Ax \\le b,\\,x \\ge 0\\}$ can be solved greedily. We identify submodular matrices as a special subclass of greedy matrices. Finally, we extend the notion of greediness to $\\{-1,0,+1\\}$-matrices. We present
Justino, Júlia
2017-06-01
Matrices with coefficients having uncertainties of type o (.) or O (.), called flexible matrices, are studied from the point of view of nonstandard analysis. The uncertainties of the afore-mentioned kind will be given in the form of the so-called neutrices, for instance the set of all infinitesimals. Since flexible matrices have uncertainties in their coefficients, it is not possible to define the identity matrix in an unique way and so the notion of spectral identity matrix arises. Not all nonsingular flexible matrices can be turned into a spectral identity matrix using Gauss-Jordan elimination method, implying that that not all nonsingular flexible matrices have the inverse matrix. Under certain conditions upon the size of the uncertainties appearing in a nonsingular flexible matrix, a general theorem concerning the boundaries of its minors is presented which guarantees the existence of the inverse matrix of a nonsingular flexible matrix.
High-dimensional quantum channel estimation using classical light
CSIR Research Space (South Africa)
Mabena, Chemist M
2017-11-01
Full Text Available A method is proposed to characterize a high-dimensional quantum channel with the aid of classical light. It uses a single nonseparable input optical field that contains correlations between spatial modes and wavelength to determine the effect...
A hybridized K-means clustering approach for high dimensional ...
African Journals Online (AJOL)
Due to incredible growth of high dimensional dataset, conventional data base querying methods are inadequate to extract useful information, so researchers nowadays is forced to develop new techniques to meet the raised requirements. Such large expression data gives rise to a number of new computational challenges ...
Inference in High-dimensional Dynamic Panel Data Models
DEFF Research Database (Denmark)
Kock, Anders Bredahl; Tang, Haihan
error variance may be non-constant over time and depend on the covariates. Furthermore, our procedure allows for inference on high-dimensional subsets of the parameter vector of an increasing cardinality. We show that the confidence bands resulting from our procedure are asymptotically honest...
High-Dimensional Statistical Learning: Roots, Justifications, and Potential Machineries.
Zollanvari, Amin
2015-01-01
High-dimensional data generally refer to data in which the number of variables is larger than the sample size. Analyzing such datasets poses great challenges for classical statistical learning because the finite-sample performance of methods developed within classical statistical learning does not live up to classical asymptotic premises in which the sample size unboundedly grows for a fixed dimensionality of observations. Much work has been done in developing mathematical-statistical techniques for analyzing high-dimensional data. Despite remarkable progress in this field, many practitioners still utilize classical methods for analyzing such datasets. This state of affairs can be attributed, in part, to a lack of knowledge and, in part, to the ready-to-use computational and statistical software packages that are well developed for classical techniques. Moreover, many scientists working in a specific field of high-dimensional statistical learning are either not aware of other existing machineries in the field or are not willing to try them out. The primary goal in this work is to bring together various machineries of high-dimensional analysis, give an overview of the important results, and present the operating conditions upon which they are grounded. When appropriate, readers are referred to relevant review articles for more information on a specific subject.
The additive hazards model with high-dimensional regressors
DEFF Research Database (Denmark)
Martinussen, Torben
2009-01-01
This paper considers estimation and prediction in the Aalen additive hazards model in the case where the covariate vector is high-dimensional such as gene expression measurements. Some form of dimension reduction of the covariate space is needed to obtain useful statistical analyses. We study...
Irregular grid methods for pricing high-dimensional American options
Berridge, S.J.
2004-01-01
This thesis proposes and studies numerical methods for pricing high-dimensional American options; important examples being basket options, Bermudan swaptions and real options. Four new methods are presented and analysed, both in terms of their application to various test problems, and in terms of
High-dimensional multispectral image fusion: classification by neural network
He, Mingyi; Xia, Jiantao
2003-06-01
Advances in sensor technology for Earth observation make it possible to collect multispectral data in much higher dimensionality. Such high dimensional data will it possible to classify more classes. However, it will also have several impacts on processing technology. First, because of its huge data, more processing power will be needed to process such high dimensional data. Second, because of its high dimensionality and the limited training samples, it is very difficult for Bayes method to estimate the parameters accurately. So the classification accuracy cannot be high enough. Neural Network is an intelligent signal processing method. MLFNN (Multi-Layer Feedforward Neural Network) directly learn from training samples and the probability model needs not to be estimated, the classification may be conducted through neural network fusion of multispectral images. The latent information about different classes can be extracted from training samples by MLFNN. However, because of the huge data and high dimensionality, MLFNN will face some serious difficulties: (1) There are many local minimal points in the error surface of MLFNN; (2) Over-fitting phenomena. These two difficulties depress the classification accuracy and generalization performance of MLFNN. In order to overcome these difficulties, the author proposed DPFNN (Double Parallel Feedforward Neural Networks) used to classify the high dimensional multispectral images. The model and learning algorithm of DPFNN with strong generalization performance are proposed, with emphases on the regularization of output weights and improvement of the generalization performance of DPFNN. As DPFNN is composed of MLFNN and SLFNN (Single-Layer Feedforward Neural Network), it has the advantages of MLFNN and SLFNN: (1) Good nonlinear mapping capability; (2) High learning speed for linear-like problem. Experimental results with generated data, 64-band practical multispectral images and 220-band multispectral images show that the new
Fallat, Shaun M
2011-01-01
Totally nonnegative matrices arise in a remarkable variety of mathematical applications. This book is a comprehensive and self-contained study of the essential theory of totally nonnegative matrices, defined by the nonnegativity of all subdeterminants. It explores methodological background, historical highlights of key ideas, and specialized topics.The book uses classical and ad hoc tools, but a unifying theme is the elementary bidiagonal factorization, which has emerged as the single most important tool for this particular class of matrices. Recent work has shown that bidiagonal factorization
DEFF Research Database (Denmark)
Britz, Thomas
Bipartite graphs and digraphs are used to describe algebraic operations on a free matrix, including Moore-Penrose inversion, finding Schur complements, and normalized LU factorization. A description of the structural properties of a free matrix and its Moore-Penrose inverse is proved, and necessa...... and sufficient conditions are given for the Moore-Penrose inverse of a free matrix to be free. Several of these results are generalized with respect to a family of matrices that contains both the free matrices and the nearly reducible matrices....
Bolton, W
1995-01-01
This book is concerned with linear equations and matrices, with emphasis on the solution of simultaneous linear equations. The solution of simultaneous linear equations is applied to electric circuit analysis and structural analysis.
Barcucci, E.; Bernini, A.; Bilotta, S.; Pinzani, R.
2016-01-01
Two matrices are said non-overlapping if one of them can not be put on the other one in a way such that the corresponding entries coincide. We provide a set of non-overlapping binary matrices and a formula to enumerate it which involves the k-generalized Fibonacci numbers. Moreover, the generating function for the enumerating sequence is easily seen to be rational.
Structural analysis of high-dimensional basins of attraction
Martiniani, Stefano; Schrenk, K. Julian; Stevenson, Jacob D.; Wales, David J.; Frenkel, Daan
2016-09-01
We propose an efficient Monte Carlo method for the computation of the volumes of high-dimensional bodies with arbitrary shape. We start with a region of known volume within the interior of the manifold and then use the multistate Bennett acceptance-ratio method to compute the dimensionless free-energy difference between a series of equilibrium simulations performed within this object. The method produces results that are in excellent agreement with thermodynamic integration, as well as a direct estimate of the associated statistical uncertainties. The histogram method also allows us to directly obtain an estimate of the interior radial probability density profile, thus yielding useful insight into the structural properties of such a high-dimensional body. We illustrate the method by analyzing the effect of structural disorder on the basins of attraction of mechanically stable packings of soft repulsive spheres.
High dimensional multiclass classification with applications to cancer diagnosis
DEFF Research Database (Denmark)
Vincent, Martin
Probabilistic classifiers are introduced and it is shown that the only regular linear probabilistic classifier with convex risk is multinomial regression. Penalized empirical risk minimization is introduced and used to construct supervised learning methods for probabilistic classifiers. A sparse...... group lasso penalized approach to high dimensional multinomial classification is presented. On different real data examples it is found that this approach clearly outperforms multinomial lasso in terms of error rate and features included in the model. An efficient coordinate descent algorithm...... is developed and the convergence is established. This algorithm is implemented in the msgl R package. Examples of high dimensional multiclass problems are studied, in particular examples of multiclass classification based on gene expression measurements. One such example is the clinically important - problem...
Dimensionality reduction for registration of high-dimensional data sets.
Xu, Min; Chen, Hao; Varshney, Pramod K
2013-08-01
Registration of two high-dimensional data sets often involves dimensionality reduction to yield a single-band image from each data set followed by pairwise image registration. We develop a new application-specific algorithm for dimensionality reduction of high-dimensional data sets such that the weighted harmonic mean of Cramér-Rao lower bounds for the estimation of the transformation parameters for registration is minimized. The performance of the proposed dimensionality reduction algorithm is evaluated using three remotes sensing data sets. The experimental results using mutual information-based pairwise registration technique demonstrate that our proposed dimensionality reduction algorithm combines the original data sets to obtain the image pair with more texture, resulting in improved image registration.
HSM: Heterogeneous Subspace Mining in High Dimensional Data
DEFF Research Database (Denmark)
Müller, Emmanuel; Assent, Ira; Seidl, Thomas
2009-01-01
Heterogeneous data, i.e. data with both categorical and continuous values, is common in many databases. However, most data mining algorithms assume either continuous or categorical attributes, but not both. In high dimensional data, phenomena due to the "curse of dimensionality" pose additional...... challenges. Usually, due to locally varying relevance of attributes, patterns do not show across the full set of attributes. In this paper we propose HSM, which defines a new pattern model for heterogeneous high dimensional data. It allows data mining in arbitrary subsets of the attributes that are relevant...... for the respective patterns. Based on this model we propose an efficient algorithm, which is aware of the heterogeneity of the attributes. We extend an indexing structure for continuous attributes such that HSM indexing adapts to different attribute types. In our experiments we show that HSM efficiently mines...
Analysis of chaos in high-dimensional wind power system
Wang, Cong; Zhang, Hongli; Fan, Wenhui; Ma, Ping
2018-01-01
A comprehensive analysis on the chaos of a high-dimensional wind power system is performed in this study. A high-dimensional wind power system is more complex than most power systems. An 11-dimensional wind power system proposed by Huang, which has not been analyzed in previous studies, is investigated. When the systems are affected by external disturbances including single parameter and periodic disturbance, or its parameters changed, chaotic dynamics of the wind power system is analyzed and chaotic parameters ranges are obtained. Chaos existence is confirmed by calculation and analysis of all state variables' Lyapunov exponents and the state variable sequence diagram. Theoretical analysis and numerical simulations show that the wind power system chaos will occur when parameter variations and external disturbances change to a certain degree.
Machine-learned cluster identification in high-dimensional data.
Ultsch, Alfred; Lötsch, Jörn
2017-02-01
High-dimensional biomedical data are frequently clustered to identify subgroup structures pointing at distinct disease subtypes. It is crucial that the used cluster algorithm works correctly. However, by imposing a predefined shape on the clusters, classical algorithms occasionally suggest a cluster structure in homogenously distributed data or assign data points to incorrect clusters. We analyzed whether this can be avoided by using emergent self-organizing feature maps (ESOM). Data sets with different degrees of complexity were submitted to ESOM analysis with large numbers of neurons, using an interactive R-based bioinformatics tool. On top of the trained ESOM the distance structure in the high dimensional feature space was visualized in the form of a so-called U-matrix. Clustering results were compared with those provided by classical common cluster algorithms including single linkage, Ward and k-means. Ward clustering imposed cluster structures on cluster-less "golf ball", "cuboid" and "S-shaped" data sets that contained no structure at all (random data). Ward clustering also imposed structures on permuted real world data sets. By contrast, the ESOM/U-matrix approach correctly found that these data contain no cluster structure. However, ESOM/U-matrix was correct in identifying clusters in biomedical data truly containing subgroups. It was always correct in cluster structure identification in further canonical artificial data. Using intentionally simple data sets, it is shown that popular clustering algorithms typically used for biomedical data sets may fail to cluster data correctly, suggesting that they are also likely to perform erroneously on high dimensional biomedical data. The present analyses emphasized that generally established classical hierarchical clustering algorithms carry a considerable tendency to produce erroneous results. By contrast, unsupervised machine-learned analysis of cluster structures, applied using the ESOM/U-matrix method, is a
Parsimonious description for predicting high-dimensional dynamics
Yoshito Hirata; Tomoya Takeuchi; Shunsuke Horai; Hideyuki Suzuki; Kazuyuki Aihara
2015-01-01
When we observe a system, we often cannot observe all its variables and may have some of its limited measurements. Under such a circumstance, delay coordinates, vectors made of successive measurements, are useful to reconstruct the states of the whole system. Although the method of delay coordinates is theoretically supported for high-dimensional dynamical systems, practically there is a limitation because the calculation for higher-dimensional delay coordinates becomes more expensive. Here, ...
Bayesian Visual Analytics: Interactive Visualization for High Dimensional Data
Han, Chao
2012-01-01
In light of advancements made in data collection techniques over the past two decades, data mining has become common practice to summarize large, high dimensional datasets, in hopes of discovering noteworthy data structures. However, one concern is that most data mining approaches rely upon strict criteria that may mask information in data that analysts may find useful. We propose a new approach called Bayesian Visual Analytics (BaVA) which merges Bayesian Statistics with Visual Analytics to ...
RRT+ : Fast Planning for High-Dimensional Configuration Spaces
Xanthidis, Marios; Rekleitis, Ioannis; O'Kane, Jason M.
2016-01-01
In this paper we propose a new family of RRT based algorithms, named RRT+ , that are able to find faster solutions in high-dimensional configuration spaces compared to other existing RRT variants by finding paths in lower dimensional subspaces of the configuration space. The method can be easily applied to complex hyper-redundant systems and can be adapted by other RRT based planners. We introduce RRT+ and develop some variants, called PrioritizedRRT+ , PrioritizedRRT+-Connect, and Prioritize...
Evaluating Clustering in Subspace Projections of High Dimensional Data
DEFF Research Database (Denmark)
Müller, Emmanuel; Günnemann, Stephan; Assent, Ira
2009-01-01
Clustering high dimensional data is an emerging research field. Subspace clustering or projected clustering group similar objects in subspaces, i.e. projections, of the full space. In the past decade, several clustering paradigms have been developed in parallel, without thorough evaluation and co...... and create a common baseline for future developments and comparable evaluations in the field. For repeatability, all implementations, data sets and evaluation measures are available on our website....
Oracle inequalities for high-dimensional panel data models
DEFF Research Database (Denmark)
Kock, Anders Bredahl
This paper is concerned with high-dimensional panel data models where the number of regressors can be much larger than the sample size. Under the assumption that the true parameter vector is sparse we establish finite sample upper bounds on the estimation error of the Lasso under two different se...... results by simulations and apply the methods to search for covariates explaining growth in the G8 countries....
Spectral Regularization Algorithms for Learning Large Incomplete Matrices.
Mazumder, Rahul; Hastie, Trevor; Tibshirani, Robert
2010-03-01
We use convex relaxation techniques to provide a sequence of regularized low-rank solutions for large-scale matrix completion problems. Using the nuclear norm as a regularizer, we provide a simple and very efficient convex algorithm for minimizing the reconstruction error subject to a bound on the nuclear norm. Our algorithm Soft-Impute iteratively replaces the missing elements with those obtained from a soft-thresholded SVD. With warm starts this allows us to efficiently compute an entire regularization path of solutions on a grid of values of the regularization parameter. The computationally intensive part of our algorithm is in computing a low-rank SVD of a dense matrix. Exploiting the problem structure, we show that the task can be performed with a complexity linear in the matrix dimensions. Our semidefinite-programming algorithm is readily scalable to large matrices: for example it can obtain a rank-80 approximation of a 10(6) × 10(6) incomplete matrix with 10(5) observed entries in 2.5 hours, and can fit a rank 40 approximation to the full Netflix training set in 6.6 hours. Our methods show very good performance both in training and test error when compared to other competitive state-of-the art techniques.
Spectral Regularization Algorithms for Learning Large Incomplete Matrices
Mazumder, Rahul; Hastie, Trevor; Tibshirani, Robert
2010-01-01
We use convex relaxation techniques to provide a sequence of regularized low-rank solutions for large-scale matrix completion problems. Using the nuclear norm as a regularizer, we provide a simple and very efficient convex algorithm for minimizing the reconstruction error subject to a bound on the nuclear norm. Our algorithm Soft-Impute iteratively replaces the missing elements with those obtained from a soft-thresholded SVD. With warm starts this allows us to efficiently compute an entire regularization path of solutions on a grid of values of the regularization parameter. The computationally intensive part of our algorithm is in computing a low-rank SVD of a dense matrix. Exploiting the problem structure, we show that the task can be performed with a complexity linear in the matrix dimensions. Our semidefinite-programming algorithm is readily scalable to large matrices: for example it can obtain a rank-80 approximation of a 106 × 106 incomplete matrix with 105 observed entries in 2.5 hours, and can fit a rank 40 approximation to the full Netflix training set in 6.6 hours. Our methods show very good performance both in training and test error when compared to other competitive state-of-the art techniques. PMID:21552465
Energy Technology Data Exchange (ETDEWEB)
Hernandez, G.A.; Bello, R.O.; McVay, D.A.; Ayers, W.B.; Ramazanova, R.I. [Society of Petroleum Engineers, Richardson, TX (United States)]|[Texas A and M Univ., Austin, TX (United States); Rushing, J.A. [Society of Petroleum Engineers, Richardson, TX (United States)]|[Anadarko Petroleum Corp., Spring, TX (United States); Ruhl, S.K.; Hoffmann, M.F. [Anadarko Petroleum Corp., Spring, TX (United States)
2006-07-01
Texas emits about 10 per cent of the total carbon dioxide (CO{sub 2}) emitted in the United States. Any method that reduces net CO{sub 2} emissions would help mitigate the global greenhouse effect. The sequestration of carbon dioxide in coals is one method that could help achieve this goal. Carbon dioxide injection in coal beds also has the added benefit of enhanced coalbed methane (ECBM) recovery. It can also help maintain reservoir pressure, thereby lowering operational costs. Low rank coals in the Texas Gulf Coast area could be potential targets for CO{sub 2} sequestration and ECBM recovery. The area is well suited for testing the viability of CO{sub 2} sequestration in low-rank coals because of the proximity of Texas power plants to abundant, well-characterized coal deposits. As such, the area is well suited to test whether the technology can be transferred to other low-rank coals around the world. This study focused on CO{sub 2} sequestration potential on low-rank coals of the Wilcox Group in east-central Texas. The study involved an extensive coal characterization program, deterministic and probabilistic simulation studies, and economic evaluations. Both CO{sub 2} and flue gas injection scenarios were evaluated. It was concluded that the methane resources and CO{sub 2} sequestration potential of the Wilcox coals in east-central Texas are significant. Based on the results of this field study, average volumes of CO{sub 2} sequestered range from 1.55 to 1.75 Bcf and average volumes of methane produced range between 0.54 and 0.67 Bcf. Sequestration projects will be most viable when gas prices and carbon market prices are at the higher ends of the ranges investigated. With increasing nitrogen content in the injected gas, CO{sub 2} sequestration volumes decrease and ECBM production increases. The total volumes of CO{sub 2} sequestered and methane produced on a uni-area basis do not change much with spacings up to 240 acres per well. The economic viability of a
High-dimensional quantum cloning and applications to quantum hacking.
Bouchard, Frédéric; Fickler, Robert; Boyd, Robert W; Karimi, Ebrahim
2017-02-01
Attempts at cloning a quantum system result in the introduction of imperfections in the state of the copies. This is a consequence of the no-cloning theorem, which is a fundamental law of quantum physics and the backbone of security for quantum communications. Although perfect copies are prohibited, a quantum state may be copied with maximal accuracy via various optimal cloning schemes. Optimal quantum cloning, which lies at the border of the physical limit imposed by the no-signaling theorem and the Heisenberg uncertainty principle, has been experimentally realized for low-dimensional photonic states. However, an increase in the dimensionality of quantum systems is greatly beneficial to quantum computation and communication protocols. Nonetheless, no experimental demonstration of optimal cloning machines has hitherto been shown for high-dimensional quantum systems. We perform optimal cloning of high-dimensional photonic states by means of the symmetrization method. We show the universality of our technique by conducting cloning of numerous arbitrary input states and fully characterize our cloning machine by performing quantum state tomography on cloned photons. In addition, a cloning attack on a Bennett and Brassard (BB84) quantum key distribution protocol is experimentally demonstrated to reveal the robustness of high-dimensional states in quantum cryptography.
Optimal Feature Selection in High-Dimensional Discriminant Analysis.
Kolar, Mladen; Liu, Han
2015-02-01
We consider the high-dimensional discriminant analysis problem. For this problem, different methods have been proposed and justified by establishing exact convergence rates for the classification risk, as well as the ℓ 2 convergence results to the discriminative rule. However, sharp theoretical analysis for the variable selection performance of these procedures have not been established, even though model interpretation is of fundamental importance in scientific data analysis. This paper bridges the gap by providing sharp sufficient conditions for consistent variable selection using the sparse discriminant analysis (Mai et al., 2012). Through careful analysis, we establish rates of convergence that are significantly faster than the best known results and admit an optimal scaling of the sample size n , dimensionality p , and sparsity level s in the high-dimensional setting. Sufficient conditions are complemented by the necessary information theoretic limits on the variable selection problem in the context of high-dimensional discriminant analysis. Exploiting a numerical equivalence result, our method also establish the optimal results for the ROAD estimator (Fan et al., 2012) and the sparse optimal scaling estimator (Clemmensen et al., 2011). Furthermore, we analyze an exhaustive search procedure, whose performance serves as a benchmark, and show that it is variable selection consistent under weaker conditions. Extensive simulations demonstrating the sharpness of the bounds are also provided.
Energy Technology Data Exchange (ETDEWEB)
Oki, A.; Xie, X.; Nakajima, T.; Maeda, S. [Kagoshima University, Kagoshima (Japan). Faculty of Engineering
1996-10-28
With an objective to learn mechanisms in low-rank coal reformation processes, change of properties on coal surface was discussed. Difficulty in handling low-rank coal is attributed to large intrinsic water content. Since it contains highly volatile components, it has a danger of spontaneous ignition. The hot water drying (HWD) method was used for reformation. Coal which has been dry-pulverized to a grain size of 1 mm or smaller was mixed with water to make slurry, heated in an autoclave, cooled, filtered, and dried in vacuum. The HWD applied to Loy Yang and Yallourn coals resulted in rapid rise in pressure starting from about 250{degree}C. Water content (ANA value) absorbed into the coal has decreased largely, with the surface made hydrophobic effectively due to high temperature and pressure. Hydroxyl group and carbonyl group contents in the coal have decreased largely with rising reformation treatment temperature (according to FT-IR measurement). Specific surface area of the original coal of the Loy Yang coal was 138 m{sup 2}/g, while it has decreased largely to 73 m{sup 2}/g when the reformation temperature was raised to 350{degree}C. This is because of volatile components dissolving from the coal as tar and blocking the surface pores. 2 refs., 4 figs.
Wu, Zhiqiang; Yang, Wangcai; Yang, Bolun
2018-02-01
In this work, the influence of Nannochloropsis and Chlorella on the thermal behavior and surface morphology of char during the co-pyrolysis process were explored. Thermogravimetric and iso-conversional methods were applied to analyzing the pyrolytic and kinetic characteristics for different mass ratios of microalgae and low-rank coal (0, 3:1, 1:1, 1:3 and 1). Fractal theory was used to quantitatively determine the effect of microalgae on the morphological texture of co-pyrolysis char. The result indicated that both the Nannochloropsis and Chlorella promoted the release of volatile from low-rank coal. Different synergistic effects on the thermal parameters and yield of volatile were observed, which could be attributed to the different compositions in the Nannochloropsis and Chlorella and operating condition. The distribution of activation energies shows nonadditive characteristics. Fractal dimensions of the co-pyrolysis char were higher than the individual char, indicating the promotion of disordered degree due to the addition of microalgae. Copyright © 2017 Elsevier Ltd. All rights reserved.
Directory of Open Access Journals (Sweden)
Kunlun Qi
2016-12-01
Full Text Available Scene classification plays an important role in the intelligent processing of High-Resolution Satellite (HRS remotely sensed images. In HRS image classification, multiple features, e.g., shape, color, and texture features, are employed to represent scenes from different perspectives. Accordingly, effective integration of multiple features always results in better performance compared to methods based on a single feature in the interpretation of HRS images. In this paper, we introduce a multi-task joint sparse and low-rank representation model to combine the strength of multiple features for HRS image interpretation. Specifically, a multi-task learning formulation is applied to simultaneously consider sparse and low-rank structures across multiple tasks. The proposed model is optimized as a non-smooth convex optimization problem using an accelerated proximal gradient method. Experiments on two public scene classification datasets demonstrate that the proposed method achieves remarkable performance and improves upon the state-of-art methods in respective applications.
Li, Hailong; Wu, Chang-Yu; Li, Ying; Zhang, Junying
2011-09-01
CeO(2)-TiO(2) (CeTi) catalysts synthesized by an ultrasound-assisted impregnation method were employed to oxidize elemental mercury (Hg(0)) in simulated low-rank (sub-bituminous and lignite) coal combustion flue gas. The CeTi catalysts with a CeO(2)/TiO(2) weight ratio of 1-2 exhibited high Hg(0) oxidation activity from 150 to 250 °C. The high concentrations of surface cerium and oxygen were responsible for their superior performance. Hg(0) oxidation over CeTi catalysts was proposed to follow the Langmuir-Hinshelwood mechanism whereby reactive species from adsorbed flue gas components react with adjacently adsorbed Hg(0). In the presence of O(2), a promotional effect of HCl, NO, and SO(2) on Hg(0) oxidation was observed. Without O(2), HCl and NO still promoted Hg(0) oxidation due to the surface oxygen, while SO(2) inhibited Hg(0) adsorption and subsequent oxidation. Water vapor also inhibited Hg(0) oxidation. HCl was the most effective flue gas component responsible for Hg(0) oxidation. However, the combination of SO(2) and NO without HCl also resulted in high Hg(0) oxidation efficiency. This superior oxidation capability is advantageous to Hg(0) oxidation in low-rank coal combustion flue gas with low HCl concentration.
Variable kernel density estimation in high-dimensional feature spaces
CSIR Research Space (South Africa)
Van der Walt, Christiaan M
2017-02-01
Full Text Available with the KDE is non-parametric, since no parametric distribution is imposed on the estimate; instead the estimated distribution is defined by the sum of the kernel functions centred on the data points. KDEs thus require the selection of two design parameters... has become feasible – understanding and modelling high- dimensional data has thus become a crucial activity, espe- cially in the field of machine learning. Since non-parametric density estimators are data-driven and do not require or impose a pre...
The additive hazards model with high-dimensional regressors
DEFF Research Database (Denmark)
Martinussen, Torben; Scheike, Thomas
2009-01-01
This paper considers estimation and prediction in the Aalen additive hazards model in the case where the covariate vector is high-dimensional such as gene expression measurements. Some form of dimension reduction of the covariate space is needed to obtain useful statistical analyses. We study...... the partial least squares regression method. It turns out that it is naturally adapted to this setting via the so-called Krylov sequence. The resulting PLS estimator is shown to be consistent provided that the number of terms included is taken to be equal to the number of relevant components in the regression...
Matrices in Engineering Problems
Tobias, Marvin
2011-01-01
This book is intended as an undergraduate text introducing matrix methods as they relate to engineering problems. It begins with the fundamentals of mathematics of matrices and determinants. Matrix inversion is discussed, with an introduction of the well known reduction methods. Equation sets are viewed as vector transformations, and the conditions of their solvability are explored. Orthogonal matrices are introduced with examples showing application to many problems requiring three dimensional thinking. The angular velocity matrix is shown to emerge from the differentiation of the 3-D orthogo
Simulation-based hypothesis testing of high dimensional means under covariance heterogeneity.
Chang, Jinyuan; Zheng, Chao; Zhou, Wen-Xin; Zhou, Wen
2017-12-01
In this article, we study the problem of testing the mean vectors of high dimensional data in both one-sample and two-sample cases. The proposed testing procedures employ maximum-type statistics and the parametric bootstrap techniques to compute the critical values. Different from the existing tests that heavily rely on the structural conditions on the unknown covariance matrices, the proposed tests allow general covariance structures of the data and therefore enjoy wide scope of applicability in practice. To enhance powers of the tests against sparse alternatives, we further propose two-step procedures with a preliminary feature screening step. Theoretical properties of the proposed tests are investigated. Through extensive numerical experiments on synthetic data sets and an human acute lymphoblastic leukemia gene expression data set, we illustrate the performance of the new tests and how they may provide assistance on detecting disease-associated gene-sets. The proposed methods have been implemented in an R-package HDtest and are available on CRAN. © 2017, The International Biometric Society.
Sample size requirements for training high-dimensional risk predictors.
Dobbin, Kevin K; Song, Xiao
2013-09-01
A common objective of biomarker studies is to develop a predictor of patient survival outcome. Determining the number of samples required to train a predictor from survival data is important for designing such studies. Existing sample size methods for training studies use parametric models for the high-dimensional data and cannot handle a right-censored dependent variable. We present a new training sample size method that is non-parametric with respect to the high-dimensional vectors, and is developed for a right-censored response. The method can be applied to any prediction algorithm that satisfies a set of conditions. The sample size is chosen so that the expected performance of the predictor is within a user-defined tolerance of optimal. The central method is based on a pilot dataset. To quantify uncertainty, a method to construct a confidence interval for the tolerance is developed. Adequacy of the size of the pilot dataset is discussed. An alternative model-based version of our method for estimating the tolerance when no adequate pilot dataset is available is presented. The model-based method requires a covariance matrix be specified, but we show that the identity covariance matrix provides adequate sample size when the user specifies three key quantities. Application of the sample size method to two microarray datasets is discussed.
Mapping morphological shape as a high-dimensional functional curve.
Fu, Guifang; Huang, Mian; Bo, Wenhao; Hao, Han; Wu, Rongling
2017-01-06
Detecting how genes regulate biological shape has become a multidisciplinary research interest because of its wide application in many disciplines. Despite its fundamental importance, the challenges of accurately extracting information from an image, statistically modeling the high-dimensional shape and meticulously locating shape quantitative trait loci (QTL) affect the progress of this research. In this article, we propose a novel integrated framework that incorporates shape analysis, statistical curve modeling and genetic mapping to detect significant QTLs regulating variation of biological shape traits. After quantifying morphological shape via a radius centroid contour approach, each shape, as a phenotype, was characterized as a high-dimensional curve, varying as angle θ runs clockwise with the first point starting from angle zero. We then modeled the dynamic trajectories of three mean curves and variation patterns as functions of θ Our framework led to the detection of a few significant QTLs regulating the variation of leaf shape collected from a natural population of poplar, Populus szechuanica var tibetica This population, distributed at altitudes 2000-4500 m above sea level, is an evolutionarily important plant species. This is the first work in the quantitative genetic shape mapping area that emphasizes a sense of 'function' instead of decomposing the shape into a few discrete principal components, as the majority of shape studies do. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Scalable Nearest Neighbor Algorithms for High Dimensional Data.
Muja, Marius; Lowe, David G
2014-11-01
For many computer vision and machine learning problems, large training sets are key for good performance. However, the most computationally expensive part of many computer vision and machine learning algorithms consists of finding nearest neighbor matches to high dimensional vectors that represent the training data. We propose new algorithms for approximate nearest neighbor matching and evaluate and compare them with previous algorithms. For matching high dimensional features, we find two algorithms to be the most efficient: the randomized k-d forest and a new algorithm proposed in this paper, the priority search k-means tree. We also propose a new algorithm for matching binary features by searching multiple hierarchical clustering trees and show it outperforms methods typically used in the literature. We show that the optimal nearest neighbor algorithm and its parameters depend on the data set characteristics and describe an automated configuration procedure for finding the best algorithm to search a particular data set. In order to scale to very large data sets that would otherwise not fit in the memory of a single machine, we propose a distributed nearest neighbor matching framework that can be used with any of the algorithms described in the paper. All this research has been released as an open source library called fast library for approximate nearest neighbors (FLANN), which has been incorporated into OpenCV and is now one of the most popular libraries for nearest neighbor matching.
High-dimensional camera shake removal with given depth map.
Yue, Tao; Suo, Jinli; Dai, Qionghai
2014-06-01
Camera motion blur is drastically nonuniform for large depth-range scenes, and the nonuniformity caused by camera translation is depth dependent but not the case for camera rotations. To restore the blurry images of large-depth-range scenes deteriorated by arbitrary camera motion, we build an image blur model considering 6-degrees of freedom (DoF) of camera motion with a given scene depth map. To make this 6D depth-aware model tractable, we propose a novel parametrization strategy to reduce the number of variables and an effective method to estimate high-dimensional camera motion as well. The number of variables is reduced by temporal sampling motion function, which describes the 6-DoF camera motion by sampling the camera trajectory uniformly in time domain. To effectively estimate the high-dimensional camera motion parameters, we construct the probabilistic motion density function (PMDF) to describe the probability distribution of camera poses during exposure, and apply it as a unified constraint to guide the convergence of the iterative deblurring algorithm. Specifically, PMDF is computed through a back projection from 2D local blur kernels to 6D camera motion parameter space and robust voting. We conduct a series of experiments on both synthetic and real captured data, and validate that our method achieves better performance than existing uniform methods and nonuniform methods on large-depth-range scenes.
Elucidating high-dimensional cancer hallmark annotation via enriched ontology.
Yan, Shankai; Wong, Ka-Chun
2017-09-01
Cancer hallmark annotation is a promising technique that could discover novel knowledge about cancer from the biomedical literature. The automated annotation of cancer hallmarks could reveal relevant cancer transformation processes in the literature or extract the articles that correspond to the cancer hallmark of interest. It acts as a complementary approach that can retrieve knowledge from massive text information, advancing numerous focused studies in cancer research. Nonetheless, the high-dimensional nature of cancer hallmark annotation imposes a unique challenge. To address the curse of dimensionality, we compared multiple cancer hallmark annotation methods on 1580 PubMed abstracts. Based on the insights, a novel approach, UDT-RF, which makes use of ontological features is proposed. It expands the feature space via the Medical Subject Headings (MeSH) ontology graph and utilizes novel feature selections for elucidating the high-dimensional cancer hallmark annotation space. To demonstrate its effectiveness, state-of-the-art methods are compared and evaluated by a multitude of performance metrics, revealing the full performance spectrum on the full set of cancer hallmarks. Several case studies are conducted, demonstrating how the proposed approach could reveal novel insights into cancers. https://github.com/cskyan/chmannot. Copyright © 2017 Elsevier Inc. All rights reserved.
Infinite matrices and sequence spaces
Cooke, Richard G
2014-01-01
This clear and correct summation of basic results from a specialized field focuses on the behavior of infinite matrices in general, rather than on properties of special matrices. Three introductory chapters guide students to the manipulation of infinite matrices, covering definitions and preliminary ideas, reciprocals of infinite matrices, and linear equations involving infinite matrices.From the fourth chapter onward, the author treats the application of infinite matrices to the summability of divergent sequences and series from various points of view. Topics include consistency, mutual consi
Indian Academy of Sciences (India)
The way to think about matrices and understand matrix- multiplication is geometrically. When viewed properly, the reason for the validity of the example of the previous paragraph is this: if TA denotes the operation of 'coun- terclockwise rotation of the plane by 90°', and if T B de- notes 'projection onto the x-axis', then TAoTB, ...
Indian Academy of Sciences (India)
Abstract. Assuming a relation between the quark mass matrices of the two sectors a unique solution can be obtained for the CKM ﬂavor mixing matrix. A numerical example is worked out which is in excellent agreement with experimental data.
Introduction to matrices and vectors
Schwartz, Jacob T
2001-01-01
In this concise undergraduate text, the first three chapters present the basics of matrices - in later chapters the author shows how to use vectors and matrices to solve systems of linear equations. 1961 edition.
Class prediction for high-dimensional class-imbalanced data
Directory of Open Access Journals (Sweden)
Lusa Lara
2010-10-01
Full Text Available Abstract Background The goal of class prediction studies is to develop rules to accurately predict the class membership of new samples. The rules are derived using the values of the variables available for each subject: the main characteristic of high-dimensional data is that the number of variables greatly exceeds the number of samples. Frequently the classifiers are developed using class-imbalanced data, i.e., data sets where the number of samples in each class is not equal. Standard classification methods used on class-imbalanced data often produce classifiers that do not accurately predict the minority class; the prediction is biased towards the majority class. In this paper we investigate if the high-dimensionality poses additional challenges when dealing with class-imbalanced prediction. We evaluate the performance of six types of classifiers on class-imbalanced data, using simulated data and a publicly available data set from a breast cancer gene-expression microarray study. We also investigate the effectiveness of some strategies that are available to overcome the effect of class imbalance. Results Our results show that the evaluated classifiers are highly sensitive to class imbalance and that variable selection introduces an additional bias towards classification into the majority class. Most new samples are assigned to the majority class from the training set, unless the difference between the classes is very large. As a consequence, the class-specific predictive accuracies differ considerably. When the class imbalance is not too severe, down-sizing and asymmetric bagging embedding variable selection work well, while over-sampling does not. Variable normalization can further worsen the performance of the classifiers. Conclusions Our results show that matching the prevalence of the classes in training and test set does not guarantee good performance of classifiers and that the problems related to classification with class
Applying recursive numerical integration techniques for solving high dimensional integrals
Energy Technology Data Exchange (ETDEWEB)
Ammon, Andreas [IVU Traffic Technologies AG, Berlin (Germany); Genz, Alan [Washington State Univ., Pullman, WA (United States). Dept. of Mathematics; Hartung, Tobias [King' s College, London (United Kingdom). Dept. of Mathematics; Jansen, Karl; Volmer, Julia [Deutsches Elektronen-Synchrotron (DESY), Zeuthen (Germany). John von Neumann-Inst. fuer Computing NIC; Leoevey, Hernan [Humboldt Univ. Berlin (Germany). Inst. fuer Mathematik
2016-11-15
The error scaling for Markov-Chain Monte Carlo techniques (MCMC) with N samples behaves like 1/√(N). This scaling makes it often very time intensive to reduce the error of computed observables, in particular for applications in lattice QCD. It is therefore highly desirable to have alternative methods at hand which show an improved error scaling. One candidate for such an alternative integration technique is the method of recursive numerical integration (RNI). The basic idea of this method is to use an efficient low-dimensional quadrature rule (usually of Gaussian type) and apply it iteratively to integrate over high-dimensional observables and Boltzmann weights. We present the application of such an algorithm to the topological rotor and the anharmonic oscillator and compare the error scaling to MCMC results. In particular, we demonstrate that the RNI technique shows an error scaling in the number of integration points m that is at least exponential.
Testing the mean matrix in high-dimensional transposable data.
Touloumis, Anestis; Tavaré, Simon; Marioni, John C
2015-03-01
The structural information in high-dimensional transposable data allows us to write the data recorded for each subject in a matrix such that both the rows and the columns correspond to variables of interest. One important problem is to test the null hypothesis that the mean matrix has a particular structure without ignoring the dependence structure among and/or between the row and column variables. To address this, we develop a generic and computationally inexpensive nonparametric testing procedure to assess the hypothesis that, in each predefined subset of columns (rows), the column (row) mean vector remains constant. In simulation studies, the proposed testing procedure seems to have good performance and, unlike simple practical approaches, it preserves the nominal size and remains powerful even if the row and/or column variables are not independent. Finally, we illustrate the use of the proposed methodology via two empirical examples from gene expression microarrays. © 2015, The International Biometric Society.
Building high dimensional imaging database for content based image search
Sun, Qinpei; Sun, Jianyong; Ling, Tonghui; Wang, Mingqing; Yang, Yuanyuan; Zhang, Jianguo
2016-03-01
In medical imaging informatics, content-based image retrieval (CBIR) techniques are employed to aid radiologists in the retrieval of images with similar image contents. CBIR uses visual contents, normally called as image features, to search images from large scale image databases according to users' requests in the form of a query image. However, most of current CBIR systems require a distance computation of image character feature vectors to perform query, and the distance computations can be time consuming when the number of image character features grows large, and thus this limits the usability of the systems. In this presentation, we propose a novel framework which uses a high dimensional database to index the image character features to improve the accuracy and retrieval speed of a CBIR in integrated RIS/PACS.
Experimental High-Dimensional Einstein-Podolsky-Rosen Steering
Zeng, Qiang; Wang, Bo; Li, Pengyun; Zhang, Xiangdong
2018-01-01
Steering nonlocality is the fundamental property of quantum mechanics, which has been widely demonstrated in some systems with qubits. Recently, theoretical works have shown that the high-dimensional (HD) steering effect exhibits novel and important features, such as noise suppression, which appear promising for potential application in quantum information processing (QIP). However, experimental observation of these HD properties remains a great challenge to date. In this work, we demonstrate the HD steering effect by encoding with orbital angular momentum photons for the first time. More importantly, we have quantitatively certified the noise-suppression phenomenon in the HD steering effect by introducing a tunable isotropic noise. We believe our results represent a significant advance of the nonlocal steering study and have direct benefits for QIP applications with superior capacity and reliability.
Technical Report: Scalable Parallel Algorithms for High Dimensional Numerical Integration
Energy Technology Data Exchange (ETDEWEB)
Masalma, Yahya [Universidad del Turabo; Jiao, Yu [ORNL
2010-10-01
We implemented a scalable parallel quasi-Monte Carlo numerical high-dimensional integration for tera-scale data points. The implemented algorithm uses the Sobol s quasi-sequences to generate random samples. Sobol s sequence was used to avoid clustering effects in the generated random samples and to produce low-discrepancy random samples which cover the entire integration domain. The performance of the algorithm was tested. Obtained results prove the scalability and accuracy of the implemented algorithms. The implemented algorithm could be used in different applications where a huge data volume is generated and numerical integration is required. We suggest using the hyprid MPI and OpenMP programming model to improve the performance of the algorithms. If the mixed model is used, attention should be paid to the scalability and accuracy.
Interactive Java Tools for Exploring High-dimensional Data
Directory of Open Access Journals (Sweden)
James W. Bradley
2000-12-01
Full Text Available The World Wide Web (WWW is a new mechanism for providing information. At this point, the majority of the information on the WWW is static, which means it is incapable of responding to user input. Text, images, and video are examples of static information that can easily be included in a WWW page. With the advent of the Java programming language, it is now possible to embed dynamic information in the form of interactive programs called applets. Therefore, it is not only possible to transfer raw data over the WWW, but we can also now provide interactive graphics for displaying and exploring data in the context of a WWW page. In this paper, we will describe the use of Java applets that have been developed for the interactive display of high dimensional data on the WWW.
Invertibility of some circulant matrices
Bustomi; Barra, A.
2017-10-01
We consider several cases of circulant matrices and determine the invertibility of these matrices. We give a criterion of the invertibility of matrices of the form circ(a, b, c, …, c) and circ(a, b, c, …, c, a) where a; b and c are real numbers. By using a different approach, we are able to generalize the result of Carmona.
Bapat, Ravindra B
2014-01-01
This new edition illustrates the power of linear algebra in the study of graphs. The emphasis on matrix techniques is greater than in other texts on algebraic graph theory. Important matrices associated with graphs (for example, incidence, adjacency and Laplacian matrices) are treated in detail. Presenting a useful overview of selected topics in algebraic graph theory, early chapters of the text focus on regular graphs, algebraic connectivity, the distance matrix of a tree, and its generalized version for arbitrary graphs, known as the resistance matrix. Coverage of later topics include Laplacian eigenvalues of threshold graphs, the positive definite completion problem and matrix games based on a graph. Such an extensive coverage of the subject area provides a welcome prompt for further exploration. The inclusion of exercises enables practical learning throughout the book. In the new edition, a new chapter is added on the line graph of a tree, while some results in Chapter 6 on Perron-Frobenius theory are reo...
Indian Academy of Sciences (India)
IAS Admin
Her areas of interest are Lie groups and. Lie algebras and their representation theory, harmonic analysis and complex analysis, in particular, Clifford analysis. .... Its Jordan block structure can be expressed as either J0,3⊕Ji,2 ⊕Ji,2 ⊕J5,3 or, diag(J0,3,Ji,2,Ji,2,J5,3). 3. Nilpotent and Unipotent Matrices. DEFINITION 3.1.
Averaging operations on matrices
Indian Academy of Sciences (India)
2014-07-03
Jul 3, 2014 ... Arithmetic mean of objects in a space need not lie in the space. [Frechet; 1948] Finding mean of right-angled triangles. S = {(x,y,z) ∈ R+3 : x2 + y2 = z2}. = {. [ z x − ιy x + ιy z. ] : x,y,z > 0,z2 = x2 + y2}. Surface of right triangles : Arithmetic mean not on S. Tanvi Jain. Averaging operations on matrices ...
A qualitative numerical study of high dimensional dynamical systems
Albers, David James
Since Poincare, the father of modern mathematical dynamical systems, much effort has been exerted to achieve a qualitative understanding of the physical world via a qualitative understanding of the functions we use to model the physical world. In this thesis, we construct a numerical framework suitable for a qualitative, statistical study of dynamical systems using the space of artificial neural networks. We analyze the dynamics along intervals in parameter space, separating the set of neural networks into roughly four regions: the fixed point to the first bifurcation; the route to chaos; the chaotic region; and a transition region between chaos and finite-state neural networks. The study is primarily with respect to high-dimensional dynamical systems. We make the following general conclusions as the dimension of the dynamical system is increased: the probability of the first bifurcation being of type Neimark-Sacker is greater than ninety-percent; the most probable route to chaos is via a cascade of bifurcations of high-period periodic orbits, quasi-periodic orbits, and 2-tori; there exists an interval of parameter space such that hyperbolicity is violated on a countable, Lebesgue measure 0, "increasingly dense" subset; chaos is much more likely to persist with respect to parameter perturbation in the chaotic region of parameter space as the dimension is increased; moreover, as the number of positive Lyapunov exponents is increased, the likelihood that any significant portion of these positive exponents can be perturbed away decreases with increasing dimension. The maximum Kaplan-Yorke dimension and the maximum number of positive Lyapunov exponents increases linearly with dimension. The probability of a dynamical system being chaotic increases exponentially with dimension. The results with respect to the first bifurcation and the route to chaos comment on previous results of Newhouse, Ruelle, Takens, Broer, Chenciner, and Iooss. Moreover, results regarding the high-dimensional
Bayesian Subset Modeling for High-Dimensional Generalized Linear Models
Liang, Faming
2013-06-01
This article presents a new prior setting for high-dimensional generalized linear models, which leads to a Bayesian subset regression (BSR) with the maximum a posteriori model approximately equivalent to the minimum extended Bayesian information criterion model. The consistency of the resulting posterior is established under mild conditions. Further, a variable screening procedure is proposed based on the marginal inclusion probability, which shares the same properties of sure screening and consistency with the existing sure independence screening (SIS) and iterative sure independence screening (ISIS) procedures. However, since the proposed procedure makes use of joint information from all predictors, it generally outperforms SIS and ISIS in real applications. This article also makes extensive comparisons of BSR with the popular penalized likelihood methods, including Lasso, elastic net, SIS, and ISIS. The numerical results indicate that BSR can generally outperform the penalized likelihood methods. The models selected by BSR tend to be sparser and, more importantly, of higher prediction ability. In addition, the performance of the penalized likelihood methods tends to deteriorate as the number of predictors increases, while this is not significant for BSR. Supplementary materials for this article are available online. © 2013 American Statistical Association.
Experimental Design of Formulations Utilizing High Dimensional Model Representation.
Li, Genyuan; Bastian, Caleb; Welsh, William; Rabitz, Herschel
2015-07-23
Many applications involve formulations or mixtures where large numbers of components are possible to choose from, but a final composition with only a few components is sought. Finding suitable binary or ternary mixtures from all the permissible components often relies on simplex-lattice sampling in traditional design of experiments (DoE), which requires performing a large number of experiments even for just tens of permissible components. The effect rises very rapidly with increasing numbers of components and can readily become impractical. This paper proposes constructing a single model for a mixture containing all permissible components from just a modest number of experiments. Yet the model is capable of satisfactorily predicting the performance for full as well as all possible binary and ternary component mixtures. To achieve this goal, we utilize biased random sampling combined with high dimensional model representation (HDMR) to replace DoE simplex-lattice design. Compared with DoE, the required number of experiments is significantly reduced, especially when the number of permissible components is large. This study is illustrated with a solubility model for solvent mixture screening.
The literary uses of high-dimensional space
Directory of Open Access Journals (Sweden)
Ted Underwood
2015-12-01
Full Text Available Debates over “Big Data” shed more heat than light in the humanities, because the term ascribes new importance to statistical methods without explaining how those methods have changed. What we badly need instead is a conversation about the substantive innovations that have made statistical modeling useful for disciplines where, in the past, it truly wasn’t. These innovations are partly technical, but more fundamentally expressed in what Leo Breiman calls a new “culture” of statistical modeling. Where 20th-century methods often required humanists to squeeze our unstructured texts, sounds, or images into some special-purpose data model, new methods can handle unstructured evidence more directly by modeling it in a high-dimensional space. This opens a range of research opportunities that humanists have barely begun to discuss. To date, topic modeling has received most attention, but in the long run, supervised predictive models may be even more important. I sketch their potential by describing how Jordan Sellers and I have begun to model poetic distinction in the long 19th century—revealing an arc of gradual change much longer than received literary histories would lead us to expect.
Efficient Smoothed Concomitant Lasso Estimation for High Dimensional Regression
Ndiaye, Eugene; Fercoq, Olivier; Gramfort, Alexandre; Leclère, Vincent; Salmon, Joseph
2017-10-01
In high dimensional settings, sparse structures are crucial for efficiency, both in term of memory, computation and performance. It is customary to consider ℓ 1 penalty to enforce sparsity in such scenarios. Sparsity enforcing methods, the Lasso being a canonical example, are popular candidates to address high dimension. For efficiency, they rely on tuning a parameter trading data fitting versus sparsity. For the Lasso theory to hold this tuning parameter should be proportional to the noise level, yet the latter is often unknown in practice. A possible remedy is to jointly optimize over the regression parameter as well as over the noise level. This has been considered under several names in the literature: Scaled-Lasso, Square-root Lasso, Concomitant Lasso estimation for instance, and could be of interest for uncertainty quantification. In this work, after illustrating numerical difficulties for the Concomitant Lasso formulation, we propose a modification we coined Smoothed Concomitant Lasso, aimed at increasing numerical stability. We propose an efficient and accurate solver leading to a computational cost no more expensive than the one for the Lasso. We leverage on standard ingredients behind the success of fast Lasso solvers: a coordinate descent algorithm, combined with safe screening rules to achieve speed efficiency, by eliminating early irrelevant features.
Progress in high-dimensional percolation and random graphs
Heydenreich, Markus
2017-01-01
This text presents an engaging exposition of the active field of high-dimensional percolation that will likely provide an impetus for future work. With over 90 exercises designed to enhance the reader’s understanding of the material, as well as many open problems, the book is aimed at graduate students and researchers who wish to enter the world of this rich topic. The text may also be useful in advanced courses and seminars, as well as for reference and individual study. Part I, consisting of 3 chapters, presents a general introduction to percolation, stating the main results, defining the central objects, and proving its main properties. No prior knowledge of percolation is assumed. Part II, consisting of Chapters 4–9, discusses mean-field critical behavior by describing the two main techniques used, namely, differential inequalities and the lace expansion. In Parts I and II, all results are proved, making this the first self-contained text discussing high-dimensiona l percolation. Part III, consist...
Indian Academy of Sciences (India)
λj (BB∗) = λj (I − AA∗) = 1 − λj (AA∗). = 1 − λj (A∗A) = λj (I − A∗A) = λj (C∗C). Thus B and C have the same singular values, and again |||B||| = |||C||| for all unitarily invariant norms. This equality of norms does not persist when we go to arbitrary normal matrices, as we will soon see. From (2) and (4) we get a simple inequality.
Schneider, Hans
1989-01-01
Linear algebra is one of the central disciplines in mathematics. A student of pure mathematics must know linear algebra if he is to continue with modern algebra or functional analysis. Much of the mathematics now taught to engineers and physicists requires it.This well-known and highly regarded text makes the subject accessible to undergraduates with little mathematical experience. Written mainly for students in physics, engineering, economics, and other fields outside mathematics, the book gives the theory of matrices and applications to systems of linear equations, as well as many related t
M Wedderburn, J H
1934-01-01
It is the organization and presentation of the material, however, which make the peculiar appeal of the book. This is no mere compendium of results-the subject has been completely reworked and the proofs recast with the skill and elegance which come only from years of devotion. -Bulletin of the American Mathematical Society The very clear and simple presentation gives the reader easy access to the more difficult parts of the theory. -Jahrbuch über die Fortschritte der Mathematik In 1937, the theory of matrices was seventy-five years old. However, many results had only recently evolved from sp
Zhang, Tingting; Pham, Minh; Sun, Jianhui; Yan, Guofen; Li, Huazhang; Sun, Yinge; Gonzalez, Marlen Z; Coan, James A
2017-12-26
The focus of this paper is on evaluating brain responses to different stimuli and identifying brain regions with different responses using multi-subject, stimulus-evoked functional magnetic resonance imaging (fMRI) data. To jointly model many brain voxels' responses to designed stimuli, we present a new low-rank multivariate general linear model (LRMGLM) for stimulus-evoked fMRI data. The new model not only is flexible to characterize variation in hemodynamic response functions (HRFs) across different regions and stimulus types, but also enables information "borrowing" across voxels and uses much fewer parameters than typical nonparametric models for HRFs. To estimate the proposed LRMGLM, we introduce a new penalized optimization function, which leads to temporally and spatially smooth HRF estimates. We develop an efficient optimization algorithm to minimize the optimization function and identify the voxels with different responses to stimuli. We show that the proposed method can outperform several existing voxel-wise methods by achieving both high sensitivity and specificity. We apply the proposed method to the fMRI data collected in an emotion study, and identify anterior dACC to have different responses to a designed threat and control stimuli. Copyright © 2017. Published by Elsevier Inc.
Approximation of High-Dimensional Rank One Tensors
Bachmayr, Markus
2013-11-12
Many real world problems are high-dimensional in that their solution is a function which depends on many variables or parameters. This presents a computational challenge since traditional numerical techniques are built on model classes for functions based solely on smoothness. It is known that the approximation of smoothness classes of functions suffers from the so-called \\'curse of dimensionality\\'. Avoiding this curse requires new model classes for real world functions that match applications. This has led to the introduction of notions such as sparsity, variable reduction, and reduced modeling. One theme that is particularly common is to assume a tensor structure for the target function. This paper investigates how well a rank one function f(x 1,...,x d)=f 1(x 1)⋯f d(x d), defined on Ω=[0,1]d can be captured through point queries. It is shown that such a rank one function with component functions f j in W∞ r([0,1]) can be captured (in L ∞) to accuracy O(C(d,r)N -r) from N well-chosen point evaluations. The constant C(d,r) scales like d dr. The queries in our algorithms have two ingredients, a set of points built on the results from discrepancy theory and a second adaptive set of queries dependent on the information drawn from the first set. Under the assumption that a point z∈Ω with nonvanishing f(z) is known, the accuracy improves to O(dN -r). © 2013 Springer Science+Business Media New York.
Quality and efficiency in high dimensional Nearest neighbor search
Tao, Yufei
2009-01-01
Nearest neighbor (NN) search in high dimensional space is an important problem in many applications. Ideally, a practical solution (i) should be implementable in a relational database, and (ii) its query cost should grow sub-linearly with the dataset size, regardless of the data and query distributions. Despite the bulk of NN literature, no solution fulfills both requirements, except locality sensitive hashing (LSH). The existing LSH implementations are either rigorous or adhoc. Rigorous-LSH ensures good quality of query results, but requires expensive space and query cost. Although adhoc-LSH is more efficient, it abandons quality control, i.e., the neighbor it outputs can be arbitrarily bad. As a result, currently no method is able to ensure both quality and efficiency simultaneously in practice. Motivated by this, we propose a new access method called the locality sensitive B-tree (LSB-tree) that enables fast highdimensional NN search with excellent quality. The combination of several LSB-trees leads to a structure called the LSB-forest that ensures the same result quality as rigorous-LSH, but reduces its space and query cost dramatically. The LSB-forest also outperforms adhoc-LSH, even though the latter has no quality guarantee. Besides its appealing theoretical properties, the LSB-tree itself also serves as an effective index that consumes linear space, and supports efficient updates. Our extensive experiments confirm that the LSB-tree is faster than (i) the state of the art of exact NN search by two orders of magnitude, and (ii) the best (linear-space) method of approximate retrieval by an order of magnitude, and at the same time, returns neighbors with much better quality. © 2009 ACM.
Optimally splitting cases for training and testing high dimensional classifiers
Directory of Open Access Journals (Sweden)
Simon Richard M
2011-04-01
Full Text Available Abstract Background We consider the problem of designing a study to develop a predictive classifier from high dimensional data. A common study design is to split the sample into a training set and an independent test set, where the former is used to develop the classifier and the latter to evaluate its performance. In this paper we address the question of what proportion of the samples should be devoted to the training set. How does this proportion impact the mean squared error (MSE of the prediction accuracy estimate? Results We develop a non-parametric algorithm for determining an optimal splitting proportion that can be applied with a specific dataset and classifier algorithm. We also perform a broad simulation study for the purpose of better understanding the factors that determine the best split proportions and to evaluate commonly used splitting strategies (1/2 training or 2/3 training under a wide variety of conditions. These methods are based on a decomposition of the MSE into three intuitive component parts. Conclusions By applying these approaches to a number of synthetic and real microarray datasets we show that for linear classifiers the optimal proportion depends on the overall number of samples available and the degree of differential expression between the classes. The optimal proportion was found to depend on the full dataset size (n and classification accuracy - with higher accuracy and smaller n resulting in more assigned to the training set. The commonly used strategy of allocating 2/3rd of cases for training was close to optimal for reasonable sized datasets (n ≥ 100 with strong signals (i.e. 85% or greater full dataset accuracy. In general, we recommend use of our nonparametric resampling approach for determing the optimal split. This approach can be applied to any dataset, using any predictor development method, to determine the best split.
An Effective Parameter Screening Strategy for High Dimensional Watershed Models
Khare, Y. P.; Martinez, C. J.; Munoz-Carpena, R.
2014-12-01
Watershed simulation models can assess the impacts of natural and anthropogenic disturbances on natural systems. These models have become important tools for tackling a range of water resources problems through their implementation in the formulation and evaluation of Best Management Practices, Total Maximum Daily Loads, and Basin Management Action Plans. For accurate applications of watershed models they need to be thoroughly evaluated through global uncertainty and sensitivity analyses (UA/SA). However, due to the high dimensionality of these models such evaluation becomes extremely time- and resource-consuming. Parameter screening, the qualitative separation of important parameters, has been suggested as an essential step before applying rigorous evaluation techniques such as the Sobol' and Fourier Amplitude Sensitivity Test (FAST) methods in the UA/SA framework. The method of elementary effects (EE) (Morris, 1991) is one of the most widely used screening methodologies. Some of the common parameter sampling strategies for EE, e.g. Optimized Trajectories [OT] (Campolongo et al., 2007) and Modified Optimized Trajectories [MOT] (Ruano et al., 2012), suffer from inconsistencies in the generated parameter distributions, infeasible sample generation time, etc. In this work, we have formulated a new parameter sampling strategy - Sampling for Uniformity (SU) - for parameter screening which is based on the principles of the uniformity of the generated parameter distributions and the spread of the parameter sample. A rigorous multi-criteria evaluation (time, distribution, spread and screening efficiency) of OT, MOT, and SU indicated that SU is superior to other sampling strategies. Comparison of the EE-based parameter importance rankings with those of Sobol' helped to quantify the qualitativeness of the EE parameter screening approach, reinforcing the fact that one should use EE only to reduce the resource burden required by FAST/Sobol' analyses but not to replace it.
Numerically Stable Evaluation of Moments of Random Gram Matrices With Applications
Elkhalil, Khalil
2017-07-31
This paper focuses on the computation of the positive moments of one-side correlated random Gram matrices. Closed-form expressions for the moments can be obtained easily, but numerical evaluation thereof is prone to numerical stability, especially in high-dimensional settings. This letter provides a numerically stable method that efficiently computes the positive moments in closed-form. The developed expressions are more accurate and can lead to higher accuracy levels when fed to moment based-approaches. As an application, we show how the obtained moments can be used to approximate the marginal distribution of the eigenvalues of random Gram matrices.
A Novel High Dimensional and High Speed Data Streams Algorithm: HSDStream
Irshad Ahmed; Irfan Ahmed; Waseem Shahzad
2016-01-01
This paper presents a novel high speed clustering scheme for high-dimensional data stream. Data stream clustering has gained importance in different applications, for example, network monitoring, intrusion detection, and real-time sensing. High dimensional stream data is inherently more complex when used for clustering because the evolving nature of the stream data and high dimensionality make it non-trivial. In order to tackle this problem, projected subspace within the high dimensions and l...
Multivariate statistical analysis a high-dimensional approach
Serdobolskii, V
2000-01-01
In the last few decades the accumulation of large amounts of in formation in numerous applications. has stimtllated an increased in terest in multivariate analysis. Computer technologies allow one to use multi-dimensional and multi-parametric models successfully. At the same time, an interest arose in statistical analysis with a de ficiency of sample data. Nevertheless, it is difficult to describe the recent state of affairs in applied multivariate methods as satisfactory. Unimprovable (dominating) statistical procedures are still unknown except for a few specific cases. The simplest problem of estimat ing the mean vector with minimum quadratic risk is unsolved, even for normal distributions. Commonly used standard linear multivari ate procedures based on the inversion of sample covariance matrices can lead to unstable results or provide no solution in dependence of data. Programs included in standard statistical packages cannot process 'multi-collinear data' and there are no theoretical recommen ...
Generalisations of Fisher Matrices
Directory of Open Access Journals (Sweden)
Alan Heavens
2016-06-01
Full Text Available Fisher matrices play an important role in experimental design and in data analysis. Their primary role is to make predictions for the inference of model parameters—both their errors and covariances. In this short review, I outline a number of extensions to the simple Fisher matrix formalism, covering a number of recent developments in the field. These are: (a situations where the data (in the form of ( x , y pairs have errors in both x and y; (b modifications to parameter inference in the presence of systematic errors, or through fixing the values of some model parameters; (c Derivative Approximation for LIkelihoods (DALI - higher-order expansions of the likelihood surface, going beyond the Gaussian shape approximation; (d extensions of the Fisher-like formalism, to treat model selection problems with Bayesian evidence.
VanderLaan Circulant Type Matrices
Directory of Open Access Journals (Sweden)
Hongyan Pan
2015-01-01
Full Text Available Circulant matrices have become a satisfactory tools in control methods for modern complex systems. In the paper, VanderLaan circulant type matrices are presented, which include VanderLaan circulant, left circulant, and g-circulant matrices. The nonsingularity of these special matrices is discussed by the surprising properties of VanderLaan numbers. The exact determinants of VanderLaan circulant type matrices are given by structuring transformation matrices, determinants of well-known tridiagonal matrices, and tridiagonal-like matrices. The explicit inverse matrices of these special matrices are obtained by structuring transformation matrices, inverses of known tridiagonal matrices, and quasi-tridiagonal matrices. Three kinds of norms and lower bound for the spread of VanderLaan circulant and left circulant matrix are given separately. And we gain the spectral norm of VanderLaan g-circulant matrix.
Reduced nonlinear prognostic model construction from high-dimensional data
Gavrilov, Andrey; Mukhin, Dmitry; Loskutov, Evgeny; Feigin, Alexander
2017-04-01
Construction of a data-driven model of evolution operator using universal approximating functions can only be statistically justified when the dimension of its phase space is small enough, especially in the case of short time series. At the same time in many applications real-measured data is high-dimensional, e.g. it is space-distributed and multivariate in climate science. Therefore it is necessary to use efficient dimensionality reduction methods which are also able to capture key dynamical properties of the system from observed data. To address this problem we present a Bayesian approach to an evolution operator construction which incorporates two key reduction steps. First, the data is decomposed into a set of certain empirical modes, such as standard empirical orthogonal functions or recently suggested nonlinear dynamical modes (NDMs) [1], and the reduced space of corresponding principal components (PCs) is obtained. Then, the model of evolution operator for PCs is constructed which maps a number of states in the past to the current state. The second step is to reduce this time-extended space in the past using appropriate decomposition methods. Such a reduction allows us to capture only the most significant spatio-temporal couplings. The functional form of the evolution operator includes separately linear, nonlinear (based on artificial neural networks) and stochastic terms. Explicit separation of the linear term from the nonlinear one allows us to more easily interpret degree of nonlinearity as well as to deal better with smooth PCs which can naturally occur in the decompositions like NDM, as they provide a time scale separation. Results of application of the proposed method to climate data are demonstrated and discussed. The study is supported by Government of Russian Federation (agreement #14.Z50.31.0033 with the Institute of Applied Physics of RAS). 1. Mukhin, D., Gavrilov, A., Feigin, A., Loskutov, E., & Kurths, J. (2015). Principal nonlinear dynamical
Two representations of a high-dimensional perceptual space.
Victor, Jonathan D; Rizvi, Syed M; Conte, Mary M
2017-08-01
A perceptual space is a mental workspace of points in a sensory domain that supports similarity and difference judgments and enables further processing such as classification and naming. Perceptual spaces are present across sensory modalities; examples include colors, faces, auditory textures, and odors. Color is perhaps the best-studied perceptual space, but it is atypical in two respects. First, the dimensions of color space are directly linked to the three cone absorption spectra, but the dimensions of generic perceptual spaces are not as readily traceable to single-neuron properties. Second, generic perceptual spaces have more than three dimensions. This is important because representing each distinguishable point in a high-dimensional space by a separate neuron or population is unwieldy; combinatorial strategies may be needed to overcome this hurdle. To study the representation of a complex perceptual space, we focused on a well-characterized 10-dimensional domain of visual textures. Within this domain, we determine perceptual distances in a threshold task (segmentation) and a suprathreshold task (border salience comparison). In N=4 human observers, we find both quantitative and qualitative differences between these sets of measurements. Quantitatively, observers' segmentation thresholds were inconsistent with their uncertainty determined from border salience comparisons. Qualitatively, segmentation thresholds suggested that distances are determined by a coordinate representation with Euclidean geometry. Border salience comparisons, in contrast, indicated a global curvature of the space, and that distances are determined by activity patterns across broadly tuned elements. Thus, our results indicate two representations of this perceptual space, and suggest that they use differing combinatorial strategies. To move from sensory signals to decisions and actions, the brain carries out a sequence of transformations. An important stage in this process is the
Enhancing Understanding of Transformation Matrices
Dick, Jonathan; Childrey, Maria
2012-01-01
With the Common Core State Standards' emphasis on transformations, teachers need a variety of approaches to increase student understanding. Teaching matrix transformations by focusing on row vectors gives students tools to create matrices to perform transformations. This empowerment opens many doors: Students are able to create the matrices for…
Hierarchical matrices algorithms and analysis
Hackbusch, Wolfgang
2015-01-01
This self-contained monograph presents matrix algorithms and their analysis. The new technique enables not only the solution of linear systems but also the approximation of matrix functions, e.g., the matrix exponential. Other applications include the solution of matrix equations, e.g., the Lyapunov or Riccati equation. The required mathematical background can be found in the appendix. The numerical treatment of fully populated large-scale matrices is usually rather costly. However, the technique of hierarchical matrices makes it possible to store matrices and to perform matrix operations approximately with almost linear cost and a controllable degree of approximation error. For important classes of matrices, the computational cost increases only logarithmically with the approximation error. The operations provided include the matrix inversion and LU decomposition. Since large-scale linear algebra problems are standard in scientific computing, the subject of hierarchical matrices is of interest to scientists ...
Manin matrices and Talalaev's formula
Chervov, A.; Falqui, G.
2008-05-01
In this paper we study properties of Lax and transfer matrices associated with quantum integrable systems. Our point of view stems from the fact that their elements satisfy special commutation properties, considered by Yu I Manin some 20 years ago at the beginning of quantum group theory. These are the commutation properties of matrix elements of linear homomorphisms between polynomial rings; more explicitly these read: (1) elements of the same column commute; (2) commutators of the cross terms are equal: [Mij, Mkl] = [Mkj, Mil] (e.g. [M11, M22] = [M21, M12]). The main aim of this paper is twofold: on the one hand we observe and prove that such matrices (which we call Manin matrices in short) behave almost as well as matrices with commutative elements. Namely, the theorems of linear algebra (e.g., a natural definition of the determinant, the Cayley-Hamilton theorem, the Newton identities and so on and so forth) have a straightforward counterpart in the case of Manin matrices. On the other hand, we remark that such matrices are somewhat ubiquitous in the theory of quantum integrability. For instance, Manin matrices (and their q-analogs) include matrices satisfying the Yang-Baxter relation 'RTT=TTR' and the so-called Cartier-Foata matrices. Also, they enter Talalaev's remarkable formulae: \\det(\\partial_z-L_Gaudin(z)), \\det(1-e^{-\\partial_z}T_Yangian(z)) for the 'quantum spectral curve', and appear in the separation of variables problem and Capelli identities. We show that theorems of linear algebra, after being established for such matrices, have various applications to quantum integrable systems and Lie algebras, e.g. in the construction of new generators in Z(U_crit(\\widehat{gl_n})) (and, in general, in the construction of quantum conservation laws), in the Knizhnik-Zamolodchikov equation, and in the problem of Wick ordering. We propose, in the appendix, a construction of quantum separated variables for the XXX-Heisenberg system.
Visualizing High-Dimensional Structures by Dimension Ordering and Filtering using Subspace Analysis
Ferdosi, Bilkis J.; Roerdink, Jos B. T. M.
2011-01-01
High-dimensional data visualization is receiving increasing interest because of the growing abundance of high-dimensional datasets. To understand such datasets, visualization of the structures present in the data, such as clusters, can be an invaluable tool. Structures may be present in the full
Engineering two-photon high-dimensional states through quantum interference
CSIR Research Space (South Africa)
Zhang, YI
2016-02-01
Full Text Available problems. An alternative strategy is to consider a lesser number of particles that exist in high-dimensional states. The spatial modes of light are one such candidate that provides access to high-dimensional quantum states, and thus they increase...
Energy Technology Data Exchange (ETDEWEB)
Tripathy, Rohit, E-mail: rtripath@purdue.edu; Bilionis, Ilias, E-mail: ibilion@purdue.edu; Gonzalez, Marcial, E-mail: marcial-gonzalez@purdue.edu
2016-09-15
Uncertainty quantification (UQ) tasks, such as model calibration, uncertainty propagation, and optimization under uncertainty, typically require several thousand evaluations of the underlying computer codes. To cope with the cost of simulations, one replaces the real response surface with a cheap surrogate based, e.g., on polynomial chaos expansions, neural networks, support vector machines, or Gaussian processes (GP). However, the number of simulations required to learn a generic multivariate response grows exponentially as the input dimension increases. This curse of dimensionality can only be addressed, if the response exhibits some special structure that can be discovered and exploited. A wide range of physical responses exhibit a special structure known as an active subspace (AS). An AS is a linear manifold of the stochastic space characterized by maximal response variation. The idea is that one should first identify this low dimensional manifold, project the high-dimensional input onto it, and then link the projection to the output. If the dimensionality of the AS is low enough, then learning the link function is a much easier problem than the original problem of learning a high-dimensional function. The classic approach to discovering the AS requires gradient information, a fact that severely limits its applicability. Furthermore, and partly because of its reliance to gradients, it is not able to handle noisy observations. The latter is an essential trait if one wants to be able to propagate uncertainty through stochastic simulators, e.g., through molecular dynamics codes. In this work, we develop a probabilistic version of AS which is gradient-free and robust to observational noise. Our approach relies on a novel Gaussian process regression with built-in dimensionality reduction. In particular, the AS is represented as an orthogonal projection matrix that serves as yet another covariance function hyper-parameter to be estimated from the data. To train the
Directory of Open Access Journals (Sweden)
Ivan Gregor
2016-02-01
Full Text Available Background. Metagenomics is an approach for characterizing environmental microbial communities in situ, it allows their functional and taxonomic characterization and to recover sequences from uncultured taxa. This is often achieved by a combination of sequence assembly and binning, where sequences are grouped into ‘bins’ representing taxa of the underlying microbial community. Assignment to low-ranking taxonomic bins is an important challenge for binning methods as is scalability to Gb-sized datasets generated with deep sequencing techniques. One of the best available methods for species bins recovery from deep-branching phyla is the expert-trained PhyloPythiaS package, where a human expert decides on the taxa to incorporate in the model and identifies ‘training’ sequences based on marker genes directly from the sample. Due to the manual effort involved, this approach does not scale to multiple metagenome samples and requires substantial expertise, which researchers who are new to the area do not have. Results. We have developed PhyloPythiaS+, a successor to our PhyloPythia(S software. The new (+ component performs the work previously done by the human expert. PhyloPythiaS+ also includes a new k-mer counting algorithm, which accelerated the simultaneous counting of 4–6-mers used for taxonomic binning 100-fold and reduced the overall execution time of the software by a factor of three. Our software allows to analyze Gb-sized metagenomes with inexpensive hardware, and to recover species or genera-level bins with low error rates in a fully automated fashion. PhyloPythiaS+ was compared to MEGAN, taxator-tk, Kraken and the generic PhyloPythiaS model. The results showed that PhyloPythiaS+ performs especially well for samples originating from novel environments in comparison to the other methods. Availability. PhyloPythiaS+ in a virtual machine is available for installation under Windows, Unix systems or OS X on: https://github.com/algbioi/ppsp/wiki.
Genton, Marc G.
2017-09-07
We present a hierarchical decomposition scheme for computing the n-dimensional integral of multivariate normal probabilities that appear frequently in statistics. The scheme exploits the fact that the formally dense covariance matrix can be approximated by a matrix with a hierarchical low rank structure. It allows the reduction of the computational complexity per Monte Carlo sample from O(n2) to O(mn+knlog(n/m)), where k is the numerical rank of off-diagonal matrix blocks and m is the size of small diagonal blocks in the matrix that are not well-approximated by low rank factorizations and treated as dense submatrices. This hierarchical decomposition leads to substantial efficiencies in multivariate normal probability computations and allows integrations in thousands of dimensions to be practical on modern workstations.
Hadamard Matrices and Their Applications
Horadam, K J
2011-01-01
In Hadamard Matrices and Their Applications, K. J. Horadam provides the first unified account of cocyclic Hadamard matrices and their applications in signal and data processing. This original work is based on the development of an algebraic link between Hadamard matrices and the cohomology of finite groups that was discovered fifteen years ago. The book translates physical applications into terms a pure mathematician will appreciate, and theoretical structures into ones an applied mathematician, computer scientist, or communications engineer can adapt and use. The first half of the book expl
Special matrices of mathematical physics stochastic, circulant and Bell matrices
Aldrovandi, R
2001-01-01
This book expounds three special kinds of matrices that are of physical interest, centering on physical examples. Stochastic matrices describe dynamical systems of many different types, involving (or not) phenomena like transience, dissipation, ergodicity, nonequilibrium, and hypersensitivity to initial conditions. The main characteristic is growth by agglomeration, as in glass formation. Circulants are the building blocks of elementary Fourier analysis and provide a natural gateway to quantum mechanics and noncommutative geometry. Bell polynomials offer closed expressions for many formulas co
The invariant theory of matrices
Concini, Corrado De
2017-01-01
This book gives a unified, complete, and self-contained exposition of the main algebraic theorems of invariant theory for matrices in a characteristic free approach. More precisely, it contains the description of polynomial functions in several variables on the set of m\\times m matrices with coefficients in an infinite field or even the ring of integers, invariant under simultaneous conjugation. Following Hermann Weyl's classical approach, the ring of invariants is described by formulating and proving the first fundamental theorem that describes a set of generators in the ring of invariants, and the second fundamental theorem that describes relations between these generators. The authors study both the case of matrices over a field of characteristic 0 and the case of matrices over a field of positive characteristic. While the case of characteristic 0 can be treated following a classical approach, the case of positive characteristic (developed by Donkin and Zubkov) is much harder. A presentation of this case...
Mapping the human DC lineage through the integration of high-dimensional techniques
See, Peter; Dutertre, Charles-Antoine; Chen, Jinmiao; Günther, Patrick; McGovern, Naomi; Irac, Sergio Erdal; Gunawan, Merry; Beyer, Marc; Händler, Kristian; Duan, Kaibo; Sumatoh, Hermi Rizal Bin; Ruffin, Nicolas; Jouve, Mabel; Gea-Mallorquí, Ester; Hennekam, Raoul C. M.; Lim, Tony; Yip, Chan Chung; Wen, Ming; Malleret, Benoit; Low, Ivy; Shadan, Nurhidaya Binte; Fen, Charlene Foong Shu; Tay, Alicia; Lum, Josephine; Zolezzi, Francesca; Larbi, Anis; Poidinger, Michael; Chan, Jerry K. Y.; Chen, Qingfeng; Rénia, Laurent; Haniffa, Muzlifah; Benaroch, Philippe; Schlitzer, Andreas; Schultze, Joachim L.; Newell, Evan W.; Ginhoux, Florent
2017-01-01
Dendritic cells (DC) are professional antigen-presenting cells that orchestrate immune responses. The human DC population comprises two main functionally specialized lineages, whose origins and differentiation pathways remain incompletely defined. Here, we combine two high-dimensional
Approximating high-dimensional dynamics by barycentric coordinates with linear programming.
Hirata, Yoshito; Shiro, Masanori; Takahashi, Nozomu; Aihara, Kazuyuki; Suzuki, Hideyuki; Mas, Paloma
2015-01-01
The increasing development of novel methods and techniques facilitates the measurement of high-dimensional time series but challenges our ability for accurate modeling and predictions. The use of a general mathematical model requires the inclusion of many parameters, which are difficult to be fitted for relatively short high-dimensional time series observed. Here, we propose a novel method to accurately model a high-dimensional time series. Our method extends the barycentric coordinates to high-dimensional phase space by employing linear programming, and allowing the approximation errors explicitly. The extension helps to produce free-running time-series predictions that preserve typical topological, dynamical, and/or geometric characteristics of the underlying attractors more accurately than the radial basis function model that is widely used. The method can be broadly applied, from helping to improve weather forecasting, to creating electronic instruments that sound more natural, and to comprehensively understanding complex biological data.
Experimental ladder proof of Hardy's nonlocality for high-dimensional quantum systems
Chen, Lixiang; Zhang, Wuhong; Wu, Ziwen; Wang, Jikang; Fickler, Robert; Karimi, Ebrahim
2017-08-01
Recent years have witnessed a rapidly growing interest in high-dimensional quantum entanglement for fundamental studies as well as towards novel applications. Therefore, the ability to verify entanglement between physical qudits, d -dimensional quantum systems, is of crucial importance. To show nonclassicality, Hardy's paradox represents "the best version of Bell's theorem" without using inequalities. However, so far it has only been tested experimentally for bidimensional vector spaces. Here, we formulate a theoretical framework to demonstrate the ladder proof of Hardy's paradox for arbitrary high-dimensional systems. Furthermore, we experimentally demonstrate the ladder proof by taking advantage of the orbital angular momentum of high-dimensionally entangled photon pairs. We perform the ladder proof of Hardy's paradox for dimensions 3 and 4, both with the ladder up to the third step. Our paper paves the way towards a deeper understanding of the nature of high-dimensionally entangled quantum states and may find applications in quantum information science.
Characterization of high-dimensional entangled systems via mutually unbiased measurements
CSIR Research Space (South Africa)
Giovannini, D
2013-04-01
Full Text Available Mutually unbiased bases (MUBs) play a key role in many protocols in quantum science, such as quantum key distribution. However, defining MUBs for arbitrary high-dimensional systems is theoretically difficult, and measurements in such bases can...
Mitigating the Insider Threat Using High-Dimensional Search and Modeling
National Research Council Canada - National Science Library
Van Den Berg, Eric; Uphadyaya, Shambhu; Ngo, Phi H; Muthukrishnan, Muthu; Palan, Rajago
2006-01-01
In this project a system was built aimed at mitigating insider attacks centered around a high-dimensional search engine for correlating the large number of monitoring streams necessary for detecting insider attacks...
Projection Bank: From High-dimensional Data to Medium-length Binary Codes
Liu, Li; Yu, Mengyang; Shao, Ling
2015-01-01
Recently, very high-dimensional feature representations, e.g., Fisher Vector, have achieved excellent performance for visual recognition and retrieval. However, these lengthy representations always cause extremely heavy computational and storage costs and even become unfeasible in some large-scale applications. A few existing techniques can transfer very high-dimensional data into binary codes, but they still require the reduced code length to be relatively long to maintain acceptable accurac...
Zhu, Yinchu
2017-01-01
Economic modeling in a data-rich environment is often challenging. To allow for enough flexibility and to model heterogeneity, models might have parameters with dimensionality growing with (or even much larger than) the sample size of the data. Learning these high-dimensional parameters requires new methodologies and theories. We consider three important high-dimensional models and propose novel methods for estimation and inference. Empirical applications in economics and finance are also stu...
Extreme eigenvalues of sample covariance and correlation matrices
DEFF Research Database (Denmark)
Heiny, Johannes
of the problem at hand. We develop a theory for the point process of the normalized eigenvalues of the sample covariance matrix in the case where rows and columns of the data are linearly dependent. Based on the weak convergence of this point process we derive the limit laws of various functionals......This thesis is concerned with asymptotic properties of the eigenvalues of high-dimensional sample covariance and correlation matrices under an infinite fourth moment of the entries. In the first part, we study the joint distributional convergence of the largest eigenvalues of the sample covariance...... matrix of a p-dimensional heavy-tailed time series when p converges to infinity together with the sample size n. We generalize the growth rates of p existing in the literature. Assuming a regular variation condition with tail index
Extreme eigenvalues of sample covariance and correlation matrices
DEFF Research Database (Denmark)
Heiny, Johannes
2017-01-01
dimension of the problem at hand. We develop a theory for the point process of the normalized eigenvalues of the sample covariance matrix in the case where rows and columns of the data are linearly dependent. Based on the weak convergence of this point process we derive the limit laws of various functionals......This thesis is concerned with asymptotic properties of the eigenvalues of high-dimensional sample covariance and correlation matrices under an infinite fourth moment of the entries. In the first part, we study the joint distributional convergence of the largest eigenvalues of the sample covariance...... matrix of a $p$-dimensional heavy-tailed time series when $p$ converges to infinity together with the sample size $n$. We generalize the growth rates of $p$ existing in the literature. Assuming a regular variation condition with tail index $\\alpha
Extreme eigenvalues of sample covariance and correlation matrices
DEFF Research Database (Denmark)
Heiny, Johannes
2017-01-01
This thesis is concerned with asymptotic properties of the eigenvalues of high-dimensional sample covariance and correlation matrices under an infinite fourth moment of the entries. In the first part, we study the joint distributional convergence of the largest eigenvalues of the sample covariance...... that the extreme eigenvalues are essentially determined by the extreme order statistics from an array of iid random variables. The asymptotic behavior of the extreme eigenvalues is then derived routinely from classical extreme value theory. The resulting approximations are strikingly simple considering the high...... dimension of the problem at hand. We develop a theory for the point process of the normalized eigenvalues of the sample covariance matrix in the case where rows and columns of the data are linearly dependent. Based on the weak convergence of this point process we derive the limit laws of various functionals...
Extreme eigenvalues of sample covariance and correlation matrices
DEFF Research Database (Denmark)
Heiny, Johannes
This thesis is concerned with asymptotic properties of the eigenvalues of high-dimensional sample covariance and correlation matrices under an infinite fourth moment of the entries. In the first part, we study the joint distributional convergence of the largest eigenvalues of the sample covariance...... eigenvalues are essentially determined by the extreme order statistics from an array of iid random variables. The asymptotic behavior of the extreme eigenvalues is then derived routinely from classical extreme value theory. The resulting approximations are strikingly simple considering the high dimension...... of the problem at hand. We develop a theory for the point process of the normalized eigenvalues of the sample covariance matrix in the case where rows and columns of the data are linearly dependent. Based on the weak convergence of this point process we derive the limit laws of various functionals...
Conservation constraints on random matrices
Ma Wen Jong; Hsieh, J
2003-01-01
We study the random matrices constrained by the summation rules that are present in the Hessian of the potential energy surface in the instantaneous normal mode calculations, as a result of momentum conservation. In this paper, we analyse the properties related to such conservation constraints in two classes of real symmetric matrices: one with purely row-wise summation rules and the other with the constraints on the blocks of each matrix, which underscores partially the spatial dimension. We show explicitly that the constraints are removable by separating the degrees of freedom of the zero-eigenvalue modes. The non-spectral degrees of freedom under the constraints can be realized in terms of the ordinary constraint-free orthogonal symmetry but with the rank deducted by the block dimension. We propose that the ensemble of real symmetric matrices with full randomness, constrained by the summation rules, is equivalent to the Gaussian orthogonal ensemble (GOE) with lowered rank. Independent of the joint probabil...
Free probability and random matrices
Mingo, James A
2017-01-01
This volume opens the world of free probability to a wide variety of readers. From its roots in the theory of operator algebras, free probability has intertwined with non-crossing partitions, random matrices, applications in wireless communications, representation theory of large groups, quantum groups, the invariant subspace problem, large deviations, subfactors, and beyond. This book puts a special emphasis on the relation of free probability to random matrices, but also touches upon the operator algebraic, combinatorial, and analytic aspects of the theory. The book serves as a combination textbook/research monograph, with self-contained chapters, exercises scattered throughout the text, and coverage of important ongoing progress of the theory. It will appeal to graduate students and all mathematicians interested in random matrices and free probability from the point of view of operator algebras, combinatorics, analytic functions, or applications in engineering and statistical physics.
Immanant Conversion on Symmetric Matrices
Directory of Open Access Journals (Sweden)
Purificação Coelho M.
2014-01-01
Full Text Available Letr Σn(C denote the space of all n χ n symmetric matrices over the complex field C. The main objective of this paper is to prove that the maps Φ : Σn(C -> Σn (C satisfying for any fixed irre- ducible characters X, X' -SC the condition dx(A +aB = dχ·(Φ(Α + αΦ(Β for all matrices A,В ε Σ„(С and all scalars a ε C are automatically linear and bijective. As a corollary of the above result we characterize all such maps Φ acting on ΣИ(С.
Derivatives of triangular, Toeplitz, circulant matrices and matrices of other forms over semirings
Vladeva, Dimitrinka
2017-01-01
In this article we construct examples of derivations in matrix semirings. We study hereditary and inner derivations, derivatives of diagonal, triangular, Toeplitz, circulant matrices and of matrices of other forms and prove theorems for derivatives of matrices of these forms.
Iterative methods for Toeplitz-like matrices
Energy Technology Data Exchange (ETDEWEB)
Huckle, T. [Universitaet Wurzburg (Germany)
1994-12-31
In this paper the author will give a survey on iterative methods for solving linear equations with Toeplitz matrices, Block Toeplitz matrices, Toeplitz plus Hankel matrices, and matrices with low displacement rank. He will treat the following subjects: (1) optimal (w)-circulant preconditioners is a generalization of circulant preconditioners; (2) Optimal implementation of circulant-like preconditioners in the complex and real case; (3) preconditioning of near-singular matrices; what kind of preconditioners can be used in this case; (4) circulant preconditioning for more general classes of Toeplitz matrices; what can be said about matrices with coefficients that are not l{sub 1}-sequences; (5) preconditioners for Toeplitz least squares problems, for block Toeplitz matrices, and for Toeplitz plus Hankel matrices.
Sign pattern matrices that admit M-, N-, P- or inverse M-matrices
Araújo, C. Mendes; Juan R. Torregrosa
2009-01-01
In this paper we identify the sign pattern matrices that occur among the N–matrices, the P–matrices and the M–matrices. We also address to the class of inverse M–matrices and the related admissibility of sign pattern matrices problem. Fundação para a Ciência e a Tecnologia (FCT) Spanish DGI grant number MTM2007-64477
CHARACTERISTIC RADICALS OF STOCHASTIC MATRICES,
The paper investigates the distribution on a complex plane of characteristic radicals of stochastic matrices of the n-th order. The results obtained...can be interpreted as theorems on the relative distribution of characteristic radicals of an arbitrary matrix with non-negative elements. (Author)
Functional Brain Network Classification With Compact Representation of SICE Matrices.
Zhang, Jianjia; Zhou, Luping; Wang, Lei; Li, Wanqing
2015-06-01
Recently, a sparse inverse covariance estimation (SICE) technique has been employed to model functional brain connectivity. The inverse covariance matrix (SICE matrix in short) estimated for each subject is used as a representation of brain connectivity to discriminate Alzheimers disease from normal controls. However, we observed that direct use of the SICE matrix does not necessarily give satisfying discrimination, due to its high dimensionality and the scarcity of training subjects. Looking into this problem, we argue that the intrinsic dimensionality of these SICE matrices shall be much lower, considering 1) an SICE matrix resides on a Riemannian manifold of symmetric positive definiteness matrices, and 2) human brains share common patterns of connectivity across subjects. Therefore, we propose to employ manifold-based similarity measures and kernel-based PCA to extract principal connectivity components as a compact representation of brain network. Moreover, to cater for the requirement of both discrimination and interpretation in neuroimage analysis, we develop a novel preimage estimation algorithm to make the obtained connectivity components anatomically interpretable. To verify the efficacy of our method and gain insights into SICE-based brain networks, we conduct extensive experimental study on synthetic data and real rs-fMRI data from the ADNI dataset. Our method outperforms the comparable methods and improves the classification accuracy significantly.
Engineering two-photon high-dimensional states through quantum interference.
Zhang, Yingwen; Roux, Filippus S; Konrad, Thomas; Agnew, Megan; Leach, Jonathan; Forbes, Andrew
2016-02-01
Many protocols in quantum science, for example, linear optical quantum computing, require access to large-scale entangled quantum states. Such systems can be realized through many-particle qubits, but this approach often suffers from scalability problems. An alternative strategy is to consider a lesser number of particles that exist in high-dimensional states. The spatial modes of light are one such candidate that provides access to high-dimensional quantum states, and thus they increase the storage and processing potential of quantum information systems. We demonstrate the controlled engineering of two-photon high-dimensional states entangled in their orbital angular momentum through Hong-Ou-Mandel interference. We prepare a large range of high-dimensional entangled states and implement precise quantum state filtering. We characterize the full quantum state before and after the filter, and are thus able to determine that only the antisymmetric component of the initial state remains. This work paves the way for high-dimensional processing and communication of multiphoton quantum states, for example, in teleportation beyond qubits.
A Comparison of Methods for Estimating the Determinant of High-Dimensional Covariance Matrix.
Hu, Zongliang; Dong, Kai; Dai, Wenlin; Tong, Tiejun
2017-09-21
The determinant of the covariance matrix for high-dimensional data plays an important role in statistical inference and decision. It has many real applications including statistical tests and information theory. Due to the statistical and computational challenges with high dimensionality, little work has been proposed in the literature for estimating the determinant of high-dimensional covariance matrix. In this paper, we estimate the determinant of the covariance matrix using some recent proposals for estimating high-dimensional covariance matrix. Specifically, we consider a total of eight covariance matrix estimation methods for comparison. Through extensive simulation studies, we explore and summarize some interesting comparison results among all compared methods. We also provide practical guidelines based on the sample size, the dimension, and the correlation of the data set for estimating the determinant of high-dimensional covariance matrix. Finally, from a perspective of the loss function, the comparison study in this paper may also serve as a proxy to assess the performance of the covariance matrix estimation.
A Hybrid Semi-Supervised Anomaly Detection Model for High-Dimensional Data
Directory of Open Access Journals (Sweden)
Hongchao Song
2017-01-01
Full Text Available Anomaly detection, which aims to identify observations that deviate from a nominal sample, is a challenging task for high-dimensional data. Traditional distance-based anomaly detection methods compute the neighborhood distance between each observation and suffer from the curse of dimensionality in high-dimensional space; for example, the distances between any pair of samples are similar and each sample may perform like an outlier. In this paper, we propose a hybrid semi-supervised anomaly detection model for high-dimensional data that consists of two parts: a deep autoencoder (DAE and an ensemble k-nearest neighbor graphs- (K-NNG- based anomaly detector. Benefiting from the ability of nonlinear mapping, the DAE is first trained to learn the intrinsic features of a high-dimensional dataset to represent the high-dimensional data in a more compact subspace. Several nonparametric KNN-based anomaly detectors are then built from different subsets that are randomly sampled from the whole dataset. The final prediction is made by all the anomaly detectors. The performance of the proposed method is evaluated on several real-life datasets, and the results confirm that the proposed hybrid model improves the detection accuracy and reduces the computational complexity.
A Hybrid Semi-Supervised Anomaly Detection Model for High-Dimensional Data.
Song, Hongchao; Jiang, Zhuqing; Men, Aidong; Yang, Bo
2017-01-01
Anomaly detection, which aims to identify observations that deviate from a nominal sample, is a challenging task for high-dimensional data. Traditional distance-based anomaly detection methods compute the neighborhood distance between each observation and suffer from the curse of dimensionality in high-dimensional space; for example, the distances between any pair of samples are similar and each sample may perform like an outlier. In this paper, we propose a hybrid semi-supervised anomaly detection model for high-dimensional data that consists of two parts: a deep autoencoder (DAE) and an ensemble k-nearest neighbor graphs- (K-NNG-) based anomaly detector. Benefiting from the ability of nonlinear mapping, the DAE is first trained to learn the intrinsic features of a high-dimensional dataset to represent the high-dimensional data in a more compact subspace. Several nonparametric KNN-based anomaly detectors are then built from different subsets that are randomly sampled from the whole dataset. The final prediction is made by all the anomaly detectors. The performance of the proposed method is evaluated on several real-life datasets, and the results confirm that the proposed hybrid model improves the detection accuracy and reduces the computational complexity.
A Comparison of Methods for Estimating the Determinant of High-Dimensional Covariance Matrix
Hu, Zongliang
2017-09-27
The determinant of the covariance matrix for high-dimensional data plays an important role in statistical inference and decision. It has many real applications including statistical tests and information theory. Due to the statistical and computational challenges with high dimensionality, little work has been proposed in the literature for estimating the determinant of high-dimensional covariance matrix. In this paper, we estimate the determinant of the covariance matrix using some recent proposals for estimating high-dimensional covariance matrix. Specifically, we consider a total of eight covariance matrix estimation methods for comparison. Through extensive simulation studies, we explore and summarize some interesting comparison results among all compared methods. We also provide practical guidelines based on the sample size, the dimension, and the correlation of the data set for estimating the determinant of high-dimensional covariance matrix. Finally, from a perspective of the loss function, the comparison study in this paper may also serve as a proxy to assess the performance of the covariance matrix estimation.
Fickler, Robert; Lapkiewicz, Radek; Huber, Marcus; Lavery, Martin P J; Padgett, Miles J; Zeilinger, Anton
2014-07-30
Photonics has become a mature field of quantum information science, where integrated optical circuits offer a way to scale the complexity of the set-up as well as the dimensionality of the quantum state. On photonic chips, paths are the natural way to encode information. To distribute those high-dimensional quantum states over large distances, transverse spatial modes, like orbital angular momentum possessing Laguerre Gauss modes, are favourable as flying information carriers. Here we demonstrate a quantum interface between these two vibrant photonic fields. We create three-dimensional path entanglement between two photons in a nonlinear crystal and use a mode sorter as the quantum interface to transfer the entanglement to the orbital angular momentum degree of freedom. Thus our results show a flexible way to create high-dimensional spatial mode entanglement. Moreover, they pave the way to implement broad complex quantum networks where high-dimensionally entangled states could be distributed over distant photonic chips.
Some basic properties of block operator matrices
Jin, Guohai; Chen, Alatancang
2014-01-01
General approach to the multiplication or adjoint operation of $2\\times 2$ block operator matrices with unbounded entries are founded. Furthermore, criteria for self-adjointness of block operator matrices based on their entry operators are established.
The validation and assessment of machine learning: a game of prediction from high-dimensional data
DEFF Research Database (Denmark)
Pers, Tune Hannes; Albrechtsen, A; Holst, C
2009-01-01
In applied statistics, tools from machine learning are popular for analyzing complex and high-dimensional data. However, few theoretical results are available that could guide to the appropriate machine learning tool in a new application. Initial development of an overall strategy thus often...... the ideas, the game is applied to data from the Nugenob Study where the aim is to predict the fat oxidation capacity based on conventional factors and high-dimensional metabolomics data. Three players have chosen to use support vector machines, LASSO, and random forests, respectively....
Convertible Subspaces of Hessenberg-Type Matrices
Directory of Open Access Journals (Sweden)
Henrique F. da Cruz
2017-12-01
Full Text Available We describe subspaces of generalized Hessenberg matrices where the determinant is convertible into the permanent by affixing ± signs. An explicit characterization of convertible Hessenberg-type matrices is presented. We conclude that convertible matrices with the maximum number of nonzero entries can be reduced to a basic set.
Directory of Open Access Journals (Sweden)
Guo Wenge
2012-07-01
Full Text Available Abstract Background Based on available biological information, genomic data can often be partitioned into pre-defined sets (e.g. pathways and subsets within sets. Biologists are often interested in determining whether some pre-defined sets of variables (e.g. genes are differentially expressed under varying experimental conditions. Several procedures are available in the literature for making such determinations, however, they do not take into account information regarding the subsets within each set. Secondly, variables (e.g. genes belonging to a set or a subset are potentially correlated, yet such information is often ignored and univariate methods are used. This may result in loss of power and/or inflated false positive rate. Results We introduce a multiple testing-based methodology which makes use of available information regarding biologically relevant subsets within each pre-defined set of variables while exploiting the underlying dependence structure among the variables. Using this methodology, a biologist may not only determine whether a set of variables are differentially expressed between two experimental conditions, but may also test whether specific subsets within a significant set are also significant. Conclusions The proposed methodology; (a is easy to implement, (b does not require inverting potentially singular covariance matrices, and (c controls the family wise error rate (FWER at the desired nominal level, (d is robust to the underlying distribution and covariance structures. Although for simplicity of exposition, the methodology is described for microarray gene expression data, it is also applicable to any high dimensional data, such as the mRNA seq data, CpG methylation data etc.
Fast multipole preconditioners for sparse matrices arising from elliptic equations
Ibeid, Huda
2017-11-09
Among optimal hierarchical algorithms for the computational solution of elliptic problems, the fast multipole method (FMM) stands out for its adaptability to emerging architectures, having high arithmetic intensity, tunable accuracy, and relaxable global synchronization requirements. We demonstrate that, beyond its traditional use as a solver in problems for which explicit free-space kernel representations are available, the FMM has applicability as a preconditioner in finite domain elliptic boundary value problems, by equipping it with boundary integral capability for satisfying conditions at finite boundaries and by wrapping it in a Krylov method for extensibility to more general operators. Here, we do not discuss the well developed applications of FMM to implement matrix-vector multiplications within Krylov solvers of boundary element methods. Instead, we propose using FMM for the volume-to-volume contribution of inhomogeneous Poisson-like problems, where the boundary integral is a small part of the overall computation. Our method may be used to precondition sparse matrices arising from finite difference/element discretizations, and can handle a broader range of scientific applications. It is capable of algebraic convergence rates down to the truncation error of the discretized PDE comparable to those of multigrid methods, and it offers potentially superior multicore and distributed memory scalability properties on commodity architecture supercomputers. Compared with other methods exploiting the low-rank character of off-diagonal blocks of the dense resolvent operator, FMM-preconditioned Krylov iteration may reduce the amount of communication because it is matrix-free and exploits the tree structure of FMM. We describe our tests in reproducible detail with freely available codes and outline directions for further extensibility.
A Cure for Variance Inflation in High Dimensional Kernel Principal Component Analysis
DEFF Research Database (Denmark)
Abrahamsen, Trine Julie; Hansen, Lars Kai
2011-01-01
Small sample high-dimensional principal component analysis (PCA) suffers from variance inflation and lack of generalizability. It has earlier been pointed out that a simple leave-one-out variance renormalization scheme can cure the problem. In this paper we generalize the cure in two directions...
Global communication schemes for the numerical solution of high-dimensional PDEs
DEFF Research Database (Denmark)
Hupp, Philipp; Heene, Mario; Jacob, Riko
2016-01-01
The numerical treatment of high-dimensional partial differential equations is among the most compute-hungry problems and in urgent need for current and future high-performance computing (HPC) systems. It is thus also facing the grand challenges of exascale computing such as the requirement to red...
Generation of high-dimensional energy-time-entangled photon pairs
Zhang, Da; Zhang, Yiqi; Li, Xinghua; Zhang, Dan; Cheng, Lin; Li, Changbiao; Zhang, Yanpeng
2017-11-01
High-dimensional entangled photon pairs have many excellent properties compared to two-dimensional entangled two-photon states, such as greater information capacity, stronger nonlocality, and higher security. Traditionally, the degree of freedom that can produce high-dimensional entanglement mainly consists of angular momentum and energy time. In this paper, we propose a type of high-dimensional energy-time-entangled qudit, which is different from the traditional model with an extended propagation path. In addition, our method mainly focuses on the generation with multiple frequency modes, while two- and three-dimensional frequency-entangled qudits are examined as examples in detail through the linear or nonlinear optical response of the medium. The generation of high-dimensional energy-time-entangled states can be verified by coincidence counts in the damped Rabi oscillation regime, where the paired Stokes-anti-Stokes wave packet is determined by the structure of resonances in the third-order nonlinearity. Finally, we extend the dimension to N in the sequential-cascade mode. Our results have potential applications in quantum communication and quantum computation.
Characterization of differentially expressed genes using high-dimensional co-expression networks
DEFF Research Database (Denmark)
Coelho Goncalves de Abreu, Gabriel; Labouriau, Rodrigo S.
2010-01-01
We present a technique to characterize differentially expressed genes in terms of their position in a high-dimensional co-expression network. The set-up of Gaussian graphical models is used to construct representations of the co-expression network in such a way that redundancy and the propagation...
Penalized estimation for competing risks regression with applications to high-dimensional covariates
DEFF Research Database (Denmark)
Ambrogi, Federico; Scheike, Thomas H.
2016-01-01
High-dimensional regression has become an increasingly important topic for many research fields. For example, biomedical research generates an increasing amount of data to characterize patients' bio-profiles (e.g. from a genomic high-throughput assay). The increasing complexity in the characteriz......High-dimensional regression has become an increasingly important topic for many research fields. For example, biomedical research generates an increasing amount of data to characterize patients' bio-profiles (e.g. from a genomic high-throughput assay). The increasing complexity...... in the characterization of patients' bio-profiles is added to the complexity related to the prolonged follow-up of patients with the registration of the occurrence of possible adverse events. This information may offer useful insight into disease dynamics and in identifying subset of patients with worse prognosis...... and better response to the therapy. Although in the last years the number of contributions for coping with high and ultra-high-dimensional data in standard survival analysis have increased (Witten and Tibshirani, 2010. Survival analysis with high-dimensional covariates. Statistical Methods in Medical...
Ferdosi, Bilkis J.; Buddelmeijer, Hugo; Trager, Scott; Wilkinson, Michael H.F.; Roerdink, Jos B.T.M.
2010-01-01
Data sets in astronomy are growing to enormous sizes. Modern astronomical surveys provide not only image data but also catalogues of millions of objects (stars, galaxies), each object with hundreds of associated parameters. Exploration of this very high-dimensional data space poses a huge challenge.
High-Dimensional Exploratory Item Factor Analysis by a Metropolis-Hastings Robbins-Monro Algorithm
Cai, Li
2010-01-01
A Metropolis-Hastings Robbins-Monro (MH-RM) algorithm for high-dimensional maximum marginal likelihood exploratory item factor analysis is proposed. The sequence of estimates from the MH-RM algorithm converges with probability one to the maximum likelihood solution. Details on the computer implementation of this algorithm are provided. The…
Rotationally invariant ensembles of integrable matrices.
Yuzbashyan, Emil A; Shastry, B Sriram; Scaramazza, Jasen A
2016-05-01
We construct ensembles of random integrable matrices with any prescribed number of nontrivial integrals and formulate integrable matrix theory (IMT)-a counterpart of random matrix theory (RMT) for quantum integrable models. A type-M family of integrable matrices consists of exactly N-M independent commuting N×N matrices linear in a real parameter. We first develop a rotationally invariant parametrization of such matrices, previously only constructed in a preferred basis. For example, an arbitrary choice of a vector and two commuting Hermitian matrices defines a type-1 family and vice versa. Higher types similarly involve a random vector and two matrices. The basis-independent formulation allows us to derive the joint probability density for integrable matrices, similar to the construction of Gaussian ensembles in the RMT.
Genetic code, hamming distance and stochastic matrices.
He, Matthew X; Petoukhov, Sergei V; Ricci, Paolo E
2004-09-01
In this paper we use the Gray code representation of the genetic code C=00, U=10, G=11 and A=01 (C pairs with G, A pairs with U) to generate a sequence of genetic code-based matrices. In connection with these code-based matrices, we use the Hamming distance to generate a sequence of numerical matrices. We then further investigate the properties of the numerical matrices and show that they are doubly stochastic and symmetric. We determine the frequency distributions of the Hamming distances, building blocks of the matrices, decomposition and iterations of matrices. We present an explicit decomposition formula for the genetic code-based matrix in terms of permutation matrices, which provides a hypercube representation of the genetic code. It is also observed that there is a Hamiltonian cycle in a genetic code-based hypercube.
Energy Technology Data Exchange (ETDEWEB)
Li, Weixuan; Lin, Guang; Li, Bing
2016-09-01
A well-known challenge in uncertainty quantification (UQ) is the "curse of dimensionality". However, many high-dimensional UQ problems are essentially low-dimensional, because the randomness of the quantity of interest (QoI) is caused only by uncertain parameters varying within a low-dimensional subspace, known as the sufficient dimension reduction (SDR) subspace. Motivated by this observation, we propose and demonstrate in this paper an inverse regression-based UQ approach (IRUQ) for high-dimensional problems. Specifically, we use an inverse regression procedure to estimate the SDR subspace and then convert the original problem to a low-dimensional one, which can be efficiently solved by building a response surface model such as a polynomial chaos expansion. The novelty and advantages of the proposed approach is seen in its computational efficiency and practicality. Comparing with Monte Carlo, the traditionally preferred approach for high-dimensional UQ, IRUQ with a comparable cost generally gives much more accurate solutions even for high-dimensional problems, and even when the dimension reduction is not exactly sufficient. Theoretically, IRUQ is proved to converge twice as fast as the approach it uses seeking the SDR subspace. For example, while a sliced inverse regression method converges to the SDR subspace at the rate of $O(n^{-1/2})$, the corresponding IRUQ converges at $O(n^{-1})$. IRUQ also provides several desired conveniences in practice. It is non-intrusive, requiring only a simulator to generate realizations of the QoI, and there is no need to compute the high-dimensional gradient of the QoI. Finally, error bars can be derived for the estimation results reported by IRUQ.
Comparison of clustering methods for high-dimensional single-cell flow and mass cytometry data.
Weber, Lukas M; Robinson, Mark D
2016-12-01
Recent technological developments in high-dimensional flow cytometry and mass cytometry (CyTOF) have made it possible to detect expression levels of dozens of protein markers in thousands of cells per second, allowing cell populations to be characterized in unprecedented detail. Traditional data analysis by "manual gating" can be inefficient and unreliable in these high-dimensional settings, which has led to the development of a large number of automated analysis methods. Methods designed for unsupervised analysis use specialized clustering algorithms to detect and define cell populations for further downstream analysis. Here, we have performed an up-to-date, extensible performance comparison of clustering methods for high-dimensional flow and mass cytometry data. We evaluated methods using several publicly available data sets from experiments in immunology, containing both major and rare cell populations, with cell population identities from expert manual gating as the reference standard. Several methods performed well, including FlowSOM, X-shift, PhenoGraph, Rclusterpp, and flowMeans. Among these, FlowSOM had extremely fast runtimes, making this method well-suited for interactive, exploratory analysis of large, high-dimensional data sets on a standard laptop or desktop computer. These results extend previously published comparisons by focusing on high-dimensional data and including new methods developed for CyTOF data. R scripts to reproduce all analyses are available from GitHub (https://github.com/lmweber/cytometry-clustering-comparison), and pre-processed data files are available from FlowRepository (FR-FCM-ZZPH), allowing our comparisons to be extended to include new clustering methods and reference data sets. © 2016 The Authors. Cytometry Part A published by Wiley Periodicals, Inc. on behalf of ISAC. © 2016 The Authors. Cytometry Part A Published by Wiley Periodicals, Inc. on behalf of ISAC.
SNP interaction detection with Random Forests in high-dimensional genetic data.
Winham, Stacey J; Colby, Colin L; Freimuth, Robert R; Wang, Xin; de Andrade, Mariza; Huebner, Marianne; Biernacka, Joanna M
2012-07-15
Identifying variants associated with complex human traits in high-dimensional data is a central goal of genome-wide association studies. However, complicated etiologies such as gene-gene interactions are ignored by the univariate analysis usually applied in these studies. Random Forests (RF) are a popular data-mining technique that can accommodate a large number of predictor variables and allow for complex models with interactions. RF analysis produces measures of variable importance that can be used to rank the predictor variables. Thus, single nucleotide polymorphism (SNP) analysis using RFs is gaining popularity as a potential filter approach that considers interactions in high-dimensional data. However, the impact of data dimensionality on the power of RF to identify interactions has not been thoroughly explored. We investigate the ability of rankings from variable importance measures to detect gene-gene interaction effects and their potential effectiveness as filters compared to p-values from univariate logistic regression, particularly as the data becomes increasingly high-dimensional. RF effectively identifies interactions in low dimensional data. As the total number of predictor variables increases, probability of detection declines more rapidly for interacting SNPs than for non-interacting SNPs, indicating that in high-dimensional data the RF variable importance measures are capturing marginal effects rather than capturing the effects of interactions. While RF remains a promising data-mining technique that extends univariate methods to condition on multiple variables simultaneously, RF variable importance measures fail to detect interaction effects in high-dimensional data in the absence of a strong marginal component, and therefore may not be useful as a filter technique that allows for interaction effects in genome-wide data.
Sparse Matrices in Frame Theory
DEFF Research Database (Denmark)
Lemvig, Jakob; Krahmer, Felix; Kutyniok, Gitta
2014-01-01
Frame theory is closely intertwined with signal processing through a canon of methodologies for the analysis of signals using (redundant) linear measurements. The canonical dual frame associated with a frame provides a means for reconstruction by a least squares approach, but other dual frames...... yield alternative reconstruction procedures. The novel paradigm of sparsity has recently entered the area of frame theory in various ways. Of those different sparsity perspectives, we will focus on the situations where frames and (not necessarily canonical) dual frames can be written as sparse matrices...
Compressively Characterizing High-Dimensional Entangled States with Complementary, Random Filtering
Directory of Open Access Journals (Sweden)
Gregory A. Howland
2016-05-01
Full Text Available The resources needed to conventionally characterize a quantum system are overwhelmingly large for high-dimensional systems. This obstacle may be overcome by abandoning traditional cornerstones of quantum measurement, such as general quantum states, strong projective measurement, and assumption-free characterization. Following this reasoning, we demonstrate an efficient technique for characterizing high-dimensional, spatial entanglement with one set of measurements. We recover sharp distributions with local, random filtering of the same ensemble in momentum followed by position—something the uncertainty principle forbids for projective measurements. Exploiting the expectation that entangled signals are highly correlated, we use fewer than 5000 measurements to characterize a 65,536-dimensional state. Finally, we use entropic inequalities to witness entanglement without a density matrix. Our method represents the sea change unfolding in quantum measurement, where methods influenced by the information theory and signal-processing communities replace unscalable, brute-force techniques—a progression previously followed by classical sensing.
Distribution of high-dimensional entanglement via an intra-city free-space link
Steinlechner, Fabian; Ecker, Sebastian; Fink, Matthias; Liu, Bo; Bavaresco, Jessica; Huber, Marcus; Scheidl, Thomas; Ursin, Rupert
2017-07-01
Quantum entanglement is a fundamental resource in quantum information processing and its distribution between distant parties is a key challenge in quantum communications. Increasing the dimensionality of entanglement has been shown to improve robustness and channel capacities in secure quantum communications. Here we report on the distribution of genuine high-dimensional entanglement via a 1.2-km-long free-space link across Vienna. We exploit hyperentanglement, that is, simultaneous entanglement in polarization and energy-time bases, to encode quantum information, and observe high-visibility interference for successive correlation measurements in each degree of freedom. These visibilities impose lower bounds on entanglement in each subspace individually and certify four-dimensional entanglement for the hyperentangled system. The high-fidelity transmission of high-dimensional entanglement under real-world atmospheric link conditions represents an important step towards long-distance quantum communications with more complex quantum systems and the implementation of advanced quantum experiments with satellite links.
High-dimensional quantum state transfer through a quantum spin chain
Qin, Wei; Wang, Chuan; Long, Gui Lu
2013-01-01
In this paper, we investigate a high-dimensional quantum state transfer protocol. An arbitrary unknown high-dimensional state can be transferred with high fidelity between two remote registers through an XX coupling spin chain of arbitrary length. The evolution of the state transfer is determined by the natural dynamics of the chain without external modulation and coupling strength engineering. As a consequence, entanglement distribution with a high efficiency can be achieved. Also the strong field and high spin quantum number can partly counteract the effect of finite temperature to ensure the high fidelity of the protocol when the quantum data bus is in the thermal equilibrium state under an external magnetic field.
Distribution of high-dimensional entanglement via an intra-city free-space link.
Steinlechner, Fabian; Ecker, Sebastian; Fink, Matthias; Liu, Bo; Bavaresco, Jessica; Huber, Marcus; Scheidl, Thomas; Ursin, Rupert
2017-07-24
Quantum entanglement is a fundamental resource in quantum information processing and its distribution between distant parties is a key challenge in quantum communications. Increasing the dimensionality of entanglement has been shown to improve robustness and channel capacities in secure quantum communications. Here we report on the distribution of genuine high-dimensional entanglement via a 1.2-km-long free-space link across Vienna. We exploit hyperentanglement, that is, simultaneous entanglement in polarization and energy-time bases, to encode quantum information, and observe high-visibility interference for successive correlation measurements in each degree of freedom. These visibilities impose lower bounds on entanglement in each subspace individually and certify four-dimensional entanglement for the hyperentangled system. The high-fidelity transmission of high-dimensional entanglement under real-world atmospheric link conditions represents an important step towards long-distance quantum communications with more complex quantum systems and the implementation of advanced quantum experiments with satellite links.
Su, Yapeng; Shi, Qihui; Wei, Wei
2017-02-01
New insights on cellular heterogeneity in the last decade provoke the development of a variety of single cell omics tools at a lightning pace. The resultant high-dimensional single cell data generated by these tools require new theoretical approaches and analytical algorithms for effective visualization and interpretation. In this review, we briefly survey the state-of-the-art single cell proteomic tools with a particular focus on data acquisition and quantification, followed by an elaboration of a number of statistical and computational approaches developed to date for dissecting the high-dimensional single cell data. The underlying assumptions, unique features, and limitations of the analytical methods with the designated biological questions they seek to answer will be discussed. Particular attention will be given to those information theoretical approaches that are anchored in a set of first principles of physics and can yield detailed (and often surprising) predictions. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Estimating and testing high-dimensional mediation effects in epigenetic studies.
Zhang, Haixiang; Zheng, Yinan; Zhang, Zhou; Gao, Tao; Joyce, Brian; Yoon, Grace; Zhang, Wei; Schwartz, Joel; Just, Allan; Colicino, Elena; Vokonas, Pantel; Zhao, Lihui; Lv, Jinchi; Baccarelli, Andrea; Hou, Lifang; Liu, Lei
2016-10-15
High-dimensional DNA methylation markers may mediate pathways linking environmental exposures with health outcomes. However, there is a lack of analytical methods to identify significant mediators for high-dimensional mediation analysis. Based on sure independent screening and minimax concave penalty techniques, we use a joint significance test for mediation effect. We demonstrate its practical performance using Monte Carlo simulation studies and apply this method to investigate the extent to which DNA methylation markers mediate the causal pathway from smoking to reduced lung function in the Normative Aging Study. We identify 2 CpGs with significant mediation effects. R package, source code, and simulation study are available at https://github.com/YinanZheng/HIMA CONTACT: lei.liu@northwestern.edu. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Distributed Computation of the knn Graph for Large High-Dimensional Point Sets.
Plaku, Erion; Kavraki, Lydia E
2007-03-01
High-dimensional problems arising from robot motion planning, biology, data mining, and geographic information systems often require the computation of k nearest neighbor (knn) graphs. The knn graph of a data set is obtained by connecting each point to its k closest points. As the research in the above-mentioned fields progressively addresses problems of unprecedented complexity, the demand for computing knn graphs based on arbitrary distance metrics and large high-dimensional data sets increases, exceeding resources available to a single machine. In this work we efficiently distribute the computation of knn graphs for clusters of processors with message passing. Extensions to our distributed framework include the computation of graphs based on other proximity queries, such as approximate knn or range queries. Our experiments show nearly linear speedup with over one hundred processors and indicate that similar speedup can be obtained with several hundred processors.
Bit-Table Based Biclustering and Frequent Closed Itemset Mining in High-Dimensional Binary Data
Directory of Open Access Journals (Sweden)
András Király
2014-01-01
Full Text Available During the last decade various algorithms have been developed and proposed for discovering overlapping clusters in high-dimensional data. The two most prominent application fields in this research, proposed independently, are frequent itemset mining (developed for market basket data and biclustering (applied to gene expression data analysis. The common limitation of both methodologies is the limited applicability for very large binary data sets. In this paper we propose a novel and efficient method to find both frequent closed itemsets and biclusters in high-dimensional binary data. The method is based on simple but very powerful matrix and vector multiplication approaches that ensure that all patterns can be discovered in a fast manner. The proposed algorithm has been implemented in the commonly used MATLAB environment and freely available for researchers.
Efficient uncertainty quantification methodologies for high-dimensional climate land models
Energy Technology Data Exchange (ETDEWEB)
Sargsyan, Khachik [Sandia National Lab. (SNL-CA), Livermore, CA (United States); Safta, Cosmin [Sandia National Lab. (SNL-CA), Livermore, CA (United States); Berry, Robert Dan [Sandia National Lab. (SNL-CA), Livermore, CA (United States); Ray, Jaideep [Sandia National Lab. (SNL-CA), Livermore, CA (United States); Debusschere, Bert J. [Sandia National Lab. (SNL-CA), Livermore, CA (United States); Najm, Habib N. [Sandia National Lab. (SNL-CA), Livermore, CA (United States)
2011-11-01
In this report, we proposed, examined and implemented approaches for performing efficient uncertainty quantification (UQ) in climate land models. Specifically, we applied Bayesian compressive sensing framework to a polynomial chaos spectral expansions, enhanced it with an iterative algorithm of basis reduction, and investigated the results on test models as well as on the community land model (CLM). Furthermore, we discussed construction of efficient quadrature rules for forward propagation of uncertainties from high-dimensional, constrained input space to output quantities of interest. The work lays grounds for efficient forward UQ for high-dimensional, strongly non-linear and computationally costly climate models. Moreover, to investigate parameter inference approaches, we have applied two variants of the Markov chain Monte Carlo (MCMC) method to a soil moisture dynamics submodel of the CLM. The evaluation of these algorithms gave us a good foundation for further building out the Bayesian calibration framework towards the goal of robust component-wise calibration.
Extreme Learning Machines on High Dimensional and Large Data Applications: A Survey
Directory of Open Access Journals (Sweden)
Jiuwen Cao
2015-01-01
Full Text Available Extreme learning machine (ELM has been developed for single hidden layer feedforward neural networks (SLFNs. In ELM algorithm, the connections between the input layer and the hidden neurons are randomly assigned and remain unchanged during the learning process. The output connections are then tuned via minimizing the cost function through a linear system. The computational burden of ELM has been significantly reduced as the only cost is solving a linear system. The low computational complexity attracted a great deal of attention from the research community, especially for high dimensional and large data applications. This paper provides an up-to-date survey on the recent developments of ELM and its applications in high dimensional and large data. Comprehensive reviews on image processing, video processing, medical signal processing, and other popular large data applications with ELM are presented in the paper.
Atom-centered symmetry functions for constructing high-dimensional neural network potentials
Behler, Jörg
2011-02-01
Neural networks offer an unbiased and numerically very accurate approach to represent high-dimensional ab initio potential-energy surfaces. Once constructed, neural network potentials can provide the energies and forces many orders of magnitude faster than electronic structure calculations, and thus enable molecular dynamics simulations of large systems. However, Cartesian coordinates are not a good choice to represent the atomic positions, and a transformation to symmetry functions is required. Using simple benchmark systems, the properties of several types of symmetry functions suitable for the construction of high-dimensional neural network potential-energy surfaces are discussed in detail. The symmetry functions are general and can be applied to all types of systems such as molecules, crystalline and amorphous solids, and liquids.
An Unbiased Distance-based Outlier Detection Approach for High-dimensional Data
DEFF Research Database (Denmark)
Nguyen, Hoang Vu; Gopalkrishnan, Vivekanand; Assent, Ira
2011-01-01
than a global property. Different from existing approaches, it is not grid-based and dimensionality unbiased. Thus, its performance is impervious to grid resolution as well as the curse of dimensionality. In addition, our approach ranks the outliers, allowing users to select the number of desired...... outliers, thus mitigating the issue of high false alarm rate. Extensive empirical studies on real datasets show that our approach efficiently and effectively detects outliers, even in high-dimensional spaces....
Xu, Chao; Fang, Jian; Shen, Hui; Wang, Yu-Ping; Deng, Hong-Wen
2018-01-25
Extreme phenotype sampling (EPS) is a broadly-used design to identify candidate genetic factors contributing to the variation of quantitative traits. By enriching the signals in extreme phenotypic samples, EPS can boost the association power compared to random sampling. Most existing statistical methods for EPS examine the genetic factors individually, despite many quantitative traits have multiple genetic factors underlying their variation. It is desirable to model the joint effects of genetic factors, which may increase the power and identify novel quantitative trait loci under EPS. The joint analysis of genetic data in high-dimensional situations requires specialized techniques, e.g., the least absolute shrinkage and selection operator (LASSO). Although there are extensive research and application related to LASSO, the statistical inference and testing for the sparse model under EPS remain unknown. We propose a novel sparse model (EPS-LASSO) with hypothesis test for high-dimensional regression under EPS based on a decorrelated score function. The comprehensive simulation shows EPS-LASSO outperforms existing methods with stable type I error and FDR control. EPS-LASSO can provide a consistent power for both low- and high-dimensional situations compared with the other methods dealing with high-dimensional situations. The power of EPS-LASSO is close to other low-dimensional methods when the causal effect sizes are small and is superior when the effects are large. Applying EPS-LASSO to a transcriptome-wide gene expression study for obesity reveals 10 significant body mass index associated genes. Our results indicate that EPS-LASSO is an effective method for EPS data analysis, which can account for correlated predictors. The source code is available at https://github.com/xu1912/EPSLASSO. hdeng2@tulane.edu. Supplementary data are available at Bioinformatics online.
Nikooienejad, Amir; Wang, Wenyi; Johnson, Valen E.
2017-01-01
Variable selection in high dimensional cancer genomic studies has become very popular in the past decade, due to the interest in discovering significant genes pertinent to a specific cancer type. Censored survival data is the main data structure in such studies and performing variable selection for such data type requires certain methodology. With recent developments in computational power, Bayesian methods have become more attractive in the context of variable selection. In this article we i...
Spectral properties of random triangular matrices
Basu, Riddhipratim; Bose, Arup; Ganguly, Shirshendu; Hazra, Rajat Subhra
2011-01-01
We prove the existence of the limiting spectral distribution (LSD) of symmetric triangular patterned matrices and also establish the joint convergence of sequences of such matrices. For the particular case of the symmetric triangular Wigner matrix, we derive expression for the moments of the LSD using properties of Catalan words. The problem of deriving explicit formulae for the moments of the LSD does not seem to be easy to solve for other patterned matrices. The LSD of the non-symmetric tri...
Transformation of a high-dimensional color space for material classification.
Liu, Huajian; Lee, Sang-Heon; Chahl, Javaan Singh
2017-04-01
Images in red-green-blue (RGB) color space need to be transformed to other color spaces for image processing or analysis. For example, the well-known hue-saturation-intensity (HSI) color space, which separates hue from saturation and intensity and is similar to the color perception of humans, can aid many computer vision applications. For high-dimensional images, such as multispectral or hyperspectral images, transformation images to a color space that can separate hue from saturation and intensity would be useful; however, the related works are limited. Some methods could interpret a set of high-dimensional images to hue, saturation, and intensity, but these methods need to reduce the dimension of original images to three images and then map them to the trichromatic color space of RGB. Generally, dimension reduction could cause loss or distortion of original data, and, therefore, the transformed color spaces could not be suitable for material classification in critical conditions. This paper describes a method that can transform high-dimensional images to a color space called hyper-hue-saturation-intensity (HHSI), which is analogous to HSI in high dimensions. The transformation does not need dimension reduction, and, therefore, it can preserve the original information. Experimental results indicate that the hyper-hue is independent of saturation and intensity and it is more suitable for material classification of proximal or remote sensing images captured in a natural environment where illumination usually cannot be controlled.
Designing Progressive and Interactive Analytics Processes for High-Dimensional Data Analysis.
Turkay, Cagatay; Kaya, Erdem; Balcisoy, Selim; Hauser, Helwig
2017-01-01
In interactive data analysis processes, the dialogue between the human and the computer is the enabling mechanism that can lead to actionable observations about the phenomena being investigated. It is of paramount importance that this dialogue is not interrupted by slow computational mechanisms that do not consider any known temporal human-computer interaction characteristics that prioritize the perceptual and cognitive capabilities of the users. In cases where the analysis involves an integrated computational method, for instance to reduce the dimensionality of the data or to perform clustering, such non-optimal processes are often likely. To remedy this, progressive computations, where results are iteratively improved, are getting increasing interest in visual analytics. In this paper, we present techniques and design considerations to incorporate progressive methods within interactive analysis processes that involve high-dimensional data. We define methodologies to facilitate processes that adhere to the perceptual characteristics of users and describe how online algorithms can be incorporated within these. A set of design recommendations and according methods to support analysts in accomplishing high-dimensional data analysis tasks are then presented. Our arguments and decisions here are informed by observations gathered over a series of analysis sessions with analysts from finance. We document observations and recommendations from this study and present evidence on how our approach contribute to the efficiency and productivity of interactive visual analysis sessions involving high-dimensional data.
Inferring gene regulatory relationships with a high-dimensional robust approach.
Zang, Yangguang; Zhao, Qing; Zhang, Qingzhao; Li, Yang; Zhang, Sanguo; Ma, Shuangge
2017-07-01
Gene expression (GE) levels have important biological and clinical implications. They are regulated by copy number alterations (CNAs). Modeling the regulatory relationships between GEs and CNAs facilitates understanding disease biology and can also have values in translational medicine. The expression level of a gene can be regulated by its cis-acting as well as trans-acting CNAs, and the set of trans-acting CNAs is usually not known, which poses a high-dimensional selection and estimation problem. Most of the existing studies share a common limitation in that they cannot accommodate long-tailed distributions or contamination of GE data. In this study, we develop a high-dimensional robust regression approach to infer the regulatory relationships between GEs and CNAs. A high-dimensional regression model is used to accommodate the effects of both cis-acting and trans-acting CNAs. A density power divergence loss function is used to accommodate long-tailed GE distributions and contamination. Penalization is adopted for regularized estimation and selection of relevant CNAs. The proposed approach is effectively realized using a coordinate descent algorithm. Simulation shows that it has competitive performance compared to the nonrobust benchmark and the robust LAD (least absolute deviation) approach. We analyze TCGA (The Cancer Genome Atlas) data on cutaneous melanoma and study GE-CNA regulations in the RAP (regulation of apoptosis) pathway, which further demonstrates the satisfactory performance of the proposed approach. © 2017 WILEY PERIODICALS, INC.
High-Dimensional Function Approximation With Neural Networks for Large Volumes of Data.
Andras, Peter
2018-02-01
Approximation of high-dimensional functions is a challenge for neural networks due to the curse of dimensionality. Often the data for which the approximated function is defined resides on a low-dimensional manifold and in principle the approximation of the function over this manifold should improve the approximation performance. It has been show that projecting the data manifold into a lower dimensional space, followed by the neural network approximation of the function over this space, provides a more precise approximation of the function than the approximation of the function with neural networks in the original data space. However, if the data volume is very large, the projection into the low-dimensional space has to be based on a limited sample of the data. Here, we investigate the nature of the approximation error of neural networks trained over the projection space. We show that such neural networks should have better approximation performance than neural networks trained on high-dimensional data even if the projection is based on a relatively sparse sample of the data manifold. We also find that it is preferable to use a uniformly distributed sparse sample of the data for the purpose of the generation of the low-dimensional projection. We illustrate these results considering the practical neural network approximation of a set of functions defined on high-dimensional data including real world data as well.
Mixture drug-count response model for the high-dimensional drug combinatory effect on myopathy.
Wang, Xueying; Zhang, Pengyue; Chiang, Chien-Wei; Wu, Hengyi; Shen, Li; Ning, Xia; Zeng, Donglin; Wang, Lei; Quinney, Sara K; Feng, Weixing; Li, Lang
2018-02-20
Drug-drug interactions (DDIs) are a common cause of adverse drug events (ADEs). The electronic medical record (EMR) database and the FDA's adverse event reporting system (FAERS) database are the major data sources for mining and testing the ADE associated DDI signals. Most DDI data mining methods focus on pair-wise drug interactions, and methods to detect high-dimensional DDIs in medical databases are lacking. In this paper, we propose 2 novel mixture drug-count response models for detecting high-dimensional drug combinations that induce myopathy. The "count" indicates the number of drugs in a combination. One model is called fixed probability mixture drug-count response model with a maximum risk threshold (FMDRM-MRT). The other model is called count-dependent probability mixture drug-count response model with a maximum risk threshold (CMDRM-MRT), in which the mixture probability is count dependent. Compared with the previous mixture drug-count response model (MDRM) developed by our group, these 2 new models show a better likelihood in detecting high-dimensional drug combinatory effects on myopathy. CMDRM-MRT identified and validated (54; 374; 637; 442; 131) 2-way to 6-way drug interactions, respectively, which induce myopathy in both EMR and FAERS databases. We further demonstrate FAERS data capture much higher maximum myopathy risk than EMR data do. The consistency of 2 mixture models' parameters and local false discovery rate estimates are evaluated through statistical simulation studies. Copyright © 2017 John Wiley & Sons, Ltd.
Lee, Jenny Hyunjung; McDonnell, Kevin T; Zelenyuk, Alla; Imre, Dan; Mueller, Klaus
2014-03-01
Although the euclidean distance does well in measuring data distances within high-dimensional clusters, it does poorly when it comes to gauging intercluster distances. This significantly impacts the quality of global, low-dimensional space embedding procedures such as the popular multidimensional scaling (MDS) where one can often observe nonintuitive layouts. We were inspired by the perceptual processes evoked in the method of parallel coordinates which enables users to visually aggregate the data by the patterns the polylines exhibit across the dimension axes. We call the path of such a polyline its structure and suggest a metric that captures this structure directly in high-dimensional space. This allows us to better gauge the distances of spatially distant data constellations and so achieve data aggregations in MDS plots that are more cognizant of existing high-dimensional structure similarities. Our biscale framework distinguishes far-distances from near-distances. The coarser scale uses the structural similarity metric to separate data aggregates obtained by prior classification or clustering, while the finer scale employs the appropriate euclidean distance.
High-dimensional decoy-state quantum key distribution over multicore telecommunication fibers
Cañas, G.; Vera, N.; Cariñe, J.; González, P.; Cardenas, J.; Connolly, P. W. R.; Przysiezna, A.; Gómez, E. S.; Figueroa, M.; Vallone, G.; Villoresi, P.; da Silva, T. Ferreira; Xavier, G. B.; Lima, G.
2017-08-01
Multiplexing is a strategy to augment the transmission capacity of a communication system. It consists of combining multiple signals over the same data channel and it has been very successful in classical communications. However, the use of enhanced channels has only reached limited practicality in quantum communications (QC) as it requires the manipulation of quantum systems of higher dimensions. Considerable effort is being made towards QC using high-dimensional quantum systems encoded into the transverse momentum of single photons, but so far no approach has been proven to be fully compatible with the existing telecommunication fibers. Here we overcome such a challenge and demonstrate a secure high-dimensional decoy-state quantum key distribution session over a 300-m-long multicore optical fiber. The high-dimensional quantum states are defined in terms of the transverse core modes available for the photon transmission over the fiber, and theoretical analyses show that positive secret key rates can be achieved through metropolitan distances.
Lambda-matrices and vibrating systems
Lancaster, Peter; Stark, M; Kahane, J P
1966-01-01
Lambda-Matrices and Vibrating Systems presents aspects and solutions to problems concerned with linear vibrating systems with a finite degrees of freedom and the theory of matrices. The book discusses some parts of the theory of matrices that will account for the solutions of the problems. The text starts with an outline of matrix theory, and some theorems are proved. The Jordan canonical form is also applied to understand the structure of square matrices. Classical theorems are discussed further by applying the Jordan canonical form, the Rayleigh quotient, and simple matrix pencils with late
Matrices with totally positive powers and their generalizations
Kushel, Olga Y.
2013-01-01
In this paper, eventually totally positive matrices (i.e. matrices all whose powers starting with some point are totally positive) are studied. We present a new approach to eventual total positivity which is based on the theory of eventually positive matrices. We mainly focus on the spectral properties of such matrices. We also study eventually J-sign-symmetric matrices and matrices, whose powers are P-matrices.
The lower bounds for the rank of matrices and some sufficient conditions for nonsingular matrices.
Wang, Dafei; Zhang, Xumei
2017-01-01
The paper mainly discusses the lower bounds for the rank of matrices and sufficient conditions for nonsingular matrices. We first present a new estimation for [Formula: see text] ([Formula: see text] is an eigenvalue of a matrix) by using the partitioned matrices. By using this estimation and inequality theory, the new and more accurate estimations for the lower bounds for the rank are deduced. Furthermore, based on the estimation for the rank, some sufficient conditions for nonsingular matrices are obtained.
Pathological rate matrices: from primates to pathogens
Directory of Open Access Journals (Sweden)
Knight Rob
2008-12-01
Full Text Available Abstract Background Continuous-time Markov models allow flexible, parametrically succinct descriptions of sequence divergence. Non-reversible forms of these models are more biologically realistic but are challenging to develop. The instantaneous rate matrices defined for these models are typically transformed into substitution probability matrices using a matrix exponentiation algorithm that employs eigendecomposition, but this algorithm has characteristic vulnerabilities that lead to significant errors when a rate matrix possesses certain 'pathological' properties. Here we tested whether pathological rate matrices exist in nature, and consider the suitability of different algorithms to their computation. Results We used concatenated protein coding gene alignments from microbial genomes, primate genomes and independent intron alignments from primate genomes. The Taylor series expansion and eigendecomposition matrix exponentiation algorithms were compared to the less widely employed, but more robust, Padé with scaling and squaring algorithm for nucleotide, dinucleotide, codon and trinucleotide rate matrices. Pathological dinucleotide and trinucleotide matrices were evident in the microbial data set, affecting the eigendecomposition and Taylor algorithms respectively. Even using a conservative estimate of matrix error (occurrence of an invalid probability, both Taylor and eigendecomposition algorithms exhibited substantial error rates: ~100% of all exonic trinucleotide matrices were pathological to the Taylor algorithm while ~10% of codon positions 1 and 2 dinucleotide matrices and intronic trinucleotide matrices, and ~30% of codon matrices were pathological to eigendecomposition. The majority of Taylor algorithm errors derived from occurrence of multiple unobserved states. A small number of negative probabilities were detected from the Pad�� algorithm on trinucleotide matrices that were attributable to machine precision. Although the Pad
A partial classification of primes in the positive matrices and in the doubly stochastic matrices
G. Picci; J.M. van den Hof; J.H. van Schuppen (Jan)
1995-01-01
textabstractThe algebraic structure of the set of square positive matrices is that of a semi-ring. The concept of a prime in the positive matrices has been introduced. A few examples of primes in the positive matrices are known but there is no general classification. In this paper a partial
Performance of Low-rank STAP detectors
Anitori, L.; Srinivasan, R.; Rangaswamy, M.
2008-01-01
In this paper the STAP detector based on the lowrank approximation of the normalized adaptive matched filter (LRNAMF) is investigated for its false alarm probability (FAP) performance. An exact formula for the FAP of the LRNAMF detector is derived using the g-method estimator [4]. The non CFAR
Low Rank Sparse Coding for Image Classification
2013-12-08
Singapore 4 Institute of Automation, Chinese Academy of Sciences, P. R. China 5 University of Illinois at Urbana -Champaign, Urbana , IL USA Abstract In this...which contain 200 to 400 images each. The categories vary from outdoor scenes like mountain and forest to indoor environments like living room and
Quantum Hilbert matrices and orthogonal polynomials
DEFF Research Database (Denmark)
Andersen, Jørgen Ellegaard; Berg, Christian
2009-01-01
Using the notion of quantum integers associated with a complex number q≠0 , we define the quantum Hilbert matrix and various extensions. They are Hankel matrices corresponding to certain little q -Jacobi polynomials when |q|... of reciprocal Fibonacci numbers called Filbert matrices. We find a formula for the entries of the inverse quantum Hilbert matrix....
Products of Generalized Stochastic Sarymsakov Matrices
Xia, Weiguo; Liu, Ji; Cao, Ming; Johansson, Karl; Basar, Tamer
2015-01-01
In the set of stochastic, indecomposable, aperiodic (SIA) matrices, the class of stochastic Sarymsakov matrices is the largest known subset (i) that is closed under matrix multiplication and (ii) the inﬁnitely long left-product of the elements from a compact subset converges to a rank-one matrix. In
Chang, Jinyuan; Zhou, Wen; Zhou, Wen-Xin; Wang, Lan
2017-03-01
Comparing large covariance matrices has important applications in modern genomics, where scientists are often interested in understanding whether relationships (e.g., dependencies or co-regulations) among a large number of genes vary between different biological states. We propose a computationally fast procedure for testing the equality of two large covariance matrices when the dimensions of the covariance matrices are much larger than the sample sizes. A distinguishing feature of the new procedure is that it imposes no structural assumptions on the unknown covariance matrices. Hence, the test is robust with respect to various complex dependence structures that frequently arise in genomics. We prove that the proposed procedure is asymptotically valid under weak moment conditions. As an interesting application, we derive a new gene clustering algorithm which shares the same nice property of avoiding restrictive structural assumptions for high-dimensional genomics data. Using an asthma gene expression dataset, we illustrate how the new test helps compare the covariance matrices of the genes across different gene sets/pathways between the disease group and the control group, and how the gene clustering algorithm provides new insights on the way gene clustering patterns differ between the two groups. The proposed methods have been implemented in an R-package HDtest and are available on CRAN. © 2016, The International Biometric Society.
Approximate Joint Diagonalization and Geometric Mean of Symmetric Positive Definite Matrices
Congedo, Marco; Afsari, Bijan; Barachant, Alexandre; Moakher, Maher
2015-01-01
We explore the connection between two problems that have arisen independently in the signal processing and related fields: the estimation of the geometric mean of a set of symmetric positive definite (SPD) matrices and their approximate joint diagonalization (AJD). Today there is a considerable interest in estimating the geometric mean of a SPD matrix set in the manifold of SPD matrices endowed with the Fisher information metric. The resulting mean has several important invariance properties and has proven very useful in diverse engineering applications such as biomedical and image data processing. While for two SPD matrices the mean has an algebraic closed form solution, for a set of more than two SPD matrices it can only be estimated by iterative algorithms. However, none of the existing iterative algorithms feature at the same time fast convergence, low computational complexity per iteration and guarantee of convergence. For this reason, recently other definitions of geometric mean based on symmetric divergence measures, such as the Bhattacharyya divergence, have been considered. The resulting means, although possibly useful in practice, do not satisfy all desirable invariance properties. In this paper we consider geometric means of covariance matrices estimated on high-dimensional time-series, assuming that the data is generated according to an instantaneous mixing model, which is very common in signal processing. We show that in these circumstances we can approximate the Fisher information geometric mean by employing an efficient AJD algorithm. Our approximation is in general much closer to the Fisher information geometric mean as compared to its competitors and verifies many invariance properties. Furthermore, convergence is guaranteed, the computational complexity is low and the convergence rate is quadratic. The accuracy of this new geometric mean approximation is demonstrated by means of simulations. PMID:25919667
On-chip generation of high-dimensional entangled quantum states and their coherent control.
Kues, Michael; Reimer, Christian; Roztocki, Piotr; Cortés, Luis Romero; Sciara, Stefania; Wetzel, Benjamin; Zhang, Yanbing; Cino, Alfonso; Chu, Sai T; Little, Brent E; Moss, David J; Caspani, Lucia; Azaña, José; Morandotti, Roberto
2017-06-28
Optical quantum states based on entangled photons are essential for solving questions in fundamental physics and are at the heart of quantum information science. Specifically, the realization of high-dimensional states (D-level quantum systems, that is, qudits, with D > 2) and their control are necessary for fundamental investigations of quantum mechanics, for increasing the sensitivity of quantum imaging schemes, for improving the robustness and key rate of quantum communication protocols, for enabling a richer variety of quantum simulations, and for achieving more efficient and error-tolerant quantum computation. Integrated photonics has recently become a leading platform for the compact, cost-efficient, and stable generation and processing of non-classical optical states. However, so far, integrated entangled quantum sources have been limited to qubits (D = 2). Here we demonstrate on-chip generation of entangled qudit states, where the photons are created in a coherent superposition of multiple high-purity frequency modes. In particular, we confirm the realization of a quantum system with at least one hundred dimensions, formed by two entangled qudits with D = 10. Furthermore, using state-of-the-art, yet off-the-shelf telecommunications components, we introduce a coherent manipulation platform with which to control frequency-entangled states, capable of performing deterministic high-dimensional gate operations. We validate this platform by measuring Bell inequality violations and performing quantum state tomography. Our work enables the generation and processing of high-dimensional quantum states in a single spatial mode.
Sparse redundancy analysis of high-dimensional genetic and genomic data.
Csala, Attila; Voorbraak, Frans P J M; Zwinderman, Aeilko H; Hof, Michel H
2017-10-15
Recent technological developments have enabled the possibility of genetic and genomic integrated data analysis approaches, where multiple omics datasets from various biological levels are combined and used to describe (disease) phenotypic variations. The main goal is to explain and ultimately predict phenotypic variations by understanding their genetic basis and the interaction of the associated genetic factors. Therefore, understanding the underlying genetic mechanisms of phenotypic variations is an ever increasing research interest in biomedical sciences. In many situations, we have a set of variables that can be considered to be the outcome variables and a set that can be considered to be explanatory variables. Redundancy analysis (RDA) is an analytic method to deal with this type of directionality. Unfortunately, current implementations of RDA cannot deal optimally with the high dimensionality of omics data (p≫n). The existing theoretical framework, based on Ridge penalization, is suboptimal, since it includes all variables in the analysis. As a solution, we propose to use Elastic Net penalization in an iterative RDA framework to obtain a sparse solution. We proposed sparse redundancy analysis (sRDA) for high dimensional omics data analysis. We conducted simulation studies with our software implementation of sRDA to assess the reliability of sRDA. Both the analysis of simulated data, and the analysis of 485 512 methylation markers and 18,424 gene-expression values measured in a set of 55 patients with Marfan syndrome show that sRDA is able to deal with the usual high dimensionality of omics data. http://uva.csala.me/rda. a.csala@amc.uva.nl. Supplementary data are available at Bioinformatics online.
On-chip generation of high-dimensional entangled quantum states and their coherent control
Kues, Michael; Reimer, Christian; Roztocki, Piotr; Cortés, Luis Romero; Sciara, Stefania; Wetzel, Benjamin; Zhang, Yanbing; Cino, Alfonso; Chu, Sai T.; Little, Brent E.; Moss, David J.; Caspani, Lucia; Azaña, José; Morandotti, Roberto
2017-06-01
Optical quantum states based on entangled photons are essential for solving questions in fundamental physics and are at the heart of quantum information science. Specifically, the realization of high-dimensional states (D-level quantum systems, that is, qudits, with D > 2) and their control are necessary for fundamental investigations of quantum mechanics, for increasing the sensitivity of quantum imaging schemes, for improving the robustness and key rate of quantum communication protocols, for enabling a richer variety of quantum simulations, and for achieving more efficient and error-tolerant quantum computation. Integrated photonics has recently become a leading platform for the compact, cost-efficient, and stable generation and processing of non-classical optical states. However, so far, integrated entangled quantum sources have been limited to qubits (D = 2). Here we demonstrate on-chip generation of entangled qudit states, where the photons are created in a coherent superposition of multiple high-purity frequency modes. In particular, we confirm the realization of a quantum system with at least one hundred dimensions, formed by two entangled qudits with D = 10. Furthermore, using state-of-the-art, yet off-the-shelf telecommunications components, we introduce a coherent manipulation platform with which to control frequency-entangled states, capable of performing deterministic high-dimensional gate operations. We validate this platform by measuring Bell inequality violations and performing quantum state tomography. Our work enables the generation and processing of high-dimensional quantum states in a single spatial mode.
Shiqing Wang; Limin Su
2013-01-01
During the last few years, a great deal of attention has been focused on Lasso and Dantzig selector in high-dimensional linear regression when the number of variables can be much larger than the sample size. Under a sparsity scenario, the authors (see, e.g., Bickel et al., 2009, Bunea et al., 2007, Candes and Tao, 2007, Candès and Tao, 2007, Donoho et al., 2006, Koltchinskii, 2009, Koltchinskii, 2009, Meinshausen and Yu, 2009, Rosenbaum and Tsybakov, 2010, Tsybakov, 2006, van de Geer, 2008, a...
Efficient Estimation of first Passage Probability of high-Dimensional Nonlinear Systems
DEFF Research Database (Denmark)
Sichani, Mahdi Teimouri; Nielsen, Søren R.K.; Bucher, Christian
2011-01-01
An efficient method for estimating low first passage probabilities of high-dimensional nonlinear systems based on asymptotic estimation of low probabilities is presented. The method does not require any a priori knowledge of the system, i.e. it is a black-box method, and has very low requirements......, the failure probabilities of three well-known nonlinear systems are estimated. Next, a reduced degree-of-freedom model of a wind turbine is developed and is exposed to a turbulent wind field. The model incorporates very high dimensions and strong nonlinearities simultaneously. The failure probability...
CSIR Research Space (South Africa)
Giovannini, D
2013-06-01
Full Text Available : QELS_Fundamental Science, San Jose, California United States, 9-14 June 2013 Reconstruction of High-Dimensional States Entangled in Orbital Angular Momentum Using Mutually Unbiased Measurements D. Giovannini1, ⇤, J. Romero1, 2, J. Leach3, A.... Dudley4, A. Forbes4, 5 and M. J. Padgett1 1 School of Physics and Astronomy, SUPA, University of Glasgow, Glasgow G12 8QQ, United Kingdom 2 Department of Physics, SUPA, University of Strathclyde, Glasgow G4 ONG, United Kingdom 3 School of Engineering...
Computing and visualizing time-varying merge trees for high-dimensional data
Energy Technology Data Exchange (ETDEWEB)
Oesterling, Patrick [Univ. of Leipzig (Germany); Heine, Christian [Univ. of Kaiserslautern (Germany); Weber, Gunther H. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Morozov, Dmitry [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Scheuermann, Gerik [Univ. of Leipzig (Germany)
2017-06-03
We introduce a new method that identifies and tracks features in arbitrary dimensions using the merge tree -- a structure for identifying topological features based on thresholding in scalar fields. This method analyzes the evolution of features of the function by tracking changes in the merge tree and relates features by matching subtrees between consecutive time steps. Using the time-varying merge tree, we present a structural visualization of the changing function that illustrates both features and their temporal evolution. We demonstrate the utility of our approach by applying it to temporal cluster analysis of high-dimensional point clouds.
High-dimensional, massive sample-size Cox proportional hazards regression for survival analysis.
Mittal, Sushil; Madigan, David; Burd, Randall S; Suchard, Marc A
2014-04-01
Survival analysis endures as an old, yet active research field with applications that spread across many domains. Continuing improvements in data acquisition techniques pose constant challenges in applying existing survival analysis methods to these emerging data sets. In this paper, we present tools for fitting regularized Cox survival analysis models on high-dimensional, massive sample-size (HDMSS) data using a variant of the cyclic coordinate descent optimization technique tailored for the sparsity that HDMSS data often present. Experiments on two real data examples demonstrate that efficient analyses of HDMSS data using these tools result in improved predictive performance and calibration.
High-dimensional nonlinear diffusion stochastic processes modelling for engineering applications
Mamontov, Yevgeny
2001-01-01
This book is the first one devoted to high-dimensional (or large-scale) diffusion stochastic processes (DSPs) with nonlinear coefficients. These processes are closely associated with nonlinear Ito's stochastic ordinary differential equations (ISODEs) and with the space-discretized versions of nonlinear Ito's stochastic partial integro-differential equations. The latter models include Ito's stochastic partial differential equations (ISPDEs). The book presents the new analytical treatment which can serve as the basis of a combined, analytical-numerical approach to greater computational efficienc
Efficient and accurate nearest neighbor and closest pair search in high-dimensional space
Tao, Yufei
2010-07-01
Nearest Neighbor (NN) search in high-dimensional space is an important problem in many applications. From the database perspective, a good solution needs to have two properties: (i) it can be easily incorporated in a relational database, and (ii) its query cost should increase sublinearly with the dataset size, regardless of the data and query distributions. Locality-Sensitive Hashing (LSH) is a well-known methodology fulfilling both requirements, but its current implementations either incur expensive space and query cost, or abandon its theoretical guarantee on the quality of query results. Motivated by this, we improve LSH by proposing an access method called the Locality-Sensitive B-tree (LSB-tree) to enable fast, accurate, high-dimensional NN search in relational databases. The combination of several LSB-trees forms a LSB-forest that has strong quality guarantees, but improves dramatically the efficiency of the previous LSH implementation having the same guarantees. In practice, the LSB-tree itself is also an effective index which consumes linear space, supports efficient updates, and provides accurate query results. In our experiments, the LSB-tree was faster than: (i) iDistance (a famous technique for exact NN search) by two orders ofmagnitude, and (ii) MedRank (a recent approximate method with nontrivial quality guarantees) by one order of magnitude, and meanwhile returned much better results. As a second step, we extend our LSB technique to solve another classic problem, called Closest Pair (CP) search, in high-dimensional space. The long-term challenge for this problem has been to achieve subquadratic running time at very high dimensionalities, which fails most of the existing solutions. We show that, using a LSB-forest, CP search can be accomplished in (worst-case) time significantly lower than the quadratic complexity, yet still ensuring very good quality. In practice, accurate answers can be found using just two LSB-trees, thus giving a substantial
Stochastic Neural Network Approach for Learning High-Dimensional Free Energy Surfaces
Schneider, Elia; Dai, Luke; Topper, Robert Q.; Drechsel-Grau, Christof; Tuckerman, Mark E.
2017-10-01
The generation of free energy landscapes corresponding to conformational equilibria in complex molecular systems remains a significant computational challenge. Adding to this challenge is the need to represent, store, and manipulate the often high-dimensional surfaces that result from rare-event sampling approaches employed to compute them. In this Letter, we propose the use of artificial neural networks as a solution to these issues. Using specific examples, we discuss network training using enhanced-sampling methods and the use of the networks in the calculation of ensemble averages.
Non-Asymptotic Oracle Inequalities for the High-Dimensional Cox Regression via Lasso.
Kong, Shengchun; Nan, Bin
2014-01-01
We consider finite sample properties of the regularized high-dimensional Cox regression via lasso. Existing literature focuses on linear models or generalized linear models with Lipschitz loss functions, where the empirical risk functions are the summations of independent and identically distributed (iid) losses. The summands in the negative log partial likelihood function for censored survival data, however, are neither iid nor Lipschitz.We first approximate the negative log partial likelihood function by a sum of iid non-Lipschitz terms, then derive the non-asymptotic oracle inequalities for the lasso penalized Cox regression using pointwise arguments to tackle the difficulties caused by lacking iid Lipschitz losses.
Inferring biological tasks using Pareto analysis of high-dimensional data.
Hart, Yuval; Sheftel, Hila; Hausser, Jean; Szekely, Pablo; Ben-Moshe, Noa Bossel; Korem, Yael; Tendler, Avichai; Mayo, Avraham E; Alon, Uri
2015-03-01
We present the Pareto task inference method (ParTI; http://www.weizmann.ac.il/mcb/UriAlon/download/ParTI) for inferring biological tasks from high-dimensional biological data. Data are described as a polytope, and features maximally enriched closest to the vertices (or archetypes) allow identification of the tasks the vertices represent. We demonstrate that human breast tumors and mouse tissues are well described by tetrahedrons in gene expression space, with specific tumor types and biological functions enriched at each of the vertices, suggesting four key tasks.
Ceotto, Michele; Di Liberto, Giovanni; Conte, Riccardo
2017-07-01
A new semiclassical "divide-and-conquer" method is presented with the aim of demonstrating that quantum dynamics simulations of high dimensional molecular systems are doable. The method is first tested by calculating the quantum vibrational power spectra of water, methane, and benzene—three molecules of increasing dimensionality for which benchmark quantum results are available—and then applied to C60 , a system characterized by 174 vibrational degrees of freedom. Results show that the approach can accurately account for quantum anharmonicities, purely quantum features like overtones, and the removal of degeneracy when the molecular symmetry is broken.
Talib, Imran; Belgacem, Fethi Bin Muhammad; Asif, Naseer Ahmad; Khalil, Hammad
2017-01-01
In this research article, we derive and analyze an efficient spectral method based on the operational matrices of three dimensional orthogonal Jacobi polynomials to solve numerically the mixed partial derivatives type multi-terms high dimensions generalized class of fractional order partial differential equations. We transform the considered fractional order problem to an easily solvable algebraic equations with the aid of the operational matrices. Being easily solvable, the associated algebraic system leads to finding the solution of the problem. Some test problems are considered to confirm the accuracy and validity of the proposed numerical method. The convergence of the method is ensured by comparing our Matlab software simulations based obtained results with the exact solutions in the literature, yielding negligible errors. Moreover, comparative results discussed in the literature are extended and improved in this study.
Nagarajan, Mahesh B; Coan, Paola; Huber, Markus B; Diemoz, Paul C; Glaser, Christian; Wismüller, Axel
2014-02-01
Phase-contrast computed tomography (PCI-CT) has shown tremendous potential as an imaging modality for visualizing human cartilage with high spatial resolution. Previous studies have demonstrated the ability of PCI-CT to visualize (1) structural details of the human patellar cartilage matrix and (2) changes to chondrocyte organization induced by osteoarthritis. This study investigates the use of high-dimensional geometric features in characterizing such chondrocyte patterns in the presence or absence of osteoarthritic damage. Geometrical features derived from the scaling index method (SIM) and statistical features derived from gray-level co-occurrence matrices were extracted from 842 regions of interest (ROI) annotated on PCI-CT images of ex vivo human patellar cartilage specimens. These features were subsequently used in a machine learning task with support vector regression to classify ROIs as healthy or osteoarthritic; classification performance was evaluated using the area under the receiver-operating characteristic curve (AUC). SIM-derived geometrical features exhibited the best classification performance (AUC, 0.95 ± 0.06) and were most robust to changes in ROI size. These results suggest that such geometrical features can provide a detailed characterization of the chondrocyte organization in the cartilage matrix in an automated and non-subjective manner, while also enabling classification of cartilage as healthy or osteoarthritic with high accuracy. Such features could potentially serve as imaging markers for evaluating osteoarthritis progression and its response to different therapeutic intervention strategies.
Community Detection for Correlation Matrices
Directory of Open Access Journals (Sweden)
Mel MacMahon
2015-04-01
Full Text Available A challenging problem in the study of complex systems is that of resolving, without prior information, the emergent, mesoscopic organization determined by groups of units whose dynamical activity is more strongly correlated internally than with the rest of the system. The existing techniques to filter correlations are not explicitly oriented towards identifying such modules and can suffer from an unavoidable information loss. A promising alternative is that of employing community detection techniques developed in network theory. Unfortunately, this approach has focused predominantly on replacing network data with correlation matrices, a procedure that we show to be intrinsically biased because of its inconsistency with the null hypotheses underlying the existing algorithms. Here, we introduce, via a consistent redefinition of null models based on random matrix theory, the appropriate correlation-based counterparts of the most popular community detection techniques. Our methods can filter out both unit-specific noise and system-wide dependencies, and the resulting communities are internally correlated and mutually anticorrelated. We also implement multiresolution and multifrequency approaches revealing hierarchically nested subcommunities with “hard” cores and “soft” peripheries. We apply our techniques to several financial time series and identify mesoscopic groups of stocks which are irreducible to a standard, sectorial taxonomy; detect “soft stocks” that alternate between communities; and discuss implications for portfolio optimization and risk management.
Smart sampling and incremental function learning for very large high dimensional data.
Loyola R, Diego G; Pedergnana, Mattia; Gimeno García, Sebastián
2016-06-01
Very large high dimensional data are common nowadays and they impose new challenges to data-driven and data-intensive algorithms. Computational Intelligence techniques have the potential to provide powerful tools for addressing these challenges, but the current literature focuses mainly on handling scalability issues related to data volume in terms of sample size for classification tasks. This work presents a systematic and comprehensive approach for optimally handling regression tasks with very large high dimensional data. The proposed approach is based on smart sampling techniques for minimizing the number of samples to be generated by using an iterative approach that creates new sample sets until the input and output space of the function to be approximated are optimally covered. Incremental function learning takes place in each sampling iteration, the new samples are used to fine tune the regression results of the function learning algorithm. The accuracy and confidence levels of the resulting approximation function are assessed using the probably approximately correct computation framework. The smart sampling and incremental function learning techniques can be easily used in practical applications and scale well in the case of extremely large data. The feasibility and good results of the proposed techniques are demonstrated using benchmark functions as well as functions from real-world problems. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.
Tikhonov, Mikhail; Monasson, Remi
2018-01-01
Much of our understanding of ecological and evolutionary mechanisms derives from analysis of low-dimensional models: with few interacting species, or few axes defining "fitness". It is not always clear to what extent the intuition derived from low-dimensional models applies to the complex, high-dimensional reality. For instance, most naturally occurring microbial communities are strikingly diverse, harboring a large number of coexisting species, each of which contributes to shaping the environment of others. Understanding the eco-evolutionary interplay in these systems is an important challenge, and an exciting new domain for statistical physics. Recent work identified a promising new platform for investigating highly diverse ecosystems, based on the classic resource competition model of MacArthur. Here, we describe how the same analytical framework can be used to study evolutionary questions. Our analysis illustrates how, at high dimension, the intuition promoted by a one-dimensional (scalar) notion of fitness can become misleading. Specifically, while the low-dimensional picture emphasizes organism cost or efficiency, we exhibit a regime where cost becomes irrelevant for survival, and link this observation to generic properties of high-dimensional geometry.
Nguyen, Lan Huong; Holmes, Susan
2017-09-13
Detecting patterns in high-dimensional multivariate datasets is non-trivial. Clustering and dimensionality reduction techniques often help in discerning inherent structures. In biological datasets such as microbial community composition or gene expression data, observations can be generated from a continuous process, often unknown. Estimating data points' 'natural ordering' and their corresponding uncertainties can help researchers draw insights about the mechanisms involved. We introduce a Bayesian Unidimensional Scaling (BUDS) technique which extracts dominant sources of variation in high dimensional datasets and produces their visual data summaries, facilitating the exploration of a hidden continuum. The method maps multivariate data points to latent one-dimensional coordinates along their underlying trajectory, and provides estimated uncertainty bounds. By statistically modeling dissimilarities and applying a DiSTATIS registration method to their posterior samples, we are able to incorporate visualizations of uncertainties in the estimated data trajectory across different regions using confidence contours for individual data points. We also illustrate the estimated overall data density across different areas by including density clouds. One-dimensional coordinates recovered by BUDS help researchers discover sample attributes or covariates that are factors driving the main variability in a dataset. We demonstrated usefulness and accuracy of BUDS on a set of published microbiome 16S and RNA-seq and roll call data. Our method effectively recovers and visualizes natural orderings present in datasets. Automatic visualization tools for data exploration and analysis are available at: https://nlhuong.shinyapps.io/visTrajectory/ .
Selecting Optimal Feature Set in High-Dimensional Data by Swarm Search
Directory of Open Access Journals (Sweden)
Simon Fong
2013-01-01
Full Text Available Selecting the right set of features from data of high dimensionality for inducing an accurate classification model is a tough computational challenge. It is almost a NP-hard problem as the combinations of features escalate exponentially as the number of features increases. Unfortunately in data mining, as well as other engineering applications and bioinformatics, some data are described by a long array of features. Many feature subset selection algorithms have been proposed in the past, but not all of them are effective. Since it takes seemingly forever to use brute force in exhaustively trying every possible combination of features, stochastic optimization may be a solution. In this paper, we propose a new feature selection scheme called Swarm Search to find an optimal feature set by using metaheuristics. The advantage of Swarm Search is its flexibility in integrating any classifier into its fitness function and plugging in any metaheuristic algorithm to facilitate heuristic search. Simulation experiments are carried out by testing the Swarm Search over some high-dimensional datasets, with different classification algorithms and various metaheuristic algorithms. The comparative experiment results show that Swarm Search is able to attain relatively low error rates in classification without shrinking the size of the feature subset to its minimum.
Using High-Dimensional Image Models to Perform Highly Undetectable Steganography
Pevný, Tomáš; Filler, Tomáš; Bas, Patrick
This paper presents a complete methodology for designing practical and highly-undetectable stegosystems for real digital media. The main design principle is to minimize a suitably-defined distortion by means of efficient coding algorithm. The distortion is defined as a weighted difference of extended state-of-the-art feature vectors already used in steganalysis. This allows us to "preserve" the model used by steganalyst and thus be undetectable even for large payloads. This framework can be efficiently implemented even when the dimensionality of the feature set used by the embedder is larger than 107. The high dimensional model is necessary to avoid known security weaknesses. Although high-dimensional models might be problem in steganalysis, we explain, why they are acceptable in steganography. As an example, we introduce HUGO, a new embedding algorithm for spatial-domain digital images and we contrast its performance with LSB matching. On the BOWS2 image database and in contrast with LSB matching, HUGO allows the embedder to hide 7× longer message with the same level of security level.
A New Ensemble Method with Feature Space Partitioning for High-Dimensional Data Classification
Directory of Open Access Journals (Sweden)
Yongjun Piao
2015-01-01
Full Text Available Ensemble data mining methods, also known as classifier combination, are often used to improve the performance of classification. Various classifier combination methods such as bagging, boosting, and random forest have been devised and have received considerable attention in the past. However, data dimensionality increases rapidly day by day. Such a trend poses various challenges as these methods are not suitable to directly apply to high-dimensional datasets. In this paper, we propose an ensemble method for classification of high-dimensional data, with each classifier constructed from a different set of features determined by partitioning of redundant features. In our method, the redundancy of features is considered to divide the original feature space. Then, each generated feature subset is trained by a support vector machine, and the results of each classifier are combined by majority voting. The efficiency and effectiveness of our method are demonstrated through comparisons with other ensemble techniques, and the results show that our method outperforms other methods.
EigenPrism: inference for high dimensional signal-to-noise ratios.
Janson, Lucas; Barber, Rina Foygel; Candès, Emmanuel
2017-09-01
Consider the following three important problems in statistical inference, namely, constructing confidence intervals for (1) the error of a high-dimensional (p > n) regression estimator, (2) the linear regression noise level, and (3) the genetic signal-to-noise ratio of a continuous-valued trait (related to the heritability). All three problems turn out to be closely related to the little-studied problem of performing inference on the [Formula: see text]-norm of the signal in high-dimensional linear regression. We derive a novel procedure for this, which is asymptotically correct when the covariates are multivariate Gaussian and produces valid confidence intervals in finite samples as well. The procedure, called EigenPrism, is computationally fast and makes no assumptions on coefficient sparsity or knowledge of the noise level. We investigate the width of the EigenPrism confidence intervals, including a comparison with a Bayesian setting in which our interval is just 5% wider than the Bayes credible interval. We are then able to unify the three aforementioned problems by showing that the EigenPrism procedure with only minor modifications is able to make important contributions to all three. We also investigate the robustness of coverage and find that the method applies in practice and in finite samples much more widely than just the case of multivariate Gaussian covariates. Finally, we apply EigenPrism to a genetic dataset to estimate the genetic signal-to-noise ratio for a number of continuous phenotypes.
An Efficient High Dimensional Cluster Method and its Application in Global Climate Sets
Directory of Open Access Journals (Sweden)
Ke Li
2007-10-01
Full Text Available Because of the development of modern-day satellites and other data acquisition systems, global climate research often involves overwhelming volume and complexity of high dimensional datasets. As a data preprocessing and analysis method, the clustering method is playing a more and more important role in these researches. In this paper, we propose a spatial clustering algorithm that, to some extent, cures the problem of dimensionality in high dimensional clustering. The similarity measure of our algorithm is based on the number of top-k nearest neighbors that two grids share. The neighbors of each grid are computed based on the time series associated with each grid, and computing the nearest neighbor of an object is the most time consuming step. According to Tobler's "First Law of Geography," we add a spatial window constraint upon each grid to restrict the number of grids considered and greatly improve the efficiency of our algorithm. We apply this algorithm to a 100-year global climate dataset and partition the global surface into sub areas under various spatial granularities. Experiments indicate that our spatial clustering algorithm works well.
High-Dimensional Single-Photon Quantum Gates: Concepts and Experiments
Babazadeh, Amin; Erhard, Manuel; Wang, Feiran; Malik, Mehul; Nouroozi, Rahman; Krenn, Mario; Zeilinger, Anton
2017-11-01
Transformations on quantum states form a basic building block of every quantum information system. From photonic polarization to two-level atoms, complete sets of quantum gates for a variety of qubit systems are well known. For multilevel quantum systems beyond qubits, the situation is more challenging. The orbital angular momentum modes of photons comprise one such high-dimensional system for which generation and measurement techniques are well studied. However, arbitrary transformations for such quantum states are not known. Here we experimentally demonstrate a four-dimensional generalization of the Pauli X gate and all of its integer powers on single photons carrying orbital angular momentum. Together with the well-known Z gate, this forms the first complete set of high-dimensional quantum gates implemented experimentally. The concept of the X gate is based on independent access to quantum states with different parities and can thus be generalized to other photonic degrees of freedom and potentially also to other quantum systems.
High-Dimensional Single-Photon Quantum Gates: Concepts and Experiments.
Babazadeh, Amin; Erhard, Manuel; Wang, Feiran; Malik, Mehul; Nouroozi, Rahman; Krenn, Mario; Zeilinger, Anton
2017-11-03
Transformations on quantum states form a basic building block of every quantum information system. From photonic polarization to two-level atoms, complete sets of quantum gates for a variety of qubit systems are well known. For multilevel quantum systems beyond qubits, the situation is more challenging. The orbital angular momentum modes of photons comprise one such high-dimensional system for which generation and measurement techniques are well studied. However, arbitrary transformations for such quantum states are not known. Here we experimentally demonstrate a four-dimensional generalization of the Pauli X gate and all of its integer powers on single photons carrying orbital angular momentum. Together with the well-known Z gate, this forms the first complete set of high-dimensional quantum gates implemented experimentally. The concept of the X gate is based on independent access to quantum states with different parities and can thus be generalized to other photonic degrees of freedom and potentially also to other quantum systems.
High-dimensional quantum key distribution with the entangled single-photon-added coherent state
Energy Technology Data Exchange (ETDEWEB)
Wang, Yang [Zhengzhou Information Science and Technology Institute, Zhengzhou, 450001 (China); Synergetic Innovation Center of Quantum Information and Quantum Physics, University of Science and Technology of China, Hefei, Anhui 230026 (China); Bao, Wan-Su, E-mail: 2010thzz@sina.com [Zhengzhou Information Science and Technology Institute, Zhengzhou, 450001 (China); Synergetic Innovation Center of Quantum Information and Quantum Physics, University of Science and Technology of China, Hefei, Anhui 230026 (China); Bao, Hai-Ze; Zhou, Chun; Jiang, Mu-Sheng; Li, Hong-Wei [Zhengzhou Information Science and Technology Institute, Zhengzhou, 450001 (China); Synergetic Innovation Center of Quantum Information and Quantum Physics, University of Science and Technology of China, Hefei, Anhui 230026 (China)
2017-04-25
High-dimensional quantum key distribution (HD-QKD) can generate more secure bits for one detection event so that it can achieve long distance key distribution with a high secret key capacity. In this Letter, we present a decoy state HD-QKD scheme with the entangled single-photon-added coherent state (ESPACS) source. We present two tight formulas to estimate the single-photon fraction of postselected events and Eve's Holevo information and derive lower bounds on the secret key capacity and the secret key rate of our protocol. We also present finite-key analysis for our protocol by using the Chernoff bound. Our numerical results show that our protocol using one decoy state can perform better than that of previous HD-QKD protocol with the spontaneous parametric down conversion (SPDC) using two decoy states. Moreover, when considering finite resources, the advantage is more obvious. - Highlights: • Implement the single-photon-added coherent state source into the high-dimensional quantum key distribution. • Enhance both the secret key capacity and the secret key rate compared with previous schemes. • Show an excellent performance in view of statistical fluctuations.
Energy Efficient MAC Scheme for Wireless Sensor Networks with High-Dimensional Data Aggregate
Directory of Open Access Journals (Sweden)
Seokhoon Kim
2015-01-01
Full Text Available This paper presents a novel and sustainable medium access control (MAC scheme for wireless sensor network (WSN systems that process high-dimensional aggregated data. Based on a preamble signal and buffer threshold analysis, it maximizes the energy efficiency of the wireless sensor devices which have limited energy resources. The proposed group management MAC (GM-MAC approach not only sets the buffer threshold value of a sensor device to be reciprocal to the preamble signal but also sets a transmittable group value to each sensor device by using the preamble signal of the sink node. The primary difference between the previous and the proposed approach is that existing state-of-the-art schemes use duty cycle and sleep mode to save energy consumption of individual sensor devices, whereas the proposed scheme employs the group management MAC scheme for sensor devices to maximize the overall energy efficiency of the whole WSN systems by minimizing the energy consumption of sensor devices located near the sink node. Performance evaluations show that the proposed scheme outperforms the previous schemes in terms of active time of sensor devices, transmission delay, control overhead, and energy consumption. Therefore, the proposed scheme is suitable for sensor devices in a variety of wireless sensor networking environments with high-dimensional data aggregate.
Directory of Open Access Journals (Sweden)
Sergio Rojas-Galeano
2008-03-01
Full Text Available The analysis of complex proteomic and genomic profiles involves the identification of significant markers within a set of hundreds or even thousands of variables that represent a high-dimensional problem space. The occurrence of noise, redundancy or combinatorial interactions in the profile makes the selection of relevant variables harder.Here we propose a method to select variables based on estimated relevance to hidden patterns. Our method combines a weighted-kernel discriminant with an iterative stochastic probability estimation algorithm to discover the relevance distribution over the set of variables. We verified the ability of our method to select predefined relevant variables in synthetic proteome-like data and then assessed its performance on biological high-dimensional problems. Experiments were run on serum proteomic datasets of infectious diseases. The resulting variable subsets achieved classification accuracies of 99% on Human African Trypanosomiasis, 91% on Tuberculosis, and 91% on Malaria serum proteomic profiles with fewer than 20% of variables selected. Our method scaled-up to dimensionalities of much higher orders of magnitude as shown with gene expression microarray datasets in which we obtained classification accuracies close to 90% with fewer than 1% of the total number of variables.Our method consistently found relevant variables attaining high classification accuracies across synthetic and biological datasets. Notably, it yielded very compact subsets compared to the original number of variables, which should simplify downstream biological experimentation.
The Subspace Voyager: Exploring High-Dimensional Data along a Continuum of Salient 3D Subspaces.
Wang, Bing; Mueller, Klaus
2018-02-01
Analyzing high-dimensional data and finding hidden patterns is a difficult problem and has attracted numerous research efforts. Automated methods can be useful to some extent but bringing the data analyst into the loop via interactive visual tools can help the discovery process tremendously. An inherent problem in this effort is that humans lack the mental capacity to truly understand spaces exceeding three spatial dimensions. To keep within this limitation, we describe a framework that decomposes a high-dimensional data space into a continuum of generalized 3D subspaces. Analysts can then explore these 3D subspaces individually via the familiar trackball interface while using additional facilities to smoothly transition to adjacent subspaces for expanded space comprehension. Since the number of such subspaces suffers from combinatorial explosion, we provide a set of data-driven subspace selection and navigation tools which can guide users to interesting subspaces and views. A subspace trail map allows users to manage the explored subspaces, keep their bearings, and return to interesting subspaces and views. Both trackball and trail map are each embedded into a word cloud of attribute labels which aid in navigation. We demonstrate our system via several use cases in a diverse set of application areas-cluster analysis and refinement, information discovery, and supervised training of classifiers. We also report on a user study that evaluates the usability of the various interactions our system provides.
Pang, Herbert; Jung, Sin-Ho
2013-01-01
A variety of prediction methods are used to relate high-dimensional genome data with a clinical outcome using a prediction model. Once a prediction model is developed from a data set, it should be validated using a resampling method or an independent data set. Although the existing prediction methods have been intensively evaluated by many investigators, there has not been a comprehensive study investigating the performance of the validation methods, especially with a survival clinical outcome. Understanding the properties of the various validation methods can allow researchers to perform more powerful validations while controlling for type I error. In addition, sample size calculation strategy based on these validation methods is lacking. We conduct extensive simulations to examine the statistical properties of these validation strategies. In both simulations and a real data example, we have found that 10-fold cross-validation with permutation gave the best power while controlling type I error close to the nominal level. Based on this, we have also developed a sample size calculation method that will be used to design a validation study with a user-chosen combination of prediction. Microarray and genome-wide association studies data are used as illustrations. The power calculation method in this presentation can be used for the design of any biomedical studies involving high-dimensional data and survival outcomes. PMID:23471879
Yue, Mu; Li, Jialiang
2017-05-18
Motivated by risk prediction studies with ultra-high dimensional bio markers, we propose a novel improvement screening methodology. Accurate risk prediction can be quite useful for patient treatment selection, prevention strategy or disease management in evidence-based medicine. The question of how to choose new markers in addition to the conventional ones is especially important. In the past decade, a number of new measures for quantifying the added value from the new markers were proposed, among which the integrated discrimination improvement (IDI) and net reclassification improvement (NRI) stand out. Meanwhile, C-statistics are routinely used to quantify the capacity of the estimated risk score in discriminating among subjects with different event times. In this paper, we will examine these improvement statistics as well as the norm-based approach for evaluating the incremental values of new markers and compare these four measures by analyzing ultra-high dimensional censored survival data. In particular, we consider Cox proportional hazards models with varying coefficients. All measures perform very well in simulations and we illustrate our methods in an application to a lung cancer study.
Sauerbrei, Willi; Boulesteix, Anne-Laure; Binder, Harald
2011-11-01
Multivariable regression models can link a potentially large number of variables to various kinds of outcomes, such as continuous, binary, or time-to-event endpoints. Selection of important variables and selection of the functional form for continuous covariates are key parts of building such models but are notoriously difficult due to several reasons. Caused by multicollinearity between predictors and a limited amount of information in the data, (in)stability can be a serious issue of models selected. For applications with a moderate number of variables, resampling-based techniques have been developed for diagnosing and improving multivariable regression models. Deriving models for high-dimensional molecular data has led to the need for adapting these techniques to settings where the number of variables is much larger than the number of observations. Three studies with a time-to-event outcome, of which one has high-dimensional data, are used to illustrate several techniques. Investigations at the covariate level and at the predictor level are seen to provide considerable insight into model stability and performance. While some areas are indicated where resampling techniques for model building still need further refinement, our case studies illustrate that these techniques can already be recommended for wider use.
Hierarchical classification of microorganisms based on high-dimensional phenotypic data.
Tafintseva, Valeria; Vigneau, Evelyne; Shapaval, Volha; Cariou, Véronique; Qannari, El Mostafa; Kohler, Achim
2017-11-09
The classification of microorganisms by high-dimensional phenotyping methods such as FTIR spectroscopy is often a complicated process due to the complexity of microbial phylogenetic taxonomy. A hierarchical structure developed for such data can often facilitate the classification analysis. The hierarchical tree structure can either be imposed to a given set of phenotypic data by integrating the phylogenetic taxonomic structure or set up by revealing the inherent clusters in the phenotypic data. In this study, we wanted to compare different approaches to hierarchical classification of microorganisms based on high-dimensional phenotypic data. A set of 19 different species of moulds (filamentous fungi) obtained from the mycological strain collection of the Norwegian Veterinary Institute (Oslo, Norway) is used for the study. Hierarchical cluster analysis is performed for setting up the classification trees. Classification algorithms such as Artificial Neural Networks (ANN), Partial Least Squared Discriminant Analysis (PLSDA), and Random Forest (RF) are used and compared. The two methods ANN and RF outperformed all the other approaches even though they did not utilize predefined hierarchical structure. To our knowledge, the Random Forest approach is used here for the first time to classify microorganisms by FTIR spectroscopy. This article is protected by copyright. All rights reserved.
Quality metrics in high-dimensional data visualization: an overview and systematization.
Bertini, Enrico; Tatu, Andrada; Keim, Daniel
2011-12-01
In this paper, we present a systematization of techniques that use quality metrics to help in the visual exploration of meaningful patterns in high-dimensional data. In a number of recent papers, different quality metrics are proposed to automate the demanding search through large spaces of alternative visualizations (e.g., alternative projections or ordering), allowing the user to concentrate on the most promising visualizations suggested by the quality metrics. Over the last decade, this approach has witnessed a remarkable development but few reflections exist on how these methods are related to each other and how the approach can be developed further. For this purpose, we provide an overview of approaches that use quality metrics in high-dimensional data visualization and propose a systematization based on a thorough literature review. We carefully analyze the papers and derive a set of factors for discriminating the quality metrics, visualization techniques, and the process itself. The process is described through a reworked version of the well-known information visualization pipeline. We demonstrate the usefulness of our model by applying it to several existing approaches that use quality metrics, and we provide reflections on implications of our model for future research. © 2010 IEEE
Prediction of Incident Diabetes in the Jackson Heart Study Using High-Dimensional Machine Learning.
Directory of Open Access Journals (Sweden)
Ramon Casanova
Full Text Available Statistical models to predict incident diabetes are often based on limited variables. Here we pursued two main goals: 1 investigate the relative performance of a machine learning method such as Random Forests (RF for detecting incident diabetes in a high-dimensional setting defined by a large set of observational data, and 2 uncover potential predictors of diabetes. The Jackson Heart Study collected data at baseline and in two follow-up visits from 5,301 African Americans. We excluded those with baseline diabetes and no follow-up, leaving 3,633 individuals for analyses. Over a mean 8-year follow-up, 584 participants developed diabetes. The full RF model evaluated 93 variables including demographic, anthropometric, blood biomarker, medical history, and echocardiogram data. We also used RF metrics of variable importance to rank variables according to their contribution to diabetes prediction. We implemented other models based on logistic regression and RF where features were preselected. The RF full model performance was similar (AUC = 0.82 to those more parsimonious models. The top-ranked variables according to RF included hemoglobin A1C, fasting plasma glucose, waist circumference, adiponectin, c-reactive protein, triglycerides, leptin, left ventricular mass, high-density lipoprotein cholesterol, and aldosterone. This work shows the potential of RF for incident diabetes prediction while dealing with high-dimensional data.
Multiple Group Testing Procedures for Analysis of High-Dimensional Genomic Data
Directory of Open Access Journals (Sweden)
Hyoseok Ko
2016-12-01
Full Text Available In genetic association studies with high-dimensional genomic data, multiple group testing procedures are often required in order to identify disease/trait-related genes or genetic regions, where multiple genetic sites or variants are located within the same gene or genetic region. However, statistical testing procedures based on an individual test suffer from multiple testing issues such as the control of family-wise error rate and dependent tests. Moreover, detecting only a few of genes associated with a phenotype outcome among tens of thousands of genes is of main interest in genetic association studies. In this reason regularization procedures, where a phenotype outcome regresses on all genomic markers and then regression coefficients are estimated based on a penalized likelihood, have been considered as a good alternative approach to analysis of high-dimensional genomic data. But, selection performance of regularization procedures has been rarely compared with that of statistical group testing procedures. In this article, we performed extensive simulation studies where commonly used group testing procedures such as principal component analysis, Hotelling's T2 test, and permutation test are compared with group lasso (least absolute selection and shrinkage operator in terms of true positive selection. Also, we applied all methods considered in simulation studies to identify genes associated with ovarian cancer from over 20,000 genetic sites generated from Illumina Infinium HumanMethylation27K Beadchip. We found a big discrepancy of selected genes between multiple group testing procedures and group lasso.
The Antitriangular Factorization of Saddle Point Matrices
Pestana, J.
2014-01-01
Mastronardi and Van Dooren [SIAM J. Matrix Anal. Appl., 34 (2013), pp. 173-196] recently introduced the block antitriangular ("Batman") decomposition for symmetric indefinite matrices. Here we show the simplification of this factorization for saddle point matrices and demonstrate how it represents the common nullspace method. We show that rank-1 updates to the saddle point matrix can be easily incorporated into the factorization and give bounds on the eigenvalues of matrices important in saddle point theory. We show the relation of this factorization to constraint preconditioning and how it transforms but preserves the structure of block diagonal and block triangular preconditioners. © 2014 Society for Industrial and Applied Mathematics.
Revisiting the texture zero neutrino mass matrices
Singh, Madan; Ahuja, Gulsheen; Gupta, Manmohan
2016-12-01
In the light of refined and large measurements of the reactor mixing angle θ, we have revisited the texture three- and two-zero neutrino mass matrices in the flavor basis. For Majorana neutrinos, it has been explicitly shown that all the texture three-zero mass matrices remain ruled out. Further, for both normal and inverted mass ordering, for the texture two-zero neutrino mass matrices one finds interesting constraints on the Dirac-like CP-violating phase δ and Majorana phases ρ and σ.
High-Dimensional Analysis of Convex Optimization-Based Massive MIMO Decoders
Ben Atitallah, Ismail
2017-04-01
A wide range of modern large-scale systems relies on recovering a signal from noisy linear measurements. In many applications, the useful signal has inherent properties, such as sparsity, low-rankness, or boundedness, and making use of these properties and structures allow a more efficient recovery. Hence, a significant amount of work has been dedicated to developing and analyzing algorithms that can take advantage of the signal structure. Especially, since the advent of Compressed Sensing (CS) there has been significant progress towards this direction. Generally speaking, the signal structure can be harnessed by solving an appropriate regularized or constrained M-estimator. In modern Multi-input Multi-output (MIMO) communication systems, all transmitted signals are drawn from finite constellations and are thus bounded. Besides, most recent modulation schemes such as Generalized Space Shift Keying (GSSK) or Generalized Spatial Modulation (GSM) yield signals that are inherently sparse. In the recovery procedure, boundedness and sparsity can be promoted by using the ℓ1 norm regularization and by imposing an ℓ∞ norm constraint respectively. In this thesis, we propose novel optimization algorithms to recover certain classes of structured signals with emphasis on MIMO communication systems. The exact analysis permits a clear characterization of how well these systems perform. Also, it allows an automatic tuning of the parameters. In each context, we define the appropriate performance metrics and we analyze them exactly in the High Dimentional Regime (HDR). The framework we use for the analysis is based on Gaussian process inequalities; in particular, on a new strong and tight version of a classical comparison inequality (due to Gordon, 1988) in the presence of additional convexity assumptions. The new framework that emerged from this inequality is coined as Convex Gaussian Min-max Theorem (CGMT).
High dimensional biological data retrieval optimization with NoSQL technology.
Wang, Shicai; Pandis, Ioannis; Wu, Chao; He, Sijin; Johnson, David; Emam, Ibrahim; Guitton, Florian; Guo, Yike
2014-01-01
High-throughput transcriptomic data generated by microarray experiments is the most abundant and frequently stored kind of data currently used in translational medicine studies. Although microarray data is supported in data warehouses such as tranSMART, when querying relational databases for hundreds of different patient gene expression records queries are slow due to poor performance. Non-relational data models, such as the key-value model implemented in NoSQL databases, hold promise to be more performant solutions. Our motivation is to improve the performance of the tranSMART data warehouse with a view to supporting Next Generation Sequencing data. In this paper we introduce a new data model better suited for high-dimensional data storage and querying, optimized for database scalability and performance. We have designed a key-value pair data model to support faster queries over large-scale microarray data and implemented the model using HBase, an implementation of Google's BigTable storage system. An experimental performance comparison was carried out against the traditional relational data model implemented in both MySQL Cluster and MongoDB, using a large publicly available transcriptomic data set taken from NCBI GEO concerning Multiple Myeloma. Our new key-value data model implemented on HBase exhibits an average 5.24-fold increase in high-dimensional biological data query performance compared to the relational model implemented on MySQL Cluster, and an average 6.47-fold increase on query performance on MongoDB. The performance evaluation found that the new key-value data model, in particular its implementation in HBase, outperforms the relational model currently implemented in tranSMART. We propose that NoSQL technology holds great promise for large-scale data management, in particular for high-dimensional biological data such as that demonstrated in the performance evaluation described in this paper. We aim to use this new data model as a basis for migrating
Decorrelation of the True and Estimated Classifier Errors in High-Dimensional Settings
Directory of Open Access Journals (Sweden)
Hua Jianping
2007-01-01
Full Text Available The aim of many microarray experiments is to build discriminatory diagnosis and prognosis models. Given the huge number of features and the small number of examples, model validity which refers to the precision of error estimation is a critical issue. Previous studies have addressed this issue via the deviation distribution (estimated error minus true error, in particular, the deterioration of cross-validation precision in high-dimensional settings where feature selection is used to mitigate the peaking phenomenon (overfitting. Because classifier design is based upon random samples, both the true and estimated errors are sample-dependent random variables, and one would expect a loss of precision if the estimated and true errors are not well correlated, so that natural questions arise as to the degree of correlation and the manner in which lack of correlation impacts error estimation. We demonstrate the effect of correlation on error precision via a decomposition of the variance of the deviation distribution, observe that the correlation is often severely decreased in high-dimensional settings, and show that the effect of high dimensionality on error estimation tends to result more from its decorrelating effects than from its impact on the variance of the estimated error. We consider the correlation between the true and estimated errors under different experimental conditions using both synthetic and real data, several feature-selection methods, different classification rules, and three error estimators commonly used (leave-one-out cross-validation, -fold cross-validation, and .632 bootstrap. Moreover, three scenarios are considered: (1 feature selection, (2 known-feature set, and (3 all features. Only the first is of practical interest; however, the other two are needed for comparison purposes. We will observe that the true and estimated errors tend to be much more correlated in the case of a known feature set than with either feature selection
A Euclidean algorithm for integer matrices
DEFF Research Database (Denmark)
Lauritzen, Niels; Thomsen, Jesper Funch
2015-01-01
We present a Euclidean algorithm for computing a greatest common right divisor of two integer matrices. The algorithm is derived from elementary properties of finitely generated modules over the ring of integers.......We present a Euclidean algorithm for computing a greatest common right divisor of two integer matrices. The algorithm is derived from elementary properties of finitely generated modules over the ring of integers....
An unsupervised feature extraction method for high dimensional image data compaction
Ghassemian, Hassan; Landgrebe, David
1987-01-01
A new on-line unsupervised feature extraction method for high-dimensional remotely sensed image data compaction is presented. This method can be utilized to solve the problem of data redundancy in scene representation by satellite-borne high resolution multispectral sensors. The algorithm first partitions the observation space into an exhaustive set of disjoint objects. Then, pixels that belong to an object are characterized by an object feature. Finally, the set of object features is used for data transmission and classification. The example results show that the performance with the compacted features provides a slight improvement in classification accuracy instead of any degradation. Also, the information extraction method does not need to be preceded by a data decompaction.
Inference for feature selection using the Lasso with high-dimensional data
DEFF Research Database (Denmark)
Brink-Jensen, Kasper; Ekstrøm, Claus Thorn
2014-01-01
that involve various effects strengths and correlation between predictors. The algorithm is also applied to a prostate cancer dataset that has been analyzed in recent papers on the subject. The proposed method is found to provide a powerful way to make inference for feature selection even for small samples......Penalized regression models such as the Lasso have proved useful for variable selection in many fields - especially for situations with high-dimensional data where the numbers of predictors far exceeds the number of observations. These methods identify and rank variables of importance but do...... not generally provide any inference of the selected variables. Thus, the variables selected might be the "most important" but need not be significant. We propose a significance test for the selection found by the Lasso. We introduce a procedure that computes inference and p-values for features chosen...
Bilionis, Ilias; Gonzalez, Marcial
2016-01-01
The prohibitive cost of performing Uncertainty Quantification (UQ) tasks with a very large number of input parameters can be addressed, if the response exhibits some special structure that can be discovered and exploited. Several physical responses exhibit a special structure known as an active subspace (AS), a linear manifold of the stochastic space characterized by maximal response variation. The idea is that one should first identify this low dimensional manifold, project the high-dimensional input onto it, and then link the projection to the output. In this work, we develop a probabilistic version of AS which is gradient-free and robust to observational noise. Our approach relies on a novel Gaussian process regression with built-in dimensionality reduction with the AS represented as an orthogonal projection matrix that serves as yet another covariance function hyper-parameter to be estimated from the data. To train the model, we design a two-step maximum likelihood optimization procedure that ensures the ...
PyDREAM: High-dimensional parameter inference for biological models in Python.
Shockley, Erin M; Vrugt, Jasper A; Lopez, Carlos F
2017-10-04
Biological models contain many parameters whose values are difficult to measure directly via experimentation and therefore require calibration against experimental data. Markov chain Monte Carlo (MCMC) methods are suitable to estimate multivariate posterior model parameter distributions, but these methods may exhibit slow or premature convergence in high-dimensional search spaces. Here, we present PyDREAM, a Python implementation of the (Multiple-Try) Differential Evolution Adaptive Metropolis (DREAM(ZS)) algorithm developed by Vrugt and ter Braak (2008) and Laloy and Vrugt (2012). PyDREAM achieves excellent performance for complex, parameter-rich models and takes full advantage of distributed computing resources, facilitating parameter inference and uncertainty estimation of CPU-intensive biological models. PyDREAM is freely available under the GNU GPLv3 license from the Lopez lab GitHub repository at http://github.com/LoLab-VU/PyDREAM. c.lopez@vanderbilt.edu. Supplementary data are available at Bioinformatics online.
DEFF Research Database (Denmark)
Ding, Yunhong; Bacco, Davide; Dalgaard, Kjeld
2017-01-01
-dimensional quantum states, and enables breaking the information efficiency limit of traditional quantum key distribution protocols. In addition, the silicon photonic circuits used in our work integrate variable optical attenuators, highly efficient multicore fiber couplers, and Mach-Zehnder interferometers, enabling......Quantum key distribution provides an efficient means to exchange information in an unconditionally secure way. Historically, quantum key distribution protocols have been based on binary signal formats, such as two polarization states, and the transmitted information efficiency of the quantum key...... is intrinsically limited to 1 bit/photon. Here we propose and experimentally demonstrate, for the first time, a high-dimensional quantum key distribution protocol based on space division multiplexing in multicore fiber using silicon photonic integrated lightwave circuits. We successfully realized three mutually...
High-dimensional single-cell analysis reveals the immune signature of narcolepsy.
Hartmann, Felix J; Bernard-Valnet, Raphaël; Quériault, Clémence; Mrdjen, Dunja; Weber, Lukas M; Galli, Edoardo; Krieg, Carsten; Robinson, Mark D; Nguyen, Xuan-Hung; Dauvilliers, Yves; Liblau, Roland S; Becher, Burkhard
2016-11-14
Narcolepsy type 1 is a devastating neurological sleep disorder resulting from the destruction of orexin-producing neurons in the central nervous system (CNS). Despite its striking association with the HLA-DQB1*06:02 allele, the autoimmune etiology of narcolepsy has remained largely hypothetical. Here, we compared peripheral mononucleated cells from narcolepsy patients with HLA-DQB1*06:02-matched healthy controls using high-dimensional mass cytometry in combination with algorithm-guided data analysis. Narcolepsy patients displayed multifaceted immune activation in CD4 + and CD8 + T cells dominated by elevated levels of B cell-supporting cytokines. Additionally, T cells from narcolepsy patients showed increased production of the proinflammatory cytokines IL-2 and TNF. Although it remains to be established whether these changes are primary to an autoimmune process in narcolepsy or secondary to orexin deficiency, these findings are indicative of inflammatory processes in the pathogenesis of this enigmatic disease. © 2016 Hartmann et al.
High-Dimensional Disorder-Driven Phenomena in Weyl Semimetals, Semiconductors and Related Systems
Syzranov, S V
2016-01-01
It is commonly believed that a non-interacting disordered electronic system can undergo only the Anderson metal-insulator transition. It has been suggested, however, that a broad class of systems can display disorder-driven transitions distinct from Anderson localisation that have manifestations in the disorder-averaged density of states, conductivity and other observables. Such transitions have received particular attention in the context of recently discovered 3D Weyl and Dirac materials but have also been predicted in cold-atom systems with long-range interactions, quantum kicked rotors and all sufficiently high-dimensional systems. Moreover, such systems exhibit unconventional behaviour of Lifshitz tails, energy-level statistics and ballistic-transport properties. Here we review recent progress and the status of results on non-Anderson disorder-driven transitions and related phenomena.
The Effects of Feature Optimization on High-Dimensional Essay Data
Directory of Open Access Journals (Sweden)
Bong-Jun Yi
2015-01-01
Full Text Available Current machine learning (ML based automated essay scoring (AES systems have employed various and vast numbers of features, which have been proven to be useful, in improving the performance of the AES. However, the high-dimensional feature space is not properly represented, due to the large volume of features extracted from the limited training data. As a result, this problem gives rise to poor performance and increased training time for the system. In this paper, we experiment and analyze the effects of feature optimization, including normalization, discretization, and feature selection techniques for different ML algorithms, while taking into consideration the size of the feature space and the performance of the AES. Accordingly, we show that the appropriate feature optimization techniques can reduce the dimensions of features, thus, contributing to the efficient training and performance improvement of AES.
Parabolic Theory as a High-Dimensional Limit of Elliptic Theory
Davey, Blair
2017-10-01
The aim of this article is to show how certain parabolic theorems follow from their elliptic counterparts. This technique is demonstrated through new proofs of five important theorems in parabolic unique continuation and the regularity theory of parabolic equations and geometric flows. Specifically, we give new proofs of an L 2 Carleman estimate for the heat operator, and the monotonicity formulas for the frequency function associated to the heat operator, the two-phase free boundary problem, the flow of harmonic maps, and the mean curvature flow. The proofs rely only on the underlying elliptic theorems and limiting procedures belonging essentially to probability theory. In particular, each parabolic theorem is proved by taking a high-dimensional limit of the related elliptic result.
Robust Hessian Locally Linear Embedding Techniques for High-Dimensional Data
Directory of Open Access Journals (Sweden)
Xianglei Xing
2016-05-01
Full Text Available Recently manifold learning has received extensive interest in the community of pattern recognition. Despite their appealing properties, most manifold learning algorithms are not robust in practical applications. In this paper, we address this problem in the context of the Hessian locally linear embedding (HLLE algorithm and propose a more robust method, called RHLLE, which aims to be robust against both outliers and noise in the data. Specifically, we first propose a fast outlier detection method for high-dimensional datasets. Then, we employ a local smoothing method to reduce noise. Furthermore, we reformulate the original HLLE algorithm by using the truncation function from differentiable manifolds. In the reformulated framework, we explicitly introduce a weighted global functional to further reduce the undesirable effect of outliers and noise on the embedding result. Experiments on synthetic as well as real datasets demonstrate the effectiveness of our proposed algorithm.
Wang, Zhiping; Chen, Jinyu; Yu, Benli
2017-02-20
We investigate the two-dimensional (2D) and three-dimensional (3D) atom localization behaviors via spontaneously generated coherence in a microwave-driven four-level atomic system. Owing to the space-dependent atom-field interaction, it is found that the detecting probability and precision of 2D and 3D atom localization behaviors can be significantly improved via adjusting the system parameters, the phase, amplitude, and initial population distribution. Interestingly, the atom can be localized in volumes that are substantially smaller than a cubic optical wavelength. Our scheme opens a promising way to achieve high-precision and high-efficiency atom localization, which provides some potential applications in high-dimensional atom nanolithography.
GD-RDA: A New Regularized Discriminant Analysis for High-Dimensional Data.
Zhou, Yan; Zhang, Baoxue; Li, Gaorong; Tong, Tiejun; Wan, Xiang
2017-11-01
High-throughput techniques bring novel tools and also statistical challenges to genomic research. Identification of which type of diseases a new patient belongs to has been recognized as an important problem. For high-dimensional small sample size data, the classical discriminant methods suffer from the singularity problem and are, therefore, no longer applicable in practice. In this article, we propose a geometric diagonalization method for the regularized discriminant analysis. We then consider a bias correction to further improve the proposed method. Simulation studies show that the proposed method performs better than, or at least as well as, the existing methods in a wide range of settings. A microarray dataset and an RNA-seq dataset are also analyzed and they demonstrate the superiority of the proposed method over the existing competitors, especially when the number of samples is small or the number of genes is large. Finally, we have developed an R package called "GDRDA" which is available upon request.
Rupp, Matthias; Schneider, Petra; Schneider, Gisbert
2009-11-15
Measuring the (dis)similarity of molecules is important for many cheminformatics applications like compound ranking, clustering, and property prediction. In this work, we focus on real-valued vector representations of molecules (as opposed to the binary spaces of fingerprints). We demonstrate the influence which the choice of (dis)similarity measure can have on results, and provide recommendations for such choices. We review the mathematical concepts used to measure (dis)similarity in vector spaces, namely norms, metrics, inner products, and, similarity coefficients, as well as the relationships between them, employing (dis)similarity measures commonly used in cheminformatics as examples. We present several phenomena (empty space phenomenon, sphere volume related phenomena, distance concentration) in high-dimensional descriptor spaces which are not encountered in two and three dimensions. These phenomena are theoretically characterized and illustrated on both artificial and real (bioactivity) data. 2009 Wiley Periodicals, Inc.
High-dimensional neural-network potentials for multicomponent systems: Applications to zinc oxide
Artrith, Nongnuch; Morawietz, Tobias; Behler, Jörg
2011-04-01
Artificial neural networks represent an accurate and efficient tool to construct high-dimensional potential-energy surfaces based on first-principles data. However, so far the main drawback of this method has been the limitation to a single atomic species. We present a generalization to compounds of arbitrary chemical composition, which now enables simulations of a wide range of systems containing large numbers of atoms. The required incorporation of long-range interactions is achieved by combining the numerical accuracy of neural networks with an electrostatic term based on environment-dependent charges. Using zinc oxide as a benchmark system we show that the neural network potential-energy surface is in excellent agreement with density-functional theory reference calculations, while the evaluation is many orders of magnitude faster.
High-dimensional neural network potentials for metal surfaces: A prototype study for copper
Artrith, Nongnuch; Behler, Jörg
2012-01-01
The atomic environments at metal surfaces differ strongly from the bulk, and, in particular, in case of reconstructions or imperfections at “real surfaces,” very complicated atomic configurations can be present. This structural complexity poses a significant challenge for the development of accurate interatomic potentials suitable for large-scale molecular dynamics simulations. In recent years, artificial neural networks (NN) have become a promising new method for the construction of potential-energy surfaces for difficult systems. In the present work, we explore the applicability of such high-dimensional NN potentials to metal surfaces using copper as a benchmark system. A detailed analysis of the properties of bulk copper and of a wide range of surface structures shows that NN potentials can provide results of almost density functional theory (DFT) quality at a small fraction of the computational costs.
A fast PC algorithm for high dimensional causal discovery with multi-core PCs.
Le, Thuc; Hoang, Tao; Li, Jiuyong; Liu, Lin; Liu, Huawen; Hu, Shu
2016-07-14
Discovering causal relationships from observational data is a crucial problem and it has applications in many research areas. The PC algorithm is the state-of-the-art constraint based method for causal discovery. However, runtime of the PC algorithm, in the worst-case, is exponential to the number of nodes (variables), and thus it is inefficient when being applied to high dimensional data, e.g. gene expression datasets. On another note, the advancement of computer hardware in the last decade has resulted in the widespread availability of multi-core personal computers. There is a significant motivation for designing a parallelised PC algorithm that is suitable for personal computers and does not require end users' parallel computing knowledge beyond their competency in using the PC algorithm. In this paper, we develop parallel-PC, a fast and memory efficient PC algorithm using the parallel computing technique. We apply our method to a range of synthetic and real-world high dimensional datasets. Experimental results on a dataset from the DREAM 5 challenge show that the original PC algorithm could not produce any results after running more than 24 hours; meanwhile, our parallel-PC algorithm managed to finish within around 12 hours with a 4-core CPU computer, and less than 6 hours with a 8-core CPU computer. Furthermore, we integrate parallel-PC into a causal inference method for inferring miRNA-mRNA regulatory relationships. The experimental results show that parallel-PC helps improve both the efficiency and accuracy of the causal inference algorithm.
Robust and sparse correlation matrix estimation for the analysis of high-dimensional genomics data.
Serra, Angela; Coretto, Pietro; Fratello, Michele; Tagliaferri, Roberto
2017-10-12
Microarray technology can be used to study the expression of thousands of genes across a number of different experimental conditions, usually hundreds. The underlying principle is that genes sharing similar expression patterns, across different samples, can be part of the same co-expression system, or they may share the same biological functions. Groups of genes are usually identified based on cluster analysis. Clustering methods rely on the similarity matrix between genes. A common choice to measure similarity is to compute the sample correlation matrix. Dimensionality reduction is another popular data analysis task which is also based on covariance/correlation matrix estimates. Unfortunately, covariance/correlation matrix estimation suffers from the intrinsic noise present in high-dimensional data. Sources of noise are: sampling variations, presents of outlying sample units, and the fact that in most cases the number of units is much larger than the number of genes. In this paper we propose a robust correlation matrix estimator that is regularized based on adaptive thresholding. The resulting method jointly tames the effects of the high-dimensionality, and data contamination. Computations are easy to implement and do not require hand tunings. Both simulated and real data are analysed. A Monte Carlo experiment shows that the proposed method is capable of remarkable performances. Our correlation metric is more robust to outliers compared with the existing alternatives in two gene expression data sets. It is also shown how the regularization allows to automatically detect and filter spurious correlations. The same regularization is also extended to other less robust correlation measures. Finally, we apply the ARACNE algorithm on the SyNTreN gene expression data. Sensitivity and specificity of the reconstructed network is compared with the gold standard. We show that ARACNE performs better when it takes the proposed correlation matrix estimator as input. The R
An Adaptive ANOVA-based PCKF for High-Dimensional Nonlinear Inverse Modeling
Energy Technology Data Exchange (ETDEWEB)
LI, Weixuan; Lin, Guang; Zhang, Dongxiao
2014-02-01
The probabilistic collocation-based Kalman filter (PCKF) is a recently developed approach for solving inverse problems. It resembles the ensemble Kalman filter (EnKF) in every aspect—except that it represents and propagates model uncertainty by polynomial chaos expansion (PCE) instead of an ensemble of model realizations. Previous studies have shown PCKF is a more efficient alternative to EnKF for many data assimilation problems. However, the accuracy and efficiency of PCKF depends on an appropriate truncation of the PCE series. Having more polynomial chaos bases in the expansion helps to capture uncertainty more accurately but increases computational cost. Bases selection is particularly important for high-dimensional stochastic problems because the number of polynomial chaos bases required to represent model uncertainty grows dramatically as the number of input parameters (random dimensions) increases. In classic PCKF algorithms, the PCE bases are pre-set based on users’ experience. Also, for sequential data assimilation problems, the bases kept in PCE expression remain unchanged in different Kalman filter loops, which could limit the accuracy and computational efficiency of classic PCKF algorithms. To address this issue, we present a new algorithm that adaptively selects PCE bases for different problems and automatically adjusts the number of bases in different Kalman filter loops. The algorithm is based on adaptive functional ANOVA (analysis of variance) decomposition, which approximates a high-dimensional function with the summation of a set of low-dimensional functions. Thus, instead of expanding the original model into PCE, we implement the PCE expansion on these low-dimensional functions, which is much less costly. We also propose a new adaptive criterion for ANOVA that is more suited for solving inverse problems. The new algorithm is tested with different examples and demonstrated great effectiveness in comparison with non-adaptive PCKF and En
An adaptive ANOVA-based PCKF for high-dimensional nonlinear inverse modeling
Energy Technology Data Exchange (ETDEWEB)
Li, Weixuan, E-mail: weixuan.li@usc.edu [Sonny Astani Department of Civil and Environmental Engineering, University of Southern California, Los Angeles, CA 90089 (United States); Lin, Guang, E-mail: guang.lin@pnnl.gov [Pacific Northwest National Laboratory, Richland, WA 99352 (United States); Zhang, Dongxiao, E-mail: dxz@pku.edu.cn [Department of Energy and Resources Engineering, College of Engineering, Peking University, Beijing 100871 (China)
2014-02-01
The probabilistic collocation-based Kalman filter (PCKF) is a recently developed approach for solving inverse problems. It resembles the ensemble Kalman filter (EnKF) in every aspect—except that it represents and propagates model uncertainty by polynomial chaos expansion (PCE) instead of an ensemble of model realizations. Previous studies have shown PCKF is a more efficient alternative to EnKF for many data assimilation problems. However, the accuracy and efficiency of PCKF depends on an appropriate truncation of the PCE series. Having more polynomial chaos basis functions in the expansion helps to capture uncertainty more accurately but increases computational cost. Selection of basis functions is particularly important for high-dimensional stochastic problems because the number of polynomial chaos basis functions required to represent model uncertainty grows dramatically as the number of input parameters (random dimensions) increases. In classic PCKF algorithms, the PCE basis functions are pre-set based on users' experience. Also, for sequential data assimilation problems, the basis functions kept in PCE expression remain unchanged in different Kalman filter loops, which could limit the accuracy and computational efficiency of classic PCKF algorithms. To address this issue, we present a new algorithm that adaptively selects PCE basis functions for different problems and automatically adjusts the number of basis functions in different Kalman filter loops. The algorithm is based on adaptive functional ANOVA (analysis of variance) decomposition, which approximates a high-dimensional function with the summation of a set of low-dimensional functions. Thus, instead of expanding the original model into PCE, we implement the PCE expansion on these low-dimensional functions, which is much less costly. We also propose a new adaptive criterion for ANOVA that is more suited for solving inverse problems. The new algorithm was tested with different examples and
2D-EM clustering approach for high-dimensional data through folding feature vectors.
Sharma, Alok; Kamola, Piotr J; Tsunoda, Tatsuhiko
2017-12-28
Clustering methods are becoming widely utilized in biomedical research where the volume and complexity of data is rapidly increasing. Unsupervised clustering of patient information can reveal distinct phenotype groups with different underlying mechanism, risk prognosis and treatment response. However, biological datasets are usually characterized by a combination of low sample number and very high dimensionality, something that is not adequately addressed by current algorithms. While the performance of the methods is satisfactory for low dimensional data, increasing number of features results in either deterioration of accuracy or inability to cluster. To tackle these challenges, new methodologies designed specifically for such data are needed. We present 2D-EM, a clustering algorithm approach designed for small sample size and high-dimensional datasets. To employ information corresponding to data distribution and facilitate visualization, the sample is folded into its two-dimension (2D) matrix form (or feature matrix). The maximum likelihood estimate is then estimated using a modified expectation-maximization (EM) algorithm. The 2D-EM methodology was benchmarked against several existing clustering methods using 6 medically-relevant transcriptome datasets. The percentage improvement of Rand score and adjusted Rand index compared to the best performing alternative method is up to 21.9% and 155.6%, respectively. To present the general utility of the 2D-EM method we also employed 2 methylome datasets, again showing superior performance relative to established methods. The 2D-EM algorithm was able to reproduce the groups in transcriptome and methylome data with high accuracy. This build confidence in the methods ability to uncover novel disease subtypes in new datasets. The design of 2D-EM algorithm enables it to handle a diverse set of challenging biomedical dataset and cluster with higher accuracy than established methods. MATLAB implementation of the tool can be
Representing potential energy surfaces by high-dimensional neural network potentials.
Behler, J
2014-05-07
The development of interatomic potentials employing artificial neural networks has seen tremendous progress in recent years. While until recently the applicability of neural network potentials (NNPs) has been restricted to low-dimensional systems, this limitation has now been overcome and high-dimensional NNPs can be used in large-scale molecular dynamics simulations of thousands of atoms. NNPs are constructed by adjusting a set of parameters using data from electronic structure calculations, and in many cases energies and forces can be obtained with very high accuracy. Therefore, NNP-based simulation results are often very close to those gained by a direct application of first-principles methods. In this review, the basic methodology of high-dimensional NNPs will be presented with a special focus on the scope and the remaining limitations of this approach. The development of NNPs requires substantial computational effort as typically thousands of reference calculations are required. Still, if the problem to be studied involves very large systems or long simulation times this overhead is regained quickly. Further, the method is still limited to systems containing about three or four chemical elements due to the rapidly increasing complexity of the configuration space, although many atoms of each species can be present. Due to the ability of NNPs to describe even extremely complex atomic configurations with excellent accuracy irrespective of the nature of the atomic interactions, they represent a general and therefore widely applicable technique, e.g. for addressing problems in materials science, for investigating properties of interfaces, and for studying solvation processes.
Lee, Okkyun; Kappler, Steffen; Polster, Christoph; Taguchi, Katsuyuki
2017-11-01
Photon counting detectors (PCDs) provide multiple energy-dependent measurements for estimating basis line-integrals. However, the measured spectrum is distorted from the spectral response effect (SRE) via charge sharing, K-fluorescence emission, and so on. Thus, in order to avoid bias and artifacts in images, the SRE needs to be compensated. For this purpose, we recently developed a computationally efficient three-step algorithm for PCD-CT without contrast agents by approximating smooth X-ray transmittance using low-order polynomial bases. It compensated the SRE by incorporating the SRE model in a linearized estimation process and achieved nearly the minimum variance and unbiased (MVU) estimator. In this paper, we extend the three-step algorithm to K-edge imaging applications by designing optimal bases using a low-rank approximation to model X-ray transmittances with arbitrary shapes (i.e., smooth without the K-edge or discontinuous with the K-edge). The bases can be used to approximate the X-ray transmittance and to linearize the PCD measurement modeling and then the three-step estimator can be derived as in the previous approach: estimating the x-ray transmittance in the first step, estimating basis line-integrals including that of the contrast agent in the second step, and correcting for a bias in the third step. We demonstrate that the proposed method is more accurate and stable than the low-order polynomial-based approaches with extensive simulation studies using gadolinium for the K-edge imaging application. We also demonstrate that the proposed method achieves nearly MVU estimator, and is more stable than the conventional maximum likelihood estimator in high attenuation cases with fewer photon counts.
Siren, J; Ovaskainen, O; Merilä, J
2017-10-01
The genetic variance-covariance matrix (G) is a quantity of central importance in evolutionary biology due to its influence on the rate and direction of multivariate evolution. However, the predictive power of empirically estimated G-matrices is limited for two reasons. First, phenotypes are high-dimensional, whereas traditional statistical methods are tuned to estimate and analyse low-dimensional matrices. Second, the stability of G to environmental effects and over time remains poorly understood. Using Bayesian sparse factor analysis (BSFG) designed to estimate high-dimensional G-matrices, we analysed levels variation and covariation in 10,527 expressed genes in a large (n = 563) half-sib breeding design of three-spined sticklebacks subject to two temperature treatments. We found significant differences in the structure of G between the treatments: heritabilities and evolvabilities were higher in the warm than in the low-temperature treatment, suggesting more and faster opportunity to evolve in warm (stressful) conditions. Furthermore, comparison of G and its phenotypic equivalent P revealed the latter is a poor substitute of the former. Most strikingly, the results suggest that the expected impact of G on evolvability-as well as the similarity among G-matrices-may depend strongly on the number of traits included into analyses. In our results, the inclusion of only few traits in the analyses leads to underestimation in the differences between the G-matrices and their predicted impacts on evolution. While the results highlight the challenges involved in estimating G, they also illustrate that by enabling the estimation of large G-matrices, the BSFG method can improve predicted evolutionary responses to selection. © 2017 John Wiley & Sons Ltd.
MERSENNE AND HADAMARD MATRICES CALCULATION BY SCARPIS METHOD
Directory of Open Access Journals (Sweden)
N. A. Balonin
2014-05-01
Full Text Available Purpose. The paper deals with the problem of basic generalizations of Hadamard matrices associated with maximum determinant matrices or not optimal by determinant matrices with orthogonal columns (weighing matrices, Mersenne and Euler matrices, ets.; calculation methods for the quasi-orthogonal local maximum determinant Mersenne matrices are not studied enough sufficiently. The goal of this paper is to develop the theory of Mersenne and Hadamard matrices on the base of generalized Scarpis method research. Methods. Extreme solutions are found in general by minimization of maximum for absolute values of the elements of studied matrices followed by their subsequent classification according to the quantity of levels and their values depending on orders. Less universal but more effective methods are based on structural invariants of quasi-orthogonal matrices (Silvester, Paley, Scarpis methods, ets.. Results. Generalizations of Hadamard and Belevitch matrices as a family of quasi-orthogonal matrices of odd orders are observed; they include, in particular, two-level Mersenne matrices. Definitions of section and layer on the set of generalized matrices are proposed. Calculation algorithms for matrices of adjacent layers and sections by matrices of lower orders are described. Approximation examples of the Belevitch matrix structures up to 22-nd critical order by Mersenne matrix of the third order are given. New formulation of the modified Scarpis method to approximate Hadamard matrices of high orders by lower order Mersenne matrices is proposed. Williamson method is described by example of one modular level matrices approximation by matrices with a small number of levels. Practical relevance. The efficiency of developing direction for the band-pass filters creation is justified. Algorithms for Mersenne matrices design by Scarpis method are used in developing software of the research program complex. Mersenne filters are based on the suboptimal by
A Brief Historical Introduction to Matrices and Their Applications
Debnath, L.
2014-01-01
This paper deals with the ancient origin of matrices, and the system of linear equations. Included are algebraic properties of matrices, determinants, linear transformations, and Cramer's Rule for solving the system of algebraic equations. Special attention is given to some special matrices, including matrices in graph theory and electrical…
Protein matrices for wound dressings =
Vasconcelos, Andreia Joana Costa
Fibrous proteins such as silk fibroin (SF), keratin (K) and elastin (EL) are able to mimic the extracellular matrix (ECM) that allows their recognition under physiological conditions. The impressive mechanical properties, the environmental stability, in combination with their biocompatibility and control of morphology, provide an important basis to use these proteins in biomedical applications like protein-based wound dressings. Along time the concept of wound dressings has changed from the traditional dressings such as honey or natural fibres, used just to protect the wound from external factors, to the interactive dressings of the present. Wounds can be classified in acute that heal in the expected time frame, and chronic, which fail to heal because the orderly sequence of events is disrupted at one or more stages of the healing process. Moreover, chronic wound exudates contain high levels of tissue destructive proteolytic enzymes such as human neutrophil elastase (HNE) that need to be controlled for a proper healing. The aim of this work is to exploit the self-assemble properties of silk fibroin, keratin and elastin for the development of new protein materials to be used as wound dressings: i) evaluation of the blending effect on the physical and chemical properties of the materials; ii) development of materials with different morphologies; iii) assessment of the cytocompatibility of the protein matrices; iv) ultimately, study the ability of the developed protein matrices as wound dressings through the use of human chronic wound exudate; v) use of innovative short peptide sequences that allow to target the control of high levels of HNE found on chronic wounds. Chapter III reports the preparation of silk fibroin/keratin (SF/K) blend films by solvent casting evaporation. Two solvent systems, aqueous and acidic, were used for the preparation of films from fibroin and keratin extracted from the respective silk and wool fibres. The effect of solvent system used was
Condition number estimation of preconditioned matrices.
Kushida, Noriyuki
2015-01-01
The present paper introduces a condition number estimation method for preconditioned matrices. The newly developed method provides reasonable results, while the conventional method which is based on the Lanczos connection gives meaningless results. The Lanczos connection based method provides the condition numbers of coefficient matrices of systems of linear equations with information obtained through the preconditioned conjugate gradient method. Estimating the condition number of preconditioned matrices is sometimes important when describing the effectiveness of new preconditionerers or selecting adequate preconditioners. Operating a preconditioner on a coefficient matrix is the simplest method of estimation. However, this is not possible for large-scale computing, especially if computation is performed on distributed memory parallel computers. This is because, the preconditioned matrices become dense, even if the original matrices are sparse. Although the Lanczos connection method can be used to calculate the condition number of preconditioned matrices, it is not considered to be applicable to large-scale problems because of its weakness with respect to numerical errors. Therefore, we have developed a robust and parallelizable method based on Hager's method. The feasibility studies are curried out for the diagonal scaling preconditioner and the SSOR preconditioner with a diagonal matrix, a tri-daigonal matrix and Pei's matrix. As a result, the Lanczos connection method contains around 10% error in the results even with a simple problem. On the other hand, the new method contains negligible errors. In addition, the newly developed method returns reasonable solutions when the Lanczos connection method fails with Pei's matrix, and matrices generated with the finite element method.
Condition number estimation of preconditioned matrices.
Directory of Open Access Journals (Sweden)
Noriyuki Kushida
Full Text Available The present paper introduces a condition number estimation method for preconditioned matrices. The newly developed method provides reasonable results, while the conventional method which is based on the Lanczos connection gives meaningless results. The Lanczos connection based method provides the condition numbers of coefficient matrices of systems of linear equations with information obtained through the preconditioned conjugate gradient method. Estimating the condition number of preconditioned matrices is sometimes important when describing the effectiveness of new preconditionerers or selecting adequate preconditioners. Operating a preconditioner on a coefficient matrix is the simplest method of estimation. However, this is not possible for large-scale computing, especially if computation is performed on distributed memory parallel computers. This is because, the preconditioned matrices become dense, even if the original matrices are sparse. Although the Lanczos connection method can be used to calculate the condition number of preconditioned matrices, it is not considered to be applicable to large-scale problems because of its weakness with respect to numerical errors. Therefore, we have developed a robust and parallelizable method based on Hager's method. The feasibility studies are curried out for the diagonal scaling preconditioner and the SSOR preconditioner with a diagonal matrix, a tri-daigonal matrix and Pei's matrix. As a result, the Lanczos connection method contains around 10% error in the results even with a simple problem. On the other hand, the new method contains negligible errors. In addition, the newly developed method returns reasonable solutions when the Lanczos connection method fails with Pei's matrix, and matrices generated with the finite element method.
A simple new filter for nonlinear high-dimensional data assimilation
Tödter, Julian; Kirchgessner, Paul; Ahrens, Bodo
2015-04-01
performance with a realistic ensemble size. The results confirm that, in principle, it can be applied successfully and as simple as the ETKF in high-dimensional problems without further modifications of the algorithm, even though it is only based on the particle weights. This proves that the suggested method constitutes a useful filter for nonlinear, high-dimensional data assimilation, and is able to overcome the curse of dimensionality even in deterministic systems.
Infinite matrices and their recent applications
Shivakumar, P N; Zhang, Yang
2016-01-01
This monograph covers the theory of finite and infinite matrices over the fields of real numbers, complex numbers and over quaternions. Emphasizing topics such as sections or truncations and their relationship to the linear operator theory on certain specific separable and sequence spaces, the authors explore techniques like conformal mapping, iterations and truncations that are used to derive precise estimates in some cases and explicit lower and upper bounds for solutions in the other cases. Most of the matrices considered in this monograph have typically special structures like being diagonally dominated or tridiagonal, possess certain sign distributions and are frequently nonsingular. Such matrices arise, for instance, from solution methods for elliptic partial differential equations. The authors focus on both theoretical and computational aspects concerning infinite linear algebraic equations, differential systems and infinite linear programming, among others. Additionally, the authors cover topics such ...
Advanced incomplete factorization algorithms for Stiltijes matrices
Energy Technology Data Exchange (ETDEWEB)
Il`in, V.P. [Siberian Division RAS, Novosibirsk (Russian Federation)
1996-12-31
The modern numerical methods for solving the linear algebraic systems Au = f with high order sparse matrices A, which arise in grid approximations of multidimensional boundary value problems, are based mainly on accelerated iterative processes with easily invertible preconditioning matrices presented in the form of approximate (incomplete) factorization of the original matrix A. We consider some recent algorithmic approaches, theoretical foundations, experimental data and open questions for incomplete factorization of Stiltijes matrices which are {open_quotes}the best{close_quotes} ones in the sense that they have the most advanced results. Special attention is given to solving the elliptic differential equations with strongly variable coefficients, singular perturbated diffusion-convection and parabolic equations.
Directory of Open Access Journals (Sweden)
NELSON VALERO VALERO
2012-05-01
Full Text Available Se aislaron bacterias con actividad biotransformadora de carbón de bajo rango (CBR a partir de muestras ambientales con presencia de residuos de carbón en la mina "El Cerrejón". Se aislaron 75 morfotipos bacterianos de los cuales 32 presentaron crecimiento en medio sólido mínimo de sales con carbón a 5 %. Se diseño un protocolo para la selección de los morfotipos con mayor actividad biotransformadora de CBR, el protocolo incluye el aislamiento en un medio selectivo con CBR en polvo, pruebas cualitativas y cuantitativas de solubilización de CBR en medios sólidos y líquido. El mecanismo de solubilización en las cepas que producen mayores valores de sustancias húmicas (SH estuvo asociado a cambios de pH en el medio, probablemente por la producción de sustancias alcalinas extracelulares. El mayor número de aislamientos y los aislamientos con mayor actividad solubilizadora sobre el CBR provienen de lodo con alto contenido de residuos de carbón y las rizósferas de Typha domingensis y Cenchrus ciliaris que crecen sobre sedimentos mezclados con partículas de carbón, este resultado sugiere que la obtención y capacidad de solubilización de CBR por parte de bacterias puede estar relacionada con el microhábitat donde se desarrollan las poblaciones.Bacteria capable of low rank coal (LRC biotransform were isolated from environmental samples altered with coal in the mine "The Cerrejon". A protocol was designed to select strains more capable of LRC biotransform, the protocol includes isolation in a selective medium with LRC powder, qualitative and quantitative tests for LRC solubilization in solid and liquid culture medium. Of 75 bacterial strains isolated, 32 showed growth in minimal salts agar with 5 % carbon. The strains that produce higher values of humic substances (HS have a mechanism of solubilization associated with pH changes in the culture medium, probably related to the production of extracellular alkaline substances by bacteria
Forecasting Covariance Matrices: A Mixed Frequency Approach
DEFF Research Database (Denmark)
Halbleib, Roxana; Voev, Valeri
This paper proposes a new method for forecasting covariance matrices of financial returns. The model mixes volatility forecasts from a dynamic model of daily realized volatilities estimated with high-frequency data with correlation forecasts based on daily data. This new approach allows for flexi......This paper proposes a new method for forecasting covariance matrices of financial returns. The model mixes volatility forecasts from a dynamic model of daily realized volatilities estimated with high-frequency data with correlation forecasts based on daily data. This new approach allows...... matrix dynamics. Our empirical results show that the new mixing approach provides superior forecasts compared to multivariate volatility specifications using single sources of information....
Directory of Open Access Journals (Sweden)
Shiqing Wang
2013-01-01
Full Text Available During the last few years, a great deal of attention has been focused on Lasso and Dantzig selector in high-dimensional linear regression when the number of variables can be much larger than the sample size. Under a sparsity scenario, the authors (see, e.g., Bickel et al., 2009, Bunea et al., 2007, Candes and Tao, 2007, Candès and Tao, 2007, Donoho et al., 2006, Koltchinskii, 2009, Koltchinskii, 2009, Meinshausen and Yu, 2009, Rosenbaum and Tsybakov, 2010, Tsybakov, 2006, van de Geer, 2008, and Zhang and Huang, 2008 discussed the relations between Lasso and Dantzig selector and derived sparsity oracle inequalities for the prediction risk and bounds on the estimation loss. In this paper, we point out that some of the authors overemphasize the role of some sparsity conditions, and the assumptions based on this sparsity condition may cause bad results. We give better assumptions and the methods that avoid using the sparsity condition. As a comparison with the results by Bickel et al., 2009, more precise oracle inequalities for the prediction risk and bounds on the estimation loss are derived when the number of variables can be much larger than the sample size.
Mapping the human DC lineage through the integration of high-dimensional techniques.
See, Peter; Dutertre, Charles-Antoine; Chen, Jinmiao; Günther, Patrick; McGovern, Naomi; Irac, Sergio Erdal; Gunawan, Merry; Beyer, Marc; Händler, Kristian; Duan, Kaibo; Sumatoh, Hermi Rizal Bin; Ruffin, Nicolas; Jouve, Mabel; Gea-Mallorquí, Ester; Hennekam, Raoul C M; Lim, Tony; Yip, Chan Chung; Wen, Ming; Malleret, Benoit; Low, Ivy; Shadan, Nurhidaya Binte; Fen, Charlene Foong Shu; Tay, Alicia; Lum, Josephine; Zolezzi, Francesca; Larbi, Anis; Poidinger, Michael; Chan, Jerry K Y; Chen, Qingfeng; Rénia, Laurent; Haniffa, Muzlifah; Benaroch, Philippe; Schlitzer, Andreas; Schultze, Joachim L; Newell, Evan W; Ginhoux, Florent
2017-06-09
Dendritic cells (DC) are professional antigen-presenting cells that orchestrate immune responses. The human DC population comprises two main functionally specialized lineages, whose origins and differentiation pathways remain incompletely defined. Here, we combine two high-dimensional technologies-single-cell messenger RNA sequencing (scmRNAseq) and cytometry by time-of-flight (CyTOF)-to identify human blood CD123+CD33+CD45RA+ DC precursors (pre-DC). Pre-DC share surface markers with plasmacytoid DC (pDC) but have distinct functional properties that were previously attributed to pDC. Tracing the differentiation of DC from the bone marrow to the peripheral blood revealed that the pre-DC compartment contains distinct lineage-committed subpopulations, including one early uncommitted CD123high pre-DC subset and two CD45RA+CD123low lineage-committed subsets exhibiting functional differences. The discovery of multiple committed pre-DC populations opens promising new avenues for the therapeutic exploitation of DC subset-specific targeting. Copyright © 2017, American Association for the Advancement of Science.
Huang, Yen-Tsung; Pan, Wen-Chi
2016-06-01
Causal mediation modeling has become a popular approach for studying the effect of an exposure on an outcome through a mediator. However, current methods are not applicable to the setting with a large number of mediators. We propose a testing procedure for mediation effects of high-dimensional continuous mediators. We characterize the marginal mediation effect, the multivariate component-wise mediation effects, and the L2 norm of the component-wise effects, and develop a Monte-Carlo procedure for evaluating their statistical significance. To accommodate the setting with a large number of mediators and a small sample size, we further propose a transformation model using the spectral decomposition. Under the transformation model, mediation effects can be estimated using a series of regression models with a univariate transformed mediator, and examined by our proposed testing procedure. Extensive simulation studies are conducted to assess the performance of our methods for continuous and dichotomous outcomes. We apply the methods to analyze genomic data investigating the effect of microRNA miR-223 on a dichotomous survival status of patients with glioblastoma multiforme (GBM). We identify nine gene ontology sets with expression values that significantly mediate the effect of miR-223 on GBM survival. © 2015, The International Biometric Society.
Multi-Scale Factor Analysis of High-Dimensional Brain Signals
Ting, Chee-Ming
2017-05-18
In this paper, we develop an approach to modeling high-dimensional networks with a large number of nodes arranged in a hierarchical and modular structure. We propose a novel multi-scale factor analysis (MSFA) model which partitions the massive spatio-temporal data defined over the complex networks into a finite set of regional clusters. To achieve further dimension reduction, we represent the signals in each cluster by a small number of latent factors. The correlation matrix for all nodes in the network are approximated by lower-dimensional sub-structures derived from the cluster-specific factors. To estimate regional connectivity between numerous nodes (within each cluster), we apply principal components analysis (PCA) to produce factors which are derived as the optimal reconstruction of the observed signals under the squared loss. Then, we estimate global connectivity (between clusters or sub-networks) based on the factors across regions using the RV-coefficient as the cross-dependence measure. This gives a reliable and computationally efficient multi-scale analysis of both regional and global dependencies of the large networks. The proposed novel approach is applied to estimate brain connectivity networks using functional magnetic resonance imaging (fMRI) data. Results on resting-state fMRI reveal interesting modular and hierarchical organization of human brain networks during rest.
High-Dimensional Neural Network Potentials for Organic Reactions and an Improved Training Algorithm.
Gastegger, Michael; Marquetand, Philipp
2015-05-12
Artificial neural networks (NNs) represent a relatively recent approach for the prediction of molecular potential energies, suitable for simulations of large molecules and long time scales. By using NNs to fit electronic structure data, it is possible to obtain empirical potentials of high accuracy combined with the computational efficiency of conventional force fields. However, as opposed to the latter, changing bonding patterns and unusual coordination geometries can be described due to the underlying flexible functional form of the NNs. One of the most promising approaches in this field is the high-dimensional neural network (HDNN) method, which is especially adapted to the prediction of molecular properties. While HDNNs have been mostly used to model solid state systems and surface interactions, we present here the first application of the HDNN approach to an organic reaction, the Claisen rearrangement of allyl vinyl ether to 4-pentenal. To construct the corresponding HDNN potential, a new training algorithm is introduced. This algorithm is termed "element-decoupled" global extended Kalman filter (ED-GEKF) and is based on the decoupled Kalman filter. Using a metadynamics trajectory computed with density functional theory as reference data, we show that the ED-GEKF exhibits superior performance - both in terms of accuracy and training speed - compared to other variants of the Kalman filter hitherto employed in HDNN training. In addition, the effect of including forces during ED-GEKF training on the resulting potentials was studied.
Spanning high-dimensional expression space using ribosome-binding site combinatorics.
Zelcbuch, Lior; Antonovsky, Niv; Bar-Even, Arren; Levin-Karp, Ayelet; Barenholz, Uri; Dayagi, Michal; Liebermeister, Wolfram; Flamholz, Avi; Noor, Elad; Amram, Shira; Brandis, Alexander; Bareia, Tasneem; Yofe, Ido; Jubran, Halim; Milo, Ron
2013-05-01
Protein levels are a dominant factor shaping natural and synthetic biological systems. Although proper functioning of metabolic pathways relies on precise control of enzyme levels, the experimental ability to balance the levels of many genes in parallel is a major outstanding challenge. Here, we introduce a rapid and modular method to span the expression space of several proteins in parallel. By combinatorially pairing genes with a compact set of ribosome-binding sites, we modulate protein abundance by several orders of magnitude. We demonstrate our strategy by using a synthetic operon containing fluorescent proteins to span a 3D color space. Using the same approach, we modulate a recombinant carotenoid biosynthesis pathway in Escherichia coli to reveal a diversity of phenotypes, each characterized by a distinct carotenoid accumulation profile. In a single combinatorial assembly, we achieve a yield of the industrially valuable compound astaxanthin 4-fold higher than previously reported. The methodology presented here provides an efficient tool for exploring a high-dimensional expression space to locate desirable phenotypes.
A novel algorithm for simultaneous SNP selection in high-dimensional genome-wide association studies
Directory of Open Access Journals (Sweden)
Zuber Verena
2012-10-01
Full Text Available Abstract Background Identification of causal SNPs in most genome wide association studies relies on approaches that consider each SNP individually. However, there is a strong correlation structure among SNPs that needs to be taken into account. Hence, increasingly modern computationally expensive regression methods are employed for SNP selection that consider all markers simultaneously and thus incorporate dependencies among SNPs. Results We develop a novel multivariate algorithm for large scale SNP selection using CAR score regression, a promising new approach for prioritizing biomarkers. Specifically, we propose a computationally efficient procedure for shrinkage estimation of CAR scores from high-dimensional data. Subsequently, we conduct a comprehensive comparison study including five advanced regression approaches (boosting, lasso, NEG, MCP, and CAR score and a univariate approach (marginal correlation to determine the effectiveness in finding true causal SNPs. Conclusions Simultaneous SNP selection is a challenging task. We demonstrate that our CAR score-based algorithm consistently outperforms all competing approaches, both uni- and multivariate, in terms of correctly recovered causal SNPs and SNP ranking. An R package implementing the approach as well as R code to reproduce the complete study presented here is available from http://strimmerlab.org/software/care/.
Xia, Yin; Cai, Tianxi; Cai, T Tony
2018-01-01
Motivated by applications in genomics, we consider in this paper global and multiple testing for the comparisons of two high-dimensional linear regression models. A procedure for testing the equality of the two regression vectors globally is proposed and shown to be particularly powerful against sparse alternatives. We then introduce a multiple testing procedure for identifying unequal coordinates while controlling the false discovery rate and false discovery proportion. Theoretical justifications are provided to guarantee the validity of the proposed tests and optimality results are established under sparsity assumptions on the regression coefficients. The proposed testing procedures are easy to implement. Numerical properties of the procedures are investigated through simulation and data analysis. The results show that the proposed tests maintain the desired error rates under the null and have good power under the alternative at moderate sample sizes. The procedures are applied to the Framingham Offspring study to investigate the interactions between smoking and cardiovascular related genetic mutations important for an inflammation marker.
A Comparison of Machine Learning Methods in a High-Dimensional Classification Problem
Directory of Open Access Journals (Sweden)
Zekić-Sušac Marijana
2014-09-01
Full Text Available Background: Large-dimensional data modelling often relies on variable reduction methods in the pre-processing and in the post-processing stage. However, such a reduction usually provides less information and yields a lower accuracy of the model. Objectives: The aim of this paper is to assess the high-dimensional classification problem of recognizing entrepreneurial intentions of students by machine learning methods. Methods/Approach: Four methods were tested: artificial neural networks, CART classification trees, support vector machines, and k-nearest neighbour on the same dataset in order to compare their efficiency in the sense of classification accuracy. The performance of each method was compared on ten subsamples in a 10-fold cross-validation procedure in order to assess computing sensitivity and specificity of each model. Results: The artificial neural network model based on multilayer perceptron yielded a higher classification rate than the models produced by other methods. The pairwise t-test showed a statistical significance between the artificial neural network and the k-nearest neighbour model, while the difference among other methods was not statistically significant. Conclusions: Tested machine learning methods are able to learn fast and achieve high classification accuracy. However, further advancement can be assured by testing a few additional methodological refinements in machine learning methods.
Biomarker identification and effect estimation on schizophrenia –a high dimensional data analysis
Directory of Open Access Journals (Sweden)
Yuanzhang eLi
2015-05-01
Full Text Available Biomarkers have been examined in schizophrenia research for decades. Medical morbidity and mortality rates, as well as personal and societal costs, are associated with schizophrenia patients. The identification of biomarkers and alleles, which often have a small effect individually, may help to develop new diagnostic tests for early identification and treatment. Currently, there is not a commonly accepted statistical approach to identify predictive biomarkers from high dimensional data. We used space Decomposition-Gradient-Regression method (DGR to select biomarkers, which are associated with the risk of schizophrenia. Then, we used the gradient scores, generated from the selected biomarkers, as the prediction factor in regression to estimate their effects. We also used an alternative approach, classification and regression tree (CART, to compare the biomarker selected by DGR and found about 70% of the selected biomarkers were the same. However, the advantage of DGR is that it can evaluate individual effects for each biomarker from their combined effect. In DGR analysis of serum specimens of US military service members with a diagnosis of schizophrenia from 1992 to 2005 and their controls, Alpha-1-Antitrypsin (AAT, Interleukin-6 receptor (IL-6r and Connective Tissue Growth Factor (CTGF were selected to identify schizophrenia for males; and Alpha-1-Antitrypsin (AAT, Apolipoprotein B (Apo B and Sortilin were selected for females. If these findings from military subjects are replicated by other studies, they suggest the possibility of a novel biomarker panel as an adjunct to earlier diagnosis and initiation of treatment.
Städler, Nicolas; Dondelinger, Frank; Hill, Steven M; Akbani, Rehan; Lu, Yiling; Mills, Gordon B; Mukherjee, Sach
2017-09-15
Molecular pathways and networks play a key role in basic and disease biology. An emerging notion is that networks encoding patterns of molecular interplay may themselves differ between contexts, such as cell type, tissue or disease (sub)type. However, while statistical testing of differences in mean expression levels has been extensively studied, testing of network differences remains challenging. Furthermore, since network differences could provide important and biologically interpretable information to identify molecular subgroups, there is a need to consider the unsupervised task of learning subgroups and networks that define them. This is a nontrivial clustering problem, with neither subgroups nor subgroup-specific networks known at the outset. We leverage recent ideas from high-dimensional statistics for testing and clustering in the network biology setting. The methods we describe can be applied directly to most continuous molecular measurements and networks do not need to be specified beforehand. We illustrate the ideas and methods in a case study using protein data from The Cancer Genome Atlas (TCGA). This provides evidence that patterns of interplay between signalling proteins differ significantly between cancer types. Furthermore, we show how the proposed approaches can be used to learn subtypes and the molecular networks that define them. As the Bioconductor package nethet. staedler.n@gmail.com or sach.mukherjee@dzne.de. Supplementary data are available at Bioinformatics online.
SPRING: a kinetic interface for visualizing high dimensional single-cell expression data.
Weinreb, Caleb; Wolock, Samuel; Klein, Allon
2017-12-07
Single-cell gene expression profiling technologies can map the cell states in a tissue or organism. As these technologies become more common, there is a need for computational tools to explore the data they produce. In particular, visualizing continuous gene expression topologies can be improved, since current tools tend to fragment gene expression continua or capture only limited features of complex population topologies. Force-directed layouts of k-nearest-neighbor graphs can visualize continuous gene expression topologies in a manner that preserves high-dimensional relationships and captures complex population topologies. We describe SPRING, a pipeline for data filtering, normalization and visualization using force-directed layouts, and show that it reveals more detailed biological relationships than existing approaches when applied to branching gene expression trajectories from hematopoietic progenitor cells and cells of the upper airway epithelium. Visualizations from SPRING are also more reproducible than those of stochastic visualization methods such as tSNE, a state-of-the-art tool. We provide SPRING as an interactive web-tool with an easy to use GUI. https://kleintools.hms.harvard.edu/tools/spring.html, https://github.com/AllonKleinLab/SPRING/. calebsw@gmail.com, allon_klein@hms.harvard.edu.
Yuan, Xiaoru; Ren, Donghao; Wang, Zuchao; Guo, Cong
2013-12-01
For high-dimensional data, this work proposes two novel visual exploration methods to gain insights into the data aspect and the dimension aspect of the data. The first is a Dimension Projection Matrix, as an extension of a scatterplot matrix. In the matrix, each row or column represents a group of dimensions, and each cell shows a dimension projection (such as MDS) of the data with the corresponding dimensions. The second is a Dimension Projection Tree, where every node is either a dimension projection plot or a Dimension Projection Matrix. Nodes are connected with links and each child node in the tree covers a subset of the parent node's dimensions or a subset of the parent node's data items. While the tree nodes visualize the subspaces of dimensions or subsets of the data items under exploration, the matrix nodes enable cross-comparison between different combinations of subspaces. Both Dimension Projection Matrix and Dimension Project Tree can be constructed algorithmically through automation, or manually through user interaction. Our implementation enables interactions such as drilling down to explore different levels of the data, merging or splitting the subspaces to adjust the matrix, and applying brushing to select data clusters. Our method enables simultaneously exploring data correlation and dimension correlation for data with high dimensions.
Energy Technology Data Exchange (ETDEWEB)
Snyder, Abigail C. [University of Pittsburgh; Jiao, Yu [ORNL
2010-10-01
Neutron experiments at the Spallation Neutron Source (SNS) at Oak Ridge National Laboratory (ORNL) frequently generate large amounts of data (on the order of 106-1012 data points). Hence, traditional data analysis tools run on a single CPU take too long to be practical and scientists are unable to efficiently analyze all data generated by experiments. Our goal is to develop a scalable algorithm to efficiently compute high-dimensional integrals of arbitrary functions. This algorithm can then be used to integrate the four-dimensional integrals that arise as part of modeling intensity from the experiments at the SNS. Here, three different one-dimensional numerical integration solvers from the GNU Scientific Library were modified and implemented to solve four-dimensional integrals. The results of these solvers on a final integrand provided by scientists at the SNS can be compared to the results of other methods, such as quasi-Monte Carlo methods, computing the same integral. A parallelized version of the most efficient method can allow scientists the opportunity to more effectively analyze all experimental data.
Mwangi, Benson; Soares, Jair C; Hasan, Khader M
2014-10-30
Neuroimaging machine learning studies have largely utilized supervised algorithms - meaning they require both neuroimaging scan data and corresponding target variables (e.g. healthy vs. diseased) to be successfully 'trained' for a prediction task. Noticeably, this approach may not be optimal or possible when the global structure of the data is not well known and the researcher does not have an a priori model to fit the data. We set out to investigate the utility of an unsupervised machine learning technique; t-distributed stochastic neighbour embedding (t-SNE) in identifying 'unseen' sample population patterns that may exist in high-dimensional neuroimaging data. Multimodal neuroimaging scans from 92 healthy subjects were pre-processed using atlas-based methods, integrated and input into the t-SNE algorithm. Patterns and clusters discovered by the algorithm were visualized using a 2D scatter plot and further analyzed using the K-means clustering algorithm. t-SNE was evaluated against classical principal component analysis. Remarkably, based on unlabelled multimodal scan data, t-SNE separated study subjects into two very distinct clusters which corresponded to subjects' gender labels (cluster silhouette index value=0.79). The resulting clusters were used to develop an unsupervised minimum distance clustering model which identified 93.5% of subjects' gender. Notably, from a neuropsychiatric perspective this method may allow discovery of data-driven disease phenotypes or sub-types of treatment responders. Copyright © 2014 Elsevier B.V. All rights reserved.
A multistage mathematical approach to automated clustering of high-dimensional noisy data
Friedman, Alexander; Keselman, Michael D.; Gibb, Leif G.; Graybiel, Ann M.
2015-01-01
A critical problem faced in many scientific fields is the adequate separation of data derived from individual sources. Often, such datasets require analysis of multiple features in a highly multidimensional space, with overlap of features and sources. The datasets generated by simultaneous recording from hundreds of neurons emitting phasic action potentials have produced the challenge of separating the recorded signals into independent data subsets (clusters) corresponding to individual signal-generating neurons. Mathematical methods have been developed over the past three decades to achieve such spike clustering, but a complete solution with fully automated cluster identification has not been achieved. We propose here a fully automated mathematical approach that identifies clusters in multidimensional space through recursion, which combats the multidimensionality of the data. Recursion is paired with an approach to dimensional evaluation, in which each dimension of a dataset is examined for its informational importance for clustering. The dimensions offering greater informational importance are given added weight during recursive clustering. To combat strong background activity, our algorithm takes an iterative approach of data filtering according to a signal-to-noise ratio metric. The algorithm finds cluster cores, which are thereafter expanded to include complete clusters. This mathematical approach can be extended from its prototype context of spike sorting to other datasets that suffer from high dimensionality and background activity. PMID:25831512
Evolutionary fields can explain patterns of high-dimensional complexity in ecology.
Wilsenach, James; Landi, Pietro; Hui, Cang
2017-04-01
One of the properties that make ecological systems so unique is the range of complex behavioral patterns that can be exhibited by even the simplest communities with only a few species. Much of this complexity is commonly attributed to stochastic factors that have very high-degrees of freedom. Orthodox study of the evolution of these simple networks has generally been limited in its ability to explain complexity, since it restricts evolutionary adaptation to an inertia-free process with few degrees of freedom in which only gradual, moderately complex behaviors are possible. We propose a model inspired by particle-mediated field phenomena in classical physics in combination with fundamental concepts in adaptation, which suggests that small but high-dimensional chaotic dynamics near to the adaptive trait optimum could help explain complex properties shared by most ecological datasets, such as aperiodicity and pink, fractal noise spectra. By examining a simple predator-prey model and appealing to real ecological data, we show that this type of complexity could be easily confused for or confounded by stochasticity, especially when spurred on or amplified by stochastic factors that share variational and spectral properties with the underlying dynamics.
Schran, Christoph; Uhl, Felix; Behler, Jörg; Marx, Dominik
2018-03-01
The design of accurate helium-solute interaction potentials for the simulation of chemically complex molecules solvated in superfluid helium has long been a cumbersome task due to the rather weak but strongly anisotropic nature of the interactions. We show that this challenge can be met by using a combination of an effective pair potential for the He-He interactions and a flexible high-dimensional neural network potential (NNP) for describing the complex interaction between helium and the solute in a pairwise additive manner. This approach yields an excellent agreement with a mean absolute deviation as small as 0.04 kJ mol-1 for the interaction energy between helium and both hydronium and Zundel cations compared with coupled cluster reference calculations with an energetically converged basis set. The construction and improvement of the potential can be performed in a highly automated way, which opens the door for applications to a variety of reactive molecules to study the effect of solvation on the solute as well as the solute-induced structuring of the solvent. Furthermore, we show that this NNP approach yields very convincing agreement with the coupled cluster reference for properties like many-body spatial and radial distribution functions. This holds for the microsolvation of the protonated water monomer and dimer by a few helium atoms up to their solvation in bulk helium as obtained from path integral simulations at about 1 K.
Anomaly Detection in Large Sets of High-Dimensional Symbol Sequences
Budalakoti, Suratna; Srivastava, Ashok N.; Akella, Ram; Turkov, Eugene
2006-01-01
This paper addresses the problem of detecting and describing anomalies in large sets of high-dimensional symbol sequences. The approach taken uses unsupervised clustering of sequences using the normalized longest common subsequence (LCS) as a similarity measure, followed by detailed analysis of outliers to detect anomalies. As the LCS measure is expensive to compute, the first part of the paper discusses existing algorithms, such as the Hunt-Szymanski algorithm, that have low time-complexity. We then discuss why these algorithms often do not work well in practice and present a new hybrid algorithm for computing the LCS that, in our tests, outperforms the Hunt-Szymanski algorithm by a factor of five. The second part of the paper presents new algorithms for outlier analysis that provide comprehensible indicators as to why a particular sequence was deemed to be an outlier. The algorithms provide a coherent description to an analyst of the anomalies in the sequence, compared to more normal sequences. The algorithms we present are general and domain-independent, so we discuss applications in related areas such as anomaly detection.
Evolutionary fields can explain patterns of high-dimensional complexity in ecology
Wilsenach, James; Landi, Pietro; Hui, Cang
2017-04-01
One of the properties that make ecological systems so unique is the range of complex behavioral patterns that can be exhibited by even the simplest communities with only a few species. Much of this complexity is commonly attributed to stochastic factors that have very high-degrees of freedom. Orthodox study of the evolution of these simple networks has generally been limited in its ability to explain complexity, since it restricts evolutionary adaptation to an inertia-free process with few degrees of freedom in which only gradual, moderately complex behaviors are possible. We propose a model inspired by particle-mediated field phenomena in classical physics in combination with fundamental concepts in adaptation, which suggests that small but high-dimensional chaotic dynamics near to the adaptive trait optimum could help explain complex properties shared by most ecological datasets, such as aperiodicity and pink, fractal noise spectra. By examining a simple predator-prey model and appealing to real ecological data, we show that this type of complexity could be easily confused for or confounded by stochasticity, especially when spurred on or amplified by stochastic factors that share variational and spectral properties with the underlying dynamics.
CMA – a comprehensive Bioconductor package for supervised classification with high dimensional data
Slawski, M; Daumer, M; Boulesteix, A-L
2008-01-01
Background For the last eight years, microarray-based classification has been a major topic in statistics, bioinformatics and biomedicine research. Traditional methods often yield unsatisfactory results or may even be inapplicable in the so-called "p ≫ n" setting where the number of predictors p by far exceeds the number of observations n, hence the term "ill-posed-problem". Careful model selection and evaluation satisfying accepted good-practice standards is a very complex task for statisticians without experience in this area or for scientists with limited statistical background. The multiplicity of available methods for class prediction based on high-dimensional data is an additional practical challenge for inexperienced researchers. Results In this article, we introduce a new Bioconductor package called CMA (standing for "Classification for MicroArrays") for automatically performing variable selection, parameter tuning, classifier construction, and unbiased evaluation of the constructed classifiers using a large number of usual methods. Without much time and effort, users are provided with an overview of the unbiased accuracy of most top-performing classifiers. Furthermore, the standardized evaluation framework underlying CMA can also be beneficial in statistical research for comparison purposes, for instance if a new classifier has to be compared to existing approaches. Conclusion CMA is a user-friendly comprehensive package for classifier construction and evaluation implementing most usual approaches. It is freely available from the Bioconductor website at . PMID:18925941
Hou, Jiayi; Archer, Kellie J
2015-02-01
Abstract An ordinal scale is commonly used to measure health status and disease related outcomes in hospital settings as well as in translational medical research. In addition, repeated measurements are common in clinical practice for tracking and monitoring the progression of complex diseases. Classical methodology based on statistical inference, in particular, ordinal modeling has contributed to the analysis of data in which the response categories are ordered and the number of covariates (p) remains smaller than the sample size (n). With the emergence of genomic technologies being increasingly applied for more accurate diagnosis and prognosis, high-dimensional data where the number of covariates (p) is much larger than the number of samples (n), are generated. To meet the emerging needs, we introduce our proposed model which is a two-stage algorithm: Extend the generalized monotone incremental forward stagewise (GMIFS) method to the cumulative logit ordinal model; and combine the GMIFS procedure with the classical mixed-effects model for classifying disease status in disease progression along with time. We demonstrate the efficiency and accuracy of the proposed models in classification using a time-course microarray dataset collected from the Inflammation and the Host Response to Injury study.
PCA leverage: outlier detection for high-dimensional functional magnetic resonance imaging data.
Mejia, Amanda F; Nebel, Mary Beth; Eloyan, Ani; Caffo, Brian; Lindquist, Martin A
2017-07-01
Outlier detection for high-dimensional (HD) data is a popular topic in modern statistical research. However, one source of HD data that has received relatively little attention is functional magnetic resonance images (fMRI), which consists of hundreds of thousands of measurements sampled at hundreds of time points. At a time when the availability of fMRI data is rapidly growing-primarily through large, publicly available grassroots datasets-automated quality control and outlier detection methods are greatly needed. We propose principal components analysis (PCA) leverage and demonstrate how it can be used to identify outlying time points in an fMRI run. Furthermore, PCA leverage is a measure of the influence of each observation on the estimation of principal components, which are often of interest in fMRI data. We also propose an alternative measure, PCA robust distance, which is less sensitive to outliers and has controllable statistical properties. The proposed methods are validated through simulation studies and are shown to be highly accurate. We also conduct a reliability study using resting-state fMRI data from the Autism Brain Imaging Data Exchange and find that removal of outliers using the proposed methods results in more reliable estimation of subject-level resting-state networks using independent components analysis. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Defining and evaluating classification algorithm for high-dimensional data based on latent topics.
Directory of Open Access Journals (Sweden)
Le Luo
Full Text Available Automatic text categorization is one of the key techniques in information retrieval and the data mining field. The classification is usually time-consuming when the training dataset is large and high-dimensional. Many methods have been proposed to solve this problem, but few can achieve satisfactory efficiency. In this paper, we present a method which combines the Latent Dirichlet Allocation (LDA algorithm and the Support Vector Machine (SVM. LDA is first used to generate reduced dimensional representation of topics as feature in VSM. It is able to reduce features dramatically but keeps the necessary semantic information. The Support Vector Machine (SVM is then employed to classify the data based on the generated features. We evaluate the algorithm on 20 Newsgroups and Reuters-21578 datasets, respectively. The experimental results show that the classification based on our proposed LDA+SVM model achieves high performance in terms of precision, recall and F1 measure. Further, it can achieve this within a much shorter time-frame. Our process improves greatly upon the previous work in this field and displays strong potential to achieve a streamlined classification process for a wide range of applications.
A hyper-spherical adaptive sparse-grid method for high-dimensional discontinuity detection
Energy Technology Data Exchange (ETDEWEB)
Zhang, Guannan [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Webster, Clayton G. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Gunzburger, Max D. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Burkardt, John V. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
2014-03-01
This work proposes and analyzes a hyper-spherical adaptive hierarchical sparse-grid method for detecting jump discontinuities of functions in high-dimensional spaces is proposed. The method is motivated by the theoretical and computational inefficiencies of well-known adaptive sparse-grid methods for discontinuity detection. Our novel approach constructs a function representation of the discontinuity hyper-surface of an N-dimensional dis- continuous quantity of interest, by virtue of a hyper-spherical transformation. Then, a sparse-grid approximation of the transformed function is built in the hyper-spherical coordinate system, whose value at each point is estimated by solving a one-dimensional discontinuity detection problem. Due to the smoothness of the hyper-surface, the new technique can identify jump discontinuities with significantly reduced computational cost, compared to existing methods. Moreover, hierarchical acceleration techniques are also incorporated to further reduce the overall complexity. Rigorous error estimates and complexity analyses of the new method are provided as are several numerical examples that illustrate the effectiveness of the approach.
Free-energy calculations along a high-dimensional fragmented path with constrained dynamics.
Chen, Changjun; Huang, Yanzhao; Xiao, Yi
2012-09-01
Free-energy calculations for high-dimensional systems, such as peptides or proteins, always suffer from a serious sampling problem in a huge conformational space. For such systems, path-based free-energy methods, such as thermodynamic integration or free-energy perturbation, are good choices. However, both of them need sufficient sampling along a predefined transition path, which can only be controlled using restrained or constrained dynamics. Constrained simulations produce more reasonable free-energy profiles than restrained simulations. But calculations of standard constrained dynamics require an explicit expression of reaction coordinates as a function of Cartesian coordinates of all related atoms, which may be difficult to find for the complex transition of biomolecules. In this paper, we propose a practical solution: (1) We use restrained dynamics to define an optimized transition path, divide it into small fragments, and define a virtual reaction coordinate to denote a position along the path. (2) We use constrained dynamics to perform a formal free-energy calculation for each fragment and collect the values together to provide the entire free-energy profile. This method avoids the requirement to explicitly define reaction coordinates in Cartesian coordinates and provides a novel strategy to perform free-energy calculations for biomolecules along any complex transition path.
Construction of high-dimensional neural network potentials using environment-dependent atom pairs.
Jose, K V Jovan; Artrith, Nongnuch; Behler, Jörg
2012-05-21
An accurate determination of the potential energy is the crucial step in computer simulations of chemical processes, but using electronic structure methods on-the-fly in molecular dynamics (MD) is computationally too demanding for many systems. Constructing more efficient interatomic potentials becomes intricate with increasing dimensionality of the potential-energy surface (PES), and for numerous systems the accuracy that can be achieved is still not satisfying and far from the reliability of first-principles calculations. Feed-forward neural networks (NNs) have a very flexible functional form, and in recent years they have been shown to be an accurate tool to construct efficient PESs. High-dimensional NN potentials based on environment-dependent atomic energy contributions have been presented for a number of materials. Still, these potentials may be improved by a more detailed structural description, e.g., in form of atom pairs, which directly reflect the atomic interactions and take the chemical environment into account. We present an implementation of an NN method based on atom pairs, and its accuracy and performance are compared to the atom-based NN approach using two very different systems, the methanol molecule and metallic copper. We find that both types of NN potentials provide an excellent description of both PESs, with the pair-based method yielding a slightly higher accuracy making it a competitive alternative for addressing complex systems in MD simulations.
Construction of high-dimensional neural network potentials using environment-dependent atom pairs
Jose, K. V. Jovan; Artrith, Nongnuch; Behler, Jörg
2012-05-01
An accurate determination of the potential energy is the crucial step in computer simulations of chemical processes, but using electronic structure methods on-the-fly in molecular dynamics (MD) is computationally too demanding for many systems. Constructing more efficient interatomic potentials becomes intricate with increasing dimensionality of the potential-energy surface (PES), and for numerous systems the accuracy that can be achieved is still not satisfying and far from the reliability of first-principles calculations. Feed-forward neural networks (NNs) have a very flexible functional form, and in recent years they have been shown to be an accurate tool to construct efficient PESs. High-dimensional NN potentials based on environment-dependent atomic energy contributions have been presented for a number of materials. Still, these potentials may be improved by a more detailed structural description, e.g., in form of atom pairs, which directly reflect the atomic interactions and take the chemical environment into account. We present an implementation of an NN method based on atom pairs, and its accuracy and performance are compared to the atom-based NN approach using two very different systems, the methanol molecule and metallic copper. We find that both types of NN potentials provide an excellent description of both PESs, with the pair-based method yielding a slightly higher accuracy making it a competitive alternative for addressing complex systems in MD simulations.
Relating high dimensional stochastic complex systems to low-dimensional intermittency
Diaz-Ruelas, Alvaro; Jensen, Henrik Jeldtoft; Piovani, Duccio; Robledo, Alberto
2017-02-01
We evaluate the implication and outlook of an unanticipated simplification in the macroscopic behavior of two high-dimensional sto-chastic models: the Replicator Model with Mutations and the Tangled Nature Model (TaNa) of evolutionary ecology. This simplification consists of the apparent display of low-dimensional dynamics in the non-stationary intermittent time evolution of the model on a coarse-grained scale. Evolution on this time scale spans generations of individuals, rather than single reproduction, death or mutation events. While a local one-dimensional map close to a tangent bifurcation can be derived from a mean-field version of the TaNa model, a nonlinear dynamical model consisting of successive tangent bifurcations generates time evolution patterns resembling those of the full TaNa model. To advance the interpretation of this finding, here we consider parallel results on a game-theoretic version of the TaNa model that in discrete time yields a coupled map lattice. This in turn is represented, a la Langevin, by a one-dimensional nonlinear map. Among various kinds of behaviours we obtain intermittent evolution associated with tangent bifurcations. We discuss our results.