Parallel Sparse Matrix - Vector Product
DEFF Research Database (Denmark)
Alexandersen, Joe; Lazarov, Boyan Stefanov; Dammann, Bernd
This technical report contains a case study of a sparse matrix-vector product routine, implemented for parallel execution on a compute cluster with both pure MPI and hybrid MPI-OpenMP solutions. C++ classes for sparse data types were developed and the report shows how these class can be used...
Massive Asynchronous Parallelization of Sparse Matrix Factorizations
Energy Technology Data Exchange (ETDEWEB)
Chow, Edmond [Georgia Inst. of Technology, Atlanta, GA (United States)
2018-01-08
Solving sparse problems is at the core of many DOE computational science applications. We focus on the challenge of developing sparse algorithms that can fully exploit the parallelism in extreme-scale computing systems, in particular systems with massive numbers of cores per node. Our approach is to express a sparse matrix factorization as a large number of bilinear constraint equations, and then solving these equations via an asynchronous iterative method. The unknowns in these equations are the matrix entries of the factorization that is desired.
Parallel transposition of sparse data structures
DEFF Research Database (Denmark)
Wang, Hao; Liu, Weifeng; Hou, Kaixi
2016-01-01
Many applications in computational sciences and social sciences exploit sparsity and connectivity of acquired data. Even though many parallel sparse primitives such as sparse matrix-vector (SpMV) multiplication have been extensively studied, some other important building blocks, e.g., parallel tr...... transposition in the latest vendor-supplied library on an Intel multicore CPU platform, and the MergeTrans approach achieves on average of 3.4-fold (up to 11.7-fold) speedup on an Intel Xeon Phi many-core processor....
Parallel preconditioning techniques for sparse CG solvers
Energy Technology Data Exchange (ETDEWEB)
Basermann, A.; Reichel, B.; Schelthoff, C. [Central Institute for Applied Mathematics, Juelich (Germany)
1996-12-31
Conjugate gradient (CG) methods to solve sparse systems of linear equations play an important role in numerical methods for solving discretized partial differential equations. The large size and the condition of many technical or physical applications in this area result in the need for efficient parallelization and preconditioning techniques of the CG method. In particular for very ill-conditioned matrices, sophisticated preconditioner are necessary to obtain both acceptable convergence and accuracy of CG. Here, we investigate variants of polynomial and incomplete Cholesky preconditioners that markedly reduce the iterations of the simply diagonally scaled CG and are shown to be well suited for massively parallel machines.
Partitioning sparse rectangular matrices for parallel processing
Energy Technology Data Exchange (ETDEWEB)
Kolda, T.G.
1998-05-01
The authors are interested in partitioning sparse rectangular matrices for parallel processing. The partitioning problem has been well-studied in the square symmetric case, but the rectangular problem has received very little attention. They will formalize the rectangular matrix partitioning problem and discuss several methods for solving it. They will extend the spectral partitioning method for symmetric matrices to the rectangular case and compare this method to three new methods -- the alternating partitioning method and two hybrid methods. The hybrid methods will be shown to be best.
Parallel sparse direct solver for integrated circuit simulation
Chen, Xiaoming; Yang, Huazhong
2017-01-01
This book describes algorithmic methods and parallelization techniques to design a parallel sparse direct solver which is specifically targeted at integrated circuit simulation problems. The authors describe a complete flow and detailed parallel algorithms of the sparse direct solver. They also show how to improve the performance by simple but effective numerical techniques. The sparse direct solver techniques described can be applied to any SPICE-like integrated circuit simulator and have been proven to be high-performance in actual circuit simulation. Readers will benefit from the state-of-the-art parallel integrated circuit simulation techniques described in this book, especially the latest parallel sparse matrix solution techniques. · Introduces complicated algorithms of sparse linear solvers, using concise principles and simple examples, without complex theory or lengthy derivations; · Describes a parallel sparse direct solver that can be adopted to accelerate any SPICE-like integrated circuit simulato...
On the Automatic Parallelization of Sparse and Irregular Fortran Programs
Directory of Open Access Journals (Sweden)
Yuan Lin
1999-01-01
Full Text Available Automatic parallelization is usually believed to be less effective at exploiting implicit parallelism in sparse/irregular programs than in their dense/regular counterparts. However, not much is really known because there have been few research reports on this topic. In this work, we have studied the possibility of using an automatic parallelizing compiler to detect the parallelism in sparse/irregular programs. The study with a collection of sparse/irregular programs led us to some common loop patterns. Based on these patterns new techniques were derived that produced good speedups when manually applied to our benchmark codes. More importantly, these parallelization methods can be implemented in a parallelizing compiler and can be applied automatically.
Sparse BLIP: BLind Iterative Parallel imaging reconstruction using compressed sensing.
She, Huajun; Chen, Rong-Rong; Liang, Dong; DiBella, Edward V R; Ying, Leslie
2014-02-01
To develop a sensitivity-based parallel imaging reconstruction method to reconstruct iteratively both the coil sensitivities and MR image simultaneously based on their prior information. Parallel magnetic resonance imaging reconstruction problem can be formulated as a multichannel sampling problem where solutions are sought analytically. However, the channel functions given by the coil sensitivities in parallel imaging are not known exactly and the estimation error usually leads to artifacts. In this study, we propose a new reconstruction algorithm, termed Sparse BLind Iterative Parallel, for blind iterative parallel imaging reconstruction using compressed sensing. The proposed algorithm reconstructs both the sensitivity functions and the image simultaneously from undersampled data. It enforces the sparseness constraint in the image as done in compressed sensing, but is different from compressed sensing in that the sensing matrix is unknown and additional constraint is enforced on the sensitivities as well. Both phantom and in vivo imaging experiments were carried out with retrospective undersampling to evaluate the performance of the proposed method. Experiments show improvement in Sparse BLind Iterative Parallel reconstruction when compared with Sparse SENSE, JSENSE, IRGN-TV, and L1-SPIRiT reconstructions with the same number of measurements. The proposed Sparse BLind Iterative Parallel algorithm reduces the reconstruction errors when compared to the state-of-the-art parallel imaging methods. Copyright © 2013 Wiley Periodicals, Inc.
Iterative algorithms for large sparse linear systems on parallel computers
Adams, L. M.
1982-01-01
Algorithms for assembling in parallel the sparse system of linear equations that result from finite difference or finite element discretizations of elliptic partial differential equations, such as those that arise in structural engineering are developed. Parallel linear stationary iterative algorithms and parallel preconditioned conjugate gradient algorithms are developed for solving these systems. In addition, a model for comparing parallel algorithms on array architectures is developed and results of this model for the algorithms are given.
Massively parallel sparse matrix function calculations with NTPoly
Dawson, William; Nakajima, Takahito
2018-04-01
We present NTPoly, a massively parallel library for computing the functions of sparse, symmetric matrices. The theory of matrix functions is a well developed framework with a wide range of applications including differential equations, graph theory, and electronic structure calculations. One particularly important application area is diagonalization free methods in quantum chemistry. When the input and output of the matrix function are sparse, methods based on polynomial expansions can be used to compute matrix functions in linear time. We present a library based on these methods that can compute a variety of matrix functions. Distributed memory parallelization is based on a communication avoiding sparse matrix multiplication algorithm. OpenMP task parallellization is utilized to implement hybrid parallelization. We describe NTPoly's interface and show how it can be integrated with programs written in many different programming languages. We demonstrate the merits of NTPoly by performing large scale calculations on the K computer.
P-SPARSLIB: A parallel sparse iterative solution package
Energy Technology Data Exchange (ETDEWEB)
Saad, Y. [Univ. of Minnesota, Minneapolis, MN (United States)
1994-12-31
Iterative methods are gaining popularity in engineering and sciences at a time where the computational environment is changing rapidly. P-SPARSLIB is a project to build a software library for sparse matrix computations on parallel computers. The emphasis is on iterative methods and the use of distributed sparse matrices, an extension of the domain decomposition approach to general sparse matrices. One of the goals of this project is to develop a software package geared towards specific applications. For example, the author will test the performance and usefulness of P-SPARSLIB modules on linear systems arising from CFD applications. Equally important is the goal of portability. In the long run, the author wishes to ensure that this package is portable on a variety of platforms, including SIMD environments and shared memory environments.
Sparse Parallel MRI Based on Accelerated Operator Splitting Schemes.
Cai, Nian; Xie, Weisi; Su, Zhenghang; Wang, Shanshan; Liang, Dong
2016-01-01
Recently, the sparsity which is implicit in MR images has been successfully exploited for fast MR imaging with incomplete acquisitions. In this paper, two novel algorithms are proposed to solve the sparse parallel MR imaging problem, which consists of l 1 regularization and fidelity terms. The two algorithms combine forward-backward operator splitting and Barzilai-Borwein schemes. Theoretically, the presented algorithms overcome the nondifferentiable property in l 1 regularization term. Meanwhile, they are able to treat a general matrix operator that may not be diagonalized by fast Fourier transform and to ensure that a well-conditioned optimization system of equations is simply solved. In addition, we build connections between the proposed algorithms and the state-of-the-art existing methods and prove their convergence with a constant stepsize in Appendix. Numerical results and comparisons with the advanced methods demonstrate the efficiency of proposed algorithms.
Storage of sparse files using parallel log-structured file system
Bent, John M.; Faibish, Sorin; Grider, Gary; Torres, Aaron
2017-11-07
A sparse file is stored without holes by storing a data portion of the sparse file using a parallel log-structured file system; and generating an index entry for the data portion, the index entry comprising a logical offset, physical offset and length of the data portion. The holes can be restored to the sparse file upon a reading of the sparse file. The data portion can be stored at a logical end of the sparse file. Additional storage efficiency can optionally be achieved by (i) detecting a write pattern for a plurality of the data portions and generating a single patterned index entry for the plurality of the patterned data portions; and/or (ii) storing the patterned index entries for a plurality of the sparse files in a single directory, wherein each entry in the single directory comprises an identifier of a corresponding sparse file.
Parallel and Scalable Sparse Basic Linear Algebra Subprograms
DEFF Research Database (Denmark)
Liu, Weifeng
and heterogeneous processors. The thesis compares the proposed methods with state-of-the-art approaches on six homogeneous and five heterogeneous processors from Intel, AMD and nVidia. Using in total 38 sparse matrices as a benchmark suite, the experimental results show that the proposed methods obtain significant...
User's Manual for PCSMS (Parallel Complex Sparse Matrix Solver). Version 1.
Reddy, C. J.
2000-01-01
PCSMS (Parallel Complex Sparse Matrix Solver) is a computer code written to make use of the existing real sparse direct solvers to solve complex, sparse matrix linear equations. PCSMS converts complex matrices into real matrices and use real, sparse direct matrix solvers to factor and solve the real matrices. The solution vector is reconverted to complex numbers. Though, this utility is written for Silicon Graphics (SGI) real sparse matrix solution routines, it is general in nature and can be easily modified to work with any real sparse matrix solver. The User's Manual is written to make the user acquainted with the installation and operation of the code. Driver routines are given to aid the users to integrate PCSMS routines in their own codes.
Parallel sparse direct solvers for Poisson's equation in streamer discharges
M. Nool (Margreet); M. Genseberger (Menno); U. M. Ebert (Ute)
2017-01-01
textabstractThe aim of this paper is to examine whether a hybrid approach of parallel computing, a combination of the message passing model (MPI) with the threads model (OpenMP) can deliver good performance in streamer discharge simulations. Since one of the bottlenecks of almost all streamer
Parallelized preconditioned BiCGStab solution of sparse linear system equations in F-COBRA-TF
International Nuclear Information System (INIS)
Geemert, Rene van; Glück, Markus; Riedmann, Michael; Gabriel, Harry
2011-01-01
Recently, the in-house development of a preconditioned and parallelized BiCGStab solver has been pursued successfully in AREVA’s advanced sub-channel code F-COBRA-TF. This solver can be run either in a sequential computation mode on a single CPU, or in a parallel computation mode on multiple parallel CPUs. The developed procedure enables the computation of several thousands of successive sparse linear system solutions in F-COBRA-TF with acceptable wall clock run times. The current paper provides general information about F-COBRA-TF in terms of modeling capabilities and application areas, and points out where the relevance arises for the efficient iterative solution of sparse linear systems. Furthermore, the preconditioning and parallelization strategies in the developed BiCGStab iterative solution approach are discussed. The paper is concluded with a number of verification examples. (author)
Sparse Probabilistic Parallel Factor Analysis for the Modeling of PET and Task-fMRI Data
DEFF Research Database (Denmark)
Beliveau, Vincent; Papoutsakis, Georgios; Hinrich, Jesper Løve
2017-01-01
Modern datasets are often multiway in nature and can contain patterns common to a mode of the data (e.g. space, time, and subjects). Multiway decomposition such as parallel factor analysis (PARAFAC) take into account the intrinsic structure of the data, and sparse versions of these methods improv...
Graph Transformation and Designing Parallel Sparse Matrix Algorithms beyond Data Dependence Analysis
Directory of Open Access Journals (Sweden)
H.X. Lin
2004-01-01
Full Text Available Algorithms are often parallelized based on data dependence analysis manually or by means of parallel compilers. Some vector/matrix computations such as the matrix-vector products with simple data dependence structures (data parallelism can be easily parallelized. For problems with more complicated data dependence structures, parallelization is less straightforward. The data dependence graph is a powerful means for designing and analyzing parallel algorithms. However, for sparse matrix computations, parallelization based on solely exploiting the existing parallelism in an algorithm does not always give satisfactory results. For example, the conventional Gaussian elimination algorithm for the solution of a tri-diagonal system is inherently sequential, so algorithms specially for parallel computation has to be designed. After briefly reviewing different parallelization approaches, a powerful graph formalism for designing parallel algorithms is introduced. This formalism will be discussed using a tri-diagonal system as an example. Its application to general matrix computations is also discussed. Its power in designing parallel algorithms beyond the ability of data dependence analysis is shown by means of a new algorithm called ACER (Alternating Cyclic Elimination and Reduction algorithm.
Building Input Adaptive Parallel Applications: A Case Study of Sparse Grid Interpolation
Murarasu, Alin
2012-12-01
The well-known power wall resulting in multi-cores requires special techniques for speeding up applications. In this sense, parallelization plays a crucial role. Besides standard serial optimizations, techniques such as input specialization can also bring a substantial contribution to the speedup. By identifying common patterns in the input data, we propose new algorithms for sparse grid interpolation that accelerate the state-of-the-art non-specialized version. Sparse grid interpolation is an inherently hierarchical method of interpolation employed for example in computational steering applications for decompressing highdimensional simulation data. In this context, improving the speedup is essential for real-time visualization. Using input specialization, we report a speedup of up to 9x over the nonspecialized version. The paper covers the steps we took to reach this speedup by means of input adaptivity. Our algorithms will be integrated in fastsg, a library for fast sparse grid interpolation. © 2012 IEEE.
Learning Joint-Sparse Codes for Calibration-Free Parallel MR Imaging.
Wang, Shanshan; Tan, Sha; Gao, Yuan; Liu, Qiegen; Ying, Leslie; Xiao, Taohui; Liu, Yuanyuan; Liu, Xin; Zheng, Hairong; Liang, Dong
2018-01-01
The integration of compressed sensing and parallel imaging (CS-PI) has shown an increased popularity in recent years to accelerate magnetic resonance (MR) imaging. Among them, calibration-free techniques have presented encouraging performances due to its capability in robustly handling the sensitivity information. Unfortunately, existing calibration-free methods have only explored joint-sparsity with direct analysis transform projections. To further exploit joint-sparsity and improve reconstruction accuracy, this paper proposes to Learn joINt-sparse coDes for caliBration-free parallEl mR imaGing (LINDBERG) by modeling the parallel MR imaging problem as an - - minimization objective with an norm constraining data fidelity, Frobenius norm enforcing sparse representation error and the mixed norm triggering joint sparsity across multichannels. A corresponding algorithm has been developed to alternatively update the sparse representation, sensitivity encoded images and K-space data. Then, the final image is produced as the square root of sum of squares of all channel images. Experimental results on both physical phantom and in vivo data sets show that the proposed method is comparable and even superior to state-of-the-art CS-PI reconstruction approaches. Specifically, LINDBERG has presented strong capability in suppressing noise and artifacts while reconstructing MR images from highly undersampled multichannel measurements.
A new scheduling algorithm for parallel sparse LU factorization with static pivoting
Energy Technology Data Exchange (ETDEWEB)
Grigori, Laura; Li, Xiaoye S.
2002-08-20
In this paper we present a static scheduling algorithm for parallel sparse LU factorization with static pivoting. The algorithm is divided into mapping and scheduling phases, using the symmetric pruned graphs of L' and U to represent dependencies. The scheduling algorithm is designed for driving the parallel execution of the factorization on a distributed-memory architecture. Experimental results and comparisons with SuperLU{_}DIST are reported after applying this algorithm on real world application matrices on an IBM SP RS/6000 distributed memory machine.
Parallel Reservoir Simulations with Sparse Grid Techniques and Applications to Wormhole Propagation
Wu, Yuanqing
2015-09-08
In this work, two topics of reservoir simulations are discussed. The first topic is the two-phase compositional flow simulation in hydrocarbon reservoir. The major obstacle that impedes the applicability of the simulation code is the long run time of the simulation procedure, and thus speeding up the simulation code is necessary. Two means are demonstrated to address the problem: parallelism in physical space and the application of sparse grids in parameter space. The parallel code can gain satisfactory scalability, and the sparse grids can remove the bottleneck of flash calculations. Instead of carrying out the flash calculation in each time step of the simulation, a sparse grid approximation of all possible results of the flash calculation is generated before the simulation. Then the constructed surrogate model is evaluated to approximate the flash calculation results during the simulation. The second topic is the wormhole propagation simulation in carbonate reservoir. In this work, different from the traditional simulation technique relying on the Darcy framework, we propose a new framework called Darcy-Brinkman-Forchheimer framework to simulate wormhole propagation. Furthermore, to process the large quantity of cells in the simulation grid and shorten the long simulation time of the traditional serial code, standard domain-based parallelism is employed, using the Hypre multigrid library. In addition to that, a new technique called “experimenting field approach” to set coefficients in the model equations is introduced. In the 2D dissolution experiments, different configurations of wormholes and a series of properties simulated by both frameworks are compared. We conclude that the numerical results of the DBF framework are more like wormholes and more stable than the Darcy framework, which is a demonstration of the advantages of the DBF framework. The scalability of the parallel code is also evaluated, and good scalability can be achieved. Finally, a mixed
Turbo-SMT: Parallel Coupled Sparse Matrix-Tensor Factorizations and Applications
Papalexakis, Evangelos E.; Faloutsos, Christos; Mitchell, Tom M.; Talukdar, Partha Pratim; Sidiropoulos, Nicholas D.; Murphy, Brian
2016-01-01
How can we correlate the neural activity in the human brain as it responds to typed words, with properties of these terms (like ’edible’, ’fits in hand’)? In short, we want to find latent variables, that jointly explain both the brain activity, as well as the behavioral responses. This is one of many settings of the Coupled Matrix-Tensor Factorization (CMTF) problem. Can we enhance any CMTF solver, so that it can operate on potentially very large datasets that may not fit in main memory? We introduce Turbo-SMT, a meta-method capable of doing exactly that: it boosts the performance of any CMTF algorithm, produces sparse and interpretable solutions, and parallelizes any CMTF algorithm, producing sparse and interpretable solutions (up to 65 fold). Additionally, we improve upon ALS, the work-horse algorithm for CMTF, with respect to efficiency and robustness to missing values. We apply Turbo-SMT to BrainQ, a dataset consisting of a (nouns, brain voxels, human subjects) tensor and a (nouns, properties) matrix, with coupling along the nouns dimension. Turbo-SMT is able to find meaningful latent variables, as well as to predict brain activity with competitive accuracy. Finally, we demonstrate the generality of Turbo-SMT, by applying it on a Facebook dataset (users, ’friends’, wall-postings); there, Turbo-SMT spots spammer-like anomalies. PMID:27672406
Building Input Adaptive Parallel Applications: A Case Study of Sparse Grid Interpolation
Murarasu, Alin; Weidendorfer, Josef
2012-01-01
bring a substantial contribution to the speedup. By identifying common patterns in the input data, we propose new algorithms for sparse grid interpolation that accelerate the state-of-the-art non-specialized version. Sparse grid interpolation
A Non-static Data Layout Enhancing Parallelism and Vectorization in Sparse Grid Algorithms
Buse, Gerrit; Pfluger, Dirk; Murarasu, Alin; Jacob, Riko
2012-01-01
performance and facilitate the use of vector registers for our sparse grid benchmark problem hierarchization. Based on the compact data structure proposed for regular sparse grids in [2], we developed a new algorithm that outperforms existing implementations
A Non-static Data Layout Enhancing Parallelism and Vectorization in Sparse Grid Algorithms
Buse, Gerrit
2012-06-01
The name sparse grids denotes a highly space-efficient, grid-based numerical technique to approximate high-dimensional functions. Although employed in a broad spectrum of applications from different fields, there have only been few tries to use it in real time visualization (e.g. [1]), due to complex data structures and long algorithm runtime. In this work we present a novel approach inspired by principles of I/0-efficient algorithms. Locally applied coefficient permutations lead to improved cache performance and facilitate the use of vector registers for our sparse grid benchmark problem hierarchization. Based on the compact data structure proposed for regular sparse grids in [2], we developed a new algorithm that outperforms existing implementations on modern multi-core systems by a factor of 37 for a grid size of 127 million points. For larger problems the speedup is even increasing, and with execution times below 1 s, sparse grids are well-suited for visualization applications. Furthermore, we point out how a broad class of sparse grid algorithms can benefit from our approach. © 2012 IEEE.
Communications oriented programming of parallel iterative solutions of sparse linear systems
Patrick, M. L.; Pratt, T. W.
1986-01-01
Parallel algorithms are developed for a class of scientific computational problems by partitioning the problems into smaller problems which may be solved concurrently. The effectiveness of the resulting parallel solutions is determined by the amount and frequency of communication and synchronization and the extent to which communication can be overlapped with computation. Three different parallel algorithms for solving the same class of problems are presented, and their effectiveness is analyzed from this point of view. The algorithms are programmed using a new programming environment. Run-time statistics and experience obtained from the execution of these programs assist in measuring the effectiveness of these algorithms.
International Nuclear Information System (INIS)
Lin Lin; Chao Yang; Jiangfeng Lu; Lexing Ying; Weinan, E.
2009-01-01
We present an efficient parallel algorithm and its implementation for computing the diagonal of H -1 where H is a 2D Kohn-Sham Hamiltonian discretized on a rectangular domain using a standard second order finite difference scheme. This type of calculation can be used to obtain an accurate approximation to the diagonal of a Fermi-Dirac function of H through a recently developed pole-expansion technique LinLuYingE2009. The diagonal elements are needed in electronic structure calculations for quantum mechanical systems HohenbergKohn1964, KohnSham 1965,DreizlerGross1990. We show how elimination tree is used to organize the parallel computation and how synchronization overhead is reduced by passing data level by level along this tree using the technique of local buffers and relative indices. We analyze the performance of our implementation by examining its load balance and communication overhead. We show that our implementation exhibits an excellent weak scaling on a large-scale high performance distributed parallel machine. When compared with standard approach for evaluating the diagonal a Fermi-Dirac function of a Kohn-Sham Hamiltonian associated a 2D electron quantum dot, the new pole-expansion technique that uses our algorithm to compute the diagonal of (H-z i I) -1 for a small number of poles z i is much faster, especially when the quantum dot contains many electrons.
Energy Technology Data Exchange (ETDEWEB)
Lin, Lin; Yang, Chao; Lu, Jiangfeng; Ying, Lexing; E, Weinan
2009-09-25
We present an efficient parallel algorithm and its implementation for computing the diagonal of $H^-1$ where $H$ is a 2D Kohn-Sham Hamiltonian discretized on a rectangular domain using a standard second order finite difference scheme. This type of calculation can be used to obtain an accurate approximation to the diagonal of a Fermi-Dirac function of $H$ through a recently developed pole-expansion technique \\cite{LinLuYingE2009}. The diagonal elements are needed in electronic structure calculations for quantum mechanical systems \\citeHohenbergKohn1964, KohnSham 1965,DreizlerGross1990. We show how elimination tree is used to organize the parallel computation and how synchronization overhead is reduced by passing data level by level along this tree using the technique of local buffers and relative indices. We analyze the performance of our implementation by examining its load balance and communication overhead. We show that our implementation exhibits an excellent weak scaling on a large-scale high performance distributed parallel machine. When compared with standard approach for evaluating the diagonal a Fermi-Dirac function of a Kohn-Sham Hamiltonian associated a 2D electron quantum dot, the new pole-expansion technique that uses our algorithm to compute the diagonal of $(H-z_i I)^-1$ for a small number of poles $z_i$ is much faster, especially when the quantum dot contains many electrons.
Scemama, Anthony; Renon, Nicolas; Rapacioli, Mathias
2014-06-10
We present an algorithm and its parallel implementation for solving a self-consistent problem as encountered in Hartree-Fock or density functional theory. The algorithm takes advantage of the sparsity of matrices through the use of local molecular orbitals. The implementation allows one to exploit efficiently modern symmetric multiprocessing (SMP) computer architectures. As a first application, the algorithm is used within the density-functional-based tight binding method, for which most of the computational time is spent in the linear algebra routines (diagonalization of the Fock/Kohn-Sham matrix). We show that with this algorithm (i) single point calculations on very large systems (millions of atoms) can be performed on large SMP machines, (ii) calculations involving intermediate size systems (1000-100 000 atoms) are also strongly accelerated and can run efficiently on standard servers, and (iii) the error on the total energy due to the use of a cutoff in the molecular orbital coefficients can be controlled such that it remains smaller than the SCF convergence criterion.
Yoon, Jeong Hee; Lee, Jeong Min; Yu, Mi Hye; Hur, Bo Yun; Grimm, Robert; Block, Kai Tobias; Chandarana, Hersh; Kiefer, Berthold; Son, Yohan
2018-01-01
The aims of this study were to observe the pattern of transient motion after gadoxetic acid administration including incidence, onset, and duration, and to evaluate the clinical feasibility of free-breathing gadoxetic acid-enhanced liver magnetic resonance imaging using golden-angle radial sparse parallel (GRASP) imaging with respiratory gating. In this institutional review board-approved prospective study, 59 patients who provided informed consents were analyzed. Free-breathing dynamic T1-weighted images (T1WIs) were obtained using GRASP at 3 T after a standard dose of gadoxetic acid (0.025 mmol/kg) administration at a rate of 1 mL/s, and development of transient motion was monitored, which is defined as a distinctive respiratory frequency alteration of the self-gating MR signals. Early arterial, late arterial, and portal venous phases retrospectively reconstructed with and without respiratory gating and with different temporal resolutions (nongated 13.3-second, gated 13.3-second, gated 6-second T1WI) were evaluated for image quality and motion artifacts. Diagnostic performance in detecting focal liver lesions was compared among the 3 data sets. Transient motion (mean duration, 21.5 ± 13.0 seconds) was observed in 40.0% (23/59) of patients, 73.9% (17/23) of which developed within 15 seconds after gadoxetic acid administration. On late arterial phase, motion artifacts were significantly reduced on gated 13.3-second and 6-second T1WI (3.64 ± 0.34, 3.61 ± 0.36, respectively), compared with nongated 13.3-second T1WI (3.12 ± 0.51, P < 0.0001). Overall, image quality was the highest on gated 13.3-second T1WI (3.76 ± 0.39) followed by gated 6-second and nongated 13.3-second T1WI (3.39 ± 0.55, 2.57 ± 0.57, P < 0.0001). Only gated 6-second T1WI showed significantly higher detection performance than nongated 13.3-second T1WI (figure of merit, 0.69 [0.63-0.76]) vs 0.60 [0.56-0.65], P = 0.004). Transient motion developed in 40% (23/59) of patients shortly after
Sparse distributed memory overview
Raugh, Mike
1990-01-01
The Sparse Distributed Memory (SDM) project is investigating the theory and applications of massively parallel computing architecture, called sparse distributed memory, that will support the storage and retrieval of sensory and motor patterns characteristic of autonomous systems. The immediate objectives of the project are centered in studies of the memory itself and in the use of the memory to solve problems in speech, vision, and robotics. Investigation of methods for encoding sensory data is an important part of the research. Examples of NASA missions that may benefit from this work are Space Station, planetary rovers, and solar exploration. Sparse distributed memory offers promising technology for systems that must learn through experience and be capable of adapting to new circumstances, and for operating any large complex system requiring automatic monitoring and control. Sparse distributed memory is a massively parallel architecture motivated by efforts to understand how the human brain works. Sparse distributed memory is an associative memory, able to retrieve information from cues that only partially match patterns stored in the memory. It is able to store long temporal sequences derived from the behavior of a complex system, such as progressive records of the system's sensory data and correlated records of the system's motor controls.
Chen, Lihua; Liu, Daihong; Zhang, Jiuquan; Xie, Bing; Zhou, Xiaoyue; Grimm, Robert; Huang, Xuequan; Wang, Jian; Feng, Li
2018-02-13
Dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) has been shown to be a promising technique for assessing lung lesions. However, DCE-MRI often suffers from motion artifacts and insufficient imaging speed. Therefore, highly accelerated free-breathing DCE-MRI is of clinical interest for lung exams. To test the performance of rapid free-breathing DCE-MRI for simultaneous qualitative and quantitative assessment of pulmonary lesions using Golden-angle RAdial Sparse Parallel (GRASP) imaging. Prospective. Twenty-six patients (17 males, mean age = 55.1 ± 14.4) with known pulmonary lesions. 3T MR scanner; a prototype fat-saturated, T 1 -weighted stack-of-stars golden-angle radial sequence for data acquisition and a Cartesian breath-hold volumetric-interpolated examination (BH-VIBE) sequence for comparison. After a dual-mode GRASP reconstruction, one with 3-second temporal resolution (3s-GRASP) and the other with 15-second temporal resolution (15s-GRASP), all GRASP and BH-VIBE images were pooled together for blind assessment by two experienced radiologists, who independently scored the overall image quality, lesion delineation, overall artifact level, and diagnostic confidence of each case. Perfusion analysis was performed for the 3s-GRASP images using a Tofts model to generate the volume transfer coefficient (K trans ) and interstitial volume (V e ). Nonparametric paired two-tailed Wilcoxon signed-rank test; Cohen's kappa; unpaired Student's t-test. 15s-GRASP achieved comparable image quality with conventional BH-VIBE (P > 0.05), except for the higher overall artifact level in the precontrast phase (P = 0.018). The K trans and V e in inflammation were higher than those in malignant lesions (K trans : 0.78 ± 0.52 min -1 vs. 0.37 ± 0.22 min -1 , P = 0.020; V e : 0.36 ± 0.16 vs. 0.26 ± 0.1, P = 0.177). Also, the K trans and V e in malignant lesions were also higher than those in benign lesions (K trans : 0.37
Enhancing Scalability of Sparse Direct Methods
International Nuclear Information System (INIS)
Li, Xiaoye S.; Demmel, James; Grigori, Laura; Gu, Ming; Xia, Jianlin; Jardin, Steve; Sovinec, Carl; Lee, Lie-Quan
2007-01-01
TOPS is providing high-performance, scalable sparse direct solvers, which have had significant impacts on the SciDAC applications, including fusion simulation (CEMM), accelerator modeling (COMPASS), as well as many other mission-critical applications in DOE and elsewhere. Our recent developments have been focusing on new techniques to overcome scalability bottleneck of direct methods, in both time and memory. These include parallelizing symbolic analysis phase and developing linear-complexity sparse factorization methods. The new techniques will make sparse direct methods more widely usable in large 3D simulations on highly-parallel petascale computers
Parallelism in matrix computations
Gallopoulos, Efstratios; Sameh, Ahmed H
2016-01-01
This book is primarily intended as a research monograph that could also be used in graduate courses for the design of parallel algorithms in matrix computations. It assumes general but not extensive knowledge of numerical linear algebra, parallel architectures, and parallel programming paradigms. The book consists of four parts: (I) Basics; (II) Dense and Special Matrix Computations; (III) Sparse Matrix Computations; and (IV) Matrix functions and characteristics. Part I deals with parallel programming paradigms and fundamental kernels, including reordering schemes for sparse matrices. Part II is devoted to dense matrix computations such as parallel algorithms for solving linear systems, linear least squares, the symmetric algebraic eigenvalue problem, and the singular-value decomposition. It also deals with the development of parallel algorithms for special linear systems such as banded ,Vandermonde ,Toeplitz ,and block Toeplitz systems. Part III addresses sparse matrix computations: (a) the development of pa...
Sparse structure regularized ranking
Wang, Jim Jing-Yan; Sun, Yijun; Gao, Xin
2014-01-01
Learning ranking scores is critical for the multimedia database retrieval problem. In this paper, we propose a novel ranking score learning algorithm by exploring the sparse structure and using it to regularize ranking scores. To explore the sparse
Zhang, Tianzhu
2015-06-01
Sparse representation has been applied to visual tracking by finding the best target candidate with minimal reconstruction error by use of target templates. However, most sparse representation based trackers only consider holistic or local representations and do not make full use of the intrinsic structure among and inside target candidates, thereby making the representation less effective when similar objects appear or under occlusion. In this paper, we propose a novel Structural Sparse Tracking (SST) algorithm, which not only exploits the intrinsic relationship among target candidates and their local patches to learn their sparse representations jointly, but also preserves the spatial layout structure among the local patches inside each target candidate. We show that our SST algorithm accommodates most existing sparse trackers with the respective merits. Both qualitative and quantitative evaluations on challenging benchmark image sequences demonstrate that the proposed SST algorithm performs favorably against several state-of-the-art methods.
Analog system for computing sparse codes
Rozell, Christopher John; Johnson, Don Herrick; Baraniuk, Richard Gordon; Olshausen, Bruno A.; Ortman, Robert Lowell
2010-08-24
A parallel dynamical system for computing sparse representations of data, i.e., where the data can be fully represented in terms of a small number of non-zero code elements, and for reconstructing compressively sensed images. The system is based on the principles of thresholding and local competition that solves a family of sparse approximation problems corresponding to various sparsity metrics. The system utilizes Locally Competitive Algorithms (LCAs), nodes in a population continually compete with neighboring units using (usually one-way) lateral inhibition to calculate coefficients representing an input in an over complete dictionary.
Sparse structure regularized ranking
Wang, Jim Jing-Yan
2014-04-17
Learning ranking scores is critical for the multimedia database retrieval problem. In this paper, we propose a novel ranking score learning algorithm by exploring the sparse structure and using it to regularize ranking scores. To explore the sparse structure, we assume that each multimedia object could be represented as a sparse linear combination of all other objects, and combination coefficients are regarded as a similarity measure between objects and used to regularize their ranking scores. Moreover, we propose to learn the sparse combination coefficients and the ranking scores simultaneously. A unified objective function is constructed with regard to both the combination coefficients and the ranking scores, and is optimized by an iterative algorithm. Experiments on two multimedia database retrieval data sets demonstrate the significant improvements of the propose algorithm over state-of-the-art ranking score learning algorithms.
Zhang, Tianzhu; Yang, Ming-Hsuan; Ahuja, Narendra; Ghanem, Bernard; Yan, Shuicheng; Xu, Changsheng; Liu, Si
2015-01-01
candidate. We show that our SST algorithm accommodates most existing sparse trackers with the respective merits. Both qualitative and quantitative evaluations on challenging benchmark image sequences demonstrate that the proposed SST algorithm performs
SparseM: A Sparse Matrix Package for R *
Directory of Open Access Journals (Sweden)
Roger Koenker
2003-02-01
Full Text Available SparseM provides some basic R functionality for linear algebra with sparse matrices. Use of the package is illustrated by a family of linear model fitting functions that implement least squares methods for problems with sparse design matrices. Significant performance improvements in memory utilization and computational speed are possible for applications involving large sparse matrices.
Fast wavelet based sparse approximate inverse preconditioner
Energy Technology Data Exchange (ETDEWEB)
Wan, W.L. [Univ. of California, Los Angeles, CA (United States)
1996-12-31
Incomplete LU factorization is a robust preconditioner for both general and PDE problems but unfortunately not easy to parallelize. Recent study of Huckle and Grote and Chow and Saad showed that sparse approximate inverse could be a potential alternative while readily parallelizable. However, for special class of matrix A that comes from elliptic PDE problems, their preconditioners are not optimal in the sense that independent of mesh size. A reason may be that no good sparse approximate inverse exists for the dense inverse matrix. Our observation is that for this kind of matrices, its inverse entries typically have piecewise smooth changes. We can take advantage of this fact and use wavelet compression techniques to construct a better sparse approximate inverse preconditioner. We shall show numerically that our approach is effective for this kind of matrices.
Efficient convolutional sparse coding
Wohlberg, Brendt
2017-06-20
Computationally efficient algorithms may be applied for fast dictionary learning solving the convolutional sparse coding problem in the Fourier domain. More specifically, efficient convolutional sparse coding may be derived within an alternating direction method of multipliers (ADMM) framework that utilizes fast Fourier transforms (FFT) to solve the main linear system in the frequency domain. Such algorithms may enable a significant reduction in computational cost over conventional approaches by implementing a linear solver for the most critical and computationally expensive component of the conventional iterative algorithm. The theoretical computational cost of the algorithm may be reduced from O(M.sup.3N) to O(MN log N), where N is the dimensionality of the data and M is the number of elements in the dictionary. This significant improvement in efficiency may greatly increase the range of problems that can practically be addressed via convolutional sparse representations.
Sparse approximation with bases
2015-01-01
This book systematically presents recent fundamental results on greedy approximation with respect to bases. Motivated by numerous applications, the last decade has seen great successes in studying nonlinear sparse approximation. Recent findings have established that greedy-type algorithms are suitable methods of nonlinear approximation in both sparse approximation with respect to bases and sparse approximation with respect to redundant systems. These insights, combined with some previous fundamental results, form the basis for constructing the theory of greedy approximation. Taking into account the theoretical and practical demand for this kind of theory, the book systematically elaborates a theoretical framework for greedy approximation and its applications. The book addresses the needs of researchers working in numerical mathematics, harmonic analysis, and functional analysis. It quickly takes the reader from classical results to the latest frontier, but is written at the level of a graduate course and do...
Supervised Convolutional Sparse Coding
Affara, Lama Ahmed
2018-04-08
Convolutional Sparse Coding (CSC) is a well-established image representation model especially suited for image restoration tasks. In this work, we extend the applicability of this model by proposing a supervised approach to convolutional sparse coding, which aims at learning discriminative dictionaries instead of purely reconstructive ones. We incorporate a supervised regularization term into the traditional unsupervised CSC objective to encourage the final dictionary elements to be discriminative. Experimental results show that using supervised convolutional learning results in two key advantages. First, we learn more semantically relevant filters in the dictionary and second, we achieve improved image reconstruction on unseen data.
Supervised Transfer Sparse Coding
Al-Shedivat, Maruan
2014-07-27
A combination of the sparse coding and transfer learn- ing techniques was shown to be accurate and robust in classification tasks where training and testing objects have a shared feature space but are sampled from differ- ent underlying distributions, i.e., belong to different do- mains. The key assumption in such case is that in spite of the domain disparity, samples from different domains share some common hidden factors. Previous methods often assumed that all the objects in the target domain are unlabeled, and thus the training set solely comprised objects from the source domain. However, in real world applications, the target domain often has some labeled objects, or one can always manually label a small num- ber of them. In this paper, we explore such possibil- ity and show how a small number of labeled data in the target domain can significantly leverage classifica- tion accuracy of the state-of-the-art transfer sparse cod- ing methods. We further propose a unified framework named supervised transfer sparse coding (STSC) which simultaneously optimizes sparse representation, domain transfer and classification. Experimental results on three applications demonstrate that a little manual labeling and then learning the model in a supervised fashion can significantly improve classification accuracy.
Exarchakis, Georgios; Lücke, Jörg
2017-11-01
Sparse coding algorithms with continuous latent variables have been the subject of a large number of studies. However, discrete latent spaces for sparse coding have been largely ignored. In this work, we study sparse coding with latents described by discrete instead of continuous prior distributions. We consider the general case in which the latents (while being sparse) can take on any value of a finite set of possible values and in which we learn the prior probability of any value from data. This approach can be applied to any data generated by discrete causes, and it can be applied as an approximation of continuous causes. As the prior probabilities are learned, the approach then allows for estimating the prior shape without assuming specific functional forms. To efficiently train the parameters of our probabilistic generative model, we apply a truncated expectation-maximization approach (expectation truncation) that we modify to work with a general discrete prior. We evaluate the performance of the algorithm by applying it to a variety of tasks: (1) we use artificial data to verify that the algorithm can recover the generating parameters from a random initialization, (2) use image patches of natural images and discuss the role of the prior for the extraction of image components, (3) use extracellular recordings of neurons to present a novel method of analysis for spiking neurons that includes an intuitive discretization strategy, and (4) apply the algorithm on the task of encoding audio waveforms of human speech. The diverse set of numerical experiments presented in this letter suggests that discrete sparse coding algorithms can scale efficiently to work with realistic data sets and provide novel statistical quantities to describe the structure of the data.
Multi-threaded Sparse Matrix Sparse Matrix Multiplication for Many-Core and GPU Architectures.
Energy Technology Data Exchange (ETDEWEB)
Deveci, Mehmet [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Trott, Christian Robert [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Rajamanickam, Sivasankaran [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
2018-01-01
Sparse Matrix-Matrix multiplication is a key kernel that has applications in several domains such as scientific computing and graph analysis. Several algorithms have been studied in the past for this foundational kernel. In this paper, we develop parallel algorithms for sparse matrix- matrix multiplication with a focus on performance portability across different high performance computing architectures. The performance of these algorithms depend on the data structures used in them. We compare different types of accumulators in these algorithms and demonstrate the performance difference between these data structures. Furthermore, we develop a meta-algorithm, kkSpGEMM, to choose the right algorithm and data structure based on the characteristics of the problem. We show performance comparisons on three architectures and demonstrate the need for the community to develop two phase sparse matrix-matrix multiplication implementations for efficient reuse of the data structures involved.
Totally parallel multilevel algorithms
Frederickson, Paul O.
1988-01-01
Four totally parallel algorithms for the solution of a sparse linear system have common characteristics which become quite apparent when they are implemented on a highly parallel hypercube such as the CM2. These four algorithms are Parallel Superconvergent Multigrid (PSMG) of Frederickson and McBryan, Robust Multigrid (RMG) of Hackbusch, the FFT based Spectral Algorithm, and Parallel Cyclic Reduction. In fact, all four can be formulated as particular cases of the same totally parallel multilevel algorithm, which are referred to as TPMA. In certain cases the spectral radius of TPMA is zero, and it is recognized to be a direct algorithm. In many other cases the spectral radius, although not zero, is small enough that a single iteration per timestep keeps the local error within the required tolerance.
A sparse-grid isogeometric solver
Beck, Joakim
2018-02-28
Isogeometric Analysis (IGA) typically adopts tensor-product splines and NURBS as a basis for the approximation of the solution of PDEs. In this work, we investigate to which extent IGA solvers can benefit from the so-called sparse-grids construction in its combination technique form, which was first introduced in the early 90’s in the context of the approximation of high-dimensional PDEs.The tests that we report show that, in accordance to the literature, a sparse-grid construction can indeed be useful if the solution of the PDE at hand is sufficiently smooth. Sparse grids can also be useful in the case of non-smooth solutions when some a-priori knowledge on the location of the singularities of the solution can be exploited to devise suitable non-equispaced meshes. Finally, we remark that sparse grids can be seen as a simple way to parallelize pre-existing serial IGA solvers in a straightforward fashion, which can be beneficial in many practical situations.
A sparse version of IGA solvers
Beck, Joakim
2017-07-30
Isogeometric Analysis (IGA) typically adopts tensor-product splines and NURBS as a basis for the approximation of the solution of PDEs. In this work, we investigate to which extent IGA solvers can benefit from the so-called sparse-grids construction in its combination technique form, which was first introduced in the early 90s in the context of the approximation of high-dimensional PDEs. The tests that we report show that, in accordance to the literature, a sparse grids construction can indeed be useful if the solution of the PDE at hand is sufficiently smooth. Sparse grids can also be useful in the case of non-smooth solutions when some a-priori knowledge on the location of the singularities of the solution can be exploited to devise suitable non-equispaced meshes. Finally, we remark that sparse grids can be seen as a simple way to parallelize pre-existing serial IGA solvers in a straightforward fashion, which can be beneficial in many practical situations.
A sparse-grid isogeometric solver
Beck, Joakim; Sangalli, Giancarlo; Tamellini, Lorenzo
2018-01-01
Isogeometric Analysis (IGA) typically adopts tensor-product splines and NURBS as a basis for the approximation of the solution of PDEs. In this work, we investigate to which extent IGA solvers can benefit from the so-called sparse-grids construction in its combination technique form, which was first introduced in the early 90’s in the context of the approximation of high-dimensional PDEs.The tests that we report show that, in accordance to the literature, a sparse-grid construction can indeed be useful if the solution of the PDE at hand is sufficiently smooth. Sparse grids can also be useful in the case of non-smooth solutions when some a-priori knowledge on the location of the singularities of the solution can be exploited to devise suitable non-equispaced meshes. Finally, we remark that sparse grids can be seen as a simple way to parallelize pre-existing serial IGA solvers in a straightforward fashion, which can be beneficial in many practical situations.
A sparse version of IGA solvers
Beck, Joakim; Sangalli, Giancarlo; Tamellini, Lorenzo
2017-01-01
Isogeometric Analysis (IGA) typically adopts tensor-product splines and NURBS as a basis for the approximation of the solution of PDEs. In this work, we investigate to which extent IGA solvers can benefit from the so-called sparse-grids construction in its combination technique form, which was first introduced in the early 90s in the context of the approximation of high-dimensional PDEs. The tests that we report show that, in accordance to the literature, a sparse grids construction can indeed be useful if the solution of the PDE at hand is sufficiently smooth. Sparse grids can also be useful in the case of non-smooth solutions when some a-priori knowledge on the location of the singularities of the solution can be exploited to devise suitable non-equispaced meshes. Finally, we remark that sparse grids can be seen as a simple way to parallelize pre-existing serial IGA solvers in a straightforward fashion, which can be beneficial in many practical situations.
Sparse inpainting and isotropy
Energy Technology Data Exchange (ETDEWEB)
Feeney, Stephen M.; McEwen, Jason D.; Peiris, Hiranya V. [Department of Physics and Astronomy, University College London, Gower Street, London, WC1E 6BT (United Kingdom); Marinucci, Domenico; Cammarota, Valentina [Department of Mathematics, University of Rome Tor Vergata, via della Ricerca Scientifica 1, Roma, 00133 (Italy); Wandelt, Benjamin D., E-mail: s.feeney@imperial.ac.uk, E-mail: marinucc@axp.mat.uniroma2.it, E-mail: jason.mcewen@ucl.ac.uk, E-mail: h.peiris@ucl.ac.uk, E-mail: wandelt@iap.fr, E-mail: cammarot@axp.mat.uniroma2.it [Kavli Institute for Theoretical Physics, Kohn Hall, University of California, 552 University Road, Santa Barbara, CA, 93106 (United States)
2014-01-01
Sparse inpainting techniques are gaining in popularity as a tool for cosmological data analysis, in particular for handling data which present masked regions and missing observations. We investigate here the relationship between sparse inpainting techniques using the spherical harmonic basis as a dictionary and the isotropy properties of cosmological maps, as for instance those arising from cosmic microwave background (CMB) experiments. In particular, we investigate the possibility that inpainted maps may exhibit anisotropies in the behaviour of higher-order angular polyspectra. We provide analytic computations and simulations of inpainted maps for a Gaussian isotropic model of CMB data, suggesting that the resulting angular trispectrum may exhibit small but non-negligible deviations from isotropy.
Sparse matrix test collections
Energy Technology Data Exchange (ETDEWEB)
Duff, I.
1996-12-31
This workshop will discuss plans for coordinating and developing sets of test matrices for the comparison and testing of sparse linear algebra software. We will talk of plans for the next release (Release 2) of the Harwell-Boeing Collection and recent work on improving the accessibility of this Collection and others through the World Wide Web. There will only be three talks of about 15 to 20 minutes followed by a discussion from the floor.
Compressed sensing & sparse filtering
Carmi, Avishy Y; Godsill, Simon J
2013-01-01
This book is aimed at presenting concepts, methods and algorithms ableto cope with undersampled and limited data. One such trend that recently gained popularity and to some extent revolutionised signal processing is compressed sensing. Compressed sensing builds upon the observation that many signals in nature are nearly sparse (or compressible, as they are normally referred to) in some domain, and consequently they can be reconstructed to within high accuracy from far fewer observations than traditionally held to be necessary.Â Apart from compressed sensing this book contains other related app
Wang, Jim Jing-Yan; Gao, Xin
2014-01-01
Sparse coding approximates the data sample as a sparse linear combination of some basic codewords and uses the sparse codes as new presentations. In this paper, we investigate learning discriminative sparse codes by sparse coding in a semi-supervised manner, where only a few training samples are labeled. By using the manifold structure spanned by the data set of both labeled and unlabeled samples and the constraints provided by the labels of the labeled samples, we learn the variable class labels for all the samples. Furthermore, to improve the discriminative ability of the learned sparse codes, we assume that the class labels could be predicted from the sparse codes directly using a linear classifier. By solving the codebook, sparse codes, class labels and classifier parameters simultaneously in a unified objective function, we develop a semi-supervised sparse coding algorithm. Experiments on two real-world pattern recognition problems demonstrate the advantage of the proposed methods over supervised sparse coding methods on partially labeled data sets.
Wang, Jim Jing-Yan
2014-07-06
Sparse coding approximates the data sample as a sparse linear combination of some basic codewords and uses the sparse codes as new presentations. In this paper, we investigate learning discriminative sparse codes by sparse coding in a semi-supervised manner, where only a few training samples are labeled. By using the manifold structure spanned by the data set of both labeled and unlabeled samples and the constraints provided by the labels of the labeled samples, we learn the variable class labels for all the samples. Furthermore, to improve the discriminative ability of the learned sparse codes, we assume that the class labels could be predicted from the sparse codes directly using a linear classifier. By solving the codebook, sparse codes, class labels and classifier parameters simultaneously in a unified objective function, we develop a semi-supervised sparse coding algorithm. Experiments on two real-world pattern recognition problems demonstrate the advantage of the proposed methods over supervised sparse coding methods on partially labeled data sets.
A framework for general sparse matrix-matrix multiplication on GPUs and heterogeneous processors
DEFF Research Database (Denmark)
Liu, Weifeng; Vinter, Brian
2015-01-01
General sparse matrix-matrix multiplication (SpGEMM) is a fundamental building block for numerous applications such as algebraic multigrid method (AMG), breadth first search and shortest path problem. Compared to other sparse BLAS routines, an efficient parallel SpGEMM implementation has to handle...... extra irregularity from three aspects: (1) the number of nonzero entries in the resulting sparse matrix is unknown in advance, (2) very expensive parallel insert operations at random positions in the resulting sparse matrix dominate the execution time, and (3) load balancing must account for sparse data...... memory space and efficiently utilizes the very limited on-chip scratchpad memory. Parallel insert operations of the nonzero entries are implemented through the GPU merge path algorithm that is experimentally found to be the fastest GPU merge approach. Load balancing builds on the number of necessary...
Stiffness Analysis and Comparison of 3-PPR Planar Parallel Manipulators with Actuation Compliance
DEFF Research Database (Denmark)
Wu, Guanglei; Bai, Shaoping; Kepler, Jørgen Asbøl
2012-01-01
In this paper, the stiffness of 3-PPR planar parallel manipulator (PPM) is analyzed with the consideration of nonlinear actuation compliance. The characteristics of the stiffness matrix pertaining to the planar parallel manipulators are analyzed and discussed. Graphic representation of the stiffn...... of the stiffness characteristics by means of translational and rotational stiffness mapping is developed. The developed method is illustrated with an unsymmetrical 3-PPR PPM, being compared with its structure-symmetrical counterpart....
Denning, Peter J.
1989-01-01
Sparse distributed memory was proposed be Pentti Kanerva as a realizable architecture that could store large patterns and retrieve them based on partial matches with patterns representing current sensory inputs. This memory exhibits behaviors, both in theory and in experiment, that resemble those previously unapproached by machines - e.g., rapid recognition of faces or odors, discovery of new connections between seemingly unrelated ideas, continuation of a sequence of events when given a cue from the middle, knowing that one doesn't know, or getting stuck with an answer on the tip of one's tongue. These behaviors are now within reach of machines that can be incorporated into the computing systems of robots capable of seeing, talking, and manipulating. Kanerva's theory is a break with the Western rationalistic tradition, allowing a new interpretation of learning and cognition that respects biology and the mysteries of individual human beings.
Sparse decompositions in 'incoherent' dictionaries
DEFF Research Database (Denmark)
Gribonval, R.; Nielsen, Morten
2003-01-01
a unique sparse representation in such a dictionary. In particular, it is proved that the result of Donoho and Huo, concerning the replacement of a combinatorial optimization problem with a linear programming problem when searching for sparse representations, has an analog for dictionaries that may...
Consensus Convolutional Sparse Coding
Choudhury, Biswarup
2017-12-01
Convolutional sparse coding (CSC) is a promising direction for unsupervised learning in computer vision. In contrast to recent supervised methods, CSC allows for convolutional image representations to be learned that are equally useful for high-level vision tasks and low-level image reconstruction and can be applied to a wide range of tasks without problem-specific retraining. Due to their extreme memory requirements, however, existing CSC solvers have so far been limited to low-dimensional problems and datasets using a handful of low-resolution example images at a time. In this paper, we propose a new approach to solving CSC as a consensus optimization problem, which lifts these limitations. By learning CSC features from large-scale image datasets for the first time, we achieve significant quality improvements in a number of imaging tasks. Moreover, the proposed method enables new applications in high-dimensional feature learning that has been intractable using existing CSC methods. This is demonstrated for a variety of reconstruction problems across diverse problem domains, including 3D multispectral demosaicing and 4D light field view synthesis.
Consensus Convolutional Sparse Coding
Choudhury, Biswarup
2017-04-11
Convolutional sparse coding (CSC) is a promising direction for unsupervised learning in computer vision. In contrast to recent supervised methods, CSC allows for convolutional image representations to be learned that are equally useful for high-level vision tasks and low-level image reconstruction and can be applied to a wide range of tasks without problem-specific retraining. Due to their extreme memory requirements, however, existing CSC solvers have so far been limited to low-dimensional problems and datasets using a handful of low-resolution example images at a time. In this paper, we propose a new approach to solving CSC as a consensus optimization problem, which lifts these limitations. By learning CSC features from large-scale image datasets for the first time, we achieve significant quality improvements in a number of imaging tasks. Moreover, the proposed method enables new applications in high dimensional feature learning that has been intractable using existing CSC methods. This is demonstrated for a variety of reconstruction problems across diverse problem domains, including 3D multispectral demosaickingand 4D light field view synthesis.
Consensus Convolutional Sparse Coding
Choudhury, Biswarup; Swanson, Robin; Heide, Felix; Wetzstein, Gordon; Heidrich, Wolfgang
2017-01-01
Convolutional sparse coding (CSC) is a promising direction for unsupervised learning in computer vision. In contrast to recent supervised methods, CSC allows for convolutional image representations to be learned that are equally useful for high-level vision tasks and low-level image reconstruction and can be applied to a wide range of tasks without problem-specific retraining. Due to their extreme memory requirements, however, existing CSC solvers have so far been limited to low-dimensional problems and datasets using a handful of low-resolution example images at a time. In this paper, we propose a new approach to solving CSC as a consensus optimization problem, which lifts these limitations. By learning CSC features from large-scale image datasets for the first time, we achieve significant quality improvements in a number of imaging tasks. Moreover, the proposed method enables new applications in high-dimensional feature learning that has been intractable using existing CSC methods. This is demonstrated for a variety of reconstruction problems across diverse problem domains, including 3D multispectral demosaicing and 4D light field view synthesis.
Turbulent flows over sparse canopies
Sharma, Akshath; García-Mayoral, Ricardo
2018-04-01
Turbulent flows over sparse and dense canopies exerting a similar drag force on the flow are investigated using Direct Numerical Simulations. The dense canopies are modelled using a homogeneous drag force, while for the sparse canopy, the geometry of the canopy elements is represented. It is found that on using the friction velocity based on the local shear at each height, the streamwise velocity fluctuations and the Reynolds stress within the sparse canopy are similar to those from a comparable smooth-wall case. In addition, when scaled with the local friction velocity, the intensity of the off-wall peak in the streamwise vorticity for sparse canopies also recovers a value similar to a smooth-wall. This indicates that the sparse canopy does not significantly disturb the near-wall turbulence cycle, but causes its rescaling to an intensity consistent with a lower friction velocity within the canopy. In comparison, the dense canopy is found to have a higher damping effect on the turbulent fluctuations. For the case of the sparse canopy, a peak in the spectral energy density of the wall-normal velocity, and Reynolds stress is observed, which may indicate the formation of Kelvin-Helmholtz-like instabilities. It is also found that a sparse canopy is better modelled by a homogeneous drag applied on the mean flow alone, and not the turbulent fluctuations.
Sparse Regression by Projection and Sparse Discriminant Analysis
Qi, Xin; Luo, Ruiyan; Carroll, Raymond J.; Zhao, Hongyu
2015-01-01
predictions. We introduce a new framework, regression by projection, and its sparse version to analyze high-dimensional data. The unique nature of this framework is that the directions of the regression coefficients are inferred first, and the lengths
In Defense of Sparse Tracking: Circulant Sparse Tracker
Zhang, Tianzhu; Bibi, Adel Aamer; Ghanem, Bernard
2016-01-01
Sparse representation has been introduced to visual tracking by finding the best target candidate with minimal reconstruction error within the particle filter framework. However, most sparse representation based trackers have high computational cost, less than promising tracking performance, and limited feature representation. To deal with the above issues, we propose a novel circulant sparse tracker (CST), which exploits circulant target templates. Because of the circulant structure property, CST has the following advantages: (1) It can refine and reduce particles using circular shifts of target templates. (2) The optimization can be efficiently solved entirely in the Fourier domain. (3) High dimensional features can be embedded into CST to significantly improve tracking performance without sacrificing much computation time. Both qualitative and quantitative evaluations on challenging benchmark sequences demonstrate that CST performs better than all other sparse trackers and favorably against state-of-the-art methods.
In Defense of Sparse Tracking: Circulant Sparse Tracker
Zhang, Tianzhu
2016-12-13
Sparse representation has been introduced to visual tracking by finding the best target candidate with minimal reconstruction error within the particle filter framework. However, most sparse representation based trackers have high computational cost, less than promising tracking performance, and limited feature representation. To deal with the above issues, we propose a novel circulant sparse tracker (CST), which exploits circulant target templates. Because of the circulant structure property, CST has the following advantages: (1) It can refine and reduce particles using circular shifts of target templates. (2) The optimization can be efficiently solved entirely in the Fourier domain. (3) High dimensional features can be embedded into CST to significantly improve tracking performance without sacrificing much computation time. Both qualitative and quantitative evaluations on challenging benchmark sequences demonstrate that CST performs better than all other sparse trackers and favorably against state-of-the-art methods.
A performance study of sparse Cholesky factorization on INTEL iPSC/860
Zubair, M.; Ghose, M.
1992-01-01
The problem of Cholesky factorization of a sparse matrix has been very well investigated on sequential machines. A number of efficient codes exist for factorizing large unstructured sparse matrices. However, there is a lack of such efficient codes on parallel machines in general, and distributed machines in particular. Some of the issues that are critical to the implementation of sparse Cholesky factorization on a distributed memory parallel machine are ordering, partitioning and mapping, load balancing, and ordering of various tasks within a processor. Here, we focus on the effect of various partitioning schemes on the performance of sparse Cholesky factorization on the Intel iPSC/860. Also, a new partitioning heuristic for structured as well as unstructured sparse matrices is proposed, and its performance is compared with other schemes.
An Efficient GPU General Sparse Matrix-Matrix Multiplication for Irregular Data
DEFF Research Database (Denmark)
Liu, Weifeng; Vinter, Brian
2014-01-01
General sparse matrix-matrix multiplication (SpGEMM) is a fundamental building block for numerous applications such as algebraic multigrid method, breadth first search and shortest path problem. Compared to other sparse BLAS routines, an efficient parallel SpGEMM algorithm has to handle extra...... irregularity from three aspects: (1) the number of the nonzero entries in the result sparse matrix is unknown in advance, (2) very expensive parallel insert operations at random positions in the result sparse matrix dominate the execution time, and (3) load balancing must account for sparse data in both input....... Load balancing builds on the number of the necessary arithmetic operations on the nonzero entries and is guaranteed in all stages. Compared with the state-of-the-art GPU SpGEMM methods in the CUSPARSE library and the CUSP library and the latest CPU SpGEMM method in the Intel Math Kernel Library, our...
Efficient implementations of block sparse matrix operations on shared memory vector machines
International Nuclear Information System (INIS)
Washio, T.; Maruyama, K.; Osoda, T.; Doi, S.; Shimizu, F.
2000-01-01
In this paper, we propose vectorization and shared memory-parallelization techniques for block-type random sparse matrix operations in finite element (FEM) applications. Here, a block corresponds to unknowns on one node in the FEM mesh and we assume that the block size is constant over the mesh. First, we discuss some basic vectorization ideas (the jagged diagonal (JAD) format and the segmented scan algorithm) for the sparse matrix-vector product. Then, we extend these ideas to the shared memory parallelization. After that, we show that the techniques can be applied not only to the sparse matrix-vector product but also to the sparse matrix-matrix product, the incomplete or complete sparse LU factorization and preconditioning. Finally, we report the performance evaluation results obtained on an NEC SX-4 shared memory vector machine for linear systems in some FEM applications. (author)
Evaluating Sparse Linear System Solvers on Scalable Parallel Architectures
National Research Council Canada - National Science Library
Grama, Ananth; Manguoglu, Murat; Koyuturk, Mehmet; Naumov, Maxim; Sameh, Ahmed
2008-01-01
.... The study was motivated primarily by the lack of robustness of Krylov subspace iterative schemes with generic, black-box, pre-conditioners such as approximate (or incomplete) LU-factorizations...
A Computing Platform for Parallel Sparse Matrix Computations
2016-01-05
REPORT NUMBER 19a. NAME OF RESPONSIBLE PERSON 19b. TELEPHONE NUMBER Ahmed Sameh Ahmed H. Sameh, Alicia Klinvex, Yao Zhu 611103 c. THIS PAGE The...PERCENT_SUPPORTEDNAME FTE Equivalent: Total Number: Discipline Yao Zhu 0.50 Alicia Klinvex 0.10 0.60 2 Names of Post Doctorates Names of Faculty Supported...PERCENT_SUPPORTEDNAME FTE Equivalent: Total Number: NAME Total Number: NAME Total Number: Yao Zhu Alicia Klinvex 2 ...... ...... Sub Contractors (DD882) Names of other
Language Recognition via Sparse Coding
2016-09-08
explanation is that sparse coding can achieve a near-optimal approximation of much complicated nonlinear relationship through local and piecewise linear...training examples, where x(i) ∈ RN is the ith example in the batch. Optionally, X can be normalized and whitened before sparse coding for better result...normalized input vectors are then ZCA- whitened [20]. Em- pirically, we choose ZCA- whitening over PCA- whitening , and there is no dimensionality reduction
Automatic Management of Parallel and Distributed System Resources
Yan, Jerry; Ngai, Tin Fook; Lundstrom, Stephen F.
1990-01-01
Viewgraphs on automatic management of parallel and distributed system resources are presented. Topics covered include: parallel applications; intelligent management of multiprocessing systems; performance evaluation of parallel architecture; dynamic concurrent programs; compiler-directed system approach; lattice gaseous cellular automata; and sparse matrix Cholesky factorization.
Shearlets and Optimally Sparse Approximations
DEFF Research Database (Denmark)
Kutyniok, Gitta; Lemvig, Jakob; Lim, Wang-Q
2012-01-01
Multivariate functions are typically governed by anisotropic features such as edges in images or shock fronts in solutions of transport-dominated equations. One major goal both for the purpose of compression as well as for an efficient analysis is the provision of optimally sparse approximations...... optimally sparse approximations of this model class in 2D as well as 3D. Even more, in contrast to all other directional representation systems, a theory for compactly supported shearlet frames was derived which moreover also satisfy this optimality benchmark. This chapter shall serve as an introduction...... to and a survey about sparse approximations of cartoon-like images by band-limited and also compactly supported shearlet frames as well as a reference for the state-of-the-art of this research field....
Sparse Representations of Hyperspectral Images
Swanson, Robin J.
2015-01-01
Hyperspectral image data has long been an important tool for many areas of sci- ence. The addition of spectral data yields significant improvements in areas such as object and image classification, chemical and mineral composition detection, and astronomy. Traditional capture methods for hyperspectral data often require each wavelength to be captured individually, or by sacrificing spatial resolution. Recently there have been significant improvements in snapshot hyperspectral captures using, in particular, compressed sensing methods. As we move to a compressed sensing image formation model the need for strong image priors to shape our reconstruction, as well as sparse basis become more important. Here we compare several several methods for representing hyperspectral images including learned three dimensional dictionaries, sparse convolutional coding, and decomposable nonlocal tensor dictionaries. Addi- tionally, we further explore their parameter space to identify which parameters provide the most faithful and sparse representations.
Sparse Representations of Hyperspectral Images
Swanson, Robin J.
2015-11-23
Hyperspectral image data has long been an important tool for many areas of sci- ence. The addition of spectral data yields significant improvements in areas such as object and image classification, chemical and mineral composition detection, and astronomy. Traditional capture methods for hyperspectral data often require each wavelength to be captured individually, or by sacrificing spatial resolution. Recently there have been significant improvements in snapshot hyperspectral captures using, in particular, compressed sensing methods. As we move to a compressed sensing image formation model the need for strong image priors to shape our reconstruction, as well as sparse basis become more important. Here we compare several several methods for representing hyperspectral images including learned three dimensional dictionaries, sparse convolutional coding, and decomposable nonlocal tensor dictionaries. Addi- tionally, we further explore their parameter space to identify which parameters provide the most faithful and sparse representations.
Image understanding using sparse representations
Thiagarajan, Jayaraman J; Turaga, Pavan; Spanias, Andreas
2014-01-01
Image understanding has been playing an increasingly crucial role in several inverse problems and computer vision. Sparse models form an important component in image understanding, since they emulate the activity of neural receptors in the primary visual cortex of the human brain. Sparse methods have been utilized in several learning problems because of their ability to provide parsimonious, interpretable, and efficient models. Exploiting the sparsity of natural signals has led to advances in several application areas including image compression, denoising, inpainting, compressed sensing, blin
Sparse PCA with Oracle Property.
Gu, Quanquan; Wang, Zhaoran; Liu, Han
In this paper, we study the estimation of the k -dimensional sparse principal subspace of covariance matrix Σ in the high-dimensional setting. We aim to recover the oracle principal subspace solution, i.e., the principal subspace estimator obtained assuming the true support is known a priori. To this end, we propose a family of estimators based on the semidefinite relaxation of sparse PCA with novel regularizations. In particular, under a weak assumption on the magnitude of the population projection matrix, one estimator within this family exactly recovers the true support with high probability, has exact rank- k , and attains a [Formula: see text] statistical rate of convergence with s being the subspace sparsity level and n the sample size. Compared to existing support recovery results for sparse PCA, our approach does not hinge on the spiked covariance model or the limited correlation condition. As a complement to the first estimator that enjoys the oracle property, we prove that, another estimator within the family achieves a sharper statistical rate of convergence than the standard semidefinite relaxation of sparse PCA, even when the previous assumption on the magnitude of the projection matrix is violated. We validate the theoretical results by numerical experiments on synthetic datasets.
Eighth SIAM conference on parallel processing for scientific computing: Final program and abstracts
Energy Technology Data Exchange (ETDEWEB)
NONE
1997-12-31
This SIAM conference is the premier forum for developments in parallel numerical algorithms, a field that has seen very lively and fruitful developments over the past decade, and whose health is still robust. Themes for this conference were: combinatorial optimization; data-parallel languages; large-scale parallel applications; message-passing; molecular modeling; parallel I/O; parallel libraries; parallel software tools; parallel compilers; particle simulations; problem-solving environments; and sparse matrix computations.
Crockett, Thomas W.
1995-01-01
This article provides a broad introduction to the subject of parallel rendering, encompassing both hardware and software systems. The focus is on the underlying concepts and the issues which arise in the design of parallel rendering algorithms and systems. We examine the different types of parallelism and how they can be applied in rendering applications. Concepts from parallel computing, such as data decomposition, task granularity, scalability, and load balancing, are considered in relation to the rendering problem. We also explore concepts from computer graphics, such as coherence and projection, which have a significant impact on the structure of parallel rendering algorithms. Our survey covers a number of practical considerations as well, including the choice of architectural platform, communication and memory requirements, and the problem of image assembly and display. We illustrate the discussion with numerous examples from the parallel rendering literature, representing most of the principal rendering methods currently used in computer graphics.
Sparse Regression by Projection and Sparse Discriminant Analysis
Qi, Xin
2015-04-03
© 2015, © American Statistical Association, Institute of Mathematical Statistics, and Interface Foundation of North America. Recent years have seen active developments of various penalized regression methods, such as LASSO and elastic net, to analyze high-dimensional data. In these approaches, the direction and length of the regression coefficients are determined simultaneously. Due to the introduction of penalties, the length of the estimates can be far from being optimal for accurate predictions. We introduce a new framework, regression by projection, and its sparse version to analyze high-dimensional data. The unique nature of this framework is that the directions of the regression coefficients are inferred first, and the lengths and the tuning parameters are determined by a cross-validation procedure to achieve the largest prediction accuracy. We provide a theoretical result for simultaneous model selection consistency and parameter estimation consistency of our method in high dimension. This new framework is then generalized such that it can be applied to principal components analysis, partial least squares, and canonical correlation analysis. We also adapt this framework for discriminant analysis. Compared with the existing methods, where there is relatively little control of the dependency among the sparse components, our method can control the relationships among the components. We present efficient algorithms and related theory for solving the sparse regression by projection problem. Based on extensive simulations and real data analysis, we demonstrate that our method achieves good predictive performance and variable selection in the regression setting, and the ability to control relationships between the sparse components leads to more accurate classification. In supplementary materials available online, the details of the algorithms and theoretical proofs, and R codes for all simulation studies are provided.
1982-01-01
Parallel Computations focuses on parallel computation, with emphasis on algorithms used in a variety of numerical and physical applications and for many different types of parallel computers. Topics covered range from vectorization of fast Fourier transforms (FFTs) and of the incomplete Cholesky conjugate gradient (ICCG) algorithm on the Cray-1 to calculation of table lookups and piecewise functions. Single tridiagonal linear systems and vectorized computation of reactive flow are also discussed.Comprised of 13 chapters, this volume begins by classifying parallel computers and describing techn
SuperLU{_}DIST: A scalable distributed-memory sparse direct solver for unsymmetric linear systems
Energy Technology Data Exchange (ETDEWEB)
Li, Xiaoye S.; Demmel, James W.
2002-03-27
In this paper, we present the main algorithmic features in the software package SuperLU{_}DIST, a distributed-memory sparse direct solver for large sets of linear equations. We give in detail our parallelization strategies, with focus on scalability issues, and demonstrate the parallel performance and scalability on current machines. The solver is based on sparse Gaussian elimination, with an innovative static pivoting strategy proposed earlier by the authors. The main advantage of static pivoting over classical partial pivoting is that it permits a priori determination of data structures and communication pattern for sparse Gaussian elimination, which makes it more scalable on distributed memory machines. Based on this a priori knowledge, we designed highly parallel and scalable algorithms for both LU decomposition and triangular solve and we show that they are suitable for large-scale distributed memory machines.
Casanova, Henri; Robert, Yves
2008-01-01
""…The authors of the present book, who have extensive credentials in both research and instruction in the area of parallelism, present a sound, principled treatment of parallel algorithms. … This book is very well written and extremely well designed from an instructional point of view. … The authors have created an instructive and fascinating text. The book will serve researchers as well as instructors who need a solid, readable text for a course on parallelism in computing. Indeed, for anyone who wants an understandable text from which to acquire a current, rigorous, and broad vi
Sparse Matrices in Frame Theory
DEFF Research Database (Denmark)
Lemvig, Jakob; Krahmer, Felix; Kutyniok, Gitta
2014-01-01
Frame theory is closely intertwined with signal processing through a canon of methodologies for the analysis of signals using (redundant) linear measurements. The canonical dual frame associated with a frame provides a means for reconstruction by a least squares approach, but other dual frames...... yield alternative reconstruction procedures. The novel paradigm of sparsity has recently entered the area of frame theory in various ways. Of those different sparsity perspectives, we will focus on the situations where frames and (not necessarily canonical) dual frames can be written as sparse matrices...
Diffusion Indexes with Sparse Loadings
DEFF Research Database (Denmark)
Kristensen, Johannes Tang
The use of large-dimensional factor models in forecasting has received much attention in the literature with the consensus being that improvements on forecasts can be achieved when comparing with standard models. However, recent contributions in the literature have demonstrated that care needs...... to the problem by using the LASSO as a variable selection method to choose between the possible variables and thus obtain sparse loadings from which factors or diffusion indexes can be formed. This allows us to build a more parsimonious factor model which is better suited for forecasting compared...... it to be an important alternative to PC....
Sparse Linear Identifiable Multivariate Modeling
DEFF Research Database (Denmark)
Henao, Ricardo; Winther, Ole
2011-01-01
and bench-marked on artificial and real biological data sets. SLIM is closest in spirit to LiNGAM (Shimizu et al., 2006), but differs substantially in inference, Bayesian network structure learning and model comparison. Experimentally, SLIM performs equally well or better than LiNGAM with comparable......In this paper we consider sparse and identifiable linear latent variable (factor) and linear Bayesian network models for parsimonious analysis of multivariate data. We propose a computationally efficient method for joint parameter and model inference, and model comparison. It consists of a fully...
Programming for Sparse Minimax Optimization
DEFF Research Database (Denmark)
Jonasson, K.; Madsen, Kaj
1994-01-01
We present an algorithm for nonlinear minimax optimization which is well suited for large and sparse problems. The method is based on trust regions and sequential linear programming. On each iteration, a linear minimax problem is solved for a basic step. If necessary, this is followed...... by the determination of a minimum norm corrective step based on a first-order Taylor approximation. No Hessian information needs to be stored. Global convergence is proved. This new method has been extensively tested and compared with other methods, including two well known codes for nonlinear programming...
Dynamic Representations of Sparse Graphs
DEFF Research Database (Denmark)
Brodal, Gerth Stølting; Fagerberg, Rolf
1999-01-01
We present a linear space data structure for maintaining graphs with bounded arboricity—a large class of sparse graphs containing e.g. planar graphs and graphs of bounded treewidth—under edge insertions, edge deletions, and adjacency queries. The data structure supports adjacency queries in worst...... case O(c) time, and edge insertions and edge deletions in amortized O(1) and O(c+log n) time, respectively, where n is the number of nodes in the graph, and c is the bound on the arboricity....
A distributed-memory hierarchical solver for general sparse linear systems
Energy Technology Data Exchange (ETDEWEB)
Chen, Chao [Stanford Univ., CA (United States). Inst. for Computational and Mathematical Engineering; Pouransari, Hadi [Stanford Univ., CA (United States). Dept. of Mechanical Engineering; Rajamanickam, Sivasankaran [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States). Center for Computing Research; Boman, Erik G. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States). Center for Computing Research; Darve, Eric [Stanford Univ., CA (United States). Inst. for Computational and Mathematical Engineering and Dept. of Mechanical Engineering
2017-12-20
We present a parallel hierarchical solver for general sparse linear systems on distributed-memory machines. For large-scale problems, this fully algebraic algorithm is faster and more memory-efficient than sparse direct solvers because it exploits the low-rank structure of fill-in blocks. Depending on the accuracy of low-rank approximations, the hierarchical solver can be used either as a direct solver or as a preconditioner. The parallel algorithm is based on data decomposition and requires only local communication for updating boundary data on every processor. Moreover, the computation-to-communication ratio of the parallel algorithm is approximately the volume-to-surface-area ratio of the subdomain owned by every processor. We also provide various numerical results to demonstrate the versatility and scalability of the parallel algorithm.
Iterative solution of general sparse linear systems on clusters of workstations
Energy Technology Data Exchange (ETDEWEB)
Lo, Gen-Ching; Saad, Y. [Univ. of Minnesota, Minneapolis, MN (United States)
1996-12-31
Solving sparse irregularly structured linear systems on parallel platforms poses several challenges. First, sparsity makes it difficult to exploit data locality, whether in a distributed or shared memory environment. A second, perhaps more serious challenge, is to find efficient ways to precondition the system. Preconditioning techniques which have a large degree of parallelism, such as multicolor SSOR, often have a slower rate of convergence than their sequential counterparts. Finally, a number of other computational kernels such as inner products could ruin any gains gained from parallel speed-ups, and this is especially true on workstation clusters where start-up times may be high. In this paper we discuss these issues and report on our experience with PSPARSLIB, an on-going project for building a library of parallel iterative sparse matrix solvers.
Bayesian Inference Methods for Sparse Channel Estimation
DEFF Research Database (Denmark)
Pedersen, Niels Lovmand
2013-01-01
This thesis deals with sparse Bayesian learning (SBL) with application to radio channel estimation. As opposed to the classical approach for sparse signal representation, we focus on the problem of inferring complex signals. Our investigations within SBL constitute the basis for the development...... of Bayesian inference algorithms for sparse channel estimation. Sparse inference methods aim at finding the sparse representation of a signal given in some overcomplete dictionary of basis vectors. Within this context, one of our main contributions to the field of SBL is a hierarchical representation...... analysis of the complex prior representation, where we show that the ability to induce sparse estimates of a given prior heavily depends on the inference method used and, interestingly, whether real or complex variables are inferred. We also show that the Bayesian estimators derived from the proposed...
Image fusion using sparse overcomplete feature dictionaries
Brumby, Steven P.; Bettencourt, Luis; Kenyon, Garrett T.; Chartrand, Rick; Wohlberg, Brendt
2015-10-06
Approaches for deciding what individuals in a population of visual system "neurons" are looking for using sparse overcomplete feature dictionaries are provided. A sparse overcomplete feature dictionary may be learned for an image dataset and a local sparse representation of the image dataset may be built using the learned feature dictionary. A local maximum pooling operation may be applied on the local sparse representation to produce a translation-tolerant representation of the image dataset. An object may then be classified and/or clustered within the translation-tolerant representation of the image dataset using a supervised classification algorithm and/or an unsupervised clustering algorithm.
Sparse Image Reconstruction in Computed Tomography
DEFF Research Database (Denmark)
Jørgensen, Jakob Sauer
In recent years, increased focus on the potentially harmful effects of x-ray computed tomography (CT) scans, such as radiation-induced cancer, has motivated research on new low-dose imaging techniques. Sparse image reconstruction methods, as studied for instance in the field of compressed sensing...... applications. This thesis takes a systematic approach toward establishing quantitative understanding of conditions for sparse reconstruction to work well in CT. A general framework for analyzing sparse reconstruction methods in CT is introduced and two sets of computational tools are proposed: 1...... contributions to a general set of computational characterization tools. Thus, the thesis contributions help advance sparse reconstruction methods toward routine use in...
Parallel computation of rotating flows
DEFF Research Database (Denmark)
Lundin, Lars Kristian; Barker, Vincent A.; Sørensen, Jens Nørkær
1999-01-01
This paper deals with the simulation of 3‐D rotating flows based on the velocity‐vorticity formulation of the Navier‐Stokes equations in cylindrical coordinates. The governing equations are discretized by a finite difference method. The solution is advanced to a new time level by a two‐step process...... is that of solving a singular, large, sparse, over‐determined linear system of equations, and the iterative method CGLS is applied for this purpose. We discuss some of the mathematical and numerical aspects of this procedure and report on the performance of our software on a wide range of parallel computers. Darbe...
When sparse coding meets ranking: a joint framework for learning sparse codes and ranking scores
Wang, Jim Jing-Yan; Cui, Xuefeng; Yu, Ge; Guo, Lili; Gao, Xin
2017-01-01
Sparse coding, which represents a data point as a sparse reconstruction code with regard to a dictionary, has been a popular data representation method. Meanwhile, in database retrieval problems, learning the ranking scores from data points plays
Neural Network for Sparse Reconstruction
Directory of Open Access Journals (Sweden)
Qingfa Li
2014-01-01
Full Text Available We construct a neural network based on smoothing approximation techniques and projected gradient method to solve a kind of sparse reconstruction problems. Neural network can be implemented by circuits and can be seen as an important method for solving optimization problems, especially large scale problems. Smoothing approximation is an efficient technique for solving nonsmooth optimization problems. We combine these two techniques to overcome the difficulties of the choices of the step size in discrete algorithms and the item in the set-valued map of differential inclusion. In theory, the proposed network can converge to the optimal solution set of the given problem. Furthermore, some numerical experiments show the effectiveness of the proposed network in this paper.
Diffusion Indexes With Sparse Loadings
DEFF Research Database (Denmark)
Kristensen, Johannes Tang
2017-01-01
The use of large-dimensional factor models in forecasting has received much attention in the literature with the consensus being that improvements on forecasts can be achieved when comparing with standard models. However, recent contributions in the literature have demonstrated that care needs...... to the problem by using the least absolute shrinkage and selection operator (LASSO) as a variable selection method to choose between the possible variables and thus obtain sparse loadings from which factors or diffusion indexes can be formed. This allows us to build a more parsimonious factor model...... in forecasting accuracy and thus find it to be an important alternative to PC. Supplementary materials for this article are available online....
Sparse and stable Markowitz portfolios.
Brodie, Joshua; Daubechies, Ingrid; De Mol, Christine; Giannone, Domenico; Loris, Ignace
2009-07-28
We consider the problem of portfolio selection within the classical Markowitz mean-variance framework, reformulated as a constrained least-squares regression problem. We propose to add to the objective function a penalty proportional to the sum of the absolute values of the portfolio weights. This penalty regularizes (stabilizes) the optimization problem, encourages sparse portfolios (i.e., portfolios with only few active positions), and allows accounting for transaction costs. Our approach recovers as special cases the no-short-positions portfolios, but does allow for short positions in limited number. We implement this methodology on two benchmark data sets constructed by Fama and French. Using only a modest amount of training data, we construct portfolios whose out-of-sample performance, as measured by Sharpe ratio, is consistently and significantly better than that of the naïve evenly weighted portfolio.
SPARSE FARADAY ROTATION MEASURE SYNTHESIS
International Nuclear Information System (INIS)
Andrecut, M.; Stil, J. M.; Taylor, A. R.
2012-01-01
Faraday rotation measure synthesis is a method for analyzing multichannel polarized radio emissions, and it has emerged as an important tool in the study of Galactic and extragalactic magnetic fields. The method requires the recovery of the Faraday dispersion function from measurements restricted to limited wavelength ranges, which is an ill-conditioned deconvolution problem. Here, we discuss a recovery method that assumes a sparse approximation of the Faraday dispersion function in an overcomplete dictionary of functions. We discuss the general case when both thin and thick components are included in the model, and we present the implementation of a greedy deconvolution algorithm. We illustrate the method with several numerical simulations that emphasize the effect of the covered range and sampling resolution in the Faraday depth space, and the effect of noise on the observed data.
Efficient Pseudorecursive Evaluation Schemes for Non-adaptive Sparse Grids
Buse, Gerrit
2014-01-01
In this work we propose novel algorithms for storing and evaluating sparse grid functions, operating on regular (not spatially adaptive), yet potentially dimensionally adaptive grid types. Besides regular sparse grids our approach includes truncated grids, both with and without boundary grid points. Similar to the implicit data structures proposed in Feuersänger (Dünngitterverfahren für hochdimensionale elliptische partielle Differntialgleichungen. Diploma Thesis, Institut für Numerische Simulation, Universität Bonn, 2005) and Murarasu et al. (Proceedings of the 16th ACM Symposium on Principles and Practice of Parallel Programming. Cambridge University Press, New York, 2011, pp. 25–34) we also define a bijective mapping from the multi-dimensional space of grid points to a contiguous index, such that the grid data can be stored in a simple array without overhead. Our approach is especially well-suited to exploit all levels of current commodity hardware, including cache-levels and vector extensions. Furthermore, this kind of data structure is extremely attractive for today’s real-time applications, as it gives direct access to the hierarchical structure of the grids, while outperforming other common sparse grid structures (hash maps, etc.) which do not match with modern compute platforms that well. For dimensionality d ≤ 10 we achieve good speedups on a 12 core Intel Westmere-EP NUMA platform compared to the results presented in Murarasu et al. (Proceedings of the International Conference on Computational Science—ICCS 2012. Procedia Computer Science, 2012). As we show, this also holds for the results obtained on Nvidia Fermi GPUs, for which we observe speedups over our own CPU implementation of up to 4.5 when dealing with moderate dimensionality. In high-dimensional settings, in the order of tens to hundreds of dimensions, our sparse grid evaluation kernels on the CPU outperform any other known implementation.
Numerical solution of large sparse linear systems
International Nuclear Information System (INIS)
Meurant, Gerard; Golub, Gene.
1982-02-01
This note is based on one of the lectures given at the 1980 CEA-EDF-INRIA Numerical Analysis Summer School whose aim is the study of large sparse linear systems. The main topics are solving least squares problems by orthogonal transformation, fast Poisson solvers and solution of sparse linear system by iterative methods with a special emphasis on preconditioned conjuguate gradient method [fr
Sparse seismic imaging using variable projection
Aravkin, Aleksandr Y.; Tu, Ning; van Leeuwen, Tristan
2013-01-01
We consider an important class of signal processing problems where the signal of interest is known to be sparse, and can be recovered from data given auxiliary information about how the data was generated. For example, a sparse Green's function may be recovered from seismic experimental data using
International Nuclear Information System (INIS)
Jejcic, A.; Maillard, J.; Maurel, G.; Silva, J.; Wolff-Bacha, F.
1997-01-01
The work in the field of parallel processing has developed as research activities using several numerical Monte Carlo simulations related to basic or applied current problems of nuclear and particle physics. For the applications utilizing the GEANT code development or improvement works were done on parts simulating low energy physical phenomena like radiation, transport and interaction. The problem of actinide burning by means of accelerators was approached using a simulation with the GEANT code. A program of neutron tracking in the range of low energies up to the thermal region has been developed. It is coupled to the GEANT code and permits in a single pass the simulation of a hybrid reactor core receiving a proton burst. Other works in this field refers to simulations for nuclear medicine applications like, for instance, development of biological probes, evaluation and characterization of the gamma cameras (collimators, crystal thickness) as well as the method for dosimetric calculations. Particularly, these calculations are suited for a geometrical parallelization approach especially adapted to parallel machines of the TN310 type. Other works mentioned in the same field refer to simulation of the electron channelling in crystals and simulation of the beam-beam interaction effect in colliders. The GEANT code was also used to simulate the operation of germanium detectors designed for natural and artificial radioactivity monitoring of environment
Parallel Implicit Algorithms for CFD
Keyes, David E.
1998-01-01
The main goal of this project was efficient distributed parallel and workstation cluster implementations of Newton-Krylov-Schwarz (NKS) solvers for implicit Computational Fluid Dynamics (CFD.) "Newton" refers to a quadratically convergent nonlinear iteration using gradient information based on the true residual, "Krylov" to an inner linear iteration that accesses the Jacobian matrix only through highly parallelizable sparse matrix-vector products, and "Schwarz" to a domain decomposition form of preconditioning the inner Krylov iterations with primarily neighbor-only exchange of data between the processors. Prior experience has established that Newton-Krylov methods are competitive solvers in the CFD context and that Krylov-Schwarz methods port well to distributed memory computers. The combination of the techniques into Newton-Krylov-Schwarz was implemented on 2D and 3D unstructured Euler codes on the parallel testbeds that used to be at LaRC and on several other parallel computers operated by other agencies or made available by the vendors. Early implementations were made directly in Massively Parallel Integration (MPI) with parallel solvers we adapted from legacy NASA codes and enhanced for full NKS functionality. Later implementations were made in the framework of the PETSC library from Argonne National Laboratory, which now includes pseudo-transient continuation Newton-Krylov-Schwarz solver capability (as a result of demands we made upon PETSC during our early porting experiences). A secondary project pursued with funding from this contract was parallel implicit solvers in acoustics, specifically in the Helmholtz formulation. A 2D acoustic inverse problem has been solved in parallel within the PETSC framework.
Orthogonal sparse linear discriminant analysis
Liu, Zhonghua; Liu, Gang; Pu, Jiexin; Wang, Xiaohong; Wang, Haijun
2018-03-01
Linear discriminant analysis (LDA) is a linear feature extraction approach, and it has received much attention. On the basis of LDA, researchers have done a lot of research work on it, and many variant versions of LDA were proposed. However, the inherent problem of LDA cannot be solved very well by the variant methods. The major disadvantages of the classical LDA are as follows. First, it is sensitive to outliers and noises. Second, only the global discriminant structure is preserved, while the local discriminant information is ignored. In this paper, we present a new orthogonal sparse linear discriminant analysis (OSLDA) algorithm. The k nearest neighbour graph is first constructed to preserve the locality discriminant information of sample points. Then, L2,1-norm constraint on the projection matrix is used to act as loss function, which can make the proposed method robust to outliers in data points. Extensive experiments have been performed on several standard public image databases, and the experiment results demonstrate the performance of the proposed OSLDA algorithm.
Spectra of sparse random matrices
International Nuclear Information System (INIS)
Kuehn, Reimer
2008-01-01
We compute the spectral density for ensembles of sparse symmetric random matrices using replica. Our formulation of the replica-symmetric ansatz shares the symmetries of that suggested in a seminal paper by Rodgers and Bray (symmetry with respect to permutation of replica and rotation symmetry in the space of replica), but uses a different representation in terms of superpositions of Gaussians. It gives rise to a pair of integral equations which can be solved by a stochastic population-dynamics algorithm. Remarkably our representation allows us to identify pure-point contributions to the spectral density related to the existence of normalizable eigenstates. Our approach is not restricted to matrices defined on graphs with Poissonian degree distribution. Matrices defined on regular random graphs or on scale-free graphs, are easily handled. We also look at matrices with row constraints such as discrete graph Laplacians. Our approach naturally allows us to unfold the total density of states into contributions coming from vertices of different local coordinations and an example of such an unfolding is presented. Our results are well corroborated by numerical diagonalization studies of large finite random matrices
McCallum, Ethan
2011-01-01
It's tough to argue with R as a high-quality, cross-platform, open source statistical software product-unless you're in the business of crunching Big Data. This concise book introduces you to several strategies for using R to analyze large datasets. You'll learn the basics of Snow, Multicore, Parallel, and some Hadoop-related tools, including how to find them, how to use them, when they work well, and when they don't. With these packages, you can overcome R's single-threaded nature by spreading work across multiple CPUs, or offloading work to multiple machines to address R's memory barrier.
Galiatsatos, P. G.; Tennyson, J.
2012-11-01
The most time consuming step within the framework of the UK R-matrix molecular codes is that of the diagonalization of the inner region Hamiltonian matrix (IRHM). Here we present the method that we follow to speed up this step. We use shared memory machines (SMM), distributed memory machines (DMM), the OpenMP directive based parallel language, the MPI function based parallel language, the sparse matrix diagonalizers ARPACK and PARPACK, a variation for real symmetric matrices of the official coordinate sparse matrix format and finally a parallel sparse matrix-vector product (PSMV). The efficient application of the previous techniques rely on two important facts: the sparsity of the matrix is large enough (more than 98%) and in order to get back converged results we need a small only part of the matrix spectrum.
Discriminative sparse coding on multi-manifolds
Wang, J.J.-Y.; Bensmail, H.; Yao, N.; Gao, Xin
2013-01-01
Sparse coding has been popularly used as an effective data representation method in various applications, such as computer vision, medical imaging and bioinformatics. However, the conventional sparse coding algorithms and their manifold-regularized variants (graph sparse coding and Laplacian sparse coding), learn codebooks and codes in an unsupervised manner and neglect class information that is available in the training set. To address this problem, we propose a novel discriminative sparse coding method based on multi-manifolds, that learns discriminative class-conditioned codebooks and sparse codes from both data feature spaces and class labels. First, the entire training set is partitioned into multiple manifolds according to the class labels. Then, we formulate the sparse coding as a manifold-manifold matching problem and learn class-conditioned codebooks and codes to maximize the manifold margins of different classes. Lastly, we present a data sample-manifold matching-based strategy to classify the unlabeled data samples. Experimental results on somatic mutations identification and breast tumor classification based on ultrasonic images demonstrate the efficacy of the proposed data representation and classification approach. 2013 The Authors. All rights reserved.
Discriminative sparse coding on multi-manifolds
Wang, J.J.-Y.
2013-09-26
Sparse coding has been popularly used as an effective data representation method in various applications, such as computer vision, medical imaging and bioinformatics. However, the conventional sparse coding algorithms and their manifold-regularized variants (graph sparse coding and Laplacian sparse coding), learn codebooks and codes in an unsupervised manner and neglect class information that is available in the training set. To address this problem, we propose a novel discriminative sparse coding method based on multi-manifolds, that learns discriminative class-conditioned codebooks and sparse codes from both data feature spaces and class labels. First, the entire training set is partitioned into multiple manifolds according to the class labels. Then, we formulate the sparse coding as a manifold-manifold matching problem and learn class-conditioned codebooks and codes to maximize the manifold margins of different classes. Lastly, we present a data sample-manifold matching-based strategy to classify the unlabeled data samples. Experimental results on somatic mutations identification and breast tumor classification based on ultrasonic images demonstrate the efficacy of the proposed data representation and classification approach. 2013 The Authors. All rights reserved.
Regression with Sparse Approximations of Data
DEFF Research Database (Denmark)
Noorzad, Pardis; Sturm, Bob L.
2012-01-01
We propose sparse approximation weighted regression (SPARROW), a method for local estimation of the regression function that uses sparse approximation with a dictionary of measurements. SPARROW estimates the regression function at a point with a linear combination of a few regressands selected...... by a sparse approximation of the point in terms of the regressors. We show SPARROW can be considered a variant of \\(k\\)-nearest neighbors regression (\\(k\\)-NNR), and more generally, local polynomial kernel regression. Unlike \\(k\\)-NNR, however, SPARROW can adapt the number of regressors to use based...
Sparse adaptive filters for echo cancellation
Paleologu, Constantin
2011-01-01
Adaptive filters with a large number of coefficients are usually involved in both network and acoustic echo cancellation. Consequently, it is important to improve the convergence rate and tracking of the conventional algorithms used for these applications. This can be achieved by exploiting the sparseness character of the echo paths. Identification of sparse impulse responses was addressed mainly in the last decade with the development of the so-called ``proportionate''-type algorithms. The goal of this book is to present the most important sparse adaptive filters developed for echo cancellati
Technique detection software for Sparse Matrices
Directory of Open Access Journals (Sweden)
KHAN Muhammad Taimoor
2009-12-01
Full Text Available Sparse storage formats are techniques for storing and processing the sparse matrix data efficiently. The performance of these storage formats depend upon the distribution of non-zeros, within the matrix in different dimensions. In order to have better results we need a technique that suits best the organization of data in a particular matrix. So the decision of selecting a better technique is the main step towards improving the system's results otherwise the efficiency can be decreased. The purpose of this research is to help identify the best storage format in case of reduced storage size and high processing efficiency for a sparse matrix.
Directory of Open Access Journals (Sweden)
James G. Worner
2017-05-01
Full Text Available James Worner is an Australian-based writer and scholar currently pursuing a PhD at the University of Technology Sydney. His research seeks to expose masculinities lost in the shadow of Australia’s Anzac hegemony while exploring new opportunities for contemporary historiography. He is the recipient of the Doctoral Scholarship in Historical Consciousness at the university’s Australian Centre of Public History and will be hosted by the University of Bologna during 2017 on a doctoral research writing scholarship. ‘Parallel Lines’ is one of a collection of stories, The Shapes of Us, exploring liminal spaces of modern life: class, gender, sexuality, race, religion and education. It looks at lives, like lines, that do not meet but which travel in proximity, simultaneously attracted and repelled. James’ short stories have been published in various journals and anthologies.
Structure-based bayesian sparse reconstruction
Quadeer, Ahmed Abdul; Al-Naffouri, Tareq Y.
2012-01-01
Sparse signal reconstruction algorithms have attracted research attention due to their wide applications in various fields. In this paper, we present a simple Bayesian approach that utilizes the sparsity constraint and a priori statistical
Biclustering via Sparse Singular Value Decomposition
Lee, Mihee
2010-02-16
Sparse singular value decomposition (SSVD) is proposed as a new exploratory analysis tool for biclustering or identifying interpretable row-column associations within high-dimensional data matrices. SSVD seeks a low-rank, checkerboard structured matrix approximation to data matrices. The desired checkerboard structure is achieved by forcing both the left- and right-singular vectors to be sparse, that is, having many zero entries. By interpreting singular vectors as regression coefficient vectors for certain linear regressions, sparsity-inducing regularization penalties are imposed to the least squares regression to produce sparse singular vectors. An efficient iterative algorithm is proposed for computing the sparse singular vectors, along with some discussion of penalty parameter selection. A lung cancer microarray dataset and a food nutrition dataset are used to illustrate SSVD as a biclustering method. SSVD is also compared with some existing biclustering methods using simulated datasets. © 2010, The International Biometric Society.
Tunable Sparse Network Coding for Multicast Networks
DEFF Research Database (Denmark)
Feizi, Soheil; Roetter, Daniel Enrique Lucani; Sørensen, Chres Wiant
2014-01-01
This paper shows the potential and key enabling mechanisms for tunable sparse network coding, a scheme in which the density of network coded packets varies during a transmission session. At the beginning of a transmission session, sparsely coded packets are transmitted, which benefits decoding...... complexity. At the end of a transmission, when receivers have accumulated degrees of freedom, coding density is increased. We propose a family of tunable sparse network codes (TSNCs) for multicast erasure networks with a controllable trade-off between completion time performance to decoding complexity...... a mechanism to perform efficient Gaussian elimination over sparse matrices going beyond belief propagation but maintaining low decoding complexity. Supporting simulation results are provided showing the trade-off between decoding complexity and completion time....
SPARSE ELECTROMAGNETIC IMAGING USING NONLINEAR LANDWEBER ITERATIONS
Desmal, Abdulla; Bagci, Hakan
2015-01-01
minimization problem is solved using nonlinear Landweber iterations, where at each iteration a thresholding function is applied to enforce the sparseness-promoting L0/L1-norm constraint. The thresholded nonlinear Landweber iterations are applied to several two
Learning sparse generative models of audiovisual signals
Monaci, Gianluca; Sommer, Friedrich T.; Vandergheynst, Pierre
2008-01-01
This paper presents a novel framework to learn sparse represen- tations for audiovisual signals. An audiovisual signal is modeled as a sparse sum of audiovisual kernels. The kernels are bimodal functions made of synchronous audio and video components that can be positioned independently and arbitrarily in space and time. We design an algorithm capable of learning sets of such audiovi- sual, synchronous, shift-invariant functions by alternatingly solving a coding and a learning pr...
Hyperspectral Unmixing with Robust Collaborative Sparse Regression
Directory of Open Access Journals (Sweden)
Chang Li
2016-07-01
Full Text Available Recently, sparse unmixing (SU of hyperspectral data has received particular attention for analyzing remote sensing images. However, most SU methods are based on the commonly admitted linear mixing model (LMM, which ignores the possible nonlinear effects (i.e., nonlinearity. In this paper, we propose a new method named robust collaborative sparse regression (RCSR based on the robust LMM (rLMM for hyperspectral unmixing. The rLMM takes the nonlinearity into consideration, and the nonlinearity is merely treated as outlier, which has the underlying sparse property. The RCSR simultaneously takes the collaborative sparse property of the abundance and sparsely distributed additive property of the outlier into consideration, which can be formed as a robust joint sparse regression problem. The inexact augmented Lagrangian method (IALM is used to optimize the proposed RCSR. The qualitative and quantitative experiments on synthetic datasets and real hyperspectral images demonstrate that the proposed RCSR is efficient for solving the hyperspectral SU problem compared with the other four state-of-the-art algorithms.
Multi-threaded Sparse Matrix-Matrix Multiplication for Many-Core and GPU Architectures.
Energy Technology Data Exchange (ETDEWEB)
Deveci, Mehmet [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Rajamanickam, Sivasankaran [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Trott, Christian Robert [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
2017-12-01
Sparse Matrix-Matrix multiplication is a key kernel that has applications in several domains such as scienti c computing and graph analysis. Several algorithms have been studied in the past for this foundational kernel. In this paper, we develop parallel algorithms for sparse matrix-matrix multiplication with a focus on performance portability across different high performance computing architectures. The performance of these algorithms depend on the data structures used in them. We compare different types of accumulators in these algorithms and demonstrate the performance difference between these data structures. Furthermore, we develop a meta-algorithm, kkSpGEMM, to choose the right algorithm and data structure based on the characteristics of the problem. We show performance comparisons on three architectures and demonstrate the need for the community to develop two phase sparse matrix-matrix multiplication implementations for efficient reuse of the data structures involved.
Parallel Computing Strategies for Irregular Algorithms
Biswas, Rupak; Oliker, Leonid; Shan, Hongzhang; Biegel, Bryan (Technical Monitor)
2002-01-01
Parallel computing promises several orders of magnitude increase in our ability to solve realistic computationally-intensive problems, but relies on their efficient mapping and execution on large-scale multiprocessor architectures. Unfortunately, many important applications are irregular and dynamic in nature, making their effective parallel implementation a daunting task. Moreover, with the proliferation of parallel architectures and programming paradigms, the typical scientist is faced with a plethora of questions that must be answered in order to obtain an acceptable parallel implementation of the solution algorithm. In this paper, we consider three representative irregular applications: unstructured remeshing, sparse matrix computations, and N-body problems, and parallelize them using various popular programming paradigms on a wide spectrum of computer platforms ranging from state-of-the-art supercomputers to PC clusters. We present the underlying problems, the solution algorithms, and the parallel implementation strategies. Smart load-balancing, partitioning, and ordering techniques are used to enhance parallel performance. Overall results demonstrate the complexity of efficiently parallelizing irregular algorithms.
Permuting sparse rectangular matrices into block-diagonal form
Energy Technology Data Exchange (ETDEWEB)
Aykanat, Cevdet; Pinar, Ali; Catalyurek, Umit V.
2002-12-09
This work investigates the problem of permuting a sparse rectangular matrix into block diagonal form. Block diagonal form of a matrix grants an inherent parallelism for the solution of the deriving problem, as recently investigated in the context of mathematical programming, LU factorization and QR factorization. We propose graph and hypergraph models to represent the nonzero structure of a matrix, which reduce the permutation problem to those of graph partitioning by vertex separator and hypergraph partitioning, respectively. Besides proposing the models to represent sparse matrices and investigating related combinatorial problems, we provide a detailed survey of relevant literature to bridge the gap between different societies, investigate existing techniques for partitioning and propose new ones, and finally present a thorough empirical study of these techniques. Our experiments on a wide range of matrices, using state-of-the-art graph and hypergraph partitioning tools MeTiS and PaT oH, revealed that the proposed methods yield very effective solutions both in terms of solution quality and run time.
Slowness and sparseness have diverging effects on complex cell learning.
Directory of Open Access Journals (Sweden)
Jörn-Philipp Lies
2014-03-01
Full Text Available Following earlier studies which showed that a sparse coding principle may explain the receptive field properties of complex cells in primary visual cortex, it has been concluded that the same properties may be equally derived from a slowness principle. In contrast to this claim, we here show that slowness and sparsity drive the representations towards substantially different receptive field properties. To do so, we present complete sets of basis functions learned with slow subspace analysis (SSA in case of natural movies as well as translations, rotations, and scalings of natural images. SSA directly parallels independent subspace analysis (ISA with the only difference that SSA maximizes slowness instead of sparsity. We find a large discrepancy between the filter shapes learned with SSA and ISA. We argue that SSA can be understood as a generalization of the Fourier transform where the power spectrum corresponds to the maximally slow subspace energies in SSA. Finally, we investigate the trade-off between slowness and sparseness when combined in one objective function.
When sparse coding meets ranking: a joint framework for learning sparse codes and ranking scores
Wang, Jim Jing-Yan
2017-06-28
Sparse coding, which represents a data point as a sparse reconstruction code with regard to a dictionary, has been a popular data representation method. Meanwhile, in database retrieval problems, learning the ranking scores from data points plays an important role. Up to now, these two problems have always been considered separately, assuming that data coding and ranking are two independent and irrelevant problems. However, is there any internal relationship between sparse coding and ranking score learning? If yes, how to explore and make use of this internal relationship? In this paper, we try to answer these questions by developing the first joint sparse coding and ranking score learning algorithm. To explore the local distribution in the sparse code space, and also to bridge coding and ranking problems, we assume that in the neighborhood of each data point, the ranking scores can be approximated from the corresponding sparse codes by a local linear function. By considering the local approximation error of ranking scores, the reconstruction error and sparsity of sparse coding, and the query information provided by the user, we construct a unified objective function for learning of sparse codes, the dictionary and ranking scores. We further develop an iterative algorithm to solve this optimization problem.
An NoC Traffic Compiler for Efficient FPGA Implementation of Sparse Graph-Oriented Workloads
Directory of Open Access Journals (Sweden)
Nachiket Kapre
2011-01-01
synchronization to optimize our workloads for large networks up to 2025 parallel elements for BSP model and 25 parallel elements for Token Dataflow. This allows us to demonstrate speedups between 1.2× and 22× (3.5× mean, area reductions (number of Processing Elements between 3× and 15× (9× mean and dynamic energy savings between 2× and 3.5× (2.7× mean over a range of real-world graph applications in the BSP compute model. We deliver speedups of 0.5–13× (geomean 3.6× for Sparse Direct Matrix Solve (Token Dataflow compute model applied to a range of sparse matrices when using a high-quality placement algorithm. We expect such traffic optimization tools and techniques to become an essential part of the NoC application-mapping flow.
Sparse Learning with Stochastic Composite Optimization.
Zhang, Weizhong; Zhang, Lijun; Jin, Zhongming; Jin, Rong; Cai, Deng; Li, Xuelong; Liang, Ronghua; He, Xiaofei
2017-06-01
In this paper, we study Stochastic Composite Optimization (SCO) for sparse learning that aims to learn a sparse solution from a composite function. Most of the recent SCO algorithms have already reached the optimal expected convergence rate O(1/λT), but they often fail to deliver sparse solutions at the end either due to the limited sparsity regularization during stochastic optimization (SO) or due to the limitation in online-to-batch conversion. Even when the objective function is strongly convex, their high probability bounds can only attain O(√{log(1/δ)/T}) with δ is the failure probability, which is much worse than the expected convergence rate. To address these limitations, we propose a simple yet effective two-phase Stochastic Composite Optimization scheme by adding a novel powerful sparse online-to-batch conversion to the general Stochastic Optimization algorithms. We further develop three concrete algorithms, OptimalSL, LastSL and AverageSL, directly under our scheme to prove the effectiveness of the proposed scheme. Both the theoretical analysis and the experiment results show that our methods can really outperform the existing methods at the ability of sparse learning and at the meantime we can improve the high probability bound to approximately O(log(log(T)/δ)/λT).
In-place sparse suffix sorting
DEFF Research Database (Denmark)
Prezza, Nicola
2018-01-01
information regarding the lexicographical order of a size-b subset of all n text suffixes is often needed. Such information can be stored space-efficiently (in b words) in the sparse suffix array (SSA). The SSA and its relative sparse LCP array (SLCP) can be used as a space-efficient substitute of the sparse...... suffix tree. Very recently, Gawrychowski and Kociumaka [11] showed that the sparse suffix tree (and therefore SSA and SLCP) can be built in asymptotically optimal O(b) space with a Monte Carlo algorithm running in O(n) time. The main reason for using the SSA and SLCP arrays in place of the sparse suffix...... tree is, however, their reduced space of b words each. This leads naturally to the quest for in-place algorithms building these arrays. Franceschini and Muthukrishnan [8] showed that the full suffix array can be built in-place and in optimal running time. On the other hand, finding sub-quadratic in...
Scalable group level probabilistic sparse factor analysis
DEFF Research Database (Denmark)
Hinrich, Jesper Løve; Nielsen, Søren Føns Vind; Riis, Nicolai Andre Brogaard
2017-01-01
Many data-driven approaches exist to extract neural representations of functional magnetic resonance imaging (fMRI) data, but most of them lack a proper probabilistic formulation. We propose a scalable group level probabilistic sparse factor analysis (psFA) allowing spatially sparse maps, component...... pruning using automatic relevance determination (ARD) and subject specific heteroscedastic spatial noise modeling. For task-based and resting state fMRI, we show that the sparsity constraint gives rise to components similar to those obtained by group independent component analysis. The noise modeling...... shows that noise is reduced in areas typically associated with activation by the experimental design. The psFA model identifies sparse components and the probabilistic setting provides a natural way to handle parameter uncertainties. The variational Bayesian framework easily extends to more complex...
SPARSE ELECTROMAGNETIC IMAGING USING NONLINEAR LANDWEBER ITERATIONS
Desmal, Abdulla
2015-07-29
A scheme for efficiently solving the nonlinear electromagnetic inverse scattering problem on sparse investigation domains is described. The proposed scheme reconstructs the (complex) dielectric permittivity of an investigation domain from fields measured away from the domain itself. Least-squares data misfit between the computed scattered fields, which are expressed as a nonlinear function of the permittivity, and the measured fields is constrained by the L0/L1-norm of the solution. The resulting minimization problem is solved using nonlinear Landweber iterations, where at each iteration a thresholding function is applied to enforce the sparseness-promoting L0/L1-norm constraint. The thresholded nonlinear Landweber iterations are applied to several two-dimensional problems, where the ``measured\\'\\' fields are synthetically generated or obtained from actual experiments. These numerical experiments demonstrate the accuracy, efficiency, and applicability of the proposed scheme in reconstructing sparse profiles with high permittivity values.
Sparse regularization for force identification using dictionaries
Qiao, Baijie; Zhang, Xingwu; Wang, Chenxi; Zhang, Hang; Chen, Xuefeng
2016-04-01
The classical function expansion method based on minimizing l2-norm of the response residual employs various basis functions to represent the unknown force. Its difficulty lies in determining the optimum number of basis functions. Considering the sparsity of force in the time domain or in other basis space, we develop a general sparse regularization method based on minimizing l1-norm of the coefficient vector of basis functions. The number of basis functions is adaptively determined by minimizing the number of nonzero components in the coefficient vector during the sparse regularization process. First, according to the profile of the unknown force, the dictionary composed of basis functions is determined. Second, a sparsity convex optimization model for force identification is constructed. Third, given the transfer function and the operational response, Sparse reconstruction by separable approximation (SpaRSA) is developed to solve the sparse regularization problem of force identification. Finally, experiments including identification of impact and harmonic forces are conducted on a cantilever thin plate structure to illustrate the effectiveness and applicability of SpaRSA. Besides the Dirac dictionary, other three sparse dictionaries including Db6 wavelets, Sym4 wavelets and cubic B-spline functions can also accurately identify both the single and double impact forces from highly noisy responses in a sparse representation frame. The discrete cosine functions can also successfully reconstruct the harmonic forces including the sinusoidal, square and triangular forces. Conversely, the traditional Tikhonov regularization method with the L-curve criterion fails to identify both the impact and harmonic forces in these cases.
Structure-based bayesian sparse reconstruction
Quadeer, Ahmed Abdul
2012-12-01
Sparse signal reconstruction algorithms have attracted research attention due to their wide applications in various fields. In this paper, we present a simple Bayesian approach that utilizes the sparsity constraint and a priori statistical information (Gaussian or otherwise) to obtain near optimal estimates. In addition, we make use of the rich structure of the sensing matrix encountered in many signal processing applications to develop a fast sparse recovery algorithm. The computational complexity of the proposed algorithm is very low compared with the widely used convex relaxation methods as well as greedy matching pursuit techniques, especially at high sparsity. © 1991-2012 IEEE.
Binary Sparse Phase Retrieval via Simulated Annealing
Directory of Open Access Journals (Sweden)
Wei Peng
2016-01-01
Full Text Available This paper presents the Simulated Annealing Sparse PhAse Recovery (SASPAR algorithm for reconstructing sparse binary signals from their phaseless magnitudes of the Fourier transform. The greedy strategy version is also proposed for a comparison, which is a parameter-free algorithm. Sufficient numeric simulations indicate that our method is quite effective and suggest the binary model is robust. The SASPAR algorithm seems competitive to the existing methods for its efficiency and high recovery rate even with fewer Fourier measurements.
Subspace Based Blind Sparse Channel Estimation
DEFF Research Database (Denmark)
Hayashi, Kazunori; Matsushima, Hiroki; Sakai, Hideaki
2012-01-01
The paper proposes a subspace based blind sparse channel estimation method using 1–2 optimization by replacing the 2–norm minimization in the conventional subspace based method by the 1–norm minimization problem. Numerical results confirm that the proposed method can significantly improve...
Multilevel sparse functional principal component analysis.
Di, Chongzhi; Crainiceanu, Ciprian M; Jank, Wolfgang S
2014-01-29
We consider analysis of sparsely sampled multilevel functional data, where the basic observational unit is a function and data have a natural hierarchy of basic units. An example is when functions are recorded at multiple visits for each subject. Multilevel functional principal component analysis (MFPCA; Di et al. 2009) was proposed for such data when functions are densely recorded. Here we consider the case when functions are sparsely sampled and may contain only a few observations per function. We exploit the multilevel structure of covariance operators and achieve data reduction by principal component decompositions at both between and within subject levels. We address inherent methodological differences in the sparse sampling context to: 1) estimate the covariance operators; 2) estimate the functional principal component scores; 3) predict the underlying curves. Through simulations the proposed method is able to discover dominating modes of variations and reconstruct underlying curves well even in sparse settings. Our approach is illustrated by two applications, the Sleep Heart Health Study and eBay auctions.
Continuous speech recognition with sparse coding
CSIR Research Space (South Africa)
Smit, WJ
2009-04-01
Full Text Available generative model. The spike train is classified by making use of a spike train model and dynamic programming. It is computationally expensive to find a sparse code. We use an iterative subset selection algorithm with quadratic programming for this process...
Multisnapshot Sparse Bayesian Learning for DOA
DEFF Research Database (Denmark)
Gerstoft, Peter; Mecklenbrauker, Christoph F.; Xenaki, Angeliki
2016-01-01
The directions of arrival (DOA) of plane waves are estimated from multisnapshot sensor array data using sparse Bayesian learning (SBL). The prior for the source amplitudes is assumed independent zero-mean complex Gaussian distributed with hyperparameters, the unknown variances (i.e., the source...
Better Size Estimation for Sparse Matrix Products
DEFF Research Database (Denmark)
Amossen, Rasmus Resen; Campagna, Andrea; Pagh, Rasmus
2010-01-01
We consider the problem of doing fast and reliable estimation of the number of non-zero entries in a sparse Boolean matrix product. Let n denote the total number of non-zero entries in the input matrices. We show how to compute a 1 ± ε approximation (with small probability of error) in expected t...
Rotational image deblurring with sparse matrices
DEFF Research Database (Denmark)
Hansen, Per Christian; Nagy, James G.; Tigkos, Konstantinos
2014-01-01
We describe iterative deblurring algorithms that can handle blur caused by a rotation along an arbitrary axis (including the common case of pure rotation). Our algorithms use a sparse-matrix representation of the blurring operation, which allows us to easily handle several different boundary...
Feature based omnidirectional sparse visual path following
Goedemé, Toon; Tuytelaars, Tinne; Van Gool, Luc; Vanacker, Gerolf; Nuttin, Marnix
2005-01-01
Goedemé T., Tuytelaars T., Van Gool L., Vanacker G., Nuttin M., ''Feature based omnidirectional sparse visual path following'', Proceedings IEEE/RSJ international conference on intelligent robots and systems - IROS2005, pp. 1003-1008, August 2-6, 2005, Edmonton, Alberta, Canada.
Comparison of sparse point distribution models
DEFF Research Database (Denmark)
Erbou, Søren Gylling Hemmingsen; Vester-Christensen, Martin; Larsen, Rasmus
2010-01-01
This paper compares several methods for obtaining sparse and compact point distribution models suited for data sets containing many variables. These are evaluated on a database consisting of 3D surfaces of a section of the pelvic bone obtained from CT scans of 33 porcine carcasses. The superior m...
New methods for sampling sparse populations
Anna Ringvall
2007-01-01
To improve surveys of sparse objects, methods that use auxiliary information have been suggested. Guided transect sampling uses prior information, e.g., from aerial photographs, for the layout of survey strips. Instead of being laid out straight, the strips will wind between potentially more interesting areas. 3P sampling (probability proportional to prediction) uses...
Fast Sparse Coding for Range Data Denoising with Sparse Ridges Constraint.
Gao, Zhi; Lao, Mingjie; Sang, Yongsheng; Wen, Fei; Ramesh, Bharath; Zhai, Ruifang
2018-05-06
Light detection and ranging (LiDAR) sensors have been widely deployed on intelligent systems such as unmanned ground vehicles (UGVs) and unmanned aerial vehicles (UAVs) to perform localization, obstacle detection, and navigation tasks. Thus, research into range data processing with competitive performance in terms of both accuracy and efficiency has attracted increasing attention. Sparse coding has revolutionized signal processing and led to state-of-the-art performance in a variety of applications. However, dictionary learning, which plays the central role in sparse coding techniques, is computationally demanding, resulting in its limited applicability in real-time systems. In this study, we propose sparse coding algorithms with a fixed pre-learned ridge dictionary to realize range data denoising via leveraging the regularity of laser range measurements in man-made environments. Experiments on both synthesized data and real data demonstrate that our method obtains accuracy comparable to that of sophisticated sparse coding methods, but with much higher computational efficiency.
A sparse electromagnetic imaging scheme using nonlinear landweber iterations
Desmal, Abdulla; Bagci, Hakan
2015-01-01
Development and use of electromagnetic inverse scattering techniques for imagining sparse domains have been on the rise following the recent advancements in solving sparse optimization problems. Existing techniques rely on iteratively converting
Efficient Pseudorecursive Evaluation Schemes for Non-adaptive Sparse Grids
Buse, Gerrit; Pflü ger, Dirk; Jacob, Riko
2014-01-01
In this work we propose novel algorithms for storing and evaluating sparse grid functions, operating on regular (not spatially adaptive), yet potentially dimensionally adaptive grid types. Besides regular sparse grids our approach includes truncated
Improved Sparse Channel Estimation for Cooperative Communication Systems
Directory of Open Access Journals (Sweden)
Guan Gui
2012-01-01
Full Text Available Accurate channel state information (CSI is necessary at receiver for coherent detection in amplify-and-forward (AF cooperative communication systems. To estimate the channel, traditional methods, that is, least squares (LS and least absolute shrinkage and selection operator (LASSO, are based on assumptions of either dense channel or global sparse channel. However, LS-based linear method neglects the inherent sparse structure information while LASSO-based sparse channel method cannot take full advantage of the prior information. Based on the partial sparse assumption of the cooperative channel model, we propose an improved channel estimation method with partial sparse constraint. At first, by using sparse decomposition theory, channel estimation is formulated as a compressive sensing problem. Secondly, the cooperative channel is reconstructed by LASSO with partial sparse constraint. Finally, numerical simulations are carried out to confirm the superiority of proposed methods over global sparse channel estimation methods.
Sparse reconstruction using distribution agnostic bayesian matching pursuit
Masood, Mudassir; Al-Naffouri, Tareq Y.
2013-01-01
A fast matching pursuit method using a Bayesian approach is introduced for sparse signal recovery. This method performs Bayesian estimates of sparse signals even when the signal prior is non-Gaussian or unknown. It is agnostic on signal statistics
Sparse DOA estimation with polynomial rooting
DEFF Research Database (Denmark)
Xenaki, Angeliki; Gerstoft, Peter; Fernandez Grande, Efren
2015-01-01
Direction-of-arrival (DOA) estimation involves the localization of a few sources from a limited number of observations on an array of sensors. Thus, DOA estimation can be formulated as a sparse signal reconstruction problem and solved efficiently with compressive sensing (CS) to achieve highresol......Direction-of-arrival (DOA) estimation involves the localization of a few sources from a limited number of observations on an array of sensors. Thus, DOA estimation can be formulated as a sparse signal reconstruction problem and solved efficiently with compressive sensing (CS) to achieve...... highresolution imaging. Utilizing the dual optimal variables of the CS optimization problem, it is shown with Monte Carlo simulations that the DOAs are accurately reconstructed through polynomial rooting (Root-CS). Polynomial rooting is known to improve the resolution in several other DOA estimation methods...
Sparse learning of stochastic dynamical equations
Boninsegna, Lorenzo; Nüske, Feliks; Clementi, Cecilia
2018-06-01
With the rapid increase of available data for complex systems, there is great interest in the extraction of physically relevant information from massive datasets. Recently, a framework called Sparse Identification of Nonlinear Dynamics (SINDy) has been introduced to identify the governing equations of dynamical systems from simulation data. In this study, we extend SINDy to stochastic dynamical systems which are frequently used to model biophysical processes. We prove the asymptotic correctness of stochastic SINDy in the infinite data limit, both in the original and projected variables. We discuss algorithms to solve the sparse regression problem arising from the practical implementation of SINDy and show that cross validation is an essential tool to determine the right level of sparsity. We demonstrate the proposed methodology on two test systems, namely, the diffusion in a one-dimensional potential and the projected dynamics of a two-dimensional diffusion process.
Sparseness- and continuity-constrained seismic imaging
Herrmann, Felix J.
2005-04-01
Non-linear solution strategies to the least-squares seismic inverse-scattering problem with sparseness and continuity constraints are proposed. Our approach is designed to (i) deal with substantial amounts of additive noise (SNR formulating the solution of the seismic inverse problem in terms of an optimization problem. During the optimization, sparseness on the basis and continuity along the reflectors are imposed by jointly minimizing the l1- and anisotropic diffusion/total-variation norms on the coefficients and reflectivity, respectively. [Joint work with Peyman P. Moghaddam was carried out as part of the SINBAD project, with financial support secured through ITF (the Industry Technology Facilitator) from the following organizations: BG Group, BP, ExxonMobil, and SHELL. Additional funding came from the NSERC Discovery Grants 22R81254.
A density functional for sparse matter
DEFF Research Database (Denmark)
Langreth, D.C.; Lundqvist, Bengt; Chakarova-Kack, S.D.
2009-01-01
forces in molecules, to adsorbed molecules, like benzene, naphthalene, phenol and adenine on graphite, alumina and metals, to polymer and carbon nanotube (CNT) crystals, and hydrogen storage in graphite and metal-organic frameworks (MOFs), and to the structure of DNA and of DNA with intercalators......Sparse matter is abundant and has both strong local bonds and weak nonbonding forces, in particular nonlocal van der Waals (vdW) forces between atoms separated by empty space. It encompasses a broad spectrum of systems, like soft matter, adsorption systems and biostructures. Density-functional...... theory (DFT), long since proven successful for dense matter, seems now to have come to a point, where useful extensions to sparse matter are available. In particular, a functional form, vdW-DF (Dion et al 2004 Phys. Rev. Lett. 92 246401; Thonhauser et al 2007 Phys. Rev. B 76 125112), has been proposed...
Parallel Programming with Intel Parallel Studio XE
Blair-Chappell , Stephen
2012-01-01
Optimize code for multi-core processors with Intel's Parallel Studio Parallel programming is rapidly becoming a "must-know" skill for developers. Yet, where to start? This teach-yourself tutorial is an ideal starting point for developers who already know Windows C and C++ and are eager to add parallelism to their code. With a focus on applying tools, techniques, and language extensions to implement parallelism, this essential resource teaches you how to write programs for multicore and leverage the power of multicore in your programs. Sharing hands-on case studies and real-world examples, the
Robust Fringe Projection Profilometry via Sparse Representation.
Budianto; Lun, Daniel P K
2016-04-01
In this paper, a robust fringe projection profilometry (FPP) algorithm using the sparse dictionary learning and sparse coding techniques is proposed. When reconstructing the 3D model of objects, traditional FPP systems often fail to perform if the captured fringe images have a complex scene, such as having multiple and occluded objects. It introduces great difficulty to the phase unwrapping process of an FPP system that can result in serious distortion in the final reconstructed 3D model. For the proposed algorithm, it encodes the period order information, which is essential to phase unwrapping, into some texture patterns and embeds them to the projected fringe patterns. When the encoded fringe image is captured, a modified morphological component analysis and a sparse classification procedure are performed to decode and identify the embedded period order information. It is then used to assist the phase unwrapping process to deal with the different artifacts in the fringe images. Experimental results show that the proposed algorithm can significantly improve the robustness of an FPP system. It performs equally well no matter the fringe images have a simple or complex scene, or are affected due to the ambient lighting of the working environment.
Marandi, Ahmadreza; de Klerk, Etienne; Dahl, Joachim
The sparse bounded degree sum-of-squares (sparse-BSOS) hierarchy of Weisser, Lasserre and Toh [arXiv:1607.01151,2016] constructs a sequence of lower bounds for a sparse polynomial optimization problem. Under some assumptions, it is proven by the authors that the sequence converges to the optimal
Morse, H Stephen
1994-01-01
Practical Parallel Computing provides information pertinent to the fundamental aspects of high-performance parallel processing. This book discusses the development of parallel applications on a variety of equipment.Organized into three parts encompassing 12 chapters, this book begins with an overview of the technology trends that converge to favor massively parallel hardware over traditional mainframes and vector machines. This text then gives a tutorial introduction to parallel hardware architectures. Other chapters provide worked-out examples of programs using several parallel languages. Thi
Akl, Selim G
1985-01-01
Parallel Sorting Algorithms explains how to use parallel algorithms to sort a sequence of items on a variety of parallel computers. The book reviews the sorting problem, the parallel models of computation, parallel algorithms, and the lower bounds on the parallel sorting problems. The text also presents twenty different algorithms, such as linear arrays, mesh-connected computers, cube-connected computers. Another example where algorithm can be applied is on the shared-memory SIMD (single instruction stream multiple data stream) computers in which the whole sequence to be sorted can fit in the
Noniterative MAP reconstruction using sparse matrix representations.
Cao, Guangzhi; Bouman, Charles A; Webb, Kevin J
2009-09-01
We present a method for noniterative maximum a posteriori (MAP) tomographic reconstruction which is based on the use of sparse matrix representations. Our approach is to precompute and store the inverse matrix required for MAP reconstruction. This approach has generally not been used in the past because the inverse matrix is typically large and fully populated (i.e., not sparse). In order to overcome this problem, we introduce two new ideas. The first idea is a novel theory for the lossy source coding of matrix transformations which we refer to as matrix source coding. This theory is based on a distortion metric that reflects the distortions produced in the final matrix-vector product, rather than the distortions in the coded matrix itself. The resulting algorithms are shown to require orthonormal transformations of both the measurement data and the matrix rows and columns before quantization and coding. The second idea is a method for efficiently storing and computing the required orthonormal transformations, which we call a sparse-matrix transform (SMT). The SMT is a generalization of the classical FFT in that it uses butterflies to compute an orthonormal transform; but unlike an FFT, the SMT uses the butterflies in an irregular pattern, and is numerically designed to best approximate the desired transforms. We demonstrate the potential of the noniterative MAP reconstruction with examples from optical tomography. The method requires offline computation to encode the inverse transform. However, once these offline computations are completed, the noniterative MAP algorithm is shown to reduce both storage and computation by well over two orders of magnitude, as compared to a linear iterative reconstruction methods.
Galaxy redshift surveys with sparse sampling
International Nuclear Information System (INIS)
Chiang, Chi-Ting; Wullstein, Philipp; Komatsu, Eiichiro; Jee, Inh; Jeong, Donghui; Blanc, Guillermo A.; Ciardullo, Robin; Gronwall, Caryl; Hagen, Alex; Schneider, Donald P.; Drory, Niv; Fabricius, Maximilian; Landriau, Martin; Finkelstein, Steven; Jogee, Shardha; Cooper, Erin Mentuch; Tuttle, Sarah; Gebhardt, Karl; Hill, Gary J.
2013-01-01
Survey observations of the three-dimensional locations of galaxies are a powerful approach to measure the distribution of matter in the universe, which can be used to learn about the nature of dark energy, physics of inflation, neutrino masses, etc. A competitive survey, however, requires a large volume (e.g., V survey ∼ 10Gpc 3 ) to be covered, and thus tends to be expensive. A ''sparse sampling'' method offers a more affordable solution to this problem: within a survey footprint covering a given survey volume, V survey , we observe only a fraction of the volume. The distribution of observed regions should be chosen such that their separation is smaller than the length scale corresponding to the wavenumber of interest. Then one can recover the power spectrum of galaxies with precision expected for a survey covering a volume of V survey (rather than the volume of the sum of observed regions) with the number density of galaxies given by the total number of observed galaxies divided by V survey (rather than the number density of galaxies within an observed region). We find that regularly-spaced sampling yields an unbiased power spectrum with no window function effect, and deviations from regularly-spaced sampling, which are unavoidable in realistic surveys, introduce calculable window function effects and increase the uncertainties of the recovered power spectrum. On the other hand, we show that the two-point correlation function (pair counting) is not affected by sparse sampling. While we discuss the sparse sampling method within the context of the forthcoming Hobby-Eberly Telescope Dark Energy Experiment, the method is general and can be applied to other galaxy surveys
A view of Kanerva's sparse distributed memory
Denning, P. J.
1986-01-01
Pentti Kanerva is working on a new class of computers, which are called pattern computers. Pattern computers may close the gap between capabilities of biological organisms to recognize and act on patterns (visual, auditory, tactile, or olfactory) and capabilities of modern computers. Combinations of numeric, symbolic, and pattern computers may one day be capable of sustaining robots. The overview of the requirements for a pattern computer, a summary of Kanerva's Sparse Distributed Memory (SDM), and examples of tasks this computer can be expected to perform well are given.
Wavelets for Sparse Representation of Music
DEFF Research Database (Denmark)
Endelt, Line Ørtoft; Harbo, Anders La-Cour
2004-01-01
We are interested in obtaining a sparse representation of music signals by means of a discrete wavelet transform (DWT). That means we want the energy in the representation to be concentrated in few DWT coefficients. It is well-known that the decay of the DWT coefficients is strongly related...... to the number of vanishing moments of the mother wavelet, and to the smoothness of the signal. In this paper we present the result of applying two classical families of wavelets to a series of musical signals. The purpose is to determine a general relation between the number of vanishing moments of the wavelet...
Sparse dynamics for partial differential equations.
Schaeffer, Hayden; Caflisch, Russel; Hauck, Cory D; Osher, Stanley
2013-04-23
We investigate the approximate dynamics of several differential equations when the solutions are restricted to a sparse subset of a given basis. The restriction is enforced at every time step by simply applying soft thresholding to the coefficients of the basis approximation. By reducing or compressing the information needed to represent the solution at every step, only the essential dynamics are represented. In many cases, there are natural bases derived from the differential equations, which promote sparsity. We find that our method successfully reduces the dynamics of convection equations, diffusion equations, weak shocks, and vorticity equations with high-frequency source terms.
Abnormal Event Detection Using Local Sparse Representation
DEFF Research Database (Denmark)
Ren, Huamin; Moeslund, Thomas B.
2014-01-01
We propose to detect abnormal events via a sparse subspace clustering algorithm. Unlike most existing approaches, which search for optimized normal bases and detect abnormality based on least square error or reconstruction error from the learned normal patterns, we propose an abnormality measurem...... is found that satisfies: the distance between its local space and the normal space is large. We evaluate our method on two public benchmark datasets: UCSD and Subway Entrance datasets. The comparison to the state-of-the-art methods validate our method's effectiveness....
Functional fixedness in a technologically sparse culture.
German, Tim P; Barrett, H Clark
2005-01-01
Problem solving can be inefficient when the solution requires subjects to generate an atypical function for an object and the object's typical function has been primed. Subjects become "fixed" on the design function of the object, and problem solving suffers relative to control conditions in which the object's function is not demonstrated. In the current study, such functional fixedness was demonstrated in a sample of adolescents (mean age of 16 years) among the Shuar of Ecuadorian Amazonia, whose technologically sparse culture provides limited access to large numbers of artifacts with highly specialized functions. This result suggests that design function may universally be the core property of artifact concepts in human semantic memory.
Ordering schemes for sparse matrices using modern programming paradigms
International Nuclear Information System (INIS)
Oliker, Leonid; Li, Xiaoye; Husbands, Parry; Biswas, Rupak
2000-01-01
The Conjugate Gradient (CG) algorithm is perhaps the best-known iterative technique to solve sparse linear systems that are symmetric and positive definite. In previous work, we investigated the effects of various ordering and partitioning strategies on the performance of CG using different programming paradigms and architectures. This paper makes several extensions to our prior research. First, we present a hybrid(MPI+OpenMP) implementation of the CG algorithm on the IBM SP and show that the hybrid paradigm increases programming complexity with little performance gains compared to a pure MPI implementation. For ill-conditioned linear systems, it is often necessary to use a preconditioning technique. We present MPI results for ILU(0) preconditioned CG (PCG) using the BlockSolve95 library, and show that the initial ordering of the input matrix dramatically affect PCG's performance. Finally, a multithreaded version of the PCG is developed on the Cray (Tera) MTA. Unlike the message-passing version, this implementation did not require the complexities of special orderings or graph dependency analysis. However, only limited scalability was achieved due to the lack of available thread level parallelism
Introduction to parallel programming
Brawer, Steven
1989-01-01
Introduction to Parallel Programming focuses on the techniques, processes, methodologies, and approaches involved in parallel programming. The book first offers information on Fortran, hardware and operating system models, and processes, shared memory, and simple parallel programs. Discussions focus on processes and processors, joining processes, shared memory, time-sharing with multiple processors, hardware, loops, passing arguments in function/subroutine calls, program structure, and arithmetic expressions. The text then elaborates on basic parallel programming techniques, barriers and race
Fox, Geoffrey C; Messina, Guiseppe C
2014-01-01
A clear illustration of how parallel computers can be successfully appliedto large-scale scientific computations. This book demonstrates how avariety of applications in physics, biology, mathematics and other scienceswere implemented on real parallel computers to produce new scientificresults. It investigates issues of fine-grained parallelism relevant forfuture supercomputers with particular emphasis on hypercube architecture. The authors describe how they used an experimental approach to configuredifferent massively parallel machines, design and implement basic systemsoftware, and develop
DEFF Research Database (Denmark)
Han, Xixuan; Clemmensen, Line Katrine Harder
2015-01-01
We propose a general technique for obtaining sparse solutions to generalized eigenvalue problems, and call it Regularized Generalized Eigen-Decomposition (RGED). For decades, Fisher's discriminant criterion has been applied in supervised feature extraction and discriminant analysis, and it is for...
Fast Sparse Coding for Range Data Denoising with Sparse Ridges Constraint
Directory of Open Access Journals (Sweden)
Zhi Gao
2018-05-01
Full Text Available Light detection and ranging (LiDAR sensors have been widely deployed on intelligent systems such as unmanned ground vehicles (UGVs and unmanned aerial vehicles (UAVs to perform localization, obstacle detection, and navigation tasks. Thus, research into range data processing with competitive performance in terms of both accuracy and efficiency has attracted increasing attention. Sparse coding has revolutionized signal processing and led to state-of-the-art performance in a variety of applications. However, dictionary learning, which plays the central role in sparse coding techniques, is computationally demanding, resulting in its limited applicability in real-time systems. In this study, we propose sparse coding algorithms with a fixed pre-learned ridge dictionary to realize range data denoising via leveraging the regularity of laser range measurements in man-made environments. Experiments on both synthesized data and real data demonstrate that our method obtains accuracy comparable to that of sophisticated sparse coding methods, but with much higher computational efficiency.
Interferometric interpolation of sparse marine data
Hanafy, Sherif M.
2013-10-11
We present the theory and numerical results for interferometrically interpolating 2D and 3D marine surface seismic profiles data. For the interpolation of seismic data we use the combination of a recorded Green\\'s function and a model-based Green\\'s function for a water-layer model. Synthetic (2D and 3D) and field (2D) results show that the seismic data with sparse receiver intervals can be accurately interpolated to smaller intervals using multiples in the data. An up- and downgoing separation of both recorded and model-based Green\\'s functions can help in minimizing artefacts in a virtual shot gather. If the up- and downgoing separation is not possible, noticeable artefacts will be generated in the virtual shot gather. As a partial remedy we iteratively use a non-stationary 1D multi-channel matching filter with the interpolated data. Results suggest that a sparse marine seismic survey can yield more information about reflectors if traces are interpolated by interferometry. Comparing our results to those of f-k interpolation shows that the synthetic example gives comparable results while the field example shows better interpolation quality for the interferometric method. © 2013 European Association of Geoscientists & Engineers.
Balanced and sparse Tamo-Barg codes
Halbawi, Wael; Duursma, Iwan; Dau, Hoang; Hassibi, Babak
2017-01-01
We construct balanced and sparse generator matrices for Tamo and Barg's Locally Recoverable Codes (LRCs). More specifically, for a cyclic Tamo-Barg code of length n, dimension k and locality r, we show how to deterministically construct a generator matrix where the number of nonzeros in any two columns differs by at most one, and where the weight of every row is d + r - 1, where d is the minimum distance of the code. Since LRCs are designed mainly for distributed storage systems, the results presented in this work provide a computationally balanced and efficient encoding scheme for these codes. The balanced property ensures that the computational effort exerted by any storage node is essentially the same, whilst the sparse property ensures that this effort is minimal. The work presented in this paper extends a similar result previously established for Reed-Solomon (RS) codes, where it is now known that any cyclic RS code possesses a generator matrix that is balanced as described, but is sparsest, meaning that each row has d nonzeros.
Atmospheric inverse modeling via sparse reconstruction
Hase, Nils; Miller, Scot M.; Maaß, Peter; Notholt, Justus; Palm, Mathias; Warneke, Thorsten
2017-10-01
Many applications in atmospheric science involve ill-posed inverse problems. A crucial component of many inverse problems is the proper formulation of a priori knowledge about the unknown parameters. In most cases, this knowledge is expressed as a Gaussian prior. This formulation often performs well at capturing smoothed, large-scale processes but is often ill equipped to capture localized structures like large point sources or localized hot spots. Over the last decade, scientists from a diverse array of applied mathematics and engineering fields have developed sparse reconstruction techniques to identify localized structures. In this study, we present a new regularization approach for ill-posed inverse problems in atmospheric science. It is based on Tikhonov regularization with sparsity constraint and allows bounds on the parameters. We enforce sparsity using a dictionary representation system. We analyze its performance in an atmospheric inverse modeling scenario by estimating anthropogenic US methane (CH4) emissions from simulated atmospheric measurements. Different measures indicate that our sparse reconstruction approach is better able to capture large point sources or localized hot spots than other methods commonly used in atmospheric inversions. It captures the overall signal equally well but adds details on the grid scale. This feature can be of value for any inverse problem with point or spatially discrete sources. We show an example for source estimation of synthetic methane emissions from the Barnett shale formation.
Balanced and sparse Tamo-Barg codes
Halbawi, Wael
2017-08-29
We construct balanced and sparse generator matrices for Tamo and Barg\\'s Locally Recoverable Codes (LRCs). More specifically, for a cyclic Tamo-Barg code of length n, dimension k and locality r, we show how to deterministically construct a generator matrix where the number of nonzeros in any two columns differs by at most one, and where the weight of every row is d + r - 1, where d is the minimum distance of the code. Since LRCs are designed mainly for distributed storage systems, the results presented in this work provide a computationally balanced and efficient encoding scheme for these codes. The balanced property ensures that the computational effort exerted by any storage node is essentially the same, whilst the sparse property ensures that this effort is minimal. The work presented in this paper extends a similar result previously established for Reed-Solomon (RS) codes, where it is now known that any cyclic RS code possesses a generator matrix that is balanced as described, but is sparsest, meaning that each row has d nonzeros.
Non-parametric co-clustering of large scale sparse bipartite networks on the GPU
DEFF Research Database (Denmark)
Hansen, Toke Jansen; Mørup, Morten; Hansen, Lars Kai
2011-01-01
of row and column clusters from a hypothesis space of an infinite number of clusters. To reach large scale applications of co-clustering we exploit that parameter inference for co-clustering is well suited for parallel computing. We develop a generic GPU framework for efficient inference on large scale...... sparse bipartite networks and achieve a speedup of two orders of magnitude compared to estimation based on conventional CPUs. In terms of scalability we find for networks with more than 100 million links that reliable inference can be achieved in less than an hour on a single GPU. To efficiently manage...
Workload Balancing on Heterogeneous Systems: A Case Study of Sparse Grid Interpolation
Muraraşu, Alin
2012-01-01
Multi-core parallelism and accelerators are becoming common features of today’s computer systems, as they allow for computational power without sacrificing energy efficiency. Due to heterogeneity, tuning for each type of compute unit and adequate load balancing is essential. This paper proposes static and dynamic solutions for load balancing in the context of an application for visualizing high-dimensional simulation data. The application relies on the sparse grid technique for data compression. Its performance critical part is the interpolation routine used for decompression. Results show that our load balancing scheme allows for an efficient acceleration of interpolation on heterogeneous systems containing multi-core CPUs and GPUs.
Parallel Reservoir Simulations with Sparse Grid Techniques and Applications to Wormhole Propagation
Wu, Yuanqing
2015-01-01
the traditional simulation technique relying on the Darcy framework, we propose a new framework called Darcy-Brinkman-Forchheimer framework to simulate wormhole propagation. Furthermore, to process the large quantity of cells in the simulation grid and shorten
A portable implementation of ARPACK for distributed memory parallel architectures
Energy Technology Data Exchange (ETDEWEB)
Maschhoff, K.J.; Sorensen, D.C.
1996-12-31
ARPACK is a package of Fortran 77 subroutines which implement the Implicitly Restarted Arnoldi Method used for solving large sparse eigenvalue problems. A parallel implementation of ARPACK is presented which is portable across a wide range of distributed memory platforms and requires minimal changes to the serial code. The communication layers used for message passing are the Basic Linear Algebra Communication Subprograms (BLACS) developed for the ScaLAPACK project and Message Passing Interface(MPI).
JTpack90: A parallel, object-based, Fortran 90 linear algebra package
Energy Technology Data Exchange (ETDEWEB)
Turner, J.A.; Kothe, D.B. [Los Alamos National Lab., NM (United States); Ferrell, R.C. [Cambridge Power Computing Associates, Ltd., Brookline, MA (United States)
1997-03-01
The authors have developed an object-based linear algebra package, currently with emphasis on sparse Krylov methods, driven primarily by needs of the Los Alamos National Laboratory parallel unstructured-mesh casting simulation tool Telluride. Support for a number of sparse storage formats, methods, and preconditioners have been implemented, driven primarily by application needs. They describe the object-based Fortran 90 approach, which enhances maintainability, performance, and extensibility, the parallelization approach using a new portable gather/scatter library (PGSLib), current capabilities and future plans, and present preliminary performance results on a variety of platforms.
Parallel Atomistic Simulations
Energy Technology Data Exchange (ETDEWEB)
HEFFELFINGER,GRANT S.
2000-01-18
Algorithms developed to enable the use of atomistic molecular simulation methods with parallel computers are reviewed. Methods appropriate for bonded as well as non-bonded (and charged) interactions are included. While strategies for obtaining parallel molecular simulations have been developed for the full variety of atomistic simulation methods, molecular dynamics and Monte Carlo have received the most attention. Three main types of parallel molecular dynamics simulations have been developed, the replicated data decomposition, the spatial decomposition, and the force decomposition. For Monte Carlo simulations, parallel algorithms have been developed which can be divided into two categories, those which require a modified Markov chain and those which do not. Parallel algorithms developed for other simulation methods such as Gibbs ensemble Monte Carlo, grand canonical molecular dynamics, and Monte Carlo methods for protein structure determination are also reviewed and issues such as how to measure parallel efficiency, especially in the case of parallel Monte Carlo algorithms with modified Markov chains are discussed.
Hine, N D M; Haynes, P D; Mostofi, A A; Payne, M C
2010-09-21
We present calculations of formation energies of defects in an ionic solid (Al(2)O(3)) extrapolated to the dilute limit, corresponding to a simulation cell of infinite size. The large-scale calculations required for this extrapolation are enabled by developments in the approach to parallel sparse matrix algebra operations, which are central to linear-scaling density-functional theory calculations. The computational cost of manipulating sparse matrices, whose sizes are determined by the large number of basis functions present, is greatly improved with this new approach. We present details of the sparse algebra scheme implemented in the ONETEP code using hierarchical sparsity patterns, and demonstrate its use in calculations on a wide range of systems, involving thousands of atoms on hundreds to thousands of parallel processes.
A Sparse Approximate Inverse Preconditioner for Nonsymmetric Linear Systems
Czech Academy of Sciences Publication Activity Database
Benzi, M.; Tůma, Miroslav
1998-01-01
Roč. 19, č. 3 (1998), s. 968-994 ISSN 1064-8275 R&D Projects: GA ČR GA201/93/0067; GA AV ČR IAA230401 Keywords : large sparse systems * interative methods * preconditioning * approximate inverse * sparse linear systems * sparse matrices * incomplete factorizations * conjugate gradient -type methods Subject RIV: BA - General Mathematics Impact factor: 1.378, year: 1998
Parallel algorithms for network routing problems and recurrences
International Nuclear Information System (INIS)
Wisniewski, J.A.; Sameh, A.H.
1982-01-01
In this paper, we consider the parallel solution of recurrences, and linear systems in the regular algebra of Carre. These problems are equivalent to solving the shortest path problem in graph theory, and they also arise in the analysis of Fortran programs. Our methods for solving linear systems in the regular algebra are analogues of well-known methods for solving systems of linear algebraic equations. A parallel version of Dijkstra's method, which has no linear algebraic analogue, is presented. Considerations for choosing an algorithm when the problem is large and sparse are also discussed
A Parallel Algebraic Multigrid Solver on Graphics Processing Units
Haase, Gundolf
2010-01-01
The paper presents a multi-GPU implementation of the preconditioned conjugate gradient algorithm with an algebraic multigrid preconditioner (PCG-AMG) for an elliptic model problem on a 3D unstructured grid. An efficient parallel sparse matrix-vector multiplication scheme underlying the PCG-AMG algorithm is presented for the many-core GPU architecture. A performance comparison of the parallel solver shows that a singe Nvidia Tesla C1060 GPU board delivers the performance of a sixteen node Infiniband cluster and a multi-GPU configuration with eight GPUs is about 100 times faster than a typical server CPU core. © 2010 Springer-Verlag.
Dose-shaping using targeted sparse optimization
Energy Technology Data Exchange (ETDEWEB)
Sayre, George A.; Ruan, Dan [Department of Radiation Oncology, University of California - Los Angeles School of Medicine, 200 Medical Plaza, Los Angeles, California 90095 (United States)
2013-07-15
Purpose: Dose volume histograms (DVHs) are common tools in radiation therapy treatment planning to characterize plan quality. As statistical metrics, DVHs provide a compact summary of the underlying plan at the cost of losing spatial information: the same or similar dose-volume histograms can arise from substantially different spatial dose maps. This is exactly the reason why physicians and physicists scrutinize dose maps even after they satisfy all DVH endpoints numerically. However, up to this point, little has been done to control spatial phenomena, such as the spatial distribution of hot spots, which has significant clinical implications. To this end, the authors propose a novel objective function that enables a more direct tradeoff between target coverage, organ-sparing, and planning target volume (PTV) homogeneity, and presents our findings from four prostate cases, a pancreas case, and a head-and-neck case to illustrate the advantages and general applicability of our method.Methods: In designing the energy minimization objective (E{sub tot}{sup sparse}), the authors utilized the following robust cost functions: (1) an asymmetric linear well function to allow differential penalties for underdose, relaxation of prescription dose, and overdose in the PTV; (2) a two-piece linear function to heavily penalize high dose and mildly penalize low and intermediate dose in organs-at risk (OARs); and (3) a total variation energy, i.e., the L{sub 1} norm applied to the first-order approximation of the dose gradient in the PTV. By minimizing a weighted sum of these robust costs, general conformity to dose prescription and dose-gradient prescription is achieved while encouraging prescription violations to follow a Laplace distribution. In contrast, conventional quadratic objectives are associated with a Gaussian distribution of violations, which is less forgiving to large violations of prescription than the Laplace distribution. As a result, the proposed objective E{sub tot
Dose-shaping using targeted sparse optimization
International Nuclear Information System (INIS)
Sayre, George A.; Ruan, Dan
2013-01-01
Purpose: Dose volume histograms (DVHs) are common tools in radiation therapy treatment planning to characterize plan quality. As statistical metrics, DVHs provide a compact summary of the underlying plan at the cost of losing spatial information: the same or similar dose-volume histograms can arise from substantially different spatial dose maps. This is exactly the reason why physicians and physicists scrutinize dose maps even after they satisfy all DVH endpoints numerically. However, up to this point, little has been done to control spatial phenomena, such as the spatial distribution of hot spots, which has significant clinical implications. To this end, the authors propose a novel objective function that enables a more direct tradeoff between target coverage, organ-sparing, and planning target volume (PTV) homogeneity, and presents our findings from four prostate cases, a pancreas case, and a head-and-neck case to illustrate the advantages and general applicability of our method.Methods: In designing the energy minimization objective (E tot sparse ), the authors utilized the following robust cost functions: (1) an asymmetric linear well function to allow differential penalties for underdose, relaxation of prescription dose, and overdose in the PTV; (2) a two-piece linear function to heavily penalize high dose and mildly penalize low and intermediate dose in organs-at risk (OARs); and (3) a total variation energy, i.e., the L 1 norm applied to the first-order approximation of the dose gradient in the PTV. By minimizing a weighted sum of these robust costs, general conformity to dose prescription and dose-gradient prescription is achieved while encouraging prescription violations to follow a Laplace distribution. In contrast, conventional quadratic objectives are associated with a Gaussian distribution of violations, which is less forgiving to large violations of prescription than the Laplace distribution. As a result, the proposed objective E tot sparse improves
Dose-shaping using targeted sparse optimization.
Sayre, George A; Ruan, Dan
2013-07-01
Dose volume histograms (DVHs) are common tools in radiation therapy treatment planning to characterize plan quality. As statistical metrics, DVHs provide a compact summary of the underlying plan at the cost of losing spatial information: the same or similar dose-volume histograms can arise from substantially different spatial dose maps. This is exactly the reason why physicians and physicists scrutinize dose maps even after they satisfy all DVH endpoints numerically. However, up to this point, little has been done to control spatial phenomena, such as the spatial distribution of hot spots, which has significant clinical implications. To this end, the authors propose a novel objective function that enables a more direct tradeoff between target coverage, organ-sparing, and planning target volume (PTV) homogeneity, and presents our findings from four prostate cases, a pancreas case, and a head-and-neck case to illustrate the advantages and general applicability of our method. In designing the energy minimization objective (E tot (sparse)), the authors utilized the following robust cost functions: (1) an asymmetric linear well function to allow differential penalties for underdose, relaxation of prescription dose, and overdose in the PTV; (2) a two-piece linear function to heavily penalize high dose and mildly penalize low and intermediate dose in organs-at risk (OARs); and (3) a total variation energy, i.e., the L1 norm applied to the first-order approximation of the dose gradient in the PTV. By minimizing a weighted sum of these robust costs, general conformity to dose prescription and dose-gradient prescription is achieved while encouraging prescription violations to follow a Laplace distribution. In contrast, conventional quadratic objectives are associated with a Gaussian distribution of violations, which is less forgiving to large violations of prescription than the Laplace distribution. As a result, the proposed objective E tot (sparse) improves tradeoff between
Data analysis in high-dimensional sparse spaces
DEFF Research Database (Denmark)
Clemmensen, Line Katrine Harder
classification techniques for high-dimensional problems are presented: Sparse discriminant analysis, sparse mixture discriminant analysis and orthogonality constrained support vector machines. The first two introduces sparseness to the well known linear and mixture discriminant analysis and thereby provide low...... are applied to classifications of fish species, ear canal impressions used in the hearing aid industry, microbiological fungi species, and various cancerous tissues and healthy tissues. In addition, novel applications of sparse regressions (also called the elastic net) to the medical, concrete, and food...
Greedy vs. L1 convex optimization in sparse coding
DEFF Research Database (Denmark)
Ren, Huamin; Pan, Hong; Olsen, Søren Ingvor
2015-01-01
Sparse representation has been applied successfully in many image analysis applications, including abnormal event detection, in which a baseline is to learn a dictionary from the training data and detect anomalies from its sparse codes. During this procedure, sparse codes which can be achieved...... solutions. Considering the property of abnormal event detection, i.e., only normal videos are used as training data due to practical reasons, effective codes in classification application may not perform well in abnormality detection. Therefore, we compare the sparse codes and comprehensively evaluate...... their performance from various aspects to better understand their applicability, including computation time, reconstruction error, sparsity, detection...
General upper bounds on the runtime of parallel evolutionary algorithms.
Lässig, Jörg; Sudholt, Dirk
2014-01-01
We present a general method for analyzing the runtime of parallel evolutionary algorithms with spatially structured populations. Based on the fitness-level method, it yields upper bounds on the expected parallel runtime. This allows for a rigorous estimate of the speedup gained by parallelization. Tailored results are given for common migration topologies: ring graphs, torus graphs, hypercubes, and the complete graph. Example applications for pseudo-Boolean optimization show that our method is easy to apply and that it gives powerful results. In our examples the performance guarantees improve with the density of the topology. Surprisingly, even sparse topologies such as ring graphs lead to a significant speedup for many functions while not increasing the total number of function evaluations by more than a constant factor. We also identify which number of processors lead to the best guaranteed speedups, thus giving hints on how to parameterize parallel evolutionary algorithms.
A Novel CSR-Based Sparse Matrix-Vector Multiplication on GPUs
Directory of Open Access Journals (Sweden)
Guixia He
2016-01-01
Full Text Available Sparse matrix-vector multiplication (SpMV is an important operation in scientific computations. Compressed sparse row (CSR is the most frequently used format to store sparse matrices. However, CSR-based SpMVs on graphic processing units (GPUs, for example, CSR-scalar and CSR-vector, usually have poor performance due to irregular memory access patterns. This motivates us to propose a perfect CSR-based SpMV on the GPU that is called PCSR. PCSR involves two kernels and accesses CSR arrays in a fully coalesced manner by introducing a middle array, which greatly alleviates the deficiencies of CSR-scalar (rare coalescing and CSR-vector (partial coalescing. Test results on a single C2050 GPU show that PCSR fully outperforms CSR-scalar, CSR-vector, and CSRMV and HYBMV in the vendor-tuned CUSPARSE library and is comparable with a most recently proposed CSR-based algorithm, CSR-Adaptive. Furthermore, we extend PCSR on a single GPU to multiple GPUs. Experimental results on four C2050 GPUs show that no matter whether the communication between GPUs is considered or not PCSR on multiple GPUs achieves good performance and has high parallel efficiency.
Sparse Bayesian Learning for Nonstationary Data Sources
Fujimaki, Ryohei; Yairi, Takehisa; Machida, Kazuo
This paper proposes an online Sparse Bayesian Learning (SBL) algorithm for modeling nonstationary data sources. Although most learning algorithms implicitly assume that a data source does not change over time (stationary), one in the real world usually does due to such various factors as dynamically changing environments, device degradation, sudden failures, etc (nonstationary). The proposed algorithm can be made useable for stationary online SBL by setting time decay parameters to zero, and as such it can be interpreted as a single unified framework for online SBL for use with stationary and nonstationary data sources. Tests both on four types of benchmark problems and on actual stock price data have shown it to perform well.
Narrowband interference parameterization for sparse Bayesian recovery
Ali, Anum
2015-09-11
This paper addresses the problem of narrowband interference (NBI) in SC-FDMA systems by using tools from compressed sensing and stochastic geometry. The proposed NBI cancellation scheme exploits the frequency domain sparsity of the unknown signal and adopts a Bayesian sparse recovery procedure. This is done by keeping a few randomly chosen sub-carriers data free to sense the NBI signal at the receiver. As Bayesian recovery requires knowledge of some NBI parameters (i.e., mean, variance and sparsity rate), we use tools from stochastic geometry to obtain analytical expressions for the required parameters. Our simulation results validate the analysis and depict suitability of the proposed recovery method for NBI mitigation. © 2015 IEEE.
Modern algorithms for large sparse eigenvalue problems
International Nuclear Information System (INIS)
Meyer, A.
1987-01-01
The volume is written for mathematicians interested in (numerical) linear algebra and in the solution of large sparse eigenvalue problems, as well as for specialists in engineering, who use the considered algorithms in the investigation of eigenoscillations of structures, in reactor physics, etc. Some variants of the algorithms based on the idea of a gradient-type direction of movement are presented and their convergence properties are discussed. From this, a general strategy for the direct use of preconditionings for the eigenvalue problem is derived. In this new approach the necessity of the solution of large linear systems is entirely avoided. Hence, these methods represent a new alternative to some other modern eigenvalue algorithms, as they show a slightly slower convergence on the one hand but essentially lower numerical and data processing problems on the other hand. A brief description and comparison of some well-known methods (i.e. simultaneous iteration, Lanczos algorithm) completes this volume. (author)
Sparse random matrices: The eigenvalue spectrum revisited
International Nuclear Information System (INIS)
Semerjian, Guilhem; Cugliandolo, Leticia F.
2003-08-01
We revisit the derivation of the density of states of sparse random matrices. We derive a recursion relation that allows one to compute the spectrum of the matrix of incidence for finite trees that determines completely the low concentration limit. Using the iterative scheme introduced by Biroli and Monasson [J. Phys. A 32, L255 (1999)] we find an approximate expression for the density of states expected to hold exactly in the opposite limit of large but finite concentration. The combination of the two methods yields a very simple geometric interpretation of the tails of the spectrum. We test the analytic results with numerical simulations and we suggest an indirect numerical method to explore the tails of the spectrum. (author)
ESTIMATION OF FUNCTIONALS OF SPARSE COVARIANCE MATRICES.
Fan, Jianqing; Rigollet, Philippe; Wang, Weichen
High-dimensional statistical tests often ignore correlations to gain simplicity and stability leading to null distributions that depend on functionals of correlation matrices such as their Frobenius norm and other ℓ r norms. Motivated by the computation of critical values of such tests, we investigate the difficulty of estimation the functionals of sparse correlation matrices. Specifically, we show that simple plug-in procedures based on thresholded estimators of correlation matrices are sparsity-adaptive and minimax optimal over a large class of correlation matrices. Akin to previous results on functional estimation, the minimax rates exhibit an elbow phenomenon. Our results are further illustrated in simulated data as well as an empirical study of data arising in financial econometrics.
Miniature Laboratory for Detecting Sparse Biomolecules
Lin, Ying; Yu, Nan
2005-01-01
A miniature laboratory system has been proposed for use in the field to detect sparsely distributed biomolecules. By emphasizing concentration and sorting of specimens prior to detection, the underlying system concept would make it possible to attain high detection sensitivities without the need to develop ever more sensitive biosensors. The original purpose of the proposal is to aid the search for signs of life on a remote planet by enabling the detection of specimens as sparse as a few molecules or microbes in a large amount of soil, dust, rocks, water/ice, or other raw sample material. Some version of the system could prove useful on Earth for remote sensing of biological contamination, including agents of biological warfare. Processing in this system would begin with dissolution of the raw sample material in a sample-separation vessel. The solution in the vessel would contain floating microscopic magnetic beads coated with substances that could engage in chemical reactions with various target functional groups that are parts of target molecules. The chemical reactions would cause the targeted molecules to be captured on the surfaces of the beads. By use of a controlled magnetic field, the beads would be concentrated in a specified location in the vessel. Once the beads were thus concentrated, the rest of the solution would be discarded. This procedure would obviate the filtration steps and thereby also eliminate the filter-clogging difficulties of typical prior sample-concentration schemes. For ferrous dust/soil samples, the dissolution would be done first in a separate vessel before the solution is transferred to the microbead-containing vessel.
CERN. Geneva
2016-01-01
The traditionally used and well established parallel programming models OpenMP and MPI are both targeting lower level parallelism and are meant to be as language agnostic as possible. For a long time, those models were the only widely available portable options for developing parallel C++ applications beyond using plain threads. This has strongly limited the optimization capabilities of compilers, has inhibited extensibility and genericity, and has restricted the use of those models together with other, modern higher level abstractions introduced by the C++11 and C++14 standards. The recent revival of interest in the industry and wider community for the C++ language has also spurred a remarkable amount of standardization proposals and technical specifications being developed. Those efforts however have so far failed to build a vision on how to seamlessly integrate various types of parallelism, such as iterative parallel execution, task-based parallelism, asynchronous many-task execution flows, continuation s...
DEFF Research Database (Denmark)
Sitchinava, Nodar; Zeh, Norbert
2012-01-01
We present the parallel buffer tree, a parallel external memory (PEM) data structure for batched search problems. This data structure is a non-trivial extension of Arge's sequential buffer tree to a private-cache multiprocessor environment and reduces the number of I/O operations by the number of...... in the optimal OhOf(psortN + K/PB) parallel I/O complexity, where K is the size of the output reported in the process and psortN is the parallel I/O complexity of sorting N elements using P processors....
Deshmane, Anagha; Gulani, Vikas; Griswold, Mark A; Seiberlich, Nicole
2012-07-01
Parallel imaging is a robust method for accelerating the acquisition of magnetic resonance imaging (MRI) data, and has made possible many new applications of MR imaging. Parallel imaging works by acquiring a reduced amount of k-space data with an array of receiver coils. These undersampled data can be acquired more quickly, but the undersampling leads to aliased images. One of several parallel imaging algorithms can then be used to reconstruct artifact-free images from either the aliased images (SENSE-type reconstruction) or from the undersampled data (GRAPPA-type reconstruction). The advantages of parallel imaging in a clinical setting include faster image acquisition, which can be used, for instance, to shorten breath-hold times resulting in fewer motion-corrupted examinations. In this article the basic concepts behind parallel imaging are introduced. The relationship between undersampling and aliasing is discussed and two commonly used parallel imaging methods, SENSE and GRAPPA, are explained in detail. Examples of artifacts arising from parallel imaging are shown and ways to detect and mitigate these artifacts are described. Finally, several current applications of parallel imaging are presented and recent advancements and promising research in parallel imaging are briefly reviewed. Copyright © 2012 Wiley Periodicals, Inc.
Parallel Algorithms and Patterns
Energy Technology Data Exchange (ETDEWEB)
Robey, Robert W. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
2016-06-16
This is a powerpoint presentation on parallel algorithms and patterns. A parallel algorithm is a well-defined, step-by-step computational procedure that emphasizes concurrency to solve a problem. Examples of problems include: Sorting, searching, optimization, matrix operations. A parallel pattern is a computational step in a sequence of independent, potentially concurrent operations that occurs in diverse scenarios with some frequency. Examples are: Reductions, prefix scans, ghost cell updates. We only touch on parallel patterns in this presentation. It really deserves its own detailed discussion which Gabe Rockefeller would like to develop.
Application Portable Parallel Library
Cole, Gary L.; Blech, Richard A.; Quealy, Angela; Townsend, Scott
1995-01-01
Application Portable Parallel Library (APPL) computer program is subroutine-based message-passing software library intended to provide consistent interface to variety of multiprocessor computers on market today. Minimizes effort needed to move application program from one computer to another. User develops application program once and then easily moves application program from parallel computer on which created to another parallel computer. ("Parallel computer" also include heterogeneous collection of networked computers). Written in C language with one FORTRAN 77 subroutine for UNIX-based computers and callable from application programs written in C language or FORTRAN 77.
Moody, Daniela; Wohlberg, Brendt
2018-01-02
An approach for land cover classification, seasonal and yearly change detection and monitoring, and identification of changes in man-made features may use a clustering of sparse approximations (CoSA) on sparse representations in learned dictionaries. The learned dictionaries may be derived using efficient convolutional sparse coding to build multispectral or hyperspectral, multiresolution dictionaries that are adapted to regional satellite image data. Sparse image representations of images over the learned dictionaries may be used to perform unsupervised k-means clustering into land cover categories. The clustering process behaves as a classifier in detecting real variability. This approach may combine spectral and spatial textural characteristics to detect geologic, vegetative, hydrologic, and man-made features, as well as changes in these features over time.
Sparse Source EEG Imaging with the Variational Garrote
DEFF Research Database (Denmark)
Hansen, Sofie Therese; Stahlhut, Carsten; Hansen, Lars Kai
2013-01-01
EEG imaging, the estimation of the cortical source distribution from scalp electrode measurements, poses an extremely ill-posed inverse problem. Recent work by Delorme et al. (2012) supports the hypothesis that distributed source solutions are sparse. We show that direct search for sparse solutions...
Support agnostic Bayesian matching pursuit for block sparse signals
Masood, Mudassir; Al-Naffouri, Tareq Y.
2013-01-01
priori knowledge of block partition and utilizes a greedy approach and order-recursive updates of its metrics to find the most dominant sparse supports to determine the approximate minimum mean square error (MMSE) estimate of the block-sparse signal
Local posterior concentration rate for multilevel sparse sequences
Belitser, E.N.; Nurushev, N.
2017-01-01
We consider empirical Bayesian inference in the many normal means model in the situation when the high-dimensional mean vector is multilevel sparse, that is,most of the entries of the parameter vector are some fixed values. For instance, the traditional sparse signal is a particular case (with one
Joint Group Sparse PCA for Compressed Hyperspectral Imaging.
Khan, Zohaib; Shafait, Faisal; Mian, Ajmal
2015-12-01
A sparse principal component analysis (PCA) seeks a sparse linear combination of input features (variables), so that the derived features still explain most of the variations in the data. A group sparse PCA introduces structural constraints on the features in seeking such a linear combination. Collectively, the derived principal components may still require measuring all the input features. We present a joint group sparse PCA (JGSPCA) algorithm, which forces the basic coefficients corresponding to a group of features to be jointly sparse. Joint sparsity ensures that the complete basis involves only a sparse set of input features, whereas the group sparsity ensures that the structural integrity of the features is maximally preserved. We evaluate the JGSPCA algorithm on the problems of compressed hyperspectral imaging and face recognition. Compressed sensing results show that the proposed method consistently outperforms sparse PCA and group sparse PCA in reconstructing the hyperspectral scenes of natural and man-made objects. The efficacy of the proposed compressed sensing method is further demonstrated in band selection for face recognition.
Confidence of model based shape reconstruction from sparse data
DEFF Research Database (Denmark)
Baka, N.; de Bruijne, Marleen; Reiber, J. H. C.
2010-01-01
Statistical shape models (SSM) are commonly applied for plausible interpolation of missing data in medical imaging. However, when fitting a shape model to sparse information, many solutions may fit the available data. In this paper we derive a constrained SSM to fit noisy sparse input landmarks...
Comparison of Methods for Sparse Representation of Musical Signals
DEFF Research Database (Denmark)
Endelt, Line Ørtoft; la Cour-Harbo, Anders
2005-01-01
by a number of sparseness measures and results are shown on the ℓ1 norm of the coefficients, using a dictionary containing a Dirac basis, a Discrete Cosine Transform, and a Wavelet Packet. Evaluated only on the sparseness Matching Pursuit is the best method, and it is also relatively fast....
Robust Face Recognition Via Gabor Feature and Sparse Representation
Directory of Open Access Journals (Sweden)
Hao Yu-Juan
2016-01-01
Full Text Available Sparse representation based on compressed sensing theory has been widely used in the field of face recognition, and has achieved good recognition results. but the face feature extraction based on sparse representation is too simple, and the sparse coefficient is not sparse. In this paper, we improve the classification algorithm based on the fusion of sparse representation and Gabor feature, and then improved algorithm for Gabor feature which overcomes the problem of large dimension of the vector dimension, reduces the computation and storage cost, and enhances the robustness of the algorithm to the changes of the environment.The classification efficiency of sparse representation is determined by the collaborative representation,we simplify the sparse constraint based on L1 norm to the least square constraint, which makes the sparse coefficients both positive and reduce the complexity of the algorithm. Experimental results show that the proposed method is robust to illumination, facial expression and pose variations of face recognition, and the recognition rate of the algorithm is improved.
Highly-Accelerated Real-Time Cardiac Cine MRI Using k-t SPARSE-SENSE
Feng, Li; Srichai, Monvadi B.; Lim, Ruth P.; Harrison, Alexis; King, Wilson; Adluru, Ganesh; Dibella, Edward VR.; Sodickson, Daniel K.; Otazo, Ricardo; Kim, Daniel
2012-01-01
For patients with impaired breath-hold capacity and/or arrhythmias, real-time cine MRI may be more clinically useful than breath-hold cine MRI. However, commercially available real-time cine MRI methods using parallel imaging typically yield relatively poor spatio-temporal resolution due to their low image acquisition speed. We sought to achieve relatively high spatial resolution (~2.5mm × 2.5mm) and temporal resolution (~40ms), to produce high-quality real-time cine MR images that could be applied clinically for wall motion assessment and measurement of left ventricular (LV) function. In this work, we present an 8-fold accelerated real-time cardiac cine MRI pulse sequence using a combination of compressed sensing and parallel imaging (k-t SPARSE-SENSE). Compared with reference, breath-hold cine MRI, our 8-fold accelerated real-time cine MRI produced significantly worse qualitative grades (1–5 scale), but its image quality and temporal fidelity scores were above 3.0 (adequate) and artifacts and noise scores were below 3.0 (moderate), suggesting that acceptable diagnostic image quality can be achieved. Additionally, both 8-fold accelerated real-time cine and breath-hold cine MRI yielded comparable LV function measurements, with coefficient of variation cine MRI with k-t SPARSE-SENSE is a promising modality for rapid imaging of myocardial function. PMID:22887290
Highly accelerated real-time cardiac cine MRI using k-t SPARSE-SENSE.
Feng, Li; Srichai, Monvadi B; Lim, Ruth P; Harrison, Alexis; King, Wilson; Adluru, Ganesh; Dibella, Edward V R; Sodickson, Daniel K; Otazo, Ricardo; Kim, Daniel
2013-07-01
For patients with impaired breath-hold capacity and/or arrhythmias, real-time cine MRI may be more clinically useful than breath-hold cine MRI. However, commercially available real-time cine MRI methods using parallel imaging typically yield relatively poor spatio-temporal resolution due to their low image acquisition speed. We sought to achieve relatively high spatial resolution (∼2.5 × 2.5 mm(2)) and temporal resolution (∼40 ms), to produce high-quality real-time cine MR images that could be applied clinically for wall motion assessment and measurement of left ventricular function. In this work, we present an eightfold accelerated real-time cardiac cine MRI pulse sequence using a combination of compressed sensing and parallel imaging (k-t SPARSE-SENSE). Compared with reference, breath-hold cine MRI, our eightfold accelerated real-time cine MRI produced significantly worse qualitative grades (1-5 scale), but its image quality and temporal fidelity scores were above 3.0 (adequate) and artifacts and noise scores were below 3.0 (moderate), suggesting that acceptable diagnostic image quality can be achieved. Additionally, both eightfold accelerated real-time cine and breath-hold cine MRI yielded comparable left ventricular function measurements, with coefficient of variation cine MRI with k-t SPARSE-SENSE is a promising modality for rapid imaging of myocardial function. Copyright © 2012 Wiley Periodicals, Inc.
Directory of Open Access Journals (Sweden)
Sapan eAgarwal
2016-01-01
Full Text Available The exponential increase in data over the last decade presents a significant challenge to analytics efforts that seek to process and interpret such data for various applications. Neural-inspired computing approaches are being developed in order to leverage the computational advantages of the analog, low-power data processing observed in biological systems. Analog resistive memory crossbars can perform a parallel read or a vector-matrix multiplication as well as a parallel write or a rank-1 update with high computational efficiency. For an NxN crossbar, these two kernels are at a minimum O(N more energy efficient than a digital memory-based architecture. If the read operation is noise limited, the energy to read a column can be independent of the crossbar size (O(1. These two kernels form the basis of many neuromorphic algorithms such as image, text, and speech recognition. For instance, these kernels can be applied to a neural sparse coding algorithm to give an O(N reduction in energy for the entire algorithm. Sparse coding is a rich problem with a host of applications including computer vision, object tracking, and more generally unsupervised learning.
Sparse Frequency Waveform Design for Radar-Embedded Communication
Directory of Open Access Journals (Sweden)
Chaoyun Mai
2016-01-01
Full Text Available According to the Tag application with function of covert communication, a method for sparse frequency waveform design based on radar-embedded communication is proposed. Firstly, sparse frequency waveforms are designed based on power spectral density fitting and quasi-Newton method. Secondly, the eigenvalue decomposition of the sparse frequency waveform sequence is used to get the dominant space. Finally the communication waveforms are designed through the projection of orthogonal pseudorandom vectors in the vertical subspace. Compared with the linear frequency modulation waveform, the sparse frequency waveform can further improve the bandwidth occupation of communication signals, thus achieving higher communication rate. A certain correlation exists between the reciprocally orthogonal communication signals samples and the sparse frequency waveform, which guarantees the low SER (signal error rate and LPI (low probability of intercept. The simulation results verify the effectiveness of this method.
Parallel discrete event simulation
Overeinder, B.J.; Hertzberger, L.O.; Sloot, P.M.A.; Withagen, W.J.
1991-01-01
In simulating applications for execution on specific computing systems, the simulation performance figures must be known in a short period of time. One basic approach to the problem of reducing the required simulation time is the exploitation of parallelism. However, in parallelizing the simulation
Parallel reservoir simulator computations
International Nuclear Information System (INIS)
Hemanth-Kumar, K.; Young, L.C.
1995-01-01
The adaptation of a reservoir simulator for parallel computations is described. The simulator was originally designed for vector processors. It performs approximately 99% of its calculations in vector/parallel mode and relative to scalar calculations it achieves speedups of 65 and 81 for black oil and EOS simulations, respectively on the CRAY C-90
Relaxations to Sparse Optimization Problems and Applications
Skau, Erik West
Parsimony is a fundamental property that is applied to many characteristics in a variety of fields. Of particular interest are optimization problems that apply rank, dimensionality, or support in a parsimonious manner. In this thesis we study some optimization problems and their relaxations, and focus on properties and qualities of the solutions of these problems. The Gramian tensor decomposition problem attempts to decompose a symmetric tensor as a sum of rank one tensors.We approach the Gramian tensor decomposition problem with a relaxation to a semidefinite program. We study conditions which ensure that the solution of the relaxed semidefinite problem gives the minimal Gramian rank decomposition. Sparse representations with learned dictionaries are one of the leading image modeling techniques for image restoration. When learning these dictionaries from a set of training images, the sparsity parameter of the dictionary learning algorithm strongly influences the content of the dictionary atoms.We describe geometrically the content of trained dictionaries and how it changes with the sparsity parameter.We use statistical analysis to characterize how the different content is used in sparse representations. Finally, a method to control the structure of the dictionaries is demonstrated, allowing us to learn a dictionary which can later be tailored for specific applications. Variations of dictionary learning can be broadly applied to a variety of applications.We explore a pansharpening problem with a triple factorization variant of coupled dictionary learning. Another application of dictionary learning is computer vision. Computer vision relies heavily on object detection, which we explore with a hierarchical convolutional dictionary learning model. Data fusion of disparate modalities is a growing topic of interest.We do a case study to demonstrate the benefit of using social media data with satellite imagery to estimate hazard extents. In this case study analysis we
Energy Technology Data Exchange (ETDEWEB)
1991-10-23
An account of the Caltech Concurrent Computation Program (C{sup 3}P), a five year project that focused on answering the question: Can parallel computers be used to do large-scale scientific computations '' As the title indicates, the question is answered in the affirmative, by implementing numerous scientific applications on real parallel computers and doing computations that produced new scientific results. In the process of doing so, C{sup 3}P helped design and build several new computers, designed and implemented basic system software, developed algorithms for frequently used mathematical computations on massively parallel machines, devised performance models and measured the performance of many computers, and created a high performance computing facility based exclusively on parallel computers. While the initial focus of C{sup 3}P was the hypercube architecture developed by C. Seitz, many of the methods developed and lessons learned have been applied successfully on other massively parallel architectures.
Massively parallel mathematical sieves
Energy Technology Data Exchange (ETDEWEB)
Montry, G.R.
1989-01-01
The Sieve of Eratosthenes is a well-known algorithm for finding all prime numbers in a given subset of integers. A parallel version of the Sieve is described that produces computational speedups over 800 on a hypercube with 1,024 processing elements for problems of fixed size. Computational speedups as high as 980 are achieved when the problem size per processor is fixed. The method of parallelization generalizes to other sieves and will be efficient on any ensemble architecture. We investigate two highly parallel sieves using scattered decomposition and compare their performance on a hypercube multiprocessor. A comparison of different parallelization techniques for the sieve illustrates the trade-offs necessary in the design and implementation of massively parallel algorithms for large ensemble computers.
Sparse alignment for robust tensor learning.
Lai, Zhihui; Wong, Wai Keung; Xu, Yong; Zhao, Cairong; Sun, Mingming
2014-10-01
Multilinear/tensor extensions of manifold learning based algorithms have been widely used in computer vision and pattern recognition. This paper first provides a systematic analysis of the multilinear extensions for the most popular methods by using alignment techniques, thereby obtaining a general tensor alignment framework. From this framework, it is easy to show that the manifold learning based tensor learning methods are intrinsically different from the alignment techniques. Based on the alignment framework, a robust tensor learning method called sparse tensor alignment (STA) is then proposed for unsupervised tensor feature extraction. Different from the existing tensor learning methods, L1- and L2-norms are introduced to enhance the robustness in the alignment step of the STA. The advantage of the proposed technique is that the difficulty in selecting the size of the local neighborhood can be avoided in the manifold learning based tensor feature extraction algorithms. Although STA is an unsupervised learning method, the sparsity encodes the discriminative information in the alignment step and provides the robustness of STA. Extensive experiments on the well-known image databases as well as action and hand gesture databases by encoding object images as tensors demonstrate that the proposed STA algorithm gives the most competitive performance when compared with the tensor-based unsupervised learning methods.
Regression analysis of sparse asynchronous longitudinal data.
Cao, Hongyuan; Zeng, Donglin; Fine, Jason P
2015-09-01
We consider estimation of regression models for sparse asynchronous longitudinal observations, where time-dependent responses and covariates are observed intermittently within subjects. Unlike with synchronous data, where the response and covariates are observed at the same time point, with asynchronous data, the observation times are mismatched. Simple kernel-weighted estimating equations are proposed for generalized linear models with either time invariant or time-dependent coefficients under smoothness assumptions for the covariate processes which are similar to those for synchronous data. For models with either time invariant or time-dependent coefficients, the estimators are consistent and asymptotically normal but converge at slower rates than those achieved with synchronous data. Simulation studies evidence that the methods perform well with realistic sample sizes and may be superior to a naive application of methods for synchronous data based on an ad hoc last value carried forward approach. The practical utility of the methods is illustrated on data from a study on human immunodeficiency virus.
Duplex scanning using sparse data sequences
DEFF Research Database (Denmark)
Møllenbach, S. K.; Jensen, Jørgen Arendt
2008-01-01
reconstruction of the missing samples possible. The periodic pattern has the length T = M + A samples, where M are for B-mode and A for velocity estimation. The missing samples can now be reconstructed using a filter bank. One filter bank reconstructs one missing sample, so the number of filter banks corresponds...... to M. The number of sub filters in every filter bank is the same as A. Every sub filter contains fractional delay (FD) filter and an interpolation function. Many different sequences can be selected to adapt the B-mode frame rate needed. The drawback of the method is that the maximum velocity detectable......, the fprf and the resolution are 15 MHz, 3.5 kHz, and 12 bit sample (8 kHz and 16 bit for the Carotid artery). The resulting data contains 8000 RF lines with 128 samples at a depth of 45 mm for the vein and 50 mm for Aorta. Sparse sequences are constructed from the full data sequences to have both...
Transformer fault diagnosis using continuous sparse autoencoder.
Wang, Lukun; Zhao, Xiaoying; Pei, Jiangnan; Tang, Gongyou
2016-01-01
This paper proposes a novel continuous sparse autoencoder (CSAE) which can be used in unsupervised feature learning. The CSAE adds Gaussian stochastic unit into activation function to extract features of nonlinear data. In this paper, CSAE is applied to solve the problem of transformer fault recognition. Firstly, based on dissolved gas analysis method, IEC three ratios are calculated by the concentrations of dissolved gases. Then IEC three ratios data is normalized to reduce data singularity and improve training speed. Secondly, deep belief network is established by two layers of CSAE and one layer of back propagation (BP) network. Thirdly, CSAE is adopted to unsupervised training and getting features. Then BP network is used for supervised training and getting transformer fault. Finally, the experimental data from IEC TC 10 dataset aims to illustrate the effectiveness of the presented approach. Comparative experiments clearly show that CSAE can extract features from the original data, and achieve a superior correct differentiation rate on transformer fault diagnosis.
Joint Sparse Recovery With Semisupervised MUSIC
Wen, Zaidao; Hou, Biao; Jiao, Licheng
2017-05-01
Discrete multiple signal classification (MUSIC) with its low computational cost and mild condition requirement becomes a significant noniterative algorithm for joint sparse recovery (JSR). However, it fails in rank defective problem caused by coherent or limited amount of multiple measurement vectors (MMVs). In this letter, we provide a novel sight to address this problem by interpreting JSR as a binary classification problem with respect to atoms. Meanwhile, MUSIC essentially constructs a supervised classifier based on the labeled MMVs so that its performance will heavily depend on the quality and quantity of these training samples. From this viewpoint, we develop a semisupervised MUSIC (SS-MUSIC) in the spirit of machine learning, which declares that the insufficient supervised information in the training samples can be compensated from those unlabeled atoms. Instead of constructing a classifier in a fully supervised manner, we iteratively refine a semisupervised classifier by exploiting the labeled MMVs and some reliable unlabeled atoms simultaneously. Through this way, the required conditions and iterations can be greatly relaxed and reduced. Numerical experimental results demonstrate that SS-MUSIC can achieve much better recovery performances than other MUSIC extended algorithms as well as some typical greedy algorithms for JSR in terms of iterations and recovery probability.
SLAP, Large Sparse Linear System Solution Package
International Nuclear Information System (INIS)
Greenbaum, A.
1987-01-01
1 - Description of program or function: SLAP is a set of routines for solving large sparse systems of linear equations. One need not store the entire matrix - only the nonzero elements and their row and column numbers. Any nonzero structure is acceptable, so the linear system solver need not be modified when the structure of the matrix changes. Auxiliary storage space is acquired and released within the routines themselves by use of the LRLTRAN POINTER statement. 2 - Method of solution: SLAP contains one direct solver, a band matrix factorization and solution routine, BAND, and several interactive solvers. The iterative routines are as follows: JACOBI, Jacobi iteration; GS, Gauss-Seidel Iteration; ILUIR, incomplete LU decomposition with iterative refinement; DSCG and ICCG, diagonal scaling and incomplete Cholesky decomposition with conjugate gradient iteration (for symmetric positive definite matrices only); DSCGN and ILUGGN, diagonal scaling and incomplete LU decomposition with conjugate gradient interaction on the normal equations; DSBCG and ILUBCG, diagonal scaling and incomplete LU decomposition with bi-conjugate gradient iteration; and DSOMN and ILUOMN, diagonal scaling and incomplete LU decomposition with ORTHOMIN iteration
A Hybrid FPGA/Coarse Parallel Processing Architecture for Multi-modal Visual Feature Descriptors
DEFF Research Database (Denmark)
Jensen, Lars Baunegaard With; Kjær-Nielsen, Anders; Alonso, Javier Díaz
2008-01-01
This paper describes the hybrid architecture developed for speeding up the processing of so-called multi-modal visual primitives which are sparse image descriptors extracted along contours. In the system, the first stages of visual processing are implemented on FPGAs due to their highly parallel...
An M-step preconditioned conjugate gradient method for parallel computation
Adams, L.
1983-01-01
This paper describes a preconditioned conjugate gradient method that can be effectively implemented on both vector machines and parallel arrays to solve sparse symmetric and positive definite systems of linear equations. The implementation on the CYBER 203/205 and on the Finite Element Machine is discussed and results obtained using the method on these machines are given.
Object tracking by occlusion detection via structured sparse learning
Zhang, Tianzhu
2013-06-01
Sparse representation based methods have recently drawn much attention in visual tracking due to good performance against illumination variation and occlusion. They assume the errors caused by image variations can be modeled as pixel-wise sparse. However, in many practical scenarios these errors are not truly pixel-wise sparse but rather sparsely distributed in a structured way. In fact, pixels in error constitute contiguous regions within the object\\'s track. This is the case when significant occlusion occurs. To accommodate for non-sparse occlusion in a given frame, we assume that occlusion detected in previous frames can be propagated to the current one. This propagated information determines which pixels will contribute to the sparse representation of the current track. In other words, pixels that were detected as part of an occlusion in the previous frame will be removed from the target representation process. As such, this paper proposes a novel tracking algorithm that models and detects occlusion through structured sparse learning. We test our tracker on challenging benchmark sequences, such as sports videos, which involve heavy occlusion, drastic illumination changes, and large pose variations. Experimental results show that our tracker consistently outperforms the state-of-the-art. © 2013 IEEE.
Manifold regularization for sparse unmixing of hyperspectral images.
Liu, Junmin; Zhang, Chunxia; Zhang, Jiangshe; Li, Huirong; Gao, Yuelin
2016-01-01
Recently, sparse unmixing has been successfully applied to spectral mixture analysis of remotely sensed hyperspectral images. Based on the assumption that the observed image signatures can be expressed in the form of linear combinations of a number of pure spectral signatures known in advance, unmixing of each mixed pixel in the scene is to find an optimal subset of signatures in a very large spectral library, which is cast into the framework of sparse regression. However, traditional sparse regression models, such as collaborative sparse regression , ignore the intrinsic geometric structure in the hyperspectral data. In this paper, we propose a novel model, called manifold regularized collaborative sparse regression , by introducing a manifold regularization to the collaborative sparse regression model. The manifold regularization utilizes a graph Laplacian to incorporate the locally geometrical structure of the hyperspectral data. An algorithm based on alternating direction method of multipliers has been developed for the manifold regularized collaborative sparse regression model. Experimental results on both the simulated and real hyperspectral data sets have demonstrated the effectiveness of our proposed model.
A comprehensive study of sparse codes on abnormality detection
DEFF Research Database (Denmark)
Ren, Huamin; Pan, Hong; Olsen, Søren Ingvor
2017-01-01
Sparse representation has been applied successfully in abnor-mal event detection, in which the baseline is to learn a dic-tionary accompanied by sparse codes. While much empha-sis is put on discriminative dictionary construction, there areno comparative studies of sparse codes regarding abnormal-ity...... detection. We comprehensively study two types of sparsecodes solutions - greedy algorithms and convex L1-norm so-lutions - and their impact on abnormality detection perfor-mance. We also propose our framework of combining sparsecodes with different detection methods. Our comparative ex-periments are carried...
Electromagnetic Formation Flight (EMFF) for Sparse Aperture Arrays
Kwon, Daniel W.; Miller, David W.; Sedwick, Raymond J.
2004-01-01
Traditional methods of actuating spacecraft in sparse aperture arrays use propellant as a reaction mass. For formation flying systems, propellant becomes a critical consumable which can be quickly exhausted while maintaining relative orientation. Additional problems posed by propellant include optical contamination, plume impingement, thermal emission, and vibration excitation. For these missions where control of relative degrees of freedom is important, we consider using a system of electromagnets, in concert with reaction wheels, to replace the consumables. Electromagnetic Formation Flight sparse apertures, powered by solar energy, are designed differently from traditional propulsion systems, which are based on V. This paper investigates the design of sparse apertures both inside and outside the Earth's gravity field.
Sparse Principal Component Analysis in Medical Shape Modeling
DEFF Research Database (Denmark)
Sjöstrand, Karl; Stegmann, Mikkel Bille; Larsen, Rasmus
2006-01-01
Principal component analysis (PCA) is a widely used tool in medical image analysis for data reduction, model building, and data understanding and exploration. While PCA is a holistic approach where each new variable is a linear combination of all original variables, sparse PCA (SPCA) aims...... analysis in medicine. Results for three different data sets are given in relation to standard PCA and sparse PCA by simple thresholding of sufficiently small loadings. Focus is on a recent algorithm for computing sparse principal components, but a review of other approaches is supplied as well. The SPCA...
Algorithms for parallel computers
International Nuclear Information System (INIS)
Churchhouse, R.F.
1985-01-01
Until relatively recently almost all the algorithms for use on computers had been designed on the (usually unstated) assumption that they were to be run on single processor, serial machines. With the introduction of vector processors, array processors and interconnected systems of mainframes, minis and micros, however, various forms of parallelism have become available. The advantage of parallelism is that it offers increased overall processing speed but it also raises some fundamental questions, including: (i) which, if any, of the existing 'serial' algorithms can be adapted for use in the parallel mode. (ii) How close to optimal can such adapted algorithms be and, where relevant, what are the convergence criteria. (iii) How can we design new algorithms specifically for parallel systems. (iv) For multi-processor systems how can we handle the software aspects of the interprocessor communications. Aspects of these questions illustrated by examples are considered in these lectures. (orig.)
Parallelism and array processing
International Nuclear Information System (INIS)
Zacharov, V.
1983-01-01
Modern computing, as well as the historical development of computing, has been dominated by sequential monoprocessing. Yet there is the alternative of parallelism, where several processes may be in concurrent execution. This alternative is discussed in a series of lectures, in which the main developments involving parallelism are considered, both from the standpoint of computing systems and that of applications that can exploit such systems. The lectures seek to discuss parallelism in a historical context, and to identify all the main aspects of concurrency in computation right up to the present time. Included will be consideration of the important question as to what use parallelism might be in the field of data processing. (orig.)
Parallel magnetic resonance imaging
International Nuclear Information System (INIS)
Larkman, David J; Nunes, Rita G
2007-01-01
Parallel imaging has been the single biggest innovation in magnetic resonance imaging in the last decade. The use of multiple receiver coils to augment the time consuming Fourier encoding has reduced acquisition times significantly. This increase in speed comes at a time when other approaches to acquisition time reduction were reaching engineering and human limits. A brief summary of spatial encoding in MRI is followed by an introduction to the problem parallel imaging is designed to solve. There are a large number of parallel reconstruction algorithms; this article reviews a cross-section, SENSE, SMASH, g-SMASH and GRAPPA, selected to demonstrate the different approaches. Theoretical (the g-factor) and practical (coil design) limits to acquisition speed are reviewed. The practical implementation of parallel imaging is also discussed, in particular coil calibration. How to recognize potential failure modes and their associated artefacts are shown. Well-established applications including angiography, cardiac imaging and applications using echo planar imaging are reviewed and we discuss what makes a good application for parallel imaging. Finally, active research areas where parallel imaging is being used to improve data quality by repairing artefacted images are also reviewed. (invited topical review)
High Order Tensor Formulation for Convolutional Sparse Coding
Bibi, Adel Aamer; Ghanem, Bernard
2017-01-01
Convolutional sparse coding (CSC) has gained attention for its successful role as a reconstruction and a classification tool in the computer vision and machine learning community. Current CSC methods can only reconstruct singlefeature 2D images
Preconditioned Inexact Newton for Nonlinear Sparse Electromagnetic Imaging
Desmal, Abdulla; Bagci, Hakan
2014-01-01
with smoothness promoting optimization/regularization schemes. However, this type of regularization schemes are known to perform poorly when applied in imagining domains with sparse content or sharp variations. In this work, an inexact Newton algorithm
Multiple instance learning tracking method with local sparse representation
Xie, Chengjun; Tan, Jieqing; Chen, Peng; Zhang, Jie; Helg, Lei
2013-01-01
as training data for the MIL framework. First, local image patches of a target object are represented as sparse codes with an overcomplete dictionary, where the adaptive representation can be helpful in overcoming partial occlusion in object tracking. Then MIL
Low-rank sparse learning for robust visual tracking
Zhang, Tianzhu; Ghanem, Bernard; Liu, Si; Ahuja, Narendra
2012-01-01
In this paper, we propose a new particle-filter based tracking algorithm that exploits the relationship between particles (candidate targets). By representing particles as sparse linear combinations of dictionary templates, this algorithm
Robust visual tracking via multi-task sparse learning
Zhang, Tianzhu; Ghanem, Bernard; Liu, Si; Ahuja, Narendra
2012-01-01
In this paper, we formulate object tracking in a particle filter framework as a multi-task sparse learning problem, which we denote as Multi-Task Tracking (MTT). Since we model particles as linear combinations of dictionary templates
Sparse Machine Learning Methods for Understanding Large Text Corpora
National Aeronautics and Space Administration — Sparse machine learning has recently emerged as powerful tool to obtain models of high-dimensional data with high degree of interpretability, at low computational...
Sparse PDF Volumes for Consistent Multi-Resolution Volume Rendering
Sicat, Ronell Barrera; Kruger, Jens; Moller, Torsten; Hadwiger, Markus
2014-01-01
This paper presents a new multi-resolution volume representation called sparse pdf volumes, which enables consistent multi-resolution volume rendering based on probability density functions (pdfs) of voxel neighborhoods. These pdfs are defined
Sparse Linear Solver for Power System Analysis Using FPGA
National Research Council Canada - National Science Library
Johnson, J. R; Nagvajara, P; Nwankpa, C
2005-01-01
.... Numerical solution to load flow equations are typically computed using Newton-Raphson iteration, and the most time consuming component of the computation is the solution of a sparse linear system...
Support agnostic Bayesian matching pursuit for block sparse signals
Masood, Mudassir
2013-05-01
A fast matching pursuit method using a Bayesian approach is introduced for block-sparse signal recovery. This method performs Bayesian estimates of block-sparse signals even when the distribution of active blocks is non-Gaussian or unknown. It is agnostic to the distribution of active blocks in the signal and utilizes a priori statistics of additive noise and the sparsity rate of the signal, which are shown to be easily estimated from data and no user intervention is required. The method requires a priori knowledge of block partition and utilizes a greedy approach and order-recursive updates of its metrics to find the most dominant sparse supports to determine the approximate minimum mean square error (MMSE) estimate of the block-sparse signal. Simulation results demonstrate the power and robustness of our proposed estimator. © 2013 IEEE.
Detection of Pitting in Gears Using a Deep Sparse Autoencoder
Directory of Open Access Journals (Sweden)
Yongzhi Qu
2017-05-01
Full Text Available In this paper; a new method for gear pitting fault detection is presented. The presented method is developed based on a deep sparse autoencoder. The method integrates dictionary learning in sparse coding into a stacked autoencoder network. Sparse coding with dictionary learning is viewed as an adaptive feature extraction method for machinery fault diagnosis. An autoencoder is an unsupervised machine learning technique. A stacked autoencoder network with multiple hidden layers is considered to be a deep learning network. The presented method uses a stacked autoencoder network to perform the dictionary learning in sparse coding and extract features from raw vibration data automatically. These features are then used to perform gear pitting fault detection. The presented method is validated with vibration data collected from gear tests with pitting faults in a gearbox test rig and compared with an existing deep learning-based approach.
Sparse logistic principal components analysis for binary data
Lee, Seokho; Huang, Jianhua Z.; Hu, Jianhua
2010-01-01
with a criterion function motivated from a penalized Bernoulli likelihood. A Majorization-Minimization algorithm is developed to efficiently solve the optimization problem. The effectiveness of the proposed sparse logistic PCA method is illustrated
Sparse reconstruction using distribution agnostic bayesian matching pursuit
Masood, Mudassir
2013-11-01
A fast matching pursuit method using a Bayesian approach is introduced for sparse signal recovery. This method performs Bayesian estimates of sparse signals even when the signal prior is non-Gaussian or unknown. It is agnostic on signal statistics and utilizes a priori statistics of additive noise and the sparsity rate of the signal, which are shown to be easily estimated from data if not available. The method utilizes a greedy approach and order-recursive updates of its metrics to find the most dominant sparse supports to determine the approximate minimum mean-square error (MMSE) estimate of the sparse signal. Simulation results demonstrate the power and robustness of our proposed estimator. © 2013 IEEE.
Occlusion detection via structured sparse learning for robust object tracking
Zhang, Tianzhu; Ghanem, Bernard; Xu, Changsheng; Ahuja, Narendra
2014-01-01
occlusion through structured sparse learning. We test our tracker on challenging benchmark sequences, such as sports videos, which involve heavy occlusion, drastic illumination changes, and large pose variations. Extensive experimental results show that our
Object tracking by occlusion detection via structured sparse learning
Zhang, Tianzhu; Ghanem, Bernard; Xu, Changsheng; Ahuja, Narendra
2013-01-01
occlusion through structured sparse learning. We test our tracker on challenging benchmark sequences, such as sports videos, which involve heavy occlusion, drastic illumination changes, and large pose variations. Experimental results show that our tracker
Sparse Vector Distributions and Recovery from Compressed Sensing
DEFF Research Database (Denmark)
Sturm, Bob L.
It is well known that the performance of sparse vector recovery algorithms from compressive measurements can depend on the distribution underlying the non-zero elements of a sparse vector. However, the extent of these effects has yet to be explored, and formally presented. In this paper, I...... empirically investigate this dependence for seven distributions and fifteen recovery algorithms. The two morals of this work are: 1) any judgement of the recovery performance of one algorithm over that of another must be prefaced by the conditions for which this is observed to be true, including sparse vector...... distributions, and the criterion for exact recovery; and 2) a recovery algorithm must be selected carefully based on what distribution one expects to underlie the sensed sparse signal....
Sparse encoding of automatic visual association in hippocampal networks
DEFF Research Database (Denmark)
Hulme, Oliver J; Skov, Martin; Chadwick, Martin J
2014-01-01
Intelligent action entails exploiting predictions about associations between elements of ones environment. The hippocampus and mediotemporal cortex are endowed with the network topology, physiology, and neurochemistry to automatically and sparsely code sensori-cognitive associations that can...
Efficient collaborative sparse channel estimation in massive MIMO
Masood, Mudassir; Afify, Laila H.; Al-Naffouri, Tareq Y.
2015-01-01
We propose a method for estimation of sparse frequency selective channels within MIMO-OFDM systems. These channels are independently sparse and share a common support. The method estimates the impulse response for each channel observed by the antennas at the receiver. Estimation is performed in a coordinated manner by sharing minimal information among neighboring antennas to achieve results better than many contemporary methods. Simulations demonstrate the superior performance of the proposed method.
Fast convolutional sparse coding using matrix inversion lemma
Czech Academy of Sciences Publication Activity Database
Šorel, Michal; Šroubek, Filip
2016-01-01
Roč. 55, č. 1 (2016), s. 44-51 ISSN 1051-2004 R&D Projects: GA ČR GA13-29225S Institutional support: RVO:67985556 Keywords : Convolutional sparse coding * Feature learning * Deconvolution networks * Shift-invariant sparse coding Subject RIV: JD - Computer Applications, Robotics Impact factor: 2.337, year: 2016 http://library.utia.cas.cz/separaty/2016/ZOI/sorel-0459332.pdf
Discussion of CoSA: Clustering of Sparse Approximations
Energy Technology Data Exchange (ETDEWEB)
Armstrong, Derek Elswick [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
2017-03-07
The purpose of this talk is to discuss the possible applications of CoSA (Clustering of Sparse Approximations) to the exploitation of HSI (HyperSpectral Imagery) data. CoSA is presented by Moody et al. in the Journal of Applied Remote Sensing (“Land cover classification in multispectral imagery using clustering of sparse approximations over learned feature dictionaries”, Vol. 8, 2014) and is based on machine learning techniques.
Efficient collaborative sparse channel estimation in massive MIMO
Masood, Mudassir
2015-08-12
We propose a method for estimation of sparse frequency selective channels within MIMO-OFDM systems. These channels are independently sparse and share a common support. The method estimates the impulse response for each channel observed by the antennas at the receiver. Estimation is performed in a coordinated manner by sharing minimal information among neighboring antennas to achieve results better than many contemporary methods. Simulations demonstrate the superior performance of the proposed method.
A flexible framework for sparse simultaneous component based data integration
Directory of Open Access Journals (Sweden)
Van Deun Katrijn
2011-11-01
Full Text Available Abstract 1 Background High throughput data are complex and methods that reveal structure underlying the data are most useful. Principal component analysis, frequently implemented as a singular value decomposition, is a popular technique in this respect. Nowadays often the challenge is to reveal structure in several sources of information (e.g., transcriptomics, proteomics that are available for the same biological entities under study. Simultaneous component methods are most promising in this respect. However, the interpretation of the principal and simultaneous components is often daunting because contributions of each of the biomolecules (transcripts, proteins have to be taken into account. 2 Results We propose a sparse simultaneous component method that makes many of the parameters redundant by shrinking them to zero. It includes principal component analysis, sparse principal component analysis, and ordinary simultaneous component analysis as special cases. Several penalties can be tuned that account in different ways for the block structure present in the integrated data. This yields known sparse approaches as the lasso, the ridge penalty, the elastic net, the group lasso, sparse group lasso, and elitist lasso. In addition, the algorithmic results can be easily transposed to the context of regression. Metabolomics data obtained with two measurement platforms for the same set of Escherichia coli samples are used to illustrate the proposed methodology and the properties of different penalties with respect to sparseness across and within data blocks. 3 Conclusion Sparse simultaneous component analysis is a useful method for data integration: First, simultaneous analyses of multiple blocks offer advantages over sequential and separate analyses and second, interpretation of the results is highly facilitated by their sparseness. The approach offered is flexible and allows to take the block structure in different ways into account. As such
A flexible framework for sparse simultaneous component based data integration.
Van Deun, Katrijn; Wilderjans, Tom F; van den Berg, Robert A; Antoniadis, Anestis; Van Mechelen, Iven
2011-11-15
High throughput data are complex and methods that reveal structure underlying the data are most useful. Principal component analysis, frequently implemented as a singular value decomposition, is a popular technique in this respect. Nowadays often the challenge is to reveal structure in several sources of information (e.g., transcriptomics, proteomics) that are available for the same biological entities under study. Simultaneous component methods are most promising in this respect. However, the interpretation of the principal and simultaneous components is often daunting because contributions of each of the biomolecules (transcripts, proteins) have to be taken into account. We propose a sparse simultaneous component method that makes many of the parameters redundant by shrinking them to zero. It includes principal component analysis, sparse principal component analysis, and ordinary simultaneous component analysis as special cases. Several penalties can be tuned that account in different ways for the block structure present in the integrated data. This yields known sparse approaches as the lasso, the ridge penalty, the elastic net, the group lasso, sparse group lasso, and elitist lasso. In addition, the algorithmic results can be easily transposed to the context of regression. Metabolomics data obtained with two measurement platforms for the same set of Escherichia coli samples are used to illustrate the proposed methodology and the properties of different penalties with respect to sparseness across and within data blocks. Sparse simultaneous component analysis is a useful method for data integration: First, simultaneous analyses of multiple blocks offer advantages over sequential and separate analyses and second, interpretation of the results is highly facilitated by their sparseness. The approach offered is flexible and allows to take the block structure in different ways into account. As such, structures can be found that are exclusively tied to one data platform
In-Storage Embedded Accelerator for Sparse Pattern Processing
Jun, Sang-Woo; Nguyen, Huy T.; Gadepally, Vijay N.; Arvind
2016-01-01
We present a novel architecture for sparse pattern processing, using flash storage with embedded accelerators. Sparse pattern processing on large data sets is the essence of applications such as document search, natural language processing, bioinformatics, subgraph matching, machine learning, and graph processing. One slice of our prototype accelerator is capable of handling up to 1TB of data, and experiments show that it can outperform C/C++ software solutions on a 16-core system at a fracti...
Process Knowledge Discovery Using Sparse Principal Component Analysis
DEFF Research Database (Denmark)
Gao, Huihui; Gajjar, Shriram; Kulahci, Murat
2016-01-01
As the goals of ensuring process safety and energy efficiency become ever more challenging, engineers increasingly rely on data collected from such processes for informed decision making. During recent decades, extracting and interpreting valuable process information from large historical data sets...... SPCA approach that helps uncover the underlying process knowledge regarding variable relations. This approach systematically determines the optimal sparse loadings for each sparse PC while improving interpretability and minimizing information loss. The salient features of the proposed approach...
The STAPL Parallel Graph Library
Harshvardhan,; Fidel, Adam; Amato, Nancy M.; Rauchwerger, Lawrence
2013-01-01
This paper describes the stapl Parallel Graph Library, a high-level framework that abstracts the user from data-distribution and parallelism details and allows them to concentrate on parallel graph algorithm development. It includes a customizable
Occlusion detection via structured sparse learning for robust object tracking
Zhang, Tianzhu
2014-01-01
Sparse representation based methods have recently drawn much attention in visual tracking due to good performance against illumination variation and occlusion. They assume the errors caused by image variations can be modeled as pixel-wise sparse. However, in many practical scenarios, these errors are not truly pixel-wise sparse but rather sparsely distributed in a structured way. In fact, pixels in error constitute contiguous regions within the object’s track. This is the case when significant occlusion occurs. To accommodate for nonsparse occlusion in a given frame, we assume that occlusion detected in previous frames can be propagated to the current one. This propagated information determines which pixels will contribute to the sparse representation of the current track. In other words, pixels that were detected as part of an occlusion in the previous frame will be removed from the target representation process. As such, this paper proposes a novel tracking algorithm that models and detects occlusion through structured sparse learning. We test our tracker on challenging benchmark sequences, such as sports videos, which involve heavy occlusion, drastic illumination changes, and large pose variations. Extensive experimental results show that our proposed tracker consistently outperforms the state-of-the-art trackers.
Exhaustive Search for Sparse Variable Selection in Linear Regression
Igarashi, Yasuhiko; Takenaka, Hikaru; Nakanishi-Ohno, Yoshinori; Uemura, Makoto; Ikeda, Shiro; Okada, Masato
2018-04-01
We propose a K-sparse exhaustive search (ES-K) method and a K-sparse approximate exhaustive search method (AES-K) for selecting variables in linear regression. With these methods, K-sparse combinations of variables are tested exhaustively assuming that the optimal combination of explanatory variables is K-sparse. By collecting the results of exhaustively computing ES-K, various approximate methods for selecting sparse variables can be summarized as density of states. With this density of states, we can compare different methods for selecting sparse variables such as relaxation and sampling. For large problems where the combinatorial explosion of explanatory variables is crucial, the AES-K method enables density of states to be effectively reconstructed by using the replica-exchange Monte Carlo method and the multiple histogram method. Applying the ES-K and AES-K methods to type Ia supernova data, we confirmed the conventional understanding in astronomy when an appropriate K is given beforehand. However, we found the difficulty to determine K from the data. Using virtual measurement and analysis, we argue that this is caused by data shortage.
Sparse Representation Based SAR Vehicle Recognition along with Aspect Angle
Directory of Open Access Journals (Sweden)
Xiangwei Xing
2014-01-01
Full Text Available As a method of representing the test sample with few training samples from an overcomplete dictionary, sparse representation classification (SRC has attracted much attention in synthetic aperture radar (SAR automatic target recognition (ATR recently. In this paper, we develop a novel SAR vehicle recognition method based on sparse representation classification along with aspect information (SRCA, in which the correlation between the vehicle’s aspect angle and the sparse representation vector is exploited. The detailed procedure presented in this paper can be summarized as follows. Initially, the sparse representation vector of a test sample is solved by sparse representation algorithm with a principle component analysis (PCA feature-based dictionary. Then, the coefficient vector is projected onto a sparser one within a certain range of the vehicle’s aspect angle. Finally, the vehicle is classified into a certain category that minimizes the reconstruction error with the novel sparse representation vector. Extensive experiments are conducted on the moving and stationary target acquisition and recognition (MSTAR dataset and the results demonstrate that the proposed method performs robustly under the variations of depression angle and target configurations, as well as incomplete observation.
Structure-aware Local Sparse Coding for Visual Tracking
Qi, Yuankai
2018-01-24
Sparse coding has been applied to visual tracking and related vision problems with demonstrated success in recent years. Existing tracking methods based on local sparse coding sample patches from a target candidate and sparsely encode these using a dictionary consisting of patches sampled from target template images. The discriminative strength of existing methods based on local sparse coding is limited as spatial structure constraints among the template patches are not exploited. To address this problem, we propose a structure-aware local sparse coding algorithm which encodes a target candidate using templates with both global and local sparsity constraints. For robust tracking, we show local regions of a candidate region should be encoded only with the corresponding local regions of the target templates that are the most similar from the global view. Thus, a more precise and discriminative sparse representation is obtained to account for appearance changes. To alleviate the issues with tracking drifts, we design an effective template update scheme. Extensive experiments on challenging image sequences demonstrate the effectiveness of the proposed algorithm against numerous stateof- the-art methods.
Vector sparse representation of color image using quaternion matrix analysis.
Xu, Yi; Yu, Licheng; Xu, Hongteng; Zhang, Hao; Nguyen, Truong
2015-04-01
Traditional sparse image models treat color image pixel as a scalar, which represents color channels separately or concatenate color channels as a monochrome image. In this paper, we propose a vector sparse representation model for color images using quaternion matrix analysis. As a new tool for color image representation, its potential applications in several image-processing tasks are presented, including color image reconstruction, denoising, inpainting, and super-resolution. The proposed model represents the color image as a quaternion matrix, where a quaternion-based dictionary learning algorithm is presented using the K-quaternion singular value decomposition (QSVD) (generalized K-means clustering for QSVD) method. It conducts the sparse basis selection in quaternion space, which uniformly transforms the channel images to an orthogonal color space. In this new color space, it is significant that the inherent color structures can be completely preserved during vector reconstruction. Moreover, the proposed sparse model is more efficient comparing with the current sparse models for image restoration tasks due to lower redundancy between the atoms of different color channels. The experimental results demonstrate that the proposed sparse image model avoids the hue bias issue successfully and shows its potential as a general and powerful tool in color image analysis and processing domain.
Sparse Reconstruction Schemes for Nonlinear Electromagnetic Imaging
Desmal, Abdulla
2016-03-01
synthetically generated or actually measured scattered fields, show that the images recovered by these sparsity-regularized methods are sharper and more accurate than those produced by existing methods. The methods developed in this work have potential application areas ranging from oil/gas reservoir engineering to biological imaging where sparse domains naturally exist.
Parallel adaptation of a vectorised quantumchemical program system
International Nuclear Information System (INIS)
Van Corler, L.C.H.; Van Lenthe, J.H.
1987-01-01
Supercomputers, like the CRAY 1 or the Cyber 205, have had, and still have, a marked influence on Quantum Chemistry. Vectorization has led to a considerable increase in the performance of Quantum Chemistry programs. However, clockcycle times more than a factor 10 smaller than those of the present supercomputers are not to be expected. Therefore future supercomputers will have to depend on parallel structures. Recently, the first examples of such supercomputers have been installed. To be prepared for this new generation of (parallel) supercomputers one should consider the concepts one wants to use and the kind of problems one will encounter during implementation of existing vectorized programs on those parallel systems. The authors implemented four important parts of a large quantumchemical program system (ATMOL), i.e. integrals, SCF, 4-index and Direct-CI in the parallel environment at ECSEC (Rome, Italy). This system offers simulated parallellism on the host computer (IBM 4381) and real parallellism on at most 10 attached processors (FPS-164). Quantumchemical programs usually handle large amounts of data and very large, often sparse matrices. The transfer of that many data can cause problems concerning communication and overhead, in view of which shared memory and shared disks must be considered. The strategy and the tools that were used to parallellise the programs are shown. Also, some examples are presented to illustrate effectiveness and performance of the system in Rome for these type of calculations
Scalability of Parallel Scientific Applications on the Cloud
Directory of Open Access Journals (Sweden)
Satish Narayana Srirama
2011-01-01
Full Text Available Cloud computing, with its promise of virtually infinite resources, seems to suit well in solving resource greedy scientific computing problems. To study the effects of moving parallel scientific applications onto the cloud, we deployed several benchmark applications like matrix–vector operations and NAS parallel benchmarks, and DOUG (Domain decomposition On Unstructured Grids on the cloud. DOUG is an open source software package for parallel iterative solution of very large sparse systems of linear equations. The detailed analysis of DOUG on the cloud showed that parallel applications benefit a lot and scale reasonable on the cloud. We could also observe the limitations of the cloud and its comparison with cluster in terms of performance. However, for efficiently running the scientific applications on the cloud infrastructure, the applications must be reduced to frameworks that can successfully exploit the cloud resources, like the MapReduce framework. Several iterative and embarrassingly parallel algorithms are reduced to the MapReduce model and their performance is measured and analyzed. The analysis showed that Hadoop MapReduce has significant problems with iterative methods, while it suits well for embarrassingly parallel algorithms. Scientific computing often uses iterative methods to solve large problems. Thus, for scientific computing on the cloud, this paper raises the necessity for better frameworks or optimizations for MapReduce.
Massively parallel multicanonical simulations
Gross, Jonathan; Zierenberg, Johannes; Weigel, Martin; Janke, Wolfhard
2018-03-01
Generalized-ensemble Monte Carlo simulations such as the multicanonical method and similar techniques are among the most efficient approaches for simulations of systems undergoing discontinuous phase transitions or with rugged free-energy landscapes. As Markov chain methods, they are inherently serial computationally. It was demonstrated recently, however, that a combination of independent simulations that communicate weight updates at variable intervals allows for the efficient utilization of parallel computational resources for multicanonical simulations. Implementing this approach for the many-thread architecture provided by current generations of graphics processing units (GPUs), we show how it can be efficiently employed with of the order of 104 parallel walkers and beyond, thus constituting a versatile tool for Monte Carlo simulations in the era of massively parallel computing. We provide the fully documented source code for the approach applied to the paradigmatic example of the two-dimensional Ising model as starting point and reference for practitioners in the field.
SPINning parallel systems software
International Nuclear Information System (INIS)
Matlin, O.S.; Lusk, E.; McCune, W.
2002-01-01
We describe our experiences in using Spin to verify parts of the Multi Purpose Daemon (MPD) parallel process management system. MPD is a distributed collection of processes connected by Unix network sockets. MPD is dynamic processes and connections among them are created and destroyed as MPD is initialized, runs user processes, recovers from faults, and terminates. This dynamic nature is easily expressible in the Spin/Promela framework but poses performance and scalability challenges. We present here the results of expressing some of the parallel algorithms of MPD and executing both simulation and verification runs with Spin
Parallel programming with Python
Palach, Jan
2014-01-01
A fast, easy-to-follow and clear tutorial to help you develop Parallel computing systems using Python. Along with explaining the fundamentals, the book will also introduce you to slightly advanced concepts and will help you in implementing these techniques in the real world. If you are an experienced Python programmer and are willing to utilize the available computing resources by parallelizing applications in a simple way, then this book is for you. You are required to have a basic knowledge of Python development to get the most of this book.
Sparse modeling of spatial environmental variables associated with asthma.
Chang, Timothy S; Gangnon, Ronald E; David Page, C; Buckingham, William R; Tandias, Aman; Cowan, Kelly J; Tomasallo, Carrie D; Arndt, Brian G; Hanrahan, Lawrence P; Guilbert, Theresa W
2015-02-01
Geographically distributed environmental factors influence the burden of diseases such as asthma. Our objective was to identify sparse environmental variables associated with asthma diagnosis gathered from a large electronic health record (EHR) dataset while controlling for spatial variation. An EHR dataset from the University of Wisconsin's Family Medicine, Internal Medicine and Pediatrics Departments was obtained for 199,220 patients aged 5-50years over a three-year period. Each patient's home address was geocoded to one of 3456 geographic census block groups. Over one thousand block group variables were obtained from a commercial database. We developed a Sparse Spatial Environmental Analysis (SASEA). Using this method, the environmental variables were first dimensionally reduced with sparse principal component analysis. Logistic thin plate regression spline modeling was then used to identify block group variables associated with asthma from sparse principal components. The addresses of patients from the EHR dataset were distributed throughout the majority of Wisconsin's geography. Logistic thin plate regression spline modeling captured spatial variation of asthma. Four sparse principal components identified via model selection consisted of food at home, dog ownership, household size, and disposable income variables. In rural areas, dog ownership and renter occupied housing units from significant sparse principal components were associated with asthma. Our main contribution is the incorporation of sparsity in spatial modeling. SASEA sequentially added sparse principal components to Logistic thin plate regression spline modeling. This method allowed association of geographically distributed environmental factors with asthma using EHR and environmental datasets. SASEA can be applied to other diseases with environmental risk factors. Copyright © 2014 Elsevier Inc. All rights reserved.
International Nuclear Information System (INIS)
Yang, C L; Wei, H Y; Soleimani, M; Adler, A
2013-01-01
Electrical impedance tomography (EIT) is a fast and cost-effective technique to provide a tomographic conductivity image of a subject from boundary current–voltage data. This paper proposes a time and memory efficient method for solving a large scale 3D EIT inverse problem using a parallel conjugate gradient (CG) algorithm. The 3D EIT system with a large number of measurement data can produce a large size of Jacobian matrix; this could cause difficulties in computer storage and the inversion process. One of challenges in 3D EIT is to decrease the reconstruction time and memory usage, at the same time retaining the image quality. Firstly, a sparse matrix reduction technique is proposed using thresholding to set very small values of the Jacobian matrix to zero. By adjusting the Jacobian matrix into a sparse format, the element with zeros would be eliminated, which results in a saving of memory requirement. Secondly, a block-wise CG method for parallel reconstruction has been developed. The proposed method has been tested using simulated data as well as experimental test samples. Sparse Jacobian with a block-wise CG enables the large scale EIT problem to be solved efficiently. Image quality measures are presented to quantify the effect of sparse matrix reduction in reconstruction results. (paper)
Yang, C L; Wei, H Y; Adler, A; Soleimani, M
2013-06-01
Electrical impedance tomography (EIT) is a fast and cost-effective technique to provide a tomographic conductivity image of a subject from boundary current-voltage data. This paper proposes a time and memory efficient method for solving a large scale 3D EIT inverse problem using a parallel conjugate gradient (CG) algorithm. The 3D EIT system with a large number of measurement data can produce a large size of Jacobian matrix; this could cause difficulties in computer storage and the inversion process. One of challenges in 3D EIT is to decrease the reconstruction time and memory usage, at the same time retaining the image quality. Firstly, a sparse matrix reduction technique is proposed using thresholding to set very small values of the Jacobian matrix to zero. By adjusting the Jacobian matrix into a sparse format, the element with zeros would be eliminated, which results in a saving of memory requirement. Secondly, a block-wise CG method for parallel reconstruction has been developed. The proposed method has been tested using simulated data as well as experimental test samples. Sparse Jacobian with a block-wise CG enables the large scale EIT problem to be solved efficiently. Image quality measures are presented to quantify the effect of sparse matrix reduction in reconstruction results.
Directory of Open Access Journals (Sweden)
Bo Kong
2017-01-01
Full Text Available In an integrated radar-communication network, multiuser access techniques with minimal performance degradation and without range-Doppler ambiguities are required, especially in a dense user environment. In this paper, a multiuser access scheme with random subcarrier allocation mechanism is proposed for orthogonal frequency division multiplexing (OFDM based integrated radar-communication networks. The expression of modulation Symbol-Domain method combined with sparse representation (SR for range-Doppler estimation is introduced and a parallel reconstruction algorithm is employed. The radar target detection performance is improved with less spectrum occupation. Additionally, a Doppler frequency detector is exploited to decrease the computational complexity. Numerical simulations show that the proposed method outperforms the traditional modulation Symbol-Domain method under ideal and realistic nonideal scenarios.
Expressing Parallelism with ROOT
Energy Technology Data Exchange (ETDEWEB)
Piparo, D. [CERN; Tejedor, E. [CERN; Guiraud, E. [CERN; Ganis, G. [CERN; Mato, P. [CERN; Moneta, L. [CERN; Valls Pla, X. [CERN; Canal, P. [Fermilab
2017-11-22
The need for processing the ever-increasing amount of data generated by the LHC experiments in a more efficient way has motivated ROOT to further develop its support for parallelism. Such support is being tackled both for shared-memory and distributed-memory environments. The incarnations of the aforementioned parallelism are multi-threading, multi-processing and cluster-wide executions. In the area of multi-threading, we discuss the new implicit parallelism and related interfaces, as well as the new building blocks to safely operate with ROOT objects in a multi-threaded environment. Regarding multi-processing, we review the new MultiProc framework, comparing it with similar tools (e.g. multiprocessing module in Python). Finally, as an alternative to PROOF for cluster-wide executions, we introduce the efforts on integrating ROOT with state-of-the-art distributed data processing technologies like Spark, both in terms of programming model and runtime design (with EOS as one of the main components). For all the levels of parallelism, we discuss, based on real-life examples and measurements, how our proposals can increase the productivity of scientists.
Expressing Parallelism with ROOT
Piparo, D.; Tejedor, E.; Guiraud, E.; Ganis, G.; Mato, P.; Moneta, L.; Valls Pla, X.; Canal, P.
2017-10-01
The need for processing the ever-increasing amount of data generated by the LHC experiments in a more efficient way has motivated ROOT to further develop its support for parallelism. Such support is being tackled both for shared-memory and distributed-memory environments. The incarnations of the aforementioned parallelism are multi-threading, multi-processing and cluster-wide executions. In the area of multi-threading, we discuss the new implicit parallelism and related interfaces, as well as the new building blocks to safely operate with ROOT objects in a multi-threaded environment. Regarding multi-processing, we review the new MultiProc framework, comparing it with similar tools (e.g. multiprocessing module in Python). Finally, as an alternative to PROOF for cluster-wide executions, we introduce the efforts on integrating ROOT with state-of-the-art distributed data processing technologies like Spark, both in terms of programming model and runtime design (with EOS as one of the main components). For all the levels of parallelism, we discuss, based on real-life examples and measurements, how our proposals can increase the productivity of scientists.
Parallel Fast Legendre Transform
Alves de Inda, M.; Bisseling, R.H.; Maslen, D.K.
1998-01-01
We discuss a parallel implementation of a fast algorithm for the discrete polynomial Legendre transform We give an introduction to the DriscollHealy algorithm using polynomial arithmetic and present experimental results on the eciency and accuracy of our implementation The algorithms were
Practical parallel programming
Bauer, Barr E
2014-01-01
This is the book that will teach programmers to write faster, more efficient code for parallel processors. The reader is introduced to a vast array of procedures and paradigms on which actual coding may be based. Examples and real-life simulations using these devices are presented in C and FORTRAN.
Parallel hierarchical radiosity rendering
Energy Technology Data Exchange (ETDEWEB)
Carter, Michael [Iowa State Univ., Ames, IA (United States)
1993-07-01
In this dissertation, the step-by-step development of a scalable parallel hierarchical radiosity renderer is documented. First, a new look is taken at the traditional radiosity equation, and a new form is presented in which the matrix of linear system coefficients is transformed into a symmetric matrix, thereby simplifying the problem and enabling a new solution technique to be applied. Next, the state-of-the-art hierarchical radiosity methods are examined for their suitability to parallel implementation, and scalability. Significant enhancements are also discovered which both improve their theoretical foundations and improve the images they generate. The resultant hierarchical radiosity algorithm is then examined for sources of parallelism, and for an architectural mapping. Several architectural mappings are discussed. A few key algorithmic changes are suggested during the process of making the algorithm parallel. Next, the performance, efficiency, and scalability of the algorithm are analyzed. The dissertation closes with a discussion of several ideas which have the potential to further enhance the hierarchical radiosity method, or provide an entirely new forum for the application of hierarchical methods.
Parallel universes beguile science
2007-01-01
A staple of mind-bending science fiction, the possibility of multiple universes has long intrigued hard-nosed physicists, mathematicians and cosmologists too. We may not be able -- as least not yet -- to prove they exist, many serious scientists say, but there are plenty of reasons to think that parallel dimensions are more than figments of eggheaded imagination.
Energy Technology Data Exchange (ETDEWEB)
2017-04-04
A parallelization of the k-means++ seed selection algorithm on three distinct hardware platforms: GPU, multicore CPU, and multithreaded architecture. K-means++ was developed by David Arthur and Sergei Vassilvitskii in 2007 as an extension of the k-means data clustering technique. These algorithms allow people to cluster multidimensional data, by attempting to minimize the mean distance of data points within a cluster. K-means++ improved upon traditional k-means by using a more intelligent approach to selecting the initial seeds for the clustering process. While k-means++ has become a popular alternative to traditional k-means clustering, little work has been done to parallelize this technique. We have developed original C++ code for parallelizing the algorithm on three unique hardware architectures: GPU using NVidia's CUDA/Thrust framework, multicore CPU using OpenMP, and the Cray XMT multithreaded architecture. By parallelizing the process for these platforms, we are able to perform k-means++ clustering much more quickly than it could be done before.
International Nuclear Information System (INIS)
Gardes, D.; Volkov, P.
1981-01-01
A 5x3cm 2 (timing only) and a 15x5cm 2 (timing and position) parallel plate avalanche counters (PPAC) are considered. The theory of operation and timing resolution is given. The measurement set-up and the curves of experimental results illustrate the possibilities of the two counters [fr
Parallel hierarchical global illumination
Energy Technology Data Exchange (ETDEWEB)
Snell, Quinn O. [Iowa State Univ., Ames, IA (United States)
1997-10-08
Solving the global illumination problem is equivalent to determining the intensity of every wavelength of light in all directions at every point in a given scene. The complexity of the problem has led researchers to use approximation methods for solving the problem on serial computers. Rather than using an approximation method, such as backward ray tracing or radiosity, the authors have chosen to solve the Rendering Equation by direct simulation of light transport from the light sources. This paper presents an algorithm that solves the Rendering Equation to any desired accuracy, and can be run in parallel on distributed memory or shared memory computer systems with excellent scaling properties. It appears superior in both speed and physical correctness to recent published methods involving bidirectional ray tracing or hybrid treatments of diffuse and specular surfaces. Like progressive radiosity methods, it dynamically refines the geometry decomposition where required, but does so without the excessive storage requirements for ray histories. The algorithm, called Photon, produces a scene which converges to the global illumination solution. This amounts to a huge task for a 1997-vintage serial computer, but using the power of a parallel supercomputer significantly reduces the time required to generate a solution. Currently, Photon can be run on most parallel environments from a shared memory multiprocessor to a parallel supercomputer, as well as on clusters of heterogeneous workstations.
Image fusion via nonlocal sparse K-SVD dictionary learning.
Li, Ying; Li, Fangyi; Bai, Bendu; Shen, Qiang
2016-03-01
Image fusion aims to merge two or more images captured via various sensors of the same scene to construct a more informative image by integrating their details. Generally, such integration is achieved through the manipulation of the representations of the images concerned. Sparse representation plays an important role in the effective description of images, offering a great potential in a variety of image processing tasks, including image fusion. Supported by sparse representation, in this paper, an approach for image fusion by the use of a novel dictionary learning scheme is proposed. The nonlocal self-similarity property of the images is exploited, not only at the stage of learning the underlying description dictionary but during the process of image fusion. In particular, the property of nonlocal self-similarity is combined with the traditional sparse dictionary. This results in an improved learned dictionary, hereafter referred to as the nonlocal sparse K-SVD dictionary (where K-SVD stands for the K times singular value decomposition that is commonly used in the literature), and abbreviated to NL_SK_SVD. The performance of the NL_SK_SVD dictionary is applied for image fusion using simultaneous orthogonal matching pursuit. The proposed approach is evaluated with different types of images, and compared with a number of alternative image fusion techniques. The resultant superior fused images using the present approach demonstrates the efficacy of the NL_SK_SVD dictionary in sparse image representation.
Sparse dictionary for synthetic transmit aperture medical ultrasound imaging.
Wang, Ping; Jiang, Jin-Yang; Li, Na; Luo, Han-Wu; Li, Fang; Cui, Shi-Gang
2017-07-01
It is possible to recover a signal below the Nyquist sampling limit using a compressive sensing technique in ultrasound imaging. However, the reconstruction enabled by common sparse transform approaches does not achieve satisfactory results. Considering the ultrasound echo signal's features of attenuation, repetition, and superposition, a sparse dictionary with the emission pulse signal is proposed. Sparse coefficients in the proposed dictionary have high sparsity. Images reconstructed with this dictionary were compared with those obtained with the three other common transforms, namely, discrete Fourier transform, discrete cosine transform, and discrete wavelet transform. The performance of the proposed dictionary was analyzed via a simulation and experimental data. The mean absolute error (MAE) was used to quantify the quality of the reconstructions. Experimental results indicate that the MAE associated with the proposed dictionary was always the smallest, the reconstruction time required was the shortest, and the lateral resolution and contrast of the reconstructed images were also the closest to the original images. The proposed sparse dictionary performed better than the other three sparse transforms. With the same sampling rate, the proposed dictionary achieved excellent reconstruction quality.
A sparse matrix based full-configuration interaction algorithm
International Nuclear Information System (INIS)
Rolik, Zoltan; Szabados, Agnes; Surjan, Peter R.
2008-01-01
We present an algorithm related to the full-configuration interaction (FCI) method that makes complete use of the sparse nature of the coefficient vector representing the many-electron wave function in a determinantal basis. Main achievements of the presented sparse FCI (SFCI) algorithm are (i) development of an iteration procedure that avoids the storage of FCI size vectors; (ii) development of an efficient algorithm to evaluate the effect of the Hamiltonian when both the initial and the product vectors are sparse. As a result of point (i) large disk operations can be skipped which otherwise may be a bottleneck of the procedure. At point (ii) we progress by adopting the implementation of the linear transformation by Olsen et al. [J. Chem Phys. 89, 2185 (1988)] for the sparse case, getting the algorithm applicable to larger systems and faster at the same time. The error of a SFCI calculation depends only on the dropout thresholds for the sparse vectors, and can be tuned by controlling the amount of system memory passed to the procedure. The algorithm permits to perform FCI calculations on single node workstations for systems previously accessible only by supercomputers
X-ray computed tomography using curvelet sparse regularization.
Wieczorek, Matthias; Frikel, Jürgen; Vogel, Jakob; Eggl, Elena; Kopp, Felix; Noël, Peter B; Pfeiffer, Franz; Demaret, Laurent; Lasser, Tobias
2015-04-01
Reconstruction of x-ray computed tomography (CT) data remains a mathematically challenging problem in medical imaging. Complementing the standard analytical reconstruction methods, sparse regularization is growing in importance, as it allows inclusion of prior knowledge. The paper presents a method for sparse regularization based on the curvelet frame for the application to iterative reconstruction in x-ray computed tomography. In this work, the authors present an iterative reconstruction approach based on the alternating direction method of multipliers using curvelet sparse regularization. Evaluation of the method is performed on a specifically crafted numerical phantom dataset to highlight the method's strengths. Additional evaluation is performed on two real datasets from commercial scanners with different noise characteristics, a clinical bone sample acquired in a micro-CT and a human abdomen scanned in a diagnostic CT. The results clearly illustrate that curvelet sparse regularization has characteristic strengths. In particular, it improves the restoration and resolution of highly directional, high contrast features with smooth contrast variations. The authors also compare this approach to the popular technique of total variation and to traditional filtered backprojection. The authors conclude that curvelet sparse regularization is able to improve reconstruction quality by reducing noise while preserving highly directional features.
Selectivity and sparseness in randomly connected balanced networks.
Directory of Open Access Journals (Sweden)
Cengiz Pehlevan
Full Text Available Neurons in sensory cortex show stimulus selectivity and sparse population response, even in cases where no strong functionally specific structure in connectivity can be detected. This raises the question whether selectivity and sparseness can be generated and maintained in randomly connected networks. We consider a recurrent network of excitatory and inhibitory spiking neurons with random connectivity, driven by random projections from an input layer of stimulus selective neurons. In this architecture, the stimulus-to-stimulus and neuron-to-neuron modulation of total synaptic input is weak compared to the mean input. Surprisingly, we show that in the balanced state the network can still support high stimulus selectivity and sparse population response. In the balanced state, strong synapses amplify the variation in synaptic input and recurrent inhibition cancels the mean. Functional specificity in connectivity emerges due to the inhomogeneity caused by the generative statistical rule used to build the network. We further elucidate the mechanism behind and evaluate the effects of model parameters on population sparseness and stimulus selectivity. Network response to mixtures of stimuli is investigated. It is shown that a balanced state with unselective inhibition can be achieved with densely connected input to inhibitory population. Balanced networks exhibit the "paradoxical" effect: an increase in excitatory drive to inhibition leads to decreased inhibitory population firing rate. We compare and contrast selectivity and sparseness generated by the balanced network to randomly connected unbalanced networks. Finally, we discuss our results in light of experiments.
Low-count PET image restoration using sparse representation
Li, Tao; Jiang, Changhui; Gao, Juan; Yang, Yongfeng; Liang, Dong; Liu, Xin; Zheng, Hairong; Hu, Zhanli
2018-04-01
In the field of positron emission tomography (PET), reconstructed images are often blurry and contain noise. These problems are primarily caused by the low resolution of projection data. Solving this problem by improving hardware is an expensive solution, and therefore, we attempted to develop a solution based on optimizing several related algorithms in both the reconstruction and image post-processing domains. As sparse technology is widely used, sparse prediction is increasingly applied to solve this problem. In this paper, we propose a new sparse method to process low-resolution PET images. Two dictionaries (D1 for low-resolution PET images and D2 for high-resolution PET images) are learned from a group real PET image data sets. Among these two dictionaries, D1 is used to obtain a sparse representation for each patch of the input PET image. Then, a high-resolution PET image is generated from this sparse representation using D2. Experimental results indicate that the proposed method exhibits a stable and superior ability to enhance image resolution and recover image details. Quantitatively, this method achieves better performance than traditional methods. This proposed strategy is a new and efficient approach for improving the quality of PET images.
Simulation of sparse matrix array designs
Boehm, Rainer; Heckel, Thomas
2018-04-01
Matrix phased array probes are becoming more prominently used in industrial applications. The main drawbacks, using probes incorporating a very large number of transducer elements, are needed for an appropriate cabling and an ultrasonic device offering many parallel channels. Matrix arrays designed for extended functionality feature at least 64 or more elements. Typical arrangements are square matrices, e.g., 8 by 8 or 11 by 11 or rectangular matrixes, e.g., 8 by 16 or 10 by 12 to fit a 128-channel phased array system. In some phased array systems, the number of simultaneous active elements is limited to a certain number, e.g., 32 or 64. Those setups do not allow running the probe with all elements active, which may cause a significant change in the directivity pattern of the resulting sound beam. When only a subset of elements can be used during a single acquisition, different strategies may be applied to collect enough data for rebuilding the missing information from the echo signal. Omission of certain elements may be one approach, overlay of subsequent shots with different active areas may be another one. This paper presents the influence of a decreased number of active elements on the sound field and their distribution on the array. Solutions using subsets with different element activity patterns on matrix arrays and their advantages and disadvantages concerning the sound field are evaluated using semi-analytical simulation tools. Sound field criteria are discussed, which are significant for non-destructive testing results and for the system setup.
Joint sparse representation for robust multimodal biometrics recognition.
Shekhar, Sumit; Patel, Vishal M; Nasrabadi, Nasser M; Chellappa, Rama
2014-01-01
Traditional biometric recognition systems rely on a single biometric signature for authentication. While the advantage of using multiple sources of information for establishing the identity has been widely recognized, computational models for multimodal biometrics recognition have only recently received attention. We propose a multimodal sparse representation method, which represents the test data by a sparse linear combination of training data, while constraining the observations from different modalities of the test subject to share their sparse representations. Thus, we simultaneously take into account correlations as well as coupling information among biometric modalities. A multimodal quality measure is also proposed to weigh each modality as it gets fused. Furthermore, we also kernelize the algorithm to handle nonlinearity in data. The optimization problem is solved using an efficient alternative direction method. Various experiments show that the proposed method compares favorably with competing fusion-based methods.
Sparse Representation Denoising for Radar High Resolution Range Profiling
Directory of Open Access Journals (Sweden)
Min Li
2014-01-01
Full Text Available Radar high resolution range profile has attracted considerable attention in radar automatic target recognition. In practice, radar return is usually contaminated by noise, which results in profile distortion and recognition performance degradation. To deal with this problem, in this paper, a novel denoising method based on sparse representation is proposed to remove the Gaussian white additive noise. The return is sparsely described in the Fourier redundant dictionary and the denoising problem is described as a sparse representation model. Noise level of the return, which is crucial to the denoising performance but often unknown, is estimated by performing subspace method on the sliding subsequence correlation matrix. Sliding window process enables noise level estimation using only one observation sequence, not only guaranteeing estimation efficiency but also avoiding the influence of profile time-shift sensitivity. Experimental results show that the proposed method can effectively improve the signal-to-noise ratio of the return, leading to a high-quality profile.
A Projected Conjugate Gradient Method for Sparse Minimax Problems
DEFF Research Database (Denmark)
Madsen, Kaj; Jonasson, Kristjan
1993-01-01
A new method for nonlinear minimax problems is presented. The method is of the trust region type and based on sequential linear programming. It is a first order method that only uses first derivatives and does not approximate Hessians. The new method is well suited for large sparse problems...... as it only requires that software for sparse linear programming and a sparse symmetric positive definite equation solver are available. On each iteration a special linear/quadratic model of the function is minimized, but contrary to the usual practice in trust region methods the quadratic model is only...... with the method are presented. In fact, we find that the number of iterations required is comparable to that of state-of-the-art quasi-Newton codes....
A Multiobjective Sparse Feature Learning Model for Deep Neural Networks.
Gong, Maoguo; Liu, Jia; Li, Hao; Cai, Qing; Su, Linzhi
2015-12-01
Hierarchical deep neural networks are currently popular learning models for imitating the hierarchical architecture of human brain. Single-layer feature extractors are the bricks to build deep networks. Sparse feature learning models are popular models that can learn useful representations. But most of those models need a user-defined constant to control the sparsity of representations. In this paper, we propose a multiobjective sparse feature learning model based on the autoencoder. The parameters of the model are learnt by optimizing two objectives, reconstruction error and the sparsity of hidden units simultaneously to find a reasonable compromise between them automatically. We design a multiobjective induced learning procedure for this model based on a multiobjective evolutionary algorithm. In the experiments, we demonstrate that the learning procedure is effective, and the proposed multiobjective model can learn useful sparse features.
Preconditioned Inexact Newton for Nonlinear Sparse Electromagnetic Imaging
Desmal, Abdulla; Bagci, Hakan
2014-01-01
Newton-type algorithms have been extensively studied in nonlinear microwave imaging due to their quadratic convergence rate and ability to recover images with high contrast values. In the past, Newton methods have been implemented in conjunction with smoothness promoting optimization/regularization schemes. However, this type of regularization schemes are known to perform poorly when applied in imagining domains with sparse content or sharp variations. In this work, an inexact Newton algorithm is formulated and implemented in conjunction with a linear sparse optimization scheme. A novel preconditioning technique is proposed to increase the convergence rate of the optimization problem. Numerical results demonstrate that the proposed framework produces sharper and more accurate images when applied in sparse/sparsified domains.
Preconditioned Inexact Newton for Nonlinear Sparse Electromagnetic Imaging
Desmal, Abdulla
2014-05-04
Newton-type algorithms have been extensively studied in nonlinear microwave imaging due to their quadratic convergence rate and ability to recover images with high contrast values. In the past, Newton methods have been implemented in conjunction with smoothness promoting optimization/regularization schemes. However, this type of regularization schemes are known to perform poorly when applied in imagining domains with sparse content or sharp variations. In this work, an inexact Newton algorithm is formulated and implemented in conjunction with a linear sparse optimization scheme. A novel preconditioning technique is proposed to increase the convergence rate of the optimization problem. Numerical results demonstrate that the proposed framework produces sharper and more accurate images when applied in sparse/sparsified domains.
Preconditioned Inexact Newton for Nonlinear Sparse Electromagnetic Imaging
Desmal, Abdulla
2014-01-06
Newton-type algorithms have been extensively studied in nonlinear microwave imaging due to their quadratic convergence rate and ability to recover images with high contrast values. In the past, Newton methods have been implemented in conjunction with smoothness promoting optimization/regularization schemes. However, this type of regularization schemes are known to perform poorly when applied in imagining domains with sparse content or sharp variations. In this work, an inexact Newton algorithm is formulated and implemented in conjunction with a linear sparse optimization scheme. A novel preconditioning technique is proposed to increase the convergence rate of the optimization problem. Numerical results demonstrate that the proposed framework produces sharper and more accurate images when applied in sparse/sparsified domains.
Identification of MIMO systems with sparse transfer function coefficients
Qiu, Wanzhi; Saleem, Syed Khusro; Skafidas, Efstratios
2012-12-01
We study the problem of estimating transfer functions of multivariable (multiple-input multiple-output--MIMO) systems with sparse coefficients. We note that subspace identification methods are powerful and convenient tools in dealing with MIMO systems since they neither require nonlinear optimization nor impose any canonical form on the systems. However, subspace-based methods are inefficient for systems with sparse transfer function coefficients since they work on state space models. We propose a two-step algorithm where the first step identifies the system order using the subspace principle in a state space format, while the second step estimates coefficients of the transfer functions via L1-norm convex optimization. The proposed algorithm retains good features of subspace methods with improved noise-robustness for sparse systems.
A General Sparse Tensor Framework for Electronic Structure Theory.
Manzer, Samuel; Epifanovsky, Evgeny; Krylov, Anna I; Head-Gordon, Martin
2017-03-14
Linear-scaling algorithms must be developed in order to extend the domain of applicability of electronic structure theory to molecules of any desired size. However, the increasing complexity of modern linear-scaling methods makes code development and maintenance a significant challenge. A major contributor to this difficulty is the lack of robust software abstractions for handling block-sparse tensor operations. We therefore report the development of a highly efficient symbolic block-sparse tensor library in order to provide access to high-level software constructs to treat such problems. Our implementation supports arbitrary multi-dimensional sparsity in all input and output tensors. We avoid cumbersome machine-generated code by implementing all functionality as a high-level symbolic C++ language library and demonstrate that our implementation attains very high performance for linear-scaling sparse tensor contractions.
Sparse dictionary learning of resting state fMRI networks.
Eavani, Harini; Filipovych, Roman; Davatzikos, Christos; Satterthwaite, Theodore D; Gur, Raquel E; Gur, Ruben C
2012-07-02
Research in resting state fMRI (rsfMRI) has revealed the presence of stable, anti-correlated functional subnetworks in the brain. Task-positive networks are active during a cognitive process and are anti-correlated with task-negative networks, which are active during rest. In this paper, based on the assumption that the structure of the resting state functional brain connectivity is sparse, we utilize sparse dictionary modeling to identify distinct functional sub-networks. We propose two ways of formulating the sparse functional network learning problem that characterize the underlying functional connectivity from different perspectives. Our results show that the whole-brain functional connectivity can be concisely represented with highly modular, overlapping task-positive/negative pairs of sub-networks.
Uncovering Transcriptional Regulatory Networks by Sparse Bayesian Factor Model
Directory of Open Access Journals (Sweden)
Qi Yuan(Alan
2010-01-01
Full Text Available Abstract The problem of uncovering transcriptional regulation by transcription factors (TFs based on microarray data is considered. A novel Bayesian sparse correlated rectified factor model (BSCRFM is proposed that models the unknown TF protein level activity, the correlated regulations between TFs, and the sparse nature of TF-regulated genes. The model admits prior knowledge from existing database regarding TF-regulated target genes based on a sparse prior and through a developed Gibbs sampling algorithm, a context-specific transcriptional regulatory network specific to the experimental condition of the microarray data can be obtained. The proposed model and the Gibbs sampling algorithm were evaluated on the simulated systems, and results demonstrated the validity and effectiveness of the proposed approach. The proposed model was then applied to the breast cancer microarray data of patients with Estrogen Receptor positive ( status and Estrogen Receptor negative ( status, respectively.
MULTISCALE SPARSE APPEARANCE MODELING AND SIMULATION OF PATHOLOGICAL DEFORMATIONS
Directory of Open Access Journals (Sweden)
Rami Zewail
2017-08-01
Full Text Available Machine learning and statistical modeling techniques has drawn much interest within the medical imaging research community. However, clinically-relevant modeling of anatomical structures continues to be a challenging task. This paper presents a novel method for multiscale sparse appearance modeling in medical images with application to simulation of pathological deformations in X-ray images of human spine. The proposed appearance model benefits from the non-linear approximation power of Contourlets and its ability to capture higher order singularities to achieve a sparse representation while preserving the accuracy of the statistical model. Independent Component Analysis is used to extract statistical independent modes of variations from the sparse Contourlet-based domain. The new model is then used to simulate clinically-relevant pathological deformations in radiographic images.
Parallel Breadth-First Search on Distributed Memory Systems
Energy Technology Data Exchange (ETDEWEB)
Computational Research Division; Buluc, Aydin; Madduri, Kamesh
2011-04-15
Data-intensive, graph-based computations are pervasive in several scientific applications, and are known to to be quite challenging to implement on distributed memory systems. In this work, we explore the design space of parallel algorithms for Breadth-First Search (BFS), a key subroutine in several graph algorithms. We present two highly-tuned par- allel approaches for BFS on large parallel systems: a level-synchronous strategy that relies on a simple vertex-based partitioning of the graph, and a two-dimensional sparse matrix- partitioning-based approach that mitigates parallel commu- nication overhead. For both approaches, we also present hybrid versions with intra-node multithreading. Our novel hybrid two-dimensional algorithm reduces communication times by up to a factor of 3.5, relative to a common vertex based approach. Our experimental study identifies execu- tion regimes in which these approaches will be competitive, and we demonstrate extremely high performance on lead- ing distributed-memory parallel systems. For instance, for a 40,000-core parallel execution on Hopper, an AMD Magny- Cours based system, we achieve a BFS performance rate of 17.8 billion edge visits per second on an undirected graph of 4.3 billion vertices and 68.7 billion edges with skewed degree distribution.
Wald, Ingo; Ize, Santiago
2015-07-28
Parallel population of a grid with a plurality of objects using a plurality of processors. One example embodiment is a method for parallel population of a grid with a plurality of objects using a plurality of processors. The method includes a first act of dividing a grid into n distinct grid portions, where n is the number of processors available for populating the grid. The method also includes acts of dividing a plurality of objects into n distinct sets of objects, assigning a distinct set of objects to each processor such that each processor determines by which distinct grid portion(s) each object in its distinct set of objects is at least partially bounded, and assigning a distinct grid portion to each processor such that each processor populates its distinct grid portion with any objects that were previously determined to be at least partially bounded by its distinct grid portion.
Ultrascalable petaflop parallel supercomputer
Blumrich, Matthias A [Ridgefield, CT; Chen, Dong [Croton On Hudson, NY; Chiu, George [Cross River, NY; Cipolla, Thomas M [Katonah, NY; Coteus, Paul W [Yorktown Heights, NY; Gara, Alan G [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Hall, Shawn [Pleasantville, NY; Haring, Rudolf A [Cortlandt Manor, NY; Heidelberger, Philip [Cortlandt Manor, NY; Kopcsay, Gerard V [Yorktown Heights, NY; Ohmacht, Martin [Yorktown Heights, NY; Salapura, Valentina [Chappaqua, NY; Sugavanam, Krishnan [Mahopac, NY; Takken, Todd [Brewster, NY
2010-07-20
A massively parallel supercomputer of petaOPS-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC) having up to four processing elements. The ASIC nodes are interconnected by multiple independent networks that optimally maximize the throughput of packet communications between nodes with minimal latency. The multiple networks may include three high-speed networks for parallel algorithm message passing including a Torus, collective network, and a Global Asynchronous network that provides global barrier and notification functions. These multiple independent networks may be collaboratively or independently utilized according to the needs or phases of an algorithm for optimizing algorithm processing performance. The use of a DMA engine is provided to facilitate message passing among the nodes without the expenditure of processing resources at the node.
DEFF Research Database (Denmark)
Gregersen, Frans; Josephson, Olle; Kristoffersen, Gjert
of departure that English may be used in parallel with the various local, in this case Nordic, languages. As such, the book integrates the challenge of internationalization faced by any university with the wish to improve quality in research, education and administration based on the local language......Abstract [en] More parallel, please is the result of the work of an Inter-Nordic group of experts on language policy financed by the Nordic Council of Ministers 2014-17. The book presents all that is needed to plan, practice and revise a university language policy which takes as its point......(s). There are three layers in the text: First, you may read the extremely brief version of the in total 11 recommendations for best practice. Second, you may acquaint yourself with the extended version of the recommendations and finally, you may study the reasoning behind each of them. At the end of the text, we give...
PARALLEL MOVING MECHANICAL SYSTEMS
Directory of Open Access Journals (Sweden)
Florian Ion Tiberius Petrescu
2014-09-01
Full Text Available Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 Moving mechanical systems parallel structures are solid, fast, and accurate. Between parallel systems it is to be noticed Stewart platforms, as the oldest systems, fast, solid and precise. The work outlines a few main elements of Stewart platforms. Begin with the geometry platform, kinematic elements of it, and presented then and a few items of dynamics. Dynamic primary element on it means the determination mechanism kinetic energy of the entire Stewart platforms. It is then in a record tail cinematic mobile by a method dot matrix of rotation. If a structural mottoelement consists of two moving elements which translates relative, drive train and especially dynamic it is more convenient to represent the mottoelement as a single moving components. We have thus seven moving parts (the six motoelements or feet to which is added mobile platform 7 and one fixed.
Xyce parallel electronic simulator.
Energy Technology Data Exchange (ETDEWEB)
Keiter, Eric R; Mei, Ting; Russo, Thomas V.; Rankin, Eric Lamont; Schiek, Richard Louis; Thornquist, Heidi K.; Fixel, Deborah A.; Coffey, Todd S; Pawlowski, Roger P; Santarelli, Keith R.
2010-05-01
This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users Guide. The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce. This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users Guide.
Betchov, R
2012-01-01
Stability of Parallel Flows provides information pertinent to hydrodynamical stability. This book explores the stability problems that occur in various fields, including electronics, mechanics, oceanography, administration, economics, as well as naval and aeronautical engineering. Organized into two parts encompassing 10 chapters, this book starts with an overview of the general equations of a two-dimensional incompressible flow. This text then explores the stability of a laminar boundary layer and presents the equation of the inviscid approximation. Other chapters present the general equation
Algorithmically specialized parallel computers
Snyder, Lawrence; Gannon, Dennis B
1985-01-01
Algorithmically Specialized Parallel Computers focuses on the concept and characteristics of an algorithmically specialized computer.This book discusses the algorithmically specialized computers, algorithmic specialization using VLSI, and innovative architectures. The architectures and algorithms for digital signal, speech, and image processing and specialized architectures for numerical computations are also elaborated. Other topics include the model for analyzing generalized inter-processor, pipelined architecture for search tree maintenance, and specialized computer organization for raster
Universal Regularizers For Robust Sparse Coding and Modeling
Ramirez, Ignacio; Sapiro, Guillermo
2010-01-01
Sparse data models, where data is assumed to be well represented as a linear combination of a few elements from a dictionary, have gained considerable attention in recent years, and their use has led to state-of-the-art results in many signal and image processing tasks. It is now well understood that the choice of the sparsity regularization term is critical in the success of such models. Based on a codelength minimization interpretation of sparse coding, and using tools from universal coding...
Uniform sparse bounds for discrete quadratic phase Hilbert transforms
Kesler, Robert; Arias, Darío Mena
2017-09-01
For each α \\in T consider the discrete quadratic phase Hilbert transform acting on finitely supported functions f : Z → C according to H^{α }f(n):= \\sum _{m ≠ 0} e^{iα m^2} f(n - m)/m. We prove that, uniformly in α \\in T , there is a sparse bound for the bilinear form for every pair of finitely supported functions f,g : Z→ C . The sparse bound implies several mapping properties such as weighted inequalities in an intersection of Muckenhoupt and reverse Hölder classes.
Sparse reconstruction by means of the standard Tikhonov regularization
International Nuclear Information System (INIS)
Lu Shuai; Pereverzev, Sergei V
2008-01-01
It is a common belief that Tikhonov scheme with || · ||L 2 -penalty fails in sparse reconstruction. We are going to show, however, that this standard regularization can help if the stability measured in L 1 -norm will be properly taken into account in the choice of the regularization parameter. The crucial point is that now a stability bound may depend on the bases with respect to which the solution of the problem is assumed to be sparse. We discuss how such a stability can be estimated numerically and present the results of computational experiments giving the evidence of the reliability of our approach.
Sparse electromagnetic imaging using nonlinear iterative shrinkage thresholding
Desmal, Abdulla; Bagci, Hakan
2015-01-01
A sparse nonlinear electromagnetic imaging scheme is proposed for reconstructing dielectric contrast of investigation domains from measured fields. The proposed approach constructs the optimization problem by introducing the sparsity constraint to the data misfit between the scattered fields expressed as a nonlinear function of the contrast and the measured fields and solves it using the nonlinear iterative shrinkage thresholding algorithm. The thresholding is applied to the result of every nonlinear Landweber iteration to enforce the sparsity constraint. Numerical results demonstrate the accuracy and efficiency of the proposed method in reconstructing sparse dielectric profiles.
Sparse electromagnetic imaging using nonlinear iterative shrinkage thresholding
Desmal, Abdulla
2015-04-13
A sparse nonlinear electromagnetic imaging scheme is proposed for reconstructing dielectric contrast of investigation domains from measured fields. The proposed approach constructs the optimization problem by introducing the sparsity constraint to the data misfit between the scattered fields expressed as a nonlinear function of the contrast and the measured fields and solves it using the nonlinear iterative shrinkage thresholding algorithm. The thresholding is applied to the result of every nonlinear Landweber iteration to enforce the sparsity constraint. Numerical results demonstrate the accuracy and efficiency of the proposed method in reconstructing sparse dielectric profiles.
Sparse grid techniques for particle-in-cell schemes
Ricketson, L. F.; Cerfon, A. J.
2017-02-01
We propose the use of sparse grids to accelerate particle-in-cell (PIC) schemes. By using the so-called ‘combination technique’ from the sparse grids literature, we are able to dramatically increase the size of the spatial cells in multi-dimensional PIC schemes while paying only a slight penalty in grid-based error. The resulting increase in cell size allows us to reduce the statistical noise in the simulation without increasing total particle number. We present initial proof-of-principle results from test cases in two and three dimensions that demonstrate the new scheme’s efficiency, both in terms of computation time and memory usage.
Ordering sparse matrices for cache-based systems
International Nuclear Information System (INIS)
Biswas, Rupak; Oliker, Leonid
2001-01-01
The Conjugate Gradient (CG) algorithm is the oldest and best-known Krylov subspace method used to solve sparse linear systems. Most of the coating-point operations within each CG iteration is spent performing sparse matrix-vector multiplication (SPMV). We examine how various ordering and partitioning strategies affect the performance of CG and SPMV when different programming paradigms are used on current commercial cache-based computers. However, a multithreaded implementation on the cacheless Cray MTA demonstrates high efficiency and scalability without any special ordering or partitioning
Sparse Matrix for ECG Identification with Two-Lead Features
Directory of Open Access Journals (Sweden)
Kuo-Kun Tseng
2015-01-01
Full Text Available Electrocardiograph (ECG human identification has the potential to improve biometric security. However, improvements in ECG identification and feature extraction are required. Previous work has focused on single lead ECG signals. Our work proposes a new algorithm for human identification by mapping two-lead ECG signals onto a two-dimensional matrix then employing a sparse matrix method to process the matrix. And that is the first application of sparse matrix techniques for ECG identification. Moreover, the results of our experiments demonstrate the benefits of our approach over existing methods.
Low-rank and sparse modeling for visual analysis
Fu, Yun
2014-01-01
This book provides a view of low-rank and sparse computing, especially approximation, recovery, representation, scaling, coding, embedding and learning among unconstrained visual data. The book includes chapters covering multiple emerging topics in this new field. It links multiple popular research fields in Human-Centered Computing, Social Media, Image Classification, Pattern Recognition, Computer Vision, Big Data, and Human-Computer Interaction. Contains an overview of the low-rank and sparse modeling techniques for visual analysis by examining both theoretical analysis and real-world applic
Directory of Open Access Journals (Sweden)
Bin Yan
2015-01-01
Full Text Available Sparse-view imaging is a promising scanning method which can reduce the radiation dose in X-ray computed tomography (CT. Reconstruction algorithm for sparse-view imaging system is of significant importance. The adoption of the spatial iterative algorithm for CT image reconstruction has a low operation efficiency and high computation requirement. A novel Fourier-based iterative reconstruction technique that utilizes nonuniform fast Fourier transform is presented in this study along with the advanced total variation (TV regularization for sparse-view CT. Combined with the alternating direction method, the proposed approach shows excellent efficiency and rapid convergence property. Numerical simulations and real data experiments are performed on a parallel beam CT. Experimental results validate that the proposed method has higher computational efficiency and better reconstruction quality than the conventional algorithms, such as simultaneous algebraic reconstruction technique using TV method and the alternating direction total variation minimization approach, with the same time duration. The proposed method appears to have extensive applications in X-ray CT imaging.
A Parallel Sweeping Preconditioner for Heterogeneous 3D Helmholtz Equations
Poulson, Jack
2013-05-02
A parallelization of a sweeping preconditioner for three-dimensional Helmholtz equations without large cavities is introduced and benchmarked for several challenging velocity models. The setup and application costs of the sequential preconditioner are shown to be O(γ2N4/3) and O(γN logN), where γ(ω) denotes the modestly frequency-dependent number of grid points per perfectly matched layer. Several computational and memory improvements are introduced relative to using black-box sparse-direct solvers for the auxiliary problems, and competitive runtimes and iteration counts are reported for high-frequency problems distributed over thousands of cores. Two open-source packages are released along with this paper: Parallel Sweeping Preconditioner (PSP) and the underlying distributed multifrontal solver, Clique. © 2013 Society for Industrial and Applied Mathematics.
A hybrid method for the parallel computation of Green's functions
DEFF Research Database (Denmark)
Petersen, Dan Erik; Li, Song; Stokbro, Kurt
2009-01-01
of the large number of times this calculation needs to be performed, this is computationally very expensive even on supercomputers. The classical approach is based on recurrence formulas which cannot be efficiently parallelized. This practically prevents the solution of large problems with hundreds...... of thousands of atoms. We propose new recurrences for a general class of sparse matrices to calculate Green's and lesser Green's function matrices which extend formulas derived by Takahashi and others. We show that these recurrences may lead to a dramatically reduced computational cost because they only...... require computing a small number of entries of the inverse matrix. Then. we propose a parallelization strategy for block tridiagonal matrices which involves a combination of Schur complement calculations and cyclic reduction. It achieves good scalability even on problems of modest size....
Directory of Open Access Journals (Sweden)
Tianpeng Feng
2016-01-01
Full Text Available An auto fabric defect detection system via computer vision is used to replace manual inspection. In this paper, we propose a hardware accelerated algorithm based on a small-scale over-completed dictionary (SSOCD via sparse coding (SC method, which is realized on a parallel hardware platform (TMS320C6678. In order to reduce computation, the image patches projections in the training SSOCD are taken as features and the proposed features are more robust, and exhibit obvious advantages in detection results and computational cost. Furthermore, we introduce detection ratio and false ratio in order to measure the performance and reliability of the hardware accelerated algorithm. The experiments show that the proposed algorithm can run with high parallel efficiency and that the detection speed meets the real-time requirements of industrial inspection.
A hybrid method for the parallel computation of Green's functions
International Nuclear Information System (INIS)
Petersen, Dan Erik; Li Song; Stokbro, Kurt; Sorensen, Hans Henrik B.; Hansen, Per Christian; Skelboe, Stig; Darve, Eric
2009-01-01
Quantum transport models for nanodevices using the non-equilibrium Green's function method require the repeated calculation of the block tridiagonal part of the Green's and lesser Green's function matrices. This problem is related to the calculation of the inverse of a sparse matrix. Because of the large number of times this calculation needs to be performed, this is computationally very expensive even on supercomputers. The classical approach is based on recurrence formulas which cannot be efficiently parallelized. This practically prevents the solution of large problems with hundreds of thousands of atoms. We propose new recurrences for a general class of sparse matrices to calculate Green's and lesser Green's function matrices which extend formulas derived by Takahashi and others. We show that these recurrences may lead to a dramatically reduced computational cost because they only require computing a small number of entries of the inverse matrix. Then, we propose a parallelization strategy for block tridiagonal matrices which involves a combination of Schur complement calculations and cyclic reduction. It achieves good scalability even on problems of modest size.
Efficient coordinated recovery of sparse channels in massive MIMO
Masood, Mudassir
2015-01-01
This paper addresses the problem of estimating sparse channels in massive MIMO-OFDM systems. Most wireless channels are sparse in nature with large delay spread. In addition, these channels as observed by multiple antennas in a neighborhood have approximately common support. The sparsity and common support properties are attractive when it comes to the efficient estimation of large number of channels in massive MIMO systems. Moreover, to avoid pilot contamination and to achieve better spectral efficiency, it is important to use a small number of pilots. We present a novel channel estimation approach which utilizes the sparsity and common support properties to estimate sparse channels and requires a small number of pilots. Two algorithms based on this approach have been developed that perform Bayesian estimates of sparse channels even when the prior is non-Gaussian or unknown. Neighboring antennas share among each other their beliefs about the locations of active channel taps to perform estimation. The coordinated approach improves channel estimates and also reduces the required number of pilots. Further improvement is achieved by the data-aided version of the algorithm. Extensive simulation results are provided to demonstrate the performance of the proposed algorithms.
Sparse Generalized Fourier Series via Collocation-based Optimization
2014-11-01
Theory 51, 12 (2005) 4203– 4215. [6] P. CONSTANTINE , M. ELDRED AND E. PHIPPS, Sparse pseu- dospectral approximation method. Comput. Methods Appl. Mech...Visition XVI: Algorithms, Techniques, Active Vision , and Materials Handling, 224 (1997). [15] J. SHEN AND L. WANG, Some recent advances on spectral methods
A Sparse Bayesian Learning Algorithm With Dictionary Parameter Estimation
DEFF Research Database (Denmark)
Hansen, Thomas Lundgaard; Badiu, Mihai Alin; Fleury, Bernard Henri
2014-01-01
This paper concerns sparse decomposition of a noisy signal into atoms which are specified by unknown continuous-valued parameters. An example could be estimation of the model order, frequencies and amplitudes of a superposition of complex sinusoids. The common approach is to reduce the continuous...
Robust visual tracking via structured multi-task sparse learning
Zhang, Tianzhu; Ghanem, Bernard; Liu, Si; Ahuja, Narendra
2012-01-01
In this paper, we formulate object tracking in a particle filter framework as a structured multi-task sparse learning problem, which we denote as Structured Multi-Task Tracking (S-MTT). Since we model particles as linear combinations of dictionary
Behavior of greedy sparse representation algorithms on nested supports
DEFF Research Database (Denmark)
Mailhé, Boris; Sturm, Bob L.; Plumbley, Mark
2013-01-01
is not locally nested: there is a dictionary and supports Γ ⊃ Γ′ such that OMP can recover all signals with support Γ, but not all signals with support Γ′. We also show that the support recovery optimality of OMP is globally nested: if OMP can recover all s-sparse signals, then it can recover all s...
Sparse linear models: Variational approximate inference and Bayesian experimental design
International Nuclear Information System (INIS)
Seeger, Matthias W
2009-01-01
A wide range of problems such as signal reconstruction, denoising, source separation, feature selection, and graphical model search are addressed today by posterior maximization for linear models with sparsity-favouring prior distributions. The Bayesian posterior contains useful information far beyond its mode, which can be used to drive methods for sampling optimization (active learning), feature relevance ranking, or hyperparameter estimation, if only this representation of uncertainty can be approximated in a tractable manner. In this paper, we review recent results for variational sparse inference, and show that they share underlying computational primitives. We discuss how sampling optimization can be implemented as sequential Bayesian experimental design. While there has been tremendous recent activity to develop sparse estimation, little attendance has been given to sparse approximate inference. In this paper, we argue that many problems in practice, such as compressive sensing for real-world image reconstruction, are served much better by proper uncertainty approximations than by ever more aggressive sparse estimation algorithms. Moreover, since some variational inference methods have been given strong convex optimization characterizations recently, theoretical analysis may become possible, promising new insights into nonlinear experimental design.
Inference algorithms and learning theory for Bayesian sparse factor analysis
International Nuclear Information System (INIS)
Rattray, Magnus; Sharp, Kevin; Stegle, Oliver; Winn, John
2009-01-01
Bayesian sparse factor analysis has many applications; for example, it has been applied to the problem of inferring a sparse regulatory network from gene expression data. We describe a number of inference algorithms for Bayesian sparse factor analysis using a slab and spike mixture prior. These include well-established Markov chain Monte Carlo (MCMC) and variational Bayes (VB) algorithms as well as a novel hybrid of VB and Expectation Propagation (EP). For the case of a single latent factor we derive a theory for learning performance using the replica method. We compare the MCMC and VB/EP algorithm results with simulated data to the theoretical prediction. The results for MCMC agree closely with the theory as expected. Results for VB/EP are slightly sub-optimal but show that the new algorithm is effective for sparse inference. In large-scale problems MCMC is infeasible due to computational limitations and the VB/EP algorithm then provides a very useful computationally efficient alternative.
Sparse linear models: Variational approximate inference and Bayesian experimental design
Energy Technology Data Exchange (ETDEWEB)
Seeger, Matthias W [Saarland University and Max Planck Institute for Informatics, Campus E1.4, 66123 Saarbruecken (Germany)
2009-12-01
A wide range of problems such as signal reconstruction, denoising, source separation, feature selection, and graphical model search are addressed today by posterior maximization for linear models with sparsity-favouring prior distributions. The Bayesian posterior contains useful information far beyond its mode, which can be used to drive methods for sampling optimization (active learning), feature relevance ranking, or hyperparameter estimation, if only this representation of uncertainty can be approximated in a tractable manner. In this paper, we review recent results for variational sparse inference, and show that they share underlying computational primitives. We discuss how sampling optimization can be implemented as sequential Bayesian experimental design. While there has been tremendous recent activity to develop sparse estimation, little attendance has been given to sparse approximate inference. In this paper, we argue that many problems in practice, such as compressive sensing for real-world image reconstruction, are served much better by proper uncertainty approximations than by ever more aggressive sparse estimation algorithms. Moreover, since some variational inference methods have been given strong convex optimization characterizations recently, theoretical analysis may become possible, promising new insights into nonlinear experimental design.
Inference algorithms and learning theory for Bayesian sparse factor analysis
Energy Technology Data Exchange (ETDEWEB)
Rattray, Magnus; Sharp, Kevin [School of Computer Science, University of Manchester, Manchester M13 9PL (United Kingdom); Stegle, Oliver [Max-Planck-Institute for Biological Cybernetics, Tuebingen (Germany); Winn, John, E-mail: magnus.rattray@manchester.ac.u [Microsoft Research Cambridge, Roger Needham Building, Cambridge, CB3 0FB (United Kingdom)
2009-12-01
Bayesian sparse factor analysis has many applications; for example, it has been applied to the problem of inferring a sparse regulatory network from gene expression data. We describe a number of inference algorithms for Bayesian sparse factor analysis using a slab and spike mixture prior. These include well-established Markov chain Monte Carlo (MCMC) and variational Bayes (VB) algorithms as well as a novel hybrid of VB and Expectation Propagation (EP). For the case of a single latent factor we derive a theory for learning performance using the replica method. We compare the MCMC and VB/EP algorithm results with simulated data to the theoretical prediction. The results for MCMC agree closely with the theory as expected. Results for VB/EP are slightly sub-optimal but show that the new algorithm is effective for sparse inference. In large-scale problems MCMC is infeasible due to computational limitations and the VB/EP algorithm then provides a very useful computationally efficient alternative.
Efficient coordinated recovery of sparse channels in massive MIMO
Masood, Mudassir; Afify, Laila H.; Al-Naffouri, Tareq Y.
2015-01-01
on this approach have been developed that perform Bayesian estimates of sparse channels even when the prior is non-Gaussian or unknown. Neighboring antennas share among each other their beliefs about the locations of active channel taps to perform estimation
Low-rank sparse learning for robust visual tracking
Zhang, Tianzhu
2012-01-01
In this paper, we propose a new particle-filter based tracking algorithm that exploits the relationship between particles (candidate targets). By representing particles as sparse linear combinations of dictionary templates, this algorithm capitalizes on the inherent low-rank structure of particle representations that are learned jointly. As such, it casts the tracking problem as a low-rank matrix learning problem. This low-rank sparse tracker (LRST) has a number of attractive properties. (1) Since LRST adaptively updates dictionary templates, it can handle significant changes in appearance due to variations in illumination, pose, scale, etc. (2) The linear representation in LRST explicitly incorporates background templates in the dictionary and a sparse error term, which enables LRST to address the tracking drift problem and to be robust against occlusion respectively. (3) LRST is computationally attractive, since the low-rank learning problem can be efficiently solved as a sequence of closed form update operations, which yield a time complexity that is linear in the number of particles and the template size. We evaluate the performance of LRST by applying it to a set of challenging video sequences and comparing it to 6 popular tracking methods. Our experiments show that by representing particles jointly, LRST not only outperforms the state-of-the-art in tracking accuracy but also significantly improves the time complexity of methods that use a similar sparse linear representation model for particles [1]. © 2012 Springer-Verlag.
Structure-aware Local Sparse Coding for Visual Tracking
Qi, Yuankai; Qin, Lei; Zhang, Jian; Zhang, Shengping; Huang, Qingming; Yang, Ming-Hsuan
2018-01-01
with the corresponding local regions of the target templates that are the most similar from the global view. Thus, a more precise and discriminative sparse representation is obtained to account for appearance changes. To alleviate the issues with tracking drifts, we
A Practical View on Tunable Sparse Network Coding
DEFF Research Database (Denmark)
Sørensen, Chres Wiant; Shahbaz Badr, Arash; Cabrera Guerrero, Juan Alberto
2015-01-01
Tunable sparse network coding (TSNC) constitutes a promising concept for trading off computational complexity and delay performance. This paper advocates for the use of judicious feedback as a key not only to make TSNC practical, but also to deliver a highly consistent and controlled delay perfor...
SparseBeads data: benchmarking sparsity-regularized computed tomography
DEFF Research Database (Denmark)
Jørgensen, Jakob Sauer; Coban, Sophia B.; Lionheart, William R. B.
2017-01-01
-regularized reconstruction. A collection of 48 x-ray CT datasets called SparseBeads was designed for benchmarking SR reconstruction algorithms. Beadpacks comprising glass beads of five different sizes as well as mixtures were scanned in a micro-CT scanner to provide structured datasets with variable image sparsity levels...
Hierarchical Bayesian sparse image reconstruction with application to MRFM.
Dobigeon, Nicolas; Hero, Alfred O; Tourneret, Jean-Yves
2009-09-01
This paper presents a hierarchical Bayesian model to reconstruct sparse images when the observations are obtained from linear transformations and corrupted by an additive white Gaussian noise. Our hierarchical Bayes model is well suited to such naturally sparse image applications as it seamlessly accounts for properties such as sparsity and positivity of the image via appropriate Bayes priors. We propose a prior that is based on a weighted mixture of a positive exponential distribution and a mass at zero. The prior has hyperparameters that are tuned automatically by marginalization over the hierarchical Bayesian model. To overcome the complexity of the posterior distribution, a Gibbs sampling strategy is proposed. The Gibbs samples can be used to estimate the image to be recovered, e.g., by maximizing the estimated posterior distribution. In our fully Bayesian approach, the posteriors of all the parameters are available. Thus, our algorithm provides more information than other previously proposed sparse reconstruction methods that only give a point estimate. The performance of the proposed hierarchical Bayesian sparse reconstruction method is illustrated on synthetic data and real data collected from a tobacco virus sample using a prototype MRFM instrument.
Multiple instance learning tracking method with local sparse representation
Xie, Chengjun
2013-10-01
When objects undergo large pose change, illumination variation or partial occlusion, most existed visual tracking algorithms tend to drift away from targets and even fail in tracking them. To address this issue, in this study, the authors propose an online algorithm by combining multiple instance learning (MIL) and local sparse representation for tracking an object in a video system. The key idea in our method is to model the appearance of an object by local sparse codes that can be formed as training data for the MIL framework. First, local image patches of a target object are represented as sparse codes with an overcomplete dictionary, where the adaptive representation can be helpful in overcoming partial occlusion in object tracking. Then MIL learns the sparse codes by a classifier to discriminate the target from the background. Finally, results from the trained classifier are input into a particle filter framework to sequentially estimate the target state over time in visual tracking. In addition, to decrease the visual drift because of the accumulative errors when updating the dictionary and classifier, a two-step object tracking method combining a static MIL classifier with a dynamical MIL classifier is proposed. Experiments on some publicly available benchmarks of video sequences show that our proposed tracker is more robust and effective than others. © The Institution of Engineering and Technology 2013.
Fast sparse matrix-vector multiplication by partitioning and reordering
Yzelman, A.N.
2011-01-01
The thesis introduces a cache-oblivious method for the sparse matrix-vector (SpMV) multiplication, which is an important computational kernel in many applications. The method works by permuting rows and columns of the input matrix so that the resulting reordered matrix induces cache-friendly
Sobol indices for dimension adaptivity in sparse grids
Dwight, R.P.; Desmedt, S.G.L.; Shoeibi Omrani, P.
2016-01-01
Propagation of random variables through computer codes of many inputs is primarily limited by computational expense. The use of sparse grids mitigates these costs somewhat; here we show how Sobol indices can be used to perform dimension adaptivity to mitigate them further. The method is compared to
Discriminative object tracking via sparse representation and online dictionary learning.
Xie, Yuan; Zhang, Wensheng; Li, Cuihua; Lin, Shuyang; Qu, Yanyun; Zhang, Yinghua
2014-04-01
We propose a robust tracking algorithm based on local sparse coding with discriminative dictionary learning and new keypoint matching schema. This algorithm consists of two parts: the local sparse coding with online updated discriminative dictionary for tracking (SOD part), and the keypoint matching refinement for enhancing the tracking performance (KP part). In the SOD part, the local image patches of the target object and background are represented by their sparse codes using an over-complete discriminative dictionary. Such discriminative dictionary, which encodes the information of both the foreground and the background, may provide more discriminative power. Furthermore, in order to adapt the dictionary to the variation of the foreground and background during the tracking, an online learning method is employed to update the dictionary. The KP part utilizes refined keypoint matching schema to improve the performance of the SOD. With the help of sparse representation and online updated discriminative dictionary, the KP part are more robust than the traditional method to reject the incorrect matches and eliminate the outliers. The proposed method is embedded into a Bayesian inference framework for visual tracking. Experimental results on several challenging video sequences demonstrate the effectiveness and robustness of our approach.
Fast Estimation of Optimal Sparseness of Music Signals
DEFF Research Database (Denmark)
la Cour-Harbo, Anders
2006-01-01
We want to use a variety of sparseness measured applied to ‘the minimal L1 norm representation' of a music signal in an overcomplete dictionary as features for automatic classification of music. Unfortunately, the process of computing the optimal L1 norm representation is rather slow, and we...
Sparse principal component analysis in medical shape modeling
Sjöstrand, Karl; Stegmann, Mikkel B.; Larsen, Rasmus
2006-03-01
Principal component analysis (PCA) is a widely used tool in medical image analysis for data reduction, model building, and data understanding and exploration. While PCA is a holistic approach where each new variable is a linear combination of all original variables, sparse PCA (SPCA) aims at producing easily interpreted models through sparse loadings, i.e. each new variable is a linear combination of a subset of the original variables. One of the aims of using SPCA is the possible separation of the results into isolated and easily identifiable effects. This article introduces SPCA for shape analysis in medicine. Results for three different data sets are given in relation to standard PCA and sparse PCA by simple thresholding of small loadings. Focus is on a recent algorithm for computing sparse principal components, but a review of other approaches is supplied as well. The SPCA algorithm has been implemented using Matlab and is available for download. The general behavior of the algorithm is investigated, and strengths and weaknesses are discussed. The original report on the SPCA algorithm argues that the ordering of modes is not an issue. We disagree on this point and propose several approaches to establish sensible orderings. A method that orders modes by decreasing variance and maximizes the sum of variances for all modes is presented and investigated in detail.
Robust Visual Tracking Via Consistent Low-Rank Sparse Learning
Zhang, Tianzhu; Liu, Si; Ahuja, Narendra; Yang, Ming-Hsuan; Ghanem, Bernard
2014-01-01
and the low-rank minimization problem for learning joint sparse representations can be efficiently solved by a sequence of closed form update operations. We evaluate the proposed CLRST algorithm against 14 state-of-the-art tracking methods on a set of 25
Non-Cartesian MRI scan time reduction through sparse sampling
Wajer, F.T.A.W.
2001-01-01
Non-Cartesian MRI Scan-Time Reduction through Sparse Sampling Magnetic resonance imaging (MRI) signals are measured in the Fourier domain, also called k-space. Samples of the MRI signal can not be taken at will, but lie along k-space trajectories determined by the magnetic field gradients. MRI
Sparsely-Packetized Predictive Control by Orthogonal Matching Pursuit
DEFF Research Database (Denmark)
Nagahara, Masaaki; Quevedo, Daniel; Østergaard, Jan
2012-01-01
We study packetized predictive control, known to be robust against packet dropouts in networked systems. To obtain sparse packets for rate-limited networks, we design control packets via an ℓ0 optimization, which can be eectively solved by orthogonal matching pursuit. Our formulation ensures...
Proportionate Minimum Error Entropy Algorithm for Sparse System Identification
Directory of Open Access Journals (Sweden)
Zongze Wu
2015-08-01
Full Text Available Sparse system identification has received a great deal of attention due to its broad applicability. The proportionate normalized least mean square (PNLMS algorithm, as a popular tool, achieves excellent performance for sparse system identification. In previous studies, most of the cost functions used in proportionate-type sparse adaptive algorithms are based on the mean square error (MSE criterion, which is optimal only when the measurement noise is Gaussian. However, this condition does not hold in most real-world environments. In this work, we use the minimum error entropy (MEE criterion, an alternative to the conventional MSE criterion, to develop the proportionate minimum error entropy (PMEE algorithm for sparse system identification, which may achieve much better performance than the MSE based methods especially in heavy-tailed non-Gaussian situations. Moreover, we analyze the convergence of the proposed algorithm and derive a sufficient condition that ensures the mean square convergence. Simulation results confirm the excellent performance of the new algorithm.
Robust Visual Tracking Via Consistent Low-Rank Sparse Learning
Zhang, Tianzhu
2014-06-19
Object tracking is the process of determining the states of a target in consecutive video frames based on properties of motion and appearance consistency. In this paper, we propose a consistent low-rank sparse tracker (CLRST) that builds upon the particle filter framework for tracking. By exploiting temporal consistency, the proposed CLRST algorithm adaptively prunes and selects candidate particles. By using linear sparse combinations of dictionary templates, the proposed method learns the sparse representations of image regions corresponding to candidate particles jointly by exploiting the underlying low-rank constraints. In addition, the proposed CLRST algorithm is computationally attractive since temporal consistency property helps prune particles and the low-rank minimization problem for learning joint sparse representations can be efficiently solved by a sequence of closed form update operations. We evaluate the proposed CLRST algorithm against 14 state-of-the-art tracking methods on a set of 25 challenging image sequences. Experimental results show that the CLRST algorithm performs favorably against state-of-the-art tracking methods in terms of accuracy and execution time.
Sparse PDF Volumes for Consistent Multi-Resolution Volume Rendering
Sicat, Ronell Barrera
2014-12-31
This paper presents a new multi-resolution volume representation called sparse pdf volumes, which enables consistent multi-resolution volume rendering based on probability density functions (pdfs) of voxel neighborhoods. These pdfs are defined in the 4D domain jointly comprising the 3D volume and its 1D intensity range. Crucially, the computation of sparse pdf volumes exploits data coherence in 4D, resulting in a sparse representation with surprisingly low storage requirements. At run time, we dynamically apply transfer functions to the pdfs using simple and fast convolutions. Whereas standard low-pass filtering and down-sampling incur visible differences between resolution levels, the use of pdfs facilitates consistent results independent of the resolution level used. We describe the efficient out-of-core computation of large-scale sparse pdf volumes, using a novel iterative simplification procedure of a mixture of 4D Gaussians. Finally, our data structure is optimized to facilitate interactive multi-resolution volume rendering on GPUs.
EEG Source Reconstruction using Sparse Basis Function Representations
DEFF Research Database (Denmark)
Hansen, Sofie Therese; Hansen, Lars Kai
2014-01-01
-validation this approach is more automated than competing approaches such as Multiple Sparse Priors (Friston et al., 2008) or Champagne (Wipf et al., 2010) that require manual selection of noise level and auxiliary signal free data, respectively. Finally, we propose an unbiased estimator of the reproducibility...
Aliasing-free wideband beamforming using sparse signal representation
Tang, Z.; Blacquière, G.; Leus, G.
2011-01-01
Sparse signal representation (SSR) is considered to be an appealing alternative to classical beamforming for direction-of-arrival (DOA) estimation. For wideband signals, the SSR-based approach constructs steering matrices, referred to as dictionaries in this paper, corresponding to different
Resistor Combinations for Parallel Circuits.
McTernan, James P.
1978-01-01
To help simplify both teaching and learning of parallel circuits, a high school electricity/electronics teacher presents and illustrates the use of tables of values for parallel resistive circuits in which total resistances are whole numbers. (MF)
SOFTWARE FOR DESIGNING PARALLEL APPLICATIONS
Directory of Open Access Journals (Sweden)
M. K. Bouza
2017-01-01
Full Text Available The object of research is the tools to support the development of parallel programs in C/C ++. The methods and software which automates the process of designing parallel applications are proposed.
Effects of Ordering Strategies and Programming Paradigms on Sparse Matrix Computations
Oliker, Leonid; Li, Xiaoye; Husbands, Parry; Biswas, Rupak; Biegel, Bryan (Technical Monitor)
2002-01-01
The Conjugate Gradient (CG) algorithm is perhaps the best-known iterative technique to solve sparse linear systems that are symmetric and positive definite. For systems that are ill-conditioned, it is often necessary to use a preconditioning technique. In this paper, we investigate the effects of various ordering and partitioning strategies on the performance of parallel CG and ILU(O) preconditioned CG (PCG) using different programming paradigms and architectures. Results show that for this class of applications: ordering significantly improves overall performance on both distributed and distributed shared-memory systems, that cache reuse may be more important than reducing communication, that it is possible to achieve message-passing performance using shared-memory constructs through careful data ordering and distribution, and that a hybrid MPI+OpenMP paradigm increases programming complexity with little performance gains. A implementation of CG on the Cray MTA does not require special ordering or partitioning to obtain high efficiency and scalability, giving it a distinct advantage for adaptive applications; however, it shows limited scalability for PCG due to a lack of thread level parallelism.
Optimization of sparse matrix-vector multiplication on emerging multicore platforms
Energy Technology Data Exchange (ETDEWEB)
Williams, Samuel [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Univ. of California, Berkeley, CA (United States); Oliker, Leonid [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Vuduc, Richard [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Shalf, John [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Yelick, Katherine [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Univ. of California, Berkeley, CA (United States); Demmel, James [Univ. of California, Berkeley, CA (United States)
2007-01-01
We are witnessing a dramatic change in computer architecture due to the multicore paradigm shift, as every electronic device from cell phones to supercomputers confronts parallelism of unprecedented scale. To fully unleash the potential of these systems, the HPC community must develop multicore specific optimization methodologies for important scientific computations. In this work, we examine sparse matrix-vector multiply (SpMV) - one of the most heavily used kernels in scientific computing - across a broad spectrum of multicore designs. Our experimental platform includes the homogeneous AMD dual-core and Intel quad-core designs, the heterogeneous STI Cell, as well as the first scientific study of the highly multithreaded Sun Niagara2. We present several optimization strategies especially effective for the multicore environment, and demonstrate significant performance improvements compared to existing state-of-the-art serial and parallel SpMV implementations. Additionally, we present key insights into the architectural tradeoffs of leading multicore design strategies, in the context of demanding memory-bound numerical algorithms.
Optimization of Sparse Matrix-Vector Multiplication on Emerging Multicore Platforms
Energy Technology Data Exchange (ETDEWEB)
Williams, Samuel; Oliker, Leonid; Vuduc, Richard; Shalf, John; Yelick, Katherine; Demmel, James
2008-10-16
We are witnessing a dramatic change in computer architecture due to the multicore paradigm shift, as every electronic device from cell phones to supercomputers confronts parallelism of unprecedented scale. To fully unleash the potential of these systems, the HPC community must develop multicore specific-optimization methodologies for important scientific computations. In this work, we examine sparse matrix-vector multiply (SpMV) - one of the most heavily used kernels in scientific computing - across a broad spectrum of multicore designs. Our experimental platform includes the homogeneous AMD quad-core, AMD dual-core, and Intel quad-core designs, the heterogeneous STI Cell, as well as one of the first scientific studies of the highly multithreaded Sun Victoria Falls (a Niagara2 SMP). We present several optimization strategies especially effective for the multicore environment, and demonstrate significant performance improvements compared to existing state-of-the-art serial and parallel SpMV implementations. Additionally, we present key insights into the architectural trade-offs of leading multicore design strategies, in the context of demanding memory-bound numerical algorithms.
Parallel External Memory Graph Algorithms
DEFF Research Database (Denmark)
Arge, Lars Allan; Goodrich, Michael T.; Sitchinava, Nodari
2010-01-01
In this paper, we study parallel I/O efficient graph algorithms in the Parallel External Memory (PEM) model, one o f the private-cache chip multiprocessor (CMP) models. We study the fundamental problem of list ranking which leads to efficient solutions to problems on trees, such as computing lowest...... an optimal speedup of Â¿(P) in parallel I/O complexity and parallel computation time, compared to the single-processor external memory counterparts....
Deformable segmentation via sparse representation and dictionary learning.
Zhang, Shaoting; Zhan, Yiqiang; Metaxas, Dimitris N
2012-10-01
"Shape" and "appearance", the two pillars of a deformable model, complement each other in object segmentation. In many medical imaging applications, while the low-level appearance information is weak or mis-leading, shape priors play a more important role to guide a correct segmentation, thanks to the strong shape characteristics of biological structures. Recently a novel shape prior modeling method has been proposed based on sparse learning theory. Instead of learning a generative shape model, shape priors are incorporated on-the-fly through the sparse shape composition (SSC). SSC is robust to non-Gaussian errors and still preserves individual shape characteristics even when such characteristics is not statistically significant. Although it seems straightforward to incorporate SSC into a deformable segmentation framework as shape priors, the large-scale sparse optimization of SSC has low runtime efficiency, which cannot satisfy clinical requirements. In this paper, we design two strategies to decrease the computational complexity of SSC, making a robust, accurate and efficient deformable segmentation system. (1) When the shape repository contains a large number of instances, which is often the case in 2D problems, K-SVD is used to learn a more compact but still informative shape dictionary. (2) If the derived shape instance has a large number of vertices, which often appears in 3D problems, an affinity propagation method is used to partition the surface into small sub-regions, on which the sparse shape composition is performed locally. Both strategies dramatically decrease the scale of the sparse optimization problem and hence speed up the algorithm. Our method is applied on a diverse set of biomedical image analysis problems. Compared to the original SSC, these two newly-proposed modules not only significant reduce the computational complexity, but also improve the overall accuracy. Copyright © 2012 Elsevier B.V. All rights reserved.
Parallel inter channel interaction mechanisms
International Nuclear Information System (INIS)
Jovic, V.; Afgan, N.; Jovic, L.
1995-01-01
Parallel channels interactions are examined. For experimental researches of nonstationary regimes flow in three parallel vertical channels results of phenomenon analysis and mechanisms of parallel channel interaction for adiabatic condition of one-phase fluid and two-phase mixture flow are shown. (author)
International Nuclear Information System (INIS)
Soltz, R; Vranas, P; Blumrich, M; Chen, D; Gara, A; Giampap, M; Heidelberger, P; Salapura, V; Sexton, J; Bhanot, G
2007-01-01
The theory of the strong nuclear force, Quantum Chromodynamics (QCD), can be numerically simulated from first principles on massively-parallel supercomputers using the method of Lattice Gauge Theory. We describe the special programming requirements of lattice QCD (LQCD) as well as the optimal supercomputer hardware architectures that it suggests. We demonstrate these methods on the BlueGene massively-parallel supercomputer and argue that LQCD and the BlueGene architecture are a natural match. This can be traced to the simple fact that LQCD is a regular lattice discretization of space into lattice sites while the BlueGene supercomputer is a discretization of space into compute nodes, and that both are constrained by requirements of locality. This simple relation is both technologically important and theoretically intriguing. The main result of this paper is the speedup of LQCD using up to 131,072 CPUs on the largest BlueGene/L supercomputer. The speedup is perfect with sustained performance of about 20% of peak. This corresponds to a maximum of 70.5 sustained TFlop/s. At these speeds LQCD and BlueGene are poised to produce the next generation of strong interaction physics theoretical results
A Parallel Butterfly Algorithm
Poulson, Jack; Demanet, Laurent; Maxwell, Nicholas; Ying, Lexing
2014-01-01
The butterfly algorithm is a fast algorithm which approximately evaluates a discrete analogue of the integral transform (Equation Presented.) at large numbers of target points when the kernel, K(x, y), is approximately low-rank when restricted to subdomains satisfying a certain simple geometric condition. In d dimensions with O(Nd) quasi-uniformly distributed source and target points, when each appropriate submatrix of K is approximately rank-r, the running time of the algorithm is at most O(r2Nd logN). A parallelization of the butterfly algorithm is introduced which, assuming a message latency of α and per-process inverse bandwidth of β, executes in at most (Equation Presented.) time using p processes. This parallel algorithm was then instantiated in the form of the open-source DistButterfly library for the special case where K(x, y) = exp(iΦ(x, y)), where Φ(x, y) is a black-box, sufficiently smooth, real-valued phase function. Experiments on Blue Gene/Q demonstrate impressive strong-scaling results for important classes of phase functions. Using quasi-uniform sources, hyperbolic Radon transforms, and an analogue of a three-dimensional generalized Radon transform were, respectively, observed to strong-scale from 1-node/16-cores up to 1024-nodes/16,384-cores with greater than 90% and 82% efficiency, respectively. © 2014 Society for Industrial and Applied Mathematics.
A Parallel Butterfly Algorithm
Poulson, Jack
2014-02-04
The butterfly algorithm is a fast algorithm which approximately evaluates a discrete analogue of the integral transform (Equation Presented.) at large numbers of target points when the kernel, K(x, y), is approximately low-rank when restricted to subdomains satisfying a certain simple geometric condition. In d dimensions with O(Nd) quasi-uniformly distributed source and target points, when each appropriate submatrix of K is approximately rank-r, the running time of the algorithm is at most O(r2Nd logN). A parallelization of the butterfly algorithm is introduced which, assuming a message latency of α and per-process inverse bandwidth of β, executes in at most (Equation Presented.) time using p processes. This parallel algorithm was then instantiated in the form of the open-source DistButterfly library for the special case where K(x, y) = exp(iΦ(x, y)), where Φ(x, y) is a black-box, sufficiently smooth, real-valued phase function. Experiments on Blue Gene/Q demonstrate impressive strong-scaling results for important classes of phase functions. Using quasi-uniform sources, hyperbolic Radon transforms, and an analogue of a three-dimensional generalized Radon transform were, respectively, observed to strong-scale from 1-node/16-cores up to 1024-nodes/16,384-cores with greater than 90% and 82% efficiency, respectively. © 2014 Society for Industrial and Applied Mathematics.
Fast parallel event reconstruction
CERN. Geneva
2010-01-01
On-line processing of large data volumes produced in modern HEP experiments requires using maximum capabilities of modern and future many-core CPU and GPU architectures.One of such powerful feature is a SIMD instruction set, which allows packing several data items in one register and to operate on all of them, thus achievingmore operations per clock cycle. Motivated by the idea of using the SIMD unit ofmodern processors, the KF based track fit has been adapted for parallelism, including memory optimization, numerical analysis, vectorization with inline operator overloading, and optimization using SDKs. The speed of the algorithm has been increased in 120000 times with 0.1 ms/track, running in parallel on 16 SPEs of a Cell Blade computer. Running on a Nehalem CPU with 8 cores it shows the processing speed of 52 ns/track using the Intel Threading Building Blocks. The same KF algorithm running on an Nvidia GTX 280 in the CUDA frameworkprovi...
A new parallelization algorithm of ocean model with explicit scheme
Fu, X. D.
2017-08-01
This paper will focus on the parallelization of ocean model with explicit scheme which is one of the most commonly used schemes in the discretization of governing equation of ocean model. The characteristic of explicit schema is that calculation is simple, and that the value of the given grid point of ocean model depends on the grid point at the previous time step, which means that one doesn’t need to solve sparse linear equations in the process of solving the governing equation of the ocean model. Aiming at characteristics of the explicit scheme, this paper designs a parallel algorithm named halo cells update with tiny modification of original ocean model and little change of space step and time step of the original ocean model, which can parallelize ocean model by designing transmission module between sub-domains. This paper takes the GRGO for an example to implement the parallelization of GRGO (Global Reduced Gravity Ocean model) with halo update. The result demonstrates that the higher speedup can be achieved at different problem size.
International Nuclear Information System (INIS)
DeHart, Mark D.; Williams, Mark L.; Bowman, Stephen M.
2010-01-01
The SCALE computational architecture has remained basically the same since its inception 30 years ago, although constituent modules and capabilities have changed significantly. This SCALE concept was intended to provide a framework whereby independent codes can be linked to provide a more comprehensive capability than possible with the individual programs - allowing flexibility to address a wide variety of applications. However, the current system was designed originally for mainframe computers with a single CPU and with significantly less memory than today's personal computers. It has been recognized that the present SCALE computation system could be restructured to take advantage of modern hardware and software capabilities, while retaining many of the modular features of the present system. Preliminary work is being done to define specifications and capabilities for a more advanced computational architecture. This paper describes the state of current SCALE development activities and plans for future development. With the release of SCALE 6.1 in 2010, a new phase of evolutionary development will be available to SCALE users within the TRITON and NEWT modules. The SCALE (Standardized Computer Analyses for Licensing Evaluation) code system developed by Oak Ridge National Laboratory (ORNL) provides a comprehensive and integrated package of codes and nuclear data for a wide range of applications in criticality safety, reactor physics, shielding, isotopic depletion and decay, and sensitivity/uncertainty (S/U) analysis. Over the last three years, since the release of version 5.1 in 2006, several important new codes have been introduced within SCALE, and significant advances applied to existing codes. Many of these new features became available with the release of SCALE 6.0 in early 2009. However, beginning with SCALE 6.1, a first generation of parallel computing is being introduced. In addition to near-term improvements, a plan for longer term SCALE enhancement
Feature selection and multi-kernel learning for sparse representation on a manifold
Wang, Jim Jing-Yan; Bensmail, Halima; Gao, Xin
2014-01-01
combination of some basic items in a dictionary. Gao etal. (2013) recently proposed Laplacian sparse coding by regularizing the sparse codes with an affinity graph. However, due to the noisy features and nonlinear distribution of the data samples, the affinity
Group sparse canonical correlation analysis for genomic data integration.
Lin, Dongdong; Zhang, Jigang; Li, Jingyao; Calhoun, Vince D; Deng, Hong-Wen; Wang, Yu-Ping
2013-08-12
The emergence of high-throughput genomic datasets from different sources and platforms (e.g., gene expression, single nucleotide polymorphisms (SNP), and copy number variation (CNV)) has greatly enhanced our understandings of the interplay of these genomic factors as well as their influences on the complex diseases. It is challenging to explore the relationship between these different types of genomic data sets. In this paper, we focus on a multivariate statistical method, canonical correlation analysis (CCA) method for this problem. Conventional CCA method does not work effectively if the number of data samples is significantly less than that of biomarkers, which is a typical case for genomic data (e.g., SNPs). Sparse CCA (sCCA) methods were introduced to overcome such difficulty, mostly using penalizations with l-1 norm (CCA-l1) or the combination of l-1and l-2 norm (CCA-elastic net). However, they overlook the structural or group effect within genomic data in the analysis, which often exist and are important (e.g., SNPs spanning a gene interact and work together as a group). We propose a new group sparse CCA method (CCA-sparse group) along with an effective numerical algorithm to study the mutual relationship between two different types of genomic data (i.e., SNP and gene expression). We then extend the model to a more general formulation that can include the existing sCCA models. We apply the model to feature/variable selection from two data sets and compare our group sparse CCA method with existing sCCA methods on both simulation and two real datasets (human gliomas data and NCI60 data). We use a graphical representation of the samples with a pair of canonical variates to demonstrate the discriminating characteristic of the selected features. Pathway analysis is further performed for biological interpretation of those features. The CCA-sparse group method incorporates group effects of features into the correlation analysis while performs individual feature
Parallel Polarization State Generation.
She, Alan; Capasso, Federico
2016-05-17
The control of polarization, an essential property of light, is of wide scientific and technological interest. The general problem of generating arbitrary time-varying states of polarization (SOP) has always been mathematically formulated by a series of linear transformations, i.e. a product of matrices, imposing a serial architecture. Here we show a parallel architecture described by a sum of matrices. The theory is experimentally demonstrated by modulating spatially-separated polarization components of a laser using a digital micromirror device that are subsequently beam combined. This method greatly expands the parameter space for engineering devices that control polarization. Consequently, performance characteristics, such as speed, stability, and spectral range, are entirely dictated by the technologies of optical intensity modulation, including absorption, reflection, emission, and scattering. This opens up important prospects for polarization state generation (PSG) with unique performance characteristics with applications in spectroscopic ellipsometry, spectropolarimetry, communications, imaging, and security.
Parallel imaging microfluidic cytometer.
Ehrlich, Daniel J; McKenna, Brian K; Evans, James G; Belkina, Anna C; Denis, Gerald V; Sherr, David H; Cheung, Man Ching
2011-01-01
By adding an additional degree of freedom from multichannel flow, the parallel microfluidic cytometer (PMC) combines some of the best features of fluorescence-activated flow cytometry (FCM) and microscope-based high-content screening (HCS). The PMC (i) lends itself to fast processing of large numbers of samples, (ii) adds a 1D imaging capability for intracellular localization assays (HCS), (iii) has a high rare-cell sensitivity, and (iv) has an unusual capability for time-synchronized sampling. An inability to practically handle large sample numbers has restricted applications of conventional flow cytometers and microscopes in combinatorial cell assays, network biology, and drug discovery. The PMC promises to relieve a bottleneck in these previously constrained applications. The PMC may also be a powerful tool for finding rare primary cells in the clinic. The multichannel architecture of current PMC prototypes allows 384 unique samples for a cell-based screen to be read out in ∼6-10 min, about 30 times the speed of most current FCM systems. In 1D intracellular imaging, the PMC can obtain protein localization using HCS marker strategies at many times for the sample throughput of charge-coupled device (CCD)-based microscopes or CCD-based single-channel flow cytometers. The PMC also permits the signal integration time to be varied over a larger range than is practical in conventional flow cytometers. The signal-to-noise advantages are useful, for example, in counting rare positive cells in the most difficult early stages of genome-wide screening. We review the status of parallel microfluidic cytometry and discuss some of the directions the new technology may take. Copyright © 2011 Elsevier Inc. All rights reserved.
An in-depth study of sparse codes on abnormality detection
DEFF Research Database (Denmark)
Ren, Huamin; Pan, Hong; Olsen, Søren Ingvor
2016-01-01
Sparse representation has been applied successfully in abnormal event detection, in which the baseline is to learn a dictionary accompanied by sparse codes. While much emphasis is put on discriminative dictionary construction, there are no comparative studies of sparse codes regarding abnormality...... are carried out from various angles to better understand the applicability of sparse codes, including computation time, reconstruction error, sparsity, detection accuracy, and their performance combining various detection methods. The experiment results show that combining OMP codes with maximum coordinate...
Cheng, Hong
2015-01-01
This unique text/reference presents a comprehensive review of the state of the art in sparse representations, modeling and learning. The book examines both the theoretical foundations and details of algorithm implementation, highlighting the practical application of compressed sensing research in visual recognition and computer vision. Topics and features: provides a thorough introduction to the fundamentals of sparse representation, modeling and learning, and the application of these techniques in visual recognition; describes sparse recovery approaches, robust and efficient sparse represen
Porting of the DBCSR library for Sparse Matrix-Matrix Multiplications to Intel Xeon Phi systems
Bethune, Iain; Gloess, Andeas; Hutter, Juerg; Lazzaro, Alfio; Pabst, Hans; Reid, Fiona
2017-01-01
Multiplication of two sparse matrices is a key operation in the simulation of the electronic structure of systems containing thousands of atoms and electrons. The highly optimized sparse linear algebra library DBCSR (Distributed Block Compressed Sparse Row) has been specifically designed to efficiently perform such sparse matrix-matrix multiplications. This library is the basic building block for linear scaling electronic structure theory and low scaling correlated methods in CP2K. It is para...
About Parallel Programming: Paradigms, Parallel Execution and Collaborative Systems
Directory of Open Access Journals (Sweden)
Loredana MOCEAN
2009-01-01
Full Text Available In the last years, there were made efforts for delineation of a stabile and unitary frame, where the problems of logical parallel processing must find solutions at least at the level of imperative languages. The results obtained by now are not at the level of the made efforts. This paper wants to be a little contribution at these efforts. We propose an overview in parallel programming, parallel execution and collaborative systems.
High-Order Sparse Linear Predictors for Audio Processing
DEFF Research Database (Denmark)
Giacobello, Daniele; van Waterschoot, Toon; Christensen, Mads Græsbøll
2010-01-01
Linear prediction has generally failed to make a breakthrough in audio processing, as it has done in speech processing. This is mostly due to its poor modeling performance, since an audio signal is usually an ensemble of different sources. Nevertheless, linear prediction comes with a whole set...... of interesting features that make the idea of using it in audio processing not far fetched, e.g., the strong ability of modeling the spectral peaks that play a dominant role in perception. In this paper, we provide some preliminary conjectures and experiments on the use of high-order sparse linear predictors...... in audio processing. These predictors, successfully implemented in modeling the short-term and long-term redundancies present in speech signals, will be used to model tonal audio signals, both monophonic and polyphonic. We will show how the sparse predictors are able to model efﬁciently the different...
Sparse Covariance Matrix Estimation by DCA-Based Algorithms.
Phan, Duy Nhat; Le Thi, Hoai An; Dinh, Tao Pham
2017-11-01
This letter proposes a novel approach using the [Formula: see text]-norm regularization for the sparse covariance matrix estimation (SCME) problem. The objective function of SCME problem is composed of a nonconvex part and the [Formula: see text] term, which is discontinuous and difficult to tackle. Appropriate DC (difference of convex functions) approximations of [Formula: see text]-norm are used that result in approximation SCME problems that are still nonconvex. DC programming and DCA (DC algorithm), powerful tools in nonconvex programming framework, are investigated. Two DC formulations are proposed and corresponding DCA schemes developed. Two applications of the SCME problem that are considered are classification via sparse quadratic discriminant analysis and portfolio optimization. A careful empirical experiment is performed through simulated and real data sets to study the performance of the proposed algorithms. Numerical results showed their efficiency and their superiority compared with seven state-of-the-art methods.
Sample size reduction in groundwater surveys via sparse data assimilation
Hussain, Z.
2013-04-01
In this paper, we focus on sparse signal recovery methods for data assimilation in groundwater models. The objective of this work is to exploit the commonly understood spatial sparsity in hydrodynamic models and thereby reduce the number of measurements to image a dynamic groundwater profile. To achieve this we employ a Bayesian compressive sensing framework that lets us adaptively select the next measurement to reduce the estimation error. An extension to the Bayesian compressive sensing framework is also proposed which incorporates the additional model information to estimate system states from even lesser measurements. Instead of using cumulative imaging-like measurements, such as those used in standard compressive sensing, we use sparse binary matrices. This choice of measurements can be interpreted as randomly sampling only a small subset of dug wells at each time step, instead of sampling the entire grid. Therefore, this framework offers groundwater surveyors a significant reduction in surveying effort without compromising the quality of the survey. © 2013 IEEE.
High Order Tensor Formulation for Convolutional Sparse Coding
Bibi, Adel Aamer
2017-12-25
Convolutional sparse coding (CSC) has gained attention for its successful role as a reconstruction and a classification tool in the computer vision and machine learning community. Current CSC methods can only reconstruct singlefeature 2D images independently. However, learning multidimensional dictionaries and sparse codes for the reconstruction of multi-dimensional data is very important, as it examines correlations among all the data jointly. This provides more capacity for the learned dictionaries to better reconstruct data. In this paper, we propose a generic and novel formulation for the CSC problem that can handle an arbitrary order tensor of data. Backed with experimental results, our proposed formulation can not only tackle applications that are not possible with standard CSC solvers, including colored video reconstruction (5D- tensors), but it also performs favorably in reconstruction with much fewer parameters as compared to naive extensions of standard CSC to multiple features/channels.
Sample size reduction in groundwater surveys via sparse data assimilation
Hussain, Z.; Muhammad, A.
2013-01-01
In this paper, we focus on sparse signal recovery methods for data assimilation in groundwater models. The objective of this work is to exploit the commonly understood spatial sparsity in hydrodynamic models and thereby reduce the number of measurements to image a dynamic groundwater profile. To achieve this we employ a Bayesian compressive sensing framework that lets us adaptively select the next measurement to reduce the estimation error. An extension to the Bayesian compressive sensing framework is also proposed which incorporates the additional model information to estimate system states from even lesser measurements. Instead of using cumulative imaging-like measurements, such as those used in standard compressive sensing, we use sparse binary matrices. This choice of measurements can be interpreted as randomly sampling only a small subset of dug wells at each time step, instead of sampling the entire grid. Therefore, this framework offers groundwater surveyors a significant reduction in surveying effort without compromising the quality of the survey. © 2013 IEEE.
Sparse logistic principal components analysis for binary data
Lee, Seokho
2010-09-01
We develop a new principal components analysis (PCA) type dimension reduction method for binary data. Different from the standard PCA which is defined on the observed data, the proposed PCA is defined on the logit transform of the success probabilities of the binary observations. Sparsity is introduced to the principal component (PC) loading vectors for enhanced interpretability and more stable extraction of the principal components. Our sparse PCA is formulated as solving an optimization problem with a criterion function motivated from a penalized Bernoulli likelihood. A Majorization-Minimization algorithm is developed to efficiently solve the optimization problem. The effectiveness of the proposed sparse logistic PCA method is illustrated by application to a single nucleotide polymorphism data set and a simulation study. © Institute ol Mathematical Statistics, 2010.
Statistical mechanics of semi-supervised clustering in sparse graphs
International Nuclear Information System (INIS)
Ver Steeg, Greg; Galstyan, Aram; Allahverdyan, Armen E
2011-01-01
We theoretically study semi-supervised clustering in sparse graphs in the presence of pair-wise constraints on the cluster assignments of nodes. We focus on bi-cluster graphs and study the impact of semi-supervision for varying constraint density and overlap between the clusters. Recent results for unsupervised clustering in sparse graphs indicate that there is a critical ratio of within-cluster and between-cluster connectivities below which clusters cannot be recovered with better than random accuracy. The goal of this paper is to examine the impact of pair-wise constraints on the clustering accuracy. Our results suggest that the addition of constraints does not provide automatic improvement over the unsupervised case. When the density of the constraints is sufficiently small, their only impact is to shift the detection threshold while preserving the criticality. Conversely, if the density of (hard) constraints is above the percolation threshold, the criticality is suppressed and the detection threshold disappears
A sparse electromagnetic imaging scheme using nonlinear landweber iterations
Desmal, Abdulla
2015-10-26
Development and use of electromagnetic inverse scattering techniques for imagining sparse domains have been on the rise following the recent advancements in solving sparse optimization problems. Existing techniques rely on iteratively converting the nonlinear forward scattering operator into a sequence of linear ill-posed operations (for example using the Born iterative method) and applying sparsity constraints to the linear minimization problem of each iteration through the use of L0/L1-norm penalty term (A. Desmal and H. Bagci, IEEE Trans. Antennas Propag, 7, 3878–3884, 2014, and IEEE Trans. Geosci. Remote Sens., 3, 532–536, 2015). It has been shown that these techniques produce more accurate and sharper images than their counterparts which solve a minimization problem constrained with smoothness promoting L2-norm penalty term. But these existing techniques are only applicable to investigation domains involving weak scatterers because the linearization process breaks down for high values of dielectric permittivity.
Semi-blind sparse image reconstruction with application to MRFM.
Park, Se Un; Dobigeon, Nicolas; Hero, Alfred O
2012-09-01
We propose a solution to the image deconvolution problem where the convolution kernel or point spread function (PSF) is assumed to be only partially known. Small perturbations generated from the model are exploited to produce a few principal components explaining the PSF uncertainty in a high-dimensional space. Unlike recent developments on blind deconvolution of natural images, we assume the image is sparse in the pixel basis, a natural sparsity arising in magnetic resonance force microscopy (MRFM). Our approach adopts a Bayesian Metropolis-within-Gibbs sampling framework. The performance of our Bayesian semi-blind algorithm for sparse images is superior to previously proposed semi-blind algorithms such as the alternating minimization algorithm and blind algorithms developed for natural images. We illustrate our myopic algorithm on real MRFM tobacco virus data.
Sparse data structure design for wavelet-based methods
Directory of Open Access Journals (Sweden)
Latu Guillaume
2011-12-01
Full Text Available This course gives an introduction to the design of efficient datatypes for adaptive wavelet-based applications. It presents some code fragments and benchmark technics useful to learn about the design of sparse data structures and adaptive algorithms. Material and practical examples are given, and they provide good introduction for anyone involved in the development of adaptive applications. An answer will be given to the question: how to implement and efficiently use the discrete wavelet transform in computer applications? A focus will be made on time-evolution problems, and use of wavelet-based scheme for adaptively solving partial differential equations (PDE. One crucial issue is that the benefits of the adaptive method in term of algorithmic cost reduction can not be wasted by overheads associated to sparse data management.
Split-Bregman-based sparse-view CT reconstruction
Energy Technology Data Exchange (ETDEWEB)
Vandeghinste, Bert; Vandenberghe, Stefaan [Ghent Univ. (Belgium). Medical Image and Signal Processing (MEDISIP); Goossens, Bart; Pizurica, Aleksandra; Philips, Wilfried [Ghent Univ. (Belgium). Image Processing and Interpretation Research Group (IPI); Beenhouwer, Jan de [Ghent Univ. (Belgium). Medical Image and Signal Processing (MEDISIP); Antwerp Univ., Wilrijk (Belgium). The Vision Lab; Staelens, Steven [Ghent Univ. (Belgium). Medical Image and Signal Processing (MEDISIP); Antwerp Univ., Edegem (Belgium). Molecular Imaging Centre Antwerp
2011-07-01
Total variation minimization has been extensively researched for image denoising and sparse view reconstruction. These methods show superior denoising performance for simple images with little texture, but result in texture information loss when applied to more complex images. It could thus be beneficial to use other regularizers within medical imaging. We propose a general regularization method, based on a split-Bregman approach. We show results for this framework combined with a total variation denoising operator, in comparison to ASD-POCS. We show that sparse-view reconstruction and noise regularization is possible. This general method will allow us to investigate other regularizers in the context of regularized CT reconstruction, and decrease the acquisition times in {mu}CT. (orig.)
Sparse canonical correlation analysis: new formulation and algorithm.
Chu, Delin; Liao, Li-Zhi; Ng, Michael K; Zhang, Xiaowei
2013-12-01
In this paper, we study canonical correlation analysis (CCA), which is a powerful tool in multivariate data analysis for finding the correlation between two sets of multidimensional variables. The main contributions of the paper are: 1) to reveal the equivalent relationship between a recursive formula and a trace formula for the multiple CCA problem, 2) to obtain the explicit characterization for all solutions of the multiple CCA problem even when the corresponding covariance matrices are singular, 3) to develop a new sparse CCA algorithm, and 4) to establish the equivalent relationship between the uncorrelated linear discriminant analysis and the CCA problem. We test several simulated and real-world datasets in gene classification and cross-language document retrieval to demonstrate the effectiveness of the proposed algorithm. The performance of the proposed method is competitive with the state-of-the-art sparse CCA algorithms.
Sparse Nonlinear Electromagnetic Imaging Accelerated With Projected Steepest Descent Algorithm
Desmal, Abdulla
2017-04-03
An efficient electromagnetic inversion scheme for imaging sparse 3-D domains is proposed. The scheme achieves its efficiency and accuracy by integrating two concepts. First, the nonlinear optimization problem is constrained using L₀ or L₁-norm of the solution as the penalty term to alleviate the ill-posedness of the inverse problem. The resulting Tikhonov minimization problem is solved using nonlinear Landweber iterations (NLW). Second, the efficiency of the NLW is significantly increased using a steepest descent algorithm. The algorithm uses a projection operator to enforce the sparsity constraint by thresholding the solution at every iteration. Thresholding level and iteration step are selected carefully to increase the efficiency without sacrificing the convergence of the algorithm. Numerical results demonstrate the efficiency and accuracy of the proposed imaging scheme in reconstructing sparse 3-D dielectric profiles.
Multi scales based sparse matrix spectral clustering image segmentation
Liu, Zhongmin; Chen, Zhicai; Li, Zhanming; Hu, Wenjin
2018-04-01
In image segmentation, spectral clustering algorithms have to adopt the appropriate scaling parameter to calculate the similarity matrix between the pixels, which may have a great impact on the clustering result. Moreover, when the number of data instance is large, computational complexity and memory use of the algorithm will greatly increase. To solve these two problems, we proposed a new spectral clustering image segmentation algorithm based on multi scales and sparse matrix. We devised a new feature extraction method at first, then extracted the features of image on different scales, at last, using the feature information to construct sparse similarity matrix which can improve the operation efficiency. Compared with traditional spectral clustering algorithm, image segmentation experimental results show our algorithm have better degree of accuracy and robustness.
Nonredundant sparse feature extraction using autoencoders with receptive fields clustering.
Ayinde, Babajide O; Zurada, Jacek M
2017-09-01
This paper proposes new techniques for data representation in the context of deep learning using agglomerative clustering. Existing autoencoder-based data representation techniques tend to produce a number of encoding and decoding receptive fields of layered autoencoders that are duplicative, thereby leading to extraction of similar features, thus resulting in filtering redundancy. We propose a way to address this problem and show that such redundancy can be eliminated. This yields smaller networks and produces unique receptive fields that extract distinct features. It is also shown that autoencoders with nonnegativity constraints on weights are capable of extracting fewer redundant features than conventional sparse autoencoders. The concept is illustrated using conventional sparse autoencoder and nonnegativity-constrained autoencoders with MNIST digits recognition, NORB normalized-uniform object data and Yale face dataset. Copyright © 2017 Elsevier Ltd. All rights reserved.
Greedy Algorithms for Nonnegativity-Constrained Simultaneous Sparse Recovery
Kim, Daeun; Haldar, Justin P.
2016-01-01
This work proposes a family of greedy algorithms to jointly reconstruct a set of vectors that are (i) nonnegative and (ii) simultaneously sparse with a shared support set. The proposed algorithms generalize previous approaches that were designed to impose these constraints individually. Similar to previous greedy algorithms for sparse recovery, the proposed algorithms iteratively identify promising support indices. In contrast to previous approaches, the support index selection procedure has been adapted to prioritize indices that are consistent with both the nonnegativity and shared support constraints. Empirical results demonstrate for the first time that the combined use of simultaneous sparsity and nonnegativity constraints can substantially improve recovery performance relative to existing greedy algorithms that impose less signal structure. PMID:26973368
Limited-memory trust-region methods for sparse relaxation
Adhikari, Lasith; DeGuchy, Omar; Erway, Jennifer B.; Lockhart, Shelby; Marcia, Roummel F.
2017-08-01
In this paper, we solve the l2-l1 sparse recovery problem by transforming the objective function of this problem into an unconstrained differentiable function and applying a limited-memory trust-region method. Unlike gradient projection-type methods, which uses only the current gradient, our approach uses gradients from previous iterations to obtain a more accurate Hessian approximation. Numerical experiments show that our proposed approach eliminates spurious solutions more effectively while improving computational time.
Obtaining sparse distributions in 2D inverse problems
Reci, A; Sederman, Andrew John; Gladden, Lynn Faith
2017-01-01
The mathematics of inverse problems has relevance across numerous estimation problems in science and engineering. L1 regularization has attracted recent attention in reconstructing the system properties in the case of sparse inverse problems; i.e., when the true property sought is not adequately described by a continuous distribution, in particular in Compressed Sensing image reconstruction. In this work, we focus on the application of L1 regularization to a class of inverse problems; relaxat...
Sparse optimization for inverse problems in atmospheric modelling
Czech Academy of Sciences Publication Activity Database
Adam, Lukáš; Branda, Martin
2016-01-01
Roč. 79, č. 3 (2016), s. 256-266 ISSN 1364-8152 R&D Projects: GA MŠk(CZ) 7F14287 Institutional support: RVO:67985556 Keywords : Inverse modelling * Sparse optimization * Integer optimization * Least squares * European tracer experiment * Free Matlab codes Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 4.404, year: 2016 http://library.utia.cas.cz/separaty/2016/MTR/adam-0457037.pdf
Example-Based Image Colorization Using Locality Consistent Sparse Representation.
Bo Li; Fuchen Zhao; Zhuo Su; Xiangguo Liang; Yu-Kun Lai; Rosin, Paul L
2017-11-01
Image colorization aims to produce a natural looking color image from a given gray-scale image, which remains a challenging problem. In this paper, we propose a novel example-based image colorization method exploiting a new locality consistent sparse representation. Given a single reference color image, our method automatically colorizes the target gray-scale image by sparse pursuit. For efficiency and robustness, our method operates at the superpixel level. We extract low-level intensity features, mid-level texture features, and high-level semantic features for each superpixel, which are then concatenated to form its descriptor. The collection of feature vectors for all the superpixels from the reference image composes the dictionary. We formulate colorization of target superpixels as a dictionary-based sparse reconstruction problem. Inspired by the observation that superpixels with similar spatial location and/or feature representation are likely to match spatially close regions from the reference image, we further introduce a locality promoting regularization term into the energy formulation, which substantially improves the matching consistency and subsequent colorization results. Target superpixels are colorized based on the chrominance information from the dominant reference superpixels. Finally, to further improve coherence while preserving sharpness, we develop a new edge-preserving filter for chrominance channels with the guidance from the target gray-scale image. To the best of our knowledge, this is the first work on sparse pursuit image colorization from single reference images. Experimental results demonstrate that our colorization method outperforms the state-of-the-art methods, both visually and quantitatively using a user study.
Reliability of Broadcast Communications Under Sparse Random Linear Network Coding
Brown, Suzie; Johnson, Oliver; Tassi, Andrea
2018-01-01
Ultra-reliable Point-to-Multipoint (PtM) communications are expected to become pivotal in networks offering future dependable services for smart cities. In this regard, sparse Random Linear Network Coding (RLNC) techniques have been widely employed to provide an efficient way to improve the reliability of broadcast and multicast data streams. This paper addresses the pressing concern of providing a tight approximation to the probability of a user recovering a data stream protected by this kin...
Distributed coding of multiview sparse sources with joint recovery
DEFF Research Database (Denmark)
Luong, Huynh Van; Deligiannis, Nikos; Forchhammer, Søren
2016-01-01
In support of applications involving multiview sources in distributed object recognition using lightweight cameras, we propose a new method for the distributed coding of sparse sources as visual descriptor histograms extracted from multiview images. The problem is challenging due to the computati...... transform (SIFT) descriptors extracted from multiview images shows that our method leads to bit-rate saving of up to 43% compared to the state-of-the-art distributed compressed sensing method with independent encoding of the sources....
Optimal deep neural networks for sparse recovery via Laplace techniques
Limmer, Steffen; Stanczak, Slawomir
2017-01-01
This paper introduces Laplace techniques for designing a neural network, with the goal of estimating simplex-constraint sparse vectors from compressed measurements. To this end, we recast the problem of MMSE estimation (w.r.t. a pre-defined uniform input distribution) as the problem of computing the centroid of some polytope that results from the intersection of the simplex and an affine subspace determined by the measurements. Owing to the specific structure, it is shown that the centroid ca...
Shape prior modeling using sparse representation and online dictionary learning.
Zhang, Shaoting; Zhan, Yiqiang; Zhou, Yan; Uzunbas, Mustafa; Metaxas, Dimitris N
2012-01-01
The recently proposed sparse shape composition (SSC) opens a new avenue for shape prior modeling. Instead of assuming any parametric model of shape statistics, SSC incorporates shape priors on-the-fly by approximating a shape instance (usually derived from appearance cues) by a sparse combination of shapes in a training repository. Theoretically, one can increase the modeling capability of SSC by including as many training shapes in the repository. However, this strategy confronts two limitations in practice. First, since SSC involves an iterative sparse optimization at run-time, the more shape instances contained in the repository, the less run-time efficiency SSC has. Therefore, a compact and informative shape dictionary is preferred to a large shape repository. Second, in medical imaging applications, training shapes seldom come in one batch. It is very time consuming and sometimes infeasible to reconstruct the shape dictionary every time new training shapes appear. In this paper, we propose an online learning method to address these two limitations. Our method starts from constructing an initial shape dictionary using the K-SVD algorithm. When new training shapes come, instead of re-constructing the dictionary from the ground up, we update the existing one using a block-coordinates descent approach. Using the dynamically updated dictionary, sparse shape composition can be gracefully scaled up to model shape priors from a large number of training shapes without sacrificing run-time efficiency. Our method is validated on lung localization in X-Ray and cardiac segmentation in MRI time series. Compared to the original SSC, it shows comparable performance while being significantly more efficient.
Color normalization of histology slides using graph regularized sparse NMF
Sha, Lingdao; Schonfeld, Dan; Sethi, Amit
2017-03-01
Computer based automatic medical image processing and quantification are becoming popular in digital pathology. However, preparation of histology slides can vary widely due to differences in staining equipment, procedures and reagents, which can reduce the accuracy of algorithms that analyze their color and texture information. To re- duce the unwanted color variations, various supervised and unsupervised color normalization methods have been proposed. Compared with supervised color normalization methods, unsupervised color normalization methods have advantages of time and cost efficient and universal applicability. Most of the unsupervised color normaliza- tion methods for histology are based on stain separation. Based on the fact that stain concentration cannot be negative and different parts of the tissue absorb different stains, nonnegative matrix factorization (NMF), and particular its sparse version (SNMF), are good candidates for stain separation. However, most of the existing unsupervised color normalization method like PCA, ICA, NMF and SNMF fail to consider important information about sparse manifolds that its pixels occupy, which could potentially result in loss of texture information during color normalization. Manifold learning methods like Graph Laplacian have proven to be very effective in interpreting high-dimensional data. In this paper, we propose a novel unsupervised stain separation method called graph regularized sparse nonnegative matrix factorization (GSNMF). By considering the sparse prior of stain concentration together with manifold information from high-dimensional image data, our method shows better performance in stain color deconvolution than existing unsupervised color deconvolution methods, especially in keeping connected texture information. To utilized the texture information, we construct a nearest neighbor graph between pixels within a spatial area of an image based on their distances using heat kernal in lαβ space. The
Dentate Gyrus circuitry features improve performance of sparse approximation algorithms.
Directory of Open Access Journals (Sweden)
Panagiotis C Petrantonakis
Full Text Available Memory-related activity in the Dentate Gyrus (DG is characterized by sparsity. Memory representations are seen as activated neuronal populations of granule cells, the main encoding cells in DG, which are estimated to engage 2-4% of the total population. This sparsity is assumed to enhance the ability of DG to perform pattern separation, one of the most valuable contributions of DG during memory formation. In this work, we investigate how features of the DG such as its excitatory and inhibitory connectivity diagram can be used to develop theoretical algorithms performing Sparse Approximation, a widely used strategy in the Signal Processing field. Sparse approximation stands for the algorithmic identification of few components from a dictionary that approximate a certain signal. The ability of DG to achieve pattern separation by sparsifing its representations is exploited here to improve the performance of the state of the art sparse approximation algorithm "Iterative Soft Thresholding" (IST by adding new algorithmic features inspired by the DG circuitry. Lateral inhibition of granule cells, either direct or indirect, via mossy cells, is shown to enhance the performance of the IST. Apart from revealing the potential of DG-inspired theoretical algorithms, this work presents new insights regarding the function of particular cell types in the pattern separation task of the DG.
Superresolving Black Hole Images with Full-Closure Sparse Modeling
Crowley, Chelsea; Akiyama, Kazunori; Fish, Vincent
2018-01-01
It is believed that almost all galaxies have black holes at their centers. Imaging a black hole is a primary objective to answer scientific questions relating to relativistic accretion and jet formation. The Event Horizon Telescope (EHT) is set to capture images of two nearby black holes, Sagittarius A* at the center of the Milky Way galaxy roughly 26,000 light years away and the other M87 which is in Virgo A, a large elliptical galaxy that is 50 million light years away. Sparse imaging techniques have shown great promise for reconstructing high-fidelity superresolved images of black holes from simulated data. Previous work has included the effects of atmospheric phase errors and thermal noise, but not systematic amplitude errors that arise due to miscalibration. We explore a full-closure imaging technique with sparse modeling that uses closure amplitudes and closure phases to improve the imaging process. This new technique can successfully handle data with systematic amplitude errors. Applying our technique to synthetic EHT data of M87, we find that full-closure sparse modeling can reconstruct images better than traditional methods and recover key structural information on the source, such as the shape and size of the predicted photon ring. These results suggest that our new approach will provide superior imaging performance for data from the EHT and other interferometric arrays.
Robust visual tracking via multiscale deep sparse networks
Wang, Xin; Hou, Zhiqiang; Yu, Wangsheng; Xue, Yang; Jin, Zefenfen; Dai, Bo
2017-04-01
In visual tracking, deep learning with offline pretraining can extract more intrinsic and robust features. It has significant success solving the tracking drift in a complicated environment. However, offline pretraining requires numerous auxiliary training datasets and is considerably time-consuming for tracking tasks. To solve these problems, a multiscale sparse networks-based tracker (MSNT) under the particle filter framework is proposed. Based on the stacked sparse autoencoders and rectifier linear unit, the tracker has a flexible and adjustable architecture without the offline pretraining process and exploits the robust and powerful features effectively only through online training of limited labeled data. Meanwhile, the tracker builds four deep sparse networks of different scales, according to the target's profile type. During tracking, the tracker selects the matched tracking network adaptively in accordance with the initial target's profile type. It preserves the inherent structural information more efficiently than the single-scale networks. Additionally, a corresponding update strategy is proposed to improve the robustness of the tracker. Extensive experimental results on a large scale benchmark dataset show that the proposed method performs favorably against state-of-the-art methods in challenging environments.
An Improved Information Hiding Method Based on Sparse Representation
Directory of Open Access Journals (Sweden)
Minghai Yao
2015-01-01
Full Text Available A novel biometric authentication information hiding method based on the sparse representation is proposed for enhancing the security of biometric information transmitted in the network. In order to make good use of abundant information of the cover image, the sparse representation method is adopted to exploit the correlation between the cover and biometric images. Thus, the biometric image is divided into two parts. The first part is the reconstructed image, and the other part is the residual image. The biometric authentication image cannot be restored by any one part. The residual image and sparse representation coefficients are embedded into the cover image. Then, for the sake of causing much less attention of attackers, the visual attention mechanism is employed to select embedding location and embedding sequence of secret information. Finally, the reversible watermarking algorithm based on histogram is utilized for embedding the secret information. For verifying the validity of the algorithm, the PolyU multispectral palmprint and the CASIA iris databases are used as biometric information. The experimental results show that the proposed method exhibits good security, invisibility, and high capacity.
Two-dimensional sparse wavenumber recovery for guided wavefields
Sabeti, Soroosh; Harley, Joel B.
2018-04-01
The multi-modal and dispersive behavior of guided waves is often characterized by their dispersion curves, which describe their frequency-wavenumber behavior. In prior work, compressive sensing based techniques, such as sparse wavenumber analysis (SWA), have been capable of recovering dispersion curves from limited data samples. A major limitation of SWA, however, is the assumption that the structure is isotropic. As a result, SWA fails when applied to composites and other anisotropic structures. There have been efforts to address this issue in the literature, but they either are not easily generalizable or do not sufficiently express the data. In this paper, we enhance the existing approaches by employing a two-dimensional wavenumber model to account for direction-dependent velocities in anisotropic media. We integrate this model with tools from compressive sensing to reconstruct a wavefield from incomplete data. Specifically, we create a modified two-dimensional orthogonal matching pursuit algorithm that takes an undersampled wavefield image, with specified unknown elements, and determines its sparse wavenumber characteristics. We then recover the entire wavefield from the sparse representations obtained with our small number of data samples.
Low-Rank Sparse Coding for Image Classification
Zhang, Tianzhu; Ghanem, Bernard; Liu, Si; Xu, Changsheng; Ahuja, Narendra
2013-01-01
In this paper, we propose a low-rank sparse coding (LRSC) method that exploits local structure information among features in an image for the purpose of image-level classification. LRSC represents densely sampled SIFT descriptors, in a spatial neighborhood, collectively as low-rank, sparse linear combinations of code words. As such, it casts the feature coding problem as a low-rank matrix learning problem, which is different from previous methods that encode features independently. This LRSC has a number of attractive properties. (1) It encourages sparsity in feature codes, locality in codebook construction, and low-rankness for spatial consistency. (2) LRSC encodes local features jointly by considering their low-rank structure information, and is computationally attractive. We evaluate the LRSC by comparing its performance on a set of challenging benchmarks with that of 7 popular coding and other state-of-the-art methods. Our experiments show that by representing local features jointly, LRSC not only outperforms the state-of-the-art in classification accuracy but also improves the time complexity of methods that use a similar sparse linear representation model for feature coding.
On A Nonlinear Generalization of Sparse Coding and Dictionary Learning.
Xie, Yuchen; Ho, Jeffrey; Vemuri, Baba
2013-01-01
Existing dictionary learning algorithms are based on the assumption that the data are vectors in an Euclidean vector space ℝ d , and the dictionary is learned from the training data using the vector space structure of ℝ d and its Euclidean L 2 -metric. However, in many applications, features and data often originated from a Riemannian manifold that does not support a global linear (vector space) structure. Furthermore, the extrinsic viewpoint of existing dictionary learning algorithms becomes inappropriate for modeling and incorporating the intrinsic geometry of the manifold that is potentially important and critical to the application. This paper proposes a novel framework for sparse coding and dictionary learning for data on a Riemannian manifold, and it shows that the existing sparse coding and dictionary learning methods can be considered as special (Euclidean) cases of the more general framework proposed here. We show that both the dictionary and sparse coding can be effectively computed for several important classes of Riemannian manifolds, and we validate the proposed method using two well-known classification problems in computer vision and medical imaging analysis.
Low-Rank Sparse Coding for Image Classification
Zhang, Tianzhu
2013-12-01
In this paper, we propose a low-rank sparse coding (LRSC) method that exploits local structure information among features in an image for the purpose of image-level classification. LRSC represents densely sampled SIFT descriptors, in a spatial neighborhood, collectively as low-rank, sparse linear combinations of code words. As such, it casts the feature coding problem as a low-rank matrix learning problem, which is different from previous methods that encode features independently. This LRSC has a number of attractive properties. (1) It encourages sparsity in feature codes, locality in codebook construction, and low-rankness for spatial consistency. (2) LRSC encodes local features jointly by considering their low-rank structure information, and is computationally attractive. We evaluate the LRSC by comparing its performance on a set of challenging benchmarks with that of 7 popular coding and other state-of-the-art methods. Our experiments show that by representing local features jointly, LRSC not only outperforms the state-of-the-art in classification accuracy but also improves the time complexity of methods that use a similar sparse linear representation model for feature coding.
Efficient MATLAB computations with sparse and factored tensors.
Energy Technology Data Exchange (ETDEWEB)
Bader, Brett William; Kolda, Tamara Gibson (Sandia National Lab, Livermore, CA)
2006-12-01
In this paper, the term tensor refers simply to a multidimensional or N-way array, and we consider how specially structured tensors allow for efficient storage and computation. First, we study sparse tensors, which have the property that the vast majority of the elements are zero. We propose storing sparse tensors using coordinate format and describe the computational efficiency of this scheme for various mathematical operations, including those typical to tensor decomposition algorithms. Second, we study factored tensors, which have the property that they can be assembled from more basic components. We consider two specific types: a Tucker tensor can be expressed as the product of a core tensor (which itself may be dense, sparse, or factored) and a matrix along each mode, and a Kruskal tensor can be expressed as the sum of rank-1 tensors. We are interested in the case where the storage of the components is less than the storage of the full tensor, and we demonstrate that many elementary operations can be computed using only the components. All of the efficiencies described in this paper are implemented in the Tensor Toolbox for MATLAB.
Parallel Framework for Cooperative Processes
Directory of Open Access Journals (Sweden)
Mitică Craus
2005-01-01
Full Text Available This paper describes the work of an object oriented framework designed to be used in the parallelization of a set of related algorithms. The idea behind the system we are describing is to have a re-usable framework for running several sequential algorithms in a parallel environment. The algorithms that the framework can be used with have several things in common: they have to run in cycles and the work should be possible to be split between several "processing units". The parallel framework uses the message-passing communication paradigm and is organized as a master-slave system. Two applications are presented: an Ant Colony Optimization (ACO parallel algorithm for the Travelling Salesman Problem (TSP and an Image Processing (IP parallel algorithm for the Symmetrical Neighborhood Filter (SNF. The implementations of these applications by means of the parallel framework prove to have good performances: approximatively linear speedup and low communication cost.
Parallel Monte Carlo reactor neutronics
International Nuclear Information System (INIS)
Blomquist, R.N.; Brown, F.B.
1994-01-01
The issues affecting implementation of parallel algorithms for large-scale engineering Monte Carlo neutron transport simulations are discussed. For nuclear reactor calculations, these include load balancing, recoding effort, reproducibility, domain decomposition techniques, I/O minimization, and strategies for different parallel architectures. Two codes were parallelized and tested for performance. The architectures employed include SIMD, MIMD-distributed memory, and workstation network with uneven interactive load. Speedups linear with the number of nodes were achieved
Abdelfattah, Ahmad
2015-01-15
High performance computing (HPC) platforms are evolving to more heterogeneous configurations to support the workloads of various applications. The current hardware landscape is composed of traditional multicore CPUs equipped with hardware accelerators that can handle high levels of parallelism. Graphical Processing Units (GPUs) are popular high performance hardware accelerators in modern supercomputers. GPU programming has a different model than that for CPUs, which means that many numerical kernels have to be redesigned and optimized specifically for this architecture. GPUs usually outperform multicore CPUs in some compute intensive and massively parallel applications that have regular processing patterns. However, most scientific applications rely on crucial memory-bound kernels and may witness bottlenecks due to the overhead of the memory bus latency. They can still take advantage of the GPU compute power capabilities, provided that an efficient architecture-aware design is achieved. This dissertation presents a uniform design strategy for optimizing critical memory-bound kernels on GPUs. Based on hierarchical register blocking, double buffering and latency hiding techniques, this strategy leverages the performance of a wide range of standard numerical kernels found in dense and sparse linear algebra libraries. The work presented here focuses on matrix-vector multiplication kernels (MVM) as repre- sentative and most important memory-bound operations in this context. Each kernel inherits the benefits of the proposed strategies. By exposing a proper set of tuning parameters, the strategy is flexible enough to suit different types of matrices, ranging from large dense matrices, to sparse matrices with dense block structures, while high performance is maintained. Furthermore, the tuning parameters are used to maintain the relative performance across different GPU architectures. Multi-GPU acceleration is proposed to scale the performance on several devices. The
DEFF Research Database (Denmark)
Kosbar, Tamer R.; Sofan, Mamdouh A.; Waly, Mohamed A.
2015-01-01
about 6.1 °C when the TFO strand was modified with Z and the Watson-Crick strand with adenine-LNA (AL). The molecular modeling results showed that, in case of nucleobases Y and Z a hydrogen bond (1.69 and 1.72 Å, respectively) was formed between the protonated 3-aminopropyn-1-yl chain and one...... of the phosphate groups in Watson-Crick strand. Also, it was shown that the nucleobase Y made a good stacking and binding with the other nucleobases in the TFO and Watson-Crick duplex, respectively. In contrast, the nucleobase Z with LNA moiety was forced to twist out of plane of Watson-Crick base pair which......The phosphoramidites of DNA monomers of 7-(3-aminopropyn-1-yl)-8-aza-7-deazaadenine (Y) and 7-(3-aminopropyn-1-yl)-8-aza-7-deazaadenine LNA (Z) are synthesized, and the thermal stability at pH 7.2 and 8.2 of anti-parallel triplexes modified with these two monomers is determined. When, the anti...
Parallel consensual neural networks.
Benediktsson, J A; Sveinsson, J R; Ersoy, O K; Swain, P H
1997-01-01
A new type of a neural-network architecture, the parallel consensual neural network (PCNN), is introduced and applied in classification/data fusion of multisource remote sensing and geographic data. The PCNN architecture is based on statistical consensus theory and involves using stage neural networks with transformed input data. The input data are transformed several times and the different transformed data are used as if they were independent inputs. The independent inputs are first classified using the stage neural networks. The output responses from the stage networks are then weighted and combined to make a consensual decision. In this paper, optimization methods are used in order to weight the outputs from the stage networks. Two approaches are proposed to compute the data transforms for the PCNN, one for binary data and another for analog data. The analog approach uses wavelet packets. The experimental results obtained with the proposed approach show that the PCNN outperforms both a conjugate-gradient backpropagation neural network and conventional statistical methods in terms of overall classification accuracy of test data.
Fast Solution in Sparse LDA for Binary Classification
Moghaddam, Baback
2010-01-01
An algorithm that performs sparse linear discriminant analysis (Sparse-LDA) finds near-optimal solutions in far less time than the prior art when specialized to binary classification (of 2 classes). Sparse-LDA is a type of feature- or variable- selection problem with numerous applications in statistics, machine learning, computer vision, computational finance, operations research, and bio-informatics. Because of its combinatorial nature, feature- or variable-selection problems are NP-hard or computationally intractable in cases involving more than 30 variables or features. Therefore, one typically seeks approximate solutions by means of greedy search algorithms. The prior Sparse-LDA algorithm was a greedy algorithm that considered the best variable or feature to add/ delete to/ from its subsets in order to maximally discriminate between multiple classes of data. The present algorithm is designed for the special but prevalent case of 2-class or binary classification (e.g. 1 vs. 0, functioning vs. malfunctioning, or change versus no change). The present algorithm provides near-optimal solutions on large real-world datasets having hundreds or even thousands of variables or features (e.g. selecting the fewest wavelength bands in a hyperspectral sensor to do terrain classification) and does so in typical computation times of minutes as compared to days or weeks as taken by the prior art. Sparse LDA requires solving generalized eigenvalue problems for a large number of variable subsets (represented by the submatrices of the input within-class and between-class covariance matrices). In the general (fullrank) case, the amount of computation scales at least cubically with the number of variables and thus the size of the problems that can be solved is limited accordingly. However, in binary classification, the principal eigenvalues can be found using a special analytic formula, without resorting to costly iterative techniques. The present algorithm exploits this analytic
Directory of Open Access Journals (Sweden)
Daniel Marcsa
2015-01-01
Full Text Available The analysis and design of electromechanical devices involve the solution of large sparse linear systems, and require therefore high performance algorithms. In this paper, the primal Domain Decomposition Method (DDM with parallel forward-backward and with parallel Preconditioned Conjugate Gradient (PCG solvers are introduced in two-dimensional parallel time-stepping finite element formulation to analyze rotating machine considering the electromagnetic field, external circuit and rotor movement. The proposed parallel direct and the iterative solver with two preconditioners are analyzed concerning its computational efficiency and number of iterations of the solver with different preconditioners. Simulation results of a rotating machine is also presented.
A Parallel Particle Swarm Optimizer
National Research Council Canada - National Science Library
Schutte, J. F; Fregly, B .J; Haftka, R. T; George, A. D
2003-01-01
.... Motivated by a computationally demanding biomechanical system identification problem, we introduce a parallel implementation of a stochastic population based global optimizer, the Particle Swarm...
Patterns for Parallel Software Design
Ortega-Arjona, Jorge Luis
2010-01-01
Essential reading to understand patterns for parallel programming Software patterns have revolutionized the way we think about how software is designed, built, and documented, and the design of parallel software requires you to consider other particular design aspects and special skills. From clusters to supercomputers, success heavily depends on the design skills of software developers. Patterns for Parallel Software Design presents a pattern-oriented software architecture approach to parallel software design. This approach is not a design method in the classic sense, but a new way of managin
DEFF Research Database (Denmark)
Christensen, Mark Schram; Ehrsson, H Henrik; Nielsen, Jens Bo
2013-01-01
a different network, involving bilateral dorsal premotor cortex (PMd), primary motor cortex, and SMA, was more active when subjects viewed parallel movements while performing either symmetrical or parallel movements. Correlations between behavioral instability and brain activity were present in right lateral...... adduction-abduction movements symmetrically or in parallel with real-time congruent or incongruent visual feedback of the movements. One network, consisting of bilateral superior and middle frontal gyrus and supplementary motor area (SMA), was more active when subjects performed parallel movements, whereas...
Energy Technology Data Exchange (ETDEWEB)
Mehrotra, Sanjay [Northwestern Univ., Evanston, IL (United States)
2016-09-07
The support from this grant resulted in seven published papers and a technical report. Two papers are published in SIAM J. on Optimization [87, 88]; two papers are published in IEEE Transactions on Power Systems [77, 78]; one paper is published in Smart Grid [79]; one paper is published in Computational Optimization and Applications [44] and one in INFORMS J. on Computing [67]). The works in [44, 67, 87, 88] were funded primarily by this DOE grant. The applied papers in [77, 78, 79] were also supported through a subcontract from the Argonne National Lab. We start by presenting our main research results on the scenario generation problem in Sections 1–2. We present our algorithmic results on interior point methods for convex optimization problems in Section 3. We describe a new ‘central’ cutting surface algorithm developed for solving large scale convex programming problems (as is the case with our proposed research) with semi-infinite number of constraints in Section 4. In Sections 5–6 we present our work on two application problems of interest to DOE.
PARALLEL IMPORT: REALITY FOR RUSSIA
Directory of Open Access Journals (Sweden)
Т. А. Сухопарова
2014-01-01
Full Text Available Problem of parallel import is urgent question at now. Parallel import legalization in Russia is expedient. Such statement based on opposite experts opinion analysis. At the same time it’s necessary to negative consequences consider of this decision and to apply remedies to its minimization.Purchase on Elibrary.ru > Buy now
Parallel time domain solvers for electrically large transient scattering problems
Liu, Yang
2014-09-26
Marching on in time (MOT)-based integral equation solvers represent an increasingly appealing avenue for analyzing transient electromagnetic interactions with large and complex structures. MOT integral equation solvers for analyzing electromagnetic scattering from perfect electrically conducting objects are obtained by enforcing electric field boundary conditions and implicitly time advance electric surface current densities by iteratively solving sparse systems of equations at all time steps. Contrary to finite difference and element competitors, these solvers apply to nonlinear and multi-scale structures comprising geometrically intricate and deep sub-wavelength features residing atop electrically large platforms. Moreover, they are high-order accurate, stable in the low- and high-frequency limits, and applicable to conducting and penetrable structures represented by highly irregular meshes. This presentation reviews some recent advances in the parallel implementations of time domain integral equation solvers, specifically those that leverage multilevel plane-wave time-domain algorithm (PWTD) on modern manycore computer architectures including graphics processing units (GPUs) and distributed memory supercomputers. The GPU-based implementation achieves at least one order of magnitude speedups compared to serial implementations while the distributed parallel implementation are highly scalable to thousands of compute-nodes. A distributed parallel PWTD kernel has been adopted to solve time domain surface/volume integral equations (TDSIE/TDVIE) for analyzing transient scattering from large and complex-shaped perfectly electrically conducting (PEC)/dielectric objects involving ten million/tens of millions of spatial unknowns.
BCYCLIC: A parallel block tridiagonal matrix cyclic solver
Hirshman, S. P.; Perumalla, K. S.; Lynch, V. E.; Sanchez, R.
2010-09-01
A block tridiagonal matrix is factored with minimal fill-in using a cyclic reduction algorithm that is easily parallelized. Storage of the factored blocks allows the application of the inverse to multiple right-hand sides which may not be known at factorization time. Scalability with the number of block rows is achieved with cyclic reduction, while scalability with the block size is achieved using multithreaded routines (OpenMP, GotoBLAS) for block matrix manipulation. This dual scalability is a noteworthy feature of this new solver, as well as its ability to efficiently handle arbitrary (non-powers-of-2) block row and processor numbers. Comparison with a state-of-the art parallel sparse solver is presented. It is expected that this new solver will allow many physical applications to optimally use the parallel resources on current supercomputers. Example usage of the solver in magneto-hydrodynamic (MHD), three-dimensional equilibrium solvers for high-temperature fusion plasmas is cited.
The Galley Parallel File System
Nieuwejaar, Nils; Kotz, David
1996-01-01
Most current multiprocessor file systems are designed to use multiple disks in parallel, using the high aggregate bandwidth to meet the growing I/0 requirements of parallel scientific applications. Many multiprocessor file systems provide applications with a conventional Unix-like interface, allowing the application to access multiple disks transparently. This interface conceals the parallelism within the file system, increasing the ease of programmability, but making it difficult or impossible for sophisticated programmers and libraries to use knowledge about their I/O needs to exploit that parallelism. In addition to providing an insufficient interface, most current multiprocessor file systems are optimized for a different workload than they are being asked to support. We introduce Galley, a new parallel file system that is intended to efficiently support realistic scientific multiprocessor workloads. We discuss Galley's file structure and application interface, as well as the performance advantages offered by that interface.
Parallelization of the FLAPW method
International Nuclear Information System (INIS)
Canning, A.; Mannstadt, W.; Freeman, A.J.
1999-01-01
The FLAPW (full-potential linearized-augmented plane-wave) method is one of the most accurate first-principles methods for determining electronic and magnetic properties of crystals and surfaces. Until the present work, the FLAPW method has been limited to systems of less than about one hundred atoms due to a lack of an efficient parallel implementation to exploit the power and memory of parallel computers. In this work we present an efficient parallelization of the method by division among the processors of the plane-wave components for each state. The code is also optimized for RISC (reduced instruction set computer) architectures, such as those found on most parallel computers, making full use of BLAS (basic linear algebra subprograms) wherever possible. Scaling results are presented for systems of up to 686 silicon atoms and 343 palladium atoms per unit cell, running on up to 512 processors on a CRAY T3E parallel computer
Parallelization of the FLAPW method
Canning, A.; Mannstadt, W.; Freeman, A. J.
2000-08-01
The FLAPW (full-potential linearized-augmented plane-wave) method is one of the most accurate first-principles methods for determining structural, electronic and magnetic properties of crystals and surfaces. Until the present work, the FLAPW method has been limited to systems of less than about a hundred atoms due to the lack of an efficient parallel implementation to exploit the power and memory of parallel computers. In this work, we present an efficient parallelization of the method by division among the processors of the plane-wave components for each state. The code is also optimized for RISC (reduced instruction set computer) architectures, such as those found on most parallel computers, making full use of BLAS (basic linear algebra subprograms) wherever possible. Scaling results are presented for systems of up to 686 silicon atoms and 343 palladium atoms per unit cell, running on up to 512 processors on a CRAY T3E parallel supercomputer.
Social biases determine spatiotemporal sparseness of ciliate mating heuristics.
Clark, Kevin B
2012-01-01
Ciliates become highly social, even displaying animal-like qualities, in the joint presence of aroused conspecifics and nonself mating pheromones. Pheromone detection putatively helps trigger instinctual and learned courtship and dominance displays from which social judgments are made about the availability, compatibility, and fitness representativeness or likelihood of prospective mates and rivals. In earlier studies, I demonstrated the heterotrich Spirostomum ambiguum improves mating competence by effecting preconjugal strategies and inferences in mock social trials via behavioral heuristics built from Hebbian-like associative learning. Heuristics embody serial patterns of socially relevant action that evolve into ordered, topologically invariant computational networks supporting intra- and intermate selection. S. ambiguum employs heuristics to acquire, store, plan, compare, modify, select, and execute sets of mating propaganda. One major adaptive constraint over formation and use of heuristics involves a ciliate's initial subjective bias, responsiveness, or preparedness, as defined by Stevens' Law of subjective stimulus intensity, for perceiving the meaningfulness of mechanical pressures accompanying cell-cell contacts and additional perimating events. This bias controls durations and valences of nonassociative learning, search rates for appropriate mating strategies, potential net reproductive payoffs, levels of social honesty and deception, successful error diagnosis and correction of mating signals, use of insight or analysis to solve mating dilemmas, bioenergetics expenditures, and governance of mating decisions by classical or quantum statistical mechanics. I now report this same social bias also differentially affects the spatiotemporal sparseness, as measured with metric entropy, of ciliate heuristics. Sparseness plays an important role in neural systems through optimizing the specificity, efficiency, and capacity of memory representations. The present
Deploying temporary networks for upscaling of sparse network stations
Coopersmith, Evan J.; Cosh, Michael H.; Bell, Jesse E.; Kelly, Victoria; Hall, Mark; Palecki, Michael A.; Temimi, Marouane
2016-10-01
Soil observations networks at the national scale play an integral role in hydrologic modeling, drought assessment, agricultural decision support, and our ability to understand climate change. Understanding soil moisture variability is necessary to apply these measurements to model calibration, business and consumer applications, or even human health issues. The installation of soil moisture sensors as sparse, national networks is necessitated by limited financial resources. However, this results in the incomplete sampling of the local heterogeneity of soil type, vegetation cover, topography, and the fine spatial distribution of precipitation events. To this end, temporary networks can be installed in the areas surrounding a permanent installation within a sparse network. The temporary networks deployed in this study provide a more representative average at the 3 km and 9 km scales, localized about the permanent gauge. The value of such temporary networks is demonstrated at test sites in Millbrook, New York and Crossville, Tennessee. The capacity of a single U.S. Climate Reference Network (USCRN) sensor set to approximate the average of a temporary network at the 3 km and 9 km scales using a simple linear scaling function is tested. The capacity of a temporary network to provide reliable estimates with diminishing numbers of sensors, the temporal stability of those networks, and ultimately, the relationship of the variability of those networks to soil moisture conditions at the permanent sensor are investigated. In this manner, this work demonstrates the single-season installation of a temporary network as a mechanism to characterize the soil moisture variability at a permanent gauge within a sparse network.
Social biases determine spatiotemporal sparseness of ciliate mating heuristics
2012-01-01
Ciliates become highly social, even displaying animal-like qualities, in the joint presence of aroused conspecifics and nonself mating pheromones. Pheromone detection putatively helps trigger instinctual and learned courtship and dominance displays from which social judgments are made about the availability, compatibility, and fitness representativeness or likelihood of prospective mates and rivals. In earlier studies, I demonstrated the heterotrich Spirostomum ambiguum improves mating competence by effecting preconjugal strategies and inferences in mock social trials via behavioral heuristics built from Hebbian-like associative learning. Heuristics embody serial patterns of socially relevant action that evolve into ordered, topologically invariant computational networks supporting intra- and intermate selection. S. ambiguum employs heuristics to acquire, store, plan, compare, modify, select, and execute sets of mating propaganda. One major adaptive constraint over formation and use of heuristics involves a ciliate’s initial subjective bias, responsiveness, or preparedness, as defined by Stevens’ Law of subjective stimulus intensity, for perceiving the meaningfulness of mechanical pressures accompanying cell-cell contacts and additional perimating events. This bias controls durations and valences of nonassociative learning, search rates for appropriate mating strategies, potential net reproductive payoffs, levels of social honesty and deception, successful error diagnosis and correction of mating signals, use of insight or analysis to solve mating dilemmas, bioenergetics expenditures, and governance of mating decisions by classical or quantum statistical mechanics. I now report this same social bias also differentially affects the spatiotemporal sparseness, as measured with metric entropy, of ciliate heuristics. Sparseness plays an important role in neural systems through optimizing the specificity, efficiency, and capacity of memory representations. The
Fast Convolutional Sparse Coding in the Dual Domain
Affara, Lama Ahmed
2017-09-27
Convolutional sparse coding (CSC) is an important building block of many computer vision applications ranging from image and video compression to deep learning. We present two contributions to the state of the art in CSC. First, we significantly speed up the computation by proposing a new optimization framework that tackles the problem in the dual domain. Second, we extend the original formulation to higher dimensions in order to process a wider range of inputs, such as color inputs, or HOG features. Our results show a significant speedup compared to the current state of the art in CSC.
Oscillator Neural Network Retrieving Sparsely Coded Phase Patterns
Aoyagi, Toshio; Nomura, Masaki
1999-08-01
Little is known theoretically about the associative memory capabilities of neural networks in which information is encoded not only in the mean firing rate but also in the timing of firings. Particularly, in the case of sparsely coded patterns, it is biologically important to consider the timings of firings and to study how such consideration influences storage capacities and quality of recalled patterns. For this purpose, we propose a simple extended model of oscillator neural networks to allow for expression of a nonfiring state. Analyzing both equilibrium states and dynamical properties in recalling processes, we find that the system possesses good associative memory.
Sparse inverse covariance estimation with the graphical lasso.
Friedman, Jerome; Hastie, Trevor; Tibshirani, Robert
2008-07-01
We consider the problem of estimating sparse graphs by a lasso penalty applied to the inverse covariance matrix. Using a coordinate descent procedure for the lasso, we develop a simple algorithm--the graphical lasso--that is remarkably fast: It solves a 1000-node problem ( approximately 500,000 parameters) in at most a minute and is 30-4000 times faster than competing methods. It also provides a conceptual link between the exact problem and the approximation suggested by Meinshausen and Bühlmann (2006). We illustrate the method on some cell-signaling data from proteomics.
Fast Convolutional Sparse Coding in the Dual Domain
Affara, Lama Ahmed; Ghanem, Bernard; Wonka, Peter
2017-01-01
Convolutional sparse coding (CSC) is an important building block of many computer vision applications ranging from image and video compression to deep learning. We present two contributions to the state of the art in CSC. First, we significantly speed up the computation by proposing a new optimization framework that tackles the problem in the dual domain. Second, we extend the original formulation to higher dimensions in order to process a wider range of inputs, such as color inputs, or HOG features. Our results show a significant speedup compared to the current state of the art in CSC.
Blind spectrum reconstruction algorithm with L0-sparse representation
International Nuclear Information System (INIS)
Liu, Hai; Zhang, Zhaoli; Liu, Sanyan; Shu, Jiangbo; Liu, Tingting; Zhang, Tianxu
2015-01-01
Raman spectrum often suffers from band overlap and Poisson noise. This paper presents a new blind Poissonian Raman spectrum reconstruction method, which incorporates the L 0 -sparse prior together with the total variation constraint into the maximum a posteriori framework. Furthermore, the greedy analysis pursuit algorithm is adopted to solve the L 0 -based minimization problem. Simulated and real spectrum experimental results show that the proposed method can effectively preserve spectral structure and suppress noise. The reconstructed Raman spectra are easily used for interpreting unknown chemical mixtures. (paper)
Calculation of the inverse data space via sparse inversion
Saragiotis, Christos
2011-01-01
The inverse data space provides a natural separation of primaries and surface-related multiples, as the surface multiples map onto the area around the origin while the primaries map elsewhere. However, the calculation of the inverse data is far from trivial as theory requires infinite time and offset recording. Furthermore regularization issues arise during inversion. We perform the inversion by minimizing the least-squares norm of the misfit function by constraining the $ell_1$ norm of the solution, being the inverse data space. In this way a sparse inversion approach is obtained. We show results on field data with an application to surface multiple removal.
Dynamic Stochastic Superresolution of sparsely observed turbulent systems
International Nuclear Information System (INIS)
Branicki, M.; Majda, A.J.
2013-01-01
Real-time capture of the relevant features of the unresolved turbulent dynamics of complex natural systems from sparse noisy observations and imperfect models is a notoriously difficult problem. The resulting lack of observational resolution and statistical accuracy in estimating the important turbulent processes, which intermittently send significant energy to the large-scale fluctuations, hinders efficient parameterization and real-time prediction using discretized PDE models. This issue is particularly subtle and important when dealing with turbulent geophysical systems with an vast range of interacting spatio-temporal scales and rough energy spectra near the mesh scale of numerical models. Here, we introduce and study a suite of general Dynamic Stochastic Superresolution (DSS) algorithms and show that, by appropriately filtering sparse regular observations with the help of cheap stochastic exactly solvable models, one can derive stochastically ‘superresolved’ velocity fields and gain insight into the important characteristics of the unresolved dynamics, including the detection of the so-called black swans. The DSS algorithms operate in Fourier domain and exploit the fact that the coarse observation network aliases high-wavenumber information into the resolved waveband. It is shown that these cheap algorithms are robust and have significant skill on a test bed of turbulent solutions from realistic nonlinear turbulent spatially extended systems in the presence of a significant model error. In particular, the DSS algorithms are capable of successfully capturing time-localized extreme events in the unresolved modes, and they provide good and robust skill for recovery of the unresolved processes in terms of pattern correlation. Moreover, we show that DSS improves the skill for recovering the primary modes associated with the sparse observation mesh which is equally important in applications. The skill of the various DSS algorithms depends on the energy spectrum
Wind Noise Reduction using Non-negative Sparse Coding
DEFF Research Database (Denmark)
Schmidt, Mikkel N.; Larsen, Jan; Hsiao, Fu-Tien
2007-01-01
We introduce a new speaker independent method for reducing wind noise in single-channel recordings of noisy speech. The method is based on non-negative sparse coding and relies on a wind noise dictionary which is estimated from an isolated noise recording. We estimate the parameters of the model ...... and discuss their sensitivity. We then compare the algorithm with the classical spectral subtraction method and the Qualcomm-ICSI-OGI noise reduction method. We optimize the sound quality in terms of signal-to-noise ratio and provide results on a noisy speech recognition task....
Sparse least-squares reverse time migration using seislets
Dutta, Gaurav
2015-08-19
We propose sparse least-squares reverse time migration (LSRTM) using seislets as a basis for the reflectivity distribution. This basis is used along with a dip-constrained preconditioner that emphasizes image updates only along prominent dips during the iterations. These dips can be estimated from the standard migration image or from the gradient using plane-wave destruction filters or structural tensors. Numerical tests on synthetic datasets demonstrate the benefits of this method for mitigation of aliasing artifacts and crosstalk noise in multisource least-squares migration.
Predicting E-commerce Consumer Behaviour Using Sparse Session Data
Thorrud, Thorstein Kaldahl; Myklatun, Øyvind
2015-01-01
This thesis research consumer behavior in an e-commerce domain by using a data set of sparse session data collected from an anonymous European e-commerce site. The goal is to predict whether a consumer session results in a purchase, and if so, which items are purchased. The data is supplied by the ACM Recommender System Challenge, which is a yearly challenge held by the ACM Recommender System Conference. Classification is used for predicting whether or not a session made a purchase, as w...
Anisotropic Third-Order Regularization for Sparse Digital Elevation Models
Lellmann, Jan
2013-01-01
We consider the problem of interpolating a surface based on sparse data such as individual points or level lines. We derive interpolators satisfying a list of desirable properties with an emphasis on preserving the geometry and characteristic features of the contours while ensuring smoothness across level lines. We propose an anisotropic third-order model and an efficient method to adaptively estimate both the surface and the anisotropy. Our experiments show that the approach outperforms AMLE and higher-order total variation methods qualitatively and quantitatively on real-world digital elevation data. © 2013 Springer-Verlag.
Algorithms for sparse, symmetric, definite quadratic lambda-matrix eigenproblems
International Nuclear Information System (INIS)
Scott, D.S.; Ward, R.C.
1981-01-01
Methods are presented for computing eigenpairs of the quadratic lambda-matrix, M lambda 2 + C lambda + K, where M, C, and K are large and sparse, and have special symmetry-type properties. These properties are sufficient to insure that all the eigenvalues are real and that theory analogous to the standard symmetric eigenproblem exists. The methods employ some standard techniques such as partial tri-diagonalization via the Lanczos Method and subsequent eigenpair calculation, shift-and- invert strategy and subspace iteration. The methods also employ some new techniques such as Rayleigh-Ritz quadratic roots and the inertia of symmetric, definite, quadratic lambda-matrices
Iterative Sparse Channel Estimation and Decoding for Underwater MIMO-OFDM
Directory of Open Access Journals (Sweden)
Berger ChristianR
2010-01-01
Full Text Available We propose a block-by-block iterative receiver for underwater MIMO-OFDM that couples channel estimation with multiple-input multiple-output (MIMO detection and low-density parity-check (LDPC channel decoding. In particular, the channel estimator is based on a compressive sensing technique to exploit the channel sparsity, the MIMO detector consists of a hybrid use of successive interference cancellation and soft minimum mean-square error (MMSE equalization, and channel coding uses nonbinary LDPC codes. Various feedback strategies from the channel decoder to the channel estimator are studied, including full feedback of hard or soft symbol decisions, as well as their threshold-controlled versions. We study the receiver performance using numerical simulation and experimental data collected from the RACE08 and SPACE08 experiments. We find that iterative receiver processing including sparse channel estimation leads to impressive performance gains. These gains are more pronounced when the number of available pilots to estimate the channel is decreased, for example, when a fixed number of pilots is split between an increasing number of parallel data streams in MIMO transmission. For the various feedback strategies for iterative channel estimation, we observe that soft decision feedback slightly outperforms hard decision feedback.
Directory of Open Access Journals (Sweden)
Bérenger Bramas
2018-04-01
Full Text Available The sparse matrix-vector product (SpMV is a fundamental operation in many scientific applications from various fields. The High Performance Computing (HPC community has therefore continuously invested a lot of effort to provide an efficient SpMV kernel on modern CPU architectures. Although it has been shown that block-based kernels help to achieve high performance, they are difficult to use in practice because of the zero padding they require. In the current paper, we propose new kernels using the AVX-512 instruction set, which makes it possible to use a blocking scheme without any zero padding in the matrix memory storage. We describe mask-based sparse matrix formats and their corresponding SpMV kernels highly optimized in assembly language. Considering that the optimal blocking size depends on the matrix, we also provide a method to predict the best kernel to be used utilizing a simple interpolation of results from previous executions. We compare the performance of our approach to that of the Intel MKL CSR kernel and the CSR5 open-source package on a set of standard benchmark matrices. We show that we can achieve significant improvements in many cases, both for sequential and for parallel executions. Finally, we provide the corresponding code in an open source library, called SPC5.
Is Monte Carlo embarrassingly parallel?
Energy Technology Data Exchange (ETDEWEB)
Hoogenboom, J. E. [Delft Univ. of Technology, Mekelweg 15, 2629 JB Delft (Netherlands); Delft Nuclear Consultancy, IJsselzoom 2, 2902 LB Capelle aan den IJssel (Netherlands)
2012-07-01
Monte Carlo is often stated as being embarrassingly parallel. However, running a Monte Carlo calculation, especially a reactor criticality calculation, in parallel using tens of processors shows a serious limitation in speedup and the execution time may even increase beyond a certain number of processors. In this paper the main causes of the loss of efficiency when using many processors are analyzed using a simple Monte Carlo program for criticality. The basic mechanism for parallel execution is MPI. One of the bottlenecks turn out to be the rendez-vous points in the parallel calculation used for synchronization and exchange of data between processors. This happens at least at the end of each cycle for fission source generation in order to collect the full fission source distribution for the next cycle and to estimate the effective multiplication factor, which is not only part of the requested results, but also input to the next cycle for population control. Basic improvements to overcome this limitation are suggested and tested. Also other time losses in the parallel calculation are identified. Moreover, the threading mechanism, which allows the parallel execution of tasks based on shared memory using OpenMP, is analyzed in detail. Recommendations are given to get the maximum efficiency out of a parallel Monte Carlo calculation. (authors)
Is Monte Carlo embarrassingly parallel?
International Nuclear Information System (INIS)
Hoogenboom, J. E.
2012-01-01
Monte Carlo is often stated as being embarrassingly parallel. However, running a Monte Carlo calculation, especially a reactor criticality calculation, in parallel using tens of processors shows a serious limitation in speedup and the execution time may even increase beyond a certain number of processors. In this paper the main causes of the loss of efficiency when using many processors are analyzed using a simple Monte Carlo program for criticality. The basic mechanism for parallel execution is MPI. One of the bottlenecks turn out to be the rendez-vous points in the parallel calculation used for synchronization and exchange of data between processors. This happens at least at the end of each cycle for fission source generation in order to collect the full fission source distribution for the next cycle and to estimate the effective multiplication factor, which is not only part of the requested results, but also input to the next cycle for population control. Basic improvements to overcome this limitation are suggested and tested. Also other time losses in the parallel calculation are identified. Moreover, the threading mechanism, which allows the parallel execution of tasks based on shared memory using OpenMP, is analyzed in detail. Recommendations are given to get the maximum efficiency out of a parallel Monte Carlo calculation. (authors)
Parallel integer sorting with medium and fine-scale parallelism
Dagum, Leonardo
1993-01-01
Two new parallel integer sorting algorithms, queue-sort and barrel-sort, are presented and analyzed in detail. These algorithms do not have optimal parallel complexity, yet they show very good performance in practice. Queue-sort designed for fine-scale parallel architectures which allow the queueing of multiple messages to the same destination. Barrel-sort is designed for medium-scale parallel architectures with a high message passing overhead. The performance results from the implementation of queue-sort on a Connection Machine CM-2 and barrel-sort on a 128 processor iPSC/860 are given. The two implementations are found to be comparable in performance but not as good as a fully vectorized bucket sort on the Cray YMP.
Template based parallel checkpointing in a massively parallel computer system
Archer, Charles Jens [Rochester, MN; Inglett, Todd Alan [Rochester, MN
2009-01-13
A method and apparatus for a template based parallel checkpoint save for a massively parallel super computer system using a parallel variation of the rsync protocol, and network broadcast. In preferred embodiments, the checkpoint data for each node is compared to a template checkpoint file that resides in the storage and that was previously produced. Embodiments herein greatly decrease the amount of data that must be transmitted and stored for faster checkpointing and increased efficiency of the computer system. Embodiments are directed to a parallel computer system with nodes arranged in a cluster with a high speed interconnect that can perform broadcast communication. The checkpoint contains a set of actual small data blocks with their corresponding checksums from all nodes in the system. The data blocks may be compressed using conventional non-lossy data compression algorithms to further reduce the overall checkpoint size.
Energy Technology Data Exchange (ETDEWEB)
McLay, R.T.; Carey, G.F.
1996-12-31
In this study we consider parallel solution of sparse linear systems arising from discretized PDE`s. As part of our continuing work on our parallel PCG Solver package, we have made improvements in two areas. The first is improving the performance of the matrix-vector product. Here on regular finite-difference grids, we are able to use the cache memory more efficiently for smaller domains or where there are multiple degrees of freedom. The second problem of interest in the present work is the construction of preconditioners in the context of the parallel PCG solver we are developing. Here the problem is partitioned over a set of processors subdomains and the matrix-vector product for PCG is carried out in parallel for overlapping grid subblocks. For problems of scaled speedup, the actual rate of convergence of the unpreconditioned system deteriorates as the mesh is refined. Multigrid and subdomain strategies provide a logical approach to resolving the problem. We consider the parallel trade-offs between communication and computation and provide a complexity analysis of a representative algorithm. Some preliminary calculations using the parallel package and comparisons with other preconditioners are provided together with parallel performance results.
Parallel education: what is it?
Amos, Michelle Peta
2017-01-01
In the history of education it has long been discussed that single-sex and coeducation are the two models of education present in schools. With the introduction of parallel schools over the last 15 years, there has been very little research into this 'new model'. Many people do not understand what it means for a school to be parallel or they confuse a parallel model with co-education, due to the presence of both boys and girls within the one institution. Therefore, the main obj...
Balanced, parallel operation of flashlamps
International Nuclear Information System (INIS)
Carder, B.M.; Merritt, B.T.
1979-01-01
A new energy store, the Compensated Pulsed Alternator (CPA), promises to be a cost effective substitute for capacitors to drive flashlamps that pump large Nd:glass lasers. Because the CPA is large and discrete, it will be necessary that it drive many parallel flashlamp circuits, presenting a problem in equal current distribution. Current division to +- 20% between parallel flashlamps has been achieved, but this is marginal for laser pumping. A method is presented here that provides equal current sharing to about 1%, and it includes fused protection against short circuit faults. The method was tested with eight parallel circuits, including both open-circuit and short-circuit fault tests
A fast sparse reconstruction algorithm for electrical tomography
International Nuclear Information System (INIS)
Zhao, Jia; Xu, Yanbin; Tan, Chao; Dong, Feng
2014-01-01
Electrical tomography (ET) has been widely investigated due to its advantages of being non-radiative, low-cost and high-speed. However, the image reconstruction of ET is a nonlinear and ill-posed inverse problem and the imaging results are easily affected by measurement noise. A sparse reconstruction algorithm based on L 1 regularization is robust to noise and consequently provides a high quality of reconstructed images. In this paper, a sparse reconstruction by separable approximation algorithm (SpaRSA) is extended to solve the ET inverse problem. The algorithm is competitive with the fastest state-of-the-art algorithms in solving the standard L 2 −L 1 problem. However, it is computationally expensive when the dimension of the matrix is large. To further improve the calculation speed of solving inverse problems, a projection method based on the Krylov subspace is employed and combined with the SpaRSA algorithm. The proposed algorithm is tested with image reconstruction of electrical resistance tomography (ERT). Both simulation and experimental results demonstrate that the proposed method can reduce the computational time and improve the noise robustness for the image reconstruction. (paper)
A Modified Sparse Representation Method for Facial Expression Recognition
Directory of Open Access Journals (Sweden)
Wei Wang
2016-01-01
Full Text Available In this paper, we carry on research on a facial expression recognition method, which is based on modified sparse representation recognition (MSRR method. On the first stage, we use Haar-like+LPP to extract feature and reduce dimension. On the second stage, we adopt LC-K-SVD (Label Consistent K-SVD method to train the dictionary, instead of adopting directly the dictionary from samples, and add block dictionary training into the training process. On the third stage, stOMP (stagewise orthogonal matching pursuit method is used to speed up the convergence of OMP (orthogonal matching pursuit. Besides, a dynamic regularization factor is added to iteration process to suppress noises and enhance accuracy. We verify the proposed method from the aspect of training samples, dimension, feature extraction and dimension reduction methods and noises in self-built database and Japan’s JAFFE and CMU’s CK database. Further, we compare this sparse method with classic SVM and RVM and analyze the recognition effect and time efficiency. The result of simulation experiment has shown that the coefficient of MSRR method contains classifying information, which is capable of improving the computing speed and achieving a satisfying recognition result.
Sparse estimation of model-based diffuse thermal dust emission
Irfan, Melis O.; Bobin, Jérôme
2018-03-01
Component separation for the Planck High Frequency Instrument (HFI) data is primarily concerned with the estimation of thermal dust emission, which requires the separation of thermal dust from the cosmic infrared background (CIB). For that purpose, current estimation methods rely on filtering techniques to decouple thermal dust emission from CIB anisotropies, which tend to yield a smooth, low-resolution, estimation of the dust emission. In this paper, we present a new parameter estimation method, premise: Parameter Recovery Exploiting Model Informed Sparse Estimates. This method exploits the sparse nature of thermal dust emission to calculate all-sky maps of thermal dust temperature, spectral index, and optical depth at 353 GHz. premise is evaluated and validated on full-sky simulated data. We find the percentage difference between the premise results and the true values to be 2.8, 5.7, and 7.2 per cent at the 1σ level across the full sky for thermal dust temperature, spectral index, and optical depth at 353 GHz, respectively. A comparison between premise and a GNILC-like method over selected regions of our sky simulation reveals that both methods perform comparably within high signal-to-noise regions. However, outside of the Galactic plane, premise is seen to outperform the GNILC-like method with increasing success as the signal-to-noise ratio worsens.
Integrative sparse principal component analysis of gene expression data.
Liu, Mengque; Fan, Xinyan; Fang, Kuangnan; Zhang, Qingzhao; Ma, Shuangge
2017-12-01
In the analysis of gene expression data, dimension reduction techniques have been extensively adopted. The most popular one is perhaps the PCA (principal component analysis). To generate more reliable and more interpretable results, the SPCA (sparse PCA) technique has been developed. With the "small sample size, high dimensionality" characteristic of gene expression data, the analysis results generated from a single dataset are often unsatisfactory. Under contexts other than dimension reduction, integrative analysis techniques, which jointly analyze the raw data of multiple independent datasets, have been developed and shown to outperform "classic" meta-analysis and other multidatasets techniques and single-dataset analysis. In this study, we conduct integrative analysis by developing the iSPCA (integrative SPCA) method. iSPCA achieves the selection and estimation of sparse loadings using a group penalty. To take advantage of the similarity across datasets and generate more accurate results, we further impose contrasted penalties. Different penalties are proposed to accommodate different data conditions. Extensive simulations show that iSPCA outperforms the alternatives under a wide spectrum of settings. The analysis of breast cancer and pancreatic cancer data further shows iSPCA's satisfactory performance. © 2017 WILEY PERIODICALS, INC.
Visual recognition and inference using dynamic overcomplete sparse learning.
Murray, Joseph F; Kreutz-Delgado, Kenneth
2007-09-01
We present a hierarchical architecture and learning algorithm for visual recognition and other visual inference tasks such as imagination, reconstruction of occluded images, and expectation-driven segmentation. Using properties of biological vision for guidance, we posit a stochastic generative world model and from it develop a simplified world model (SWM) based on a tractable variational approximation that is designed to enforce sparse coding. Recent developments in computational methods for learning overcomplete representations (Lewicki & Sejnowski, 2000; Teh, Welling, Osindero, & Hinton, 2003) suggest that overcompleteness can be useful for visual tasks, and we use an overcomplete dictionary learning algorithm (Kreutz-Delgado, et al., 2003) as a preprocessing stage to produce accurate, sparse codings of images. Inference is performed by constructing a dynamic multilayer network with feedforward, feedback, and lateral connections, which is trained to approximate the SWM. Learning is done with a variant of the back-propagation-through-time algorithm, which encourages convergence to desired states within a fixed number of iterations. Vision tasks require large networks, and to make learning efficient, we take advantage of the sparsity of each layer to update only a small subset of elements in a large weight matrix at each iteration. Experiments on a set of rotated objects demonstrate various types of visual inference and show that increasing the degree of overcompleteness improves recognition performance in difficult scenes with occluded objects in clutter.
A preconditioned inexact newton method for nonlinear sparse electromagnetic imaging
Desmal, Abdulla
2015-03-01
A nonlinear inversion scheme for the electromagnetic microwave imaging of domains with sparse content is proposed. Scattering equations are constructed using a contrast-source (CS) formulation. The proposed method uses an inexact Newton (IN) scheme to tackle the nonlinearity of these equations. At every IN iteration, a system of equations, which involves the Frechet derivative (FD) matrix of the CS operator, is solved for the IN step. A sparsity constraint is enforced on the solution via thresholded Landweber iterations, and the convergence is significantly increased using a preconditioner that levels the FD matrix\\'s singular values associated with contrast and equivalent currents. To increase the accuracy, the weight of the regularization\\'s penalty term is reduced during the IN iterations consistently with the scheme\\'s quadratic convergence. At the end of each IN iteration, an additional thresholding, which removes small \\'ripples\\' that are produced by the IN step, is applied to maintain the solution\\'s sparsity. Numerical results demonstrate the applicability of the proposed method in recovering sparse and discontinuous dielectric profiles with high contrast values.
Synthesizing spatiotemporally sparse smartphone sensor data for bridge modal identification
Ozer, Ekin; Feng, Maria Q.
2016-08-01
Smartphones as vibration measurement instruments form a large-scale, citizen-induced, and mobile wireless sensor network (WSN) for system identification and structural health monitoring (SHM) applications. Crowdsourcing-based SHM is possible with a decentralized system granting citizens with operational responsibility and control. Yet, citizen initiatives introduce device mobility, drastically changing SHM results due to uncertainties in the time and the space domains. This paper proposes a modal identification strategy that fuses spatiotemporally sparse SHM data collected by smartphone-based WSNs. Multichannel data sampled with the time and the space independence is used to compose the modal identification parameters such as frequencies and mode shapes. Structural response time history can be gathered by smartphone accelerometers and converted into Fourier spectra by the processor units. Timestamp, data length, energy to power conversion address temporal variation, whereas spatial uncertainties are reduced by geolocation services or determining node identity via QR code labels. Then, parameters collected from each distributed network component can be extended to global behavior to deduce modal parameters without the need of a centralized and synchronous data acquisition system. The proposed method is tested on a pedestrian bridge and compared with a conventional reference monitoring system. The results show that the spatiotemporally sparse mobile WSN data can be used to infer modal parameters despite non-overlapping sensor operation schedule.
Ab initio nuclear structure - the large sparse matrix eigenvalue problem
Energy Technology Data Exchange (ETDEWEB)
Vary, James P; Maris, Pieter [Department of Physics, Iowa State University, Ames, IA, 50011 (United States); Ng, Esmond; Yang, Chao [Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720 (United States); Sosonkina, Masha, E-mail: jvary@iastate.ed [Scalable Computing Laboratory, Ames Laboratory, Iowa State University, Ames, IA, 50011 (United States)
2009-07-01
The structure and reactions of light nuclei represent fundamental and formidable challenges for microscopic theory based on realistic strong interaction potentials. Several ab initio methods have now emerged that provide nearly exact solutions for some nuclear properties. The ab initio no core shell model (NCSM) and the no core full configuration (NCFC) method, frame this quantum many-particle problem as a large sparse matrix eigenvalue problem where one evaluates the Hamiltonian matrix in a basis space consisting of many-fermion Slater determinants and then solves for a set of the lowest eigenvalues and their associated eigenvectors. The resulting eigenvectors are employed to evaluate a set of experimental quantities to test the underlying potential. For fundamental problems of interest, the matrix dimension often exceeds 10{sup 10} and the number of nonzero matrix elements may saturate available storage on present-day leadership class facilities. We survey recent results and advances in solving this large sparse matrix eigenvalue problem. We also outline the challenges that lie ahead for achieving further breakthroughs in fundamental nuclear theory using these ab initio approaches.
Ab initio nuclear structure - the large sparse matrix eigenvalue problem
International Nuclear Information System (INIS)
Vary, James P; Maris, Pieter; Ng, Esmond; Yang, Chao; Sosonkina, Masha
2009-01-01
The structure and reactions of light nuclei represent fundamental and formidable challenges for microscopic theory based on realistic strong interaction potentials. Several ab initio methods have now emerged that provide nearly exact solutions for some nuclear properties. The ab initio no core shell model (NCSM) and the no core full configuration (NCFC) method, frame this quantum many-particle problem as a large sparse matrix eigenvalue problem where one evaluates the Hamiltonian matrix in a basis space consisting of many-fermion Slater determinants and then solves for a set of the lowest eigenvalues and their associated eigenvectors. The resulting eigenvectors are employed to evaluate a set of experimental quantities to test the underlying potential. For fundamental problems of interest, the matrix dimension often exceeds 10 10 and the number of nonzero matrix elements may saturate available storage on present-day leadership class facilities. We survey recent results and advances in solving this large sparse matrix eigenvalue problem. We also outline the challenges that lie ahead for achieving further breakthroughs in fundamental nuclear theory using these ab initio approaches.
Population coding in sparsely connected networks of noisy neurons.
Tripp, Bryan P; Orchard, Jeff
2012-01-01
This study examines the relationship between population coding and spatial connection statistics in networks of noisy neurons. Encoding of sensory information in the neocortex is thought to require coordinated neural populations, because individual cortical neurons respond to a wide range of stimuli, and exhibit highly variable spiking in response to repeated stimuli. Population coding is rooted in network structure, because cortical neurons receive information only from other neurons, and because the information they encode must be decoded by other neurons, if it is to affect behavior. However, population coding theory has often ignored network structure, or assumed discrete, fully connected populations (in contrast with the sparsely connected, continuous sheet of the cortex). In this study, we modeled a sheet of cortical neurons with sparse, primarily local connections, and found that a network with this structure could encode multiple internal state variables with high signal-to-noise ratio. However, we were unable to create high-fidelity networks by instantiating connections at random according to spatial connection probabilities. In our models, high-fidelity networks required additional structure, with higher cluster factors and correlations between the inputs to nearby neurons.
Population Coding in Sparsely Connected Networks of Noisy Neurons
Directory of Open Access Journals (Sweden)
Bryan Patrick Tripp
2012-05-01
Full Text Available This study examines the relationship between population coding and spatial connection statistics in networks of noisy neurons. Encoding of sensory information in the neocortex is thought to require coordinated neural populations, because individual cortical neurons respond to a wide range of stimuli, and exhibit highly variable spiking in response to repeated stimuli. Population coding is rooted in network structure, because cortical neurons receive information only from other neurons, and because the information they encode must be decoded by other neurons, if it is to affect behaviour. However, population coding theory has often ignored network structure, or assumed discrete, fully-connected populations (in contrast with the sparsely connected, continuous sheet of the cortex. In this study, we model a sheet of cortical neurons with sparse, primarily local connections, and find that a network with this structure can encode multiple internal state variables with high signal-to-noise ratio. However, in our model, although connection probability varies with the distance between neurons, we find that the connections cannot be instantiated at random according to these probabilities, but must have additional structure if information is to be encoded with high fidelity.
Pairwise Constraint-Guided Sparse Learning for Feature Selection.
Liu, Mingxia; Zhang, Daoqiang
2016-01-01
Feature selection aims to identify the most informative features for a compact and accurate data representation. As typical supervised feature selection methods, Lasso and its variants using L1-norm-based regularization terms have received much attention in recent studies, most of which use class labels as supervised information. Besides class labels, there are other types of supervised information, e.g., pairwise constraints that specify whether a pair of data samples belong to the same class (must-link constraint) or different classes (cannot-link constraint). However, most of existing L1-norm-based sparse learning methods do not take advantage of the pairwise constraints that provide us weak and more general supervised information. For addressing that problem, we propose a pairwise constraint-guided sparse (CGS) learning method for feature selection, where the must-link and the cannot-link constraints are used as discriminative regularization terms that directly concentrate on the local discriminative structure of data. Furthermore, we develop two variants of CGS, including: 1) semi-supervised CGS that utilizes labeled data, pairwise constraints, and unlabeled data and 2) ensemble CGS that uses the ensemble of pairwise constraint sets. We conduct a series of experiments on a number of data sets from University of California-Irvine machine learning repository, a gene expression data set, two real-world neuroimaging-based classification tasks, and two large-scale attribute classification tasks. Experimental results demonstrate the efficacy of our proposed methods, compared with several established feature selection methods.
Sparse tensor spherical harmonics approximation in radiative transfer
International Nuclear Information System (INIS)
Grella, K.; Schwab, Ch.
2011-01-01
The stationary monochromatic radiative transfer equation is a partial differential transport equation stated on a five-dimensional phase space. To obtain a well-posed problem, boundary conditions have to be prescribed on the inflow part of the domain boundary. We solve the equation with a multi-level Galerkin FEM in physical space and a spectral discretization with harmonics in solid angle and show that the benefits of the concept of sparse tensor products, known from the context of sparse grids, can also be leveraged in combination with a spectral discretization. Our method allows us to include high spectral orders without incurring the 'curse of dimension' of a five-dimensional computational domain. Neglecting boundary conditions, we find analytically that for smooth solutions, the convergence rate of the full tensor product method is retained in our method up to a logarithmic factor, while the number of degrees of freedom grows essentially only as fast as for the purely spatial problem. For the case with boundary conditions, we propose a splitting of the physical function space and a conforming tensorization. Numerical experiments in two physical and one angular dimension show evidence for the theoretical convergence rates to hold in the latter case as well.
Robust Pedestrian Classification Based on Hierarchical Kernel Sparse Representation
Directory of Open Access Journals (Sweden)
Rui Sun
2016-08-01
Full Text Available Vision-based pedestrian detection has become an active topic in computer vision and autonomous vehicles. It aims at detecting pedestrians appearing ahead of the vehicle using a camera so that autonomous vehicles can assess the danger and take action. Due to varied illumination and appearance, complex background and occlusion pedestrian detection in outdoor environments is a difficult problem. In this paper, we propose a novel hierarchical feature extraction and weighted kernel sparse representation model for pedestrian classification. Initially, hierarchical feature extraction based on a CENTRIST descriptor is used to capture discriminative structures. A max pooling operation is used to enhance the invariance of varying appearance. Then, a kernel sparse representation model is proposed to fully exploit the discrimination information embedded in the hierarchical local features, and a Gaussian weight function as the measure to effectively handle the occlusion in pedestrian images. Extensive experiments are conducted on benchmark databases, including INRIA, Daimler, an artificially generated dataset and a real occluded dataset, demonstrating the more robust performance of the proposed method compared to state-of-the-art pedestrian classification methods.
Sparse Representation Based Binary Hypothesis Model for Hyperspectral Image Classification
Directory of Open Access Journals (Sweden)
Yidong Tang
2016-01-01
Full Text Available The sparse representation based classifier (SRC and its kernel version (KSRC have been employed for hyperspectral image (HSI classification. However, the state-of-the-art SRC often aims at extended surface objects with linear mixture in smooth scene and assumes that the number of classes is given. Considering the small target with complex background, a sparse representation based binary hypothesis (SRBBH model is established in this paper. In this model, a query pixel is represented in two ways, which are, respectively, by background dictionary and by union dictionary. The background dictionary is composed of samples selected from the local dual concentric window centered at the query pixel. Thus, for each pixel the classification issue becomes an adaptive multiclass classification problem, where only the number of desired classes is required. Furthermore, the kernel method is employed to improve the interclass separability. In kernel space, the coding vector is obtained by using kernel-based orthogonal matching pursuit (KOMP algorithm. Then the query pixel can be labeled by the characteristics of the coding vectors. Instead of directly using the reconstruction residuals, the different impacts the background dictionary and union dictionary have on reconstruction are used for validation and classification. It enhances the discrimination and hence improves the performance.
Global sensitivity analysis using sparse grid interpolation and polynomial chaos
International Nuclear Information System (INIS)
Buzzard, Gregery T.
2012-01-01
Sparse grid interpolation is widely used to provide good approximations to smooth functions in high dimensions based on relatively few function evaluations. By using an efficient conversion from the interpolating polynomial provided by evaluations on a sparse grid to a representation in terms of orthogonal polynomials (gPC representation), we show how to use these relatively few function evaluations to estimate several types of sensitivity coefficients and to provide estimates on local minima and maxima. First, we provide a good estimate of the variance-based sensitivity coefficients of Sobol' (1990) [1] and then use the gradient of the gPC representation to give good approximations to the derivative-based sensitivity coefficients described by Kucherenko and Sobol' (2009) [2]. Finally, we use the package HOM4PS-2.0 given in Lee et al. (2008) [3] to determine the critical points of the interpolating polynomial and use these to determine the local minima and maxima of this polynomial. - Highlights: ► Efficient estimation of variance-based sensitivity coefficients. ► Efficient estimation of derivative-based sensitivity coefficients. ► Use of homotopy methods for approximation of local maxima and minima.
Sparse representation based image interpolation with nonlocal autoregressive modeling.
Dong, Weisheng; Zhang, Lei; Lukac, Rastislav; Shi, Guangming
2013-04-01
Sparse representation is proven to be a promising approach to image super-resolution, where the low-resolution (LR) image is usually modeled as the down-sampled version of its high-resolution (HR) counterpart after blurring. When the blurring kernel is the Dirac delta function, i.e., the LR image is directly down-sampled from its HR counterpart without blurring, the super-resolution problem becomes an image interpolation problem. In such cases, however, the conventional sparse representation models (SRM) become less effective, because the data fidelity term fails to constrain the image local structures. In natural images, fortunately, many nonlocal similar patches to a given patch could provide nonlocal constraint to the local structure. In this paper, we incorporate the image nonlocal self-similarity into SRM for image interpolation. More specifically, a nonlocal autoregressive model (NARM) is proposed and taken as the data fidelity term in SRM. We show that the NARM-induced sampling matrix is less coherent with the representation dictionary, and consequently makes SRM more effective for image interpolation. Our extensive experimental results demonstrate that the proposed NARM-based image interpolation method can effectively reconstruct the edge structures and suppress the jaggy/ringing artifacts, achieving the best image interpolation results so far in terms of PSNR as well as perceptual quality metrics such as SSIM and FSIM.
Robust sparse image reconstruction of radio interferometric observations with PURIFY
Pratley, Luke; McEwen, Jason D.; d'Avezac, Mayeul; Carrillo, Rafael E.; Onose, Alexandru; Wiaux, Yves
2018-01-01
Next-generation radio interferometers, such as the Square Kilometre Array, will revolutionize our understanding of the Universe through their unprecedented sensitivity and resolution. However, to realize these goals significant challenges in image and data processing need to be overcome. The standard methods in radio interferometry for reconstructing images, such as CLEAN, have served the community well over the last few decades and have survived largely because they are pragmatic. However, they produce reconstructed interferometric images that are limited in quality and scalability for big data. In this work, we apply and evaluate alternative interferometric reconstruction methods that make use of state-of-the-art sparse image reconstruction algorithms motivated by compressive sensing, which have been implemented in the PURIFY software package. In particular, we implement and apply the proximal alternating direction method of multipliers algorithm presented in a recent article. First, we assess the impact of the interpolation kernel used to perform gridding and degridding on sparse image reconstruction. We find that the Kaiser-Bessel interpolation kernel performs as well as prolate spheroidal wave functions while providing a computational saving and an analytic form. Secondly, we apply PURIFY to real interferometric observations from the Very Large Array and the Australia Telescope Compact Array and find that images recovered by PURIFY are of higher quality than those recovered by CLEAN. Thirdly, we discuss how PURIFY reconstructions exhibit additional advantages over those recovered by CLEAN. The latest version of PURIFY, with developments presented in this work, is made publicly available.
Sparse Detector Imaging Sensor with Two-Class Silhouette Classification
Directory of Open Access Journals (Sweden)
David Russomanno
2008-12-01
Full Text Available This paper presents the design and test of a simple active near-infrared sparse detector imaging sensor. The prototype of the sensor is novel in that it can capture remarkable silhouettes or profiles of a wide-variety of moving objects, including humans, animals, and vehicles using a sparse detector array comprised of only sixteen sensing elements deployed in a vertical configuration. The prototype sensor was built to collect silhouettes for a variety of objects and to evaluate several algorithms for classifying the data obtained from the sensor into two classes: human versus non-human. Initial tests show that the classification of individually sensed objects into two classes can be achieved with accuracy greater than ninety-nine percent (99% with a subset of the sixteen detectors using a representative dataset consisting of 512 signatures. The prototype also includes a Webservice interface such that the sensor can be tasked in a network-centric environment. The sensor appears to be a low-cost alternative to traditional, high-resolution focal plane array imaging sensors for some applications. After a power optimization study, appropriate packaging, and testing with more extensive datasets, the sensor may be a good candidate for deployment in vast geographic regions for a myriad of intelligent electronic fence and persistent surveillance applications, including perimeter security scenarios.
Image Classification Based on Convolutional Denoising Sparse Autoencoder
Directory of Open Access Journals (Sweden)
Shuangshuang Chen
2017-01-01
Full Text Available Image classification aims to group images into corresponding semantic categories. Due to the difficulties of interclass similarity and intraclass variability, it is a challenging issue in computer vision. In this paper, an unsupervised feature learning approach called convolutional denoising sparse autoencoder (CDSAE is proposed based on the theory of visual attention mechanism and deep learning methods. Firstly, saliency detection method is utilized to get training samples for unsupervised feature learning. Next, these samples are sent to the denoising sparse autoencoder (DSAE, followed by convolutional layer and local contrast normalization layer. Generally, prior in a specific task is helpful for the task solution. Therefore, a new pooling strategy—spatial pyramid pooling (SPP fused with center-bias prior—is introduced into our approach. Experimental results on the common two image datasets (STL-10 and CIFAR-10 demonstrate that our approach is effective in image classification. They also demonstrate that none of these three components: local contrast normalization, SPP fused with center-prior, and l2 vector normalization can be excluded from our proposed approach. They jointly improve image representation and classification performance.
Sparse image representation for jet neutron and gamma tomography
Energy Technology Data Exchange (ETDEWEB)
Craciunescu, T. [EURATOM-MEdC Association, Institute for Laser, Plasma and Radiation Physics, Bucharest (Romania); Kiptily, V. [EURATOM/CCFE Association, Culham Science Centre, Abingdon (United Kingdom); Murari, A. [Consorzio RFX, Associazione EURATOM-ENEA per la Fusione, Padova (Italy); Tiseanu, I.; Zoita, V. [EURATOM-MEdC Association, Institute for Laser, Plasma and Radiation Physics, Bucharest (Romania)
2013-10-15
Highlights: •A new tomographic method for the reconstruction of the 2-D neutron and gamma emissivity on JET. •The method is based on the sparse representation of the reconstructed image in an over-complete dictionary. •Several techniques, based on a priori information are used to regularize this highly limited data set tomographic problem. •The proposed method provides good reconstructions in terms of shapes and resolution. -- Abstract: The JET gamma/neutron profile monitor plasma coverage of the emissive region enables tomographic reconstruction. However, due to the availability of only two projection angles and to the coarse sampling, tomography is a highly limited data set problem. A new reconstruction method, based on the sparse representation of the reconstructed image in an over-complete dictionary, has been developed and applied to JET neutron/gamma tomography. The method has been tested on JET experimental data and significant results are presented. The proposed method provides good reconstructions in terms of shapes and resolution.
Pedestrian detection from thermal images: A sparse representation based approach
Qi, Bin; John, Vijay; Liu, Zheng; Mita, Seiichi
2016-05-01
Pedestrian detection, a key technology in computer vision, plays a paramount role in the applications of advanced driver assistant systems (ADASs) and autonomous vehicles. The objective of pedestrian detection is to identify and locate people in a dynamic environment so that accidents can be avoided. With significant variations introduced by illumination, occlusion, articulated pose, and complex background, pedestrian detection is a challenging task for visual perception. Different from visible images, thermal images are captured and presented with intensity maps based objects' emissivity, and thus have an enhanced spectral range to make human beings perceptible from the cool background. In this study, a sparse representation based approach is proposed for pedestrian detection from thermal images. We first adopted the histogram of sparse code to represent image features and then detect pedestrian with the extracted features in an unimodal and a multimodal framework respectively. In the unimodal framework, two types of dictionaries, i.e. joint dictionary and individual dictionary, are built by learning from prepared training samples. In the multimodal framework, a weighted fusion scheme is proposed to further highlight the contributions from features with higher separability. To validate the proposed approach, experiments were conducted to compare with three widely used features: Haar wavelets (HWs), histogram of oriented gradients (HOG), and histogram of phase congruency (HPC) as well as two classification methods, i.e. AdaBoost and support vector machine (SVM). Experimental results on a publicly available data set demonstrate the superiority of the proposed approach.
Sparse approximation of currents for statistics on curves and surfaces.
Durrleman, Stanley; Pennec, Xavier; Trouvé, Alain; Ayache, Nicholas
2008-01-01
Computing, processing, visualizing statistics on shapes like curves or surfaces is a real challenge with many applications ranging from medical image analysis to computational geometry. Modelling such geometrical primitives with currents avoids feature-based approach as well as point-correspondence method. This framework has been proved to be powerful to register brain surfaces or to measure geometrical invariants. However, if the state-of-the-art methods perform efficiently pairwise registrations, new numerical schemes are required to process groupwise statistics due to an increasing complexity when the size of the database is growing. Statistics such as mean and principal modes of a set of shapes often have a heavy and highly redundant representation. We propose therefore to find an adapted basis on which mean and principal modes have a sparse decomposition. Besides the computational improvement, this sparse representation offers a way to visualize and interpret statistics on currents. Experiments show the relevance of the approach on 34 sets of 70 sulcal lines and on 50 sets of 10 meshes of deep brain structures.
iSAP: Interactive Sparse Astronomical Data Analysis Packages
Fourt, O.; Starck, J.-L.; Sureau, F.; Bobin, J.; Moudden, Y.; Abrial, P.; Schmitt, J.
2013-03-01
iSAP consists of three programs, written in IDL, which together are useful for spherical data analysis. MR/S (MultiResolution on the Sphere) contains routines for wavelet, ridgelet and curvelet transform on the sphere, and applications such denoising on the sphere using wavelets and/or curvelets, Gaussianity tests and Independent Component Analysis on the Sphere. MR/S has been designed for the PLANCK project, but can be used for many other applications. SparsePol (Polarized Spherical Wavelets and Curvelets) has routines for polarized wavelet, polarized ridgelet and polarized curvelet transform on the sphere, and applications such denoising on the sphere using wavelets and/or curvelets, Gaussianity tests and blind source separation on the Sphere. SparsePol has been designed for the PLANCK project. MS-VSTS (Multi-Scale Variance Stabilizing Transform on the Sphere), designed initially for the FERMI project, is useful for spherical mono-channel and multi-channel data analysis when the data are contaminated by a Poisson noise. It contains routines for wavelet/curvelet denoising, wavelet deconvolution, multichannel wavelet denoising and deconvolution.
Application of alternating decision trees in selecting sparse linear solvers
Bhowmick, Sanjukta; Eijkhout, Victor; Freund, Yoav; Fuentes, Erika; Keyes, David E.
2010-01-01
The solution of sparse linear systems, a fundamental and resource-intensive task in scientific computing, can be approached through multiple algorithms. Using an algorithm well adapted to characteristics of the task can significantly enhance the performance, such as reducing the time required for the operation, without compromising the quality of the result. However, the best solution method can vary even across linear systems generated in course of the same PDE-based simulation, thereby making solver selection a very challenging problem. In this paper, we use a machine learning technique, Alternating Decision Trees (ADT), to select efficient solvers based on the properties of sparse linear systems and runtime-dependent features, such as the stages of simulation. We demonstrate the effectiveness of this method through empirical results over linear systems drawn from computational fluid dynamics and magnetohydrodynamics applications. The results also demonstrate that using ADT can resolve the problem of over-fitting, which occurs when limited amount of data is available. © 2010 Springer Science+Business Media LLC.
BOES: Building Occupancy Estimation System using sparse ambient vibration monitoring
Pan, Shijia; Bonde, Amelie; Jing, Jie; Zhang, Lin; Zhang, Pei; Noh, Hae Young
2014-04-01
In this paper, we present a room-level building occupancy estimation system (BOES) utilizing low-resolution vibration sensors that are sparsely distributed. Many ubiquitous computing and building maintenance systems require fine-grained occupancy knowledge to enable occupant centric services and optimize space and energy utilization. The sensing infrastructure support for current occupancy estimation systems often requires multiple intrusive sensors per room, resulting in systems that are both costly to deploy and difficult to maintain. To address these shortcomings, we developed BOES. BOES utilizes sparse vibration sensors to track occupancy levels and activities. Our system has three major components. 1) It extracts features that distinguish occupant activities from noise prone ambient vibrations and detects human footsteps. 2) Using a sequence of footsteps, the system localizes and tracks individuals by observing changes in the sequences. It uses this tracking information to identify when an occupant leaves or enters a room. 3) The entering and leaving room information are combined with detected individual location information to update the room-level occupancy state of the building. Through validation experiments in two different buildings, our system was able to achieve 99.55% accuracy for event detection, less than three feet average error for localization, and 85% accuracy in occupancy counting.
Decentralized modal identification using sparse blind source separation
International Nuclear Information System (INIS)
Sadhu, A; Hazra, B; Narasimhan, S; Pandey, M D
2011-01-01
Popular ambient vibration-based system identification methods process information collected from a dense array of sensors centrally to yield the modal properties. In such methods, the need for a centralized processing unit capable of satisfying large memory and processing demands is unavoidable. With the advent of wireless smart sensor networks, it is now possible to process information locally at the sensor level, instead. The information at the individual sensor level can then be concatenated to obtain the global structure characteristics. A novel decentralized algorithm based on wavelet transforms to infer global structure mode information using measurements obtained using a small group of sensors at a time is proposed in this paper. The focus of the paper is on algorithmic development, while the actual hardware and software implementation is not pursued here. The problem of identification is cast within the framework of under-determined blind source separation invoking transformations of measurements to the time–frequency domain resulting in a sparse representation. The partial mode shape coefficients so identified are then combined to yield complete modal information. The transformations are undertaken using stationary wavelet packet transform (SWPT), yielding a sparse representation in the wavelet domain. Principal component analysis (PCA) is then performed on the resulting wavelet coefficients, yielding the partial mixing matrix coefficients from a few measurement channels at a time. This process is repeated using measurements obtained from multiple sensor groups, and the results so obtained from each group are concatenated to obtain the global modal characteristics of the structure
Decentralized modal identification using sparse blind source separation
Sadhu, A.; Hazra, B.; Narasimhan, S.; Pandey, M. D.
2011-12-01
Popular ambient vibration-based system identification methods process information collected from a dense array of sensors centrally to yield the modal properties. In such methods, the need for a centralized processing unit capable of satisfying large memory and processing demands is unavoidable. With the advent of wireless smart sensor networks, it is now possible to process information locally at the sensor level, instead. The information at the individual sensor level can then be concatenated to obtain the global structure characteristics. A novel decentralized algorithm based on wavelet transforms to infer global structure mode information using measurements obtained using a small group of sensors at a time is proposed in this paper. The focus of the paper is on algorithmic development, while the actual hardware and software implementation is not pursued here. The problem of identification is cast within the framework of under-determined blind source separation invoking transformations of measurements to the time-frequency domain resulting in a sparse representation. The partial mode shape coefficients so identified are then combined to yield complete modal information. The transformations are undertaken using stationary wavelet packet transform (SWPT), yielding a sparse representation in the wavelet domain. Principal component analysis (PCA) is then performed on the resulting wavelet coefficients, yielding the partial mixing matrix coefficients from a few measurement channels at a time. This process is repeated using measurements obtained from multiple sensor groups, and the results so obtained from each group are concatenated to obtain the global modal characteristics of the structure.
Sparse/Low Rank Constrained Reconstruction for Dynamic PET Imaging.
Directory of Open Access Journals (Sweden)
Xingjian Yu
Full Text Available In dynamic Positron Emission Tomography (PET, an estimate of the radio activity concentration is obtained from a series of frames of sinogram data taken at ranging in duration from 10 seconds to minutes under some criteria. So far, all the well-known reconstruction algorithms require known data statistical properties. It limits the speed of data acquisition, besides, it is unable to afford the separated information about the structure and the variation of shape and rate of metabolism which play a major role in improving the visualization of contrast for some requirement of the diagnosing in application. This paper presents a novel low rank-based activity map reconstruction scheme from emission sinograms of dynamic PET, termed as SLCR representing Sparse/Low Rank Constrained Reconstruction for Dynamic PET Imaging. In this method, the stationary background is formulated as a low rank component while variations between successive frames are abstracted to the sparse. The resulting nuclear norm and l1 norm related minimization problem can also be efficiently solved by many recently developed numerical methods. In this paper, the linearized alternating direction method is applied. The effectiveness of the proposed scheme is illustrated on three data sets.
Threshold partitioning of sparse matrices and applications to Markov chains
Energy Technology Data Exchange (ETDEWEB)
Choi, Hwajeong; Szyld, D.B. [Temple Univ., Philadelphia, PA (United States)
1996-12-31
It is well known that the order of the variables and equations of a large, sparse linear system influences the performance of classical iterative methods. In particular if, after a symmetric permutation, the blocks in the diagonal have more nonzeros, classical block methods have a faster asymptotic rate of convergence. In this paper, different ordering and partitioning algorithms for sparse matrices are presented. They are modifications of PABLO. In the new algorithms, in addition to the location of the nonzeros, the values of the entries are taken into account. The matrix resulting after the symmetric permutation has dense blocks along the diagonal, and small entries in the off-diagonal blocks. Parameters can be easily adjusted to obtain, for example, denser blocks, or blocks with elements of larger magnitude. In particular, when the matrices represent Markov chains, the permuted matrices are well suited for block iterative methods that find the corresponding probability distribution. Applications to three types of methods are explored: (1) Classical block methods, such as Block Gauss Seidel. (2) Preconditioned GMRES, where a block diagonal preconditioner is used. (3) Iterative aggregation method (also called aggregation/disaggregation) where the partition obtained from the ordering algorithm with certain parameters is used as an aggregation scheme. In all three cases, experiments are presented which illustrate the performance of the methods with the new orderings. The complexity of the new algorithms is linear in the number of nonzeros and the order of the matrix, and thus adding little computational effort to the overall solution.
Parallelization of the preconditioned IDR solver for modern multicore computer systems
Bessonov, O. A.; Fedoseyev, A. I.
2012-10-01
This paper present the analysis, parallelization and optimization approach for the large sparse matrix solver CNSPACK for modern multicore microprocessors. CNSPACK is an advanced solver successfully used for coupled solution of stiff problems arising in multiphysics applications such as CFD, semiconductor transport, kinetic and quantum problems. It employs iterative IDR algorithm with ILU preconditioning (user chosen ILU preconditioning order). CNSPACK has been successfully used during last decade for solving problems in several application areas, including fluid dynamics and semiconductor device simulation. However, there was a dramatic change in processor architectures and computer system organization in recent years. Due to this, performance criteria and methods have been revisited, together with involving the parallelization of the solver and preconditioner using Open MP environment. Results of the successful implementation for efficient parallelization are presented for the most advances computer system (Intel Core i7-9xx or two-processor Xeon 55xx/56xx).
Workspace Analysis for Parallel Robot
Directory of Open Access Journals (Sweden)
Ying Sun
2013-05-01
Full Text Available As a completely new-type of robot, the parallel robot possesses a lot of advantages that the serial robot does not, such as high rigidity, great load-carrying capacity, small error, high precision, small self-weight/load ratio, good dynamic behavior and easy control, hence its range is extended in using domain. In order to find workspace of parallel mechanism, the numerical boundary-searching algorithm based on the reverse solution of kinematics and limitation of link length has been introduced. This paper analyses position workspace, orientation workspace of parallel robot of the six degrees of freedom. The result shows: It is a main means to increase and decrease its workspace to change the length of branch of parallel mechanism; The radius of the movement platform has no effect on the size of workspace, but will change position of workspace.
"Feeling" Series and Parallel Resistances.
Morse, Robert A.
1993-01-01
Equipped with drinking straws and stirring straws, a teacher can help students understand how resistances in electric circuits combine in series and in parallel. Follow-up suggestions are provided. (ZWH)
Parallel encoders for pixel detectors
International Nuclear Information System (INIS)
Nikityuk, N.M.
1991-01-01
A new method of fast encoding and determining the multiplicity and coordinates of fired pixels is described. A specific example construction of parallel encodes and MCC for n=49 and t=2 is given. 16 refs.; 6 figs.; 2 tabs
Massively Parallel Finite Element Programming
Heister, Timo
2010-01-01
Today\\'s large finite element simulations require parallel algorithms to scale on clusters with thousands or tens of thousands of processor cores. We present data structures and algorithms to take advantage of the power of high performance computers in generic finite element codes. Existing generic finite element libraries often restrict the parallelization to parallel linear algebra routines. This is a limiting factor when solving on more than a few hundreds of cores. We describe routines for distributed storage of all major components coupled with efficient, scalable algorithms. We give an overview of our effort to enable the modern and generic finite element library deal.II to take advantage of the power of large clusters. In particular, we describe the construction of a distributed mesh and develop algorithms to fully parallelize the finite element calculation. Numerical results demonstrate good scalability. © 2010 Springer-Verlag.
Event monitoring of parallel computations
Directory of Open Access Journals (Sweden)
Gruzlikov Alexander M.
2015-06-01
Full Text Available The paper considers the monitoring of parallel computations for detection of abnormal events. It is assumed that computations are organized according to an event model, and monitoring is based on specific test sequences
Massively Parallel Finite Element Programming
Heister, Timo; Kronbichler, Martin; Bangerth, Wolfgang
2010-01-01
Today's large finite element simulations require parallel algorithms to scale on clusters with thousands or tens of thousands of processor cores. We present data structures and algorithms to take advantage of the power of high performance computers in generic finite element codes. Existing generic finite element libraries often restrict the parallelization to parallel linear algebra routines. This is a limiting factor when solving on more than a few hundreds of cores. We describe routines for distributed storage of all major components coupled with efficient, scalable algorithms. We give an overview of our effort to enable the modern and generic finite element library deal.II to take advantage of the power of large clusters. In particular, we describe the construction of a distributed mesh and develop algorithms to fully parallelize the finite element calculation. Numerical results demonstrate good scalability. © 2010 Springer-Verlag.
The STAPL Parallel Graph Library
Harshvardhan,
2013-01-01
This paper describes the stapl Parallel Graph Library, a high-level framework that abstracts the user from data-distribution and parallelism details and allows them to concentrate on parallel graph algorithm development. It includes a customizable distributed graph container and a collection of commonly used parallel graph algorithms. The library introduces pGraph pViews that separate algorithm design from the container implementation. It supports three graph processing algorithmic paradigms, level-synchronous, asynchronous and coarse-grained, and provides common graph algorithms based on them. Experimental results demonstrate improved scalability in performance and data size over existing graph libraries on more than 16,000 cores and on internet-scale graphs containing over 16 billion vertices and 250 billion edges. © Springer-Verlag Berlin Heidelberg 2013.
Writing parallel programs that work
CERN. Geneva
2012-01-01
Serial algorithms typically run inefficiently on parallel machines. This may sound like an obvious statement, but it is the root cause of why parallel programming is considered to be difficult. The current state of the computer industry is still that almost all programs in existence are serial. This talk will describe the techniques used in the Intel Parallel Studio to provide a developer with the tools necessary to understand the behaviors and limitations of the existing serial programs. Once the limitations are known the developer can refactor the algorithms and reanalyze the resulting programs with the tools in the Intel Parallel Studio to create parallel programs that work. About the speaker Paul Petersen is a Sr. Principal Engineer in the Software and Solutions Group (SSG) at Intel. He received a Ph.D. degree in Computer Science from the University of Illinois in 1993. After UIUC, he was employed at Kuck and Associates, Inc. (KAI) working on auto-parallelizing compiler (KAP), and was involved in th...
Exploiting Symmetry on Parallel Architectures.
Stiller, Lewis Benjamin
1995-01-01
This thesis describes techniques for the design of parallel programs that solve well-structured problems with inherent symmetry. Part I demonstrates the reduction of such problems to generalized matrix multiplication by a group-equivariant matrix. Fast techniques for this multiplication are described, including factorization, orbit decomposition, and Fourier transforms over finite groups. Our algorithms entail interaction between two symmetry groups: one arising at the software level from the problem's symmetry and the other arising at the hardware level from the processors' communication network. Part II illustrates the applicability of our symmetry -exploitation techniques by presenting a series of case studies of the design and implementation of parallel programs. First, a parallel program that solves chess endgames by factorization of an associated dihedral group-equivariant matrix is described. This code runs faster than previous serial programs, and discovered it a number of results. Second, parallel algorithms for Fourier transforms for finite groups are developed, and preliminary parallel implementations for group transforms of dihedral and of symmetric groups are described. Applications in learning, vision, pattern recognition, and statistics are proposed. Third, parallel implementations solving several computational science problems are described, including the direct n-body problem, convolutions arising from molecular biology, and some communication primitives such as broadcast and reduce. Some of our implementations ran orders of magnitude faster than previous techniques, and were used in the investigation of various physical phenomena.
Parallel algorithms for continuum dynamics
International Nuclear Information System (INIS)
Hicks, D.L.; Liebrock, L.M.
1987-01-01
Simply porting existing parallel programs to a new parallel processor may not achieve the full speedup possible; to achieve the maximum efficiency may require redesigning the parallel algorithms for the specific architecture. The authors discuss here parallel algorithms that were developed first for the HEP processor and then ported to the CRAY X-MP/4, the ELXSI/10, and the Intel iPSC/32. Focus is mainly on the most recent parallel processing results produced, i.e., those on the Intel Hypercube. The applications are simulations of continuum dynamics in which the momentum and stress gradients are important. Examples of these are inertial confinement fusion experiments, severe breaks in the coolant system of a reactor, weapons physics, shock-wave physics. Speedup efficiencies on the Intel iPSC Hypercube are very sensitive to the ratio of communication to computation. Great care must be taken in designing algorithms for this machine to avoid global communication. This is much more critical on the iPSC than it was on the three previous parallel processors
Parallel computation of multigroup reactivity coefficient using iterative method
Susmikanti, Mike; Dewayatna, Winter
2013-09-01
One of the research activities to support the commercial radioisotope production program is a safety research target irradiation FPM (Fission Product Molybdenum). FPM targets form a tube made of stainless steel in which the nuclear degrees of superimposed high-enriched uranium. FPM irradiation tube is intended to obtain fission. The fission material widely used in the form of kits in the world of nuclear medicine. Irradiation FPM tube reactor core would interfere with performance. One of the disorders comes from changes in flux or reactivity. It is necessary to study a method for calculating safety terrace ongoing configuration changes during the life of the reactor, making the code faster became an absolute necessity. Neutron safety margin for the research reactor can be reused without modification to the calculation of the reactivity of the reactor, so that is an advantage of using perturbation method. The criticality and flux in multigroup diffusion model was calculate at various irradiation positions in some uranium content. This model has a complex computation. Several parallel algorithms with iterative method have been developed for the sparse and big matrix solution. The Black-Red Gauss Seidel Iteration and the power iteration parallel method can be used to solve multigroup diffusion equation system and calculated the criticality and reactivity coeficient. This research was developed code for reactivity calculation which used one of safety analysis with parallel processing. It can be done more quickly and efficiently by utilizing the parallel processing in the multicore computer. This code was applied for the safety limits calculation of irradiated targets FPM with increment Uranium.
Oliker, Leonid; Heber, Gerd; Biswas, Rupak
2000-01-01
The Conjugate Gradient (CG) algorithm is perhaps the best-known iterative technique to solve sparse linear systems that are symmetric and positive definite. A sparse matrix-vector multiply (SPMV) usually accounts for most of the floating-point operations within a CG iteration. In this paper, we investigate the effects of various ordering and partitioning strategies on the performance of parallel CG and SPMV using different programming paradigms and architectures. Results show that for this class of applications, ordering significantly improves overall performance, that cache reuse may be more important than reducing communication, and that it is possible to achieve message passing performance using shared memory constructs through careful data ordering and distribution. However, a multi-threaded implementation of CG on the Tera MTA does not require special ordering or partitioning to obtain high efficiency and scalability.
Domain decomposition parallel computing for transient two-phase flow of nuclear reactors
Energy Technology Data Exchange (ETDEWEB)
Lee, Jae Ryong; Yoon, Han Young [KAERI, Daejeon (Korea, Republic of); Choi, Hyoung Gwon [Seoul National University, Seoul (Korea, Republic of)
2016-05-15
KAERI (Korea Atomic Energy Research Institute) has been developing a multi-dimensional two-phase flow code named CUPID for multi-physics and multi-scale thermal hydraulics analysis of Light water reactors (LWRs). The CUPID code has been validated against a set of conceptual problems and experimental data. In this work, the CUPID code has been parallelized based on the domain decomposition method with Message passing interface (MPI) library. For domain decomposition, the CUPID code provides both manual and automatic methods with METIS library. For the effective memory management, the Compressed sparse row (CSR) format is adopted, which is one of the methods to represent the sparse asymmetric matrix. CSR format saves only non-zero value and its position (row and column). By performing the verification for the fundamental problem set, the parallelization of the CUPID has been successfully confirmed. Since the scalability of a parallel simulation is generally known to be better for fine mesh system, three different scales of mesh system are considered: 40000 meshes for coarse mesh system, 320000 meshes for mid-size mesh system, and 2560000 meshes for fine mesh system. In the given geometry, both single- and two-phase calculations were conducted. In addition, two types of preconditioners for a matrix solver were compared: Diagonal and incomplete LU preconditioner. In terms of enhancement of the parallel performance, the OpenMP and MPI hybrid parallel computing for a pressure solver was examined. It is revealed that the scalability of hybrid calculation was enhanced for the multi-core parallel computation.
Joint-2D-SL0 Algorithm for Joint Sparse Matrix Reconstruction
Directory of Open Access Journals (Sweden)
Dong Zhang
2017-01-01
Full Text Available Sparse matrix reconstruction has a wide application such as DOA estimation and STAP. However, its performance is usually restricted by the grid mismatch problem. In this paper, we revise the sparse matrix reconstruction model and propose the joint sparse matrix reconstruction model based on one-order Taylor expansion. And it can overcome the grid mismatch problem. Then, we put forward the Joint-2D-SL0 algorithm which can solve the joint sparse matrix reconstruction problem efficiently. Compared with the Kronecker compressive sensing method, our proposed method has a higher computational efficiency and acceptable reconstruction accuracy. Finally, simulation results validate the superiority of the proposed method.
Improved image registration by sparse patch-based deformation estimation.
Kim, Minjeong; Wu, Guorong; Wang, Qian; Lee, Seong-Whan; Shen, Dinggang
2015-01-15
Despite intensive efforts for decades, deformable image registration is still a challenging problem due to the potential large anatomical differences across individual images, which limits the registration performance. Fortunately, this issue could be alleviated if a good initial deformation can be provided for the two images under registration, which are often termed as the moving subject and the fixed template, respectively. In this work, we present a novel patch-based initial deformation prediction framework for improving the performance of existing registration algorithms. Our main idea is to estimate the initial deformation between subject and template in a patch-wise fashion by using the sparse representation technique. We argue that two image patches should follow the same deformation toward the template image if their patch-wise appearance patterns are similar. To this end, our framework consists of two stages, i.e., the training stage and the application stage. In the training stage, we register all training images to the pre-selected template, such that the deformation of each training image with respect to the template is known. In the application stage, we apply the following four steps to efficiently calculate the initial deformation field for the new test subject: (1) We pick a small number of key points in the distinctive regions of the test subject; (2) for each key point, we extract a local patch and form a coupled appearance-deformation dictionary from training images where each dictionary atom consists of the image intensity patch as well as their respective local deformations; (3) a small set of training image patches in the coupled dictionary are selected to represent the image patch of each subject key point by sparse representation. Then, we can predict the initial deformation for each subject key point by propagating the pre-estimated deformations on the selected training patches with the same sparse representation coefficients; and (4) we