WorldWideScience

Sample records for mcae sparse solver

  1. Parallel sparse direct solver for integrated circuit simulation

    CERN Document Server

    Chen, Xiaoming; Yang, Huazhong

    2017-01-01

    This book describes algorithmic methods and parallelization techniques to design a parallel sparse direct solver which is specifically targeted at integrated circuit simulation problems. The authors describe a complete flow and detailed parallel algorithms of the sparse direct solver. They also show how to improve the performance by simple but effective numerical techniques. The sparse direct solver techniques described can be applied to any SPICE-like integrated circuit simulator and have been proven to be high-performance in actual circuit simulation. Readers will benefit from the state-of-the-art parallel integrated circuit simulation techniques described in this book, especially the latest parallel sparse matrix solution techniques. · Introduces complicated algorithms of sparse linear solvers, using concise principles and simple examples, without complex theory or lengthy derivations; · Describes a parallel sparse direct solver that can be adopted to accelerate any SPICE-like integrated circuit simulato...

  2. A sparse-grid isogeometric solver

    KAUST Repository

    Beck, Joakim; Sangalli, Giancarlo; Tamellini, Lorenzo

    2018-01-01

    Isogeometric Analysis (IGA) typically adopts tensor-product splines and NURBS as a basis for the approximation of the solution of PDEs. In this work, we investigate to which extent IGA solvers can benefit from the so-called sparse-grids construction in its combination technique form, which was first introduced in the early 90’s in the context of the approximation of high-dimensional PDEs.The tests that we report show that, in accordance to the literature, a sparse-grid construction can indeed be useful if the solution of the PDE at hand is sufficiently smooth. Sparse grids can also be useful in the case of non-smooth solutions when some a-priori knowledge on the location of the singularities of the solution can be exploited to devise suitable non-equispaced meshes. Finally, we remark that sparse grids can be seen as a simple way to parallelize pre-existing serial IGA solvers in a straightforward fashion, which can be beneficial in many practical situations.

  3. A sparse-grid isogeometric solver

    KAUST Repository

    Beck, Joakim

    2018-02-28

    Isogeometric Analysis (IGA) typically adopts tensor-product splines and NURBS as a basis for the approximation of the solution of PDEs. In this work, we investigate to which extent IGA solvers can benefit from the so-called sparse-grids construction in its combination technique form, which was first introduced in the early 90’s in the context of the approximation of high-dimensional PDEs.The tests that we report show that, in accordance to the literature, a sparse-grid construction can indeed be useful if the solution of the PDE at hand is sufficiently smooth. Sparse grids can also be useful in the case of non-smooth solutions when some a-priori knowledge on the location of the singularities of the solution can be exploited to devise suitable non-equispaced meshes. Finally, we remark that sparse grids can be seen as a simple way to parallelize pre-existing serial IGA solvers in a straightforward fashion, which can be beneficial in many practical situations.

  4. User's Manual for PCSMS (Parallel Complex Sparse Matrix Solver). Version 1.

    Science.gov (United States)

    Reddy, C. J.

    2000-01-01

    PCSMS (Parallel Complex Sparse Matrix Solver) is a computer code written to make use of the existing real sparse direct solvers to solve complex, sparse matrix linear equations. PCSMS converts complex matrices into real matrices and use real, sparse direct matrix solvers to factor and solve the real matrices. The solution vector is reconverted to complex numbers. Though, this utility is written for Silicon Graphics (SGI) real sparse matrix solution routines, it is general in nature and can be easily modified to work with any real sparse matrix solver. The User's Manual is written to make the user acquainted with the installation and operation of the code. Driver routines are given to aid the users to integrate PCSMS routines in their own codes.

  5. A sparse version of IGA solvers

    KAUST Repository

    Beck, Joakim; Sangalli, Giancarlo; Tamellini, Lorenzo

    2017-01-01

    Isogeometric Analysis (IGA) typically adopts tensor-product splines and NURBS as a basis for the approximation of the solution of PDEs. In this work, we investigate to which extent IGA solvers can benefit from the so-called sparse-grids construction in its combination technique form, which was first introduced in the early 90s in the context of the approximation of high-dimensional PDEs. The tests that we report show that, in accordance to the literature, a sparse grids construction can indeed be useful if the solution of the PDE at hand is sufficiently smooth. Sparse grids can also be useful in the case of non-smooth solutions when some a-priori knowledge on the location of the singularities of the solution can be exploited to devise suitable non-equispaced meshes. Finally, we remark that sparse grids can be seen as a simple way to parallelize pre-existing serial IGA solvers in a straightforward fashion, which can be beneficial in many practical situations.

  6. A sparse version of IGA solvers

    KAUST Repository

    Beck, Joakim

    2017-07-30

    Isogeometric Analysis (IGA) typically adopts tensor-product splines and NURBS as a basis for the approximation of the solution of PDEs. In this work, we investigate to which extent IGA solvers can benefit from the so-called sparse-grids construction in its combination technique form, which was first introduced in the early 90s in the context of the approximation of high-dimensional PDEs. The tests that we report show that, in accordance to the literature, a sparse grids construction can indeed be useful if the solution of the PDE at hand is sufficiently smooth. Sparse grids can also be useful in the case of non-smooth solutions when some a-priori knowledge on the location of the singularities of the solution can be exploited to devise suitable non-equispaced meshes. Finally, we remark that sparse grids can be seen as a simple way to parallelize pre-existing serial IGA solvers in a straightforward fashion, which can be beneficial in many practical situations.

  7. The impact of improved sparse linear solvers on industrial engineering applications

    Energy Technology Data Exchange (ETDEWEB)

    Heroux, M. [Cray Research, Inc., Eagan, MN (United States); Baddourah, M.; Poole, E.L.; Yang, Chao Wu

    1996-12-31

    There are usually many factors that ultimately determine the quality of computer simulation for engineering applications. Some of the most important are the quality of the analytical model and approximation scheme, the accuracy of the input data and the capability of the computing resources. However, in many engineering applications the characteristics of the sparse linear solver are the key factors in determining how complex a problem a given application code can solve. Therefore, the advent of a dramatically improved solver often brings with it dramatic improvements in our ability to do accurate and cost effective computer simulations. In this presentation we discuss the current status of sparse iterative and direct solvers in several key industrial CFD and structures codes, and show the impact that recent advances in linear solvers have made on both our ability to perform challenging simulations and the cost of those simulations. We also present some of the current challenges we have and the constraints we face in trying to improve these solvers. Finally, we discuss future requirements for sparse linear solvers on high performance architectures and try to indicate the opportunities that exist if we can develop even more improvements in linear solver capabilities.

  8. A distributed-memory hierarchical solver for general sparse linear systems

    Energy Technology Data Exchange (ETDEWEB)

    Chen, Chao [Stanford Univ., CA (United States). Inst. for Computational and Mathematical Engineering; Pouransari, Hadi [Stanford Univ., CA (United States). Dept. of Mechanical Engineering; Rajamanickam, Sivasankaran [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States). Center for Computing Research; Boman, Erik G. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States). Center for Computing Research; Darve, Eric [Stanford Univ., CA (United States). Inst. for Computational and Mathematical Engineering and Dept. of Mechanical Engineering

    2017-12-20

    We present a parallel hierarchical solver for general sparse linear systems on distributed-memory machines. For large-scale problems, this fully algebraic algorithm is faster and more memory-efficient than sparse direct solvers because it exploits the low-rank structure of fill-in blocks. Depending on the accuracy of low-rank approximations, the hierarchical solver can be used either as a direct solver or as a preconditioner. The parallel algorithm is based on data decomposition and requires only local communication for updating boundary data on every processor. Moreover, the computation-to-communication ratio of the parallel algorithm is approximately the volume-to-surface-area ratio of the subdomain owned by every processor. We also provide various numerical results to demonstrate the versatility and scalability of the parallel algorithm.

  9. The Use of Sparse Direct Solver in Vector Finite Element Modeling for Calculating Two Dimensional (2-D) Magnetotelluric Responses in Transverse Electric (TE) Mode

    Science.gov (United States)

    Yihaa Roodhiyah, Lisa’; Tjong, Tiffany; Nurhasan; Sutarno, D.

    2018-04-01

    The late research, linear matrices of vector finite element in two dimensional(2-D) magnetotelluric (MT) responses modeling was solved by non-sparse direct solver in TE mode. Nevertheless, there is some weakness which have to be improved especially accuracy in the low frequency (10-3 Hz-10-5 Hz) which is not achieved yet and high cost computation in dense mesh. In this work, the solver which is used is sparse direct solver instead of non-sparse direct solverto overcome the weaknesses of solving linear matrices of vector finite element metod using non-sparse direct solver. Sparse direct solver will be advantageous in solving linear matrices of vector finite element method because of the matrix properties which is symmetrical and sparse. The validation of sparse direct solver in solving linear matrices of vector finite element has been done for a homogen half-space model and vertical contact model by analytical solution. Thevalidation result of sparse direct solver in solving linear matrices of vector finite element shows that sparse direct solver is more stable than non-sparse direct solver in computing linear problem of vector finite element method especially in low frequency. In the end, the accuracy of 2D MT responses modelling in low frequency (10-3 Hz-10-5 Hz) has been reached out under the efficient allocation memory of array and less computational time consuming.

  10. Application of alternating decision trees in selecting sparse linear solvers

    KAUST Repository

    Bhowmick, Sanjukta; Eijkhout, Victor; Freund, Yoav; Fuentes, Erika; Keyes, David E.

    2010-01-01

    The solution of sparse linear systems, a fundamental and resource-intensive task in scientific computing, can be approached through multiple algorithms. Using an algorithm well adapted to characteristics of the task can significantly enhance the performance, such as reducing the time required for the operation, without compromising the quality of the result. However, the best solution method can vary even across linear systems generated in course of the same PDE-based simulation, thereby making solver selection a very challenging problem. In this paper, we use a machine learning technique, Alternating Decision Trees (ADT), to select efficient solvers based on the properties of sparse linear systems and runtime-dependent features, such as the stages of simulation. We demonstrate the effectiveness of this method through empirical results over linear systems drawn from computational fluid dynamics and magnetohydrodynamics applications. The results also demonstrate that using ADT can resolve the problem of over-fitting, which occurs when limited amount of data is available. © 2010 Springer Science+Business Media LLC.

  11. SuperLU{_}DIST: A scalable distributed-memory sparse direct solver for unsymmetric linear systems

    Energy Technology Data Exchange (ETDEWEB)

    Li, Xiaoye S.; Demmel, James W.

    2002-03-27

    In this paper, we present the main algorithmic features in the software package SuperLU{_}DIST, a distributed-memory sparse direct solver for large sets of linear equations. We give in detail our parallelization strategies, with focus on scalability issues, and demonstrate the parallel performance and scalability on current machines. The solver is based on sparse Gaussian elimination, with an innovative static pivoting strategy proposed earlier by the authors. The main advantage of static pivoting over classical partial pivoting is that it permits a priori determination of data structures and communication pattern for sparse Gaussian elimination, which makes it more scalable on distributed memory machines. Based on this a priori knowledge, we designed highly parallel and scalable algorithms for both LU decomposition and triangular solve and we show that they are suitable for large-scale distributed memory machines.

  12. Improving the energy efficiency of sparse linear system solvers on multicore and manycore systems.

    Science.gov (United States)

    Anzt, H; Quintana-Ortí, E S

    2014-06-28

    While most recent breakthroughs in scientific research rely on complex simulations carried out in large-scale supercomputers, the power draft and energy spent for this purpose is increasingly becoming a limiting factor to this trend. In this paper, we provide an overview of the current status in energy-efficient scientific computing by reviewing different technologies used to monitor power draft as well as power- and energy-saving mechanisms available in commodity hardware. For the particular domain of sparse linear algebra, we analyse the energy efficiency of a broad collection of hardware architectures and investigate how algorithmic and implementation modifications can improve the energy performance of sparse linear system solvers, without negatively impacting their performance. © 2014 The Author(s) Published by the Royal Society. All rights reserved.

  13. New sparse matrix solver in the KIKO3D 3-dimensional reactor dynamics code

    International Nuclear Information System (INIS)

    Panka, I.; Kereszturi, A.; Hegedus, C.

    2005-01-01

    The goal of this paper is to present a more effective method Bi-CGSTAB for accelerating the large sparse matrix equation solution in the KIKO3D code. This equation system is obtained by using the factorization of the improved quasi static (IQS) method for the time dependent nodal kinetic equations. In the old methodology standard large sparse matrix techniques were considered, where Gauss-Seidel preconditioning and a GMRES-type solver were applied. The validation of KIKO3D using Bi-CGSTAB has been performed by solving of a VVER-1000 kinetic benchmark problem. Additionally, the convergence characteristics were investigated in given macro time steps of Control Rod Ejection transients. The results have been obtained by the old GMRES and new Bi-CGSTAB methods are compared. (author)

  14. New iterative solvers for the NAG Libraries

    Energy Technology Data Exchange (ETDEWEB)

    Salvini, S.; Shaw, G. [Numerical Algorithms Group Ltd., Oxford (United Kingdom)

    1996-12-31

    The purpose of this paper is to introduce the work which has been carried out at NAG Ltd to update the iterative solvers for sparse systems of linear equations, both symmetric and unsymmetric, in the NAG Fortran 77 Library. Our current plans to extend this work and include it in our other numerical libraries in our range are also briefly mentioned. We have added to the Library the new Chapter F11, entirely dedicated to sparse linear algebra. At Mark 17, the F11 Chapter includes sparse iterative solvers, preconditioners, utilities and black-box routines for sparse symmetric (both positive-definite and indefinite) linear systems. Mark 18 will add solvers, preconditioners, utilities and black-boxes for sparse unsymmetric systems: the development of these has already been completed.

  15. Extending the Finite Domain Solver of GNU Prolog

    NARCIS (Netherlands)

    Bloemen, Vincent; Diaz, Daniel; van der Bijl, Machiel; Abreu, Salvador; Ströder, Thomas; Swift, Terrance

    This paper describes three significant extensions for the Finite Domain solver of GNU Prolog. First, the solver now supports negative integers. Second, the solver detects and prevents integer overflows from occurring. Third, the internal representation of sparse domains has been redesigned to

  16. High performance simplex solver

    OpenAIRE

    Huangfu, Qi

    2013-01-01

    The dual simplex method is frequently the most efficient technique for solving linear programming (LP) problems. This thesis describes an efficient implementation of the sequential dual simplex method and the design and development of two parallel dual simplex solvers. In serial, many advanced techniques for the (dual) simplex method are implemented, including sparse LU factorization, hyper-sparse linear system solution technique, efficient approaches to updating LU factors and...

  17. Sampling of finite elements for sparse recovery in large scale 3D electrical impedance tomography

    International Nuclear Information System (INIS)

    Javaherian, Ashkan; Moeller, Knut; Soleimani, Manuchehr

    2015-01-01

    This study proposes a method to improve performance of sparse recovery inverse solvers in 3D electrical impedance tomography (3D EIT), especially when the volume under study contains small-sized inclusions, e.g. 3D imaging of breast tumours. Initially, a quadratic regularized inverse solver is applied in a fast manner with a stopping threshold much greater than the optimum. Based on assuming a fixed level of sparsity for the conductivity field, finite elements are then sampled via applying a compressive sensing (CS) algorithm to the rough blurred estimation previously made by the quadratic solver. Finally, a sparse inverse solver is applied solely to the sampled finite elements, with the solution to the CS as its initial guess. The results show the great potential of the proposed CS-based sparse recovery in improving accuracy of sparse solution to the large-size 3D EIT. (paper)

  18. Comparing direct and iterative equation solvers in a large structural analysis software system

    Science.gov (United States)

    Poole, E. L.

    1991-01-01

    Two direct Choleski equation solvers and two iterative preconditioned conjugate gradient (PCG) equation solvers used in a large structural analysis software system are described. The two direct solvers are implementations of the Choleski method for variable-band matrix storage and sparse matrix storage. The two iterative PCG solvers include the Jacobi conjugate gradient method and an incomplete Choleski conjugate gradient method. The performance of the direct and iterative solvers is compared by solving several representative structural analysis problems. Some key factors affecting the performance of the iterative solvers relative to the direct solvers are identified.

  19. Scalable Newton-Krylov solver for very large power flow problems

    NARCIS (Netherlands)

    Idema, R.; Lahaye, D.J.P.; Vuik, C.; Van der Sluis, L.

    2010-01-01

    The power flow problem is generally solved by the Newton-Raphson method with a sparse direct solver for the linear system of equations in each iteration. While this works fine for small power flow problems, we will show that for very large problems the direct solver is very slow and we present

  20. Enhancing Scalability of Sparse Direct Methods

    International Nuclear Information System (INIS)

    Li, Xiaoye S.; Demmel, James; Grigori, Laura; Gu, Ming; Xia, Jianlin; Jardin, Steve; Sovinec, Carl; Lee, Lie-Quan

    2007-01-01

    TOPS is providing high-performance, scalable sparse direct solvers, which have had significant impacts on the SciDAC applications, including fusion simulation (CEMM), accelerator modeling (COMPASS), as well as many other mission-critical applications in DOE and elsewhere. Our recent developments have been focusing on new techniques to overcome scalability bottleneck of direct methods, in both time and memory. These include parallelizing symbolic analysis phase and developing linear-complexity sparse factorization methods. The new techniques will make sparse direct methods more widely usable in large 3D simulations on highly-parallel petascale computers

  1. Numerical solution of large sparse linear systems

    International Nuclear Information System (INIS)

    Meurant, Gerard; Golub, Gene.

    1982-02-01

    This note is based on one of the lectures given at the 1980 CEA-EDF-INRIA Numerical Analysis Summer School whose aim is the study of large sparse linear systems. The main topics are solving least squares problems by orthogonal transformation, fast Poisson solvers and solution of sparse linear system by iterative methods with a special emphasis on preconditioned conjuguate gradient method [fr

  2. BCYCLIC: A parallel block tridiagonal matrix cyclic solver

    Science.gov (United States)

    Hirshman, S. P.; Perumalla, K. S.; Lynch, V. E.; Sanchez, R.

    2010-09-01

    A block tridiagonal matrix is factored with minimal fill-in using a cyclic reduction algorithm that is easily parallelized. Storage of the factored blocks allows the application of the inverse to multiple right-hand sides which may not be known at factorization time. Scalability with the number of block rows is achieved with cyclic reduction, while scalability with the block size is achieved using multithreaded routines (OpenMP, GotoBLAS) for block matrix manipulation. This dual scalability is a noteworthy feature of this new solver, as well as its ability to efficiently handle arbitrary (non-powers-of-2) block row and processor numbers. Comparison with a state-of-the art parallel sparse solver is presented. It is expected that this new solver will allow many physical applications to optimally use the parallel resources on current supercomputers. Example usage of the solver in magneto-hydrodynamic (MHD), three-dimensional equilibrium solvers for high-temperature fusion plasmas is cited.

  3. Advanced Algebraic Multigrid Solvers for Subsurface Flow Simulation

    KAUST Repository

    Chen, Meng-Huo

    2015-09-13

    In this research we are particularly interested in extending the robustness of multigrid solvers to encounter complex systems related to subsurface reservoir applications for flow problems in porous media. In many cases, the step for solving the pressure filed in subsurface flow simulation becomes a bottleneck for the performance of the simulator. For solving large sparse linear system arising from MPFA discretization, we choose multigrid methods as the linear solver. The possible difficulties and issues will be addressed and the corresponding remedies will be studied. As the multigrid methods are used as the linear solver, the simulator can be parallelized (although not trivial) and the high-resolution simulation become feasible, the ultimately goal which we desire to achieve.

  4. A comparison of SuperLU solvers on the intel MIC architecture

    Science.gov (United States)

    Tuncel, Mehmet; Duran, Ahmet; Celebi, M. Serdar; Akaydin, Bora; Topkaya, Figen O.

    2016-10-01

    In many science and engineering applications, problems may result in solving a sparse linear system AX=B. For example, SuperLU_MCDT, a linear solver, was used for the large penta-diagonal matrices for 2D problems and hepta-diagonal matrices for 3D problems, coming from the incompressible blood flow simulation (see [1]). It is important to test the status and potential improvements of state-of-the-art solvers on new technologies. In this work, sequential, multithreaded and distributed versions of SuperLU solvers (see [2]) are examined on the Intel Xeon Phi coprocessors using offload programming model at the EURORA cluster of CINECA in Italy. We consider a portfolio of test matrices containing patterned matrices from UFMM ([3]) and randomly located matrices. This architecture can benefit from high parallelism and large vectors. We find that the sequential SuperLU benefited up to 45 % performance improvement from the offload programming depending on the sparse matrix type and the size of transferred and processed data.

  5. Scalable domain decomposition solvers for stochastic PDEs in high performance computing

    International Nuclear Information System (INIS)

    Desai, Ajit; Pettit, Chris; Poirel, Dominique; Sarkar, Abhijit

    2017-01-01

    Stochastic spectral finite element models of practical engineering systems may involve solutions of linear systems or linearized systems for non-linear problems with billions of unknowns. For stochastic modeling, it is therefore essential to design robust, parallel and scalable algorithms that can efficiently utilize high-performance computing to tackle such large-scale systems. Domain decomposition based iterative solvers can handle such systems. And though these algorithms exhibit excellent scalabilities, significant algorithmic and implementational challenges exist to extend them to solve extreme-scale stochastic systems using emerging computing platforms. Intrusive polynomial chaos expansion based domain decomposition algorithms are extended here to concurrently handle high resolution in both spatial and stochastic domains using an in-house implementation. Sparse iterative solvers with efficient preconditioners are employed to solve the resulting global and subdomain level local systems through multi-level iterative solvers. We also use parallel sparse matrix–vector operations to reduce the floating-point operations and memory requirements. Numerical and parallel scalabilities of these algorithms are presented for the diffusion equation having spatially varying diffusion coefficient modeled by a non-Gaussian stochastic process. Scalability of the solvers with respect to the number of random variables is also investigated.

  6. Three-Dimensional Inverse Transport Solver Based on Compressive Sensing Technique

    Science.gov (United States)

    Cheng, Yuxiong; Wu, Hongchun; Cao, Liangzhi; Zheng, Youqi

    2013-09-01

    According to the direct exposure measurements from flash radiographic image, a compressive sensing-based method for three-dimensional inverse transport problem is presented. The linear absorption coefficients and interface locations of objects are reconstructed directly at the same time. It is always very expensive to obtain enough measurements. With limited measurements, compressive sensing sparse reconstruction technique orthogonal matching pursuit is applied to obtain the sparse coefficients by solving an optimization problem. A three-dimensional inverse transport solver is developed based on a compressive sensing-based technique. There are three features in this solver: (1) AutoCAD is employed as a geometry preprocessor due to its powerful capacity in graphic. (2) The forward projection matrix rather than Gauss matrix is constructed by the visualization tool generator. (3) Fourier transform and Daubechies wavelet transform are adopted to convert an underdetermined system to a well-posed system in the algorithm. Simulations are performed and numerical results in pseudo-sine absorption problem, two-cube problem and two-cylinder problem when using compressive sensing-based solver agree well with the reference value.

  7. Migration of vectorized iterative solvers to distributed memory architectures

    Energy Technology Data Exchange (ETDEWEB)

    Pommerell, C. [AT& T Bell Labs., Murray Hill, NJ (United States); Ruehl, R. [CSCS-ETH, Manno (Switzerland)

    1994-12-31

    Both necessity and opportunity motivate the use of high-performance computers for iterative linear solvers. Necessity results from the size of the problems being solved-smaller problems are often better handled by direct methods. Opportunity arises from the formulation of the iterative methods in terms of simple linear algebra operations, even if this {open_quote}natural{close_quotes} parallelism is not easy to exploit in irregularly structured sparse matrices and with good preconditioners. As a result, high-performance implementations of iterative solvers have attracted a lot of interest in recent years. Most efforts are geared to vectorize or parallelize the dominating operation-structured or unstructured sparse matrix-vector multiplication, or to increase locality and parallelism by reformulating the algorithm-reducing global synchronization in inner products or local data exchange in preconditioners. Target architectures for iterative solvers currently include mostly vector supercomputers and architectures with one or few optimized (e.g., super-scalar and/or super-pipelined RISC) processors and hierarchical memory systems. More recently, parallel computers with physically distributed memory and a better price/performance ratio have been offered by vendors as a very interesting alternative to vector supercomputers. However, programming comfort on such distributed memory parallel processors (DMPPs) still lags behind. Here the authors are concerned with iterative solvers and their changing computing environment. In particular, they are considering migration from traditional vector supercomputers to DMPPs. Application requirements force one to use flexible and portable libraries. They want to extend the portability of iterative solvers rather than reimplementing everything for each new machine, or even for each new architecture.

  8. GPU-Accelerated Sparse Matrix Solvers for Large-Scale Simulations, Phase II

    Data.gov (United States)

    National Aeronautics and Space Administration — At the heart of scientific computing and numerical analysis are linear algebra solvers. In scientific computing, the focus is on the partial differential equations...

  9. Sparse Linear Solver for Power System Analysis Using FPGA

    National Research Council Canada - National Science Library

    Johnson, J. R; Nagvajara, P; Nwankpa, C

    2005-01-01

    .... Numerical solution to load flow equations are typically computed using Newton-Raphson iteration, and the most time consuming component of the computation is the solution of a sparse linear system...

  10. LAPACKrc: Fast linear algebra kernels/solvers for FPGA accelerators

    International Nuclear Information System (INIS)

    Gonzalez, Juan; Nunez, Rafael C

    2009-01-01

    We present LAPACKrc, a family of FPGA-based linear algebra solvers able to achieve more than 100x speedup per commodity processor on certain problems. LAPACKrc subsumes some of the LAPACK and ScaLAPACK functionalities, and it also incorporates sparse direct and iterative matrix solvers. Current LAPACKrc prototypes demonstrate between 40x-150x speedup compared against top-of-the-line hardware/software systems. A technology roadmap is in place to validate current performance of LAPACKrc in HPC applications, and to increase the computational throughput by factors of hundreds within the next few years.

  11. Efficient convolutional sparse coding

    Science.gov (United States)

    Wohlberg, Brendt

    2017-06-20

    Computationally efficient algorithms may be applied for fast dictionary learning solving the convolutional sparse coding problem in the Fourier domain. More specifically, efficient convolutional sparse coding may be derived within an alternating direction method of multipliers (ADMM) framework that utilizes fast Fourier transforms (FFT) to solve the main linear system in the frequency domain. Such algorithms may enable a significant reduction in computational cost over conventional approaches by implementing a linear solver for the most critical and computationally expensive component of the conventional iterative algorithm. The theoretical computational cost of the algorithm may be reduced from O(M.sup.3N) to O(MN log N), where N is the dimensionality of the data and M is the number of elements in the dictionary. This significant improvement in efficiency may greatly increase the range of problems that can practically be addressed via convolutional sparse representations.

  12. Parallel preconditioning techniques for sparse CG solvers

    Energy Technology Data Exchange (ETDEWEB)

    Basermann, A.; Reichel, B.; Schelthoff, C. [Central Institute for Applied Mathematics, Juelich (Germany)

    1996-12-31

    Conjugate gradient (CG) methods to solve sparse systems of linear equations play an important role in numerical methods for solving discretized partial differential equations. The large size and the condition of many technical or physical applications in this area result in the need for efficient parallelization and preconditioning techniques of the CG method. In particular for very ill-conditioned matrices, sophisticated preconditioner are necessary to obtain both acceptable convergence and accuracy of CG. Here, we investigate variants of polynomial and incomplete Cholesky preconditioners that markedly reduce the iterations of the simply diagonally scaled CG and are shown to be well suited for massively parallel machines.

  13. Fast Multipole-Based Preconditioner for Sparse Iterative Solvers

    KAUST Repository

    Ibeid, Huda; Yokota, Rio; Keyes, David E.

    2014-01-01

    Among optimal hierarchical algorithms for the computational solution of elliptic problems, the Fast Multipole Method (FMM) stands out for its adaptability to emerging architectures, having high arithmetic intensity, tunable accuracy, and relaxed global synchronization requirements. We demonstrate that, beyond its traditional use as a solver in problems for which explicit free-space kernel representations are available, the FMM has applicability as a preconditioner in finite domain elliptic boundary value problems, by equipping it with boundary integral capability for finite boundaries and by wrapping it in a Krylov method for extensibility to more general operators. Compared with multilevel methods, it is capable of comparable algebraic convergence rates down to the truncation error of the discretized PDE, and it has superior multicore and distributed memory scalability properties on commodity architecture supercomputers.

  14. Fast Multipole-Based Preconditioner for Sparse Iterative Solvers

    KAUST Repository

    Ibeid, Huda

    2014-05-04

    Among optimal hierarchical algorithms for the computational solution of elliptic problems, the Fast Multipole Method (FMM) stands out for its adaptability to emerging architectures, having high arithmetic intensity, tunable accuracy, and relaxed global synchronization requirements. We demonstrate that, beyond its traditional use as a solver in problems for which explicit free-space kernel representations are available, the FMM has applicability as a preconditioner in finite domain elliptic boundary value problems, by equipping it with boundary integral capability for finite boundaries and by wrapping it in a Krylov method for extensibility to more general operators. Compared with multilevel methods, it is capable of comparable algebraic convergence rates down to the truncation error of the discretized PDE, and it has superior multicore and distributed memory scalability properties on commodity architecture supercomputers.

  15. A high-performance Riccati based solver for tree-structured quadratic programs

    DEFF Research Database (Denmark)

    Frison, Gianluca; Kouzoupis, Dimitris; Diehl, Moritz

    2017-01-01

    the online solution of such problems challenging and the development of tailored solvers crucial. In this paper, an interior point method is presented that can solve Quadratic Programs (QPs) arising in multi-stage MPC efficiently by means of a tree-structured Riccati recursion and a high-performance linear...... algebra library. A performance comparison with code-generated and general purpose sparse QP solvers shows that the computation times can be significantly reduced for all problem sizes that are practically relevant in embedded MPC applications. The presented implementation is freely available as part...

  16. An Optimized Multicolor Point-Implicit Solver for Unstructured Grid Applications on Graphics Processing Units

    Science.gov (United States)

    Zubair, Mohammad; Nielsen, Eric; Luitjens, Justin; Hammond, Dana

    2016-01-01

    In the field of computational fluid dynamics, the Navier-Stokes equations are often solved using an unstructuredgrid approach to accommodate geometric complexity. Implicit solution methodologies for such spatial discretizations generally require frequent solution of large tightly-coupled systems of block-sparse linear equations. The multicolor point-implicit solver used in the current work typically requires a significant fraction of the overall application run time. In this work, an efficient implementation of the solver for graphics processing units is proposed. Several factors present unique challenges to achieving an efficient implementation in this environment. These include the variable amount of parallelism available in different kernel calls, indirect memory access patterns, low arithmetic intensity, and the requirement to support variable block sizes. In this work, the solver is reformulated to use standard sparse and dense Basic Linear Algebra Subprograms (BLAS) functions. However, numerical experiments show that the performance of the BLAS functions available in existing CUDA libraries is suboptimal for matrices representative of those encountered in actual simulations. Instead, optimized versions of these functions are developed. Depending on block size, the new implementations show performance gains of up to 7x over the existing CUDA library functions.

  17. PCX, Interior-Point Linear Programming Solver

    International Nuclear Information System (INIS)

    Czyzyk, J.

    2004-01-01

    1 - Description of program or function: PCX solves linear programming problems using the Mehrota predictor-corrector interior-point algorithm. PCX can be called as a subroutine or used in stand-alone mode, with data supplied from an MPS file. The software incorporates modules that can be used separately from the linear programming solver, including a pre-solve routine and data structure definitions. 2 - Methods: The Mehrota predictor-corrector method is a primal-dual interior-point method for linear programming. The starting point is determined from a modified least squares heuristic. Linear systems of equations are solved at each interior-point iteration via a sparse Cholesky algorithm native to the code. A pre-solver is incorporated in the code to eliminate inefficiencies in the user's formulation of the problem. 3 - Restriction on the complexity of the problem: There are no size limitations built into the program. The size of problem solved is limited by RAM and swap space on the user's computer

  18. A Projected Conjugate Gradient Method for Sparse Minimax Problems

    DEFF Research Database (Denmark)

    Madsen, Kaj; Jonasson, Kristjan

    1993-01-01

    A new method for nonlinear minimax problems is presented. The method is of the trust region type and based on sequential linear programming. It is a first order method that only uses first derivatives and does not approximate Hessians. The new method is well suited for large sparse problems...... as it only requires that software for sparse linear programming and a sparse symmetric positive definite equation solver are available. On each iteration a special linear/quadratic model of the function is minimized, but contrary to the usual practice in trust region methods the quadratic model is only...... with the method are presented. In fact, we find that the number of iterations required is comparable to that of state-of-the-art quasi-Newton codes....

  19. Parallelization of the preconditioned IDR solver for modern multicore computer systems

    Science.gov (United States)

    Bessonov, O. A.; Fedoseyev, A. I.

    2012-10-01

    This paper present the analysis, parallelization and optimization approach for the large sparse matrix solver CNSPACK for modern multicore microprocessors. CNSPACK is an advanced solver successfully used for coupled solution of stiff problems arising in multiphysics applications such as CFD, semiconductor transport, kinetic and quantum problems. It employs iterative IDR algorithm with ILU preconditioning (user chosen ILU preconditioning order). CNSPACK has been successfully used during last decade for solving problems in several application areas, including fluid dynamics and semiconductor device simulation. However, there was a dramatic change in processor architectures and computer system organization in recent years. Due to this, performance criteria and methods have been revisited, together with involving the parallelization of the solver and preconditioner using Open MP environment. Results of the successful implementation for efficient parallelization are presented for the most advances computer system (Intel Core i7-9xx or two-processor Xeon 55xx/56xx).

  20. SLAP, Large Sparse Linear System Solution Package

    International Nuclear Information System (INIS)

    Greenbaum, A.

    1987-01-01

    1 - Description of program or function: SLAP is a set of routines for solving large sparse systems of linear equations. One need not store the entire matrix - only the nonzero elements and their row and column numbers. Any nonzero structure is acceptable, so the linear system solver need not be modified when the structure of the matrix changes. Auxiliary storage space is acquired and released within the routines themselves by use of the LRLTRAN POINTER statement. 2 - Method of solution: SLAP contains one direct solver, a band matrix factorization and solution routine, BAND, and several interactive solvers. The iterative routines are as follows: JACOBI, Jacobi iteration; GS, Gauss-Seidel Iteration; ILUIR, incomplete LU decomposition with iterative refinement; DSCG and ICCG, diagonal scaling and incomplete Cholesky decomposition with conjugate gradient iteration (for symmetric positive definite matrices only); DSCGN and ILUGGN, diagonal scaling and incomplete LU decomposition with conjugate gradient interaction on the normal equations; DSBCG and ILUBCG, diagonal scaling and incomplete LU decomposition with bi-conjugate gradient iteration; and DSOMN and ILUOMN, diagonal scaling and incomplete LU decomposition with ORTHOMIN iteration

  1. Optimizing Sparse Matrix-Multiple Vectors Multiplication for Nuclear Configuration Interaction Calculations

    Energy Technology Data Exchange (ETDEWEB)

    Aktulga, Hasan Metin [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Buluc, Aydin [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Williams, Samuel [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Yang, Chao [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)

    2014-08-14

    Obtaining highly accurate predictions on the properties of light atomic nuclei using the configuration interaction (CI) approach requires computing a few extremal Eigen pairs of the many-body nuclear Hamiltonian matrix. In the Many-body Fermion Dynamics for nuclei (MFDn) code, a block Eigen solver is used for this purpose. Due to the large size of the sparse matrices involved, a significant fraction of the time spent on the Eigen value computations is associated with the multiplication of a sparse matrix (and the transpose of that matrix) with multiple vectors (SpMM and SpMM-T). Existing implementations of SpMM and SpMM-T significantly underperform expectations. Thus, in this paper, we present and analyze optimized implementations of SpMM and SpMM-T. We base our implementation on the compressed sparse blocks (CSB) matrix format and target systems with multi-core architectures. We develop a performance model that allows us to understand and estimate the performance characteristics of our SpMM kernel implementations, and demonstrate the efficiency of our implementation on a series of real-world matrices extracted from MFDn. In particular, we obtain 3-4 speedup on the requisite operations over good implementations based on the commonly used compressed sparse row (CSR) matrix format. The improvements in the SpMM kernel suggest we may attain roughly a 40% speed up in the overall execution time of the block Eigen solver used in MFDn.

  2. Performance optimization of Sparse Matrix-Vector Multiplication for multi-component PDE-based applications using GPUs

    KAUST Repository

    Abdelfattah, Ahmad

    2016-05-23

    Simulations of many multi-component PDE-based applications, such as petroleum reservoirs or reacting flows, are dominated by the solution, on each time step and within each Newton step, of large sparse linear systems. The standard solver is a preconditioned Krylov method. Along with application of the preconditioner, memory-bound Sparse Matrix-Vector Multiplication (SpMV) is the most time-consuming operation in such solvers. Multi-species models produce Jacobians with a dense block structure, where the block size can be as large as a few dozen. Failing to exploit this dense block structure vastly underutilizes hardware capable of delivering high performance on dense BLAS operations. This paper presents a GPU-accelerated SpMV kernel for block-sparse matrices. Dense matrix-vector multiplications within the sparse-block structure leverage optimization techniques from the KBLAS library, a high performance library for dense BLAS kernels. The design ideas of KBLAS can be applied to block-sparse matrices. Furthermore, a technique is proposed to balance the workload among thread blocks when there are large variations in the lengths of nonzero rows. Multi-GPU performance is highlighted. The proposed SpMV kernel outperforms existing state-of-the-art implementations using matrices with real structures from different applications. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  3. Performance optimization of Sparse Matrix-Vector Multiplication for multi-component PDE-based applications using GPUs

    KAUST Repository

    Abdelfattah, Ahmad; Ltaief, Hatem; Keyes, David E.; Dongarra, Jack

    2016-01-01

    Simulations of many multi-component PDE-based applications, such as petroleum reservoirs or reacting flows, are dominated by the solution, on each time step and within each Newton step, of large sparse linear systems. The standard solver is a preconditioned Krylov method. Along with application of the preconditioner, memory-bound Sparse Matrix-Vector Multiplication (SpMV) is the most time-consuming operation in such solvers. Multi-species models produce Jacobians with a dense block structure, where the block size can be as large as a few dozen. Failing to exploit this dense block structure vastly underutilizes hardware capable of delivering high performance on dense BLAS operations. This paper presents a GPU-accelerated SpMV kernel for block-sparse matrices. Dense matrix-vector multiplications within the sparse-block structure leverage optimization techniques from the KBLAS library, a high performance library for dense BLAS kernels. The design ideas of KBLAS can be applied to block-sparse matrices. Furthermore, a technique is proposed to balance the workload among thread blocks when there are large variations in the lengths of nonzero rows. Multi-GPU performance is highlighted. The proposed SpMV kernel outperforms existing state-of-the-art implementations using matrices with real structures from different applications. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  4. Development of an international matrix-solver prediction system on a French-Japanese international grid computing environment

    International Nuclear Information System (INIS)

    Suzuki, Yoshio; Kushida, Noriyuki; Tatekawa, Takayuki; Teshima, Naoya; Caniou, Yves; Guivarch, Ronan; Dayde, Michel; Ramet, Pierre

    2010-01-01

    The 'Research and Development of International Matrix-Solver Prediction System (REDIMPS)' project aimed at improving the TLSE sparse linear algebra expert website by establishing an international grid computing environment between Japan and France. To help users in identifying the best solver or sparse linear algebra tool for their problems, we have developed an interoperable environment between French and Japanese grid infrastructures (respectively managed by DIET and AEGIS). Two main issues were considered. The first issue is how to submit a job from DIET to AEGIS. The second issue is how to bridge the difference of security between DIET and AEGIS. To overcome these issues, we developed APIs to communicate between different grid infrastructures by improving the client API of AEGIS. By developing a server deamon program (SeD) of DIET which behaves like an AEGIS user, DIET can call functions in AEGIS: authentication, file transfer, job submission, and so on. To intensify the security, we also developed functionalities to authenticate DIET sites and DIET users in order to access AEGIS computing resources. By this study, the set of software and computers available within TLSE to find an appropriate solver is enlarged over France (DIET) and Japan (AEGIS). (author)

  5. Parallelized preconditioned BiCGStab solution of sparse linear system equations in F-COBRA-TF

    International Nuclear Information System (INIS)

    Geemert, Rene van; Glück, Markus; Riedmann, Michael; Gabriel, Harry

    2011-01-01

    Recently, the in-house development of a preconditioned and parallelized BiCGStab solver has been pursued successfully in AREVA’s advanced sub-channel code F-COBRA-TF. This solver can be run either in a sequential computation mode on a single CPU, or in a parallel computation mode on multiple parallel CPUs. The developed procedure enables the computation of several thousands of successive sparse linear system solutions in F-COBRA-TF with acceptable wall clock run times. The current paper provides general information about F-COBRA-TF in terms of modeling capabilities and application areas, and points out where the relevance arises for the efficient iterative solution of sparse linear systems. Furthermore, the preconditioning and parallelization strategies in the developed BiCGStab iterative solution approach are discussed. The paper is concluded with a number of verification examples. (author)

  6. Sparse direct solver for large finite element problems based on the minimum degree algorithm

    Czech Academy of Sciences Publication Activity Database

    Pařík, Petr; Plešek, Jiří

    2017-01-01

    Roč. 113, November (2017), s. 2-6 ISSN 0965-9978 R&D Projects: GA ČR(CZ) GA15-20666S; GA MŠk(CZ) EF15_003/0000493 Institutional support: RVO:61388998 Keywords : sparse direct solution * finite element method * large sparse Linear systems Subject RIV: JR - Other Machinery OBOR OECD: Mechanical engineering Impact factor: 3.000, year: 2016 https://www.sciencedirect.com/science/article/pii/S0965997817302582

  7. An automatic way of finding robust elimination trees for a multi-frontal sparse solver for radical 2D hierarchical meshes

    KAUST Repository

    AbouEisha, Hassan M.

    2014-01-01

    In this paper we present a dynamic programming algorithm for finding optimal elimination trees for the multi-frontal direct solver algorithm executed over two dimensional meshes with point singularities. The elimination tree found by the optimization algorithm results in a linear computational cost of sequential direct solver. Based on the optimal elimination tree found by the optimization algorithm we construct heuristic sequential multi-frontal direct solver algorithm resulting in a linear computational cost as well as heuristic parallel multi-frontal direct solver algorithm resulting in a logarithmic computational cost. The resulting parallel algorithm is implemented on NVIDIA CUDA GPU architecture based on our graph-grammar approach. © 2014 Springer-Verlag.

  8. Parallel time domain solvers for electrically large transient scattering problems

    KAUST Repository

    Liu, Yang

    2014-09-26

    Marching on in time (MOT)-based integral equation solvers represent an increasingly appealing avenue for analyzing transient electromagnetic interactions with large and complex structures. MOT integral equation solvers for analyzing electromagnetic scattering from perfect electrically conducting objects are obtained by enforcing electric field boundary conditions and implicitly time advance electric surface current densities by iteratively solving sparse systems of equations at all time steps. Contrary to finite difference and element competitors, these solvers apply to nonlinear and multi-scale structures comprising geometrically intricate and deep sub-wavelength features residing atop electrically large platforms. Moreover, they are high-order accurate, stable in the low- and high-frequency limits, and applicable to conducting and penetrable structures represented by highly irregular meshes. This presentation reviews some recent advances in the parallel implementations of time domain integral equation solvers, specifically those that leverage multilevel plane-wave time-domain algorithm (PWTD) on modern manycore computer architectures including graphics processing units (GPUs) and distributed memory supercomputers. The GPU-based implementation achieves at least one order of magnitude speedups compared to serial implementations while the distributed parallel implementation are highly scalable to thousands of compute-nodes. A distributed parallel PWTD kernel has been adopted to solve time domain surface/volume integral equations (TDSIE/TDVIE) for analyzing transient scattering from large and complex-shaped perfectly electrically conducting (PEC)/dielectric objects involving ten million/tens of millions of spatial unknowns.

  9. Sparse matrix-vector multiplication on GPGPU clusters: A new storage format and a scalable implementation

    OpenAIRE

    Kreutzer, Moritz; Hager, Georg; Wellein, Gerhard; Fehske, Holger; Basermann, Achim; Bishop, Alan R.

    2011-01-01

    Sparse matrix-vector multiplication (spMVM) is the dominant operation in many sparse solvers. We investigate performance properties of spMVM with matrices of various sparsity patterns on the nVidia “Fermi” class of GPGPUs. A new “padded jagged diagonals storage” (pJDS) format is proposed which may substantially reduce the memory overhead intrinsic to the widespread ELLPACK-R scheme while making no assumptions about the matrix structure. In our test scenarios the pJDS format cuts the ...

  10. COMPARATIVE STUDY OF THREE LINEAR SYSTEM SOLVER APPLIED TO FAST DECOUPLED LOAD FLOW METHOD FOR CONTINGENCY ANALYSIS

    Directory of Open Access Journals (Sweden)

    Syafii

    2017-03-01

    Full Text Available This paper presents the assessment of fast decoupled load flow computation using three linear system solver scheme. The full matrix version of the fast decoupled load flow based on XB methods used in this study. The numerical investigations are carried out on the small and large test systems. The execution time of small system such as IEEE 14, 30, and 57 are very fast, therefore the computation time can not be compared for these cases. Another cases IEEE 118, 300 and TNB 664 produced significant execution speedup. The superLU factorization sparse matrix solver has best performance and speedup of load flow solution as well as in contigency analysis. The invers full matrix solver can solved only for IEEE 118 bus test system in 3.715 second and for another cases take too long time. However for superLU factorization linear solver can solved all of test system in 7.832 second for a largest of test system. Therefore the superLU factorization linear solver can be a viable alternative applied in contingency analysis.

  11. MGLab3D: An interactive environment for iterative solvers for elliptic PDEs in two and three dimensions

    Energy Technology Data Exchange (ETDEWEB)

    Bordner, J.; Saied, F. [Univ. of Illinois, Urbana, IL (United States)

    1996-12-31

    GLab3D is an enhancement of an interactive environment (MGLab) for experimenting with iterative solvers and multigrid algorithms. It is implemented in MATLAB. The new version has built-in 3D elliptic pde`s and several iterative methods and preconditioners that were not available in the original version. A sparse direct solver option has also been included. The multigrid solvers have also been extended to 3D. The discretization and pde domains are restricted to standard finite differences on the unit square/cube. The power of this software studies in the fact that no programming is needed to solve, for example, the convection-diffusion equation in 3D with TFQMR and a customized V-cycle preconditioner, for a variety of problem sizes and mesh Reynolds, numbers. In addition to the graphical user interface, some sample drivers are included to show how experiments can be composed using the underlying suite of problems and solvers.

  12. The Application Strategy of Iterative Solution Methodology to Matrix Equations in Hydraulic Solver Package, SPACE

    International Nuclear Information System (INIS)

    Na, Y. W.; Park, C. E.; Lee, S. Y.

    2009-01-01

    As a part of the Ministry of Knowledge Economy (MKE) project, 'Development of safety analysis codes for nuclear power plants', KOPEC has been developing the hydraulic solver code package applicable to the safety analyses of nuclear power plants (NPP's). The matrices of the hydraulic solver are usually sparse and may be asymmetric. In the earlier stage of this project, typical direct matrix solver packages MA48 and MA28 had been tested as matrix solver for the hydraulic solver code, SPACE. The selection was based on the reasonably reliable performance experience from their former version MA18 in RELAP computer code. In the later stage of this project, the iterative methodologies have been being tested in the SPACE code. Among a few candidate iterative solution methodologies tested so far, the biconjugate gradient stabilization methodology (BICGSTAB) has shown the best performance in the applicability test and in the application to the SPACE code. Regardless of all the merits of using the direct solver packages, there are some other aspects of tackling the iterative solution methodologies. The algorithm is much simpler and easier to handle. The potential problems related to the robustness of the iterative solution methodologies have been resolved by applying pre-conditioning methods adjusted and modified as appropriate to the application in the SPACE code. The application strategy of conjugate gradient method was introduced in detail by Schewchuk, Golub and Saad in the middle of 1990's. The application of his methodology to nuclear engineering in Korea started about the same time and is still going on and there are quite a few examples of application to neutronics. Besides, Yang introduced a conjugate gradient method programmed in C++ language. The purpose of this study is to assess the performance and behavior of the iterative solution methodology compared to those of the direct solution methodology still being preferred due to its robustness and reliability. The

  13. The Roles of Sparse Direct Methods in Large-scale Simulations

    Energy Technology Data Exchange (ETDEWEB)

    Li, Xiaoye S.; Gao, Weiguo; Husbands, Parry J.R.; Yang, Chao; Ng, Esmond G.

    2005-06-27

    Sparse systems of linear equations and eigen-equations arise at the heart of many large-scale, vital simulations in DOE. Examples include the Accelerator Science and Technology SciDAC (Omega3P code, electromagnetic problem), the Center for Extended Magnetohydrodynamic Modeling SciDAC(NIMROD and M3D-C1 codes, fusion plasma simulation). The Terascale Optimal PDE Simulations (TOPS)is providing high-performance sparse direct solvers, which have had significant impacts on these applications. Over the past several years, we have been working closely with the other SciDAC teams to solve their large, sparse matrix problems arising from discretization of the partial differential equations. Most of these systems are very ill-conditioned, resulting in extremely poor convergence deployed our direct methods techniques in these applications, which achieved significant scientific results as well as performance gains. These successes were made possible through the SciDAC model of computer scientists and application scientists working together to take full advantage of terascale computing systems and new algorithms research.

  14. The Roles of Sparse Direct Methods in Large-scale Simulations

    International Nuclear Information System (INIS)

    Li, Xiaoye S.; Gao, Weiguo; Husbands, Parry J.R.; Yang, Chao; Ng, Esmond G.

    2005-01-01

    Sparse systems of linear equations and eigen-equations arise at the heart of many large-scale, vital simulations in DOE. Examples include the Accelerator Science and Technology SciDAC (Omega3P code, electromagnetic problem), the Center for Extended Magnetohydrodynamic Modeling SciDAC(NIMROD and M3D-C1 codes, fusion plasma simulation). The Terascale Optimal PDE Simulations (TOPS)is providing high-performance sparse direct solvers, which have had significant impacts on these applications. Over the past several years, we have been working closely with the other SciDAC teams to solve their large, sparse matrix problems arising from discretization of the partial differential equations. Most of these systems are very ill-conditioned, resulting in extremely poor convergence deployed our direct methods techniques in these applications, which achieved significant scientific results as well as performance gains. These successes were made possible through the SciDAC model of computer scientists and application scientists working together to take full advantage of terascale computing systems and new algorithms research

  15. Primal Domain Decomposition Method with Direct and Iterative Solver for Circuit-Field-Torque Coupled Parallel Finite Element Method to Electric Machine Modelling

    Directory of Open Access Journals (Sweden)

    Daniel Marcsa

    2015-01-01

    Full Text Available The analysis and design of electromechanical devices involve the solution of large sparse linear systems, and require therefore high performance algorithms. In this paper, the primal Domain Decomposition Method (DDM with parallel forward-backward and with parallel Preconditioned Conjugate Gradient (PCG solvers are introduced in two-dimensional parallel time-stepping finite element formulation to analyze rotating machine considering the electromagnetic field, external circuit and rotor movement. The proposed parallel direct and the iterative solver with two preconditioners are analyzed concerning its computational efficiency and number of iterations of the solver with different preconditioners. Simulation results of a rotating machine is also presented.

  16. A Parallel Algebraic Multigrid Solver on Graphics Processing Units

    KAUST Repository

    Haase, Gundolf

    2010-01-01

    The paper presents a multi-GPU implementation of the preconditioned conjugate gradient algorithm with an algebraic multigrid preconditioner (PCG-AMG) for an elliptic model problem on a 3D unstructured grid. An efficient parallel sparse matrix-vector multiplication scheme underlying the PCG-AMG algorithm is presented for the many-core GPU architecture. A performance comparison of the parallel solver shows that a singe Nvidia Tesla C1060 GPU board delivers the performance of a sixteen node Infiniband cluster and a multi-GPU configuration with eight GPUs is about 100 times faster than a typical server CPU core. © 2010 Springer-Verlag.

  17. High Order Tensor Formulation for Convolutional Sparse Coding

    KAUST Repository

    Bibi, Adel Aamer

    2017-12-25

    Convolutional sparse coding (CSC) has gained attention for its successful role as a reconstruction and a classification tool in the computer vision and machine learning community. Current CSC methods can only reconstruct singlefeature 2D images independently. However, learning multidimensional dictionaries and sparse codes for the reconstruction of multi-dimensional data is very important, as it examines correlations among all the data jointly. This provides more capacity for the learned dictionaries to better reconstruct data. In this paper, we propose a generic and novel formulation for the CSC problem that can handle an arbitrary order tensor of data. Backed with experimental results, our proposed formulation can not only tackle applications that are not possible with standard CSC solvers, including colored video reconstruction (5D- tensors), but it also performs favorably in reconstruction with much fewer parameters as compared to naive extensions of standard CSC to multiple features/channels.

  18. Iterative solution of general sparse linear systems on clusters of workstations

    Energy Technology Data Exchange (ETDEWEB)

    Lo, Gen-Ching; Saad, Y. [Univ. of Minnesota, Minneapolis, MN (United States)

    1996-12-31

    Solving sparse irregularly structured linear systems on parallel platforms poses several challenges. First, sparsity makes it difficult to exploit data locality, whether in a distributed or shared memory environment. A second, perhaps more serious challenge, is to find efficient ways to precondition the system. Preconditioning techniques which have a large degree of parallelism, such as multicolor SSOR, often have a slower rate of convergence than their sequential counterparts. Finally, a number of other computational kernels such as inner products could ruin any gains gained from parallel speed-ups, and this is especially true on workstation clusters where start-up times may be high. In this paper we discuss these issues and report on our experience with PSPARSLIB, an on-going project for building a library of parallel iterative sparse matrix solvers.

  19. Compressive Sensing with Cross-Validation and Stop-Sampling for Sparse Polynomial Chaos Expansions

    Energy Technology Data Exchange (ETDEWEB)

    Huan, Xun; Safta, Cosmin; Sargsyan, Khachik; Vane, Zachary Phillips; Lacaze, Guilhem; Oefelein, Joseph C.; Najm, Habib N.

    2017-07-01

    Compressive sensing is a powerful technique for recovering sparse solutions of underdetermined linear systems, which is often encountered in uncertainty quanti cation analysis of expensive and high-dimensional physical models. We perform numerical investigations employing several com- pressive sensing solvers that target the unconstrained LASSO formulation, with a focus on linear systems that arise in the construction of polynomial chaos expansions. With core solvers of l1 ls, SpaRSA, CGIST, FPC AS, and ADMM, we develop techniques to mitigate over tting through an automated selection of regularization constant based on cross-validation, and a heuristic strategy to guide the stop-sampling decision. Practical recommendations on parameter settings for these tech- niques are provided and discussed. The overall method is applied to a series of numerical examples of increasing complexity, including large eddy simulations of supersonic turbulent jet-in-cross flow involving a 24-dimensional input. Through empirical phase-transition diagrams and convergence plots, we illustrate sparse recovery performance under structures induced by polynomial chaos, accuracy and computational tradeoffs between polynomial bases of different degrees, and practi- cability of conducting compressive sensing for a realistic, high-dimensional physical application. Across test cases studied in this paper, we find ADMM to have demonstrated empirical advantages through consistent lower errors and faster computational times.

  20. Efficient CUDA Polynomial Preconditioned Conjugate Gradient Solver for Finite Element Computation of Elasticity Problems

    Directory of Open Access Journals (Sweden)

    Jianfei Zhang

    2013-01-01

    Full Text Available Graphics processing unit (GPU has obtained great success in scientific computations for its tremendous computational horsepower and very high memory bandwidth. This paper discusses the efficient way to implement polynomial preconditioned conjugate gradient solver for the finite element computation of elasticity on NVIDIA GPUs using compute unified device architecture (CUDA. Sliced block ELLPACK (SBELL format is introduced to store sparse matrix arising from finite element discretization of elasticity with fewer padding zeros than traditional ELLPACK-based formats. Polynomial preconditioning methods have been investigated both in convergence and running time. From the overall performance, the least-squares (L-S polynomial method is chosen as a preconditioner in PCG solver to finite element equations derived from elasticity for its best results on different example meshes. In the PCG solver, mixed precision algorithm is used not only to reduce the overall computational, storage requirements and bandwidth but to make full use of the capacity of the GPU devices. With SBELL format and mixed precision algorithm, the GPU-based L-S preconditioned CG can get a speedup of about 7–9 to CPU-implementation.

  1. One-shot 3D scanning by combining sparse landmarks with dense gradient information

    Science.gov (United States)

    Di Martino, Matías; Flores, Jorge; Ferrari, José A.

    2018-06-01

    Scene understanding is one of the most challenging and popular problems in the field of robotics and computer vision and the estimation of 3D information is at the core of most of these applications. In order to retrieve the 3D structure of a test surface we propose a single shot approach that combines dense gradient information with sparse absolute measurements. To that end, we designed a colored pattern that codes fine horizontal and vertical fringes, with sparse corners landmarks. By measuring the deformation (bending) of horizontal and vertical fringes, we are able to estimate surface local variations (i.e. its gradient field). Then corner sparse landmarks are detected and matched to infer spare absolute information about the test surface height. Local gradient information is combined with the sparse absolute values which work as anchors to guide the integration process. We show that this can be mathematically done in a very compact and intuitive way by properly defining a Poisson-like partial differential equation. Then we address in detail how the problem can be formulated in a discrete domain and how it can be practically solved by straight forward linear numerical solvers. Finally, validation experiment are presented.

  2. Extending the QUDA Library with the eigCG Solver

    Energy Technology Data Exchange (ETDEWEB)

    Strelchenko, Alexei [Fermilab; Stathopoulos, Andreas [William-Mary Coll.

    2014-12-12

    While the incremental eigCG algorithm [ 1 ] is included in many LQCD software packages, its realization on GPU micro-architectures was still missing. In this session we report our experi- ence of the eigCG implementation in the QUDA library. In particular, we will focus on how to employ the mixed precision technique to accelerate solutions of large sparse linear systems with multiple right-hand sides on GPUs. Although application of mixed precision techniques is a well-known optimization approach for linear solvers, its utilization for the eigenvector com- puting within eigCG requires special consideration. We will discuss implementation aspects of the mixed precision deflation and illustrate its numerical behavior on the example of the Wilson twisted mass fermion matrix inversions

  3. Differential equations problem solver

    CERN Document Server

    Arterburn, David R

    2012-01-01

    REA's Problem Solvers is a series of useful, practical, and informative study guides. Each title in the series is complete step-by-step solution guide. The Differential Equations Problem Solver enables students to solve difficult problems by showing them step-by-step solutions to Differential Equations problems. The Problem Solvers cover material ranging from the elementary to the advanced and make excellent review books and textbook companions. They're perfect for undergraduate and graduate studies.The Differential Equations Problem Solver is the perfect resource for any class, any exam, and

  4. Survey on efficient linear solvers for porous media flow models on recent hardware architectures

    International Nuclear Information System (INIS)

    Anciaux-Sedrakian, Ani; Gratien, Jean-Marc; Guignon, Thomas; Gottschling, Peter

    2014-01-01

    In the past few years, High Performance Computing (HPC) technologies led to General Purpose Processing on Graphics Processing Units (GPGPU) and many-core architectures. These emerging technologies offer massive processing units and are interesting for porous media flow simulators may used for CO 2 geological sequestration or Enhanced Oil Recovery (EOR) simulation. However the crucial point is 'are current algorithms and software able to use these new technologies efficiently?' The resolution of large sparse linear systems, almost ill-conditioned, constitutes the most CPU-consuming part of such simulators. This paper proposes a survey on various solver and pre-conditioner algorithms, analyzes their efficiency and performance regarding these distinct architectures. Furthermore it proposes a novel approach based on a hybrid programming model for both GPU and many-core clusters. The proposed optimization techniques are validated through a Krylov subspace solver; BiCGStab and some pre-conditioners like ILU0 on GPU, multi-core and many-core architectures, on various large real study cases in EOR simulation. (authors)

  5. Comparative Performance Analysis of Coarse Solvers for Algebraic Multigrid on Multicore and Manycore Architectures

    Energy Technology Data Exchange (ETDEWEB)

    Druinsky, A; Ghysels, P; Li, XS; Marques, O; Williams, S; Barker, A; Kalchev, D; Vassilevski, P

    2016-04-02

    In this paper, we study the performance of a two-level algebraic-multigrid algorithm, with a focus on the impact of the coarse-grid solver on performance. We consider two algorithms for solving the coarse-space systems: the preconditioned conjugate gradient method and a new robust HSS-embedded low-rank sparse-factorization algorithm. Our test data comes from the SPE Comparative Solution Project for oil-reservoir simulations. We contrast the performance of our code on one 12-core socket of a Cray XC30 machine with performance on a 60-core Intel Xeon Phi coprocessor. To obtain top performance, we optimized the code to take full advantage of fine-grained parallelism and made it thread-friendly for high thread count. We also developed a bounds-and-bottlenecks performance model of the solver which we used to guide us through the optimization effort, and also carried out performance tuning in the solver’s large parameter space. Finally, as a result, significant speedups were obtained on both machines.

  6. Consensus Convolutional Sparse Coding

    KAUST Repository

    Choudhury, Biswarup

    2017-12-01

    Convolutional sparse coding (CSC) is a promising direction for unsupervised learning in computer vision. In contrast to recent supervised methods, CSC allows for convolutional image representations to be learned that are equally useful for high-level vision tasks and low-level image reconstruction and can be applied to a wide range of tasks without problem-specific retraining. Due to their extreme memory requirements, however, existing CSC solvers have so far been limited to low-dimensional problems and datasets using a handful of low-resolution example images at a time. In this paper, we propose a new approach to solving CSC as a consensus optimization problem, which lifts these limitations. By learning CSC features from large-scale image datasets for the first time, we achieve significant quality improvements in a number of imaging tasks. Moreover, the proposed method enables new applications in high-dimensional feature learning that has been intractable using existing CSC methods. This is demonstrated for a variety of reconstruction problems across diverse problem domains, including 3D multispectral demosaicing and 4D light field view synthesis.

  7. Consensus Convolutional Sparse Coding

    KAUST Repository

    Choudhury, Biswarup

    2017-04-11

    Convolutional sparse coding (CSC) is a promising direction for unsupervised learning in computer vision. In contrast to recent supervised methods, CSC allows for convolutional image representations to be learned that are equally useful for high-level vision tasks and low-level image reconstruction and can be applied to a wide range of tasks without problem-specific retraining. Due to their extreme memory requirements, however, existing CSC solvers have so far been limited to low-dimensional problems and datasets using a handful of low-resolution example images at a time. In this paper, we propose a new approach to solving CSC as a consensus optimization problem, which lifts these limitations. By learning CSC features from large-scale image datasets for the first time, we achieve significant quality improvements in a number of imaging tasks. Moreover, the proposed method enables new applications in high dimensional feature learning that has been intractable using existing CSC methods. This is demonstrated for a variety of reconstruction problems across diverse problem domains, including 3D multispectral demosaickingand 4D light field view synthesis.

  8. Consensus Convolutional Sparse Coding

    KAUST Repository

    Choudhury, Biswarup; Swanson, Robin; Heide, Felix; Wetzstein, Gordon; Heidrich, Wolfgang

    2017-01-01

    Convolutional sparse coding (CSC) is a promising direction for unsupervised learning in computer vision. In contrast to recent supervised methods, CSC allows for convolutional image representations to be learned that are equally useful for high-level vision tasks and low-level image reconstruction and can be applied to a wide range of tasks without problem-specific retraining. Due to their extreme memory requirements, however, existing CSC solvers have so far been limited to low-dimensional problems and datasets using a handful of low-resolution example images at a time. In this paper, we propose a new approach to solving CSC as a consensus optimization problem, which lifts these limitations. By learning CSC features from large-scale image datasets for the first time, we achieve significant quality improvements in a number of imaging tasks. Moreover, the proposed method enables new applications in high-dimensional feature learning that has been intractable using existing CSC methods. This is demonstrated for a variety of reconstruction problems across diverse problem domains, including 3D multispectral demosaicing and 4D light field view synthesis.

  9. An automatic way of finding robust elimination trees for a multi-frontal sparse solver for radical 2D hierarchical meshes

    KAUST Repository

    AbouEisha, Hassan M.; Gurgul, Piotr; Paszyńska, Anna; Paszyński, Maciej R.; Kuźnik, Krzysztof M.; Moshkov, Mikhail

    2014-01-01

    computational cost as well as heuristic parallel multi-frontal direct solver algorithm resulting in a logarithmic computational cost. The resulting parallel algorithm is implemented on NVIDIA CUDA GPU architecture based on our graph-grammar approach. © 2014

  10. SparseM: A Sparse Matrix Package for R *

    Directory of Open Access Journals (Sweden)

    Roger Koenker

    2003-02-01

    Full Text Available SparseM provides some basic R functionality for linear algebra with sparse matrices. Use of the package is illustrated by a family of linear model fitting functions that implement least squares methods for problems with sparse design matrices. Significant performance improvements in memory utilization and computational speed are possible for applications involving large sparse matrices.

  11. Iterative solvers in forming process simulations

    NARCIS (Netherlands)

    van den Boogaard, Antonius H.; Rietman, Bert; Huetink, Han

    1998-01-01

    The use of iterative solvers in implicit forming process simulations is studied. The time and memory requirements are compared with direct solvers and assessed in relation with the rest of the Newton-Raphson iteration process. It is shown that conjugate gradient{like solvers with a proper

  12. Hybrid Direct and Iterative Solver with Library of Multi-criteria Optimal Orderings for h Adaptive Finite Element Method Computations

    KAUST Repository

    AbouEisha, Hassan M.

    2016-06-02

    In this paper we present a multi-criteria optimization of element partition trees and resulting orderings for multi-frontal solver algorithms executed for two dimensional h adaptive finite element method. In particular, the problem of optimal ordering of elimination of rows in the sparse matrices resulting from adaptive finite element method computations is reduced to the problem of finding of optimal element partition trees. Given a two dimensional h refined mesh, we find all optimal element partition trees by using the dynamic programming approach. An element partition tree defines a prescribed order of elimination of degrees of freedom over the mesh. We utilize three different metrics to estimate the quality of the element partition tree. As the first criterion we consider the number of floating point operations(FLOPs) performed by the multi-frontal solver. As the second criterion we consider the number of memory transfers (MEMOPS) performed by the multi-frontal solver algorithm. As the third criterion we consider memory usage (NONZEROS) of the multi-frontal direct solver. We show the optimization results for FLOPs vs MEMOPS as well as for the execution time estimated as FLOPs+100MEMOPS vs NONZEROS. We obtain Pareto fronts with multiple optimal trees, for each mesh, and for each refinement level. We generate a library of optimal elimination trees for small grids with local singularities. We also propose an algorithm that for a given large mesh with identified local sub-grids, each one with local singularity. We compute Schur complements over the sub-grids using the optimal trees from the library, and we submit the sequence of Schur complements into the iterative solver ILUPCG.

  13. Comparison of open-source linear programming solvers.

    Energy Technology Data Exchange (ETDEWEB)

    Gearhart, Jared Lee; Adair, Kristin Lynn; Durfee, Justin David.; Jones, Katherine A.; Martin, Nathaniel; Detry, Richard Joseph

    2013-10-01

    When developing linear programming models, issues such as budget limitations, customer requirements, or licensing may preclude the use of commercial linear programming solvers. In such cases, one option is to use an open-source linear programming solver. A survey of linear programming tools was conducted to identify potential open-source solvers. From this survey, four open-source solvers were tested using a collection of linear programming test problems and the results were compared to IBM ILOG CPLEX Optimizer (CPLEX) [1], an industry standard. The solvers considered were: COIN-OR Linear Programming (CLP) [2], [3], GNU Linear Programming Kit (GLPK) [4], lp_solve [5] and Modular In-core Nonlinear Optimization System (MINOS) [6]. As no open-source solver outperforms CPLEX, this study demonstrates the power of commercial linear programming software. CLP was found to be the top performing open-source solver considered in terms of capability and speed. GLPK also performed well but cannot match the speed of CLP or CPLEX. lp_solve and MINOS were considerably slower and encountered issues when solving several test problems.

  14. A Matlab-based finite-difference solver for the Poisson problem with mixed Dirichlet-Neumann boundary conditions

    Science.gov (United States)

    Reimer, Ashton S.; Cheviakov, Alexei F.

    2013-03-01

    A Matlab-based finite-difference numerical solver for the Poisson equation for a rectangle and a disk in two dimensions, and a spherical domain in three dimensions, is presented. The solver is optimized for handling an arbitrary combination of Dirichlet and Neumann boundary conditions, and allows for full user control of mesh refinement. The solver routines utilize effective and parallelized sparse vector and matrix operations. Computations exhibit high speeds, numerical stability with respect to mesh size and mesh refinement, and acceptable error values even on desktop computers. Catalogue identifier: AENQ_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AENQ_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: GNU General Public License v3.0 No. of lines in distributed program, including test data, etc.: 102793 No. of bytes in distributed program, including test data, etc.: 369378 Distribution format: tar.gz Programming language: Matlab 2010a. Computer: PC, Macintosh. Operating system: Windows, OSX, Linux. RAM: 8 GB (8, 589, 934, 592 bytes) Classification: 4.3. Nature of problem: To solve the Poisson problem in a standard domain with “patchy surface”-type (strongly heterogeneous) Neumann/Dirichlet boundary conditions. Solution method: Finite difference with mesh refinement. Restrictions: Spherical domain in 3D; rectangular domain or a disk in 2D. Unusual features: Choice between mldivide/iterative solver for the solution of large system of linear algebraic equations that arise. Full user control of Neumann/Dirichlet boundary conditions and mesh refinement. Running time: Depending on the number of points taken and the geometry of the domain, the routine may take from less than a second to several hours to execute.

  15. Equivalent charge source model based iterative maximum neighbor weight for sparse EEG source localization.

    Science.gov (United States)

    Xu, Peng; Tian, Yin; Lei, Xu; Hu, Xiao; Yao, Dezhong

    2008-12-01

    How to localize the neural electric activities within brain effectively and precisely from the scalp electroencephalogram (EEG) recordings is a critical issue for current study in clinical neurology and cognitive neuroscience. In this paper, based on the charge source model and the iterative re-weighted strategy, proposed is a new maximum neighbor weight based iterative sparse source imaging method, termed as CMOSS (Charge source model based Maximum neighbOr weight Sparse Solution). Different from the weight used in focal underdetermined system solver (FOCUSS) where the weight for each point in the discrete solution space is independently updated in iterations, the new designed weight for each point in each iteration is determined by the source solution of the last iteration at both the point and its neighbors. Using such a new weight, the next iteration may have a bigger chance to rectify the local source location bias existed in the previous iteration solution. The simulation studies with comparison to FOCUSS and LORETA for various source configurations were conducted on a realistic 3-shell head model, and the results confirmed the validation of CMOSS for sparse EEG source localization. Finally, CMOSS was applied to localize sources elicited in a visual stimuli experiment, and the result was consistent with those source areas involved in visual processing reported in previous studies.

  16. Semi-supervised sparse coding

    KAUST Repository

    Wang, Jim Jing-Yan; Gao, Xin

    2014-01-01

    Sparse coding approximates the data sample as a sparse linear combination of some basic codewords and uses the sparse codes as new presentations. In this paper, we investigate learning discriminative sparse codes by sparse coding in a semi-supervised manner, where only a few training samples are labeled. By using the manifold structure spanned by the data set of both labeled and unlabeled samples and the constraints provided by the labels of the labeled samples, we learn the variable class labels for all the samples. Furthermore, to improve the discriminative ability of the learned sparse codes, we assume that the class labels could be predicted from the sparse codes directly using a linear classifier. By solving the codebook, sparse codes, class labels and classifier parameters simultaneously in a unified objective function, we develop a semi-supervised sparse coding algorithm. Experiments on two real-world pattern recognition problems demonstrate the advantage of the proposed methods over supervised sparse coding methods on partially labeled data sets.

  17. Semi-supervised sparse coding

    KAUST Repository

    Wang, Jim Jing-Yan

    2014-07-06

    Sparse coding approximates the data sample as a sparse linear combination of some basic codewords and uses the sparse codes as new presentations. In this paper, we investigate learning discriminative sparse codes by sparse coding in a semi-supervised manner, where only a few training samples are labeled. By using the manifold structure spanned by the data set of both labeled and unlabeled samples and the constraints provided by the labels of the labeled samples, we learn the variable class labels for all the samples. Furthermore, to improve the discriminative ability of the learned sparse codes, we assume that the class labels could be predicted from the sparse codes directly using a linear classifier. By solving the codebook, sparse codes, class labels and classifier parameters simultaneously in a unified objective function, we develop a semi-supervised sparse coding algorithm. Experiments on two real-world pattern recognition problems demonstrate the advantage of the proposed methods over supervised sparse coding methods on partially labeled data sets.

  18. Self-correcting Multigrid Solver

    International Nuclear Information System (INIS)

    Lewandowski, Jerome L.V.

    2004-01-01

    A new multigrid algorithm based on the method of self-correction for the solution of elliptic problems is described. The method exploits information contained in the residual to dynamically modify the source term (right-hand side) of the elliptic problem. It is shown that the self-correcting solver is more efficient at damping the short wavelength modes of the algebraic error than its standard equivalent. When used in conjunction with a multigrid method, the resulting solver displays an improved convergence rate with no additional computational work

  19. In Defense of Sparse Tracking: Circulant Sparse Tracker

    KAUST Repository

    Zhang, Tianzhu; Bibi, Adel Aamer; Ghanem, Bernard

    2016-01-01

    Sparse representation has been introduced to visual tracking by finding the best target candidate with minimal reconstruction error within the particle filter framework. However, most sparse representation based trackers have high computational cost, less than promising tracking performance, and limited feature representation. To deal with the above issues, we propose a novel circulant sparse tracker (CST), which exploits circulant target templates. Because of the circulant structure property, CST has the following advantages: (1) It can refine and reduce particles using circular shifts of target templates. (2) The optimization can be efficiently solved entirely in the Fourier domain. (3) High dimensional features can be embedded into CST to significantly improve tracking performance without sacrificing much computation time. Both qualitative and quantitative evaluations on challenging benchmark sequences demonstrate that CST performs better than all other sparse trackers and favorably against state-of-the-art methods.

  20. In Defense of Sparse Tracking: Circulant Sparse Tracker

    KAUST Repository

    Zhang, Tianzhu

    2016-12-13

    Sparse representation has been introduced to visual tracking by finding the best target candidate with minimal reconstruction error within the particle filter framework. However, most sparse representation based trackers have high computational cost, less than promising tracking performance, and limited feature representation. To deal with the above issues, we propose a novel circulant sparse tracker (CST), which exploits circulant target templates. Because of the circulant structure property, CST has the following advantages: (1) It can refine and reduce particles using circular shifts of target templates. (2) The optimization can be efficiently solved entirely in the Fourier domain. (3) High dimensional features can be embedded into CST to significantly improve tracking performance without sacrificing much computation time. Both qualitative and quantitative evaluations on challenging benchmark sequences demonstrate that CST performs better than all other sparse trackers and favorably against state-of-the-art methods.

  1. Parallel Solver for H(div) Problems Using Hybridization and AMG

    Energy Technology Data Exchange (ETDEWEB)

    Lee, Chak S. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Vassilevski, Panayot S. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

    2016-01-15

    In this paper, a scalable parallel solver is proposed for H(div) problems discretized by arbitrary order finite elements on general unstructured meshes. The solver is based on hybridization and algebraic multigrid (AMG). Unlike some previously studied H(div) solvers, the hybridization solver does not require discrete curl and gradient operators as additional input from the user. Instead, only some element information is needed in the construction of the solver. The hybridization results in a H1-equivalent symmetric positive definite system, which is then rescaled and solved by AMG solvers designed for H1 problems. Weak and strong scaling of the method are examined through several numerical tests. Our numerical results show that the proposed solver provides a promising alternative to ADS, a state-of-the-art solver [12], for H(div) problems. In fact, it outperforms ADS for higher order elements.

  2. Turbo-SMT: Parallel Coupled Sparse Matrix-Tensor Factorizations and Applications

    Science.gov (United States)

    Papalexakis, Evangelos E.; Faloutsos, Christos; Mitchell, Tom M.; Talukdar, Partha Pratim; Sidiropoulos, Nicholas D.; Murphy, Brian

    2016-01-01

    How can we correlate the neural activity in the human brain as it responds to typed words, with properties of these terms (like ’edible’, ’fits in hand’)? In short, we want to find latent variables, that jointly explain both the brain activity, as well as the behavioral responses. This is one of many settings of the Coupled Matrix-Tensor Factorization (CMTF) problem. Can we enhance any CMTF solver, so that it can operate on potentially very large datasets that may not fit in main memory? We introduce Turbo-SMT, a meta-method capable of doing exactly that: it boosts the performance of any CMTF algorithm, produces sparse and interpretable solutions, and parallelizes any CMTF algorithm, producing sparse and interpretable solutions (up to 65 fold). Additionally, we improve upon ALS, the work-horse algorithm for CMTF, with respect to efficiency and robustness to missing values. We apply Turbo-SMT to BrainQ, a dataset consisting of a (nouns, brain voxels, human subjects) tensor and a (nouns, properties) matrix, with coupling along the nouns dimension. Turbo-SMT is able to find meaningful latent variables, as well as to predict brain activity with competitive accuracy. Finally, we demonstrate the generality of Turbo-SMT, by applying it on a Facebook dataset (users, ’friends’, wall-postings); there, Turbo-SMT spots spammer-like anomalies. PMID:27672406

  3. A new optimization method using a compressed sensing inspired solver for real-time LDR-brachytherapy treatment planning

    International Nuclear Information System (INIS)

    Guthier, C; Aschenbrenner, K P; Buergy, D; Ehmann, M; Wenz, F; Hesser, J W

    2015-01-01

    This work discusses a novel strategy for inverse planning in low dose rate brachytherapy. It applies the idea of compressed sensing to the problem of inverse treatment planning and a new solver for this formulation is developed. An inverse planning algorithm was developed incorporating brachytherapy dose calculation methods as recommended by AAPM TG-43. For optimization of the functional a new variant of a matching pursuit type solver is presented. The results are compared with current state-of-the-art inverse treatment planning algorithms by means of real prostate cancer patient data. The novel strategy outperforms the best state-of-the-art methods in speed, while achieving comparable quality. It is able to find solutions with comparable values for the objective function and it achieves these results within a few microseconds, being up to 542 times faster than competing state-of-the-art strategies, allowing real-time treatment planning. The sparse solution of inverse brachytherapy planning achieved with methods from compressed sensing is a new paradigm for optimization in medical physics. Through the sparsity of required needles and seeds identified by this method, the cost of intervention may be reduced. (paper)

  4. A new optimization method using a compressed sensing inspired solver for real-time LDR-brachytherapy treatment planning

    Science.gov (United States)

    Guthier, C.; Aschenbrenner, K. P.; Buergy, D.; Ehmann, M.; Wenz, F.; Hesser, J. W.

    2015-03-01

    This work discusses a novel strategy for inverse planning in low dose rate brachytherapy. It applies the idea of compressed sensing to the problem of inverse treatment planning and a new solver for this formulation is developed. An inverse planning algorithm was developed incorporating brachytherapy dose calculation methods as recommended by AAPM TG-43. For optimization of the functional a new variant of a matching pursuit type solver is presented. The results are compared with current state-of-the-art inverse treatment planning algorithms by means of real prostate cancer patient data. The novel strategy outperforms the best state-of-the-art methods in speed, while achieving comparable quality. It is able to find solutions with comparable values for the objective function and it achieves these results within a few microseconds, being up to 542 times faster than competing state-of-the-art strategies, allowing real-time treatment planning. The sparse solution of inverse brachytherapy planning achieved with methods from compressed sensing is a new paradigm for optimization in medical physics. Through the sparsity of required needles and seeds identified by this method, the cost of intervention may be reduced.

  5. Modern solvers for Helmholtz problems

    CERN Document Server

    Tang, Jok; Vuik, Kees

    2017-01-01

    This edited volume offers a state of the art overview of fast and robust solvers for the Helmholtz equation. The book consists of three parts: new developments and analysis in Helmholtz solvers, practical methods and implementations of Helmholtz solvers, and industrial applications. The Helmholtz equation appears in a wide range of science and engineering disciplines in which wave propagation is modeled. Examples are: seismic inversion, ultrasone medical imaging, sonar detection of submarines, waves in harbours and many more. The partial differential equation looks simple but is hard to solve. In order to approximate the solution of the problem numerical methods are needed. First a discretization is done. Various methods can be used: (high order) Finite Difference Method, Finite Element Method, Discontinuous Galerkin Method and Boundary Element Method. The resulting linear system is large, where the size of the problem increases with increasing frequency. Due to higher frequencies the seismic images need to b...

  6. Structural Sparse Tracking

    KAUST Repository

    Zhang, Tianzhu

    2015-06-01

    Sparse representation has been applied to visual tracking by finding the best target candidate with minimal reconstruction error by use of target templates. However, most sparse representation based trackers only consider holistic or local representations and do not make full use of the intrinsic structure among and inside target candidates, thereby making the representation less effective when similar objects appear or under occlusion. In this paper, we propose a novel Structural Sparse Tracking (SST) algorithm, which not only exploits the intrinsic relationship among target candidates and their local patches to learn their sparse representations jointly, but also preserves the spatial layout structure among the local patches inside each target candidate. We show that our SST algorithm accommodates most existing sparse trackers with the respective merits. Both qualitative and quantitative evaluations on challenging benchmark image sequences demonstrate that the proposed SST algorithm performs favorably against several state-of-the-art methods.

  7. Differences in the Processes of Solving Physics Problems between Good Physics Problem Solvers and Poor Physics Problem Solvers.

    Science.gov (United States)

    Finegold, M.; Mass, R.

    1985-01-01

    Good problem solvers and poor problem solvers in advanced physics (N=8) were significantly different in their ability in translating, planning, and physical reasoning, as well as in problem solving time; no differences in reliance on algebraic solutions and checking problems were noted. Implications for physics teaching are discussed. (DH)

  8. A finite different field solver for dipole modes

    International Nuclear Information System (INIS)

    Nelson, E.M.

    1992-08-01

    A finite element field solver for dipole modes in axisymmetric structures has been written. The second-order elements used in this formulation yield accurate mode frequencies with no spurious modes. Quasi-periodic boundaries are included to allow travelling waves in periodic structures. The solver is useful in applications requiring precise frequency calculations such as detuned accelerator structures for linear colliders. Comparisons are made with measurements and with the popular but less accurate field solver URMEL

  9. Telescopic Hybrid Fast Solver for 3D Elliptic Problems with Point Singularities

    KAUST Repository

    Paszyńska, Anna; Jopek, Konrad; Banaś, Krzysztof; Paszyński, Maciej; Gurgul, Piotr; Lenerth, Andrew; Nguyen, Donald; Pingali, Keshav; Dalcind, Lisandro; Calo, Victor M.

    2015-01-01

    This paper describes a telescopic solver for two dimensional h adaptive grids with point singularities. The input for the telescopic solver is an h refined two dimensional computational mesh with rectangular finite elements. The candidates for point singularities are first localized over the mesh by using a greedy algorithm. Having the candidates for point singularities, we execute either a direct solver, that performs multiple refinements towards selected point singularities and executes a parallel direct solver algorithm which has logarithmic cost with respect to refinement level. The direct solvers executed over each candidate for point singularity return local Schur complement matrices that can be merged together and submitted to iterative solver. In this paper we utilize a parallel multi-thread GALOIS solver as a direct solver. We use Incomplete LU Preconditioned Conjugated Gradients (ILUPCG) as an iterative solver. We also show that elimination of point singularities from the refined mesh reduces significantly the number of iterations to be performed by the ILUPCG iterative solver.

  10. Telescopic Hybrid Fast Solver for 3D Elliptic Problems with Point Singularities

    KAUST Repository

    Paszyńska, Anna

    2015-06-01

    This paper describes a telescopic solver for two dimensional h adaptive grids with point singularities. The input for the telescopic solver is an h refined two dimensional computational mesh with rectangular finite elements. The candidates for point singularities are first localized over the mesh by using a greedy algorithm. Having the candidates for point singularities, we execute either a direct solver, that performs multiple refinements towards selected point singularities and executes a parallel direct solver algorithm which has logarithmic cost with respect to refinement level. The direct solvers executed over each candidate for point singularity return local Schur complement matrices that can be merged together and submitted to iterative solver. In this paper we utilize a parallel multi-thread GALOIS solver as a direct solver. We use Incomplete LU Preconditioned Conjugated Gradients (ILUPCG) as an iterative solver. We also show that elimination of point singularities from the refined mesh reduces significantly the number of iterations to be performed by the ILUPCG iterative solver.

  11. Summer Proceedings 2016: The Center for Computing Research at Sandia National Laboratories

    Energy Technology Data Exchange (ETDEWEB)

    Carleton, James Brian [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Parks, Michael L. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

    2017-02-01

    Solving sparse linear systems from the discretization of elliptic partial differential equations (PDEs) is an important building block in many engineering applications. Sparse direct solvers can solve general linear systems, but are usually slower and use much more memory than effective iterative solvers. To overcome these two disadvantages, a hierarchical solver (LoRaSp) based on H2-matrices was introduced in [22]. Here, we have developed a parallel version of the algorithm in LoRaSp to solve large sparse matrices on distributed memory machines. On a single processor, the factorization time of our parallel solver scales almost linearly with the problem size for three-dimensional problems, as opposed to the quadratic scalability of many existing sparse direct solvers. Moreover, our solver leads to almost constant numbers of iterations, when used as a preconditioner for Poisson problems. On more than one processor, our algorithm has significant speedups compared to sequential runs. With this parallel algorithm, we are able to solve large problems much faster than many existing packages as demonstrated by the numerical experiments.

  12. MINOS: A simplified Pn solver for core calculation

    International Nuclear Information System (INIS)

    Baudron, A.M.; Lautard, J.J.

    2007-01-01

    This paper describes a new generation of the neutronic core solver MINOS resulting from developments done in the DESCARTES project. For performance reasons, the numerical method of the existing MINOS solver in the SAPHYR system has been reused in the new system. It is based on the mixed-dual finite element approximation of the simplified transport equation. We have extended the previous method to the treatment of unstructured geometries composed by quadrilaterals, allowing us to treat geometries where fuel pins are exactly represented. For Cartesian geometries, the solver takes into account assembly discontinuity coefficients in the simplified P n context. The solver has been rewritten in C + + programming language using an object-oriented design. Its general architecture was reconsidered in order to improve its capability of evolution and its maintainability. Moreover, the performance of the previous version has been improved mainly regarding the matrix construction time; this result improves significantly the performance of the solver in the context of industrial application requiring thermal-hydraulic feedback and depletion calculations. (authors)

  13. Test set for initial value problem solvers

    NARCIS (Netherlands)

    W.M. Lioen (Walter); J.J.B. de Swart (Jacques)

    1998-01-01

    textabstractThe CWI test set for IVP solvers presents a collection of Initial Value Problems to test solvers for implicit differential equations. This test set can both decrease the effort for the code developer to test his software in a reliable way, and cross the bridge between the application

  14. A finite element field solver for dipole modes

    International Nuclear Information System (INIS)

    Nelson, E.M.

    1992-01-01

    A finite element field solver for dipole modes in axisymmetric structures has been written. The second-order elements used in this formulation yield accurate mode frequencies with no spurious modes. Quasi-periodic boundaries are included to allow travelling waves in periodic structures. The solver is useful in applications requiring precise frequency calculations such as detuned accelerator structures for linear colliders. Comparisons are made with measurements and with the popular but less accurate field solver URMEL. (author). 7 refs., 4 figs

  15. Learning Domain-Specific Heuristics for Answer Set Solvers

    OpenAIRE

    Balduccini, Marcello

    2010-01-01

    In spite of the recent improvements in the performance of Answer Set Programming (ASP) solvers, when the search space is sufficiently large, it is still possible for the search algorithm to mistakenly focus on areas of the search space that contain no solutions or very few. When that happens, performance degrades substantially, even to the point that the solver may need to be terminated before returning an answer. This prospect is a concern when one is considering using such a solver in an in...

  16. Acceleration of FDTD mode solver by high-performance computing techniques.

    Science.gov (United States)

    Han, Lin; Xi, Yanping; Huang, Wei-Ping

    2010-06-21

    A two-dimensional (2D) compact finite-difference time-domain (FDTD) mode solver is developed based on wave equation formalism in combination with the matrix pencil method (MPM). The method is validated for calculation of both real guided and complex leaky modes of typical optical waveguides against the bench-mark finite-difference (FD) eigen mode solver. By taking advantage of the inherent parallel nature of the FDTD algorithm, the mode solver is implemented on graphics processing units (GPUs) using the compute unified device architecture (CUDA). It is demonstrated that the high-performance computing technique leads to significant acceleration of the FDTD mode solver with more than 30 times improvement in computational efficiency in comparison with the conventional FDTD mode solver running on CPU of a standard desktop computer. The computational efficiency of the accelerated FDTD method is in the same order of magnitude of the standard finite-difference eigen mode solver and yet require much less memory (e.g., less than 10%). Therefore, the new method may serve as an efficient, accurate and robust tool for mode calculation of optical waveguides even when the conventional eigen value mode solvers are no longer applicable due to memory limitation.

  17. Fast multipole preconditioners for sparse matrices arising from elliptic equations

    KAUST Repository

    Ibeid, Huda

    2017-11-09

    Among optimal hierarchical algorithms for the computational solution of elliptic problems, the fast multipole method (FMM) stands out for its adaptability to emerging architectures, having high arithmetic intensity, tunable accuracy, and relaxable global synchronization requirements. We demonstrate that, beyond its traditional use as a solver in problems for which explicit free-space kernel representations are available, the FMM has applicability as a preconditioner in finite domain elliptic boundary value problems, by equipping it with boundary integral capability for satisfying conditions at finite boundaries and by wrapping it in a Krylov method for extensibility to more general operators. Here, we do not discuss the well developed applications of FMM to implement matrix-vector multiplications within Krylov solvers of boundary element methods. Instead, we propose using FMM for the volume-to-volume contribution of inhomogeneous Poisson-like problems, where the boundary integral is a small part of the overall computation. Our method may be used to precondition sparse matrices arising from finite difference/element discretizations, and can handle a broader range of scientific applications. It is capable of algebraic convergence rates down to the truncation error of the discretized PDE comparable to those of multigrid methods, and it offers potentially superior multicore and distributed memory scalability properties on commodity architecture supercomputers. Compared with other methods exploiting the low-rank character of off-diagonal blocks of the dense resolvent operator, FMM-preconditioned Krylov iteration may reduce the amount of communication because it is matrix-free and exploits the tree structure of FMM. We describe our tests in reproducible detail with freely available codes and outline directions for further extensibility.

  18. Fast multipole preconditioners for sparse matrices arising from elliptic equations

    KAUST Repository

    Ibeid, Huda; Yokota, Rio; Pestana, Jennifer; Keyes, David E.

    2017-01-01

    Among optimal hierarchical algorithms for the computational solution of elliptic problems, the fast multipole method (FMM) stands out for its adaptability to emerging architectures, having high arithmetic intensity, tunable accuracy, and relaxable global synchronization requirements. We demonstrate that, beyond its traditional use as a solver in problems for which explicit free-space kernel representations are available, the FMM has applicability as a preconditioner in finite domain elliptic boundary value problems, by equipping it with boundary integral capability for satisfying conditions at finite boundaries and by wrapping it in a Krylov method for extensibility to more general operators. Here, we do not discuss the well developed applications of FMM to implement matrix-vector multiplications within Krylov solvers of boundary element methods. Instead, we propose using FMM for the volume-to-volume contribution of inhomogeneous Poisson-like problems, where the boundary integral is a small part of the overall computation. Our method may be used to precondition sparse matrices arising from finite difference/element discretizations, and can handle a broader range of scientific applications. It is capable of algebraic convergence rates down to the truncation error of the discretized PDE comparable to those of multigrid methods, and it offers potentially superior multicore and distributed memory scalability properties on commodity architecture supercomputers. Compared with other methods exploiting the low-rank character of off-diagonal blocks of the dense resolvent operator, FMM-preconditioned Krylov iteration may reduce the amount of communication because it is matrix-free and exploits the tree structure of FMM. We describe our tests in reproducible detail with freely available codes and outline directions for further extensibility.

  19. A Novel Interactive MINLP Solver for CAPE Applications

    DEFF Research Database (Denmark)

    Henriksen, Jens Peter; Støy, S.; Russel, Boris Mariboe

    2000-01-01

    This paper presents an interactive MINLP solver that is particularly suitable for solution of process synthesis, design and analysis problems. The interactive MINLP solver is based on the decomposition based MINLP algorithms, where a NLP sub-problem is solved in the innerloop and a MILP master pr...

  20. Two-dimensional time dependent Riemann solvers for neutron transport

    International Nuclear Information System (INIS)

    Brunner, Thomas A.; Holloway, James Paul

    2005-01-01

    A two-dimensional Riemann solver is developed for the spherical harmonics approximation to the time dependent neutron transport equation. The eigenstructure of the resulting equations is explored, giving insight into both the spherical harmonics approximation and the Riemann solver. The classic Roe-type Riemann solver used here was developed for one-dimensional problems, but can be used in multidimensional problems by treating each face of a two-dimensional computation cell in a locally one-dimensional way. Several test problems are used to explore the capabilities of both the Riemann solver and the spherical harmonics approximation. The numerical solution for a simple line source problem is compared to the analytic solution to both the P 1 equation and the full transport solution. A lattice problem is used to test the method on a more challenging problem

  1. Parallel iterative solvers and preconditioners using approximate hierarchical methods

    Energy Technology Data Exchange (ETDEWEB)

    Grama, A.; Kumar, V.; Sameh, A. [Univ. of Minnesota, Minneapolis, MN (United States)

    1996-12-31

    In this paper, we report results of the performance, convergence, and accuracy of a parallel GMRES solver for Boundary Element Methods. The solver uses a hierarchical approximate matrix-vector product based on a hybrid Barnes-Hut / Fast Multipole Method. We study the impact of various accuracy parameters on the convergence and show that with minimal loss in accuracy, our solver yields significant speedups. We demonstrate the excellent parallel efficiency and scalability of our solver. The combined speedups from approximation and parallelism represent an improvement of several orders in solution time. We also develop fast and paralellizable preconditioners for this problem. We report on the performance of an inner-outer scheme and a preconditioner based on truncated Green`s function. Experimental results on a 256 processor Cray T3D are presented.

  2. Fast Sparse Coding for Range Data Denoising with Sparse Ridges Constraint

    Directory of Open Access Journals (Sweden)

    Zhi Gao

    2018-05-01

    Full Text Available Light detection and ranging (LiDAR sensors have been widely deployed on intelligent systems such as unmanned ground vehicles (UGVs and unmanned aerial vehicles (UAVs to perform localization, obstacle detection, and navigation tasks. Thus, research into range data processing with competitive performance in terms of both accuracy and efficiency has attracted increasing attention. Sparse coding has revolutionized signal processing and led to state-of-the-art performance in a variety of applications. However, dictionary learning, which plays the central role in sparse coding techniques, is computationally demanding, resulting in its limited applicability in real-time systems. In this study, we propose sparse coding algorithms with a fixed pre-learned ridge dictionary to realize range data denoising via leveraging the regularity of laser range measurements in man-made environments. Experiments on both synthesized data and real data demonstrate that our method obtains accuracy comparable to that of sophisticated sparse coding methods, but with much higher computational efficiency.

  3. Fast Sparse Coding for Range Data Denoising with Sparse Ridges Constraint.

    Science.gov (United States)

    Gao, Zhi; Lao, Mingjie; Sang, Yongsheng; Wen, Fei; Ramesh, Bharath; Zhai, Ruifang

    2018-05-06

    Light detection and ranging (LiDAR) sensors have been widely deployed on intelligent systems such as unmanned ground vehicles (UGVs) and unmanned aerial vehicles (UAVs) to perform localization, obstacle detection, and navigation tasks. Thus, research into range data processing with competitive performance in terms of both accuracy and efficiency has attracted increasing attention. Sparse coding has revolutionized signal processing and led to state-of-the-art performance in a variety of applications. However, dictionary learning, which plays the central role in sparse coding techniques, is computationally demanding, resulting in its limited applicability in real-time systems. In this study, we propose sparse coding algorithms with a fixed pre-learned ridge dictionary to realize range data denoising via leveraging the regularity of laser range measurements in man-made environments. Experiments on both synthesized data and real data demonstrate that our method obtains accuracy comparable to that of sophisticated sparse coding methods, but with much higher computational efficiency.

  4. When sparse coding meets ranking: a joint framework for learning sparse codes and ranking scores

    KAUST Repository

    Wang, Jim Jing-Yan

    2017-06-28

    Sparse coding, which represents a data point as a sparse reconstruction code with regard to a dictionary, has been a popular data representation method. Meanwhile, in database retrieval problems, learning the ranking scores from data points plays an important role. Up to now, these two problems have always been considered separately, assuming that data coding and ranking are two independent and irrelevant problems. However, is there any internal relationship between sparse coding and ranking score learning? If yes, how to explore and make use of this internal relationship? In this paper, we try to answer these questions by developing the first joint sparse coding and ranking score learning algorithm. To explore the local distribution in the sparse code space, and also to bridge coding and ranking problems, we assume that in the neighborhood of each data point, the ranking scores can be approximated from the corresponding sparse codes by a local linear function. By considering the local approximation error of ranking scores, the reconstruction error and sparsity of sparse coding, and the query information provided by the user, we construct a unified objective function for learning of sparse codes, the dictionary and ranking scores. We further develop an iterative algorithm to solve this optimization problem.

  5. Development of axisymmetric lattice Boltzmann flux solver for complex multiphase flows

    Science.gov (United States)

    Wang, Yan; Shu, Chang; Yang, Li-Ming; Yuan, Hai-Zhuan

    2018-05-01

    This paper presents an axisymmetric lattice Boltzmann flux solver (LBFS) for simulating axisymmetric multiphase flows. In the solver, the two-dimensional (2D) multiphase LBFS is applied to reconstruct macroscopic fluxes excluding axisymmetric effects. Source terms accounting for axisymmetric effects are introduced directly into the governing equations. As compared to conventional axisymmetric multiphase lattice Boltzmann (LB) method, the present solver has the kinetic feature for flux evaluation and avoids complex derivations of external forcing terms. In addition, the present solver also saves considerable computational efforts in comparison with three-dimensional (3D) computations. The capability of the proposed solver in simulating complex multiphase flows is demonstrated by studying single bubble rising in a circular tube. The obtained results compare well with the published data.

  6. Refined isogeometric analysis for a preconditioned conjugate gradient solver

    KAUST Repository

    Garcia, Daniel

    2018-02-12

    Starting from a highly continuous Isogeometric Analysis (IGA) discretization, refined Isogeometric Analysis (rIGA) introduces C0 hyperplanes that act as separators for the direct LU factorization solver. As a result, the total computational cost required to solve the corresponding system of equations using a direct LU factorization solver dramatically reduces (up to a factor of 55) Garcia et al. (2017). At the same time, rIGA enriches the IGA spaces, thus improving the best approximation error. In this work, we extend the complexity analysis of rIGA to the case of iterative solvers. We build an iterative solver as follows: we first construct the Schur complements using a direct solver over small subdomains (macro-elements). We then assemble those Schur complements into a global skeleton system. Subsequently, we solve this system iteratively using Conjugate Gradients (CG) with an incomplete LU (ILU) preconditioner. For a 2D Poisson model problem with a structured mesh and a uniform polynomial degree of approximation, rIGA achieves moderate savings with respect to IGA in terms of the number of Floating Point Operations (FLOPs) and computational time (in seconds) required to solve the resulting system of linear equations. For instance, for a mesh with four million elements and polynomial degree p=3, the iterative solver is approximately 2.6 times faster (in time) when applied to the rIGA system than to the IGA one. These savings occur because the skeleton rIGA system contains fewer non-zero entries than the IGA one. The opposite situation occurs for 3D problems, and as a result, 3D rIGA discretizations provide no gains with respect to their IGA counterparts when considering iterative solvers.

  7. Parallel linear solvers for simulations of reactor thermal hydraulics

    International Nuclear Information System (INIS)

    Yan, Y.; Antal, S.P.; Edge, B.; Keyes, D.E.; Shaver, D.; Bolotnov, I.A.; Podowski, M.Z.

    2011-01-01

    The state-of-the-art multiphase fluid dynamics code, NPHASE-CMFD, performs multiphase flow simulations in complex domains using implicit nonlinear treatment of the governing equations and in parallel, which is a very challenging environment for the linear solver. The present work illustrates how the Portable, Extensible Toolkit for Scientific Computation (PETSc) and scalable Algebraic Multigrid (AMG) preconditioner from Hypre can be utilized to construct robust and scalable linear solvers for the Newton correction equation obtained from the discretized system of governing conservation equations in NPHASE-CMFD. The overall long-tem objective of this work is to extend the NPHASE-CMFD code into a fully-scalable solver of multiphase flow and heat transfer problems, applicable to both steady-state and stiff time-dependent phenomena in complete fuel assemblies of nuclear reactors and, eventually, the entire reactor core (such as the Virtual Reactor concept envisioned by CASL). This campaign appropriately begins with the linear algebraic equation solver, which is traditionally a bottleneck to scalability in PDE-based codes. The computational complexity of the solver is usually superlinear in problem size, whereas the rest of the code, the “physics” portion, usually has its complexity linear in the problem size. (author)

  8. Using SPARK as a Solver for Modelica

    Energy Technology Data Exchange (ETDEWEB)

    Wetter, Michael; Wetter, Michael; Haves, Philip; Moshier, Michael A.; Sowell, Edward F.

    2008-06-30

    Modelica is an object-oriented acausal modeling language that is well positioned to become a de-facto standard for expressing models of complex physical systems. To simulate a model expressed in Modelica, it needs to be translated into executable code. For generating run-time efficient code, such a translation needs to employ algebraic formula manipulations. As the SPARK solver has been shown to be competitive for generating such code but currently cannot be used with the Modelica language, we report in this paper how SPARK's symbolic and numerical algorithms can be implemented in OpenModelica, an open-source implementation of a Modelica modeling and simulation environment. We also report benchmark results that show that for our air flow network simulation benchmark, the SPARK solver is competitive with Dymola, which is believed to provide the best solver for Modelica.

  9. Implementing the conjugate gradient algorithm on multi-core systems

    NARCIS (Netherlands)

    Wiggers, W.A.; Bakker, Vincent; Kokkeler, Andre B.J.; Smit, Gerardus Johannes Maria; Nurmi, J.; Takala, J.; Vainio, O.

    2007-01-01

    In linear solvers, like the conjugate gradient algorithm, sparse-matrix vector multiplication is an important kernel. Due to the sparseness of the matrices, the solver runs relatively slow. For digital optical tomography (DOT), a large set of linear equations have to be solved which currently takes

  10. A General Symbolic PDE Solver Generator: Explicit Schemes

    Directory of Open Access Journals (Sweden)

    K. Sheshadri

    2003-01-01

    Full Text Available A symbolic solver generator to deal with a system of partial differential equations (PDEs in functions of an arbitrary number of variables is presented; it can also handle arbitrary domains (geometries of the independent variables. Given a system of PDEs, the solver generates a set of explicit finite-difference methods to any specified order, and a Fourier stability criterion for each method. For a method that is stable, an iteration function is generated symbolically using the PDE and its initial and boundary conditions. This iteration function is dynamically generated for every PDE problem, and its evaluation provides a solution to the PDE problem. A C++/Fortran 90 code for the iteration function is generated using the MathCode system, which results in a performance gain of the order of a thousand over Mathematica, the language that has been used to code the solver generator. Examples of stability criteria are presented that agree with known criteria; examples that demonstrate the generality of the solver and the speed enhancement of the generated C++ and Fortran 90 codes are also presented.

  11. High-Performance Small-Scale Solvers for Moving Horizon Estimation

    DEFF Research Database (Denmark)

    Frison, Gianluca; Vukov, Milan; Poulsen, Niels Kjølstad

    2015-01-01

    implementation techniques focusing on small-scale problems. The proposed MHE solver is implemented using custom linear algebra routines and is compared against implementations using BLAS libraries. Additionally, the MHE solver is interfaced to a code generation tool for nonlinear model predictive control (NMPC...

  12. Users are problem solvers!

    NARCIS (Netherlands)

    Brouwer-Janse, M.D.

    1991-01-01

    Most formal problem-solving studies use verbal protocol and observational data of problem solvers working on a task. In user-centred product-design projects, observational studies of users are frequently used too. In the latter case, however, systematic control of conditions, indepth analysis and

  13. A non-conforming 3D spherical harmonic transport solver

    Energy Technology Data Exchange (ETDEWEB)

    Van Criekingen, S. [Commissariat a l' Energie Atomique CEA-Saclay, DEN/DM2S/SERMA/LENR Bat 470, 91191 Gif-sur-Yvette, Cedex (France)

    2006-07-01

    A new 3D transport solver for the time-independent Boltzmann transport equation has been developed. This solver is based on the second-order even-parity form of the transport equation. The angular discretization is performed through the expansion of the angular neutron flux in spherical harmonics (PN method). The novelty of this solver is the use of non-conforming finite elements for the spatial discretization. Such elements lead to a discontinuous flux approximation. This interface continuity requirement relaxation property is shared with mixed-dual formulations such as the ones based on Raviart-Thomas finite elements. Encouraging numerical results are presented. (authors)

  14. A non-conforming 3D spherical harmonic transport solver

    International Nuclear Information System (INIS)

    Van Criekingen, S.

    2006-01-01

    A new 3D transport solver for the time-independent Boltzmann transport equation has been developed. This solver is based on the second-order even-parity form of the transport equation. The angular discretization is performed through the expansion of the angular neutron flux in spherical harmonics (PN method). The novelty of this solver is the use of non-conforming finite elements for the spatial discretization. Such elements lead to a discontinuous flux approximation. This interface continuity requirement relaxation property is shared with mixed-dual formulations such as the ones based on Raviart-Thomas finite elements. Encouraging numerical results are presented. (authors)

  15. A multi-solver quasi-Newton method for the partitioned simulation of fluid-structure interaction

    International Nuclear Information System (INIS)

    Degroote, J; Annerel, S; Vierendeels, J

    2010-01-01

    In partitioned fluid-structure interaction simulations, the flow equations and the structural equations are solved separately. Consequently, the stresses and displacements on both sides of the fluid-structure interface are not automatically in equilibrium. Coupling techniques like Aitken relaxation and the Interface Block Quasi-Newton method with approximate Jacobians from Least-Squares models (IBQN-LS) enforce this equilibrium, even with black-box solvers. However, all existing coupling techniques use only one flow solver and one structural solver. To benefit from the large number of multi-core processors in modern clusters, a new Multi-Solver Interface Block Quasi-Newton (MS-IBQN-LS) algorithm has been developed. This algorithm uses more than one flow solver and structural solver, each running in parallel on a number of cores. One-dimensional and three-dimensional numerical experiments demonstrate that the run time of a simulation decreases as the number of solvers increases, albeit at a slower pace. Hence, the presented multi-solver algorithm accelerates fluid-structure interaction calculations by increasing the number of solvers, especially when the run time does not decrease further if more cores are used per solver.

  16. Hypersonic simulations using open-source CFD and DSMC solvers

    Science.gov (United States)

    Casseau, V.; Scanlon, T. J.; John, B.; Emerson, D. R.; Brown, R. E.

    2016-11-01

    Hypersonic hybrid hydrodynamic-molecular gas flow solvers are required to satisfy the two essential requirements of any high-speed reacting code, these being physical accuracy and computational efficiency. The James Weir Fluids Laboratory at the University of Strathclyde is currently developing an open-source hybrid code which will eventually reconcile the direct simulation Monte-Carlo method, making use of the OpenFOAM application called dsmcFoam, and the newly coded open-source two-temperature computational fluid dynamics solver named hy2Foam. In conjunction with employing the CVDV chemistry-vibration model in hy2Foam, novel use is made of the QK rates in a CFD solver. In this paper, further testing is performed, in particular with the CFD solver, to ensure its efficacy before considering more advanced test cases. The hy2Foam and dsmcFoam codes have shown to compare reasonably well, thus providing a useful basis for other codes to compare against.

  17. Cafesat: A modern sat solver for scala

    OpenAIRE

    Blanc Régis

    2013-01-01

    We present CafeSat a SAT solver written in the Scala programming language. CafeSat is a modern solver based on DPLL and featuring many state of the art techniques and heuristics. It uses two watched literals for Boolean constraint propagation conict driven learning along with clause deletion a restarting strategy and the VSIDS heuristics for choosing the branching literal. CafeSat is both sound and complete. In order to achieve reasonable performance low level and hand tuned data structures a...

  18. Sparse structure regularized ranking

    KAUST Repository

    Wang, Jim Jing-Yan; Sun, Yijun; Gao, Xin

    2014-01-01

    Learning ranking scores is critical for the multimedia database retrieval problem. In this paper, we propose a novel ranking score learning algorithm by exploring the sparse structure and using it to regularize ranking scores. To explore the sparse

  19. Simplified Eigen-structure decomposition solver for the simulation of two-phase flow systems

    International Nuclear Information System (INIS)

    Kumbaro, Anela

    2012-01-01

    This paper discusses the development of a new solver for a system of first-order non-linear differential equations that model the dynamics of compressible two-phase flow. The solver presents a lower-complexity alternative to Roe-type solvers because it only makes use of a partial Eigen-structure information while maintaining its accuracy: the outcome is hence a good complexity-tractability trade-off to consider as relevant in a large number of situations in the scope of two-phase flow numerical simulation. A number of numerical and physical benchmarks are presented to assess the solver. Comparison between the computational results from the simplified Eigen-structure decomposition solver and the conventional Roe-type solver gives insight upon the issues of accuracy, robustness and efficiency. (authors)

  20. Sparse structure regularized ranking

    KAUST Repository

    Wang, Jim Jing-Yan

    2014-04-17

    Learning ranking scores is critical for the multimedia database retrieval problem. In this paper, we propose a novel ranking score learning algorithm by exploring the sparse structure and using it to regularize ranking scores. To explore the sparse structure, we assume that each multimedia object could be represented as a sparse linear combination of all other objects, and combination coefficients are regarded as a similarity measure between objects and used to regularize their ranking scores. Moreover, we propose to learn the sparse combination coefficients and the ranking scores simultaneously. A unified objective function is constructed with regard to both the combination coefficients and the ranking scores, and is optimized by an iterative algorithm. Experiments on two multimedia database retrieval data sets demonstrate the significant improvements of the propose algorithm over state-of-the-art ranking score learning algorithms.

  1. VCODE, Ordinary Differential Equation Solver for Stiff and Non-Stiff Problems

    International Nuclear Information System (INIS)

    Cohen, Scott D.; Hindmarsh, Alan C.

    2001-01-01

    1 - Description of program or function: CVODE is a package written in ANSI standard C for solving initial value problems for ordinary differential equations. It solves both stiff and non stiff systems. In the stiff case, it includes a variety of options for treating the Jacobian of the system, including dense and band matrix solvers, and a preconditioned Krylov (iterative) solver. 2 - Method of solution: Integration is by Adams or BDF (Backward Differentiation Formula) methods, at user option. Corrector iteration is by functional iteration or Newton iteration. For the solution of linear systems within Newton iteration, users can select a dense solver, a band solver, a diagonal approximation, or a preconditioned Generalized Minimal Residual (GMRES) solver. In the dense and band cases, the user can supply a Jacobian approximation or let CVODE generate it internally. In the GMRES case, the pre-conditioner is user-supplied

  2. Minos: a SPN solver for core calculation in the DESCARTES system

    International Nuclear Information System (INIS)

    Baudron, A.M.; Lautard, J.J.

    2005-01-01

    This paper describes a new development of a neutronic core solver done in the context of a new generation neutronic reactor computational system, named DESCARTES. For performance reasons, the numerical method of the existing MINOS solver in the SAPHYR system has been reused in the new system. It is based on the mixed dual finite element approximation of the simplified transport equation. The solver takes into account assembly discontinuity coefficients (ADF) in the simplified transport equation (SPN) context. The solver has been rewritten in C++ programming language using an object oriented design. Its general architecture was reconsidered in order to improve its capability of evolution and its maintainability. Moreover, the performances of the old version have been improved mainly regarding the matrix construction time; this result improves significantly the performance of the solver in the context of industrial application requiring thermal hydraulic feedback and depletion calculations. (authors)

  3. Turbulent flows over sparse canopies

    Science.gov (United States)

    Sharma, Akshath; García-Mayoral, Ricardo

    2018-04-01

    Turbulent flows over sparse and dense canopies exerting a similar drag force on the flow are investigated using Direct Numerical Simulations. The dense canopies are modelled using a homogeneous drag force, while for the sparse canopy, the geometry of the canopy elements is represented. It is found that on using the friction velocity based on the local shear at each height, the streamwise velocity fluctuations and the Reynolds stress within the sparse canopy are similar to those from a comparable smooth-wall case. In addition, when scaled with the local friction velocity, the intensity of the off-wall peak in the streamwise vorticity for sparse canopies also recovers a value similar to a smooth-wall. This indicates that the sparse canopy does not significantly disturb the near-wall turbulence cycle, but causes its rescaling to an intensity consistent with a lower friction velocity within the canopy. In comparison, the dense canopy is found to have a higher damping effect on the turbulent fluctuations. For the case of the sparse canopy, a peak in the spectral energy density of the wall-normal velocity, and Reynolds stress is observed, which may indicate the formation of Kelvin-Helmholtz-like instabilities. It is also found that a sparse canopy is better modelled by a homogeneous drag applied on the mean flow alone, and not the turbulent fluctuations.

  4. Sherlock Holmes, Master Problem Solver.

    Science.gov (United States)

    Ballew, Hunter

    1994-01-01

    Shows the connections between Sherlock Holmes's investigative methods and mathematical problem solving, including observations, characteristics of the problem solver, importance of data, questioning the obvious, learning from experience, learning from errors, and indirect proof. (MKR)

  5. Experiences with linear solvers for oil reservoir simulation problems

    Energy Technology Data Exchange (ETDEWEB)

    Joubert, W.; Janardhan, R. [Los Alamos National Lab., NM (United States); Biswas, D.; Carey, G.

    1996-12-31

    This talk will focus on practical experiences with iterative linear solver algorithms used in conjunction with Amoco Production Company`s Falcon oil reservoir simulation code. The goal of this study is to determine the best linear solver algorithms for these types of problems. The results of numerical experiments will be presented.

  6. Experimental validation of GADRAS's coupled neutron-photon inverse radiation transport solver

    International Nuclear Information System (INIS)

    Mattingly, John K.; Mitchell, Dean James; Harding, Lee T.

    2010-01-01

    Sandia National Laboratories has developed an inverse radiation transport solver that applies nonlinear regression to coupled neutron-photon deterministic transport models. The inverse solver uses nonlinear regression to fit a radiation transport model to gamma spectrometry and neutron multiplicity counting measurements. The subject of this paper is the experimental validation of that solver. This paper describes a series of experiments conducted with a 4.5 kg sphere of α-phase, weapons-grade plutonium. The source was measured bare and reflected by high-density polyethylene (HDPE) spherical shells with total thicknesses between 1.27 and 15.24 cm. Neutron and photon emissions from the source were measured using three instruments: a gross neutron counter, a portable neutron multiplicity counter, and a high-resolution gamma spectrometer. These measurements were used as input to the inverse radiation transport solver to evaluate the solver's ability to correctly infer the configuration of the source from its measured radiation signatures.

  7. RELATIVISTIC MAGNETOHYDRODYNAMICS: RENORMALIZED EIGENVECTORS AND FULL WAVE DECOMPOSITION RIEMANN SOLVER

    International Nuclear Information System (INIS)

    Anton, Luis; MartI, Jose M; Ibanez, Jose M; Aloy, Miguel A.; Mimica, Petar; Miralles, Juan A.

    2010-01-01

    We obtain renormalized sets of right and left eigenvectors of the flux vector Jacobians of the relativistic MHD equations, which are regular and span a complete basis in any physical state including degenerate ones. The renormalization procedure relies on the characterization of the degeneracy types in terms of the normal and tangential components of the magnetic field to the wave front in the fluid rest frame. Proper expressions of the renormalized eigenvectors in conserved variables are obtained through the corresponding matrix transformations. Our work completes previous analysis that present different sets of right eigenvectors for non-degenerate and degenerate states, and can be seen as a relativistic generalization of earlier work performed in classical MHD. Based on the full wave decomposition (FWD) provided by the renormalized set of eigenvectors in conserved variables, we have also developed a linearized (Roe-type) Riemann solver. Extensive testing against one- and two-dimensional standard numerical problems allows us to conclude that our solver is very robust. When compared with a family of simpler solvers that avoid the knowledge of the full characteristic structure of the equations in the computation of the numerical fluxes, our solver turns out to be less diffusive than HLL and HLLC, and comparable in accuracy to the HLLD solver. The amount of operations needed by the FWD solver makes it less efficient computationally than those of the HLL family in one-dimensional problems. However, its relative efficiency increases in multidimensional simulations.

  8. Discriminative sparse coding on multi-manifolds

    KAUST Repository

    Wang, J.J.-Y.; Bensmail, H.; Yao, N.; Gao, Xin

    2013-01-01

    Sparse coding has been popularly used as an effective data representation method in various applications, such as computer vision, medical imaging and bioinformatics. However, the conventional sparse coding algorithms and their manifold-regularized variants (graph sparse coding and Laplacian sparse coding), learn codebooks and codes in an unsupervised manner and neglect class information that is available in the training set. To address this problem, we propose a novel discriminative sparse coding method based on multi-manifolds, that learns discriminative class-conditioned codebooks and sparse codes from both data feature spaces and class labels. First, the entire training set is partitioned into multiple manifolds according to the class labels. Then, we formulate the sparse coding as a manifold-manifold matching problem and learn class-conditioned codebooks and codes to maximize the manifold margins of different classes. Lastly, we present a data sample-manifold matching-based strategy to classify the unlabeled data samples. Experimental results on somatic mutations identification and breast tumor classification based on ultrasonic images demonstrate the efficacy of the proposed data representation and classification approach. 2013 The Authors. All rights reserved.

  9. Discriminative sparse coding on multi-manifolds

    KAUST Repository

    Wang, J.J.-Y.

    2013-09-26

    Sparse coding has been popularly used as an effective data representation method in various applications, such as computer vision, medical imaging and bioinformatics. However, the conventional sparse coding algorithms and their manifold-regularized variants (graph sparse coding and Laplacian sparse coding), learn codebooks and codes in an unsupervised manner and neglect class information that is available in the training set. To address this problem, we propose a novel discriminative sparse coding method based on multi-manifolds, that learns discriminative class-conditioned codebooks and sparse codes from both data feature spaces and class labels. First, the entire training set is partitioned into multiple manifolds according to the class labels. Then, we formulate the sparse coding as a manifold-manifold matching problem and learn class-conditioned codebooks and codes to maximize the manifold margins of different classes. Lastly, we present a data sample-manifold matching-based strategy to classify the unlabeled data samples. Experimental results on somatic mutations identification and breast tumor classification based on ultrasonic images demonstrate the efficacy of the proposed data representation and classification approach. 2013 The Authors. All rights reserved.

  10. Sparse Regression by Projection and Sparse Discriminant Analysis

    KAUST Repository

    Qi, Xin

    2015-04-03

    © 2015, © American Statistical Association, Institute of Mathematical Statistics, and Interface Foundation of North America. Recent years have seen active developments of various penalized regression methods, such as LASSO and elastic net, to analyze high-dimensional data. In these approaches, the direction and length of the regression coefficients are determined simultaneously. Due to the introduction of penalties, the length of the estimates can be far from being optimal for accurate predictions. We introduce a new framework, regression by projection, and its sparse version to analyze high-dimensional data. The unique nature of this framework is that the directions of the regression coefficients are inferred first, and the lengths and the tuning parameters are determined by a cross-validation procedure to achieve the largest prediction accuracy. We provide a theoretical result for simultaneous model selection consistency and parameter estimation consistency of our method in high dimension. This new framework is then generalized such that it can be applied to principal components analysis, partial least squares, and canonical correlation analysis. We also adapt this framework for discriminant analysis. Compared with the existing methods, where there is relatively little control of the dependency among the sparse components, our method can control the relationships among the components. We present efficient algorithms and related theory for solving the sparse regression by projection problem. Based on extensive simulations and real data analysis, we demonstrate that our method achieves good predictive performance and variable selection in the regression setting, and the ability to control relationships between the sparse components leads to more accurate classification. In supplementary materials available online, the details of the algorithms and theoretical proofs, and R codes for all simulation studies are provided.

  11. Sparse distributed memory overview

    Science.gov (United States)

    Raugh, Mike

    1990-01-01

    The Sparse Distributed Memory (SDM) project is investigating the theory and applications of massively parallel computing architecture, called sparse distributed memory, that will support the storage and retrieval of sensory and motor patterns characteristic of autonomous systems. The immediate objectives of the project are centered in studies of the memory itself and in the use of the memory to solve problems in speech, vision, and robotics. Investigation of methods for encoding sensory data is an important part of the research. Examples of NASA missions that may benefit from this work are Space Station, planetary rovers, and solar exploration. Sparse distributed memory offers promising technology for systems that must learn through experience and be capable of adapting to new circumstances, and for operating any large complex system requiring automatic monitoring and control. Sparse distributed memory is a massively parallel architecture motivated by efforts to understand how the human brain works. Sparse distributed memory is an associative memory, able to retrieve information from cues that only partially match patterns stored in the memory. It is able to store long temporal sequences derived from the behavior of a complex system, such as progressive records of the system's sensory data and correlated records of the system's motor controls.

  12. A parallel direct solver for the self-adaptive hp Finite Element Method

    KAUST Repository

    Paszyński, Maciej R.

    2010-03-01

    In this paper we present a new parallel multi-frontal direct solver, dedicated for the hp Finite Element Method (hp-FEM). The self-adaptive hp-FEM generates in a fully automatic mode, a sequence of hp-meshes delivering exponential convergence of the error with respect to the number of degrees of freedom (d.o.f.) as well as the CPU time, by performing a sequence of hp refinements starting from an arbitrary initial mesh. The solver constructs an initial elimination tree for an arbitrary initial mesh, and expands the elimination tree each time the mesh is refined. This allows us to keep track of the order of elimination for the solver. The solver also minimizes the memory usage, by de-allocating partial LU factorizations computed during the elimination stage of the solver, and recomputes them for the backward substitution stage, by utilizing only about 10% of the computational time necessary for the original computations. The solver has been tested on 3D Direct Current (DC) borehole resistivity measurement simulations problems. We measure the execution time and memory usage of the solver over a large regular mesh with 1.5 million degrees of freedom as well as on the highly non-regular mesh, generated by the self-adaptive h p-FEM, with finite elements of various sizes and polynomial orders of approximation varying from p = 1 to p = 9. From the presented experiments it follows that the parallel solver scales well up to the maximum number of utilized processors. The limit for the solver scalability is the maximum sequential part of the algorithm: the computations of the partial LU factorizations over the longest path, coming from the root of the elimination tree down to the deepest leaf. © 2009 Elsevier Inc. All rights reserved.

  13. Implementation of Generalized Adjoint Equation Solver for DeCART

    International Nuclear Information System (INIS)

    Han, Tae Young; Cho, Jin Young; Lee, Hyun Chul; Noh, Jae Man

    2013-01-01

    In this paper, the generalized adjoint solver based on the generalized perturbation theory is implemented on DeCART and the verification calculations were carried out. As the results, the adjoint flux for the general response coincides with the reference solution and it is expected that the solver could produce the parameters for the sensitivity and uncertainty analysis. Recently, MUSAD (Modules of Uncertainty and Sensitivity Analysis for DeCART) was developed for the uncertainty analysis of PMR200 core and the fundamental adjoint solver was implemented into DeCART. However, the application of the code was limited to the uncertainty to the multiplication factor, k eff , because it was based on the classical perturbation theory. For the uncertainty analysis to the general response as like the power density, it is necessary to develop the analysis module based on the generalized perturbation theory and it needs the generalized adjoint solutions from DeCART. In this paper, the generalized adjoint solver is implemented on DeCART and the calculation results are compared with the results by TSUNAMI of SCALE 6.1

  14. In-place sparse suffix sorting

    DEFF Research Database (Denmark)

    Prezza, Nicola

    2018-01-01

    information regarding the lexicographical order of a size-b subset of all n text suffixes is often needed. Such information can be stored space-efficiently (in b words) in the sparse suffix array (SSA). The SSA and its relative sparse LCP array (SLCP) can be used as a space-efficient substitute of the sparse...... suffix tree. Very recently, Gawrychowski and Kociumaka [11] showed that the sparse suffix tree (and therefore SSA and SLCP) can be built in asymptotically optimal O(b) space with a Monte Carlo algorithm running in O(n) time. The main reason for using the SSA and SLCP arrays in place of the sparse suffix...... tree is, however, their reduced space of b words each. This leads naturally to the quest for in-place algorithms building these arrays. Franceschini and Muthukrishnan [8] showed that the full suffix array can be built in-place and in optimal running time. On the other hand, finding sub-quadratic in...

  15. s-Step Krylov Subspace Methods as Bottom Solvers for Geometric Multigrid

    Energy Technology Data Exchange (ETDEWEB)

    Williams, Samuel [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Lijewski, Mike [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Almgren, Ann [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Straalen, Brian Van [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Carson, Erin [Univ. of California, Berkeley, CA (United States); Knight, Nicholas [Univ. of California, Berkeley, CA (United States); Demmel, James [Univ. of California, Berkeley, CA (United States)

    2014-08-14

    Geometric multigrid solvers within adaptive mesh refinement (AMR) applications often reach a point where further coarsening of the grid becomes impractical as individual sub domain sizes approach unity. At this point the most common solution is to use a bottom solver, such as BiCGStab, to reduce the residual by a fixed factor at the coarsest level. Each iteration of BiCGStab requires multiple global reductions (MPI collectives). As the number of BiCGStab iterations required for convergence grows with problem size, and the time for each collective operation increases with machine scale, bottom solves in large-scale applications can constitute a significant fraction of the overall multigrid solve time. In this paper, we implement, evaluate, and optimize a communication-avoiding s-step formulation of BiCGStab (CABiCGStab for short) as a high-performance, distributed-memory bottom solver for geometric multigrid solvers. This is the first time s-step Krylov subspace methods have been leveraged to improve multigrid bottom solver performance. We use a synthetic benchmark for detailed analysis and integrate the best implementation into BoxLib in order to evaluate the benefit of a s-step Krylov subspace method on the multigrid solves found in the applications LMC and Nyx on up to 32,768 cores on the Cray XE6 at NERSC. Overall, we see bottom solver improvements of up to 4.2x on synthetic problems and up to 2.7x in real applications. This results in as much as a 1.5x improvement in solver performance in real applications.

  16. Classification of multispectral or hyperspectral satellite imagery using clustering of sparse approximations on sparse representations in learned dictionaries obtained using efficient convolutional sparse coding

    Science.gov (United States)

    Moody, Daniela; Wohlberg, Brendt

    2018-01-02

    An approach for land cover classification, seasonal and yearly change detection and monitoring, and identification of changes in man-made features may use a clustering of sparse approximations (CoSA) on sparse representations in learned dictionaries. The learned dictionaries may be derived using efficient convolutional sparse coding to build multispectral or hyperspectral, multiresolution dictionaries that are adapted to regional satellite image data. Sparse image representations of images over the learned dictionaries may be used to perform unsupervised k-means clustering into land cover categories. The clustering process behaves as a classifier in detecting real variability. This approach may combine spectral and spatial textural characteristics to detect geologic, vegetative, hydrologic, and man-made features, as well as changes in these features over time.

  17. Advanced field-solver techniques for RC extraction of integrated circuits

    CERN Document Server

    Yu, Wenjian

    2014-01-01

    Resistance and capacitance (RC) extraction is an essential step in modeling the interconnection wires and substrate coupling effect in nanometer-technology integrated circuits (IC). The field-solver techniques for RC extraction guarantee the accuracy of modeling, and are becoming increasingly important in meeting the demand for accurate modeling and simulation of VLSI designs. Advanced Field-Solver Techniques for RC Extraction of Integrated Circuits presents a systematic introduction to, and treatment of, the key field-solver methods for RC extraction of VLSI interconnects and substrate coupling in mixed-signal ICs. Various field-solver techniques are explained in detail, with real-world examples to illustrate the advantages and disadvantages of each algorithm. This book will benefit graduate students and researchers in the field of electrical and computer engineering, as well as engineers working in the IC design and design automation industries. Dr. Wenjian Yu is an Associate Professor at the Department of ...

  18. On the implicit density based OpenFOAM solver for turbulent compressible flows

    Science.gov (United States)

    Fürst, Jiří

    The contribution deals with the development of coupled implicit density based solver for compressible flows in the framework of open source package OpenFOAM. However the standard distribution of OpenFOAM contains several ready-made segregated solvers for compressible flows, the performance of those solvers is rather week in the case of transonic flows. Therefore we extend the work of Shen [15] and we develop an implicit semi-coupled solver. The main flow field variables are updated using lower-upper symmetric Gauss-Seidel method (LU-SGS) whereas the turbulence model variables are updated using implicit Euler method.

  19. On Cafesat: A Modern SAT Solver for Scala

    OpenAIRE

    Blanc, Régis William

    2013-01-01

    We present CafeSat, a SAT solver written in the Scala programming language. CafeSat is a modern solver based on DPLL and featuring many state-of-the-art techniques and heuristics. It uses two-watched literals for Boolean constraint propagation, conflict-driven learning along with clause deletion, a restarting strategy, and the VSIDS heuristics for choosing the branching literal. CafeSat is both sound and complete. In order to achieve reasonnable performances, low level and hand-tuned data ...

  20. Discrete Sparse Coding.

    Science.gov (United States)

    Exarchakis, Georgios; Lücke, Jörg

    2017-11-01

    Sparse coding algorithms with continuous latent variables have been the subject of a large number of studies. However, discrete latent spaces for sparse coding have been largely ignored. In this work, we study sparse coding with latents described by discrete instead of continuous prior distributions. We consider the general case in which the latents (while being sparse) can take on any value of a finite set of possible values and in which we learn the prior probability of any value from data. This approach can be applied to any data generated by discrete causes, and it can be applied as an approximation of continuous causes. As the prior probabilities are learned, the approach then allows for estimating the prior shape without assuming specific functional forms. To efficiently train the parameters of our probabilistic generative model, we apply a truncated expectation-maximization approach (expectation truncation) that we modify to work with a general discrete prior. We evaluate the performance of the algorithm by applying it to a variety of tasks: (1) we use artificial data to verify that the algorithm can recover the generating parameters from a random initialization, (2) use image patches of natural images and discuss the role of the prior for the extraction of image components, (3) use extracellular recordings of neurons to present a novel method of analysis for spiking neurons that includes an intuitive discretization strategy, and (4) apply the algorithm on the task of encoding audio waveforms of human speech. The diverse set of numerical experiments presented in this letter suggests that discrete sparse coding algorithms can scale efficiently to work with realistic data sets and provide novel statistical quantities to describe the structure of the data.

  1. Development of RBDGG Solver and Its Application to System Reliability Analysis

    International Nuclear Information System (INIS)

    Kim, Man Cheol

    2010-01-01

    For the purpose of making system reliability analysis easier and more intuitive, RBDGG (Reliability Block diagram with General Gates) methodology was introduced as an extension of the conventional reliability block diagram. The advantage of the RBDGG methodology is that the structure of a RBDGG model is very similar to the actual structure of the analyzed system, and therefore the modeling of a system for system reliability and unavailability analysis becomes very intuitive and easy. The main idea of the development of the RBDGG methodology is similar with that of the development of the RGGG (Reliability Graph with General Gates) methodology, which is an extension of a conventional reliability graph. The newly proposed methodology is now implemented into a software tool, RBDGG Solver. RBDGG Solver was developed as a WIN32 console application. RBDGG Solver receives information on the failure modes and failure probabilities of each component in the system, along with the connection structure and connection logics among the components in the system. Based on the received information, RBDGG Solver automatically generates a system reliability analysis model for the system, and then provides the analysis results. In this paper, application of RBDGG Solver to the reliability analysis of an example system, and verification of the calculation results are provided for the purpose of demonstrating how RBDGG Solver is used for system reliability analysis

  2. Solving Sparse Polynomial Optimization Problems with Chordal Structure Using the Sparse, Bounded-Degree Sum-of-Squares Hierarchy

    NARCIS (Netherlands)

    Marandi, Ahmadreza; de Klerk, Etienne; Dahl, Joachim

    The sparse bounded degree sum-of-squares (sparse-BSOS) hierarchy of Weisser, Lasserre and Toh [arXiv:1607.01151,2016] constructs a sequence of lower bounds for a sparse polynomial optimization problem. Under some assumptions, it is proven by the authors that the sequence converges to the optimal

  3. IGA-ADS: Isogeometric analysis FEM using ADS solver

    Science.gov (United States)

    Łoś, Marcin M.; Woźniak, Maciej; Paszyński, Maciej; Lenharth, Andrew; Hassaan, Muhamm Amber; Pingali, Keshav

    2017-08-01

    In this paper we present a fast explicit solver for solution of non-stationary problems using L2 projections with isogeometric finite element method. The solver has been implemented within GALOIS framework. It enables parallel multi-core simulations of different time-dependent problems, in 1D, 2D, or 3D. We have prepared the solver framework in a way that enables direct implementation of the selected PDE and corresponding boundary conditions. In this paper we describe the installation, implementation of exemplary three PDEs, and execution of the simulations on multi-core Linux cluster nodes. We consider three case studies, including heat transfer, linear elasticity, as well as non-linear flow in heterogeneous media. The presented package generates output suitable for interfacing with Gnuplot and ParaView visualization software. The exemplary simulations show near perfect scalability on Gilbert shared-memory node with four Intel® Xeon® CPU E7-4860 processors, each possessing 10 physical cores (for a total of 40 cores).

  4. An efficient spectral crystal plasticity solver for GPU architectures

    Science.gov (United States)

    Malahe, Michael

    2018-03-01

    We present a spectral crystal plasticity (CP) solver for graphics processing unit (GPU) architectures that achieves a tenfold increase in efficiency over prior GPU solvers. The approach makes use of a database containing a spectral decomposition of CP simulations performed using a conventional iterative solver over a parameter space of crystal orientations and applied velocity gradients. The key improvements in efficiency come from reducing global memory transactions, exposing more instruction-level parallelism, reducing integer instructions and performing fast range reductions on trigonometric arguments. The scheme also makes more efficient use of memory than prior work, allowing for larger problems to be solved on a single GPU. We illustrate these improvements with a simulation of 390 million crystal grains on a consumer-grade GPU, which executes at a rate of 2.72 s per strain step.

  5. Optimización con Solver

    Directory of Open Access Journals (Sweden)

    Sánchez Álvarez , I.

    1998-01-01

    Full Text Available La relevancia de los problemas de optimización en el mundo empresarial ha generado la introducción de herramientas de optimización cada vez más sofisticadas en las últimas versiones de las hojas de cálculo de utilización generalizada. Estas utilidades, conocidas habitualmente como «solvers», constituyen una alternativa a los programas especializados de optimización cuando no se trata de problemas de gran escala, presentado la ventaja de su facilidad de uso y de comunicación con el usuario final. Frontline Systems Inc es la empresa que desarrolla el «solver» de Excel, si bien existen asimismo versiones para Lotus y Quattro Pro con ligeras diferencias de uso. En su dirección de internet (www.frontsys.com se puede obtener información técnica sobre las diferentes versiones de dicha utilidad y diversos aspectos operativos del programa, algunos de los cuales se comentan en este trabajo.

  6. Bayesian Inference Methods for Sparse Channel Estimation

    DEFF Research Database (Denmark)

    Pedersen, Niels Lovmand

    2013-01-01

    This thesis deals with sparse Bayesian learning (SBL) with application to radio channel estimation. As opposed to the classical approach for sparse signal representation, we focus on the problem of inferring complex signals. Our investigations within SBL constitute the basis for the development...... of Bayesian inference algorithms for sparse channel estimation. Sparse inference methods aim at finding the sparse representation of a signal given in some overcomplete dictionary of basis vectors. Within this context, one of our main contributions to the field of SBL is a hierarchical representation...... analysis of the complex prior representation, where we show that the ability to induce sparse estimates of a given prior heavily depends on the inference method used and, interestingly, whether real or complex variables are inferred. We also show that the Bayesian estimators derived from the proposed...

  7. When sparse coding meets ranking: a joint framework for learning sparse codes and ranking scores

    KAUST Repository

    Wang, Jim Jing-Yan; Cui, Xuefeng; Yu, Ge; Guo, Lili; Gao, Xin

    2017-01-01

    Sparse coding, which represents a data point as a sparse reconstruction code with regard to a dictionary, has been a popular data representation method. Meanwhile, in database retrieval problems, learning the ranking scores from data points plays

  8. Implementation and testing of a multivariate inverse radiation transport solver

    International Nuclear Information System (INIS)

    Mattingly, John; Mitchell, Dean J.

    2012-01-01

    Detection, identification, and characterization of special nuclear materials (SNM) all face the same basic challenge: to varying degrees, each must infer the presence, composition, and configuration of the SNM by analyzing a set of measured radiation signatures. Solutions to this problem implement inverse radiation transport methods. Given a set of measured radiation signatures, inverse radiation transport estimates properties of the source terms and transport media that are consistent with those signatures. This paper describes one implementation of a multivariate inverse radiation transport solver. The solver simultaneously analyzes gamma spectrometry and neutron multiplicity measurements to fit a one-dimensional radiation transport model with variable layer thicknesses using nonlinear regression. The solver's essential components are described, and its performance is illustrated by application to benchmark experiments conducted with plutonium metal. - Highlights: ► Inverse problems, specifically applied to identifying and characterizing radiation sources . ► Radiation transport. ► Analysis of gamma spectroscopy and neutron multiplicity counting measurements. ► Experimental testing of the inverse solver against measurements of plutonium.

  9. Improved Sparse Channel Estimation for Cooperative Communication Systems

    Directory of Open Access Journals (Sweden)

    Guan Gui

    2012-01-01

    Full Text Available Accurate channel state information (CSI is necessary at receiver for coherent detection in amplify-and-forward (AF cooperative communication systems. To estimate the channel, traditional methods, that is, least squares (LS and least absolute shrinkage and selection operator (LASSO, are based on assumptions of either dense channel or global sparse channel. However, LS-based linear method neglects the inherent sparse structure information while LASSO-based sparse channel method cannot take full advantage of the prior information. Based on the partial sparse assumption of the cooperative channel model, we propose an improved channel estimation method with partial sparse constraint. At first, by using sparse decomposition theory, channel estimation is formulated as a compressive sensing problem. Secondly, the cooperative channel is reconstructed by LASSO with partial sparse constraint. Finally, numerical simulations are carried out to confirm the superiority of proposed methods over global sparse channel estimation methods.

  10. A High Performance QDWH-SVD Solver using Hardware Accelerators

    KAUST Repository

    Sukkari, Dalal E.

    2015-04-08

    This paper describes a new high performance implementation of the QR-based Dynamically Weighted Halley Singular Value Decomposition (QDWH-SVD) solver on multicore architecture enhanced with multiple GPUs. The standard QDWH-SVD algorithm was introduced by Nakatsukasa and Higham (SIAM SISC, 2013) and combines three successive computational stages: (1) the polar decomposition calculation of the original matrix using the QDWH algorithm, (2) the symmetric eigendecomposition of the resulting polar factor to obtain the singular values and the right singular vectors and (3) the matrix-matrix multiplication to get the associated left singular vectors. A comprehensive test suite highlights the numerical robustness of the QDWH-SVD solver. Although it performs up to two times more flops when computing all singular vectors compared to the standard SVD solver algorithm, our new high performance implementation on single GPU results in up to 3.8x improvements for asymptotic matrix sizes, compared to the equivalent routines from existing state-of-the-art open-source and commercial libraries. However, when only singular values are needed, QDWH-SVD is penalized by performing up to 14 times more flops. The singular value only implementation of QDWH-SVD on single GPU can still run up to 18% faster than the best existing equivalent routines. Integrating mixed precision techniques in the solver can additionally provide up to 40% improvement at the price of losing few digits of accuracy, compared to the full double precision floating point arithmetic. We further leverage the single GPU QDWH-SVD implementation by introducing the first multi-GPU SVD solver to study the scalability of the QDWH-SVD framework.

  11. T2CG1, a package of preconditioned conjugate gradient solvers for TOUGH2

    International Nuclear Information System (INIS)

    Moridis, G.; Pruess, K.; Antunez, E.

    1994-03-01

    Most of the computational work in the numerical simulation of fluid and heat flows in permeable media arises in the solution of large systems of linear equations. The simplest technique for solving such equations is by direct methods. However, because of large storage requirements and accumulation of roundoff errors, the application of direct solution techniques is limited, depending on matrix bandwidth, to systems of a few hundred to at most a few thousand simultaneous equations. T2CG1, a package of preconditioned conjugate gradient solvers, has been added to TOUGH2 to complement its direct solver and significantly increase the size of problems tractable on PCs. T2CG1 includes three different solvers: a Bi-Conjugate Gradient (BCG) solver, a Bi-Conjugate Gradient Squared (BCGS) solver, and a Generalized Minimum Residual (GMRES) solver. Results from six test problems with up to 30,000 equations show that T2CG1 (1) is significantly (and invariably) faster and requires far less memory than the MA28 direct solver, (2) it makes possible the solution of very large three-dimensional problems on PCs, and (3) that the BCGS solver is the fastest of the three in the tested problems. Sample problems are presented related to heat and fluid flow at Yucca Mountain and WIPP, environmental remediation by the Thermal Enhanced Vapor Extraction System, and geothermal resources

  12. VDJSeq-Solver: in silico V(DJ recombination detection tool.

    Directory of Open Access Journals (Sweden)

    Giulia Paciello

    Full Text Available In this paper we present VDJSeq-Solver, a methodology and tool to identify clonal lymphocyte populations from paired-end RNA Sequencing reads derived from the sequencing of mRNA neoplastic cells. The tool detects the main clone that characterises the tissue of interest by recognizing the most abundant V(DJ rearrangement among the existing ones in the sample under study. The exact sequence of the clone identified is capable of accounting for the modifications introduced by the enzymatic processes. The proposed tool overcomes limitations of currently available lymphocyte rearrangements recognition methods, working on a single sequence at a time, that are not applicable to high-throughput sequencing data. In this work, VDJSeq-Solver has been applied to correctly detect the main clone and identify its sequence on five Mantle Cell Lymphoma samples; then the tool has been tested on twelve Diffuse Large B-Cell Lymphoma samples. In order to comply with the privacy, ethics and intellectual property policies of the University Hospital and the University of Verona, data is available upon request to supporto.utenti@ateneo.univr.it after signing a mandatory Materials Transfer Agreement. VDJSeq-Solver JAVA/Perl/Bash software implementation is free and available at http://eda.polito.it/VDJSeq-Solver/.

  13. Sparse Image Reconstruction in Computed Tomography

    DEFF Research Database (Denmark)

    Jørgensen, Jakob Sauer

    In recent years, increased focus on the potentially harmful effects of x-ray computed tomography (CT) scans, such as radiation-induced cancer, has motivated research on new low-dose imaging techniques. Sparse image reconstruction methods, as studied for instance in the field of compressed sensing...... applications. This thesis takes a systematic approach toward establishing quantitative understanding of conditions for sparse reconstruction to work well in CT. A general framework for analyzing sparse reconstruction methods in CT is introduced and two sets of computational tools are proposed: 1...... contributions to a general set of computational characterization tools. Thus, the thesis contributions help advance sparse reconstruction methods toward routine use in...

  14. Sparse Regression by Projection and Sparse Discriminant Analysis

    KAUST Repository

    Qi, Xin; Luo, Ruiyan; Carroll, Raymond J.; Zhao, Hongyu

    2015-01-01

    predictions. We introduce a new framework, regression by projection, and its sparse version to analyze high-dimensional data. The unique nature of this framework is that the directions of the regression coefficients are inferred first, and the lengths

  15. Hybrid direct and iterative solvers for h refined grids with singularities

    KAUST Repository

    Paszyński, Maciej R.

    2015-04-27

    This paper describes a hybrid direct and iterative solver for two and three dimensional h adaptive grids with point singularities. The point singularities are eliminated by using a sequential linear computational cost solver O(N) on CPU [1]. The remaining Schur complements are submitted to incomplete LU preconditioned conjugated gradient (ILUPCG) iterative solver. The approach is compared to the standard algorithm performing static condensation over the entire mesh and executing the ILUPCG algorithm on top of it. The hybrid solver is applied for two or three dimensional grids automatically h refined towards point or edge singularities. The automatic refinement is based on the relative error estimations between the coarse and fine mesh solutions [2], and the optimal refinements are selected using the projection based interpolation. The computational mesh is partitioned into sub-meshes with local point and edge singularities separated. This is done by using the following greedy algorithm.

  16. Simplified Linear Equation Solvers users manual

    Energy Technology Data Exchange (ETDEWEB)

    Gropp, W. [Argonne National Lab., IL (United States); Smith, B. [California Univ., Los Angeles, CA (United States)

    1993-02-01

    The solution of large sparse systems of linear equations is at the heart of many algorithms in scientific computing. The SLES package is a set of easy-to-use yet powerful and extensible routines for solving large sparse linear systems. The design of the package allows new techniques to be used in existing applications without any source code changes in the applications.

  17. Sparse decompositions in 'incoherent' dictionaries

    DEFF Research Database (Denmark)

    Gribonval, R.; Nielsen, Morten

    2003-01-01

    a unique sparse representation in such a dictionary. In particular, it is proved that the result of Donoho and Huo, concerning the replacement of a combinatorial optimization problem with a linear programming problem when searching for sparse representations, has an analog for dictionaries that may...

  18. A CFD Heterogeneous Parallel Solver Based on Collaborating CPU and GPU

    Science.gov (United States)

    Lai, Jianqi; Tian, Zhengyu; Li, Hua; Pan, Sha

    2018-03-01

    Since Graphic Processing Unit (GPU) has a strong ability of floating-point computation and memory bandwidth for data parallelism, it has been widely used in the areas of common computing such as molecular dynamics (MD), computational fluid dynamics (CFD) and so on. The emergence of compute unified device architecture (CUDA), which reduces the complexity of compiling program, brings the great opportunities to CFD. There are three different modes for parallel solution of NS equations: parallel solver based on CPU, parallel solver based on GPU and heterogeneous parallel solver based on collaborating CPU and GPU. As we can see, GPUs are relatively rich in compute capacity but poor in memory capacity and the CPUs do the opposite. We need to make full use of the GPUs and CPUs, so a CFD heterogeneous parallel solver based on collaborating CPU and GPU has been established. Three cases are presented to analyse the solver’s computational accuracy and heterogeneous parallel efficiency. The numerical results agree well with experiment results, which demonstrate that the heterogeneous parallel solver has high computational precision. The speedup on a single GPU is more than 40 for laminar flow, it decreases for turbulent flow, but it still can reach more than 20. What’s more, the speedup increases as the grid size becomes larger.

  19. A fast Linear Complementarity Problem (LCP) solver for separating fluid-solid wall boundary Conditions

    DEFF Research Database (Denmark)

    Andersen, Michael; Abel, Sarah Maria Niebe; Erleben, Kenny

    2017-01-01

    We address the task of computing solutions for a separating fluid-solid wall boundary condition model. We present an embarrassingly parallel, easy to implement, fluid LCP solver.We are able to use greater domain sizes than previous works have shown, due to our new solver. The solver exploits matr...

  20. Aleph Field Solver Challenge Problem Results Summary

    Energy Technology Data Exchange (ETDEWEB)

    Hooper, Russell [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Moore, Stan Gerald [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

    2015-01-01

    Aleph models continuum electrostatic and steady and transient thermal fields using a finite-element method. Much work has gone into expanding the core solver capability to support enriched modeling consisting of multiple interacting fields, special boundary conditions and two-way interfacial coupling with particles modeled using Aleph's complementary particle-in-cell capability. This report provides quantitative evidence for correct implementation of Aleph's field solver via order- of-convergence assessments on a collection of problems of increasing complexity. It is intended to provide Aleph with a pedigree and to establish a basis for confidence in results for more challenging problems important to Sandia's mission that Aleph was specifically designed to address.

  1. NITSOL: A Newton iterative solver for nonlinear systems

    Energy Technology Data Exchange (ETDEWEB)

    Pernice, M. [Univ. of Utah, Salt Lake City, UT (United States); Walker, H.F. [Utah State Univ., Logan, UT (United States)

    1996-12-31

    Newton iterative methods, also known as truncated Newton methods, are implementations of Newton`s method in which the linear systems that characterize Newton steps are solved approximately using iterative linear algebra methods. Here, we outline a well-developed Newton iterative algorithm together with a Fortran implementation called NITSOL. The basic algorithm is an inexact Newton method globalized by backtracking, in which each initial trial step is determined by applying an iterative linear solver until an inexact Newton criterion is satisfied. In the implementation, the user can specify inexact Newton criteria in several ways and select an iterative linear solver from among several popular {open_quotes}transpose-free{close_quotes} Krylov subspace methods. Jacobian-vector products used by the Krylov solver can be either evaluated analytically with a user-supplied routine or approximated using finite differences of function values. A flexible interface permits a wide variety of preconditioning strategies and allows the user to define a preconditioner and optionally update it periodically. We give details of these and other features and demonstrate the performance of the implementation on a representative set of test problems.

  2. Data analysis in high-dimensional sparse spaces

    DEFF Research Database (Denmark)

    Clemmensen, Line Katrine Harder

    classification techniques for high-dimensional problems are presented: Sparse discriminant analysis, sparse mixture discriminant analysis and orthogonality constrained support vector machines. The first two introduces sparseness to the well known linear and mixture discriminant analysis and thereby provide low...... are applied to classifications of fish species, ear canal impressions used in the hearing aid industry, microbiological fungi species, and various cancerous tissues and healthy tissues. In addition, novel applications of sparse regressions (also called the elastic net) to the medical, concrete, and food...

  3. Towards Green Multi-frontal Solver for Adaptive Finite Element Method

    KAUST Repository

    AbbouEisha, H.

    2015-06-01

    In this paper we present the optimization of the energy consumption for the multi-frontal solver algorithm executed over two dimensional grids with point singularities. The multi-frontal solver algorithm is controlled by so-called elimination tree, defining the order of elimination of rows from particular frontal matrices, as well as order of memory transfers for Schur complement matrices. For a given mesh there are many possible elimination trees resulting in different number of floating point operations (FLOPs) of the solver or different amount of data trans- ferred via memory transfers. In this paper we utilize the dynamic programming optimization procedure and we compare elimination trees optimized with respect to FLOPs with elimination trees optimized with respect to energy consumption.

  4. Towards Green Multi-frontal Solver for Adaptive Finite Element Method

    KAUST Repository

    AbbouEisha, H.; Moshkov, Mikhail; Jopek, K.; Gepner, P.; Kitowski, J.; Paszyn'ski, M.

    2015-01-01

    In this paper we present the optimization of the energy consumption for the multi-frontal solver algorithm executed over two dimensional grids with point singularities. The multi-frontal solver algorithm is controlled by so-called elimination tree, defining the order of elimination of rows from particular frontal matrices, as well as order of memory transfers for Schur complement matrices. For a given mesh there are many possible elimination trees resulting in different number of floating point operations (FLOPs) of the solver or different amount of data trans- ferred via memory transfers. In this paper we utilize the dynamic programming optimization procedure and we compare elimination trees optimized with respect to FLOPs with elimination trees optimized with respect to energy consumption.

  5. An immersed interface vortex particle-mesh solver

    Science.gov (United States)

    Marichal, Yves; Chatelain, Philippe; Winckelmans, Gregoire

    2014-11-01

    An immersed interface-enabled vortex particle-mesh (VPM) solver is presented for the simulation of 2-D incompressible viscous flows, in the framework of external aerodynamics. Considering the simulation of free vortical flows, such as wakes and jets, vortex particle-mesh methods already provide a valuable alternative to standard CFD methods, thanks to the interesting numerical properties arising from its Lagrangian nature. Yet, accounting for solid bodies remains challenging, despite the extensive research efforts that have been made for several decades. The present immersed interface approach aims at improving the consistency and the accuracy of one very common technique (based on Lighthill's model) for the enforcement of the no-slip condition at the wall in vortex methods. Targeting a sharp treatment of the wall calls for substantial modifications at all computational levels of the VPM solver. More specifically, the solution of the underlying Poisson equation, the computation of the diffusion term and the particle-mesh interpolation are adapted accordingly and the spatial accuracy is assessed. The immersed interface VPM solver is subsequently validated on the simulation of some challenging impulsively started flows, such as the flow past a cylinder and that past an airfoil. Research Fellow (PhD student) of the F.R.S.-FNRS of Belgium.

  6. Direct solvers performance on h-adapted grids

    KAUST Repository

    Paszynski, Maciej; Pardo, David; Calo, Victor M.

    2015-01-01

    We analyse the performance of direct solvers when applied to a system of linear equations arising from an hh-adapted, C0C0 finite element space. Theoretical estimates are derived for typical hh-refinement patterns arising as a result of a point, edge, or face singularity as well as boundary layers. They are based on the elimination trees constructed specifically for the considered grids. Theoretical estimates are compared with experiments performed with MUMPS using the nested-dissection algorithm for construction of the elimination tree from METIS library. The numerical experiments provide the same performance for the cases where our trees are identical with those constructed by the nested-dissection algorithm, and worse performance for some cases where our trees are different. We also present numerical experiments for the cases with mixed singularities, where how to construct optimal elimination trees is unknown. In all analysed cases, the use of hh-adaptive grids significantly reduces the cost of the direct solver algorithm per unknown as compared to uniform grids. The theoretical estimates predict and the experimental data confirm that the computational complexity is linear for various refinement patterns. In most cases, the cost of the direct solver per unknown is lower when employing anisotropic refinements as opposed to isotropic ones.

  7. Direct solvers performance on h-adapted grids

    KAUST Repository

    Paszynski, Maciej

    2015-05-27

    We analyse the performance of direct solvers when applied to a system of linear equations arising from an hh-adapted, C0C0 finite element space. Theoretical estimates are derived for typical hh-refinement patterns arising as a result of a point, edge, or face singularity as well as boundary layers. They are based on the elimination trees constructed specifically for the considered grids. Theoretical estimates are compared with experiments performed with MUMPS using the nested-dissection algorithm for construction of the elimination tree from METIS library. The numerical experiments provide the same performance for the cases where our trees are identical with those constructed by the nested-dissection algorithm, and worse performance for some cases where our trees are different. We also present numerical experiments for the cases with mixed singularities, where how to construct optimal elimination trees is unknown. In all analysed cases, the use of hh-adaptive grids significantly reduces the cost of the direct solver algorithm per unknown as compared to uniform grids. The theoretical estimates predict and the experimental data confirm that the computational complexity is linear for various refinement patterns. In most cases, the cost of the direct solver per unknown is lower when employing anisotropic refinements as opposed to isotropic ones.

  8. Analysis of transient plasmonic interactions using an MOT-PMCHWT integral equation solver

    KAUST Repository

    Uysal, Ismail Enes; Ulku, Huseyin Arda; Bagci, Hakan

    2014-01-01

    that discretize only on the interfaces. Additionally, IE solvers implicitly enforce the radiation condition and consequently do not need (approximate) absorbing boundary conditions. Despite these advantages, IE solvers, especially in time domain, have not been

  9. Supervised Transfer Sparse Coding

    KAUST Repository

    Al-Shedivat, Maruan

    2014-07-27

    A combination of the sparse coding and transfer learn- ing techniques was shown to be accurate and robust in classification tasks where training and testing objects have a shared feature space but are sampled from differ- ent underlying distributions, i.e., belong to different do- mains. The key assumption in such case is that in spite of the domain disparity, samples from different domains share some common hidden factors. Previous methods often assumed that all the objects in the target domain are unlabeled, and thus the training set solely comprised objects from the source domain. However, in real world applications, the target domain often has some labeled objects, or one can always manually label a small num- ber of them. In this paper, we explore such possibil- ity and show how a small number of labeled data in the target domain can significantly leverage classifica- tion accuracy of the state-of-the-art transfer sparse cod- ing methods. We further propose a unified framework named supervised transfer sparse coding (STSC) which simultaneously optimizes sparse representation, domain transfer and classification. Experimental results on three applications demonstrate that a little manual labeling and then learning the model in a supervised fashion can significantly improve classification accuracy.

  10. Joint Group Sparse PCA for Compressed Hyperspectral Imaging.

    Science.gov (United States)

    Khan, Zohaib; Shafait, Faisal; Mian, Ajmal

    2015-12-01

    A sparse principal component analysis (PCA) seeks a sparse linear combination of input features (variables), so that the derived features still explain most of the variations in the data. A group sparse PCA introduces structural constraints on the features in seeking such a linear combination. Collectively, the derived principal components may still require measuring all the input features. We present a joint group sparse PCA (JGSPCA) algorithm, which forces the basic coefficients corresponding to a group of features to be jointly sparse. Joint sparsity ensures that the complete basis involves only a sparse set of input features, whereas the group sparsity ensures that the structural integrity of the features is maximally preserved. We evaluate the JGSPCA algorithm on the problems of compressed hyperspectral imaging and face recognition. Compressed sensing results show that the proposed method consistently outperforms sparse PCA and group sparse PCA in reconstructing the hyperspectral scenes of natural and man-made objects. The efficacy of the proposed compressed sensing method is further demonstrated in band selection for face recognition.

  11. A generalized Poisson and Poisson-Boltzmann solver for electrostatic environments

    International Nuclear Information System (INIS)

    Fisicaro, G.; Goedecker, S.; Genovese, L.; Andreussi, O.; Marzari, N.

    2016-01-01

    The computational study of chemical reactions in complex, wet environments is critical for applications in many fields. It is often essential to study chemical reactions in the presence of applied electrochemical potentials, taking into account the non-trivial electrostatic screening coming from the solvent and the electrolytes. As a consequence, the electrostatic potential has to be found by solving the generalized Poisson and the Poisson-Boltzmann equations for neutral and ionic solutions, respectively. In the present work, solvers for both problems have been developed. A preconditioned conjugate gradient method has been implemented for the solution of the generalized Poisson equation and the linear regime of the Poisson-Boltzmann, allowing to solve iteratively the minimization problem with some ten iterations of the ordinary Poisson equation solver. In addition, a self-consistent procedure enables us to solve the non-linear Poisson-Boltzmann problem. Both solvers exhibit very high accuracy and parallel efficiency and allow for the treatment of periodic, free, and slab boundary conditions. The solver has been integrated into the BigDFT and Quantum-ESPRESSO electronic-structure packages and will be released as an independent program, suitable for integration in other codes

  12. A generalized Poisson and Poisson-Boltzmann solver for electrostatic environments.

    Science.gov (United States)

    Fisicaro, G; Genovese, L; Andreussi, O; Marzari, N; Goedecker, S

    2016-01-07

    The computational study of chemical reactions in complex, wet environments is critical for applications in many fields. It is often essential to study chemical reactions in the presence of applied electrochemical potentials, taking into account the non-trivial electrostatic screening coming from the solvent and the electrolytes. As a consequence, the electrostatic potential has to be found by solving the generalized Poisson and the Poisson-Boltzmann equations for neutral and ionic solutions, respectively. In the present work, solvers for both problems have been developed. A preconditioned conjugate gradient method has been implemented for the solution of the generalized Poisson equation and the linear regime of the Poisson-Boltzmann, allowing to solve iteratively the minimization problem with some ten iterations of the ordinary Poisson equation solver. In addition, a self-consistent procedure enables us to solve the non-linear Poisson-Boltzmann problem. Both solvers exhibit very high accuracy and parallel efficiency and allow for the treatment of periodic, free, and slab boundary conditions. The solver has been integrated into the BigDFT and Quantum-ESPRESSO electronic-structure packages and will be released as an independent program, suitable for integration in other codes.

  13. A generalized Poisson and Poisson-Boltzmann solver for electrostatic environments

    Energy Technology Data Exchange (ETDEWEB)

    Fisicaro, G., E-mail: giuseppe.fisicaro@unibas.ch; Goedecker, S. [Department of Physics, University of Basel, Klingelbergstrasse 82, 4056 Basel (Switzerland); Genovese, L. [University of Grenoble Alpes, CEA, INAC-SP2M, L-Sim, F-38000 Grenoble (France); Andreussi, O. [Institute of Computational Science, Università della Svizzera Italiana, Via Giuseppe Buffi 13, CH-6904 Lugano (Switzerland); Theory and Simulations of Materials (THEOS) and National Centre for Computational Design and Discovery of Novel Materials (MARVEL), École Polytechnique Fédérale de Lausanne, Station 12, CH-1015 Lausanne (Switzerland); Marzari, N. [Theory and Simulations of Materials (THEOS) and National Centre for Computational Design and Discovery of Novel Materials (MARVEL), École Polytechnique Fédérale de Lausanne, Station 12, CH-1015 Lausanne (Switzerland)

    2016-01-07

    The computational study of chemical reactions in complex, wet environments is critical for applications in many fields. It is often essential to study chemical reactions in the presence of applied electrochemical potentials, taking into account the non-trivial electrostatic screening coming from the solvent and the electrolytes. As a consequence, the electrostatic potential has to be found by solving the generalized Poisson and the Poisson-Boltzmann equations for neutral and ionic solutions, respectively. In the present work, solvers for both problems have been developed. A preconditioned conjugate gradient method has been implemented for the solution of the generalized Poisson equation and the linear regime of the Poisson-Boltzmann, allowing to solve iteratively the minimization problem with some ten iterations of the ordinary Poisson equation solver. In addition, a self-consistent procedure enables us to solve the non-linear Poisson-Boltzmann problem. Both solvers exhibit very high accuracy and parallel efficiency and allow for the treatment of periodic, free, and slab boundary conditions. The solver has been integrated into the BigDFT and Quantum-ESPRESSO electronic-structure packages and will be released as an independent program, suitable for integration in other codes.

  14. Parallel Sparse Matrix - Vector Product

    DEFF Research Database (Denmark)

    Alexandersen, Joe; Lazarov, Boyan Stefanov; Dammann, Bernd

    This technical report contains a case study of a sparse matrix-vector product routine, implemented for parallel execution on a compute cluster with both pure MPI and hybrid MPI-OpenMP solutions. C++ classes for sparse data types were developed and the report shows how these class can be used...

  15. Comparative study of incompressible and isothermal compressible flow solvers for cavitating flow dynamics

    Energy Technology Data Exchange (ETDEWEB)

    Park, Sun Ho [Korea Maritime and Ocean University, Busan (Korea, Republic of); Rhee, Shin Hyung [Seoul National University, Seoul (Korea, Republic of)

    2015-08-15

    Incompressible flow solvers are generally used for numerical analysis of cavitating flows, but with limitations in handling compressibility effects on vapor phase. To study compressibility effects on vapor phase and cavity interface, pressure-based incompressible and isothermal compressible flow solvers based on a cell-centered finite volume method were developed using the OpenFOAM libraries. To validate the solvers, cavitating flow around a hemispherical head-form body was simulated and validated against the experimental data. The cavity shedding behavior, length of a re-entrant jet, drag history, and the Strouhal number were compared between the two solvers. The results confirmed that computations of the cavitating flow including compressibility effects improved the reproduction of cavitation dynamics.

  16. Multiscale Universal Interface: A concurrent framework for coupling heterogeneous solvers

    Energy Technology Data Exchange (ETDEWEB)

    Tang, Yu-Hang, E-mail: yuhang_tang@brown.edu [Division of Applied Mathematics, Brown University, Providence, RI (United States); Kudo, Shuhei, E-mail: shuhei-kudo@outlook.jp [Graduate School of System Informatics, Kobe University, 1-1 Rokkodai-cho, Nada-ku, Kobe, 657-8501 (Japan); Bian, Xin, E-mail: xin_bian@brown.edu [Division of Applied Mathematics, Brown University, Providence, RI (United States); Li, Zhen, E-mail: zhen_li@brown.edu [Division of Applied Mathematics, Brown University, Providence, RI (United States); Karniadakis, George Em, E-mail: george_karniadakis@brown.edu [Division of Applied Mathematics, Brown University, Providence, RI (United States); Collaboratory on Mathematics for Mesoscopic Modeling of Materials, Pacific Northwest National Laboratory, Richland, WA 99354 (United States)

    2015-09-15

    Graphical abstract: - Abstract: Concurrently coupled numerical simulations using heterogeneous solvers are powerful tools for modeling multiscale phenomena. However, major modifications to existing codes are often required to enable such simulations, posing significant difficulties in practice. In this paper we present a C++ library, i.e. the Multiscale Universal Interface (MUI), which is capable of facilitating the coupling effort for a wide range of multiscale simulations. The library adopts a header-only form with minimal external dependency and hence can be easily dropped into existing codes. A data sampler concept is introduced, combined with a hybrid dynamic/static typing mechanism, to create an easily customizable framework for solver-independent data interpretation. The library integrates MPI MPMD support and an asynchronous communication protocol to handle inter-solver information exchange irrespective of the solvers' own MPI awareness. Template metaprogramming is heavily employed to simultaneously improve runtime performance and code flexibility. We validated the library by solving three different multiscale problems, which also serve to demonstrate the flexibility of the framework in handling heterogeneous models and solvers. In the first example, a Couette flow was simulated using two concurrently coupled Smoothed Particle Hydrodynamics (SPH) simulations of different spatial resolutions. In the second example, we coupled the deterministic SPH method with the stochastic Dissipative Particle Dynamics (DPD) method to study the effect of surface grafting on the hydrodynamics properties on the surface. In the third example, we consider conjugate heat transfer between a solid domain and a fluid domain by coupling the particle-based energy-conserving DPD (eDPD) method with the Finite Element Method (FEM)

  17. Accelerated Cyclic Reduction: A Distributed-Memory Fast Solver for Structured Linear Systems

    KAUST Repository

    Chávez, Gustavo

    2017-12-15

    We present Accelerated Cyclic Reduction (ACR), a distributed-memory fast solver for rank-compressible block tridiagonal linear systems arising from the discretization of elliptic operators, developed here for three dimensions. Algorithmic synergies between Cyclic Reduction and hierarchical matrix arithmetic operations result in a solver that has O(kNlogN(logN+k2)) arithmetic complexity and O(k Nlog N) memory footprint, where N is the number of degrees of freedom and k is the rank of a block in the hierarchical approximation, and which exhibits substantial concurrency. We provide a baseline for performance and applicability by comparing with the multifrontal method with and without hierarchical semi-separable matrices, with algebraic multigrid and with the classic cyclic reduction method. Over a set of large-scale elliptic systems with features of nonsymmetry and indefiniteness, the robustness of the direct solvers extends beyond that of the multigrid solver, and relative to the multifrontal approach ACR has lower or comparable execution time and size of the factors, with substantially lower numerical ranks. ACR exhibits good strong and weak scaling in a distributed context and, as with any direct solver, is advantageous for problems that require the solution of multiple right-hand sides. Numerical experiments show that the rank k patterns are of O(1) for the Poisson equation and of O(n) for the indefinite Helmholtz equation. The solver is ideal in situations where low-accuracy solutions are sufficient, or otherwise as a preconditioner within an iterative method.

  18. Accelerated Cyclic Reduction: A Distributed-Memory Fast Solver for Structured Linear Systems

    KAUST Repository

    Chá vez, Gustavo; Turkiyyah, George; Zampini, Stefano; Ltaief, Hatem; Keyes, David E.

    2017-01-01

    We present Accelerated Cyclic Reduction (ACR), a distributed-memory fast solver for rank-compressible block tridiagonal linear systems arising from the discretization of elliptic operators, developed here for three dimensions. Algorithmic synergies between Cyclic Reduction and hierarchical matrix arithmetic operations result in a solver that has O(kNlogN(logN+k2)) arithmetic complexity and O(k Nlog N) memory footprint, where N is the number of degrees of freedom and k is the rank of a block in the hierarchical approximation, and which exhibits substantial concurrency. We provide a baseline for performance and applicability by comparing with the multifrontal method with and without hierarchical semi-separable matrices, with algebraic multigrid and with the classic cyclic reduction method. Over a set of large-scale elliptic systems with features of nonsymmetry and indefiniteness, the robustness of the direct solvers extends beyond that of the multigrid solver, and relative to the multifrontal approach ACR has lower or comparable execution time and size of the factors, with substantially lower numerical ranks. ACR exhibits good strong and weak scaling in a distributed context and, as with any direct solver, is advantageous for problems that require the solution of multiple right-hand sides. Numerical experiments show that the rank k patterns are of O(1) for the Poisson equation and of O(n) for the indefinite Helmholtz equation. The solver is ideal in situations where low-accuracy solutions are sufficient, or otherwise as a preconditioner within an iterative method.

  19. A Python interface to Diffpack-based classes and solvers

    OpenAIRE

    Munthe-Kaas, Heidi Vikki

    2013-01-01

    Python is a programming language that has gained a lot of popularity during the last 15 years, and as a very easy-to-learn and flexible scripting language it is very well suited for computa- tional science, both in mathematics and in physics. Diffpack is a PDE library written in C++, made for easier implementation of both smaller PDE solvers and for larger libraries of simu- lators. It contains large class hierarchies for different solvers, grids, arrays, parallel computing and almost everyth...

  20. Multi-threaded Sparse Matrix Sparse Matrix Multiplication for Many-Core and GPU Architectures.

    Energy Technology Data Exchange (ETDEWEB)

    Deveci, Mehmet [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Trott, Christian Robert [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Rajamanickam, Sivasankaran [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

    2018-01-01

    Sparse Matrix-Matrix multiplication is a key kernel that has applications in several domains such as scientific computing and graph analysis. Several algorithms have been studied in the past for this foundational kernel. In this paper, we develop parallel algorithms for sparse matrix- matrix multiplication with a focus on performance portability across different high performance computing architectures. The performance of these algorithms depend on the data structures used in them. We compare different types of accumulators in these algorithms and demonstrate the performance difference between these data structures. Furthermore, we develop a meta-algorithm, kkSpGEMM, to choose the right algorithm and data structure based on the characteristics of the problem. We show performance comparisons on three architectures and demonstrate the need for the community to develop two phase sparse matrix-matrix multiplication implementations for efficient reuse of the data structures involved.

  1. A Kohn–Sham equation solver based on hexahedral finite elements

    International Nuclear Information System (INIS)

    Fang Jun; Gao Xingyu; Zhou Aihui

    2012-01-01

    We design a Kohn–Sham equation solver based on hexahedral finite element discretizations. The solver integrates three schemes proposed in this paper. The first scheme arranges one a priori locally-refined hexahedral mesh with appropriate multiresolution. The second one is a modified mass-lumping procedure which accelerates the diagonalization in the self-consistent field iteration. The third one is a finite element recovery method which enhances the eigenpair approximations with small extra work. We carry out numerical tests on each scheme to investigate the validity and efficiency, and then apply them to calculate the ground state total energies of nanosystems C 60 , C 120 , and C 275 H 172 . It is shown that our solver appears to be computationally attractive for finite element applications in electronic structure study.

  2. Sparse approximation with bases

    CERN Document Server

    2015-01-01

    This book systematically presents recent fundamental results on greedy approximation with respect to bases. Motivated by numerous applications, the last decade has seen great successes in studying nonlinear sparse approximation. Recent findings have established that greedy-type algorithms are suitable methods of nonlinear approximation in both sparse approximation with respect to bases and sparse approximation with respect to redundant systems. These insights, combined with some previous fundamental results, form the basis for constructing the theory of greedy approximation. Taking into account the theoretical and practical demand for this kind of theory, the book systematically elaborates a theoretical framework for greedy approximation and its applications.  The book addresses the needs of researchers working in numerical mathematics, harmonic analysis, and functional analysis. It quickly takes the reader from classical results to the latest frontier, but is written at the level of a graduate course and do...

  3. Hyperspectral Unmixing with Robust Collaborative Sparse Regression

    Directory of Open Access Journals (Sweden)

    Chang Li

    2016-07-01

    Full Text Available Recently, sparse unmixing (SU of hyperspectral data has received particular attention for analyzing remote sensing images. However, most SU methods are based on the commonly admitted linear mixing model (LMM, which ignores the possible nonlinear effects (i.e., nonlinearity. In this paper, we propose a new method named robust collaborative sparse regression (RCSR based on the robust LMM (rLMM for hyperspectral unmixing. The rLMM takes the nonlinearity into consideration, and the nonlinearity is merely treated as outlier, which has the underlying sparse property. The RCSR simultaneously takes the collaborative sparse property of the abundance and sparsely distributed additive property of the outlier into consideration, which can be formed as a robust joint sparse regression problem. The inexact augmented Lagrangian method (IALM is used to optimize the proposed RCSR. The qualitative and quantitative experiments on synthetic datasets and real hyperspectral images demonstrate that the proposed RCSR is efficient for solving the hyperspectral SU problem compared with the other four state-of-the-art algorithms.

  4. MINARET: Towards a time-dependent neutron transport parallel solver

    International Nuclear Information System (INIS)

    Baudron, A.M.; Lautard, J.J.; Maday, Y.; Mula, O.

    2013-01-01

    We present the newly developed time-dependent 3D multigroup discrete ordinates neutron transport solver that has recently been implemented in the MINARET code. The solver is the support for a study about computing acceleration techniques that involve parallel architectures. In this work, we will focus on the parallelization of two of the variables involved in our equation: the angular directions and the time. This last variable has been parallelized by a (time) domain decomposition method called the para-real in time algorithm. (authors)

  5. Image fusion using sparse overcomplete feature dictionaries

    Science.gov (United States)

    Brumby, Steven P.; Bettencourt, Luis; Kenyon, Garrett T.; Chartrand, Rick; Wohlberg, Brendt

    2015-10-06

    Approaches for deciding what individuals in a population of visual system "neurons" are looking for using sparse overcomplete feature dictionaries are provided. A sparse overcomplete feature dictionary may be learned for an image dataset and a local sparse representation of the image dataset may be built using the learned feature dictionary. A local maximum pooling operation may be applied on the local sparse representation to produce a translation-tolerant representation of the image dataset. An object may then be classified and/or clustered within the translation-tolerant representation of the image dataset using a supervised classification algorithm and/or an unsupervised clustering algorithm.

  6. Development of a CANDU Moderator Analysis Model; Based on Coupled Solver

    International Nuclear Information System (INIS)

    Yoon, Churl; Park, Joo Hwan

    2006-01-01

    A CFD model for predicting the CANDU-6 moderator temperature has been developed for several years in KAERI, which is based on CFX-4. This analytic model(CFX4-CAMO) has some strength in the modeling of hydraulic resistance in the core region and in the treatment of heat source term in the energy equations. But the convergence difficulties and slow computing speed reveal to be the limitations of this model, because the CFX-4 code adapts a segregated solver to solve the governing equations with strong coupled-effect. Compared to CFX-4 using segregated solver, CFX-10 adapts high efficient and robust coupled-solver. Before December 2005 when CFX-10 was distributed, the previous version of CFX-10(CFX-5. series) also adapted coupled solver but didn't have any capability to apply porous media approaches correctly. In this study, the developed moderator analysis model based on CFX- 4 (CFX4-CAMO) is transformed into a new moderator analysis model based on CFX-10. The new model is examined and the results are compared to the former

  7. Computational cost estimates for parallel shared memory isogeometric multi-frontal solvers

    KAUST Repository

    Woźniak, Maciej; Kuźnik, Krzysztof M.; Paszyński, Maciej R.; Calo, Victor M.; Pardo, D.

    2014-01-01

    In this paper we present computational cost estimates for parallel shared memory isogeometric multi-frontal solvers. The estimates show that the ideal isogeometric shared memory parallel direct solver scales as O( p2log(N/p)) for one dimensional problems, O(Np2) for two dimensional problems, and O(N4/3p2) for three dimensional problems, where N is the number of degrees of freedom, and p is the polynomial order of approximation. The computational costs of the shared memory parallel isogeometric direct solver are compared with those corresponding to the sequential isogeometric direct solver, being the latest equal to O(N p2) for the one dimensional case, O(N1.5p3) for the two dimensional case, and O(N2p3) for the three dimensional case. The shared memory version significantly reduces both the scalability in terms of N and p. Theoretical estimates are compared with numerical experiments performed with linear, quadratic, cubic, quartic, and quintic B-splines, in one and two spatial dimensions. © 2014 Elsevier Ltd. All rights reserved.

  8. Computational cost estimates for parallel shared memory isogeometric multi-frontal solvers

    KAUST Repository

    Woźniak, Maciej

    2014-06-01

    In this paper we present computational cost estimates for parallel shared memory isogeometric multi-frontal solvers. The estimates show that the ideal isogeometric shared memory parallel direct solver scales as O( p2log(N/p)) for one dimensional problems, O(Np2) for two dimensional problems, and O(N4/3p2) for three dimensional problems, where N is the number of degrees of freedom, and p is the polynomial order of approximation. The computational costs of the shared memory parallel isogeometric direct solver are compared with those corresponding to the sequential isogeometric direct solver, being the latest equal to O(N p2) for the one dimensional case, O(N1.5p3) for the two dimensional case, and O(N2p3) for the three dimensional case. The shared memory version significantly reduces both the scalability in terms of N and p. Theoretical estimates are compared with numerical experiments performed with linear, quadratic, cubic, quartic, and quintic B-splines, in one and two spatial dimensions. © 2014 Elsevier Ltd. All rights reserved.

  9. Performance of uncertainty quantification methodologies and linear solvers in cardiovascular simulations

    Science.gov (United States)

    Seo, Jongmin; Schiavazzi, Daniele; Marsden, Alison

    2017-11-01

    Cardiovascular simulations are increasingly used in clinical decision making, surgical planning, and disease diagnostics. Patient-specific modeling and simulation typically proceeds through a pipeline from anatomic model construction using medical image data to blood flow simulation and analysis. To provide confidence intervals on simulation predictions, we use an uncertainty quantification (UQ) framework to analyze the effects of numerous uncertainties that stem from clinical data acquisition, modeling, material properties, and boundary condition selection. However, UQ poses a computational challenge requiring multiple evaluations of the Navier-Stokes equations in complex 3-D models. To achieve efficiency in UQ problems with many function evaluations, we implement and compare a range of iterative linear solver and preconditioning techniques in our flow solver. We then discuss applications to patient-specific cardiovascular simulation and how the problem/boundary condition formulation in the solver affects the selection of the most efficient linear solver. Finally, we discuss performance improvements in the context of uncertainty propagation. Support from National Institute of Health (R01 EB018302) is greatly appreciated.

  10. A Direct Elliptic Solver Based on Hierarchically Low-Rank Schur Complements

    KAUST Repository

    Chávez, Gustavo

    2017-03-17

    A parallel fast direct solver for rank-compressible block tridiagonal linear systems is presented. Algorithmic synergies between Cyclic Reduction and Hierarchical matrix arithmetic operations result in a solver with O(Nlog2N) arithmetic complexity and O(NlogN) memory footprint. We provide a baseline for performance and applicability by comparing with well-known implementations of the $$\\\\mathcal{H}$$ -LU factorization and algebraic multigrid within a shared-memory parallel environment that leverages the concurrency features of the method. Numerical experiments reveal that this method is comparable with other fast direct solvers based on Hierarchical Matrices such as $$\\\\mathcal{H}$$ -LU and that it can tackle problems where algebraic multigrid fails to converge.

  11. Decision Engines for Software Analysis Using Satisfiability Modulo Theories Solvers

    Science.gov (United States)

    Bjorner, Nikolaj

    2010-01-01

    The area of software analysis, testing and verification is now undergoing a revolution thanks to the use of automated and scalable support for logical methods. A well-recognized premise is that at the core of software analysis engines is invariably a component using logical formulas for describing states and transformations between system states. The process of using this information for discovering and checking program properties (including such important properties as safety and security) amounts to automatic theorem proving. In particular, theorem provers that directly support common software constructs offer a compelling basis. Such provers are commonly called satisfiability modulo theories (SMT) solvers. Z3 is a state-of-the-art SMT solver. It is developed at Microsoft Research. It can be used to check the satisfiability of logical formulas over one or more theories such as arithmetic, bit-vectors, lists, records and arrays. The talk describes some of the technology behind modern SMT solvers, including the solver Z3. Z3 is currently mainly targeted at solving problems that arise in software analysis and verification. It has been applied to various contexts, such as systems for dynamic symbolic simulation (Pex, SAGE, Vigilante), for program verification and extended static checking (Spec#/Boggie, VCC, HAVOC), for software model checking (Yogi, SLAM), model-based design (FORMULA), security protocol code (F7), program run-time analysis and invariant generation (VS3). We will describe how it integrates support for a variety of theories that arise naturally in the context of the applications. There are several new promising avenues and the talk will touch on some of these and the challenges related to SMT solvers. Proceedings

  12. AQUASOL: An efficient solver for the dipolar Poisson-Boltzmann-Langevin equation.

    Science.gov (United States)

    Koehl, Patrice; Delarue, Marc

    2010-02-14

    The Poisson-Boltzmann (PB) formalism is among the most popular approaches to modeling the solvation of molecules. It assumes a continuum model for water, leading to a dielectric permittivity that only depends on position in space. In contrast, the dipolar Poisson-Boltzmann-Langevin (DPBL) formalism represents the solvent as a collection of orientable dipoles with nonuniform concentration; this leads to a nonlinear permittivity function that depends both on the position and on the local electric field at that position. The differences in the assumptions underlying these two models lead to significant differences in the equations they generate. The PB equation is a second order, elliptic, nonlinear partial differential equation (PDE). Its response coefficients correspond to the dielectric permittivity and are therefore constant within each subdomain of the system considered (i.e., inside and outside of the molecules considered). While the DPBL equation is also a second order, elliptic, nonlinear PDE, its response coefficients are nonlinear functions of the electrostatic potential. Many solvers have been developed for the PB equation; to our knowledge, none of these can be directly applied to the DPBL equation. The methods they use may adapt to the difference; their implementations however are PBE specific. We adapted the PBE solver originally developed by Holst and Saied [J. Comput. Chem. 16, 337 (1995)] to the problem of solving the DPBL equation. This solver uses a truncated Newton method with a multigrid preconditioner. Numerical evidences suggest that it converges for the DPBL equation and that the convergence is superlinear. It is found however to be slow and greedy in memory requirement for problems commonly encountered in computational biology and computational chemistry. To circumvent these problems, we propose two variants, a quasi-Newton solver based on a simplified, inexact Jacobian and an iterative self-consistent solver that is based directly on the PBE

  13. Integrating Problem Solvers from Analogous Markets in New Product Ideation

    DEFF Research Database (Denmark)

    Franke, Nikolaus; Poetz, Marion; Schreier, Martin

    2014-01-01

    Who provides better inputs to new product ideation tasks, problem solvers with expertise in the area for which new products are to be developed or problem solvers from “analogous” markets that are distant but share an analogous problem or need? Conventional wisdom appears to suggest that target...... market expertise is indispensable, which is why most managers searching for new ideas tend to stay within their own market context even when they do search outside their firms' boundaries. However, in a unique symmetric experiment that isolates the effect of market origin, we find evidence...... for the opposite: Although solutions provided by problem solvers from analogous markets show lower potential for immediate use, they demonstrate substantially higher levels of novelty. Also, compared to established novelty drivers, this effect appears highly relevant from a managerial perspective: we find...

  14. Manifold regularization for sparse unmixing of hyperspectral images.

    Science.gov (United States)

    Liu, Junmin; Zhang, Chunxia; Zhang, Jiangshe; Li, Huirong; Gao, Yuelin

    2016-01-01

    Recently, sparse unmixing has been successfully applied to spectral mixture analysis of remotely sensed hyperspectral images. Based on the assumption that the observed image signatures can be expressed in the form of linear combinations of a number of pure spectral signatures known in advance, unmixing of each mixed pixel in the scene is to find an optimal subset of signatures in a very large spectral library, which is cast into the framework of sparse regression. However, traditional sparse regression models, such as collaborative sparse regression , ignore the intrinsic geometric structure in the hyperspectral data. In this paper, we propose a novel model, called manifold regularized collaborative sparse regression , by introducing a manifold regularization to the collaborative sparse regression model. The manifold regularization utilizes a graph Laplacian to incorporate the locally geometrical structure of the hyperspectral data. An algorithm based on alternating direction method of multipliers has been developed for the manifold regularized collaborative sparse regression model. Experimental results on both the simulated and real hyperspectral data sets have demonstrated the effectiveness of our proposed model.

  15. Regression with Sparse Approximations of Data

    DEFF Research Database (Denmark)

    Noorzad, Pardis; Sturm, Bob L.

    2012-01-01

    We propose sparse approximation weighted regression (SPARROW), a method for local estimation of the regression function that uses sparse approximation with a dictionary of measurements. SPARROW estimates the regression function at a point with a linear combination of a few regressands selected...... by a sparse approximation of the point in terms of the regressors. We show SPARROW can be considered a variant of \\(k\\)-nearest neighbors regression (\\(k\\)-NNR), and more generally, local polynomial kernel regression. Unlike \\(k\\)-NNR, however, SPARROW can adapt the number of regressors to use based...

  16. Sparse adaptive filters for echo cancellation

    CERN Document Server

    Paleologu, Constantin

    2011-01-01

    Adaptive filters with a large number of coefficients are usually involved in both network and acoustic echo cancellation. Consequently, it is important to improve the convergence rate and tracking of the conventional algorithms used for these applications. This can be achieved by exploiting the sparseness character of the echo paths. Identification of sparse impulse responses was addressed mainly in the last decade with the development of the so-called ``proportionate''-type algorithms. The goal of this book is to present the most important sparse adaptive filters developed for echo cancellati

  17. Fostering Creative Problem Solvers in Higher Education

    DEFF Research Database (Denmark)

    Zhou, Chunfang

    2016-01-01

    to meet such challenges. This chapter aims to illustrate how to understand: 1) complexity as the nature of professional practice; 2) creative problem solving as the core skill in professional practice; 3) creativity as interplay between persons and their environment; 4) higher education as the context......Recent studies have emphasized issues of social emergence based on thinking of societies as complex systems. The complexity of professional practice has been recognized as the root of challenges for higher education. To foster creative problem solvers is a key response of higher education in order...... of fostering creative problem solvers; and 5) some innovative strategies such as Problem-Based Learning (PBL) and building a learning environment by Information Communication Technology (ICT) as potential strategies of creativity development. Accordingly, this chapter contributes to bridge the complexity...

  18. Implementation of density-based solver for all speeds in the framework of OpenFOAM

    Science.gov (United States)

    Shen, Chun; Sun, Fengxian; Xia, Xinlin

    2014-10-01

    In the framework of open source CFD code OpenFOAM, a density-based solver for all speeds flow field is developed. In this solver the preconditioned all speeds AUSM+(P) scheme is adopted and the dual time scheme is implemented to complete the unsteady process. Parallel computation could be implemented to accelerate the solving process. Different interface reconstruction algorithms are implemented, and their accuracy with respect to convection is compared. Three benchmark tests of lid-driven cavity flow, flow crossing over a bump, and flow over a forward-facing step are presented to show the accuracy of the AUSM+(P) solver for low-speed incompressible flow, transonic flow, and supersonic/hypersonic flow. Firstly, for the lid driven cavity flow, the computational results obtained by different interface reconstruction algorithms are compared. It is indicated that the one dimensional reconstruction scheme adopted in this solver possesses high accuracy and the solver developed in this paper can effectively catch the features of low incompressible flow. Then via the test cases regarding the flow crossing over bump and over forward step, the ability to capture characteristics of the transonic and supersonic/hypersonic flows are confirmed. The forward-facing step proves to be the most challenging for the preconditioned solvers with and without the dual time scheme. Nonetheless, the solvers described in this paper reproduce the main features of this flow, including the evolution of the initial transient.

  19. Java Based Symbolic Circuit Solver For Electrical Engineering Curriculum

    Directory of Open Access Journals (Sweden)

    Ruba Akram Amarin

    2012-11-01

    Full Text Available The interactive technical electronic book, TechEBook, currently under development at the University of Central Florida (UCF, introduces a paradigm shift by replacing the traditional electrical engineering course with topic-driven modules that provide a useful tool for engineers and scientists. The TechEBook comprises the two worlds of classical circuit books and interactive operating platforms such as iPads, laptops and desktops. The TechEBook provides an interactive applets screen that holds many modules, each of which has a specific application in the self learning process. This paper describes one of the interactive techniques in the TechEBook known as Symbolic Circuit Solver (SymCirc. The SymCirc develops a versatile symbolic based linear circuit with a switches solver. The solver works by accepting a Netlist and the element that the user wants to find the voltage across or current on, as input parameters. Then it either produces the plot or the time domain expression of the output. Frequency domain plots or Symbolic Transfer Functions are also produced. The solver gets its input from a Web-based GUI circuit drawer developed at UCF. Typical simulation tools that electrical engineers encounter are numerical in nature, that is, when presented with an input circuit they iteratively solve the circuit across a set of small time steps. The result is represented as a data set of output versus time, which can be plotted for further inspection. Such results do not help users understand the ultimate nature of circuits as Linear Time Invariant systems with a finite dimensional basis in the solution space. SymCirc provides all simulation results as time domain expressions composed of the basic functions that exclusively include exponentials, sines, cosines and/or t raised to any power. This paper explains the motivation behind SymCirc, the Graphical User Interface front end and how the solver actually works. The paper also presents some examples and

  20. Efficiency optimization of a fast Poisson solver in beam dynamics simulation

    Science.gov (United States)

    Zheng, Dawei; Pöplau, Gisela; van Rienen, Ursula

    2016-01-01

    Calculating the solution of Poisson's equation relating to space charge force is still the major time consumption in beam dynamics simulations and calls for further improvement. In this paper, we summarize a classical fast Poisson solver in beam dynamics simulations: the integrated Green's function method. We introduce three optimization steps of the classical Poisson solver routine: using the reduced integrated Green's function instead of the integrated Green's function; using the discrete cosine transform instead of discrete Fourier transform for the Green's function; using a novel fast convolution routine instead of an explicitly zero-padded convolution. The new Poisson solver routine preserves the advantages of fast computation and high accuracy. This provides a fast routine for high performance calculation of the space charge effect in accelerators.

  1. Wavelet-Based Poisson Solver for Use in Particle-in-Cell Simulations

    CERN Document Server

    Terzic, Balsa; Mihalcea, Daniel; Pogorelov, Ilya V

    2005-01-01

    We report on a successful implementation of a wavelet-based Poisson solver for use in 3D particle-in-cell simulations. One new aspect of our algorithm is its ability to treat the general (inhomogeneous) Dirichlet boundary conditions. The solver harnesses advantages afforded by the wavelet formulation, such as sparsity of operators and data sets, existence of effective preconditioners, and the ability simultaneously to remove numerical noise and further compress relevant data sets. Having tested our method as a stand-alone solver on two model problems, we merged it into IMPACT-T to obtain a fully functional serial PIC code. We present and discuss preliminary results of application of the new code to the modelling of the Fermilab/NICADD and AES/JLab photoinjectors.

  2. Wavelet-based Poisson Solver for use in Particle-In-Cell Simulations

    International Nuclear Information System (INIS)

    Terzic, B.; Mihalcea, D.; Bohn, C.L.; Pogorelov, I.V.

    2005-01-01

    We report on a successful implementation of a wavelet based Poisson solver for use in 3D particle-in-cell (PIC) simulations. One new aspect of our algorithm is its ability to treat the general(inhomogeneous) Dirichlet boundary conditions (BCs). The solver harnesses advantages afforded by the wavelet formulation, such as sparsity of operators and data sets, existence of effective preconditioners, and the ability simultaneously to remove numerical noise and further compress relevant data sets. Having tested our method as a stand-alone solver on two model problems, we merged it into IMPACT-T to obtain a fully functional serial PIC code. We present and discuss preliminary results of application of the new code to the modeling of the Fermilab/NICADD and AES/JLab photoinjectors

  3. Identification of severe wind conditions using a Reynolds averaged Navier-Stokes solver

    DEFF Research Database (Denmark)

    Sørensen, Niels N.; Bechmann, Andreas; Johansen, Jeppe

    2007-01-01

    The present paper describes the application of a Navier-Stokes solver to predict the presence of severe flow conditions in complex terrain, capturing conditions that may be critical to the siting of wind turbines in the terrain. First it is documented that the flow solver is capable of predicting...

  4. Biclustering via Sparse Singular Value Decomposition

    KAUST Repository

    Lee, Mihee

    2010-02-16

    Sparse singular value decomposition (SSVD) is proposed as a new exploratory analysis tool for biclustering or identifying interpretable row-column associations within high-dimensional data matrices. SSVD seeks a low-rank, checkerboard structured matrix approximation to data matrices. The desired checkerboard structure is achieved by forcing both the left- and right-singular vectors to be sparse, that is, having many zero entries. By interpreting singular vectors as regression coefficient vectors for certain linear regressions, sparsity-inducing regularization penalties are imposed to the least squares regression to produce sparse singular vectors. An efficient iterative algorithm is proposed for computing the sparse singular vectors, along with some discussion of penalty parameter selection. A lung cancer microarray dataset and a food nutrition dataset are used to illustrate SSVD as a biclustering method. SSVD is also compared with some existing biclustering methods using simulated datasets. © 2010, The International Biometric Society.

  5. Fast Laplace solver approach to pore-scale permeability

    Science.gov (United States)

    Arns, C. H.; Adler, P. M.

    2018-02-01

    We introduce a powerful and easily implemented method to calculate the permeability of porous media at the pore scale using an approximation based on the Poiseulle equation to calculate permeability to fluid flow with a Laplace solver. The method consists of calculating the Euclidean distance map of the fluid phase to assign local conductivities and lends itself naturally to the treatment of multiscale problems. We compare with analytical solutions as well as experimental measurements and lattice Boltzmann calculations of permeability for Fontainebleau sandstone. The solver is significantly more stable than the lattice Boltzmann approach, uses less memory, and is significantly faster. Permeabilities are in excellent agreement over a wide range of porosities.

  6. Influence of an SN solver in a fine-mesh neutronics/thermal-hydraulics framework

    International Nuclear Information System (INIS)

    Jareteg, Klas; Vinai, Paolo; Demaziere, Christophe; Sasic, Srdjan

    2015-01-01

    In this paper a study on the influence of a neutron discrete ordinates (S N ) solver within a fine-mesh neutronic/thermal-hydraulic methodology is presented. The methodology consists of coupling a neutronic solver with a single-phase fluid solver, and it is aimed at computing the two fields on a three-dimensional (3D) sub-pin level. The cross-sections needed for the neutron transport equations are pre-generated using a Monte Carlo approach. The coupling is resolved in an iterative manner with full convergence of both fields. A conservative transfer of the full 3D information is achieved, allowing for a proper coupling between the neutronic and the thermal-hydraulic meshes on the finest calculated scales. The discrete ordinates solver is benchmarked against a Monte Carlo reference solution for a two-dimensional (2D) system. The results confirm the need of a high number of ordinates, giving a satisfactory accuracy in k eff and scalar flux profile applying S 16 for 16 energy groups. The coupled framework is used to compare the S N implementation and a solver based on the neutron diffusion approximation for a full 3D system of a quarter of a symmetric, 7x7 array in an infinite lattice setup. In this case, the impact of the discrete ordinates solver shows to be significant for the coupled system, as demonstrated in the calculations of the temperature distributions. (author)

  7. A fast mass spring model solver for high-resolution elastic objects

    Science.gov (United States)

    Zheng, Mianlun; Yuan, Zhiyong; Zhu, Weixu; Zhang, Guian

    2017-03-01

    Real-time simulation of elastic objects is of great importance for computer graphics and virtual reality applications. The fast mass spring model solver can achieve visually realistic simulation in an efficient way. Unfortunately, this method suffers from resolution limitations and lack of mechanical realism for a surface geometry model, which greatly restricts its application. To tackle these problems, in this paper we propose a fast mass spring model solver for high-resolution elastic objects. First, we project the complex surface geometry model into a set of uniform grid cells as cages through *cages mean value coordinate method to reflect its internal structure and mechanics properties. Then, we replace the original Cholesky decomposition method in the fast mass spring model solver with a conjugate gradient method, which can make the fast mass spring model solver more efficient for detailed surface geometry models. Finally, we propose a graphics processing unit accelerated parallel algorithm for the conjugate gradient method. Experimental results show that our method can realize efficient deformation simulation of 3D elastic objects with visual reality and physical fidelity, which has a great potential for applications in computer animation.

  8. Robust Face Recognition Via Gabor Feature and Sparse Representation

    Directory of Open Access Journals (Sweden)

    Hao Yu-Juan

    2016-01-01

    Full Text Available Sparse representation based on compressed sensing theory has been widely used in the field of face recognition, and has achieved good recognition results. but the face feature extraction based on sparse representation is too simple, and the sparse coefficient is not sparse. In this paper, we improve the classification algorithm based on the fusion of sparse representation and Gabor feature, and then improved algorithm for Gabor feature which overcomes the problem of large dimension of the vector dimension, reduces the computation and storage cost, and enhances the robustness of the algorithm to the changes of the environment.The classification efficiency of sparse representation is determined by the collaborative representation,we simplify the sparse constraint based on L1 norm to the least square constraint, which makes the sparse coefficients both positive and reduce the complexity of the algorithm. Experimental results show that the proposed method is robust to illumination, facial expression and pose variations of face recognition, and the recognition rate of the algorithm is improved.

  9. Sparse Learning with Stochastic Composite Optimization.

    Science.gov (United States)

    Zhang, Weizhong; Zhang, Lijun; Jin, Zhongming; Jin, Rong; Cai, Deng; Li, Xuelong; Liang, Ronghua; He, Xiaofei

    2017-06-01

    In this paper, we study Stochastic Composite Optimization (SCO) for sparse learning that aims to learn a sparse solution from a composite function. Most of the recent SCO algorithms have already reached the optimal expected convergence rate O(1/λT), but they often fail to deliver sparse solutions at the end either due to the limited sparsity regularization during stochastic optimization (SO) or due to the limitation in online-to-batch conversion. Even when the objective function is strongly convex, their high probability bounds can only attain O(√{log(1/δ)/T}) with δ is the failure probability, which is much worse than the expected convergence rate. To address these limitations, we propose a simple yet effective two-phase Stochastic Composite Optimization scheme by adding a novel powerful sparse online-to-batch conversion to the general Stochastic Optimization algorithms. We further develop three concrete algorithms, OptimalSL, LastSL and AverageSL, directly under our scheme to prove the effectiveness of the proposed scheme. Both the theoretical analysis and the experiment results show that our methods can really outperform the existing methods at the ability of sparse learning and at the meantime we can improve the high probability bound to approximately O(log(log(T)/δ)/λT).

  10. Shearlets and Optimally Sparse Approximations

    DEFF Research Database (Denmark)

    Kutyniok, Gitta; Lemvig, Jakob; Lim, Wang-Q

    2012-01-01

    Multivariate functions are typically governed by anisotropic features such as edges in images or shock fronts in solutions of transport-dominated equations. One major goal both for the purpose of compression as well as for an efficient analysis is the provision of optimally sparse approximations...... optimally sparse approximations of this model class in 2D as well as 3D. Even more, in contrast to all other directional representation systems, a theory for compactly supported shearlet frames was derived which moreover also satisfy this optimality benchmark. This chapter shall serve as an introduction...... to and a survey about sparse approximations of cartoon-like images by band-limited and also compactly supported shearlet frames as well as a reference for the state-of-the-art of this research field....

  11. Newton-Krylov-BDDC solvers for nonlinear cardiac mechanics

    KAUST Repository

    Pavarino, L.F.; Scacchi, S.; Zampini, Stefano

    2015-01-01

    The aim of this work is to design and study a Balancing Domain Decomposition by Constraints (BDDC) solver for the nonlinear elasticity system modeling the mechanical deformation of cardiac tissue. The contraction–relaxation process in the myocardium is induced by the generation and spread of the bioelectrical excitation throughout the tissue and it is mathematically described by the coupling of cardiac electro-mechanical models consisting of systems of partial and ordinary differential equations. In this study, the discretization of the electro-mechanical models is performed by Q1 finite elements in space and semi-implicit finite difference schemes in time, leading to the solution of a large-scale linear system for the bioelectrical potentials and a nonlinear system for the mechanical deformation at each time step of the simulation. The parallel mechanical solver proposed in this paper consists in solving the nonlinear system with a Newton-Krylov-BDDC method, based on the parallel solution of local mechanical problems and a coarse problem for the so-called primal unknowns. Three-dimensional parallel numerical tests on different machines show that the proposed parallel solver is scalable in the number of subdomains, quasi-optimal in the ratio of subdomain to mesh sizes, and robust with respect to tissue anisotropy.

  12. Newton-Krylov-BDDC solvers for nonlinear cardiac mechanics

    KAUST Repository

    Pavarino, L.F.

    2015-07-18

    The aim of this work is to design and study a Balancing Domain Decomposition by Constraints (BDDC) solver for the nonlinear elasticity system modeling the mechanical deformation of cardiac tissue. The contraction–relaxation process in the myocardium is induced by the generation and spread of the bioelectrical excitation throughout the tissue and it is mathematically described by the coupling of cardiac electro-mechanical models consisting of systems of partial and ordinary differential equations. In this study, the discretization of the electro-mechanical models is performed by Q1 finite elements in space and semi-implicit finite difference schemes in time, leading to the solution of a large-scale linear system for the bioelectrical potentials and a nonlinear system for the mechanical deformation at each time step of the simulation. The parallel mechanical solver proposed in this paper consists in solving the nonlinear system with a Newton-Krylov-BDDC method, based on the parallel solution of local mechanical problems and a coarse problem for the so-called primal unknowns. Three-dimensional parallel numerical tests on different machines show that the proposed parallel solver is scalable in the number of subdomains, quasi-optimal in the ratio of subdomain to mesh sizes, and robust with respect to tissue anisotropy.

  13. Motivation, Challenge, and Opportunity of Successful Solvers on an Innovation Platform

    DEFF Research Database (Denmark)

    Hossain, Mokter

    2017-01-01

    . The main motivational factors of successful solvers engaged in problem solving are money, learning, fun, sense of achievement, passion, and networking. Major challenges solvers face include unclear or insufficient problem description, lack of option for communication, language barrier, time zone...... other experts, the ability to work in a diverse environment, options of work after retirement and from distant locations, and a new source of income....

  14. Development and validation of a local time stepping-based PaSR solver for combustion and radiation modeling

    DEFF Research Database (Denmark)

    Pang, Kar Mun; Ivarsson, Anders; Haider, Sajjad

    2013-01-01

    In the current work, a local time stepping (LTS) solver for the modeling of combustion, radiative heat transfer and soot formation is developed and validated. This is achieved using an open source computational fluid dynamics code, OpenFOAM. Akin to the solver provided in default assembly i...... library in the edcSimpleFoam solver which was introduced during the 6th OpenFOAM workshop is modified and coupled with the current solver. One of the main amendments made is the integration of soot radiation submodel since this is significant in rich flames where soot particles are formed. The new solver...

  15. Benchmarking ICRF Full-wave Solvers for ITER

    International Nuclear Information System (INIS)

    Budny, R.V.; Berry, L.; Bilato, R.; Bonoli, P.; Brambilla, M.; Dumont, R.J.; Fukuyama, A.; Harvey, R.; Jaeger, E.F.; Indireshkumar, K.; Lerche, E.; McCune, D.; Phillips, C.K.; Vdovin, V.; Wright, J.

    2011-01-01

    Benchmarking of full-wave solvers for ICRF simulations is performed using plasma profiles and equilibria obtained from integrated self-consistent modeling predictions of four ITER plasmas. One is for a high performance baseline (5.3 T, 15 MA) DT H-mode. The others are for half-field, half-current plasmas of interest for the pre-activation phase with bulk plasma ion species being either hydrogen or He4. The predicted profiles are used by six full-wave solver groups to simulate the ICRF electromagnetic fields and heating, and by three of these groups to simulate the current-drive. Approximate agreement is achieved for the predicted heating power for the DT and He4 cases. Factor of two disagreements are found for the cases with second harmonic He3 heating in bulk H cases. Approximate agreement is achieved simulating the ICRF current drive.

  16. Multilevel sparse functional principal component analysis.

    Science.gov (United States)

    Di, Chongzhi; Crainiceanu, Ciprian M; Jank, Wolfgang S

    2014-01-29

    We consider analysis of sparsely sampled multilevel functional data, where the basic observational unit is a function and data have a natural hierarchy of basic units. An example is when functions are recorded at multiple visits for each subject. Multilevel functional principal component analysis (MFPCA; Di et al. 2009) was proposed for such data when functions are densely recorded. Here we consider the case when functions are sparsely sampled and may contain only a few observations per function. We exploit the multilevel structure of covariance operators and achieve data reduction by principal component decompositions at both between and within subject levels. We address inherent methodological differences in the sparse sampling context to: 1) estimate the covariance operators; 2) estimate the functional principal component scores; 3) predict the underlying curves. Through simulations the proposed method is able to discover dominating modes of variations and reconstruct underlying curves well even in sparse settings. Our approach is illustrated by two applications, the Sleep Heart Health Study and eBay auctions.

  17. On a construction of fast direct solvers

    Czech Academy of Sciences Publication Activity Database

    Práger, Milan

    2003-01-01

    Roč. 48, č. 3 (2003), s. 225-236 ISSN 0862-7940 Institutional research plan: CEZ:AV0Z1019905; CEZ:AV0Z1019905 Keywords : Poisson equation * boundary value problem * fast direct solver Subject RIV: BA - General Mathematics

  18. Language Recognition via Sparse Coding

    Science.gov (United States)

    2016-09-08

    explanation is that sparse coding can achieve a near-optimal approximation of much complicated nonlinear relationship through local and piecewise linear...training examples, where x(i) ∈ RN is the ith example in the batch. Optionally, X can be normalized and whitened before sparse coding for better result...normalized input vectors are then ZCA- whitened [20]. Em- pirically, we choose ZCA- whitening over PCA- whitening , and there is no dimensionality reduction

  19. Sparse seismic imaging using variable projection

    NARCIS (Netherlands)

    Aravkin, Aleksandr Y.; Tu, Ning; van Leeuwen, Tristan

    2013-01-01

    We consider an important class of signal processing problems where the signal of interest is known to be sparse, and can be recovered from data given auxiliary information about how the data was generated. For example, a sparse Green's function may be recovered from seismic experimental data using

  20. Tunable Sparse Network Coding for Multicast Networks

    DEFF Research Database (Denmark)

    Feizi, Soheil; Roetter, Daniel Enrique Lucani; Sørensen, Chres Wiant

    2014-01-01

    This paper shows the potential and key enabling mechanisms for tunable sparse network coding, a scheme in which the density of network coded packets varies during a transmission session. At the beginning of a transmission session, sparsely coded packets are transmitted, which benefits decoding...... complexity. At the end of a transmission, when receivers have accumulated degrees of freedom, coding density is increased. We propose a family of tunable sparse network codes (TSNCs) for multicast erasure networks with a controllable trade-off between completion time performance to decoding complexity...... a mechanism to perform efficient Gaussian elimination over sparse matrices going beyond belief propagation but maintaining low decoding complexity. Supporting simulation results are provided showing the trade-off between decoding complexity and completion time....

  1. A generalized gyrokinetic Poisson solver

    International Nuclear Information System (INIS)

    Lin, Z.; Lee, W.W.

    1995-03-01

    A generalized gyrokinetic Poisson solver has been developed, which employs local operations in the configuration space to compute the polarization density response. The new technique is based on the actual physical process of gyrophase-averaging. It is useful for nonlocal simulations using general geometry equilibrium. Since it utilizes local operations rather than the global ones such as FFT, the new method is most amenable to massively parallel algorithms

  2. Sparse PCA with Oracle Property.

    Science.gov (United States)

    Gu, Quanquan; Wang, Zhaoran; Liu, Han

    In this paper, we study the estimation of the k -dimensional sparse principal subspace of covariance matrix Σ in the high-dimensional setting. We aim to recover the oracle principal subspace solution, i.e., the principal subspace estimator obtained assuming the true support is known a priori. To this end, we propose a family of estimators based on the semidefinite relaxation of sparse PCA with novel regularizations. In particular, under a weak assumption on the magnitude of the population projection matrix, one estimator within this family exactly recovers the true support with high probability, has exact rank- k , and attains a [Formula: see text] statistical rate of convergence with s being the subspace sparsity level and n the sample size. Compared to existing support recovery results for sparse PCA, our approach does not hinge on the spiked covariance model or the limited correlation condition. As a complement to the first estimator that enjoys the oracle property, we prove that, another estimator within the family achieves a sharper statistical rate of convergence than the standard semidefinite relaxation of sparse PCA, even when the previous assumption on the magnitude of the projection matrix is violated. We validate the theoretical results by numerical experiments on synthetic datasets.

  3. Structural Sparse Tracking

    KAUST Repository

    Zhang, Tianzhu; Yang, Ming-Hsuan; Ahuja, Narendra; Ghanem, Bernard; Yan, Shuicheng; Xu, Changsheng; Liu, Si

    2015-01-01

    candidate. We show that our SST algorithm accommodates most existing sparse trackers with the respective merits. Both qualitative and quantitative evaluations on challenging benchmark image sequences demonstrate that the proposed SST algorithm performs

  4. 3D casing-distributor analysis with a novel block coupled OpenFOAM solver for hydraulic design application

    International Nuclear Information System (INIS)

    Devals, C; Zhang, Y; Dompierre, J; Guibault, F; Vu, T C; Mangani, L

    2014-01-01

    Nowadays, computational fluid dynamics is commonly used by design engineers to evaluate and compare losses in hydraulic components as it is less expensive and less time consuming than model tests. For that purpose, an automatic tool for casing and distributor analysis will be presented in this paper. An in-house mesh generator and a Reynolds Averaged Navier-Stokes equation solver using the standard k-ω SST turbulence model will be used to perform all computations. Two solvers based on the C++ OpenFOAM library will be used and compared to a commercial solver. The performance of the new fully coupled block solver developed by the University of Lucerne and Andritz will be compared to the standard 1.6ext segregated simpleFoam solver and to a commercial solver. In this study, relative comparisons of different geometries of casing and distributor will be performed. The present study is thus aimed at validating the block solver and the tool chain and providing design engineers with a faster and more reliable analysis tool that can be integrated into their design process

  5. Technique detection software for Sparse Matrices

    Directory of Open Access Journals (Sweden)

    KHAN Muhammad Taimoor

    2009-12-01

    Full Text Available Sparse storage formats are techniques for storing and processing the sparse matrix data efficiently. The performance of these storage formats depend upon the distribution of non-zeros, within the matrix in different dimensions. In order to have better results we need a technique that suits best the organization of data in a particular matrix. So the decision of selecting a better technique is the main step towards improving the system's results otherwise the efficiency can be decreased. The purpose of this research is to help identify the best storage format in case of reduced storage size and high processing efficiency for a sparse matrix.

  6. Sparse Representations of Hyperspectral Images

    KAUST Repository

    Swanson, Robin J.

    2015-11-23

    Hyperspectral image data has long been an important tool for many areas of sci- ence. The addition of spectral data yields significant improvements in areas such as object and image classification, chemical and mineral composition detection, and astronomy. Traditional capture methods for hyperspectral data often require each wavelength to be captured individually, or by sacrificing spatial resolution. Recently there have been significant improvements in snapshot hyperspectral captures using, in particular, compressed sensing methods. As we move to a compressed sensing image formation model the need for strong image priors to shape our reconstruction, as well as sparse basis become more important. Here we compare several several methods for representing hyperspectral images including learned three dimensional dictionaries, sparse convolutional coding, and decomposable nonlocal tensor dictionaries. Addi- tionally, we further explore their parameter space to identify which parameters provide the most faithful and sparse representations.

  7. Sparse Representations of Hyperspectral Images

    KAUST Repository

    Swanson, Robin J.

    2015-01-01

    Hyperspectral image data has long been an important tool for many areas of sci- ence. The addition of spectral data yields significant improvements in areas such as object and image classification, chemical and mineral composition detection, and astronomy. Traditional capture methods for hyperspectral data often require each wavelength to be captured individually, or by sacrificing spatial resolution. Recently there have been significant improvements in snapshot hyperspectral captures using, in particular, compressed sensing methods. As we move to a compressed sensing image formation model the need for strong image priors to shape our reconstruction, as well as sparse basis become more important. Here we compare several several methods for representing hyperspectral images including learned three dimensional dictionaries, sparse convolutional coding, and decomposable nonlocal tensor dictionaries. Addi- tionally, we further explore their parameter space to identify which parameters provide the most faithful and sparse representations.

  8. Supervised Convolutional Sparse Coding

    KAUST Repository

    Affara, Lama Ahmed

    2018-04-08

    Convolutional Sparse Coding (CSC) is a well-established image representation model especially suited for image restoration tasks. In this work, we extend the applicability of this model by proposing a supervised approach to convolutional sparse coding, which aims at learning discriminative dictionaries instead of purely reconstructive ones. We incorporate a supervised regularization term into the traditional unsupervised CSC objective to encourage the final dictionary elements to be discriminative. Experimental results show that using supervised convolutional learning results in two key advantages. First, we learn more semantically relevant filters in the dictionary and second, we achieve improved image reconstruction on unseen data.

  9. vZ - An Optimizing SMT Solver

    DEFF Research Database (Denmark)

    Bjørner, Nikolaj; Dung, Phan Anh; Fleckenstein, Lars

    2015-01-01

    vZ is a part of the SMT solver Z3. It allows users to pose and solve optimization problems modulo theories. Many SMT applications use models to provide satisfying assignments, and a growing number of these build on top of Z3 to get optimal assignments with respect to objective functions. vZ provi...

  10. Structure-aware Local Sparse Coding for Visual Tracking

    KAUST Repository

    Qi, Yuankai

    2018-01-24

    Sparse coding has been applied to visual tracking and related vision problems with demonstrated success in recent years. Existing tracking methods based on local sparse coding sample patches from a target candidate and sparsely encode these using a dictionary consisting of patches sampled from target template images. The discriminative strength of existing methods based on local sparse coding is limited as spatial structure constraints among the template patches are not exploited. To address this problem, we propose a structure-aware local sparse coding algorithm which encodes a target candidate using templates with both global and local sparsity constraints. For robust tracking, we show local regions of a candidate region should be encoded only with the corresponding local regions of the target templates that are the most similar from the global view. Thus, a more precise and discriminative sparse representation is obtained to account for appearance changes. To alleviate the issues with tracking drifts, we design an effective template update scheme. Extensive experiments on challenging image sequences demonstrate the effectiveness of the proposed algorithm against numerous stateof- the-art methods.

  11. Robust nonhomogeneous training samples detection method for space-time adaptive processing radar using sparse-recovery with knowledge-aided

    Science.gov (United States)

    Li, Zhihui; Liu, Hanwei; Zhang, Yongshun; Guo, Yiduo

    2017-10-01

    The performance of space-time adaptive processing (STAP) may degrade significantly when some of the training samples are contaminated by the signal-like components (outliers) in nonhomogeneous clutter environments. To remove the training samples contaminated by outliers in nonhomogeneous clutter environments, a robust nonhomogeneous training samples detection method using the sparse-recovery (SR) with knowledge-aided (KA) is proposed. First, the reduced-dimension (RD) overcomplete spatial-temporal steering dictionary is designed with the prior knowledge of system parameters and the possible target region. Then, the clutter covariance matrix (CCM) of cell under test is efficiently estimated using a modified focal underdetermined system solver (FOCUSS) algorithm, where a RD overcomplete spatial-temporal steering dictionary is applied. Third, the proposed statistics are formed by combining the estimated CCM with the generalized inner products (GIP) method, and the contaminated training samples can be detected and removed. Finally, several simulation results validate the effectiveness of the proposed KA-SR-GIP method.

  12. An Empirical Analysis of the Performance of Preconditioners for SPD Systems

    KAUST Repository

    George, Thomas

    2012-08-01

    Preconditioned iterative solvers have the potential to solve very large sparse linear systems with a fraction of the memory used by direct methods. However, the effectiveness and performance of most preconditioners is not only problem dependent, but also fairly sensitive to the choice of their tunable parameters. As a result, a typical practitioner is faced with an overwhelming number of choices of solvers, preconditioners, and their parameters. The diversity of preconditioners makes it difficult to analyze them in a unified theoretical model. A systematic empirical evaluation of existing preconditioned iterative solvers can help in identifying the relative advantages of various implementations. We present the results of a comprehensive experimental study of the most popular preconditioner and iterative solver combinations for symmetric positive-definite systems. We introduce a methodology for a rigorous comparative evaluation of various preconditioners, including the use of some simple but powerful metrics. The detailed comparison of various preconditioner implementations and a state-of-the-art direct solver gives interesting insights into their relative strengths and weaknesses. We believe that these results would be useful to researchers developing preconditioners and iterative solvers as well as practitioners looking for appropriate sparse solvers for their applications. © 2012 ACM.

  13. Advanced calculus problem solver

    CERN Document Server

    REA, Editors of

    2012-01-01

    Each Problem Solver is an insightful and essential study and solution guide chock-full of clear, concise problem-solving gems. All your questions can be found in one convenient source from one of the most trusted names in reference solution guides. More useful, more practical, and more informative, these study aids are the best review books and textbook companions available. Nothing remotely as comprehensive or as helpful exists in their subject anywhere. Perfect for undergraduate and graduate studies.Here in this highly useful reference is the finest overview of advanced calculus currently av

  14. Electric circuits problem solver

    CERN Document Server

    REA, Editors of

    2012-01-01

    Each Problem Solver is an insightful and essential study and solution guide chock-full of clear, concise problem-solving gems. All your questions can be found in one convenient source from one of the most trusted names in reference solution guides. More useful, more practical, and more informative, these study aids are the best review books and textbook companions available. Nothing remotely as comprehensive or as helpful exists in their subject anywhere. Perfect for undergraduate and graduate studies.Here in this highly useful reference is the finest overview of electric circuits currently av

  15. Mathematical programming solver based on local search

    CERN Document Server

    Gardi, Frédéric; Darlay, Julien; Estellon, Bertrand; Megel, Romain

    2014-01-01

    This book covers local search for combinatorial optimization and its extension to mixed-variable optimization. Although not yet understood from the theoretical point of view, local search is the paradigm of choice for tackling large-scale real-life optimization problems. Today's end-users demand interactivity with decision support systems. For optimization software, this means obtaining good-quality solutions quickly. Fast iterative improvement methods, like local search, are suited to satisfying such needs. Here the authors show local search in a new light, in particular presenting a new kind of mathematical programming solver, namely LocalSolver, based on neighborhood search. First, an iconoclast methodology is presented to design and engineer local search algorithms. The authors' concern about industrializing local search approaches is of particular interest for practitioners. This methodology is applied to solve two industrial problems with high economic stakes. Software based on local search induces ex...

  16. Fast linear solver for radiative transport equation with multiple right hand sides in diffuse optical tomography

    International Nuclear Information System (INIS)

    Jia, Jingfei; Kim, Hyun K.; Hielscher, Andreas H.

    2015-01-01

    It is well known that radiative transfer equation (RTE) provides more accurate tomographic results than its diffusion approximation (DA). However, RTE-based tomographic reconstruction codes have limited applicability in practice due to their high computational cost. In this article, we propose a new efficient method for solving the RTE forward problem with multiple light sources in an all-at-once manner instead of solving it for each source separately. To this end, we introduce here a novel linear solver called block biconjugate gradient stabilized method (block BiCGStab) that makes full use of the shared information between different right hand sides to accelerate solution convergence. Two parallelized block BiCGStab methods are proposed for additional acceleration under limited threads situation. We evaluate the performance of this algorithm with numerical simulation studies involving the Delta–Eddington approximation to the scattering phase function. The results show that the single threading block RTE solver proposed here reduces computation time by a factor of 1.5–3 as compared to the traditional sequential solution method and the parallel block solver by a factor of 1.5 as compared to the traditional parallel sequential method. This block linear solver is, moreover, independent of discretization schemes and preconditioners used; thus further acceleration and higher accuracy can be expected when combined with other existing discretization schemes or preconditioners. - Highlights: • We solve the multiple-right-hand-side problem in DOT with a block BiCGStab method. • We examine the CPU times of the block solver and the traditional sequential solver. • The block solver is faster than the sequential solver by a factor of 1.5–3.0. • Multi-threading block solvers give additional speedup under limited threads situation.

  17. A Survey of Solver-Related Geometry and Meshing Issues

    Science.gov (United States)

    Masters, James; Daniel, Derick; Gudenkauf, Jared; Hine, David; Sideroff, Chris

    2016-01-01

    There is a concern in the computational fluid dynamics community that mesh generation is a significant bottleneck in the CFD workflow. This is one of several papers that will help set the stage for a moderated panel discussion addressing this issue. Although certain general "rules of thumb" and a priori mesh metrics can be used to ensure that some base level of mesh quality is achieved, inadequate consideration is often given to the type of solver or particular flow regime on which the mesh will be utilized. This paper explores how an analyst may want to think differently about a mesh based on considerations such as if a flow is compressible vs. incompressible or hypersonic vs. subsonic or if the solver is node-centered vs. cell-centered. This paper is a high-level investigation intended to provide general insight into how considering the nature of the solver or flow when performing mesh generation has the potential to increase the accuracy and/or robustness of the solution and drive the mesh generation process to a state where it is no longer a hindrance to the analysis process.

  18. Sparse Frequency Waveform Design for Radar-Embedded Communication

    Directory of Open Access Journals (Sweden)

    Chaoyun Mai

    2016-01-01

    Full Text Available According to the Tag application with function of covert communication, a method for sparse frequency waveform design based on radar-embedded communication is proposed. Firstly, sparse frequency waveforms are designed based on power spectral density fitting and quasi-Newton method. Secondly, the eigenvalue decomposition of the sparse frequency waveform sequence is used to get the dominant space. Finally the communication waveforms are designed through the projection of orthogonal pseudorandom vectors in the vertical subspace. Compared with the linear frequency modulation waveform, the sparse frequency waveform can further improve the bandwidth occupation of communication signals, thus achieving higher communication rate. A certain correlation exists between the reciprocally orthogonal communication signals samples and the sparse frequency waveform, which guarantees the low SER (signal error rate and LPI (low probability of intercept. The simulation results verify the effectiveness of this method.

  19. Grammar-Based Multi-Frontal Solver for One Dimensional Isogeometric Analysis with Multiple Right-Hand-Sides

    KAUST Repository

    Kuźnik, Krzysztof

    2013-06-01

    This paper introduces a grammar-based model for developing a multi-thread multi-frontal parallel direct solver for one- dimensional isogeometric finite element method. The model includes the integration of B-splines for construction of the element local matrices and the multi-frontal solver algorithm. The integration and the solver algorithm are partitioned into basic indivisible tasks, namely the grammar productions, that can be executed squentially. The partial order of execution of the basic tasks is analyzed to provide the scheduling for the execution of the concurrent integration and multi-frontal solver algo- rithm. This graph grammar analysis allows for optimal concurrent execution of all tasks. The model has been implemented and tested on NVIDIA CUDA GPU, delivering logarithmic execution time for linear, quadratic, cubic and higher order B-splines. Thus, the CUDA implementation delivers the optimal performance predicted by our graph grammar analysis. We utilize the solver for multiple right hand sides related to the solution of non-stationary or inverse problems.

  20. Computational aeroelasticity using a pressure-based solver

    Science.gov (United States)

    Kamakoti, Ramji

    A computational methodology for performing fluid-structure interaction computations for three-dimensional elastic wing geometries is presented. The flow solver used is based on an unsteady Reynolds-Averaged Navier-Stokes (RANS) model. A well validated k-ε turbulence model with wall function treatment for near wall region was used to perform turbulent flow calculations. Relative merits of alternative flow solvers were investigated. The predictor-corrector-based Pressure Implicit Splitting of Operators (PISO) algorithm was found to be computationally economic for unsteady flow computations. Wing structure was modeled using Bernoulli-Euler beam theory. A fully implicit time-marching scheme (using the Newmark integration method) was used to integrate the equations of motion for structure. Bilinear interpolation and linear extrapolation techniques were used to transfer necessary information between fluid and structure solvers. Geometry deformation was accounted for by using a moving boundary module. The moving grid capability was based on a master/slave concept and transfinite interpolation techniques. Since computations were performed on a moving mesh system, the geometric conservation law must be preserved. This is achieved by appropriately evaluating the Jacobian values associated with each cell. Accurate computation of contravariant velocities for unsteady flows using the momentum interpolation method on collocated, curvilinear grids was also addressed. Flutter computations were performed for the AGARD 445.6 wing at subsonic, transonic and supersonic Mach numbers. Unsteady computations were performed at various dynamic pressures to predict the flutter boundary. Results showed favorable agreement of experiment and previous numerical results. The computational methodology exhibited capabilities to predict both qualitative and quantitative features of aeroelasticity.

  1. Massive Asynchronous Parallelization of Sparse Matrix Factorizations

    Energy Technology Data Exchange (ETDEWEB)

    Chow, Edmond [Georgia Inst. of Technology, Atlanta, GA (United States)

    2018-01-08

    Solving sparse problems is at the core of many DOE computational science applications. We focus on the challenge of developing sparse algorithms that can fully exploit the parallelism in extreme-scale computing systems, in particular systems with massive numbers of cores per node. Our approach is to express a sparse matrix factorization as a large number of bilinear constraint equations, and then solving these equations via an asynchronous iterative method. The unknowns in these equations are the matrix entries of the factorization that is desired.

  2. Benchmarking optimization solvers for structural topology optimization

    DEFF Research Database (Denmark)

    Rojas Labanda, Susana; Stolpe, Mathias

    2015-01-01

    solvers in IPOPT and FMINCON, and the sequential quadratic programming method in SNOPT, are benchmarked on the library using performance profiles. Whenever possible the methods are applied to both the nested and the Simultaneous Analysis and Design (SAND) formulations of the problem. The performance...

  3. The SX Solver: A New Computer Program for Analyzing Solvent-Extraction Equilibria

    International Nuclear Information System (INIS)

    McNamara, B.K.; Rapko, B.M.; Lumetta, G.J.

    1999-01-01

    A new computer program, the SX Solver, has been developed to analyze solvent-extraction equilibria. The program operates out of Microsoft Excel and uses the built-in ''Solver'' function to minimize the sum of the square of the residuals between measured and calculated distribution coefficients. The extraction of nitric acid by tributylphosphate has been modeled to illustrate the program's use

  4. A fast direct solver for boundary value problems on locally perturbed geometries

    Science.gov (United States)

    Zhang, Yabin; Gillman, Adrianna

    2018-03-01

    Many applications including optimal design and adaptive discretization techniques involve solving several boundary value problems on geometries that are local perturbations of an original geometry. This manuscript presents a fast direct solver for boundary value problems that are recast as boundary integral equations. The idea is to write the discretized boundary integral equation on a new geometry as a low rank update to the discretized problem on the original geometry. Using the Sherman-Morrison formula, the inverse can be expressed in terms of the inverse of the original system applied to the low rank factors and the right hand side. Numerical results illustrate for problems where perturbation is localized the fast direct solver is three times faster than building a new solver from scratch.

  5. Efficient Computation of Sparse Matrix Functions for Large-Scale Electronic Structure Calculations: The CheSS Library.

    Science.gov (United States)

    Mohr, Stephan; Dawson, William; Wagner, Michael; Caliste, Damien; Nakajima, Takahito; Genovese, Luigi

    2017-10-10

    We present CheSS, the "Chebyshev Sparse Solvers" library, which has been designed to solve typical problems arising in large-scale electronic structure calculations using localized basis sets. The library is based on a flexible and efficient expansion in terms of Chebyshev polynomials and presently features the calculation of the density matrix, the calculation of matrix powers for arbitrary powers, and the extraction of eigenvalues in a selected interval. CheSS is able to exploit the sparsity of the matrices and scales linearly with respect to the number of nonzero entries, making it well-suited for large-scale calculations. The approach is particularly adapted for setups leading to small spectral widths of the involved matrices and outperforms alternative methods in this regime. By coupling CheSS to the DFT code BigDFT, we show that such a favorable setup is indeed possible in practice. In addition, the approach based on Chebyshev polynomials can be massively parallelized, and CheSS exhibits excellent scaling up to thousands of cores even for relatively small matrix sizes.

  6. Balancing Energy and Performance in Dense Linear System Solvers for Hybrid ARM+GPU platforms

    Directory of Open Access Journals (Sweden)

    Juan P. Silva

    2016-04-01

    Full Text Available The high performance computing community has traditionally focused uniquely on the reduction of execution time, though in the last years, the optimization of energy consumption has become a main issue. A reduction of energy usage without a degradation of performance requires the adoption of energy-efficient hardware platforms accompanied by the development of energy-aware algorithms and computational kernels. The solution of linear systems is a key operation for many scientific and engineering problems. Its relevance has motivated an important amount of work, and consequently, it is possible to find high performance solvers for a wide variety of hardware platforms. In this work, we aim to develop a high performance and energy-efficient linear system solver. In particular, we develop two solvers for a low-power CPU-GPU platform, the NVIDIA Jetson TK1. These solvers implement the Gauss-Huard algorithm yielding an efficient usage of the target hardware as well as an efficient memory access. The experimental evaluation shows that the novel proposal reports important savings in both time and energy-consumption when compared with the state-of-the-art solvers of the platform.

  7. Numerical solver for compressible two-fluid flow

    NARCIS (Netherlands)

    J. Naber (Jorick)

    2005-01-01

    textabstractThis report treats the development of a numerical solver for the simulation of flows of two non-mixing fluids described by the two-dimensional Euler equations. A level-set equation in conservative form describes the interface. After each time step the deformed level-set function is

  8. A Parallel Multigrid Solver for Viscous Flows on Anisotropic Structured Grids

    Science.gov (United States)

    Prieto, Manuel; Montero, Ruben S.; Llorente, Ignacio M.; Bushnell, Dennis M. (Technical Monitor)

    2001-01-01

    This paper presents an efficient parallel multigrid solver for speeding up the computation of a 3-D model that treats the flow of a viscous fluid over a flat plate. The main interest of this simulation lies in exhibiting some basic difficulties that prevent optimal multigrid efficiencies from being achieved. As the computing platform, we have used Coral, a Beowulf-class system based on Intel Pentium processors and equipped with GigaNet cLAN and switched Fast Ethernet networks. Our study not only examines the scalability of the solver but also includes a performance evaluation of Coral where the investigated solver has been used to compare several of its design choices, namely, the interconnection network (GigaNet versus switched Fast-Ethernet) and the node configuration (dual nodes versus single nodes). As a reference, the performance results have been compared with those obtained with the NAS-MG benchmark.

  9. Status and Perspective of the Hydraulic Solver development for SPACE code

    International Nuclear Information System (INIS)

    Lee, S. Y.; Oh, M. T.; Park, J. C.; Ahn, S. J.; Park, C. E.; Lee, E. J.; Na, Y. W.

    2008-01-01

    KOPEC has been developing a hydraulic solver for SPACE code. The governing equations for the solver can be obtained through several steps of modeling and approximations from the basic material transport principles. Once the governing equations are fixed, a proper discretization procedure should be followed to get the difference equations that can be solved by well established matrix solvers. Of course, the mesh generation and handling procedures are necessary for the discretization process. At present, the preliminary test version has been constructed and being tested. The selection of the compiler language was debated openly. C++ was chosen as a basis compiler language. But other language such as FORTRAN can be used as it is necessary. The steps mentioned above are explained in the following sections. Test results are presented by other companion papers in this meeting. Future activities will be described in the conclusion section

  10. A comparison of viscous-plastic sea ice solvers with and without replacement pressure

    Science.gov (United States)

    Kimmritz, Madlen; Losch, Martin; Danilov, Sergey

    2017-07-01

    Recent developments of the explicit elastic-viscous-plastic (EVP) solvers call for a new comparison with implicit solvers for the equations of viscous-plastic sea ice dynamics. In Arctic sea ice simulations, the modified and the adaptive EVP solvers, and the implicit Jacobian-free Newton-Krylov (JFNK) solver are compared against each other. The adaptive EVP method shows convergence rates that are generally similar or even better than those of the modified EVP method, but the convergence of the EVP methods is found to depend dramatically on the use of the replacement pressure (RP). Apparently, using the RP can affect the pseudo-elastic waves in the EVP methods by introducing extra non-physical oscillations so that, in the extreme case, convergence to the VP solution can be lost altogether. The JFNK solver also suffers from higher failure rates with RP implying that with RP the momentum equations are stiffer and more difficult to solve. For practical purposes, both EVP methods can be used efficiently with an unexpectedly low number of sub-cycling steps without compromising the solutions. The differences between the RP solutions and the NoRP solutions (when the RP is not being used) can be reduced with lower thresholds of viscous regularization at the cost of increasing stiffness of the equations, and hence the computational costs of solving them.

  11. Motivations, Challenges, and Opportunities of Successful Solvers on an Innovation Intermediary Platform

    DEFF Research Database (Denmark)

    Hossain, Mokter

    2018-01-01

    . The main motivational factors of successful solvers engaged in problem solving are money, learning, fun, sense of achievement, passion, and networking. Major challenges solvers face include unclear or insufficient problem description, lack of option for communication, language barrier, time zone...... other experts, the ability to work in a diverse environment, options of work after retirement and from distant locations, and a new source of income....

  12. A coupled systems code-CFD MHD solver for fusion blanket design

    Energy Technology Data Exchange (ETDEWEB)

    Wolfendale, Michael J., E-mail: m.wolfendale11@imperial.ac.uk; Bluck, Michael J.

    2015-10-15

    Highlights: • A coupled systems code-CFD MHD solver for fusion blanket applications is proposed. • Development of a thermal hydraulic systems code with MHD capabilities is detailed. • A code coupling methodology based on the use of TCP socket communications is detailed. • Validation cases are briefly discussed for the systems code and coupled solver. - Abstract: The network of flow channels in a fusion blanket can be modelled using a 1D thermal hydraulic systems code. For more complex components such as junctions and manifolds, the simplifications employed in such codes can become invalid, requiring more detailed analyses. For magnetic confinement reactor blanket designs using a conducting fluid as coolant/breeder, the difficulties in flow modelling are particularly severe due to MHD effects. Blanket analysis is an ideal candidate for the application of a code coupling methodology, with a thermal hydraulic systems code modelling portions of the blanket amenable to 1D analysis, and CFD providing detail where necessary. A systems code, MHD-SYS, has been developed and validated against existing analyses. The code shows good agreement in the prediction of MHD pressure loss and the temperature profile in the fluid and wall regions of the blanket breeding zone. MHD-SYS has been coupled to an MHD solver developed in OpenFOAM and the coupled solver validated for test geometries in preparation for modelling blanket systems.

  13. Storage of sparse files using parallel log-structured file system

    Science.gov (United States)

    Bent, John M.; Faibish, Sorin; Grider, Gary; Torres, Aaron

    2017-11-07

    A sparse file is stored without holes by storing a data portion of the sparse file using a parallel log-structured file system; and generating an index entry for the data portion, the index entry comprising a logical offset, physical offset and length of the data portion. The holes can be restored to the sparse file upon a reading of the sparse file. The data portion can be stored at a logical end of the sparse file. Additional storage efficiency can optionally be achieved by (i) detecting a write pattern for a plurality of the data portions and generating a single patterned index entry for the plurality of the patterned data portions; and/or (ii) storing the patterned index entries for a plurality of the sparse files in a single directory, wherein each entry in the single directory comprises an identifier of a corresponding sparse file.

  14. Matlab Geochemistry: An open source geochemistry solver based on MRST

    Science.gov (United States)

    McNeece, C. J.; Raynaud, X.; Nilsen, H.; Hesse, M. A.

    2017-12-01

    The study of geological systems often requires the solution of complex geochemical relations. To address this need we present an open source geochemical solver based on the Matlab Reservoir Simulation Toolbox (MRST) developed by SINTEF. The implementation supports non-isothermal multicomponent aqueous complexation, surface complexation, ion exchange, and dissolution/precipitation reactions. The suite of tools available in MRST allows for rapid model development, in particular the incorporation of geochemical calculations into transport simulations of multiple phases, complex domain geometry and geomechanics. Different numerical schemes and additional physics can be easily incorporated into the existing tools through the object-oriented framework employed by MRST. The solver leverages the automatic differentiation tools available in MRST to solve arbitrarily complex geochemical systems with any choice of species or element concentration as input. Four mathematical approaches enable the solver to be quite robust: 1) the choice of chemical elements as the basis components makes all entries in the composition matrix positive thus preserving convexity, 2) a log variable transformation is used which transfers the nonlinearity to the convex composition matrix, 3) a priori bounds on variables are calculated from the structure of the problem, constraining Netwon's path and 4) an initial guess is calculated implicitly by sequentially adding model complexity. As a benchmark we compare the model to experimental and semi-analytic solutions of the coupled salinity-acidity transport system. Together with the reservoir simulation capabilities of MRST the solver offers a promising tool for geochemical simulations in reservoir domains for applications in a diversity of fields from enhanced oil recovery to radionuclide storage.

  15. Sparse reconstruction using distribution agnostic bayesian matching pursuit

    KAUST Repository

    Masood, Mudassir

    2013-11-01

    A fast matching pursuit method using a Bayesian approach is introduced for sparse signal recovery. This method performs Bayesian estimates of sparse signals even when the signal prior is non-Gaussian or unknown. It is agnostic on signal statistics and utilizes a priori statistics of additive noise and the sparsity rate of the signal, which are shown to be easily estimated from data if not available. The method utilizes a greedy approach and order-recursive updates of its metrics to find the most dominant sparse supports to determine the approximate minimum mean-square error (MMSE) estimate of the sparse signal. Simulation results demonstrate the power and robustness of our proposed estimator. © 2013 IEEE.

  16. Image understanding using sparse representations

    CERN Document Server

    Thiagarajan, Jayaraman J; Turaga, Pavan; Spanias, Andreas

    2014-01-01

    Image understanding has been playing an increasingly crucial role in several inverse problems and computer vision. Sparse models form an important component in image understanding, since they emulate the activity of neural receptors in the primary visual cortex of the human brain. Sparse methods have been utilized in several learning problems because of their ability to provide parsimonious, interpretable, and efficient models. Exploiting the sparsity of natural signals has led to advances in several application areas including image compression, denoising, inpainting, compressed sensing, blin

  17. Sparse regularization for force identification using dictionaries

    Science.gov (United States)

    Qiao, Baijie; Zhang, Xingwu; Wang, Chenxi; Zhang, Hang; Chen, Xuefeng

    2016-04-01

    The classical function expansion method based on minimizing l2-norm of the response residual employs various basis functions to represent the unknown force. Its difficulty lies in determining the optimum number of basis functions. Considering the sparsity of force in the time domain or in other basis space, we develop a general sparse regularization method based on minimizing l1-norm of the coefficient vector of basis functions. The number of basis functions is adaptively determined by minimizing the number of nonzero components in the coefficient vector during the sparse regularization process. First, according to the profile of the unknown force, the dictionary composed of basis functions is determined. Second, a sparsity convex optimization model for force identification is constructed. Third, given the transfer function and the operational response, Sparse reconstruction by separable approximation (SpaRSA) is developed to solve the sparse regularization problem of force identification. Finally, experiments including identification of impact and harmonic forces are conducted on a cantilever thin plate structure to illustrate the effectiveness and applicability of SpaRSA. Besides the Dirac dictionary, other three sparse dictionaries including Db6 wavelets, Sym4 wavelets and cubic B-spline functions can also accurately identify both the single and double impact forces from highly noisy responses in a sparse representation frame. The discrete cosine functions can also successfully reconstruct the harmonic forces including the sinusoidal, square and triangular forces. Conversely, the traditional Tikhonov regularization method with the L-curve criterion fails to identify both the impact and harmonic forces in these cases.

  18. Graph Grammar-Based Multi-Frontal Parallel Direct Solver for Two-Dimensional Isogeometric Analysis

    KAUST Repository

    Kuźnik, Krzysztof

    2012-06-02

    This paper introduces the graph grammar based model for developing multi-thread multi-frontal parallel direct solver for two dimensional isogeometric finite element method. Execution of the solver algorithm has been expressed as the sequence of graph grammar productions. At the beginning productions construct the elimination tree with leaves corresponding to finite elements. Following sequence of graph grammar productions generates element frontal matri-ces at leaf nodes, merges matrices at parent nodes and eliminates rows corresponding to fully assembled degrees of freedom. Finally, there are graph grammar productions responsible for root problem solution and recursive backward substitutions. Expressing the solver algorithm by graph grammar productions allows us to explore the concurrency of the algorithm. The graph grammar productions are grouped into sets of independent tasks that can be executed concurrently. The resulting concurrent multi-frontal solver algorithm is implemented and tested on NVIDIA GPU, providing O(NlogN) execution time complexity where N is the number of degrees of freedom. We have confirmed this complexity by solving up to 1 million of degrees of freedom with 448 cores GPU.

  19. Sparse inpainting and isotropy

    Energy Technology Data Exchange (ETDEWEB)

    Feeney, Stephen M.; McEwen, Jason D.; Peiris, Hiranya V. [Department of Physics and Astronomy, University College London, Gower Street, London, WC1E 6BT (United Kingdom); Marinucci, Domenico; Cammarota, Valentina [Department of Mathematics, University of Rome Tor Vergata, via della Ricerca Scientifica 1, Roma, 00133 (Italy); Wandelt, Benjamin D., E-mail: s.feeney@imperial.ac.uk, E-mail: marinucc@axp.mat.uniroma2.it, E-mail: jason.mcewen@ucl.ac.uk, E-mail: h.peiris@ucl.ac.uk, E-mail: wandelt@iap.fr, E-mail: cammarot@axp.mat.uniroma2.it [Kavli Institute for Theoretical Physics, Kohn Hall, University of California, 552 University Road, Santa Barbara, CA, 93106 (United States)

    2014-01-01

    Sparse inpainting techniques are gaining in popularity as a tool for cosmological data analysis, in particular for handling data which present masked regions and missing observations. We investigate here the relationship between sparse inpainting techniques using the spherical harmonic basis as a dictionary and the isotropy properties of cosmological maps, as for instance those arising from cosmic microwave background (CMB) experiments. In particular, we investigate the possibility that inpainted maps may exhibit anisotropies in the behaviour of higher-order angular polyspectra. We provide analytic computations and simulations of inpainted maps for a Gaussian isotropic model of CMB data, suggesting that the resulting angular trispectrum may exhibit small but non-negligible deviations from isotropy.

  20. Design and Optimization of OpenFOAM-based CFD Applications for Modern Hybrid and Heterogeneous HPC Platforms

    KAUST Repository

    AlOnazi, Amani A.

    2014-01-01

    has been designed and implemented to solve the sparse linear algebraic kernel that derives from two CFD solver: icoFoam, which is an incompressible flow solver, and laplacianFoam, which solves the Poisson equation, for e.g., thermal dif- fusion. A load

  1. Scalable parallel prefix solvers for discrete ordinates transport

    International Nuclear Information System (INIS)

    Pautz, S.; Pandya, T.; Adams, M.

    2009-01-01

    The well-known 'sweep' algorithm for inverting the streaming-plus-collision term in first-order deterministic radiation transport calculations has some desirable numerical properties. However, it suffers from parallel scaling issues caused by a lack of concurrency. The maximum degree of concurrency, and thus the maximum parallelism, grows more slowly than the problem size for sweeps-based solvers. We investigate a new class of parallel algorithms that involves recasting the streaming-plus-collision problem in prefix form and solving via cyclic reduction. This method, although computationally more expensive at low levels of parallelism than the sweep algorithm, offers better theoretical scalability properties. Previous work has demonstrated this approach for one-dimensional calculations; we show how to extend it to multidimensional calculations. Notably, for multiple dimensions it appears that this approach is limited to long-characteristics discretizations; other discretizations cannot be cast in prefix form. We implement two variants of the algorithm within the radlib/SCEPTRE transport code library at Sandia National Laboratories and show results on two different massively parallel systems. Both the 'forward' and 'symmetric' solvers behave similarly, scaling well to larger degrees of parallelism then sweeps-based solvers. We do observe some issues at the highest levels of parallelism (relative to the system size) and discuss possible causes. We conclude that this approach shows good potential for future parallel systems, but the parallel scalability will depend heavily on the architecture of the communication networks of these systems. (authors)

  2. Fast Multipole-Based Elliptic PDE Solver and Preconditioner

    KAUST Repository

    Ibeid, Huda

    2016-01-01

    extrapolated scalability. Fast multipole methods (FMM) were originally developed for accelerating N-body problems for particle-based methods in astrophysics and molecular dynamics. FMM is more than an N-body solver, however. Recent efforts to view the FMM

  3. An Investigation of the Performance of the Colored Gauss-Seidel Solver on CPU and GPU

    International Nuclear Information System (INIS)

    Yoon, Jong Seon; Choi, Hyoung Gwon; Jeon, Byoung Jin

    2017-01-01

    The performance of the colored Gauss–Seidel solver on CPU and GPU was investigated for the two- and three-dimensional heat conduction problems by using different mesh sizes. The heat conduction equation was discretized by the finite difference method and finite element method. The CPU yielded good performance for small problems but deteriorated when the total memory required for computing was larger than the cache memory for large problems. In contrast, the GPU performed better as the mesh size increased because of the latency hiding technique. Further, GPU computation by the colored Gauss–Siedel solver was approximately 7 times that by the single CPU. Furthermore, the colored Gauss–Seidel solver was found to be approximately twice that of the Jacobi solver when parallel computing was conducted on the GPU.

  4. An Investigation of the Performance of the Colored Gauss-Seidel Solver on CPU and GPU

    Energy Technology Data Exchange (ETDEWEB)

    Yoon, Jong Seon; Choi, Hyoung Gwon [Seoul Nat’l Univ. of Science and Technology, Seoul (Korea, Republic of); Jeon, Byoung Jin [Yonsei Univ., Seoul (Korea, Republic of)

    2017-02-15

    The performance of the colored Gauss–Seidel solver on CPU and GPU was investigated for the two- and three-dimensional heat conduction problems by using different mesh sizes. The heat conduction equation was discretized by the finite difference method and finite element method. The CPU yielded good performance for small problems but deteriorated when the total memory required for computing was larger than the cache memory for large problems. In contrast, the GPU performed better as the mesh size increased because of the latency hiding technique. Further, GPU computation by the colored Gauss–Siedel solver was approximately 7 times that by the single CPU. Furthermore, the colored Gauss–Seidel solver was found to be approximately twice that of the Jacobi solver when parallel computing was conducted on the GPU.

  5. The Openpipeflow Navier–Stokes solver

    Directory of Open Access Journals (Sweden)

    Ashley P. Willis

    2017-01-01

    Full Text Available Pipelines are used in a huge range of industrial processes involving fluids, and the ability to accurately predict properties of the flow through a pipe is of fundamental engineering importance. Armed with parallel MPI, Arnoldi and Newton–Krylov solvers, the Openpipeflow code can be used in a range of settings, from large-scale simulation of highly turbulent flow, to the detailed analysis of nonlinear invariant solutions (equilibria and periodic orbits and their influence on the dynamics of the flow.

  6. Object tracking by occlusion detection via structured sparse learning

    KAUST Repository

    Zhang, Tianzhu

    2013-06-01

    Sparse representation based methods have recently drawn much attention in visual tracking due to good performance against illumination variation and occlusion. They assume the errors caused by image variations can be modeled as pixel-wise sparse. However, in many practical scenarios these errors are not truly pixel-wise sparse but rather sparsely distributed in a structured way. In fact, pixels in error constitute contiguous regions within the object\\'s track. This is the case when significant occlusion occurs. To accommodate for non-sparse occlusion in a given frame, we assume that occlusion detected in previous frames can be propagated to the current one. This propagated information determines which pixels will contribute to the sparse representation of the current track. In other words, pixels that were detected as part of an occlusion in the previous frame will be removed from the target representation process. As such, this paper proposes a novel tracking algorithm that models and detects occlusion through structured sparse learning. We test our tracker on challenging benchmark sequences, such as sports videos, which involve heavy occlusion, drastic illumination changes, and large pose variations. Experimental results show that our tracker consistently outperforms the state-of-the-art. © 2013 IEEE.

  7. Advanced Algebraic Multigrid Solvers for Subsurface Flow Simulation

    KAUST Repository

    Chen, Meng-Huo; Sun, Shuyu; Salama, Amgad

    2015-01-01

    and issues will be addressed and the corresponding remedies will be studied. As the multigrid methods are used as the linear solver, the simulator can be parallelized (although not trivial) and the high-resolution simulation become feasible, the ultimately

  8. Sparse Vector Distributions and Recovery from Compressed Sensing

    DEFF Research Database (Denmark)

    Sturm, Bob L.

    It is well known that the performance of sparse vector recovery algorithms from compressive measurements can depend on the distribution underlying the non-zero elements of a sparse vector. However, the extent of these effects has yet to be explored, and formally presented. In this paper, I...... empirically investigate this dependence for seven distributions and fifteen recovery algorithms. The two morals of this work are: 1) any judgement of the recovery performance of one algorithm over that of another must be prefaced by the conditions for which this is observed to be true, including sparse vector...... distributions, and the criterion for exact recovery; and 2) a recovery algorithm must be selected carefully based on what distribution one expects to underlie the sensed sparse signal....

  9. Exhaustive Search for Sparse Variable Selection in Linear Regression

    Science.gov (United States)

    Igarashi, Yasuhiko; Takenaka, Hikaru; Nakanishi-Ohno, Yoshinori; Uemura, Makoto; Ikeda, Shiro; Okada, Masato

    2018-04-01

    We propose a K-sparse exhaustive search (ES-K) method and a K-sparse approximate exhaustive search method (AES-K) for selecting variables in linear regression. With these methods, K-sparse combinations of variables are tested exhaustively assuming that the optimal combination of explanatory variables is K-sparse. By collecting the results of exhaustively computing ES-K, various approximate methods for selecting sparse variables can be summarized as density of states. With this density of states, we can compare different methods for selecting sparse variables such as relaxation and sampling. For large problems where the combinatorial explosion of explanatory variables is crucial, the AES-K method enables density of states to be effectively reconstructed by using the replica-exchange Monte Carlo method and the multiple histogram method. Applying the ES-K and AES-K methods to type Ia supernova data, we confirmed the conventional understanding in astronomy when an appropriate K is given beforehand. However, we found the difficulty to determine K from the data. Using virtual measurement and analysis, we argue that this is caused by data shortage.

  10. Time Domain Surface Integral Equation Solvers for Quantum Corrected Electromagnetic Analysis of Plasmonic Nanostructures

    KAUST Repository

    Uysal, Ismail Enes

    2016-10-01

    Plasmonic structures are utilized in many applications ranging from bio-medicine to solar energy generation and transfer. Numerical schemes capable of solving equations of classical electrodynamics have been the method of choice for characterizing scattering properties of such structures. However, as dimensions of these plasmonic structures reduce to nanometer scale, quantum mechanical effects start to appear. These effects cannot be accurately modeled by available classical numerical methods. One of these quantum effects is the tunneling, which is observed when two structures are located within a sub-nanometer distance of each other. At these small distances electrons “jump" from one structure to another and introduce a path for electric current to flow. Classical equations of electrodynamics and the schemes used for solving them do not account for this additional current path. This limitation can be lifted by introducing an auxiliary tunnel with material properties obtained using quantum models and applying a classical solver to the structures connected by this auxiliary tunnel. Early work on this topic focused on quantum models that are generated using a simple one-dimensional wave function to find the tunneling probability and assume a simple Drude model for the permittivity of the tunnel. These tunnel models are then used together with a classical frequency domain solver. In this thesis, a time domain surface integral equation solver for quantum corrected analysis of transient plasmonic interactions is proposed. This solver has several advantages: (i) As opposed to frequency domain solvers, it provides results at a broad band of frequencies with a single simulation. (ii) As opposed to differential equation solvers, it only discretizes surfaces (reducing number of unknowns), enforces the radiation condition implicitly (increasing the accuracy), and allows for time step selection independent of spatial discretization (increasing efficiency). The quantum model

  11. A Sparse Approximate Inverse Preconditioner for Nonsymmetric Linear Systems

    Czech Academy of Sciences Publication Activity Database

    Benzi, M.; Tůma, Miroslav

    1998-01-01

    Roč. 19, č. 3 (1998), s. 968-994 ISSN 1064-8275 R&D Projects: GA ČR GA201/93/0067; GA AV ČR IAA230401 Keywords : large sparse systems * interative methods * preconditioning * approximate inverse * sparse linear systems * sparse matrices * incomplete factorizations * conjugate gradient -type methods Subject RIV: BA - General Mathematics Impact factor: 1.378, year: 1998

  12. Investigation on the Use of a Multiphase Eulerian CFD solver to simulate breaking waves

    DEFF Research Database (Denmark)

    Tomaselli, Pietro D.; Christensen, Erik Damgaard

    2015-01-01

    investigation on a CFD model capable of handling this problem. The model is based on a solver, available in the open-source CFD toolkit OpenFOAM, which combines the Eulerian multi-fluid approach for dispersed flows with a numerical interface sharpening method. The solver, enhanced with additional formulations...

  13. Development and verification of the neutron diffusion solver for the GeN-Foam multi-physics platform

    International Nuclear Information System (INIS)

    Fiorina, Carlo; Kerkar, Nordine; Mikityuk, Konstantin; Rubiolo, Pablo; Pautz, Andreas

    2016-01-01

    Highlights: • Development and verification of a neutron diffusion solver based on OpenFOAM. • Integration in the GeN-Foam multi-physics platform. • Implementation and verification of acceleration techniques. • Implementation of isotropic discontinuity factors. • Automatic adjustment of discontinuity factors. - Abstract: The Laboratory for Reactor Physics and Systems Behaviour at the PSI and the EPFL has been developing in recent years a new code system for reactor analysis based on OpenFOAM®. The objective is to supplement available legacy codes with a modern tool featuring state-of-the-art characteristics in terms of scalability, programming approach and flexibility. As part of this project, a new solver has been developed for the eigenvalue and transient solution of multi-group diffusion equations. Several features distinguish the developed solver from other available codes, in particular: object oriented programming to ease code modification and maintenance; modern parallel computing capabilities; use of general unstructured meshes; possibility of mesh deformation; cell-wise parametrization of cross-sections; and arbitrary energy group structure. In addition, the solver is integrated into the GeN-Foam multi-physics solver. The general features of the solver and its integration with GeN-Foam have already been presented in previous publications. The present paper describes the diffusion solver in more details and provides an overview of new features recently implemented, including the use of acceleration techniques and discontinuity factors. In addition, a code verification is performed through a comparison with Monte Carlo results for both a thermal and a fast reactor system.

  14. Final Report for UC Berkeley Terascale Optimal PDE Solvers TOPS DOE Award Number DE-FC02-01ER25478 9/15/2001-9/14/2006

    International Nuclear Information System (INIS)

    James Demmel

    2007-01-01

    In many areas of science, physical experimentation may be too dangerous, too expensive or even impossible. Instead, large-scale simulations, validated by comparison with related experiments in well-understood laboratory contexts, are used by scientists to gain insight and confirmation of existing theories in such areas, without benefit of full experimental verification. The goal of the TOPS ISIC was to develop and implement algorithms and support scientific investigations performed by DOE-sponsored researchers. A major component of this effort is to provide software for large scale parallel computers capable of efficiently solving the enormous systems of equations arising from the nonlinear PDEs underlying these simulations. Several TOPS supported packages where designed in part (ScaLAPACK) or in whole (SuperLU) at Berkeley, and are widely used beyond SciDAC and DOE. Beyond continuing to develop these codes, our main effort focused on automatic performance tuning of the sparse matrix kernels (eg sparse-matrix-vector-multiply, or SpMV) at the core of many TOPS iterative solvers. Based on the observation that the fastest implementation of SpMV (and other kernels) can depend dramatically both on the computer and the matrix (the latter of which is not known until run-time), we developed and released a system called OSKI (Optimized Sparse Kernel Interface) that will automatically produce optimized version of SpMV (and other kernels), hiding complicated implementation details from the user. OSKI led to a 2x speedup in SpMV in a DOE accelerator design code, a 2x speedup in a commercial lithography simulation, and has been downloaded over 500 times. In addition to a stand-alone version, OSKI was also integrated into the TOPS-supported PETSc system

  15. Fast Multipole-Based Elliptic PDE Solver and Preconditioner

    KAUST Repository

    Ibeid, Huda

    2016-12-07

    Exascale systems are predicted to have approximately one billion cores, assuming Gigahertz cores. Limitations on affordable network topologies for distributed memory systems of such massive scale bring new challenges to the currently dominant parallel programing model. Currently, there are many efforts to evaluate the hardware and software bottlenecks of exascale designs. It is therefore of interest to model application performance and to understand what changes need to be made to ensure extrapolated scalability. Fast multipole methods (FMM) were originally developed for accelerating N-body problems for particle-based methods in astrophysics and molecular dynamics. FMM is more than an N-body solver, however. Recent efforts to view the FMM as an elliptic PDE solver have opened the possibility to use it as a preconditioner for even a broader range of applications. In this thesis, we (i) discuss the challenges for FMM on current parallel computers and future exascale architectures, with a focus on inter-node communication, and develop a performance model that considers the communication patterns of the FMM for spatially quasi-uniform distributions, (ii) employ this performance model to guide performance and scaling improvement of FMM for all-atom molecular dynamics simulations of uniformly distributed particles, and (iii) demonstrate that, beyond its traditional use as a solver in problems for which explicit free-space kernel representations are available, the FMM has applicability as a preconditioner in finite domain elliptic boundary value problems, by equipping it with boundary integral capability for satisfying conditions at finite boundaries and by wrapping it in a Krylov method for extensibility to more general operators. Compared with multilevel methods, FMM is capable of comparable algebraic convergence rates down to the truncation error of the discretized PDE, and it has superior multicore and distributed memory scalability properties on commodity

  16. High-Order Calderón Preconditioned Time Domain Integral Equation Solvers

    KAUST Repository

    Valdes, Felipe; Ghaffari-Miab, Mohsen; Andriulli, Francesco P.; Cools, Kristof; Michielssen,

    2013-01-01

    Two high-order accurate Calderón preconditioned time domain electric field integral equation (TDEFIE) solvers are presented. In contrast to existing Calderón preconditioned time domain solvers, the proposed preconditioner allows for high-order surface representations and current expansions by using a novel set of fully-localized high-order div-and quasi curl-conforming (DQCC) basis functions. Numerical results demonstrate that the linear systems of equations obtained using the proposed basis functions converge rapidly, regardless of the mesh density and of the order of the current expansion. © 1963-2012 IEEE.

  17. High-Order Calderón Preconditioned Time Domain Integral Equation Solvers

    KAUST Repository

    Valdes, Felipe

    2013-05-01

    Two high-order accurate Calderón preconditioned time domain electric field integral equation (TDEFIE) solvers are presented. In contrast to existing Calderón preconditioned time domain solvers, the proposed preconditioner allows for high-order surface representations and current expansions by using a novel set of fully-localized high-order div-and quasi curl-conforming (DQCC) basis functions. Numerical results demonstrate that the linear systems of equations obtained using the proposed basis functions converge rapidly, regardless of the mesh density and of the order of the current expansion. © 1963-2012 IEEE.

  18. Tests of a 3D Self Magnetic Field Solver in the Finite Element Gun Code MICHELLE

    CERN Document Server

    Nelson, Eric M

    2005-01-01

    We have recently implemented a prototype 3d self magnetic field solver in the finite-element gun code MICHELLE. The new solver computes the magnetic vector potential on unstructured grids. The solver employs edge basis functions in the curl-curl formulation of the finite-element method. A novel current accumulation algorithm takes advantage of the unstructured grid particle tracker to produce a compatible source vector, for which the singular matrix equation is easily solved by the conjugate gradient method. We will present some test cases demonstrating the capabilities of the prototype 3d self magnetic field solver. One test case is self magnetic field in a square drift tube. Another is a relativistic axisymmetric beam freely expanding in a round pipe.

  19. Structure-based bayesian sparse reconstruction

    KAUST Repository

    Quadeer, Ahmed Abdul

    2012-12-01

    Sparse signal reconstruction algorithms have attracted research attention due to their wide applications in various fields. In this paper, we present a simple Bayesian approach that utilizes the sparsity constraint and a priori statistical information (Gaussian or otherwise) to obtain near optimal estimates. In addition, we make use of the rich structure of the sensing matrix encountered in many signal processing applications to develop a fast sparse recovery algorithm. The computational complexity of the proposed algorithm is very low compared with the widely used convex relaxation methods as well as greedy matching pursuit techniques, especially at high sparsity. © 1991-2012 IEEE.

  20. Galerkin CFD solvers for use in a multi-disciplinary suite for modeling advanced flight vehicles

    Science.gov (United States)

    Moffitt, Nicholas J.

    This work extends existing Galerkin CFD solvers for use in a multi-disciplinary suite. The suite is proposed as a means of modeling advanced flight vehicles, which exhibit strong coupling between aerodynamics, structural dynamics, controls, rigid body motion, propulsion, and heat transfer. Such applications include aeroelastics, aeroacoustics, stability and control, and other highly coupled applications. The suite uses NASA STARS for modeling structural dynamics and heat transfer. Aerodynamics, propulsion, and rigid body dynamics are modeled in one of the five CFD solvers below. Euler2D and Euler3D are Galerkin CFD solvers created at OSU by Cowan (2003). These solvers are capable of modeling compressible inviscid aerodynamics with modal elastics and rigid body motion. This work reorganized these solvers to improve efficiency during editing and at run time. Simple and efficient propulsion models were added, including rocket, turbojet, and scramjet engines. Viscous terms were added to the previous solvers to create NS2D and NS3D. The viscous contributions were demonstrated in the inertial and non-inertial frames. Variable viscosity (Sutherland's equation) and heat transfer boundary conditions were added to both solvers but not verified in this work. Two turbulence models were implemented in NS2D and NS3D: Spalart-Allmarus (SA) model of Deck, et al. (2002) and Menter's SST model (1994). A rotation correction term (Shur, et al., 2000) was added to the production of turbulence. Local time stepping and artificial dissipation were adapted to each model. CFDsol is a Taylor-Galerkin solver with an SA turbulence model. This work improved the time accuracy, far field stability, viscous terms, Sutherland?s equation, and SA model with NS3D as a guideline and added the propulsion models from Euler3D to CFDsol. Simple geometries were demonstrated to utilize current meshing and processing capabilities. Air-breathing hypersonic flight vehicles (AHFVs) represent the ultimate

  1. Greedy vs. L1 convex optimization in sparse coding

    DEFF Research Database (Denmark)

    Ren, Huamin; Pan, Hong; Olsen, Søren Ingvor

    2015-01-01

    Sparse representation has been applied successfully in many image analysis applications, including abnormal event detection, in which a baseline is to learn a dictionary from the training data and detect anomalies from its sparse codes. During this procedure, sparse codes which can be achieved...... solutions. Considering the property of abnormal event detection, i.e., only normal videos are used as training data due to practical reasons, effective codes in classification application may not perform well in abnormality detection. Therefore, we compare the sparse codes and comprehensively evaluate...... their performance from various aspects to better understand their applicability, including computation time, reconstruction error, sparsity, detection...

  2. Thermal Loss of High-Q Antennas in Time Domain vs. Frequency Domain Solver

    DEFF Research Database (Denmark)

    Bahramzy, Pevand; Pedersen, Gert Frølund

    2014-01-01

    High-Q structures pose great challenges to their loss simulations in Time Domain Solvers (TDS). Therefore, in this work the thermal loss of high-Q antennas is calculated both in TDS and Frequency Domain Solver (FDS), which are then compared with each other and with the actual measurements....... The thermal loss calculation in FDS is shown to be more accurate for high-Q antennas....

  3. A 3D approximate maximum likelihood solver for localization of fish implanted with acoustic transmitters

    Science.gov (United States)

    Li, Xinya; Deng, Z. Daniel; Sun, Yannan; Martinez, Jayson J.; Fu, Tao; McMichael, Geoffrey A.; Carlson, Thomas J.

    2014-11-01

    Better understanding of fish behavior is vital for recovery of many endangered species including salmon. The Juvenile Salmon Acoustic Telemetry System (JSATS) was developed to observe the out-migratory behavior of juvenile salmonids tagged by surgical implantation of acoustic micro-transmitters and to estimate the survival when passing through dams on the Snake and Columbia Rivers. A robust three-dimensional solver was needed to accurately and efficiently estimate the time sequence of locations of fish tagged with JSATS acoustic transmitters, to describe in sufficient detail the information needed to assess the function of dam-passage design alternatives. An approximate maximum likelihood solver was developed using measurements of time difference of arrival from all hydrophones in receiving arrays on which a transmission was detected. Field experiments demonstrated that the developed solver performed significantly better in tracking efficiency and accuracy than other solvers described in the literature.

  4. Algorithms for parallel flow solvers on message passing architectures

    Science.gov (United States)

    Vanderwijngaart, Rob F.

    1995-01-01

    The purpose of this project has been to identify and test suitable technologies for implementation of fluid flow solvers -- possibly coupled with structures and heat equation solvers -- on MIMD parallel computers. In the course of this investigation much attention has been paid to efficient domain decomposition strategies for ADI-type algorithms. Multi-partitioning derives its efficiency from the assignment of several blocks of grid points to each processor in the parallel computer. A coarse-grain parallelism is obtained, and a near-perfect load balance results. In uni-partitioning every processor receives responsibility for exactly one block of grid points instead of several. This necessitates fine-grain pipelined program execution in order to obtain a reasonable load balance. Although fine-grain parallelism is less desirable on many systems, especially high-latency networks of workstations, uni-partition methods are still in wide use in production codes for flow problems. Consequently, it remains important to achieve good efficiency with this technique that has essentially been superseded by multi-partitioning for parallel ADI-type algorithms. Another reason for the concentration on improving the performance of pipeline methods is their applicability in other types of flow solver kernels with stronger implied data dependence. Analytical expressions can be derived for the size of the dynamic load imbalance incurred in traditional pipelines. From these it can be determined what is the optimal first-processor retardation that leads to the shortest total completion time for the pipeline process. Theoretical predictions of pipeline performance with and without optimization match experimental observations on the iPSC/860 very well. Analysis of pipeline performance also highlights the effect of uncareful grid partitioning in flow solvers that employ pipeline algorithms. If grid blocks at boundaries are not at least as large in the wall-normal direction as those

  5. A sparse matrix based full-configuration interaction algorithm

    International Nuclear Information System (INIS)

    Rolik, Zoltan; Szabados, Agnes; Surjan, Peter R.

    2008-01-01

    We present an algorithm related to the full-configuration interaction (FCI) method that makes complete use of the sparse nature of the coefficient vector representing the many-electron wave function in a determinantal basis. Main achievements of the presented sparse FCI (SFCI) algorithm are (i) development of an iteration procedure that avoids the storage of FCI size vectors; (ii) development of an efficient algorithm to evaluate the effect of the Hamiltonian when both the initial and the product vectors are sparse. As a result of point (i) large disk operations can be skipped which otherwise may be a bottleneck of the procedure. At point (ii) we progress by adopting the implementation of the linear transformation by Olsen et al. [J. Chem Phys. 89, 2185 (1988)] for the sparse case, getting the algorithm applicable to larger systems and faster at the same time. The error of a SFCI calculation depends only on the dropout thresholds for the sparse vectors, and can be tuned by controlling the amount of system memory passed to the procedure. The algorithm permits to perform FCI calculations on single node workstations for systems previously accessible only by supercomputers

  6. An in-depth study of sparse codes on abnormality detection

    DEFF Research Database (Denmark)

    Ren, Huamin; Pan, Hong; Olsen, Søren Ingvor

    2016-01-01

    Sparse representation has been applied successfully in abnormal event detection, in which the baseline is to learn a dictionary accompanied by sparse codes. While much emphasis is put on discriminative dictionary construction, there are no comparative studies of sparse codes regarding abnormality...... are carried out from various angles to better understand the applicability of sparse codes, including computation time, reconstruction error, sparsity, detection accuracy, and their performance combining various detection methods. The experiment results show that combining OMP codes with maximum coordinate...

  7. Sparse Principal Component Analysis in Medical Shape Modeling

    DEFF Research Database (Denmark)

    Sjöstrand, Karl; Stegmann, Mikkel Bille; Larsen, Rasmus

    2006-01-01

    Principal component analysis (PCA) is a widely used tool in medical image analysis for data reduction, model building, and data understanding and exploration. While PCA is a holistic approach where each new variable is a linear combination of all original variables, sparse PCA (SPCA) aims...... analysis in medicine. Results for three different data sets are given in relation to standard PCA and sparse PCA by simple thresholding of sufficiently small loadings. Focus is on a recent algorithm for computing sparse principal components, but a review of other approaches is supplied as well. The SPCA...

  8. Sparse reconstruction using distribution agnostic bayesian matching pursuit

    KAUST Repository

    Masood, Mudassir; Al-Naffouri, Tareq Y.

    2013-01-01

    A fast matching pursuit method using a Bayesian approach is introduced for sparse signal recovery. This method performs Bayesian estimates of sparse signals even when the signal prior is non-Gaussian or unknown. It is agnostic on signal statistics

  9. Effects of high-frequency damping on iterative convergence of implicit viscous solver

    Science.gov (United States)

    Nishikawa, Hiroaki; Nakashima, Yoshitaka; Watanabe, Norihiko

    2017-11-01

    This paper discusses effects of high-frequency damping on iterative convergence of an implicit defect-correction solver for viscous problems. The study targets a finite-volume discretization with a one parameter family of damped viscous schemes. The parameter α controls high-frequency damping: zero damping with α = 0, and larger damping for larger α (> 0). Convergence rates are predicted for a model diffusion equation by a Fourier analysis over a practical range of α. It is shown that the convergence rate attains its minimum at α = 1 on regular quadrilateral grids, and deteriorates for larger values of α. A similar behavior is observed for regular triangular grids. In both quadrilateral and triangular grids, the solver is predicted to diverge for α smaller than approximately 0.5. Numerical results are shown for the diffusion equation and the Navier-Stokes equations on regular and irregular grids. The study suggests that α = 1 and 4/3 are suitable values for robust and efficient computations, and α = 4 / 3 is recommended for the diffusion equation, which achieves higher-order accuracy on regular quadrilateral grids. Finally, a Jacobian-Free Newton-Krylov solver with the implicit solver (a low-order Jacobian approximately inverted by a multi-color Gauss-Seidel relaxation scheme) used as a variable preconditioner is recommended for practical computations, which provides robust and efficient convergence for a wide range of α.

  10. A multilevel in space and energy solver for multigroup diffusion eigenvalue problems

    Directory of Open Access Journals (Sweden)

    Ben C. Yee

    2017-09-01

    Full Text Available In this paper, we present a new multilevel in space and energy diffusion (MSED method for solving multigroup diffusion eigenvalue problems. The MSED method can be described as a PI scheme with three additional features: (1 a grey (one-group diffusion equation used to efficiently converge the fission source and eigenvalue, (2 a space-dependent Wielandt shift technique used to reduce the number of PIs required, and (3 a multigrid-in-space linear solver for the linear solves required by each PI step. In MSED, the convergence of the solution of the multigroup diffusion eigenvalue problem is accelerated by performing work on lower-order equations with only one group and/or coarser spatial grids. Results from several Fourier analyses and a one-dimensional test code are provided to verify the efficiency of the MSED method and to justify the incorporation of the grey diffusion equation and the multigrid linear solver. These results highlight the potential efficiency of the MSED method as a solver for multidimensional multigroup diffusion eigenvalue problems, and they serve as a proof of principle for future work. Our ultimate goal is to implement the MSED method as an efficient solver for the two-dimensional/three-dimensional coarse mesh finite difference diffusion system in the Michigan parallel characteristics transport code. The work in this paper represents a necessary step towards that goal.

  11. A wavelet-based PWTD algorithm-accelerated time domain surface integral equation solver

    KAUST Repository

    Liu, Yang

    2015-10-26

    © 2015 IEEE. The multilevel plane-wave time-domain (PWTD) algorithm allows for fast and accurate analysis of transient scattering from, and radiation by, electrically large and complex structures. When used in tandem with marching-on-in-time (MOT)-based surface integral equation (SIE) solvers, it reduces the computational and memory costs of transient analysis from equation and equation to equation and equation, respectively, where Nt and Ns denote the number of temporal and spatial unknowns (Ergin et al., IEEE Trans. Antennas Mag., 41, 39-52, 1999). In the past, PWTD-accelerated MOT-SIE solvers have been applied to transient problems involving half million spatial unknowns (Shanker et al., IEEE Trans. Antennas Propag., 51, 628-641, 2003). Recently, a scalable parallel PWTD-accelerated MOT-SIE solver that leverages a hiearchical parallelization strategy has been developed and successfully applied to the transient problems involving ten million spatial unknowns (Liu et. al., in URSI Digest, 2013). We further enhanced the capabilities of this solver by implementing a compression scheme based on local cosine wavelet bases (LCBs) that exploits the sparsity in the temporal dimension (Liu et. al., in URSI Digest, 2014). Specifically, the LCB compression scheme was used to reduce the memory requirement of the PWTD ray data and computational cost of operations in the PWTD translation stage.

  12. Approximate Riemann solver for the two-fluid plasma model

    International Nuclear Information System (INIS)

    Shumlak, U.; Loverich, J.

    2003-01-01

    An algorithm is presented for the simulation of plasma dynamics using the two-fluid plasma model. The two-fluid plasma model is more general than the magnetohydrodynamic (MHD) model often used for plasma dynamic simulations. The two-fluid equations are derived in divergence form and an approximate Riemann solver is developed to compute the fluxes of the electron and ion fluids at the computational cell interfaces and an upwind characteristic-based solver to compute the electromagnetic fields. The source terms that couple the fluids and fields are treated implicitly to relax the stiffness. The algorithm is validated with the coplanar Riemann problem, Langmuir plasma oscillations, and the electromagnetic shock problem that has been simulated with the MHD plasma model. A numerical dispersion relation is also presented that demonstrates agreement with analytical plasma waves

  13. Asynchronous Parallelization of a CFD Solver

    OpenAIRE

    Abdi, Daniel S.; Bitsuamlak, Girma T.

    2015-01-01

    The article of record as published may be found at http://dx.doi.org/10.1155/2015/295393 A Navier-Stokes equations solver is parallelized to run on a cluster of computers using the domain decomposition method. Two approaches of communication and computation are investigated, namely, synchronous and asynchronous methods. Asynchronous communication between subdomains is not commonly used inCFDcodes; however, it has a potential to alleviate scaling bottlenecks incurred due to process...

  14. Refined isogeometric analysis for a preconditioned conjugate gradient solver

    KAUST Repository

    Garcia, Daniel; Pardo, D.; Dalcin, Lisandro; Calo, Victor M.

    2018-01-01

    Starting from a highly continuous Isogeometric Analysis (IGA) discretization, refined Isogeometric Analysis (rIGA) introduces C0 hyperplanes that act as separators for the direct LU factorization solver. As a result, the total computational cost

  15. A fast, high-order solver for the Grad–Shafranov equation

    International Nuclear Information System (INIS)

    Pataki, Andras; Cerfon, Antoine J.; Freidberg, Jeffrey P.; Greengard, Leslie; O’Neil, Michael

    2013-01-01

    We present a new fast solver to calculate fixed-boundary plasma equilibria in toroidally axisymmetric geometries. By combining conformal mapping with Fourier and integral equation methods on the unit disk, we show that high-order accuracy can be achieved for the solution of the equilibrium equation and its first and second derivatives. Smooth arbitrary plasma cross-sections as well as arbitrary pressure and poloidal current profiles are used as initial data for the solver. Equilibria with large Shafranov shifts can be computed without difficulty. Spectral convergence is demonstrated by comparing the numerical solution with a known exact analytic solution. A fusion-relevant example of an equilibrium with a pressure pedestal is also presented

  16. Multidimensional Riemann problem with self-similar internal structure - part III - a multidimensional analogue of the HLLI Riemann solver for conservative hyperbolic systems

    Science.gov (United States)

    Balsara, Dinshaw S.; Nkonga, Boniface

    2017-10-01

    Just as the quality of a one-dimensional approximate Riemann solver is improved by the inclusion of internal sub-structure, the quality of a multidimensional Riemann solver is also similarly improved. Such multidimensional Riemann problems arise when multiple states come together at the vertex of a mesh. The interaction of the resulting one-dimensional Riemann problems gives rise to a strongly-interacting state. We wish to endow this strongly-interacting state with physically-motivated sub-structure. The fastest way of endowing such sub-structure consists of making a multidimensional extension of the HLLI Riemann solver for hyperbolic conservation laws. Presenting such a multidimensional analogue of the HLLI Riemann solver with linear sub-structure for use on structured meshes is the goal of this work. The multidimensional MuSIC Riemann solver documented here is universal in the sense that it can be applied to any hyperbolic conservation law. The multidimensional Riemann solver is made to be consistent with constraints that emerge naturally from the Galerkin projection of the self-similar states within the wave model. When the full eigenstructure in both directions is used in the present Riemann solver, it becomes a complete Riemann solver in a multidimensional sense. I.e., all the intermediate waves are represented in the multidimensional wave model. The work also presents, for the very first time, an important analysis of the dissipation characteristics of multidimensional Riemann solvers. The present Riemann solver results in the most efficient implementation of a multidimensional Riemann solver with sub-structure. Because it preserves stationary linearly degenerate waves, it might also help with well-balancing. Implementation-related details are presented in pointwise fashion for the one-dimensional HLLI Riemann solver as well as the multidimensional MuSIC Riemann solver.

  17. Parallel transposition of sparse data structures

    DEFF Research Database (Denmark)

    Wang, Hao; Liu, Weifeng; Hou, Kaixi

    2016-01-01

    Many applications in computational sciences and social sciences exploit sparsity and connectivity of acquired data. Even though many parallel sparse primitives such as sparse matrix-vector (SpMV) multiplication have been extensively studied, some other important building blocks, e.g., parallel tr...... transposition in the latest vendor-supplied library on an Intel multicore CPU platform, and the MergeTrans approach achieves on average of 3.4-fold (up to 11.7-fold) speedup on an Intel Xeon Phi many-core processor....

  18. A new solver for granular avalanche simulation: Indoor experiment verification and field scale case study

    Science.gov (United States)

    Wang, XiaoLiang; Li, JiaChun

    2017-12-01

    A new solver based on the high-resolution scheme with novel treatments of source terms and interface capture for the Savage-Hutter model is developed to simulate granular avalanche flows. The capability to simulate flow spread and deposit processes is verified through indoor experiments of a two-dimensional granular avalanche. Parameter studies show that reduction in bed friction enhances runout efficiency, and that lower earth pressure restraints enlarge the deposit spread. The April 9, 2000, Yigong avalanche in Tibet, China, is simulated as a case study by this new solver. The predicted results, including evolution process, deposit spread, and hazard impacts, generally agree with site observations. It is concluded that the new solver for the Savage-Hutter equation provides a comprehensive software platform for granular avalanche simulation at both experimental and field scales. In particular, the solver can be a valuable tool for providing necessary information for hazard forecasts, disaster mitigation, and countermeasure decisions in mountainous areas.

  19. NONLINEAR MULTIGRID SOLVER EXPLOITING AMGe COARSE SPACES WITH APPROXIMATION PROPERTIES

    Energy Technology Data Exchange (ETDEWEB)

    Christensen, Max La Cour [Technical Univ. of Denmark, Lyngby (Denmark); Villa, Umberto E. [Univ. of Texas, Austin, TX (United States); Engsig-Karup, Allan P. [Technical Univ. of Denmark, Lyngby (Denmark); Vassilevski, Panayot S. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

    2016-01-22

    The paper introduces a nonlinear multigrid solver for mixed nite element discretizations based on the Full Approximation Scheme (FAS) and element-based Algebraic Multigrid (AMGe). The main motivation to use FAS for unstruc- tured problems is the guaranteed approximation property of the AMGe coarse spaces that were developed recently at Lawrence Livermore National Laboratory. These give the ability to derive stable and accurate coarse nonlinear discretization problems. The previous attempts (including ones with the original AMGe method, [5, 11]), were less successful due to lack of such good approximation properties of the coarse spaces. With coarse spaces with approximation properties, our FAS approach on un- structured meshes should be as powerful/successful as FAS on geometrically re ned meshes. For comparison, Newton's method and Picard iterations with an inner state-of-the-art linear solver is compared to FAS on a nonlinear saddle point problem with applications to porous media ow. It is demonstrated that FAS is faster than Newton's method and Picard iterations for the experiments considered here. Due to the guaranteed approximation properties of our AMGe, the coarse spaces are very accurate, providing a solver with the potential for mesh-independent convergence on general unstructured meshes.

  20. Evolving effective incremental SAT solvers with GP

    OpenAIRE

    Bader, Mohamed; Poli, R.

    2008-01-01

    Hyper-Heuristics could simply be defined as heuristics to choose other heuristics, and it is a way of combining existing heuristics to generate new ones. In a Hyper-Heuristic framework, the framework is used for evolving effective incremental (Inc*) solvers for SAT. We test the evolved heuristics (IncHH) against other known local search heuristics on a variety of benchmark SAT problems.

  1. Sparse Source EEG Imaging with the Variational Garrote

    DEFF Research Database (Denmark)

    Hansen, Sofie Therese; Stahlhut, Carsten; Hansen, Lars Kai

    2013-01-01

    EEG imaging, the estimation of the cortical source distribution from scalp electrode measurements, poses an extremely ill-posed inverse problem. Recent work by Delorme et al. (2012) supports the hypothesis that distributed source solutions are sparse. We show that direct search for sparse solutions...

  2. Low-count PET image restoration using sparse representation

    Science.gov (United States)

    Li, Tao; Jiang, Changhui; Gao, Juan; Yang, Yongfeng; Liang, Dong; Liu, Xin; Zheng, Hairong; Hu, Zhanli

    2018-04-01

    In the field of positron emission tomography (PET), reconstructed images are often blurry and contain noise. These problems are primarily caused by the low resolution of projection data. Solving this problem by improving hardware is an expensive solution, and therefore, we attempted to develop a solution based on optimizing several related algorithms in both the reconstruction and image post-processing domains. As sparse technology is widely used, sparse prediction is increasingly applied to solve this problem. In this paper, we propose a new sparse method to process low-resolution PET images. Two dictionaries (D1 for low-resolution PET images and D2 for high-resolution PET images) are learned from a group real PET image data sets. Among these two dictionaries, D1 is used to obtain a sparse representation for each patch of the input PET image. Then, a high-resolution PET image is generated from this sparse representation using D2. Experimental results indicate that the proposed method exhibits a stable and superior ability to enhance image resolution and recover image details. Quantitatively, this method achieves better performance than traditional methods. This proposed strategy is a new and efficient approach for improving the quality of PET images.

  3. X-ray computed tomography using curvelet sparse regularization.

    Science.gov (United States)

    Wieczorek, Matthias; Frikel, Jürgen; Vogel, Jakob; Eggl, Elena; Kopp, Felix; Noël, Peter B; Pfeiffer, Franz; Demaret, Laurent; Lasser, Tobias

    2015-04-01

    Reconstruction of x-ray computed tomography (CT) data remains a mathematically challenging problem in medical imaging. Complementing the standard analytical reconstruction methods, sparse regularization is growing in importance, as it allows inclusion of prior knowledge. The paper presents a method for sparse regularization based on the curvelet frame for the application to iterative reconstruction in x-ray computed tomography. In this work, the authors present an iterative reconstruction approach based on the alternating direction method of multipliers using curvelet sparse regularization. Evaluation of the method is performed on a specifically crafted numerical phantom dataset to highlight the method's strengths. Additional evaluation is performed on two real datasets from commercial scanners with different noise characteristics, a clinical bone sample acquired in a micro-CT and a human abdomen scanned in a diagnostic CT. The results clearly illustrate that curvelet sparse regularization has characteristic strengths. In particular, it improves the restoration and resolution of highly directional, high contrast features with smooth contrast variations. The authors also compare this approach to the popular technique of total variation and to traditional filtered backprojection. The authors conclude that curvelet sparse regularization is able to improve reconstruction quality by reducing noise while preserving highly directional features.

  4. Resolving Neighbourhood Relations in a Parallel Fluid Dynamic Solver

    KAUST Repository

    Frisch, Jerome; Mundani, Ralf-Peter; Rank, Ernst

    2012-01-01

    solver with a special aspect on the hierarchical data structure, unique cell and grid identification, and the neighbourhood relations in-between grids on different processes. A special server concept keeps track of every grid over all processes while

  5. The SX Solver: A Computer Program for Analyzing Solvent-Extraction Equilibria: Version 3.0

    International Nuclear Information System (INIS)

    Lumetta, Gregg J.

    2001-01-01

    A new computer program, the SX Solver, has been developed to analyze solvent-extraction equilibria. The program operates out of Microsoft Excel and uses the built-in Solver function to minimize the sum of the square of the residuals between measured and calculated distribution coefficients. The extraction of nitric acid by tributyl phosphate has been modeled to illustrate the programs use

  6. Comparison of Einstein-Boltzmann solvers for testing general relativity

    Science.gov (United States)

    Bellini, E.; Barreira, A.; Frusciante, N.; Hu, B.; Peirone, S.; Raveri, M.; Zumalacárregui, M.; Avilez-Lopez, A.; Ballardini, M.; Battye, R. A.; Bolliet, B.; Calabrese, E.; Dirian, Y.; Ferreira, P. G.; Finelli, F.; Huang, Z.; Ivanov, M. M.; Lesgourgues, J.; Li, B.; Lima, N. A.; Pace, F.; Paoletti, D.; Sawicki, I.; Silvestri, A.; Skordis, C.; Umiltà, C.; Vernizzi, F.

    2018-01-01

    We compare Einstein-Boltzmann solvers that include modifications to general relativity and find that, for a wide range of models and parameters, they agree to a high level of precision. We look at three general purpose codes that primarily model general scalar-tensor theories, three codes that model Jordan-Brans-Dicke (JBD) gravity, a code that models f (R ) gravity, a code that models covariant Galileons, a code that models Hořava-Lifschitz gravity, and two codes that model nonlocal models of gravity. Comparing predictions of the angular power spectrum of the cosmic microwave background and the power spectrum of dark matter for a suite of different models, we find agreement at the subpercent level. This means that this suite of Einstein-Boltzmann solvers is now sufficiently accurate for precision constraints on cosmological and gravitational parameters.

  7. A High Performance QDWH-SVD Solver using Hardware Accelerators

    KAUST Repository

    Sukkari, Dalal E.; Ltaief, Hatem; Keyes, David E.

    2015-01-01

    few digits of accuracy, compared to the full double precision floating point arithmetic. We further leverage the single GPU QDWH-SVD implementation by introducing the first multi-GPU SVD solver to study the scalability of the QDWH-SVD framework.

  8. Sparse dictionary for synthetic transmit aperture medical ultrasound imaging.

    Science.gov (United States)

    Wang, Ping; Jiang, Jin-Yang; Li, Na; Luo, Han-Wu; Li, Fang; Cui, Shi-Gang

    2017-07-01

    It is possible to recover a signal below the Nyquist sampling limit using a compressive sensing technique in ultrasound imaging. However, the reconstruction enabled by common sparse transform approaches does not achieve satisfactory results. Considering the ultrasound echo signal's features of attenuation, repetition, and superposition, a sparse dictionary with the emission pulse signal is proposed. Sparse coefficients in the proposed dictionary have high sparsity. Images reconstructed with this dictionary were compared with those obtained with the three other common transforms, namely, discrete Fourier transform, discrete cosine transform, and discrete wavelet transform. The performance of the proposed dictionary was analyzed via a simulation and experimental data. The mean absolute error (MAE) was used to quantify the quality of the reconstructions. Experimental results indicate that the MAE associated with the proposed dictionary was always the smallest, the reconstruction time required was the shortest, and the lateral resolution and contrast of the reconstructed images were also the closest to the original images. The proposed sparse dictionary performed better than the other three sparse transforms. With the same sampling rate, the proposed dictionary achieved excellent reconstruction quality.

  9. A sparse electromagnetic imaging scheme using nonlinear landweber iterations

    KAUST Repository

    Desmal, Abdulla; Bagci, Hakan

    2015-01-01

    Development and use of electromagnetic inverse scattering techniques for imagining sparse domains have been on the rise following the recent advancements in solving sparse optimization problems. Existing techniques rely on iteratively converting

  10. Scalable group level probabilistic sparse factor analysis

    DEFF Research Database (Denmark)

    Hinrich, Jesper Løve; Nielsen, Søren Føns Vind; Riis, Nicolai Andre Brogaard

    2017-01-01

    Many data-driven approaches exist to extract neural representations of functional magnetic resonance imaging (fMRI) data, but most of them lack a proper probabilistic formulation. We propose a scalable group level probabilistic sparse factor analysis (psFA) allowing spatially sparse maps, component...... pruning using automatic relevance determination (ARD) and subject specific heteroscedastic spatial noise modeling. For task-based and resting state fMRI, we show that the sparsity constraint gives rise to components similar to those obtained by group independent component analysis. The noise modeling...... shows that noise is reduced in areas typically associated with activation by the experimental design. The psFA model identifies sparse components and the probabilistic setting provides a natural way to handle parameter uncertainties. The variational Bayesian framework easily extends to more complex...

  11. Fast wavelet based sparse approximate inverse preconditioner

    Energy Technology Data Exchange (ETDEWEB)

    Wan, W.L. [Univ. of California, Los Angeles, CA (United States)

    1996-12-31

    Incomplete LU factorization is a robust preconditioner for both general and PDE problems but unfortunately not easy to parallelize. Recent study of Huckle and Grote and Chow and Saad showed that sparse approximate inverse could be a potential alternative while readily parallelizable. However, for special class of matrix A that comes from elliptic PDE problems, their preconditioners are not optimal in the sense that independent of mesh size. A reason may be that no good sparse approximate inverse exists for the dense inverse matrix. Our observation is that for this kind of matrices, its inverse entries typically have piecewise smooth changes. We can take advantage of this fact and use wavelet compression techniques to construct a better sparse approximate inverse preconditioner. We shall show numerically that our approach is effective for this kind of matrices.

  12. Local posterior concentration rate for multilevel sparse sequences

    NARCIS (Netherlands)

    Belitser, E.N.; Nurushev, N.

    2017-01-01

    We consider empirical Bayesian inference in the many normal means model in the situation when the high-dimensional mean vector is multilevel sparse, that is,most of the entries of the parameter vector are some fixed values. For instance, the traditional sparse signal is a particular case (with one

  13. A heterogeneous CPU+GPU Poisson solver for space charge calculations in beam dynamics studies

    Energy Technology Data Exchange (ETDEWEB)

    Zheng, Dawei; Rienen, Ursula van [University of Rostock, Institute of General Electrical Engineering (Germany)

    2016-07-01

    In beam dynamics studies in accelerator physics, space charge plays a central role in the low energy regime of an accelerator. Numerical space charge calculations are required, both, in the design phase and in the operation of the machines as well. Due to its efficiency, mostly the Particle-In-Cell (PIC) method is chosen for the space charge calculation. Then, the solution of Poisson's equation for the charge distribution in the rest frame is the most prominent part within the solution process. The Poisson solver directly affects the accuracy of the self-field applied on the charged particles when the equation of motion is solved in the laboratory frame. As the Poisson solver consumes the major part of the computing time in most simulations it has to be as fast as possible since it has to be carried out once per time step. In this work, we demonstrate a novel heterogeneous CPU+GPU routine for the Poisson solver. The novel solver also benefits from our new research results on the utilization of a discrete cosine transform within the classical Hockney and Eastwood's convolution routine.

  14. ELSI: A unified software interface for Kohn-Sham electronic structure solvers

    Science.gov (United States)

    Yu, Victor Wen-zhe; Corsetti, Fabiano; García, Alberto; Huhn, William P.; Jacquelin, Mathias; Jia, Weile; Lange, Björn; Lin, Lin; Lu, Jianfeng; Mi, Wenhui; Seifitokaldani, Ali; Vázquez-Mayagoitia, Álvaro; Yang, Chao; Yang, Haizhao; Blum, Volker

    2018-01-01

    Solving the electronic structure from a generalized or standard eigenproblem is often the bottleneck in large scale calculations based on Kohn-Sham density-functional theory. This problem must be addressed by essentially all current electronic structure codes, based on similar matrix expressions, and by high-performance computation. We here present a unified software interface, ELSI, to access different strategies that address the Kohn-Sham eigenvalue problem. Currently supported algorithms include the dense generalized eigensolver library ELPA, the orbital minimization method implemented in libOMM, and the pole expansion and selected inversion (PEXSI) approach with lower computational complexity for semilocal density functionals. The ELSI interface aims to simplify the implementation and optimal use of the different strategies, by offering (a) a unified software framework designed for the electronic structure solvers in Kohn-Sham density-functional theory; (b) reasonable default parameters for a chosen solver; (c) automatic conversion between input and internal working matrix formats, and in the future (d) recommendation of the optimal solver depending on the specific problem. Comparative benchmarks are shown for system sizes up to 11,520 atoms (172,800 basis functions) on distributed memory supercomputing architectures.

  15. CASTRO: A NEW COMPRESSIBLE ASTROPHYSICAL SOLVER. II. GRAY RADIATION HYDRODYNAMICS

    International Nuclear Information System (INIS)

    Zhang, W.; Almgren, A.; Bell, J.; Howell, L.; Burrows, A.

    2011-01-01

    We describe the development of a flux-limited gray radiation solver for the compressible astrophysics code, CASTRO. CASTRO uses an Eulerian grid with block-structured adaptive mesh refinement based on a nested hierarchy of logically rectangular variable-sized grids with simultaneous refinement in both space and time. The gray radiation solver is based on a mixed-frame formulation of radiation hydrodynamics. In our approach, the system is split into two parts, one part that couples the radiation and fluid in a hyperbolic subsystem, and another parabolic part that evolves radiation diffusion and source-sink terms. The hyperbolic subsystem is solved explicitly with a high-order Godunov scheme, whereas the parabolic part is solved implicitly with a first-order backward Euler method.

  16. Sparse modeling of spatial environmental variables associated with asthma.

    Science.gov (United States)

    Chang, Timothy S; Gangnon, Ronald E; David Page, C; Buckingham, William R; Tandias, Aman; Cowan, Kelly J; Tomasallo, Carrie D; Arndt, Brian G; Hanrahan, Lawrence P; Guilbert, Theresa W

    2015-02-01

    Geographically distributed environmental factors influence the burden of diseases such as asthma. Our objective was to identify sparse environmental variables associated with asthma diagnosis gathered from a large electronic health record (EHR) dataset while controlling for spatial variation. An EHR dataset from the University of Wisconsin's Family Medicine, Internal Medicine and Pediatrics Departments was obtained for 199,220 patients aged 5-50years over a three-year period. Each patient's home address was geocoded to one of 3456 geographic census block groups. Over one thousand block group variables were obtained from a commercial database. We developed a Sparse Spatial Environmental Analysis (SASEA). Using this method, the environmental variables were first dimensionally reduced with sparse principal component analysis. Logistic thin plate regression spline modeling was then used to identify block group variables associated with asthma from sparse principal components. The addresses of patients from the EHR dataset were distributed throughout the majority of Wisconsin's geography. Logistic thin plate regression spline modeling captured spatial variation of asthma. Four sparse principal components identified via model selection consisted of food at home, dog ownership, household size, and disposable income variables. In rural areas, dog ownership and renter occupied housing units from significant sparse principal components were associated with asthma. Our main contribution is the incorporation of sparsity in spatial modeling. SASEA sequentially added sparse principal components to Logistic thin plate regression spline modeling. This method allowed association of geographically distributed environmental factors with asthma using EHR and environmental datasets. SASEA can be applied to other diseases with environmental risk factors. Copyright © 2014 Elsevier Inc. All rights reserved.

  17. Strongly coupled partitioned six degree-of-freedom rigid body motion solver with Aitken's dynamic under-relaxation

    Directory of Open Access Journals (Sweden)

    Jeng Hei Chow

    2016-07-01

    Full Text Available An implicit method of solving the six degree-of-freedom rigid body motion equations based on the second order Adams-Bashforth-Moulten method was utilised as an improvement over the leapfrog scheme by making modifications to the rigid body motion solver libraries directly. The implementation will depend on predictor-corrector steps still residing within the hybrid Pressure Implicit with Splitting of Operators - Semi-Implicit Method for Pressure Linked Equations (PIMPLE outer corrector loops to ensure strong coupling between fluid and motion. Aitken's under-relaxation is also introduced in this study to optimise the convergence rate and stability of the coupled solver. The resulting coupled solver ran on a free floating object tutorial test case when converged matches the original solver. It further allows a varying 70%–80% reduction in simulation times compared using a fixed under-relaxation to achieve the required stability.

  18. Analog system for computing sparse codes

    Science.gov (United States)

    Rozell, Christopher John; Johnson, Don Herrick; Baraniuk, Richard Gordon; Olshausen, Bruno A.; Ortman, Robert Lowell

    2010-08-24

    A parallel dynamical system for computing sparse representations of data, i.e., where the data can be fully represented in terms of a small number of non-zero code elements, and for reconstructing compressively sensed images. The system is based on the principles of thresholding and local competition that solves a family of sparse approximation problems corresponding to various sparsity metrics. The system utilizes Locally Competitive Algorithms (LCAs), nodes in a population continually compete with neighboring units using (usually one-way) lateral inhibition to calculate coefficients representing an input in an over complete dictionary.

  19. FATCOP: A Fault Tolerant Condor-PVM Mixed Integer Program Solver

    National Research Council Canada - National Science Library

    Chen, Qun

    1999-01-01

    We describe FATCOP, a new parallel mixed integer program solver written in PVM. The implementation uses the Condor resource management system to provide a virtual machine composed of otherwise idle computers...

  20. Efficient Pseudorecursive Evaluation Schemes for Non-adaptive Sparse Grids

    KAUST Repository

    Buse, Gerrit; Pflü ger, Dirk; Jacob, Riko

    2014-01-01

    In this work we propose novel algorithms for storing and evaluating sparse grid functions, operating on regular (not spatially adaptive), yet potentially dimensionally adaptive grid types. Besides regular sparse grids our approach includes truncated

  1. Dynamic Programming Algorithm for Generation of Optimal Elimination Trees for Multi-frontal Direct Solver Over H-refined Grids

    KAUST Repository

    AbouEisha, Hassan M.

    2014-06-06

    In this paper we present a dynamic programming algorithm for finding optimal elimination trees for computational grids refined towards point or edge singularities. The elimination tree is utilized to guide the multi-frontal direct solver algorithm. Thus, the criterion for the optimization of the elimination tree is the computational cost associated with the multi-frontal solver algorithm executed over such tree. We illustrate the paper with several examples of optimal trees found for grids with point, isotropic edge and anisotropic edge mixed with point singularity. We show the comparison of the execution time of the multi-frontal solver algorithm with results of MUMPS solver with METIS library, implementing the nested dissection algorithm.

  2. Occlusion detection via structured sparse learning for robust object tracking

    KAUST Repository

    Zhang, Tianzhu

    2014-01-01

    Sparse representation based methods have recently drawn much attention in visual tracking due to good performance against illumination variation and occlusion. They assume the errors caused by image variations can be modeled as pixel-wise sparse. However, in many practical scenarios, these errors are not truly pixel-wise sparse but rather sparsely distributed in a structured way. In fact, pixels in error constitute contiguous regions within the object’s track. This is the case when significant occlusion occurs. To accommodate for nonsparse occlusion in a given frame, we assume that occlusion detected in previous frames can be propagated to the current one. This propagated information determines which pixels will contribute to the sparse representation of the current track. In other words, pixels that were detected as part of an occlusion in the previous frame will be removed from the target representation process. As such, this paper proposes a novel tracking algorithm that models and detects occlusion through structured sparse learning. We test our tracker on challenging benchmark sequences, such as sports videos, which involve heavy occlusion, drastic illumination changes, and large pose variations. Extensive experimental results show that our proposed tracker consistently outperforms the state-of-the-art trackers.

  3. Sparse Representation Based SAR Vehicle Recognition along with Aspect Angle

    Directory of Open Access Journals (Sweden)

    Xiangwei Xing

    2014-01-01

    Full Text Available As a method of representing the test sample with few training samples from an overcomplete dictionary, sparse representation classification (SRC has attracted much attention in synthetic aperture radar (SAR automatic target recognition (ATR recently. In this paper, we develop a novel SAR vehicle recognition method based on sparse representation classification along with aspect information (SRCA, in which the correlation between the vehicle’s aspect angle and the sparse representation vector is exploited. The detailed procedure presented in this paper can be summarized as follows. Initially, the sparse representation vector of a test sample is solved by sparse representation algorithm with a principle component analysis (PCA feature-based dictionary. Then, the coefficient vector is projected onto a sparser one within a certain range of the vehicle’s aspect angle. Finally, the vehicle is classified into a certain category that minimizes the reconstruction error with the novel sparse representation vector. Extensive experiments are conducted on the moving and stationary target acquisition and recognition (MSTAR dataset and the results demonstrate that the proposed method performs robustly under the variations of depression angle and target configurations, as well as incomplete observation.

  4. A Parallel Algebraic Multigrid Solver on Graphics Processing Units

    KAUST Repository

    Haase, Gundolf; Liebmann, Manfred; Douglas, Craig C.; Plank, Gernot

    2010-01-01

    -vector multiplication scheme underlying the PCG-AMG algorithm is presented for the many-core GPU architecture. A performance comparison of the parallel solver shows that a singe Nvidia Tesla C1060 GPU board delivers the performance of a sixteen node Infiniband cluster

  5. A Nonlinear Modal Aeroelastic Solver for FUN3D

    Science.gov (United States)

    Goldman, Benjamin D.; Bartels, Robert E.; Biedron, Robert T.; Scott, Robert C.

    2016-01-01

    A nonlinear structural solver has been implemented internally within the NASA FUN3D computational fluid dynamics code, allowing for some new aeroelastic capabilities. Using a modal representation of the structure, a set of differential or differential-algebraic equations are derived for general thin structures with geometric nonlinearities. ODEPACK and LAPACK routines are linked with FUN3D, and the nonlinear equations are solved at each CFD time step. The existing predictor-corrector method is retained, whereby the structural solution is updated after mesh deformation. The nonlinear solver is validated using a test case for a flexible aeroshell at transonic, supersonic, and hypersonic flow conditions. Agreement with linear theory is seen for the static aeroelastic solutions at relatively low dynamic pressures, but structural nonlinearities limit deformation amplitudes at high dynamic pressures. No flutter was found at any of the tested trajectory points, though LCO may be possible in the transonic regime.

  6. Cartesian Mesh Linearized Euler Equations Solver for Aeroacoustic Problems around Full Aircraft

    Directory of Open Access Journals (Sweden)

    Yuma Fukushima

    2015-01-01

    Full Text Available The linearized Euler equations (LEEs solver for aeroacoustic problems has been developed on block-structured Cartesian mesh to address complex geometry. Taking advantage of the benefits of Cartesian mesh, we employ high-order schemes for spatial derivatives and for time integration. On the other hand, the difficulty of accommodating curved wall boundaries is addressed by the immersed boundary method. The resulting LEEs solver is robust to complex geometry and numerically efficient in a parallel environment. The accuracy and effectiveness of the present solver are validated by one-dimensional and three-dimensional test cases. Acoustic scattering around a sphere and noise propagation from the JT15D nacelle are computed. The results show good agreement with analytical, computational, and experimental results. Finally, noise propagation around fuselage-wing-nacelle configurations is computed as a practical example. The results show that the sound pressure level below the over-the-wing nacelle (OWN configuration is much lower than that of the conventional DLR-F6 aircraft configuration due to the shielding effect of the OWN configuration.

  7. Learning sparse generative models of audiovisual signals

    OpenAIRE

    Monaci, Gianluca; Sommer, Friedrich T.; Vandergheynst, Pierre

    2008-01-01

    This paper presents a novel framework to learn sparse represen- tations for audiovisual signals. An audiovisual signal is modeled as a sparse sum of audiovisual kernels. The kernels are bimodal functions made of synchronous audio and video components that can be positioned independently and arbitrarily in space and time. We design an algorithm capable of learning sets of such audiovi- sual, synchronous, shift-invariant functions by alternatingly solving a coding and a learning pr...

  8. Support agnostic Bayesian matching pursuit for block sparse signals

    KAUST Repository

    Masood, Mudassir

    2013-05-01

    A fast matching pursuit method using a Bayesian approach is introduced for block-sparse signal recovery. This method performs Bayesian estimates of block-sparse signals even when the distribution of active blocks is non-Gaussian or unknown. It is agnostic to the distribution of active blocks in the signal and utilizes a priori statistics of additive noise and the sparsity rate of the signal, which are shown to be easily estimated from data and no user intervention is required. The method requires a priori knowledge of block partition and utilizes a greedy approach and order-recursive updates of its metrics to find the most dominant sparse supports to determine the approximate minimum mean square error (MMSE) estimate of the block-sparse signal. Simulation results demonstrate the power and robustness of our proposed estimator. © 2013 IEEE.

  9. Preconditioned Inexact Newton for Nonlinear Sparse Electromagnetic Imaging

    KAUST Repository

    Desmal, Abdulla

    2014-05-04

    Newton-type algorithms have been extensively studied in nonlinear microwave imaging due to their quadratic convergence rate and ability to recover images with high contrast values. In the past, Newton methods have been implemented in conjunction with smoothness promoting optimization/regularization schemes. However, this type of regularization schemes are known to perform poorly when applied in imagining domains with sparse content or sharp variations. In this work, an inexact Newton algorithm is formulated and implemented in conjunction with a linear sparse optimization scheme. A novel preconditioning technique is proposed to increase the convergence rate of the optimization problem. Numerical results demonstrate that the proposed framework produces sharper and more accurate images when applied in sparse/sparsified domains.

  10. Preconditioned Inexact Newton for Nonlinear Sparse Electromagnetic Imaging

    KAUST Repository

    Desmal, Abdulla

    2014-01-06

    Newton-type algorithms have been extensively studied in nonlinear microwave imaging due to their quadratic convergence rate and ability to recover images with high contrast values. In the past, Newton methods have been implemented in conjunction with smoothness promoting optimization/regularization schemes. However, this type of regularization schemes are known to perform poorly when applied in imagining domains with sparse content or sharp variations. In this work, an inexact Newton algorithm is formulated and implemented in conjunction with a linear sparse optimization scheme. A novel preconditioning technique is proposed to increase the convergence rate of the optimization problem. Numerical results demonstrate that the proposed framework produces sharper and more accurate images when applied in sparse/sparsified domains.

  11. Electromagnetic Formation Flight (EMFF) for Sparse Aperture Arrays

    Science.gov (United States)

    Kwon, Daniel W.; Miller, David W.; Sedwick, Raymond J.

    2004-01-01

    Traditional methods of actuating spacecraft in sparse aperture arrays use propellant as a reaction mass. For formation flying systems, propellant becomes a critical consumable which can be quickly exhausted while maintaining relative orientation. Additional problems posed by propellant include optical contamination, plume impingement, thermal emission, and vibration excitation. For these missions where control of relative degrees of freedom is important, we consider using a system of electromagnets, in concert with reaction wheels, to replace the consumables. Electromagnetic Formation Flight sparse apertures, powered by solar energy, are designed differently from traditional propulsion systems, which are based on V. This paper investigates the design of sparse apertures both inside and outside the Earth's gravity field.

  12. Preconditioned Inexact Newton for Nonlinear Sparse Electromagnetic Imaging

    KAUST Repository

    Desmal, Abdulla; Bagci, Hakan

    2014-01-01

    Newton-type algorithms have been extensively studied in nonlinear microwave imaging due to their quadratic convergence rate and ability to recover images with high contrast values. In the past, Newton methods have been implemented in conjunction with smoothness promoting optimization/regularization schemes. However, this type of regularization schemes are known to perform poorly when applied in imagining domains with sparse content or sharp variations. In this work, an inexact Newton algorithm is formulated and implemented in conjunction with a linear sparse optimization scheme. A novel preconditioning technique is proposed to increase the convergence rate of the optimization problem. Numerical results demonstrate that the proposed framework produces sharper and more accurate images when applied in sparse/sparsified domains.

  13. SolveDB: Integrating Optimization Problem Solvers Into SQL Databases

    DEFF Research Database (Denmark)

    Siksnys, Laurynas; Pedersen, Torben Bach

    2016-01-01

    for optimization problems, (2) an extensible infrastructure for integrating different solvers, and (3) query optimization techniques to achieve the best execution performance and/or result quality. Extensive experiments with the PostgreSQL-based implementation show that SolveDB is a versatile tool offering much...

  14. A comprehensive study of sparse codes on abnormality detection

    DEFF Research Database (Denmark)

    Ren, Huamin; Pan, Hong; Olsen, Søren Ingvor

    2017-01-01

    Sparse representation has been applied successfully in abnor-mal event detection, in which the baseline is to learn a dic-tionary accompanied by sparse codes. While much empha-sis is put on discriminative dictionary construction, there areno comparative studies of sparse codes regarding abnormal-ity...... detection. We comprehensively study two types of sparsecodes solutions - greedy algorithms and convex L1-norm so-lutions - and their impact on abnormality detection perfor-mance. We also propose our framework of combining sparsecodes with different detection methods. Our comparative ex-periments are carried...

  15. New multigrid solver advances in TOPS

    International Nuclear Information System (INIS)

    Falgout, R D; Brannick, J; Brezina, M; Manteuffel, T; McCormick, S

    2005-01-01

    In this paper, we highlight new multigrid solver advances in the Terascale Optimal PDE Simulations (TOPS) project in the Scientific Discovery Through Advanced Computing (SciDAC) program. We discuss two new algebraic multigrid (AMG) developments in TOPS: the adaptive smoothed aggregation method (αSA) and a coarse-grid selection algorithm based on compatible relaxation (CR). The αSA method is showing promising results in initial studies for Quantum Chromodynamics (QCD) applications. The CR method has the potential to greatly improve the applicability of AMG

  16. Support agnostic Bayesian matching pursuit for block sparse signals

    KAUST Repository

    Masood, Mudassir; Al-Naffouri, Tareq Y.

    2013-01-01

    priori knowledge of block partition and utilizes a greedy approach and order-recursive updates of its metrics to find the most dominant sparse supports to determine the approximate minimum mean square error (MMSE) estimate of the block-sparse signal

  17. An alternative solver for the nodal expansion method equations - 106

    International Nuclear Information System (INIS)

    Carvalho da Silva, F.; Carlos Marques Alvim, A.; Senra Martinez, A.

    2010-01-01

    An automated procedure for nuclear reactor core design is accomplished by using a quick and accurate 3D nodal code, aiming at solving the diffusion equation, which describes the spatial neutron distribution in the reactor. This paper deals with an alternative solver for nodal expansion method (NEM), with only two inner iterations (mesh sweeps) per outer iteration, thus having the potential to reduce the time required to calculate the power distribution in nuclear reactors, but with accuracy similar to the ones found in conventional NEM. The proposed solver was implemented into a computational system which, besides solving the diffusion equation, also solves the burnup equations governing the gradual changes in material compositions of the core due to fuel depletion. Results confirm the effectiveness of the method for practical purposes. (authors)

  18. Ramses-GPU: Second order MUSCL-Handcock finite volume fluid solver

    Science.gov (United States)

    Kestener, Pierre

    2017-10-01

    RamsesGPU is a reimplementation of RAMSES (ascl:1011.007) which drops the adaptive mesh refinement (AMR) features to optimize 3D uniform grid algorithms for modern graphics processor units (GPU) to provide an efficient software package for astrophysics applications that do not need AMR features but do require a very large number of integration time steps. RamsesGPU provides an very efficient C++/CUDA/MPI software implementation of a second order MUSCL-Handcock finite volume fluid solver for compressible hydrodynamics as a magnetohydrodynamics solver based on the constraint transport technique. Other useful modules includes static gravity, dissipative terms (viscosity, resistivity), and forcing source term for turbulence studies, and special care was taken to enhance parallel input/output performance by using state-of-the-art libraries such as HDF5 and parallel-netcdf.

  19. A Generic High-performance GPU-based Library for PDE solvers

    DEFF Research Database (Denmark)

    Glimberg, Stefan Lemvig; Engsig-Karup, Allan Peter

    , the privilege of high-performance parallel computing is now in principle accessible for many scientific users, no matter their economic resources. Though being highly effective units, GPUs and parallel architectures in general, pose challenges for software developers to utilize their efficiency. Sequential...... legacy codes are not always easily parallelized and the time spent on conversion might not pay o in the end. We present a highly generic C++ library for fast assembling of partial differential equation (PDE) solvers, aiming at utilizing the computational resources of GPUs. The library requires a minimum...... of GPU computing knowledge, while still oering the possibility to customize user-specic solvers at kernel level if desired. Spatial dierential operators are based on matrix free exible order nite dierence approximations. These matrix free operators minimize both memory consumption and main memory access...

  20. PUFoam : A novel open-source CFD solver for the simulation of polyurethane foams

    Science.gov (United States)

    Karimi, M.; Droghetti, H.; Marchisio, D. L.

    2017-08-01

    In this work a transient three-dimensional mathematical model is formulated and validated for the simulation of polyurethane (PU) foams. The model is based on computational fluid dynamics (CFD) and is coupled with a population balance equation (PBE) to describe the evolution of the gas bubbles/cells within the PU foam. The front face of the expanding foam is monitored on the basis of the volume-of-fluid (VOF) method using a compressible solver available in OpenFOAM version 3.0.1. The solver is additionally supplemented to include the PBE, solved with the quadrature method of moments (QMOM), the polymerization kinetics, an adequate rheological model and a simple model for the foam thermal conductivity. The new solver is labelled as PUFoam and is, for the first time in this work, validated for 12 different mixing-cup experiments. Comparison of the time evolution of the predicted and experimentally measured density and temperature of the PU foam shows the potentials and limitations of the approach.

  1. Linear optical response of finite systems using multishift linear system solvers

    Energy Technology Data Exchange (ETDEWEB)

    Hübener, Hannes; Giustino, Feliciano [Department of Materials, University of Oxford, Oxford OX1 3PH (United Kingdom)

    2014-07-28

    We discuss the application of multishift linear system solvers to linear-response time-dependent density functional theory. Using this technique the complete frequency-dependent electronic density response of finite systems to an external perturbation can be calculated at the cost of a single solution of a linear system via conjugate gradients. We show that multishift time-dependent density functional theory yields excitation energies and oscillator strengths in perfect agreement with the standard diagonalization of the response matrix (Casida's method), while being computationally advantageous. We present test calculations for benzene, porphin, and chlorophyll molecules. We argue that multishift solvers may find broad applicability in the context of excited-state calculations within density-functional theory and beyond.

  2. Multitasking domain decomposition fast Poisson solvers on the Cray Y-MP

    Science.gov (United States)

    Chan, Tony F.; Fatoohi, Rod A.

    1990-01-01

    The results of multitasking implementation of a domain decomposition fast Poisson solver on eight processors of the Cray Y-MP are presented. The object of this research is to study the performance of domain decomposition methods on a Cray supercomputer and to analyze the performance of different multitasking techniques using highly parallel algorithms. Two implementations of multitasking are considered: macrotasking (parallelism at the subroutine level) and microtasking (parallelism at the do-loop level). A conventional FFT-based fast Poisson solver is also multitasked. The results of different implementations are compared and analyzed. A speedup of over 7.4 on the Cray Y-MP running in a dedicated environment is achieved for all cases.

  3. Integrated tokamak modelling with the fast-ion Fokker–Planck solver adapted for transient analyses

    International Nuclear Information System (INIS)

    Toma, M; Hamamatsu, K; Hayashi, N; Honda, M; Ide, S

    2015-01-01

    Integrated tokamak modelling that enables the simulation of an entire discharge period is indispensable for designing advanced tokamak plasmas. For this purpose, we extend the integrated code TOPICS to make it more suitable for transient analyses in the fast-ion part. The fast-ion Fokker–Planck solver is integrated into TOPICS at the same level as the bulk transport solver so that the time evolutions of the fast ion and the bulk plasma are consistent with each other as well as with the equilibrium magnetic field. The fast-ion solver simultaneously handles neutral beam-injected ions and alpha particles. Parallelisation of the fast-ion solver in addition to its computational lightness owing to a dimensional reduction in the phase space enables transient analyses for long periods in the order of tens of seconds. The fast-ion Fokker–Planck calculation is compared and confirmed to be in good agreement with an orbit following a Monte Carlo calculation. The integrated code is applied to ramp-up simulations for JT-60SA and ITER to confirm its capability and effectiveness in transient analyses. In the integrated simulations, the coupled evolution of the fast ions, plasma profiles, and equilibrium magnetic fields are presented. In addition, the electric acceleration effect on fast ions is shown and discussed. (paper)

  4. Selectivity and sparseness in randomly connected balanced networks.

    Directory of Open Access Journals (Sweden)

    Cengiz Pehlevan

    Full Text Available Neurons in sensory cortex show stimulus selectivity and sparse population response, even in cases where no strong functionally specific structure in connectivity can be detected. This raises the question whether selectivity and sparseness can be generated and maintained in randomly connected networks. We consider a recurrent network of excitatory and inhibitory spiking neurons with random connectivity, driven by random projections from an input layer of stimulus selective neurons. In this architecture, the stimulus-to-stimulus and neuron-to-neuron modulation of total synaptic input is weak compared to the mean input. Surprisingly, we show that in the balanced state the network can still support high stimulus selectivity and sparse population response. In the balanced state, strong synapses amplify the variation in synaptic input and recurrent inhibition cancels the mean. Functional specificity in connectivity emerges due to the inhomogeneity caused by the generative statistical rule used to build the network. We further elucidate the mechanism behind and evaluate the effects of model parameters on population sparseness and stimulus selectivity. Network response to mixtures of stimuli is investigated. It is shown that a balanced state with unselective inhibition can be achieved with densely connected input to inhibitory population. Balanced networks exhibit the "paradoxical" effect: an increase in excitatory drive to inhibition leads to decreased inhibitory population firing rate. We compare and contrast selectivity and sparseness generated by the balanced network to randomly connected unbalanced networks. Finally, we discuss our results in light of experiments.

  5. SPARSE ELECTROMAGNETIC IMAGING USING NONLINEAR LANDWEBER ITERATIONS

    KAUST Repository

    Desmal, Abdulla

    2015-07-29

    A scheme for efficiently solving the nonlinear electromagnetic inverse scattering problem on sparse investigation domains is described. The proposed scheme reconstructs the (complex) dielectric permittivity of an investigation domain from fields measured away from the domain itself. Least-squares data misfit between the computed scattered fields, which are expressed as a nonlinear function of the permittivity, and the measured fields is constrained by the L0/L1-norm of the solution. The resulting minimization problem is solved using nonlinear Landweber iterations, where at each iteration a thresholding function is applied to enforce the sparseness-promoting L0/L1-norm constraint. The thresholded nonlinear Landweber iterations are applied to several two-dimensional problems, where the ``measured\\'\\' fields are synthetically generated or obtained from actual experiments. These numerical experiments demonstrate the accuracy, efficiency, and applicability of the proposed scheme in reconstructing sparse profiles with high permittivity values.

  6. Vector sparse representation of color image using quaternion matrix analysis.

    Science.gov (United States)

    Xu, Yi; Yu, Licheng; Xu, Hongteng; Zhang, Hao; Nguyen, Truong

    2015-04-01

    Traditional sparse image models treat color image pixel as a scalar, which represents color channels separately or concatenate color channels as a monochrome image. In this paper, we propose a vector sparse representation model for color images using quaternion matrix analysis. As a new tool for color image representation, its potential applications in several image-processing tasks are presented, including color image reconstruction, denoising, inpainting, and super-resolution. The proposed model represents the color image as a quaternion matrix, where a quaternion-based dictionary learning algorithm is presented using the K-quaternion singular value decomposition (QSVD) (generalized K-means clustering for QSVD) method. It conducts the sparse basis selection in quaternion space, which uniformly transforms the channel images to an orthogonal color space. In this new color space, it is significant that the inherent color structures can be completely preserved during vector reconstruction. Moreover, the proposed sparse model is more efficient comparing with the current sparse models for image restoration tasks due to lower redundancy between the atoms of different color channels. The experimental results demonstrate that the proposed sparse image model avoids the hue bias issue successfully and shows its potential as a general and powerful tool in color image analysis and processing domain.

  7. An AMR capable finite element diffusion solver for ALE hydrocodes [An AMR capable diffusion solver for ALE-AMR

    Energy Technology Data Exchange (ETDEWEB)

    Fisher, A. C. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Bailey, D. S. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Kaiser, T. B. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Eder, D. C. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Gunney, B. T. N. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Masters, N. D. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Koniges, A. E. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Anderson, R. W. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

    2015-02-01

    Here, we present a novel method for the solution of the diffusion equation on a composite AMR mesh. This approach is suitable for including diffusion based physics modules to hydrocodes that support ALE and AMR capabilities. To illustrate, we proffer our implementations of diffusion based radiation transport and heat conduction in a hydrocode called ALE-AMR. Numerical experiments conducted with the diffusion solver and associated physics packages yield 2nd order convergence in the L2 norm.

  8. Modularization and Validation of FUN3D as a CREATE-AV Helios Near-Body Solver

    Science.gov (United States)

    Jain, Rohit; Biedron, Robert T.; Jones, William T.; Lee-Rausch, Elizabeth M.

    2016-01-01

    Under a recent collaborative effort between the US Army Aeroflightdynamics Directorate (AFDD) and NASA Langley, NASA's general unstructured CFD solver, FUN3D, was modularized as a CREATE-AV Helios near-body unstructured grid solver. The strategies adopted in Helios/FUN3D integration effort are described. A validation study of the new capability is performed for rotorcraft cases spanning hover prediction, airloads prediction, coupling with computational structural dynamics, counter-rotating dual-rotor configurations, and free-flight trim. The integration of FUN3D, along with the previously integrated NASA OVERFLOW solver, lays the ground for future interaction opportunities where capabilities of one component could be leveraged with those of others in a relatively seamless fashion within CREATE-AV Helios.

  9. A GPU-based incompressible Navier-Stokes solver on moving overset grids

    Science.gov (United States)

    Chandar, Dominic D. J.; Sitaraman, Jayanarayanan; Mavriplis, Dimitri J.

    2013-07-01

    In pursuit of obtaining high fidelity solutions to the fluid flow equations in a short span of time, graphics processing units (GPUs) which were originally intended for gaming applications are currently being used to accelerate computational fluid dynamics (CFD) codes. With a high peak throughput of about 1 TFLOPS on a PC, GPUs seem to be favourable for many high-resolution computations. One such computation that involves a lot of number crunching is computing time accurate flow solutions past moving bodies. The aim of the present paper is thus to discuss the development of a flow solver on unstructured and overset grids and its implementation on GPUs. In its present form, the flow solver solves the incompressible fluid flow equations on unstructured/hybrid/overset grids using a fully implicit projection method. The resulting discretised equations are solved using a matrix-free Krylov solver using several GPU kernels such as gradient, Laplacian and reduction. Some of the simple arithmetic vector calculations are implemented using the CU++: An Object Oriented Framework for Computational Fluid Dynamics Applications using Graphics Processing Units, Journal of Supercomputing, 2013, doi:10.1007/s11227-013-0985-9 approach where GPU kernels are automatically generated at compile time. Results are presented for two- and three-dimensional computations on static and moving grids.

  10. Anisotropic resonator analysis using the Fourier-Bessel mode solver

    Science.gov (United States)

    Gauthier, Robert C.

    2018-03-01

    A numerical mode solver for optical structures that conform to cylindrical symmetry using Faraday's and Ampere's laws as starting expressions is developed when electric or magnetic anisotropy is present. The technique builds on the existing Fourier-Bessel mode solver which allows resonator states to be computed exploiting the symmetry properties of the resonator and states to reduce the matrix system. The introduction of anisotropy into the theoretical frame work facilitates the inclusion of PML borders permitting the computation of open ended structures and a better estimation of the resonator state quality factor. Matrix populating expressions are provided that can accommodate any material anisotropy with arbitrary orientation in the computation domain. Several example of electrical anisotropic computations are provided for rationally symmetric structures such as standard optical fibers, axial Bragg-ring fibers and bottle resonators. The anisotropy present in the materials introduces off diagonal matrix elements in the permittivity tensor when expressed in cylindrical coordinates. The effects of the anisotropy of computed states are presented and discussed.

  11. Parallel Auxiliary Space AMG Solver for $H(div)$ Problems

    Energy Technology Data Exchange (ETDEWEB)

    Kolev, Tzanio V. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Vassilevski, Panayot S. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

    2012-12-18

    We present a family of scalable preconditioners for matrices arising in the discretization of $H(div)$ problems using the lowest order Raviart--Thomas finite elements. Our approach belongs to the class of “auxiliary space''--based methods and requires only the finite element stiffness matrix plus some minimal additional discretization information about the topology and orientation of mesh entities. Also, we provide a detailed algebraic description of the theory, parallel implementation, and different variants of this parallel auxiliary space divergence solver (ADS) and discuss its relations to the Hiptmair--Xu (HX) auxiliary space decomposition of $H(div)$ [SIAM J. Numer. Anal., 45 (2007), pp. 2483--2509] and to the auxiliary space Maxwell solver AMS [J. Comput. Math., 27 (2009), pp. 604--623]. Finally, an extensive set of numerical experiments demonstrates the robustness and scalability of our implementation on large-scale $H(div)$ problems with large jumps in the material coefficients.

  12. Evaluating the performance of the two-phase flow solver interFoam

    International Nuclear Information System (INIS)

    Deshpande, Suraj S; Anumolu, Lakshman; Trujillo, Mario F

    2012-01-01

    The performance of the open source multiphase flow solver, interFoam, is evaluated in this work. The solver is based on a modified volume of fluid (VoF) approach, which incorporates an interfacial compression flux term to mitigate the effects of numerical smearing of the interface. It forms a part of the C + + libraries and utilities of OpenFOAM and is gaining popularity in the multiphase flow research community. However, to the best of our knowledge, the evaluation of this solver is confined to the validation tests of specific interest to the users of the code and the extent of its applicability to a wide range of multiphase flow situations remains to be explored. In this work, we have performed a thorough investigation of the solver performance using a variety of verification and validation test cases, which include (i) verification tests for pure advection (kinematics), (ii) dynamics in the high Weber number limit and (iii) dynamics of surface tension-dominated flows. With respect to (i), the kinematics tests show that the performance of interFoam is generally comparable with the recent algebraic VoF algorithms; however, it is noticeably worse than the geometric reconstruction schemes. For (ii), the simulations of inertia-dominated flows with large density ratios ∼O(10 3 ) yielded excellent agreement with analytical and experimental results. In regime (iii), where surface tension is important, consistency of pressure–surface tension formulation and accuracy of curvature are important, as established by Francois et al (2006 J. Comput. Phys. 213 141–73). Several verification tests were performed along these lines and the main findings are: (a) the algorithm of interFoam ensures a consistent formulation of pressure and surface tension; (b) the curvatures computed by the solver converge to a value slightly (10%) different from the analytical value and a scope for improvement exists in this respect. To reduce the disruptive effects of spurious currents, we

  13. Evaluating the performance of the two-phase flow solver interFoam

    Science.gov (United States)

    Deshpande, Suraj S.; Anumolu, Lakshman; Trujillo, Mario F.

    2012-01-01

    The performance of the open source multiphase flow solver, interFoam, is evaluated in this work. The solver is based on a modified volume of fluid (VoF) approach, which incorporates an interfacial compression flux term to mitigate the effects of numerical smearing of the interface. It forms a part of the C + + libraries and utilities of OpenFOAM and is gaining popularity in the multiphase flow research community. However, to the best of our knowledge, the evaluation of this solver is confined to the validation tests of specific interest to the users of the code and the extent of its applicability to a wide range of multiphase flow situations remains to be explored. In this work, we have performed a thorough investigation of the solver performance using a variety of verification and validation test cases, which include (i) verification tests for pure advection (kinematics), (ii) dynamics in the high Weber number limit and (iii) dynamics of surface tension-dominated flows. With respect to (i), the kinematics tests show that the performance of interFoam is generally comparable with the recent algebraic VoF algorithms; however, it is noticeably worse than the geometric reconstruction schemes. For (ii), the simulations of inertia-dominated flows with large density ratios {\\sim }\\mathscr {O}(10^3) yielded excellent agreement with analytical and experimental results. In regime (iii), where surface tension is important, consistency of pressure-surface tension formulation and accuracy of curvature are important, as established by Francois et al (2006 J. Comput. Phys. 213 141-73). Several verification tests were performed along these lines and the main findings are: (a) the algorithm of interFoam ensures a consistent formulation of pressure and surface tension; (b) the curvatures computed by the solver converge to a value slightly (10%) different from the analytical value and a scope for improvement exists in this respect. To reduce the disruptive effects of spurious

  14. Chemical Mechanism Solvers in Air Quality Models

    Directory of Open Access Journals (Sweden)

    John C. Linford

    2011-09-01

    Full Text Available The solution of chemical kinetics is one of the most computationally intensivetasks in atmospheric chemical transport simulations. Due to the stiff nature of the system,implicit time stepping algorithms which repeatedly solve linear systems of equations arenecessary. This paper reviews the issues and challenges associated with the construction ofefficient chemical solvers, discusses several families of algorithms, presents strategies forincreasing computational efficiency, and gives insight into implementing chemical solverson accelerated computer architectures.

  15. Menu-Driven Solver Of Linear-Programming Problems

    Science.gov (United States)

    Viterna, L. A.; Ferencz, D.

    1992-01-01

    Program assists inexperienced user in formulating linear-programming problems. A Linear Program Solver (ALPS) computer program is full-featured LP analysis program. Solves plain linear-programming problems as well as more-complicated mixed-integer and pure-integer programs. Also contains efficient technique for solution of purely binary linear-programming problems. Written entirely in IBM's APL2/PC software, Version 1.01. Packed program contains licensed material, property of IBM (copyright 1988, all rights reserved).

  16. Fast convolutional sparse coding using matrix inversion lemma

    Czech Academy of Sciences Publication Activity Database

    Šorel, Michal; Šroubek, Filip

    2016-01-01

    Roč. 55, č. 1 (2016), s. 44-51 ISSN 1051-2004 R&D Projects: GA ČR GA13-29225S Institutional support: RVO:67985556 Keywords : Convolutional sparse coding * Feature learning * Deconvolution networks * Shift-invariant sparse coding Subject RIV: JD - Computer Applications, Robotics Impact factor: 2.337, year: 2016 http://library.utia.cas.cz/separaty/2016/ZOI/sorel-0459332.pdf

  17. A Comparison of Monte Carlo and Deterministic Solvers for keff and Sensitivity Calculations

    Energy Technology Data Exchange (ETDEWEB)

    Haeck, Wim [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Parsons, Donald Kent [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); White, Morgan Curtis [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Saller, Thomas [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Favorite, Jeffrey A. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

    2017-12-12

    Verification and validation of our solutions for calculating the neutron reactivity for nuclear materials is a key issue to address for many applications, including criticality safety, research reactors, power reactors, and nuclear security. Neutronics codes solve variations of the Boltzmann transport equation. The two main variants are Monte Carlo versus deterministic solutions, e.g. the MCNP [1] versus PARTISN [2] codes, respectively. There have been many studies over the decades that examined the accuracy of such solvers and the general conclusion is that when the problems are well-posed, either solver can produce accurate results. However, the devil is always in the details. The current study examines the issue of self-shielding and the stress it puts on deterministic solvers. Most Monte Carlo neutronics codes use continuous-energy descriptions of the neutron interaction data that are not subject to this effect. The issue of self-shielding occurs because of the discretisation of data used by the deterministic solutions. Multigroup data used in these solvers are the average cross section and scattering parameters over an energy range. Resonances in cross sections can occur that change the likelihood of interaction by one to three orders of magnitude over a small energy range. Self-shielding is the numerical effect that the average cross section in groups with strong resonances can be strongly affected as neutrons within that material are preferentially absorbed or scattered out of the resonance energies. This affects both the average cross section and the scattering matrix.

  18. Structure-based bayesian sparse reconstruction

    KAUST Repository

    Quadeer, Ahmed Abdul; Al-Naffouri, Tareq Y.

    2012-01-01

    Sparse signal reconstruction algorithms have attracted research attention due to their wide applications in various fields. In this paper, we present a simple Bayesian approach that utilizes the sparsity constraint and a priori statistical

  19. Binary Sparse Phase Retrieval via Simulated Annealing

    Directory of Open Access Journals (Sweden)

    Wei Peng

    2016-01-01

    Full Text Available This paper presents the Simulated Annealing Sparse PhAse Recovery (SASPAR algorithm for reconstructing sparse binary signals from their phaseless magnitudes of the Fourier transform. The greedy strategy version is also proposed for a comparison, which is a parameter-free algorithm. Sufficient numeric simulations indicate that our method is quite effective and suggest the binary model is robust. The SASPAR algorithm seems competitive to the existing methods for its efficiency and high recovery rate even with fewer Fourier measurements.

  20. Implementing parallel elliptic solver on a Beowulf cluster

    Directory of Open Access Journals (Sweden)

    Marcin Paprzycki

    1999-12-01

    Full Text Available In a recent paper cite{zara} a parallel direct solver for the linear systems arising from elliptic partial differential equations has been proposed. The aim of this note is to present the initial evaluation of the performance characteristics of this algorithm on Beowulf-type cluster. In this context the performance of PVM and MPI based implementations is compared.

  1. Nearly Interactive Parabolized Navier-Stokes Solver for High Speed Forebody and Inlet Flows

    Science.gov (United States)

    Benson, Thomas J.; Liou, May-Fun; Jones, William H.; Trefny, Charles J.

    2009-01-01

    A system of computer programs is being developed for the preliminary design of high speed inlets and forebodies. The system comprises four functions: geometry definition, flow grid generation, flow solver, and graphics post-processor. The system runs on a dedicated personal computer using the Windows operating system and is controlled by graphical user interfaces written in MATLAB (The Mathworks, Inc.). The flow solver uses the Parabolized Navier-Stokes equations to compute millions of mesh points in several minutes. Sample two-dimensional and three-dimensional calculations are demonstrated in the paper.

  2. Nonlinear Multigrid solver exploiting AMGe Coarse Spaces with Approximation Properties

    DEFF Research Database (Denmark)

    Christensen, Max la Cour; Villa, Umberto; Engsig-Karup, Allan Peter

    The paper introduces a nonlinear multigrid solver for mixed finite element discretizations based on the Full Approximation Scheme (FAS) and element-based Algebraic Multigrid (AMGe). The main motivation to use FAS for unstructured problems is the guaranteed approximation property of the AMGe coarse...... properties of the coarse spaces. With coarse spaces with approximation properties, our FAS approach on unstructured meshes has the ability to be as powerful/successful as FAS on geometrically refined meshes. For comparison, Newton’s method and Picard iterations with an inner state-of-the-art linear solver...... are compared to FAS on a nonlinear saddle point problem with applications to porous media flow. It is demonstrated that FAS is faster than Newton’s method and Picard iterations for the experiments considered here. Due to the guaranteed approximation properties of our AMGe, the coarse spaces are very accurate...

  3. High accuracy electromagnetic field solvers for cylindrical waveguides and axisymmetric structures using the finite element method

    International Nuclear Information System (INIS)

    Nelson, E.M.

    1993-12-01

    Some two-dimensional finite element electromagnetic field solvers are described and tested. For TE and TM modes in homogeneous cylindrical waveguides and monopole modes in homogeneous axisymmetric structures, the solvers find approximate solutions to a weak formulation of the wave equation. Second-order isoparametric lagrangian triangular elements represent the field. For multipole modes in axisymmetric structures, the solver finds approximate solutions to a weak form of the curl-curl formulation of Maxwell's equations. Second-order triangular edge elements represent the radial (ρ) and axial (z) components of the field, while a second-order lagrangian basis represents the azimuthal (φ) component of the field weighted by the radius ρ. A reduced set of basis functions is employed for elements touching the axis. With this basis the spurious modes of the curl-curl formulation have zero frequency, so spurious modes are easily distinguished from non-static physical modes. Tests on an annular ring, a pillbox and a sphere indicate the solutions converge rapidly as the mesh is refined. Computed eigenvalues with relative errors of less than a few parts per million are obtained. Boundary conditions for symmetric, periodic and symmetric-periodic structures are discussed and included in the field solver. Boundary conditions for structures with inversion symmetry are also discussed. Special corner elements are described and employed to improve the accuracy of cylindrical waveguide and monopole modes with singular fields at sharp corners. The field solver is applied to three problems: (1) cross-field amplifier slow-wave circuits, (2) a detuned disk-loaded waveguide linear accelerator structure and (3) a 90 degrees overmoded waveguide bend. The detuned accelerator structure is a critical application of this high accuracy field solver. To maintain low long-range wakefields, tight design and manufacturing tolerances are required

  4. Confidence of model based shape reconstruction from sparse data

    DEFF Research Database (Denmark)

    Baka, N.; de Bruijne, Marleen; Reiber, J. H. C.

    2010-01-01

    Statistical shape models (SSM) are commonly applied for plausible interpolation of missing data in medical imaging. However, when fitting a shape model to sparse information, many solutions may fit the available data. In this paper we derive a constrained SSM to fit noisy sparse input landmarks...

  5. Proportionate Minimum Error Entropy Algorithm for Sparse System Identification

    Directory of Open Access Journals (Sweden)

    Zongze Wu

    2015-08-01

    Full Text Available Sparse system identification has received a great deal of attention due to its broad applicability. The proportionate normalized least mean square (PNLMS algorithm, as a popular tool, achieves excellent performance for sparse system identification. In previous studies, most of the cost functions used in proportionate-type sparse adaptive algorithms are based on the mean square error (MSE criterion, which is optimal only when the measurement noise is Gaussian. However, this condition does not hold in most real-world environments. In this work, we use the minimum error entropy (MEE criterion, an alternative to the conventional MSE criterion, to develop the proportionate minimum error entropy (PMEE algorithm for sparse system identification, which may achieve much better performance than the MSE based methods especially in heavy-tailed non-Gaussian situations. Moreover, we analyze the convergence of the proposed algorithm and derive a sufficient condition that ensures the mean square convergence. Simulation results confirm the excellent performance of the new algorithm.

  6. Sparse PDF Volumes for Consistent Multi-Resolution Volume Rendering

    KAUST Repository

    Sicat, Ronell Barrera

    2014-12-31

    This paper presents a new multi-resolution volume representation called sparse pdf volumes, which enables consistent multi-resolution volume rendering based on probability density functions (pdfs) of voxel neighborhoods. These pdfs are defined in the 4D domain jointly comprising the 3D volume and its 1D intensity range. Crucially, the computation of sparse pdf volumes exploits data coherence in 4D, resulting in a sparse representation with surprisingly low storage requirements. At run time, we dynamically apply transfer functions to the pdfs using simple and fast convolutions. Whereas standard low-pass filtering and down-sampling incur visible differences between resolution levels, the use of pdfs facilitates consistent results independent of the resolution level used. We describe the efficient out-of-core computation of large-scale sparse pdf volumes, using a novel iterative simplification procedure of a mixture of 4D Gaussians. Finally, our data structure is optimized to facilitate interactive multi-resolution volume rendering on GPUs.

  7. Solving non-linear Horn clauses using a linear Horn clause solver

    DEFF Research Database (Denmark)

    Kafle, Bishoksan; Gallagher, John Patrick; Ganty, Pierre

    2016-01-01

    In this paper we show that checking satisfiability of a set of non-linear Horn clauses (also called a non-linear Horn clause program) can be achieved using a solver for linear Horn clauses. We achieve this by interleaving a program transformation with a satisfiability checker for linear Horn...... clauses (also called a solver for linear Horn clauses). The program transformation is based on the notion of tree dimension, which we apply to a set of non-linear clauses, yielding a set whose derivation trees have bounded dimension. Such a set of clauses can be linearised. The main algorithm...... dimension. We constructed a prototype implementation of this approach and performed some experiments on a set of verification problems, which shows some promise....

  8. GPU TECHNOLOGIES EMBODIED IN PARALLEL SOLVERS OF LINEAR ALGEBRAIC EQUATION SYSTEMS

    Directory of Open Access Journals (Sweden)

    Sidorov Alexander Vladimirovich

    2012-10-01

    Full Text Available The author reviews existing shareware solvers that are operated by graphical computer devices. The purpose of this review is to explore the opportunities and limitations of the above parallel solvers applicable for resolution of linear algebraic problems that arise at Research and Educational Centre of Computer Modeling at MSUCE, and Research and Engineering Centre STADYO. The author has explored new applications of the GPU in the PETSc suite and compared them with the results generated absent of the GPU. The research is performed within the CUSP library developed to resolve the problems of linear algebra through the application of GPU. The author has also reviewed the new MAGMA project which is analogous to LAPACK for the GPU.

  9. ODEPACK, Initial Value Problems of Ordinary Differential Equation System

    International Nuclear Information System (INIS)

    Hindmarsh, A.C.; Petzold, L.R.

    2005-01-01

    I - Description of program or function: ODEPACK is a collection of Fortran solvers for the initial value problem for ordinary differential equation systems. It consists of nine solvers, namely a basic solver called LSODE and eight variants of it -- LSODES, LSODA, LSODAR, LSODPK, LSODKR, LSODI, LSOIBT, and LSODIS. The collection is suitable for both stiff and non-stiff systems. It includes solvers for systems given in explicit form, dy/dt = f(t,y), and also solvers for systems given in linearly implicit form, A(t,y) dy/dt = g(t,y). Two of the solvers use general sparse matrix solvers for the linear systems that arise. Two others use iterative (preconditioned Krylov) methods instead of direct methods for these linear systems. The most recent addition is LSODIS, which solves implicit problems with general sparse treatment of all matrices involved. The ODEPACK solvers are written in standard Fortran 77, with a few exceptions, and with minimal machine dependencies. There are separate double and single precision versions of ODEPACK. The actual solver names are those given above with a prefix of D- or S- for the double or single precision version, respectively, i.e. DLSODE/SLSODE, etc. Each solver consists of a main driver subroutine having the same name as the solver and some number of subordinate routines. For each solver, there is also a demonstration program, which solves one or two simple problems in a somewhat self-checking manner. A. Solvers for explicitly given systems. For each of the following solvers, it is assumed that the ODEs are given explicitly, so that the system can be written in the form dy/dt = f(t,y), where y is the vector of dependent variables, and t is the independent variable. 1. LSODE (Livermore Solver for Ordinary Differential Equations) is the basic solver of the collection. It solves stiff and non-stiff systems of the form dy/dt = f. In the stiff case, it treats the Jacobian matrix df/dy as either a dense (full) or a banded matrix, and as

  10. Ordering sparse matrices for cache-based systems

    International Nuclear Information System (INIS)

    Biswas, Rupak; Oliker, Leonid

    2001-01-01

    The Conjugate Gradient (CG) algorithm is the oldest and best-known Krylov subspace method used to solve sparse linear systems. Most of the coating-point operations within each CG iteration is spent performing sparse matrix-vector multiplication (SPMV). We examine how various ordering and partitioning strategies affect the performance of CG and SPMV when different programming paradigms are used on current commercial cache-based computers. However, a multithreaded implementation on the cacheless Cray MTA demonstrates high efficiency and scalability without any special ordering or partitioning

  11. Modeling hemodynamics in intracranial aneurysms: Comparing accuracy of CFD solvers based on finite element and finite volume schemes.

    Science.gov (United States)

    Botti, Lorenzo; Paliwal, Nikhil; Conti, Pierangelo; Antiga, Luca; Meng, Hui

    2018-06-01

    Image-based computational fluid dynamics (CFD) has shown potential to aid in the clinical management of intracranial aneurysms (IAs) but its adoption in the clinical practice has been missing, partially due to lack of accuracy assessment and sensitivity analysis. To numerically solve the flow-governing equations CFD solvers generally rely on two spatial discretization schemes: Finite Volume (FV) and Finite Element (FE). Since increasingly accurate numerical solutions are obtained by different means, accuracies and computational costs of FV and FE formulations cannot be compared directly. To this end, in this study we benchmark two representative CFD solvers in simulating flow in a patient-specific IA model: (1) ANSYS Fluent, a commercial FV-based solver and (2) VMTKLab multidGetto, a discontinuous Galerkin (dG) FE-based solver. The FV solver's accuracy is improved by increasing the spatial mesh resolution (134k, 1.1m, 8.6m and 68.5m tetrahedral element meshes). The dGFE solver accuracy is increased by increasing the degree of polynomials (first, second, third and fourth degree) on the base 134k tetrahedral element mesh. Solutions from best FV and dGFE approximations are used as baseline for error quantification. On average, velocity errors for second-best approximations are approximately 1cm/s for a [0,125]cm/s velocity magnitude field. Results show that high-order dGFE provide better accuracy per degree of freedom but worse accuracy per Jacobian non-zero entry as compared to FV. Cross-comparison of velocity errors demonstrates asymptotic convergence of both solvers to the same numerical solution. Nevertheless, the discrepancy between under-resolved velocity fields suggests that mesh independence is reached following different paths. This article is protected by copyright. All rights reserved.

  12. A flexible framework for sparse simultaneous component based data integration

    Directory of Open Access Journals (Sweden)

    Van Deun Katrijn

    2011-11-01

    Full Text Available Abstract 1 Background High throughput data are complex and methods that reveal structure underlying the data are most useful. Principal component analysis, frequently implemented as a singular value decomposition, is a popular technique in this respect. Nowadays often the challenge is to reveal structure in several sources of information (e.g., transcriptomics, proteomics that are available for the same biological entities under study. Simultaneous component methods are most promising in this respect. However, the interpretation of the principal and simultaneous components is often daunting because contributions of each of the biomolecules (transcripts, proteins have to be taken into account. 2 Results We propose a sparse simultaneous component method that makes many of the parameters redundant by shrinking them to zero. It includes principal component analysis, sparse principal component analysis, and ordinary simultaneous component analysis as special cases. Several penalties can be tuned that account in different ways for the block structure present in the integrated data. This yields known sparse approaches as the lasso, the ridge penalty, the elastic net, the group lasso, sparse group lasso, and elitist lasso. In addition, the algorithmic results can be easily transposed to the context of regression. Metabolomics data obtained with two measurement platforms for the same set of Escherichia coli samples are used to illustrate the proposed methodology and the properties of different penalties with respect to sparseness across and within data blocks. 3 Conclusion Sparse simultaneous component analysis is a useful method for data integration: First, simultaneous analyses of multiple blocks offer advantages over sequential and separate analyses and second, interpretation of the results is highly facilitated by their sparseness. The approach offered is flexible and allows to take the block structure in different ways into account. As such

  13. A flexible framework for sparse simultaneous component based data integration.

    Science.gov (United States)

    Van Deun, Katrijn; Wilderjans, Tom F; van den Berg, Robert A; Antoniadis, Anestis; Van Mechelen, Iven

    2011-11-15

    High throughput data are complex and methods that reveal structure underlying the data are most useful. Principal component analysis, frequently implemented as a singular value decomposition, is a popular technique in this respect. Nowadays often the challenge is to reveal structure in several sources of information (e.g., transcriptomics, proteomics) that are available for the same biological entities under study. Simultaneous component methods are most promising in this respect. However, the interpretation of the principal and simultaneous components is often daunting because contributions of each of the biomolecules (transcripts, proteins) have to be taken into account. We propose a sparse simultaneous component method that makes many of the parameters redundant by shrinking them to zero. It includes principal component analysis, sparse principal component analysis, and ordinary simultaneous component analysis as special cases. Several penalties can be tuned that account in different ways for the block structure present in the integrated data. This yields known sparse approaches as the lasso, the ridge penalty, the elastic net, the group lasso, sparse group lasso, and elitist lasso. In addition, the algorithmic results can be easily transposed to the context of regression. Metabolomics data obtained with two measurement platforms for the same set of Escherichia coli samples are used to illustrate the proposed methodology and the properties of different penalties with respect to sparseness across and within data blocks. Sparse simultaneous component analysis is a useful method for data integration: First, simultaneous analyses of multiple blocks offer advantages over sequential and separate analyses and second, interpretation of the results is highly facilitated by their sparseness. The approach offered is flexible and allows to take the block structure in different ways into account. As such, structures can be found that are exclusively tied to one data platform

  14. P-SPARSLIB: A parallel sparse iterative solution package

    Energy Technology Data Exchange (ETDEWEB)

    Saad, Y. [Univ. of Minnesota, Minneapolis, MN (United States)

    1994-12-31

    Iterative methods are gaining popularity in engineering and sciences at a time where the computational environment is changing rapidly. P-SPARSLIB is a project to build a software library for sparse matrix computations on parallel computers. The emphasis is on iterative methods and the use of distributed sparse matrices, an extension of the domain decomposition approach to general sparse matrices. One of the goals of this project is to develop a software package geared towards specific applications. For example, the author will test the performance and usefulness of P-SPARSLIB modules on linear systems arising from CFD applications. Equally important is the goal of portability. In the long run, the author wishes to ensure that this package is portable on a variety of platforms, including SIMD environments and shared memory environments.

  15. Feature selection and multi-kernel learning for sparse representation on a manifold

    KAUST Repository

    Wang, Jim Jing-Yan

    2014-03-01

    Sparse representation has been widely studied as a part-based data representation method and applied in many scientific and engineering fields, such as bioinformatics and medical imaging. It seeks to represent a data sample as a sparse linear combination of some basic items in a dictionary. Gao etal. (2013) recently proposed Laplacian sparse coding by regularizing the sparse codes with an affinity graph. However, due to the noisy features and nonlinear distribution of the data samples, the affinity graph constructed directly from the original feature space is not necessarily a reliable reflection of the intrinsic manifold of the data samples. To overcome this problem, we integrate feature selection and multiple kernel learning into the sparse coding on the manifold. To this end, unified objectives are defined for feature selection, multiple kernel learning, sparse coding, and graph regularization. By optimizing the objective functions iteratively, we develop novel data representation algorithms with feature selection and multiple kernel learning respectively. Experimental results on two challenging tasks, N-linked glycosylation prediction and mammogram retrieval, demonstrate that the proposed algorithms outperform the traditional sparse coding methods. © 2013 Elsevier Ltd.

  16. Feature selection and multi-kernel learning for sparse representation on a manifold.

    Science.gov (United States)

    Wang, Jim Jing-Yan; Bensmail, Halima; Gao, Xin

    2014-03-01

    Sparse representation has been widely studied as a part-based data representation method and applied in many scientific and engineering fields, such as bioinformatics and medical imaging. It seeks to represent a data sample as a sparse linear combination of some basic items in a dictionary. Gao et al. (2013) recently proposed Laplacian sparse coding by regularizing the sparse codes with an affinity graph. However, due to the noisy features and nonlinear distribution of the data samples, the affinity graph constructed directly from the original feature space is not necessarily a reliable reflection of the intrinsic manifold of the data samples. To overcome this problem, we integrate feature selection and multiple kernel learning into the sparse coding on the manifold. To this end, unified objectives are defined for feature selection, multiple kernel learning, sparse coding, and graph regularization. By optimizing the objective functions iteratively, we develop novel data representation algorithms with feature selection and multiple kernel learning respectively. Experimental results on two challenging tasks, N-linked glycosylation prediction and mammogram retrieval, demonstrate that the proposed algorithms outperform the traditional sparse coding methods. Copyright © 2013 Elsevier Ltd. All rights reserved.

  17. Sparse representation, modeling and learning in visual recognition theory, algorithms and applications

    CERN Document Server

    Cheng, Hong

    2015-01-01

    This unique text/reference presents a comprehensive review of the state of the art in sparse representations, modeling and learning. The book examines both the theoretical foundations and details of algorithm implementation, highlighting the practical application of compressed sensing research in visual recognition and computer vision. Topics and features: provides a thorough introduction to the fundamentals of sparse representation, modeling and learning, and the application of these techniques in visual recognition; describes sparse recovery approaches, robust and efficient sparse represen

  18. Design Patterns for Sparse-Matrix Computations on Hybrid CPU/GPU Platforms

    Directory of Open Access Journals (Sweden)

    Valeria Cardellini

    2014-01-01

    Full Text Available We apply object-oriented software design patterns to develop code for scientific software involving sparse matrices. Design patterns arise when multiple independent developments produce similar designs which converge onto a generic solution. We demonstrate how to use design patterns to implement an interface for sparse matrix computations on NVIDIA GPUs starting from PSBLAS, an existing sparse matrix library, and from existing sets of GPU kernels for sparse matrices. We also compare the throughput of the PSBLAS sparse matrix–vector multiplication on two platforms exploiting the GPU with that obtained by a CPU-only PSBLAS implementation. Our experiments exhibit encouraging results regarding the comparison between CPU and GPU executions in double precision, obtaining a speedup of up to 35.35 on NVIDIA GTX 285 with respect to AMD Athlon 7750, and up to 10.15 on NVIDIA Tesla C2050 with respect to Intel Xeon X5650.

  19. An Efficient GPU General Sparse Matrix-Matrix Multiplication for Irregular Data

    DEFF Research Database (Denmark)

    Liu, Weifeng; Vinter, Brian

    2014-01-01

    General sparse matrix-matrix multiplication (SpGEMM) is a fundamental building block for numerous applications such as algebraic multigrid method, breadth first search and shortest path problem. Compared to other sparse BLAS routines, an efficient parallel SpGEMM algorithm has to handle extra...... irregularity from three aspects: (1) the number of the nonzero entries in the result sparse matrix is unknown in advance, (2) very expensive parallel insert operations at random positions in the result sparse matrix dominate the execution time, and (3) load balancing must account for sparse data in both input....... Load balancing builds on the number of the necessary arithmetic operations on the nonzero entries and is guaranteed in all stages. Compared with the state-of-the-art GPU SpGEMM methods in the CUSPARSE library and the CUSP library and the latest CPU SpGEMM method in the Intel Math Kernel Library, our...

  20. Comparison of Methods for Sparse Representation of Musical Signals

    DEFF Research Database (Denmark)

    Endelt, Line Ørtoft; la Cour-Harbo, Anders

    2005-01-01

    by a number of sparseness measures and results are shown on the ℓ1 norm of the coefficients, using a dictionary containing a Dirac basis, a Discrete Cosine Transform, and a Wavelet Packet. Evaluated only on the sparseness Matching Pursuit is the best method, and it is also relatively fast....

  1. Joint-2D-SL0 Algorithm for Joint Sparse Matrix Reconstruction

    Directory of Open Access Journals (Sweden)

    Dong Zhang

    2017-01-01

    Full Text Available Sparse matrix reconstruction has a wide application such as DOA estimation and STAP. However, its performance is usually restricted by the grid mismatch problem. In this paper, we revise the sparse matrix reconstruction model and propose the joint sparse matrix reconstruction model based on one-order Taylor expansion. And it can overcome the grid mismatch problem. Then, we put forward the Joint-2D-SL0 algorithm which can solve the joint sparse matrix reconstruction problem efficiently. Compared with the Kronecker compressive sensing method, our proposed method has a higher computational efficiency and acceptable reconstruction accuracy. Finally, simulation results validate the superiority of the proposed method.

  2. Development of a global toroidal gyrokinetic Vlasov code with new real space field solver

    International Nuclear Information System (INIS)

    Obrejan, Kevin; Imadera, Kenji; Li, Ji-Quan; Kishimoto, Yasuaki

    2015-01-01

    This work introduces a new full-f toroidal gyrokinetic (GK) Vlasov simulation code that uses a real space field solver. This solver enables us to compute the gyro-averaging operators in real space to allow proper treatment of finite Larmor radius (FLR) effects without requiring any particular hypothesis and in any magnetic field configuration (X-point, D-shaped etc). The code was well verified through benchmark tests such as toroidal Ion Temperature Gradient (ITG) instability and collisionless damping of zonal flow. (author)

  3. Discussion of CoSA: Clustering of Sparse Approximations

    Energy Technology Data Exchange (ETDEWEB)

    Armstrong, Derek Elswick [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

    2017-03-07

    The purpose of this talk is to discuss the possible applications of CoSA (Clustering of Sparse Approximations) to the exploitation of HSI (HyperSpectral Imagery) data. CoSA is presented by Moody et al. in the Journal of Applied Remote Sensing (“Land cover classification in multispectral imagery using clustering of sparse approximations over learned feature dictionaries”, Vol. 8, 2014) and is based on machine learning techniques.

  4. A contribution to the great Riemann solver debate

    Science.gov (United States)

    Quirk, James J.

    1992-01-01

    The aims of this paper are threefold: to increase the level of awareness within the shock capturing community to the fact that many Godunov-type methods contain subtle flaws that can cause spurious solutions to be computed; to identify one mechanism that might thwart attempts to produce very high resolution simulations; and to proffer a simple strategy for overcoming the specific failings of individual Riemann solvers.

  5. Robust Multiscale Iterative Solvers for Nonlinear Flows in Highly Heterogeneous Media

    KAUST Repository

    Efendiev, Y.

    2012-08-01

    In this paper, we study robust iterative solvers for finite element systems resulting in approximation of steady-state Richards\\' equation in porous media with highly heterogeneous conductivity fields. It is known that in such cases the contrast, ratio between the highest and lowest values of the conductivity, can adversely affect the performance of the preconditioners and, consequently, a design of robust preconditioners is important for many practical applications. The proposed iterative solvers consist of two kinds of iterations, outer and inner iterations. Outer iterations are designed to handle nonlinearities by linearizing the equation around the previous solution state. As a result of the linearization, a large-scale linear system needs to be solved. This linear system is solved iteratively (called inner iterations), and since it can have large variations in the coefficients, a robust preconditioner is needed. First, we show that under some assumptions the number of outer iterations is independent of the contrast. Second, based on the recently developed iterative methods, we construct a class of preconditioners that yields convergence rate that is independent of the contrast. Thus, the proposed iterative solvers are optimal with respect to the large variation in the physical parameters. Since the same preconditioner can be reused in every outer iteration, this provides an additional computational savings in the overall solution process. Numerical tests are presented to confirm the theoretical results. © 2012 Global-Science Press.

  6. Using a satisfiability solver to identify deterministic finite state automata

    NARCIS (Netherlands)

    Heule, M.J.H.; Verwer, S.

    2009-01-01

    We present an exact algorithm for identification of deterministic finite automata (DFA) which is based on satisfiability (SAT) solvers. Despite the size of the low level SAT representation, our approach seems to be competitive with alternative techniques. Our contributions are threefold: First, we

  7. Pushing Memory Bandwidth Limitations Through Efficient Implementations of Block-Krylov Space Solvers on GPUs

    Energy Technology Data Exchange (ETDEWEB)

    Clark, M. A. [NVIDIA Corp., Santa Clara; Strelchenko, Alexei [Fermilab; Vaquero, Alejandro [Utah U.; Wagner, Mathias [NVIDIA Corp., Santa Clara; Weinberg, Evan [Boston U.

    2017-10-26

    Lattice quantum chromodynamics simulations in nuclear physics have benefited from a tremendous number of algorithmic advances such as multigrid and eigenvector deflation. These improve the time to solution but do not alleviate the intrinsic memory-bandwidth constraints of the matrix-vector operation dominating iterative solvers. Batching this operation for multiple vectors and exploiting cache and register blocking can yield a super-linear speed up. Block-Krylov solvers can naturally take advantage of such batched matrix-vector operations, further reducing the iterations to solution by sharing the Krylov space between solves. However, practical implementations typically suffer from the quadratic scaling in the number of vector-vector operations. Using the QUDA library, we present an implementation of a block-CG solver on NVIDIA GPUs which reduces the memory-bandwidth complexity of vector-vector operations from quadratic to linear. We present results for the HISQ discretization, showing a 5x speedup compared to highly-optimized independent Krylov solves on NVIDIA's SaturnV cluster.

  8. Essential imposition of Neumann condition in Galerkin-Legendre elliptic solvers

    CERN Document Server

    Auteri, F; Quartapelle, L

    2003-01-01

    A new Galerkin-Legendre direct spectral solver for the Neumann problem associated with Laplace and Helmholtz operators in rectangular domains is presented. The algorithm differs from other Neumann spectral solvers by the high sparsity of the matrices, exploited in conjunction with the direct product structure of the problem. The homogeneous boundary condition is satisfied exactly by expanding the unknown variable into a polynomial basis of functions which are built upon the Legendre polynomials and have a zero slope at the interval extremes. A double diagonalization process is employed pivoting around the eigenstructure of the pentadiagonal mass matrices in both directions, instead of the full stiffness matrices encountered in the classical variational formulation of the problem with a weak natural imposition of the derivative boundary condition. Nonhomogeneous Neumann data are accounted for by means of a lifting. Numerical results are given to illustrate the performance of the proposed spectral elliptic solv...

  9. Metaheuristics progress as real problem solvers

    CERN Document Server

    Nonobe, Koji; Yagiura, Mutsunori

    2005-01-01

    Metaheuristics: Progress as Real Problem Solvers is a peer-reviewed volume of eighteen current, cutting-edge papers by leading researchers in the field. Included are an invited paper by F. Glover and G. Kochenberger, which discusses the concept of Metaheuristic agent processes, and a tutorial paper by M.G.C. Resende and C.C. Ribeiro discussing GRASP with path-relinking. Other papers discuss problem-solving approaches to timetabling, automated planograms, elevators, space allocation, shift design, cutting stock, flexible shop scheduling, colorectal cancer and cartography. A final group of methodology papers clarify various aspects of Metaheuristics from the computational view point.

  10. Facial Expression Recognition via Non-Negative Least-Squares Sparse Coding

    Directory of Open Access Journals (Sweden)

    Ying Chen

    2014-05-01

    Full Text Available Sparse coding is an active research subject in signal processing, computer vision, and pattern recognition. A novel method of facial expression recognition via non-negative least squares (NNLS sparse coding is presented in this paper. The NNLS sparse coding is used to form a facial expression classifier. To testify the performance of the presented method, local binary patterns (LBP and the raw pixels are extracted for facial feature representation. Facial expression recognition experiments are conducted on the Japanese Female Facial Expression (JAFFE database. Compared with other widely used methods such as linear support vector machines (SVM, sparse representation-based classifier (SRC, nearest subspace classifier (NSC, K-nearest neighbor (KNN and radial basis function neural networks (RBFNN, the experiment results indicate that the presented NNLS method performs better than other used methods on facial expression recognition tasks.

  11. Sparse Representation Based Multi-Instance Learning for Breast Ultrasound Image Classification

    Directory of Open Access Journals (Sweden)

    Lu Bing

    2017-01-01

    Full Text Available We propose a novel method based on sparse representation for breast ultrasound image classification under the framework of multi-instance learning (MIL. After image enhancement and segmentation, concentric circle is used to extract the global and local features for improving the accuracy in diagnosis and prediction. The classification problem of ultrasound image is converted to sparse representation based MIL problem. Each instance of a bag is represented as a sparse linear combination of all basis vectors in the dictionary, and then the bag is represented by one feature vector which is obtained via sparse representations of all instances within the bag. The sparse and MIL problem is further converted to a conventional learning problem that is solved by relevance vector machine (RVM. Results of single classifiers are combined to be used for classification. Experimental results on the breast cancer datasets demonstrate the superiority of the proposed method in terms of classification accuracy as compared with state-of-the-art MIL methods.

  12. Sparse Representation Based Multi-Instance Learning for Breast Ultrasound Image Classification.

    Science.gov (United States)

    Bing, Lu; Wang, Wei

    2017-01-01

    We propose a novel method based on sparse representation for breast ultrasound image classification under the framework of multi-instance learning (MIL). After image enhancement and segmentation, concentric circle is used to extract the global and local features for improving the accuracy in diagnosis and prediction. The classification problem of ultrasound image is converted to sparse representation based MIL problem. Each instance of a bag is represented as a sparse linear combination of all basis vectors in the dictionary, and then the bag is represented by one feature vector which is obtained via sparse representations of all instances within the bag. The sparse and MIL problem is further converted to a conventional learning problem that is solved by relevance vector machine (RVM). Results of single classifiers are combined to be used for classification. Experimental results on the breast cancer datasets demonstrate the superiority of the proposed method in terms of classification accuracy as compared with state-of-the-art MIL methods.

  13. On Positive Semidefinite Modification Schemes for Incomplete Cholesky Factorization

    Czech Academy of Sciences Publication Activity Database

    Scott, J.; Tůma, Miroslav

    2014-01-01

    Roč. 36, č. 2 (2014), A609-A633 ISSN 1064-8275 R&D Projects: GA ČR GA13-06684S Institutional support: RVO:67985807 Keywords : sparse matrices * sparse linear systems * positive-definite symmetric systems * iterative solvers * preconditioning * incomplete Cholesky factorization Subject RIV: BA - General Mathematics Impact factor: 1.854, year: 2014

  14. Joint sparse representation for robust multimodal biometrics recognition.

    Science.gov (United States)

    Shekhar, Sumit; Patel, Vishal M; Nasrabadi, Nasser M; Chellappa, Rama

    2014-01-01

    Traditional biometric recognition systems rely on a single biometric signature for authentication. While the advantage of using multiple sources of information for establishing the identity has been widely recognized, computational models for multimodal biometrics recognition have only recently received attention. We propose a multimodal sparse representation method, which represents the test data by a sparse linear combination of training data, while constraining the observations from different modalities of the test subject to share their sparse representations. Thus, we simultaneously take into account correlations as well as coupling information among biometric modalities. A multimodal quality measure is also proposed to weigh each modality as it gets fused. Furthermore, we also kernelize the algorithm to handle nonlinearity in data. The optimization problem is solved using an efficient alternative direction method. Various experiments show that the proposed method compares favorably with competing fusion-based methods.

  15. Using Solver Interfaced Virtual Reality in PEACER Design Process

    International Nuclear Information System (INIS)

    Lee, Hyong Won; Nam, Won Chang; Jeong, Seung Ho; Hwang, Il Soon; Shin, Jong Gye; Kim, Chang Hyo

    2006-01-01

    The recent research progress in the area of plant design and simulation highlighted the importance of integrating design and analysis models on a unified environment. For currently developed advanced reactors, either for power production or research, this effort has embraced impressive state-of-the-art information and automation technology. The PEACER (Proliferation-resistant, Environment friendly, Accident-tolerant, Continual and Economical Reactor) is one of the conceptual fast reactor system cooled by LBE (Lead Bismuth Eutectic) for nuclear waste transmutation. This reactor system is composed of innovative combination between design process and analysis. To establish an integrated design process by coupling design, analysis, and post-processing technology while minimizing the repetitive and costly manual interactions for design changes, a solver interfaced virtual reality simulation system (SIVR) has been developed for a nuclear transmutation energy system as PEACER. The SIVR was developed using Virtual Reality Modeling Language (VRML) in order to interface a commercial 3D CAD tool with various engineering solvers and to implement virtual reality presentation of results in a neutral format. In this paper, we have shown the SIVR approach viable and effective in the life-cycle management of complex nuclear energy systems, including design, construction and operation. For instance, The HELIOS is a down scaled model of the PEACER prototype to demonstrate the operability and safety as well as preliminary test of PEACER PLM (Product Life-cycle Management) with SIVR (Solver Interfaced Virtual Reality) concepts. Most components are designed by CATIA, which is 3D CAD tool. During the construction, 3D drawing by CATIA was effective to handle and arrange the loop configuration, especially when we changed the design. Most of all, This system shows the transparency of design and operational status of an energy complex to operators and inspectors can help ensure accident

  16. Robust Visual Tracking Via Consistent Low-Rank Sparse Learning

    KAUST Repository

    Zhang, Tianzhu

    2014-06-19

    Object tracking is the process of determining the states of a target in consecutive video frames based on properties of motion and appearance consistency. In this paper, we propose a consistent low-rank sparse tracker (CLRST) that builds upon the particle filter framework for tracking. By exploiting temporal consistency, the proposed CLRST algorithm adaptively prunes and selects candidate particles. By using linear sparse combinations of dictionary templates, the proposed method learns the sparse representations of image regions corresponding to candidate particles jointly by exploiting the underlying low-rank constraints. In addition, the proposed CLRST algorithm is computationally attractive since temporal consistency property helps prune particles and the low-rank minimization problem for learning joint sparse representations can be efficiently solved by a sequence of closed form update operations. We evaluate the proposed CLRST algorithm against 14 state-of-the-art tracking methods on a set of 25 challenging image sequences. Experimental results show that the CLRST algorithm performs favorably against state-of-the-art tracking methods in terms of accuracy and execution time.

  17. Constraint Solver Techniques for Implementing Precise and Scalable Static Program Analysis

    DEFF Research Database (Denmark)

    Zhang, Ye

    solver using unification we could make a program analysis easier to design and implement, much more scalable, and still as precise as expected. We present an inclusion constraint language with the explicit equality constructs for specifying program analysis problems, and a parameterized framework...... developers to build reliable software systems more quickly and with fewer bugs or security defects. While designing and implementing a program analysis remains a hard work, making it both scalable and precise is even more challenging. In this dissertation, we show that with a general inclusion constraint...... data flow analyses for C language, we demonstrate a large amount of equivalences could be detected by off-line analyses, and they could then be used by a constraint solver to significantly improve the scalability of an analysis without sacrificing any precision....

  18. Evaluating Sparse Linear System Solvers on Scalable Parallel Architectures

    National Research Council Canada - National Science Library

    Grama, Ananth; Manguoglu, Murat; Koyuturk, Mehmet; Naumov, Maxim; Sameh, Ahmed

    2008-01-01

    .... The study was motivated primarily by the lack of robustness of Krylov subspace iterative schemes with generic, black-box, pre-conditioners such as approximate (or incomplete) LU-factorizations...

  19. Domain decomposition solvers for nonlinear multiharmonic finite element equations

    KAUST Repository

    Copeland, D. M.

    2010-01-01

    In many practical applications, for instance, in computational electromagnetics, the excitation is time-harmonic. Switching from the time domain to the frequency domain allows us to replace the expensive time-integration procedure by the solution of a simple elliptic equation for the amplitude. This is true for linear problems, but not for nonlinear problems. However, due to the periodicity of the solution, we can expand the solution in a Fourier series. Truncating this Fourier series and approximating the Fourier coefficients by finite elements, we arrive at a large-scale coupled nonlinear system for determining the finite element approximation to the Fourier coefficients. The construction of fast solvers for such systems is very crucial for the efficiency of this multiharmonic approach. In this paper we look at nonlinear, time-harmonic potential problems as simple model problems. We construct and analyze almost optimal solvers for the Jacobi systems arising from the Newton linearization of the large-scale coupled nonlinear system that one has to solve instead of performing the expensive time-integration procedure. © 2010 de Gruyter.

  20. PENBURN - A 3-D Zone-Based Depletion/Burnup Solver

    International Nuclear Information System (INIS)

    Manalo, Kevin; Plower, Thomas; Rowe, Mireille; Mock, Travis; Sjoden, Glenn E.

    2008-01-01

    PENBURN (Parallel Environment Burnup) is a general depletion/burnup solver which, when provided with zone-based reaction rates, computes time-dependent isotope concentrations for a set of actinides and fission products. Burnup analysis in PENBURN is performed with a direct Bateman-solver chain solution technique. Specifically, in tandem with PENBURN is the use of PENTRAN, a parallel multi-group anisotropic Sn code for 3-D Cartesian geometries. In PENBURN, the linear chain method is actively used to solve individual isotope chains which are then fully attributed by the burnup code to yield integrated isotope concentrations for each nuclide specified. Included with the discussion of code features, a single PWR fuel pin calculation with the burnup code is performed and detailed with a benchmark comparison to PIE (Post-Irradiation Examination) data within the SFCOMPO (Spent Fuel Composition / NEA) database, and also with burnup codes in SCALE5.1. Conclusions within the paper detail, in PENBURN, the accuracy of major actinides, flux profile behavior as a function of burnup, and criticality calculations for the PWR fuel pin model. (authors)

  1. Analysis of transient plasmonic interactions using an MOT-PMCHWT integral equation solver

    KAUST Repository

    Uysal, Ismail Enes

    2014-07-01

    Device design involving metals and dielectrics at nano-scales and optical frequencies calls for simulation tools capable of analyzing plasmonic interactions. To this end finite difference time domain (FDTD) and finite element methods have been used extensively. Since these methods require volumetric meshes, the discretization size should be very small to accurately resolve fast-decaying fields in the vicinity of metal/dielectric interfaces. This can be avoided using integral equation (IE) techniques that discretize only on the interfaces. Additionally, IE solvers implicitly enforce the radiation condition and consequently do not need (approximate) absorbing boundary conditions. Despite these advantages, IE solvers, especially in time domain, have not been used for analyzing plasmonic interactions.

  2. Identification of severe wind conditions using a Reynolds Averaged Navier-Stokes solver

    International Nuclear Information System (INIS)

    Soerensen, N N; Bechmann, A; Johansen, J; Myllerup, L; Botha, P; Vinther, S; Nielsen, B S

    2007-01-01

    The present paper describes the application of a Navier-Stokes solver to predict the presence of severe flow conditions in complex terrain, capturing conditions that may be critical to the siting of wind turbines in the terrain. First it is documented that the flow solver is capable of predicting the flow in the complex terrain by comparing with measurements from two meteorology masts. Next, it is illustrated how levels of turbulent kinetic energy can be used to easily identify areas with severe flow conditions, relying on a high correlation between high turbulence intensity and severe flow conditions, in the form of high wind shear and directional shear which may seriously lower the lifetime of a wind turbine

  3. Efficient collaborative sparse channel estimation in massive MIMO

    KAUST Repository

    Masood, Mudassir

    2015-08-12

    We propose a method for estimation of sparse frequency selective channels within MIMO-OFDM systems. These channels are independently sparse and share a common support. The method estimates the impulse response for each channel observed by the antennas at the receiver. Estimation is performed in a coordinated manner by sharing minimal information among neighboring antennas to achieve results better than many contemporary methods. Simulations demonstrate the superior performance of the proposed method.

  4. Efficient collaborative sparse channel estimation in massive MIMO

    KAUST Repository

    Masood, Mudassir; Afify, Laila H.; Al-Naffouri, Tareq Y.

    2015-01-01

    We propose a method for estimation of sparse frequency selective channels within MIMO-OFDM systems. These channels are independently sparse and share a common support. The method estimates the impulse response for each channel observed by the antennas at the receiver. Estimation is performed in a coordinated manner by sharing minimal information among neighboring antennas to achieve results better than many contemporary methods. Simulations demonstrate the superior performance of the proposed method.

  5. Sparse dictionary learning of resting state fMRI networks.

    Science.gov (United States)

    Eavani, Harini; Filipovych, Roman; Davatzikos, Christos; Satterthwaite, Theodore D; Gur, Raquel E; Gur, Ruben C

    2012-07-02

    Research in resting state fMRI (rsfMRI) has revealed the presence of stable, anti-correlated functional subnetworks in the brain. Task-positive networks are active during a cognitive process and are anti-correlated with task-negative networks, which are active during rest. In this paper, based on the assumption that the structure of the resting state functional brain connectivity is sparse, we utilize sparse dictionary modeling to identify distinct functional sub-networks. We propose two ways of formulating the sparse functional network learning problem that characterize the underlying functional connectivity from different perspectives. Our results show that the whole-brain functional connectivity can be concisely represented with highly modular, overlapping task-positive/negative pairs of sub-networks.

  6. Low-Rank Sparse Coding for Image Classification

    KAUST Repository

    Zhang, Tianzhu; Ghanem, Bernard; Liu, Si; Xu, Changsheng; Ahuja, Narendra

    2013-01-01

    In this paper, we propose a low-rank sparse coding (LRSC) method that exploits local structure information among features in an image for the purpose of image-level classification. LRSC represents densely sampled SIFT descriptors, in a spatial neighborhood, collectively as low-rank, sparse linear combinations of code words. As such, it casts the feature coding problem as a low-rank matrix learning problem, which is different from previous methods that encode features independently. This LRSC has a number of attractive properties. (1) It encourages sparsity in feature codes, locality in codebook construction, and low-rankness for spatial consistency. (2) LRSC encodes local features jointly by considering their low-rank structure information, and is computationally attractive. We evaluate the LRSC by comparing its performance on a set of challenging benchmarks with that of 7 popular coding and other state-of-the-art methods. Our experiments show that by representing local features jointly, LRSC not only outperforms the state-of-the-art in classification accuracy but also improves the time complexity of methods that use a similar sparse linear representation model for feature coding.

  7. Low-Rank Sparse Coding for Image Classification

    KAUST Repository

    Zhang, Tianzhu

    2013-12-01

    In this paper, we propose a low-rank sparse coding (LRSC) method that exploits local structure information among features in an image for the purpose of image-level classification. LRSC represents densely sampled SIFT descriptors, in a spatial neighborhood, collectively as low-rank, sparse linear combinations of code words. As such, it casts the feature coding problem as a low-rank matrix learning problem, which is different from previous methods that encode features independently. This LRSC has a number of attractive properties. (1) It encourages sparsity in feature codes, locality in codebook construction, and low-rankness for spatial consistency. (2) LRSC encodes local features jointly by considering their low-rank structure information, and is computationally attractive. We evaluate the LRSC by comparing its performance on a set of challenging benchmarks with that of 7 popular coding and other state-of-the-art methods. Our experiments show that by representing local features jointly, LRSC not only outperforms the state-of-the-art in classification accuracy but also improves the time complexity of methods that use a similar sparse linear representation model for feature coding.

  8. Regularized generalized eigen-decomposition with applications to sparse supervised feature extraction and sparse discriminant analysis

    DEFF Research Database (Denmark)

    Han, Xixuan; Clemmensen, Line Katrine Harder

    2015-01-01

    We propose a general technique for obtaining sparse solutions to generalized eigenvalue problems, and call it Regularized Generalized Eigen-Decomposition (RGED). For decades, Fisher's discriminant criterion has been applied in supervised feature extraction and discriminant analysis, and it is for...

  9. A performance study of sparse Cholesky factorization on INTEL iPSC/860

    Science.gov (United States)

    Zubair, M.; Ghose, M.

    1992-01-01

    The problem of Cholesky factorization of a sparse matrix has been very well investigated on sequential machines. A number of efficient codes exist for factorizing large unstructured sparse matrices. However, there is a lack of such efficient codes on parallel machines in general, and distributed machines in particular. Some of the issues that are critical to the implementation of sparse Cholesky factorization on a distributed memory parallel machine are ordering, partitioning and mapping, load balancing, and ordering of various tasks within a processor. Here, we focus on the effect of various partitioning schemes on the performance of sparse Cholesky factorization on the Intel iPSC/860. Also, a new partitioning heuristic for structured as well as unstructured sparse matrices is proposed, and its performance is compared with other schemes.

  10. A Numerical Study of Scalable Cardiac Electro-Mechanical Solvers on HPC Architectures

    Directory of Open Access Journals (Sweden)

    Piero Colli Franzone

    2018-04-01

    Full Text Available We introduce and study some scalable domain decomposition preconditioners for cardiac electro-mechanical 3D simulations on parallel HPC (High Performance Computing architectures. The electro-mechanical model of the cardiac tissue is composed of four coupled sub-models: (1 the static finite elasticity equations for the transversely isotropic deformation of the cardiac tissue; (2 the active tension model describing the dynamics of the intracellular calcium, cross-bridge binding and myofilament tension; (3 the anisotropic Bidomain model describing the evolution of the intra- and extra-cellular potentials in the deforming cardiac tissue; and (4 the ionic membrane model describing the dynamics of ionic currents, gating variables, ionic concentrations and stretch-activated channels. This strongly coupled electro-mechanical model is discretized in time with a splitting semi-implicit technique and in space with isoparametric finite elements. The resulting scalable parallel solver is based on Multilevel Additive Schwarz preconditioners for the solution of the Bidomain system and on BDDC preconditioned Newton-Krylov solvers for the non-linear finite elasticity system. The results of several 3D parallel simulations show the scalability of both linear and non-linear solvers and their application to the study of both physiological excitation-contraction cardiac dynamics and re-entrant waves in the presence of different mechano-electrical feedbacks.

  11. Fast sparsely synchronized brain rhythms in a scale-free neural network.

    Science.gov (United States)

    Kim, Sang-Yoon; Lim, Woochang

    2015-08-01

    We consider a directed version of the Barabási-Albert scale-free network model with symmetric preferential attachment with the same in- and out-degrees and study the emergence of sparsely synchronized rhythms for a fixed attachment degree in an inhibitory population of fast-spiking Izhikevich interneurons. Fast sparsely synchronized rhythms with stochastic and intermittent neuronal discharges are found to appear for large values of J (synaptic inhibition strength) and D (noise intensity). For an intensive study we fix J at a sufficiently large value and investigate the population states by increasing D. For small D, full synchronization with the same population-rhythm frequency fp and mean firing rate (MFR) fi of individual neurons occurs, while for large D partial synchronization with fp>〈fi〉 (〈fi〉: ensemble-averaged MFR) appears due to intermittent discharge of individual neurons; in particular, the case of fp>4〈fi〉 is referred to as sparse synchronization. For the case of partial and sparse synchronization, MFRs of individual neurons vary depending on their degrees. As D passes a critical value D* (which is determined by employing an order parameter), a transition to unsynchronization occurs due to the destructive role of noise to spoil the pacing between sparse spikes. For Dsparse synchronization do contributions of individual neuronal dynamics to population synchronization change depending on their degrees, unlike in the case of full synchronization. Consequently, dynamics of individual neurons reveal the inhomogeneous network structure for the case of partial and sparse synchronization, which is in contrast to the case of

  12. Fast sparsely synchronized brain rhythms in a scale-free neural network

    Science.gov (United States)

    Kim, Sang-Yoon; Lim, Woochang

    2015-08-01

    We consider a directed version of the Barabási-Albert scale-free network model with symmetric preferential attachment with the same in- and out-degrees and study the emergence of sparsely synchronized rhythms for a fixed attachment degree in an inhibitory population of fast-spiking Izhikevich interneurons. Fast sparsely synchronized rhythms with stochastic and intermittent neuronal discharges are found to appear for large values of J (synaptic inhibition strength) and D (noise intensity). For an intensive study we fix J at a sufficiently large value and investigate the population states by increasing D . For small D , full synchronization with the same population-rhythm frequency fp and mean firing rate (MFR) fi of individual neurons occurs, while for large D partial synchronization with fp> ( : ensemble-averaged MFR) appears due to intermittent discharge of individual neurons; in particular, the case of fp>4 is referred to as sparse synchronization. For the case of partial and sparse synchronization, MFRs of individual neurons vary depending on their degrees. As D passes a critical value D* (which is determined by employing an order parameter), a transition to unsynchronization occurs due to the destructive role of noise to spoil the pacing between sparse spikes. For D sparse synchronization do contributions of individual neuronal dynamics to population synchronization change depending on their degrees, unlike in the case of full synchronization. Consequently, dynamics of individual neurons reveal the inhomogeneous network structure for the case of partial and sparse synchronization, which is in contrast to the case of statistically homogeneous

  13. l1- and l2-Norm Joint Regularization Based Sparse Signal Reconstruction Scheme

    Directory of Open Access Journals (Sweden)

    Chanzi Liu

    2016-01-01

    Full Text Available Many problems in signal processing and statistical inference involve finding sparse solution to some underdetermined linear system of equations. This is also the application condition of compressive sensing (CS which can find the sparse solution from the measurements far less than the original signal. In this paper, we propose l1- and l2-norm joint regularization based reconstruction framework to approach the original l0-norm based sparseness-inducing constrained sparse signal reconstruction problem. Firstly, it is shown that, by employing the simple conjugate gradient algorithm, the new formulation provides an effective framework to deduce the solution as the original sparse signal reconstruction problem with l0-norm regularization item. Secondly, the upper reconstruction error limit is presented for the proposed sparse signal reconstruction framework, and it is unveiled that a smaller reconstruction error than l1-norm relaxation approaches can be realized by using the proposed scheme in most cases. Finally, simulation results are presented to validate the proposed sparse signal reconstruction approach.

  14. Capability of State-of-the-Art Navier-Stokes Solvers for the Prediction of Helicopter Fuselage Aerodynamics

    DEFF Research Database (Denmark)

    N., Kroll; P., Renzoni; M., Amato

    1998-01-01

    The purpose of this paper is to describe the influence of different Navier-Stokes solvers and grids on the prediction of the global coefficients for a simplified geometry of a helicopter fuselage.......The purpose of this paper is to describe the influence of different Navier-Stokes solvers and grids on the prediction of the global coefficients for a simplified geometry of a helicopter fuselage....

  15. Image fusion via nonlocal sparse K-SVD dictionary learning.

    Science.gov (United States)

    Li, Ying; Li, Fangyi; Bai, Bendu; Shen, Qiang

    2016-03-01

    Image fusion aims to merge two or more images captured via various sensors of the same scene to construct a more informative image by integrating their details. Generally, such integration is achieved through the manipulation of the representations of the images concerned. Sparse representation plays an important role in the effective description of images, offering a great potential in a variety of image processing tasks, including image fusion. Supported by sparse representation, in this paper, an approach for image fusion by the use of a novel dictionary learning scheme is proposed. The nonlocal self-similarity property of the images is exploited, not only at the stage of learning the underlying description dictionary but during the process of image fusion. In particular, the property of nonlocal self-similarity is combined with the traditional sparse dictionary. This results in an improved learned dictionary, hereafter referred to as the nonlocal sparse K-SVD dictionary (where K-SVD stands for the K times singular value decomposition that is commonly used in the literature), and abbreviated to NL_SK_SVD. The performance of the NL_SK_SVD dictionary is applied for image fusion using simultaneous orthogonal matching pursuit. The proposed approach is evaluated with different types of images, and compared with a number of alternative image fusion techniques. The resultant superior fused images using the present approach demonstrates the efficacy of the NL_SK_SVD dictionary in sparse image representation.

  16. Detection of Pitting in Gears Using a Deep Sparse Autoencoder

    Directory of Open Access Journals (Sweden)

    Yongzhi Qu

    2017-05-01

    Full Text Available In this paper; a new method for gear pitting fault detection is presented. The presented method is developed based on a deep sparse autoencoder. The method integrates dictionary learning in sparse coding into a stacked autoencoder network. Sparse coding with dictionary learning is viewed as an adaptive feature extraction method for machinery fault diagnosis. An autoencoder is an unsupervised machine learning technique. A stacked autoencoder network with multiple hidden layers is considered to be a deep learning network. The presented method uses a stacked autoencoder network to perform the dictionary learning in sparse coding and extract features from raw vibration data automatically. These features are then used to perform gear pitting fault detection. The presented method is validated with vibration data collected from gear tests with pitting faults in a gearbox test rig and compared with an existing deep learning-based approach.

  17. Extension of the GeN-Foam neutronic solver to SP3 analysis and application to the CROCUS experimental reactor

    International Nuclear Information System (INIS)

    Fiorina, Carlo; Hursin, Mathieu; Pautz, Andreas

    2017-01-01

    Highlights: • Development and verification of an SP 3 solver based on OpenFOAM. • Integration into the GeN-Foam multi-physics platform. • Application of the new GeN-Foam SP 3 solver to the CROCUS reactor. - Abstract: The Laboratory for Reactor Physics and Systems Behaviour at the PSI and at the EPFL has been developing since 2013 a multi-physics platform for coupled reactor analysis named GeN-Foam. The developed tool includes a solver for the eigenvalue and transient solution of multi-group neutron diffusion equations. Although frequently used in reactor analysis, the diffusion theory shows some limitations for core configurations involving strong anisotropies, which is the case for the CROCUS research reactor at the EPFL. The use of an SP 3 approximation to neutron transport can often lead to visible improvements in a code predictive capabilities, especially for one-directional anisotropies, with acceptable added computational cost vs diffusion. Following some modelling issues for the CROCUS reactor, and in order to improve the GeN-Foam modelling capabilities, the GeN-Foam diffusion solver has been extended to allow for SP 3 analyses. The present paper describes such extension and a preliminary verification using a mini-core PWR benchmark. The newly developed solver is then applied to the analysis of the CROCUS experimental reactor and results are compared to Monte Carlo calculations, as well as to the results of the diffusion solver.

  18. A fast Poisson solver for unsteady incompressible Navier-Stokes equations on the half-staggered grid

    Science.gov (United States)

    Golub, G. H.; Huang, L. C.; Simon, H.; Tang, W. -P.

    1995-01-01

    In this paper, a fast Poisson solver for unsteady, incompressible Navier-Stokes equations with finite difference methods on the non-uniform, half-staggered grid is presented. To achieve this, new algorithms for diagonalizing a semi-definite pair are developed. Our fast solver can also be extended to the three dimensional case. The motivation and related issues in using this second kind of staggered grid are also discussed. Numerical testing has indicated the effectiveness of this algorithm.

  19. In-Storage Embedded Accelerator for Sparse Pattern Processing

    OpenAIRE

    Jun, Sang-Woo; Nguyen, Huy T.; Gadepally, Vijay N.; Arvind

    2016-01-01

    We present a novel architecture for sparse pattern processing, using flash storage with embedded accelerators. Sparse pattern processing on large data sets is the essence of applications such as document search, natural language processing, bioinformatics, subgraph matching, machine learning, and graph processing. One slice of our prototype accelerator is capable of handling up to 1TB of data, and experiments show that it can outperform C/C++ software solutions on a 16-core system at a fracti...

  20. Process Knowledge Discovery Using Sparse Principal Component Analysis

    DEFF Research Database (Denmark)

    Gao, Huihui; Gajjar, Shriram; Kulahci, Murat

    2016-01-01

    As the goals of ensuring process safety and energy efficiency become ever more challenging, engineers increasingly rely on data collected from such processes for informed decision making. During recent decades, extracting and interpreting valuable process information from large historical data sets...... SPCA approach that helps uncover the underlying process knowledge regarding variable relations. This approach systematically determines the optimal sparse loadings for each sparse PC while improving interpretability and minimizing information loss. The salient features of the proposed approach...

  1. Calm water resistance prediction of a bulk carrier using Reynolds averaged Navier-Stokes based solver

    Science.gov (United States)

    Rahaman, Md. Mashiur; Islam, Hafizul; Islam, Md. Tariqul; Khondoker, Md. Reaz Hasan

    2017-12-01

    Maneuverability and resistance prediction with suitable accuracy is essential for optimum ship design and propulsion power prediction. This paper aims at providing some of the maneuverability characteristics of a Japanese bulk carrier model, JBC in calm water using a computational fluid dynamics solver named SHIP Motion and OpenFOAM. The solvers are based on the Reynolds average Navier-Stokes method (RaNS) and solves structured grid using the Finite Volume Method (FVM). This paper comprises the numerical results of calm water test for the JBC model with available experimental results. The calm water test results include the total drag co-efficient, average sinkage, and trim data. Visualization data for pressure distribution on the hull surface and free water surface have also been included. The paper concludes that the presented solvers predict the resistance and maneuverability characteristics of the bulk carrier with reasonable accuracy utilizing minimum computational resources.

  2. Massively parallel sparse matrix function calculations with NTPoly

    Science.gov (United States)

    Dawson, William; Nakajima, Takahito

    2018-04-01

    We present NTPoly, a massively parallel library for computing the functions of sparse, symmetric matrices. The theory of matrix functions is a well developed framework with a wide range of applications including differential equations, graph theory, and electronic structure calculations. One particularly important application area is diagonalization free methods in quantum chemistry. When the input and output of the matrix function are sparse, methods based on polynomial expansions can be used to compute matrix functions in linear time. We present a library based on these methods that can compute a variety of matrix functions. Distributed memory parallelization is based on a communication avoiding sparse matrix multiplication algorithm. OpenMP task parallellization is utilized to implement hybrid parallelization. We describe NTPoly's interface and show how it can be integrated with programs written in many different programming languages. We demonstrate the merits of NTPoly by performing large scale calculations on the K computer.

  3. Minaret, a deterministic neutron transport solver for nuclear core calculations

    Energy Technology Data Exchange (ETDEWEB)

    Moller, J-Y.; Lautard, J-J., E-mail: jean-yves.moller@cea.fr, E-mail: jean-jacques.lautard@cea.fr [CEA - Centre de Saclay , Gif sur Yvette (France)

    2011-07-01

    We present here MINARET a deterministic transport solver for nuclear core calculations to solve the steady state Boltzmann equation. The code follows the multi-group formalism to discretize the energy variable. It uses discrete ordinate method to deal with the angular variable and a DGFEM to solve spatially the Boltzmann equation. The mesh is unstructured in 2D and semi-unstructured in 3D (cylindrical). Curved triangles can be used to fit the exact geometry. For the curved elements, two different sets of basis functions can be used. Transport solver is accelerated with a DSA method. Diffusion and SPN calculations are made possible by skipping the transport sweep in the source iteration. The transport calculations are parallelized with respect to the angular directions. Numerical results are presented for simple geometries and for the C5G7 Benchmark, JHR reactor and the ESFR (in 2D and 3D). Straight and curved finite element results are compared. (author)

  4. CASTRO: A NEW COMPRESSIBLE ASTROPHYSICAL SOLVER. III. MULTIGROUP RADIATION HYDRODYNAMICS

    International Nuclear Information System (INIS)

    Zhang, W.; Almgren, A.; Bell, J.; Howell, L.; Burrows, A.; Dolence, J.

    2013-01-01

    We present a formulation for multigroup radiation hydrodynamics that is correct to order O(v/c) using the comoving-frame approach and the flux-limited diffusion approximation. We describe a numerical algorithm for solving the system, implemented in the compressible astrophysics code, CASTRO. CASTRO uses a Eulerian grid with block-structured adaptive mesh refinement based on a nested hierarchy of logically rectangular variable-sized grids with simultaneous refinement in both space and time. In our multigroup radiation solver, the system is split into three parts: one part that couples the radiation and fluid in a hyperbolic subsystem, another part that advects the radiation in frequency space, and a parabolic part that evolves radiation diffusion and source-sink terms. The hyperbolic subsystem and the frequency space advection are solved explicitly with high-order Godunov schemes, whereas the parabolic part is solved implicitly with a first-order backward Euler method. Our multigroup radiation solver works for both neutrino and photon radiation.

  5. Minaret, a deterministic neutron transport solver for nuclear core calculations

    International Nuclear Information System (INIS)

    Moller, J-Y.; Lautard, J-J.

    2011-01-01

    We present here MINARET a deterministic transport solver for nuclear core calculations to solve the steady state Boltzmann equation. The code follows the multi-group formalism to discretize the energy variable. It uses discrete ordinate method to deal with the angular variable and a DGFEM to solve spatially the Boltzmann equation. The mesh is unstructured in 2D and semi-unstructured in 3D (cylindrical). Curved triangles can be used to fit the exact geometry. For the curved elements, two different sets of basis functions can be used. Transport solver is accelerated with a DSA method. Diffusion and SPN calculations are made possible by skipping the transport sweep in the source iteration. The transport calculations are parallelized with respect to the angular directions. Numerical results are presented for simple geometries and for the C5G7 Benchmark, JHR reactor and the ESFR (in 2D and 3D). Straight and curved finite element results are compared. (author)

  6. Deformable segmentation via sparse representation and dictionary learning.

    Science.gov (United States)

    Zhang, Shaoting; Zhan, Yiqiang; Metaxas, Dimitris N

    2012-10-01

    "Shape" and "appearance", the two pillars of a deformable model, complement each other in object segmentation. In many medical imaging applications, while the low-level appearance information is weak or mis-leading, shape priors play a more important role to guide a correct segmentation, thanks to the strong shape characteristics of biological structures. Recently a novel shape prior modeling method has been proposed based on sparse learning theory. Instead of learning a generative shape model, shape priors are incorporated on-the-fly through the sparse shape composition (SSC). SSC is robust to non-Gaussian errors and still preserves individual shape characteristics even when such characteristics is not statistically significant. Although it seems straightforward to incorporate SSC into a deformable segmentation framework as shape priors, the large-scale sparse optimization of SSC has low runtime efficiency, which cannot satisfy clinical requirements. In this paper, we design two strategies to decrease the computational complexity of SSC, making a robust, accurate and efficient deformable segmentation system. (1) When the shape repository contains a large number of instances, which is often the case in 2D problems, K-SVD is used to learn a more compact but still informative shape dictionary. (2) If the derived shape instance has a large number of vertices, which often appears in 3D problems, an affinity propagation method is used to partition the surface into small sub-regions, on which the sparse shape composition is performed locally. Both strategies dramatically decrease the scale of the sparse optimization problem and hence speed up the algorithm. Our method is applied on a diverse set of biomedical image analysis problems. Compared to the original SSC, these two newly-proposed modules not only significant reduce the computational complexity, but also improve the overall accuracy. Copyright © 2012 Elsevier B.V. All rights reserved.

  7. Sparseness- and continuity-constrained seismic imaging

    Science.gov (United States)

    Herrmann, Felix J.

    2005-04-01

    Non-linear solution strategies to the least-squares seismic inverse-scattering problem with sparseness and continuity constraints are proposed. Our approach is designed to (i) deal with substantial amounts of additive noise (SNR formulating the solution of the seismic inverse problem in terms of an optimization problem. During the optimization, sparseness on the basis and continuity along the reflectors are imposed by jointly minimizing the l1- and anisotropic diffusion/total-variation norms on the coefficients and reflectivity, respectively. [Joint work with Peyman P. Moghaddam was carried out as part of the SINBAD project, with financial support secured through ITF (the Industry Technology Facilitator) from the following organizations: BG Group, BP, ExxonMobil, and SHELL. Additional funding came from the NSERC Discovery Grants 22R81254.

  8. Combinatorial Algorithms for Computing Column Space Bases ThatHave Sparse Inverses

    Energy Technology Data Exchange (ETDEWEB)

    Pinar, Ali; Chow, Edmond; Pothen, Alex

    2005-03-18

    This paper presents a combinatorial study on the problem ofconstructing a sparse basis forthe null-space of a sparse, underdetermined, full rank matrix, A. Such a null-space is suitable forsolving solving many saddle point problems. Our approach is to form acolumn space basis of A that has a sparse inverse, by selecting suitablecolumns of A. This basis is then used to form a sparse null-space basisin fundamental form. We investigate three different algorithms forcomputing the column space basis: Two greedy approaches that rely onmatching, and a third employing a divide and conquer strategy implementedwith hypergraph partitioning followed by the greedy approach. We alsodiscuss the complexity of selecting a column basis when it is known thata block diagonal basis exists with a small given block size.

  9. Image Super-Resolution Algorithm Based on an Improved Sparse Autoencoder

    Directory of Open Access Journals (Sweden)

    Detian Huang

    2018-01-01

    Full Text Available Due to the limitations of the resolution of the imaging system and the influence of scene changes and other factors, sometimes only low-resolution images can be acquired, which cannot satisfy the practical application’s requirements. To improve the quality of low-resolution images, a novel super-resolution algorithm based on an improved sparse autoencoder is proposed. Firstly, in the training set preprocessing stage, the high- and low-resolution image training sets are constructed, respectively, by using high-frequency information of the training samples as the characterization, and then the zero-phase component analysis whitening technique is utilized to decorrelate the formed joint training set to reduce its redundancy. Secondly, a constructed sparse regularization term is added to the cost function of the traditional sparse autoencoder to further strengthen the sparseness constraint on the hidden layer. Finally, in the dictionary learning stage, the improved sparse autoencoder is adopted to achieve unsupervised dictionary learning to improve the accuracy and stability of the dictionary. Experimental results validate that the proposed algorithm outperforms the existing algorithms both in terms of the subjective visual perception and the objective evaluation indices, including the peak signal-to-noise ratio and the structural similarity measure.

  10. Diagnosis and prognosis of Ostheoarthritis by texture analysis using sparse linear models

    DEFF Research Database (Denmark)

    Marques, Joselene; Clemmensen, Line Katrine Harder; Dam, Erik

    We present a texture analysis methodology that combines uncommitted machine-learning techniques and sparse feature transformation methods in a fully automatic framework. We compare the performances of a partial least squares (PLS) forward feature selection strategy to a hard threshold sparse PLS...... algorithm and a sparse linear discriminant model. The texture analysis framework was applied to diagnosis of knee osteoarthritis (OA) and prognosis of cartilage loss. For this investigation, a generic texture feature bank was extracted from magnetic resonance images of tibial knee bone. The features were...... used as input to the sparse algorithms, which dened the best features to retain in the model. To cope with the limited number of samples, the data was evaluated using 10 fold cross validation (CV). The diagnosis evaluation using sparse PLS reached a generalization area-under-the-ROC curve (AUC) of 0...

  11. Efficient implementations of block sparse matrix operations on shared memory vector machines

    International Nuclear Information System (INIS)

    Washio, T.; Maruyama, K.; Osoda, T.; Doi, S.; Shimizu, F.

    2000-01-01

    In this paper, we propose vectorization and shared memory-parallelization techniques for block-type random sparse matrix operations in finite element (FEM) applications. Here, a block corresponds to unknowns on one node in the FEM mesh and we assume that the block size is constant over the mesh. First, we discuss some basic vectorization ideas (the jagged diagonal (JAD) format and the segmented scan algorithm) for the sparse matrix-vector product. Then, we extend these ideas to the shared memory parallelization. After that, we show that the techniques can be applied not only to the sparse matrix-vector product but also to the sparse matrix-matrix product, the incomplete or complete sparse LU factorization and preconditioning. Finally, we report the performance evaluation results obtained on an NEC SX-4 shared memory vector machine for linear systems in some FEM applications. (author)

  12. Convergence acceleration of Navier-Stokes equation using adaptive wavelet method

    International Nuclear Information System (INIS)

    Kang, Hyung Min; Ghafoor, Imran; Lee, Do Hyung

    2010-01-01

    An efficient adaptive wavelet method is proposed for the enhancement of computational efficiency of the Navier-Stokes equations. The method is based on sparse point representation (SPR), which uses the wavelet decomposition and thresholding to obtain a sparsely distributed dataset. The threshold mechanism is modified in order to maintain the spatial accuracy of a conventional Navier-Stokes solver by adapting the threshold value to the order of spatial truncation error. The computational grid can be dynamically adapted to a transient solution to reflect local changes in the solution. The flux evaluation is then carried out only at the points of the adapted dataset, which reduces the computational effort and memory requirements. A stabilization technique is also implemented to avoid the additional numerical errors introduced by the threshold procedure. The numerical results of the adaptive wavelet method are compared with a conventional solver to validate the enhancement in computational efficiency of Navier-Stokes equations without the degeneration of the numerical accuracy of a conventional solver

  13. Identification of MIMO systems with sparse transfer function coefficients

    Science.gov (United States)

    Qiu, Wanzhi; Saleem, Syed Khusro; Skafidas, Efstratios

    2012-12-01

    We study the problem of estimating transfer functions of multivariable (multiple-input multiple-output--MIMO) systems with sparse coefficients. We note that subspace identification methods are powerful and convenient tools in dealing with MIMO systems since they neither require nonlinear optimization nor impose any canonical form on the systems. However, subspace-based methods are inefficient for systems with sparse transfer function coefficients since they work on state space models. We propose a two-step algorithm where the first step identifies the system order using the subspace principle in a state space format, while the second step estimates coefficients of the transfer functions via L1-norm convex optimization. The proposed algorithm retains good features of subspace methods with improved noise-robustness for sparse systems.

  14. MULTISCALE SPARSE APPEARANCE MODELING AND SIMULATION OF PATHOLOGICAL DEFORMATIONS

    Directory of Open Access Journals (Sweden)

    Rami Zewail

    2017-08-01

    Full Text Available Machine learning and statistical modeling techniques has drawn much interest within the medical imaging research community. However, clinically-relevant modeling of anatomical structures continues to be a challenging task. This paper presents a novel method for multiscale sparse appearance modeling in medical images with application to simulation of pathological deformations in X-ray images of human spine. The proposed appearance model benefits from the non-linear approximation power of Contourlets and its ability to capture higher order singularities to achieve a sparse representation while preserving the accuracy of the statistical model. Independent Component Analysis is used to extract statistical independent modes of variations from the sparse Contourlet-based domain. The new model is then used to simulate clinically-relevant pathological deformations in radiographic images.

  15. An Adaptive Sparse Grid Algorithm for Elliptic PDEs with Lognormal Diffusion Coefficient

    KAUST Repository

    Nobile, Fabio

    2016-03-18

    In this work we build on the classical adaptive sparse grid algorithm (T. Gerstner and M. Griebel, Dimension-adaptive tensor-product quadrature), obtaining an enhanced version capable of using non-nested collocation points, and supporting quadrature and interpolation on unbounded sets. We also consider several profit indicators that are suitable to drive the adaptation process. We then use such algorithm to solve an important test case in Uncertainty Quantification problem, namely the Darcy equation with lognormal permeability random field, and compare the results with those obtained with the quasi-optimal sparse grids based on profit estimates, which we have proposed in our previous works (cf. e.g. Convergence of quasi-optimal sparse grids approximation of Hilbert-valued functions: application to random elliptic PDEs). To treat the case of rough permeability fields, in which a sparse grid approach may not be suitable, we propose to use the adaptive sparse grid quadrature as a control variate in a Monte Carlo simulation. Numerical results show that the adaptive sparse grids have performances similar to those of the quasi-optimal sparse grids and are very effective in the case of smooth permeability fields. Moreover, their use as control variate in a Monte Carlo simulation allows to tackle efficiently also problems with rough coefficients, significantly improving the performances of a standard Monte Carlo scheme.

  16. Sparse principal component analysis in medical shape modeling

    Science.gov (United States)

    Sjöstrand, Karl; Stegmann, Mikkel B.; Larsen, Rasmus

    2006-03-01

    Principal component analysis (PCA) is a widely used tool in medical image analysis for data reduction, model building, and data understanding and exploration. While PCA is a holistic approach where each new variable is a linear combination of all original variables, sparse PCA (SPCA) aims at producing easily interpreted models through sparse loadings, i.e. each new variable is a linear combination of a subset of the original variables. One of the aims of using SPCA is the possible separation of the results into isolated and easily identifiable effects. This article introduces SPCA for shape analysis in medicine. Results for three different data sets are given in relation to standard PCA and sparse PCA by simple thresholding of small loadings. Focus is on a recent algorithm for computing sparse principal components, but a review of other approaches is supplied as well. The SPCA algorithm has been implemented using Matlab and is available for download. The general behavior of the algorithm is investigated, and strengths and weaknesses are discussed. The original report on the SPCA algorithm argues that the ordering of modes is not an issue. We disagree on this point and propose several approaches to establish sensible orderings. A method that orders modes by decreasing variance and maximizes the sum of variances for all modes is presented and investigated in detail.

  17. Inference algorithms and learning theory for Bayesian sparse factor analysis

    International Nuclear Information System (INIS)

    Rattray, Magnus; Sharp, Kevin; Stegle, Oliver; Winn, John

    2009-01-01

    Bayesian sparse factor analysis has many applications; for example, it has been applied to the problem of inferring a sparse regulatory network from gene expression data. We describe a number of inference algorithms for Bayesian sparse factor analysis using a slab and spike mixture prior. These include well-established Markov chain Monte Carlo (MCMC) and variational Bayes (VB) algorithms as well as a novel hybrid of VB and Expectation Propagation (EP). For the case of a single latent factor we derive a theory for learning performance using the replica method. We compare the MCMC and VB/EP algorithm results with simulated data to the theoretical prediction. The results for MCMC agree closely with the theory as expected. Results for VB/EP are slightly sub-optimal but show that the new algorithm is effective for sparse inference. In large-scale problems MCMC is infeasible due to computational limitations and the VB/EP algorithm then provides a very useful computationally efficient alternative.

  18. Inference algorithms and learning theory for Bayesian sparse factor analysis

    Energy Technology Data Exchange (ETDEWEB)

    Rattray, Magnus; Sharp, Kevin [School of Computer Science, University of Manchester, Manchester M13 9PL (United Kingdom); Stegle, Oliver [Max-Planck-Institute for Biological Cybernetics, Tuebingen (Germany); Winn, John, E-mail: magnus.rattray@manchester.ac.u [Microsoft Research Cambridge, Roger Needham Building, Cambridge, CB3 0FB (United Kingdom)

    2009-12-01

    Bayesian sparse factor analysis has many applications; for example, it has been applied to the problem of inferring a sparse regulatory network from gene expression data. We describe a number of inference algorithms for Bayesian sparse factor analysis using a slab and spike mixture prior. These include well-established Markov chain Monte Carlo (MCMC) and variational Bayes (VB) algorithms as well as a novel hybrid of VB and Expectation Propagation (EP). For the case of a single latent factor we derive a theory for learning performance using the replica method. We compare the MCMC and VB/EP algorithm results with simulated data to the theoretical prediction. The results for MCMC agree closely with the theory as expected. Results for VB/EP are slightly sub-optimal but show that the new algorithm is effective for sparse inference. In large-scale problems MCMC is infeasible due to computational limitations and the VB/EP algorithm then provides a very useful computationally efficient alternative.

  19. Sparse Jacobian construction for mapped grid visco-resistive magnetohydrodynamics

    KAUST Repository

    Reynolds, Daniel R.; Samtaney, Ravi

    2012-01-01

    employs a fully implicit formulation in time, and a mapped finite volume spatial discretization. We solve this model using inexact Newton-Krylov methods. Of critical importance in these iterative solvers is the development of an effective preconditioner

  20. Universal Regularizers For Robust Sparse Coding and Modeling

    OpenAIRE

    Ramirez, Ignacio; Sapiro, Guillermo

    2010-01-01

    Sparse data models, where data is assumed to be well represented as a linear combination of a few elements from a dictionary, have gained considerable attention in recent years, and their use has led to state-of-the-art results in many signal and image processing tasks. It is now well understood that the choice of the sparsity regularization term is critical in the success of such models. Based on a codelength minimization interpretation of sparse coding, and using tools from universal coding...

  1. Deep ensemble learning of sparse regression models for brain disease diagnosis.

    Science.gov (United States)

    Suk, Heung-Il; Lee, Seong-Whan; Shen, Dinggang

    2017-04-01

    Recent studies on brain imaging analysis witnessed the core roles of machine learning techniques in computer-assisted intervention for brain disease diagnosis. Of various machine-learning techniques, sparse regression models have proved their effectiveness in handling high-dimensional data but with a small number of training samples, especially in medical problems. In the meantime, deep learning methods have been making great successes by outperforming the state-of-the-art performances in various applications. In this paper, we propose a novel framework that combines the two conceptually different methods of sparse regression and deep learning for Alzheimer's disease/mild cognitive impairment diagnosis and prognosis. Specifically, we first train multiple sparse regression models, each of which is trained with different values of a regularization control parameter. Thus, our multiple sparse regression models potentially select different feature subsets from the original feature set; thereby they have different powers to predict the response values, i.e., clinical label and clinical scores in our work. By regarding the response values from our sparse regression models as target-level representations, we then build a deep convolutional neural network for clinical decision making, which thus we call 'Deep Ensemble Sparse Regression Network.' To our best knowledge, this is the first work that combines sparse regression models with deep neural network. In our experiments with the ADNI cohort, we validated the effectiveness of the proposed method by achieving the highest diagnostic accuracies in three classification tasks. We also rigorously analyzed our results and compared with the previous studies on the ADNI cohort in the literature. Copyright © 2017 Elsevier B.V. All rights reserved.

  2. Hierarchical Bayesian sparse image reconstruction with application to MRFM.

    Science.gov (United States)

    Dobigeon, Nicolas; Hero, Alfred O; Tourneret, Jean-Yves

    2009-09-01

    This paper presents a hierarchical Bayesian model to reconstruct sparse images when the observations are obtained from linear transformations and corrupted by an additive white Gaussian noise. Our hierarchical Bayes model is well suited to such naturally sparse image applications as it seamlessly accounts for properties such as sparsity and positivity of the image via appropriate Bayes priors. We propose a prior that is based on a weighted mixture of a positive exponential distribution and a mass at zero. The prior has hyperparameters that are tuned automatically by marginalization over the hierarchical Bayesian model. To overcome the complexity of the posterior distribution, a Gibbs sampling strategy is proposed. The Gibbs samples can be used to estimate the image to be recovered, e.g., by maximizing the estimated posterior distribution. In our fully Bayesian approach, the posteriors of all the parameters are available. Thus, our algorithm provides more information than other previously proposed sparse reconstruction methods that only give a point estimate. The performance of the proposed hierarchical Bayesian sparse reconstruction method is illustrated on synthetic data and real data collected from a tobacco virus sample using a prototype MRFM instrument.

  3. Efficient coordinated recovery of sparse channels in massive MIMO

    KAUST Repository

    Masood, Mudassir

    2015-01-01

    This paper addresses the problem of estimating sparse channels in massive MIMO-OFDM systems. Most wireless channels are sparse in nature with large delay spread. In addition, these channels as observed by multiple antennas in a neighborhood have approximately common support. The sparsity and common support properties are attractive when it comes to the efficient estimation of large number of channels in massive MIMO systems. Moreover, to avoid pilot contamination and to achieve better spectral efficiency, it is important to use a small number of pilots. We present a novel channel estimation approach which utilizes the sparsity and common support properties to estimate sparse channels and requires a small number of pilots. Two algorithms based on this approach have been developed that perform Bayesian estimates of sparse channels even when the prior is non-Gaussian or unknown. Neighboring antennas share among each other their beliefs about the locations of active channel taps to perform estimation. The coordinated approach improves channel estimates and also reduces the required number of pilots. Further improvement is achieved by the data-aided version of the algorithm. Extensive simulation results are provided to demonstrate the performance of the proposed algorithms.

  4. Boltzmann Solver with Adaptive Mesh in Velocity Space

    International Nuclear Information System (INIS)

    Kolobov, Vladimir I.; Arslanbekov, Robert R.; Frolova, Anna A.

    2011-01-01

    We describe the implementation of direct Boltzmann solver with Adaptive Mesh in Velocity Space (AMVS) using quad/octree data structure. The benefits of the AMVS technique are demonstrated for the charged particle transport in weakly ionized plasmas where the collision integral is linear. We also describe the implementation of AMVS for the nonlinear Boltzmann collision integral. Test computations demonstrate both advantages and deficiencies of the current method for calculations of narrow-kernel distributions.

  5. Application of Nearly Linear Solvers to Electric Power System Computation

    Science.gov (United States)

    Grant, Lisa L.

    To meet the future needs of the electric power system, improvements need to be made in the areas of power system algorithms, simulation, and modeling, specifically to achieve a time frame that is useful to industry. If power system time-domain simulations could run in real-time, then system operators would have situational awareness to implement online control and avoid cascading failures, significantly improving power system reliability. Several power system applications rely on the solution of a very large linear system. As the demands on power systems continue to grow, there is a greater computational complexity involved in solving these large linear systems within reasonable time. This project expands on the current work in fast linear solvers, developed for solving symmetric and diagonally dominant linear systems, in order to produce power system specific methods that can be solved in nearly-linear run times. The work explores a new theoretical method that is based on ideas in graph theory and combinatorics. The technique builds a chain of progressively smaller approximate systems with preconditioners based on the system's low stretch spanning tree. The method is compared to traditional linear solvers and shown to reduce the time and iterations required for an accurate solution, especially as the system size increases. A simulation validation is performed, comparing the solution capabilities of the chain method to LU factorization, which is the standard linear solver for power flow. The chain method was successfully demonstrated to produce accurate solutions for power flow simulation on a number of IEEE test cases, and a discussion on how to further improve the method's speed and accuracy is included.

  6. Source term identification in atmospheric modelling via sparse optimization

    Science.gov (United States)

    Adam, Lukas; Branda, Martin; Hamburger, Thomas

    2015-04-01

    Inverse modelling plays an important role in identifying the amount of harmful substances released into atmosphere during major incidents such as power plant accidents or volcano eruptions. Another possible application of inverse modelling lies in the monitoring the CO2 emission limits where only observations at certain places are available and the task is to estimate the total releases at given locations. This gives rise to minimizing the discrepancy between the observations and the model predictions. There are two standard ways of solving such problems. In the first one, this discrepancy is regularized by adding additional terms. Such terms may include Tikhonov regularization, distance from a priori information or a smoothing term. The resulting, usually quadratic, problem is then solved via standard optimization solvers. The second approach assumes that the error term has a (normal) distribution and makes use of Bayesian modelling to identify the source term. Instead of following the above-mentioned approaches, we utilize techniques from the field of compressive sensing. Such techniques look for a sparsest solution (solution with the smallest number of nonzeros) of a linear system, where a maximal allowed error term may be added to this system. Even though this field is a developed one with many possible solution techniques, most of them do not consider even the simplest constraints which are naturally present in atmospheric modelling. One of such examples is the nonnegativity of release amounts. We believe that the concept of a sparse solution is natural in both problems of identification of the source location and of the time process of the source release. In the first case, it is usually assumed that there are only few release points and the task is to find them. In the second case, the time window is usually much longer than the duration of the actual release. In both cases, the optimal solution should contain a large amount of zeros, giving rise to the

  7. Solving very large scattering problems using a parallel PWTD-enhanced surface integral equation solver

    KAUST Repository

    Liu, Yang

    2013-07-01

    The computational complexity and memory requirements of multilevel plane wave time domain (PWTD)-accelerated marching-on-in-time (MOT)-based surface integral equation (SIE) solvers scale as O(NtNs(log 2)Ns) and O(Ns 1.5); here N t and Ns denote numbers of temporal and spatial basis functions discretizing the current [Shanker et al., IEEE Trans. Antennas Propag., 51, 628-641, 2003]. In the past, serial versions of these solvers have been successfully applied to the analysis of scattering from perfect electrically conducting as well as homogeneous penetrable targets involving up to Ns ≈ 0.5 × 106 and Nt ≈ 10 3. To solve larger problems, parallel PWTD-enhanced MOT solvers are called for. Even though a simple parallelization strategy was demonstrated in the context of electromagnetic compatibility analysis [M. Lu et al., in Proc. IEEE Int. Symp. AP-S, 4, 4212-4215, 2004], by and large, progress in this area has been slow. The lack of progress can be attributed wholesale to difficulties associated with the construction of a scalable PWTD kernel. © 2013 IEEE.

  8. Optical solver for a system of ordinary differential equations based on an external feedback assisted microring resonator.

    Science.gov (United States)

    Hou, Jie; Dong, Jianji; Zhang, Xinliang

    2017-06-15

    Systems of ordinary differential equations (SODEs) are crucial for describing the dynamic behaviors in various systems such as modern control systems which require observability and controllability. In this Letter, we propose and experimentally demonstrate an all-optical SODE solver based on the silicon-on-insulator platform. We use an add/drop microring resonator to construct two different ordinary differential equations (ODEs) and then introduce two external feedback waveguides to realize the coupling between these ODEs, thus forming the SODE solver. A temporal coupled mode theory is used to deduce the expression of the SODE. A system experiment is carried out for further demonstration. For the input 10 GHz NRZ-like pulses, the measured output waveforms of the SODE solver agree well with the calculated results.

  9. Hybrid direct and iterative solvers for h refined grids with singularities

    KAUST Repository

    Paszyński, Maciej R.; Paszyńska, Anna; Dalcin, Lisandro; Calo, Victor M.

    2015-01-01

    on top of it. The hybrid solver is applied for two or three dimensional grids automatically h refined towards point or edge singularities. The automatic refinement is based on the relative error estimations between the coarse and fine mesh solutions [2

  10. ALPS - A LINEAR PROGRAM SOLVER

    Science.gov (United States)

    Viterna, L. A.

    1994-01-01

    Linear programming is a widely-used engineering and management tool. Scheduling, resource allocation, and production planning are all well-known applications of linear programs (LP's). Most LP's are too large to be solved by hand, so over the decades many computer codes for solving LP's have been developed. ALPS, A Linear Program Solver, is a full-featured LP analysis program. ALPS can solve plain linear programs as well as more complicated mixed integer and pure integer programs. ALPS also contains an efficient solution technique for pure binary (0-1 integer) programs. One of the many weaknesses of LP solvers is the lack of interaction with the user. ALPS is a menu-driven program with no special commands or keywords to learn. In addition, ALPS contains a full-screen editor to enter and maintain the LP formulation. These formulations can be written to and read from plain ASCII files for portability. For those less experienced in LP formulation, ALPS contains a problem "parser" which checks the formulation for errors. ALPS creates fully formatted, readable reports that can be sent to a printer or output file. ALPS is written entirely in IBM's APL2/PC product, Version 1.01. The APL2 workspace containing all the ALPS code can be run on any APL2/PC system (AT or 386). On a 32-bit system, this configuration can take advantage of all extended memory. The user can also examine and modify the ALPS code. The APL2 workspace has also been "packed" to be run on any DOS system (without APL2) as a stand-alone "EXE" file, but has limited memory capacity on a 640K system. A numeric coprocessor (80X87) is optional but recommended. The standard distribution medium for ALPS is a 5.25 inch 360K MS-DOS format diskette. IBM, IBM PC and IBM APL2 are registered trademarks of International Business Machines Corporation. MS-DOS is a registered trademark of Microsoft Corporation.

  11. Parallel Solver for Diffuse Optical Tomography on Realistic Head Models With Scattering and Clear Regions.

    Science.gov (United States)

    Placati, Silvio; Guermandi, Marco; Samore, Andrea; Scarselli, Eleonora Franchi; Guerrieri, Roberto

    2016-09-01

    Diffuse optical tomography is an imaging technique, based on evaluation of how light propagates within the human head to obtain the functional information about the brain. Precision in reconstructing such an optical properties map is highly affected by the accuracy of the light propagation model implemented, which needs to take into account the presence of clear and scattering tissues. We present a numerical solver based on the radiosity-diffusion model, integrating the anatomical information provided by a structural MRI. The solver is designed to run on parallel heterogeneous platforms based on multiple GPUs and CPUs. We demonstrate how the solver provides a 7 times speed-up over an isotropic-scattered parallel Monte Carlo engine based on a radiative transport equation for a domain composed of 2 million voxels, along with a significant improvement in accuracy. The speed-up greatly increases for larger domains, allowing us to compute the light distribution of a full human head ( ≈ 3 million voxels) in 116 s for the platform used.

  12. Robust Fringe Projection Profilometry via Sparse Representation.

    Science.gov (United States)

    Budianto; Lun, Daniel P K

    2016-04-01

    In this paper, a robust fringe projection profilometry (FPP) algorithm using the sparse dictionary learning and sparse coding techniques is proposed. When reconstructing the 3D model of objects, traditional FPP systems often fail to perform if the captured fringe images have a complex scene, such as having multiple and occluded objects. It introduces great difficulty to the phase unwrapping process of an FPP system that can result in serious distortion in the final reconstructed 3D model. For the proposed algorithm, it encodes the period order information, which is essential to phase unwrapping, into some texture patterns and embeds them to the projected fringe patterns. When the encoded fringe image is captured, a modified morphological component analysis and a sparse classification procedure are performed to decode and identify the embedded period order information. It is then used to assist the phase unwrapping process to deal with the different artifacts in the fringe images. Experimental results show that the proposed algorithm can significantly improve the robustness of an FPP system. It performs equally well no matter the fringe images have a simple or complex scene, or are affected due to the ambient lighting of the working environment.

  13. High-performance small-scale solvers for linear Model Predictive Control

    DEFF Research Database (Denmark)

    Frison, Gianluca; Sørensen, Hans Henrik Brandenborg; Dammann, Bernd

    2014-01-01

    , with the two main research areas of explicit MPC and tailored on-line MPC. State-of-the-art solvers in this second class can outperform optimized linear-algebra libraries (BLAS) only for very small problems, and do not explicitly exploit the hardware capabilities, relying on compilers for that. This approach...

  14. WIENER-HOPF SOLVER WITH SMOOTH PROBABILITY DISTRIBUTIONS OF ITS COMPONENTS

    Directory of Open Access Journals (Sweden)

    Mr. Vladimir A. Smagin

    2016-12-01

    Full Text Available The Wiener – Hopf solver with smooth probability distributions of its component is presented. The method is based on hyper delta approximations of initial distributions. The use of Fourier series transformation and characteristic function allows working with the random variable method concentrated in transversal axis of absc.

  15. A LAGRANGIAN GAUSS-NEWTON-KRYLOV SOLVER FOR MASS- AND INTENSITY-PRESERVING DIFFEOMORPHIC IMAGE REGISTRATION.

    Science.gov (United States)

    Mang, Andreas; Ruthotto, Lars

    2017-01-01

    We present an efficient solver for diffeomorphic image registration problems in the framework of Large Deformations Diffeomorphic Metric Mappings (LDDMM). We use an optimal control formulation, in which the velocity field of a hyperbolic PDE needs to be found such that the distance between the final state of the system (the transformed/transported template image) and the observation (the reference image) is minimized. Our solver supports both stationary and non-stationary (i.e., transient or time-dependent) velocity fields. As transformation models, we consider both the transport equation (assuming intensities are preserved during the deformation) and the continuity equation (assuming mass-preservation). We consider the reduced form of the optimal control problem and solve the resulting unconstrained optimization problem using a discretize-then-optimize approach. A key contribution is the elimination of the PDE constraint using a Lagrangian hyperbolic PDE solver. Lagrangian methods rely on the concept of characteristic curves. We approximate these curves using a fourth-order Runge-Kutta method. We also present an efficient algorithm for computing the derivatives of the final state of the system with respect to the velocity field. This allows us to use fast Gauss-Newton based methods. We present quickly converging iterative linear solvers using spectral preconditioners that render the overall optimization efficient and scalable. Our method is embedded into the image registration framework FAIR and, thus, supports the most commonly used similarity measures and regularization functionals. We demonstrate the potential of our new approach using several synthetic and real world test problems with up to 14.7 million degrees of freedom.

  16. Addressing the computational cost of large EIT solutions

    International Nuclear Information System (INIS)

    Boyle, Alistair; Adler, Andy; Borsic, Andrea

    2012-01-01

    Electrical impedance tomography (EIT) is a soft field tomography modality based on the application of electric current to a body and measurement of voltages through electrodes at the boundary. The interior conductivity is reconstructed on a discrete representation of the domain using a finite-element method (FEM) mesh and a parametrization of that domain. The reconstruction requires a sequence of numerically intensive calculations. There is strong interest in reducing the cost of these calculations. An improvement in the compute time for current problems would encourage further exploration of computationally challenging problems such as the incorporation of time series data, wide-spread adoption of three-dimensional simulations and correlation of other modalities such as CT and ultrasound. Multicore processors offer an opportunity to reduce EIT computation times but may require some restructuring of the underlying algorithms to maximize the use of available resources. This work profiles two EIT software packages (EIDORS and NDRM) to experimentally determine where the computational costs arise in EIT as problems scale. Sparse matrix solvers, a key component for the FEM forward problem and sensitivity estimates in the inverse problem, are shown to take a considerable portion of the total compute time in these packages. A sparse matrix solver performance measurement tool, Meagre-Crowd, is developed to interface with a variety of solvers and compare their performance over a range of two- and three-dimensional problems of increasing node density. Results show that distributed sparse matrix solvers that operate on multiple cores are advantageous up to a limit that increases as the node density increases. We recommend a selection procedure to find a solver and hardware arrangement matched to the problem and provide guidance and tools to perform that selection. (paper)

  17. Addressing the computational cost of large EIT solutions.

    Science.gov (United States)

    Boyle, Alistair; Borsic, Andrea; Adler, Andy

    2012-05-01

    Electrical impedance tomography (EIT) is a soft field tomography modality based on the application of electric current to a body and measurement of voltages through electrodes at the boundary. The interior conductivity is reconstructed on a discrete representation of the domain using a finite-element method (FEM) mesh and a parametrization of that domain. The reconstruction requires a sequence of numerically intensive calculations. There is strong interest in reducing the cost of these calculations. An improvement in the compute time for current problems would encourage further exploration of computationally challenging problems such as the incorporation of time series data, wide-spread adoption of three-dimensional simulations and correlation of other modalities such as CT and ultrasound. Multicore processors offer an opportunity to reduce EIT computation times but may require some restructuring of the underlying algorithms to maximize the use of available resources. This work profiles two EIT software packages (EIDORS and NDRM) to experimentally determine where the computational costs arise in EIT as problems scale. Sparse matrix solvers, a key component for the FEM forward problem and sensitivity estimates in the inverse problem, are shown to take a considerable portion of the total compute time in these packages. A sparse matrix solver performance measurement tool, Meagre-Crowd, is developed to interface with a variety of solvers and compare their performance over a range of two- and three-dimensional problems of increasing node density. Results show that distributed sparse matrix solvers that operate on multiple cores are advantageous up to a limit that increases as the node density increases. We recommend a selection procedure to find a solver and hardware arrangement matched to the problem and provide guidance and tools to perform that selection.

  18. Sparse DOA estimation with polynomial rooting

    DEFF Research Database (Denmark)

    Xenaki, Angeliki; Gerstoft, Peter; Fernandez Grande, Efren

    2015-01-01

    Direction-of-arrival (DOA) estimation involves the localization of a few sources from a limited number of observations on an array of sensors. Thus, DOA estimation can be formulated as a sparse signal reconstruction problem and solved efficiently with compressive sensing (CS) to achieve highresol......Direction-of-arrival (DOA) estimation involves the localization of a few sources from a limited number of observations on an array of sensors. Thus, DOA estimation can be formulated as a sparse signal reconstruction problem and solved efficiently with compressive sensing (CS) to achieve...... highresolution imaging. Utilizing the dual optimal variables of the CS optimization problem, it is shown with Monte Carlo simulations that the DOAs are accurately reconstructed through polynomial rooting (Root-CS). Polynomial rooting is known to improve the resolution in several other DOA estimation methods...

  19. A General Sparse Tensor Framework for Electronic Structure Theory.

    Science.gov (United States)

    Manzer, Samuel; Epifanovsky, Evgeny; Krylov, Anna I; Head-Gordon, Martin

    2017-03-14

    Linear-scaling algorithms must be developed in order to extend the domain of applicability of electronic structure theory to molecules of any desired size. However, the increasing complexity of modern linear-scaling methods makes code development and maintenance a significant challenge. A major contributor to this difficulty is the lack of robust software abstractions for handling block-sparse tensor operations. We therefore report the development of a highly efficient symbolic block-sparse tensor library in order to provide access to high-level software constructs to treat such problems. Our implementation supports arbitrary multi-dimensional sparsity in all input and output tensors. We avoid cumbersome machine-generated code by implementing all functionality as a high-level symbolic C++ language library and demonstrate that our implementation attains very high performance for linear-scaling sparse tensor contractions.

  20. Low-rank and sparse modeling for visual analysis

    CERN Document Server

    Fu, Yun

    2014-01-01

    This book provides a view of low-rank and sparse computing, especially approximation, recovery, representation, scaling, coding, embedding and learning among unconstrained visual data. The book includes chapters covering multiple emerging topics in this new field. It links multiple popular research fields in Human-Centered Computing, Social Media, Image Classification, Pattern Recognition, Computer Vision, Big Data, and Human-Computer Interaction. Contains an overview of the low-rank and sparse modeling techniques for visual analysis by examining both theoretical analysis and real-world applic

  1. Development and validation of a magneto-hydrodynamic solver for blood flow analysis

    Energy Technology Data Exchange (ETDEWEB)

    Kainz, W; Guag, J; Krauthamer, V; Myklebust, J; Bassen, H; Chang, I [Center for Devices and Radiological Health, FDA, Silver Spring, MD (United States); Benkler, S; Chavannes, N [Schmid and Partner Engineering AG, Zurich (Switzerland); Szczerba, D; Neufeld, E; Kuster, N [Foundation for Research on Information Technology in Society (IT' IS), Zurich (Switzerland); Kim, J H; Sarntinoranont, M, E-mail: wolfgang.kainz@fda.hhs.go [Soft Tissue Mechanics and Drug Delivery Laboratory, Mechanical and Aerospace Engineering, University of Florida, FL (United States)

    2010-12-07

    The objective of this study was to develop a numerical solver to calculate the magneto-hydrodynamic (MHD) signal produced by a moving conductive liquid, i.e. blood flow in the great vessels of the heart, in a static magnetic field. We believe that this MHD signal is able to non-invasively characterize cardiac blood flow in order to supplement the present non-invasive techniques for the assessment of heart failure conditions. The MHD signal can be recorded on the electrocardiogram (ECG) while the subject is exposed to a strong static magnetic field. The MHD signal can only be measured indirectly as a combination of the heart's electrical signal and the MHD signal. The MHD signal itself is caused by induced electrical currents in the blood due to the moving of the blood in the magnetic field. To characterize and eventually optimize MHD measurements, we developed a MHD solver based on a finite element code. This code was validated against literature, experimental and analytical data. The validation of the MHD solver shows good agreement with all three reference values. Future studies will include the calculation of the MHD signals for anatomical models. We will vary the orientation of the static magnetic field to determine an optimized location for the measurement of the MHD blood flow signal.

  2. Sparse BLIP: BLind Iterative Parallel imaging reconstruction using compressed sensing.

    Science.gov (United States)

    She, Huajun; Chen, Rong-Rong; Liang, Dong; DiBella, Edward V R; Ying, Leslie

    2014-02-01

    To develop a sensitivity-based parallel imaging reconstruction method to reconstruct iteratively both the coil sensitivities and MR image simultaneously based on their prior information. Parallel magnetic resonance imaging reconstruction problem can be formulated as a multichannel sampling problem where solutions are sought analytically. However, the channel functions given by the coil sensitivities in parallel imaging are not known exactly and the estimation error usually leads to artifacts. In this study, we propose a new reconstruction algorithm, termed Sparse BLind Iterative Parallel, for blind iterative parallel imaging reconstruction using compressed sensing. The proposed algorithm reconstructs both the sensitivity functions and the image simultaneously from undersampled data. It enforces the sparseness constraint in the image as done in compressed sensing, but is different from compressed sensing in that the sensing matrix is unknown and additional constraint is enforced on the sensitivities as well. Both phantom and in vivo imaging experiments were carried out with retrospective undersampling to evaluate the performance of the proposed method. Experiments show improvement in Sparse BLind Iterative Parallel reconstruction when compared with Sparse SENSE, JSENSE, IRGN-TV, and L1-SPIRiT reconstructions with the same number of measurements. The proposed Sparse BLind Iterative Parallel algorithm reduces the reconstruction errors when compared to the state-of-the-art parallel imaging methods. Copyright © 2013 Wiley Periodicals, Inc.

  3. Real-time SPARSE-SENSE cardiac cine MR imaging: optimization of image reconstruction and sequence validation.

    Science.gov (United States)

    Goebel, Juliane; Nensa, Felix; Bomas, Bettina; Schemuth, Haemi P; Maderwald, Stefan; Gratz, Marcel; Quick, Harald H; Schlosser, Thomas; Nassenstein, Kai

    2016-12-01

    Improved real-time cardiac magnetic resonance (CMR) sequences have currently been introduced, but so far only limited practical experience exists. This study aimed at image reconstruction optimization and clinical validation of a new highly accelerated real-time cine SPARSE-SENSE sequence. Left ventricular (LV) short-axis stacks of a real-time free-breathing SPARSE-SENSE sequence with high spatiotemporal resolution and of a standard segmented cine SSFP sequence were acquired at 1.5 T in 11 volunteers and 15 patients. To determine the optimal iterations, all volunteers' SPARSE-SENSE images were reconstructed using 10-200 iterations, and contrast ratios, image entropies, and reconstruction times were assessed. Subsequently, the patients' SPARSE-SENSE images were reconstructed with the clinically optimal iterations. LV volumetric values were evaluated and compared between both sequences. Sufficient image quality and acceptable reconstruction times were achieved when using 80 iterations. Bland-Altman plots and Passing-Bablok regression showed good agreement for all volumetric parameters. 80 iterations are recommended for iterative SPARSE-SENSE image reconstruction in clinical routine. Real-time cine SPARSE-SENSE yielded comparable volumetric results as the current standard SSFP sequence. Due to its intrinsic low image acquisition times, real-time cine SPARSE-SENSE imaging with iterative image reconstruction seems to be an attractive alternative for LV function analysis. • A highly accelerated real-time CMR sequence using SPARSE-SENSE was evaluated. • SPARSE-SENSE allows free breathing in real-time cardiac cine imaging. • For clinically optimal SPARSE-SENSE image reconstruction, 80 iterations are recommended. • Real-time SPARSE-SENSE imaging yielded comparable volumetric results as the reference SSFP sequence. • The fast SPARSE-SENSE sequence is an attractive alternative to standard SSFP sequences.

  4. Riemann solvers and undercompressive shocks of convex FPU chains

    International Nuclear Information System (INIS)

    Herrmann, Michael; Rademacher, Jens D M

    2010-01-01

    We consider FPU-type atomic chains with general convex potentials. The naive continuum limit in the hyperbolic space–time scaling is the p-system of mass and momentum conservation. We systematically compare Riemann solutions to the p-system with numerical solutions to discrete Riemann problems in FPU chains, and argue that the latter can be described by modified p-system Riemann solvers. We allow the flux to have a turning point, and observe a third type of elementary wave (conservative shocks) in the atomistic simulations. These waves are heteroclinic travelling waves and correspond to non-classical, undercompressive shocks of the p-system. We analyse such shocks for fluxes with one or more turning points. Depending on the convexity properties of the flux we propose FPU-Riemann solvers. Our numerical simulations confirm that Lax shocks are replaced by so-called dispersive shocks. For convex–concave flux we provide numerical evidence that convex FPU chains follow the p-system in generating conservative shocks that are supersonic. For concave–convex flux, however, the conservative shocks of the p-system are subsonic and do not appear in FPU-Riemann solutions

  5. A Comparison Between Mıcrosoft Excel Solver and Ncss, Spss Routines for Nonlinear Regression Models

    Directory of Open Access Journals (Sweden)

    Didem Tetik Küçükelçi

    2018-02-01

    Full Text Available In this study we have tried to compare the results obtained by Microsoft Excel Solver program with those of NCSS and SPSS in some nonlinear regression models. We fit some nonlinear models to data present in http//itl.nist.gov/div898/strd/nls/nls_main.shtml by the three packages. Although EXCEL did not succeed as well as the other packages, we conclude that Microsoft Excel Solver provides us a cheaper and a more interactive way of studying nonlinear models.

  6. Iterative linear solvers in a 2D radiation-hydrodynamics code: Methods and performance

    International Nuclear Information System (INIS)

    Baldwin, C.; Brown, P.N.; Falgout, R.; Graziani, F.; Jones, J.

    1999-01-01

    Computer codes containing both hydrodynamics and radiation play a central role in simulating both astrophysical and inertial confinement fusion (ICF) phenomena. A crucial aspect of these codes is that they require an implicit solution of the radiation diffusion equations. The authors present in this paper the results of a comparison of five different linear solvers on a range of complex radiation and radiation-hydrodynamics problems. The linear solvers used are diagonally scaled conjugate gradient, GMRES with incomplete LU preconditioning, conjugate gradient with incomplete Cholesky preconditioning, multigrid, and multigrid-preconditioned conjugate gradient. These problems involve shock propagation, opacities varying over 5--6 orders of magnitude, tabular equations of state, and dynamic ALE (Arbitrary Lagrangian Eulerian) meshes. They perform a problem size scalability study by comparing linear solver performance over a wide range of problem sizes from 1,000 to 100,000 zones. The fundamental question they address in this paper is: Is it more efficient to invert the matrix in many inexpensive steps (like diagonally scaled conjugate gradient) or in fewer expensive steps (like multigrid)? In addition, what is the answer to this question as a function of problem size and is the answer problem dependent? They find that the diagonally scaled conjugate gradient method performs poorly with the growth of problem size, increasing in both iteration count and overall CPU time with the size of the problem and also increasing for larger time steps. For all problems considered, the multigrid algorithms scale almost perfectly (i.e., the iteration count is approximately independent of problem size and problem time step). For pure radiation flow problems (i.e., no hydrodynamics), they see speedups in CPU time of factors of ∼15--30 for the largest problems, when comparing the multigrid solvers relative to diagonal scaled conjugate gradient

  7. Security-enhanced phase encryption assisted by nonlinear optical correlation via sparse phase

    International Nuclear Information System (INIS)

    Chen, Wen; Chen, Xudong; Wang, Xiaogang

    2015-01-01

    We propose a method for security-enhanced phase encryption assisted by a nonlinear optical correlation via a sparse phase. Optical configurations are established based on a phase retrieval algorithm for embedding an input image and the secret data into phase-only masks. We found that when one or a few phase-only masks generated during data hiding are sparse, it is possible to integrate these sparse masks into those phase-only masks generated during the encoding of the input image. Synthesized phase-only masks are used for the recovery, and sparse distributions (i.e., binary maps) for generating the incomplete phase-only masks are considered as additional parameters for the recovery of secret data. It is difficult for unauthorized receivers to know that a useful phase has been sparsely distributed in the finally generated phase-only masks for secret-data recovery. Only when the secret data are correctly verified can the input image obtained with valid keys be claimed as targeted information. (paper)

  8. Single and Multiple Object Tracking Using a Multi-Feature Joint Sparse Representation.

    Science.gov (United States)

    Hu, Weiming; Li, Wei; Zhang, Xiaoqin; Maybank, Stephen

    2015-04-01

    In this paper, we propose a tracking algorithm based on a multi-feature joint sparse representation. The templates for the sparse representation can include pixel values, textures, and edges. In the multi-feature joint optimization, noise or occlusion is dealt with using a set of trivial templates. A sparse weight constraint is introduced to dynamically select the relevant templates from the full set of templates. A variance ratio measure is adopted to adaptively adjust the weights of different features. The multi-feature template set is updated adaptively. We further propose an algorithm for tracking multi-objects with occlusion handling based on the multi-feature joint sparse reconstruction. The observation model based on sparse reconstruction automatically focuses on the visible parts of an occluded object by using the information in the trivial templates. The multi-object tracking is simplified into a joint Bayesian inference. The experimental results show the superiority of our algorithm over several state-of-the-art tracking algorithms.

  9. The analytic nodal diffusion solver ANDES in multigroups for 3D rectangular geometry: Development and performance analysis

    International Nuclear Information System (INIS)

    Lozano, Juan-Andres; Garcia-Herranz, Nuria; Ahnert, Carol; Aragones, Jose-Maria

    2008-01-01

    In this work we address the development and implementation of the analytic coarse-mesh finite-difference (ACMFD) method in a nodal neutron diffusion solver called ANDES. The first version of the solver is implemented in any number of neutron energy groups, and in 3D Cartesian geometries; thus it mainly addresses PWR and BWR core simulations. The details about the generalization to multigroups and 3D, as well as the implementation of the method are given. The transverse integration procedure is the scheme chosen to extend the ACMFD formulation to multidimensional problems. The role of the transverse leakage treatment in the accuracy of the nodal solutions is analyzed in detail: the involved assumptions, the limitations of the method in terms of nodal width, the alternative approaches to implement the transverse leakage terms in nodal methods - implicit or explicit -, and the error assessment due to transverse integration. A new approach for solving the control rod 'cusping' problem, based on the direct application of the ACMFD method, is also developed and implemented in ANDES. The solver architecture turns ANDES into an user-friendly, modular and easily linkable tool, as required to be integrated into common software platforms for multi-scale and multi-physics simulations. ANDES can be used either as a stand-alone nodal code or as a solver to accelerate the convergence of whole core pin-by-pin code systems. The verification and performance of the solver are demonstrated using both proof-of-principle test cases and well-referenced international benchmarks

  10. Low-rank sparse learning for robust visual tracking

    KAUST Repository

    Zhang, Tianzhu

    2012-01-01

    In this paper, we propose a new particle-filter based tracking algorithm that exploits the relationship between particles (candidate targets). By representing particles as sparse linear combinations of dictionary templates, this algorithm capitalizes on the inherent low-rank structure of particle representations that are learned jointly. As such, it casts the tracking problem as a low-rank matrix learning problem. This low-rank sparse tracker (LRST) has a number of attractive properties. (1) Since LRST adaptively updates dictionary templates, it can handle significant changes in appearance due to variations in illumination, pose, scale, etc. (2) The linear representation in LRST explicitly incorporates background templates in the dictionary and a sparse error term, which enables LRST to address the tracking drift problem and to be robust against occlusion respectively. (3) LRST is computationally attractive, since the low-rank learning problem can be efficiently solved as a sequence of closed form update operations, which yield a time complexity that is linear in the number of particles and the template size. We evaluate the performance of LRST by applying it to a set of challenging video sequences and comparing it to 6 popular tracking methods. Our experiments show that by representing particles jointly, LRST not only outperforms the state-of-the-art in tracking accuracy but also significantly improves the time complexity of methods that use a similar sparse linear representation model for particles [1]. © 2012 Springer-Verlag.

  11. Group-sparse representation with dictionary learning for medical image denoising and fusion.

    Science.gov (United States)

    Li, Shutao; Yin, Haitao; Fang, Leyuan

    2012-12-01

    Recently, sparse representation has attracted a lot of interest in various areas. However, the standard sparse representation does not consider the intrinsic structure, i.e., the nonzero elements occur in clusters, called group sparsity. Furthermore, there is no dictionary learning method for group sparse representation considering the geometrical structure of space spanned by atoms. In this paper, we propose a novel dictionary learning method, called Dictionary Learning with Group Sparsity and Graph Regularization (DL-GSGR). First, the geometrical structure of atoms is modeled as the graph regularization. Then, combining group sparsity and graph regularization, the DL-GSGR is presented, which is solved by alternating the group sparse coding and dictionary updating. In this way, the group coherence of learned dictionary can be enforced small enough such that any signal can be group sparse coded effectively. Finally, group sparse representation with DL-GSGR is applied to 3-D medical image denoising and image fusion. Specifically, in 3-D medical image denoising, a 3-D processing mechanism (using the similarity among nearby slices) and temporal regularization (to perverse the correlations across nearby slices) are exploited. The experimental results on 3-D image denoising and image fusion demonstrate the superiority of our proposed denoising and fusion approaches.

  12. Sparse electromagnetic imaging using nonlinear iterative shrinkage thresholding

    KAUST Repository

    Desmal, Abdulla; Bagci, Hakan

    2015-01-01

    A sparse nonlinear electromagnetic imaging scheme is proposed for reconstructing dielectric contrast of investigation domains from measured fields. The proposed approach constructs the optimization problem by introducing the sparsity constraint to the data misfit between the scattered fields expressed as a nonlinear function of the contrast and the measured fields and solves it using the nonlinear iterative shrinkage thresholding algorithm. The thresholding is applied to the result of every nonlinear Landweber iteration to enforce the sparsity constraint. Numerical results demonstrate the accuracy and efficiency of the proposed method in reconstructing sparse dielectric profiles.

  13. Sparse electromagnetic imaging using nonlinear iterative shrinkage thresholding

    KAUST Repository

    Desmal, Abdulla

    2015-04-13

    A sparse nonlinear electromagnetic imaging scheme is proposed for reconstructing dielectric contrast of investigation domains from measured fields. The proposed approach constructs the optimization problem by introducing the sparsity constraint to the data misfit between the scattered fields expressed as a nonlinear function of the contrast and the measured fields and solves it using the nonlinear iterative shrinkage thresholding algorithm. The thresholding is applied to the result of every nonlinear Landweber iteration to enforce the sparsity constraint. Numerical results demonstrate the accuracy and efficiency of the proposed method in reconstructing sparse dielectric profiles.

  14. Conducting Automated Test Assembly Using the Premium Solver Platform Version 7.0 with Microsoft Excel and the Large-Scale LP/QP Solver Engine Add-In

    Science.gov (United States)

    Cor, Ken; Alves, Cecilia; Gierl, Mark J.

    2008-01-01

    This review describes and evaluates a software add-in created by Frontline Systems, Inc., that can be used with Microsoft Excel 2007 to solve large, complex test assembly problems. The combination of Microsoft Excel 2007 with the Frontline Systems Premium Solver Platform is significant because Microsoft Excel is the most commonly used spreadsheet…

  15. Computational complexity and memory usage for multi-frontal direct solvers used in p finite element analysis

    KAUST Repository

    Calo, Victor M.; Collier, Nathan; Pardo, David; Paszyński, Maciej R.

    2011-01-01

    The multi-frontal direct solver is the state of the art for the direct solution of linear systems. This paper provides computational complexity and memory usage estimates for the application of the multi-frontal direct solver algorithm on linear systems resulting from p finite elements. Specifically we provide the estimates for systems resulting from C0 polynomial spaces spanned by B-splines. The structured grid and uniform polynomial order used in isogeometric meshes simplifies the analysis.

  16. Computational complexity and memory usage for multi-frontal direct solvers used in p finite element analysis

    KAUST Repository

    Calo, Victor M.

    2011-05-14

    The multi-frontal direct solver is the state of the art for the direct solution of linear systems. This paper provides computational complexity and memory usage estimates for the application of the multi-frontal direct solver algorithm on linear systems resulting from p finite elements. Specifically we provide the estimates for systems resulting from C0 polynomial spaces spanned by B-splines. The structured grid and uniform polynomial order used in isogeometric meshes simplifies the analysis.

  17. Applications of an implicit HLLC-based Godunov solver for steady state hypersonic problems

    International Nuclear Information System (INIS)

    Link, R.A.; Sharman, B.

    2005-01-01

    Over the past few years, there has been considerable activity developing research vehicles for studying hypersonic propulsion. Successful launches of the Australian Hyshot and the US Hyper-X vehicles have added a significant amount of flight test data to a field that had previously been limited to numerical simulation. A number of approaches have been proposed for hypersonics propulsion, including attached detonation wave, supersonics combustion, and shock induced combustion. Due to the high cost of developing flight hardware, CFD simulations will continue to be a key tool for investigating the feasibility of these concepts. Capturing the interactions of the vehicle body with the boundary layer and chemical reactions pushes the limits of available modelling tools and computer hardware. Explicit formulations are extremely slow in converging to a steady state; therefore, the use of implicit methods are warranted. An implicit LLC-based Godunov solver has been developed at Martec in collaboration with DRDC Valcartier to solve hypersonic problems with a minimum of CPU time and RAM storage. The solver, Chinook Implicit, is based upon the implicit formulation adopted by Batten et. al. The solver is based on a point implicit Gauss-Seidel method for unstructured grids, and includes fully implicit boundary conditions. Preliminary results for small and large scale inviscid hypersonics problems will be presented. (author)

  18. Computational cost of isogeometric multi-frontal solvers on parallel distributed memory machines

    KAUST Repository

    Woźniak, Maciej; Paszyński, Maciej R.; Pardo, D.; Dalcin, Lisandro; Calo, Victor M.

    2015-01-01

    This paper derives theoretical estimates of the computational cost for isogeometric multi-frontal direct solver executed on parallel distributed memory machines. We show theoretically that for the Cp-1 global continuity of the isogeometric solution

  19. On the Automatic Parallelization of Sparse and Irregular Fortran Programs

    Directory of Open Access Journals (Sweden)

    Yuan Lin

    1999-01-01

    Full Text Available Automatic parallelization is usually believed to be less effective at exploiting implicit parallelism in sparse/irregular programs than in their dense/regular counterparts. However, not much is really known because there have been few research reports on this topic. In this work, we have studied the possibility of using an automatic parallelizing compiler to detect the parallelism in sparse/irregular programs. The study with a collection of sparse/irregular programs led us to some common loop patterns. Based on these patterns new techniques were derived that produced good speedups when manually applied to our benchmark codes. More importantly, these parallelization methods can be implemented in a parallelizing compiler and can be applied automatically.

  20. POWERPLAY: Training an Increasingly General Problem Solver by Continually Searching for the Simplest Still Unsolvable Problem

    Directory of Open Access Journals (Sweden)

    Jürgen eSchmidhuber

    2013-06-01

    Full Text Available Most of computer science focuses on automatically solving given computational problems. I focus on automatically inventing or discovering problems in a way inspired by the playful behavior of animals and humans, to train a more and more general problem solver from scratch in an unsupervised fashion. Consider the infinite set of all computable descriptions of tasks with possibly computable solutions. The novel algorithmic framework POWERPLAY (2011 continually searches the space of possible pairs of new tasks and modifications of the current problem solver, until it finds a more powerful problem solver that provably solves all previously learned tasks plus the new one, while the unmodified predecessor does not. Wow-effects are achieved by continually making previously learned skills more efficient such that they require less time and space. New skills may (partially re-use previously learned skills. POWERPLAY's search orders candidate pairs of tasks and solver modifications by their conditional computational (time & space complexity, given the stored experience so far. The new task and its corresponding task-solving skill are those first found and validated. The computational costs of validating new tasks need not grow with task repertoire size. POWERPLAY's ongoing search for novelty keeps breaking the generalization abilities of its present solver. This is related to Goedel's sequence of increasingly powerful formal theories based on adding formerly unprovable statements to the axioms without affecting previously provable theorems. The continually increasing repertoire of problem solving procedures can be exploited by a parallel search for solutions to additional externally posed tasks. POWERPLAY may be viewed as a greedy but practical implementation of basic principles of creativity. A first experimental analysis can be found in separate papers [58, 56, 57].

  1. Split-Bregman-based sparse-view CT reconstruction

    Energy Technology Data Exchange (ETDEWEB)

    Vandeghinste, Bert; Vandenberghe, Stefaan [Ghent Univ. (Belgium). Medical Image and Signal Processing (MEDISIP); Goossens, Bart; Pizurica, Aleksandra; Philips, Wilfried [Ghent Univ. (Belgium). Image Processing and Interpretation Research Group (IPI); Beenhouwer, Jan de [Ghent Univ. (Belgium). Medical Image and Signal Processing (MEDISIP); Antwerp Univ., Wilrijk (Belgium). The Vision Lab; Staelens, Steven [Ghent Univ. (Belgium). Medical Image and Signal Processing (MEDISIP); Antwerp Univ., Edegem (Belgium). Molecular Imaging Centre Antwerp

    2011-07-01

    Total variation minimization has been extensively researched for image denoising and sparse view reconstruction. These methods show superior denoising performance for simple images with little texture, but result in texture information loss when applied to more complex images. It could thus be beneficial to use other regularizers within medical imaging. We propose a general regularization method, based on a split-Bregman approach. We show results for this framework combined with a total variation denoising operator, in comparison to ASD-POCS. We show that sparse-view reconstruction and noise regularization is possible. This general method will allow us to investigate other regularizers in the context of regularized CT reconstruction, and decrease the acquisition times in {mu}CT. (orig.)

  2. A framework for general sparse matrix-matrix multiplication on GPUs and heterogeneous processors

    DEFF Research Database (Denmark)

    Liu, Weifeng; Vinter, Brian

    2015-01-01

    General sparse matrix-matrix multiplication (SpGEMM) is a fundamental building block for numerous applications such as algebraic multigrid method (AMG), breadth first search and shortest path problem. Compared to other sparse BLAS routines, an efficient parallel SpGEMM implementation has to handle...... extra irregularity from three aspects: (1) the number of nonzero entries in the resulting sparse matrix is unknown in advance, (2) very expensive parallel insert operations at random positions in the resulting sparse matrix dominate the execution time, and (3) load balancing must account for sparse data...... memory space and efficiently utilizes the very limited on-chip scratchpad memory. Parallel insert operations of the nonzero entries are implemented through the GPU merge path algorithm that is experimentally found to be the fastest GPU merge approach. Load balancing builds on the number of necessary...

  3. Optimal Couple Projections for Domain Adaptive Sparse Representation-based Classification.

    Science.gov (United States)

    Zhang, Guoqing; Sun, Huaijiang; Porikli, Fatih; Liu, Yazhou; Sun, Quansen

    2017-08-29

    In recent years, sparse representation based classification (SRC) is one of the most successful methods and has been shown impressive performance in various classification tasks. However, when the training data has a different distribution than the testing data, the learned sparse representation may not be optimal, and the performance of SRC will be degraded significantly. To address this problem, in this paper, we propose an optimal couple projections for domain-adaptive sparse representation-based classification (OCPD-SRC) method, in which the discriminative features of data in the two domains are simultaneously learned with the dictionary that can succinctly represent the training and testing data in the projected space. OCPD-SRC is designed based on the decision rule of SRC, with the objective to learn coupled projection matrices and a common discriminative dictionary such that the between-class sparse reconstruction residuals of data from both domains are maximized, and the within-class sparse reconstruction residuals of data are minimized in the projected low-dimensional space. Thus, the resulting representations can well fit SRC and simultaneously have a better discriminant ability. In addition, our method can be easily extended to multiple domains and can be kernelized to deal with the nonlinear structure of data. The optimal solution for the proposed method can be efficiently obtained following the alternative optimization method. Extensive experimental results on a series of benchmark databases show that our method is better or comparable to many state-of-the-art methods.

  4. Two-dimensional sparse wavenumber recovery for guided wavefields

    Science.gov (United States)

    Sabeti, Soroosh; Harley, Joel B.

    2018-04-01

    The multi-modal and dispersive behavior of guided waves is often characterized by their dispersion curves, which describe their frequency-wavenumber behavior. In prior work, compressive sensing based techniques, such as sparse wavenumber analysis (SWA), have been capable of recovering dispersion curves from limited data samples. A major limitation of SWA, however, is the assumption that the structure is isotropic. As a result, SWA fails when applied to composites and other anisotropic structures. There have been efforts to address this issue in the literature, but they either are not easily generalizable or do not sufficiently express the data. In this paper, we enhance the existing approaches by employing a two-dimensional wavenumber model to account for direction-dependent velocities in anisotropic media. We integrate this model with tools from compressive sensing to reconstruct a wavefield from incomplete data. Specifically, we create a modified two-dimensional orthogonal matching pursuit algorithm that takes an undersampled wavefield image, with specified unknown elements, and determines its sparse wavenumber characteristics. We then recover the entire wavefield from the sparse representations obtained with our small number of data samples.

  5. Sparse matrix test collections

    Energy Technology Data Exchange (ETDEWEB)

    Duff, I.

    1996-12-31

    This workshop will discuss plans for coordinating and developing sets of test matrices for the comparison and testing of sparse linear algebra software. We will talk of plans for the next release (Release 2) of the Harwell-Boeing Collection and recent work on improving the accessibility of this Collection and others through the World Wide Web. There will only be three talks of about 15 to 20 minutes followed by a discussion from the floor.

  6. On the implementation of an accurate and efficient solver for convection-diffusion equations

    Science.gov (United States)

    Wu, Chin-Tien

    In this dissertation, we examine several different aspects of computing the numerical solution of the convection-diffusion equation. The solution of this equation often exhibits sharp gradients due to Dirichlet outflow boundaries or discontinuities in boundary conditions. Because of the singular-perturbed nature of the equation, numerical solutions often have severe oscillations when grid sizes are not small enough to resolve sharp gradients. To overcome such difficulties, the streamline diffusion discretization method can be used to obtain an accurate approximate solution in regions where the solution is smooth. To increase accuracy of the solution in the regions containing layers, adaptive mesh refinement and mesh movement based on a posteriori error estimations can be employed. An error-adapted mesh refinement strategy based on a posteriori error estimations is also proposed to resolve layers. For solving the sparse linear systems that arise from discretization, goemetric multigrid (MG) and algebraic multigrid (AMG) are compared. In addition, both methods are also used as preconditioners for Krylov subspace methods. We derive some convergence results for MG with line Gauss-Seidel smoothers and bilinear interpolation. Finally, while considering adaptive mesh refinement as an integral part of the solution process, it is natural to set a stopping tolerance for the iterative linear solvers on each mesh stage so that the difference between the approximate solution obtained from iterative methods and the finite element solution is bounded by an a posteriori error bound. Here, we present two stopping criteria. The first is based on a residual-type a posteriori error estimator developed by Verfurth. The second is based on an a posteriori error estimator, using local solutions, developed by Kay and Silvester. Our numerical results show the refined mesh obtained from the iterative solution which satisfies the second criteria is similar to the refined mesh obtained from

  7. Shallow-water sloshing in a moving vessel with variable cross-section and wetting-drying using an extension of George's well-balanced finite volume solver

    Science.gov (United States)

    Alemi Ardakani, Hamid; Bridges, Thomas J.; Turner, Matthew R.

    2016-06-01

    A class of augmented approximate Riemann solvers due to George (2008) [12] is extended to solve the shallow-water equations in a moving vessel with variable bottom topography and variable cross-section with wetting and drying. A class of Roe-type upwind solvers for the system of balance laws is derived which respects the steady-state solutions. The numerical solutions of the new adapted augmented f-wave solvers are validated against the Roe-type solvers. The theory is extended to solve the shallow-water flows in moving vessels with arbitrary cross-section with influx-efflux boundary conditions motivated by the shallow-water sloshing in the ocean wave energy converter (WEC) proposed by Offshore Wave Energy Ltd. (OWEL) [1]. A fractional step approach is used to handle the time-dependent forcing functions. The numerical solutions are compared to an extended new Roe-type solver for the system of balance laws with a time-dependent source function. The shallow-water sloshing finite volume solver can be coupled to a Runge-Kutta integrator for the vessel motion.

  8. Modeling of frequency-domain scalar wave equation with the average-derivative optimal scheme based on a multigrid-preconditioned iterative solver

    Science.gov (United States)

    Cao, Jian; Chen, Jing-Bo; Dai, Meng-Xue

    2018-01-01

    An efficient finite-difference frequency-domain modeling of seismic wave propagation relies on the discrete schemes and appropriate solving methods. The average-derivative optimal scheme for the scalar wave modeling is advantageous in terms of the storage saving for the system of linear equations and the flexibility for arbitrary directional sampling intervals. However, using a LU-decomposition-based direct solver to solve its resulting system of linear equations is very costly for both memory and computational requirements. To address this issue, we consider establishing a multigrid-preconditioned BI-CGSTAB iterative solver fit for the average-derivative optimal scheme. The choice of preconditioning matrix and its corresponding multigrid components is made with the help of Fourier spectral analysis and local mode analysis, respectively, which is important for the convergence. Furthermore, we find that for the computation with unequal directional sampling interval, the anisotropic smoothing in the multigrid precondition may affect the convergence rate of this iterative solver. Successful numerical applications of this iterative solver for the homogenous and heterogeneous models in 2D and 3D are presented where the significant reduction of computer memory and the improvement of computational efficiency are demonstrated by comparison with the direct solver. In the numerical experiments, we also show that the unequal directional sampling interval will weaken the advantage of this multigrid-preconditioned iterative solver in the computing speed or, even worse, could reduce its accuracy in some cases, which implies the need for a reasonable control of directional sampling interval in the discretization.

  9. A High Performance Block Eigensolver for Nuclear Configuration Interaction Calculations

    International Nuclear Information System (INIS)

    Aktulga, Hasan Metin; Afibuzzaman, Md.; Williams, Samuel; Buluc, Aydin; Shao, Meiyue

    2017-01-01

    As on-node parallelism increases and the performance gap between the processor and the memory system widens, achieving high performance in large-scale scientific applications requires an architecture-aware design of algorithms and solvers. We focus on the eigenvalue problem arising in nuclear Configuration Interaction (CI) calculations, where a few extreme eigenpairs of a sparse symmetric matrix are needed. Here, we consider a block iterative eigensolver whose main computational kernels are the multiplication of a sparse matrix with multiple vectors (SpMM), and tall-skinny matrix operations. We then present techniques to significantly improve the SpMM and the transpose operation SpMM T by using the compressed sparse blocks (CSB) format. We achieve 3-4× speedup on the requisite operations over good implementations with the commonly used compressed sparse row (CSR) format. We develop a performance model that allows us to correctly estimate the performance of our SpMM kernel implementations, and we identify cache bandwidth as a potential performance bottleneck beyond DRAM. We also analyze and optimize the performance of LOBPCG kernels (inner product and linear combinations on multiple vectors) and show up to 15× speedup over using high performance BLAS libraries for these operations. The resulting high performance LOBPCG solver achieves 1.4× to 1.8× speedup over the existing Lanczos solver on a series of CI computations on high-end multicore architectures (Intel Xeons). We also analyze the performance of our techniques on an Intel Xeon Phi Knights Corner (KNC) processor.

  10. Final Report for 'Implimentation and Evaluation of Multigrid Linear Solvers into Extended Magnetohydrodynamic Codes for Petascale Computing'

    International Nuclear Information System (INIS)

    Vadlamani, Srinath; Kruger, Scott; Austin, Travis

    2008-01-01

    Extended magnetohydrodynamic (MHD) codes are used to model the large, slow-growing instabilities that are projected to limit the performance of International Thermonuclear Experimental Reactor (ITER). The multiscale nature of the extended MHD equations requires an implicit approach. The current linear solvers needed for the implicit algorithm scale poorly because the resultant matrices are so ill-conditioned. A new solver is needed, especially one that scales to the petascale. The most successful scalable parallel processor solvers to date are multigrid solvers. Applying multigrid techniques to a set of equations whose fundamental modes are dispersive waves is a promising solution to CEMM problems. For the Phase 1, we implemented multigrid preconditioners from the HYPRE project of the Center for Applied Scientific Computing at LLNL via PETSc of the DOE SciDAC TOPS for the real matrix systems of the extended MHD code NIMROD which is a one of the primary modeling codes of the OFES-funded Center for Extended Magnetohydrodynamic Modeling (CEMM) SciDAC. We implemented the multigrid solvers on the fusion test problem that allows for real matrix systems with success, and in the process learned about the details of NIMROD data structures and the difficulties of inverting NIMROD operators. The further success of this project will allow for efficient usage of future petascale computers at the National Leadership Facilities: Oak Ridge National Laboratory, Argonne National Laboratory, and National Energy Research Scientific Computing Center. The project will be a collaborative effort between computational plasma physicists and applied mathematicians at Tech-X Corporation, applied mathematicians Front Range Scientific Computations, Inc. (who are collaborators on the HYPRE project), and other computational plasma physicists involved with the CEMM project.

  11. Transient analysis of electromagnetic wave interactions on plasmonic nanostructures using a surface integral equation solver

    KAUST Repository

    Uysal, Ismail Enes

    2016-08-09

    Transient electromagnetic interactions on plasmonic nanostructures are analyzed by solving the Poggio-Miller-Chan-Harrington-Wu-Tsai (PMCHWT) surface integral equation (SIE). Equivalent (unknown) electric and magnetic current densities, which are introduced on the surfaces of the nanostructures, are expanded using Rao-Wilton-Glisson and polynomial basis functions in space and time, respectively. Inserting this expansion into the PMCHWT-SIE and Galerkin testing the resulting equation at discrete times yield a system of equations that is solved for the current expansion coefficients by a marching on-in-time (MOT) scheme. The resulting MOT-PMCHWT-SIE solver calls for computation of additional convolutions between the temporal basis function and the plasmonic medium\\'s permittivity and Green function. This computation is carried out with almost no additional cost and without changing the computational complexity of the solver. Time-domain samples of the permittivity and the Green function required by these convolutions are obtained from their frequency-domain samples using a fast relaxed vector fitting algorithm. Numerical results demonstrate the accuracy and applicability of the proposed MOT-PMCHWT solver. © 2016 Optical Society of America.

  12. JiTTree: A Just-in-Time Compiled Sparse GPU Volume Data Structure

    KAUST Repository

    Labschutz, Matthias

    2015-08-12

    Sparse volume data structures enable the efficient representation of large but sparse volumes in GPU memory for computation and visualization. However, the choice of a specific data structure for a given data set depends on several factors, such as the memory budget, the sparsity of the data, and data access patterns. In general, there is no single optimal sparse data structure, but a set of several candidates with individual strengths and drawbacks. One solution to this problem are hybrid data structures which locally adapt themselves to the sparsity. However, they typically suffer from increased traversal overhead which limits their utility in many applications. This paper presents JiTTree, a novel sparse hybrid volume data structure that uses just-in-time compilation to overcome these problems. By combining multiple sparse data structures and reducing traversal overhead we leverage their individual advantages. We demonstrate that hybrid data structures adapt well to a large range of data sets. They are especially superior to other sparse data structures for data sets that locally vary in sparsity. Possible optimization criteria are memory, performance and a combination thereof. Through just-in-time (JIT) compilation, JiTTree reduces the traversal overhead of the resulting optimal data structure. As a result, our hybrid volume data structure enables efficient computations on the GPU, while being superior in terms of memory usage when compared to non-hybrid data structures.

  13. JiTTree: A Just-in-Time Compiled Sparse GPU Volume Data Structure

    KAUST Repository

    Labschutz, Matthias; Bruckner, Stefan; Groller, M. Eduard; Hadwiger, Markus; Rautek, Peter

    2015-01-01

    Sparse volume data structures enable the efficient representation of large but sparse volumes in GPU memory for computation and visualization. However, the choice of a specific data structure for a given data set depends on several factors, such as the memory budget, the sparsity of the data, and data access patterns. In general, there is no single optimal sparse data structure, but a set of several candidates with individual strengths and drawbacks. One solution to this problem are hybrid data structures which locally adapt themselves to the sparsity. However, they typically suffer from increased traversal overhead which limits their utility in many applications. This paper presents JiTTree, a novel sparse hybrid volume data structure that uses just-in-time compilation to overcome these problems. By combining multiple sparse data structures and reducing traversal overhead we leverage their individual advantages. We demonstrate that hybrid data structures adapt well to a large range of data sets. They are especially superior to other sparse data structures for data sets that locally vary in sparsity. Possible optimization criteria are memory, performance and a combination thereof. Through just-in-time (JIT) compilation, JiTTree reduces the traversal overhead of the resulting optimal data structure. As a result, our hybrid volume data structure enables efficient computations on the GPU, while being superior in terms of memory usage when compared to non-hybrid data structures.

  14. JiTTree: A Just-in-Time Compiled Sparse GPU Volume Data Structure.

    Science.gov (United States)

    Labschütz, Matthias; Bruckner, Stefan; Gröller, M Eduard; Hadwiger, Markus; Rautek, Peter

    2016-01-01

    Sparse volume data structures enable the efficient representation of large but sparse volumes in GPU memory for computation and visualization. However, the choice of a specific data structure for a given data set depends on several factors, such as the memory budget, the sparsity of the data, and data access patterns. In general, there is no single optimal sparse data structure, but a set of several candidates with individual strengths and drawbacks. One solution to this problem are hybrid data structures which locally adapt themselves to the sparsity. However, they typically suffer from increased traversal overhead which limits their utility in many applications. This paper presents JiTTree, a novel sparse hybrid volume data structure that uses just-in-time compilation to overcome these problems. By combining multiple sparse data structures and reducing traversal overhead we leverage their individual advantages. We demonstrate that hybrid data structures adapt well to a large range of data sets. They are especially superior to other sparse data structures for data sets that locally vary in sparsity. Possible optimization criteria are memory, performance and a combination thereof. Through just-in-time (JIT) compilation, JiTTree reduces the traversal overhead of the resulting optimal data structure. As a result, our hybrid volume data structure enables efficient computations on the GPU, while being superior in terms of memory usage when compared to non-hybrid data structures.

  15. Uncovering Transcriptional Regulatory Networks by Sparse Bayesian Factor Model

    Directory of Open Access Journals (Sweden)

    Qi Yuan(Alan

    2010-01-01

    Full Text Available Abstract The problem of uncovering transcriptional regulation by transcription factors (TFs based on microarray data is considered. A novel Bayesian sparse correlated rectified factor model (BSCRFM is proposed that models the unknown TF protein level activity, the correlated regulations between TFs, and the sparse nature of TF-regulated genes. The model admits prior knowledge from existing database regarding TF-regulated target genes based on a sparse prior and through a developed Gibbs sampling algorithm, a context-specific transcriptional regulatory network specific to the experimental condition of the microarray data can be obtained. The proposed model and the Gibbs sampling algorithm were evaluated on the simulated systems, and results demonstrated the validity and effectiveness of the proposed approach. The proposed model was then applied to the breast cancer microarray data of patients with Estrogen Receptor positive ( status and Estrogen Receptor negative ( status, respectively.

  16. Numerical Platon: A unified linear equation solver interface by CEA for solving open foe scientific applications

    International Nuclear Information System (INIS)

    Secher, Bernard; Belliard, Michel; Calvin, Christophe

    2009-01-01

    This paper describes a tool called 'Numerical Platon' developed by the French Atomic Energy Commission (CEA). It provides a freely available (GNU LGPL license) interface for coupling scientific computing applications to various freeware linear solver libraries (essentially PETSc, SuperLU and HyPre), together with some proprietary CEA solvers, for high-performance computers that may be used in industrial software written in various programming languages. This tool was developed as part of considerable efforts by the CEA Nuclear Energy Division in the past years to promote massively parallel software and on-shelf parallel tools to help develop new generation simulation codes. After the presentation of the package architecture and the available algorithms, we show examples of how Numerical Platon is used in sequential and parallel CEA codes. Comparing with in-house solvers, the gain in terms of increases in computation capacities or in terms of parallel performances is notable, without considerable extra development cost

  17. Numerical Platon: A unified linear equation solver interface by CEA for solving open foe scientific applications

    Energy Technology Data Exchange (ETDEWEB)

    Secher, Bernard [French Atomic Energy Commission (CEA), Nuclear Energy Division (DEN) (France); CEA Saclay DM2S/SFME/LGLS, Bat. 454, F-91191 Gif-sur-Yvette Cedex (France)], E-mail: bsecher@cea.fr; Belliard, Michel [French Atomic Energy Commission (CEA), Nuclear Energy Division (DEN) (France); CEA Cadarache DER/SSTH/LMDL, Bat. 238, F-13108 Saint-Paul-lez-Durance Cedex (France); Calvin, Christophe [French Atomic Energy Commission (CEA), Nuclear Energy Division (DEN) (France); CEA Saclay DM2S/SERMA/LLPR, Bat. 470, F-91191 Gif-sur-Yvette Cedex (France)

    2009-01-15

    This paper describes a tool called 'Numerical Platon' developed by the French Atomic Energy Commission (CEA). It provides a freely available (GNU LGPL license) interface for coupling scientific computing applications to various freeware linear solver libraries (essentially PETSc, SuperLU and HyPre), together with some proprietary CEA solvers, for high-performance computers that may be used in industrial software written in various programming languages. This tool was developed as part of considerable efforts by the CEA Nuclear Energy Division in the past years to promote massively parallel software and on-shelf parallel tools to help develop new generation simulation codes. After the presentation of the package architecture and the available algorithms, we show examples of how Numerical Platon is used in sequential and parallel CEA codes. Comparing with in-house solvers, the gain in terms of increases in computation capacities or in terms of parallel performances is notable, without considerable extra development cost.

  18. A Massively Parallel Solver for the Mechanical Harmonic Analysis of Accelerator Cavities

    International Nuclear Information System (INIS)

    2015-01-01

    ACE3P is a 3D massively parallel simulation suite that developed at SLAC National Accelerator Laboratory that can perform coupled electromagnetic, thermal and mechanical study. Effectively utilizing supercomputer resources, ACE3P has become a key simulation tool for particle accelerator R and D. A new frequency domain solver to perform mechanical harmonic response analysis of accelerator components is developed within the existing parallel framework. This solver is designed to determine the frequency response of the mechanical system to external harmonic excitations for time-efficient accurate analysis of the large-scale problems. Coupled with the ACE3P electromagnetic modules, this capability complements a set of multi-physics tools for a comprehensive study of microphonics in superconducting accelerating cavities in order to understand the RF response and feedback requirements for the operational reliability of a particle accelerator. (auth)

  19. Steady-State Anderson Accelerated Coupling of Lattice Boltzmann and Navier–Stokes Solvers

    KAUST Repository

    Atanasov, Atanas

    2016-10-17

    We present an Anderson acceleration-based approach to spatially couple three-dimensional Lattice Boltzmann and Navier–Stokes (LBNS) flow simulations. This allows to locally exploit the computational features of both fluid flow solver approaches to the fullest extent and yields enhanced control to match the LB and NS degrees of freedom within the LBNS overlap layer. Designed for parallel Schwarz coupling, the Anderson acceleration allows for the simultaneous execution of both Lattice Boltzmann and Navier–Stokes solver. We detail our coupling methodology, validate it, and study convergence and accuracy of the Anderson accelerated coupling, considering three steady-state scenarios: plane channel flow, flow around a sphere and channel flow across a porous structure. We find that the Anderson accelerated coupling yields a speed-up (in terms of iteration steps) of up to 40% in the considered scenarios, compared to strictly sequential Schwarz coupling.

  20. Magnetic Resonance Super-resolution Imaging Measurement with Dictionary-optimized Sparse Learning

    Directory of Open Access Journals (Sweden)

    Li Jun-Bao

    2017-06-01

    Full Text Available Magnetic Resonance Super-resolution Imaging Measurement (MRIM is an effective way of measuring materials. MRIM has wide applications in physics, chemistry, biology, geology, medical and material science, especially in medical diagnosis. It is feasible to improve the resolution of MR imaging through increasing radiation intensity, but the high radiation intensity and the longtime of magnetic field harm the human body. Thus, in the practical applications the resolution of hardware imaging reaches the limitation of resolution. Software-based super-resolution technology is effective to improve the resolution of image. This work proposes a framework of dictionary-optimized sparse learning based MR super-resolution method. The framework is to solve the problem of sample selection for dictionary learning of sparse reconstruction. The textural complexity-based image quality representation is proposed to choose the optimal samples for dictionary learning. Comprehensive experiments show that the dictionary-optimized sparse learning improves the performance of sparse representation.