WorldWideScience

Sample records for time-parallel iterative methods

  1. Parallel S/sub n/ iteration schemes

    International Nuclear Information System (INIS)

    Wienke, B.R.; Hiromoto, R.E.

    1986-01-01

    The iterative, multigroup, discrete ordinates (S/sub n/) technique for solving the linear transport equation enjoys widespread usage and appeal. Serial iteration schemes and numerical algorithms developed over the years provide a timely framework for parallel extension. On the Denelcor HEP, the authors investigate three parallel iteration schemes for solving the one-dimensional S/sub n/ transport equation. The multigroup representation and serial iteration methods are also reviewed. This analysis represents a first attempt to extend serial S/sub n/ algorithms to parallel environments and provides good baseline estimates on ease of parallel implementation, relative algorithm efficiency, comparative speedup, and some future directions. The authors examine ordered and chaotic versions of these strategies, with and without concurrent rebalance and diffusion acceleration. Two strategies efficiently support high degrees of parallelization and appear to be robust parallel iteration techniques. The third strategy is a weaker parallel algorithm. Chaotic iteration, difficult to simulate on serial machines, holds promise and converges faster than ordered versions of the schemes. Actual parallel speedup and efficiency are high and payoff appears substantial

  2. Parallel computation of multigroup reactivity coefficient using iterative method

    Science.gov (United States)

    Susmikanti, Mike; Dewayatna, Winter

    2013-09-01

    One of the research activities to support the commercial radioisotope production program is a safety research target irradiation FPM (Fission Product Molybdenum). FPM targets form a tube made of stainless steel in which the nuclear degrees of superimposed high-enriched uranium. FPM irradiation tube is intended to obtain fission. The fission material widely used in the form of kits in the world of nuclear medicine. Irradiation FPM tube reactor core would interfere with performance. One of the disorders comes from changes in flux or reactivity. It is necessary to study a method for calculating safety terrace ongoing configuration changes during the life of the reactor, making the code faster became an absolute necessity. Neutron safety margin for the research reactor can be reused without modification to the calculation of the reactivity of the reactor, so that is an advantage of using perturbation method. The criticality and flux in multigroup diffusion model was calculate at various irradiation positions in some uranium content. This model has a complex computation. Several parallel algorithms with iterative method have been developed for the sparse and big matrix solution. The Black-Red Gauss Seidel Iteration and the power iteration parallel method can be used to solve multigroup diffusion equation system and calculated the criticality and reactivity coeficient. This research was developed code for reactivity calculation which used one of safety analysis with parallel processing. It can be done more quickly and efficiently by utilizing the parallel processing in the multicore computer. This code was applied for the safety limits calculation of irradiated targets FPM with increment Uranium.

  3. Primal Domain Decomposition Method with Direct and Iterative Solver for Circuit-Field-Torque Coupled Parallel Finite Element Method to Electric Machine Modelling

    Directory of Open Access Journals (Sweden)

    Daniel Marcsa

    2015-01-01

    Full Text Available The analysis and design of electromechanical devices involve the solution of large sparse linear systems, and require therefore high performance algorithms. In this paper, the primal Domain Decomposition Method (DDM with parallel forward-backward and with parallel Preconditioned Conjugate Gradient (PCG solvers are introduced in two-dimensional parallel time-stepping finite element formulation to analyze rotating machine considering the electromagnetic field, external circuit and rotor movement. The proposed parallel direct and the iterative solver with two preconditioners are analyzed concerning its computational efficiency and number of iterations of the solver with different preconditioners. Simulation results of a rotating machine is also presented.

  4. Sparse BLIP: BLind Iterative Parallel imaging reconstruction using compressed sensing.

    Science.gov (United States)

    She, Huajun; Chen, Rong-Rong; Liang, Dong; DiBella, Edward V R; Ying, Leslie

    2014-02-01

    To develop a sensitivity-based parallel imaging reconstruction method to reconstruct iteratively both the coil sensitivities and MR image simultaneously based on their prior information. Parallel magnetic resonance imaging reconstruction problem can be formulated as a multichannel sampling problem where solutions are sought analytically. However, the channel functions given by the coil sensitivities in parallel imaging are not known exactly and the estimation error usually leads to artifacts. In this study, we propose a new reconstruction algorithm, termed Sparse BLind Iterative Parallel, for blind iterative parallel imaging reconstruction using compressed sensing. The proposed algorithm reconstructs both the sensitivity functions and the image simultaneously from undersampled data. It enforces the sparseness constraint in the image as done in compressed sensing, but is different from compressed sensing in that the sensing matrix is unknown and additional constraint is enforced on the sensitivities as well. Both phantom and in vivo imaging experiments were carried out with retrospective undersampling to evaluate the performance of the proposed method. Experiments show improvement in Sparse BLind Iterative Parallel reconstruction when compared with Sparse SENSE, JSENSE, IRGN-TV, and L1-SPIRiT reconstructions with the same number of measurements. The proposed Sparse BLind Iterative Parallel algorithm reduces the reconstruction errors when compared to the state-of-the-art parallel imaging methods. Copyright © 2013 Wiley Periodicals, Inc.

  5. Time parallelization of advanced operation scenario simulations of ITER plasma

    International Nuclear Information System (INIS)

    Samaddar, D; Casper, T A; Kim, S H; Houlberg, W A; Berry, L A; Elwasif, W R; Batchelor, D

    2013-01-01

    This work demonstrates that simulations of advanced burning plasma operation scenarios can be successfully parallelized in time using the parareal algorithm. CORSICA -an advanced operation scenario code for tokamak plasmas is used as a test case. This is a unique application since the parareal algorithm has so far been applied to relatively much simpler systems except for the case of turbulence. In the present application, a computational gain of an order of magnitude has been achieved which is extremely promising. A successful implementation of the Parareal algorithm to codes like CORSICA ushers in the possibility of time efficient simulations of ITER plasmas.

  6. Gauss-Seidel Iterative Method as a Real-Time Pile-Up Solver of Scintillation Pulses

    Science.gov (United States)

    Novak, Roman; Vencelj, Matja¿

    2009-12-01

    The pile-up rejection in nuclear spectroscopy has been confronted recently by several pile-up correction schemes that compensate for distortions of the signal and subsequent energy spectra artifacts as the counting rate increases. We study here a real-time capability of the event-by-event correction method, which at the core translates to solving many sets of linear equations. Tight time limits and constrained front-end electronics resources make well-known direct solvers inappropriate. We propose a novel approach based on the Gauss-Seidel iterative method, which turns out to be a stable and cost-efficient solution to improve spectroscopic resolution in the front-end electronics. We show the method convergence properties for a class of matrices that emerge in calorimetric processing of scintillation detector signals and demonstrate the ability of the method to support the relevant resolutions. The sole iteration-based error component can be brought below the sliding window induced errors in a reasonable number of iteration steps, thus allowing real-time operation. An area-efficient hardware implementation is proposed that fully utilizes the method's inherent parallelism.

  7. Parallel iterative solution of the Hermite Collocation equations on GPUs II

    International Nuclear Information System (INIS)

    Vilanakis, N; Mathioudakis, E

    2014-01-01

    Hermite Collocation is a high order finite element method for Boundary Value Problems modelling applications in several fields of science and engineering. Application of this integration free numerical solver for the solution of linear BVPs results in a large and sparse general system of algebraic equations, suggesting the usage of an efficient iterative solver especially for realistic simulations. In part I of this work an efficient parallel algorithm of the Schur complement method coupled with Bi-Conjugate Gradient Stabilized (BiCGSTAB) iterative solver has been designed for multicore computing architectures with a Graphics Processing Unit (GPU). In the present work the proposed algorithm has been extended for high performance computing environments consisting of multiprocessor machines with multiple GPUs. Since this is a distributed GPU and shared CPU memory parallel architecture, a hybrid memory treatment is needed for the development of the parallel algorithm. The realization of the algorithm took place on a multiprocessor machine HP SL390 with Tesla M2070 GPUs using the OpenMP and OpenACC standards. Execution time measurements reveal the efficiency of the parallel implementation

  8. P-SPARSLIB: A parallel sparse iterative solution package

    Energy Technology Data Exchange (ETDEWEB)

    Saad, Y. [Univ. of Minnesota, Minneapolis, MN (United States)

    1994-12-31

    Iterative methods are gaining popularity in engineering and sciences at a time where the computational environment is changing rapidly. P-SPARSLIB is a project to build a software library for sparse matrix computations on parallel computers. The emphasis is on iterative methods and the use of distributed sparse matrices, an extension of the domain decomposition approach to general sparse matrices. One of the goals of this project is to develop a software package geared towards specific applications. For example, the author will test the performance and usefulness of P-SPARSLIB modules on linear systems arising from CFD applications. Equally important is the goal of portability. In the long run, the author wishes to ensure that this package is portable on a variety of platforms, including SIMD environments and shared memory environments.

  9. PRIM: An Efficient Preconditioning Iterative Reweighted Least Squares Method for Parallel Brain MRI Reconstruction.

    Science.gov (United States)

    Xu, Zheng; Wang, Sheng; Li, Yeqing; Zhu, Feiyun; Huang, Junzhou

    2018-02-08

    The most recent history of parallel Magnetic Resonance Imaging (pMRI) has in large part been devoted to finding ways to reduce acquisition time. While joint total variation (JTV) regularized model has been demonstrated as a powerful tool in increasing sampling speed for pMRI, however, the major bottleneck is the inefficiency of the optimization method. While all present state-of-the-art optimizations for the JTV model could only reach a sublinear convergence rate, in this paper, we squeeze the performance by proposing a linear-convergent optimization method for the JTV model. The proposed method is based on the Iterative Reweighted Least Squares algorithm. Due to the complexity of the tangled JTV objective, we design a novel preconditioner to further accelerate the proposed method. Extensive experiments demonstrate the superior performance of the proposed algorithm for pMRI regarding both accuracy and efficiency compared with state-of-the-art methods.

  10. A parallel algorithm for the two-dimensional time fractional diffusion equation with implicit difference method.

    Science.gov (United States)

    Gong, Chunye; Bao, Weimin; Tang, Guojian; Jiang, Yuewen; Liu, Jie

    2014-01-01

    It is very time consuming to solve fractional differential equations. The computational complexity of two-dimensional fractional differential equation (2D-TFDE) with iterative implicit finite difference method is O(M(x)M(y)N(2)). In this paper, we present a parallel algorithm for 2D-TFDE and give an in-depth discussion about this algorithm. A task distribution model and data layout with virtual boundary are designed for this parallel algorithm. The experimental results show that the parallel algorithm compares well with the exact solution. The parallel algorithm on single Intel Xeon X5540 CPU runs 3.16-4.17 times faster than the serial algorithm on single CPU core. The parallel efficiency of 81 processes is up to 88.24% compared with 9 processes on a distributed memory cluster system. We do think that the parallel computing technology will become a very basic method for the computational intensive fractional applications in the near future.

  11. Iterative Refinement Methods for Time-Domain Equalizer Design

    Directory of Open Access Journals (Sweden)

    Evans Brian L

    2006-01-01

    Full Text Available Commonly used time domain equalizer (TEQ design methods have been recently unified as an optimization problem involving an objective function in the form of a Rayleigh quotient. The direct generalized eigenvalue solution relies on matrix decompositions. To reduce implementation complexity, we propose an iterative refinement approach in which the TEQ length starts at two taps and increases by one tap at each iteration. Each iteration involves matrix-vector multiplications and vector additions with matrices and two-element vectors. At each iteration, the optimization of the objective function either improves or the approach terminates. The iterative refinement approach provides a range of communication performance versus implementation complexity tradeoffs for any TEQ method that fits the Rayleigh quotient framework. We apply the proposed approach to three such TEQ design methods: maximum shortening signal-to-noise ratio, minimum intersymbol interference, and minimum delay spread.

  12. Improved parallel solution techniques for the integral transport matrix method

    Energy Technology Data Exchange (ETDEWEB)

    Zerr, R. Joseph, E-mail: rjz116@psu.edu [Department of Mechanical and Nuclear Engineering, The Pennsylvania State University, University Park, PA (United States); Azmy, Yousry Y., E-mail: yyazmy@ncsu.edu [Department of Nuclear Engineering, North Carolina State University, Burlington Engineering Laboratories, Raleigh, NC (United States)

    2011-07-01

    Alternative solution strategies to the parallel block Jacobi (PBJ) method for the solution of the global problem with the integral transport matrix method operators have been designed and tested. The most straightforward improvement to the Jacobi iterative method is the Gauss-Seidel alternative. The parallel red-black Gauss-Seidel (PGS) algorithm can improve on the number of iterations and reduce work per iteration by applying an alternating red-black color-set to the subdomains and assigning multiple sub-domains per processor. A parallel GMRES(m) method was implemented as an alternative to stationary iterations. Computational results show that the PGS method can improve on the PBJ method execution time by up to 10´ when eight sub-domains per processor are used. However, compared to traditional source iterations with diffusion synthetic acceleration, it is still approximately an order of magnitude slower. The best-performing cases are optically thick because sub-domains decouple, yielding faster convergence. Further tests revealed that 64 sub-domains per processor was the best performing level of sub-domain division. An acceleration technique that improves the convergence rate would greatly improve the ITMM. The GMRES(m) method with a diagonal block pre conditioner consumes approximately the same time as the PBJ solver but could be improved by an as yet undeveloped, more efficient pre conditioner. (author)

  13. Improved parallel solution techniques for the integral transport matrix method

    International Nuclear Information System (INIS)

    Zerr, R. Joseph; Azmy, Yousry Y.

    2011-01-01

    Alternative solution strategies to the parallel block Jacobi (PBJ) method for the solution of the global problem with the integral transport matrix method operators have been designed and tested. The most straightforward improvement to the Jacobi iterative method is the Gauss-Seidel alternative. The parallel red-black Gauss-Seidel (PGS) algorithm can improve on the number of iterations and reduce work per iteration by applying an alternating red-black color-set to the subdomains and assigning multiple sub-domains per processor. A parallel GMRES(m) method was implemented as an alternative to stationary iterations. Computational results show that the PGS method can improve on the PBJ method execution time by up to 10´ when eight sub-domains per processor are used. However, compared to traditional source iterations with diffusion synthetic acceleration, it is still approximately an order of magnitude slower. The best-performing cases are optically thick because sub-domains decouple, yielding faster convergence. Further tests revealed that 64 sub-domains per processor was the best performing level of sub-domain division. An acceleration technique that improves the convergence rate would greatly improve the ITMM. The GMRES(m) method with a diagonal block pre conditioner consumes approximately the same time as the PBJ solver but could be improved by an as yet undeveloped, more efficient pre conditioner. (author)

  14. Colorado Conference on iterative methods. Volume 1

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1994-12-31

    The conference provided a forum on many aspects of iterative methods. Volume I topics were:Session: domain decomposition, nonlinear problems, integral equations and inverse problems, eigenvalue problems, iterative software kernels. Volume II presents nonsymmetric solvers, parallel computation, theory of iterative methods, software and programming environment, ODE solvers, multigrid and multilevel methods, applications, robust iterative methods, preconditioners, Toeplitz and circulation solvers, and saddle point problems. Individual papers are indexed separately on the EDB.

  15. Parallel iterative procedures for approximate solutions of wave propagation by finite element and finite difference methods

    Energy Technology Data Exchange (ETDEWEB)

    Kim, S. [Purdue Univ., West Lafayette, IN (United States)

    1994-12-31

    Parallel iterative procedures based on domain decomposition techniques are defined and analyzed for the numerical solution of wave propagation by finite element and finite difference methods. For finite element methods, in a Lagrangian framework, an efficient way for choosing the algorithm parameter as well as the algorithm convergence are indicated. Some heuristic arguments for finding the algorithm parameter for finite difference schemes are addressed. Numerical results are presented to indicate the effectiveness of the methods.

  16. Parallel iterative solvers and preconditioners using approximate hierarchical methods

    Energy Technology Data Exchange (ETDEWEB)

    Grama, A.; Kumar, V.; Sameh, A. [Univ. of Minnesota, Minneapolis, MN (United States)

    1996-12-31

    In this paper, we report results of the performance, convergence, and accuracy of a parallel GMRES solver for Boundary Element Methods. The solver uses a hierarchical approximate matrix-vector product based on a hybrid Barnes-Hut / Fast Multipole Method. We study the impact of various accuracy parameters on the convergence and show that with minimal loss in accuracy, our solver yields significant speedups. We demonstrate the excellent parallel efficiency and scalability of our solver. The combined speedups from approximation and parallelism represent an improvement of several orders in solution time. We also develop fast and paralellizable preconditioners for this problem. We report on the performance of an inner-outer scheme and a preconditioner based on truncated Green`s function. Experimental results on a 256 processor Cray T3D are presented.

  17. AZTEC: A parallel iterative package for the solving linear systems

    Energy Technology Data Exchange (ETDEWEB)

    Hutchinson, S.A.; Shadid, J.N.; Tuminaro, R.S. [Sandia National Labs., Albuquerque, NM (United States)

    1996-12-31

    We describe a parallel linear system package, AZTEC. The package incorporates a number of parallel iterative methods (e.g. GMRES, biCGSTAB, CGS, TFQMR) and preconditioners (e.g. Jacobi, Gauss-Seidel, polynomial, domain decomposition with LU or ILU within subdomains). Additionally, AZTEC allows for the reuse of previous preconditioning factorizations within Newton schemes for nonlinear methods. Currently, a number of different users are using this package to solve a variety of PDE applications.

  18. An iterative algorithm for solving the multidimensional neutron diffusion nodal method equations on parallel computers

    International Nuclear Information System (INIS)

    Kirk, B.L.; Azmy, Y.Y.

    1992-01-01

    In this paper the one-group, steady-state neutron diffusion equation in two-dimensional Cartesian geometry is solved using the nodal integral method. The discrete variable equations comprise loosely coupled sets of equations representing the nodal balance of neutrons, as well as neutron current continuity along rows or columns of computational cells. An iterative algorithm that is more suitable for solving large problems concurrently is derived based on the decomposition of the spatial domain and is accelerated using successive overrelaxation. This algorithm is very well suited for parallel computers, especially since the spatial domain decomposition occurs naturally, so that the number of iterations required for convergence does not depend on the number of processors participating in the calculation. Implementation of the authors' algorithm on the Intel iPSC/2 hypercube and Sequent Balance 8000 parallel computer is presented, and measured speedup and efficiency for test problems are reported. The results suggest that the efficiency of the hypercube quickly deteriorates when many processors are used, while the Sequent Balance retains very high efficiency for a comparable number of participating processors. This leads to the conjecture that message-passing parallel computers are not as well suited for this algorithm as shared-memory machines

  19. Design of parallel intersector weld/cut robot for machining processes in ITER vacuum vessel

    International Nuclear Information System (INIS)

    Wu Huapeng; Handroos, Heikki; Kovanen, Janne; Rouvinen, Asko; Hannukainen, Petri; Saira, Tanja; Jones, Lawrence

    2003-01-01

    This paper presents a new parallel robot Penta-WH, which has five degrees of freedom driven by hydraulic cylinders. The manipulator has a large, singularity-free workspace and high stiffness and it acts as a transport device for welding, machining and inspection end-effectors inside the ITER vacuum vessel. The presented kinematic structure of a parallel robot is particularly suitable for the ITER environment. Analysis of the machining process for ITER, such as the machining methods and forces are given, and the kinematic analyses, such as workspace and force capacity are discussed

  20. PARALLEL ITERATIVE RECONSTRUCTION OF PHANTOM CATPHAN ON EXPERIMENTAL DATA

    Directory of Open Access Journals (Sweden)

    M. A. Mirzavand

    2016-01-01

    Full Text Available The principles of fast parallel iterative algorithms based on the use of graphics accelerators and OpenGL library are considered in the paper. The proposed approach provides simultaneous minimization of the residuals of the desired solution and total variation of the reconstructed three- dimensional image. The number of necessary input data, i. e. conical X-ray projections, can be reduced several times. It means in a corresponding number of times the possibility to reduce radiation exposure to the patient. At the same time maintain the necessary contrast and spatial resolution of threedimensional image of the patient. Heuristic iterative algorithm can be used as an alternative to the well-known three-dimensional Feldkamp algorithm.

  1. Parallelization of the model-based iterative reconstruction algorithm DIRA

    International Nuclear Information System (INIS)

    Oertenberg, A.; Sandborg, M.; Alm Carlsson, G.; Malusek, A.; Magnusson, M.

    2016-01-01

    New paradigms for parallel programming have been devised to simplify software development on multi-core processors and many-core graphical processing units (GPU). Despite their obvious benefits, the parallelization of existing computer programs is not an easy task. In this work, the use of the Open Multiprocessing (OpenMP) and Open Computing Language (OpenCL) frameworks is considered for the parallelization of the model-based iterative reconstruction algorithm DIRA with the aim to significantly shorten the code's execution time. Selected routines were parallelized using OpenMP and OpenCL libraries; some routines were converted from MATLAB to C and optimised. Parallelization of the code with the OpenMP was easy and resulted in an overall speedup of 15 on a 16-core computer. Parallelization with OpenCL was more difficult owing to differences between the central processing unit and GPU architectures. The resulting speedup was substantially lower than the theoretical peak performance of the GPU; the cause was explained. (authors)

  2. Implementation of the multireference Brillouin-Wigner and Mukherjee’s coupled cluster methods with non-iterative triple excitations utilizing reference-level parallelism

    Energy Technology Data Exchange (ETDEWEB)

    Bhaskaran-Nair, Kiran; Brabec, Jiri; Apra, Edoardo; van Dam, Hubertus JJ; Pittner, Jiri; Kowalski, Karol

    2012-09-07

    In this paper we discuss the performance of the non-iterative State-Specific Mul- tireference Coupled Cluster (SS-MRCC) methods accounting for the effect of triply excited cluster amplitudes. The corrections to the Brillouin-Wigner and Mukherjee MRCC models based on the manifold of singly and doubly excited cluster amplitudes (BW-MRCCSD and Mk-MRCCSD, respectively) are tested and compared with the exact full configuration interaction results (FCI) for small systems (H2O, N2, and Be3). For larger systems (naphthyne isomers and -carotene), the non-iterative BW-MRCCSD(T) and Mk-MRCCSD(T) methods are compared against the results obtained with the single reference coupled cluster methods. We also report on the parallel performance of the non-iterative implementations based on the use of pro- cessor groups.

  3. Parallel Jacobi EVD Methods on Integrated Circuits

    Directory of Open Access Journals (Sweden)

    Chi-Chia Sun

    2014-01-01

    Full Text Available Design strategies for parallel iterative algorithms are presented. In order to further study different tradeoff strategies in design criteria for integrated circuits, A 10 × 10 Jacobi Brent-Luk-EVD array with the simplified μ-CORDIC processor is used as an example. The experimental results show that using the μ-CORDIC processor is beneficial for the design criteria as it yields a smaller area, faster overall computation time, and less energy consumption than the regular CORDIC processor. It is worth to notice that the proposed parallel EVD method can be applied to real-time and low-power array signal processing algorithms performing beamforming or DOA estimation.

  4. Accuracy analysis of hybrid parallel robot for the assembling of ITER

    Energy Technology Data Exchange (ETDEWEB)

    Wang Yongbo [Institute of Mechatronics and Virtual Engineering, Lappeenranta University of Technology, Skinnarilankatu 34, 53850 Lappeenranta (Finland); The State Key Laboratory of Mechanical Transmission, Chongqing University (China); Pessi, Pekka [Institute of Mechatronics and Virtual Engineering, Lappeenranta University of Technology, Skinnarilankatu 34, 53850 Lappeenranta (Finland); Wu Huapeng [Institute of Mechatronics and Virtual Engineering, Lappeenranta University of Technology, Skinnarilankatu 34, 53850 Lappeenranta (Finland)], E-mail: huapeng@lut.fi; Handroos, Heikki [Institute of Mechatronics and Virtual Engineering, Lappeenranta University of Technology, Skinnarilankatu 34, 53850 Lappeenranta (Finland)

    2009-06-15

    This paper presents a novel mobile parallel robot, which is able to carry welding and machining processes from inside the international thermonuclear experimental reactor (ITER) vacuum vessel (VV). The kinematics design of the robot has been optimized for ITER access. To improve the accuracy of the parallel robot, the errors caused by the stiffness and manufacture process have to be compensated or limited to a minimum value. In this paper kinematics errors and stiffness modeling are given. The simulation results are presented.

  5. Accuracy analysis of hybrid parallel robot for the assembling of ITER

    International Nuclear Information System (INIS)

    Wang Yongbo; Pessi, Pekka; Wu Huapeng; Handroos, Heikki

    2009-01-01

    This paper presents a novel mobile parallel robot, which is able to carry welding and machining processes from inside the international thermonuclear experimental reactor (ITER) vacuum vessel (VV). The kinematics design of the robot has been optimized for ITER access. To improve the accuracy of the parallel robot, the errors caused by the stiffness and manufacture process have to be compensated or limited to a minimum value. In this paper kinematics errors and stiffness modeling are given. The simulation results are presented.

  6. Iterative algorithms for large sparse linear systems on parallel computers

    Science.gov (United States)

    Adams, L. M.

    1982-01-01

    Algorithms for assembling in parallel the sparse system of linear equations that result from finite difference or finite element discretizations of elliptic partial differential equations, such as those that arise in structural engineering are developed. Parallel linear stationary iterative algorithms and parallel preconditioned conjugate gradient algorithms are developed for solving these systems. In addition, a model for comparing parallel algorithms on array architectures is developed and results of this model for the algorithms are given.

  7. Copper Mountain conference on iterative methods: Proceedings: Volume 2

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1996-10-01

    This volume (the second of two) contains information presented during the last two days of the Copper Mountain Conference on Iterative Methods held April 9-13, 1996 at Copper Mountain, Colorado. Topics of the sessions held these two days include domain decomposition, Krylov methods, computational fluid dynamics, Markov chains, sparse and parallel basic linear algebra subprograms, multigrid methods, applications of iterative methods, equation systems with multiple right-hand sides, projection methods, and the Helmholtz equation. Selected papers indexed separately for the Energy Science and Technology Database.

  8. Variable aperture-based ptychographical iterative engine method

    Science.gov (United States)

    Sun, Aihui; Kong, Yan; Meng, Xin; He, Xiaoliang; Du, Ruijun; Jiang, Zhilong; Liu, Fei; Xue, Liang; Wang, Shouyu; Liu, Cheng

    2018-02-01

    A variable aperture-based ptychographical iterative engine (vaPIE) is demonstrated both numerically and experimentally to reconstruct the sample phase and amplitude rapidly. By adjusting the size of a tiny aperture under the illumination of a parallel light beam to change the illumination on the sample step by step and recording the corresponding diffraction patterns sequentially, both the sample phase and amplitude can be faithfully reconstructed with a modified ptychographical iterative engine (PIE) algorithm. Since many fewer diffraction patterns are required than in common PIE and the shape, the size, and the position of the aperture need not to be known exactly, this proposed vaPIE method remarkably reduces the data acquisition time and makes PIE less dependent on the mechanical accuracy of the translation stage; therefore, the proposed technique can be potentially applied for various scientific researches.

  9. Efficient parallel implicit methods for rotary-wing aerodynamics calculations

    Science.gov (United States)

    Wissink, Andrew M.

    Euler/Navier-Stokes Computational Fluid Dynamics (CFD) methods are commonly used for prediction of the aerodynamics and aeroacoustics of modern rotary-wing aircraft. However, their widespread application to large complex problems is limited lack of adequate computing power. Parallel processing offers the potential for dramatic increases in computing power, but most conventional implicit solution methods are inefficient in parallel and new techniques must be adopted to realize its potential. This work proposes alternative implicit schemes for Euler/Navier-Stokes rotary-wing calculations which are robust and efficient in parallel. The first part of this work proposes an efficient parallelizable modification of the Lower Upper-Symmetric Gauss Seidel (LU-SGS) implicit operator used in the well-known Transonic Unsteady Rotor Navier Stokes (TURNS) code. The new hybrid LU-SGS scheme couples a point-relaxation approach of the Data Parallel-Lower Upper Relaxation (DP-LUR) algorithm for inter-processor communication with the Symmetric Gauss Seidel algorithm of LU-SGS for on-processor computations. With the modified operator, TURNS is implemented in parallel using Message Passing Interface (MPI) for communication. Numerical performance and parallel efficiency are evaluated on the IBM SP2 and Thinking Machines CM-5 multi-processors for a variety of steady-state and unsteady test cases. The hybrid LU-SGS scheme maintains the numerical performance of the original LU-SGS algorithm in all cases and shows a good degree of parallel efficiency. It experiences a higher degree of robustness than DP-LUR for third-order upwind solutions. The second part of this work examines use of Krylov subspace iterative solvers for the nonlinear CFD solutions. The hybrid LU-SGS scheme is used as a parallelizable preconditioner. Two iterative methods are tested, Generalized Minimum Residual (GMRES) and Orthogonal s-Step Generalized Conjugate Residual (OSGCR). The Newton method demonstrates good

  10. Parallel iterative decoding of transform domain Wyner-Ziv video using cross bitplane correlation

    DEFF Research Database (Denmark)

    Luong, Huynh Van; Huang, Xin; Forchhammer, Søren

    2011-01-01

    decoding scheme is proposed to improve the coding efficiency of TDWZ video codecs. The proposed parallel iterative LDPC decoding scheme is able to utilize cross bitplane correlation during decoding, by iteratively refining the soft-input, updating a modeled noise distribution and thereafter enhancing......In recent years, Transform Domain Wyner-Ziv (TDWZ) video coding has been proposed as an efficient Distributed Video Coding (DVC) solution, which fully or partly exploits the source statistics at the decoder to reduce the computational burden at the encoder. In this paper, a parallel iterative LDPC...

  11. A discrete ordinate response matrix method for massively parallel computers

    International Nuclear Information System (INIS)

    Hanebutte, U.R.; Lewis, E.E.

    1991-01-01

    A discrete ordinate response matrix method is formulated for the solution of neutron transport problems on massively parallel computers. The response matrix formulation eliminates iteration on the scattering source. The nodal matrices which result from the diamond-differenced equations are utilized in a factored form which minimizes memory requirements and significantly reduces the required number of algorithm utilizes massive parallelism by assigning each spatial node to a processor. The algorithm is accelerated effectively by a synthetic method in which the low-order diffusion equations are also solved by massively parallel red/black iterations. The method has been implemented on a 16k Connection Machine-2, and S 8 and S 16 solutions have been obtained for fixed-source benchmark problems in X--Y geometry

  12. Variable aperture-based ptychographical iterative engine method.

    Science.gov (United States)

    Sun, Aihui; Kong, Yan; Meng, Xin; He, Xiaoliang; Du, Ruijun; Jiang, Zhilong; Liu, Fei; Xue, Liang; Wang, Shouyu; Liu, Cheng

    2018-02-01

    A variable aperture-based ptychographical iterative engine (vaPIE) is demonstrated both numerically and experimentally to reconstruct the sample phase and amplitude rapidly. By adjusting the size of a tiny aperture under the illumination of a parallel light beam to change the illumination on the sample step by step and recording the corresponding diffraction patterns sequentially, both the sample phase and amplitude can be faithfully reconstructed with a modified ptychographical iterative engine (PIE) algorithm. Since many fewer diffraction patterns are required than in common PIE and the shape, the size, and the position of the aperture need not to be known exactly, this proposed vaPIE method remarkably reduces the data acquisition time and makes PIE less dependent on the mechanical accuracy of the translation stage; therefore, the proposed technique can be potentially applied for various scientific researches. (2018) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE).

  13. Angular parallelization of a curvilinear Sn transport theory method

    International Nuclear Information System (INIS)

    Haghighat, A.

    1991-01-01

    In this paper a parallel algorithm for angular domain decomposition (or parallelization) of an r-dependent spherical S n transport theory method is derived. The parallel formulation is incorporated into TWOTRAN-II using the IBM Parallel Fortran compiler and implemented on an IBM 3090/400 (with four processors). The behavior of the parallel algorithm for different physical problems is studied, and it is concluded that the parallel algorithm behaves differently in the presence of a fission source as opposed to the absence of a fission source; this is attributed to the relative contributions of the source and the angular redistribution terms in the S s algorithm. Further, the parallel performance of the algorithm is measured for various problem sizes and different combinations of angular subdomains or processors. Poor parallel efficiencies between ∼35 and 50% are achieved in situations where the relative difference of parallel to serial iterations is ∼50%. High parallel efficiencies between ∼60% and 90% are obtained in situations where the relative difference of parallel to serial iterations is <35%

  14. Implementation of a cell-wise block-Gauss-Seidel iterative method for SN transport on a hybrid parallel computer architecture

    International Nuclear Information System (INIS)

    Rosa, Massimiliano; Warsa, James S.; Perks, Michael

    2011-01-01

    We have implemented a cell-wise, block-Gauss-Seidel (bGS) iterative algorithm, for the solution of the S_n transport equations on the Roadrunner hybrid, parallel computer architecture. A compute node of this massively parallel machine comprises AMD Opteron cores that are linked to a Cell Broadband Engine™ (Cell/B.E.)"1. LAPACK routines have been ported to the Cell/B.E. in order to make use of its parallel Synergistic Processing Elements (SPEs). The bGS algorithm is based on the LU factorization and solution of a linear system that couples the fluxes for all S_n angles and energy groups on a mesh cell. For every cell of a mesh that has been parallel decomposed on the higher-level Opteron processors, a linear system is transferred to the Cell/B.E. and the parallel LAPACK routines are used to compute a solution, which is then transferred back to the Opteron, where the rest of the computations for the S_n transport problem take place. Compared to standard parallel machines, a hundred-fold speedup of the bGS was observed on the hybrid Roadrunner architecture. Numerical experiments with strong and weak parallel scaling demonstrate the bGS method is viable and compares favorably to full parallel sweeps (FPS) on two-dimensional, unstructured meshes when it is applied to optically thick, multi-material problems. As expected, however, it is not as efficient as FPS in optically thin problems. (author)

  15. PCG: A software package for the iterative solution of linear systems on scalar, vector and parallel computers

    Energy Technology Data Exchange (ETDEWEB)

    Joubert, W. [Los Alamos National Lab., NM (United States); Carey, G.F. [Univ. of Texas, Austin, TX (United States)

    1994-12-31

    A great need exists for high performance numerical software libraries transportable across parallel machines. This talk concerns the PCG package, which solves systems of linear equations by iterative methods on parallel computers. The features of the package are discussed, as well as techniques used to obtain high performance as well as transportability across architectures. Representative numerical results are presented for several machines including the Connection Machine CM-5, Intel Paragon and Cray T3D parallel computers.

  16. Block iterative restoration of astronomical images with the massively parallel processor

    International Nuclear Information System (INIS)

    Heap, S.R.; Lindler, D.J.

    1987-01-01

    A method is described for algebraic image restoration capable of treating astronomical images. For a typical 500 x 500 image, direct algebraic restoration would require the solution of a 250,000 x 250,000 linear system. The block iterative approach is used to reduce the problem to solving 4900 121 x 121 linear systems. The algorithm was implemented on the Goddard Massively Parallel Processor, which can solve a 121 x 121 system in approximately 0.06 seconds. Examples are shown of the results for various astronomical images

  17. A massively parallel discrete ordinates response matrix method for neutron transport

    International Nuclear Information System (INIS)

    Hanebutte, U.R.; Lewis, E.E.

    1992-01-01

    In this paper a discrete ordinates response matrix method is formulated with anisotropic scattering for the solution of neutron transport problems on massively parallel computers. The response matrix formulation eliminates iteration on the scattering source. The nodal matrices that result from the diamond-differenced equations are utilized in a factored form that minimizes memory requirements and significantly reduces the number of arithmetic operations required per node. The red-black solution algorithm utilizes massive parallelism by assigning each spatial node to one or more processors. The algorithm is accelerated by a synthetic method in which the low-order diffusion equations are also solved by massively parallel red-black iterations. The method is implemented on a 16K Connection Machine-2, and S 8 and S 16 solutions are obtained for fixed-source benchmark problems in x-y geometry

  18. Multi-Level iterative methods in computational plasma physics

    International Nuclear Information System (INIS)

    Knoll, D.A.; Barnes, D.C.; Brackbill, J.U.; Chacon, L.; Lapenta, G.

    1999-01-01

    Plasma physics phenomena occur on a wide range of spatial scales and on a wide range of time scales. When attempting to model plasma physics problems numerically the authors are inevitably faced with the need for both fine spatial resolution (fine grids) and implicit time integration methods. Fine grids can tax the efficiency of iterative methods and large time steps can challenge the robustness of iterative methods. To meet these challenges they are developing a hybrid approach where multigrid methods are used as preconditioners to Krylov subspace based iterative methods such as conjugate gradients or GMRES. For nonlinear problems they apply multigrid preconditioning to a matrix-few Newton-GMRES method. Results are presented for application of these multilevel iterative methods to the field solves in implicit moment method PIC, multidimensional nonlinear Fokker-Planck problems, and their initial efforts in particle MHD

  19. Iterating skeletons

    DEFF Research Database (Denmark)

    Dieterle, Mischa; Horstmeyer, Thomas; Berthold, Jost

    2012-01-01

    a particular skeleton ad-hoc for repeated execution turns out to be considerably complicated, and raises general questions about introducing state into a stateless parallel computation. In addition, one would strongly prefer an approach which leaves the original skeleton intact, and only uses it as a building...... block inside a bigger structure. In this work, we present a general framework for skeleton iteration and discuss requirements and variations of iteration control and iteration body. Skeleton iteration is expressed by synchronising a parallel iteration body skeleton with a (likewise parallel) state......Skeleton-based programming is an area of increasing relevance with upcoming highly parallel hardware, since it substantially facilitates parallel programming and separates concerns. When parallel algorithms expressed by skeletons involve iterations – applying the same algorithm repeatedly...

  20. Efficient parallel iterative solvers for the solution of large dense linear systems arising from the boundary element method in electromagnetism

    International Nuclear Information System (INIS)

    Alleon, G.; Carpentieri, B.; Du, I.S.; Giraud, L.; Langou, J.; Martin, E.

    2003-01-01

    The boundary element method has become a popular tool for the solution of Maxwell's equations in electromagnetism. It discretizes only the surface of the radiating object and gives rise to linear systems that are smaller in size compared to those arising from finite element or finite difference discretizations. However, these systems are prohibitively demanding in terms of memory for direct methods and challenging to solve by iterative methods. In this paper we address the iterative solution via preconditioned Krylov methods of electromagnetic scattering problems expressed in an integral formulation, with main focus on the design of the pre-conditioner. We consider an approximate inverse method based on the Frobenius-norm minimization with a pattern prescribed in advance. The pre-conditioner is constructed from a sparse approximation of the dense coefficient matrix, and the patterns both for the pre-conditioner and for the coefficient matrix are computed a priori using geometric information from the mesh. We describe the implementation of the approximate inverse in an out-of-core parallel code that uses multipole techniques for the matrix-vector products, and show results on the numerical scalability of our method on systems of size up to one million unknowns. We propose an embedded iterative scheme based on the GMRES method and combined with multipole techniques, aimed at improving the robustness of the approximate inverse for large problems. We prove by numerical experiments that the proposed scheme enables the solution of very large and difficult problems efficiently at reduced computational and memory cost. Finally we perform a preliminary study on a spectral two-level pre-conditioner to enhance the robustness of our method. This numerical technique exploits spectral information of the preconditioned systems to build a low rank-update of the pre-conditioner. (authors)

  1. Efficient parallel iterative solvers for the solution of large dense linear systems arising from the boundary element method in electromagnetism

    Energy Technology Data Exchange (ETDEWEB)

    Alleon, G. [EADS-CCR, 31 - Blagnac (France); Carpentieri, B.; Du, I.S.; Giraud, L.; Langou, J.; Martin, E. [Cerfacs, 31 - Toulouse (France)

    2003-07-01

    The boundary element method has become a popular tool for the solution of Maxwell's equations in electromagnetism. It discretizes only the surface of the radiating object and gives rise to linear systems that are smaller in size compared to those arising from finite element or finite difference discretizations. However, these systems are prohibitively demanding in terms of memory for direct methods and challenging to solve by iterative methods. In this paper we address the iterative solution via preconditioned Krylov methods of electromagnetic scattering problems expressed in an integral formulation, with main focus on the design of the pre-conditioner. We consider an approximate inverse method based on the Frobenius-norm minimization with a pattern prescribed in advance. The pre-conditioner is constructed from a sparse approximation of the dense coefficient matrix, and the patterns both for the pre-conditioner and for the coefficient matrix are computed a priori using geometric information from the mesh. We describe the implementation of the approximate inverse in an out-of-core parallel code that uses multipole techniques for the matrix-vector products, and show results on the numerical scalability of our method on systems of size up to one million unknowns. We propose an embedded iterative scheme based on the GMRES method and combined with multipole techniques, aimed at improving the robustness of the approximate inverse for large problems. We prove by numerical experiments that the proposed scheme enables the solution of very large and difficult problems efficiently at reduced computational and memory cost. Finally we perform a preliminary study on a spectral two-level pre-conditioner to enhance the robustness of our method. This numerical technique exploits spectral information of the preconditioned systems to build a low rank-update of the pre-conditioner. (authors)

  2. Iterative Brinkman penalization for remeshed vortex methods

    DEFF Research Database (Denmark)

    Hejlesen, Mads Mølholm; Koumoutsakos, Petros; Leonard, Anthony

    2015-01-01

    We introduce an iterative Brinkman penalization method for the enforcement of the no-slip boundary condition in remeshed vortex methods. In the proposed method, the Brinkman penalization is applied iteratively only in the neighborhood of the body. This allows for using significantly larger time...

  3. Parallelized implicit propagators for the finite-difference Schrödinger equation

    Science.gov (United States)

    Parker, Jonathan; Taylor, K. T.

    1995-08-01

    We describe the application of block Gauss-Seidel and block Jacobi iterative methods to the design of implicit propagators for finite-difference models of the time-dependent Schrödinger equation. The block-wise iterative methods discussed here are mixed direct-iterative methods for solving simultaneous equations, in the sense that direct methods (e.g. LU decomposition) are used to invert certain block sub-matrices, and iterative methods are used to complete the solution. We describe parallel variants of the basic algorithm that are well suited to the medium- to coarse-grained parallelism of work-station clusters, and MIMD supercomputers, and we show that under a wide range of conditions, fine-grained parallelism of the computation can be achieved. Numerical tests are conducted on a typical one-electron atom Hamiltonian. The methods converge robustly to machine precision (15 significant figures), in some cases in as few as 6 or 7 iterations. The rate of convergence is nearly independent of the finite-difference grid-point separations.

  4. Parallel algorithms for nuclear reactor analysis via domain decomposition method

    International Nuclear Information System (INIS)

    Kim, Yong Hee

    1995-02-01

    In this thesis, the neutron diffusion equation in reactor physics is discretized by the finite difference method and is solved on a parallel computer network which is composed of T-800 transputers. T-800 transputer is a message-passing type MIMD (multiple instruction streams and multiple data streams) architecture. A parallel variant of Schwarz alternating procedure for overlapping subdomains is developed with domain decomposition. The thesis provides convergence analysis and improvement of the convergence of the algorithm. The convergence of the parallel Schwarz algorithms with DN(or ND), DD, NN, and mixed pseudo-boundary conditions(a weighted combination of Dirichlet and Neumann conditions) is analyzed for both continuous and discrete models in two-subdomain case and various underlying features are explored. The analysis shows that the convergence rate of the algorithm highly depends on the pseudo-boundary conditions and the theoretically best one is the mixed boundary conditions(MM conditions). Also it is shown that there may exist a significant discrepancy between continuous model analysis and discrete model analysis. In order to accelerate the convergence of the parallel Schwarz algorithm, relaxation in pseudo-boundary conditions is introduced and the convergence analysis of the algorithm for two-subdomain case is carried out. The analysis shows that under-relaxation of the pseudo-boundary conditions accelerates the convergence of the parallel Schwarz algorithm if the convergence rate without relaxation is negative, and any relaxation(under or over) decelerates convergence if the convergence rate without relaxation is positive. Numerical implementation of the parallel Schwarz algorithm on an MIMD system requires multi-level iterations: two levels for fixed source problems, three levels for eigenvalue problems. Performance of the algorithm turns out to be very sensitive to the iteration strategy. In general, multi-level iterations provide good performance when

  5. Domain decomposition methods and parallel computing

    International Nuclear Information System (INIS)

    Meurant, G.

    1991-01-01

    In this paper, we show how to efficiently solve large linear systems on parallel computers. These linear systems arise from discretization of scientific computing problems described by systems of partial differential equations. We show how to get a discrete finite dimensional system from the continuous problem and the chosen conjugate gradient iterative algorithm is briefly described. Then, the different kinds of parallel architectures are reviewed and their advantages and deficiencies are emphasized. We sketch the problems found in programming the conjugate gradient method on parallel computers. For this algorithm to be efficient on parallel machines, domain decomposition techniques are introduced. We give results of numerical experiments showing that these techniques allow a good rate of convergence for the conjugate gradient algorithm as well as computational speeds in excess of a billion of floating point operations per second. (author). 5 refs., 11 figs., 2 tabs., 1 inset

  6. Performance Analysis of Fission and Surface Source Iteration Method for Domain Decomposed Monte Carlo Whole-Core Calculation

    International Nuclear Information System (INIS)

    Jo, Yu Gwon; Oh, Yoo Min; Park, Hyang Kyu; Park, Kang Soon; Cho, Nam Zin

    2016-01-01

    In this paper, two issues in the FSS iteration method, i.e., the waiting time for surface source data and the variance biases in local tallies are investigated for the domain decomposed, 3-D continuous-energy whole-core calculation. The fission sources are provided as usual, while the surface sources are provided by banking MC particles crossing local domain boundaries. The surface sources serve as boundary conditions for nonoverlapping local problems, so that each local problem can be solved independently. In this paper, two issues in the FSS iteration are investigated. One is quantifying the waiting time of processors to receive surface source data. By using nonblocking communication, 'time penalty' to wait for the arrival of the surface source data is reduced. The other important issue is underestimation of the sample variance of the tally because of additional inter-iteration correlations in surface sources. From the numerical results on a 3-D whole-core test problem, it is observed that the time penalty is negligible in the FSS iteration method and that the real variances of both pin powers and assembly powers are estimated by the HB method. For those purposes, three cases; Case 1 (1 local domain), Case 2 (4 local domains), Case 3 (16 local domains) are tested. For both Cases 2 and 3, the time penalties for waiting are negligible compared to the source-tracking times. However, for finer divisions of local domains, the loss of parallel efficiency caused by the different number of sources for local domains in symmetric locations becomes larger due to the stochastic errors in source distributions. For all test cases, the HB method very well estimates the real variances of local tallies. However, it is also noted that the real variances of local tallies estimated by the HB method show slightly smaller than the real variances obtained from 30 independent batch runs and the deviations become larger for finer divisions of local domains. The batch size used for the HB

  7. Three-Dimensional Induced Polarization Parallel Inversion Using Nonlinear Conjugate Gradients Method

    Directory of Open Access Journals (Sweden)

    Huan Ma

    2015-01-01

    Full Text Available Four kinds of array of induced polarization (IP methods (surface, borehole-surface, surface-borehole, and borehole-borehole are widely used in resource exploration. However, due to the presence of large amounts of the sources, it will take much time to complete the inversion. In the paper, a new parallel algorithm is described which uses message passing interface (MPI and graphics processing unit (GPU to accelerate 3D inversion of these four methods. The forward finite differential equation is solved by ILU0 preconditioner and the conjugate gradient (CG solver. The inverse problem is solved by nonlinear conjugate gradients (NLCG iteration which is used to calculate one forward and two “pseudo-forward” modelings and update the direction, space, and model in turn. Because each source is independent in forward and “pseudo-forward” modelings, multiprocess modes are opened by calling MPI library. The iterative matrix solver within CULA is called in each process. Some tables and synthetic data examples illustrate that this parallel inversion algorithm is effective. Furthermore, we demonstrate that the joint inversion of surface and borehole data produces resistivity and chargeability results are superior to those obtained from inversions of individual surface data.

  8. A mobile robot with parallel kinematics to meet the requirements for assembling and machining the ITER vacuum vessel

    Energy Technology Data Exchange (ETDEWEB)

    Pessi, Pekka [Lappeenranta University of Technology, Lappeenranta (Finland)], E-mail: pessi@lut.fi; Wu, Huapeng; Handroos, Heikki [Lappeenranta University of Technology, Lappeenranta (Finland); Jones, Lawrence [EFDA Close Support Unit, Boltzmannstrasse 2, Garching D-85748 (Germany)

    2007-10-15

    The present paper introduces a mobile parallel robot developed for International Thermonuclear Experimental Reactor (ITER). The task of the robot is to carry out welding and machining processes inside the ITER vacuum vessel. The kinematic design of the robot has been optimized for the ITER access. The kinematic analysis is given in the paper. A virtual prototype of the parallel robot is built. A dynamic behavior of the whole robot is studied by the multi-body system simulation (MBS)

  9. A mobile robot with parallel kinematics to meet the requirements for assembling and machining the ITER vacuum vessel

    International Nuclear Information System (INIS)

    Pessi, Pekka; Wu, Huapeng; Handroos, Heikki; Jones, Lawrence

    2007-01-01

    The present paper introduces a mobile parallel robot developed for International Thermonuclear Experimental Reactor (ITER). The task of the robot is to carry out welding and machining processes inside the ITER vacuum vessel. The kinematic design of the robot has been optimized for the ITER access. The kinematic analysis is given in the paper. A virtual prototype of the parallel robot is built. A dynamic behavior of the whole robot is studied by the multi-body system simulation (MBS)

  10. Estimation of POL-iteration methods in fast running DNBR code

    Energy Technology Data Exchange (ETDEWEB)

    Kwon, Hyuk; Kim, S. J.; Seo, K. W.; Hwang, D. H. [KAERI, Daejeon (Korea, Republic of)

    2016-05-15

    In this study, various root finding methods are applied to the POL-iteration module in SCOMS and POLiteration efficiency is compared with reference method. On the base of these results, optimum algorithm of POL iteration is selected. The POL requires the iteration until present local power reach limit power. The process to search the limiting power is equivalent with a root finding of nonlinear equation. POL iteration process involved in online monitoring system used a variant bisection method that is the most robust algorithm to find the root of nonlinear equation. The method including the interval accelerating factor and escaping routine out of ill-posed condition assured the robustness of SCOMS system. POL iteration module in SCOMS shall satisfy the requirement which is a minimum calculation time. For this requirement of calculation time, non-iterative algorithm, few channel model, simple steam table are implemented into SCOMS to improve the calculation time. MDNBR evaluation at a given operating condition requires the DNBR calculation at all axial locations. An increasing of POL-iteration number increased a calculation load of SCOMS significantly. Therefore, calculation efficiency of SCOMS is strongly dependent on the POL iteration number. In case study, the iterations of the methods have a superlinear convergence for finding limiting power but Brent method shows a quardratic convergence speed. These methods are effective and better than the reference bisection algorithm.

  11. Vector-Parallel processing of the successive overrelaxation method

    International Nuclear Information System (INIS)

    Yokokawa, Mitsuo

    1988-02-01

    Successive overrelaxation method, called SOR method, is one of iterative methods for solving linear system of equations, and it has been calculated in serial with a natural ordering in many nuclear codes. After the appearance of vector processors, this natural SOR method has been changed for the parallel algorithm such as hyperplane or red-black method, in which the calculation order is modified. These methods are suitable for vector processors, and more high-speed calculation can be obtained compared with the natural SOR method on vector processors. In this report, a new scheme named 4-colors SOR method is proposed. We find that the 4-colors SOR method can be executed on vector-parallel processors and it gives the most high-speed calculation among all SOR methods according to results of the vector-parallel execution on the Alliant FX/8 multiprocessor system. It is also shown that the theoretical optimal acceleration parameters are equal among five different ordering SOR methods, and the difference between convergence rates of these SOR methods are examined. (author)

  12. Solution of the within-group multidimensional discrete ordinates transport equations on massively parallel architectures

    Science.gov (United States)

    Zerr, Robert Joseph

    2011-12-01

    The integral transport matrix method (ITMM) has been used as the kernel of new parallel solution methods for the discrete ordinates approximation of the within-group neutron transport equation. The ITMM abandons the repetitive mesh sweeps of the traditional source iterations (SI) scheme in favor of constructing stored operators that account for the direct coupling factors among all the cells and between the cells and boundary surfaces. The main goals of this work were to develop the algorithms that construct these operators and employ them in the solution process, determine the most suitable way to parallelize the entire procedure, and evaluate the behavior and performance of the developed methods for increasing number of processes. This project compares the effectiveness of the ITMM with the SI scheme parallelized with the Koch-Baker-Alcouffe (KBA) method. The primary parallel solution method involves a decomposition of the domain into smaller spatial sub-domains, each with their own transport matrices, and coupled together via interface boundary angular fluxes. Each sub-domain has its own set of ITMM operators and represents an independent transport problem. Multiple iterative parallel solution methods have investigated, including parallel block Jacobi (PBJ), parallel red/black Gauss-Seidel (PGS), and parallel GMRES (PGMRES). The fastest observed parallel solution method, PGS, was used in a weak scaling comparison with the PARTISN code. Compared to the state-of-the-art SI-KBA with diffusion synthetic acceleration (DSA), this new method without acceleration/preconditioning is not competitive for any problem parameters considered. The best comparisons occur for problems that are difficult for SI DSA, namely highly scattering and optically thick. SI DSA execution time curves are generally steeper than the PGS ones. However, until further testing is performed it cannot be concluded that SI DSA does not outperform the ITMM with PGS even on several thousand or tens of

  13. Fast Time and Space Parallel Algorithms for Solution of Parabolic Partial Differential Equations

    Science.gov (United States)

    Fijany, Amir

    1993-01-01

    In this paper, fast time- and Space -Parallel agorithms for solution of linear parabolic PDEs are developed. It is shown that the seemingly strictly serial iterations of the time-stepping procedure for solution of the problem can be completed decoupled.

  14. High-speed parallel solution of the neutron diffusion equation with the hierarchical domain decomposition boundary element method incorporating parallel communications

    International Nuclear Information System (INIS)

    Tsuji, Masashi; Chiba, Gou

    2000-01-01

    A hierarchical domain decomposition boundary element method (HDD-BEM) for solving the multiregion neutron diffusion equation (NDE) has been fully parallelized, both for numerical computations and for data communications, to accomplish a high parallel efficiency on distributed memory message passing parallel computers. Data exchanges between node processors that are repeated during iteration processes of HDD-BEM are implemented, without any intervention of the host processor that was used to supervise parallel processing in the conventional parallelized HDD-BEM (P-HDD-BEM). Thus, the parallel processing can be executed with only cooperative operations of node processors. The communication overhead was even the dominant time consuming part in the conventional P-HDD-BEM, and the parallelization efficiency decreased steeply with the increase of the number of processors. With the parallel data communication, the efficiency is affected only by the number of boundary elements assigned to decomposed subregions, and the communication overhead can be drastically reduced. This feature can be particularly advantageous in the analysis of three-dimensional problems where a large number of processors are required. The proposed P-HDD-BEM offers a promising solution to the deterioration problem of parallel efficiency and opens a new path to parallel computations of NDEs on distributed memory message passing parallel computers. (author)

  15. A Comparison of Iterative 2D-3D Pose Estimation Methods for Real-Time Applications

    DEFF Research Database (Denmark)

    Grest, Daniel; Krüger, Volker; Petersen, Thomas

    2009-01-01

    This work compares iterative 2D-3D Pose Estimation methods for use in real-time applications. The compared methods are available for public as C++ code. One method is part of the openCV library, namely POSIT. Because POSIT is not applicable for planar 3Dpoint congurations, we include the planar P...

  16. Copper Mountain conference on iterative methods: Proceedings: Volume 1

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1996-10-01

    This volume (one of two) contains information presented during the first three days of the Copper Mountain Conference on Iterative Methods held April 9-13, 1996 at Copper Mountain, Colorado. Topics of the sessions held these three days included nonlinear systems, parallel processing, preconditioning, sparse matrix test collections, first-order system least squares, Arnoldi`s method, integral equations, software, Navier-Stokes equations, Euler equations, Krylov methods, and eigenvalues. The top three papers from a student competition are also included. Selected papers indexed separately for the Energy Science and Technology Database.

  17. III - Template Metaprogramming for massively parallel scientific computing - Templates for Iteration; Thread-level Parallelism

    CERN Multimedia

    CERN. Geneva

    2016-01-01

    Large scale scientific computing raises questions on different levels ranging from the fomulation of the problems to the choice of the best algorithms and their implementation for a specific platform. There are similarities in these different topics that can be exploited by modern-style C++ template metaprogramming techniques to produce readable, maintainable and generic code. Traditional low-level code tend to be fast but platform-dependent, and it obfuscates the meaning of the algorithm. On the other hand, object-oriented approach is nice to read, but may come with an inherent performance penalty. These lectures aim to present he basics of the Expression Template (ET) idiom which allows us to keep the object-oriented approach without sacrificing performance. We will in particular show to to enhance ET to include SIMD vectorization. We will then introduce techniques for abstracting iteration, and introduce thread-level parallelism for use in heavy data-centric loads. We will show to to apply these methods i...

  18. Method of parallel processing in SANPO real time system

    International Nuclear Information System (INIS)

    Ostrovnoj, A.I.; Salamatin, I.M.

    1981-01-01

    A method of parellel processing in SANPO real time system is described. Algorithms of data accumulation and preliminary processing in this system as a parallel processes using a specialized high level programming language are described. Hierarchy of elementary processes are also described. It provides the synchronization of concurrent processes without semaphors. The developed means are applied to the systems of experiment automation using SM-3 minicomputers [ru

  19. Chatter suppression methods of a robot machine for ITER vacuum vessel assembly and maintenance

    International Nuclear Information System (INIS)

    Wu, Huapeng; Wang, Yongbo; Li, Ming; Al-Saedi, Mazin; Handroos, Heikki

    2014-01-01

    Highlights: •A redundant 10-DOF serial-parallel hybrid robot for ITER assembly and maintains is presented. •A dynamic model of the robot is developed. •A feedback and feedforward controller is presented to suppress machining vibration of the robot. -- Abstract: In the process of assembly and maintenance of ITER vacuum vessel (ITER VV), various machining tasks including threading, milling, welding-defects cutting and flexible hose boring are required to be performed from inside of ITER VV by on-site machining tools. Robot machine is a promising option for these tasks, but great chatter (machine vibration) would happen in the machining process. The chatter vibration will deteriorate the robot accuracy and surface quality, and even cause some damages on the end-effector tools and the robot structure itself. This paper introduces two vibration control methods, one is passive and another is active vibration control. For the passive vibration control, a parallel mechanism is presented to increase the stiffness of robot machine; for the active vibration control, a hybrid control method combining feedforward controller and nonlinear feedback controller is introduced for chatter suppression. A dynamic model and its chatter vibration phenomena of a hybrid robot is demonstrated. Simulation results are given based on the proposed hybrid robot machine which is developed for the ITER VV assembly and maintenance

  20. Chatter suppression methods of a robot machine for ITER vacuum vessel assembly and maintenance

    Energy Technology Data Exchange (ETDEWEB)

    Wu, Huapeng; Wang, Yongbo, E-mail: yongbo.wang@lut.fi; Li, Ming; Al-Saedi, Mazin; Handroos, Heikki

    2014-10-15

    Highlights: •A redundant 10-DOF serial-parallel hybrid robot for ITER assembly and maintains is presented. •A dynamic model of the robot is developed. •A feedback and feedforward controller is presented to suppress machining vibration of the robot. -- Abstract: In the process of assembly and maintenance of ITER vacuum vessel (ITER VV), various machining tasks including threading, milling, welding-defects cutting and flexible hose boring are required to be performed from inside of ITER VV by on-site machining tools. Robot machine is a promising option for these tasks, but great chatter (machine vibration) would happen in the machining process. The chatter vibration will deteriorate the robot accuracy and surface quality, and even cause some damages on the end-effector tools and the robot structure itself. This paper introduces two vibration control methods, one is passive and another is active vibration control. For the passive vibration control, a parallel mechanism is presented to increase the stiffness of robot machine; for the active vibration control, a hybrid control method combining feedforward controller and nonlinear feedback controller is introduced for chatter suppression. A dynamic model and its chatter vibration phenomena of a hybrid robot is demonstrated. Simulation results are given based on the proposed hybrid robot machine which is developed for the ITER VV assembly and maintenance.

  1. Iteration of ultrasound aberration correction methods

    Science.gov (United States)

    Maasoey, Svein-Erik; Angelsen, Bjoern; Varslot, Trond

    2004-05-01

    Aberration in ultrasound medical imaging is usually modeled by time-delay and amplitude variations concentrated on the transmitting/receiving array. This filter process is here denoted a TDA filter. The TDA filter is an approximation to the physical aberration process, which occurs over an extended part of the human body wall. Estimation of the TDA filter, and performing correction on transmit and receive, has proven difficult. It has yet to be shown that this method works adequately for severe aberration. Estimation of the TDA filter can be iterated by retransmitting a corrected signal and re-estimate until a convergence criterion is fulfilled (adaptive imaging). Two methods for estimating time-delay and amplitude variations in receive signals from random scatterers have been developed. One method correlates each element signal with a reference signal. The other method use eigenvalue decomposition of the receive cross-spectrum matrix, based upon a receive energy-maximizing criterion. Simulations of iterating aberration correction with a TDA filter have been investigated to study its convergence properties. A weak and strong human-body wall model generated aberration. Both emulated the human abdominal wall. Results after iteration improve aberration correction substantially, and both estimation methods converge, even for the case of strong aberration.

  2. Issues in developing parallel iterative algorithms for solving partial differential equations on a (transputer-based) distributed parallel computing system

    International Nuclear Information System (INIS)

    Rajagopalan, S.; Jethra, A.; Khare, A.N.; Ghodgaonkar, M.D.; Srivenkateshan, R.; Menon, S.V.G.

    1990-01-01

    Issues relating to implementing iterative procedures, for numerical solution of elliptic partial differential equations, on a distributed parallel computing system are discussed. Preliminary investigations show that a speed-up of about 3.85 is achievable on a four transputer pipeline network. (author). 2 figs., 3 a ppendixes., 7 refs

  3. The Davidson Method as an alternative to power iterations for criticality calculations

    International Nuclear Information System (INIS)

    Subramanian, C.; Van Criekingen, S.; Heuveline, V.; Nataf, F.; Have, P.

    2011-01-01

    The Davidson method is implemented within the neutron transport core solver parafish to solve k-eigenvalue criticality transport problems. The parafish solver is based on domain decomposition, uses spherical harmonics (P_N method) for angular discretization, and nonconforming finite elements for spatial discretization. The Davidson method is compared to the traditional power iteration method in that context. Encouraging numerical results are obtained with both sequential and parallel calculations. (author)

  4. A non overlapping parallel domain decomposition method applied to the simplified transport equations

    International Nuclear Information System (INIS)

    Lathuiliere, B.; Barrault, M.; Ramet, P.; Roman, J.

    2009-01-01

    A reactivity computation requires to compute the highest eigenvalue of a generalized eigenvalue problem. An inverse power algorithm is used commonly. Very fine modelizations are difficult to tackle for our sequential solver, based on the simplified transport equations, in terms of memory consumption and computational time. So, we propose a non-overlapping domain decomposition method for the approximate resolution of the linear system to solve at each inverse power iteration. Our method brings to a low development effort as the inner multigroup solver can be re-use without modification, and allows us to adapt locally the numerical resolution (mesh, finite element order). Numerical results are obtained by a parallel implementation of the method on two different cases with a pin by pin discretization. This results are analyzed in terms of memory consumption and parallel efficiency. (authors)

  5. Parallel supercomputing: Advanced methods, algorithms, and software for large-scale linear and nonlinear problems

    Energy Technology Data Exchange (ETDEWEB)

    Carey, G.F.; Young, D.M.

    1993-12-31

    The program outlined here is directed to research on methods, algorithms, and software for distributed parallel supercomputers. Of particular interest are finite element methods and finite difference methods together with sparse iterative solution schemes for scientific and engineering computations of very large-scale systems. Both linear and nonlinear problems will be investigated. In the nonlinear case, applications with bifurcation to multiple solutions will be considered using continuation strategies. The parallelizable numerical methods of particular interest are a family of partitioning schemes embracing domain decomposition, element-by-element strategies, and multi-level techniques. The methods will be further developed incorporating parallel iterative solution algorithms with associated preconditioners in parallel computer software. The schemes will be implemented on distributed memory parallel architectures such as the CRAY MPP, Intel Paragon, the NCUBE3, and the Connection Machine. We will also consider other new architectures such as the Kendall-Square (KSQ) and proposed machines such as the TERA. The applications will focus on large-scale three-dimensional nonlinear flow and reservoir problems with strong convective transport contributions. These are legitimate grand challenge class computational fluid dynamics (CFD) problems of significant practical interest to DOE. The methods developed and algorithms will, however, be of wider interest.

  6. A time-domain decomposition iterative method for the solution of distributed linear quadratic optimal control problems

    Science.gov (United States)

    Heinkenschloss, Matthias

    2005-01-01

    We study a class of time-domain decomposition-based methods for the numerical solution of large-scale linear quadratic optimal control problems. Our methods are based on a multiple shooting reformulation of the linear quadratic optimal control problem as a discrete-time optimal control (DTOC) problem. The optimality conditions for this DTOC problem lead to a linear block tridiagonal system. The diagonal blocks are invertible and are related to the original linear quadratic optimal control problem restricted to smaller time-subintervals. This motivates the application of block Gauss-Seidel (GS)-type methods for the solution of the block tridiagonal systems. Numerical experiments show that the spectral radii of the block GS iteration matrices are larger than one for typical applications, but that the eigenvalues of the iteration matrices decay to zero fast. Hence, while the GS method is not expected to convergence for typical applications, it can be effective as a preconditioner for Krylov-subspace methods. This is confirmed by our numerical tests.A byproduct of this research is the insight that certain instantaneous control techniques can be viewed as the application of one step of the forward block GS method applied to the DTOC optimality system.

  7. Long-time atomistic simulations with the Parallel Replica Dynamics method

    Science.gov (United States)

    Perez, Danny

    Molecular Dynamics (MD) -- the numerical integration of atomistic equations of motion -- is a workhorse of computational materials science. Indeed, MD can in principle be used to obtain any thermodynamic or kinetic quantity, without introducing any approximation or assumptions beyond the adequacy of the interaction potential. It is therefore an extremely powerful and flexible tool to study materials with atomistic spatio-temporal resolution. These enviable qualities however come at a steep computational price, hence limiting the system sizes and simulation times that can be achieved in practice. While the size limitation can be efficiently addressed with massively parallel implementations of MD based on spatial decomposition strategies, allowing for the simulation of trillions of atoms, the same approach usually cannot extend the timescales much beyond microseconds. In this article, we discuss an alternative parallel-in-time approach, the Parallel Replica Dynamics (ParRep) method, that aims at addressing the timescale limitation of MD for systems that evolve through rare state-to-state transitions. We review the formal underpinnings of the method and demonstrate that it can provide arbitrarily accurate results for any definition of the states. When an adequate definition of the states is available, ParRep can simulate trajectories with a parallel speedup approaching the number of replicas used. We demonstrate the usefulness of ParRep by presenting different examples of materials simulations where access to long timescales was essential to access the physical regime of interest and discuss practical considerations that must be addressed to carry out these simulations. Work supported by the United States Department of Energy (U.S. DOE), Office of Science, Office of Basic Energy Sciences, Materials Sciences and Engineering Division.

  8. Novel Parallel Numerical Methods for Radiation and Neutron Transport

    International Nuclear Information System (INIS)

    Brown, P N

    2001-01-01

    In many of the multiphysics simulations performed at LLNL, transport calculations can take up 30 to 50% of the total run time. If Monte Carlo methods are used, the percentage can be as high as 80%. Thus, a significant core competence in the formulation, software implementation, and solution of the numerical problems arising in transport modeling is essential to Laboratory and DOE research. In this project, we worked on developing scalable solution methods for the equations that model the transport of photons and neutrons through materials. Our goal was to reduce the transport solve time in these simulations by means of more advanced numerical methods and their parallel implementations. These methods must be scalable, that is, the time to solution must remain constant as the problem size grows and additional computer resources are used. For iterative methods, scalability requires that (1) the number of iterations to reach convergence is independent of problem size, and (2) that the computational cost grows linearly with problem size. We focused on deterministic approaches to transport, building on our earlier work in which we performed a new, detailed analysis of some existing transport methods and developed new approaches. The Boltzmann equation (the underlying equation to be solved) and various solution methods have been developed over many years. Consequently, many laboratory codes are based on these methods, which are in some cases decades old. For the transport of x-rays through partially ionized plasmas in local thermodynamic equilibrium, the transport equation is coupled to nonlinear diffusion equations for the electron and ion temperatures via the highly nonlinear Planck function. We investigated the suitability of traditional-solution approaches to transport on terascale architectures and also designed new scalable algorithms; in some cases, we investigated hybrid approaches that combined both

  9. A block-iterative nodal integral method for forced convection problems

    International Nuclear Information System (INIS)

    Decker, W.J.; Dorning, J.J.

    1992-01-01

    A new efficient iterative nodal integral method for the time-dependent two- and three-dimensional incompressible Navier-Stokes equations has been developed. Using the approach introduced by Azmy and Droning to develop nodal mehtods with high accuracy on coarse spatial grids for two-dimensional steady-state problems and extended to coarse two-dimensional space-time grids by Wilson et al. for thermal convection problems, we have developed a new iterative nodal integral method for the time-dependent Navier-Stokes equations for mechanically forced convection. A new, extremely efficient block iterative scheme is employed to invert the Jacobian within each of the Newton-Raphson iterations used to solve the final nonlinear discrete-variable equations. By taking advantage of the special structure of the Jacobian, this scheme greatly reduces memory requirements. The accuracy of the overall method is illustrated by appliying it to the time-dependent version of the classic two-dimensional driven cavity problem of computational fluid dynamics

  10. A PARALLEL NONOVERLAPPING DOMAIN DECOMPOSITION METHOD FOR STOKES PROBLEMS

    Institute of Scientific and Technical Information of China (English)

    Mei-qun Jiang; Pei-liang Dai

    2006-01-01

    A nonoverlapping domain decomposition iterative procedure is developed and analyzed for generalized Stokes problems and their finite element approximate problems in RN(N=2,3). The method is based on a mixed-type consistency condition with two parameters as a transmission condition together with a derivative-free transmission data updating technique on the artificial interfaces. The method can be applied to a general multi-subdomain decomposition and implemented on parallel machines with local simple communications naturally.

  11. Iterative schemes for parallel Sn algorithms in a shared-memory computing environment

    International Nuclear Information System (INIS)

    Haghighat, A.; Hunter, M.A.; Mattis, R.E.

    1995-01-01

    Several two-dimensional spatial domain partitioning S n transport theory algorithms are developed on the basis of different iterative schemes. These algorithms are incorporated into TWOTRAN-II and tested on the shared-memory CRAY Y-MP C90 computer. For a series of fixed-source r-z geometry homogeneous problems, it is demonstrated that the concurrent red-black algorithms may result in large parallel efficiencies (>60%) on C90. It is also demonstrated that for a realistic shielding problem, the use of the negative flux fixup causes high load imbalance, which results in a significant loss of parallel efficiency

  12. Advances in iterative methods

    International Nuclear Information System (INIS)

    Beauwens, B.; Arkuszewski, J.; Boryszewicz, M.

    1981-01-01

    Results obtained in the field of linear iterative methods within the Coordinated Research Program on Transport Theory and Advanced Reactor Calculations are summarized. The general convergence theory of linear iterative methods is essentially based on the properties of nonnegative operators on ordered normed spaces. The following aspects of this theory have been improved: new comparison theorems for regular splittings, generalization of the notions of M- and H-matrices, new interpretations of classical convergence theorems for positive-definite operators. The estimation of asymptotic convergence rates was developed with two purposes: the analysis of model problems and the optimization of relaxation parameters. In the framework of factorization iterative methods, model problem analysis is needed to investigate whether the increased computational complexity of higher-order methods does not offset their increased asymptotic convergence rates, as well as to appreciate the effect of standard relaxation techniques (polynomial relaxation). On the other hand, the optimal use of factorization iterative methods requires the development of adequate relaxation techniques and their optimization. The relative performances of a few possibilities have been explored for model problems. Presently, the best results have been obtained with optimal diagonal-Chebyshev relaxation

  13. Iterative approach as alternative to S-matrix in modal methods

    Science.gov (United States)

    Semenikhin, Igor; Zanuccoli, Mauro

    2014-12-01

    The continuously increasing complexity of opto-electronic devices and the rising demands of simulation accuracy lead to the need of solving very large systems of linear equations making iterative methods promising and attractive from the computational point of view with respect to direct methods. In particular, iterative approach potentially enables the reduction of required computational time to solve Maxwell's equations by Eigenmode Expansion algorithms. Regardless of the particular eigenmodes finding method used, the expansion coefficients are computed as a rule by scattering matrix (S-matrix) approach or similar techniques requiring order of M3 operations. In this work we consider alternatives to the S-matrix technique which are based on pure iterative or mixed direct-iterative approaches. The possibility to diminish the impact of M3 -order calculations to overall time and in some cases even to reduce the number of arithmetic operations to M2 by applying iterative techniques are discussed. Numerical results are illustrated to discuss validity and potentiality of the proposed approaches.

  14. An efficient parallel algorithm: Poststack and prestack Kirchhoff 3D depth migration using flexi-depth iterations

    Science.gov (United States)

    Rastogi, Richa; Srivastava, Abhishek; Khonde, Kiran; Sirasala, Kirannmayi M.; Londhe, Ashutosh; Chavhan, Hitesh

    2015-07-01

    This paper presents an efficient parallel 3D Kirchhoff depth migration algorithm suitable for current class of multicore architecture. The fundamental Kirchhoff depth migration algorithm exhibits inherent parallelism however, when it comes to 3D data migration, as the data size increases the resource requirement of the algorithm also increases. This challenges its practical implementation even on current generation high performance computing systems. Therefore a smart parallelization approach is essential to handle 3D data for migration. The most compute intensive part of Kirchhoff depth migration algorithm is the calculation of traveltime tables due to its resource requirements such as memory/storage and I/O. In the current research work, we target this area and develop a competent parallel algorithm for post and prestack 3D Kirchhoff depth migration, using hybrid MPI+OpenMP programming techniques. We introduce a concept of flexi-depth iterations while depth migrating data in parallel imaging space, using optimized traveltime table computations. This concept provides flexibility to the algorithm by migrating data in a number of depth iterations, which depends upon the available node memory and the size of data to be migrated during runtime. Furthermore, it minimizes the requirements of storage, I/O and inter-node communication, thus making it advantageous over the conventional parallelization approaches. The developed parallel algorithm is demonstrated and analysed on Yuva II, a PARAM series of supercomputers. Optimization, performance and scalability experiment results along with the migration outcome show the effectiveness of the parallel algorithm.

  15. Non-Cartesian parallel imaging reconstruction.

    Science.gov (United States)

    Wright, Katherine L; Hamilton, Jesse I; Griswold, Mark A; Gulani, Vikas; Seiberlich, Nicole

    2014-11-01

    Non-Cartesian parallel imaging has played an important role in reducing data acquisition time in MRI. The use of non-Cartesian trajectories can enable more efficient coverage of k-space, which can be leveraged to reduce scan times. These trajectories can be undersampled to achieve even faster scan times, but the resulting images may contain aliasing artifacts. Just as Cartesian parallel imaging can be used to reconstruct images from undersampled Cartesian data, non-Cartesian parallel imaging methods can mitigate aliasing artifacts by using additional spatial encoding information in the form of the nonhomogeneous sensitivities of multi-coil phased arrays. This review will begin with an overview of non-Cartesian k-space trajectories and their sampling properties, followed by an in-depth discussion of several selected non-Cartesian parallel imaging algorithms. Three representative non-Cartesian parallel imaging methods will be described, including Conjugate Gradient SENSE (CG SENSE), non-Cartesian generalized autocalibrating partially parallel acquisition (GRAPPA), and Iterative Self-Consistent Parallel Imaging Reconstruction (SPIRiT). After a discussion of these three techniques, several potential promising clinical applications of non-Cartesian parallel imaging will be covered. © 2014 Wiley Periodicals, Inc.

  16. Time-dependent deterministic transport on parallel architectures using PARTISN

    International Nuclear Information System (INIS)

    Alcouffe, R.E.; Baker, R.S.

    1998-01-01

    In addition to the ability to solve the static transport equation, the authors have also incorporated time dependence into the parallel S N code PARTISN. Using a semi-implicit scheme, PARTISN is capable of performing time-dependent calculations for both fissioning and pure source driven problems. They have applied this to various types of problems such as shielding and prompt fission experiments. This paper describes the form of the time-dependent equations implemented, their solution strategies in PARTISN including iteration acceleration, and the strategies used for time-step control. Results are presented for a iron-water shielding calculation and a criticality excursion in a uranium solution configuration

  17. The ITER Fast Plant System Controller ATCA prototype Real-Time Software Architecture

    International Nuclear Information System (INIS)

    Carvalho, B.B.; Santos, B.; Carvalho, P.F.; Neto, A.; Boncagni, L.; Batista, A.J.N.; Correia, M.; Sousa, J.; Gonçalves, B.

    2013-01-01

    Highlights: ► High performance ATCA systems for fast control and data acquisition. ► IEEE1588 timing system and synchronization. ► Plasma control algorithms. ► Real-time control software frameworks. ► Targeted for nuclear fusion experiments with long duration discharges. -- Abstract: IPFN is developing a prototype Fast Plant System Controller (FPSC) based in ATCA embedded technologies dedicated to ITER CODAC data acquisition and control tasks in the sub-millisecond range. The main goal is to demonstrate the usability of the ATCA standard and its enhanced specifications for the high speed, very high density parallel data acquisition needs of the most demanding ITER tokamak plasma Instrumentation and Control (I and C) systems. This effort included the in-house development of a new family of high performance ATCA I/O and timing boards. The standard ITER software system CODAC Core System (CCS) v3.1, with the control based in the EPICS system does not cover yet the real-time requirements fulfilled by this hardware, so a new set of software components was developed for this specific platform, attempting to integrate and leverage the new features in CSS, for example the Multithreaded Application Real Time executor (MARTe) software framework, the new Data Archiving Network (DAN) solution, an ATCA IEEE-1588-2008 timing interface, and the Intelligent Platform Management Interface (IPMI) for system monitoring and remote management. This paper presents the overall software architecture for the ATCA FPSC, as well a discussion on the ITER constrains and design choices and finally a detailed description of the software components already developed

  18. The ITER Fast Plant System Controller ATCA prototype Real-Time Software Architecture

    Energy Technology Data Exchange (ETDEWEB)

    Carvalho, B.B., E-mail: bernardo@ipfn.ist.utl.pt [Associacao EURATOM/IST Instituto de Plasmas e Fusao Nuclear, Instituto Superior Tecnico, Universidade Tecnica de Lisboa, P-1049-001 Lisboa (Portugal); Santos, B.; Carvalho, P.F.; Neto, A. [Associacao EURATOM/IST Instituto de Plasmas e Fusao Nuclear, Instituto Superior Tecnico, Universidade Tecnica de Lisboa, P-1049-001 Lisboa (Portugal); Boncagni, L. [Associazione Euratom-ENEA sulla Fusione, Frascati Research Centre, Division of Fusion Physics, Frascati, Rome (Italy); Batista, A.J.N.; Correia, M.; Sousa, J.; Gonçalves, B. [Associacao EURATOM/IST Instituto de Plasmas e Fusao Nuclear, Instituto Superior Tecnico, Universidade Tecnica de Lisboa, P-1049-001 Lisboa (Portugal)

    2013-10-15

    Highlights: ► High performance ATCA systems for fast control and data acquisition. ► IEEE1588 timing system and synchronization. ► Plasma control algorithms. ► Real-time control software frameworks. ► Targeted for nuclear fusion experiments with long duration discharges. -- Abstract: IPFN is developing a prototype Fast Plant System Controller (FPSC) based in ATCA embedded technologies dedicated to ITER CODAC data acquisition and control tasks in the sub-millisecond range. The main goal is to demonstrate the usability of the ATCA standard and its enhanced specifications for the high speed, very high density parallel data acquisition needs of the most demanding ITER tokamak plasma Instrumentation and Control (I and C) systems. This effort included the in-house development of a new family of high performance ATCA I/O and timing boards. The standard ITER software system CODAC Core System (CCS) v3.1, with the control based in the EPICS system does not cover yet the real-time requirements fulfilled by this hardware, so a new set of software components was developed for this specific platform, attempting to integrate and leverage the new features in CSS, for example the Multithreaded Application Real Time executor (MARTe) software framework, the new Data Archiving Network (DAN) solution, an ATCA IEEE-1588-2008 timing interface, and the Intelligent Platform Management Interface (IPMI) for system monitoring and remote management. This paper presents the overall software architecture for the ATCA FPSC, as well a discussion on the ITER constrains and design choices and finally a detailed description of the software components already developed.

  19. Reducing dose calculation time for accurate iterative IMRT planning

    International Nuclear Information System (INIS)

    Siebers, Jeffrey V.; Lauterbach, Marc; Tong, Shidong; Wu Qiuwen; Mohan, Radhe

    2002-01-01

    A time-consuming component of IMRT optimization is the dose computation required in each iteration for the evaluation of the objective function. Accurate superposition/convolution (SC) and Monte Carlo (MC) dose calculations are currently considered too time-consuming for iterative IMRT dose calculation. Thus, fast, but less accurate algorithms such as pencil beam (PB) algorithms are typically used in most current IMRT systems. This paper describes two hybrid methods that utilize the speed of fast PB algorithms yet achieve the accuracy of optimizing based upon SC algorithms via the application of dose correction matrices. In one method, the ratio method, an infrequently computed voxel-by-voxel dose ratio matrix (R=D SC /D PB ) is applied for each beam to the dose distributions calculated with the PB method during the optimization. That is, D PB xR is used for the dose calculation during the optimization. The optimization proceeds until both the IMRT beam intensities and the dose correction ratio matrix converge. In the second method, the correction method, a periodically computed voxel-by-voxel correction matrix for each beam, defined to be the difference between the SC and PB dose computations, is used to correct PB dose distributions. To validate the methods, IMRT treatment plans developed with the hybrid methods are compared with those obtained when the SC algorithm is used for all optimization iterations and with those obtained when PB-based optimization is followed by SC-based optimization. In the 12 patient cases studied, no clinically significant differences exist in the final treatment plans developed with each of the dose computation methodologies. However, the number of time-consuming SC iterations is reduced from 6-32 for pure SC optimization to four or less for the ratio matrix method and five or less for the correction method. Because the PB algorithm is faster at computing dose, this reduces the inverse planning optimization time for our implementation

  20. Iterative algorithm for the volume integral method for magnetostatics problems

    International Nuclear Information System (INIS)

    Pasciak, J.E.

    1980-11-01

    Volume integral methods for solving nonlinear magnetostatics problems are considered in this paper. The integral method is discretized by a Galerkin technique. Estimates are given which show that the linearized problems are well conditioned and hence easily solved using iterative techniques. Comparisons of iterative algorithms with the elimination method of GFUN3D shows that the iterative method gives an order of magnitude improvement in computational time as well as memory requirements for large problems. Computational experiments for a test problem as well as a double layer dipole magnet are given. Error estimates for the linearized problem are also derived

  1. Parallel 3-D method of characteristics in MPACT

    International Nuclear Information System (INIS)

    Kochunas, B.; Dovvnar, T. J.; Liu, Z.

    2013-01-01

    A new parallel 3-D MOC kernel has been developed and implemented in MPACT which makes use of the modular ray tracing technique to reduce computational requirements and to facilitate parallel decomposition. The parallel model makes use of both distributed and shared memory parallelism which are implemented with the MPI and OpenMP standards, respectively. The kernel is capable of parallel decomposition of problems in space, angle, and by characteristic rays up to 0(104) processors. Initial verification of the parallel 3-D MOC kernel was performed using the Takeda 3-D transport benchmark problems. The eigenvalues computed by MPACT are within the statistical uncertainty of the benchmark reference and agree well with the averages of other participants. The MPACT k eff differs from the benchmark results for rodded and un-rodded cases by 11 and -40 pcm, respectively. The calculations were performed for various numbers of processors and parallel decompositions up to 15625 processors; all producing the same result at convergence. The parallel efficiency of the worst case was 60%, while very good efficiency (>95%) was observed for cases using 500 processors. The overall run time for the 500 processor case was 231 seconds and 19 seconds for the case with 15625 processors. Ongoing work is focused on developing theoretical performance models and the implementation of acceleration techniques to minimize the number of iterations to converge. (authors)

  2. A finite element solution method for quadrics parallel computer

    International Nuclear Information System (INIS)

    Zucchini, A.

    1996-08-01

    A distributed preconditioned conjugate gradient method for finite element analysis has been developed and implemented on a parallel SIMD Quadrics computer. The main characteristic of the method is that it does not require any actual assembling of all element equations in a global system. The physical domain of the problem is partitioned in cells of n p finite elements and each cell element is assigned to a different node of an n p -processors machine. Element stiffness matrices are stored in the data memory of the assigned processing node and the solution process is completely executed in parallel at element level. Inter-element and therefore inter-processor communications are required once per iteration to perform local sums of vector quantities between neighbouring elements. A prototype implementation has been tested on an 8-nodes Quadrics machine in a simple 2D benchmark problem

  3. Parallel computation for solving the tridiagonal linear system of equations

    International Nuclear Information System (INIS)

    Ishiguro, Misako; Harada, Hiroo; Fujii, Minoru; Fujimura, Toichiro; Nakamura, Yasuhiro; Nanba, Katsumi.

    1981-09-01

    Recently, applications of parallel computation for scientific calculations have increased from the need of the high speed calculation of large scale programs. At the JAERI computing center, an array processor FACOM 230-75 APU has installed to study the applicability of parallel computation for nuclear codes. We made some numerical experiments by using the APU on the methods of solution of tridiagonal linear equation which is an important problem in scientific calculations. Referring to the recent papers with parallel methods, we investigate eight ones. These are Gauss elimination method, Parallel Gauss method, Accelerated parallel Gauss method, Jacobi method, Recursive doubling method, Cyclic reduction method, Chebyshev iteration method, and Conjugate gradient method. The computing time and accuracy were compared among the methods on the basis of the numerical experiments. As the result, it is found that the Cyclic reduction method is best both in computing time and accuracy and the Gauss elimination method is the second one. (author)

  4. Eigenvalues calculation algorithms for {lambda}-modes determination. Parallelization approach

    Energy Technology Data Exchange (ETDEWEB)

    Vidal, V. [Universidad Politecnica de Valencia (Spain). Departamento de Sistemas Informaticos y Computacion; Verdu, G.; Munoz-Cobo, J.L. [Universidad Politecnica de Valencia (Spain). Departamento de Ingenieria Quimica y Nuclear; Ginestart, D. [Universidad Politecnica de Valencia (Spain). Departamento de Matematica Aplicada

    1997-03-01

    In this paper, we review two methods to obtain the {lambda}-modes of a nuclear reactor, Subspace Iteration method and Arnoldi`s method, which are popular methods to solve the partial eigenvalue problem for a given matrix. In the developed application for the neutron diffusion equation we include improved acceleration techniques for both methods. Also, we propose two parallelization approaches for these methods, a coarse grain parallelization and a fine grain one. We have tested the developed algorithms with two realistic problems, focusing on the efficiency of the methods according to the CPU times. (author).

  5. Iterative Splitting Methods for Differential Equations

    CERN Document Server

    Geiser, Juergen

    2011-01-01

    Iterative Splitting Methods for Differential Equations explains how to solve evolution equations via novel iterative-based splitting methods that efficiently use computational and memory resources. It focuses on systems of parabolic and hyperbolic equations, including convection-diffusion-reaction equations, heat equations, and wave equations. In the theoretical part of the book, the author discusses the main theorems and results of the stability and consistency analysis for ordinary differential equations. He then presents extensions of the iterative splitting methods to partial differential

  6. GPU-based parallel algorithm for blind image restoration using midfrequency-based methods

    Science.gov (United States)

    Xie, Lang; Luo, Yi-han; Bao, Qi-liang

    2013-08-01

    GPU-based general-purpose computing is a new branch of modern parallel computing, so the study of parallel algorithms specially designed for GPU hardware architecture is of great significance. In order to solve the problem of high computational complexity and poor real-time performance in blind image restoration, the midfrequency-based algorithm for blind image restoration was analyzed and improved in this paper. Furthermore, a midfrequency-based filtering method is also used to restore the image hardly with any recursion or iteration. Combining the algorithm with data intensiveness, data parallel computing and GPU execution model of single instruction and multiple threads, a new parallel midfrequency-based algorithm for blind image restoration is proposed in this paper, which is suitable for stream computing of GPU. In this algorithm, the GPU is utilized to accelerate the estimation of class-G point spread functions and midfrequency-based filtering. Aiming at better management of the GPU threads, the threads in a grid are scheduled according to the decomposition of the filtering data in frequency domain after the optimization of data access and the communication between the host and the device. The kernel parallelism structure is determined by the decomposition of the filtering data to ensure the transmission rate to get around the memory bandwidth limitation. The results show that, with the new algorithm, the operational speed is significantly increased and the real-time performance of image restoration is effectively improved, especially for high-resolution images.

  7. A parallel algorithm for solving the integral form of the discrete ordinates equations

    International Nuclear Information System (INIS)

    Zerr, R. J.; Azmy, Y. Y.

    2009-01-01

    The integral form of the discrete ordinates equations involves a system of equations that has a large, dense coefficient matrix. The serial construction methodology is presented and properties that affect the execution times to construct and solve the system are evaluated. Two approaches for massively parallel implementation of the solution algorithm are proposed and the current results of one of these are presented. The system of equations May be solved using two parallel solvers-block Jacobi and conjugate gradient. Results indicate that both methods can reduce overall wall-clock time for execution. The conjugate gradient solver exhibits better performance to compete with the traditional source iteration technique in terms of execution time and scalability. The parallel conjugate gradient method is synchronous, hence it does not increase the number of iterations for convergence compared to serial execution, and the efficiency of the algorithm demonstrates an apparent asymptotic decline. (authors)

  8. Migration of vectorized iterative solvers to distributed memory architectures

    Energy Technology Data Exchange (ETDEWEB)

    Pommerell, C. [AT& T Bell Labs., Murray Hill, NJ (United States); Ruehl, R. [CSCS-ETH, Manno (Switzerland)

    1994-12-31

    Both necessity and opportunity motivate the use of high-performance computers for iterative linear solvers. Necessity results from the size of the problems being solved-smaller problems are often better handled by direct methods. Opportunity arises from the formulation of the iterative methods in terms of simple linear algebra operations, even if this {open_quote}natural{close_quotes} parallelism is not easy to exploit in irregularly structured sparse matrices and with good preconditioners. As a result, high-performance implementations of iterative solvers have attracted a lot of interest in recent years. Most efforts are geared to vectorize or parallelize the dominating operation-structured or unstructured sparse matrix-vector multiplication, or to increase locality and parallelism by reformulating the algorithm-reducing global synchronization in inner products or local data exchange in preconditioners. Target architectures for iterative solvers currently include mostly vector supercomputers and architectures with one or few optimized (e.g., super-scalar and/or super-pipelined RISC) processors and hierarchical memory systems. More recently, parallel computers with physically distributed memory and a better price/performance ratio have been offered by vendors as a very interesting alternative to vector supercomputers. However, programming comfort on such distributed memory parallel processors (DMPPs) still lags behind. Here the authors are concerned with iterative solvers and their changing computing environment. In particular, they are considering migration from traditional vector supercomputers to DMPPs. Application requirements force one to use flexible and portable libraries. They want to extend the portability of iterative solvers rather than reimplementing everything for each new machine, or even for each new architecture.

  9. Fast parallel algorithm for three-dimensional distance-driven model in iterative computed tomography reconstruction

    International Nuclear Information System (INIS)

    Chen Jian-Lin; Li Lei; Wang Lin-Yuan; Cai Ai-Long; Xi Xiao-Qi; Zhang Han-Ming; Li Jian-Xin; Yan Bin

    2015-01-01

    The projection matrix model is used to describe the physical relationship between reconstructed object and projection. Such a model has a strong influence on projection and backprojection, two vital operations in iterative computed tomographic reconstruction. The distance-driven model (DDM) is a state-of-the-art technology that simulates forward and back projections. This model has a low computational complexity and a relatively high spatial resolution; however, it includes only a few methods in a parallel operation with a matched model scheme. This study introduces a fast and parallelizable algorithm to improve the traditional DDM for computing the parallel projection and backprojection operations. Our proposed model has been implemented on a GPU (graphic processing unit) platform and has achieved satisfactory computational efficiency with no approximation. The runtime for the projection and backprojection operations with our model is approximately 4.5 s and 10.5 s per loop, respectively, with an image size of 256×256×256 and 360 projections with a size of 512×512. We compare several general algorithms that have been proposed for maximizing GPU efficiency by using the unmatched projection/backprojection models in a parallel computation. The imaging resolution is not sacrificed and remains accurate during computed tomographic reconstruction. (paper)

  10. NUMERICAL WITHOUT ITERATION METHOD OF MODELING OF ELECTROMECHANICAL PROCESSES IN ASYNCHRONOUS ENGINES

    Directory of Open Access Journals (Sweden)

    D. G. Patalakh

    2018-02-01

    Full Text Available Purpose. Development of calculation of electromagnetic and electromechanic transients is in asynchronous engines without iterations. Methodology. Numeral methods of integration of usual differential equations, programming. Findings. As the system of equations, describing the dynamics of asynchronous engine, contents the products of rotor and stator currents and product of rotation frequency of rotor and currents, so this system is nonlinear one. The numeral solution of nonlinear differential equations supposes an iteration process on every step of integration. Time-continuing and badly converging iteration process may be the reason of calculation slowing. The improvement of numeral method by the way of an iteration process removing is offered. As result the modeling time is reduced. The improved numeral method is applied for integration of differential equations, describing the dynamics of asynchronous engine. Originality. The improvement of numeral method allowing to execute numeral integrations of differential equations containing product of functions is offered, that allows to avoid an iteration process on every step of integration and shorten modeling time. Practical value. On the basis of the offered methodology the universal program of modeling of electromechanics processes in asynchronous engines could be developed as taking advantage on fast-acting.

  11. GPU Parallel Bundle Block Adjustment

    Directory of Open Access Journals (Sweden)

    ZHENG Maoteng

    2017-09-01

    Full Text Available To deal with massive data in photogrammetry, we introduce the GPU parallel computing technology. The preconditioned conjugate gradient and inexact Newton method are also applied to decrease the iteration times while solving the normal equation. A brand new workflow of bundle adjustment is developed to utilize GPU parallel computing technology. Our method can avoid the storage and inversion of the big normal matrix, and compute the normal matrix in real time. The proposed method can not only largely decrease the memory requirement of normal matrix, but also largely improve the efficiency of bundle adjustment. It also achieves the same accuracy as the conventional method. Preliminary experiment results show that the bundle adjustment of a dataset with about 4500 images and 9 million image points can be done in only 1.5 minutes while achieving sub-pixel accuracy.

  12. Parallel time domain solvers for electrically large transient scattering problems

    KAUST Repository

    Liu, Yang

    2014-09-26

    Marching on in time (MOT)-based integral equation solvers represent an increasingly appealing avenue for analyzing transient electromagnetic interactions with large and complex structures. MOT integral equation solvers for analyzing electromagnetic scattering from perfect electrically conducting objects are obtained by enforcing electric field boundary conditions and implicitly time advance electric surface current densities by iteratively solving sparse systems of equations at all time steps. Contrary to finite difference and element competitors, these solvers apply to nonlinear and multi-scale structures comprising geometrically intricate and deep sub-wavelength features residing atop electrically large platforms. Moreover, they are high-order accurate, stable in the low- and high-frequency limits, and applicable to conducting and penetrable structures represented by highly irregular meshes. This presentation reviews some recent advances in the parallel implementations of time domain integral equation solvers, specifically those that leverage multilevel plane-wave time-domain algorithm (PWTD) on modern manycore computer architectures including graphics processing units (GPUs) and distributed memory supercomputers. The GPU-based implementation achieves at least one order of magnitude speedups compared to serial implementations while the distributed parallel implementation are highly scalable to thousands of compute-nodes. A distributed parallel PWTD kernel has been adopted to solve time domain surface/volume integral equations (TDSIE/TDVIE) for analyzing transient scattering from large and complex-shaped perfectly electrically conducting (PEC)/dielectric objects involving ten million/tens of millions of spatial unknowns.

  13. Comparison of the deflated preconditioned conjugate gradient method and parallel direct solver for composite materials

    NARCIS (Netherlands)

    Jönsthövel, T.B.; Van Gijzen, M.B.; MacLachlan, S.; Vuik, C.; Scarpas, A.

    2011-01-01

    The demand for large FE meshes increases as parallel computing becomes the standard in FE simulations. Direct and iterative solution methods are used to solve the resulting linear systems. Many applications concern composite materials, which are characterized by large discontinuities in the material

  14. Accuracy improvement of a hybrid robot for ITER application using POE modeling method

    International Nuclear Information System (INIS)

    Wang, Yongbo; Wu, Huapeng; Handroos, Heikki

    2013-01-01

    Highlights: ► The product of exponential (POE) formula for error modeling of hybrid robot. ► Differential Evolution (DE) algorithm for parameter identification. ► Simulation results are given to verify the effectiveness of the method. -- Abstract: This paper focuses on the kinematic calibration of a 10 degree-of-freedom (DOF) redundant serial–parallel hybrid robot to improve its accuracy. The robot was designed to perform the assembling and repairing tasks of the vacuum vessel (VV) of the international thermonuclear experimental reactor (ITER). By employing the product of exponentials (POEs) formula, we extended the POE-based calibration method from serial robot to redundant serial–parallel hybrid robot. The proposed method combines the forward and inverse kinematics together to formulate a hybrid calibration method for serial–parallel hybrid robot. Because of the high nonlinear characteristics of the error model and too many error parameters need to be identified, the traditional iterative linear least-square algorithms cannot be used to identify the parameter errors. This paper employs a global optimization algorithm, Differential Evolution (DE), to identify parameter errors by solving the inverse kinematics of the hybrid robot. Furthermore, after the parameter errors were identified, the DE algorithm was adopted to numerically solve the forward kinematics of the hybrid robot to demonstrate the accuracy improvement of the end-effector. Numerical simulations were carried out by generating random parameter errors at the allowed tolerance limit and generating a number of configuration poses in the robot workspace. Simulation of the real experimental conditions shows that the accuracy of the end-effector can be improved to the same precision level of the given external measurement device

  15. Accuracy improvement of a hybrid robot for ITER application using POE modeling method

    Energy Technology Data Exchange (ETDEWEB)

    Wang, Yongbo, E-mail: yongbo.wang@hotmail.com [Laboratory of Intelligent Machines, Lappeenranta University of Technology, FIN-53851 Lappeenranta (Finland); Wu, Huapeng; Handroos, Heikki [Laboratory of Intelligent Machines, Lappeenranta University of Technology, FIN-53851 Lappeenranta (Finland)

    2013-10-15

    Highlights: ► The product of exponential (POE) formula for error modeling of hybrid robot. ► Differential Evolution (DE) algorithm for parameter identification. ► Simulation results are given to verify the effectiveness of the method. -- Abstract: This paper focuses on the kinematic calibration of a 10 degree-of-freedom (DOF) redundant serial–parallel hybrid robot to improve its accuracy. The robot was designed to perform the assembling and repairing tasks of the vacuum vessel (VV) of the international thermonuclear experimental reactor (ITER). By employing the product of exponentials (POEs) formula, we extended the POE-based calibration method from serial robot to redundant serial–parallel hybrid robot. The proposed method combines the forward and inverse kinematics together to formulate a hybrid calibration method for serial–parallel hybrid robot. Because of the high nonlinear characteristics of the error model and too many error parameters need to be identified, the traditional iterative linear least-square algorithms cannot be used to identify the parameter errors. This paper employs a global optimization algorithm, Differential Evolution (DE), to identify parameter errors by solving the inverse kinematics of the hybrid robot. Furthermore, after the parameter errors were identified, the DE algorithm was adopted to numerically solve the forward kinematics of the hybrid robot to demonstrate the accuracy improvement of the end-effector. Numerical simulations were carried out by generating random parameter errors at the allowed tolerance limit and generating a number of configuration poses in the robot workspace. Simulation of the real experimental conditions shows that the accuracy of the end-effector can be improved to the same precision level of the given external measurement device.

  16. The Iterative Solution to Discrete-Time H∞ Control Problems for Periodic Systems

    Directory of Open Access Journals (Sweden)

    Ivan G. Ivanov

    2016-03-01

    Full Text Available This paper addresses the problem of solving discrete-time H ∞ control problems for periodic systems. The approach for solving such a type of equations is well known in the literature. However, the focus of our research is set on the numerical computation of the stabilizing solution. In particular, two effective methods for practical realization of the known iterative processes are described. Furthermore, a new iterative approach is investigated and applied. On the basis of numerical experiments, we compare the presented methods. A major conclusion is that the new iterative approach is faster than rest of the methods and it uses less RAM memory than other methods.

  17. Iteration schemes for parallelizing models of superconductivity

    Energy Technology Data Exchange (ETDEWEB)

    Gray, P.A. [Michigan State Univ., East Lansing, MI (United States)

    1996-12-31

    The time dependent Lawrence-Doniach model, valid for high fields and high values of the Ginzburg-Landau parameter, is often used for studying vortex dynamics in layered high-T{sub c} superconductors. When solving these equations numerically, the added degrees of complexity due to the coupling and nonlinearity of the model often warrant the use of high-performance computers for their solution. However, the interdependence between the layers can be manipulated so as to allow parallelization of the computations at an individual layer level. The reduced parallel tasks may then be solved independently using a heterogeneous cluster of networked workstations connected together with Parallel Virtual Machine (PVM) software. Here, this parallelization of the model is discussed and several computational implementations of varying degrees of parallelism are presented. Computational results are also given which contrast properties of convergence speed, stability, and consistency of these implementations. Included in these results are models involving the motion of vortices due to an applied current and pinning effects due to various material properties.

  18. Improved Iterative Parallel Interference Cancellation Receiver for Future Wireless DS-CDMA Systems

    Directory of Open Access Journals (Sweden)

    Andrea Bernacchioni

    2005-04-01

    Full Text Available We present a new turbo multiuser detector for turbo-coded direct sequence code division multiple access (DS-CDMA systems. The proposed detector is based on the utilization of a parallel interference cancellation (PIC and a bank of turbo decoders. The PIC is broken up in order to perform interference cancellation after each constituent decoder of the turbo decoding scheme. Moreover, in the paper we propose a new enhanced algorithm that provides a more accurate estimation of the signal-to-noise-plus-interference-ratio used in the tentative decision device and in the MAP decoding algorithm. The performance of the proposed receiver is evaluated by means of computer simulations for medium to very high system loads, in AWGN and multipath fading channel, and compared to recently proposed interference cancellation-based iterative MUD, by taking into account the number of iterations and the complexity involved. We will see that the proposed receiver outperforms the others especially for highly loaded systems.

  19. Numerov iteration method for second order integral-differential equation

    International Nuclear Information System (INIS)

    Zeng Fanan; Zhang Jiaju; Zhao Xuan

    1987-01-01

    In this paper, Numerov iterative method for second order integral-differential equation and system of equations are constructed. Numerical examples show that this method is better than direct method (Gauss elimination method) in CPU time and memoy requireing. Therefore, this method is an efficient method for solving integral-differential equation in nuclear physics

  20. Milestones in the Development of Iterative Solution Methods

    Directory of Open Access Journals (Sweden)

    Owe Axelsson

    2010-01-01

    Full Text Available Iterative solution methods to solve linear systems of equations were originally formulated as basic iteration methods of defect-correction type, commonly referred to as Richardson's iteration method. These methods developed further into various versions of splitting methods, including the successive overrelaxation (SOR method. Later, immensely important developments included convergence acceleration methods, such as the Chebyshev and conjugate gradient iteration methods and preconditioning methods of various forms. A major strive has been to find methods with a total computational complexity of optimal order, that is, proportional to the degrees of freedom involved in the equation. Methods that have turned out to have been particularly important for the further developments of linear equation solvers are surveyed. Some of them are presented in greater detail.

  1. Parallel processing of two-dimensional Sn transport calculations

    International Nuclear Information System (INIS)

    Uematsu, M.

    1997-01-01

    A parallel processing method for the two-dimensional S n transport code DOT3.5 has been developed to achieve a drastic reduction in computation time. In the proposed method, parallelization is achieved with angular domain decomposition and/or space domain decomposition. The calculational speed of parallel processing by angular domain decomposition is largely influenced by frequent communications between processing elements. To assess parallelization efficiency, sample problems with up to 32 x 32 spatial meshes were solved with a Sun workstation using the PVM message-passing library. As a result, parallel calculation using 16 processing elements, for example, was found to be nine times as fast as that with one processing element. As for parallel processing by geometry segmentation, the influence of processing element communications on computation time is small; however, discontinuity at the segment boundary degrades convergence speed. To accelerate the convergence, an alternate sweep of angular flux in conjunction with space domain decomposition and a two-step rescaling method consisting of segmentwise rescaling and ordinary pointwise rescaling have been developed. By applying the developed method, the number of iterations needed to obtain a converged flux solution was reduced by a factor of 2. As a result, parallel calculation using 16 processing elements was found to be 5.98 times as fast as the original DOT3.5 calculation

  2. The new Exponential Directional Iterative (EDI) 3-D Sn scheme for parallel adaptive differencing

    International Nuclear Information System (INIS)

    Sjoden, G.E.

    2005-01-01

    The new Exponential Directional Iterative (EDI) discrete ordinates (Sn) scheme for 3-D Cartesian Coordinates is presented. The EDI scheme is a logical extension of the positive, efficient Exponential Directional Weighted (EDW) Sn scheme currently used as the third level of the adaptive spatial differencing algorithm in the PENTRAN parallel discrete ordinates solver. Here, the derivation and advantages of the EDI scheme are presented; EDI uses EDW-rendered exponential coefficients as initial starting values to begin a fixed point iteration of the exponential coefficients. One issue that required evaluation was an iterative cutoff criterion to prevent the application of an unstable fixed point iteration; although this was needed in some cases, it was readily treated with a default to EDW. Iterative refinement of the exponential coefficients in EDI typically converged in fewer than four fixed point iterations. Moreover, EDI yielded more accurate angular fluxes compared to the other schemes tested, particularly in streaming conditions. Overall, it was found that the EDI scheme was up to an order of magnitude more accurate than the EDW scheme on a given mesh interval in streaming cases, and is potentially a good candidate as a fourth-level differencing scheme in the PENTRAN adaptive differencing sequence. The 3-D Cartesian computational cost of EDI was only about 20% more than the EDW scheme, and about 40% more than Diamond Zero (DZ). More evaluation and testing are required to determine suitable upgrade metrics for EDI to be fully integrated into the current adaptive spatial differencing sequence in PENTRAN. (author)

  3. Value Iteration Adaptive Dynamic Programming for Optimal Control of Discrete-Time Nonlinear Systems.

    Science.gov (United States)

    Wei, Qinglai; Liu, Derong; Lin, Hanquan

    2016-03-01

    In this paper, a value iteration adaptive dynamic programming (ADP) algorithm is developed to solve infinite horizon undiscounted optimal control problems for discrete-time nonlinear systems. The present value iteration ADP algorithm permits an arbitrary positive semi-definite function to initialize the algorithm. A novel convergence analysis is developed to guarantee that the iterative value function converges to the optimal performance index function. Initialized by different initial functions, it is proven that the iterative value function will be monotonically nonincreasing, monotonically nondecreasing, or nonmonotonic and will converge to the optimum. In this paper, for the first time, the admissibility properties of the iterative control laws are developed for value iteration algorithms. It is emphasized that new termination criteria are established to guarantee the effectiveness of the iterative control laws. Neural networks are used to approximate the iterative value function and compute the iterative control law, respectively, for facilitating the implementation of the iterative ADP algorithm. Finally, two simulation examples are given to illustrate the performance of the present method.

  4. Scalability of Parallel Scientific Applications on the Cloud

    Directory of Open Access Journals (Sweden)

    Satish Narayana Srirama

    2011-01-01

    Full Text Available Cloud computing, with its promise of virtually infinite resources, seems to suit well in solving resource greedy scientific computing problems. To study the effects of moving parallel scientific applications onto the cloud, we deployed several benchmark applications like matrix–vector operations and NAS parallel benchmarks, and DOUG (Domain decomposition On Unstructured Grids on the cloud. DOUG is an open source software package for parallel iterative solution of very large sparse systems of linear equations. The detailed analysis of DOUG on the cloud showed that parallel applications benefit a lot and scale reasonable on the cloud. We could also observe the limitations of the cloud and its comparison with cluster in terms of performance. However, for efficiently running the scientific applications on the cloud infrastructure, the applications must be reduced to frameworks that can successfully exploit the cloud resources, like the MapReduce framework. Several iterative and embarrassingly parallel algorithms are reduced to the MapReduce model and their performance is measured and analyzed. The analysis showed that Hadoop MapReduce has significant problems with iterative methods, while it suits well for embarrassingly parallel algorithms. Scientific computing often uses iterative methods to solve large problems. Thus, for scientific computing on the cloud, this paper raises the necessity for better frameworks or optimizations for MapReduce.

  5. LiDAR-IMU Time Delay Calibration Based on Iterative Closest Point and Iterated Sigma Point Kalman Filter.

    Science.gov (United States)

    Liu, Wanli

    2017-03-08

    The time delay calibration between Light Detection and Ranging (LiDAR) and Inertial Measurement Units (IMUs) is an essential prerequisite for its applications. However, the correspondences between LiDAR and IMU measurements are usually unknown, and thus cannot be computed directly for the time delay calibration. In order to solve the problem of LiDAR-IMU time delay calibration, this paper presents a fusion method based on iterative closest point (ICP) and iterated sigma point Kalman filter (ISPKF), which combines the advantages of ICP and ISPKF. The ICP algorithm can precisely determine the unknown transformation between LiDAR-IMU; and the ISPKF algorithm can optimally estimate the time delay calibration parameters. First of all, the coordinate transformation from the LiDAR frame to the IMU frame is realized. Second, the measurement model and time delay error model of LiDAR and IMU are established. Third, the methodology of the ICP and ISPKF procedure is presented for LiDAR-IMU time delay calibration. Experimental results are presented that validate the proposed method and demonstrate the time delay error can be accurately calibrated.

  6. Direct and iterative algorithms for the parallel solution of the one-dimensional macroscopic Navier-Stokes equations

    International Nuclear Information System (INIS)

    Doster, J.M.; Sills, E.D.

    1986-01-01

    Current efforts are under way to develop and evaluate numerical algorithms for the parallel solution of the large sparse matrix equations associated with the finite difference representation of the macroscopic Navier-Stokes equations. Previous work has shown that these equations can be cast into smaller coupled matrix equations suitable for solution utilizing multiple computer processors operating in parallel. The individual processors themselves may exhibit parallelism through the use of vector pipelines. This wor, has concentrated on the one-dimensional drift flux form of the Navier-Stokes equations. Direct and iterative algorithms that may be suitable for implementation on parallel computer architectures are evaluated in terms of accuracy and overall execution speed. This work has application to engineering and training simulations, on-line process control systems, and engineering workstations where increased computational speeds are required

  7. A linear iterative unfolding method

    International Nuclear Information System (INIS)

    László, András

    2012-01-01

    A frequently faced task in experimental physics is to measure the probability distribution of some quantity. Often this quantity to be measured is smeared by a non-ideal detector response or by some physical process. The procedure of removing this smearing effect from the measured distribution is called unfolding, and is a delicate problem in signal processing, due to the well-known numerical ill behavior of this task. Various methods were invented which, given some assumptions on the initial probability distribution, try to regularize the unfolding problem. Most of these methods definitely introduce bias into the estimate of the initial probability distribution. We propose a linear iterative method (motivated by the Neumann series / Landweber iteration known in functional analysis), which has the advantage that no assumptions on the initial probability distribution is needed, and the only regularization parameter is the stopping order of the iteration, which can be used to choose the best compromise between the introduced bias and the propagated statistical and systematic errors. The method is consistent: 'binwise' convergence to the initial probability distribution is proved in absence of measurement errors under a quite general condition on the response function. This condition holds for practical applications such as convolutions, calorimeter response functions, momentum reconstruction response functions based on tracking in magnetic field etc. In presence of measurement errors, explicit formulae for the propagation of the three important error terms is provided: bias error (distance from the unknown to-be-reconstructed initial distribution at a finite iteration order), statistical error, and systematic error. A trade-off between these three error terms can be used to define an optimal iteration stopping criterion, and the errors can be estimated there. We provide a numerical C library for the implementation of the method, which incorporates automatic

  8. Accelerating nuclear configuration interaction calculations through a preconditioned block iterative eigensolver

    Science.gov (United States)

    Shao, Meiyue; Aktulga, H. Metin; Yang, Chao; Ng, Esmond G.; Maris, Pieter; Vary, James P.

    2018-01-01

    We describe a number of recently developed techniques for improving the performance of large-scale nuclear configuration interaction calculations on high performance parallel computers. We show the benefit of using a preconditioned block iterative method to replace the Lanczos algorithm that has traditionally been used to perform this type of computation. The rapid convergence of the block iterative method is achieved by a proper choice of starting guesses of the eigenvectors and the construction of an effective preconditioner. These acceleration techniques take advantage of special structure of the nuclear configuration interaction problem which we discuss in detail. The use of a block method also allows us to improve the concurrency of the computation, and take advantage of the memory hierarchy of modern microprocessors to increase the arithmetic intensity of the computation relative to data movement. We also discuss the implementation details that are critical to achieving high performance on massively parallel multi-core supercomputers, and demonstrate that the new block iterative solver is two to three times faster than the Lanczos based algorithm for problems of moderate sizes on a Cray XC30 system.

  9. Parallel Implicit Algorithms for CFD

    Science.gov (United States)

    Keyes, David E.

    1998-01-01

    The main goal of this project was efficient distributed parallel and workstation cluster implementations of Newton-Krylov-Schwarz (NKS) solvers for implicit Computational Fluid Dynamics (CFD.) "Newton" refers to a quadratically convergent nonlinear iteration using gradient information based on the true residual, "Krylov" to an inner linear iteration that accesses the Jacobian matrix only through highly parallelizable sparse matrix-vector products, and "Schwarz" to a domain decomposition form of preconditioning the inner Krylov iterations with primarily neighbor-only exchange of data between the processors. Prior experience has established that Newton-Krylov methods are competitive solvers in the CFD context and that Krylov-Schwarz methods port well to distributed memory computers. The combination of the techniques into Newton-Krylov-Schwarz was implemented on 2D and 3D unstructured Euler codes on the parallel testbeds that used to be at LaRC and on several other parallel computers operated by other agencies or made available by the vendors. Early implementations were made directly in Massively Parallel Integration (MPI) with parallel solvers we adapted from legacy NASA codes and enhanced for full NKS functionality. Later implementations were made in the framework of the PETSC library from Argonne National Laboratory, which now includes pseudo-transient continuation Newton-Krylov-Schwarz solver capability (as a result of demands we made upon PETSC during our early porting experiences). A secondary project pursued with funding from this contract was parallel implicit solvers in acoustics, specifically in the Helmholtz formulation. A 2D acoustic inverse problem has been solved in parallel within the PETSC framework.

  10. Combining Compile-Time and Run-Time Parallelization

    Directory of Open Access Journals (Sweden)

    Sungdo Moon

    1999-01-01

    Full Text Available This paper demonstrates that significant improvements to automatic parallelization technology require that existing systems be extended in two ways: (1 they must combine high‐quality compile‐time analysis with low‐cost run‐time testing; and (2 they must take control flow into account during analysis. We support this claim with the results of an experiment that measures the safety of parallelization at run time for loops left unparallelized by the Stanford SUIF compiler’s automatic parallelization system. We present results of measurements on programs from two benchmark suites – SPECFP95 and NAS sample benchmarks – which identify inherently parallel loops in these programs that are missed by the compiler. We characterize remaining parallelization opportunities, and find that most of the loops require run‐time testing, analysis of control flow, or some combination of the two. We present a new compile‐time analysis technique that can be used to parallelize most of these remaining loops. This technique is designed to not only improve the results of compile‐time parallelization, but also to produce low‐cost, directed run‐time tests that allow the system to defer binding of parallelization until run‐time when safety cannot be proven statically. We call this approach predicated array data‐flow analysis. We augment array data‐flow analysis, which the compiler uses to identify independent and privatizable arrays, by associating predicates with array data‐flow values. Predicated array data‐flow analysis allows the compiler to derive “optimistic” data‐flow values guarded by predicates; these predicates can be used to derive a run‐time test guaranteeing the safety of parallelization.

  11. Method for simultaneous measurement of borehole and formation neutron decay-times employing iterative fitting

    International Nuclear Information System (INIS)

    Schultz, W.E.

    1982-01-01

    A method is described of making in situ measurements of the thermal neutron decay time of earth formations in the vicinity of a wellbore. The borehole and earth formations in its vicinity are repetitively irradiated with pulsed fast neutrons and, during the intervals between pulses, capture gamma radiation is measured in at least four, non-overlapping, contiguous time intervals. A background radiation measurement is made between successive pulses and used to correct count-rates representative of thermal neutron populations in the borehole and the formations, the count-rates being generated during each of the time intervals. The background-corrected count-rate measurements are iteratively fitted to exponential curves using a least squares technique to simultaneously derive signals representing borehole component and formation component of the thermal neutron decay time. The signals are recorded as a function of borehole depth. (author)

  12. Parallelizing flow-accumulation calculations on graphics processing units—From iterative DEM preprocessing algorithm to recursive multiple-flow-direction algorithm

    Science.gov (United States)

    Qin, Cheng-Zhi; Zhan, Lijun

    2012-06-01

    As one of the important tasks in digital terrain analysis, the calculation of flow accumulations from gridded digital elevation models (DEMs) usually involves two steps in a real application: (1) using an iterative DEM preprocessing algorithm to remove the depressions and flat areas commonly contained in real DEMs, and (2) using a recursive flow-direction algorithm to calculate the flow accumulation for every cell in the DEM. Because both algorithms are computationally intensive, quick calculation of the flow accumulations from a DEM (especially for a large area) presents a practical challenge to personal computer (PC) users. In recent years, rapid increases in hardware capacity of the graphics processing units (GPUs) provided in modern PCs have made it possible to meet this challenge in a PC environment. Parallel computing on GPUs using a compute-unified-device-architecture (CUDA) programming model has been explored to speed up the execution of the single-flow-direction algorithm (SFD). However, the parallel implementation on a GPU of the multiple-flow-direction (MFD) algorithm, which generally performs better than the SFD algorithm, has not been reported. Moreover, GPU-based parallelization of the DEM preprocessing step in the flow-accumulation calculations has not been addressed. This paper proposes a parallel approach to calculate flow accumulations (including both iterative DEM preprocessing and a recursive MFD algorithm) on a CUDA-compatible GPU. For the parallelization of an MFD algorithm (MFD-md), two different parallelization strategies using a GPU are explored. The first parallelization strategy, which has been used in the existing parallel SFD algorithm on GPU, has the problem of computing redundancy. Therefore, we designed a parallelization strategy based on graph theory. The application results show that the proposed parallel approach to calculate flow accumulations on a GPU performs much faster than either sequential algorithms or other parallel GPU

  13. Predictor-Corrector Quasi-Static Method Applied to Nonoverlapping Local/Global Iterations with 2-D/1-D Fusion Transport Kernel and p-CMFD Wrapper for Transient Reactor Analysis

    International Nuclear Information System (INIS)

    Cho, Bumhee; Cho, Nam Zin

    2015-01-01

    In this study, the steady-state p-CMFD adjoint flux is used as the weighting function to obtain PK parameters instead of the computationally expensive transport adjoint angular flux. Several numerical problems are investigated to see the capability of the PCQS method applied to the NLG iteration. CRX-2K adopts the nonoverlapping local/global (NLG) iterative method with the 2-D/1-D fusion transport kernel and the global p-CMFD wrapper. The parallelization of the NLG iteration has been recently implemented in CRX-2K and several numerical results are reported in a companion paper. However, the direct time discretization leads to a fine time step size to acquire an accurate transient solution, and the step size involved in the transport transient calculations is millisecond-order. Therefore, the transient calculations need much longer computing time than the steady-state calculation. To increase the time step size, Predictor-Corrector Quasi-Static (PCQS) method can be one option to apply to the NLG iteration. The PCQS method is a linear algorithm, so the shape function does not need to be updated more than once at a specific time step like a conventional quasi-static (QS) family such as Improved Quasi-Static (IQS) method. Moreover, the shape function in the PCQS method directly comes from the direct transport calculation (with a large time step), so one can easily implement the PCQS method in an existing transient transport code. Any QS method needs to solve the amplitude function in the form of the point kinetics (PK) equations, and accurate PK parameters can be obtained by the transport steady-state adjoint angular flux as a weighting function. The PCQS method is applied to the transient NLG iteration with the 2-D/1-D fusion transport kernel and the global p-CMFD wrapper, and has been implemented in CRX-2K. In the numerical problems, the PCQS method with the NLG iteration shows more accurate solutions compared to the direct transient calculations with large time step

  14. New concurrent iterative methods with monotonic convergence

    Energy Technology Data Exchange (ETDEWEB)

    Yao, Qingchuan [Michigan State Univ., East Lansing, MI (United States)

    1996-12-31

    This paper proposes the new concurrent iterative methods without using any derivatives for finding all zeros of polynomials simultaneously. The new methods are of monotonic convergence for both simple and multiple real-zeros of polynomials and are quadratically convergent. The corresponding accelerated concurrent iterative methods are obtained too. The new methods are good candidates for the application in solving symmetric eigenproblems.

  15. Multicore Performance of Block Algebraic Iterative Reconstruction Methods

    DEFF Research Database (Denmark)

    Sørensen, Hans Henrik B.; Hansen, Per Christian

    2014-01-01

    Algebraic iterative methods are routinely used for solving the ill-posed sparse linear systems arising in tomographic image reconstruction. Here we consider the algebraic reconstruction technique (ART) and the simultaneous iterative reconstruction techniques (SIRT), both of which rely on semiconv......Algebraic iterative methods are routinely used for solving the ill-posed sparse linear systems arising in tomographic image reconstruction. Here we consider the algebraic reconstruction technique (ART) and the simultaneous iterative reconstruction techniques (SIRT), both of which rely...... on semiconvergence. Block versions of these methods, based on a partitioning of the linear system, are able to combine the fast semiconvergence of ART with the better multicore properties of SIRT. These block methods separate into two classes: those that, in each iteration, access the blocks in a sequential manner...... a fixed relaxation parameter in each method, namely, the one that leads to the fastest semiconvergence. Computational results show that for multicore computers, the sequential approach is preferable....

  16. SU-E-I-93: Improved Imaging Quality for Multislice Helical CT Via Sparsity Regularized Iterative Image Reconstruction Method Based On Tensor Framelet

    International Nuclear Information System (INIS)

    Nam, H; Guo, M; Lee, K; Li, R; Xing, L; Gao, H

    2014-01-01

    Purpose: Inspired by compressive sensing, sparsity regularized iterative reconstruction method has been extensively studied. However, its utility pertinent to multislice helical 4D CT for radiotherapy with respect to imaging quality, dose, and time has not been thoroughly addressed. As the beginning of such an investigation, this work carries out the initial comparison of reconstructed imaging quality between sparsity regularized iterative method and analytic method through static phantom studies using a state-of-art 128-channel multi-slice Siemens helical CT scanner. Methods: In our iterative method, tensor framelet (TF) is chosen as the regularization method for its superior performance from total variation regularization in terms of reduced piecewise-constant artifacts and improved imaging quality that has been demonstrated in our prior work. On the other hand, X-ray transforms and its adjoints are computed on-the-fly through GPU implementation using our previous developed fast parallel algorithms with O(1) complexity per computing thread. For comparison, both FDK (approximate analytic method) and Katsevich algorithm (exact analytic method) are used for multislice helical CT image reconstruction. Results: The phantom experimental data with different imaging doses were acquired using a state-of-art 128-channel multi-slice Siemens helical CT scanner. The reconstructed image quality was compared between TF-based iterative method, FDK and Katsevich algorithm with the quantitative analysis for characterizing signal-to-noise ratio, image contrast, and spatial resolution of high-contrast and low-contrast objects. Conclusion: The experimental results suggest that our tensor framelet regularized iterative reconstruction algorithm improves the helical CT imaging quality from FDK and Katsevich algorithm for static experimental phantom studies that have been performed

  17. Various Newton-type iterative methods for solving nonlinear equations

    Directory of Open Access Journals (Sweden)

    Manoj Kumar

    2013-10-01

    Full Text Available The aim of the present paper is to introduce and investigate new ninth and seventh order convergent Newton-type iterative methods for solving nonlinear equations. The ninth order convergent Newton-type iterative method is made derivative free to obtain seventh-order convergent Newton-type iterative method. These new with and without derivative methods have efficiency indices 1.5518 and 1.6266, respectively. The error equations are used to establish the order of convergence of these proposed iterative methods. Finally, various numerical comparisons are implemented by MATLAB to demonstrate the performance of the developed methods.

  18. A mobile robot with parallel kinematics constructed under requirements for assembling and machining of the ITER vacuum vessel

    International Nuclear Information System (INIS)

    Pessi, P.; Huapeng Wu; Handroos, H.; Jones, L.

    2006-01-01

    ITER sectors require more stringent tolerances ± 5 mm than normally expected for the size of structure involved. The walls of ITER sectors are made of 60 mm thick stainless steel and are joined together by high efficiency structural and leak tight welds. In addition to the initial vacuum vessel assembly, sectors may have to be replaced for repair. Since commercially available machines are too heavy for the required machining operations and the lifting of a possible e-beam gun column system, and conventional robots lack the stiffness and accuracy in such machining condition, a new flexible, lightweight and mobile robotic machine is being considered. For the assembly of the ITER vacuum vessel sector, precise positioning of welding end-effectors, at some distance in a confined space from the available supports, will be required, which is not possible using conventional machines or robots. This paper presents a special robot, able to carry out welding and machining processes from inside the ITER vacuum vessel, consisting of a ten-degree-of-freedom parallel robot mounted on a carriage driven by electric motor/gearbox on a track. The robot consists of a Stewart platform based parallel mechanism. Water hydraulic cylinders are used as actuators to reach six degrees of freedom for parallel construction. Two linear and two rotational motions are used for enlargement the workspace of the manipulator. The robot carries both welding gun such as a TIG, hybrid laser or e-beam welding gun to weld the inner and outer walls of the ITER vacuum vessel sectors and machining tools to cut and milling the walls with necessary accuracy, it can also carry other tools and material to a required position inside the vacuum vessel . For assembling an on line six degrees of freedom seam finding algorithm has been developed, which enables the robot to find welding seam automatically in a very complex environment. In the machining multi flexible machining processes carried out automatically by

  19. Parallel solutions of the two-group neutron diffusion equations

    International Nuclear Information System (INIS)

    Zee, K.S.; Turinsky, P.J.

    1987-01-01

    Recent efforts to adapt various numerical solution algorithms to parallel computer architectures have addressed the possibility of substantially reducing the running time of few-group neutron diffusion calculations. The authors have developed an efficient iterative parallel algorithm and an associated computer code for the rapid solution of the finite difference method representation of the two-group neutron diffusion equations on the CRAY X/MP-48 supercomputer having multi-CPUs and vector pipelines. For realistic simulation of light water reactor cores, the code employees a macroscopic depletion model with trace capability for selected fission product transients and critical boron. In addition to this, moderator and fuel temperature feedback models are also incorporated into the code. The validity of the physics models used in the code were benchmarked against qualified codes and proved accurate. This work is an extension of previous work in that various feedback effects are accounted for in the system; the entire code is structured to accommodate extensive vectorization; and an additional parallelism by multitasking is achieved not only for the solution of the matrix equations associated with the inner iterations but also for the other segments of the code, e.g., outer iterations

  20. Development and control towards a parallel water hydraulic weld/cut robot for machining processes in ITER vacuum vessel

    International Nuclear Information System (INIS)

    Wu Huapeng; Handroos, Heikki; Pessi, Pekka; Kilkki, Juha; Jones, Lawrence

    2005-01-01

    This paper presents a special robot, able to carry out welding and machining processes from inside the ITER vacuum vessel (VV), consisting of a five degree-of-freedom parallel mechanism, mounted on a carriage driven by two electric motors on a rack. The kinematic design of the robot has been optimised for ITER access and a hydraulically actuated pre-prototype built. A hybrid controller is designed for the robot, including position, speed and pressure feedback loops to achieve high accuracy and high dynamic performances. Finally, the experimental tests are given and discussed

  1. Two-phase flow steam generator simulations on parallel computers using domain decomposition method

    International Nuclear Information System (INIS)

    Belliard, M.

    2003-01-01

    Within the framework of the Domain Decomposition Method (DDM), we present industrial steady state two-phase flow simulations of PWR Steam Generators (SG) using iteration-by-sub-domain methods: standard and Adaptive Dirichlet/Neumann methods (ADN). The averaged mixture balance equations are solved by a Fractional-Step algorithm, jointly with the Crank-Nicholson scheme and the Finite Element Method. The algorithm works with overlapping or non-overlapping sub-domains and with conforming or nonconforming meshing. Computations are run on PC networks or on massively parallel mainframe computers. A CEA code-linker and the PVM package are used (master-slave context). SG mock-up simulations, involving up to 32 sub-domains, highlight the efficiency (speed-up, scalability) and the robustness of the chosen approach. With the DDM, the computational problem size is easily increased to about 1,000,000 cells and the CPU time is significantly reduced. The difficulties related to industrial use are also discussed. (author)

  2. Discrete-Time Local Value Iteration Adaptive Dynamic Programming: Admissibility and Termination Analysis.

    Science.gov (United States)

    Wei, Qinglai; Liu, Derong; Lin, Qiao

    In this paper, a novel local value iteration adaptive dynamic programming (ADP) algorithm is developed to solve infinite horizon optimal control problems for discrete-time nonlinear systems. The focuses of this paper are to study admissibility properties and the termination criteria of discrete-time local value iteration ADP algorithms. In the discrete-time local value iteration ADP algorithm, the iterative value functions and the iterative control laws are both updated in a given subset of the state space in each iteration, instead of the whole state space. For the first time, admissibility properties of iterative control laws are analyzed for the local value iteration ADP algorithm. New termination criteria are established, which terminate the iterative local ADP algorithm with an admissible approximate optimal control law. Finally, simulation results are given to illustrate the performance of the developed algorithm.In this paper, a novel local value iteration adaptive dynamic programming (ADP) algorithm is developed to solve infinite horizon optimal control problems for discrete-time nonlinear systems. The focuses of this paper are to study admissibility properties and the termination criteria of discrete-time local value iteration ADP algorithms. In the discrete-time local value iteration ADP algorithm, the iterative value functions and the iterative control laws are both updated in a given subset of the state space in each iteration, instead of the whole state space. For the first time, admissibility properties of iterative control laws are analyzed for the local value iteration ADP algorithm. New termination criteria are established, which terminate the iterative local ADP algorithm with an admissible approximate optimal control law. Finally, simulation results are given to illustrate the performance of the developed algorithm.

  3. High-performance blob-based iterative three-dimensional reconstruction in electron tomography using multi-GPUs

    Directory of Open Access Journals (Sweden)

    Wan Xiaohua

    2012-06-01

    Full Text Available Abstract Background Three-dimensional (3D reconstruction in electron tomography (ET has emerged as a leading technique to elucidate the molecular structures of complex biological specimens. Blob-based iterative methods are advantageous reconstruction methods for 3D reconstruction in ET, but demand huge computational costs. Multiple graphic processing units (multi-GPUs offer an affordable platform to meet these demands. However, a synchronous communication scheme between multi-GPUs leads to idle GPU time, and a weighted matrix involved in iterative methods cannot be loaded into GPUs especially for large images due to the limited available memory of GPUs. Results In this paper we propose a multilevel parallel strategy combined with an asynchronous communication scheme and a blob-ELLR data structure to efficiently perform blob-based iterative reconstructions on multi-GPUs. The asynchronous communication scheme is used to minimize the idle GPU time so as to asynchronously overlap communications with computations. The blob-ELLR data structure only needs nearly 1/16 of the storage space in comparison with ELLPACK-R (ELLR data structure and yields significant acceleration. Conclusions Experimental results indicate that the multilevel parallel scheme combined with the asynchronous communication scheme and the blob-ELLR data structure allows efficient implementations of 3D reconstruction in ET on multi-GPUs.

  4. Iterative methods for tomography problems: implementation to a cross-well tomography problem

    Science.gov (United States)

    Karadeniz, M. F.; Weber, G. W.

    2018-01-01

    The velocity distribution between two boreholes is reconstructed by cross-well tomography, which is commonly used in geology. In this paper, iterative methods, Kaczmarz’s algorithm, algebraic reconstruction technique (ART), and simultaneous iterative reconstruction technique (SIRT), are implemented to a specific cross-well tomography problem. Convergence to the solution of these methods and their CPU time for the cross-well tomography problem are compared. Furthermore, these three methods for this problem are compared for different tolerance values.

  5. Conformable variational iteration method

    Directory of Open Access Journals (Sweden)

    Omer Acan

    2017-02-01

    Full Text Available In this study, we introduce the conformable variational iteration method based on new defined fractional derivative called conformable fractional derivative. This new method is applied two fractional order ordinary differential equations. To see how the solutions of this method, linear homogeneous and non-linear non-homogeneous fractional ordinary differential equations are selected. Obtained results are compared the exact solutions and their graphics are plotted to demonstrate efficiency and accuracy of the method.

  6. MINARET: Towards a time-dependent neutron transport parallel solver

    International Nuclear Information System (INIS)

    Baudron, A.M.; Lautard, J.J.; Maday, Y.; Mula, O.

    2013-01-01

    We present the newly developed time-dependent 3D multigroup discrete ordinates neutron transport solver that has recently been implemented in the MINARET code. The solver is the support for a study about computing acceleration techniques that involve parallel architectures. In this work, we will focus on the parallelization of two of the variables involved in our equation: the angular directions and the time. This last variable has been parallelized by a (time) domain decomposition method called the para-real in time algorithm. (authors)

  7. Iterative and range test methods for an inverse source problem for acoustic waves

    International Nuclear Information System (INIS)

    Alves, Carlos; Kress, Rainer; Serranho, Pedro

    2009-01-01

    We propose two methods for solving an inverse source problem for time-harmonic acoustic waves. Based on the reciprocity gap principle a nonlinear equation is presented for the locations and intensities of the point sources that can be solved via Newton iterations. To provide an initial guess for this iteration we suggest a range test algorithm for approximating the source locations. We give a mathematical foundation for the range test and exhibit its feasibility in connection with the iteration method by some numerical examples

  8. Inversion of potential field data using the finite element method on parallel computers

    Science.gov (United States)

    Gross, L.; Altinay, C.; Shaw, S.

    2015-11-01

    In this paper we present a formulation of the joint inversion of potential field anomaly data as an optimization problem with partial differential equation (PDE) constraints. The problem is solved using the iterative Broyden-Fletcher-Goldfarb-Shanno (BFGS) method with the Hessian operator of the regularization and cross-gradient component of the cost function as preconditioner. We will show that each iterative step requires the solution of several PDEs namely for the potential fields, for the adjoint defects and for the application of the preconditioner. In extension to the traditional discrete formulation the BFGS method is applied to continuous descriptions of the unknown physical properties in combination with an appropriate integral form of the dot product. The PDEs can easily be solved using standard conforming finite element methods (FEMs) with potentially different resolutions. For two examples we demonstrate that the number of PDE solutions required to reach a given tolerance in the BFGS iteration is controlled by weighting regularization and cross-gradient but is independent of the resolution of PDE discretization and that as a consequence the method is weakly scalable with the number of cells on parallel computers. We also show a comparison with the UBC-GIF GRAV3D code.

  9. Iterative Methods for the Non-LTE Transfer of Polarized Radiation: Resonance Line Polarization in One-dimensional Atmospheres

    Science.gov (United States)

    Trujillo Bueno, Javier; Manso Sainz, Rafael

    1999-05-01

    This paper shows how to generalize to non-LTE polarization transfer some operator splitting methods that were originally developed for solving unpolarized transfer problems. These are the Jacobi-based accelerated Λ-iteration (ALI) method of Olson, Auer, & Buchler and the iterative schemes based on Gauss-Seidel and successive overrelaxation (SOR) iteration of Trujillo Bueno and Fabiani Bendicho. The theoretical framework chosen for the formulation of polarization transfer problems is the quantum electrodynamics (QED) theory of Landi Degl'Innocenti, which specifies the excitation state of the atoms in terms of the irreducible tensor components of the atomic density matrix. This first paper establishes the grounds of our numerical approach to non-LTE polarization transfer by concentrating on the standard case of scattering line polarization in a gas of two-level atoms, including the Hanle effect due to a weak microturbulent and isotropic magnetic field. We begin demonstrating that the well-known Λ-iteration method leads to the self-consistent solution of this type of problem if one initializes using the ``exact'' solution corresponding to the unpolarized case. We show then how the above-mentioned splitting methods can be easily derived from this simple Λ-iteration scheme. We show that our SOR method is 10 times faster than the Jacobi-based ALI method, while our implementation of the Gauss-Seidel method is 4 times faster. These iterative schemes lead to the self-consistent solution independently of the chosen initialization. The convergence rate of these iterative methods is very high; they do not require either the construction or the inversion of any matrix, and the computing time per iteration is similar to that of the Λ-iteration method.

  10. A Novel Algorithm for Solving the Multidimensional Neutron Transport Equation on Massively Parallel Architectures

    Energy Technology Data Exchange (ETDEWEB)

    Azmy, Yousry

    2014-06-10

    We employ the Integral Transport Matrix Method (ITMM) as the kernel of new parallel solution methods for the discrete ordinates approximation of the within-group neutron transport equation. The ITMM abandons the repetitive mesh sweeps of the traditional source iterations (SI) scheme in favor of constructing stored operators that account for the direct coupling factors among all the cells' fluxes and between the cells' and boundary surfaces' fluxes. The main goals of this work are to develop the algorithms that construct these operators and employ them in the solution process, determine the most suitable way to parallelize the entire procedure, and evaluate the behavior and parallel performance of the developed methods with increasing number of processes, P. The fastest observed parallel solution method, Parallel Gauss-Seidel (PGS), was used in a weak scaling comparison with the PARTISN transport code, which uses the source iteration (SI) scheme parallelized with the Koch-baker-Alcouffe (KBA) method. Compared to the state-of-the-art SI-KBA with diffusion synthetic acceleration (DSA), this new method- even without acceleration/preconditioning-is completitive for optically thick problems as P is increased to the tens of thousands range. For the most optically thick cells tested, PGS reduced execution time by an approximate factor of three for problems with more than 130 million computational cells on P = 32,768. Moreover, the SI-DSA execution times's trend rises generally more steeply with increasing P than the PGS trend. Furthermore, the PGS method outperforms SI for the periodic heterogeneous layers (PHL) configuration problems. The PGS method outperforms SI and SI-DSA on as few as P = 16 for PHL problems and reduces execution time by a factor of ten or more for all problems considered with more than 2 million computational cells on P = 4.096.

  11. A derivation and scalable implementation of the synchronous parallel kinetic Monte Carlo method for simulating long-time dynamics

    Science.gov (United States)

    Byun, Hye Suk; El-Naggar, Mohamed Y.; Kalia, Rajiv K.; Nakano, Aiichiro; Vashishta, Priya

    2017-10-01

    Kinetic Monte Carlo (KMC) simulations are used to study long-time dynamics of a wide variety of systems. Unfortunately, the conventional KMC algorithm is not scalable to larger systems, since its time scale is inversely proportional to the simulated system size. A promising approach to resolving this issue is the synchronous parallel KMC (SPKMC) algorithm, which makes the time scale size-independent. This paper introduces a formal derivation of the SPKMC algorithm based on local transition-state and time-dependent Hartree approximations, as well as its scalable parallel implementation based on a dual linked-list cell method. The resulting algorithm has achieved a weak-scaling parallel efficiency of 0.935 on 1024 Intel Xeon processors for simulating biological electron transfer dynamics in a 4.2 billion-heme system, as well as decent strong-scaling parallel efficiency. The parallel code has been used to simulate a lattice of cytochrome complexes on a bacterial-membrane nanowire, and it is broadly applicable to other problems such as computational synthesis of new materials.

  12. Advances in iterative methods for nonlinear equations

    CERN Document Server

    Busquier, Sonia

    2016-01-01

    This book focuses on the approximation of nonlinear equations using iterative methods. Nine contributions are presented on the construction and analysis of these methods, the coverage encompassing convergence, efficiency, robustness, dynamics, and applications. Many problems are stated in the form of nonlinear equations, using mathematical modeling. In particular, a wide range of problems in Applied Mathematics and in Engineering can be solved by finding the solutions to these equations. The book reveals the importance of studying convergence aspects in iterative methods and shows that selection of the most efficient and robust iterative method for a given problem is crucial to guaranteeing a good approximation. A number of sample criteria for selecting the optimal method are presented, including those regarding the order of convergence, the computational cost, and the stability, including the dynamics. This book will appeal to researchers whose field of interest is related to nonlinear problems and equations...

  13. Iterative methods for weighted least-squares

    Energy Technology Data Exchange (ETDEWEB)

    Bobrovnikova, E.Y.; Vavasis, S.A. [Cornell Univ., Ithaca, NY (United States)

    1996-12-31

    A weighted least-squares problem with a very ill-conditioned weight matrix arises in many applications. Because of round-off errors, the standard conjugate gradient method for solving this system does not give the correct answer even after n iterations. In this paper we propose an iterative algorithm based on a new type of reorthogonalization that converges to the solution.

  14. Laplace transform overcoming principle drawbacks in application of the variational iteration method to fractional heat equations

    Directory of Open Access Journals (Sweden)

    Wu Guo-Cheng

    2012-01-01

    Full Text Available This note presents a Laplace transform approach in the determination of the Lagrange multiplier when the variational iteration method is applied to time fractional heat diffusion equation. The presented approach is more straightforward and allows some simplification in application of the variational iteration method to fractional differential equations, thus improving the convergence of the successive iterations.

  15. Dynamical behaviour of neuronal networks iterated with memory

    International Nuclear Information System (INIS)

    Melatagia, P.M.; Ndoundam, R.; Tchuente, M.

    2005-11-01

    We study memory iteration where the updating consider a longer history of each site and the set of interaction matrices is palindromic. We analyze two different ways of updating the networks: parallel iteration with memory and sequential iteration with memory that we introduce in this paper. For parallel iteration, we define Lyapunov functional which permits us to characterize the periods behaviour and explicitly bounds the transient lengths of neural networks iterated with memory. For sequential iteration, we use an algebraic invariant to characterize the periods behaviour of the studied model of neural computation. (author)

  16. Iterated interactions method. Realistic NN potential

    International Nuclear Information System (INIS)

    Gorbatov, A.M.; Skopich, V.L.; Kolganova, E.A.

    1991-01-01

    The method of iterated potential is tested in the case of realistic fermionic systems. As a base for comparison calculations of the 16 O system (using various versions of realistic NN potentials) by means of the angular potential-function method as well as operators of pairing correlation were used. The convergence of genealogical series is studied for the central Malfliet-Tjon potential. In addition the mathematical technique of microscopical calculations is improved: new equations for correlators in odd states are suggested and the technique of leading terms was applied for the first time to calculations of heavy p-shell nuclei in the basis of angular potential functions

  17. Efficient numerical methods for the large-scale, parallel solution of elastoplastic contact problems

    KAUST Repository

    Frohne, Jö rg; Heister, Timo; Bangerth, Wolfgang

    2015-01-01

    © 2016 John Wiley & Sons, Ltd. Quasi-static elastoplastic contact problems are ubiquitous in many industrial processes and other contexts, and their numerical simulation is consequently of great interest in accurately describing and optimizing production processes. The key component in these simulations is the solution of a single load step of a time iteration. From a mathematical perspective, the problems to be solved in each time step are characterized by the difficulties of variational inequalities for both the plastic behavior and the contact problem. Computationally, they also often lead to very large problems. In this paper, we present and evaluate a complete set of methods that are (1) designed to work well together and (2) allow for the efficient solution of such problems. In particular, we use adaptive finite element meshes with linear and quadratic elements, a Newton linearization of the plasticity, active set methods for the contact problem, and multigrid-preconditioned linear solvers. Through a sequence of numerical experiments, we show the performance of these methods. This includes highly accurate solutions of a three-dimensional benchmark problem and scaling our methods in parallel to 1024 cores and more than a billion unknowns.

  18. Efficient numerical methods for the large-scale, parallel solution of elastoplastic contact problems

    KAUST Repository

    Frohne, Jörg

    2015-08-06

    © 2016 John Wiley & Sons, Ltd. Quasi-static elastoplastic contact problems are ubiquitous in many industrial processes and other contexts, and their numerical simulation is consequently of great interest in accurately describing and optimizing production processes. The key component in these simulations is the solution of a single load step of a time iteration. From a mathematical perspective, the problems to be solved in each time step are characterized by the difficulties of variational inequalities for both the plastic behavior and the contact problem. Computationally, they also often lead to very large problems. In this paper, we present and evaluate a complete set of methods that are (1) designed to work well together and (2) allow for the efficient solution of such problems. In particular, we use adaptive finite element meshes with linear and quadratic elements, a Newton linearization of the plasticity, active set methods for the contact problem, and multigrid-preconditioned linear solvers. Through a sequence of numerical experiments, we show the performance of these methods. This includes highly accurate solutions of a three-dimensional benchmark problem and scaling our methods in parallel to 1024 cores and more than a billion unknowns.

  19. Some contributions towards the parallel simulation of time dependent neutron transport and the integration of observed data in real time

    International Nuclear Information System (INIS)

    Mula-Hernandez, Olga

    2014-01-01

    In this thesis, we have first developed a time dependent 3D neutron transport solver on unstructured meshes with discontinuous Galerkin finite elements spatial discretization. The solver (called MINARET) represents in itself an important contribution in reactor physics thanks to the accuracy that it can provide in the knowledge of the state of the core during severe accidents. It will also play an important role on vessel fluence calculations. From a mathematical point of view, the most important contribution has consisted in the implementation of modern algorithms that are well adapted for modern parallel architectures and that significantly decrease the computing times. A special effort has been done in order to efficiently parallelize the time variable by the use of the parareal in time algorithm. For this, we have first analyzed the performances that the classical scheme of parareal can provide when applied to the resolution of the neutron transport equation in a reactor core. Then, with the purpose of improving these performances, a parareal scheme that takes more efficiently into account the presence of other iterative schemes in the resolution of each time step has been proposed. The main idea consists in limiting the number of internal iterations for each time step and to reach convergence across the parareal iterations. A second phase of our work has been motivated by the following question: given the high degree of accuracy that MINARET can provide in the modeling of the neutron population, could we somehow use it as a tool to monitor in real time the population of neutrons on the purpose of helping in the operation of the reactor? And, what is more, how to make such a tool be coherent in some sense with the measurements taken in situ? One of the main challenges of this problem is the real time aspect of the simulations. Indeed, despite all of our efforts to speed-up the calculations, the discretization methods used in MINARET do not provide simulations

  20. Leapfrog variants of iterative methods for linear algebra equations

    Science.gov (United States)

    Saylor, Paul E.

    1988-01-01

    Two iterative methods are considered, Richardson's method and a general second order method. For both methods, a variant of the method is derived for which only even numbered iterates are computed. The variant is called a leapfrog method. Comparisons between the conventional form of the methods and the leapfrog form are made under the assumption that the number of unknowns is large. In the case of Richardson's method, it is possible to express the final iterate in terms of only the initial approximation, a variant of the iteration called the grand-leap method. In the case of the grand-leap variant, a set of parameters is required. An algorithm is presented to compute these parameters that is related to algorithms to compute the weights and abscissas for Gaussian quadrature. General algorithms to implement the leapfrog and grand-leap methods are presented. Algorithms for the important special case of the Chebyshev method are also given.

  1. Parallel computation of rotating flows

    DEFF Research Database (Denmark)

    Lundin, Lars Kristian; Barker, Vincent A.; Sørensen, Jens Nørkær

    1999-01-01

    This paper deals with the simulation of 3‐D rotating flows based on the velocity‐vorticity formulation of the Navier‐Stokes equations in cylindrical coordinates. The governing equations are discretized by a finite difference method. The solution is advanced to a new time level by a two‐step process...... is that of solving a singular, large, sparse, over‐determined linear system of equations, and the iterative method CGLS is applied for this purpose. We discuss some of the mathematical and numerical aspects of this procedure and report on the performance of our software on a wide range of parallel computers. Darbe...

  2. Parallel GPU implementation of iterative PCA algorithms.

    Science.gov (United States)

    Andrecut, M

    2009-11-01

    Principal component analysis (PCA) is a key statistical technique for multivariate data analysis. For large data sets, the common approach to PCA computation is based on the standard NIPALS-PCA algorithm, which unfortunately suffers from loss of orthogonality, and therefore its applicability is usually limited to the estimation of the first few components. Here we present an algorithm based on Gram-Schmidt orthogonalization (called GS-PCA), which eliminates this shortcoming of NIPALS-PCA. Also, we discuss the GPU (Graphics Processing Unit) parallel implementation of both NIPALS-PCA and GS-PCA algorithms. The numerical results show that the GPU parallel optimized versions, based on CUBLAS (NVIDIA), are substantially faster (up to 12 times) than the CPU optimized versions based on CBLAS (GNU Scientific Library).

  3. Iterative methods for 3D implicit finite-difference migration using the complex Padé approximation

    International Nuclear Information System (INIS)

    Costa, Carlos A N; Campos, Itamara S; Costa, Jessé C; Neto, Francisco A; Schleicher, Jörg; Novais, Amélia

    2013-01-01

    Conventional implementations of 3D finite-difference (FD) migration use splitting techniques to accelerate performance and save computational cost. However, such techniques are plagued with numerical anisotropy that jeopardises the correct positioning of dipping reflectors in the directions not used for the operator splitting. We implement 3D downward continuation FD migration without splitting using a complex Padé approximation. In this way, the numerical anisotropy is eliminated at the expense of a computationally more intensive solution of a large-band linear system. We compare the performance of the iterative stabilized biconjugate gradient (BICGSTAB) and that of the multifrontal massively parallel direct solver (MUMPS). It turns out that the use of the complex Padé approximation not only stabilizes the solution, but also acts as an effective preconditioner for the BICGSTAB algorithm, reducing the number of iterations as compared to the implementation using the real Padé expansion. As a consequence, the iterative BICGSTAB method is more efficient than the direct MUMPS method when solving a single term in the Padé expansion. The results of both algorithms, here evaluated by computing the migration impulse response in the SEG/EAGE salt model, are of comparable quality. (paper)

  4. Variational iteration method for one dimensional nonlinear thermoelasticity

    International Nuclear Information System (INIS)

    Sweilam, N.H.; Khader, M.M.

    2007-01-01

    This paper applies the variational iteration method to solve the Cauchy problem arising in one dimensional nonlinear thermoelasticity. The advantage of this method is to overcome the difficulty of calculation of Adomian's polynomials in the Adomian's decomposition method. The numerical results of this method are compared with the exact solution of an artificial model to show the efficiency of the method. The approximate solutions show that the variational iteration method is a powerful mathematical tool for solving nonlinear problems

  5. A parallel algorithm for the non-symmetric eigenvalue problem

    International Nuclear Information System (INIS)

    Sidani, M.M.

    1991-01-01

    An algorithm is presented for the solution of the non-symmetric eigenvalue problem. The algorithm is based on a divide-and-conquer procedure that provides initial approximations to the eigenpairs, which are then refined using Newton iterations. Since the smaller subproblems can be solved independently, and since Newton iterations with different initial guesses can be started simultaneously, the algorithm - unlike the standard QR method - is ideal for parallel computers. The author also reports on his investigation of deflation methods designed to obtain further eigenpairs if needed. Numerical results from implementations on a host of parallel machines (distributed and shared-memory) are presented

  6. An Iterative Brinkman penalization for particle vortex methods

    DEFF Research Database (Denmark)

    Walther, Jens Honore; Hejlesen, Mads Mølholm; Leonard, A.

    2013-01-01

    We present an iterative Brinkman penalization method for the enforcement of the no-slip boundary condition in vortex particle methods. This is achieved by implementing a penalization of the velocity field using iteration of the penalized vorticity. We show that using the conventional Brinkman...... condition. These are: the impulsively started flow past a cylinder, the impulsively started flow normal to a flat plate, and the uniformly accelerated flow normal to a flat plate. The iterative penalization algorithm is shown to give significantly improved results compared to the conventional penalization...

  7. Application of the perturbation iteration method to boundary layer type problems.

    Science.gov (United States)

    Pakdemirli, Mehmet

    2016-01-01

    The recently developed perturbation iteration method is applied to boundary layer type singular problems for the first time. As a preliminary work on the topic, the simplest algorithm of PIA(1,1) is employed in the calculations. Linear and nonlinear problems are solved to outline the basic ideas of the new solution technique. The inner and outer solutions are determined with the iteration algorithm and matched to construct a composite expansion valid within all parts of the domain. The solutions are contrasted with the available exact or numerical solutions. It is shown that the perturbation-iteration algorithm can be effectively used for solving boundary layer type problems.

  8. Real-time sawtooth control and neoclassical tearing mode preemption in ITER

    Energy Technology Data Exchange (ETDEWEB)

    Kim, D., E-mail: doohyun.kim@epfl.ch; Goodman, T. P.; Sauter, O. [École Polytechnique Fédérale de Lausanne (EPFL), Centre de Recherches en Physique des Plasmas (CRPP), CH-1015 Lausanne (Switzerland)

    2014-06-15

    Real-time control of multiple plasma actuators is a requirement in advanced tokamaks; for example, for burn control, plasma current profile control and MHD stabilization—electron cyclotron (EC) wave absorption is ideally suited especially for the latter. On ITER, 24 EC sources can be switched between 56 inputs at the torus. In the torus, 5 launchers direct the power to various locations across the plasma profile via 11 steerable mirrors. For optimal usage of the available power, the aiming and polarization of the beams must be adapted to the plasma configuration and the needs of the scenario. Since the EC system performs many competing tasks, present day systems should demonstrate the ability of an EC plant to deal with several targets in parallel and/or to switch smoothly between goals to attain overall satisfaction. Based on pacing and locking experiments performed on TCV (Tokamak à Configuration Variable), the real-time sawtooth control of ITER with this complex set of actuators is analyzed, as an example. It is shown that sawtooth locking and pacing are possible with various levels of powers, leading to different time delays between the end of the EC power phase and the next sawtooth crash. This timing is important since it allows use of the same launchers for neoclassical tearing mode (NTM) preemption at the q = 1.5 or 2 surface, avoiding the need to switch power between launchers. These options are presented. It is also demonstrated that increasing the total EC power does not necessarily increase the range of control because of the geometry of the launchers.

  9. New methods of testing nonlinear hypothesis using iterative NLLS estimator

    Science.gov (United States)

    Mahaboob, B.; Venkateswarlu, B.; Mokeshrayalu, G.; Balasiddamuni, P.

    2017-11-01

    This research paper discusses the method of testing nonlinear hypothesis using iterative Nonlinear Least Squares (NLLS) estimator. Takeshi Amemiya [1] explained this method. However in the present research paper, a modified Wald test statistic due to Engle, Robert [6] is proposed to test the nonlinear hypothesis using iterative NLLS estimator. An alternative method for testing nonlinear hypothesis using iterative NLLS estimator based on nonlinear hypothesis using iterative NLLS estimator based on nonlinear studentized residuals has been proposed. In this research article an innovative method of testing nonlinear hypothesis using iterative restricted NLLS estimator is derived. Pesaran and Deaton [10] explained the methods of testing nonlinear hypothesis. This paper uses asymptotic properties of nonlinear least squares estimator proposed by Jenrich [8]. The main purpose of this paper is to provide very innovative methods of testing nonlinear hypothesis using iterative NLLS estimator, iterative NLLS estimator based on nonlinear studentized residuals and iterative restricted NLLS estimator. Eakambaram et al. [12] discussed least absolute deviation estimations versus nonlinear regression model with heteroscedastic errors and also they studied the problem of heteroscedasticity with reference to nonlinear regression models with suitable illustration. William Grene [13] examined the interaction effect in nonlinear models disused by Ai and Norton [14] and suggested ways to examine the effects that do not involve statistical testing. Peter [15] provided guidelines for identifying composite hypothesis and addressing the probability of false rejection for multiple hypotheses.

  10. AIR Tools - A MATLAB package of algebraic iterative reconstruction methods

    DEFF Research Database (Denmark)

    Hansen, Per Christian; Saxild-Hansen, Maria

    2012-01-01

    We present a MATLAB package with implementations of several algebraic iterative reconstruction methods for discretizations of inverse problems. These so-called row action methods rely on semi-convergence for achieving the necessary regularization of the problem. Two classes of methods are impleme......We present a MATLAB package with implementations of several algebraic iterative reconstruction methods for discretizations of inverse problems. These so-called row action methods rely on semi-convergence for achieving the necessary regularization of the problem. Two classes of methods...... are implemented: Algebraic Reconstruction Techniques (ART) and Simultaneous Iterative Reconstruction Techniques (SIRT). In addition we provide a few simplified test problems from medical and seismic tomography. For each iterative method, a number of strategies are available for choosing the relaxation parameter...

  11. A new non-iterative method for fitting Lorentzian to Moessbauer spectra

    International Nuclear Information System (INIS)

    Mukoyama, T.; Vegh, J.

    1980-01-01

    A new method for fitting a Lorentzian function without an iterative procedure is presented. The method is quicker and simpler than the previously proposed method of non-iterative fitting. Comparison with the previous method and with the conventional iterative method has been made. It is shown that the present method gives satisfactory results. (orig.)

  12. A comparison theorem for the SOR iterative method

    Science.gov (United States)

    Sun, Li-Ying

    2005-09-01

    In 1997, Kohno et al. have reported numerically that the improving modified Gauss-Seidel method, which was referred to as the IMGS method, is superior to the SOR iterative method. In this paper, we prove that the spectral radius of the IMGS method is smaller than that of the SOR method and Gauss-Seidel method, if the relaxation parameter [omega][set membership, variant](0,1]. As a result, we prove theoretically that this method is succeeded in improving the convergence of some classical iterative methods. Some recent results are improved.

  13. Preliminary Study on the Enhancement of Reconstruction Speed for Emission Computed Tomography Using Parallel Processing

    International Nuclear Information System (INIS)

    Park, Min Jae; Lee, Jae Sung; Kim, Soo Mee; Kang, Ji Yeon; Lee, Dong Soo; Park, Kwang Suk

    2009-01-01

    Conventional image reconstruction uses simplified physical models of projection. However, real physics, for example 3D reconstruction, takes too long time to process all the data in clinic and is unable in a common reconstruction machine because of the large memory for complex physical models. We suggest the realistic distributed memory model of fast-reconstruction using parallel processing on personal computers to enable large-scale technologies. The preliminary tests for the possibility on virtual machines and various performance test on commercial super computer, Tachyon were performed. Expectation maximization algorithm with common 2D projection and realistic 3D line of response were tested. Since the process time was getting slower (max 6 times) after a certain iteration, optimization for compiler was performed to maximize the efficiency of parallelization. Parallel processing of a program on multiple computers was available on Linux with MPICH and NFS. We verified that differences between parallel processed image and single processed image at the same iterations were under the significant digits of floating point number, about 6 bit. Double processors showed good efficiency (1.96 times) of parallel computing. Delay phenomenon was solved by vectorization method using SSE. Through the study, realistic parallel computing system in clinic was established to be able to reconstruct by plenty of memory using the realistic physical models which was impossible to simplify

  14. Parallel conjugate gradient algorithms for manipulator dynamic simulation

    Science.gov (United States)

    Fijany, Amir; Scheld, Robert E.

    1989-01-01

    Parallel conjugate gradient algorithms for the computation of multibody dynamics are developed for the specialized case of a robot manipulator. For an n-dimensional positive-definite linear system, the Classical Conjugate Gradient (CCG) algorithms are guaranteed to converge in n iterations, each with a computation cost of O(n); this leads to a total computational cost of O(n sq) on a serial processor. A conjugate gradient algorithms is presented that provide greater efficiency using a preconditioner, which reduces the number of iterations required, and by exploiting parallelism, which reduces the cost of each iteration. Two Preconditioned Conjugate Gradient (PCG) algorithms are proposed which respectively use a diagonal and a tridiagonal matrix, composed of the diagonal and tridiagonal elements of the mass matrix, as preconditioners. Parallel algorithms are developed to compute the preconditioners and their inversions in O(log sub 2 n) steps using n processors. A parallel algorithm is also presented which, on the same architecture, achieves the computational time of O(log sub 2 n) for each iteration. Simulation results for a seven degree-of-freedom manipulator are presented. Variants of the proposed algorithms are also developed which can be efficiently implemented on the Robot Mathematics Processor (RMP).

  15. The danger of iteration methods

    International Nuclear Information System (INIS)

    Villain, J.; Semeria, B.

    1983-01-01

    When a Hamiltonian H depends on variables phisub(i), the values of these variables which minimize H satisfy the equations deltaH/deltaphisub(i) = O. If this set of equations is solved by iteration, there is no guarantee that the solution is the one which minimizes H. In the case of a harmonic system with a random potential periodic with respect to the phisub(i)'s, the fluctuations have been calculated by Efetov and Larkin by means of the iteration method. The result is wrong in the case of a strong disorder. Even in the weak disorder case, it is wrong for a one-dimensional system and for a finite system of 2 particles. It is argued that the results obtained by iteration are always wrong, and that between 2 and 4 dimensions, spin-pair correlation functions decay like powers of the distance, as found by Aharony and Pytte for another model

  16. AIR Tools II: algebraic iterative reconstruction methods, improved implementation

    DEFF Research Database (Denmark)

    Hansen, Per Christian; Jørgensen, Jakob Sauer

    2017-01-01

    with algebraic iterative methods and their convergence properties. The present software is a much expanded and improved version of the package AIR Tools from 2012, based on a new modular design. In addition to improved performance and memory use, we provide more flexible iterative methods, a column-action method...

  17. Parallelization of pressure equation solver for incompressible N-S equations

    International Nuclear Information System (INIS)

    Ichihara, Kiyoshi; Yokokawa, Mitsuo; Kaburaki, Hideo.

    1996-03-01

    A pressure equation solver in a code for 3-dimensional incompressible flow analysis has been parallelized by using red-black SOR method and PCG method on Fujitsu VPP500, a vector parallel computer with distributed memory. For the comparison of scalability, the solver using the red-black SOR method has been also parallelized on the Intel Paragon, a scalar parallel computer with a distributed memory. The scalability of the red-black SOR method on both VPP500 and Paragon was lost, when number of processor elements was increased. The reason of non-scalability on both systems is increasing communication time between processor elements. In addition, the parallelization by DO-loop division makes the vectorizing efficiency lower on VPP500. For an effective implementation on VPP500, a large scale problem which holds very long vectorized DO-loops in the parallel program should be solved. PCG method with red-black SOR method applied to incomplete LU factorization (red-black PCG) has more iteration steps than normal PCG method with forward and backward substitution, in spite of same number of the floating point operations in a DO-loop of incomplete LU factorization. The parallelized red-black PCG method has less merits than the parallelized red-black SOR method when the computational region has fewer grids, because the low vectorization efficiency is obtained in red-black PCG method. (author)

  18. Iterative Runge–Kutta-type methods for nonlinear ill-posed problems

    International Nuclear Information System (INIS)

    Böckmann, C; Pornsawad, P

    2008-01-01

    We present a regularization method for solving nonlinear ill-posed problems by applying the family of Runge–Kutta methods to an initial value problem, in particular, to the asymptotical regularization method. We prove that the developed iterative regularization method converges to a solution under certain conditions and with a general stopping rule. Some particular iterative regularization methods are numerically implemented. Numerical results of the examples show that the developed Runge–Kutta-type regularization methods yield stable solutions and that particular implicit methods are very efficient in saving iteration steps

  19. Step by step parallel programming method for molecular dynamics code

    International Nuclear Information System (INIS)

    Orii, Shigeo; Ohta, Toshio

    1996-07-01

    Parallel programming for a numerical simulation program of molecular dynamics is carried out with a step-by-step programming technique using the two phase method. As a result, within the range of a certain computing parameters, it is found to obtain parallel performance by using the level of parallel programming which decomposes the calculation according to indices of do-loops into each processor on the vector parallel computer VPP500 and the scalar parallel computer Paragon. It is also found that VPP500 shows parallel performance in wider range computing parameters. The reason is that the time cost of the program parts, which can not be reduced by the do-loop level of the parallel programming, can be reduced to the negligible level by the vectorization. After that, the time consuming parts of the program are concentrated on less parts that can be accelerated by the do-loop level of the parallel programming. This report shows the step-by-step parallel programming method and the parallel performance of the molecular dynamics code on VPP500 and Paragon. (author)

  20. A transport synthetic acceleration method for transport iterations

    International Nuclear Information System (INIS)

    Ramone, G.L.; Adams, M.L.

    1997-01-01

    A family of transport synthetic acceleration (TSA) methods for iteratively solving within group scattering problems is presented. A single iteration in these schemes consists of a transport sweep followed by a low-order calculation, which itself is a simplified transport problem. The method for isotropic-scattering problems in X-Y geometry is described. The Fourier analysis of a model problem for equations with no spatial discretization shows that a previously proposed TSA method is unstable in two dimensions but that the modifications make it stable and rapidly convergent. The same procedure for discretized transport equations, using the step characteristic and two bilinear discontinuous methods, shows that discretization enhances TSA performance. A conjugate gradient algorithm for the low-order problem is described, a crude quadrature set for the low-order problem is proposed, and the number of low-order iterations per high-order sweep is limited to a relatively small value. These features lead to simple and efficient improvements to the method. TSA is tested on a series of problems, and a set of parameters is proposed for which the method behaves especially well. TSA achieves a substantial reduction in computational cost over source iteration, regardless of discretization parameters or material properties, and this reduction increases with the difficulty of the problem

  1. A Parallel Particle Swarm Optimization Algorithm Accelerated by Asynchronous Evaluations

    Science.gov (United States)

    Venter, Gerhard; Sobieszczanski-Sobieski, Jaroslaw

    2005-01-01

    A parallel Particle Swarm Optimization (PSO) algorithm is presented. Particle swarm optimization is a fairly recent addition to the family of non-gradient based, probabilistic search algorithms that is based on a simplified social model and is closely tied to swarming theory. Although PSO algorithms present several attractive properties to the designer, they are plagued by high computational cost as measured by elapsed time. One approach to reduce the elapsed time is to make use of coarse-grained parallelization to evaluate the design points. Previous parallel PSO algorithms were mostly implemented in a synchronous manner, where all design points within a design iteration are evaluated before the next iteration is started. This approach leads to poor parallel speedup in cases where a heterogeneous parallel environment is used and/or where the analysis time depends on the design point being analyzed. This paper introduces an asynchronous parallel PSO algorithm that greatly improves the parallel e ciency. The asynchronous algorithm is benchmarked on a cluster assembled of Apple Macintosh G5 desktop computers, using the multi-disciplinary optimization of a typical transport aircraft wing as an example.

  2. MICADO: Parallel implementation of a 2D-1D iterative algorithm for the 3D neutron transport problem in prismatic geometries

    International Nuclear Information System (INIS)

    Fevotte, F.; Lathuiliere, B.

    2013-01-01

    The large increase in computing power over the past few years now makes it possible to consider developing 3D full-core heterogeneous deterministic neutron transport solvers for reference calculations. Among all approaches presented in the literature, the method first introduced in [1] seems very promising. It consists in iterating over resolutions of 2D and ID MOC problems by taking advantage of prismatic geometries without introducing approximations of a low order operator such as diffusion. However, before developing a solver with all industrial options at EDF, several points needed to be clarified. In this work, we first prove the convergence of this iterative process, under some assumptions. We then present our high-performance, parallel implementation of this algorithm in the MICADO solver. Benchmarking the solver against the Takeda case shows that the 2D-1D coupling algorithm does not seem to affect the spatial convergence order of the MOC solver. As for performance issues, our study shows that even though the data distribution is suited to the 2D solver part, the efficiency of the ID part is sufficient to ensure a good parallel efficiency of the global algorithm. After this study, the main remaining difficulty implementation-wise is about the memory requirement of a vector used for initialization. An efficient acceleration operator will also need to be developed. (authors)

  3. Iterative method for Amado's model

    International Nuclear Information System (INIS)

    Tomio, L.

    1980-01-01

    A recently proposed iterative method for solving scattering integral equations is applied to the spin doublet and spin quartet neutron-deuteron scattering in the Amado model. The method is tested numerically in the calculation of scattering lengths and phase-shifts and results are found better than those obtained by using the conventional Pade technique. (Author) [pt

  4. An iterative reconstruction method of complex images using expectation maximization for radial parallel MRI

    International Nuclear Information System (INIS)

    Choi, Joonsung; Kim, Dongchan; Oh, Changhyun; Han, Yeji; Park, HyunWook

    2013-01-01

    In MRI (magnetic resonance imaging), signal sampling along a radial k-space trajectory is preferred in certain applications due to its distinct advantages such as robustness to motion, and the radial sampling can be beneficial for reconstruction algorithms such as parallel MRI (pMRI) due to the incoherency. For radial MRI, the image is usually reconstructed from projection data using analytic methods such as filtered back-projection or Fourier reconstruction after gridding. However, the quality of the reconstructed image from these analytic methods can be degraded when the number of acquired projection views is insufficient. In this paper, we propose a novel reconstruction method based on the expectation maximization (EM) method, where the EM algorithm is remodeled for MRI so that complex images can be reconstructed. Then, to optimize the proposed method for radial pMRI, a reconstruction method that uses coil sensitivity information of multichannel RF coils is formulated. Experiment results from synthetic and in vivo data show that the proposed method introduces better reconstructed images than the analytic methods, even from highly subsampled data, and provides monotonic convergence properties compared to the conjugate gradient based reconstruction method. (paper)

  5. DGDFT: A massively parallel method for large scale density functional theory calculations.

    Science.gov (United States)

    Hu, Wei; Lin, Lin; Yang, Chao

    2015-09-28

    We describe a massively parallel implementation of the recently developed discontinuous Galerkin density functional theory (DGDFT) method, for efficient large-scale Kohn-Sham DFT based electronic structure calculations. The DGDFT method uses adaptive local basis (ALB) functions generated on-the-fly during the self-consistent field iteration to represent the solution to the Kohn-Sham equations. The use of the ALB set provides a systematic way to improve the accuracy of the approximation. By using the pole expansion and selected inversion technique to compute electron density, energy, and atomic forces, we can make the computational complexity of DGDFT scale at most quadratically with respect to the number of electrons for both insulating and metallic systems. We show that for the two-dimensional (2D) phosphorene systems studied here, using 37 basis functions per atom allows us to reach an accuracy level of 1.3 × 10(-4) Hartree/atom in terms of the error of energy and 6.2 × 10(-4) Hartree/bohr in terms of the error of atomic force, respectively. DGDFT can achieve 80% parallel efficiency on 128,000 high performance computing cores when it is used to study the electronic structure of 2D phosphorene systems with 3500-14 000 atoms. This high parallel efficiency results from a two-level parallelization scheme that we will describe in detail.

  6. DGDFT: A massively parallel method for large scale density functional theory calculations

    International Nuclear Information System (INIS)

    Hu, Wei; Yang, Chao; Lin, Lin

    2015-01-01

    We describe a massively parallel implementation of the recently developed discontinuous Galerkin density functional theory (DGDFT) method, for efficient large-scale Kohn-Sham DFT based electronic structure calculations. The DGDFT method uses adaptive local basis (ALB) functions generated on-the-fly during the self-consistent field iteration to represent the solution to the Kohn-Sham equations. The use of the ALB set provides a systematic way to improve the accuracy of the approximation. By using the pole expansion and selected inversion technique to compute electron density, energy, and atomic forces, we can make the computational complexity of DGDFT scale at most quadratically with respect to the number of electrons for both insulating and metallic systems. We show that for the two-dimensional (2D) phosphorene systems studied here, using 37 basis functions per atom allows us to reach an accuracy level of 1.3 × 10 −4 Hartree/atom in terms of the error of energy and 6.2 × 10 −4 Hartree/bohr in terms of the error of atomic force, respectively. DGDFT can achieve 80% parallel efficiency on 128,000 high performance computing cores when it is used to study the electronic structure of 2D phosphorene systems with 3500-14 000 atoms. This high parallel efficiency results from a two-level parallelization scheme that we will describe in detail

  7. DGDFT: A massively parallel method for large scale density functional theory calculations

    Energy Technology Data Exchange (ETDEWEB)

    Hu, Wei, E-mail: whu@lbl.gov; Yang, Chao, E-mail: cyang@lbl.gov [Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720 (United States); Lin, Lin, E-mail: linlin@math.berkeley.edu [Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720 (United States); Department of Mathematics, University of California, Berkeley, California 94720 (United States)

    2015-09-28

    We describe a massively parallel implementation of the recently developed discontinuous Galerkin density functional theory (DGDFT) method, for efficient large-scale Kohn-Sham DFT based electronic structure calculations. The DGDFT method uses adaptive local basis (ALB) functions generated on-the-fly during the self-consistent field iteration to represent the solution to the Kohn-Sham equations. The use of the ALB set provides a systematic way to improve the accuracy of the approximation. By using the pole expansion and selected inversion technique to compute electron density, energy, and atomic forces, we can make the computational complexity of DGDFT scale at most quadratically with respect to the number of electrons for both insulating and metallic systems. We show that for the two-dimensional (2D) phosphorene systems studied here, using 37 basis functions per atom allows us to reach an accuracy level of 1.3 × 10{sup −4} Hartree/atom in terms of the error of energy and 6.2 × 10{sup −4} Hartree/bohr in terms of the error of atomic force, respectively. DGDFT can achieve 80% parallel efficiency on 128,000 high performance computing cores when it is used to study the electronic structure of 2D phosphorene systems with 3500-14 000 atoms. This high parallel efficiency results from a two-level parallelization scheme that we will describe in detail.

  8. Plasma flow to a surface using the iterative Monte Carlo method

    International Nuclear Information System (INIS)

    Pitcher, C.S.

    1994-01-01

    The iterative Monte Carlo (IMC) method is applied to a number of one-dimensional plasma flow problems, which encompass a wide range of conditions typical of those present in the boundary of magnetic fusion devices. The kinetic IMC method of solving plasma flow to a surface consists of launching and following particles within a grid of 'bins' into which weights are left according to the time a particle spends within a bin. The density and potential distributions within the plasma are iterated until the final solution is arrived at. The IMC results are compared with analytical treatments of these problems and, in general, good agreement is obtained. (author)

  9. Test Time Reduction for BIST by Parallel Divide-and-Conquer Method

    Energy Technology Data Exchange (ETDEWEB)

    Choi, Byung Gu; Kim, Dong Wook [Kwangwoon University (Korea)

    2000-06-01

    BIST(Built-in Self Test) has been considered as the most promising DFT(design-for-test) scheme for the present and future test strategy. The most serious problem in applying BIST(Built-in Self Test) into a large circuit is the excessive increase in test time. This paper is focused on this problem. We proposed a new BIST construction scheme which uses a parallel divide-and-conquer method. The circuit division is performed with respect to some internal nodes called test points. The test points are selected by considering the nodal connectivity of the circuit rather than the testability of each node. The test patterns are generated by only one linear feedback shift register(LFSR) and they are shared by all the divided circuits. Thus, the test for each divided circuit is performed in parallel. Test responses are collected from the test point as well as the primary outputs. Even though the divide-and-conquer scheme is used and test patterns are generated in one LFSR, the proposed scheme does not lose its pseudo-exhaustive property. We proposed a selection procedure to find the test points and it was implemented with C/C{sup ++} language. Several example circuits were applied to this procedure and the results showed that test time was reduced upto 1/2{sup 1}51 but the increase in the hardware overhead or the delay increase was not much high. Because the proposed scheme showed a tendency that the increasing rates in hardware overhead and delay overhead were less than that in test time reduction as the size of circuit increases, it is expected to be used efficiently for large circuits as VLSI and ULSI. (author). 15 refs., 7 figs., 5 tabs.

  10. A Study on GPU-based Iterative ML-EM Reconstruction Algorithm for Emission Computed Tomographic Imaging Systems

    Energy Technology Data Exchange (ETDEWEB)

    Ha, Woo Seok; Kim, Soo Mee; Park, Min Jae; Lee, Dong Soo; Lee, Jae Sung [Seoul National University, Seoul (Korea, Republic of)

    2009-10-15

    The maximum likelihood-expectation maximization (ML-EM) is the statistical reconstruction algorithm derived from probabilistic model of the emission and detection processes. Although the ML-EM has many advantages in accuracy and utility, the use of the ML-EM is limited due to the computational burden of iterating processing on a CPU (central processing unit). In this study, we developed a parallel computing technique on GPU (graphic processing unit) for ML-EM algorithm. Using Geforce 9800 GTX+ graphic card and CUDA (compute unified device architecture) the projection and backprojection in ML-EM algorithm were parallelized by NVIDIA's technology. The time delay on computations for projection, errors between measured and estimated data and backprojection in an iteration were measured. Total time included the latency in data transmission between RAM and GPU memory. The total computation time of the CPU- and GPU-based ML-EM with 32 iterations were 3.83 and 0.26 sec, respectively. In this case, the computing speed was improved about 15 times on GPU. When the number of iterations increased into 1024, the CPU- and GPU-based computing took totally 18 min and 8 sec, respectively. The improvement was about 135 times and was caused by delay on CPU-based computing after certain iterations. On the other hand, the GPU-based computation provided very small variation on time delay per iteration due to use of shared memory. The GPU-based parallel computation for ML-EM improved significantly the computing speed and stability. The developed GPU-based ML-EM algorithm could be easily modified for some other imaging geometries

  11. A Study on GPU-based Iterative ML-EM Reconstruction Algorithm for Emission Computed Tomographic Imaging Systems

    International Nuclear Information System (INIS)

    Ha, Woo Seok; Kim, Soo Mee; Park, Min Jae; Lee, Dong Soo; Lee, Jae Sung

    2009-01-01

    The maximum likelihood-expectation maximization (ML-EM) is the statistical reconstruction algorithm derived from probabilistic model of the emission and detection processes. Although the ML-EM has many advantages in accuracy and utility, the use of the ML-EM is limited due to the computational burden of iterating processing on a CPU (central processing unit). In this study, we developed a parallel computing technique on GPU (graphic processing unit) for ML-EM algorithm. Using Geforce 9800 GTX+ graphic card and CUDA (compute unified device architecture) the projection and backprojection in ML-EM algorithm were parallelized by NVIDIA's technology. The time delay on computations for projection, errors between measured and estimated data and backprojection in an iteration were measured. Total time included the latency in data transmission between RAM and GPU memory. The total computation time of the CPU- and GPU-based ML-EM with 32 iterations were 3.83 and 0.26 sec, respectively. In this case, the computing speed was improved about 15 times on GPU. When the number of iterations increased into 1024, the CPU- and GPU-based computing took totally 18 min and 8 sec, respectively. The improvement was about 135 times and was caused by delay on CPU-based computing after certain iterations. On the other hand, the GPU-based computation provided very small variation on time delay per iteration due to use of shared memory. The GPU-based parallel computation for ML-EM improved significantly the computing speed and stability. The developed GPU-based ML-EM algorithm could be easily modified for some other imaging geometries

  12. Automatic Loop Parallelization via Compiler Guided Refactoring

    DEFF Research Database (Denmark)

    Larsen, Per; Ladelsky, Razya; Lidman, Jacob

    For many parallel applications, performance relies not on instruction-level parallelism, but on loop-level parallelism. Unfortunately, many modern applications are written in ways that obstruct automatic loop parallelization. Since we cannot identify sufficient parallelization opportunities...... for these codes in a static, off-line compiler, we developed an interactive compilation feedback system that guides the programmer in iteratively modifying application source, thereby improving the compiler’s ability to generate loop-parallel code. We use this compilation system to modify two sequential...... benchmarks, finding that the code parallelized in this way runs up to 8.3 times faster on an octo-core Intel Xeon 5570 system and up to 12.5 times faster on a quad-core IBM POWER6 system. Benchmark performance varies significantly between the systems. This suggests that semi-automatic parallelization should...

  13. Variation in efficiency of parallel algorithms. [for study of stiffness matrices in planar trusses

    Science.gov (United States)

    Hayashi, A.; Melosh, R. J.; Utku, S.; Salama, M.

    1985-01-01

    The present study has the objective to investigate some iterative parallel-processor linear equation solving algorithms with respect to efficiency for analyses of typical linear engineering systems. Attention is given to a set of n linear equations, Ku = p, where K = an n x n positive definite, sparsely populated, symmetric matrix, u = an n x 1 vector of unknown responses, and p = an n x 1 vector of prescribed constants. This study is concerned with a hybrid method in which iteration is used to solve the problem, while a direct method is used on the local processor level. Variations in the efficiency of parallel algorithms are explored. Measures of the efficiency are based on computer experiments regarding the algorithms. For all the algorithms, the wall clock time is found to decrease as the number of processors increases.

  14. Multistage parallel-serial time averaging filters

    International Nuclear Information System (INIS)

    Theodosiou, G.E.

    1980-01-01

    Here, a new time averaging circuit design, the 'parallel filter' is presented, which can reduce the time jitter, introduced in time measurements using counters of large dimensions. This parallel filter could be considered as a single stage unit circuit which can be repeated an arbitrary number of times in series, thus providing a parallel-serial filter type as a result. The main advantages of such a filter over a serial one are much less electronic gate jitter and time delay for the same amount of total time uncertainty reduction. (orig.)

  15. Adaptive control in multi-threaded iterated integration

    International Nuclear Information System (INIS)

    Doncker, Elise de; Yuasa, Fukuko

    2013-01-01

    In recent years we have developed a technique for the direct computation of Feynman loop-integrals, which are notorious for the occurrence of integrand singularities. Especially for handling singularities in the interior of the domain, we approximate the iterated integral using an adaptive algorithm in the coordinate directions. We present a novel multi-core parallelization scheme for adaptive multivariate integration, by assigning threads to the rule evaluations in the outer dimensions of the iterated integral. The method ensures a large parallel granularity as each function evaluation by itself comprises an integral over the lower dimensions, while the application of the threads is governed by the adaptive control in the outer level. We give computational results for a test set of 3- to 6-dimensional integrals, where several problems exhibit a loop integral behavior.

  16. Parallel-In-Time For Moving Meshes

    Energy Technology Data Exchange (ETDEWEB)

    Falgout, R. D. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Manteuffel, T. A. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Southworth, B. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Schroder, J. B. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

    2016-02-04

    With steadily growing computational resources available, scientists must develop e ective ways to utilize the increased resources. High performance, highly parallel software has be- come a standard. However until recent years parallelism has focused primarily on the spatial domain. When solving a space-time partial di erential equation (PDE), this leads to a sequential bottleneck in the temporal dimension, particularly when taking a large number of time steps. The XBraid parallel-in-time library was developed as a practical way to add temporal parallelism to existing se- quential codes with only minor modi cations. In this work, a rezoning-type moving mesh is applied to a di usion problem and formulated in a parallel-in-time framework. Tests and scaling studies are run using XBraid and demonstrate excellent results for the simple model problem considered herein.

  17. Development of a parallelization strategy for the VARIANT code

    International Nuclear Information System (INIS)

    Hanebutte, U.R.; Khalil, H.S.; Palmiotti, G.; Tatsumi, M.

    1996-01-01

    The VARIANT code solves the multigroup steady-state neutron diffusion and transport equation in three-dimensional Cartesian and hexagonal geometries using the variational nodal method. VARIANT consists of four major parts that must be executed sequentially: input handling, calculation of response matrices, solution algorithm (i.e. inner-outer iteration), and output of results. The objective of the parallelization effort was to reduce the overall computing time by distributing the work of the two computationally intensive (sequential) tasks, the coupling coefficient calculation and the iterative solver, equally among a group of processors. This report describes the code's calculations and gives performance results on one of the benchmark problems used to test the code. The performance analysis in the IBM SPx system shows good efficiency for well-load-balanced programs. Even for relatively small problem sizes, respectable efficiencies are seen for the SPx. An extension to achieve a higher degree of parallelism will be addressed in future work. 7 refs., 1 tab

  18. The Semianalytical Solutions for Stiff Systems of Ordinary Differential Equations by Using Variational Iteration Method and Modified Variational Iteration Method with Comparison to Exact Solutions

    Directory of Open Access Journals (Sweden)

    Mehmet Tarik Atay

    2013-01-01

    Full Text Available The Variational Iteration Method (VIM and Modified Variational Iteration Method (MVIM are used to find solutions of systems of stiff ordinary differential equations for both linear and nonlinear problems. Some examples are given to illustrate the accuracy and effectiveness of these methods. We compare our results with exact results. In some studies related to stiff ordinary differential equations, problems were solved by Adomian Decomposition Method and VIM and Homotopy Perturbation Method. Comparisons with exact solutions reveal that the Variational Iteration Method (VIM and the Modified Variational Iteration Method (MVIM are easier to implement. In fact, these methods are promising methods for various systems of linear and nonlinear stiff ordinary differential equations. Furthermore, VIM, or in some cases MVIM, is giving exact solutions in linear cases and very satisfactory solutions when compared to exact solutions for nonlinear cases depending on the stiffness ratio of the stiff system to be solved.

  19. Robust iterative learning control for multi-phase batch processes: an average dwell-time method with 2D convergence indexes

    Science.gov (United States)

    Wang, Limin; Shen, Yiteng; Yu, Jingxian; Li, Ping; Zhang, Ridong; Gao, Furong

    2018-01-01

    In order to cope with system disturbances in multi-phase batch processes with different dimensions, a hybrid robust control scheme of iterative learning control combined with feedback control is proposed in this paper. First, with a hybrid iterative learning control law designed by introducing the state error, the tracking error and the extended information, the multi-phase batch process is converted into a two-dimensional Fornasini-Marchesini (2D-FM) switched system with different dimensions. Second, a switching signal is designed using the average dwell-time method integrated with the related switching conditions to give sufficient conditions ensuring stable running for the system. Finally, the minimum running time of the subsystems and the control law gains are calculated by solving the linear matrix inequalities. Meanwhile, a compound 2D controller with robust performance is obtained, which includes a robust extended feedback control for ensuring the steady-state tracking error to converge rapidly. The application on an injection molding process displays the effectiveness and superiority of the proposed strategy.

  20. Nonlinear and parallel algorithms for finite element discretizations of the incompressible Navier-Stokes equations

    Science.gov (United States)

    Arteaga, Santiago Egido

    1998-12-01

    The steady-state Navier-Stokes equations are of considerable interest because they are used to model numerous common physical phenomena. The applications encountered in practice often involve small viscosities and complicated domain geometries, and they result in challenging problems in spite of the vast attention that has been dedicated to them. In this thesis we examine methods for computing the numerical solution of the primitive variable formulation of the incompressible equations on distributed memory parallel computers. We use the Galerkin method to discretize the differential equations, although most results are stated so that they apply also to stabilized methods. We also reformulate some classical results in a single framework and discuss some issues frequently dismissed in the literature, such as the implementation of pressure space basis and non- homogeneous boundary values. We consider three nonlinear methods: Newton's method, Oseen's (or Picard) iteration, and sequences of Stokes problems. All these iterative nonlinear methods require solving a linear system at every step. Newton's method has quadratic convergence while that of the others is only linear; however, we obtain theoretical bounds showing that Oseen's iteration is more robust, and we confirm it experimentally. In addition, although Oseen's iteration usually requires more iterations than Newton's method, the linear systems it generates tend to be simpler and its overall costs (in CPU time) are lower. The Stokes problems result in linear systems which are easier to solve, but its convergence is much slower, so that it is competitive only for large viscosities. Inexact versions of these methods are studied, and we explain why the best timings are obtained using relatively modest error tolerances in solving the corresponding linear systems. We also present a new damping optimization strategy based on the quadratic nature of the Navier-Stokes equations, which improves the robustness of all the

  1. Parallelization of the FLAPW method

    International Nuclear Information System (INIS)

    Canning, A.; Mannstadt, W.; Freeman, A.J.

    1999-01-01

    The FLAPW (full-potential linearized-augmented plane-wave) method is one of the most accurate first-principles methods for determining electronic and magnetic properties of crystals and surfaces. Until the present work, the FLAPW method has been limited to systems of less than about one hundred atoms due to a lack of an efficient parallel implementation to exploit the power and memory of parallel computers. In this work we present an efficient parallelization of the method by division among the processors of the plane-wave components for each state. The code is also optimized for RISC (reduced instruction set computer) architectures, such as those found on most parallel computers, making full use of BLAS (basic linear algebra subprograms) wherever possible. Scaling results are presented for systems of up to 686 silicon atoms and 343 palladium atoms per unit cell, running on up to 512 processors on a CRAY T3E parallel computer

  2. Parallelization of the FLAPW method

    Science.gov (United States)

    Canning, A.; Mannstadt, W.; Freeman, A. J.

    2000-08-01

    The FLAPW (full-potential linearized-augmented plane-wave) method is one of the most accurate first-principles methods for determining structural, electronic and magnetic properties of crystals and surfaces. Until the present work, the FLAPW method has been limited to systems of less than about a hundred atoms due to the lack of an efficient parallel implementation to exploit the power and memory of parallel computers. In this work, we present an efficient parallelization of the method by division among the processors of the plane-wave components for each state. The code is also optimized for RISC (reduced instruction set computer) architectures, such as those found on most parallel computers, making full use of BLAS (basic linear algebra subprograms) wherever possible. Scaling results are presented for systems of up to 686 silicon atoms and 343 palladium atoms per unit cell, running on up to 512 processors on a CRAY T3E parallel supercomputer.

  3. Stopping test of iterative methods for solving PDE

    International Nuclear Information System (INIS)

    Wang Bangrong

    1991-01-01

    In order to assure the accuracy of the numerical solution of the iterative method for solving PDE (partial differential equation), the stopping test is very important. If the coefficient matrix of the system of linear algebraic equations is strictly diagonal dominant or irreducible weakly diagonal dominant, the stopping test formulas of the iterative method for solving PDE is proposed. Several numerical examples are given to illustrate the applications of the stopping test formulas

  4. Parallel Newton-Krylov-Schwarz algorithms for the transonic full potential equation

    Science.gov (United States)

    Cai, Xiao-Chuan; Gropp, William D.; Keyes, David E.; Melvin, Robin G.; Young, David P.

    1996-01-01

    We study parallel two-level overlapping Schwarz algorithms for solving nonlinear finite element problems, in particular, for the full potential equation of aerodynamics discretized in two dimensions with bilinear elements. The overall algorithm, Newton-Krylov-Schwarz (NKS), employs an inexact finite-difference Newton method and a Krylov space iterative method, with a two-level overlapping Schwarz method as a preconditioner. We demonstrate that NKS, combined with a density upwinding continuation strategy for problems with weak shocks, is robust and, economical for this class of mixed elliptic-hyperbolic nonlinear partial differential equations, with proper specification of several parameters. We study upwinding parameters, inner convergence tolerance, coarse grid density, subdomain overlap, and the level of fill-in in the incomplete factorization, and report their effect on numerical convergence rate, overall execution time, and parallel efficiency on a distributed-memory parallel computer.

  5. A Block-Asynchronous Relaxation Method for Graphics Processing Units

    OpenAIRE

    Anzt, H.; Dongarra, J.; Heuveline, Vincent; Tomov, S.

    2011-01-01

    In this paper, we analyze the potential of asynchronous relaxation methods on Graphics Processing Units (GPUs). For this purpose, we developed a set of asynchronous iteration algorithms in CUDA and compared them with a parallel implementation of synchronous relaxation methods on CPU-based systems. For a set of test matrices taken from the University of Florida Matrix Collection we monitor the convergence behavior, the average iteration time and the total time-to-solution time. Analyzing the r...

  6. An iterative method for unfolding time-resolved soft x-ray spectra of laser plasmas

    International Nuclear Information System (INIS)

    Tang Yongjian; Shen Kexi; Xu Hepin

    1991-01-01

    Dante-recorded temporal waveforms have been unfolded by using Fast Fourier transformation (FFT) and the inverted convolution theorem of Fourier analysis. The conversion of the signals to time-dependent soft x-ray spectra is accomplished on the IBM-PC/XT-286 microcomputer system with the code DTSP including SAND II reported by W.N.Mcelory et al.. An amplitude-limited iterative and periodic smoothing technique has been developed in the code DTSP. Time-resolved soft x-ray spectra with sixteen time-cell, and time-dependent radiation, [T R (t)], have been obtained for hohlraum targets irradiated with laser beams (λ = 1.06 μm) on LF-12 in 1989

  7. Convergence of Inner-Iteration GMRES Methods for Rank-Deficient Least Squares Problems

    Czech Academy of Sciences Publication Activity Database

    Morikuni, Keiichi; Hayami, K.

    2015-01-01

    Roč. 36, č. 1 (2015), s. 225-250 ISSN 0895-4798 Institutional support: RVO:67985807 Keywords : least squares problem * iterative methods * preconditioner * inner-outer iteration * GMRES method * stationary iterative method * rank-deficient problem Subject RIV: BA - General Mathematics Impact factor: 1.883, year: 2015

  8. Iterative methods for photoacoustic tomography in attenuating acoustic media

    Science.gov (United States)

    Haltmeier, Markus; Kowar, Richard; Nguyen, Linh V.

    2017-11-01

    The development of efficient and accurate reconstruction methods is an important aspect of tomographic imaging. In this article, we address this issue for photoacoustic tomography. To this aim, we use models for acoustic wave propagation accounting for frequency dependent attenuation according to a wide class of attenuation laws that may include memory. We formulate the inverse problem of photoacoustic tomography in attenuating medium as an ill-posed operator equation in a Hilbert space framework that is tackled by iterative regularization methods. Our approach comes with a clear convergence analysis. For that purpose we derive explicit expressions for the adjoint problem that can efficiently be implemented. In contrast to time reversal, the employed adjoint wave equation is again damping and, thus has a stable solution. This stability property can be clearly seen in our numerical results. Moreover, the presented numerical results clearly demonstrate the efficiency and accuracy of the derived iterative reconstruction algorithms in various situations including the limited view case.

  9. An efficient iterative method for the generalized Stokes problem

    Energy Technology Data Exchange (ETDEWEB)

    Sameh, A. [Univ. of Minnesota, Twin Cities, MN (United States); Sarin, V. [Univ. of Illinois, Urbana, IL (United States)

    1996-12-31

    This paper presents an efficient iterative scheme for the generalized Stokes problem, which arises frequently in the simulation of time-dependent Navier-Stokes equations for incompressible fluid flow. The general form of the linear system is where A = {alpha}M + vT is an n x n symmetric positive definite matrix, in which M is the mass matrix, T is the discrete Laplace operator, {alpha} and {nu} are positive constants proportional to the inverses of the time-step {Delta}t and the Reynolds number Re respectively, and B is the discrete gradient operator of size n x k (k < n). Even though the matrix A is symmetric and positive definite, the system is indefinite due to the incompressibility constraint (B{sup T}u = 0). This causes difficulties both for iterative methods and commonly used preconditioners. Moreover, depending on the ratio {alpha}/{nu}, A behaves like the mass matrix M at one extreme and the Laplace operator T at the other, thus complicating the issue of preconditioning.

  10. SQDFT: Spectral Quadrature method for large-scale parallel O(N) Kohn-Sham calculations at high temperature

    Science.gov (United States)

    Suryanarayana, Phanish; Pratapa, Phanisri P.; Sharma, Abhiraj; Pask, John E.

    2018-03-01

    We present SQDFT: a large-scale parallel implementation of the Spectral Quadrature (SQ) method for O(N) Kohn-Sham Density Functional Theory (DFT) calculations at high temperature. Specifically, we develop an efficient and scalable finite-difference implementation of the infinite-cell Clenshaw-Curtis SQ approach, in which results for the infinite crystal are obtained by expressing quantities of interest as bilinear forms or sums of bilinear forms, that are then approximated by spatially localized Clenshaw-Curtis quadrature rules. We demonstrate the accuracy of SQDFT by showing systematic convergence of energies and atomic forces with respect to SQ parameters to reference diagonalization results, and convergence with discretization to established planewave results, for both metallic and insulating systems. We further demonstrate that SQDFT achieves excellent strong and weak parallel scaling on computer systems consisting of tens of thousands of processors, with near perfect O(N) scaling with system size and wall times as low as a few seconds per self-consistent field iteration. Finally, we verify the accuracy of SQDFT in large-scale quantum molecular dynamics simulations of aluminum at high temperature.

  11. A Comparison of Sequential and GPU Implementations of Iterative Methods to Compute Reachability Probabilities

    Directory of Open Access Journals (Sweden)

    Elise Cormie-Bowins

    2012-10-01

    Full Text Available We consider the problem of computing reachability probabilities: given a Markov chain, an initial state of the Markov chain, and a set of goal states of the Markov chain, what is the probability of reaching any of the goal states from the initial state? This problem can be reduced to solving a linear equation Ax = b for x, where A is a matrix and b is a vector. We consider two iterative methods to solve the linear equation: the Jacobi method and the biconjugate gradient stabilized (BiCGStab method. For both methods, a sequential and a parallel version have been implemented. The parallel versions have been implemented on the compute unified device architecture (CUDA so that they can be run on a NVIDIA graphics processing unit (GPU. From our experiments we conclude that as the size of the matrix increases, the CUDA implementations outperform the sequential implementations. Furthermore, the BiCGStab method performs better than the Jacobi method for dense matrices, whereas the Jacobi method does better for sparse ones. Since the reachability probabilities problem plays a key role in probabilistic model checking, we also compared the implementations for matrices obtained from a probabilistic model checker. Our experiments support the conjecture by Bosnacki et al. that the Jacobi method is superior to Krylov subspace methods, a class to which the BiCGStab method belongs, for probabilistic model checking.

  12. Properties of a class of block-iterative methods

    International Nuclear Information System (INIS)

    Elfving, Tommy; Nikazad, Touraj

    2009-01-01

    We study a class of block-iterative (BI) methods proposed in image reconstruction for solving linear systems. A subclass, symmetric block-iteration (SBI), is derived such that for this subclass both semi-convergence analysis and stopping-rules developed for fully simultaneous iteration apply. Also results on asymptotic convergence are given, e.g., BI exhibit cyclic convergence irrespective of the consistency of the linear system. Further it is shown that the limit points of SBI satisfy a weighted least-squares problem. We also present numerical results obtained using a trained stopping rule on SBI

  13. A parallel algorithm for solving the multidimensional within-group discrete ordinates equations with spatial domain decomposition - 104

    International Nuclear Information System (INIS)

    Zerr, R.J.; Azmy, Y.Y.

    2010-01-01

    A spatial domain decomposition with a parallel block Jacobi solution algorithm has been developed based on the integral transport matrix formulation of the discrete ordinates approximation for solving the within-group transport equation. The new methodology abandons the typical source iteration scheme and solves directly for the fully converged scalar flux. Four matrix operators are constructed based upon the integral form of the discrete ordinates equations. A single differential mesh sweep is performed to construct these operators. The method is parallelized by decomposing the problem domain into several smaller sub-domains, each treated as an independent problem. The scalar flux of each sub-domain is solved exactly given incoming angular flux boundary conditions. Sub-domain boundary conditions are updated iteratively, and convergence is achieved when the scalar flux error in all cells meets a pre-specified convergence criterion. The method has been implemented in a computer code that was then employed for strong scaling studies of the algorithm's parallel performance via a fixed-size problem in tests ranging from one domain up to one cell per sub-domain. Results indicate that the best parallel performance compared to source iterations occurs for optically thick, highly scattering problems, the variety that is most difficult for the traditional SI scheme to solve. Moreover, the minimum execution time occurs when each sub-domain contains a total of four cells. (authors)

  14. A Framework for Generalising the Newton Method and Other Iterative Methods from Euclidean Space to Manifolds

    OpenAIRE

    Manton, Jonathan H.

    2012-01-01

    The Newton iteration is a popular method for minimising a cost function on Euclidean space. Various generalisations to cost functions defined on manifolds appear in the literature. In each case, the convergence rate of the generalised Newton iteration needed establishing from first principles. The present paper presents a framework for generalising iterative methods from Euclidean space to manifolds that ensures local convergence rates are preserved. It applies to any (memoryless) iterative m...

  15. Parallel Motion Simulation of Large-Scale Real-Time Crowd in a Hierarchical Environmental Model

    Directory of Open Access Journals (Sweden)

    Xin Wang

    2012-01-01

    Full Text Available This paper presents a parallel real-time crowd simulation method based on a hierarchical environmental model. A dynamical model of the complex environment should be constructed to simulate the state transition and propagation of individual motions. By modeling of a virtual environment where virtual crowds reside, we employ different parallel methods on a topological layer, a path layer and a perceptual layer. We propose a parallel motion path matching method based on the path layer and a parallel crowd simulation method based on the perceptual layer. The large-scale real-time crowd simulation becomes possible with these methods. Numerical experiments are carried out to demonstrate the methods and results.

  16. A Novel Parallel Algorithm for Edit Distance Computation

    Directory of Open Access Journals (Sweden)

    Muhammad Murtaza Yousaf

    2018-01-01

    Full Text Available The edit distance between two sequences is the minimum number of weighted transformation-operations that are required to transform one string into the other. The weighted transformation-operations are insert, remove, and substitute. Dynamic programming solution to find edit distance exists but it becomes computationally intensive when the lengths of strings become very large. This work presents a novel parallel algorithm to solve edit distance problem of string matching. The algorithm is based on resolving dependencies in the dynamic programming solution of the problem and it is able to compute each row of edit distance table in parallel. In this way, it becomes possible to compute the complete table in min(m,n iterations for strings of size m and n whereas state-of-the-art parallel algorithm solves the problem in max(m,n iterations. The proposed algorithm also increases the amount of parallelism in each of its iteration. The algorithm is also capable of exploiting spatial locality while its implementation. Additionally, the algorithm works in a load balanced way that further improves its performance. The algorithm is implemented for multicore systems having shared memory. Implementation of the algorithm in OpenMP shows linear speedup and better execution time as compared to state-of-the-art parallel approach. Efficiency of the algorithm is also proven better in comparison to its competitor.

  17. Iterative reconstruction methods for Thermo-acoustic Tomography

    International Nuclear Information System (INIS)

    Marinesque, Sebastien

    2012-01-01

    We define, study and implement various iterative reconstruction methods for Thermo-acoustic Tomography (TAT): the Back and Forth Nudging (BFN), easy to implement and to use, a variational technique (VT) and the Back and Forth SEEK (BF-SEEK), more sophisticated, and a coupling method between Kalman filter (KF) and Time Reversal (TR). A unified formulation is explained for the sequential techniques aforementioned that defines a new class of inverse problem methods: the Back and Forth Filters (BFF). In addition to existence and uniqueness (particularly for backward solutions), we study many frameworks that ensure and characterize the convergence of the algorithms. Thus we give a general theoretical framework for which the BFN is a well-posed problem. Then, in application to TAT, existence and uniqueness of its solutions and geometrical convergence of the algorithm are proved, and an explicit convergence rate and a description of its numerical behaviour are given. Next, theoretical and numerical studies of more general and realistic framework are led, namely different objects, speeds (with or without trapping), various sensor configurations and samplings, attenuated equations or external sources. Then optimal control and best estimate tools are used to characterize the BFN convergence and converging feedbacks for BFF, under observability assumptions. Finally, we compare the most flexible and efficient current techniques (TR and an iterative variant) with our various BFF and the VT in several experiments. Thus, robust, with different possible complexities and flexible, the methods that we propose are very interesting reconstruction techniques, particularly in TAT and when observations are degraded. (author) [fr

  18. Adaptive dynamic programming for discrete-time linear quadratic regulation based on multirate generalised policy iteration

    Science.gov (United States)

    Chun, Tae Yoon; Lee, Jae Young; Park, Jin Bae; Choi, Yoon Ho

    2018-06-01

    In this paper, we propose two multirate generalised policy iteration (GPI) algorithms applied to discrete-time linear quadratic regulation problems. The proposed algorithms are extensions of the existing GPI algorithm that consists of the approximate policy evaluation and policy improvement steps. The two proposed schemes, named heuristic dynamic programming (HDP) and dual HDP (DHP), based on multirate GPI, use multi-step estimation (M-step Bellman equation) at the approximate policy evaluation step for estimating the value function and its gradient called costate, respectively. Then, we show that these two methods with the same update horizon can be considered equivalent in the iteration domain. Furthermore, monotonically increasing and decreasing convergences, so called value iteration (VI)-mode and policy iteration (PI)-mode convergences, are proved to hold for the proposed multirate GPIs. Further, general convergence properties in terms of eigenvalues are also studied. The data-driven online implementation methods for the proposed HDP and DHP are demonstrated and finally, we present the results of numerical simulations performed to verify the effectiveness of the proposed methods.

  19. Acceleration of the AFEN method by two-node nonlinear iteration

    Energy Technology Data Exchange (ETDEWEB)

    Moon, Kap Suk; Cho, Nam Zin; Noh, Jae Man; Hong, Ser Gi [Korea Advanced Institute of Science and Technology, Taejon (Korea, Republic of)

    1999-12-31

    A nonlinear iterative scheme developed to reduce the computing time of the AFEN method was tested and applied to two benchmark problems. The new nonlinear method for the AFEN method is based on solving two-node problems and use of two nonlinear correction factors at every interface instead of one factor in the conventional scheme. The use of two correction factors provides higher-order accurate interface fluxes as well as currents which are used as the boundary conditions of the two-node problem. The numerical results show that this new method gives exactly the same solution as that of the original AFEN method and the computing time is significantly reduced in comparison with the original AFEN method. 7 refs., 1 fig., 1 tab. (Author)

  20. Acceleration of the AFEN method by two-node nonlinear iteration

    Energy Technology Data Exchange (ETDEWEB)

    Moon, Kap Suk; Cho, Nam Zin; Noh, Jae Man; Hong, Ser Gi [Korea Advanced Institute of Science and Technology, Taejon (Korea, Republic of)

    1998-12-31

    A nonlinear iterative scheme developed to reduce the computing time of the AFEN method was tested and applied to two benchmark problems. The new nonlinear method for the AFEN method is based on solving two-node problems and use of two nonlinear correction factors at every interface instead of one factor in the conventional scheme. The use of two correction factors provides higher-order accurate interface fluxes as well as currents which are used as the boundary conditions of the two-node problem. The numerical results show that this new method gives exactly the same solution as that of the original AFEN method and the computing time is significantly reduced in comparison with the original AFEN method. 7 refs., 1 fig., 1 tab. (Author)

  1. Parallelization methods study of thermal-hydraulics codes

    International Nuclear Information System (INIS)

    Gaudart, Catherine

    2000-01-01

    The variety of parallelization methods and machines leads to a wide selection for programmers. In this study we suggest, in an industrial context, some solutions from the experience acquired through different parallelization methods. The study is about several scientific codes which simulate a large variety of thermal-hydraulics phenomena. A bibliography on parallelization methods and a first analysis of the codes showed the difficulty of our process on the whole applications to study. Therefore, it would be necessary to identify and extract a representative part of these applications and parallelization methods. The linear solver part of the codes forced itself. On this particular part several parallelization methods had been used. From these developments one could estimate the necessary work for a non initiate programmer to parallelize his application, and the impact of the development constraints. The different methods of parallelization tested are the numerical library PETSc, the parallelizer PAF, the language HPF, the formalism PEI and the communications library MPI and PYM. In order to test several methods on different applications and to follow the constraint of minimization of the modifications in codes, a tool called SPS (Server of Parallel Solvers) had be developed. We propose to describe the different constraints about the optimization of codes in an industrial context, to present the solutions given by the tool SPS, to show the development of the linear solver part with the tested parallelization methods and lastly to compare the results against the imposed criteria. (author) [fr

  2. A homotopy method for solving Riccati equations on a shared memory parallel computer

    International Nuclear Information System (INIS)

    Zigic, D.; Watson, L.T.; Collins, E.G. Jr.; Davis, L.D.

    1993-01-01

    Although there are numerous algorithms for solving Riccati equations, there still remains a need for algorithms which can operate efficiently on large problems and on parallel machines. This paper gives a new homotopy-based algorithm for solving Riccati equations on a shared memory parallel computer. The central part of the algorithm is the computation of the kernel of the Jacobian matrix, which is essential for the corrector iterations along the homotopy zero curve. Using a Schur decomposition the tensor product structure of various matrices can be efficiently exploited. The algorithm allows for efficient parallelization on shared memory machines

  3. Parallelization in Modern C++

    CERN Multimedia

    CERN. Geneva

    2016-01-01

    The traditionally used and well established parallel programming models OpenMP and MPI are both targeting lower level parallelism and are meant to be as language agnostic as possible. For a long time, those models were the only widely available portable options for developing parallel C++ applications beyond using plain threads. This has strongly limited the optimization capabilities of compilers, has inhibited extensibility and genericity, and has restricted the use of those models together with other, modern higher level abstractions introduced by the C++11 and C++14 standards. The recent revival of interest in the industry and wider community for the C++ language has also spurred a remarkable amount of standardization proposals and technical specifications being developed. Those efforts however have so far failed to build a vision on how to seamlessly integrate various types of parallelism, such as iterative parallel execution, task-based parallelism, asynchronous many-task execution flows, continuation s...

  4. A task parallel implementation of fast multipole methods

    KAUST Repository

    Taura, Kenjiro

    2012-11-01

    This paper describes a task parallel implementation of ExaFMM, an open source implementation of fast multipole methods (FMM), using a lightweight task parallel library MassiveThreads. Although there have been many attempts on parallelizing FMM, experiences have almost exclusively been limited to formulation based on flat homogeneous parallel loops. FMM in fact contains operations that cannot be readily expressed in such conventional but restrictive models. We show that task parallelism, or parallel recursions in particular, allows us to parallelize all operations of FMM naturally and scalably. Moreover it allows us to parallelize a \\'\\'mutual interaction\\'\\' for force/potential evaluation, which is roughly twice as efficient as a more conventional, unidirectional force/potential evaluation. The net result is an open source FMM that is clearly among the fastest single node implementations, including those on GPUs; with a million particles on a 32 cores Sandy Bridge 2.20GHz node, it completes a single time step including tree construction and force/potential evaluation in 65 milliseconds. The study clearly showcases both programmability and performance benefits of flexible parallel constructs over more monolithic parallel loops. © 2012 IEEE.

  5. An iterative method for near-field Fresnel region polychromatic phase contrast imaging

    Science.gov (United States)

    Carroll, Aidan J.; van Riessen, Grant A.; Balaur, Eugeniu; Dolbnya, Igor P.; Tran, Giang N.; Peele, Andrew G.

    2017-07-01

    We present an iterative method for polychromatic phase contrast imaging that is suitable for broadband illumination and which allows for the quantitative determination of the thickness of an object given the refractive index of the sample material. Experimental and simulation results suggest the iterative method provides comparable image quality and quantitative object thickness determination when compared to the analytical polychromatic transport of intensity and contrast transfer function methods. The ability of the iterative method to work over a wider range of experimental conditions means the iterative method is a suitable candidate for use with polychromatic illumination and may deliver more utility for laboratory-based x-ray sources, which typically have a broad spectrum.

  6. Reducing the effects of acoustic heterogeneity with an iterative reconstruction method from experimental data in microwave induced thermoacoustic tomography

    International Nuclear Information System (INIS)

    Wang, Jinguo; Zhao, Zhiqin; Song, Jian; Chen, Guoping; Nie, Zaiping; Liu, Qing-Huo

    2015-01-01

    Purpose: An iterative reconstruction method has been previously reported by the authors of this paper. However, the iterative reconstruction method was demonstrated by solely using the numerical simulations. It is essential to apply the iterative reconstruction method to practice conditions. The objective of this work is to validate the capability of the iterative reconstruction method for reducing the effects of acoustic heterogeneity with the experimental data in microwave induced thermoacoustic tomography. Methods: Most existing reconstruction methods need to combine the ultrasonic measurement technology to quantitatively measure the velocity distribution of heterogeneity, which increases the system complexity. Different to existing reconstruction methods, the iterative reconstruction method combines time reversal mirror technique, fast marching method, and simultaneous algebraic reconstruction technique to iteratively estimate the velocity distribution of heterogeneous tissue by solely using the measured data. Then, the estimated velocity distribution is used subsequently to reconstruct the highly accurate image of microwave absorption distribution. Experiments that a target placed in an acoustic heterogeneous environment are performed to validate the iterative reconstruction method. Results: By using the estimated velocity distribution, the target in an acoustic heterogeneous environment can be reconstructed with better shape and higher image contrast than targets that are reconstructed with a homogeneous velocity distribution. Conclusions: The distortions caused by the acoustic heterogeneity can be efficiently corrected by utilizing the velocity distribution estimated by the iterative reconstruction method. The advantage of the iterative reconstruction method over the existing correction methods is that it is successful in improving the quality of the image of microwave absorption distribution without increasing the system complexity

  7. Vector and parallel processors in computational science

    International Nuclear Information System (INIS)

    Duff, I.S.; Reid, J.K.

    1985-01-01

    This book presents the papers given at a conference which reviewed the new developments in parallel and vector processing. Topics considered at the conference included hardware (array processors, supercomputers), programming languages, software aids, numerical methods (e.g., Monte Carlo algorithms, iterative methods, finite elements, optimization), and applications (e.g., neutron transport theory, meteorology, image processing)

  8. Fast parallel algorithms for the x-ray transform and its adjoint.

    Science.gov (United States)

    Gao, Hao

    2012-11-01

    Iterative reconstruction methods often offer better imaging quality and allow for reconstructions with lower imaging dose than classical methods in computed tomography. However, the computational speed is a major concern for these iterative methods, for which the x-ray transform and its adjoint are two most time-consuming components. The speed issue becomes even notable for the 3D imaging such as cone beam scans or helical scans, since the x-ray transform and its adjoint are frequently computed as there is usually not enough computer memory to save the corresponding system matrix. The purpose of this paper is to optimize the algorithm for computing the x-ray transform and its adjoint, and their parallel computation. The fast and highly parallelizable algorithms for the x-ray transform and its adjoint are proposed for the infinitely narrow beam in both 2D and 3D. The extension of these fast algorithms to the finite-size beam is proposed in 2D and discussed in 3D. The CPU and GPU codes are available at https://sites.google.com/site/fastxraytransform. The proposed algorithm is faster than Siddon's algorithm for computing the x-ray transform. In particular, the improvement for the parallel computation can be an order of magnitude. The authors have proposed fast and highly parallelizable algorithms for the x-ray transform and its adjoint, which are extendable for the finite-size beam. The proposed algorithms are suitable for parallel computing in the sense that the computational cost per parallel thread is O(1).

  9. Constructing Frozen Jacobian Iterative Methods for Solving Systems of Nonlinear Equations, Associated with ODEs and PDEs Using the Homotopy Method

    Directory of Open Access Journals (Sweden)

    Uswah Qasim

    2016-03-01

    Full Text Available A homotopy method is presented for the construction of frozen Jacobian iterative methods. The frozen Jacobian iterative methods are attractive because the inversion of the Jacobian is performed in terms of LUfactorization only once, for a single instance of the iterative method. We embedded parameters in the iterative methods with the help of the homotopy method: the values of the parameters are determined in such a way that a better convergence rate is achieved. The proposed homotopy technique is general and has the ability to construct different families of iterative methods, for solving weakly nonlinear systems of equations. Further iterative methods are also proposed for solving general systems of nonlinear equations.

  10. Solving the Stokes problem on a massively parallel computer

    DEFF Research Database (Denmark)

    Axelsson, Owe; Barker, Vincent A.; Neytcheva, Maya

    2001-01-01

    boundary value problem for each velocity component, are solved by the conjugate gradient method with a preconditioning based on the algebraic multi‐level iteration (AMLI) technique. The velocity is found from the computed pressure. The method is optimal in the sense that the computational work...... is proportional to the number of unknowns. Further, it is designed to exploit a massively parallel computer with distributed memory architecture. Numerical experiments on a Cray T3E computer illustrate the parallel performance of the method....

  11. A parallel direct-forcing fictitious domain method for simulating microswimmers

    Science.gov (United States)

    Gao, Tong; Lin, Zhaowu

    2017-11-01

    We present a 3D parallel direct-forcing fictitious domain method for simulating swimming micro-organisms at small Reynolds numbers. We treat the motile micro-swimmers as spherical rigid particles using the ``Squirmer'' model. The particle dynamics are solved on the moving Larangian meshes that overlay upon a fixed Eulerian mesh for solving the fluid motion, and the momentum exchange between the two phases is resolved by distributing pseudo body-forces over the particle interior regions which constrain the background fictitious fluids to follow the particle movement. While the solid and fluid subproblems are solved separately, no inner-iterations are required to enforce numerical convergence. We demonstrate the accuracy and robustness of the method by comparing our results with the existing analytical and numerical studies for various cases of single particle dynamics and particle-particle interactions. We also perform a series of numerical explorations to obtain statistical and rheological measurements to characterize the dynamics and structures of Squirmer suspensions. NSF DMS 1619960.

  12. Iterative solution of high order compact systems

    Energy Technology Data Exchange (ETDEWEB)

    Spotz, W.F.; Carey, G.F. [Univ. of Texas, Austin, TX (United States)

    1996-12-31

    We have recently developed a class of finite difference methods which provide higher accuracy and greater stability than standard central or upwind difference methods, but still reside on a compact patch of grid cells. In the present study we investigate the performance of several gradient-type iterative methods for solving the associated sparse systems. Both serial and parallel performance studies have been made. Representative examples are taken from elliptic PDE`s for diffusion, convection-diffusion, and viscous flow applications.

  13. Multiscale optical simulation settings: challenging applications handled with an iterative ray-tracing FDTD interface method.

    Science.gov (United States)

    Leiner, Claude; Nemitz, Wolfgang; Schweitzer, Susanne; Kuna, Ladislav; Wenzl, Franz P; Hartmann, Paul; Satzinger, Valentin; Sommer, Christian

    2016-03-20

    We show that with an appropriate combination of two optical simulation techniques-classical ray-tracing and the finite difference time domain method-an optical device containing multiple diffractive and refractive optical elements can be accurately simulated in an iterative simulation approach. We compare the simulation results with experimental measurements of the device to discuss the applicability and accuracy of our iterative simulation procedure.

  14. An Efficient Algorithm for Perturbed Orbit Integration Combining Analytical Continuation and Modified Chebyshev Picard Iteration

    Science.gov (United States)

    Elgohary, T.; Kim, D.; Turner, J.; Junkins, J.

    2014-09-01

    Several methods exist for integrating the motion in high order gravity fields. Some recent methods use an approximate starting orbit, and an efficient method is needed for generating warm starts that account for specific low order gravity approximations. By introducing two scalar Lagrange-like invariants and employing Leibniz product rule, the perturbed motion is integrated by a novel recursive formulation. The Lagrange-like invariants allow exact arbitrary order time derivatives. Restricting attention to the perturbations due to the zonal harmonics J2 through J6, we illustrate an idea. The recursively generated vector-valued time derivatives for the trajectory are used to develop a continuation series-based solution for propagating position and velocity. Numerical comparisons indicate performance improvements of ~ 70X over existing explicit Runge-Kutta methods while maintaining mm accuracy for the orbit predictions. The Modified Chebyshev Picard Iteration (MCPI) is an iterative path approximation method to solve nonlinear ordinary differential equations. The MCPI utilizes Picard iteration with orthogonal Chebyshev polynomial basis functions to recursively update the states. The key advantages of the MCPI are as follows: 1) Large segments of a trajectory can be approximated by evaluating the forcing function at multiple nodes along the current approximation during each iteration. 2) It can readily handle general gravity perturbations as well as non-conservative forces. 3) Parallel applications are possible. The Picard sequence converges to the solution over large time intervals when the forces are continuous and differentiable. According to the accuracy of the starting solutions, however, the MCPI may require significant number of iterations and function evaluations compared to other integrators. In this work, we provide an efficient methodology to establish good starting solutions from the continuation series method; this warm start improves the performance of the

  15. COMPARISON OF HOLOGRAPHIC AND ITERATIVE METHODS FOR AMPLITUDE OBJECT RECONSTRUCTION

    Directory of Open Access Journals (Sweden)

    I. A. Shevkunov

    2015-01-01

    Full Text Available Experimental comparison of four methods for the wavefront reconstruction is presented. We considered two iterative and two holographic methods with different mathematical models and algorithms for recovery. The first two of these methods do not use a reference wave recording scheme that reduces requirements for stability of the installation. A major role in phase information reconstruction by such methods is played by a set of spatial intensity distributions, which are recorded as the recording matrix is being moved along the optical axis. The obtained data are used consistently for wavefront reconstruction using an iterative procedure. In the course of this procedure numerical distribution of the wavefront between the planes is performed. Thus, phase information of the wavefront is stored in every plane and calculated amplitude distributions are replaced for the measured ones in these planes. In the first of the compared methods, a two-dimensional Fresnel transform and iterative calculation in the object plane are used as a mathematical model. In the second approach, an angular spectrum method is used for numerical wavefront propagation, and the iterative calculation is carried out only between closely located planes of data registration. Two digital holography methods, based on the usage of the reference wave in the recording scheme and differing from each other by numerical reconstruction algorithm of digital holograms, are compared with the first two methods. The comparison proved that the iterative method based on 2D Fresnel transform gives results comparable with the result of common holographic method with the Fourier-filtering. It is shown that holographic method for reconstructing of the object complex amplitude in the process of the object amplitude reduction is the best among considered ones.

  16. A hyperpower iterative method for computing the generalized Drazin ...

    Indian Academy of Sciences (India)

    A quadratically convergent Newton-type iterative scheme is proposed for approximating the generalized Drazin inverse bd of the Banach algebra element b. Further, its extension into the form of the hyperpower iterative method of arbitrary order p ≤ 2 is presented. Convergence criteria along with the estimation of error ...

  17. A tool for simulating parallel branch-and-bound methods

    Science.gov (United States)

    Golubeva, Yana; Orlov, Yury; Posypkin, Mikhail

    2016-01-01

    The Branch-and-Bound method is known as one of the most powerful but very resource consuming global optimization methods. Parallel and distributed computing can efficiently cope with this issue. The major difficulty in parallel B&B method is the need for dynamic load redistribution. Therefore design and study of load balancing algorithms is a separate and very important research topic. This paper presents a tool for simulating parallel Branchand-Bound method. The simulator allows one to run load balancing algorithms with various numbers of processors, sizes of the search tree, the characteristics of the supercomputer's interconnect thereby fostering deep study of load distribution strategies. The process of resolution of the optimization problem by B&B method is replaced by a stochastic branching process. Data exchanges are modeled using the concept of logical time. The user friendly graphical interface to the simulator provides efficient visualization and convenient performance analysis.

  18. Numeric algorithms for parallel processors computer architectures with applications to the few-groups neutron diffusion equations

    International Nuclear Information System (INIS)

    Zee, S.K.

    1987-01-01

    A numeric algorithm and an associated computer code were developed for the rapid solution of the finite-difference method representation of the few-group neutron-diffusion equations on parallel computers. Applications of the numeric algorithm on both SIMD (vector pipeline) and MIMD/SIMD (multi-CUP/vector pipeline) architectures were explored. The algorithm was successfully implemented in the two-group, 3-D neutron diffusion computer code named DIFPAR3D (DIFfusion PARallel 3-Dimension). Numerical-solution techniques used in the code include the Chebyshev polynomial acceleration technique in conjunction with the power method of outer iteration. For inner iterations, a parallel form of red-black (cyclic) line SOR with automated determination of group dependent relaxation factors and iteration numbers required to achieve specified inner iteration error tolerance is incorporated. The code employs a macroscopic depletion model with trace capability for selected fission products' transients and critical boron. In addition to this, moderator and fuel temperature feedback models are also incorporated into the DIFPAR3D code, for realistic simulation of power reactor cores. The physics models used were proven acceptable in separate benchmarking studies

  19. Communications oriented programming of parallel iterative solutions of sparse linear systems

    Science.gov (United States)

    Patrick, M. L.; Pratt, T. W.

    1986-01-01

    Parallel algorithms are developed for a class of scientific computational problems by partitioning the problems into smaller problems which may be solved concurrently. The effectiveness of the resulting parallel solutions is determined by the amount and frequency of communication and synchronization and the extent to which communication can be overlapped with computation. Three different parallel algorithms for solving the same class of problems are presented, and their effectiveness is analyzed from this point of view. The algorithms are programmed using a new programming environment. Run-time statistics and experience obtained from the execution of these programs assist in measuring the effectiveness of these algorithms.

  20. Iterative and multigrid methods in the finite element solution of incompressible and turbulent fluid flow

    Science.gov (United States)

    Lavery, N.; Taylor, C.

    1999-07-01

    Multigrid and iterative methods are used to reduce the solution time of the matrix equations which arise from the finite element (FE) discretisation of the time-independent equations of motion of the incompressible fluid in turbulent motion. Incompressible flow is solved by using the method of reduce interpolation for the pressure to satisfy the Brezzi-Babuska condition. The k-l model is used to complete the turbulence closure problem. The non-symmetric iterative matrix methods examined are the methods of least squares conjugate gradient (LSCG), biconjugate gradient (BCG), conjugate gradient squared (CGS), and the biconjugate gradient squared stabilised (BCGSTAB). The multigrid algorithm applied is based on the FAS algorithm of Brandt, and uses two and three levels of grids with a V-cycling schedule. These methods are all compared to the non-symmetric frontal solver. Copyright

  1. A tool for simulating parallel branch-and-bound methods

    Directory of Open Access Journals (Sweden)

    Golubeva Yana

    2016-01-01

    Full Text Available The Branch-and-Bound method is known as one of the most powerful but very resource consuming global optimization methods. Parallel and distributed computing can efficiently cope with this issue. The major difficulty in parallel B&B method is the need for dynamic load redistribution. Therefore design and study of load balancing algorithms is a separate and very important research topic. This paper presents a tool for simulating parallel Branchand-Bound method. The simulator allows one to run load balancing algorithms with various numbers of processors, sizes of the search tree, the characteristics of the supercomputer’s interconnect thereby fostering deep study of load distribution strategies. The process of resolution of the optimization problem by B&B method is replaced by a stochastic branching process. Data exchanges are modeled using the concept of logical time. The user friendly graphical interface to the simulator provides efficient visualization and convenient performance analysis.

  2. Iterative Multiparameter Elastic Waveform Inversion Using Prestack Time Imaging and Kirchhoff approximation

    Science.gov (United States)

    Khaniani, Hassan

    This thesis proposes a "standard strategy" for iterative inversion of elastic properties from the seismic reflection data. The term "standard" refers to the current hands-on commercial techniques that are used for the seismic imaging and inverse problem. The method is established to reduce the computation time associated with elastic Full Waveform Inversion (FWI) methods. It makes use of AVO analysis, prestack time migration and corresponding forward modeling in an iterative scheme. The main objective is to describe the iterative inversion procedure used in seismic reflection data using simplified mathematical expression and their numerical applications. The frame work of the inversion is similar to (FWI) method but with less computational costs. The reduction of computational costs depends on the data conditioning (with or without multiple data), the level of the complexity of geological model and acquisition condition such as Signal to Noise Ratio (SNR). Many processing methods consider multiple events as noise and remove it from the data. This is the motivation for reducing the computational cost associated with Finite Difference Time Domain (FDTD) forward modeling and Reverse Time Migration (RTM)-based techniques. Therefore, a one-way solution of the wave equation for inversion is implemented. While less computationally intensive depth imaging methods are available by iterative coupling of ray theory and the Born approximation, it is shown that we can further reduce the cost of inversion by dropping the cost of ray tracing for traveltime estimation in a way similar to standard Prestack Time Migration (PSTM) and the corresponding forward modeling. This requires the model to have smooth lateral variations in elastic properties, so that the traveltime of the scatterpoints can be approximated by a Double Square Root (DSR) equation. To represent a more realistic and stable solution of the inverse problem, while considering the phase of supercritical angles, the

  3. Parallel SOR methods with a parabolic-diffusion acceleration technique for solving an unstructured-grid Poisson equation on 3D arbitrary geometries

    Science.gov (United States)

    Zapata, M. A. Uh; Van Bang, D. Pham; Nguyen, K. D.

    2016-05-01

    This paper presents a parallel algorithm for the finite-volume discretisation of the Poisson equation on three-dimensional arbitrary geometries. The proposed method is formulated by using a 2D horizontal block domain decomposition and interprocessor data communication techniques with message passing interface. The horizontal unstructured-grid cells are reordered according to the neighbouring relations and decomposed into blocks using a load-balanced distribution to give all processors an equal amount of elements. In this algorithm, two parallel successive over-relaxation methods are presented: a multi-colour ordering technique for unstructured grids based on distributed memory and a block method using reordering index following similar ideas of the partitioning for structured grids. In all cases, the parallel algorithms are implemented with a combination of an acceleration iterative solver. This solver is based on a parabolic-diffusion equation introduced to obtain faster solutions of the linear systems arising from the discretisation. Numerical results are given to evaluate the performances of the methods showing speedups better than linear.

  4. Natural Preconditioning and Iterative Methods for Saddle Point Systems

    KAUST Repository

    Pestana, Jennifer

    2015-01-01

    © 2015 Society for Industrial and Applied Mathematics. The solution of quadratic or locally quadratic extremum problems subject to linear(ized) constraints gives rise to linear systems in saddle point form. This is true whether in the continuous or the discrete setting, so saddle point systems arising from the discretization of partial differential equation problems, such as those describing electromagnetic problems or incompressible flow, lead to equations with this structure, as do, for example, interior point methods and the sequential quadratic programming approach to nonlinear optimization. This survey concerns iterative solution methods for these problems and, in particular, shows how the problem formulation leads to natural preconditioners which guarantee a fast rate of convergence of the relevant iterative methods. These preconditioners are related to the original extremum problem and their effectiveness - in terms of rapidity of convergence - is established here via a proof of general bounds on the eigenvalues of the preconditioned saddle point matrix on which iteration convergence depends.

  5. An iterative method for obtaining the optimum lightning location on a spherical surface

    Science.gov (United States)

    Chao, Gao; Qiming, MA

    1991-01-01

    A brief introduction to the basic principles of an eigen method used to obtain the optimum source location of lightning is presented. The location of the optimum source is obtained by using multiple direction finders (DF's) on a spherical surface. An improvement of this method, which takes the distance of source-DF's as a constant, is presented. It is pointed out that using a weight factor of signal strength is not the most ideal method because of the inexact inverse signal strength-distance relation and the inaccurate signal amplitude. An iterative calculation method is presented using the distance from the source to the DF as a weight factor. This improved method has higher accuracy and needs only a little more calculation time. Some computer simulations for a 4DF system are presented to show the improvement of location through use of the iterative method.

  6. Simulation of incompressible flows with heat and mass transfer using parallel finite element method

    Directory of Open Access Journals (Sweden)

    Jalal Abedi

    2003-02-01

    Full Text Available The stabilized finite element formulations based on the SUPG (Stream-line-Upwind/Petrov-Galerkin and PSPG (Pressure-Stabilization/Petrov-Galerkin methods are developed and applied to solve buoyancy-driven incompressible flows with heat and mass transfer. The SUPG stabilization term allows us to solve flow problems at high speeds (advection dominant flows and the PSPG term eliminates instabilities associated with the use of equal order interpolation functions for both pressure and velocity. The finite element formulations are implemented in parallel using MPI. In parallel computations, the finite element mesh is partitioned into contiguous subdomains using METIS, which are then assigned to individual processors. To ensure a balanced load, the number of elements assigned to each processor is approximately equal. To solve nonlinear systems in large-scale applications, we developed a matrix-free GMRES iterative solver. Here we totally eliminate a need to form any matrices, even at the element levels. To measure the accuracy of the method, we solve 2D and 3D example of natural convection flows at moderate to high Rayleigh numbers.

  7. Annual Copper Mountain Conferences on Multigrid and Iterative Methods, Copper Mountain, Colorado

    International Nuclear Information System (INIS)

    McCormick, Stephen F.

    2016-01-01

    This project supported the Copper Mountain Conference on Multigrid and Iterative Methods, held from 2007 to 2015, at Copper Mountain, Colorado. The subject of the Copper Mountain Conference Series alternated between Multigrid Methods in odd-numbered years and Iterative Methods in even-numbered years. Begun in 1983, the Series represents an important forum for the exchange of ideas in these two closely related fields. This report describes the Copper Mountain Conference on Multigrid and Iterative Methods, 2007-2015. Information on the conference series is available at http://grandmaster.colorado.edu/~copper/

  8. Annual Copper Mountain Conferences on Multigrid and Iterative Methods, Copper Mountain, Colorado

    Energy Technology Data Exchange (ETDEWEB)

    McCormick, Stephen F. [Front Range Scientific, Inc., Lake City, CO (United States)

    2016-03-25

    This project supported the Copper Mountain Conference on Multigrid and Iterative Methods, held from 2007 to 2015, at Copper Mountain, Colorado. The subject of the Copper Mountain Conference Series alternated between Multigrid Methods in odd-numbered years and Iterative Methods in even-numbered years. Begun in 1983, the Series represents an important forum for the exchange of ideas in these two closely related fields. This report describes the Copper Mountain Conference on Multigrid and Iterative Methods, 2007-2015. Information on the conference series is available at http://grandmaster.colorado.edu/~copper/.

  9. Fast parallel algorithm for CT image reconstruction.

    Science.gov (United States)

    Flores, Liubov A; Vidal, Vicent; Mayo, Patricia; Rodenas, Francisco; Verdú, Gumersindo

    2012-01-01

    In X-ray computed tomography (CT) the X rays are used to obtain the projection data needed to generate an image of the inside of an object. The image can be generated with different techniques. Iterative methods are more suitable for the reconstruction of images with high contrast and precision in noisy conditions and from a small number of projections. Their use may be important in portable scanners for their functionality in emergency situations. However, in practice, these methods are not widely used due to the high computational cost of their implementation. In this work we analyze iterative parallel image reconstruction with the Portable Extensive Toolkit for Scientific computation (PETSc).

  10. New Iterative Method for Fractional Gas Dynamics and Coupled Burger’s Equations

    Directory of Open Access Journals (Sweden)

    Mohamed S. Al-luhaibi

    2015-01-01

    Full Text Available This paper presents the approximate analytical solutions to solve the nonlinear gas dynamics and coupled Burger’s equations with fractional time derivative. By using initial values, the explicit solutions of the equations are solved by using a reliable algorithm. Numerical results show that the new iterative method is easy to implement and accurate when applied to time-fractional partial differential equations.

  11. On the adequacy of message-passing parallel supercomputers for solving neutron transport problems

    International Nuclear Information System (INIS)

    Azmy, Y.Y.

    1990-01-01

    A coarse-grained, static-scheduling parallelization of the standard iterative scheme used for solving the discrete-ordinates approximation of the neutron transport equation is described. The parallel algorithm is based on a decomposition of the angular domain along the discrete ordinates, thus naturally producing a set of completely uncoupled systems of equations in each iteration. Implementation of the parallel code on Intcl's iPSC/2 hypercube, and solutions to test problems are presented as evidence of the high speedup and efficiency of the parallel code. The performance of the parallel code on the iPSC/2 is analyzed, and a model for the CPU time as a function of the problem size (order of angular quadrature) and the number of participating processors is developed and validated against measured CPU times. The performance model is used to speculate on the potential of massively parallel computers for significantly speeding up real-life transport calculations at acceptable efficiencies. We conclude that parallel computers with a few hundred processors are capable of producing large speedups at very high efficiencies in very large three-dimensional problems. 10 refs., 8 figs

  12. Parallel multigrid smoothing: polynomial versus Gauss-Seidel

    International Nuclear Information System (INIS)

    Adams, Mark; Brezina, Marian; Hu, Jonathan; Tuminaro, Ray

    2003-01-01

    Gauss-Seidel is often the smoother of choice within multigrid applications. In the context of unstructured meshes, however, maintaining good parallel efficiency is difficult with multiplicative iterative methods such as Gauss-Seidel. This leads us to consider alternative smoothers. We discuss the computational advantages of polynomial smoothers within parallel multigrid algorithms for positive definite symmetric systems. Two particular polynomials are considered: Chebyshev and a multilevel specific polynomial. The advantages of polynomial smoothing over traditional smoothers such as Gauss-Seidel are illustrated on several applications: Poisson's equation, thin-body elasticity, and eddy current approximations to Maxwell's equations. While parallelizing the Gauss-Seidel method typically involves a compromise between a scalable convergence rate and maintaining high flop rates, polynomial smoothers achieve parallel scalable multigrid convergence rates without sacrificing flop rates. We show that, although parallel computers are the main motivation, polynomial smoothers are often surprisingly competitive with Gauss-Seidel smoothers on serial machines

  13. Parallel multigrid smoothing: polynomial versus Gauss-Seidel

    Science.gov (United States)

    Adams, Mark; Brezina, Marian; Hu, Jonathan; Tuminaro, Ray

    2003-07-01

    Gauss-Seidel is often the smoother of choice within multigrid applications. In the context of unstructured meshes, however, maintaining good parallel efficiency is difficult with multiplicative iterative methods such as Gauss-Seidel. This leads us to consider alternative smoothers. We discuss the computational advantages of polynomial smoothers within parallel multigrid algorithms for positive definite symmetric systems. Two particular polynomials are considered: Chebyshev and a multilevel specific polynomial. The advantages of polynomial smoothing over traditional smoothers such as Gauss-Seidel are illustrated on several applications: Poisson's equation, thin-body elasticity, and eddy current approximations to Maxwell's equations. While parallelizing the Gauss-Seidel method typically involves a compromise between a scalable convergence rate and maintaining high flop rates, polynomial smoothers achieve parallel scalable multigrid convergence rates without sacrificing flop rates. We show that, although parallel computers are the main motivation, polynomial smoothers are often surprisingly competitive with Gauss-Seidel smoothers on serial machines.

  14. An iterated Radau method for time-dependent PDE's

    NARCIS (Netherlands)

    S. Pérez-Rodríguez; S. González-Pinto; B.P. Sommeijer (Ben)

    2008-01-01

    htmlabstractThis paper is concerned with the time integration of semi-discretized, multi-dimensional PDEs of advection-diffusion-reaction type. To cope with the stiffness of these ODEs, an implicit method has been selected, viz., the two-stage, third-order Radau IIA method. The main topic of this

  15. Parallel preconditioning techniques for sparse CG solvers

    Energy Technology Data Exchange (ETDEWEB)

    Basermann, A.; Reichel, B.; Schelthoff, C. [Central Institute for Applied Mathematics, Juelich (Germany)

    1996-12-31

    Conjugate gradient (CG) methods to solve sparse systems of linear equations play an important role in numerical methods for solving discretized partial differential equations. The large size and the condition of many technical or physical applications in this area result in the need for efficient parallelization and preconditioning techniques of the CG method. In particular for very ill-conditioned matrices, sophisticated preconditioner are necessary to obtain both acceptable convergence and accuracy of CG. Here, we investigate variants of polynomial and incomplete Cholesky preconditioners that markedly reduce the iterations of the simply diagonally scaled CG and are shown to be well suited for massively parallel machines.

  16. Plane-wave electronic structure calculations on a parallel supercomputer

    International Nuclear Information System (INIS)

    Nelson, J.S.; Plimpton, S.J.; Sears, M.P.

    1993-01-01

    The development of iterative solutions of Schrodinger's equation in a plane-wave (pw) basis over the last several years has coincided with great advances in the computational power available for performing the calculations. These dual developments have enabled many new and interesting condensed matter phenomena to be studied from a first-principles approach. The authors present a detailed description of the implementation on a parallel supercomputer (hypercube) of the first-order equation-of-motion solution to Schrodinger's equation, using plane-wave basis functions and ab initio separable pseudopotentials. By distributing the plane-waves across the processors of the hypercube many of the computations can be performed in parallel, resulting in decreases in the overall computation time relative to conventional vector supercomputers. This partitioning also provides ample memory for large Fast Fourier Transform (FFT) meshes and the storage of plane-wave coefficients for many hundreds of energy bands. The usefulness of the parallel techniques is demonstrated by benchmark timings for both the FFT's and iterations of the self-consistent solution of Schrodinger's equation for different sized Si unit cells of up to 512 atoms

  17. Iterative Bayesian Estimation of Travel Times on Urban Arterials: Fusing Loop Detector and Probe Vehicle Data.

    Science.gov (United States)

    Liu, Kai; Cui, Meng-Ying; Cao, Peng; Wang, Jiang-Bo

    2016-01-01

    On urban arterials, travel time estimation is challenging especially from various data sources. Typically, fusing loop detector data and probe vehicle data to estimate travel time is a troublesome issue while considering the data issue of uncertain, imprecise and even conflicting. In this paper, we propose an improved data fusing methodology for link travel time estimation. Link travel times are simultaneously pre-estimated using loop detector data and probe vehicle data, based on which Bayesian fusion is then applied to fuse the estimated travel times. Next, Iterative Bayesian estimation is proposed to improve Bayesian fusion by incorporating two strategies: 1) substitution strategy which replaces the lower accurate travel time estimation from one sensor with the current fused travel time; and 2) specially-designed conditions for convergence which restrict the estimated travel time in a reasonable range. The estimation results show that, the proposed method outperforms probe vehicle data based method, loop detector based method and single Bayesian fusion, and the mean absolute percentage error is reduced to 4.8%. Additionally, iterative Bayesian estimation performs better for lighter traffic flows when the variability of travel time is practically higher than other periods.

  18. Massively Parallel and Scalable Implicit Time Integration Algorithms for Structural Dynamics

    Science.gov (United States)

    Farhat, Charbel

    1997-01-01

    Explicit codes are often used to simulate the nonlinear dynamics of large-scale structural systems, even for low frequency response, because the storage and CPU requirements entailed by the repeated factorizations traditionally found in implicit codes rapidly overwhelm the available computing resources. With the advent of parallel processing, this trend is accelerating because of the following additional facts: (a) explicit schemes are easier to parallelize than implicit ones, and (b) explicit schemes induce short range interprocessor communications that are relatively inexpensive, while the factorization methods used in most implicit schemes induce long range interprocessor communications that often ruin the sought-after speed-up. However, the time step restriction imposed by the Courant stability condition on all explicit schemes cannot yet be offset by the speed of the currently available parallel hardware. Therefore, it is essential to develop efficient alternatives to direct methods that are also amenable to massively parallel processing because implicit codes using unconditionally stable time-integration algorithms are computationally more efficient when simulating the low-frequency dynamics of aerospace structures.

  19. A Parallel Priority Queue with Constant Time Operations

    DEFF Research Database (Denmark)

    Brodal, Gerth Stølting; Träff, Jesper Larsson; Zaroliagis, Christos D.

    1998-01-01

    We present a parallel priority queue that supports the following operations in constant time:parallel insertionof a sequence of elements ordered according to key,parallel decrease keyfor a sequence of elements ordered according to key,deletion of the minimum key element, anddeletion of an arbitrary...... application is a parallel implementation of Dijkstra's algorithm for the single-source shortest path problem, which runs inO(n) time andO(mlogn) work on a CREW PRAM on graphs withnvertices andmedges. This is a logarithmic factor improvement in the running time compared with previous approaches....

  20. Iterative Dipole Moment Method for the Dielectrophoretic Particle-Particle Interaction in a DC Electric Field

    Directory of Open Access Journals (Sweden)

    Qing Zhang

    2018-01-01

    Full Text Available Electric force is the most popular technique for bioparticle transportation and manipulation in microfluidic systems. In this paper, the iterative dipole moment (IDM method was used to calculate the dielectrophoretic (DEP forces of particle-particle interactions in a two-dimensional DC electric field, and the Lagrangian method was used to solve the transportation of particles. It was found that the DEP properties and whether the connection line between initial positions of particles perpendicular or parallel to the electric field greatly affect the chain patterns. In addition, the dependence of the DEP particle interaction upon the particle diameters, initial particle positions, and the DEP properties have been studied in detail. The conclusions are advantageous in elelctrokinetic microfluidic systems where it may be desirable to control, manipulate, and assemble bioparticles.

  1. SPECT reconstruction of combined cone beam and parallel hole collimation with experimental data

    International Nuclear Information System (INIS)

    Li, Jianying; Jaszczak, R.J.; Turkington, T.G.; Greer, K.L.; Coleman, R.E.

    1993-01-01

    The authors have developed three methods to combine parallel and cone bean (P and CB) SPECT data using modified Maximum Likelihood-Expectation Maximization (ML-EM) algorithms. The first combination method applies both parallel and cone beam data sets to reconstruct a single intermediate image after each iteration using the ML-EM algorithm. The other two iterative methods combine the intermediate parallel beam (PB) and cone beam (CB) source estimates to enhance the uniformity of images. These two methods are ad hoc methods. In earlier studies using computer Monte Carlo simulation, they suggested that improved images might be obtained by reconstructing combined P and CB SPECT data. These combined collimation methods are qualitatively evaluated using experimental data. An attenuation compensation is performed by including the effects of attenuation in the transition matrix as a multiplicative factor. The combined P and CB images are compared with CB-only images and the result indicate that the combined P and CB approaches suppress artifacts caused by truncated projections and correct for the distortions of the CB-only images

  2. Milestones in the Development of Iterative Solution Methods

    Czech Academy of Sciences Publication Activity Database

    Axelsson, Owe

    2010-01-01

    Roč. 2010, - (2010), s. 1-33 ISSN 2090-0147 Institutional research plan: CEZ:AV0Z30860518 Keywords : iterative solution methods * convergence acceleration methods * linear systems Subject RIV: JC - Computer Hardware ; Software http://www.hindawi.com/journals/jece/2010/972794.html

  3. A cryogenic system design for the international thermonuclear experimental reactor (ITER)

    International Nuclear Information System (INIS)

    Slack, D.S.

    1991-01-01

    A conceptual design for ITER was completed last year. The author developed a suitable cryogenic system for ITER as part of this conceptual design effort. An overview of the design is reported. Emphasis is on the fact that cryogenics is a mature science, and a system supporting ITER needs can be made from time-proven components without loss of efficiency or reliability. Because of the large size of the ITER cryogenic system, large numbers of compressors and expanders must be used. Very high reliability is assured by arranging these components in parallel banks where servicing of individual components can be done without interruption of operations. This and other ideas based on the author's experience with Mirror Fusion Test Facility (MFTF) operations are described. 5 refs., 3 figs

  4. Iterative Regularization with Minimum-Residual Methods

    DEFF Research Database (Denmark)

    Jensen, Toke Koldborg; Hansen, Per Christian

    2007-01-01

    subspaces. We provide a combination of theory and numerical examples, and our analysis confirms the experience that MINRES and MR-II can work as general regularization methods. We also demonstrate theoretically and experimentally that the same is not true, in general, for GMRES and RRGMRES their success......We study the regularization properties of iterative minimum-residual methods applied to discrete ill-posed problems. In these methods, the projection onto the underlying Krylov subspace acts as a regularizer, and the emphasis of this work is on the role played by the basis vectors of these Krylov...... as regularization methods is highly problem dependent....

  5. Iterative regularization with minimum-residual methods

    DEFF Research Database (Denmark)

    Jensen, Toke Koldborg; Hansen, Per Christian

    2006-01-01

    subspaces. We provide a combination of theory and numerical examples, and our analysis confirms the experience that MINRES and MR-II can work as general regularization methods. We also demonstrate theoretically and experimentally that the same is not true, in general, for GMRES and RRGMRES - their success......We study the regularization properties of iterative minimum-residual methods applied to discrete ill-posed problems. In these methods, the projection onto the underlying Krylov subspace acts as a regularizer, and the emphasis of this work is on the role played by the basis vectors of these Krylov...... as regularization methods is highly problem dependent....

  6. Parallel algorithms for simulating continuous time Markov chains

    Science.gov (United States)

    Nicol, David M.; Heidelberger, Philip

    1992-01-01

    We have previously shown that the mathematical technique of uniformization can serve as the basis of synchronization for the parallel simulation of continuous-time Markov chains. This paper reviews the basic method and compares five different methods based on uniformization, evaluating their strengths and weaknesses as a function of problem characteristics. The methods vary in their use of optimism, logical aggregation, communication management, and adaptivity. Performance evaluation is conducted on the Intel Touchstone Delta multiprocessor, using up to 256 processors.

  7. Fast ℓ1-SPIRiT Compressed Sensing Parallel Imaging MRI: Scalable Parallel Implementation and Clinically Feasible Runtime

    Science.gov (United States)

    Murphy, Mark; Alley, Marcus; Demmel, James; Keutzer, Kurt; Vasanawala, Shreyas; Lustig, Michael

    2012-01-01

    We present ℓ1-SPIRiT, a simple algorithm for auto calibrating parallel imaging (acPI) and compressed sensing (CS) that permits an efficient implementation with clinically-feasible runtimes. We propose a CS objective function that minimizes cross-channel joint sparsity in the Wavelet domain. Our reconstruction minimizes this objective via iterative soft-thresholding, and integrates naturally with iterative Self-Consistent Parallel Imaging (SPIRiT). Like many iterative MRI reconstructions, ℓ1-SPIRiT’s image quality comes at a high computational cost. Excessively long runtimes are a barrier to the clinical use of any reconstruction approach, and thus we discuss our approach to efficiently parallelizing ℓ1-SPIRiT and to achieving clinically-feasible runtimes. We present parallelizations of ℓ1-SPIRiT for both multi-GPU systems and multi-core CPUs, and discuss the software optimization and parallelization decisions made in our implementation. The performance of these alternatives depends on the processor architecture, the size of the image matrix, and the number of parallel imaging channels. Fundamentally, achieving fast runtime requires the correct trade-off between cache usage and parallelization overheads. We demonstrate image quality via a case from our clinical experimentation, using a custom 3DFT Spoiled Gradient Echo (SPGR) sequence with up to 8× acceleration via poisson-disc undersampling in the two phase-encoded directions. PMID:22345529

  8. Parallel ray tracing for one-dimensional discrete ordinate computations

    International Nuclear Information System (INIS)

    Jarvis, R.D.; Nelson, P.

    1996-01-01

    The ray-tracing sweep in discrete-ordinates, spatially discrete numerical approximation methods applied to the linear, steady-state, plane-parallel, mono-energetic, azimuthally symmetric, neutral-particle transport equation can be reduced to a parallel prefix computation. In so doing, the often severe penalty in convergence rate of the source iteration, suffered by most current parallel algorithms using spatial domain decomposition, can be avoided while attaining parallelism in the spatial domain to whatever extent desired. In addition, the reduction implies parallel algorithm complexity limits for the ray-tracing sweep. The reduction applies to all closed, linear, one-cell functional (CLOF) spatial approximation methods, which encompasses most in current popular use. Scalability test results of an implementation of the algorithm on a 64-node nCube-2S hypercube-connected, message-passing, multi-computer are described. (author)

  9. NUFFT-Based Iterative Image Reconstruction via Alternating Direction Total Variation Minimization for Sparse-View CT

    Directory of Open Access Journals (Sweden)

    Bin Yan

    2015-01-01

    Full Text Available Sparse-view imaging is a promising scanning method which can reduce the radiation dose in X-ray computed tomography (CT. Reconstruction algorithm for sparse-view imaging system is of significant importance. The adoption of the spatial iterative algorithm for CT image reconstruction has a low operation efficiency and high computation requirement. A novel Fourier-based iterative reconstruction technique that utilizes nonuniform fast Fourier transform is presented in this study along with the advanced total variation (TV regularization for sparse-view CT. Combined with the alternating direction method, the proposed approach shows excellent efficiency and rapid convergence property. Numerical simulations and real data experiments are performed on a parallel beam CT. Experimental results validate that the proposed method has higher computational efficiency and better reconstruction quality than the conventional algorithms, such as simultaneous algebraic reconstruction technique using TV method and the alternating direction total variation minimization approach, with the same time duration. The proposed method appears to have extensive applications in X-ray CT imaging.

  10. Hybrid parallelization of the XTOR-2F code for the simulation of two-fluid MHD instabilities in tokamaks

    Science.gov (United States)

    Marx, Alain; Lütjens, Hinrich

    2017-03-01

    A hybrid MPI/OpenMP parallel version of the XTOR-2F code [Lütjens and Luciani, J. Comput. Phys. 229 (2010) 8130] solving the two-fluid MHD equations in full tokamak geometry by means of an iterative Newton-Krylov matrix-free method has been developed. The present work shows that the code has been parallelized significantly despite the numerical profile of the problem solved by XTOR-2F, i.e. a discretization with pseudo-spectral representations in all angular directions, the stiffness of the two-fluid stability problem in tokamaks, and the use of a direct LU decomposition to invert the physical pre-conditioner at every Krylov iteration of the solver. The execution time of the parallelized version is an order of magnitude smaller than the sequential one for low resolution cases, with an increasing speedup when the discretization mesh is refined. Moreover, it allows to perform simulations with higher resolutions, previously forbidden because of memory limitations.

  11. Preconditioning of iterative methods - theory and applications

    Czech Academy of Sciences Publication Activity Database

    Axelsson, Owe; Blaheta, Radim; Neytcheva, M.; Pultarová, I.

    2015-01-01

    Roč. 22, č. 6 (2015), s. 901-902 ISSN 1070-5325 Institutional support: RVO:68145535 Keywords : preconditioning * iterative methods * applications Subject RIV: BA - General Mathematics Impact factor: 1.431, year: 2015 http://onlinelibrary.wiley.com/doi/10.1002/nla.2016/epdf

  12. Numerical Solutions for Nonlinear High Damping Rubber Bearing Isolators: Newmark's Method with Netwon-Raphson Iteration Revisited

    Science.gov (United States)

    Markou, A. A.; Manolis, G. D.

    2018-03-01

    Numerical methods for the solution of dynamical problems in engineering go back to 1950. The most famous and widely-used time stepping algorithm was developed by Newmark in 1959. In the present study, for the first time, the Newmark algorithm is developed for the case of the trilinear hysteretic model, a model that was used to describe the shear behaviour of high damping rubber bearings. This model is calibrated against free-vibration field tests implemented on a hybrid base isolated building, namely the Solarino project in Italy, as well as against laboratory experiments. A single-degree-of-freedom system is used to describe the behaviour of a low-rise building isolated with a hybrid system comprising high damping rubber bearings and low friction sliding bearings. The behaviour of the high damping rubber bearings is simulated by the trilinear hysteretic model, while the description of the behaviour of the low friction sliding bearings is modeled by a linear Coulomb friction model. In order to prove the effectiveness of the numerical method we compare the analytically solved trilinear hysteretic model calibrated from free-vibration field tests (Solarino project) against the same model solved with the Newmark method with Netwon-Raphson iteration. Almost perfect agreement is observed between the semi-analytical solution and the fully numerical solution with Newmark's time integration algorithm. This will allow for extension of the trilinear mechanical models to bidirectional horizontal motion, to time-varying vertical loads, to multi-degree-of-freedom-systems, as well to generalized models connected in parallel, where only numerical solutions are possible.

  13. Domain decomposition method using a hybrid parallelism and a low-order acceleration for solving the Sn transport equation on unstructured geometry

    International Nuclear Information System (INIS)

    Odry, Nans

    2016-01-01

    Deterministic calculation schemes are devised to numerically solve the neutron transport equation in nuclear reactors. Dealing with core-sized problems is very challenging for computers, so much that the dedicated core calculations have no choice but to allow simplifying assumptions (assembly- then core scale steps..). The PhD work aims at overcoming some of these approximations: thanks to important changes in computer architecture and capacities (HPC), nowadays one can solve 3D core-sized problems, using both high mesh refinement and the transport operator. It is an essential step forward in order to perform, in the future, reference calculations using deterministic schemes. This work focuses on a spatial domain decomposition method (DDM). Using massive parallelism, DDM allows much more ambitious computations in terms of both memory requirements and calculation time. Developments were performed inside the Sn core solver Minaret, from the new CEA neutronics platform APOLLO3. Only fast reactors (hexagonal periodicity) are considered, even if all kinds of geometries can be dealt with, using Minaret. The work has been divided in four steps: 1) The spatial domain decomposition with no overlap is inserted into the standard algorithmic structure of Minaret. The fundamental idea involves splitting a core-sized problem into smaller, independent, spatial sub-problems. angular flux is exchanged between adjacent sub-domains. In doing so, all combined sub-problems converge to the global solution at the outcome of an iterative process. Various strategies were explored regarding both data management and algorithm design. Results (k eff and flux) are systematically compared to the reference in a numerical verification step. 2) Introducing more parallelism is an unprecedented opportunity to heighten performances of deterministic schemes. Domain decomposition is particularly suited to this. A two-layer hybrid parallelism strategy, suited to HPC, is chosen. It benefits from the

  14. Variational Iteration Method for Fifth-Order Boundary Value Problems Using He's Polynomials

    Directory of Open Access Journals (Sweden)

    Muhammad Aslam Noor

    2008-01-01

    Full Text Available We apply the variational iteration method using He's polynomials (VIMHP for solving the fifth-order boundary value problems. The proposed method is an elegant combination of variational iteration and the homotopy perturbation methods and is mainly due to Ghorbani (2007. The suggested algorithm is quite efficient and is practically well suited for use in these problems. The proposed iterative scheme finds the solution without any discritization, linearization, or restrictive assumptions. Several examples are given to verify the reliability and efficiency of the method. The fact that the proposed technique solves nonlinear problems without using Adomian's polynomials can be considered as a clear advantage of this algorithm over the decomposition method.

  15. Comment on “Variational Iteration Method for Fractional Calculus Using He’s Polynomials”

    Directory of Open Access Journals (Sweden)

    Ji-Huan He

    2012-01-01

    boundary value problems. This note concludes that the method is a modified variational iteration method using He’s polynomials. A standard variational iteration algorithm for fractional differential equations is suggested.

  16. Pseudoinverse preconditioners and iterative methods for large dense linear least-squares problems

    Directory of Open Access Journals (Sweden)

    Oskar Cahueñas

    2013-05-01

    Full Text Available We address the issue of approximating the pseudoinverse of the coefficient matrix for dynamically building preconditioning strategies for the numerical solution of large dense linear least-squares problems. The new preconditioning strategies are embedded into simple and well-known iterative schemes that avoid the use of the, usually ill-conditioned, normal equations. We analyze a scheme to approximate the pseudoinverse, based on Schulz iterative method, and also different iterative schemes, based on extensions of Richardson's method, and the conjugate gradient method, that are suitable for preconditioning strategies. We present preliminary numerical results to illustrate the advantages of the proposed schemes.

  17. Development of parallel algorithms for electrical power management in space applications

    Science.gov (United States)

    Berry, Frederick C.

    1989-01-01

    The application of parallel techniques for electrical power system analysis is discussed. The Newton-Raphson method of load flow analysis was used along with the decomposition-coordination technique to perform load flow analysis. The decomposition-coordination technique enables tasks to be performed in parallel by partitioning the electrical power system into independent local problems. Each independent local problem represents a portion of the total electrical power system on which a loan flow analysis can be performed. The load flow analysis is performed on these partitioned elements by using the Newton-Raphson load flow method. These independent local problems will produce results for voltage and power which can then be passed to the coordinator portion of the solution procedure. The coordinator problem uses the results of the local problems to determine if any correction is needed on the local problems. The coordinator problem is also solved by an iterative method much like the local problem. The iterative method for the coordination problem will also be the Newton-Raphson method. Therefore, each iteration at the coordination level will result in new values for the local problems. The local problems will have to be solved again along with the coordinator problem until some convergence conditions are met.

  18. On varitional iteration method for fractional calculus

    Directory of Open Access Journals (Sweden)

    Wu Hai-Gen

    2017-01-01

    Full Text Available Modification of the Das’ variational iteration method for fractional differential equations is discussed, and its main shortcoming involved in the solution process is pointed out and overcome by using fractional power series. The suggested computational procedure is simple and reliable for fractional calculus.

  19. A High-Performance Parallel FDTD Method Enhanced by Using SSE Instruction Set

    Directory of Open Access Journals (Sweden)

    Dau-Chyrh Chang

    2012-01-01

    Full Text Available We introduce a hardware acceleration technique for the parallel finite difference time domain (FDTD method using the SSE (streaming (single instruction multiple data SIMD extensions instruction set. The implementation of SSE instruction set to parallel FDTD method has achieved the significant improvement on the simulation performance. The benchmarks of the SSE acceleration on both the multi-CPU workstation and computer cluster have demonstrated the advantages of (vector arithmetic logic unit VALU acceleration over GPU acceleration. Several engineering applications are employed to demonstrate the performance of parallel FDTD method enhanced by SSE instruction set.

  20. A HIGH ORDER SOLUTION OF THREE DIMENSIONAL TIME DEPENDENT NONLINEAR CONVECTIVE-DIFFUSIVE PROBLEM USING MODIFIED VARIATIONAL ITERATION METHOD

    Directory of Open Access Journals (Sweden)

    Pratibha Joshi

    2014-12-01

    Full Text Available In this paper, we have achieved high order solution of a three dimensional nonlinear diffusive-convective problem using modified variational iteration method. The efficiency of this approach has been shown by solving two examples. All computational work has been performed in MATHEMATICA.

  1. On iteration-separable method on the multichannel scattering theory

    International Nuclear Information System (INIS)

    Zubarev, A.L.; Ivlieva, I.N.; Podkopaev, A.P.

    1975-01-01

    The iteration-separable method for solving the equations of the Lippman-Schwinger type is suggested. Exponential convergency of the method of proven. Numerical convergency is clarified on the e + H scattering. Application of the method to the theory of multichannel scattering is formulated

  2. Parallel efficient rate control methods for JPEG 2000

    Science.gov (United States)

    Martínez-del-Amor, Miguel Á.; Bruns, Volker; Sparenberg, Heiko

    2017-09-01

    Since the introduction of JPEG 2000, several rate control methods have been proposed. Among them, post-compression rate-distortion optimization (PCRD-Opt) is the most widely used, and the one recommended by the standard. The approach followed by this method is to first compress the entire image split in code blocks, and subsequently, optimally truncate the set of generated bit streams according to the maximum target bit rate constraint. The literature proposes various strategies on how to estimate ahead of time where a block will get truncated in order to stop the execution prematurely and save time. However, none of them have been defined bearing in mind a parallel implementation. Today, multi-core and many-core architectures are becoming popular for JPEG 2000 codecs implementations. Therefore, in this paper, we analyze how some techniques for efficient rate control can be deployed in GPUs. In order to do that, the design of our GPU-based codec is extended, allowing stopping the process at a given point. This extension also harnesses a higher level of parallelism on the GPU, leading to up to 40% of speedup with 4K test material on a Titan X. In a second step, three selected rate control methods are adapted and implemented in our parallel encoder. A comparison is then carried out, and used to select the best candidate to be deployed in a GPU encoder, which gave an extra 40% of speedup in those situations where it was really employed.

  3. A dimension decomposition approach based on iterative observer design for an elliptic Cauchy problem

    KAUST Repository

    Majeed, Muhammad Usman; Laleg-Kirati, Taous-Meriem

    2015-01-01

    A state observer inspired iterative algorithm is presented to solve boundary estimation problem for Laplace equation using one of the space variables as a time-like variable. Three dimensional domain with two congruent parallel surfaces

  4. Comparison results on preconditioned SOR-type iterative method for Z-matrices linear systems

    Science.gov (United States)

    Wang, Xue-Zhong; Huang, Ting-Zhu; Fu, Ying-Ding

    2007-09-01

    In this paper, we present some comparison theorems on preconditioned iterative method for solving Z-matrices linear systems, Comparison results show that the rate of convergence of the Gauss-Seidel-type method is faster than the rate of convergence of the SOR-type iterative method.

  5. A SPECT reconstruction method for extending parallel to non-parallel geometries

    International Nuclear Information System (INIS)

    Wen Junhai; Liang Zhengrong

    2010-01-01

    Due to its simplicity, parallel-beam geometry is usually assumed for the development of image reconstruction algorithms. The established reconstruction methodologies are then extended to fan-beam, cone-beam and other non-parallel geometries for practical application. This situation occurs for quantitative SPECT (single photon emission computed tomography) imaging in inverting the attenuated Radon transform. Novikov reported an explicit parallel-beam formula for the inversion of the attenuated Radon transform in 2000. Thereafter, a formula for fan-beam geometry was reported by Bukhgeim and Kazantsev (2002 Preprint N. 99 Sobolev Institute of Mathematics). At the same time, we presented a formula for varying focal-length fan-beam geometry. Sometimes, the reconstruction formula is so implicit that we cannot obtain the explicit reconstruction formula in the non-parallel geometries. In this work, we propose a unified reconstruction framework for extending parallel-beam geometry to any non-parallel geometry using ray-driven techniques. Studies by computer simulations demonstrated the accuracy of the presented unified reconstruction framework for extending parallel-beam to non-parallel geometries in inverting the attenuated Radon transform.

  6. Discounted Markov games : generalized policy iteration method

    NARCIS (Netherlands)

    Wal, van der J.

    1978-01-01

    In this paper, we consider two-person zero-sum discounted Markov games with finite state and action spaces. We show that the Newton-Raphson or policy iteration method as presented by Pollats-chek and Avi-Itzhak does not necessarily converge, contradicting a proof of Rao, Chandrasekaran, and Nair.

  7. Iterative method of the parameter variation for solution of nonlinear functional equations

    International Nuclear Information System (INIS)

    Davidenko, D.F.

    1975-01-01

    The iteration method of parameter variation is used for solving nonlinear functional equations in Banach spaces. The authors consider some methods for numerical integration of ordinary first-order differential equations and construct the relevant iteration methods of parameter variation, both one- and multifactor. They also discuss problems of mathematical substantiation of the method, study the conditions and rate of convergence, estimate the error. The paper considers the application of the method to specific functional equations

  8. Optimized iterative decoding method for TPC coded CPM

    Science.gov (United States)

    Ma, Yanmin; Lai, Penghui; Wang, Shilian; Xie, Shunqin; Zhang, Wei

    2018-05-01

    Turbo Product Code (TPC) coded Continuous Phase Modulation (CPM) system (TPC-CPM) has been widely used in aeronautical telemetry and satellite communication. This paper mainly investigates the improvement and optimization on the TPC-CPM system. We first add the interleaver and deinterleaver to the TPC-CPM system, and then establish an iterative system to iteratively decode. However, the improved system has a poor convergence ability. To overcome this issue, we use the Extrinsic Information Transfer (EXIT) analysis to find the optimal factors for the system. The experiments show our method is efficient to improve the convergence performance.

  9. Improving the iterative Linear Interaction Energy approach using automated recognition of configurational transitions.

    Science.gov (United States)

    Vosmeer, C Ruben; Kooi, Derk P; Capoferri, Luigi; Terpstra, Margreet M; Vermeulen, Nico P E; Geerke, Daan P

    2016-01-01

    Recently an iterative method was proposed to enhance the accuracy and efficiency of ligand-protein binding affinity prediction through linear interaction energy (LIE) theory. For ligand binding to flexible Cytochrome P450s (CYPs), this method was shown to decrease the root-mean-square error and standard deviation of error prediction by combining interaction energies of simulations starting from different conformations. Thereby, different parts of protein-ligand conformational space are sampled in parallel simulations. The iterative LIE framework relies on the assumption that separate simulations explore different local parts of phase space, and do not show transitions to other parts of configurational space that are already covered in parallel simulations. In this work, a method is proposed to (automatically) detect such transitions during the simulations that are performed to construct LIE models and to predict binding affinities. Using noise-canceling techniques and splines to fit time series of the raw data for the interaction energies, transitions during simulation between different parts of phase space are identified. Boolean selection criteria are then applied to determine which parts of the interaction energy trajectories are to be used as input for the LIE calculations. Here we show that this filtering approach benefits the predictive quality of our previous CYP 2D6-aryloxypropanolamine LIE model. In addition, an analysis is performed of the gain in computational efficiency that can be obtained from monitoring simulations using the proposed filtering method and by prematurely terminating simulations accordingly.

  10. Existence test for asynchronous interval iterations

    DEFF Research Database (Denmark)

    Madsen, Kaj; Caprani, O.; Stauning, Ole

    1997-01-01

    In the search for regions that contain fixed points ofa real function of several variables, tests based on interval calculationscan be used to establish existence ornon-existence of fixed points in regions that are examined in the course ofthe search. The search can e.g. be performed...... as a synchronous (sequential) interval iteration:In each iteration step all components of the iterate are calculatedbased on the previous iterate. In this case it is straight forward to base simple interval existence and non-existencetests on the calculations done in each step of the iteration. The search can also...... on thecomponentwise calculations done in the course of the iteration. These componentwisetests are useful for parallel implementation of the search, sincethe tests can then be performed local to each processor and only when a test issuccessful do a processor communicate this result to other processors....

  11. Deep Time Iterations: Familiarity, Horizons, and Pattern among Finland's Nuclear Waste Safety Experts

    Science.gov (United States)

    Ialenti, Vincent Francis

    This ethnography reconsiders nuclear waste risk's deep time horizons' often-sensationalized aesthetics of horror, sublimity, and awe. It does so by tracking how Finland's nuclear energy and waste experts made visions of distant future Finlands appear more intelligible through mundane corporate, regulatory, financial, and technoscientific practices. Each chapter unpacks how informants iterated and reiterated traces of the very familiar to establish shared grounds of continuity for moving forward in time. Chapter 1 explores how Finland's energy sector's "mankala" cooperative corporate form was iterated and reiterated to give shape to political and financial time horizons. Chapter 2 explores how workplace role distinctions between recruit/retiree and junior/senior were iterated and reiterated to reckon nuclear personnel successions' intergenerational horizons. Chapter 3 explores how input/output and part/whole distinctions were iterated and reiterated to help model distant future worlds in a portfolio of "Safety Case" evidence made to demonstrate the Olkiluoto repository's safety to Finnish nuclear regulator STUK. Chapter 4 explores how Safety Case experts iterated and reiterated memories of a deceased predecessor figure in everyday engagements with deep time. What emerges are three insights about how futures attain discernible features--insights about the "continuity," "thinkability," and "extensibility" of expert thought--that, I argue, can help twenty-first century experts better navigate not only deep time, but also unknown futures of nuclear technologies, planetary environment, and expertise itself.

  12. AIR-MRF: Accelerated iterative reconstruction for magnetic resonance fingerprinting.

    Science.gov (United States)

    Cline, Christopher C; Chen, Xiao; Mailhe, Boris; Wang, Qiu; Pfeuffer, Josef; Nittka, Mathias; Griswold, Mark A; Speier, Peter; Nadar, Mariappan S

    2017-09-01

    Existing approaches for reconstruction of multiparametric maps with magnetic resonance fingerprinting (MRF) are currently limited by their estimation accuracy and reconstruction time. We aimed to address these issues with a novel combination of iterative reconstruction, fingerprint compression, additional regularization, and accelerated dictionary search methods. The pipeline described here, accelerated iterative reconstruction for magnetic resonance fingerprinting (AIR-MRF), was evaluated with simulations as well as phantom and in vivo scans. We found that the AIR-MRF pipeline provided reduced parameter estimation errors compared to non-iterative and other iterative methods, particularly at shorter sequence lengths. Accelerated dictionary search methods incorporated into the iterative pipeline reduced the reconstruction time at little cost of quality. Copyright © 2017 Elsevier Inc. All rights reserved.

  13. Just-in-Time Compilation-Inspired Methodology for Parallelization of Compute Intensive Java Code

    Directory of Open Access Journals (Sweden)

    GHULAM MUSTAFA

    2017-01-01

    Full Text Available Compute intensive programs generally consume significant fraction of execution time in a small amount of repetitive code. Such repetitive code is commonly known as hotspot code. We observed that compute intensive hotspots often possess exploitable loop level parallelism. A JIT (Just-in-Time compiler profiles a running program to identify its hotspots. Hotspots are then translated into native code, for efficient execution. Using similar approach, we propose a methodology to identify hotspots and exploit their parallelization potential on multicore systems. Proposed methodology selects and parallelizes each DOALL loop that is either contained in a hotspot method or calls a hotspot method. The methodology could be integrated in front-end of a JIT compiler to parallelize sequential code, just before native translation. However, compilation to native code is out of scope of this work. As a case study, we analyze eighteen JGF (Java Grande Forum benchmarks to determine parallelization potential of hotspots. Eight benchmarks demonstrate a speedup of up to 7.6x on an 8-core system

  14. Just-in-time compilation-inspired methodology for parallelization of compute intensive java code

    International Nuclear Information System (INIS)

    Mustafa, G.; Ghani, M.U.

    2017-01-01

    Compute intensive programs generally consume significant fraction of execution time in a small amount of repetitive code. Such repetitive code is commonly known as hotspot code. We observed that compute intensive hotspots often possess exploitable loop level parallelism. A JIT (Just-in-Time) compiler profiles a running program to identify its hotspots. Hotspots are then translated into native code, for efficient execution. Using similar approach, we propose a methodology to identify hotspots and exploit their parallelization potential on multicore systems. Proposed methodology selects and parallelizes each DOALL loop that is either contained in a hotspot method or calls a hotspot method. The methodology could be integrated in front-end of a JIT compiler to parallelize sequential code, just before native translation. However, compilation to native code is out of scope of this work. As a case study, we analyze eighteen JGF (Java Grande Forum) benchmarks to determine parallelization potential of hotspots. Eight benchmarks demonstrate a speedup of up to 7.6x on an 8-core system. (author)

  15. Numerical Solutions for Nonlinear High Damping Rubber Bearing Isolators: Newmark’s Method with Netwon-Raphson Iteration Revisited

    Directory of Open Access Journals (Sweden)

    Markou A.A.

    2018-03-01

    Full Text Available Numerical methods for the solution of dynamical problems in engineering go back to 1950. The most famous and widely-used time stepping algorithm was developed by Newmark in 1959. In the present study, for the first time, the Newmark algorithm is developed for the case of the trilinear hysteretic model, a model that was used to describe the shear behaviour of high damping rubber bearings. This model is calibrated against free-vibration field tests implemented on a hybrid base isolated building, namely the Solarino project in Italy, as well as against laboratory experiments. A single-degree-of-freedom system is used to describe the behaviour of a low-rise building isolated with a hybrid system comprising high damping rubber bearings and low friction sliding bearings. The behaviour of the high damping rubber bearings is simulated by the trilinear hysteretic model, while the description of the behaviour of the low friction sliding bearings is modeled by a linear Coulomb friction model. In order to prove the effectiveness of the numerical method we compare the analytically solved trilinear hysteretic model calibrated from free-vibration field tests (Solarino project against the same model solved with the Newmark method with Netwon-Raphson iteration. Almost perfect agreement is observed between the semi-analytical solution and the fully numerical solution with Newmark’s time integration algorithm. This will allow for extension of the trilinear mechanical models to bidirectional horizontal motion, to time-varying vertical loads, to multi-degree-of-freedom-systems, as well to generalized models connected in parallel, where only numerical solutions are possible.

  16. An iterative method for determination of a minimal eigenvalue

    DEFF Research Database (Denmark)

    Kristiansen, G.K.

    1968-01-01

    Kristiansen (1963) has discussed the convergence of a group of iterative methods (denoted the Equipoise methods) for the solution of reactor criticality problems. The main result was that even though the methods are said to work satisfactorily in all practical cases, examples of divergence can be...

  17. Parallelization of a spherical Sn transport theory algorithm

    International Nuclear Information System (INIS)

    Haghighat, A.

    1989-01-01

    The work described in this paper derives a parallel algorithm for an R-dependent spherical S N transport theory algorithm and studies its performance by testing different sample problems. The S N transport method is one of the most accurate techniques used to solve the linear Boltzmann equation. Several studies have been done on the vectorization of the S N algorithms; however, very few studies have been performed on the parallelization of this algorithm. Weinke and Hommoto have looked at the parallel processing of the different energy groups, and Azmy recently studied the parallel processing of the inner iterations of an X-Y S N nodal transport theory method. Both studies have reported very encouraging results, which have prompted us to look at the parallel processing of an R-dependent S N spherical geometry algorithm. This geometry was chosen because, in spite of its simplicity, it contains the complications of the curvilinear geometries (i.e., redistribution of neutrons over the discretized angular bins)

  18. A superlinear iteration method for calculation of finite length journal bearing's static equilibrium position

    Science.gov (United States)

    Zhou, Wenjie; Wei, Xuesong; Wang, Leqin; Wu, Guangkuan

    2017-05-01

    Solving the static equilibrium position is one of the most important parts of dynamic coefficients calculation and further coupled calculation of rotor system. The main contribution of this study is testing the superlinear iteration convergence method-twofold secant method, for the determination of the static equilibrium position of journal bearing with finite length. Essentially, the Reynolds equation for stable motion is solved by the finite difference method and the inner pressure is obtained by the successive over-relaxation iterative method reinforced by the compound Simpson quadrature formula. The accuracy and efficiency of the twofold secant method are higher in comparison with the secant method and dichotomy. The total number of iterative steps required for the twofold secant method are about one-third of the secant method and less than one-eighth of dichotomy for the same equilibrium position. The calculations for equilibrium position and pressure distribution for different bearing length, clearance and rotating speed were done. In the results, the eccentricity presents linear inverse proportional relationship to the attitude angle. The influence of the bearing length, clearance and bearing radius on the load-carrying capacity was also investigated. The results illustrate that larger bearing length, larger radius and smaller clearance are good for the load-carrying capacity of journal bearing. The application of the twofold secant method can greatly reduce the computational time for calculation of the dynamic coefficients and dynamic characteristics of rotor-bearing system with a journal bearing of finite length.

  19. Parallel preconditioned conjugate gradient algorithm applied to neutron diffusion problem

    International Nuclear Information System (INIS)

    Majumdar, A.; Martin, W.R.

    1992-01-01

    Numerical solution of the neutron diffusion problem requires solving a linear system of equations such as Ax = b, where A is an n x n symmetric positive definite (SPD) matrix; x and b are vectors with n components. The preconditioned conjugate gradient (PCG) algorithm is an efficient iterative method for solving such a linear system of equations. In this paper, the authors describe the implementation of a parallel PCG algorithm on a shared memory machine (BBN TC2000) and on a distributed workstation (IBM RS6000) environment created by the parallel virtual machine parallelization software

  20. Parallelized preconditioned BiCGStab solution of sparse linear system equations in F-COBRA-TF

    International Nuclear Information System (INIS)

    Geemert, Rene van; Glück, Markus; Riedmann, Michael; Gabriel, Harry

    2011-01-01

    Recently, the in-house development of a preconditioned and parallelized BiCGStab solver has been pursued successfully in AREVA’s advanced sub-channel code F-COBRA-TF. This solver can be run either in a sequential computation mode on a single CPU, or in a parallel computation mode on multiple parallel CPUs. The developed procedure enables the computation of several thousands of successive sparse linear system solutions in F-COBRA-TF with acceptable wall clock run times. The current paper provides general information about F-COBRA-TF in terms of modeling capabilities and application areas, and points out where the relevance arises for the efficient iterative solution of sparse linear systems. Furthermore, the preconditioning and parallelization strategies in the developed BiCGStab iterative solution approach are discussed. The paper is concluded with a number of verification examples. (author)

  1. Time-dependent density-functional theory in massively parallel computer architectures: the OCTOPUS project.

    Science.gov (United States)

    Andrade, Xavier; Alberdi-Rodriguez, Joseba; Strubbe, David A; Oliveira, Micael J T; Nogueira, Fernando; Castro, Alberto; Muguerza, Javier; Arruabarrena, Agustin; Louie, Steven G; Aspuru-Guzik, Alán; Rubio, Angel; Marques, Miguel A L

    2012-06-13

    Octopus is a general-purpose density-functional theory (DFT) code, with a particular emphasis on the time-dependent version of DFT (TDDFT). In this paper we present the ongoing efforts to achieve the parallelization of octopus. We focus on the real-time variant of TDDFT, where the time-dependent Kohn-Sham equations are directly propagated in time. This approach has great potential for execution in massively parallel systems such as modern supercomputers with thousands of processors and graphics processing units (GPUs). For harvesting the potential of conventional supercomputers, the main strategy is a multi-level parallelization scheme that combines the inherent scalability of real-time TDDFT with a real-space grid domain-partitioning approach. A scalable Poisson solver is critical for the efficiency of this scheme. For GPUs, we show how using blocks of Kohn-Sham states provides the required level of data parallelism and that this strategy is also applicable for code optimization on standard processors. Our results show that real-time TDDFT, as implemented in octopus, can be the method of choice for studying the excited states of large molecular systems in modern parallel architectures.

  2. Time-dependent density-functional theory in massively parallel computer architectures: the octopus project

    Science.gov (United States)

    Andrade, Xavier; Alberdi-Rodriguez, Joseba; Strubbe, David A.; Oliveira, Micael J. T.; Nogueira, Fernando; Castro, Alberto; Muguerza, Javier; Arruabarrena, Agustin; Louie, Steven G.; Aspuru-Guzik, Alán; Rubio, Angel; Marques, Miguel A. L.

    2012-06-01

    Octopus is a general-purpose density-functional theory (DFT) code, with a particular emphasis on the time-dependent version of DFT (TDDFT). In this paper we present the ongoing efforts to achieve the parallelization of octopus. We focus on the real-time variant of TDDFT, where the time-dependent Kohn-Sham equations are directly propagated in time. This approach has great potential for execution in massively parallel systems such as modern supercomputers with thousands of processors and graphics processing units (GPUs). For harvesting the potential of conventional supercomputers, the main strategy is a multi-level parallelization scheme that combines the inherent scalability of real-time TDDFT with a real-space grid domain-partitioning approach. A scalable Poisson solver is critical for the efficiency of this scheme. For GPUs, we show how using blocks of Kohn-Sham states provides the required level of data parallelism and that this strategy is also applicable for code optimization on standard processors. Our results show that real-time TDDFT, as implemented in octopus, can be the method of choice for studying the excited states of large molecular systems in modern parallel architectures.

  3. Time-dependent density-functional theory in massively parallel computer architectures: the octopus project

    International Nuclear Information System (INIS)

    Andrade, Xavier; Aspuru-Guzik, Alán; Alberdi-Rodriguez, Joseba; Rubio, Angel; Strubbe, David A; Louie, Steven G; Oliveira, Micael J T; Nogueira, Fernando; Castro, Alberto; Muguerza, Javier; Arruabarrena, Agustin; Marques, Miguel A L

    2012-01-01

    Octopus is a general-purpose density-functional theory (DFT) code, with a particular emphasis on the time-dependent version of DFT (TDDFT). In this paper we present the ongoing efforts to achieve the parallelization of octopus. We focus on the real-time variant of TDDFT, where the time-dependent Kohn-Sham equations are directly propagated in time. This approach has great potential for execution in massively parallel systems such as modern supercomputers with thousands of processors and graphics processing units (GPUs). For harvesting the potential of conventional supercomputers, the main strategy is a multi-level parallelization scheme that combines the inherent scalability of real-time TDDFT with a real-space grid domain-partitioning approach. A scalable Poisson solver is critical for the efficiency of this scheme. For GPUs, we show how using blocks of Kohn-Sham states provides the required level of data parallelism and that this strategy is also applicable for code optimization on standard processors. Our results show that real-time TDDFT, as implemented in octopus, can be the method of choice for studying the excited states of large molecular systems in modern parallel architectures. (topical review)

  4. The steady performance prediction of propeller-rudder-bulb system based on potential iterative method

    International Nuclear Information System (INIS)

    Liu, Y B; Su, Y M; Ju, L; Huang, S L

    2012-01-01

    A new numerical method was developed for predicting the steady hydrodynamic performance of propeller-rudder-bulb system. In the calculation, the rudder and bulb was taken into account as a whole, the potential based surface panel method was applied both to propeller and rudder-bulb system. The interaction between propeller and rudder-bulb was taken into account by velocity potential iteration in which the influence of propeller rotation was considered by the average influence coefficient. In the influence coefficient computation, the singular value should be found and deleted. Numerical results showed that the method presented is effective for predicting the steady hydrodynamic performance of propeller-rudder system and propeller-rudder-bulb system. Comparing with the induced velocity iterative method, the method presented can save programming and calculation time. Changing dimensions, the principal parameter—bulb size that affect energy-saving effect was studied, the results show that the bulb on rudder have a optimal size at the design advance coefficient.

  5. Determination of quantitative tissue composition by iterative reconstruction on 3D DECT volumes

    Energy Technology Data Exchange (ETDEWEB)

    Magnusson, Maria [Linkoeping Univ. (Sweden). Dept. of Electrical Engineering; Linkoeping Univ. (Sweden). Dept. of Medical and Health Sciences, Radiation Physics; Linkoeping Univ. (Sweden). Center for Medical Image Science and Visualization (CMIV); Malusek, Alexandr [Linkoeping Univ. (Sweden). Dept. of Medical and Health Sciences, Radiation Physics; Linkoeping Univ. (Sweden). Center for Medical Image Science and Visualization (CMIV); Nuclear Physics Institute AS CR, Prague (Czech Republic). Dept. of Radiation Dosimetry; Muhammad, Arif [Linkoeping Univ. (Sweden). Dept. of Medical and Health Sciences, Radiation Physics; Carlsson, Gudrun Alm [Linkoeping Univ. (Sweden). Dept. of Medical and Health Sciences, Radiation Physics; Linkoeping Univ. (Sweden). Center for Medical Image Science and Visualization (CMIV)

    2011-07-01

    Quantitative tissue classification using dual-energy CT has the potential to improve accuracy in radiation therapy dose planning as it provides more information about material composition of scanned objects than the currently used methods based on single-energy CT. One problem that hinders successful application of both single- and dual-energy CT is the presence of beam hardening and scatter artifacts in reconstructed data. Current pre- and post-correction methods used for image reconstruction often bias CT attenuation values and thus limit their applicability for quantitative tissue classification. Here we demonstrate simulation studies with a novel iterative algorithm that decomposes every soft tissue voxel into three base materials: water, protein, and adipose. The results demonstrate that beam hardening artifacts can effectively be removed and accurate estimation of mass fractions of each base material can be achieved. Our iterative algorithm starts with calculating parallel projections on two previously reconstructed DECT volumes reconstructed from fan-beam or helical projections with small conebeam angle. The parallel projections are then used in an iterative loop. Future developments include segmentation of soft and bone tissue and subsequent determination of bone composition. (orig.)

  6. An iterative method for accelerated degradation testing data of smart electricity meter

    Science.gov (United States)

    Wang, Xiaoming; Xie, Jinzhe

    2017-01-01

    In order to evaluate the performance of smart electricity meter (SEM), we must spend a lot of time censoring its status. For example, if we assess to the meter stability of the SEM which needs several years at least according to the standards. So accelerated degradation testing (ADT) is a useful method to assess the performance of the SEM. As we known, the Wiener process is a prevalent method to interpret the performance degradation. This paper proposes an iterative method for ADT data of SEM. The simulation study verifies the application and superiority of the proposed model than other ADT methods.

  7. A Parallel Solver for Large-Scale Markov Chains

    Czech Academy of Sciences Publication Activity Database

    Benzi, M.; Tůma, Miroslav

    2002-01-01

    Roč. 41, - (2002), s. 135-153 ISSN 0168-9274 R&D Projects: GA AV ČR IAA2030801; GA ČR GA101/00/1035 Keywords : parallel preconditioning * iterative methods * discrete Markov chains * generalized inverses * singular matrices * graph partitioning * AINV * Bi-CGSTAB Subject RIV: BA - General Mathematics Impact factor: 0.504, year: 2002

  8. Parallelizing the spectral transform method: A comparison of alternative parallel algorithms

    International Nuclear Information System (INIS)

    Foster, I.; Worley, P.H.

    1993-01-01

    The spectral transform method is a standard numerical technique for solving partial differential equations on the sphere and is widely used in global climate modeling. In this paper, we outline different approaches to parallelizing the method and describe experiments that we are conducting to evaluate the efficiency of these approaches on parallel computers. The experiments are conducted using a testbed code that solves the nonlinear shallow water equations on a sphere, but are designed to permit evaluation in the context of a global model. They allow us to evaluate the relative merits of the approaches as a function of problem size and number of processors. The results of this study are guiding ongoing work on PCCM2, a parallel implementation of the Community Climate Model developed at the National Center for Atmospheric Research

  9. Projection-iteration methods for solving nonlinear operator equations

    International Nuclear Information System (INIS)

    Nguyen Minh Chuong; Tran thi Lan Anh; Tran Quoc Binh

    1989-09-01

    In this paper, the authors investigate a nonlinear operator equation in uniformly convex Banach spaces as in metric spaces by using stationary and nonstationary generalized projection-iteration methods. Convergence theorems in the strong and weak sense were established. (author). 7 refs

  10. An iterative kernel based method for fourth order nonlinear equation with nonlinear boundary condition

    Science.gov (United States)

    Azarnavid, Babak; Parand, Kourosh; Abbasbandy, Saeid

    2018-06-01

    This article discusses an iterative reproducing kernel method with respect to its effectiveness and capability of solving a fourth-order boundary value problem with nonlinear boundary conditions modeling beams on elastic foundations. Since there is no method of obtaining reproducing kernel which satisfies nonlinear boundary conditions, the standard reproducing kernel methods cannot be used directly to solve boundary value problems with nonlinear boundary conditions as there is no knowledge about the existence and uniqueness of the solution. The aim of this paper is, therefore, to construct an iterative method by the use of a combination of reproducing kernel Hilbert space method and a shooting-like technique to solve the mentioned problems. Error estimation for reproducing kernel Hilbert space methods for nonlinear boundary value problems have yet to be discussed in the literature. In this paper, we present error estimation for the reproducing kernel method to solve nonlinear boundary value problems probably for the first time. Some numerical results are given out to demonstrate the applicability of the method.

  11. Parallel Evolutionary Optimization of Multibody Systems with Application to Railway Dynamics

    Energy Technology Data Exchange (ETDEWEB)

    Eberhard, Peter [University of Erlangen-Nuremberg, Institute of Applied Mechanics (Germany)], E-mail: eberhard@ltm.uni-erlangen.de; Dignath, Florian [University of Stuttgart, Institute B of Mechanics (Germany)], E-mail: fd@mechb.uni-stuttgart.de; Kuebler, Lars [University of Erlangen-Nuremberg, Institute of Applied Mechanics (Germany)], E-mail: kuebler@ltm.uni-erlangen.de

    2003-03-15

    The optimization of multibody systems usually requires many costly criteria computations since the equations of motion must be evaluated by numerical time integration for each considered design. For actively controlled or flexible multibody systems additional difficulties arise as the criteria may contain non-differentiable points or many local minima. Therefore, in this paper a stochastic evolution strategy is used in combination with parallel computing in order to reduce the computation times whilst keeping the inherent robustness. For the parallelization a master-slave approach is used in a heterogeneous workstation/PC cluster. The pool-of-tasks concept is applied in order to deal with the frequently changing workloads of different machines in the cluster. In order to analyze the performance of the parallel optimization method, the suspension of an ICE passenger coach, modeled as an elastic multibody system, is optimized simultaneously with regard to several criteria including vibration damping and a criterion related to safety against derailment. The iterative and interactive nature of a typical optimization process for technical systems is emphasized.

  12. Parallel Evolutionary Optimization of Multibody Systems with Application to Railway Dynamics

    International Nuclear Information System (INIS)

    Eberhard, Peter; Dignath, Florian; Kuebler, Lars

    2003-01-01

    The optimization of multibody systems usually requires many costly criteria computations since the equations of motion must be evaluated by numerical time integration for each considered design. For actively controlled or flexible multibody systems additional difficulties arise as the criteria may contain non-differentiable points or many local minima. Therefore, in this paper a stochastic evolution strategy is used in combination with parallel computing in order to reduce the computation times whilst keeping the inherent robustness. For the parallelization a master-slave approach is used in a heterogeneous workstation/PC cluster. The pool-of-tasks concept is applied in order to deal with the frequently changing workloads of different machines in the cluster. In order to analyze the performance of the parallel optimization method, the suspension of an ICE passenger coach, modeled as an elastic multibody system, is optimized simultaneously with regard to several criteria including vibration damping and a criterion related to safety against derailment. The iterative and interactive nature of a typical optimization process for technical systems is emphasized

  13. Development and test of the ITER conductor joints

    Energy Technology Data Exchange (ETDEWEB)

    Martovetsky, N., LLNL

    1998-05-14

    Joints for the ITER superconducting Central Solenoid should perform in rapidly varying magnetic field with low losses and low DC resistance. This paper describes the design of the ITER joint and presents its assembly process. Two joints were built and tested at the PTF facility at MIT. Test results are presented, losses in transverse and parallel field and the DC performance are discussed. The developed joint demonstrates sufficient margin for baseline ITER operating scenarios.

  14. Parallel Computing Characteristics of Two-Phase Thermal-Hydraulics code, CUPID

    International Nuclear Information System (INIS)

    Lee, Jae Ryong; Yoon, Han Young

    2013-01-01

    Parallelized CUPID code has proved to be able to reproduce multi-dimensional thermal hydraulic analysis by validating with various conceptual problems and experimental data. In this paper, the characteristics of the parallelized CUPID code were investigated. Both single- and two phase simulation are taken into account. Since the scalability of a parallel simulation is known to be better for fine mesh system, two types of mesh system are considered. In addition, the dependency of the preconditioner for matrix solver was also compared. The scalability for the single-phase flow is better than that for two-phase flow due to the less numbers of iterations for solving pressure matrix. The CUPID code was investigated the parallel performance in terms of scalability. The CUPID code was parallelized with domain decomposition method. The MPI library was adopted to communicate the information at the interface cells. As increasing the number of mesh, the scalability is improved. For a given mesh, single-phase flow simulation with diagonal preconditioner shows the best speedup. However, for the two-phase flow simulation, the ILU preconditioner is recommended since it reduces the overall simulation time

  15. COMPUTATIONAL EFFICIENCY OF A MODIFIED SCATTERING KERNEL FOR FULL-COUPLED PHOTON-ELECTRON TRANSPORT PARALLEL COMPUTING WITH UNSTRUCTURED TETRAHEDRAL MESHES

    Directory of Open Access Journals (Sweden)

    JONG WOON KIM

    2014-04-01

    In this paper, we introduce a modified scattering kernel approach to avoid the unnecessarily repeated calculations involved with the scattering source calculation, and used it with parallel computing to effectively reduce the computation time. Its computational efficiency was tested for three-dimensional full-coupled photon-electron transport problems using our computer program which solves the multi-group discrete ordinates transport equation by using the discontinuous finite element method with unstructured tetrahedral meshes for complicated geometrical problems. The numerical tests show that we can improve speed up to 17∼42 times for the elapsed time per iteration using the modified scattering kernel, not only in the single CPU calculation but also in the parallel computing with several CPUs.

  16. Linear multifrequency-grey acceleration recast for preconditioned Krylov iterations

    International Nuclear Information System (INIS)

    Morel, Jim E.; Brian Yang, T.-Y.; Warsa, James S.

    2007-01-01

    The linear multifrequency-grey acceleration (LMFGA) technique is used to accelerate the iterative convergence of multigroup thermal radiation diffusion calculations in high energy density simulations. Although it is effective and efficient in one-dimensional calculations, the LMFGA method has recently been observed to significantly degrade under certain conditions in multidimensional calculations with large discontinuities in material properties. To address this deficiency, we recast the LMFGA method in terms of a preconditioned system that is solved with a Krylov method (LMFGK). Results are presented demonstrating that the new LMFGK method always requires fewer iterations than the original LMFGA method. The reduction in iteration count increases with both the size of the time step and the inhomogeneity of the problem. However, for reasons later explained, the LMFGK method can cost more per iteration than the LMFGA method, resulting in lower but comparable efficiency in problems with small time steps and weak inhomogeneities. In problems with large time steps and strong inhomogeneities, the LMFGK method is significantly more efficient than the LMFGA method

  17. Accelerated fast iterative shrinkage thresholding algorithms for sparsity-regularized cone-beam CT image reconstruction

    International Nuclear Information System (INIS)

    Xu, Qiaofeng; Sawatzky, Alex; Anastasio, Mark A.; Yang, Deshan; Tan, Jun

    2016-01-01

    Purpose: The development of iterative image reconstruction algorithms for cone-beam computed tomography (CBCT) remains an active and important research area. Even with hardware acceleration, the overwhelming majority of the available 3D iterative algorithms that implement nonsmooth regularizers remain computationally burdensome and have not been translated for routine use in time-sensitive applications such as image-guided radiation therapy (IGRT). In this work, two variants of the fast iterative shrinkage thresholding algorithm (FISTA) are proposed and investigated for accelerated iterative image reconstruction in CBCT. Methods: Algorithm acceleration was achieved by replacing the original gradient-descent step in the FISTAs by a subproblem that is solved by use of the ordered subset simultaneous algebraic reconstruction technique (OS-SART). Due to the preconditioning matrix adopted in the OS-SART method, two new weighted proximal problems were introduced and corresponding fast gradient projection-type algorithms were developed for solving them. We also provided efficient numerical implementations of the proposed algorithms that exploit the massive data parallelism of multiple graphics processing units. Results: The improved rates of convergence of the proposed algorithms were quantified in computer-simulation studies and by use of clinical projection data corresponding to an IGRT study. The accelerated FISTAs were shown to possess dramatically improved convergence properties as compared to the standard FISTAs. For example, the number of iterations to achieve a specified reconstruction error could be reduced by an order of magnitude. Volumetric images reconstructed from clinical data were produced in under 4 min. Conclusions: The FISTA achieves a quadratic convergence rate and can therefore potentially reduce the number of iterations required to produce an image of a specified image quality as compared to first-order methods. We have proposed and investigated

  18. Accelerated fast iterative shrinkage thresholding algorithms for sparsity-regularized cone-beam CT image reconstruction

    Science.gov (United States)

    Xu, Qiaofeng; Yang, Deshan; Tan, Jun; Sawatzky, Alex; Anastasio, Mark A.

    2016-01-01

    Purpose: The development of iterative image reconstruction algorithms for cone-beam computed tomography (CBCT) remains an active and important research area. Even with hardware acceleration, the overwhelming majority of the available 3D iterative algorithms that implement nonsmooth regularizers remain computationally burdensome and have not been translated for routine use in time-sensitive applications such as image-guided radiation therapy (IGRT). In this work, two variants of the fast iterative shrinkage thresholding algorithm (FISTA) are proposed and investigated for accelerated iterative image reconstruction in CBCT. Methods: Algorithm acceleration was achieved by replacing the original gradient-descent step in the FISTAs by a subproblem that is solved by use of the ordered subset simultaneous algebraic reconstruction technique (OS-SART). Due to the preconditioning matrix adopted in the OS-SART method, two new weighted proximal problems were introduced and corresponding fast gradient projection-type algorithms were developed for solving them. We also provided efficient numerical implementations of the proposed algorithms that exploit the massive data parallelism of multiple graphics processing units. Results: The improved rates of convergence of the proposed algorithms were quantified in computer-simulation studies and by use of clinical projection data corresponding to an IGRT study. The accelerated FISTAs were shown to possess dramatically improved convergence properties as compared to the standard FISTAs. For example, the number of iterations to achieve a specified reconstruction error could be reduced by an order of magnitude. Volumetric images reconstructed from clinical data were produced in under 4 min. Conclusions: The FISTA achieves a quadratic convergence rate and can therefore potentially reduce the number of iterations required to produce an image of a specified image quality as compared to first-order methods. We have proposed and investigated

  19. Two-Level Iteration Penalty Methods for the Navier-Stokes Equations with Friction Boundary Conditions

    Directory of Open Access Journals (Sweden)

    Yuan Li

    2013-01-01

    Full Text Available This paper presents two-level iteration penalty finite element methods to approximate the solution of the Navier-Stokes equations with friction boundary conditions. The basic idea is to solve the Navier-Stokes type variational inequality problem on a coarse mesh with mesh size H in combining with solving a Stokes, Oseen, or linearized Navier-Stokes type variational inequality problem for Stokes, Oseen, or Newton iteration on a fine mesh with mesh size h. The error estimate obtained in this paper shows that if H, h, and ε can be chosen appropriately, then these two-level iteration penalty methods are of the same convergence orders as the usual one-level iteration penalty method.

  20. Krylov iterative methods and synthetic acceleration for transport in binary statistical media

    International Nuclear Information System (INIS)

    Fichtl, Erin D.; Warsa, James S.; Prinja, Anil K.

    2009-01-01

    In particle transport applications there are numerous physical constructs in which heterogeneities are randomly distributed. The quantity of interest in these problems is the ensemble average of the flux, or the average of the flux over all possible material 'realizations.' The Levermore-Pomraning closure assumes Markovian mixing statistics and allows a closed, coupled system of equations to be written for the ensemble averages of the flux in each material. Generally, binary statistical mixtures are considered in which there are two (homogeneous) materials and corresponding coupled equations. The solution process is iterative, but convergence may be slow as either or both materials approach the diffusion and/or atomic mix limits. A three-part acceleration scheme is devised to expedite convergence, particularly in the atomic mix-diffusion limit where computation is extremely slow. The iteration is first divided into a series of 'inner' material and source iterations to attenuate the diffusion and atomic mix error modes separately. Secondly, atomic mix synthetic acceleration is applied to the inner material iteration and S 2 synthetic acceleration to the inner source iterations to offset the cost of doing several inner iterations per outer iteration. Finally, a Krylov iterative solver is wrapped around each iteration, inner and outer, to further expedite convergence. A spectral analysis is conducted and iteration counts and computing cost for the new two-step scheme are compared against those for a simple one-step iteration, to which a Krylov iterative method can also be applied.

  1. Adaptive Mesh Iteration Method for Trajectory Optimization Based on Hermite-Pseudospectral Direct Transcription

    Directory of Open Access Journals (Sweden)

    Humin Lei

    2017-01-01

    Full Text Available An adaptive mesh iteration method based on Hermite-Pseudospectral is described for trajectory optimization. The method uses the Legendre-Gauss-Lobatto points as interpolation points; then the state equations are approximated by Hermite interpolating polynomials. The method allows for changes in both number of mesh points and the number of mesh intervals and produces significantly smaller mesh sizes with a higher accuracy tolerance solution. The derived relative error estimate is then used to trade the number of mesh points with the number of mesh intervals. The adaptive mesh iteration method is applied successfully to the examples of trajectory optimization of Maneuverable Reentry Research Vehicle, and the simulation experiment results show that the adaptive mesh iteration method has many advantages.

  2. Implementation of GPU parallel equilibrium reconstruction for plasma control in EAST

    Energy Technology Data Exchange (ETDEWEB)

    Huang, Yao, E-mail: yaohuang@ipp.ac.cn [Institute of Plasma Physics, Chinese Academy of Sciences, Hefei (China); Xiao, B.J. [Institute of Plasma Physics, Chinese Academy of Sciences, Hefei (China); School of Nuclear Science & Technology, University of Science & Technology of China (China); Luo, Z.P.; Yuan, Q.P.; Pei, X.F. [Institute of Plasma Physics, Chinese Academy of Sciences, Hefei (China); Yue, X.N. [School of Nuclear Science & Technology, University of Science & Technology of China (China)

    2016-11-15

    Highlights: • We described parallel equilibrium reconstruction code P-EFIT running on GPU was integrated with EAST plasma control system. • Compared with RT-EFIT used in EAST, P-EFIT has better spatial resolution and full algorithm of EFIT per iteration. • With the data interface through RFM, 65 × 65 spatial grids P-EFIT can satisfy the accuracy and time feasibility requirements for plasma control. • Successful control using ISOFLUX/P-EFIT was established in the dedicated experiment during the EAST 2014 campaign. • This work is a stepping-stone towards versatile ISOFLUX/P-EFIT control, such as real-time equilibrium reconstruction with more diagnostics. - Abstract: Implementation of P-EFIT code for plasma control in EAST is described. P-EFIT is based on the EFIT framework, but built with the CUDA™ architecture to take advantage of massively parallel Graphical Processing Unit (GPU) cores to significantly accelerate the computation. 65 × 65 grid size P-EFIT can complete one reconstruction iteration in 300 μs, with one iteration strategy, it can satisfy the needs of real-time plasma shape control. Data interface between P-EFIT and PCS is realized and developed by transferring data through RFM. First application of P-EFIT to discharge control in EAST is described.

  3. Study on Parallel Processing for Efficient Flexible Multibody Analysis based on Subsystem Synthesis Method

    Energy Technology Data Exchange (ETDEWEB)

    Han, Jong-Boo; Song, Hajun; Kim, Sung-Soo [Chungnam Nat’l Univ., Daejeon (Korea, Republic of)

    2017-06-15

    Flexible multibody simulations are widely used in the industry to design mechanical systems. In flexible multibody dynamics, deformation coordinates are described either relatively in the body reference frame that is floating in the space or in the inertial reference frame. Moreover, these deformation coordinates are generated based on the discretization of the body according to the finite element approach. Therefore, the formulation of the flexible multibody system always deals with a huge number of degrees of freedom and the numerical solution methods require a substantial amount of computational time. Parallel computational methods are a solution for efficient computation. However, most of the parallel computational methods are focused on the efficient solution of large-sized linear equations. For multibody analysis, we need to develop an efficient formulation that could be suitable for parallel computation. In this paper, we developed a subsystem synthesis method for a flexible multibody system and proposed efficient parallel computational schemes based on the OpenMP API in order to achieve efficient computation. Simulations of a rotating blade system, which consists of three identical blades, were carried out with two different parallel computational schemes. Actual CPU times were measured to investigate the efficiency of the proposed parallel schemes.

  4. A parallel nearly implicit time-stepping scheme

    OpenAIRE

    Botchev, Mike A.; van der Vorst, Henk A.

    2001-01-01

    Across-the-space parallelism still remains the most mature, convenient and natural way to parallelize large scale problems. One of the major problems here is that implicit time stepping is often difficult to parallelize due to the structure of the system. Approximate implicit schemes have been suggested to circumvent the problem. These schemes have attractive stability properties and they are also very well parallelizable. The purpose of this article is to give an overall assessment of the pa...

  5. Discrete-Time Nonzero-Sum Games for Multiplayer Using Policy-Iteration-Based Adaptive Dynamic Programming Algorithms.

    Science.gov (United States)

    Zhang, Huaguang; Jiang, He; Luo, Chaomin; Xiao, Geyang

    2017-10-01

    In this paper, we investigate the nonzero-sum games for a class of discrete-time (DT) nonlinear systems by using a novel policy iteration (PI) adaptive dynamic programming (ADP) method. The main idea of our proposed PI scheme is to utilize the iterative ADP algorithm to obtain the iterative control policies, which not only ensure the system to achieve stability but also minimize the performance index function for each player. This paper integrates game theory, optimal control theory, and reinforcement learning technique to formulate and handle the DT nonzero-sum games for multiplayer. First, we design three actor-critic algorithms, an offline one and two online ones, for the PI scheme. Subsequently, neural networks are employed to implement these algorithms and the corresponding stability analysis is also provided via the Lyapunov theory. Finally, a numerical simulation example is presented to demonstrate the effectiveness of our proposed approach.

  6. Comparison of multihardware parallel implementations for a phase unwrapping algorithm

    Science.gov (United States)

    Hernandez-Lopez, Francisco Javier; Rivera, Mariano; Salazar-Garibay, Adan; Legarda-Sáenz, Ricardo

    2018-04-01

    Phase unwrapping is an important problem in the areas of optical metrology, synthetic aperture radar (SAR) image analysis, and magnetic resonance imaging (MRI) analysis. These images are becoming larger in size and, particularly, the availability and need for processing of SAR and MRI data have increased significantly with the acquisition of remote sensing data and the popularization of magnetic resonators in clinical diagnosis. Therefore, it is important to develop faster and accurate phase unwrapping algorithms. We propose a parallel multigrid algorithm of a phase unwrapping method named accumulation of residual maps, which builds on a serial algorithm that consists of the minimization of a cost function; minimization achieved by means of a serial Gauss-Seidel kind algorithm. Our algorithm also optimizes the original cost function, but unlike the original work, our algorithm is a parallel Jacobi class with alternated minimizations. This strategy is known as the chessboard type, where red pixels can be updated in parallel at same iteration since they are independent. Similarly, black pixels can be updated in parallel in an alternating iteration. We present parallel implementations of our algorithm for different parallel multicore architecture such as CPU-multicore, Xeon Phi coprocessor, and Nvidia graphics processing unit. In all the cases, we obtain a superior performance of our parallel algorithm when compared with the original serial version. In addition, we present a detailed comparative performance of the developed parallel versions.

  7. Backtracking-Based Iterative Regularization Method for Image Compressive Sensing Recovery

    Directory of Open Access Journals (Sweden)

    Lingjun Liu

    2017-01-01

    Full Text Available This paper presents a variant of the iterative shrinkage-thresholding (IST algorithm, called backtracking-based adaptive IST (BAIST, for image compressive sensing (CS reconstruction. For increasing iterations, IST usually yields a smoothing of the solution and runs into prematurity. To add back more details, the BAIST method backtracks to the previous noisy image using L2 norm minimization, i.e., minimizing the Euclidean distance between the current solution and the previous ones. Through this modification, the BAIST method achieves superior performance while maintaining the low complexity of IST-type methods. Also, BAIST takes a nonlocal regularization with an adaptive regularizor to automatically detect the sparsity level of an image. Experimental results show that our algorithm outperforms the original IST method and several excellent CS techniques.

  8. Development of an efficient real-time disruption predictor from scratch on JET and implications for ITER

    International Nuclear Information System (INIS)

    Dormido-Canto, S.; Ramírez, J.M.; Vega, J.; Moreno, R.; Pereira, A.; Murari, A.; López, J.M.

    2013-01-01

    Prediction of disruptions from scratch is an ITER-relevant topic. The first operations with the new ITER-like wall constitute a good opportunity to test the development of new predictors from scratch and the related methodologies. These methodologies have been based on the Advanced Predictor Of DISruptions (APODIS) architecture. APODIS is a real-time disruption predictor that is in operation in the JET real-time network. Balanced and unbalanced datasets are used to develop real-time predictors from scratch. The discharges are used in chronological order. Also, different criteria to decide when to re-train a predictor are discussed. The best results are obtained by applying a hybrid method (balanced/unbalanced datasets) for training and with the criterion of re-training after every missed alarm. The predictors are tested off-line with all the discharges (disruptive/non-disruptive) corresponding to the first three JET ITER-like wall campaigns. The results give a success rate of 93.8% and a false alarm rate of 2.8%. It should be considered that these results are obtained from models trained with no more than 42 disruptive discharges. (paper)

  9. Computational time analysis of the numerical solution of 3D electrostatic Poisson's equation

    Science.gov (United States)

    Kamboh, Shakeel Ahmed; Labadin, Jane; Rigit, Andrew Ragai Henri; Ling, Tech Chaw; Amur, Khuda Bux; Chaudhary, Muhammad Tayyab

    2015-05-01

    3D Poisson's equation is solved numerically to simulate the electric potential in a prototype design of electrohydrodynamic (EHD) ion-drag micropump. Finite difference method (FDM) is employed to discretize the governing equation. The system of linear equations resulting from FDM is solved iteratively by using the sequential Jacobi (SJ) and sequential Gauss-Seidel (SGS) methods, simulation results are also compared to examine the difference between the results. The main objective was to analyze the computational time required by both the methods with respect to different grid sizes and parallelize the Jacobi method to reduce the computational time. In common, the SGS method is faster than the SJ method but the data parallelism of Jacobi method may produce good speedup over SGS method. In this study, the feasibility of using parallel Jacobi (PJ) method is attempted in relation to SGS method. MATLAB Parallel/Distributed computing environment is used and a parallel code for SJ method is implemented. It was found that for small grid size the SGS method remains dominant over SJ method and PJ method while for large grid size both the sequential methods may take nearly too much processing time to converge. Yet, the PJ method reduces computational time to some extent for large grid sizes.

  10. Large-Scale Parallel Finite Element Analysis of the Stress Singular Problems

    International Nuclear Information System (INIS)

    Noriyuki Kushida; Hiroshi Okuda; Genki Yagawa

    2002-01-01

    In this paper, the convergence behavior of large-scale parallel finite element method for the stress singular problems was investigated. The convergence behavior of iterative solvers depends on the efficiency of the pre-conditioners. However, efficiency of pre-conditioners may be influenced by the domain decomposition that is necessary for parallel FEM. In this study the following results were obtained: Conjugate gradient method without preconditioning and the diagonal scaling preconditioned conjugate gradient method were not influenced by the domain decomposition as expected. symmetric successive over relaxation method preconditioned conjugate gradient method converged 6% faster as maximum if the stress singular area was contained in one sub-domain. (authors)

  11. Fast implementations of 3D PET reconstruction using vector and parallel programming techniques

    International Nuclear Information System (INIS)

    Guerrero, T.M.; Cherry, S.R.; Dahlbom, M.; Ricci, A.R.; Hoffman, E.J.

    1993-01-01

    Computationally intensive techniques that offer potential clinical use have arisen in nuclear medicine. Examples include iterative reconstruction, 3D PET data acquisition and reconstruction, and 3D image volume manipulation including image registration. One obstacle in achieving clinical acceptance of these techniques is the computational time required. This study focuses on methods to reduce the computation time for 3D PET reconstruction through the use of fast computer hardware, vector and parallel programming techniques, and algorithm optimization. The strengths and weaknesses of i860 microprocessor based workstation accelerator boards are investigated in implementations of 3D PET reconstruction

  12. Local Fractional Laplace Variational Iteration Method for Solving Linear Partial Differential Equations with Local Fractional Derivative

    Directory of Open Access Journals (Sweden)

    Ai-Min Yang

    2014-01-01

    Full Text Available The local fractional Laplace variational iteration method was applied to solve the linear local fractional partial differential equations. The local fractional Laplace variational iteration method is coupled by the local fractional variational iteration method and Laplace transform. The nondifferentiable approximate solutions are obtained and their graphs are also shown.

  13. Systematic test on fast time resolution parallel plate avalanche counter

    International Nuclear Information System (INIS)

    Chen Yu; Li Guangwu; Gu Xianbao; Chen Yanchao; Zhang Gang; Zhang Wenhui; Yan Guohong

    2011-01-01

    Systematic test on each detect unit of parallel plate avalanche counter (PPAC) used in the fission multi-parameter measurement was performed with a 241 Am α source to get the time resolution and position resolution. The detectors work at 600 Pa flowing isobutane and with-600 V on cathode. The time resolution was got by TOF method and the position resolution was got by delay line method. The time resolution of detect units is better than 400 ps, and the position resolution is 6 mm. The results show that the demand of measurement is fully covered. (authors)

  14. A Note on Using Partitioning Techniques for Solving Unconstrained Optimization Problems on Parallel Systems

    Directory of Open Access Journals (Sweden)

    Mehiddin Al-Baali

    2015-12-01

    Full Text Available We deal with the design of parallel algorithms by using variable partitioning techniques to solve nonlinear optimization problems. We propose an iterative solution method that is very efficient for separable functions, our scope being to discuss its performance for general functions. Experimental results on an illustrative example have suggested some useful modifications that, even though they improve the efficiency of our parallel method, leave some questions open for further investigation.

  15. News from ITER controls - a status report

    International Nuclear Information System (INIS)

    Wallander, A.; Abadie, L.; Di Maio, F.; Evrard, B.; Fourneron, J.M.; Gulati, H.; Hansalia, C.; Journeaux, J.Y.; Kim, C.; Klotz, W.D.; Mahajan, K.; Makijarvi, P; Matsumoto, Y.; Pande, S.; Simrock, S.; Stepanov, D.; Utzel, N.; Vergara, A.; Winter, A.; Yonekawa, I.

    2012-01-01

    Construction of ITER has started at the Cadarache site in southern France. The first buildings are taking shape and more than 60 % of the in-kind procurement has been committed by the seven ITER member states (China, Europe, India, Japan, Korea, Russia and United States). The design and manufacturing of the main components of the machine is now underway all over the world. Each of these components comes with a local control system, which must be integrated in the central control system. The control group at ITER has developed two products to facilitate it; the plant control design handbook (PCDH) and the control, data access and communication (CODAC) core system. PCDH is a document which prescribes the technologies and methods to be used in developing local control systems and sets the rules applicable to the in-kind procurements. CODAC core system is a software package, distributed to all in-kind procurement developers, which implements the PCDH and facilitates the compliance of the local control system. In parallel, the ITER control group is proceeding with the design of the central control system to allow fully integrated and automated operation of ITER. In this paper we report on the progress of the design and technology choices and we discuss justifications of those choices. We also report on the results of some pilot projects aimed at validating the design and technologies. (authors)

  16. Solution of Nonlinear Partial Differential Equations by New Laplace Variational Iteration Method

    Directory of Open Access Journals (Sweden)

    Eman M. A. Hilal

    2014-01-01

    Full Text Available The aim of this study is to give a good strategy for solving some linear and nonlinear partial differential equations in engineering and physics fields, by combining Laplace transform and the modified variational iteration method. This method is based on the variational iteration method, Laplace transforms, and convolution integral, introducing an alternative Laplace correction functional and expressing the integral as a convolution. Some examples in physical engineering are provided to illustrate the simplicity and reliability of this method. The solutions of these examples are contingent only on the initial conditions.

  17. A task parallel implementation of fast multipole methods

    KAUST Repository

    Taura, Kenjiro; Nakashima, Jun; Yokota, Rio; Maruyama, Naoya

    2012-01-01

    This paper describes a task parallel implementation of ExaFMM, an open source implementation of fast multipole methods (FMM), using a lightweight task parallel library MassiveThreads. Although there have been many attempts on parallelizing FMM

  18. Computational acceleration for MR image reconstruction in partially parallel imaging.

    Science.gov (United States)

    Ye, Xiaojing; Chen, Yunmei; Huang, Feng

    2011-05-01

    In this paper, we present a fast numerical algorithm for solving total variation and l(1) (TVL1) based image reconstruction with application in partially parallel magnetic resonance imaging. Our algorithm uses variable splitting method to reduce computational cost. Moreover, the Barzilai-Borwein step size selection method is adopted in our algorithm for much faster convergence. Experimental results on clinical partially parallel imaging data demonstrate that the proposed algorithm requires much fewer iterations and/or less computational cost than recently developed operator splitting and Bregman operator splitting methods, which can deal with a general sensing matrix in reconstruction framework, to get similar or even better quality of reconstructed images.

  19. An Automated Baseline Correction Method Based on Iterative Morphological Operations.

    Science.gov (United States)

    Chen, Yunliang; Dai, Liankui

    2018-05-01

    Raman spectra usually suffer from baseline drift caused by fluorescence or other reasons. Therefore, baseline correction is a necessary and crucial step that must be performed before subsequent processing and analysis of Raman spectra. An automated baseline correction method based on iterative morphological operations is proposed in this work. The method can adaptively determine the structuring element first and then gradually remove the spectral peaks during iteration to get an estimated baseline. Experiments on simulated data and real-world Raman data show that the proposed method is accurate, fast, and flexible for handling different kinds of baselines in various practical situations. The comparison of the proposed method with some state-of-the-art baseline correction methods demonstrates its advantages over the existing methods in terms of accuracy, adaptability, and flexibility. Although only Raman spectra are investigated in this paper, the proposed method is hopefully to be used for the baseline correction of other analytical instrumental signals, such as IR spectra and chromatograms.

  20. Three dimensional iterative beam propagation method for optical waveguide devices

    Science.gov (United States)

    Ma, Changbao; Van Keuren, Edward

    2006-10-01

    The finite difference beam propagation method (FD-BPM) is an effective model for simulating a wide range of optical waveguide structures. The classical FD-BPMs are based on the Crank-Nicholson scheme, and in tridiagonal form can be solved using the Thomas method. We present a different type of algorithm for 3-D structures. In this algorithm, the wave equation is formulated into a large sparse matrix equation which can be solved using iterative methods. The simulation window shifting scheme and threshold technique introduced in our earlier work are utilized to overcome the convergence problem of iterative methods for large sparse matrix equation and wide-angle simulations. This method enables us to develop higher-order 3-D wide-angle (WA-) BPMs based on Pade approximant operators and the multistep method, which are commonly used in WA-BPMs for 2-D structures. Simulations using the new methods will be compared to the analytical results to assure its effectiveness and applicability.

  1. Variation Iteration Method for The Approximate Solution of Nonlinear ...

    African Journals Online (AJOL)

    In this study, we considered the numerical solution of the nonlinear Burgers equation using the Variational Iteration Method (VIM). The method seeks to examine the convergence of solutions of the Burgers equation at the expense of the parameters x and t of which the amount of errors depends. Numerical experimentation ...

  2. Parallel processing method for high-speed real time digital pulse processing for gamma-ray spectroscopy

    International Nuclear Information System (INIS)

    Fernandes, A.M.; Pereira, R.C.; Sousa, J.; Neto, A.; Carvalho, P.; Batista, A.J.N.; Carvalho, B.B.; Varandas, C.A.F.; Tardocchi, M.; Gorini, G.

    2010-01-01

    A new data acquisition (DAQ) system was developed to fulfil the requirements of the gamma-ray spectrometer (GRS) JET-EP2 (joint European Torus enhancement project 2), providing high-resolution spectroscopy at very high-count rate (up to few MHz). The system is based on the Advanced Telecommunications Computing Architecture TM (ATCA TM ) and includes a transient record (TR) module with 8 channels of 14 bits resolution at 400 MSamples/s (MSPS) sampling rate, 4 GB of local memory, and 2 field programmable gate array (FPGA) able to perform real time algorithms for data reduction and digital pulse processing. Although at 400 MSPS only fast programmable devices such as FPGAs can be used either for data processing and data transfer, FPGA resources also present speed limitation at some specific tasks, leading to an unavoidable data lost when demanding algorithms are applied. To overcome this problem and foreseeing an increase of the algorithm complexity, a new digital parallel filter was developed, aiming to perform real time pulse processing in the FPGAs of the TR module at the presented sampling rate. The filter is based on the conventional digital time-invariant trapezoidal shaper operating with parallelized data while performing pulse height analysis (PHA) and pile up rejection (PUR). The incoming sampled data is successively parallelized and fed into the processing algorithm block at one fourth of the sampling rate. The following data processing and data transfer is also performed at one fourth of the sampling rate. The algorithm based on data parallelization technique was implemented and tested at JET facilities, where a spectrum was obtained. Attending to the observed results, the PHA algorithm will be improved by implementing the pulse pile up discrimination.

  3. Efficient methods for time-absorption (α) eigenvalue calculations

    International Nuclear Information System (INIS)

    Hill, T.R.

    1983-01-01

    The time-absorption eigenvalue (α) calculation is one of the options found in most discrete-ordinates transport codes. Several methods have been developed at Los Alamos to improve the efficiency of this calculation. Two procedures, based on coarse-mesh rebalance, to accelerate the α eigenvalue search are derived. A hybrid scheme to automatically choose the more-effective rebalance method is described. The α rebalance scheme permits some simple modifications to the iteration strategy that eliminates many unnecessary calculations required in the standard search procedure. For several fast supercritical test problems, these methods resulted in convergence with one-fifth the number of iterations required for the conventional eigenvalue search procedure

  4. Numerical method for solving the three-dimensional time-dependent neutron diffusion equation

    International Nuclear Information System (INIS)

    Khaled, S.M.; Szatmary, Z.

    2005-01-01

    A numerical time-implicit method has been developed for solving the coupled three-dimensional time-dependent multi-group neutron diffusion and delayed neutron precursor equations. The numerical stability of the implicit computation scheme and the convergence of the iterative associated processes have been evaluated. The computational scheme requires the solution of large linear systems at each time step. For this purpose, the point over-relaxation Gauss-Seidel method was chosen. A new scheme was introduced instead of the usual source iteration scheme. (author)

  5. Virtual fringe projection system with nonparallel illumination based on iteration

    International Nuclear Information System (INIS)

    Zhou, Duo; Wang, Zhangying; Gao, Nan; Zhang, Zonghua; Jiang, Xiangqian

    2017-01-01

    Fringe projection profilometry has been widely applied in many fields. To set up an ideal measuring system, a virtual fringe projection technique has been studied to assist in the design of hardware configurations. However, existing virtual fringe projection systems use parallel illumination and have a fixed optical framework. This paper presents a virtual fringe projection system with nonparallel illumination. Using an iterative method to calculate intersection points between rays and reference planes or object surfaces, the proposed system can simulate projected fringe patterns and captured images. A new explicit calibration method has been presented to validate the precision of the system. Simulated results indicate that the proposed iterative method outperforms previous systems. Our virtual system can be applied to error analysis, algorithm optimization, and help operators to find ideal system parameter settings for actual measurements. (paper)

  6. Iterative Decoding of Concatenated Codes: A Tutorial

    Directory of Open Access Journals (Sweden)

    Phillip A. Regalia

    2005-05-01

    Full Text Available The turbo decoding algorithm of a decade ago constituted a milestone in error-correction coding for digital communications, and has inspired extensions to generalized receiver topologies, including turbo equalization, turbo synchronization, and turbo CDMA, among others. Despite an accrued understanding of iterative decoding over the years, the “turbo principle” remains elusive to master analytically, thereby inciting interest from researchers outside the communications domain. In this spirit, we develop a tutorial presentation of iterative decoding for parallel and serial concatenated codes, in terms hopefully accessible to a broader audience. We motivate iterative decoding as a computationally tractable attempt to approach maximum-likelihood decoding, and characterize fixed points in terms of a “consensus” property between constituent decoders. We review how the decoding algorithm for both parallel and serial concatenated codes coincides with an alternating projection algorithm, which allows one to identify conditions under which the algorithm indeed converges to a maximum-likelihood solution, in terms of particular likelihood functions factoring into the product of their marginals. The presentation emphasizes a common framework applicable to both parallel and serial concatenated codes.

  7. Acceleration of iterative tomographic reconstruction using graphics processors

    International Nuclear Information System (INIS)

    Belzunce, M.A.; Osorio, A.; Verrastro, C.A.

    2009-01-01

    Using iterative algorithms for image reconstruction in 3 D Positron Emission Tomography has shown to produce images with better quality than analytical methods. How ever, these algorithms are computationally expensive. New Graphic Processor Units (GPU) provides high performance at low cost and also programming tools that make possible to execute parallel algorithms easily in scientific applications. In this work, we try to achieve an acceleration of image reconstruction algorithms in 3 D PET by using a GPU. A parallel implementation of the algorithm ML-EM 3 D was developed using Siddon algorithm as Projector and Back-projector. Results show that accelerations of more than one order of magnitude can be achieved, keeping similar image quality. (author)

  8. The General Iterative Methods for Asymptotically Nonexpansive Semigroups in Banach Spaces

    Directory of Open Access Journals (Sweden)

    Rabian Wangkeeree

    2012-01-01

    Full Text Available We introduce the general iterative methods for finding a common fixed point of asymptotically nonexpansive semigroups which is a unique solution of some variational inequalities. We prove the strong convergence theorems of such iterative scheme in a reflexive Banach space which admits a weakly continuous duality mapping. The main result extends various results existing in the current literature.

  9. An iterative method to reconstruct the refractive index of a medium from time-of-flight measurements

    Science.gov (United States)

    Schröder, Udo; Schuster, Thomas

    2016-08-01

    The article deals with a classical inverse problem: the computation of the refractive index of a medium from ultrasound time-of-flight measurements. This problem is very popular in seismics but also for tomographic problems in inhomogeneous media. For example ultrasound vector field tomography needs a priori knowledge of the sound speed. According to Fermat’s principle ultrasound signals travel along geodesic curves of a Riemannian metric which is associated with the refractive index. The inverse problem thus consists of determining the index of refraction from integrals along geodesics curves associated with the integrand leading to a nonlinear problem. In this article we describe a numerical solver for this problem scheme based on an iterative minimization method for an appropriate Tikhonov functional. The outcome of the method is a stable approximation of the sought index of refraction as well as a corresponding set of geodesic curves. We prove some analytical convergence results for this method and demonstrate its performance by means of several numerical experiments. Another novelty in this article is the explicit representation of the backprojection operator for the ray transform in Riemannian geometry and its numerical realization relying on a corresponding phase function that is determined by the metric. This gives a natural extension of the conventional backprojection from 2D computerized tomography to inhomogeneous geometries. The authors dedicate this article to Prof Todd Quinto on the occasion of his 65th birthday.

  10. SU-D-206-03: Segmentation Assisted Fast Iterative Reconstruction Method for Cone-Beam CT

    International Nuclear Information System (INIS)

    Wu, P; Mao, T; Gong, S; Wang, J; Niu, T; Sheng, K; Xie, Y

    2016-01-01

    Purpose: Total Variation (TV) based iterative reconstruction (IR) methods enable accurate CT image reconstruction from low-dose measurements with sparse projection acquisition, due to the sparsifiable feature of most CT images using gradient operator. However, conventional solutions require large amount of iterations to generate a decent reconstructed image. One major reason is that the expected piecewise constant property is not taken into consideration at the optimization starting point. In this work, we propose an iterative reconstruction method for cone-beam CT (CBCT) using image segmentation to guide the optimization path more efficiently on the regularization term at the beginning of the optimization trajectory. Methods: Our method applies general knowledge that one tissue component in the CT image contains relatively uniform distribution of CT number. This general knowledge is incorporated into the proposed reconstruction using image segmentation technique to generate the piecewise constant template on the first-pass low-quality CT image reconstructed using analytical algorithm. The template image is applied as an initial value into the optimization process. Results: The proposed method is evaluated on the Shepp-Logan phantom of low and high noise levels, and a head patient. The number of iterations is reduced by overall 40%. Moreover, our proposed method tends to generate a smoother reconstructed image with the same TV value. Conclusion: We propose a computationally efficient iterative reconstruction method for CBCT imaging. Our method achieves a better optimization trajectory and a faster convergence behavior. It does not rely on prior information and can be readily incorporated into existing iterative reconstruction framework. Our method is thus practical and attractive as a general solution to CBCT iterative reconstruction. This work is supported by the Zhejiang Provincial Natural Science Foundation of China (Grant No. LR16F010001), National High-tech R

  11. SU-D-206-03: Segmentation Assisted Fast Iterative Reconstruction Method for Cone-Beam CT

    Energy Technology Data Exchange (ETDEWEB)

    Wu, P; Mao, T; Gong, S; Wang, J; Niu, T [Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Institute of Translational Medicine, Zhejiang University, Hangzhou, Zhejiang (China); Sheng, K [Department of Radiation Oncology, University of California, Los Angeles, Los Angeles, CA (United States); Xie, Y [Institute of Biomedical and Health Engineering, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong (China)

    2016-06-15

    Purpose: Total Variation (TV) based iterative reconstruction (IR) methods enable accurate CT image reconstruction from low-dose measurements with sparse projection acquisition, due to the sparsifiable feature of most CT images using gradient operator. However, conventional solutions require large amount of iterations to generate a decent reconstructed image. One major reason is that the expected piecewise constant property is not taken into consideration at the optimization starting point. In this work, we propose an iterative reconstruction method for cone-beam CT (CBCT) using image segmentation to guide the optimization path more efficiently on the regularization term at the beginning of the optimization trajectory. Methods: Our method applies general knowledge that one tissue component in the CT image contains relatively uniform distribution of CT number. This general knowledge is incorporated into the proposed reconstruction using image segmentation technique to generate the piecewise constant template on the first-pass low-quality CT image reconstructed using analytical algorithm. The template image is applied as an initial value into the optimization process. Results: The proposed method is evaluated on the Shepp-Logan phantom of low and high noise levels, and a head patient. The number of iterations is reduced by overall 40%. Moreover, our proposed method tends to generate a smoother reconstructed image with the same TV value. Conclusion: We propose a computationally efficient iterative reconstruction method for CBCT imaging. Our method achieves a better optimization trajectory and a faster convergence behavior. It does not rely on prior information and can be readily incorporated into existing iterative reconstruction framework. Our method is thus practical and attractive as a general solution to CBCT iterative reconstruction. This work is supported by the Zhejiang Provincial Natural Science Foundation of China (Grant No. LR16F010001), National High-tech R

  12. Speeding up predictive electromagnetic simulations for ITER application

    International Nuclear Information System (INIS)

    Alekseev, A.B.; Amoskov, V.M.; Bazarov, A.M.; Belov, A.V.; Belyakov, V.A.; Gapionok, E.I.; Gornikel, I.V.; Gribov, Yu. V.; Kukhtin, V.P.; Lamzin, E.A.; Sytchevsky, S.E.

    2017-01-01

    Highlights: • A general concept of engineering EM simulator for tokamak application is proposed. • An algorithm is based on influence functions and superposition principle. • The software works with extensive databases and offers parallel processing. • The simulator allows us to obtain the solution hundreds times faster. - Abstract: The paper presents an attempt to proceed to a general concept of software environment for fast and consistent multi-task simulation of EM transients (engineering simulator for tokamak applications). As an example, the ITER tokamak is taken to introduce a computational technique. The strategy exploits parallel processing with optimized simulation algorithms based on using of influence functions and superposition principle to take full advantage of parallelism. The software has been tested on a multi-core supercomputer. The results were compared with data obtained in TYPHOON computations. A discrepancy was found to be below 0.4%. The computation cost for the simulator is proportional to the number of observation points. An average computation time with the simulator is found to be by hundreds times less than the time required to solve numerically a relevant system of differential equations for known software tools.

  13. Speeding up predictive electromagnetic simulations for ITER application

    Energy Technology Data Exchange (ETDEWEB)

    Alekseev, A.B. [ITER Organization, Route de Vinon sur Verdon, 13067 St. Paul Lez Durance Cedex (France); Amoskov, V.M. [JSC “NIIEFA”, Doroga na Metallostroy 3, St. Petersburg, 196641 (Russian Federation); Bazarov, A.M., E-mail: alexander.bazarov@gmail.com [JSC “NIIEFA”, Doroga na Metallostroy 3, St. Petersburg, 196641 (Russian Federation); Belov, A.V. [JSC “NIIEFA”, Doroga na Metallostroy 3, St. Petersburg, 196641 (Russian Federation); Belyakov, V.A. [JSC “NIIEFA”, Doroga na Metallostroy 3, St. Petersburg, 196641 (Russian Federation); St. Petersburg State University, 7/9 Universitetskaya Embankment, St. Petersburg, 199034 (Russian Federation); Gapionok, E.I. [JSC “NIIEFA”, Doroga na Metallostroy 3, St. Petersburg, 196641 (Russian Federation); Gornikel, I.V. [Alphysica GmbH, Unterreut, 6, D-76135, Karlsruhe (Germany); Gribov, Yu. V. [ITER Organization, Route de Vinon sur Verdon, 13067 St. Paul Lez Durance Cedex (France); Kukhtin, V.P.; Lamzin, E.A. [JSC “NIIEFA”, Doroga na Metallostroy 3, St. Petersburg, 196641 (Russian Federation); Sytchevsky, S.E. [JSC “NIIEFA”, Doroga na Metallostroy 3, St. Petersburg, 196641 (Russian Federation); St. Petersburg State University, 7/9 Universitetskaya Embankment, St. Petersburg, 199034 (Russian Federation)

    2017-05-15

    Highlights: • A general concept of engineering EM simulator for tokamak application is proposed. • An algorithm is based on influence functions and superposition principle. • The software works with extensive databases and offers parallel processing. • The simulator allows us to obtain the solution hundreds times faster. - Abstract: The paper presents an attempt to proceed to a general concept of software environment for fast and consistent multi-task simulation of EM transients (engineering simulator for tokamak applications). As an example, the ITER tokamak is taken to introduce a computational technique. The strategy exploits parallel processing with optimized simulation algorithms based on using of influence functions and superposition principle to take full advantage of parallelism. The software has been tested on a multi-core supercomputer. The results were compared with data obtained in TYPHOON computations. A discrepancy was found to be below 0.4%. The computation cost for the simulator is proportional to the number of observation points. An average computation time with the simulator is found to be by hundreds times less than the time required to solve numerically a relevant system of differential equations for known software tools.

  14. Characterizing and Mitigating Work Time Inflation in Task Parallel Programs

    Directory of Open Access Journals (Sweden)

    Stephen L. Olivier

    2013-01-01

    Full Text Available Task parallelism raises the level of abstraction in shared memory parallel programming to simplify the development of complex applications. However, task parallel applications can exhibit poor performance due to thread idleness, scheduling overheads, and work time inflation – additional time spent by threads in a multithreaded computation beyond the time required to perform the same work in a sequential computation. We identify the contributions of each factor to lost efficiency in various task parallel OpenMP applications and diagnose the causes of work time inflation in those applications. Increased data access latency can cause significant work time inflation in NUMA systems. Our locality framework for task parallel OpenMP programs mitigates this cause of work time inflation. Our extensions to the Qthreads library demonstrate that locality-aware scheduling can improve performance up to 3X compared to the Intel OpenMP task scheduler.

  15. A New Iterative Method for Equilibrium Problems and Fixed Point Problems

    Directory of Open Access Journals (Sweden)

    Abdul Latif

    2013-01-01

    Full Text Available Introducing a new iterative method, we study the existence of a common element of the set of solutions of equilibrium problems for a family of monotone, Lipschitz-type continuous mappings and the sets of fixed points of two nonexpansive semigroups in a real Hilbert space. We establish strong convergence theorems of the new iterative method for the solution of the variational inequality problem which is the optimality condition for the minimization problem. Our results improve and generalize the corresponding recent results of Anh (2012, Cianciaruso et al. (2010, and many others.

  16. LHCD and coupling experiments with an ITER-like PAM launcher on the FTU tokamak

    International Nuclear Information System (INIS)

    Pericoli Ridolfini, V.; Apicella, M.L.; Barbato, E.; Buratti, P.; Calabro, G.; Cardinali, A.; Mirizzi, F.; Panaccione, L.; Podda, S.; Tuccillo, A.A.; Bibet, Ph.; Granucci, G.; Sozzi, C.

    2005-01-01

    Successful experimental tests on a PAM (passive active multijunction) prototype antenna for the Lower Hybrid (LH) waves similar to that foreseen for ITER have been carried out on FTU. The power level routinely achieved without any fault in the transmission lines for the maximum time allowed by the LH power plant, i.e. 0.9 s, is 250 kW versus a design value of 270. It corresponds to 50 MW/m 2 through the ITER antenna active area if it is scaled for the different LH frequencies (5 GHz in ITER, 8 GHz in FTU) and it is more than 1.4 times the goal of the ITER design (33 MW/m 2 ). The test results validate the main features indicated by the simulation codes, concerning the power handling, the coupling and the launched N parallel spectrum. The power reflection coefficient R c is always ≤ 2.5%, once the PAM launcher has been properly conditioned, even with the grill mouth retracted 2 mm inside the port shadow, with density in front of the launcher very close or even lower than the cut-off value. The current drive efficiency is comparable to a conventional grill in similar conditions, once the lower directivity is taken into account. The flexibility in the N parallel spectrum is confirmed by the HXR and ECE spectra. Conditioning the PAM to operate at the ITER equivalent power level has required only one day of RF operation, without a previous baking of the waveguides. (author)

  17. Approximate inverse preconditioning of iterative methods for nonsymmetric linear systems

    Energy Technology Data Exchange (ETDEWEB)

    Benzi, M. [Universita di Bologna (Italy); Tuma, M. [Inst. of Computer Sciences, Prague (Czech Republic)

    1996-12-31

    A method for computing an incomplete factorization of the inverse of a nonsymmetric matrix A is presented. The resulting factorized sparse approximate inverse is used as a preconditioner in the iterative solution of Ax = b by Krylov subspace methods.

  18. Comments on new iterative methods for solving linear systems

    Directory of Open Access Journals (Sweden)

    Wang Ke

    2017-06-01

    Full Text Available Some new iterative methods were presented by Du, Zheng and Wang for solving linear systems in [3], where it is shown that the new methods, comparing to the classical Jacobi or Gauss-Seidel method, can be applied to more systems and have faster convergence. This note shows that their methods are suitable for more matrices than positive matrices which the authors suggested through further analysis and numerical examples.

  19. A method of paralleling computer calculation for two-dimensional kinetic plasma model

    International Nuclear Information System (INIS)

    Brazhnik, V.A.; Demchenko, V.V.; Dem'yanov, V.G.; D'yakov, V.E.; Ol'shanskij, V.V.; Panchenko, V.I.

    1987-01-01

    A method for parallel computer calculation and OSIRIS program complex realizing it and designed for numerical plasma simulation by the macroparticle method are described. The calculation can be carried out either with one or simultaneously with two computers BESM-6, that is provided by some package of interacting programs functioning in every computer. Program interaction in every computer is based on event techniques realized in OS DISPAK. Parallel computer calculation with two BESM-6 computers allows to accelerate the computation 1.5 times

  20. A fast method to emulate an iterative POCS image reconstruction algorithm.

    Science.gov (United States)

    Zeng, Gengsheng L

    2017-10-01

    Iterative image reconstruction algorithms are commonly used to optimize an objective function, especially when the objective function is nonquadratic. Generally speaking, the iterative algorithms are computationally inefficient. This paper presents a fast algorithm that has one backprojection and no forward projection. This paper derives a new method to solve an optimization problem. The nonquadratic constraint, for example, an edge-preserving denoising constraint is implemented as a nonlinear filter. The algorithm is derived based on the POCS (projections onto projections onto convex sets) approach. A windowed FBP (filtered backprojection) algorithm enforces the data fidelity. An iterative procedure, divided into segments, enforces edge-enhancement denoising. Each segment performs nonlinear filtering. The derived iterative algorithm is computationally efficient. It contains only one backprojection and no forward projection. Low-dose CT data are used for algorithm feasibility studies. The nonlinearity is implemented as an edge-enhancing noise-smoothing filter. The patient studies results demonstrate its effectiveness in processing low-dose x ray CT data. This fast algorithm can be used to replace many iterative algorithms. © 2017 American Association of Physicists in Medicine.

  1. Iterative solution of general sparse linear systems on clusters of workstations

    Energy Technology Data Exchange (ETDEWEB)

    Lo, Gen-Ching; Saad, Y. [Univ. of Minnesota, Minneapolis, MN (United States)

    1996-12-31

    Solving sparse irregularly structured linear systems on parallel platforms poses several challenges. First, sparsity makes it difficult to exploit data locality, whether in a distributed or shared memory environment. A second, perhaps more serious challenge, is to find efficient ways to precondition the system. Preconditioning techniques which have a large degree of parallelism, such as multicolor SSOR, often have a slower rate of convergence than their sequential counterparts. Finally, a number of other computational kernels such as inner products could ruin any gains gained from parallel speed-ups, and this is especially true on workstation clusters where start-up times may be high. In this paper we discuss these issues and report on our experience with PSPARSLIB, an on-going project for building a library of parallel iterative sparse matrix solvers.

  2. Real-time trajectory optimization on parallel processors

    Science.gov (United States)

    Psiaki, Mark L.

    1993-01-01

    A parallel algorithm has been developed for rapidly solving trajectory optimization problems. The goal of the work has been to develop an algorithm that is suitable to do real-time, on-line optimal guidance through repeated solution of a trajectory optimization problem. The algorithm has been developed on an INTEL iPSC/860 message passing parallel processor. It uses a zero-order-hold discretization of a continuous-time problem and solves the resulting nonlinear programming problem using a custom-designed augmented Lagrangian nonlinear programming algorithm. The algorithm achieves parallelism of function, derivative, and search direction calculations through the principle of domain decomposition applied along the time axis. It has been encoded and tested on 3 example problems, the Goddard problem, the acceleration-limited, planar minimum-time to the origin problem, and a National Aerospace Plane minimum-fuel ascent guidance problem. Execution times as fast as 118 sec of wall clock time have been achieved for a 128-stage Goddard problem solved on 32 processors. A 32-stage minimum-time problem has been solved in 151 sec on 32 processors. A 32-stage National Aerospace Plane problem required 2 hours when solved on 32 processors. A speed-up factor of 7.2 has been achieved by using 32-nodes instead of 1-node to solve a 64-stage Goddard problem.

  3. Towards the conceptual design of the ITER real-time plasma control system

    International Nuclear Information System (INIS)

    Winter, A.; Makijarvi, P.; Simrock, S.; Snipes, J.A.; Wallander, A.; Zabeo, L.

    2014-01-01

    Highlights: • We describe the main control areas and interfaces for the ITER real-time plasma control system and the current state of their design. • An overview is given for the implementation strategy for the plasma control system as part of the ITER control, data access and communication system. • Current efforts on the creation of simulation and development tools are presented. - Abstract: ITER will be the world's largest magnetic confinement tokamak fusion device and is currently under construction in southern France. The ITER Plasma Control System (PCS) is a fundamental component of the ITER Control, Data Access and Communication system (CODAC). It will control the evolution of all plasma parameters that are necessary to operate ITER throughout all phases of the discharge. The design and implementation of the PCS poses a number of unique challenges. The timescales of phenomena to be controlled spans three orders of magnitude, ranging from a few milliseconds to seconds. Novel control schemes, which have not been implemented at present-day machines need to be developed, and control schemes that are only done as demonstration experiments today will have to become routine. In addition, advances in computing technology and available physics models make the implementation of real-time or faster-than-real-time predictive calculations to forecast and subsequently to avoid disruptions or undesired plasma regimes feasible. This requires the PCS design to be adaptable in real-time to the results of these forecasting algorithms. A further novel feature is a sophisticated event handling system, which provides a means to deal with plasma related events (such as MHD instabilities or L-H transitions) or component failure. Finally, the schedule for design and implementation poses another challenge. The beginning of ITER operation will be in late 2020, but the conceptual design activity of the PCS has already commenced as required by the on-going development of

  4. Robust Monotonically Convergent Iterative Learning Control for Discrete-Time Systems via Generalized KYP Lemma

    Directory of Open Access Journals (Sweden)

    Jian Ding

    2014-01-01

    Full Text Available This paper addresses the problem of P-type iterative learning control for a class of multiple-input multiple-output linear discrete-time systems, whose aim is to develop robust monotonically convergent control law design over a finite frequency range. It is shown that the 2 D iterative learning control processes can be taken as 1 D state space model regardless of relative degree. With the generalized Kalman-Yakubovich-Popov lemma applied, it is feasible to describe the monotonically convergent conditions with the help of linear matrix inequality technique and to develop formulas for the control gain matrices design. An extension to robust control law design against systems with structured and polytopic-type uncertainties is also considered. Two numerical examples are provided to validate the feasibility and effectiveness of the proposed method.

  5. Parallel time domain solvers for electrically large transient scattering problems

    KAUST Repository

    Liu, Yang; Yucel, Abdulkadir; Bagcý , Hakan; Michielssen, Eric

    2014-01-01

    scattering from perfect electrically conducting objects are obtained by enforcing electric field boundary conditions and implicitly time advance electric surface current densities by iteratively solving sparse systems of equations at all time steps. Contrary

  6. Boosting iterative stochastic ensemble method for nonlinear calibration of subsurface flow models

    KAUST Repository

    Elsheikh, Ahmed H.

    2013-06-01

    A novel parameter estimation algorithm is proposed. The inverse problem is formulated as a sequential data integration problem in which Gaussian process regression (GPR) is used to integrate the prior knowledge (static data). The search space is further parameterized using Karhunen-Loève expansion to build a set of basis functions that spans the search space. Optimal weights of the reduced basis functions are estimated by an iterative stochastic ensemble method (ISEM). ISEM employs directional derivatives within a Gauss-Newton iteration for efficient gradient estimation. The resulting update equation relies on the inverse of the output covariance matrix which is rank deficient.In the proposed algorithm we use an iterative regularization based on the ℓ2 Boosting algorithm. ℓ2 Boosting iteratively fits the residual and the amount of regularization is controlled by the number of iterations. A termination criteria based on Akaike information criterion (AIC) is utilized. This regularization method is very attractive in terms of performance and simplicity of implementation. The proposed algorithm combining ISEM and ℓ2 Boosting is evaluated on several nonlinear subsurface flow parameter estimation problems. The efficiency of the proposed algorithm is demonstrated by the small size of utilized ensembles and in terms of error convergence rates. © 2013 Elsevier B.V.

  7. Algebraic multigrid preconditioning within parallel finite-element solvers for 3-D electromagnetic modelling problems in geophysics

    Science.gov (United States)

    Koldan, Jelena; Puzyrev, Vladimir; de la Puente, Josep; Houzeaux, Guillaume; Cela, José María

    2014-06-01

    We present an elaborate preconditioning scheme for Krylov subspace methods which has been developed to improve the performance and reduce the execution time of parallel node-based finite-element (FE) solvers for 3-D electromagnetic (EM) numerical modelling in exploration geophysics. This new preconditioner is based on algebraic multigrid (AMG) that uses different basic relaxation methods, such as Jacobi, symmetric successive over-relaxation (SSOR) and Gauss-Seidel, as smoothers and the wave front algorithm to create groups, which are used for a coarse-level generation. We have implemented and tested this new preconditioner within our parallel nodal FE solver for 3-D forward problems in EM induction geophysics. We have performed series of experiments for several models with different conductivity structures and characteristics to test the performance of our AMG preconditioning technique when combined with biconjugate gradient stabilized method. The results have shown that, the more challenging the problem is in terms of conductivity contrasts, ratio between the sizes of grid elements and/or frequency, the more benefit is obtained by using this preconditioner. Compared to other preconditioning schemes, such as diagonal, SSOR and truncated approximate inverse, the AMG preconditioner greatly improves the convergence of the iterative solver for all tested models. Also, when it comes to cases in which other preconditioners succeed to converge to a desired precision, AMG is able to considerably reduce the total execution time of the forward-problem code-up to an order of magnitude. Furthermore, the tests have confirmed that our AMG scheme ensures grid-independent rate of convergence, as well as improvement in convergence regardless of how big local mesh refinements are. In addition, AMG is designed to be a black-box preconditioner, which makes it easy to use and combine with different iterative methods. Finally, it has proved to be very practical and efficient in the

  8. Development of GPU Based Parallel Computing Module for Solving Pressure Equation in the CUPID Component Thermo-Fluid Analysis Code

    International Nuclear Information System (INIS)

    Lee, Jin Pyo; Joo, Han Gyu

    2010-01-01

    In the thermo-fluid analysis code named CUPID, the linear system of pressure equations must be solved in each iteration step. The time for repeatedly solving the linear system can be quite significant because large sparse matrices of Rank more than 50,000 are involved and the diagonal dominance of the system is hardly hold. Therefore parallelization of the linear system solver is essential to reduce the computing time. Meanwhile, Graphics Processing Units (GPU) have been developed as highly parallel, multi-core processors for the global demand of high quality 3D graphics. If a suitable interface is provided, parallelization using GPU can be available to engineering computing. NVIDIA provides a Software Development Kit(SDK) named CUDA(Compute Unified Device Architecture) to code developers so that they can manage GPUs for parallelization using the C language. In this research, we implement parallel routines for the linear system solver using CUDA, and examine the performance of the parallelization. In the next section, we will describe the method of CUDA parallelization for the CUPID code, and then the performance of the CUDA parallelization will be discussed

  9. Iterative integral parameter identification of a respiratory mechanics model.

    Science.gov (United States)

    Schranz, Christoph; Docherty, Paul D; Chiew, Yeong Shiong; Möller, Knut; Chase, J Geoffrey

    2012-07-18

    Patient-specific respiratory mechanics models can support the evaluation of optimal lung protective ventilator settings during ventilation therapy. Clinical application requires that the individual's model parameter values must be identified with information available at the bedside. Multiple linear regression or gradient-based parameter identification methods are highly sensitive to noise and initial parameter estimates. Thus, they are difficult to apply at the bedside to support therapeutic decisions. An iterative integral parameter identification method is applied to a second order respiratory mechanics model. The method is compared to the commonly used regression methods and error-mapping approaches using simulated and clinical data. The clinical potential of the method was evaluated on data from 13 Acute Respiratory Distress Syndrome (ARDS) patients. The iterative integral method converged to error minima 350 times faster than the Simplex Search Method using simulation data sets and 50 times faster using clinical data sets. Established regression methods reported erroneous results due to sensitivity to noise. In contrast, the iterative integral method was effective independent of initial parameter estimations, and converged successfully in each case tested. These investigations reveal that the iterative integral method is beneficial with respect to computing time, operator independence and robustness, and thus applicable at the bedside for this clinical application.

  10. Explorations of the implementation of a parallel IDW interpolation algorithm in a Linux cluster-based parallel GIS

    Science.gov (United States)

    Huang, Fang; Liu, Dingsheng; Tan, Xicheng; Wang, Jian; Chen, Yunping; He, Binbin

    2011-04-01

    To design and implement an open-source parallel GIS (OP-GIS) based on a Linux cluster, the parallel inverse distance weighting (IDW) interpolation algorithm has been chosen as an example to explore the working model and the principle of algorithm parallel pattern (APP), one of the parallelization patterns for OP-GIS. Based on an analysis of the serial IDW interpolation algorithm of GRASS GIS, this paper has proposed and designed a specific parallel IDW interpolation algorithm, incorporating both single process, multiple data (SPMD) and master/slave (M/S) programming modes. The main steps of the parallel IDW interpolation algorithm are: (1) the master node packages the related information, and then broadcasts it to the slave nodes; (2) each node calculates its assigned data extent along one row using the serial algorithm; (3) the master node gathers the data from all nodes; and (4) iterations continue until all rows have been processed, after which the results are outputted. According to the experiments performed in the course of this work, the parallel IDW interpolation algorithm can attain an efficiency greater than 0.93 compared with similar algorithms, which indicates that the parallel algorithm can greatly reduce processing time and maximize speed and performance.

  11. ITER Safety and Licensing

    International Nuclear Information System (INIS)

    Girard, J-.P; Taylor, N.; Garin, P.; Uzan-Elbez, J.; GULDEN, W.; Rodriguez-Rodrigo, L.

    2006-01-01

    The site for the construction of ITER has been chosen in June 2005. The facility will be implemented in Europe, south of France close to Marseille. The generic safety scheme is now under revision to adapt the design to the host country regulation. Even though ITER will be an international organization, it will have to comply with the French requirements in the fields of public and occupational health and safety, nuclear safety, radiation protection, licensing, nuclear substances and environmental protection. The organization of the central team together with its partners organized in domestic agencies for the in-kind procurement of components is a key issue for the success of the experimentation. ITER is the first facility that will achieve sustained nuclear fusion. It is both important for the experimental one-of-a-kind device, ITER itself, and for the future of fusion power plants to well understand the key safety issues of this potential new source of energy production. The main safety concern is confinement of the tritium, activated dust in the vacuum vessel and activated corrosion products in the coolant of the plasma-facing components. This is achieved in the design through multiple confinement barriers to implement the defence in depth approach. It will be demonstrated in documents submitted to the French regulator that these barriers maintain their function in all postulated incident and accident conditions. The licensing process started by examination of the safety options. This step has been performed by Europe during the candidature phase in 2002. In parallel to the final design, and taking into account the local regulations, the Preliminary Safety Report (RPrS) will be drafted with support of the European partner and others in the framework of ITER Task Agreements. Together with the license application, the RPrS will be forwarded to the regulatory bodies, which will launch public hearings and a safety review. Both processes must succeed in order to

  12. Shrinkage-thresholding enhanced born iterative method for solving 2D inverse electromagnetic scattering problem

    KAUST Repository

    Desmal, Abdulla; Bagci, Hakan

    2014-01-01

    A numerical framework that incorporates recently developed iterative shrinkage thresholding (IST) algorithms within the Born iterative method (BIM) is proposed for solving the two-dimensional inverse electromagnetic scattering problem. IST

  13. Analysis of Diffusion Problems using Homotopy Perturbation and Variational Iteration Methods

    DEFF Research Database (Denmark)

    Barari, Amin; Poor, A. Tahmasebi; Jorjani, A.

    2010-01-01

    In this paper, variational iteration method and homotopy perturbation method are applied to different forms of diffusion equation. The diffusion equations have found wide applications in heat transfer problems, theory of consolidation and many other problems in engineering. The methods proposed...

  14. Identifying an unknown function in a parabolic equation with overspecified data via He's variational iteration method

    International Nuclear Information System (INIS)

    Dehghan, Mehdi; Tatari, Mehdi

    2008-01-01

    In this research, the He's variational iteration technique is used for computing an unknown time-dependent parameter in an inverse quasilinear parabolic partial differential equation. Parabolic partial differential equations with overspecified data play a crucial role in applied mathematics and physics, as they appear in various engineering models. The He's variational iteration method is an analytical procedure for finding solutions of differential equations, is based on the use of Lagrange multipliers for identification of an optimal value of a parameter in a functional. To show the efficiency of the new approach, several test problems are presented for one-, two- and three-dimensional cases

  15. Iterative resonance self-shielding methods using resonance integral table in heterogeneous transport lattice calculations

    International Nuclear Information System (INIS)

    Hong, Ser Gi; Kim, Kang-Seog

    2011-01-01

    This paper describes the iteration methods using resonance integral tables to estimate the effective resonance cross sections in heterogeneous transport lattice calculations. Basically, these methods have been devised to reduce an effort to convert resonance integral table into subgroup data to be used in the physical subgroup method. Since these methods do not use subgroup data but only use resonance integral tables directly, these methods do not include an error in converting resonance integral into subgroup data. The effective resonance cross sections are estimated iteratively for each resonance nuclide through the heterogeneous fixed source calculations for the whole problem domain to obtain the background cross sections. These methods have been implemented in the transport lattice code KARMA which uses the method of characteristics (MOC) to solve the transport equation. The computational results show that these iteration methods are quite promising in the practical transport lattice calculations.

  16. An iterative method for selecting degenerate multiplex PCR primers.

    Science.gov (United States)

    Souvenir, Richard; Buhler, Jeremy; Stormo, Gary; Zhang, Weixiong

    2007-01-01

    Single-nucleotide polymorphism (SNP) genotyping is an important molecular genetics process, which can produce results that will be useful in the medical field. Because of inherent complexities in DNA manipulation and analysis, many different methods have been proposed for a standard assay. One of the proposed techniques for performing SNP genotyping requires amplifying regions of DNA surrounding a large number of SNP loci. To automate a portion of this particular method, it is necessary to select a set of primers for the experiment. Selecting these primers can be formulated as the Multiple Degenerate Primer Design (MDPD) problem. The Multiple, Iterative Primer Selector (MIPS) is an iterative beam-search algorithm for MDPD. Theoretical and experimental analyses show that this algorithm performs well compared with the limits of degenerate primer design. Furthermore, MIPS outperforms an existing algorithm that was designed for a related degenerate primer selection problem.

  17. Toolkit for high performance Monte Carlo radiation transport and activation calculations for shielding applications in ITER

    International Nuclear Information System (INIS)

    Serikov, A.; Fischer, U.; Grosse, D.; Leichtle, D.; Majerle, M.

    2011-01-01

    The Monte Carlo (MC) method is the most suitable computational technique of radiation transport for shielding applications in fusion neutronics. This paper is intended for sharing the results of long term experience of the fusion neutronics group at Karlsruhe Institute of Technology (KIT) in radiation shielding calculations with the MCNP5 code for the ITER fusion reactor with emphasizing on the use of several ITER project-driven computer programs developed at KIT. Two of them, McCad and R2S, seem to be the most useful in radiation shielding analyses. The McCad computer graphical tool allows to perform automatic conversion of the MCNP models from the underlying CAD (CATIA) data files, while the R2S activation interface couples the MCNP radiation transport with the FISPACT activation allowing to estimate nuclear responses such as dose rate and nuclear heating after the ITER reactor shutdown. The cell-based R2S scheme was applied in shutdown photon dose analysis for the designing of the In-Vessel Viewing System (IVVS) and the Glow Discharge Cleaning (GDC) unit in ITER. Newly developed at KIT mesh-based R2S feature was successfully tested on the shutdown dose rate calculations for the upper port in the Neutral Beam (NB) cell of ITER. The merits of McCad graphical program were broadly acknowledged by the neutronic analysts and its continuous improvement at KIT has introduced its stable and more convenient run with its Graphical User Interface. Detailed 3D ITER neutronic modeling with the MCNP Monte Carlo method requires a lot of computation resources, inevitably leading to parallel calculations on clusters. Performance assessments of the MCNP5 parallel runs on the JUROPA/HPC-FF supercomputer cluster permitted to find the optimal number of processors for ITER-type runs. (author)

  18. Formulations to overcome the divergence of iterative method of fixed-point in nonlinear equations solution

    Directory of Open Access Journals (Sweden)

    Wilson Rodríguez Calderón

    2015-04-01

    Full Text Available When we need to determine the solution of a nonlinear equation there are two options: closed-methods which use intervals that contain the root and during the iterative process reduce the size of natural way, and, open-methods that represent an attractive option as they do not require an initial interval enclosure. In general, we know open-methods are more efficient computationally though they do not always converge. In this paper we are presenting a divergence case analysis when we use the method of fixed point iteration to find the normal height in a rectangular channel using the Manning equation. To solve this problem, we propose applying two strategies (developed by authors that allow to modifying the iteration function making additional formulations of the traditional method and its convergence theorem. Although Manning equation is solved with other methods like Newton when we use the iteration method of fixed-point an interesting divergence situation is presented which can be solved with a convergence higher than quadratic over the initial iterations. The proposed strategies have been tested in two cases; a study of divergence of square root of real numbers was made previously by authors for testing. Results in both cases have been successful. We present comparisons because are important for seeing the advantage of proposed strategies versus the most representative open-methods.

  19. Reducing Design Cycle Time and Cost Through Process Resequencing

    Science.gov (United States)

    Rogers, James L.

    2004-01-01

    In today's competitive environment, companies are under enormous pressure to reduce the time and cost of their design cycle. One method for reducing both time and cost is to develop an understanding of the flow of the design processes and the effects of the iterative subcycles that are found in complex design projects. Once these aspects are understood, the design manager can make decisions that take advantage of decomposition, concurrent engineering, and parallel processing techniques to reduce the total time and the total cost of the design cycle. One software tool that can aid in this decision-making process is the Design Manager's Aid for Intelligent Decomposition (DeMAID). The DeMAID software minimizes the feedback couplings that create iterative subcycles, groups processes into iterative subcycles, and decomposes the subcycles into a hierarchical structure. The real benefits of producing the best design in the least time and at a minimum cost are obtained from sequencing the processes in the subcycles.

  20. Parallel assembling and equation solving via graph algorithms with an application to the FE simulation of metal extrusion processes

    CERN Document Server

    Unterkircher, A

    2005-01-01

    We propose methods for parallel assembling and iterative equation solving based on graph algorithms. The assembling technique is independent of dimension, element type and model shape. As a parallel solving technique we construct a multiplicative symmetric Schwarz preconditioner for the conjugate gradient method. Both methods have been incorporated into a non-linear FE code to simulate 3D metal extrusion processes. We illustrate the efficiency of these methods on shared memory computers by realistic examples.

  1. Contribution to regularizing iterative method development for attenuation correction in gamma emission tomography

    International Nuclear Information System (INIS)

    Cao, A.

    1981-07-01

    This study is concerned with the transverse axial gamma emission tomography. The problem of self-attenuation of radiations in biologic tissues is raised. The regularizing iterative method is developed, as a reconstruction method of 3 dimensional images. The different steps from acquisition to results, necessary to its application, are described. Organigrams relative to each step are explained. Comparison notion between two reconstruction methods is introduced. Some methods used for the comparison or to bring about the characteristics of a reconstruction technique are defined. The studies realized to test the regularizing iterative method are presented and results are analyzed [fr

  2. Parallel computing techniques for rotorcraft aerodynamics

    Science.gov (United States)

    Ekici, Kivanc

    The modification of unsteady three-dimensional Navier-Stokes codes for application on massively parallel and distributed computing environments is investigated. The Euler/Navier-Stokes code TURNS (Transonic Unsteady Rotor Navier-Stokes) was chosen as a test bed because of its wide use by universities and industry. For the efficient implementation of TURNS on parallel computing systems, two algorithmic changes are developed. First, main modifications to the implicit operator, Lower-Upper Symmetric Gauss Seidel (LU-SGS) originally used in TURNS, is performed. Second, application of an inexact Newton method, coupled with a Krylov subspace iterative method (Newton-Krylov method) is carried out. Both techniques have been tried previously for the Euler equations mode of the code. In this work, we have extended the methods to the Navier-Stokes mode. Several new implicit operators were tried because of convergence problems of traditional operators with the high cell aspect ratio (CAR) grids needed for viscous calculations on structured grids. Promising results for both Euler and Navier-Stokes cases are presented for these operators. For the efficient implementation of Newton-Krylov methods to the Navier-Stokes mode of TURNS, efficient preconditioners must be used. The parallel implicit operators used in the previous step are employed as preconditioners and the results are compared. The Message Passing Interface (MPI) protocol has been used because of its portability to various parallel architectures. It should be noted that the proposed methodology is general and can be applied to several other CFD codes (e.g. OVERFLOW).

  3. Iterative integral parameter identification of a respiratory mechanics model

    Directory of Open Access Journals (Sweden)

    Schranz Christoph

    2012-07-01

    Full Text Available Abstract Background Patient-specific respiratory mechanics models can support the evaluation of optimal lung protective ventilator settings during ventilation therapy. Clinical application requires that the individual’s model parameter values must be identified with information available at the bedside. Multiple linear regression or gradient-based parameter identification methods are highly sensitive to noise and initial parameter estimates. Thus, they are difficult to apply at the bedside to support therapeutic decisions. Methods An iterative integral parameter identification method is applied to a second order respiratory mechanics model. The method is compared to the commonly used regression methods and error-mapping approaches using simulated and clinical data. The clinical potential of the method was evaluated on data from 13 Acute Respiratory Distress Syndrome (ARDS patients. Results The iterative integral method converged to error minima 350 times faster than the Simplex Search Method using simulation data sets and 50 times faster using clinical data sets. Established regression methods reported erroneous results due to sensitivity to noise. In contrast, the iterative integral method was effective independent of initial parameter estimations, and converged successfully in each case tested. Conclusion These investigations reveal that the iterative integral method is beneficial with respect to computing time, operator independence and robustness, and thus applicable at the bedside for this clinical application.

  4. Parallel deposition, sorting, and reordering methods in the Hybrid Ordered Plasma Simulation (HOPS) code

    International Nuclear Information System (INIS)

    Anderson, D.V.; Shumaker, D.E.

    1993-01-01

    From a computational standpoint, particle simulation calculations for plasmas have not adapted well to the transitions from scalar to vector processing nor from serial to parallel environments. They have suffered from inordinate and excessive accessing of computer memory and have been hobbled by relatively inefficient gather-scatter constructs resulting from the use of indirect indexing. Lastly, the many-to-one mapping characteristic of the deposition phase has made it difficult to perform this in parallel. The authors' code sorts and reorders the particles in a spatial order. This allows them to greatly reduce the memory references, to run in directly indexed vector mode, and to employ domain decomposition to achieve parallelization. In this hybrid simulation the electrons are modeled as a fluid and the field equations solved are obtained from the electron momentum equation together with the pre-Maxwell equations (displacement current neglected). Either zero or finite electron mass can be used in the electron model. The resulting field equations are solved with an iteratively explicit procedure which is thus trivial to parallelize. Likewise, the field interpolations and the particle pushing is simple to parallelize. The deposition, sorting, and reordering phases are less simple and it is for these that the authors present detailed algorithms. They have now successfully tested the parallel version of HOPS in serial mode and it is now being readied for parallel execution on the Cray C-90. They will then port HOPS to a massively parallel computer, in the next year

  5. Detailed Design and Fabrication Method of the ITER Vacuum Vessel Ports

    International Nuclear Information System (INIS)

    Hee-Jae Ahn; Kwon, T.H.; Hong, Y.S.

    2006-01-01

    The engineering design of the ITER vacuum vessel (VV) has been progressed by the ITER International Team (IT) with the cooperation of several participant teams (PT). The VV and ports are the components allocated to Korea for the construction of the ITER. Hyundai Heavy Industries has been involved in the structural analysis, detailed design and development of the fabrication method of the upper and lower ports within the framework of the ITER transitional arrangements (ITA). The design of the port structures has been investigated to validate and to improve the conceptual designs of the ITER IT and other PT. The special emphasis was laid on the flange joint between the port extension and the in-port plug to develop the design of the upper port. The modified design with a pure friction type flange with forty-eight pieces of bolts instead of the tangential key is recommended. Furthermore, the alternative flange designs developed by the ITER IT have been analyzed in detail to simplify the lip seal maintenance into the port flange. The structural analyses of the lower RH port have been also performed to verify the capacity for supporting the VV. The maximum stress exceeds the allowable value at the reinforcing block and basement. More elaborate local models have been developed to mitigate the stress concentration and to modify the component design. The fabrication method and the sequence of the detailed fabrication for the ports are developed focusing on the cost reduction as well as the simplification. A typical port structure includes a port stub, a stub extension and a port extension with a connecting duct. The fabrication sequence consists of surface treatment, cutting, forming, cleaning, welding, machining, and non-destructive inspection and test. Tolerance study has been performed to avoid the mismatch of each fabricated component and to obtain the suitable tolerances in the assembly at the shop and site. This study is based on the experience in the fabrication of

  6. ITER shielding blanket

    Energy Technology Data Exchange (ETDEWEB)

    Strebkov, Yu [ENTEK, Moscow (Russian Federation); Avsjannikov, A [ENTEK, Moscow (Russian Federation); Baryshev, M [NIAT, Moscow (Russian Federation); Blinov, Yu [ENTEK, Moscow (Russian Federation); Shatalov, G [KIAE, Moscow (Russian Federation); Vasiliev, N [KIAE, Moscow (Russian Federation); Vinnikov, A [ENTEK, Moscow (Russian Federation); Chernjagin, A [DYNAMICA, Moscow (Russian Federation)

    1995-03-01

    A reference non-breeding blanket is under development now for the ITER Basic Performance Phase for the purpose of high reliability during the first stage of ITER operation. More severe operation modes are expected in this stage with first wall (FW) local heat loads up to 100-300Wcm{sup -2}. Integration of a blanket design with protective and start limiters requires new solutions to achieve high reliability, and possible use of beryllium as a protective material leads to technologies. The rigid shielding blanket concept was developed in Russia to satisfy the above-mentioned requirements. The concept is based on a copper alloy FW, austenitic stainless steel blanket structure, water cooling. Beryllium protection is integrated in the FW design. Fabrication technology and assembly procedure are described in parallel with the equipment used. (orig.).

  7. Geometric interpretation of optimal iteration strategies

    International Nuclear Information System (INIS)

    Jones, R.B.

    1977-01-01

    The relationship between inner and outer iteration errors is extremely complex, and even formal description of total error behavior is difficult. Inner and outer iteration error propagation is analyzed in a variational formalism for a reactor model describing multidimensional, one-group theory. In a generalization the work of Akimov and Sabek, the number of inner iterations performed during each outer serial that minimizes the total computation time is determined. The generalized analysis admits a geometric interpretation of total error behavior. The results can be applied to both transport and diffusion theory computer methods. 1 figure

  8. On a new iterative method for solving linear systems and comparison results

    Science.gov (United States)

    Jing, Yan-Fei; Huang, Ting-Zhu

    2008-10-01

    In Ujevic [A new iterative method for solving linear systems, Appl. Math. Comput. 179 (2006) 725-730], the author obtained a new iterative method for solving linear systems, which can be considered as a modification of the Gauss-Seidel method. In this paper, we show that this is a special case from a point of view of projection techniques. And a different approach is established, which is both theoretically and numerically proven to be better than (at least the same as) Ujevic's. As the presented numerical examples show, in most cases, the convergence rate is more than one and a half that of Ujevic.

  9. Totally parallel multilevel algorithms

    Science.gov (United States)

    Frederickson, Paul O.

    1988-01-01

    Four totally parallel algorithms for the solution of a sparse linear system have common characteristics which become quite apparent when they are implemented on a highly parallel hypercube such as the CM2. These four algorithms are Parallel Superconvergent Multigrid (PSMG) of Frederickson and McBryan, Robust Multigrid (RMG) of Hackbusch, the FFT based Spectral Algorithm, and Parallel Cyclic Reduction. In fact, all four can be formulated as particular cases of the same totally parallel multilevel algorithm, which are referred to as TPMA. In certain cases the spectral radius of TPMA is zero, and it is recognized to be a direct algorithm. In many other cases the spectral radius, although not zero, is small enough that a single iteration per timestep keeps the local error within the required tolerance.

  10. Parallel performances of three 3D reconstruction methods on MIMD computers: Feldkamp, block ART and SIRT algorithms

    International Nuclear Information System (INIS)

    Laurent, C.; Chassery, J.M.; Peyrin, F.; Girerd, C.

    1996-01-01

    This paper deals with the parallel implementations of reconstruction methods in 3D tomography. 3D tomography requires voluminous data and long computation times. Parallel computing, on MIMD computers, seems to be a good approach to manage this problem. In this study, we present the different steps of the parallelization on an abstract parallel computer. Depending on the method, we use two main approaches to parallelize the algorithms: the local approach and the global approach. Experimental results on MIMD computers are presented. Two 3D images reconstructed from realistic data are showed

  11. 3D seismic modeling and reverse‐time migration with the parallel Fourier method using non‐blocking collective communications

    KAUST Repository

    Chu, Chunlei

    2009-01-01

    The major performance bottleneck of the parallel Fourier method on distributed memory systems is the network communication cost. In this study, we investigate the potential of using non‐blocking all‐to‐all communications to solve this problem by overlapping computation and communication. We present the runtime comparison of a 3D seismic modeling problem with the Fourier method using non‐blocking and blocking calls, respectively, on a Linux cluster. The data demonstrate that a performance improvement of up to 40% can be achieved by simply changing blocking all‐to‐all communication calls to non‐blocking ones to introduce the overlapping capability. A 3D reverse‐time migration result is also presented as an extension to the modeling work based on non‐blocking collective communications.

  12. Computation of saddle-type slow manifolds using iterative methods

    DEFF Research Database (Denmark)

    Kristiansen, Kristian Uldall

    2015-01-01

    with respect to , appropriate estimates are directly attainable using the method of this paper. The method is applied to several examples, including a model for a pair of neurons coupled by reciprocal inhibition with two slow and two fast variables, and the computation of homoclinic connections in the Fitz......This paper presents an alternative approach for the computation of trajectory segments on slow manifolds of saddle type. This approach is based on iterative methods rather than collocation-type methods. Compared to collocation methods, which require mesh refinements to ensure uniform convergence...

  13. The multigrid preconditioned conjugate gradient method

    Science.gov (United States)

    Tatebe, Osamu

    1993-01-01

    A multigrid preconditioned conjugate gradient method (MGCG method), which uses the multigrid method as a preconditioner of the PCG method, is proposed. The multigrid method has inherent high parallelism and improves convergence of long wavelength components, which is important in iterative methods. By using this method as a preconditioner of the PCG method, an efficient method with high parallelism and fast convergence is obtained. First, it is considered a necessary condition of the multigrid preconditioner in order to satisfy requirements of a preconditioner of the PCG method. Next numerical experiments show a behavior of the MGCG method and that the MGCG method is superior to both the ICCG method and the multigrid method in point of fast convergence and high parallelism. This fast convergence is understood in terms of the eigenvalue analysis of the preconditioned matrix. From this observation of the multigrid preconditioner, it is realized that the MGCG method converges in very few iterations and the multigrid preconditioner is a desirable preconditioner of the conjugate gradient method.

  14. Extending molecular simulation time scales: Parallel in time integrations for high-level quantum chemistry and complex force representations

    International Nuclear Information System (INIS)

    Bylaska, Eric J.; Weare, Jonathan Q.; Weare, John H.

    2013-01-01

    Parallel in time simulation algorithms are presented and applied to conventional molecular dynamics (MD) and ab initio molecular dynamics (AIMD) models of realistic complexity. Assuming that a forward time integrator, f (e.g., Verlet algorithm), is available to propagate the system from time t i (trajectory positions and velocities x i = (r i , v i )) to time t i+1 (x i+1 ) by x i+1 = f i (x i ), the dynamics problem spanning an interval from t 0 …t M can be transformed into a root finding problem, F(X) = [x i − f(x (i−1 )] i =1,M = 0, for the trajectory variables. The root finding problem is solved using a variety of root finding techniques, including quasi-Newton and preconditioned quasi-Newton schemes that are all unconditionally convergent. The algorithms are parallelized by assigning a processor to each time-step entry in the columns of F(X). The relation of this approach to other recently proposed parallel in time methods is discussed, and the effectiveness of various approaches to solving the root finding problem is tested. We demonstrate that more efficient dynamical models based on simplified interactions or coarsening time-steps provide preconditioners for the root finding problem. However, for MD and AIMD simulations, such preconditioners are not required to obtain reasonable convergence and their cost must be considered in the performance of the algorithm. The parallel in time algorithms developed are tested by applying them to MD and AIMD simulations of size and complexity similar to those encountered in present day applications. These include a 1000 Si atom MD simulation using Stillinger-Weber potentials, and a HCl + 4H 2 O AIMD simulation at the MP2 level. The maximum speedup ((serial execution time)/(parallel execution time) ) obtained by parallelizing the Stillinger-Weber MD simulation was nearly 3.0. For the AIMD MP2 simulations, the algorithms achieved speedups of up to 14.3. The parallel in time algorithms can be implemented in a

  15. Domain decomposition methods for the neutron diffusion problem

    International Nuclear Information System (INIS)

    Guerin, P.; Baudron, A. M.; Lautard, J. J.

    2010-01-01

    The neutronic simulation of a nuclear reactor core is performed using the neutron transport equation, and leads to an eigenvalue problem in the steady-state case. Among the deterministic resolution methods, simplified transport (SPN) or diffusion approximations are often used. The MINOS solver developed at CEA Saclay uses a mixed dual finite element method for the resolution of these problems. and has shown his efficiency. In order to take into account the heterogeneities of the geometry, a very fine mesh is generally required, and leads to expensive calculations for industrial applications. In order to take advantage of parallel computers, and to reduce the computing time and the local memory requirement, we propose here two domain decomposition methods based on the MINOS solver. The first approach is a component mode synthesis method on overlapping sub-domains: several Eigenmodes solutions of a local problem on each sub-domain are taken as basis functions used for the resolution of the global problem on the whole domain. The second approach is an iterative method based on a non-overlapping domain decomposition with Robin interface conditions. At each iteration, we solve the problem on each sub-domain with the interface conditions given by the solutions on the adjacent sub-domains estimated at the previous iteration. Numerical results on parallel computers are presented for the diffusion model on realistic 2D and 3D cores. (authors)

  16. Adaptive Iterative Soft-Input Soft-Output Parallel Decision-Feedback Detectors for Asynchronous Coded DS-CDMA Systems

    Directory of Open Access Journals (Sweden)

    Zhang Wei

    2005-01-01

    Full Text Available The optimum and many suboptimum iterative soft-input soft-output (SISO multiuser detectors require a priori information about the multiuser system, such as the users' transmitted signature waveforms, relative delays, as well as the channel impulse response. In this paper, we employ adaptive algorithms in the SISO multiuser detector in order to avoid the need for this a priori information. First, we derive the optimum SISO parallel decision-feedback detector for asynchronous coded DS-CDMA systems. Then, we propose two adaptive versions of this SISO detector, which are based on the normalized least mean square (NLMS and recursive least squares (RLS algorithms. Our SISO adaptive detectors effectively exploit the a priori information of coded symbols, whose soft inputs are obtained from a bank of single-user decoders. Furthermore, we consider how to select practical finite feedforward and feedback filter lengths to obtain a good tradeoff between the performance and computational complexity of the receiver.

  17. A time-minimizing hybrid method for fitting complex Moessbauer spectra

    International Nuclear Information System (INIS)

    Steiner, K.J.

    2000-07-01

    The process of fitting complex Moessbauer-spectra is known to be time-consuming. The fitting process involves a mathematical model for the combined hyperfine interaction which can be solved by an iteration method only. The iteration method is very sensitive to its input-parameters. In other words, with arbitrary input-parameters it is most unlikely that the iteration method will converge. Up to now a scientist has to spent her/his time to guess appropriate input parameters for the iteration process. The idea is to replace the guessing phase by a genetic algorithm. The genetic algorithm starts with an initial population of arbitrary input parameters. Each parameter set is called an individual. The first step is to evaluate the fitness of all individuals. Afterwards the current population is recombined to form a new population. The process of recombination involves the successive application of genetic operators which are selection, crossover, and mutation. These operators mimic the process of natural evolution, i.e. the concept of the survival of the fittest. Even though there is no formal proof that the genetic algorithm will eventually converge, there is an excellent chance that there will be a population with very good individuals after some generations. The hybrid method presented in the following combines a very modern version of a genetic algorithm with a conventional least-square routine solving the combined interaction Hamiltonian i.e. providing a physical solution with the original Moessbauer parameters by a minimum of input. (author)

  18. Multigrid Reduction in Time for Nonlinear Parabolic Problems

    Energy Technology Data Exchange (ETDEWEB)

    Falgout, R. D. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Manteuffel, T. A. [Univ. of Colorado, Boulder, CO (United States); O' Neill, B. [Univ. of Colorado, Boulder, CO (United States); Schroder, J. B. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

    2016-01-04

    The need for parallel-in-time is being driven by changes in computer architectures, where future speed-ups will be available through greater concurrency, but not faster clock speeds, which are stagnant.This leads to a bottleneck for sequential time marching schemes, because they lack parallelism in the time dimension. Multigrid Reduction in Time (MGRIT) is an iterative procedure that allows for temporal parallelism by utilizing multigrid reduction techniques and a multilevel hierarchy of coarse time grids. MGRIT has been shown to be effective for linear problems, with speedups of up to 50 times. The goal of this work is the efficient solution of nonlinear problems with MGRIT, where efficient is defined as achieving similar performance when compared to a corresponding linear problem. As our benchmark, we use the p-Laplacian, where p = 4 corresponds to a well-known nonlinear diffusion equation and p = 2 corresponds to our benchmark linear diffusion problem. When considering linear problems and implicit methods, the use of optimal spatial solvers such as spatial multigrid imply that the cost of one time step evaluation is fixed across temporal levels, which have a large variation in time step sizes. This is not the case for nonlinear problems, where the work required increases dramatically on coarser time grids, where relatively large time steps lead to worse conditioned nonlinear solves and increased nonlinear iteration counts per time step evaluation. This is the key difficulty explored by this paper. We show that by using a variety of strategies, most importantly, spatial coarsening and an alternate initial guess to the nonlinear time-step solver, we can reduce the work per time step evaluation over all temporal levels to a range similar with the corresponding linear problem. This allows for parallel scaling behavior comparable to the corresponding linear problem.

  19. The ITER remote maintenance system

    International Nuclear Information System (INIS)

    Tesini, A.; Palmer, J.

    2008-01-01

    The aim of this paper is to summarize the ITER approach to machine components maintenance. A major objective of the ITER project is to demonstrate that a future power producing fusion device can be maintained effectively and offer practical levels of plant availability. During its operational lifetime, many systems of the ITER machine will require maintenance and modification; this can be achieved using remote handling methods. The need for timely, safe and effective remote operations on a machine as complex as ITER and within one of the world's most hostile remote handling environments represents a major challenge at every level of the ITER Project organization, engineering and technology. The basic principles of fusion reactor maintenance are presented. An updated description of the ITER remote maintenance system is provided. This includes the maintenance equipment used inside the vacuum vessel, inside the hot cell and the hot cell itself. The correlation between the functions of the remote handling equipment, of the hot cell and of the radwaste processing system is also described. The paper concludes that ITER has equipped itself with a good platform to tackle the challenges presented by its own maintenance and upgrade needs

  20. Design and fabrication methods of FW/blanket and vessel for ITER-FEAT

    Energy Technology Data Exchange (ETDEWEB)

    Ioki, K. E-mail: iokik@itereu.de; Barabash, V.; Cardella, A.; Elio, F.; Kalinin, G.; Miki, N.; Onozuka, M.; Osaki, T.; Rozov, V.; Sannazzaro, G.; Utin, Y.; Yamada, M.; Yoshimura, H

    2001-11-01

    Design has progressed on the vacuum vessel and FW/blanket for ITER-FEAT. The basic functions and structures are the same as for the 1998 ITER design. Detailed blanket module designs of the radially cooled shield block with flat separable FW panels have been developed. The ITER blanket R and D program covers different materials and fabrication methods in order make a final selection based on the results. Separate manifolds have been designed and analysed for the blanket cooling. The vessel design with flexible support housings has been improved to minimise the number of continuous poloidal ribs. Most of the R and D performed so far during EDA are still applicable.

  1. Design and fabrication methods of FW/blanket and vessel for ITER-FEAT

    International Nuclear Information System (INIS)

    Ioki, K.; Barabash, V.; Cardella, A.; Elio, F.; Kalinin, G.; Miki, N.; Onozuka, M.; Osaki, T.; Rozov, V.; Sannazzaro, G.; Utin, Y.; Yamada, M.; Yoshimura, H.

    2001-01-01

    Design has progressed on the vacuum vessel and FW/blanket for ITER-FEAT. The basic functions and structures are the same as for the 1998 ITER design. Detailed blanket module designs of the radially cooled shield block with flat separable FW panels have been developed. The ITER blanket R and D program covers different materials and fabrication methods in order make a final selection based on the results. Separate manifolds have been designed and analysed for the blanket cooling. The vessel design with flexible support housings has been improved to minimise the number of continuous poloidal ribs. Most of the R and D performed so far during EDA are still applicable

  2. Variational iteration method for solving coupled-KdV equations

    International Nuclear Information System (INIS)

    Assas, Laila M.B.

    2008-01-01

    In this paper, the He's variational iteration method is applied to solve the non-linear coupled-KdV equations. This method is based on the use of Lagrange multipliers for identification of optimal value of a parameter in a functional. This technique provides a sequence of functions which converge to the exact solution of the coupled-KdV equations. This procedure is a powerful tool for solving coupled-KdV equations

  3. A connection between the asymptotic iteration method and the continued fractions formalism

    International Nuclear Information System (INIS)

    Matamala, A.R.; Gutierrez, F.A.; Diaz-Valdes, J.

    2007-01-01

    In this work, we show that there is a connection between the asymptotic iteration method (a method to solve second order linear ordinary differential equations) and the older method of continued fractions to solve differential equations

  4. Scan time reduction in {sup 23}Na-Magnetic Resonance Imaging using the chemical shift imaging sequence. Evaluation of an iterative reconstruction method

    Energy Technology Data Exchange (ETDEWEB)

    Weingaertner, Sebastian; Konstandin, Simon; Schad, Lothar R. [Heidelberg Univ., Mannheim (Germany). Computer Assisted Clinical Medicine; Wetterling, Friedrich [Heidelberg Univ., Mannheim (Germany). Computer Assisted Clinical Medicine; Dublin Univ. (Ireland) Trinity Inst. of Neuroscience; Fatar, Marc [Heidelberg Univ., Mannheim (Germany). Dept. of Neurology; Neumaier-Probst, Eva [Heidelberg Univ., Mannheim (Germany). Dept. of Neuroradiology

    2015-07-01

    To evaluate potential scan time reduction in {sup 23}Na-Magnetic Resonance Imaging with the chemical shift imaging sequence (CSI) using undersampled data of high-quality datasets, reconstructed with an iterative constrained reconstruction, compared to reduced resolution or reduced signal-to-noise ratio. CSI {sup 23}Na-images were retrospectively undersampled and reconstructed with a constrained reconstruction scheme. The results were compared to conventional methods of scan time reduction. The constrained reconstruction scheme used a phase constraint and a finite object support, which was extracted from a spatially registered {sup 1}H-image acquired with a double-tuned coil. The methods were evaluated using numerical simulations, phantom images and in-vivo images of a healthy volunteer and a patient who suffered from cerebral ischemic stroke. The constrained reconstruction scheme showed improved image quality compared to a decreased number of averages, images with decreased resolution or circular undersampling with weighted averaging for any undersampling factor. Brain images of a stroke patient, which were reconstructed from three-fold undersampled k-space data, resulted in only minor differences from the original image (normalized root means square error < 12%) and an almost identical delineation of the stroke region (mismatch < 6%). The acquisition of undersampled {sup 23}Na-CSI images enables up to three-fold scan time reduction with improved image quality compared to conventional methods of scan time saving.

  5. STEP: Self-supporting tailored k-space estimation for parallel imaging reconstruction.

    Science.gov (United States)

    Zhou, Zechen; Wang, Jinnan; Balu, Niranjan; Li, Rui; Yuan, Chun

    2016-02-01

    A new subspace-based iterative reconstruction method, termed Self-supporting Tailored k-space Estimation for Parallel imaging reconstruction (STEP), is presented and evaluated in comparison to the existing autocalibrating method SPIRiT and calibrationless method SAKE. In STEP, two tailored schemes including k-space partition and basis selection are proposed to promote spatially variant signal subspace and incorporated into a self-supporting structured low rank model to enforce properties of locality, sparsity, and rank deficiency, which can be formulated into a constrained optimization problem and solved by an iterative algorithm. Simulated and in vivo datasets were used to investigate the performance of STEP in terms of overall image quality and detail structure preservation. The advantage of STEP on image quality is demonstrated by retrospectively undersampled multichannel Cartesian data with various patterns. Compared with SPIRiT and SAKE, STEP can provide more accurate reconstruction images with less residual aliasing artifacts and reduced noise amplification in simulation and in vivo experiments. In addition, STEP has the capability of combining compressed sensing with arbitrary sampling trajectory. Using k-space partition and basis selection can further improve the performance of parallel imaging reconstruction with or without calibration signals. © 2015 Wiley Periodicals, Inc.

  6. Perl Modules for Constructing Iterators

    Science.gov (United States)

    Tilmes, Curt

    2009-01-01

    The Iterator Perl Module provides a general-purpose framework for constructing iterator objects within Perl, and a standard API for interacting with those objects. Iterators are an object-oriented design pattern where a description of a series of values is used in a constructor. Subsequent queries can request values in that series. These Perl modules build on the standard Iterator framework and provide iterators for some other types of values. Iterator::DateTime constructs iterators from DateTime objects or Date::Parse descriptions and ICal/RFC 2445 style re-currence descriptions. It supports a variety of input parameters, including a start to the sequence, an end to the sequence, an Ical/RFC 2445 recurrence describing the frequency of the values in the series, and a format description that can refine the presentation manner of the DateTime. Iterator::String constructs iterators from string representations. This module is useful in contexts where the API consists of supplying a string and getting back an iterator where the specific iteration desired is opaque to the caller. It is of particular value to the Iterator::Hash module which provides nested iterations. Iterator::Hash constructs iterators from Perl hashes that can include multiple iterators. The constructed iterators will return all the permutations of the iterations of the hash by nested iteration of embedded iterators. A hash simply includes a set of keys mapped to values. It is a very common data structure used throughout Perl programming. The Iterator:: Hash module allows a hash to include strings defining iterators (parsed and dispatched with Iterator::String) that are used to construct an overall series of hash values.

  7. Tomographic reconstruction by using FPSIRT (Fast Particle System Iterative Reconstruction Technique)

    Energy Technology Data Exchange (ETDEWEB)

    Moreira, Icaro Valgueiro M.; Melo, Silvio de Barros; Dantas, Carlos; Lima, Emerson Alexandre; Silva, Ricardo Martins; Cardoso, Halisson Alberdan C., E-mail: ivmm@cin.ufpe.br, E-mail: sbm@cin.ufpe.br, E-mail: rmas@cin.ufpe.br, E-mail: hacc@cin.ufpe.br, E-mail: ccd@ufpe.br, E-mail: eal@cin.ufpe.br [Universidade Federal de Pernambuco (UFPE), Recife, PE (Brazil)

    2015-07-01

    The PSIRT (Particle System Iterative Reconstruction Technique) is a method of tomographic image reconstruction primarily designed to work with configurations suitable for industrial applications. A particle system is an optimization technique inspired in real physical systems that associates to the reconstructing material a set of particles with certain physical features, subject to a force eld, which can produce movement. The system constantly updates the set of particles by repositioning them in such a way as to approach the equilibrium. The elastic potential along a trajectory is a function of the difference between the attenuation coefficient in the current configuration and the corresponding input data. PSIRT has been successfully used to reconstruct simulated and real objects subject to sets of parallel and fanbeam lines in different angles, representing typical gamma-ray tomographic arrangements. One of PSIRT's limitation was its performance, too slow for real time scenarios. In this work, it is presented a reformulation in PSIRT's computational model, which is able to grant the new algorithm, the FPSIRT - Fast System Iterative Reconstruction Technique, a performance up to 200-time faster than PSIRT's. In this work a comparison of their application to real and simulated data from the HSGT, High Speed Gamma Tomograph, is presented. (author)

  8. Tomographic reconstruction by using FPSIRT (Fast Particle System Iterative Reconstruction Technique)

    International Nuclear Information System (INIS)

    Moreira, Icaro Valgueiro M.; Melo, Silvio de Barros; Dantas, Carlos; Lima, Emerson Alexandre; Silva, Ricardo Martins; Cardoso, Halisson Alberdan C.

    2015-01-01

    The PSIRT (Particle System Iterative Reconstruction Technique) is a method of tomographic image reconstruction primarily designed to work with configurations suitable for industrial applications. A particle system is an optimization technique inspired in real physical systems that associates to the reconstructing material a set of particles with certain physical features, subject to a force eld, which can produce movement. The system constantly updates the set of particles by repositioning them in such a way as to approach the equilibrium. The elastic potential along a trajectory is a function of the difference between the attenuation coefficient in the current configuration and the corresponding input data. PSIRT has been successfully used to reconstruct simulated and real objects subject to sets of parallel and fanbeam lines in different angles, representing typical gamma-ray tomographic arrangements. One of PSIRT's limitation was its performance, too slow for real time scenarios. In this work, it is presented a reformulation in PSIRT's computational model, which is able to grant the new algorithm, the FPSIRT - Fast System Iterative Reconstruction Technique, a performance up to 200-time faster than PSIRT's. In this work a comparison of their application to real and simulated data from the HSGT, High Speed Gamma Tomograph, is presented. (author)

  9. Solution of problems in calculus of variations via He's variational iteration method

    International Nuclear Information System (INIS)

    Tatari, Mehdi; Dehghan, Mehdi

    2007-01-01

    In the modeling of a large class of problems in science and engineering, the minimization of a functional is appeared. Finding the solution of these problems needs to solve the corresponding ordinary differential equations which are generally nonlinear. In recent years He's variational iteration method has been attracted a lot of attention of the researchers for solving nonlinear problems. This method finds the solution of the problem without any discretization of the equation. Since this method gives a closed form solution of the problem and avoids the round off errors, it can be considered as an efficient method for solving various kinds of problems. In this research He's variational iteration method will be employed for solving some problems in calculus of variations. Some examples are presented to show the efficiency of the proposed technique

  10. An iterative method for hydrodynamic interactions in Brownian dynamics simulations of polymer dynamics

    Science.gov (United States)

    Miao, Linling; Young, Charles D.; Sing, Charles E.

    2017-07-01

    Brownian Dynamics (BD) simulations are a standard tool for understanding the dynamics of polymers in and out of equilibrium. Quantitative comparison can be made to rheological measurements of dilute polymer solutions, as well as direct visual observations of fluorescently labeled DNA. The primary computational challenge with BD is the expensive calculation of hydrodynamic interactions (HI), which are necessary to capture physically realistic dynamics. The full HI calculation, performed via a Cholesky decomposition every time step, scales with the length of the polymer as O(N3). This limits the calculation to a few hundred simulated particles. A number of approximations in the literature can lower this scaling to O(N2 - N2.25), and explicit solvent methods scale as O(N); however both incur a significant constant per-time step computational cost. Despite this progress, there remains a need for new or alternative methods of calculating hydrodynamic interactions; large polymer chains or semidilute polymer solutions remain computationally expensive. In this paper, we introduce an alternative method for calculating approximate hydrodynamic interactions. Our method relies on an iterative scheme to establish self-consistency between a hydrodynamic matrix that is averaged over simulation and the hydrodynamic matrix used to run the simulation. Comparison to standard BD simulation and polymer theory results demonstrates that this method quantitatively captures both equilibrium and steady-state dynamics after only a few iterations. The use of an averaged hydrodynamic matrix allows the computationally expensive Brownian noise calculation to be performed infrequently, so that it is no longer the bottleneck of the simulation calculations. We also investigate limitations of this conformational averaging approach in ring polymers.

  11. Solving large mixed linear models using preconditioned conjugate gradient iteration.

    Science.gov (United States)

    Strandén, I; Lidauer, M

    1999-12-01

    Continuous evaluation of dairy cattle with a random regression test-day model requires a fast solving method and algorithm. A new computing technique feasible in Jacobi and conjugate gradient based iterative methods using iteration on data is presented. In the new computing technique, the calculations in multiplication of a vector by a matrix were recorded to three steps instead of the commonly used two steps. The three-step method was implemented in a general mixed linear model program that used preconditioned conjugate gradient iteration. Performance of this program in comparison to other general solving programs was assessed via estimation of breeding values using univariate, multivariate, and random regression test-day models. Central processing unit time per iteration with the new three-step technique was, at best, one-third that needed with the old technique. Performance was best with the test-day model, which was the largest and most complex model used. The new program did well in comparison to other general software. Programs keeping the mixed model equations in random access memory required at least 20 and 435% more time to solve the univariate and multivariate animal models, respectively. Computations of the second best iteration on data took approximately three and five times longer for the animal and test-day models, respectively, than did the new program. Good performance was due to fast computing time per iteration and quick convergence to the final solutions. Use of preconditioned conjugate gradient based methods in solving large breeding value problems is supported by our findings.

  12. PARALLEL AND ADAPTIVE UNIFORM-DISTRIBUTED REGISTRATION METHOD FOR CHANG’E-1 LUNAR REMOTE SENSED IMAGERY

    Directory of Open Access Journals (Sweden)

    X. Ning

    2012-08-01

    To resolve the above-mentioned registration difficulties, a parallel and adaptive uniform-distributed registration method for CE-1 lunar remote sensed imagery is proposed in this paper. Based on 6 pairs of randomly selected images, both the standard SIFT algorithm and the parallel and adaptive uniform-distributed registration method were executed, the versatility and effectiveness were assessed. The experimental results indicate that: by applying the parallel and adaptive uniform-distributed registration method, the efficiency of CE-1 lunar remote sensed imagery registration were increased dramatically. Therefore, the proposed method in the paper could acquire uniform-distributed registration results more effectively, the registration difficulties including difficult to obtain results, time-consuming, non-uniform distribution could be successfully solved.

  13. MO-DE-207A-07: Filtered Iterative Reconstruction (FIR) Via Proximal Forward-Backward Splitting: A Synergy of Analytical and Iterative Reconstruction Method for CT

    International Nuclear Information System (INIS)

    Gao, H

    2016-01-01

    Purpose: This work is to develop a general framework, namely filtered iterative reconstruction (FIR) method, to incorporate analytical reconstruction (AR) method into iterative reconstruction (IR) method, for enhanced CT image quality. Methods: FIR is formulated as a combination of filtered data fidelity and sparsity regularization, and then solved by proximal forward-backward splitting (PFBS) algorithm. As a result, the image reconstruction decouples data fidelity and image regularization with a two-step iterative scheme, during which an AR-projection step updates the filtered data fidelity term, while a denoising solver updates the sparsity regularization term. During the AR-projection step, the image is projected to the data domain to form the data residual, and then reconstructed by certain AR to a residual image which is in turn weighted together with previous image iterate to form next image iterate. Since the eigenvalues of AR-projection operator are close to the unity, PFBS based FIR has a fast convergence. Results: The proposed FIR method is validated in the setting of circular cone-beam CT with AR being FDK and total-variation sparsity regularization, and has improved image quality from both AR and IR. For example, AIR has improved visual assessment and quantitative measurement in terms of both contrast and resolution, and reduced axial and half-fan artifacts. Conclusion: FIR is proposed to incorporate AR into IR, with an efficient image reconstruction algorithm based on PFBS. The CBCT results suggest that FIR synergizes AR and IR with improved image quality and reduced axial and half-fan artifacts. The authors was partially supported by the NSFC (#11405105), the 973 Program (#2015CB856000), and the Shanghai Pujiang Talent Program (#14PJ1404500).

  14. Effective Iterated Greedy Algorithm for Flow-Shop Scheduling Problems with Time lags

    Science.gov (United States)

    ZHAO, Ning; YE, Song; LI, Kaidian; CHEN, Siyu

    2017-05-01

    Flow shop scheduling problem with time lags is a practical scheduling problem and attracts many studies. Permutation problem(PFSP with time lags) is concentrated but non-permutation problem(non-PFSP with time lags) seems to be neglected. With the aim to minimize the makespan and satisfy time lag constraints, efficient algorithms corresponding to PFSP and non-PFSP problems are proposed, which consist of iterated greedy algorithm for permutation(IGTLP) and iterated greedy algorithm for non-permutation (IGTLNP). The proposed algorithms are verified using well-known simple and complex instances of permutation and non-permutation problems with various time lag ranges. The permutation results indicate that the proposed IGTLP can reach near optimal solution within nearly 11% computational time of traditional GA approach. The non-permutation results indicate that the proposed IG can reach nearly same solution within less than 1% computational time compared with traditional GA approach. The proposed research combines PFSP and non-PFSP together with minimal and maximal time lag consideration, which provides an interesting viewpoint for industrial implementation.

  15. Domain decomposition methods for the mixed dual formulation of the critical neutron diffusion problem; Methodes de decomposition de domaine pour la formulation mixte duale du probleme critique de la diffusion des neutrons

    Energy Technology Data Exchange (ETDEWEB)

    Guerin, P

    2007-12-15

    The neutronic simulation of a nuclear reactor core is performed using the neutron transport equation, and leads to an eigenvalue problem in the steady-state case. Among the deterministic resolution methods, diffusion approximation is often used. For this problem, the MINOS solver based on a mixed dual finite element method has shown his efficiency. In order to take advantage of parallel computers, and to reduce the computing time and the local memory requirement, we propose in this dissertation two domain decomposition methods for the resolution of the mixed dual form of the eigenvalue neutron diffusion problem. The first approach is a component mode synthesis method on overlapping sub-domains. Several Eigenmodes solutions of a local problem solved by MINOS on each sub-domain are taken as basis functions used for the resolution of the global problem on the whole domain. The second approach is a modified iterative Schwarz algorithm based on non-overlapping domain decomposition with Robin interface conditions. At each iteration, the problem is solved on each sub domain by MINOS with the interface conditions deduced from the solutions on the adjacent sub-domains at the previous iteration. The iterations allow the simultaneous convergence of the domain decomposition and the eigenvalue problem. We demonstrate the accuracy and the efficiency in parallel of these two methods with numerical results for the diffusion model on realistic 2- and 3-dimensional cores. (author)

  16. A non-iterative method for fitting decay curves with background

    International Nuclear Information System (INIS)

    Mukoyama, T.

    1982-01-01

    A non-iterative method for fitting a decay curve with background is presented. The sum of an exponential function and a constant term is linearized by the use of the difference equation and parameters are determined by the standard linear least-squares fitting. The validity of the present method has been tested against pseudo-experimental data. (orig.)

  17. Adaptive Coarse Spaces for FETI-DP and BDDC Methods

    OpenAIRE

    Radtke, Patrick

    2015-01-01

    Iterative substructuring methods are well suited for the parallel iterative solution of elliptic partial differential equations. These methods are based on subdividing the computational domain into smaller nonoverlapping subdomains and solving smaller problems on these subdomains. The solutions are then joined to a global solution in an iterative process. In case of a scalar diffusion equation or the equations of linear elasticity with a diffusion coefficient or Young modulus, respectively, ...

  18. Preconditioned Iterative Methods for Solving Weighted Linear Least Squares Problems

    Czech Academy of Sciences Publication Activity Database

    Bru, R.; Marín, J.; Mas, J.; Tůma, Miroslav

    2014-01-01

    Roč. 36, č. 4 (2014), A2002-A2022 ISSN 1064-8275 Institutional support: RVO:67985807 Keywords : preconditioned iterative methods * incomplete decompositions * approximate inverses * linear least squares Subject RIV: BA - General Mathematics Impact factor: 1.854, year: 2014

  19. Electromagnetic scattering using the iterative multi-region technique

    CERN Document Server

    Al Sharkawy, Mohamed H

    2007-01-01

    In this work, an iterative approach using the finite difference frequency domain method is presented to solve the problem of scattering from large-scale electromagnetic structures. The idea of the proposed iterative approach is to divide one computational domain into smaller subregions and solve each subregion separately. Then the subregion solutions are combined iteratively to obtain a solution for the complete domain. As a result, a considerable reduction in the computation time and memory is achieved. This procedure is referred to as the iterative multiregion (IMR) technique.Different enhan

  20. Optimal task mapping in safety-critical real-time parallel systems

    International Nuclear Information System (INIS)

    Aussagues, Ch.

    1998-01-01

    This PhD thesis is dealing with the correct design of safety-critical real-time parallel systems. Such systems constitutes a fundamental part of high-performance systems for command and control that can be found in the nuclear domain or more generally in parallel embedded systems. The verification of their temporal correctness is the core of this thesis. our contribution is mainly in the following three points: the analysis and extension of a programming model for such real-time parallel systems; the proposal of an original method based on a new operator of synchronized product of state machines task-graphs; the validation of the approach by its implementation and evaluation. The work addresses particularly the main problem of optimal task mapping on a parallel architecture, such that the temporal constraints are globally guaranteed, i.e. the timeliness property is valid. The results incorporate also optimally criteria for the sizing and correct dimensioning of a parallel system, for instance in the number of processing elements. These criteria are connected with operational constraints of the application domain. Our approach is based on the off-line analysis of the feasibility of the deadline-driven dynamic scheduling that is used to schedule tasks inside one processor. This leads us to define the synchronized-product, a system of linear, constraints is automatically generated and then allows to calculate a maximum load of a group of tasks and then to verify their timeliness constraints. The communications, their timeliness verification and incorporation to the mapping problem is the second main contribution of this thesis. FInally, the global solving technique dealing with both task and communication aspects has been implemented and evaluated in the framework of the OASIS project in the LETI research center at the CEA/Saclay. (author)

  1. Preconditioned iterative methods for space-time fractional advection-diffusion equations

    Science.gov (United States)

    Zhao, Zhi; Jin, Xiao-Qing; Lin, Matthew M.

    2016-08-01

    In this paper, we propose practical numerical methods for solving a class of initial-boundary value problems of space-time fractional advection-diffusion equations. First, we propose an implicit method based on two-sided Grünwald formulae and discuss its stability and consistency. Then, we develop the preconditioned generalized minimal residual (preconditioned GMRES) method and preconditioned conjugate gradient normal residual (preconditioned CGNR) method with easily constructed preconditioners. Importantly, because resulting systems are Toeplitz-like, fast Fourier transform can be applied to significantly reduce the computational cost. We perform numerical experiments to demonstrate the efficiency of our preconditioners, even in cases with variable coefficients.

  2. Domain decomposition methods for the mixed dual formulation of the critical neutron diffusion problem

    International Nuclear Information System (INIS)

    Guerin, P.

    2007-12-01

    The neutronic simulation of a nuclear reactor core is performed using the neutron transport equation, and leads to an eigenvalue problem in the steady-state case. Among the deterministic resolution methods, diffusion approximation is often used. For this problem, the MINOS solver based on a mixed dual finite element method has shown his efficiency. In order to take advantage of parallel computers, and to reduce the computing time and the local memory requirement, we propose in this dissertation two domain decomposition methods for the resolution of the mixed dual form of the eigenvalue neutron diffusion problem. The first approach is a component mode synthesis method on overlapping sub-domains. Several Eigenmodes solutions of a local problem solved by MINOS on each sub-domain are taken as basis functions used for the resolution of the global problem on the whole domain. The second approach is a modified iterative Schwarz algorithm based on non-overlapping domain decomposition with Robin interface conditions. At each iteration, the problem is solved on each sub domain by MINOS with the interface conditions deduced from the solutions on the adjacent sub-domains at the previous iteration. The iterations allow the simultaneous convergence of the domain decomposition and the eigenvalue problem. We demonstrate the accuracy and the efficiency in parallel of these two methods with numerical results for the diffusion model on realistic 2- and 3-dimensional cores. (author)

  3. Extending molecular simulation time scales: Parallel in time integrations for high-level quantum chemistry and complex force representations

    Energy Technology Data Exchange (ETDEWEB)

    Bylaska, Eric J., E-mail: Eric.Bylaska@pnnl.gov [Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory, P.O. Box 999, Richland, Washington 99352 (United States); Weare, Jonathan Q., E-mail: weare@uchicago.edu [Department of Mathematics, University of Chicago, Chicago, Illinois 60637 (United States); Weare, John H., E-mail: jweare@ucsd.edu [Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, California 92093 (United States)

    2013-08-21

    Parallel in time simulation algorithms are presented and applied to conventional molecular dynamics (MD) and ab initio molecular dynamics (AIMD) models of realistic complexity. Assuming that a forward time integrator, f (e.g., Verlet algorithm), is available to propagate the system from time t{sub i} (trajectory positions and velocities x{sub i} = (r{sub i}, v{sub i})) to time t{sub i+1} (x{sub i+1}) by x{sub i+1} = f{sub i}(x{sub i}), the dynamics problem spanning an interval from t{sub 0}…t{sub M} can be transformed into a root finding problem, F(X) = [x{sub i} − f(x{sub (i−1})]{sub i} {sub =1,M} = 0, for the trajectory variables. The root finding problem is solved using a variety of root finding techniques, including quasi-Newton and preconditioned quasi-Newton schemes that are all unconditionally convergent. The algorithms are parallelized by assigning a processor to each time-step entry in the columns of F(X). The relation of this approach to other recently proposed parallel in time methods is discussed, and the effectiveness of various approaches to solving the root finding problem is tested. We demonstrate that more efficient dynamical models based on simplified interactions or coarsening time-steps provide preconditioners for the root finding problem. However, for MD and AIMD simulations, such preconditioners are not required to obtain reasonable convergence and their cost must be considered in the performance of the algorithm. The parallel in time algorithms developed are tested by applying them to MD and AIMD simulations of size and complexity similar to those encountered in present day applications. These include a 1000 Si atom MD simulation using Stillinger-Weber potentials, and a HCl + 4H{sub 2}O AIMD simulation at the MP2 level. The maximum speedup ((serial execution time)/(parallel execution time) ) obtained by parallelizing the Stillinger-Weber MD simulation was nearly 3.0. For the AIMD MP2 simulations, the algorithms achieved speedups of up

  4. Results of the JET real-time disruption predictor in the ITER-like wall campaigns

    International Nuclear Information System (INIS)

    Vega, Jesús; Dormido-Canto, Sebastián; López, Juan M.; Murari, Andrea; Ramírez, Jesús M.; Moreno, Raúl; Ruiz, Mariano; Alves, Diogo; Felton, Robert

    2013-01-01

    Highlights: •JET real-time disruption predictor with metallic wall 991 discharges analyzed. •Predictor training has been carried out with JET C wall data. •Success, false alarm and missed alarm rates are 98.4%, 0.9% and 1.6%, respectively. •Alarms are triggered in average 426 ms before the disruption. -- Abstract: The impact of disruptions in JET became even more important with the replacement of the previous Carbon Fiber Composite (CFC) wall with a more fragile full metal ITER-like wall (ILW). The development of robust disruption mitigation systems is crucial for JET (and also for ITER). Moreover, a reliable real-time (RT) disruption predictor is a pre-requisite to any mitigation method. The Advance Predictor Of DISruptions (APODIS) has been installed in the JET Real-Time Data Network (RTDN) for the RT recognition of disruptions. The predictor operates with the new ILW but it has been trained only with discharges belonging to campaigns with the CFC wall. 7 real-time signals are used to characterize the plasma status (disruptive or non-disruptive) at regular intervals of 32 ms. After the first 3 JET ILW campaigns (991 discharges), the success rate of the predictor is 98.36% (alarms are triggered in average 426 ms before the disruptions). The false alarm and missed alarm rates are 0.92% and 1.64%

  5. Results of the JET real-time disruption predictor in the ITER-like wall campaigns

    Energy Technology Data Exchange (ETDEWEB)

    Vega, Jesús, E-mail: jesus.vega@ciemat.es [Asociación EURATOM/CIEMAT para Fusión, Madrid (Spain); Dormido-Canto, Sebastián [Universidad Nacional de Educación a Distancia (UNED), Madrid (Spain); López, Juan M. [Universidad Politécnica de Madrid, CAEND UPM-CSIC, Madrid (Spain); Murari, Andrea [Consorzio RFX, Associazione EURATOM/ENEA per la Fusione, Padua (Italy); Ramírez, Jesús M. [Universidad Nacional de Educación a Distancia (UNED), Madrid (Spain); Moreno, Raúl [Asociación EURATOM/CIEMAT para Fusión, Madrid (Spain); Ruiz, Mariano [Universidad Politécnica de Madrid, CAEND UPM-CSIC, Madrid (Spain); Alves, Diogo [Associação EURATOM/IST, Instituto de Plasmas e Fusão Nuclear – Laboratório Associado, Instituto Superior Técnico, P-1049-001 Lisboa (Portugal); Felton, Robert [EURATOM/CCFE Fusion Association, Culham Science Center, Abingdon, Oxon OX14 3DB (United Kingdom)

    2013-10-15

    Highlights: •JET real-time disruption predictor with metallic wall 991 discharges analyzed. •Predictor training has been carried out with JET C wall data. •Success, false alarm and missed alarm rates are 98.4%, 0.9% and 1.6%, respectively. •Alarms are triggered in average 426 ms before the disruption. -- Abstract: The impact of disruptions in JET became even more important with the replacement of the previous Carbon Fiber Composite (CFC) wall with a more fragile full metal ITER-like wall (ILW). The development of robust disruption mitigation systems is crucial for JET (and also for ITER). Moreover, a reliable real-time (RT) disruption predictor is a pre-requisite to any mitigation method. The Advance Predictor Of DISruptions (APODIS) has been installed in the JET Real-Time Data Network (RTDN) for the RT recognition of disruptions. The predictor operates with the new ILW but it has been trained only with discharges belonging to campaigns with the CFC wall. 7 real-time signals are used to characterize the plasma status (disruptive or non-disruptive) at regular intervals of 32 ms. After the first 3 JET ILW campaigns (991 discharges), the success rate of the predictor is 98.36% (alarms are triggered in average 426 ms before the disruptions). The false alarm and missed alarm rates are 0.92% and 1.64%.

  6. BitPAl: a bit-parallel, general integer-scoring sequence alignment algorithm.

    Science.gov (United States)

    Loving, Joshua; Hernandez, Yozen; Benson, Gary

    2014-11-15

    Mapping of high-throughput sequencing data and other bulk sequence comparison applications have motivated a search for high-efficiency sequence alignment algorithms. The bit-parallel approach represents individual cells in an alignment scoring matrix as bits in computer words and emulates the calculation of scores by a series of logic operations composed of AND, OR, XOR, complement, shift and addition. Bit-parallelism has been successfully applied to the longest common subsequence (LCS) and edit-distance problems, producing fast algorithms in practice. We have developed BitPAl, a bit-parallel algorithm for general, integer-scoring global alignment. Integer-scoring schemes assign integer weights for match, mismatch and insertion/deletion. The BitPAl method uses structural properties in the relationship between adjacent scores in the scoring matrix to construct classes of efficient algorithms, each designed for a particular set of weights. In timed tests, we show that BitPAl runs 7-25 times faster than a standard iterative algorithm. Source code is freely available for download at http://lobstah.bu.edu/BitPAl/BitPAl.html. BitPAl is implemented in C and runs on all major operating systems. jloving@bu.edu or yhernand@bu.edu or gbenson@bu.edu Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.

  7. Time complexity analysis for distributed memory computers: implementation of parallel conjugate gradient method

    NARCIS (Netherlands)

    Hoekstra, A.G.; Sloot, P.M.A.; Haan, M.J.; Hertzberger, L.O.; van Leeuwen, J.

    1991-01-01

    New developments in Computer Science, both hardware and software, offer researchers, such as physicists, unprecedented possibilities to solve their computational intensive problems.However, full exploitation of e.g. new massively parallel computers, parallel languages or runtime environments

  8. Verifying large modular systems using iterative abstraction refinement

    International Nuclear Information System (INIS)

    Lahtinen, Jussi; Kuismin, Tuomas; Heljanko, Keijo

    2015-01-01

    Digital instrumentation and control (I&C) systems are increasingly used in the nuclear engineering domain. The exhaustive verification of these systems is challenging, and the usual verification methods such as testing and simulation are typically insufficient. Model checking is a formal method that is able to exhaustively analyse the behaviour of a model against a formally written specification. If the model checking tool detects a violation of the specification, it will give out a counter-example that demonstrates how the specification is violated in the system. Unfortunately, sometimes real life system designs are too big to be directly analysed by traditional model checking techniques. We have developed an iterative technique for model checking large modular systems. The technique uses abstraction based over-approximations of the model behaviour, combined with iterative refinement. The main contribution of the work is the concrete abstraction refinement technique based on the modular structure of the model, the dependency graph of the model, and a refinement sampling heuristic similar to delta debugging. The technique is geared towards proving properties, and outperforms BDD-based model checking, the k-induction technique, and the property directed reachability algorithm (PDR) in our experiments. - Highlights: • We have developed an iterative technique for model checking large modular systems. • The technique uses BDD-based model checking, k-induction, and PDR in parallel. • We have tested our algorithm by verifying two models with it. • The technique outperforms classical model checking methods in our experiments

  9. Analysis of the iteratively regularized Gauss-Newton method under a heuristic rule

    Science.gov (United States)

    Jin, Qinian; Wang, Wei

    2018-03-01

    The iteratively regularized Gauss-Newton method is one of the most prominent regularization methods for solving nonlinear ill-posed inverse problems when the data is corrupted by noise. In order to produce a useful approximate solution, this iterative method should be terminated properly. The existing a priori and a posteriori stopping rules require accurate information on the noise level, which may not be available or reliable in practical applications. In this paper we propose a heuristic selection rule for this regularization method, which requires no information on the noise level. By imposing certain conditions on the noise, we derive a posteriori error estimates on the approximate solutions under various source conditions. Furthermore, we establish a convergence result without using any source condition. Numerical results are presented to illustrate the performance of our heuristic selection rule.

  10. Unified Lambert Tool for Massively Parallel Applications in Space Situational Awareness

    Science.gov (United States)

    Woollands, Robyn M.; Read, Julie; Hernandez, Kevin; Probe, Austin; Junkins, John L.

    2018-03-01

    This paper introduces a parallel-compiled tool that combines several of our recently developed methods for solving the perturbed Lambert problem using modified Chebyshev-Picard iteration. This tool (unified Lambert tool) consists of four individual algorithms, each of which is unique and better suited for solving a particular type of orbit transfer. The first is a Keplerian Lambert solver, which is used to provide a good initial guess (warm start) for solving the perturbed problem. It is also used to determine the appropriate algorithm to call for solving the perturbed problem. The arc length or true anomaly angle spanned by the transfer trajectory is the parameter that governs the automated selection of the appropriate perturbed algorithm, and is based on the respective algorithm convergence characteristics. The second algorithm solves the perturbed Lambert problem using the modified Chebyshev-Picard iteration two-point boundary value solver. This algorithm does not require a Newton-like shooting method and is the most efficient of the perturbed solvers presented herein, however the domain of convergence is limited to about a third of an orbit and is dependent on eccentricity. The third algorithm extends the domain of convergence of the modified Chebyshev-Picard iteration two-point boundary value solver to about 90% of an orbit, through regularization with the Kustaanheimo-Stiefel transformation. This is the second most efficient of the perturbed set of algorithms. The fourth algorithm uses the method of particular solutions and the modified Chebyshev-Picard iteration initial value solver for solving multiple revolution perturbed transfers. This method does require "shooting" but differs from Newton-like shooting methods in that it does not require propagation of a state transition matrix. The unified Lambert tool makes use of the General Mission Analysis Tool and we use it to compute thousands of perturbed Lambert trajectories in parallel on the Space Situational

  11. A Novel Compressed Sensing Method for Magnetic Resonance Imaging: Exponential Wavelet Iterative Shrinkage-Thresholding Algorithm with Random Shift

    Directory of Open Access Journals (Sweden)

    Yudong Zhang

    2016-01-01

    Full Text Available Aim. It can help improve the hospital throughput to accelerate magnetic resonance imaging (MRI scanning. Patients will benefit from less waiting time. Task. In the last decade, various rapid MRI techniques on the basis of compressed sensing (CS were proposed. However, both computation time and reconstruction quality of traditional CS-MRI did not meet the requirement of clinical use. Method. In this study, a novel method was proposed with the name of exponential wavelet iterative shrinkage-thresholding algorithm with random shift (abbreviated as EWISTARS. It is composed of three successful components: (i exponential wavelet transform, (ii iterative shrinkage-thresholding algorithm, and (iii random shift. Results. Experimental results validated that, compared to state-of-the-art approaches, EWISTARS obtained the least mean absolute error, the least mean-squared error, and the highest peak signal-to-noise ratio. Conclusion. EWISTARS is superior to state-of-the-art approaches.

  12. A Novel Compressed Sensing Method for Magnetic Resonance Imaging: Exponential Wavelet Iterative Shrinkage-Thresholding Algorithm with Random Shift

    Science.gov (United States)

    Zhang, Yudong; Yang, Jiquan; Yang, Jianfei; Liu, Aijun; Sun, Ping

    2016-01-01

    Aim. It can help improve the hospital throughput to accelerate magnetic resonance imaging (MRI) scanning. Patients will benefit from less waiting time. Task. In the last decade, various rapid MRI techniques on the basis of compressed sensing (CS) were proposed. However, both computation time and reconstruction quality of traditional CS-MRI did not meet the requirement of clinical use. Method. In this study, a novel method was proposed with the name of exponential wavelet iterative shrinkage-thresholding algorithm with random shift (abbreviated as EWISTARS). It is composed of three successful components: (i) exponential wavelet transform, (ii) iterative shrinkage-thresholding algorithm, and (iii) random shift. Results. Experimental results validated that, compared to state-of-the-art approaches, EWISTARS obtained the least mean absolute error, the least mean-squared error, and the highest peak signal-to-noise ratio. Conclusion. EWISTARS is superior to state-of-the-art approaches. PMID:27066068

  13. Depth-Averaged Non-Hydrostatic Hydrodynamic Model Using a New Multithreading Parallel Computing Method

    Directory of Open Access Journals (Sweden)

    Ling Kang

    2017-03-01

    Full Text Available Compared to the hydrostatic hydrodynamic model, the non-hydrostatic hydrodynamic model can accurately simulate flows that feature vertical accelerations. The model’s low computational efficiency severely restricts its wider application. This paper proposes a non-hydrostatic hydrodynamic model based on a multithreading parallel computing method. The horizontal momentum equation is obtained by integrating the Navier–Stokes equations from the bottom to the free surface. The vertical momentum equation is approximated by the Keller-box scheme. A two-step method is used to solve the model equations. A parallel strategy based on block decomposition computation is utilized. The original computational domain is subdivided into two subdomains that are physically connected via a virtual boundary technique. Two sub-threads are created and tasked with the computation of the two subdomains. The producer–consumer model and the thread lock technique are used to achieve synchronous communication between sub-threads. The validity of the model was verified by solitary wave propagation experiments over a flat bottom and slope, followed by two sinusoidal wave propagation experiments over submerged breakwater. The parallel computing method proposed here was found to effectively enhance computational efficiency and save 20%–40% computation time compared to serial computing. The parallel acceleration rate and acceleration efficiency are approximately 1.45% and 72%, respectively. The parallel computing method makes a contribution to the popularization of non-hydrostatic models.

  14. Fourier analysis of parallel block-Jacobi splitting with transport synthetic acceleration in two-dimensional geometry

    International Nuclear Information System (INIS)

    Rosa, M.; Warsa, J. S.; Chang, J. H.

    2007-01-01

    A Fourier analysis is conducted in two-dimensional (2D) Cartesian geometry for the discrete-ordinates (SN) approximation of the neutron transport problem solved with Richardson iteration (Source Iteration) and Richardson iteration preconditioned with Transport Synthetic Acceleration (TSA), using the Parallel Block-Jacobi (PBJ) algorithm. The results for the un-accelerated algorithm show that convergence of PBJ can degrade, leading in particular to stagnation of GMRES(m) in problems containing optically thin sub-domains. The results for the accelerated algorithm indicate that TSA can be used to efficiently precondition an iterative method in the optically thin case when implemented in the 'modified' version MTSA, in which only the scattering in the low order equations is reduced by some non-negative factor β<1. (authors)

  15. Extending molecular simulation time scales: Parallel in time integrations for high-level quantum chemistry and complex force representations.

    Science.gov (United States)

    Bylaska, Eric J; Weare, Jonathan Q; Weare, John H

    2013-08-21

    Parallel in time simulation algorithms are presented and applied to conventional molecular dynamics (MD) and ab initio molecular dynamics (AIMD) models of realistic complexity. Assuming that a forward time integrator, f (e.g., Verlet algorithm), is available to propagate the system from time ti (trajectory positions and velocities xi = (ri, vi)) to time ti + 1 (xi + 1) by xi + 1 = fi(xi), the dynamics problem spanning an interval from t0[ellipsis (horizontal)]tM can be transformed into a root finding problem, F(X) = [xi - f(x(i - 1)]i = 1, M = 0, for the trajectory variables. The root finding problem is solved using a variety of root finding techniques, including quasi-Newton and preconditioned quasi-Newton schemes that are all unconditionally convergent. The algorithms are parallelized by assigning a processor to each time-step entry in the columns of F(X). The relation of this approach to other recently proposed parallel in time methods is discussed, and the effectiveness of various approaches to solving the root finding problem is tested. We demonstrate that more efficient dynamical models based on simplified interactions or coarsening time-steps provide preconditioners for the root finding problem. However, for MD and AIMD simulations, such preconditioners are not required to obtain reasonable convergence and their cost must be considered in the performance of the algorithm. The parallel in time algorithms developed are tested by applying them to MD and AIMD simulations of size and complexity similar to those encountered in present day applications. These include a 1000 Si atom MD simulation using Stillinger-Weber potentials, and a HCl + 4H2O AIMD simulation at the MP2 level. The maximum speedup (serial execution/timeparallel execution time) obtained by parallelizing the Stillinger-Weber MD simulation was nearly 3.0. For the AIMD MP2 simulations, the algorithms achieved speedups of up to 14.3. The parallel in time algorithms can be implemented in a

  16. Self-balanced modulation and magnetic rebalancing method for parallel multilevel inverters

    Science.gov (United States)

    Li, Hui; Shi, Yanjun

    2017-11-28

    A self-balanced modulation method and a closed-loop magnetic flux rebalancing control method for parallel multilevel inverters. The combination of the two methods provides for balancing of the magnetic flux of the inter-cell transformers (ICTs) of the parallel multilevel inverters without deteriorating the quality of the output voltage. In various embodiments a parallel multi-level inverter modulator is provide including a multi-channel comparator to generate a multiplexed digitized ideal waveform for a parallel multi-level inverter and a finite state machine (FSM) module coupled to the parallel multi-channel comparator, the FSM module to receive the multiplexed digitized ideal waveform and to generate a pulse width modulated gate-drive signal for each switching device of the parallel multi-level inverter. The system and method provides for optimization of the output voltage spectrum without influence the magnetic balancing.

  17. A novel iterative energy calibration method for composite germanium detectors

    International Nuclear Information System (INIS)

    Pattabiraman, N.S.; Chintalapudi, S.N.; Ghugre, S.S.

    2004-01-01

    An automatic method for energy calibration of the observed experimental spectrum has been developed. The method presented is based on an iterative algorithm and presents an efficient way to perform energy calibrations after establishing the weights of the calibration data. An application of this novel technique for data acquired using composite detectors in an in-beam γ-ray spectroscopy experiment is presented

  18. A novel iterative energy calibration method for composite germanium detectors

    Energy Technology Data Exchange (ETDEWEB)

    Pattabiraman, N.S.; Chintalapudi, S.N.; Ghugre, S.S. E-mail: ssg@alpha.iuc.res.in

    2004-07-01

    An automatic method for energy calibration of the observed experimental spectrum has been developed. The method presented is based on an iterative algorithm and presents an efficient way to perform energy calibrations after establishing the weights of the calibration data. An application of this novel technique for data acquired using composite detectors in an in-beam {gamma}-ray spectroscopy experiment is presented.

  19. CPU time reduction strategies for the Lambda modes calculation of a nuclear power reactor

    Energy Technology Data Exchange (ETDEWEB)

    Vidal, V.; Garayoa, J.; Hernandez, V. [Universidad Politecnica de Valencia (Spain). Dept. de Sistemas Informaticos y Computacion; Navarro, J.; Verdu, G.; Munoz-Cobo, J.L. [Universidad Politecnica de Valencia (Spain). Dept. de Ingenieria Quimica y Nuclear; Ginestar, D. [Universidad Politecnica de Valencia (Spain). Dept. de Matematica Aplicada

    1997-12-01

    In this paper, we present two strategies to reduce the CPU time spent in the lambda modes calculation for a realistic nuclear power reactor.The discretization of the multigroup neutron diffusion equation has been made using a nodal collocation method, solving the associated eigenvalue problem with two different techniques: the Subspace Iteration Method and Arnoldi`s Method. CPU time reduction is based on a coarse grain parallelization approach together with a multistep algorithm to initialize adequately the solution. (author). 9 refs., 6 tabs.

  20. Post-convergence automatic differentiation of iterative schemes

    International Nuclear Information System (INIS)

    Azmy, Y.Y.

    1997-01-01

    A new approach for performing automatic differentiation (AD) of computer codes that embody an iterative procedure, based on differentiating a single additional iteration upon achieving convergence, is described and implemented. This post-convergence automatic differentiation (PAD) technique results in better accuracy of the computed derivatives, as it eliminates part of the derivatives convergence error, and a large reduction in execution time, especially when many iterations are required to achieve convergence. In addition, it provides a way to compute derivatives of the converged solution without having to repeat the entire iterative process every time new parameters are considered. These advantages are demonstrated and the PAD technique is validated via a set of three linear and nonlinear codes used to solve neutron transport and fluid flow problems. The PAD technique reduces the execution time over direct AD by a factor of up to 30 and improves the accuracy of the derivatives by up to two orders of magnitude. The PAD technique's biggest disadvantage lies in the necessity to compute the iterative map's Jacobian, which for large problems can be prohibitive. Methods are discussed to alleviate this difficulty

  1. A generic method for real time detection of magnetic sensor failure on tokamaks

    International Nuclear Information System (INIS)

    Nouailletas, Rémy; Moreau, Philippe; Bremond, Sylvain

    2012-01-01

    Highlights: ► We propose a generic method to detect and correct in real time faults on magnetic sensor. ► This method is applied to Tore Supra and tested offline with real data. ► Then the method is modified to be applied to ITER ex-vessel sensor configuration. ► The method is tested on the ITER case with simulated data. - Abstract: In tokamaks, magnetic field probe sensors are used to measure the plasma position. If a sensor provides a wrong data, the error may propagate through the control loop and cause undesirable contact between the vessel wall and the plasma. In the case of a tokamak with water cooled walls, these types of event may be very serious. Despite of these unlikely faults, the potential damages call for a real time check of magnetic sensor data before using them for control. In this paper a simple and generic method based on the comparison of each sensor to a weighted sum of its neighbors is proposed. From the analysis of the residue (the result of the comparison), the fault can be detected and compensated. The method is tuned and tested against Tore Supra experimental data. Then, the method is adapted to ITER and assessed on a reference ITER scenario using simulated magnetic sensor data.

  2. Creating IRT-Based Parallel Test Forms Using the Genetic Algorithm Method

    Science.gov (United States)

    Sun, Koun-Tem; Chen, Yu-Jen; Tsai, Shu-Yen; Cheng, Chien-Fen

    2008-01-01

    In educational measurement, the construction of parallel test forms is often a combinatorial optimization problem that involves the time-consuming selection of items to construct tests having approximately the same test information functions (TIFs) and constraints. This article proposes a novel method, genetic algorithm (GA), to construct parallel…

  3. A hyperpower iterative method for computing the generalized Drazin ...

    Indian Academy of Sciences (India)

    Shwetabh Srivastava

    [6, 7]. A number of direct and iterative methods for com- putation of the Drazin inverse were developed in [8–12]. Its extension to Banach algebras is known as the generalized Drazin inverse and was established in [13]. Let J denote the complex. Banach algebra with the unit 1. The generalized Drazin inverse of an element ...

  4. Optimization of image quality and acquisition time for lab-based X-ray microtomography using an iterative reconstruction algorithm

    Science.gov (United States)

    Lin, Qingyang; Andrew, Matthew; Thompson, William; Blunt, Martin J.; Bijeljic, Branko

    2018-05-01

    Non-invasive laboratory-based X-ray microtomography has been widely applied in many industrial and research disciplines. However, the main barrier to the use of laboratory systems compared to a synchrotron beamline is its much longer image acquisition time (hours per scan compared to seconds to minutes at a synchrotron), which results in limited application for dynamic in situ processes. Therefore, the majority of existing laboratory X-ray microtomography is limited to static imaging; relatively fast imaging (tens of minutes per scan) can only be achieved by sacrificing imaging quality, e.g. reducing exposure time or number of projections. To alleviate this barrier, we introduce an optimized implementation of a well-known iterative reconstruction algorithm that allows users to reconstruct tomographic images with reasonable image quality, but requires lower X-ray signal counts and fewer projections than conventional methods. Quantitative analysis and comparison between the iterative and the conventional filtered back-projection reconstruction algorithm was performed using a sandstone rock sample with and without liquid phases in the pore space. Overall, by implementing the iterative reconstruction algorithm, the required image acquisition time for samples such as this, with sparse object structure, can be reduced by a factor of up to 4 without measurable loss of sharpness or signal to noise ratio.

  5. Gravity-Assist Trajectories to the Ice Giants: An Automated Method to Catalog Mass-or Time-Optimal Solutions

    Science.gov (United States)

    Hughes, Kyle M.; Knittel, Jeremy M.; Englander, Jacob A.

    2017-01-01

    This work presents an automated method of calculating mass (or time) optimal gravity-assist trajectories without a priori knowledge of the flyby-body combination. Since gravity assists are particularly crucial for reaching the outer Solar System, we use the Ice Giants, Uranus and Neptune, as example destinations for this work. Catalogs are also provided that list the most attractive trajectories found over launch dates ranging from 2024 to 2038. The tool developed to implement this method, called the Python EMTG Automated Trade Study Application (PEATSA), iteratively runs the Evolutionary Mission Trajectory Generator (EMTG), a NASA Goddard Space Flight Center in-house trajectory optimization tool. EMTG finds gravity-assist trajectories with impulsive maneuvers using a multiple-shooting structure along with stochastic methods (such as monotonic basin hopping) and may be run with or without an initial guess provided. PEATSA runs instances of EMTG in parallel over a grid of launch dates. After each set of runs completes, the best results within a neighborhood of launch dates are used to seed all other cases in that neighborhood---allowing the solutions across the range of launch dates to improve over each iteration. The results here are compared against trajectories found using a grid-search technique, and PEATSA is found to outperform the grid-search results for most launch years considered.

  6. Generalized subspace correction methods

    Energy Technology Data Exchange (ETDEWEB)

    Kolm, P. [Royal Institute of Technology, Stockholm (Sweden); Arbenz, P.; Gander, W. [Eidgenoessiche Technische Hochschule, Zuerich (Switzerland)

    1996-12-31

    A fundamental problem in scientific computing is the solution of large sparse systems of linear equations. Often these systems arise from the discretization of differential equations by finite difference, finite volume or finite element methods. Iterative methods exploiting these sparse structures have proven to be very effective on conventional computers for a wide area of applications. Due to the rapid development and increasing demand for the large computing powers of parallel computers, it has become important to design iterative methods specialized for these new architectures.

  7. A parallel orbital-updating based plane-wave basis method for electronic structure calculations

    International Nuclear Information System (INIS)

    Pan, Yan; Dai, Xiaoying; Gironcoli, Stefano de; Gong, Xin-Gao; Rignanese, Gian-Marco; Zhou, Aihui

    2017-01-01

    Highlights: • Propose three parallel orbital-updating based plane-wave basis methods for electronic structure calculations. • These new methods can avoid the generating of large scale eigenvalue problems and then reduce the computational cost. • These new methods allow for two-level parallelization which is particularly interesting for large scale parallelization. • Numerical experiments show that these new methods are reliable and efficient for large scale calculations on modern supercomputers. - Abstract: Motivated by the recently proposed parallel orbital-updating approach in real space method , we propose a parallel orbital-updating based plane-wave basis method for electronic structure calculations, for solving the corresponding eigenvalue problems. In addition, we propose two new modified parallel orbital-updating methods. Compared to the traditional plane-wave methods, our methods allow for two-level parallelization, which is particularly interesting for large scale parallelization. Numerical experiments show that these new methods are more reliable and efficient for large scale calculations on modern supercomputers.

  8. Polynomial factor models : non-iterative estimation via method-of-moments

    NARCIS (Netherlands)

    Schuberth, Florian; Büchner, Rebecca; Schermelleh-Engel, Karin; Dijkstra, Theo K.

    2017-01-01

    We introduce a non-iterative method-of-moments estimator for non-linear latent variable (LV) models. Under the assumption of joint normality of all exogenous variables, we use the corrected moments of linear combinations of the observed indicators (proxies) to obtain consistent path coefficient and

  9. Recommendations for a cryogenic system for ITER [International Thermonuclear Experimental Reactor

    International Nuclear Information System (INIS)

    Slack, D.S.

    1989-01-01

    The International Thermonuclear Experimental Reactor (ITER) is a new tokamak design project with joint participation from Japan, the European Community, the Soviet Union, and the United States. ITER will be a large machine requiring up to 100 kW of refrigeration at 4.5 K to cool its superconducting magnets. Unlike earlier fusion experiments, the ITER cryogenic system must handle pulse loads constituting a large percentage of the total load. These come from neutron heating during a fusion burn and from ac losses during ramping of current in the PF (poloidal field) coils. This paper presents a conceptual design for a cryogenic system that meets ITER requirements. It describes a system with the following features: Only time-proven components are used. The system obtains a high efficiency without use of cold pumps or other developmental components. High reliability is achieved by paralleling compressors and expanders and by using adequate isolation valving. The problem of load fluctuations is solved by a simple load-leveling device. The cryogenic system can be housed in a separate building located at a considerable distance from the ITER core, if desired. The paper also summarizes physical plant size, cost estimates, and means of handling vented helium during magnet quench. 4 refs., 4 figs., 3 tabs

  10. The inverse method parametric verification of real-time embedded systems

    CERN Document Server

    André , Etienne

    2013-01-01

    This book introduces state-of-the-art verification techniques for real-time embedded systems, based on the inverse method for parametric timed automata. It reviews popular formalisms for the specification and verification of timed concurrent systems and, in particular, timed automata as well as several extensions such as timed automata equipped with stopwatches, linear hybrid automata and affine hybrid automata.The inverse method is introduced, and its benefits for guaranteeing robustness in real-time systems are shown. Then, it is shown how an iteration of the inverse method can solv

  11. Note: interpreting iterative methods convergence with diffusion point of view

    OpenAIRE

    Hong, Dohy

    2013-01-01

    In this paper, we explain the convergence speed of different iteration schemes with the fluid diffusion view when solving a linear fixed point problem. This interpretation allows one to better understand why power iteration or Jacobi iteration may converge faster or slower than Gauss-Seidel iteration.

  12. Parallel implementation of a dynamic unstructured chimera method in the DLR finite volume TAU-code

    Energy Technology Data Exchange (ETDEWEB)

    Madrane, A.; Raichle, A.; Stuermer, A. [German Aerospace Center, DLR, Numerical Methods, Inst. of Aerodynamics and Flow Technology, Braunschweig (Germany)]. E-mail: aziz.madrane@dlr.de

    2004-07-01

    Aerodynamic problems involving moving geometries have many applications, including store separation, high-speed train entering into a tunnel, simulation of full configurations of the helicopter and fast maneuverability. Overset grid method offers the option of calculating these procedures. The solution process uses a grid system that discretizes the problem domain by using separately generated but overlapping unstructured grids that update and exchange boundary information through interpolation. However, such computations are complicated and time consuming. Parallel computing offers a very effective way to improve the productivity in doing computational fluid dynamics (CFD). Therefore the purpose of this study is to develop an efficient parallel computation algorithm for analyzing the flowfield of complex geometries using overset grids method. The strategy adopted in the parallelization of the overset grids method including the use of data structures and communication, is described. Numerical results are presented to demonstrate the efficiency of the resulting parallel overset grids method. (author)

  13. Parallel implementation of a dynamic unstructured chimera method in the DLR finite volume TAU-code

    International Nuclear Information System (INIS)

    Madrane, A.; Raichle, A.; Stuermer, A.

    2004-01-01

    Aerodynamic problems involving moving geometries have many applications, including store separation, high-speed train entering into a tunnel, simulation of full configurations of the helicopter and fast maneuverability. Overset grid method offers the option of calculating these procedures. The solution process uses a grid system that discretizes the problem domain by using separately generated but overlapping unstructured grids that update and exchange boundary information through interpolation. However, such computations are complicated and time consuming. Parallel computing offers a very effective way to improve the productivity in doing computational fluid dynamics (CFD). Therefore the purpose of this study is to develop an efficient parallel computation algorithm for analyzing the flowfield of complex geometries using overset grids method. The strategy adopted in the parallelization of the overset grids method including the use of data structures and communication, is described. Numerical results are presented to demonstrate the efficiency of the resulting parallel overset grids method. (author)

  14. A New General Iterative Method for a Finite Family of Nonexpansive Mappings in Hilbert Spaces

    Directory of Open Access Journals (Sweden)

    Singthong Urailuk

    2010-01-01

    Full Text Available We introduce a new general iterative method by using the -mapping for finding a common fixed point of a finite family of nonexpansive mappings in the framework of Hilbert spaces. A strong convergence theorem of the purposed iterative method is established under some certain control conditions. Our results improve and extend the results announced by many others.

  15. More on Generalizations and Modifications of Iterative Methods for Solving Large Sparse Indefinite Linear Systems

    Directory of Open Access Journals (Sweden)

    Jen-Yuan Chen

    2014-01-01

    Full Text Available Continuing from the works of Li et al. (2014, Li (2007, and Kincaid et al. (2000, we present more generalizations and modifications of iterative methods for solving large sparse symmetric and nonsymmetric indefinite systems of linear equations. We discuss a variety of iterative methods such as GMRES, MGMRES, MINRES, LQ-MINRES, QR MINRES, MMINRES, MGRES, and others.

  16. The fusion code XGC: Enabling kinetic study of multi-scale edge turbulent transport in ITER

    Energy Technology Data Exchange (ETDEWEB)

    D' Azevedo, Eduardo [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Abbott, Stephen [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Koskela, Tuomas [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Worley, Patrick [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Ku, Seung-Hoe [Princeton Plasma Physics Lab. (PPPL), Princeton, NJ (United States); Ethier, Stephane [Princeton Plasma Physics Lab. (PPPL), Princeton, NJ (United States); Yoon, Eisung [Rensselaer Polytechnic Inst., Troy, NY (United States); Shephard, Mark [Rensselaer Polytechnic Inst., Troy, NY (United States); Hager, Robert [Princeton Plasma Physics Lab. (PPPL), Princeton, NJ (United States); Lang, Jianying [Princeton Plasma Physics Lab. (PPPL), Princeton, NJ (United States); Intel Corporation, Santa Clara, CA (United States); Choi, Jong [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Podhorszki, Norbert [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Klasky, Scott [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Parashar, Manish [Rutgers Univ., Piscataway, NJ (United States); Chang, Choong-Seock [Princeton Plasma Physics Lab. (PPPL), Princeton, NJ (United States)

    2017-01-01

    The XGC fusion gyrokinetic code combines state-of-the-art, portable computational and algorithmic technologies to enable complicated multiscale simulations of turbulence and transport dynamics in ITER edge plasma on the largest US open-science computer, the CRAY XK7 Titan, at its maximal heterogeneous capability, which have not been possible before due to a factor of over 10 shortage in the time-to-solution for less than 5 days of wall-clock time for one physics case. Frontier techniques such as nested OpenMP parallelism, adaptive parallel I/O, staging I/O and data reduction using dynamic and asynchronous applications interactions, dynamic repartitioning.

  17. Massive Asynchronous Parallelization of Sparse Matrix Factorizations

    Energy Technology Data Exchange (ETDEWEB)

    Chow, Edmond [Georgia Inst. of Technology, Atlanta, GA (United States)

    2018-01-08

    Solving sparse problems is at the core of many DOE computational science applications. We focus on the challenge of developing sparse algorithms that can fully exploit the parallelism in extreme-scale computing systems, in particular systems with massive numbers of cores per node. Our approach is to express a sparse matrix factorization as a large number of bilinear constraint equations, and then solving these equations via an asynchronous iterative method. The unknowns in these equations are the matrix entries of the factorization that is desired.

  18. Software abstractions and computational issues in parallel structure adaptive mesh methods for electronic structure calculations

    Energy Technology Data Exchange (ETDEWEB)

    Kohn, S.; Weare, J.; Ong, E.; Baden, S.

    1997-05-01

    We have applied structured adaptive mesh refinement techniques to the solution of the LDA equations for electronic structure calculations. Local spatial refinement concentrates memory resources and numerical effort where it is most needed, near the atomic centers and in regions of rapidly varying charge density. The structured grid representation enables us to employ efficient iterative solver techniques such as conjugate gradient with FAC multigrid preconditioning. We have parallelized our solver using an object- oriented adaptive mesh refinement framework.

  19. Parallelization of a hydrological model using the message passing interface

    Science.gov (United States)

    Wu, Yiping; Li, Tiejian; Sun, Liqun; Chen, Ji

    2013-01-01

    With the increasing knowledge about the natural processes, hydrological models such as the Soil and Water Assessment Tool (SWAT) are becoming larger and more complex with increasing computation time. Additionally, other procedures such as model calibration, which may require thousands of model iterations, can increase running time and thus further reduce rapid modeling and analysis. Using the widely-applied SWAT as an example, this study demonstrates how to parallelize a serial hydrological model in a Windows® environment using a parallel programing technology—Message Passing Interface (MPI). With a case study, we derived the optimal values for the two parameters (the number of processes and the corresponding percentage of work to be distributed to the master process) of the parallel SWAT (P-SWAT) on an ordinary personal computer and a work station. Our study indicates that model execution time can be reduced by 42%–70% (or a speedup of 1.74–3.36) using multiple processes (two to five) with a proper task-distribution scheme (between the master and slave processes). Although the computation time cost becomes lower with an increasing number of processes (from two to five), this enhancement becomes less due to the accompanied increase in demand for message passing procedures between the master and all slave processes. Our case study demonstrates that the P-SWAT with a five-process run may reach the maximum speedup, and the performance can be quite stable (fairly independent of a project size). Overall, the P-SWAT can help reduce the computation time substantially for an individual model run, manual and automatic calibration procedures, and optimization of best management practices. In particular, the parallelization method we used and the scheme for deriving the optimal parameters in this study can be valuable and easily applied to other hydrological or environmental models.

  20. A Posteriori Error Estimation for Finite Element Methods and Iterative Linear Solvers

    Energy Technology Data Exchange (ETDEWEB)

    Melboe, Hallgeir

    2001-10-01

    This thesis addresses a posteriori error estimation for finite element methods and iterative linear solvers. Adaptive finite element methods have gained a lot of popularity over the last decades due to their ability to produce accurate results with limited computer power. In these methods a posteriori error estimates play an essential role. Not only do they give information about how large the total error is, they also indicate which parts of the computational domain should be given a more sophisticated treatment in order to reduce the error. A posteriori error estimates are traditionally aimed at estimating the global error, but more recently so called goal oriented error estimators have been shown a lot of interest. The name reflects the fact that they estimate the error in user-defined local quantities. In this thesis the main focus is on global error estimators for highly stretched grids and goal oriented error estimators for flow problems on regular grids. Numerical methods for partial differential equations, such as finite element methods and other similar techniques, typically result in a linear system of equations that needs to be solved. Usually such systems are solved using some iterative procedure which due to a finite number of iterations introduces an additional error. Most such algorithms apply the residual in the stopping criterion, whereas the control of the actual error may be rather poor. A secondary focus in this thesis is on estimating the errors that are introduced during this last part of the solution procedure. The thesis contains new theoretical results regarding the behaviour of some well known, and a few new, a posteriori error estimators for finite element methods on anisotropic grids. Further, a goal oriented strategy for the computation of forces in flow problems is devised and investigated. Finally, an approach for estimating the actual errors associated with the iterative solution of linear systems of equations is suggested. (author)

  1. Phase reconstruction by a multilevel iteratively regularized Gauss–Newton method

    International Nuclear Information System (INIS)

    Langemann, Dirk; Tasche, Manfred

    2008-01-01

    In this paper we consider the numerical solution of a phase retrieval problem for a compactly supported, linear spline f : R → C with the Fourier transform f-circumflex, where values of |f| and |f-circumflex| at finitely many equispaced nodes are given. The unknown phases of complex spline coefficients fulfil a well-structured system of nonlinear equations. Thus the phase reconstruction leads to a nonlinear inverse problem, which is solved by a multilevel strategy and iterative Tikhonov regularization. The multilevel strategy concentrates the main effort of the solution of the phase retrieval problem in the coarse, less expensive levels and provides convenient initial guesses at the next finer level. On each level, the corresponding nonlinear system is solved by an iteratively regularized Gauss–Newton method. The multilevel strategy is motivated by convergence results of IRGN. This method is applicable to a wide range of examples as shown in several numerical tests for noiseless and noisy data

  2. Coupling methods for parallel running RELAPSim codes in nuclear power plant simulation

    Energy Technology Data Exchange (ETDEWEB)

    Li, Yankai; Lin, Meng, E-mail: linmeng@sjtu.edu.cn; Yang, Yanhua

    2016-02-15

    When the plant is modeled detailedly for high precision, it is hard to achieve real-time calculation for one single RELAP5 in a large-scale simulation. To improve the speed and ensure the precision of simulation at the same time, coupling methods for parallel running RELAPSim codes were proposed in this study. Explicit coupling method via coupling boundaries was realized based on a data-exchange and procedure-control environment. Compromise of synchronization frequency was well considered to improve the precision of simulation and guarantee the real-time simulation at the same time. The coupling methods were assessed using both single-phase flow models and two-phase flow models and good agreements were obtained between the splitting–coupling models and the integrated model. The mitigation of SGTR was performed as an integral application of the coupling models. A large-scope NPP simulator was developed adopting six splitting–coupling models of RELAPSim and other simulation codes. The coupling models could improve the speed of simulation significantly and make it possible for real-time calculation. In this paper, the coupling of the models in the engineering simulator is taken as an example to expound the coupling methods, i.e., coupling between parallel running RELAPSim codes, and coupling between RELAPSim code and other types of simulation codes. However, the coupling methods are also referable in other simulator, for example, a simulator employing ATHLETE instead of RELAP5, other logic code instead of SIMULINK. It is believed the coupling method is commonly used for NPP simulator regardless of the specific codes chosen in this paper.

  3. Coarse-grain parallel solution of few-group neutron diffusion equations

    International Nuclear Information System (INIS)

    Sarsour, H.N.; Turinsky, P.J.

    1991-01-01

    The authors present a parallel numerical algorithm for the solution of the finite difference representation of the few-group neutron diffusion equations. The targeted architectures are multiprocessor computers with shared memory like the Cray Y-MP and the IBM 3090/VF, where coarse granularity is important for minimizing overhead. Most of the work done in the past, which attempts to exploit concurrence, has concentrated on the inner iterations of the standard outer-inner iterative strategy. This produces very fine granularity. To coarsen granularity, the authors introduce parallelism at the nested outer-inner level. The problem's spatial domain was partitioned into contiguous subregions and assigned a processor to solve for each subregion independent of all other subregions, hence, processors; i.e., each subregion is treated as a reactor core with imposed boundary conditions. Since those boundary conditions on interior surfaces, referred to as internal boundary conditions (IBCs), are not known, a third iterative level, the recomposition iterations, is introduced to communicate results between subregions

  4. Externally calibrated parallel imaging for 3D multispectral imaging near metallic implants using broadband ultrashort echo time imaging.

    Science.gov (United States)

    Wiens, Curtis N; Artz, Nathan S; Jang, Hyungseok; McMillan, Alan B; Reeder, Scott B

    2017-06-01

    To develop an externally calibrated parallel imaging technique for three-dimensional multispectral imaging (3D-MSI) in the presence of metallic implants. A fast, ultrashort echo time (UTE) calibration acquisition is proposed to enable externally calibrated parallel imaging techniques near metallic implants. The proposed calibration acquisition uses a broadband radiofrequency (RF) pulse to excite the off-resonance induced by the metallic implant, fully phase-encoded imaging to prevent in-plane distortions, and UTE to capture rapidly decaying signal. The performance of the externally calibrated parallel imaging reconstructions was assessed using phantoms and in vivo examples. Phantom and in vivo comparisons to self-calibrated parallel imaging acquisitions show that significant reductions in acquisition times can be achieved using externally calibrated parallel imaging with comparable image quality. Acquisition time reductions are particularly large for fully phase-encoded methods such as spectrally resolved fully phase-encoded three-dimensional (3D) fast spin-echo (SR-FPE), in which scan time reductions of up to 8 min were obtained. A fully phase-encoded acquisition with broadband excitation and UTE enabled externally calibrated parallel imaging for 3D-MSI, eliminating the need for repeated calibration regions at each frequency offset. Significant reductions in acquisition time can be achieved, particularly for fully phase-encoded methods like SR-FPE. Magn Reson Med 77:2303-2309, 2017. © 2016 International Society for Magnetic Resonance in Medicine. © 2016 International Society for Magnetic Resonance in Medicine.

  5. Low-memory iterative density fitting.

    Science.gov (United States)

    Grajciar, Lukáš

    2015-07-30

    A new low-memory modification of the density fitting approximation based on a combination of a continuous fast multipole method (CFMM) and a preconditioned conjugate gradient solver is presented. Iterative conjugate gradient solver uses preconditioners formed from blocks of the Coulomb metric matrix that decrease the number of iterations needed for convergence by up to one order of magnitude. The matrix-vector products needed within the iterative algorithm are calculated using CFMM, which evaluates them with the linear scaling memory requirements only. Compared with the standard density fitting implementation, up to 15-fold reduction of the memory requirements is achieved for the most efficient preconditioner at a cost of only 25% increase in computational time. The potential of the method is demonstrated by performing density functional theory calculations for zeolite fragment with 2592 atoms and 121,248 auxiliary basis functions on a single 12-core CPU workstation. © 2015 Wiley Periodicals, Inc.

  6. Maintenance implications of critical components in ITER CXRS upper port plug design

    International Nuclear Information System (INIS)

    Koning, Jarich; Jaspers, Roger; Doornink, Jan; Ouwehand, Bernard; Klinkhamer, Friso; Snijders, Bart; Sadakov, Sergey; Heemskerk, Cock

    2009-01-01

    Already in the early phase of a design for ITER, the maintenance aspects should be taken into account, since they might have serious implications. This paper presents the arguments in support of the case for the maintainability of the design, notably if this maintenance is to be performed by advanced remote methods. This structure is compliant to the evolving maintenance strategy of ITER. Initial results of a Failure Mode Effects and Criticality Analysis (FMECA) and a development risk analysis for the ITER upper port plug no. 3, housing the Charge Exchange Recombination Spectroscopy (CXRS) diagnostic, are employed for the definition of the maintenance strategy. The CXRS upper port plug is essentially an optical system which transfers visible light from the plasma into a fiber bundle. The most critical component in this path is the first mirror (M1) whose reflectivity degrades during operation due to deposition and/or erosion dominated effects. Amongst other measures to mitigate these effects, the strategy is to allow for a replacement of this mirror. Therefore it is mounted on a retractable central tube. The main purpose of this tube is to make frequent replacements possible without hindering operation. The maintenance method in terms of time, geometry and spare part policy has a large impact on cost of the system and time usage in the hot cell. Replacement of the tube under vacuum and magnetic field seems infeasible due to the operational risk involved. The preferred solution is to have a spare tube available which is replaced in parallel with other maintenance operations on the vessel, as to avoid any interference in the hot cell with the shutdown scheduling. This avoids having to refurbish a full port plug and also allows for a more frequent replacement of M1, as we can replace the mirror anytime the vacuum vessel is vented, estimated to be once a year.

  7. Study of a Biparametric Family of Iterative Methods

    Directory of Open Access Journals (Sweden)

    B. Campos

    2014-01-01

    Full Text Available The dynamics of a biparametric family for solving nonlinear equations is studied on quadratic polynomials. This biparametric family includes the c-iterative methods and the well-known Chebyshev-Halley family. We find the analytical expressions for the fixed and critical points by solving 6-degree polynomials. We use the free critical points to get the parameter planes and, by observing them, we specify some values of (α, c with clear stable and unstable behaviors.

  8. Non-iterative method to calculate the periodical distribution of temperature in reactors with thermal regeneration

    International Nuclear Information System (INIS)

    Sanchez de Alsina, O.L.; Scaricabarozzi, R.A.

    1982-01-01

    A matrix non-iterative method to calculate the periodical distribution in reactors with thermal regeneration is presented. In case of exothermic reaction, a source term will be included. A computer code was developed to calculate the final temperature distribution in solids and in the outlet temperatures of the gases. The results obtained from ethane oxidation calculation in air, using the Dietrich kinetic data are presented. This method is more advantageous than iterative methods. (E.G.) [pt

  9. Qualification of phased array ultrasonic examination on T-joint weld of austenitic stainless steel for ITER vacuum vessel

    Energy Technology Data Exchange (ETDEWEB)

    Kim, G.H. [ITER Korea, National Fusion Research Institute, Daejeon 305-333 (Korea, Republic of); Park, C.K., E-mail: love879@hanmail.net [ITER Korea, National Fusion Research Institute, Daejeon 305-333 (Korea, Republic of); Jin, S.W.; Kim, H.S.; Hong, K.H.; Lee, Y.J.; Ahn, H.J.; Chung, W. [ITER Korea, National Fusion Research Institute, Daejeon 305-333 (Korea, Republic of); Jung, Y.H.; Roh, B.R. [Hyundai Heavy Industries Co. Ltd., Ulsan 682-792 (Korea, Republic of); Sa, J.W.; Choi, C.H. [ITER Organization, Route de Vinon-sur-Verdon, CS 90 046, 13067 St. Paul Lez Durance Cedex (France)

    2016-11-01

    Highlights: • PAUT techniques has been developed by Hyundai Heavy Industries Co., LTD (HHI) and Korea Domestic Agency (KODA) to verify and settle down instrument calibration, test procedures, image processing, and so on. As the first step of development for PAUT technique, Several dozens of qualification blocks with artificial defects, which are parallel side drilled hole, embedded lack of fusion, embedded repair weld notch, and so on, have been designed and fabricated to simulate all potential defects during welding process. Real UT qualification group-1 for T-joint weld was successfully conducted in front of ANB inspector. • In this paper, remarkable progresses of UT qualification are presented for ITER vacuum vessel. - Abstract: Full penetration welding and 100% volumetric examination are required for all welds of pressure retaining parts of the ITER Vacuum Vessel (VV) according to RCC-MR Code and French Order of Nuclear Pressure Equipment (ESPN). The NDE requirement is one of important technical issues because radiographic examination (RT) is not applicable to many welding joints. Therefore the ultrasonic examination (UT) has been selected as an alternative method. Generally the UT on the austenitic welds is regarded as a great challenge due to the high attenuation and dispersion of the ultrasonic signal. In this paper, Phased array ultrasonic examination (PAUT) has been introduced on double sided T-shape austenitic welds of the ITER VV as a major NDE method as well as RT. Several dozens of qualification blocks with artificial defects, which are parallel side drilled hole, embedded lack of fusion, embedded repair weld notch, embedded parallel vertical notch, and so on, have been designed and fabricated to simulate all potential defects during welding process. PAUT techniques on the thick austenitic welds have been developed taking into account the acceptance criteria. Test procedure including calibration of equipment is derived and qualified through

  10. Parallel computation of fluid-structural interactions using high resolution upwind schemes

    Science.gov (United States)

    Hu, Zongjun

    An efficient and accurate solver is developed to simulate the non-linear fluid-structural interactions in turbomachinery flutter flows. A new low diffusion E-CUSP scheme, Zha CUSP scheme, is developed to improve the efficiency and accuracy of the inviscid flux computation. The 3D unsteady Navier-Stokes equations with the Baldwin-Lomax turbulence model are solved using the finite volume method with the dual-time stepping scheme. The linearized equations are solved with Gauss-Seidel line iterations. The parallel computation is implemented using MPI protocol. The solver is validated with 2D cases for its turbulence modeling, parallel computation and unsteady calculation. The Zha CUSP scheme is validated with 2D cases, including a supersonic flat plate boundary layer, a transonic converging-diverging nozzle and a transonic inlet diffuser. The Zha CUSP2 scheme is tested with 3D cases, including a circular-to-rectangular nozzle, a subsonic compressor cascade and a transonic channel. The Zha CUSP schemes are proved to be accurate, robust and efficient in these tests. The steady and unsteady separation flows in a 3D stationary cascade under high incidence and three inlet Mach numbers are calculated to study the steady state separation flow patterns and their unsteady oscillation characteristics. The leading edge vortex shedding is the mechanism behind the unsteady characteristics of the high incidence separated flows. The separation flow characteristics is affected by the inlet Mach number. The blade aeroelasticity of a linear cascade with forced oscillating blades is studied using parallel computation. A simplified two-passage cascade with periodic boundary condition is first calculated under a medium frequency and a low incidence. The full scale cascade with 9 blades and two end walls is then studied more extensively under three oscillation frequencies and two incidence angles. The end wall influence and the blade stability are studied and compared under different

  11. IWR-solution for the ITER vacuum vessel assembly

    Energy Technology Data Exchange (ETDEWEB)

    Wu, H., E-mail: huapeng@lut.fi [Laboratory of Intelligent Machines, Lappeenranta University of Technology (Finland); Handroos, H. [Laboratory of Intelligent Machines, Lappeenranta University of Technology (Finland); Pela, P. [Tekes (Finland); Wang, Y. [Laboratory of Intelligent Machines, Lappeenranta University of Technology (Finland)

    2011-10-15

    The assembly of ITER vacuum vessel (VV) is still a very big challenge as the process can only be done from inside the VV. The welding of the VV assembly is carried out using the dedicated robotic systems. The main functions of the robots are: (i) measuring the actual space between every two sectors, (ii) positioning of the 150 kg splice plates between the sector shells, (iii) welding the splice plates to the sector shells, (iv) NDT of the welds, (v) repairing, including machining of the welds, (vi) He-leak tests of the welds, and (vii) the non-planned functions that may turn out. This paper presents a reasonable method to assemble the ITER VV. In this article, one parallel mobile robot, running on the track rail fixed on the wall inside the VV, is designed and tested. The assembling process, carried out by the mobile robot together with the welding robot, is presented.

  12. A scalable method for parallelizing sampling-based motion planning algorithms

    KAUST Repository

    Jacobs, Sam Ade; Manavi, Kasra; Burgos, Juan; Denny, Jory; Thomas, Shawna; Amato, Nancy M.

    2012-01-01

    This paper describes a scalable method for parallelizing sampling-based motion planning algorithms. It subdivides configuration space (C-space) into (possibly overlapping) regions and independently, in parallel, uses standard (sequential) sampling-based planners to construct roadmaps in each region. Next, in parallel, regional roadmaps in adjacent regions are connected to form a global roadmap. By subdividing the space and restricting the locality of connection attempts, we reduce the work and inter-processor communication associated with nearest neighbor calculation, a critical bottleneck for scalability in existing parallel motion planning methods. We show that our method is general enough to handle a variety of planning schemes, including the widely used Probabilistic Roadmap (PRM) and Rapidly-exploring Random Trees (RRT) algorithms. We compare our approach to two other existing parallel algorithms and demonstrate that our approach achieves better and more scalable performance. Our approach achieves almost linear scalability on a 2400 core LINUX cluster and on a 153,216 core Cray XE6 petascale machine. © 2012 IEEE.

  13. A scalable method for parallelizing sampling-based motion planning algorithms

    KAUST Repository

    Jacobs, Sam Ade

    2012-05-01

    This paper describes a scalable method for parallelizing sampling-based motion planning algorithms. It subdivides configuration space (C-space) into (possibly overlapping) regions and independently, in parallel, uses standard (sequential) sampling-based planners to construct roadmaps in each region. Next, in parallel, regional roadmaps in adjacent regions are connected to form a global roadmap. By subdividing the space and restricting the locality of connection attempts, we reduce the work and inter-processor communication associated with nearest neighbor calculation, a critical bottleneck for scalability in existing parallel motion planning methods. We show that our method is general enough to handle a variety of planning schemes, including the widely used Probabilistic Roadmap (PRM) and Rapidly-exploring Random Trees (RRT) algorithms. We compare our approach to two other existing parallel algorithms and demonstrate that our approach achieves better and more scalable performance. Our approach achieves almost linear scalability on a 2400 core LINUX cluster and on a 153,216 core Cray XE6 petascale machine. © 2012 IEEE.

  14. Real-time data acquisition and parallel data processing solution for TJ-II Bolometer arrays diagnostic

    Energy Technology Data Exchange (ETDEWEB)

    Barrera, E. [Departamento de Sistemas Electronicos y de Control, Universidad Politecnica de Madrid, Crta. Valencia Km. 7, 28031 Madrid (Spain)]. E-mail: eduardo.barrera@upm.es; Ruiz, M. [Grupo de Investigacion en Instrumentacion y Acustica Aplicada, Universidad Politecnica de Madrid, Crta. Valencia Km. 7, 28031 Madrid (Spain); Lopez, S. [Departamento de Sistemas Electronicos y de Control, Universidad Politecnica de Madrid, Crta. Valencia Km. 7, 28031 Madrid (Spain); Machon, D. [Departamento de Sistemas Electronicos y de Control, Universidad Politecnica de Madrid, Crta. Valencia Km. 7, 28031 Madrid (Spain); Vega, J. [Asociacion EURATOM/CIEMAT para Fusion, 28040 Madrid (Spain); Ochando, M. [Asociacion EURATOM/CIEMAT para Fusion, 28040 Madrid (Spain)

    2006-07-15

    Maps of local plasma emissivity of TJ-II plasmas are determined using three-array cameras of silicon photodiodes (AXUV type from IRD). They have assigned the top and side ports of the same sector of the vacuum vessel. Each array consists of 20 unfiltered detectors. The signals from each of these detectors are the inputs to an iterative algorithm of tomographic reconstruction. Currently, these signals are acquired by a PXI standard system at approximately 50 kS/s, with 12 bits of resolution and are stored for off-line processing. A 0.5 s discharge generates 3 Mbytes of raw data. The algorithm's load exceeds the CPU capacity of the PXI system's controller in a continuous mode, making unfeasible to process the samples in parallel with their acquisition in a PXI standard system. A new architecture model has been developed, making possible to add one or several processing cards to a standard PXI system. With this model, it is possible to define how to distribute, in real-time, the data from all acquired signals in the system among the processing cards and the PXI controller. This way, by distributing the data processing among the system controller and two processing cards, the data processing can be done in parallel with the acquisition. Hence, this system configuration would be able to measure even in long pulse devices.

  15. Analyzing the Impacts of Alternated Number of Iterations in Multiple Imputation Method on Explanatory Factor Analysis

    Directory of Open Access Journals (Sweden)

    Duygu KOÇAK

    2017-11-01

    Full Text Available The study aims to identify the effects of iteration numbers used in multiple iteration method, one of the methods used to cope with missing values, on the results of factor analysis. With this aim, artificial datasets of different sample sizes were created. Missing values at random and missing values at complete random were created in various ratios by deleting data. For the data in random missing values, a second variable was iterated at ordinal scale level and datasets with different ratios of missing values were obtained based on the levels of this variable. The data were generated using “psych” program in R software, while “dplyr” program was used to create codes that would delete values according to predetermined conditions of missing value mechanism. Different datasets were generated by applying different iteration numbers. Explanatory factor analysis was conducted on the datasets completed and the factors and total explained variances are presented. These values were first evaluated based on the number of factors and total variance explained of the complete datasets. The results indicate that multiple iteration method yields a better performance in cases of missing values at random compared to datasets with missing values at complete random. Also, it was found that increasing the number of iterations in both missing value datasets decreases the difference in the results obtained from complete datasets.

  16. Iteratively-coupled propagating exterior complex scaling method for electron-hydrogen collisions

    International Nuclear Information System (INIS)

    Bartlett, Philip L; Stelbovics, Andris T; Bray, Igor

    2004-01-01

    A newly-derived iterative coupling procedure for the propagating exterior complex scaling (PECS) method is used to efficiently calculate the electron-impact wavefunctions for atomic hydrogen. An overview of this method is given along with methods for extracting scattering cross sections. Differential scattering cross sections at 30 eV are presented for the electron-impact excitation to the n = 1, 2, 3 and 4 final states, for both PECS and convergent close coupling (CCC), which are in excellent agreement with each other and with experiment. PECS results are presented at 27.2 eV and 30 eV for symmetric and asymmetric energy-sharing triple differential cross sections, which are in excellent agreement with CCC and exterior complex scaling calculations, and with experimental data. At these intermediate energies, the efficiency of the PECS method with iterative coupling has allowed highly accurate partial-wave solutions of the full Schroedinger equation, for L ≤ 50 and a large number of coupled angular momentum states, to be obtained with minimal computing resources. (letter to the editor)

  17. ESPRIT-Forest: Parallel clustering of massive amplicon sequence data in subquadratic time.

    Science.gov (United States)

    Cai, Yunpeng; Zheng, Wei; Yao, Jin; Yang, Yujie; Mai, Volker; Mao, Qi; Sun, Yijun

    2017-04-01

    The rapid development of sequencing technology has led to an explosive accumulation of genomic sequence data. Clustering is often the first step to perform in sequence analysis, and hierarchical clustering is one of the most commonly used approaches for this purpose. However, it is currently computationally expensive to perform hierarchical clustering of extremely large sequence datasets due to its quadratic time and space complexities. In this paper we developed a new algorithm called ESPRIT-Forest for parallel hierarchical clustering of sequences. The algorithm achieves subquadratic time and space complexity and maintains a high clustering accuracy comparable to the standard method. The basic idea is to organize sequences into a pseudo-metric based partitioning tree for sub-linear time searching of nearest neighbors, and then use a new multiple-pair merging criterion to construct clusters in parallel using multiple threads. The new algorithm was tested on the human microbiome project (HMP) dataset, currently one of the largest published microbial 16S rRNA sequence dataset. Our experiment demonstrated that with the power of parallel computing it is now compu- tationally feasible to perform hierarchical clustering analysis of tens of millions of sequences. The software is available at http://www.acsu.buffalo.edu/∼yijunsun/lab/ESPRIT-Forest.html.

  18. Impact of Optimization and Parallelism on Factorization Speed of SIQS

    Directory of Open Access Journals (Sweden)

    Dominik Breitenbacher

    2016-06-01

    Full Text Available This paper examines optimization possibilities of Self-Initialization Quadratic Sieve (SIQS, which is enhanced version of Quadratic Sieve factorization method. SIQS is considered the second fastest factorization method at all and the fastest one for numbers shorter than 100 decimal digits, respectively. Although, SIQS is the fastest method up to 100 decimal digits, it cannot be effectively utilized to work in polynomial time. Therefore, it is desirable to look for options how to speed up the method as much as possible. Two feasible ways of achieving it are code optimization and parallelism. Both of them are utilized in this paper. The goal of this paper is to show how it is possible to take advantage of parallelism in SIQS as well as reach a large speed-up thanks to detailed source code analysis with optimization. Our implementation process consists of two phases. In the first phase, the complete serial algorithm is implemented in the simplest way which does not consider any requirements for execution speed. The solution from the first phase serves as the reference implementation for further experiments. An improvement of factorization speed is performed in the second phase of the SIQS implementation, where we use the method of iterative modifications in order to examine contribution of each proposed step. The final optimized version of the SIQS implementation has achieved over 200x speed-up.

  19. Efficient relaxed-Jacobi smoothers for multigrid on parallel computers

    Science.gov (United States)

    Yang, Xiang; Mittal, Rajat

    2017-03-01

    In this Technical Note, we present a family of Jacobi-based multigrid smoothers suitable for the solution of discretized elliptic equations. These smoothers are based on the idea of scheduled-relaxation Jacobi proposed recently by Yang & Mittal (2014) [18] and employ two or three successive relaxed Jacobi iterations with relaxation factors derived so as to maximize the smoothing property of these iterations. The performance of these new smoothers measured in terms of convergence acceleration and computational workload, is assessed for multi-domain implementations typical of parallelized solvers, and compared to the lexicographic point Gauss-Seidel smoother. The tests include the geometric multigrid method on structured grids as well as the algebraic grid method on unstructured grids. The tests demonstrate that unlike Gauss-Seidel, the convergence of these Jacobi-based smoothers is unaffected by domain decomposition, and furthermore, they outperform the lexicographic Gauss-Seidel by factors that increase with domain partition count.

  20. Efficacy of variational iteration method for chaotic Genesio system - Classical and multistage approach

    International Nuclear Information System (INIS)

    Goh, S.M.; Noorani, M.S.M.; Hashim, I.

    2009-01-01

    This is a case study of solving the Genesio system by using the classical variational iteration method (VIM) and a newly modified version called the multistage VIM (MVIM). VIM is an analytical technique that grants us a continuous representation of the approximate solution, which allows better information of the solution over the time interval. Unlike its counterpart, numerical techniques, such as the Runge-Kutta method, provide solutions only at two ends of the time interval (with condition that the selected time interval is adequately small for convergence). Furthermore, it offers approximate solutions in a discretized form, making it complicated in achieving a continuous representation. The explicit solutions through VIM and MVIM are compared with the numerical analysis of the fourth-order Runge-Kutta method (RK4). VIM had been successfully applied to linear and nonlinear systems of non-chaotic in nature and this had been testified by numerous scientists lately. Our intention is to determine whether VIM is also a feasible method in solving a chaotic system like Genesio. At the same time, MVIM will be applied to gauge its accuracy compared to VIM and RK4. Since, for most situations, the validity domain of the solutions is often an issue, we will consider a reasonably large time frame in our work.

  1. PARALLEL SOLUTION METHODS OF PARTIAL DIFFERENTIAL EQUATIONS

    Directory of Open Access Journals (Sweden)

    Korhan KARABULUT

    1998-03-01

    Full Text Available Partial differential equations arise in almost all fields of science and engineering. Computer time spent in solving partial differential equations is much more than that of in any other problem class. For this reason, partial differential equations are suitable to be solved on parallel computers that offer great computation power. In this study, parallel solution to partial differential equations with Jacobi, Gauss-Siedel, SOR (Succesive OverRelaxation and SSOR (Symmetric SOR algorithms is studied.

  2. Massively parallel red-black algorithms for x-y-z response matrix equations

    International Nuclear Information System (INIS)

    Hanebutte, U.R.; Laurin-Kovitz, K.; Lewis, E.E.

    1992-01-01

    Recently, both discrete ordinates and spherical harmonic (S n and P n ) methods have been cast in the form of response matrices. In x-y geometry, massively parallel algorithms have been developed to solve the resulting response matrix equations on the Connection Machine family of parallel computers, the CM-2, CM-200, and CM-5. These algorithms utilize two-cycle iteration on a red-black checkerboard. In this work we examine the use of massively parallel red-black algorithms to solve response matric equations in three dimensions. This longer term objective is to utilize massively parallel algorithms to solve S n and/or P n response matrix problems. In this exploratory examination, however, we consider the simple 6 x 6 response matrices that are derivable from fine-mesh diffusion approximations in three dimensions

  3. Newton-sor iterative method for solving the two-dimensional porous ...

    African Journals Online (AJOL)

    In this paper, we consider the application of the Newton-SOR iterative method in obtaining the approximate solution of the two-dimensional porous medium equation (2D PME). The nonlinear finite difference approximation equation to the 2D PME is derived by using the implicit finite difference scheme. The developed ...

  4. First-Order Hyperbolic System Method for Time-Dependent Advection-Diffusion Problems

    Science.gov (United States)

    2014-03-01

    accuracy, with rapid convergence over each physical time step, typically less than five Newton iter - ations. 1 Contents 1 Introduction 3 2 Hyperbolic...however, we employ the Gauss - Seidel (GS) relaxation, which is also an O(N) method for the discretization arising from hyperbolic advection-diffusion system...advection-diffusion scheme. The linear dependency of the iterations on Table 1: Boundary layer problem ( Convergence criteria: Residuals < 10−8.) log10Re

  5. Comparison between the Variational Iteration Method and the Homotopy Perturbation Method for the Sturm-Liouville Differential Equation

    OpenAIRE

    Darzi R; Neamaty A

    2010-01-01

    We applied the variational iteration method and the homotopy perturbation method to solve Sturm-Liouville eigenvalue and boundary value problems. The main advantage of these methods is the flexibility to give approximate and exact solutions to both linear and nonlinear problems without linearization or discretization. The results show that both methods are simple and effective.

  6. Semi-coarsening multigrid methods for parallel computing

    Energy Technology Data Exchange (ETDEWEB)

    Jones, J.E.

    1996-12-31

    Standard multigrid methods are not well suited for problems with anisotropic coefficients which can occur, for example, on grids that are stretched to resolve a boundary layer. There are several different modifications of the standard multigrid algorithm that yield efficient methods for anisotropic problems. In the paper, we investigate the parallel performance of these multigrid algorithms. Multigrid algorithms which work well for anisotropic problems are based on line relaxation and/or semi-coarsening. In semi-coarsening multigrid algorithms a grid is coarsened in only one of the coordinate directions unlike standard or full-coarsening multigrid algorithms where a grid is coarsened in each of the coordinate directions. When both semi-coarsening and line relaxation are used, the resulting multigrid algorithm is robust and automatic in that it requires no knowledge of the nature of the anisotropy. This is the basic multigrid algorithm whose parallel performance we investigate in the paper. The algorithm is currently being implemented on an IBM SP2 and its performance is being analyzed. In addition to looking at the parallel performance of the basic semi-coarsening algorithm, we present algorithmic modifications with potentially better parallel efficiency. One modification reduces the amount of computational work done in relaxation at the expense of using multiple coarse grids. This modification is also being implemented with the aim of comparing its performance to that of the basic semi-coarsening algorithm.

  7. A sparsity-regularized Born iterative method for reconstruction of two-dimensional piecewise continuous inhomogeneous domains

    KAUST Repository

    Sandhu, Ali Imran; Desmal, Abdulla; Bagci, Hakan

    2016-01-01

    A sparsity-regularized Born iterative method (BIM) is proposed for efficiently reconstructing two-dimensional piecewise-continuous inhomogeneous dielectric profiles. Such profiles are typically not spatially sparse, which reduces the efficiency of the sparsity-promoting regularization. To overcome this problem, scattered fields are represented in terms of the spatial derivative of the dielectric profile and reconstruction is carried out over samples of the dielectric profile's derivative. Then, like the conventional BIM, the nonlinear problem is iteratively converted into a sequence of linear problems (in derivative samples) and sparsity constraint is enforced on each linear problem using the thresholded Landweber iterations. Numerical results, which demonstrate the efficiency and accuracy of the proposed method in reconstructing piecewise-continuous dielectric profiles, are presented.

  8. An iterative method for the solution of nonlinear systems using the Faber polynomials for annular sectors

    Energy Technology Data Exchange (ETDEWEB)

    Myers, N.J. [Univ. of Durham (United Kingdom)

    1994-12-31

    The author gives a hybrid method for the iterative solution of linear systems of equations Ax = b, where the matrix (A) is nonsingular, sparse and nonsymmetric. As in a method developed by Starke and Varga the method begins with a number of steps of the Arnoldi method to produce some information on the location of the spectrum of A. This method then switches to an iterative method based on the Faber polynomials for an annular sector placed around these eigenvalue estimates. The Faber polynomials for an annular sector are used because, firstly an annular sector can easily be placed around any eigenvalue estimates bounded away from zero, and secondly the Faber polynomials are known analytically for an annular sector. Finally the author gives three numerical examples, two of which allow comparison with Starke and Varga`s results. The third is an example of a matrix for which many iterative methods would fall, but this method converges.

  9. ITER radio frequency systems

    International Nuclear Information System (INIS)

    Bosia, G.

    1998-01-01

    Neutral Beam Injection and RF heating are two of the methods for heating and current drive in ITER. The three ITER RF systems, which have been developed during the EDA, offer several complementary services and are able to fulfil ITER operational requirements

  10. Approximate convex hull of affine iterated function system attractors

    International Nuclear Information System (INIS)

    Mishkinis, Anton; Gentil, Christian; Lanquetin, Sandrine; Sokolov, Dmitry

    2012-01-01

    Highlights: ► We present an iterative algorithm to approximate affine IFS attractor convex hull. ► Elimination of the interior points significantly reduces the complexity. ► To optimize calculations, we merge the convex hull images at each iteration. ► Approximation by ellipses increases speed of convergence to the exact convex hull. ► We present a method of the output convex hull simplification. - Abstract: In this paper, we present an algorithm to construct an approximate convex hull of the attractors of an affine iterated function system (IFS). We construct a sequence of convex hull approximations for any required precision using the self-similarity property of the attractor in order to optimize calculations. Due to the affine properties of IFS transformations, the number of points considered in the construction is reduced. The time complexity of our algorithm is a linear function of the number of iterations and the number of points in the output approximate convex hull. The number of iterations and the execution time increases logarithmically with increasing accuracy. In addition, we introduce a method to simplify the approximate convex hull without loss of accuracy.

  11. Environmental dose rate assessment of ITER using the Monte Carlo method

    Directory of Open Access Journals (Sweden)

    Karimian Alireza

    2014-01-01

    Full Text Available Exposure to radiation is one of the main sources of risk to staff employed in reactor facilities. The staff of a tokamak is exposed to a wide range of neutrons and photons around the tokamak hall. The International Thermonuclear Experimental Reactor (ITER is a nuclear fusion engineering project and the most advanced experimental tokamak in the world. From the radiobiological point of view, ITER dose rates assessment is particularly important. The aim of this study is the assessment of the amount of radiation in ITER during its normal operation in a radial direction from the plasma chamber to the tokamak hall. To achieve this goal, the ITER system and its components were simulated by the Monte Carlo method using the MCNPX 2.6.0 code. Furthermore, the equivalent dose rates of some radiosensitive organs of the human body were calculated by using the medical internal radiation dose phantom. Our study is based on the deuterium-tritium plasma burning by 14.1 MeV neutron production and also photon radiation due to neutron activation. As our results show, the total equivalent dose rate on the outside of the bioshield wall of the tokamak hall is about 1 mSv per year, which is less than the annual occupational dose rate limit during the normal operation of ITER. Also, equivalent dose rates of radiosensitive organs have shown that the maximum dose rate belongs to the kidney. The data may help calculate how long the staff can stay in such an environment, before the equivalent dose rates reach the whole-body dose limits.

  12. A space-time lower-upper symmetric Gauss-Seidel scheme for the time-spectral method

    Science.gov (United States)

    Zhan, Lei; Xiong, Juntao; Liu, Feng

    2016-05-01

    The time-spectral method (TSM) offers the advantage of increased order of accuracy compared to methods using finite-difference in time for periodic unsteady flow problems. Explicit Runge-Kutta pseudo-time marching and implicit schemes have been developed to solve iteratively the space-time coupled nonlinear equations resulting from TSM. Convergence of the explicit schemes is slow because of the stringent time-step limit. Many implicit methods have been developed for TSM. Their computational efficiency is, however, still limited in practice because of delayed implicit temporal coupling, multiple iterative loops, costly matrix operations, or lack of strong diagonal dominance of the implicit operator matrix. To overcome these shortcomings, an efficient space-time lower-upper symmetric Gauss-Seidel (ST-LU-SGS) implicit scheme with multigrid acceleration is presented. In this scheme, the implicit temporal coupling term is split as one additional dimension of space in the LU-SGS sweeps. To improve numerical stability for periodic flows with high frequency, a modification to the ST-LU-SGS scheme is proposed. Numerical results show that fast convergence is achieved using large or even infinite Courant-Friedrichs-Lewy (CFL) numbers for unsteady flow problems with moderately high frequency and with the use of moderately high numbers of time intervals. The ST-LU-SGS implicit scheme is also found to work well in calculating periodic flow problems where the frequency is not known a priori and needed to be determined by using a combined Fourier analysis and gradient-based search algorithm.

  13. Use of the iterative solution method for coupled finite element and boundary element modeling

    International Nuclear Information System (INIS)

    Koteras, J.R.

    1993-07-01

    Tunnels buried deep within the earth constitute an important class geomechanics problems. Two numerical techniques used for the analysis of geomechanics problems, the finite element method and the boundary element method, have complementary characteristics for applications to problems of this type. The usefulness of combining these two methods for use as a geomechanics analysis tool has been recognized for some time, and a number of coupling techniques have been proposed. However, not all of them lend themselves to efficient computational implementations for large-scale problems. This report examines a coupling technique that can form the basis for an efficient analysis tool for large scale geomechanics problems through the use of an iterative equation solver

  14. A novel EMD selecting thresholding method based on multiple iteration for denoising LIDAR signal

    Science.gov (United States)

    Li, Meng; Jiang, Li-hui; Xiong, Xing-long

    2015-06-01

    Empirical mode decomposition (EMD) approach has been believed to be potentially useful for processing the nonlinear and non-stationary LIDAR signals. To shed further light on its performance, we proposed the EMD selecting thresholding method based on multiple iteration, which essentially acts as a development of EMD interval thresholding (EMD-IT), and randomly alters the samples of noisy parts of all the corrupted intrinsic mode functions to generate a better effect of iteration. Simulations on both synthetic signals and LIDAR signals from real world support this method.

  15. The Neutron-Gamma Pulse Shape Discrimination Method for Neutron Flux Detection in the ITER

    International Nuclear Information System (INIS)

    Xu Xiufeng; Li Shiping; Cao Hongrui; Yin Zejie; Yuan Guoliang; Yang Qingwei

    2013-01-01

    The neutron flux monitor (NFM), as a significant diagnostic system in the International Thermonuclear Experimental Reactor (ITER), will play an important role in the readings of a series of key parameters in the fusion reaction process. As the core of the main electronic system of the NFM, the neutron-gamma pulse shape discrimination (n-γ PSD) can distinguish the neutron pulse from the gamma pulse and other disturbing pulses according to the thresholds of the rising time and the amplitude pre-installed on the board, the double timing point CFD method is used to get the rising time of the pulse. The n-γ PSD can provide an accurate neutron count. (magnetically confined plasma)

  16. Parallel computing in plasma physics: Nonlinear instabilities

    International Nuclear Information System (INIS)

    Pohn, E.; Kamelander, G.; Shoucri, M.

    2000-01-01

    processors. Afterwards the υ x - and υ y -shifts also can be performed on the same slices in parallel without communication. Before performing theυ z -shifts communication is necessary. Each processor sends to each other processor a proper part of its υ z -slice to obtain y-slices. On these y-slices the υ z -shifts are performed, again in parallel. Then the y-slices are transformed back to υ z -slices. The nonlinear Poisson-equation is solved iteratively using a Newton-method. From the known potential the electric field was calculated using cubic-spline interpolation. No parallelization is necessary for these not time consuming calculations. In the spatial one-dimensional simulation 6000 time-steps were performed at a grid of N x x N υx x N υy x N υz =160 x 48 x 48 x 48 points. Up to 8 processors were used in parallel. The results show initially the development of a potential, oscillating with a frequency near the ion cyclotron frequency, which is followed by the evolution of equilibrium of the potential, showing a positive potential bump towards the plasma. The results show the importance of a fully-kinetic solution of the charge-separation at a plasma-edge and the important role played by the finite ion gyro-radius in establishing such equilibrium. (author)

  17. Eliminating graphs by means of parallel knock-out schemes

    NARCIS (Netherlands)

    Broersma, H.J.; Fomin, F.V.; Královic, R.; Woeginger, G.J.

    2007-01-01

    In 1997 Lampert and Slater introduced parallel knock-out schemes, an iterative process on graphs that goes through several rounds. In each round of this process, every vertex eliminates exactly one of its neighbors. The parallel knock-out number of a graph is the minimum number of rounds after which

  18. Eliminating graphs by means of parallel knock-out schemes

    NARCIS (Netherlands)

    Broersma, Haitze J.; Fomin, F.V.; Královič, R.; Woeginger, Gerhard

    In 1997 Lampert and Slater introduced parallel knock-out schemes, an iterative process on graphs that goes through several rounds. In each round of this process, every vertex eliminates exactly one of its neighbors. The parallel knock-out number of a graph is the minimum number of rounds after which

  19. Conceptual design Fusion Experimental Reactor (FER/ITER)

    International Nuclear Information System (INIS)

    Uehara, Kazuya; Nagashima, Takashi; Ikeda, Yoshitaka

    1991-11-01

    This report describes a conceptual design of Lower Hybrid Wave (LH) system for FER and ITER. In JAERI, the conceptual design of LH system for FER has been performed in these 3 years in parallel to that of ITER. There must be a common design part with ITER and FER. The physical requirement of LH system is the saving of volt · sec in the current start-up phase, and the current drive at the boundary region. The frequency of 5GHz is mainly chosen for avoidance of the α particle absorption and for the availability of electron tube development. Seventy-two klystrons (FER) and one hundred klystrons (ITER) are necessary to inject the 30 MW (FER) and 45-50 MW (ITER) rf power into plasma using 0.7 - 0.8 MW klystron per one tube. The launching system is the multi-junction type and the rf spectrum must be as sharp as possible with high directivity to improve the current drive efficiency. One port (FER) and two ports (ITER) are used and the injection direction is in horizontal, in which the analysis of the ray-tracing code and the better coupling of LH wave is considered. The transmission line is over-sized waveguide with low rf loss. (author)

  20. Comparison between the Variational Iteration Method and the Homotopy Perturbation Method for the Sturm-Liouville Differential Equation

    Directory of Open Access Journals (Sweden)

    R. Darzi

    2010-01-01

    Full Text Available We applied the variational iteration method and the homotopy perturbation method to solve Sturm-Liouville eigenvalue and boundary value problems. The main advantage of these methods is the flexibility to give approximate and exact solutions to both linear and nonlinear problems without linearization or discretization. The results show that both methods are simple and effective.

  1. Parareal in time 3D numerical solver for the LWR Benchmark neutron diffusion transient model

    Energy Technology Data Exchange (ETDEWEB)

    Baudron, Anne-Marie, E-mail: anne-marie.baudron@cea.fr [Laboratoire de Recherche Conventionné MANON, CEA/DEN/DANS/DM2S and UPMC-CNRS/LJLL (France); CEA-DRN/DMT/SERMA, CEN-Saclay, 91191 Gif sur Yvette Cedex (France); Lautard, Jean-Jacques, E-mail: jean-jacques.lautard@cea.fr [Laboratoire de Recherche Conventionné MANON, CEA/DEN/DANS/DM2S and UPMC-CNRS/LJLL (France); CEA-DRN/DMT/SERMA, CEN-Saclay, 91191 Gif sur Yvette Cedex (France); Maday, Yvon, E-mail: maday@ann.jussieu.fr [Sorbonne Universités, UPMC Univ Paris 06, UMR 7598, Laboratoire Jacques-Louis Lions and Institut Universitaire de France, F-75005, Paris (France); Laboratoire de Recherche Conventionné MANON, CEA/DEN/DANS/DM2S and UPMC-CNRS/LJLL (France); Brown Univ, Division of Applied Maths, Providence, RI (United States); Riahi, Mohamed Kamel, E-mail: riahi@cmap.polytechnique.fr [Laboratoire de Recherche Conventionné MANON, CEA/DEN/DANS/DM2S and UPMC-CNRS/LJLL (France); CMAP, Inria-Saclay and X-Ecole Polytechnique, Route de Saclay, 91128 Palaiseau Cedex (France); Salomon, Julien, E-mail: salomon@ceremade.dauphine.fr [CEREMADE, Univ Paris-Dauphine, Pl. du Mal. de Lattre de Tassigny, F-75016, Paris (France)

    2014-12-15

    In this paper we present a time-parallel algorithm for the 3D neutrons calculation of a transient model in a nuclear reactor core. The neutrons calculation consists in numerically solving the time dependent diffusion approximation equation, which is a simplified transport equation. The numerical resolution is done with finite elements method based on a tetrahedral meshing of the computational domain, representing the reactor core, and time discretization is achieved using a θ-scheme. The transient model presents moving control rods during the time of the reaction. Therefore, cross-sections (piecewise constants) are taken into account by interpolations with respect to the velocity of the control rods. The parallelism across the time is achieved by an adequate use of the parareal in time algorithm to the handled problem. This parallel method is a predictor corrector scheme that iteratively combines the use of two kinds of numerical propagators, one coarse and one fine. Our method is made efficient by means of a coarse solver defined with large time step and fixed position control rods model, while the fine propagator is assumed to be a high order numerical approximation of the full model. The parallel implementation of our method provides a good scalability of the algorithm. Numerical results show the efficiency of the parareal method on large light water reactor transient model corresponding to the Langenbuch–Maurer–Werner benchmark.

  2. Discrete fourier transform (DFT) analysis for applications using iterative transform methods

    Science.gov (United States)

    Dean, Bruce H. (Inventor)

    2012-01-01

    According to various embodiments, a method is provided for determining aberration data for an optical system. The method comprises collecting a data signal, and generating a pre-transformation algorithm. The data is pre-transformed by multiplying the data with the pre-transformation algorithm. A discrete Fourier transform of the pre-transformed data is performed in an iterative loop. The method further comprises back-transforming the data to generate aberration data.

  3. Iterative Methods for Solving Nonlinear Parabolic Problem in Pension Saving Management

    Science.gov (United States)

    Koleva, M. N.

    2011-11-01

    In this work we consider a nonlinear parabolic equation, obtained from Riccati like transformation of the Hamilton-Jacobi-Bellman equation, arising in pension saving management. We discuss two numerical iterative methods for solving the model problem—fully implicit Picard method and mixed Picard-Newton method, which preserves the parabolic characteristics of the differential problem. Numerical experiments for comparison the accuracy and effectiveness of the algorithms are discussed. Finally, observations are given.

  4. Finite element method for time-space-fractional Schrodinger equation

    Directory of Open Access Journals (Sweden)

    Xiaogang Zhu

    2017-07-01

    Full Text Available In this article, we develop a fully discrete finite element method for the nonlinear Schrodinger equation (NLS with time- and space-fractional derivatives. The time-fractional derivative is described in Caputo's sense and the space-fractional derivative in Riesz's sense. Its stability is well derived; the convergent estimate is discussed by an orthogonal operator. We also extend the method to the two-dimensional time-space-fractional NLS and to avoid the iterative solvers at each time step, a linearized scheme is further conducted. Several numerical examples are implemented finally, which confirm the theoretical results as well as illustrate the accuracy of our methods.

  5. An approximate block Newton method for coupled iterations of nonlinear solvers: Theory and conjugate heat transfer applications

    Science.gov (United States)

    Yeckel, Andrew; Lun, Lisa; Derby, Jeffrey J.

    2009-12-01

    A new, approximate block Newton (ABN) method is derived and tested for the coupled solution of nonlinear models, each of which is treated as a modular, black box. Such an approach is motivated by a desire to maintain software flexibility without sacrificing solution efficiency or robustness. Though block Newton methods of similar type have been proposed and studied, we present a unique derivation and use it to sort out some of the more confusing points in the literature. In particular, we show that our ABN method behaves like a Newton iteration preconditioned by an inexact Newton solver derived from subproblem Jacobians. The method is demonstrated on several conjugate heat transfer problems modeled after melt crystal growth processes. These problems are represented by partitioned spatial regions, each modeled by independent heat transfer codes and linked by temperature and flux matching conditions at the boundaries common to the partitions. Whereas a typical block Gauss-Seidel iteration fails about half the time for the model problem, quadratic convergence is achieved by the ABN method under all conditions studied here. Additional performance advantages over existing methods are demonstrated and discussed.

  6. Non-homogeneous updates for the iterative coordinate descent algorithm

    Science.gov (United States)

    Yu, Zhou; Thibault, Jean-Baptiste; Bouman, Charles A.; Sauer, Ken D.; Hsieh, Jiang

    2007-02-01

    Statistical reconstruction methods show great promise for improving resolution, and reducing noise and artifacts in helical X-ray CT. In fact, statistical reconstruction seems to be particularly valuable in maintaining reconstructed image quality when the dosage is low and the noise is therefore high. However, high computational cost and long reconstruction times remain as a barrier to the use of statistical reconstruction in practical applications. Among the various iterative methods that have been studied for statistical reconstruction, iterative coordinate descent (ICD) has been found to have relatively low overall computational requirements due to its fast convergence. This paper presents a novel method for further speeding the convergence of the ICD algorithm, and therefore reducing the overall reconstruction time for statistical reconstruction. The method, which we call nonhomogeneous iterative coordinate descent (NH-ICD) uses spatially non-homogeneous updates to speed convergence by focusing computation where it is most needed. Experimental results with real data indicate that the method speeds reconstruction by roughly a factor of two for typical 3D multi-slice geometries.

  7. Development of a first-principles code based on the screened KKR method for large super-cells

    International Nuclear Information System (INIS)

    Doi, S; Ogura, M; Akai, H

    2013-01-01

    The procedures of performing first-principles electronic structure calculation using the Korringa-Kohn-Rostoker (KKR) and the screened KKR methods are reviewed with an emphasis put on their numerical efficiency. It is shown that an iterative matrix inversion combined with a suitable preconditioning greatly improves the computational time of screened KKR method. The method is well parallelized and also has an O(N) scaling property

  8. A conjugate gradient method for solving the non-LTE line radiation transfer problem

    Science.gov (United States)

    Paletou, F.; Anterrieu, E.

    2009-12-01

    This study concerns the fast and accurate solution of the line radiation transfer problem, under non-LTE conditions. We propose and evaluate an alternative iterative scheme to the classical ALI-Jacobi method, and to the more recently proposed Gauss-Seidel and successive over-relaxation (GS/SOR) schemes. Our study is indeed based on applying a preconditioned bi-conjugate gradient method (BiCG-P). Standard tests, in 1D plane parallel geometry and in the frame of the two-level atom model with monochromatic scattering are discussed. Rates of convergence between the previously mentioned iterative schemes are compared, as are their respective timing properties. The smoothing capability of the BiCG-P method is also demonstrated.

  9. Design of a real-time wind turbine simulator using a custom parallel architecture

    Science.gov (United States)

    Hoffman, John A.; Gluck, R.; Sridhar, S.

    1995-01-01

    The design of a new parallel-processing digital simulator is described. The new simulator has been developed specifically for analysis of wind energy systems in real time. The new processor has been named: the Wind Energy System Time-domain simulator, version 3 (WEST-3). Like previous WEST versions, WEST-3 performs many computations in parallel. The modules in WEST-3 are pure digital processors, however. These digital processors can be programmed individually and operated in concert to achieve real-time simulation of wind turbine systems. Because of this programmability, WEST-3 is very much more flexible and general than its two predecessors. The design features of WEST-3 are described to show how the system produces high-speed solutions of nonlinear time-domain equations. WEST-3 has two very fast Computational Units (CU's) that use minicomputer technology plus special architectural features that make them many times faster than a microcomputer. These CU's are needed to perform the complex computations associated with the wind turbine rotor system in real time. The parallel architecture of the CU causes several tasks to be done in each cycle, including an IO operation and the combination of a multiply, add, and store. The WEST-3 simulator can be expanded at any time for additional computational power. This is possible because the CU's interfaced to each other and to other portions of the simulation using special serial buses. These buses can be 'patched' together in essentially any configuration (in a manner very similar to the programming methods used in analog computation) to balance the input/ output requirements. CU's can be added in any number to share a given computational load. This flexible bus feature is very different from many other parallel processors which usually have a throughput limit because of rigid bus architecture.

  10. Null geodesic deviation II. Conformally flat space--times

    International Nuclear Information System (INIS)

    Peters, P.C.

    1975-01-01

    The equation of geodesic deviation is solved in conformally flat space--time in a covariant manner. The solution is given as an integral equation for general geodesics. The solution is then used to evaluate second derivatives of the world function and derivatives of the parallel propagator, which need to be known in order to find the Green's function for wave equations in curved space--time. A method of null geodesic limits of two-point functions is discussed, and used to find the scalar Green's function as an iterative series

  11. Worst-case Analysis of Strategy Iteration and the Simplex Method

    DEFF Research Database (Denmark)

    Hansen, Thomas Dueholm

    In this dissertation we study strategy iteration (also known as policy iteration) algorithms for solving Markov decision processes (MDPs) and two-player turn-based stochastic games (2TBSGs). MDPs provide a mathematical model for sequential decision making under uncertainty. They are widely used...... to model stochastic optimization problems in various areas ranging from operations research, machine learning, artificial intelligence, economics and game theory. The class of two-player turn-based stochastic games is a natural generalization of Markov decision processes that is obtained by introducing...... in the size of the problem (the bounds have subexponential form). Utilizing a tight connection between MDPs and linear programming, it is shown that the same bounds apply to the corresponding pivoting rules for the simplex method for solving linear programs. Prior to this result no super-polynomial lower...

  12. Near real-time digital holographic microscope based on GPU parallel computing

    Science.gov (United States)

    Zhu, Gang; Zhao, Zhixiong; Wang, Huarui; Yang, Yan

    2018-01-01

    A transmission near real-time digital holographic microscope with in-line and off-axis light path is presented, in which the parallel computing technology based on compute unified device architecture (CUDA) and digital holographic microscopy are combined. Compared to other holographic microscopes, which have to implement reconstruction in multiple focal planes and are time-consuming the reconstruction speed of the near real-time digital holographic microscope can be greatly improved with the parallel computing technology based on CUDA, so it is especially suitable for measurements of particle field in micrometer and nanometer scale. Simulations and experiments show that the proposed transmission digital holographic microscope can accurately measure and display the velocity of particle field in micrometer scale, and the average velocity error is lower than 10%.With the graphic processing units(GPU), the computing time of the 100 reconstruction planes(512×512 grids) is lower than 120ms, while it is 4.9s using traditional reconstruction method by CPU. The reconstruction speed has been raised by 40 times. In other words, it can handle holograms at 8.3 frames per second and the near real-time measurement and display of particle velocity field are realized. The real-time three-dimensional reconstruction of particle velocity field is expected to achieve by further optimization of software and hardware. Keywords: digital holographic microscope,

  13. Fisher's method of scoring in statistical image reconstruction: comparison of Jacobi and Gauss-Seidel iterative schemes.

    Science.gov (United States)

    Hudson, H M; Ma, J; Green, P

    1994-01-01

    Many algorithms for medical image reconstruction adopt versions of the expectation-maximization (EM) algorithm. In this approach, parameter estimates are obtained which maximize a complete data likelihood or penalized likelihood, in each iteration. Implicitly (and sometimes explicitly) penalized algorithms require smoothing of the current reconstruction in the image domain as part of their iteration scheme. In this paper, we discuss alternatives to EM which adapt Fisher's method of scoring (FS) and other methods for direct maximization of the incomplete data likelihood. Jacobi and Gauss-Seidel methods for non-linear optimization provide efficient algorithms applying FS in tomography. One approach uses smoothed projection data in its iterations. We investigate the convergence of Jacobi and Gauss-Seidel algorithms with clinical tomographic projection data.

  14. System engineering and configuration management in ITER

    International Nuclear Information System (INIS)

    Chiocchio, S.; Martin, E.; Barabaschi, P.; Bartels, Hans Werner; How, J.; Spears, W.

    2007-01-01

    The construction of ITER will represent a major challenge for the fusion community at large, because of the intrinsic complexity of the tokamak design, the large number of different systems which are all essential for its operation, the worldwide distribution of the design activities and the unusual procurement scheme based on a combination of in-kind and directly funded deliverables. A key requirement for the success of such a large project is that a systematic approach to ensure the consistency of the design with the required performance is adopted. Also, effective project management methods, tools and working practices must be deployed to facilitate the communication and collaboration among the institutions and industries involved in the project. The authors have been involved in the definition and practical implementation of the design integration and configuration control structure inside ITER and in the system engineering process during the selection and optimization of the machine configuration. In parallel, they have assessed design, drawing and documentation management software to be used for the construction phase. Here, they describe the experience gained in recent years, explain the drivers behind the selection of the documents and drawings management systems, and illustrate the scope and issues of the configuration management activities to ensure the congruence of the design, to control and track the design changes and to manage the interfaces among the ITER systems

  15. He's variational iteration method applied to the solution of the prey and predator problem with variable coefficients

    International Nuclear Information System (INIS)

    Yusufoglu, Elcin; Erbas, Baris

    2008-01-01

    In this Letter, a mathematical model of the problem of prey and predator is presented and He's variational iteration method is employed to compute an approximation to the solution of the system of nonlinear differential equations governing the problem. The results are compared with the results obtained by Adomian decomposition method and homotopy perturbation method. Comparison of the methods show that He's variational iteration method is a powerful method for obtaining approximate solutions to nonlinear equations and their systems

  16. A Sequential Kriging reliability analysis method with characteristics of adaptive sampling regions and parallelizability

    International Nuclear Information System (INIS)

    Wen, Zhixun; Pei, Haiqing; Liu, Hai; Yue, Zhufeng

    2016-01-01

    The sequential Kriging reliability analysis (SKRA) method has been developed in recent years for nonlinear implicit response functions which are expensive to evaluate. This type of method includes EGRA: the efficient reliability analysis method, and AK-MCS: the active learning reliability method combining Kriging model and Monte Carlo simulation. The purpose of this paper is to improve SKRA by adaptive sampling regions and parallelizability. The adaptive sampling regions strategy is proposed to avoid selecting samples in regions where the probability density is so low that the accuracy of these regions has negligible effects on the results. The size of the sampling regions is adapted according to the failure probability calculated by last iteration. Two parallel strategies are introduced and compared, aimed at selecting multiple sample points at a time. The improvement is verified through several troublesome examples. - Highlights: • The ISKRA method improves the efficiency of SKRA. • Adaptive sampling regions strategy reduces the number of needed samples. • The two parallel strategies reduce the number of needed iterations. • The accuracy of the optimal value impacts the number of samples significantly.

  17. A sparsity-regularized Born iterative method for reconstruction of two-dimensional piecewise continuous inhomogeneous domains

    KAUST Repository

    Sandhu, Ali Imran

    2016-04-10

    A sparsity-regularized Born iterative method (BIM) is proposed for efficiently reconstructing two-dimensional piecewise-continuous inhomogeneous dielectric profiles. Such profiles are typically not spatially sparse, which reduces the efficiency of the sparsity-promoting regularization. To overcome this problem, scattered fields are represented in terms of the spatial derivative of the dielectric profile and reconstruction is carried out over samples of the dielectric profile\\'s derivative. Then, like the conventional BIM, the nonlinear problem is iteratively converted into a sequence of linear problems (in derivative samples) and sparsity constraint is enforced on each linear problem using the thresholded Landweber iterations. Numerical results, which demonstrate the efficiency and accuracy of the proposed method in reconstructing piecewise-continuous dielectric profiles, are presented.

  18. Feynman’s clock, a new variational principle, and parallel-in-time quantum dynamics

    Science.gov (United States)

    McClean, Jarrod R.; Parkhill, John A.; Aspuru-Guzik, Alán

    2013-01-01

    We introduce a discrete-time variational principle inspired by the quantum clock originally proposed by Feynman and use it to write down quantum evolution as a ground-state eigenvalue problem. The construction allows one to apply ground-state quantum many-body theory to quantum dynamics, extending the reach of many highly developed tools from this fertile research area. Moreover, this formalism naturally leads to an algorithm to parallelize quantum simulation over time. We draw an explicit connection between previously known time-dependent variational principles and the time-embedded variational principle presented. Sample calculations are presented, applying the idea to a hydrogen molecule and the spin degrees of freedom of a model inorganic compound, demonstrating the parallel speedup of our method as well as its flexibility in applying ground-state methodologies. Finally, we take advantage of the unique perspective of this variational principle to examine the error of basis approximations in quantum dynamics. PMID:24062428

  19. Research of the effectiveness of parallel multithreaded realizations of interpolation methods for scaling raster images

    Science.gov (United States)

    Vnukov, A. A.; Shershnev, M. B.

    2018-01-01

    The aim of this work is the software implementation of three image scaling algorithms using parallel computations, as well as the development of an application with a graphical user interface for the Windows operating system to demonstrate the operation of algorithms and to study the relationship between system performance, algorithm execution time and the degree of parallelization of computations. Three methods of interpolation were studied, formalized and adapted to scale images. The result of the work is a program for scaling images by different methods. Comparison of the quality of scaling by different methods is given.

  20. Predictive Variable Gain Iterative Learning Control for PMSM

    Directory of Open Access Journals (Sweden)

    Huimin Xu

    2015-01-01

    Full Text Available A predictive variable gain strategy in iterative learning control (ILC is introduced. Predictive variable gain iterative learning control is constructed to improve the performance of trajectory tracking. A scheme based on predictive variable gain iterative learning control for eliminating undesirable vibrations of PMSM system is proposed. The basic idea is that undesirable vibrations of PMSM system are eliminated from two aspects of iterative domain and time domain. The predictive method is utilized to determine the learning gain in the ILC algorithm. Compression mapping principle is used to prove the convergence of the algorithm. Simulation results demonstrate that the predictive variable gain is superior to constant gain and other variable gains.