parallel iterative programs: Topics by WorldWideScience.org

Sample records for parallel iterative programs

Parallel S/sub n/ iteration schemes

International Nuclear Information System (INIS)

Wienke, B.R.; Hiromoto, R.E.

1986-01-01

The iterative, multigroup, discrete ordinates (S/sub n/) technique for solving the linear transport equation enjoys widespread usage and appeal. Serial iteration schemes and numerical algorithms developed over the years provide a timely framework for parallel extension. On the Denelcor HEP, the authors investigate three parallel iteration schemes for solving the one-dimensional S/sub n/ transport equation. The multigroup representation and serial iteration methods are also reviewed. This analysis represents a first attempt to extend serial S/sub n/ algorithms to parallel environments and provides good baseline estimates on ease of parallel implementation, relative algorithm efficiency, comparative speedup, and some future directions. The authors examine ordered and chaotic versions of these strategies, with and without concurrent rebalance and diffusion acceleration. Two strategies efficiently support high degrees of parallelization and appear to be robust parallel iteration techniques. The third strategy is a weaker parallel algorithm. Chaotic iteration, difficult to simulate on serial machines, holds promise and converges faster than ordered versions of the schemes. Actual parallel speedup and efficiency are high and payoff appears substantial
Parallelization of the model-based iterative reconstruction algorithm DIRA

International Nuclear Information System (INIS)

Oertenberg, A.; Sandborg, M.; Alm Carlsson, G.; Malusek, A.; Magnusson, M.

2016-01-01

New paradigms for parallel programming have been devised to simplify software development on multi-core processors and many-core graphical processing units (GPU). Despite their obvious benefits, the parallelization of existing computer programs is not an easy task. In this work, the use of the Open Multiprocessing (OpenMP) and Open Computing Language (OpenCL) frameworks is considered for the parallelization of the model-based iterative reconstruction algorithm DIRA with the aim to significantly shorten the code's execution time. Selected routines were parallelized using OpenMP and OpenCL libraries; some routines were converted from MATLAB to C and optimised. Parallelization of the code with the OpenMP was easy and resulted in an overall speedup of 15 on a 16-core computer. Parallelization with OpenCL was more difficult owing to differences between the central processing unit and GPU architectures. The resulting speedup was substantially lower than the theoretical peak performance of the GPU; the cause was explained. (authors)
Iterating skeletons

DEFF Research Database (Denmark)

Dieterle, Mischa; Horstmeyer, Thomas; Berthold, Jost

2012-01-01

a particular skeleton ad-hoc for repeated execution turns out to be considerably complicated, and raises general questions about introducing state into a stateless parallel computation. In addition, one would strongly prefer an approach which leaves the original skeleton intact, and only uses it as a building...... block inside a bigger structure. In this work, we present a general framework for skeleton iteration and discuss requirements and variations of iteration control and iteration body. Skeleton iteration is expressed by synchronising a parallel iteration body skeleton with a (likewise parallel) state......Skeleton-based programming is an area of increasing relevance with upcoming highly parallel hardware, since it substantially facilitates parallel programming and separates concerns. When parallel algorithms expressed by skeletons involve iterations – applying the same algorithm repeatedly...
Parallel computation of multigroup reactivity coefficient using iterative method

Science.gov (United States)

Susmikanti, Mike; Dewayatna, Winter

2013-09-01

One of the research activities to support the commercial radioisotope production program is a safety research target irradiation FPM (Fission Product Molybdenum). FPM targets form a tube made of stainless steel in which the nuclear degrees of superimposed high-enriched uranium. FPM irradiation tube is intended to obtain fission. The fission material widely used in the form of kits in the world of nuclear medicine. Irradiation FPM tube reactor core would interfere with performance. One of the disorders comes from changes in flux or reactivity. It is necessary to study a method for calculating safety terrace ongoing configuration changes during the life of the reactor, making the code faster became an absolute necessity. Neutron safety margin for the research reactor can be reused without modification to the calculation of the reactivity of the reactor, so that is an advantage of using perturbation method. The criticality and flux in multigroup diffusion model was calculate at various irradiation positions in some uranium content. This model has a complex computation. Several parallel algorithms with iterative method have been developed for the sparse and big matrix solution. The Black-Red Gauss Seidel Iteration and the power iteration parallel method can be used to solve multigroup diffusion equation system and calculated the criticality and reactivity coeficient. This research was developed code for reactivity calculation which used one of safety analysis with parallel processing. It can be done more quickly and efficiently by utilizing the parallel processing in the multicore computer. This code was applied for the safety limits calculation of irradiated targets FPM with increment Uranium.
Sparse BLIP: BLind Iterative Parallel imaging reconstruction using compressed sensing.

Science.gov (United States)

She, Huajun; Chen, Rong-Rong; Liang, Dong; DiBella, Edward V R; Ying, Leslie

2014-02-01

To develop a sensitivity-based parallel imaging reconstruction method to reconstruct iteratively both the coil sensitivities and MR image simultaneously based on their prior information. Parallel magnetic resonance imaging reconstruction problem can be formulated as a multichannel sampling problem where solutions are sought analytically. However, the channel functions given by the coil sensitivities in parallel imaging are not known exactly and the estimation error usually leads to artifacts. In this study, we propose a new reconstruction algorithm, termed Sparse BLind Iterative Parallel, for blind iterative parallel imaging reconstruction using compressed sensing. The proposed algorithm reconstructs both the sensitivity functions and the image simultaneously from undersampled data. It enforces the sparseness constraint in the image as done in compressed sensing, but is different from compressed sensing in that the sensing matrix is unknown and additional constraint is enforced on the sensitivities as well. Both phantom and in vivo imaging experiments were carried out with retrospective undersampling to evaluate the performance of the proposed method. Experiments show improvement in Sparse BLind Iterative Parallel reconstruction when compared with Sparse SENSE, JSENSE, IRGN-TV, and L1-SPIRiT reconstructions with the same number of measurements. The proposed Sparse BLind Iterative Parallel algorithm reduces the reconstruction errors when compared to the state-of-the-art parallel imaging methods. Copyright © 2013 Wiley Periodicals, Inc.
Communications oriented programming of parallel iterative solutions of sparse linear systems

Science.gov (United States)

Patrick, M. L.; Pratt, T. W.

1986-01-01

Parallel algorithms are developed for a class of scientific computational problems by partitioning the problems into smaller problems which may be solved concurrently. The effectiveness of the resulting parallel solutions is determined by the amount and frequency of communication and synchronization and the extent to which communication can be overlapped with computation. Three different parallel algorithms for solving the same class of problems are presented, and their effectiveness is analyzed from this point of view. The algorithms are programmed using a new programming environment. Run-time statistics and experience obtained from the execution of these programs assist in measuring the effectiveness of these algorithms.
Adapting high-level language programs for parallel processing using data flow

Science.gov (United States)

Standley, Hilda M.

1988-01-01

EASY-FLOW, a very high-level data flow language, is introduced for the purpose of adapting programs written in a conventional high-level language to a parallel environment. The level of parallelism provided is of the large-grained variety in which parallel activities take place between subprograms or processes. A program written in EASY-FLOW is a set of subprogram calls as units, structured by iteration, branching, and distribution constructs. A data flow graph may be deduced from an EASY-FLOW program.
Accuracy analysis of hybrid parallel robot for the assembling of ITER

Energy Technology Data Exchange (ETDEWEB)

Wang Yongbo [Institute of Mechatronics and Virtual Engineering, Lappeenranta University of Technology, Skinnarilankatu 34, 53850 Lappeenranta (Finland); The State Key Laboratory of Mechanical Transmission, Chongqing University (China); Pessi, Pekka [Institute of Mechatronics and Virtual Engineering, Lappeenranta University of Technology, Skinnarilankatu 34, 53850 Lappeenranta (Finland); Wu Huapeng [Institute of Mechatronics and Virtual Engineering, Lappeenranta University of Technology, Skinnarilankatu 34, 53850 Lappeenranta (Finland)], E-mail: huapeng@lut.fi; Handroos, Heikki [Institute of Mechatronics and Virtual Engineering, Lappeenranta University of Technology, Skinnarilankatu 34, 53850 Lappeenranta (Finland)

2009-06-15

This paper presents a novel mobile parallel robot, which is able to carry welding and machining processes from inside the international thermonuclear experimental reactor (ITER) vacuum vessel (VV). The kinematics design of the robot has been optimized for ITER access. To improve the accuracy of the parallel robot, the errors caused by the stiffness and manufacture process have to be compensated or limited to a minimum value. In this paper kinematics errors and stiffness modeling are given. The simulation results are presented.
Accuracy analysis of hybrid parallel robot for the assembling of ITER

International Nuclear Information System (INIS)

Wang Yongbo; Pessi, Pekka; Wu Huapeng; Handroos, Heikki

2009-01-01

This paper presents a novel mobile parallel robot, which is able to carry welding and machining processes from inside the international thermonuclear experimental reactor (ITER) vacuum vessel (VV). The kinematics design of the robot has been optimized for ITER access. To improve the accuracy of the parallel robot, the errors caused by the stiffness and manufacture process have to be compensated or limited to a minimum value. In this paper kinematics errors and stiffness modeling are given. The simulation results are presented.
Iterative algorithms for large sparse linear systems on parallel computers

Science.gov (United States)

Adams, L. M.

1982-01-01

Algorithms for assembling in parallel the sparse system of linear equations that result from finite difference or finite element discretizations of elliptic partial differential equations, such as those that arise in structural engineering are developed. Parallel linear stationary iterative algorithms and parallel preconditioned conjugate gradient algorithms are developed for solving these systems. In addition, a model for comparing parallel algorithms on array architectures is developed and results of this model for the algorithms are given.
Parallel iterative decoding of transform domain Wyner-Ziv video using cross bitplane correlation

DEFF Research Database (Denmark)

Luong, Huynh Van; Huang, Xin; Forchhammer, Søren

2011-01-01

decoding scheme is proposed to improve the coding efficiency of TDWZ video codecs. The proposed parallel iterative LDPC decoding scheme is able to utilize cross bitplane correlation during decoding, by iteratively refining the soft-input, updating a modeled noise distribution and thereafter enhancing......In recent years, Transform Domain Wyner-Ziv (TDWZ) video coding has been proposed as an efficient Distributed Video Coding (DVC) solution, which fully or partly exploits the source statistics at the decoder to reduce the computational burden at the encoder. In this paper, a parallel iterative LDPC...
An efficient parallel algorithm: Poststack and prestack Kirchhoff 3D depth migration using flexi-depth iterations

Science.gov (United States)

Rastogi, Richa; Srivastava, Abhishek; Khonde, Kiran; Sirasala, Kirannmayi M.; Londhe, Ashutosh; Chavhan, Hitesh

2015-07-01

This paper presents an efficient parallel 3D Kirchhoff depth migration algorithm suitable for current class of multicore architecture. The fundamental Kirchhoff depth migration algorithm exhibits inherent parallelism however, when it comes to 3D data migration, as the data size increases the resource requirement of the algorithm also increases. This challenges its practical implementation even on current generation high performance computing systems. Therefore a smart parallelization approach is essential to handle 3D data for migration. The most compute intensive part of Kirchhoff depth migration algorithm is the calculation of traveltime tables due to its resource requirements such as memory/storage and I/O. In the current research work, we target this area and develop a competent parallel algorithm for post and prestack 3D Kirchhoff depth migration, using hybrid MPI+OpenMP programming techniques. We introduce a concept of flexi-depth iterations while depth migrating data in parallel imaging space, using optimized traveltime table computations. This concept provides flexibility to the algorithm by migrating data in a number of depth iterations, which depends upon the available node memory and the size of data to be migrated during runtime. Furthermore, it minimizes the requirements of storage, I/O and inter-node communication, thus making it advantageous over the conventional parallelization approaches. The developed parallel algorithm is demonstrated and analysed on Yuva II, a PARAM series of supercomputers. Optimization, performance and scalability experiment results along with the migration outcome show the effectiveness of the parallel algorithm.
Cell verification of parallel burnup calculation program MCBMPI based on MPI

International Nuclear Information System (INIS)

Yang Wankui; Liu Yaoguang; Ma Jimin; Wang Guanbo; Yang Xin; She Ding

2014-01-01

The parallel burnup calculation program MCBMPI was developed. The program was modularized. The parallel MCNP5 program MCNP5MPI was employed as neutron transport calculation module. And a composite of three solution methods was used to solve burnup equation, i.e. matrix exponential technique, TTA analytical solution, and Gauss Seidel iteration. MPI parallel zone decomposition strategy was concluded in the program. The program system only consists of MCNP5MPI and burnup subroutine. The latter achieves three main functions, i.e. zone decomposition, nuclide transferring and decaying, and data exchanging with MCNP5MPI. Also, the program was verified with the pressurized water reactor (PWR) cell burnup benchmark. The results show that it,s capable to apply the program to burnup calculation of multiple zones, and the computation efficiency could be significantly improved with the development of computer hardware. (authors)
Design of parallel intersector weld/cut robot for machining processes in ITER vacuum vessel

International Nuclear Information System (INIS)

Wu Huapeng; Handroos, Heikki; Kovanen, Janne; Rouvinen, Asko; Hannukainen, Petri; Saira, Tanja; Jones, Lawrence

2003-01-01

This paper presents a new parallel robot Penta-WH, which has five degrees of freedom driven by hydraulic cylinders. The manipulator has a large, singularity-free workspace and high stiffness and it acts as a transport device for welding, machining and inspection end-effectors inside the ITER vacuum vessel. The presented kinematic structure of a parallel robot is particularly suitable for the ITER environment. Analysis of the machining process for ITER, such as the machining methods and forces are given, and the kinematic analyses, such as workspace and force capacity are discussed
Parallel iterative solution of the Hermite Collocation equations on GPUs II

International Nuclear Information System (INIS)

Vilanakis, N; Mathioudakis, E

2014-01-01

Hermite Collocation is a high order finite element method for Boundary Value Problems modelling applications in several fields of science and engineering. Application of this integration free numerical solver for the solution of linear BVPs results in a large and sparse general system of algebraic equations, suggesting the usage of an efficient iterative solver especially for realistic simulations. In part I of this work an efficient parallel algorithm of the Schur complement method coupled with Bi-Conjugate Gradient Stabilized (BiCGSTAB) iterative solver has been designed for multicore computing architectures with a Graphics Processing Unit (GPU). In the present work the proposed algorithm has been extended for high performance computing environments consisting of multiprocessor machines with multiple GPUs. Since this is a distributed GPU and shared CPU memory parallel architecture, a hybrid memory treatment is needed for the development of the parallel algorithm. The realization of the algorithm took place on a multiprocessor machine HP SL390 with Tesla M2070 GPUs using the OpenMP and OpenACC standards. Execution time measurements reveal the efficiency of the parallel implementation
AZTEC: A parallel iterative package for the solving linear systems

Energy Technology Data Exchange (ETDEWEB)

Hutchinson, S.A.; Shadid, J.N.; Tuminaro, R.S. [Sandia National Labs., Albuquerque, NM (United States)

1996-12-31

We describe a parallel linear system package, AZTEC. The package incorporates a number of parallel iterative methods (e.g. GMRES, biCGSTAB, CGS, TFQMR) and preconditioners (e.g. Jacobi, Gauss-Seidel, polynomial, domain decomposition with LU or ILU within subdomains). Additionally, AZTEC allows for the reuse of previous preconditioning factorizations within Newton schemes for nonlinear methods. Currently, a number of different users are using this package to solve a variety of PDE applications.
P-SPARSLIB: A parallel sparse iterative solution package

Energy Technology Data Exchange (ETDEWEB)

Saad, Y. [Univ. of Minnesota, Minneapolis, MN (United States)

1994-12-31

Iterative methods are gaining popularity in engineering and sciences at a time where the computational environment is changing rapidly. P-SPARSLIB is a project to build a software library for sparse matrix computations on parallel computers. The emphasis is on iterative methods and the use of distributed sparse matrices, an extension of the domain decomposition approach to general sparse matrices. One of the goals of this project is to develop a software package geared towards specific applications. For example, the author will test the performance and usefulness of P-SPARSLIB modules on linear systems arising from CFD applications. Equally important is the goal of portability. In the long run, the author wishes to ensure that this package is portable on a variety of platforms, including SIMD environments and shared memory environments.
PARALLEL ITERATIVE RECONSTRUCTION OF PHANTOM CATPHAN ON EXPERIMENTAL DATA

Directory of Open Access Journals (Sweden)

M. A. Mirzavand

2016-01-01

Full Text Available The principles of fast parallel iterative algorithms based on the use of graphics accelerators and OpenGL library are considered in the paper. The proposed approach provides simultaneous minimization of the residuals of the desired solution and total variation of the reconstructed three- dimensional image. The number of necessary input data, i. e. conical X-ray projections, can be reduced several times. It means in a corresponding number of times the possibility to reduce radiation exposure to the patient. At the same time maintain the necessary contrast and spatial resolution of threedimensional image of the patient. Heuristic iterative algorithm can be used as an alternative to the well-known three-dimensional Feldkamp algorithm.
A mobile robot with parallel kinematics to meet the requirements for assembling and machining the ITER vacuum vessel

Energy Technology Data Exchange (ETDEWEB)

Pessi, Pekka [Lappeenranta University of Technology, Lappeenranta (Finland)], E-mail: pessi@lut.fi; Wu, Huapeng; Handroos, Heikki [Lappeenranta University of Technology, Lappeenranta (Finland); Jones, Lawrence [EFDA Close Support Unit, Boltzmannstrasse 2, Garching D-85748 (Germany)

2007-10-15

The present paper introduces a mobile parallel robot developed for International Thermonuclear Experimental Reactor (ITER). The task of the robot is to carry out welding and machining processes inside the ITER vacuum vessel. The kinematic design of the robot has been optimized for the ITER access. The kinematic analysis is given in the paper. A virtual prototype of the parallel robot is built. A dynamic behavior of the whole robot is studied by the multi-body system simulation (MBS)
A mobile robot with parallel kinematics to meet the requirements for assembling and machining the ITER vacuum vessel

International Nuclear Information System (INIS)

Pessi, Pekka; Wu, Huapeng; Handroos, Heikki; Jones, Lawrence

2007-01-01

The present paper introduces a mobile parallel robot developed for International Thermonuclear Experimental Reactor (ITER). The task of the robot is to carry out welding and machining processes inside the ITER vacuum vessel. The kinematic design of the robot has been optimized for the ITER access. The kinematic analysis is given in the paper. A virtual prototype of the parallel robot is built. A dynamic behavior of the whole robot is studied by the multi-body system simulation (MBS)

Parallelizing flow-accumulation calculations on graphics processing units—From iterative DEM preprocessing algorithm to recursive multiple-flow-direction algorithm

Science.gov (United States)

Qin, Cheng-Zhi; Zhan, Lijun

2012-06-01

As one of the important tasks in digital terrain analysis, the calculation of flow accumulations from gridded digital elevation models (DEMs) usually involves two steps in a real application: (1) using an iterative DEM preprocessing algorithm to remove the depressions and flat areas commonly contained in real DEMs, and (2) using a recursive flow-direction algorithm to calculate the flow accumulation for every cell in the DEM. Because both algorithms are computationally intensive, quick calculation of the flow accumulations from a DEM (especially for a large area) presents a practical challenge to personal computer (PC) users. In recent years, rapid increases in hardware capacity of the graphics processing units (GPUs) provided in modern PCs have made it possible to meet this challenge in a PC environment. Parallel computing on GPUs using a compute-unified-device-architecture (CUDA) programming model has been explored to speed up the execution of the single-flow-direction algorithm (SFD). However, the parallel implementation on a GPU of the multiple-flow-direction (MFD) algorithm, which generally performs better than the SFD algorithm, has not been reported. Moreover, GPU-based parallelization of the DEM preprocessing step in the flow-accumulation calculations has not been addressed. This paper proposes a parallel approach to calculate flow accumulations (including both iterative DEM preprocessing and a recursive MFD algorithm) on a CUDA-compatible GPU. For the parallelization of an MFD algorithm (MFD-md), two different parallelization strategies using a GPU are explored. The first parallelization strategy, which has been used in the existing parallel SFD algorithm on GPU, has the problem of computing redundancy. Therefore, we designed a parallelization strategy based on graph theory. The application results show that the proposed parallel approach to calculate flow accumulations on a GPU performs much faster than either sequential algorithms or other parallel GPU
Parallel Programming with Intel Parallel Studio XE

CERN Document Server

Blair-Chappell , Stephen

2012-01-01

Optimize code for multi-core processors with Intel's Parallel Studio Parallel programming is rapidly becoming a "must-know" skill for developers. Yet, where to start? This teach-yourself tutorial is an ideal starting point for developers who already know Windows C and C++ and are eager to add parallelism to their code. With a focus on applying tools, techniques, and language extensions to implement parallelism, this essential resource teaches you how to write programs for multicore and leverage the power of multicore in your programs. Sharing hands-on case studies and real-world examples, the
Introduction to parallel programming

CERN Document Server

Brawer, Steven

1989-01-01

Introduction to Parallel Programming focuses on the techniques, processes, methodologies, and approaches involved in parallel programming. The book first offers information on Fortran, hardware and operating system models, and processes, shared memory, and simple parallel programs. Discussions focus on processes and processors, joining processes, shared memory, time-sharing with multiple processors, hardware, loops, passing arguments in function/subroutine calls, program structure, and arithmetic expressions. The text then elaborates on basic parallel programming techniques, barriers and race
Colorado Conference on iterative methods. Volume 1

Energy Technology Data Exchange (ETDEWEB)

NONE

1994-12-31

The conference provided a forum on many aspects of iterative methods. Volume I topics were:Session: domain decomposition, nonlinear problems, integral equations and inverse problems, eigenvalue problems, iterative software kernels. Volume II presents nonsymmetric solvers, parallel computation, theory of iterative methods, software and programming environment, ODE solvers, multigrid and multilevel methods, applications, robust iterative methods, preconditioners, Toeplitz and circulation solvers, and saddle point problems. Individual papers are indexed separately on the EDB.
Issues in developing parallel iterative algorithms for solving partial differential equations on a (transputer-based) distributed parallel computing system

International Nuclear Information System (INIS)

Rajagopalan, S.; Jethra, A.; Khare, A.N.; Ghodgaonkar, M.D.; Srivenkateshan, R.; Menon, S.V.G.

1990-01-01

Issues relating to implementing iterative procedures, for numerical solution of elliptic partial differential equations, on a distributed parallel computing system are discussed. Preliminary investigations show that a speed-up of about 3.85 is achievable on a four transputer pipeline network. (author). 2 figs., 3 a ppendixes., 7 refs
Iterative schemes for parallel Sn algorithms in a shared-memory computing environment

International Nuclear Information System (INIS)

Haghighat, A.; Hunter, M.A.; Mattis, R.E.

1995-01-01

Several two-dimensional spatial domain partitioning S n transport theory algorithms are developed on the basis of different iterative schemes. These algorithms are incorporated into TWOTRAN-II and tested on the shared-memory CRAY Y-MP C90 computer. For a series of fixed-source r-z geometry homogeneous problems, it is demonstrated that the concurrent red-black algorithms may result in large parallel efficiencies (>60%) on C90. It is also demonstrated that for a realistic shielding problem, the use of the negative flux fixup causes high load imbalance, which results in a significant loss of parallel efficiency
Writing parallel programs that work

CERN Multimedia

CERN. Geneva

2012-01-01

Serial algorithms typically run inefficiently on parallel machines. This may sound like an obvious statement, but it is the root cause of why parallel programming is considered to be difficult. The current state of the computer industry is still that almost all programs in existence are serial. This talk will describe the techniques used in the Intel Parallel Studio to provide a developer with the tools necessary to understand the behaviors and limitations of the existing serial programs. Once the limitations are known the developer can refactor the algorithms and reanalyze the resulting programs with the tools in the Intel Parallel Studio to create parallel programs that work. About the speaker Paul Petersen is a Sr. Principal Engineer in the Software and Solutions Group (SSG) at Intel. He received a Ph.D. degree in Computer Science from the University of Illinois in 1993. After UIUC, he was employed at Kuck and Associates, Inc. (KAI) working on auto-parallelizing compiler (KAP), and was involved in th...
Time parallelization of advanced operation scenario simulations of ITER plasma

International Nuclear Information System (INIS)

Samaddar, D; Casper, T A; Kim, S H; Houlberg, W A; Berry, L A; Elwasif, W R; Batchelor, D

2013-01-01

This work demonstrates that simulations of advanced burning plasma operation scenarios can be successfully parallelized in time using the parareal algorithm. CORSICA -an advanced operation scenario code for tokamak plasmas is used as a test case. This is a unique application since the parareal algorithm has so far been applied to relatively much simpler systems except for the case of turbulence. In the present application, a computational gain of an order of magnitude has been achieved which is extremely promising. A successful implementation of the Parareal algorithm to codes like CORSICA ushers in the possibility of time efficient simulations of ITER plasmas.
The role of ITER in the US MFE Program Strategy

International Nuclear Information System (INIS)

Glass, A.J.

1992-07-01

I want to discuss the role of ITER in the US MFE Program Strategy. I should stress that any opinions I present are purely my own. I'm not speaking ex cathedra, I'm not speaking for the ITER Home Team, and I'm not speaking for the Lawrence Livermore National Laboratory. I'm giving my own personal opinions. In discussing the role of ITER, we have to recognize that ITER plays several roles, and I want to identify how ITER influences MFE program strategy through each of its roles
Parallel programming with Easy Java Simulations

Science.gov (United States)

Esquembre, F.; Christian, W.; Belloni, M.

2018-01-01

Nearly all of today's processors are multicore, and ideally programming and algorithm development utilizing the entire processor should be introduced early in the computational physics curriculum. Parallel programming is often not introduced because it requires a new programming environment and uses constructs that are unfamiliar to many teachers. We describe how we decrease the barrier to parallel programming by using a java-based programming environment to treat problems in the usual undergraduate curriculum. We use the easy java simulations programming and authoring tool to create the program's graphical user interface together with objects based on those developed by Kaminsky [Building Parallel Programs (Course Technology, Boston, 2010)] to handle common parallel programming tasks. Shared-memory parallel implementations of physics problems, such as time evolution of the Schrödinger equation, are available as source code and as ready-to-run programs from the AAPT-ComPADRE digital library.
Block iterative restoration of astronomical images with the massively parallel processor

International Nuclear Information System (INIS)

Heap, S.R.; Lindler, D.J.

1987-01-01

A method is described for algebraic image restoration capable of treating astronomical images. For a typical 500 x 500 image, direct algebraic restoration would require the solution of a 250,000 x 250,000 linear system. The block iterative approach is used to reduce the problem to solving 4900 121 x 121 linear systems. The algorithm was implemented on the Goddard Massively Parallel Processor, which can solve a 121 x 121 system in approximately 0.06 seconds. Examples are shown of the results for various astronomical images
PCG: A software package for the iterative solution of linear systems on scalar, vector and parallel computers

Energy Technology Data Exchange (ETDEWEB)

Joubert, W. [Los Alamos National Lab., NM (United States); Carey, G.F. [Univ. of Texas, Austin, TX (United States)

1994-12-31

A great need exists for high performance numerical software libraries transportable across parallel machines. This talk concerns the PCG package, which solves systems of linear equations by iterative methods on parallel computers. The features of the package are discussed, as well as techniques used to obtain high performance as well as transportability across architectures. Representative numerical results are presented for several machines including the Connection Machine CM-5, Intel Paragon and Cray T3D parallel computers.
P3T+: A Performance Estimator for Distributed and Parallel Programs

Directory of Open Access Journals (Sweden)

T. Fahringer

2000-01-01

Full Text Available Developing distributed and parallel programs on today's multiprocessor architectures is still a challenging task. Particular distressing is the lack of effective performance tools that support the programmer in evaluating changes in code, problem and machine sizes, and target architectures. In this paper we introduce P3T+ which is a performance estimator for mostly regular HPF (High Performance Fortran programs but partially covers also message passing programs (MPI. P3T+ is unique by modeling programs, compiler code transformations, and parallel and distributed architectures. It computes at compile-time a variety of performance parameters including work distribution, number of transfers, amount of data transferred, transfer times, computation times, and number of cache misses. Several novel technologies are employed to compute these parameters: loop iteration spaces, array access patterns, and data distributions are modeled by employing highly effective symbolic analysis. Communication is estimated by simulating the behavior of a communication library used by the underlying compiler. Computation times are predicted through pre-measured kernels on every target architecture of interest. We carefully model most critical architecture specific factors such as cache lines sizes, number of cache lines available, startup times, message transfer time per byte, etc. P3T+ has been implemented and is closely integrated with the Vienna High Performance Compiler (VFC to support programmers develop parallel and distributed applications. Experimental results for realistic kernel codes taken from real-world applications are presented to demonstrate both accuracy and usefulness of P3T+.
Primal Domain Decomposition Method with Direct and Iterative Solver for Circuit-Field-Torque Coupled Parallel Finite Element Method to Electric Machine Modelling

Directory of Open Access Journals (Sweden)

Daniel Marcsa

2015-01-01

Full Text Available The analysis and design of electromechanical devices involve the solution of large sparse linear systems, and require therefore high performance algorithms. In this paper, the primal Domain Decomposition Method (DDM with parallel forward-backward and with parallel Preconditioned Conjugate Gradient (PCG solvers are introduced in two-dimensional parallel time-stepping finite element formulation to analyze rotating machine considering the electromagnetic field, external circuit and rotor movement. The proposed parallel direct and the iterative solver with two preconditioners are analyzed concerning its computational efficiency and number of iterations of the solver with different preconditioners. Simulation results of a rotating machine is also presented.
Refinement of Parallel and Reactive Programs

OpenAIRE

Back, R. J. R.

1992-01-01

We show how to apply the refinement calculus to stepwise refinement of parallel and reactive programs. We use action systems as our basic program model. Action systems are sequential programs which can be implemented in a parallel fashion. Hence refinement calculus methods, originally developed for sequential programs, carry over to the derivation of parallel programs. Refinement of reactive programs is handled by data refinement techniques originally developed for the sequential refinement c...
The Application of Visual Basic Computer Programming Language to Simulate Numerical Iterations

Directory of Open Access Journals (Sweden)

Abdulkadir Baba HASSAN

2006-06-01

Full Text Available This paper examines the application of Visual Basic Computer Programming Language to Simulate Numerical Iterations, the merit of Visual Basic as a Programming Language and the difficulties faced when solving numerical iterations analytically, this research paper encourage the uses of Computer Programming methods for the execution of numerical iterations and finally fashion out and develop a reliable solution using Visual Basic package to write a program for some selected iteration problems.
Migration of vectorized iterative solvers to distributed memory architectures

Energy Technology Data Exchange (ETDEWEB)

Pommerell, C. [AT& T Bell Labs., Murray Hill, NJ (United States); Ruehl, R. [CSCS-ETH, Manno (Switzerland)

1994-12-31

Both necessity and opportunity motivate the use of high-performance computers for iterative linear solvers. Necessity results from the size of the problems being solved-smaller problems are often better handled by direct methods. Opportunity arises from the formulation of the iterative methods in terms of simple linear algebra operations, even if this {open_quote}natural{close_quotes} parallelism is not easy to exploit in irregularly structured sparse matrices and with good preconditioners. As a result, high-performance implementations of iterative solvers have attracted a lot of interest in recent years. Most efforts are geared to vectorize or parallelize the dominating operation-structured or unstructured sparse matrix-vector multiplication, or to increase locality and parallelism by reformulating the algorithm-reducing global synchronization in inner products or local data exchange in preconditioners. Target architectures for iterative solvers currently include mostly vector supercomputers and architectures with one or few optimized (e.g., super-scalar and/or super-pipelined RISC) processors and hierarchical memory systems. More recently, parallel computers with physically distributed memory and a better price/performance ratio have been offered by vendors as a very interesting alternative to vector supercomputers. However, programming comfort on such distributed memory parallel processors (DMPPs) still lags behind. Here the authors are concerned with iterative solvers and their changing computing environment. In particular, they are considering migration from traditional vector supercomputers to DMPPs. Application requirements force one to use flexible and portable libraries. They want to extend the portability of iterative solvers rather than reimplementing everything for each new machine, or even for each new architecture.
Structured Parallel Programming Patterns for Efficient Computation

CERN Document Server

McCool, Michael; Robison, Arch

2012-01-01

Programming is now parallel programming. Much as structured programming revolutionized traditional serial programming decades ago, a new kind of structured programming, based on patterns, is relevant to parallel programming today. Parallel computing experts and industry insiders Michael McCool, Arch Robison, and James Reinders describe how to design and implement maintainable and efficient parallel algorithms using a pattern-based approach. They present both theory and practice, and give detailed concrete examples using multiple programming models. Examples are primarily given using two of th
Parallel Conjugate Gradient: Effects of Ordering Strategies, Programming Paradigms, and Architectural Platforms

Science.gov (United States)

Oliker, Leonid; Heber, Gerd; Biswas, Rupak

2000-01-01

The Conjugate Gradient (CG) algorithm is perhaps the best-known iterative technique to solve sparse linear systems that are symmetric and positive definite. A sparse matrix-vector multiply (SPMV) usually accounts for most of the floating-point operations within a CG iteration. In this paper, we investigate the effects of various ordering and partitioning strategies on the performance of parallel CG and SPMV using different programming paradigms and architectures. Results show that for this class of applications, ordering significantly improves overall performance, that cache reuse may be more important than reducing communication, and that it is possible to achieve message passing performance using shared memory constructs through careful data ordering and distribution. However, a multi-threaded implementation of CG on the Tera MTA does not require special ordering or partitioning to obtain high efficiency and scalability.
Outline and status of ITER program

International Nuclear Information System (INIS)

Kishimoto, Hiroshi; Shimomura, Yasuo

2002-01-01

ITER is an international joint program for the next-step fusion experimental reactor which aims to demonstrate extended/steady-state fusion burn of deuterium-tritium plasmas and to demonstrate the fusion technologies in an integrated manner as well as to perform integrated testing of components required to utilize fusion energy for practical purposes. On the basis of the recent scientific and engineering achievements in the world-wide tokamak research, the Engineering Design Activities for nine years were fully completed in July 2001. The so-called compact ITER with a finite Q≥10 was proposed and its detailed engineering design was developed along the line of world fusion research. Large scale engineering research and development were completed for superconducting coils, remote-maintenance technology, etc.. The four ITER Parties (Japan, the European Union, the Soviet Federation, and Canada) have initiated the governmental negotiations for the joint implementation of ITER. (author)

Parallel phase model : a programming model for high-end parallel machines with manycores.

Energy Technology Data Exchange (ETDEWEB)

Wu, Junfeng (Syracuse University, Syracuse, NY); Wen, Zhaofang; Heroux, Michael Allen; Brightwell, Ronald Brian

2009-04-01

This paper presents a parallel programming model, Parallel Phase Model (PPM), for next-generation high-end parallel machines based on a distributed memory architecture consisting of a networked cluster of nodes with a large number of cores on each node. PPM has a unified high-level programming abstraction that facilitates the design and implementation of parallel algorithms to exploit both the parallelism of the many cores and the parallelism at the cluster level. The programming abstraction will be suitable for expressing both fine-grained and coarse-grained parallelism. It includes a few high-level parallel programming language constructs that can be added as an extension to an existing (sequential or parallel) programming language such as C; and the implementation of PPM also includes a light-weight runtime library that runs on top of an existing network communication software layer (e.g. MPI). Design philosophy of PPM and details of the programming abstraction are also presented. Several unstructured applications that inherently require high-volume random fine-grained data accesses have been implemented in PPM with very promising results.
An iterative algorithm for solving the multidimensional neutron diffusion nodal method equations on parallel computers

International Nuclear Information System (INIS)

Kirk, B.L.; Azmy, Y.Y.

1992-01-01

In this paper the one-group, steady-state neutron diffusion equation in two-dimensional Cartesian geometry is solved using the nodal integral method. The discrete variable equations comprise loosely coupled sets of equations representing the nodal balance of neutrons, as well as neutron current continuity along rows or columns of computational cells. An iterative algorithm that is more suitable for solving large problems concurrently is derived based on the decomposition of the spatial domain and is accelerated using successive overrelaxation. This algorithm is very well suited for parallel computers, especially since the spatial domain decomposition occurs naturally, so that the number of iterations required for convergence does not depend on the number of processors participating in the calculation. Implementation of the authors' algorithm on the Intel iPSC/2 hypercube and Sequent Balance 8000 parallel computer is presented, and measured speedup and efficiency for test problems are reported. The results suggest that the efficiency of the hypercube quickly deteriorates when many processors are used, while the Sequent Balance retains very high efficiency for a comparable number of participating processors. This leads to the conjecture that message-passing parallel computers are not as well suited for this algorithm as shared-memory machines
A Novel Parallel Algorithm for Edit Distance Computation

Directory of Open Access Journals (Sweden)

Muhammad Murtaza Yousaf

2018-01-01

Full Text Available The edit distance between two sequences is the minimum number of weighted transformation-operations that are required to transform one string into the other. The weighted transformation-operations are insert, remove, and substitute. Dynamic programming solution to find edit distance exists but it becomes computationally intensive when the lengths of strings become very large. This work presents a novel parallel algorithm to solve edit distance problem of string matching. The algorithm is based on resolving dependencies in the dynamic programming solution of the problem and it is able to compute each row of edit distance table in parallel. In this way, it becomes possible to compute the complete table in min(m,n iterations for strings of size m and n whereas state-of-the-art parallel algorithm solves the problem in max(m,n iterations. The proposed algorithm also increases the amount of parallelism in each of its iteration. The algorithm is also capable of exploiting spatial locality while its implementation. Additionally, the algorithm works in a load balanced way that further improves its performance. The algorithm is implemented for multicore systems having shared memory. Implementation of the algorithm in OpenMP shows linear speedup and better execution time as compared to state-of-the-art parallel approach. Efficiency of the algorithm is also proven better in comparison to its competitor.
ITER...ation

International Nuclear Information System (INIS)

Troyon, F.

1997-01-01

Recurrent attacks against ITER, the new generation of tokamak are a mix of political and scientific arguments. This short article draws a historical review of the European fusion program. This program has allowed to build and manage several installations in the aim of getting experimental results necessary to lead the program forwards. ITER will bring together a fusion reactor core with technologies such as materials, superconductive coils, heating devices and instrumentation in order to validate and delimit the operating range. ITER will be a logical and decisive step towards the use of controlled fusion. (A.C.)
Programming massively parallel processors a hands-on approach

CERN Document Server

Kirk, David B

2010-01-01

Programming Massively Parallel Processors discusses basic concepts about parallel programming and GPU architecture. ""Massively parallel"" refers to the use of a large number of processors to perform a set of computations in a coordinated parallel way. The book details various techniques for constructing parallel programs. It also discusses the development process, performance level, floating-point format, parallel patterns, and dynamic parallelism. The book serves as a teaching guide where parallel programming is the main topic of the course. It builds on the basics of C programming for CUDA, a parallel programming environment that is supported on NVI- DIA GPUs. Composed of 12 chapters, the book begins with basic information about the GPU as a parallel computer source. It also explains the main concepts of CUDA, data parallelism, and the importance of memory access efficiency using CUDA. The target audience of the book is graduate and undergraduate students from all science and engineering disciplines who ...
Improved Iterative Parallel Interference Cancellation Receiver for Future Wireless DS-CDMA Systems

Directory of Open Access Journals (Sweden)

Andrea Bernacchioni

2005-04-01

Full Text Available We present a new turbo multiuser detector for turbo-coded direct sequence code division multiple access (DS-CDMA systems. The proposed detector is based on the utilization of a parallel interference cancellation (PIC and a bank of turbo decoders. The PIC is broken up in order to perform interference cancellation after each constituent decoder of the turbo decoding scheme. Moreover, in the paper we propose a new enhanced algorithm that provides a more accurate estimation of the signal-to-noise-plus-interference-ratio used in the tentative decision device and in the MAP decoding algorithm. The performance of the proposed receiver is evaluated by means of computer simulations for medium to very high system loads, in AWGN and multipath fading channel, and compared to recently proposed interference cancellation-based iterative MUD, by taking into account the number of iterations and the complexity involved. We will see that the proposed receiver outperforms the others especially for highly loaded systems.
Experiences in Data-Parallel Programming

Directory of Open Access Journals (Sweden)

Terry W. Clark

1997-01-01

Full Text Available To efficiently parallelize a scientific application with a data-parallel compiler requires certain structural properties in the source program, and conversely, the absence of others. A recent parallelization effort of ours reinforced this observation and motivated this correspondence. Specifically, we have transformed a Fortran 77 version of GROMOS, a popular dusty-deck program for molecular dynamics, into Fortran D, a data-parallel dialect of Fortran. During this transformation we have encountered a number of difficulties that probably are neither limited to this particular application nor do they seem likely to be addressed by improved compiler technology in the near future. Our experience with GROMOS suggests a number of points to keep in mind when developing software that may at some time in its life cycle be parallelized with a data-parallel compiler. This note presents some guidelines for engineering data-parallel applications that are compatible with Fortran D or High Performance Fortran compilers.
Automatic Parallelization An Overview of Fundamental Compiler Techniques

CERN Document Server

Midkiff, Samuel P

2012-01-01

Compiling for parallelism is a longstanding topic of compiler research. This book describes the fundamental principles of compiling "regular" numerical programs for parallelism. We begin with an explanation of analyses that allow a compiler to understand the interaction of data reads and writes in different statements and loop iterations during program execution. These analyses include dependence analysis, use-def analysis and pointer analysis. Next, we describe how the results of these analyses are used to enable transformations that make loops more amenable to parallelization, and
The new Exponential Directional Iterative (EDI) 3-D Sn scheme for parallel adaptive differencing

International Nuclear Information System (INIS)

Sjoden, G.E.

2005-01-01

The new Exponential Directional Iterative (EDI) discrete ordinates (Sn) scheme for 3-D Cartesian Coordinates is presented. The EDI scheme is a logical extension of the positive, efficient Exponential Directional Weighted (EDW) Sn scheme currently used as the third level of the adaptive spatial differencing algorithm in the PENTRAN parallel discrete ordinates solver. Here, the derivation and advantages of the EDI scheme are presented; EDI uses EDW-rendered exponential coefficients as initial starting values to begin a fixed point iteration of the exponential coefficients. One issue that required evaluation was an iterative cutoff criterion to prevent the application of an unstable fixed point iteration; although this was needed in some cases, it was readily treated with a default to EDW. Iterative refinement of the exponential coefficients in EDI typically converged in fewer than four fixed point iterations. Moreover, EDI yielded more accurate angular fluxes compared to the other schemes tested, particularly in streaming conditions. Overall, it was found that the EDI scheme was up to an order of magnitude more accurate than the EDW scheme on a given mesh interval in streaming cases, and is potentially a good candidate as a fourth-level differencing scheme in the PENTRAN adaptive differencing sequence. The 3-D Cartesian computational cost of EDI was only about 20% more than the EDW scheme, and about 40% more than Diamond Zero (DZ). More evaluation and testing are required to determine suitable upgrade metrics for EDI to be fully integrated into the current adaptive spatial differencing sequence in PENTRAN. (author)
Direct and iterative algorithms for the parallel solution of the one-dimensional macroscopic Navier-Stokes equations

International Nuclear Information System (INIS)

Doster, J.M.; Sills, E.D.

1986-01-01

Current efforts are under way to develop and evaluate numerical algorithms for the parallel solution of the large sparse matrix equations associated with the finite difference representation of the macroscopic Navier-Stokes equations. Previous work has shown that these equations can be cast into smaller coupled matrix equations suitable for solution utilizing multiple computer processors operating in parallel. The individual processors themselves may exhibit parallelism through the use of vector pipelines. This wor, has concentrated on the one-dimensional drift flux form of the Navier-Stokes equations. Direct and iterative algorithms that may be suitable for implementation on parallel computer architectures are evaluated in terms of accuracy and overall execution speed. This work has application to engineering and training simulations, on-line process control systems, and engineering workstations where increased computational speeds are required
PDDP, A Data Parallel Programming Model

Directory of Open Access Journals (Sweden)

Karen H. Warren

1996-01-01

Full Text Available PDDP, the parallel data distribution preprocessor, is a data parallel programming model for distributed memory parallel computers. PDDP implements high-performance Fortran-compatible data distribution directives and parallelism expressed by the use of Fortran 90 array syntax, the FORALL statement, and the WHERE construct. Distributed data objects belong to a global name space; other data objects are treated as local and replicated on each processor. PDDP allows the user to program in a shared memory style and generates codes that are portable to a variety of parallel machines. For interprocessor communication, PDDP uses the fastest communication primitives on each platform.
Language constructs for modular parallel programs

Energy Technology Data Exchange (ETDEWEB)

Foster, I.

1996-03-01

We describe programming language constructs that facilitate the application of modular design techniques in parallel programming. These constructs allow us to isolate resource management and processor scheduling decisions from the specification of individual modules, which can themselves encapsulate design decisions concerned with concurrence, communication, process mapping, and data distribution. This approach permits development of libraries of reusable parallel program components and the reuse of these components in different contexts. In particular, alternative mapping strategies can be explored without modifying other aspects of program logic. We describe how these constructs are incorporated in two practical parallel programming languages, PCN and Fortran M. Compilers have been developed for both languages, allowing experimentation in substantial applications.
Development of parallel/serial program analyzing tool

International Nuclear Information System (INIS)

Watanabe, Hiroshi; Nagao, Saichi; Takigawa, Yoshio; Kumakura, Toshimasa

1999-03-01

Japan Atomic Energy Research Institute has been developing 'KMtool', a parallel/serial program analyzing tool, in order to promote the parallelization of the science and engineering computation program. KMtool analyzes the performance of program written by FORTRAN77 and MPI, and it reduces the effort for parallelization. This paper describes development purpose, design, utilization and evaluation of KMtool. (author)
Nuclear systems and testing programs for ITER. Progress report for FY 1998

International Nuclear Information System (INIS)

1998-01-01

The effort during this performance period focused on a number of TBWG activities (including test module design and analysis) that were identified and agreed upon (in the presence of the ITER Director and Deputy Director) at TBWG-4. These include: (a) DEMO test module design and performance analysis under pulsed operation; (b) Test program operation plan; (c) Test port design and analysis; (d) Decay heat calculations and safety analysis; (e) Further discussion among the parties to define collaboratory on R and D for the test program as well as possible collaboration on the construction and operation of test articles; (f) Remote handling and ancillary equipment; (g) Criteria for qualifying a blanket module or submodule for actual insertion and testing in ITER; (h) Definition of test module instrumentation and verification of capability to perform in the ITER fusion environment (magnetic field, radiation, heating, etc.); and (i) Analysis to show that the results to be obtained from the test modules as designed can be extrapolated to DEMO and reactor conditions (e.g., higher wall loads and the need to demonstrate tritium self-sufficiency). The main achievements during this performance period include: (1) updating and finalizing the US DDDs for the ITER Blanket Program to form part of the ITER Final Design Report (FDR). Specific revisions were in response to the minimal lithium volume test blanket design requirements and safety impact and (2) evaluating the feasibility of the US test program, including instrumentation and the benefits of the ITER test program. Details of this assessment, including solid breeder and liquid breeder blanket test plans, are documented in UCLA-IFNT-13 (attached). In addition, dose mapping calculations were performed for the ITER Building, including equipment and layout of coolant pipes/heat exchangers. A report on ITER Building dose calculations was sent to UD ITER management and to the Garching Task Coordinator in April, 1998. The report
PSHED: a simplified approach to developing parallel programs

International Nuclear Information System (INIS)

Mahajan, S.M.; Ramesh, K.; Rajesh, K.; Somani, A.; Goel, M.

1992-01-01

This paper presents a simplified approach in the forms of a tree structured computational model for parallel application programs. An attempt is made to provide a standard user interface to execute programs on BARC Parallel Processing System (BPPS), a scalable distributed memory multiprocessor. The interface package called PSHED provides a basic framework for representing and executing parallel programs on different parallel architectures. The PSHED package incorporates concepts from a broad range of previous research in programming environments and parallel computations. (author). 6 refs
Professional Parallel Programming with C# Master Parallel Extensions with NET 4

CERN Document Server

Hillar, Gastón

2010-01-01

Expert guidance for those programming today's dual-core processors PCs As PC processors explode from one or two to now eight processors, there is an urgent need for programmers to master concurrent programming. This book dives deep into the latest technologies available to programmers for creating professional parallel applications using C#, .NET 4, and Visual Studio 2010. The book covers task-based programming, coordination data structures, PLINQ, thread pools, asynchronous programming model, and more. It also teaches other parallel programming techniques, such as SIMD and vectorization.Teach
III - Template Metaprogramming for massively parallel scientific computing - Templates for Iteration; Thread-level Parallelism

CERN Multimedia

CERN. Geneva

2016-01-01

Large scale scientific computing raises questions on different levels ranging from the fomulation of the problems to the choice of the best algorithms and their implementation for a specific platform. There are similarities in these different topics that can be exploited by modern-style C++ template metaprogramming techniques to produce readable, maintainable and generic code. Traditional low-level code tend to be fast but platform-dependent, and it obfuscates the meaning of the algorithm. On the other hand, object-oriented approach is nice to read, but may come with an inherent performance penalty. These lectures aim to present he basics of the Expression Template (ET) idiom which allows us to keep the object-oriented approach without sacrificing performance. We will in particular show to to enhance ET to include SIMD vectorization. We will then introduce techniques for abstracting iteration, and introduce thread-level parallelism for use in heavy data-centric loads. We will show to to apply these methods i...
About Parallel Programming: Paradigms, Parallel Execution and Collaborative Systems

Directory of Open Access Journals (Sweden)

Loredana MOCEAN

2009-01-01

Full Text Available In the last years, there were made efforts for delineation of a stabile and unitary frame, where the problems of logical parallel processing must find solutions at least at the level of imperative languages. The results obtained by now are not at the level of the made efforts. This paper wants to be a little contribution at these efforts. We propose an overview in parallel programming, parallel execution and collaborative systems.
A mobile robot with parallel kinematics constructed under requirements for assembling and machining of the ITER vacuum vessel

International Nuclear Information System (INIS)

Pessi, P.; Huapeng Wu; Handroos, H.; Jones, L.

2006-01-01

ITER sectors require more stringent tolerances ± 5 mm than normally expected for the size of structure involved. The walls of ITER sectors are made of 60 mm thick stainless steel and are joined together by high efficiency structural and leak tight welds. In addition to the initial vacuum vessel assembly, sectors may have to be replaced for repair. Since commercially available machines are too heavy for the required machining operations and the lifting of a possible e-beam gun column system, and conventional robots lack the stiffness and accuracy in such machining condition, a new flexible, lightweight and mobile robotic machine is being considered. For the assembly of the ITER vacuum vessel sector, precise positioning of welding end-effectors, at some distance in a confined space from the available supports, will be required, which is not possible using conventional machines or robots. This paper presents a special robot, able to carry out welding and machining processes from inside the ITER vacuum vessel, consisting of a ten-degree-of-freedom parallel robot mounted on a carriage driven by electric motor/gearbox on a track. The robot consists of a Stewart platform based parallel mechanism. Water hydraulic cylinders are used as actuators to reach six degrees of freedom for parallel construction. Two linear and two rotational motions are used for enlargement the workspace of the manipulator. The robot carries both welding gun such as a TIG, hybrid laser or e-beam welding gun to weld the inner and outer walls of the ITER vacuum vessel sectors and machining tools to cut and milling the walls with necessary accuracy, it can also carry other tools and material to a required position inside the vacuum vessel . For assembling an on line six degrees of freedom seam finding algorithm has been developed, which enables the robot to find welding seam automatically in a very complex environment. In the machining multi flexible machining processes carried out automatically by
A Tutorial on Parallel and Concurrent Programming in Haskell

Science.gov (United States)

Peyton Jones, Simon; Singh, Satnam

This practical tutorial introduces the features available in Haskell for writing parallel and concurrent programs. We first describe how to write semi-explicit parallel programs by using annotations to express opportunities for parallelism and to help control the granularity of parallelism for effective execution on modern operating systems and processors. We then describe the mechanisms provided by Haskell for writing explicitly parallel programs with a focus on the use of software transactional memory to help share information between threads. Finally, we show how nested data parallelism can be used to write deterministically parallel programs which allows programmers to use rich data types in data parallel programs which are automatically transformed into flat data parallel versions for efficient execution on multi-core processors.

Fast implementations of 3D PET reconstruction using vector and parallel programming techniques

International Nuclear Information System (INIS)

Guerrero, T.M.; Cherry, S.R.; Dahlbom, M.; Ricci, A.R.; Hoffman, E.J.

1993-01-01

Computationally intensive techniques that offer potential clinical use have arisen in nuclear medicine. Examples include iterative reconstruction, 3D PET data acquisition and reconstruction, and 3D image volume manipulation including image registration. One obstacle in achieving clinical acceptance of these techniques is the computational time required. This study focuses on methods to reduce the computation time for 3D PET reconstruction through the use of fast computer hardware, vector and parallel programming techniques, and algorithm optimization. The strengths and weaknesses of i860 microprocessor based workstation accelerator boards are investigated in implementations of 3D PET reconstruction
Productive Parallel Programming: The PCN Approach

Directory of Open Access Journals (Sweden)

Ian Foster

1992-01-01

Full Text Available We describe the PCN programming system, focusing on those features designed to improve the productivity of scientists and engineers using parallel supercomputers. These features include a simple notation for the concise specification of concurrent algorithms, the ability to incorporate existing Fortran and C code into parallel applications, facilities for reusing parallel program components, a portable toolkit that allows applications to be developed on a workstation or small parallel computer and run unchanged on supercomputers, and integrated debugging and performance analysis tools. We survey representative scientific applications and identify problem classes for which PCN has proved particularly useful.
Development and control towards a parallel water hydraulic weld/cut robot for machining processes in ITER vacuum vessel

International Nuclear Information System (INIS)

Wu Huapeng; Handroos, Heikki; Pessi, Pekka; Kilkki, Juha; Jones, Lawrence

2005-01-01

This paper presents a special robot, able to carry out welding and machining processes from inside the ITER vacuum vessel (VV), consisting of a five degree-of-freedom parallel mechanism, mounted on a carriage driven by two electric motors on a rack. The kinematic design of the robot has been optimised for ITER access and a hydraulically actuated pre-prototype built. A hybrid controller is designed for the robot, including position, speed and pressure feedback loops to achieve high accuracy and high dynamic performances. Finally, the experimental tests are given and discussed
ITER tungsten divertor design development and qualification program

Energy Technology Data Exchange (ETDEWEB)

Hirai, T., E-mail: takeshi.hirai@iter.org [ITER Organization, Route de Vinon sur Verdon, F-13115 Saint Paul lez Durance (France); Escourbiac, F.; Carpentier-Chouchana, S.; Fedosov, A.; Ferrand, L.; Jokinen, T.; Komarov, V.; Kukushkin, A.; Merola, M.; Mitteau, R.; Pitts, R.A.; Shu, W.; Sugihara, M. [ITER Organization, Route de Vinon sur Verdon, F-13115 Saint Paul lez Durance (France); Riccardi, B. [F4E, c/ Josep Pla, n.2, Torres Diagonal Litoral, Edificio B3, E-08019 Barcelona (Spain); Suzuki, S. [JAEA, Fusion Research and Development Directorate JAEA, 801-1 Mukouyama, Naka, Ibaragi 311-0193 (Japan); Villari, R. [Associazione EURATOM-ENEA sulla Fusione, Via Enrico Fermi 45, I-00044 Frascati, Rome (Italy)

2013-10-15

Highlights: • Detailed design development plan for the ITER tungsten divertor. • Latest status of the ITER tungsten divertor design. • Brief overview of qualification program for the ITER tungsten divertor and status of R and D activity. -- Abstract: In November 2011, the ITER Council has endorsed the recommendation that a period of up to 2 years be set to develop a full-tungsten divertor design and accelerate technology qualification in view of a possible decision to start operation with a divertor having a full-tungsten plasma-facing surface. To ensure a solid foundation for such a decision, a full tungsten divertor design, together with a demonstration of the necessary high performance tungsten monoblock technology should be completed within the required timescale. The status of both the design and technology R and D activity is summarized in this paper.
Discrete-Time Local Value Iteration Adaptive Dynamic Programming: Admissibility and Termination Analysis.

Science.gov (United States)

Wei, Qinglai; Liu, Derong; Lin, Qiao

In this paper, a novel local value iteration adaptive dynamic programming (ADP) algorithm is developed to solve infinite horizon optimal control problems for discrete-time nonlinear systems. The focuses of this paper are to study admissibility properties and the termination criteria of discrete-time local value iteration ADP algorithms. In the discrete-time local value iteration ADP algorithm, the iterative value functions and the iterative control laws are both updated in a given subset of the state space in each iteration, instead of the whole state space. For the first time, admissibility properties of iterative control laws are analyzed for the local value iteration ADP algorithm. New termination criteria are established, which terminate the iterative local ADP algorithm with an admissible approximate optimal control law. Finally, simulation results are given to illustrate the performance of the developed algorithm.In this paper, a novel local value iteration adaptive dynamic programming (ADP) algorithm is developed to solve infinite horizon optimal control problems for discrete-time nonlinear systems. The focuses of this paper are to study admissibility properties and the termination criteria of discrete-time local value iteration ADP algorithms. In the discrete-time local value iteration ADP algorithm, the iterative value functions and the iterative control laws are both updated in a given subset of the state space in each iteration, instead of the whole state space. For the first time, admissibility properties of iterative control laws are analyzed for the local value iteration ADP algorithm. New termination criteria are established, which terminate the iterative local ADP algorithm with an admissible approximate optimal control law. Finally, simulation results are given to illustrate the performance of the developed algorithm.
Step by step parallel programming method for molecular dynamics code

International Nuclear Information System (INIS)

Orii, Shigeo; Ohta, Toshio

1996-07-01

Parallel programming for a numerical simulation program of molecular dynamics is carried out with a step-by-step programming technique using the two phase method. As a result, within the range of a certain computing parameters, it is found to obtain parallel performance by using the level of parallel programming which decomposes the calculation according to indices of do-loops into each processor on the vector parallel computer VPP500 and the scalar parallel computer Paragon. It is also found that VPP500 shows parallel performance in wider range computing parameters. The reason is that the time cost of the program parts, which can not be reduced by the do-loop level of the parallel programming, can be reduced to the negligible level by the vectorization. After that, the time consuming parts of the program are concentrated on less parts that can be accelerated by the do-loop level of the parallel programming. This report shows the step-by-step parallel programming method and the parallel performance of the molecular dynamics code on VPP500 and Paragon. (author)
Program Transformation to Identify List-Based Parallel Skeletons

Directory of Open Access Journals (Sweden)

Venkatesh Kannan

2016-07-01

Full Text Available Algorithmic skeletons are used as building-blocks to ease the task of parallel programming by abstracting the details of parallel implementation from the developer. Most existing libraries provide implementations of skeletons that are defined over flat data types such as lists or arrays. However, skeleton-based parallel programming is still very challenging as it requires intricate analysis of the underlying algorithm and often uses inefficient intermediate data structures. Further, the algorithmic structure of a given program may not match those of list-based skeletons. In this paper, we present a method to automatically transform any given program to one that is defined over a list and is more likely to contain instances of list-based skeletons. This facilitates the parallel execution of a transformed program using existing implementations of list-based parallel skeletons. Further, by using an existing transformation called distillation in conjunction with our method, we produce transformed programs that contain fewer inefficient intermediate data structures.
Dynamical behaviour of neuronal networks iterated with memory

International Nuclear Information System (INIS)

Melatagia, P.M.; Ndoundam, R.; Tchuente, M.

2005-11-01

We study memory iteration where the updating consider a longer history of each site and the set of interaction matrices is palindromic. We analyze two different ways of updating the networks: parallel iteration with memory and sequential iteration with memory that we introduce in this paper. For parallel iteration, we define Lyapunov functional which permits us to characterize the periods behaviour and explicitly bounds the transient lengths of neural networks iterated with memory. For sequential iteration, we use an algebraic invariant to characterize the periods behaviour of the studied model of neural computation. (author)
Portable parallel programming in a Fortran environment

International Nuclear Information System (INIS)

May, E.N.

1989-01-01

Experience using the Argonne-developed PARMACs macro package to implement a portable parallel programming environment is described. Fortran programs with intrinsic parallelism of coarse and medium granularity are easily converted to parallel programs which are portable among a number of commercially available parallel processors in the class of shared-memory bus-based and local-memory network based MIMD processors. The parallelism is implemented using standard UNIX (tm) tools and a small number of easily understood synchronization concepts (monitors and message-passing techniques) to construct and coordinate multiple cooperating processes on one or many processors. Benchmark results are presented for parallel computers such as the Alliant FX/8, the Encore MultiMax, the Sequent Balance, the Intel iPSC/2 Hypercube and a network of Sun 3 workstations. These parallel machines are typical MIMD types with from 8 to 30 processors, each rated at from 1 to 10 MIPS processing power. The demonstration code used for this work is a Monte Carlo simulation of the response to photons of a ''nearly realistic'' lead, iron and plastic electromagnetic and hadronic calorimeter, using the EGS4 code system. 6 refs., 2 figs., 2 tabs
Parallel processor programs in the Federal Government

Science.gov (United States)

Schneck, P. B.; Austin, D.; Squires, S. L.; Lehmann, J.; Mizell, D.; Wallgren, K.

1985-01-01

In 1982, a report dealing with the nation's research needs in high-speed computing called for increased access to supercomputing resources for the research community, research in computational mathematics, and increased research in the technology base needed for the next generation of supercomputers. Since that time a number of programs addressing future generations of computers, particularly parallel processors, have been started by U.S. government agencies. The present paper provides a description of the largest government programs in parallel processing. Established in fiscal year 1985 by the Institute for Defense Analyses for the National Security Agency, the Supercomputing Research Center will pursue research to advance the state of the art in supercomputing. Attention is also given to the DOE applied mathematical sciences research program, the NYU Ultracomputer project, the DARPA multiprocessor system architectures program, NSF research on multiprocessor systems, ONR activities in parallel computing, and NASA parallel processor projects.
Parallelization in Modern C++

CERN Multimedia

CERN. Geneva

2016-01-01

The traditionally used and well established parallel programming models OpenMP and MPI are both targeting lower level parallelism and are meant to be as language agnostic as possible. For a long time, those models were the only widely available portable options for developing parallel C++ applications beyond using plain threads. This has strongly limited the optimization capabilities of compilers, has inhibited extensibility and genericity, and has restricted the use of those models together with other, modern higher level abstractions introduced by the C++11 and C++14 standards. The recent revival of interest in the industry and wider community for the C++ language has also spurred a remarkable amount of standardization proposals and technical specifications being developed. Those efforts however have so far failed to build a vision on how to seamlessly integrate various types of parallelism, such as iterative parallel execution, task-based parallelism, asynchronous many-task execution flows, continuation s...
Parallel iterative procedures for approximate solutions of wave propagation by finite element and finite difference methods

Energy Technology Data Exchange (ETDEWEB)

Kim, S. [Purdue Univ., West Lafayette, IN (United States)

1994-12-31

Parallel iterative procedures based on domain decomposition techniques are defined and analyzed for the numerical solution of wave propagation by finite element and finite difference methods. For finite element methods, in a Lagrangian framework, an efficient way for choosing the algorithm parameter as well as the algorithm convergence are indicated. Some heuristic arguments for finding the algorithm parameter for finite difference schemes are addressed. Numerical results are presented to indicate the effectiveness of the methods.
The Glasgow Parallel Reduction Machine: Programming Shared-memory Many-core Systems using Parallel Task Composition

Directory of Open Access Journals (Sweden)

Ashkan Tousimojarad

2013-12-01

Full Text Available We present the Glasgow Parallel Reduction Machine (GPRM, a novel, flexible framework for parallel task-composition based many-core programming. We allow the programmer to structure programs into task code, written as C++ classes, and communication code, written in a restricted subset of C++ with functional semantics and parallel evaluation. In this paper we discuss the GPRM, the virtual machine framework that enables the parallel task composition approach. We focus the discussion on GPIR, the functional language used as the intermediate representation of the bytecode running on the GPRM. Using examples in this language we show the flexibility and power of our task composition framework. We demonstrate the potential using an implementation of a merge sort algorithm on a 64-core Tilera processor, as well as on a conventional Intel quad-core processor and an AMD 48-core processor system. We also compare our framework with OpenMP tasks in a parallel pointer chasing algorithm running on the Tilera processor. Our results show that the GPRM programs outperform the corresponding OpenMP codes on all test platforms, and can greatly facilitate writing of parallel programs, in particular non-data parallel algorithms such as reductions.
Vector and parallel processors in computational science

International Nuclear Information System (INIS)

Duff, I.S.; Reid, J.K.

1985-01-01

This book presents the papers given at a conference which reviewed the new developments in parallel and vector processing. Topics considered at the conference included hardware (array processors, supercomputers), programming languages, software aids, numerical methods (e.g., Monte Carlo algorithms, iterative methods, finite elements, optimization), and applications (e.g., neutron transport theory, meteorology, image processing)
Fast parallel algorithm for three-dimensional distance-driven model in iterative computed tomography reconstruction

International Nuclear Information System (INIS)

Chen Jian-Lin; Li Lei; Wang Lin-Yuan; Cai Ai-Long; Xi Xiao-Qi; Zhang Han-Ming; Li Jian-Xin; Yan Bin

2015-01-01

The projection matrix model is used to describe the physical relationship between reconstructed object and projection. Such a model has a strong influence on projection and backprojection, two vital operations in iterative computed tomographic reconstruction. The distance-driven model (DDM) is a state-of-the-art technology that simulates forward and back projections. This model has a low computational complexity and a relatively high spatial resolution; however, it includes only a few methods in a parallel operation with a matched model scheme. This study introduces a fast and parallelizable algorithm to improve the traditional DDM for computing the parallel projection and backprojection operations. Our proposed model has been implemented on a GPU (graphic processing unit) platform and has achieved satisfactory computational efficiency with no approximation. The runtime for the projection and backprojection operations with our model is approximately 4.5 s and 10.5 s per loop, respectively, with an image size of 256×256×256 and 360 projections with a size of 512×512. We compare several general algorithms that have been proposed for maximizing GPU efficiency by using the unmatched projection/backprojection models in a parallel computation. The imaging resolution is not sacrificed and remains accurate during computed tomographic reconstruction. (paper)
The kpx, a program analyzer for parallelization

International Nuclear Information System (INIS)

Matsuyama, Yuji; Orii, Shigeo; Ota, Toshiro; Kume, Etsuo; Aikawa, Hiroshi.

1997-03-01

The kpx is a program analyzer, developed as a common technological basis for promoting parallel processing. The kpx consists of three tools. The first is ktool, that shows how much execution time is spent in program segments. The second is ptool, that shows parallelization overhead on the Paragon system. The last is xtool, that shows parallelization overhead on the VPP system. The kpx, designed to work for any FORTRAN cord on any UNIX computer, is confirmed to work well after testing on Paragon, SP2, SR2201, VPP500, VPP300, Monte-4, SX-4 and T90. (author)
Speedup predictions on large scientific parallel programs

International Nuclear Information System (INIS)

Williams, E.; Bobrowicz, F.

1985-01-01

How much speedup can we expect for large scientific parallel programs running on supercomputers. For insight into this problem we extend the parallel processing environment currently existing on the Cray X-MP (a shared memory multiprocessor with at most four processors) to a simulated N-processor environment, where N greater than or equal to 1. Several large scientific parallel programs from Los Alamos National Laboratory were run in this simulated environment, and speedups were predicted. A speedup of 14.4 on 16 processors was measured for one of the three most used codes at the Laboratory
Declarative Parallel Programming in Spreadsheet End-User Development

DEFF Research Database (Denmark)

Biermann, Florian

2016-01-01

Spreadsheets are first-order functional languages and are widely used in research and industry as a tool to conveniently perform all kinds of computations. Because cells on a spreadsheet are immutable, there are possibilities for implicit parallelization of spreadsheet computations. In this liter...... can directly apply results from functional array programming to a spreadsheet model of computations.......Spreadsheets are first-order functional languages and are widely used in research and industry as a tool to conveniently perform all kinds of computations. Because cells on a spreadsheet are immutable, there are possibilities for implicit parallelization of spreadsheet computations....... In this literature study, we provide an overview of the publications on spreadsheet end-user programming and declarative array programming to inform further research on parallel programming in spreadsheets. Our results show that there is a clear overlap between spreadsheet programming and array programming and we...
Programming parallel architectures - The BLAZE family of languages

Science.gov (United States)

Mehrotra, Piyush

1989-01-01

This paper gives an overview of the various approaches to programming multiprocessor architectures that are currently being explored. It is argued that two of these approaches, interactive programming environments and functional parallel languages, are particularly attractive, since they remove much of the burden of exploiting parallel architectures from the user. This paper also describes recent work in the design of parallel languages. Research on languages for both shared and nonshared memory multiprocessors is described.
Programming parallel architectures: The BLAZE family of languages

Science.gov (United States)

Mehrotra, Piyush

1988-01-01

Programming multiprocessor architectures is a critical research issue. An overview is given of the various approaches to programming these architectures that are currently being explored. It is argued that two of these approaches, interactive programming environments and functional parallel languages, are particularly attractive since they remove much of the burden of exploiting parallel architectures from the user. Also described is recent work by the author in the design of parallel languages. Research on languages for both shared and nonshared memory multiprocessors is described, as well as the relations of this work to other current language research projects.

Parallel adaptation of a vectorised quantumchemical program system

International Nuclear Information System (INIS)

Van Corler, L.C.H.; Van Lenthe, J.H.

1987-01-01

Supercomputers, like the CRAY 1 or the Cyber 205, have had, and still have, a marked influence on Quantum Chemistry. Vectorization has led to a considerable increase in the performance of Quantum Chemistry programs. However, clockcycle times more than a factor 10 smaller than those of the present supercomputers are not to be expected. Therefore future supercomputers will have to depend on parallel structures. Recently, the first examples of such supercomputers have been installed. To be prepared for this new generation of (parallel) supercomputers one should consider the concepts one wants to use and the kind of problems one will encounter during implementation of existing vectorized programs on those parallel systems. The authors implemented four important parts of a large quantumchemical program system (ATMOL), i.e. integrals, SCF, 4-index and Direct-CI in the parallel environment at ECSEC (Rome, Italy). This system offers simulated parallellism on the host computer (IBM 4381) and real parallellism on at most 10 attached processors (FPS-164). Quantumchemical programs usually handle large amounts of data and very large, often sparse matrices. The transfer of that many data can cause problems concerning communication and overhead, in view of which shared memory and shared disks must be considered. The strategy and the tools that were used to parallellise the programs are shown. Also, some examples are presented to illustrate effectiveness and performance of the system in Rome for these type of calculations
Design and development of the ITER vacuum vessel

Energy Technology Data Exchange (ETDEWEB)

Koizumi, K.; Nakahira, M.; Itou, Y.; Tada, E. [Japan Atomic Energy Research Inst., Naka, Ibaraki (Japan); Johnson, G.; Ioki, K.; Elio, F.; Iizuka, T.; Sannazzaro, G.; Takahashi, K.; Utin, Y.; Onozuka, M. [ITER Joint Central Team (JCT), Garching (Germany); Nelson, B. [US Home Team, Oak Ridge National Laboratory (United States); Vallone, C. [EU Home Team, NET Team, Garching (Germany); Kuzmin, E. [RF Home Team, Efremov Institute, City (Russian Federation)

1998-09-01

In ITER, the vacuum vessel (VV) is designed to be a water cooled, double-walled toroidal structure made of 316LN stainless steel with a D-shaped cross section approximately 9 m wide and 15 m high. The design work which began at the beginning of the ITER-EDA is nearing completion by resolving the technical issues. In parallel with the design activities, the R and D program, full-scale VV sector model project, was initiated in 1995 to resolve the design and fabrication issues. The full-scale sector model corresponds to an 18 sector (9 sub-sector x 2) and is being fabricated on schedule. To date, 60% of the fabrication had been completed. The fabrication of full-scale model including sector-to-sector connection will be completed by the end of 1997 and performance tests are scheduled until the end of ITER-EDA. This paper describes the latest status of the ITER VV design and the full-scale sector model project. (orig.) 3 refs.
The BLAZE language - A parallel language for scientific programming

Science.gov (United States)

Mehrotra, Piyush; Van Rosendale, John

1987-01-01

A Pascal-like scientific programming language, BLAZE, is described. BLAZE contains array arithmetic, forall loops, and APL-style accumulation operators, which allow natural expression of fine grained parallelism. It also employs an applicative or functional procedure invocation mechanism, which makes it easy for compilers to extract coarse grained parallelism using machine specific program restructuring. Thus BLAZE should allow one to achieve highly parallel execution on multiprocessor architectures, while still providing the user with conceptually sequential control flow. A central goal in the design of BLAZE is portability across a broad range of parallel architectures. The multiple levels of parallelism present in BLAZE code, in principle, allow a compiler to extract the types of parallelism appropriate for the given architecture while neglecting the remainder. The features of BLAZE are described and it is shown how this language would be used in typical scientific programming.
The BLAZE language: A parallel language for scientific programming

Science.gov (United States)

Mehrotra, P.; Vanrosendale, J.

1985-01-01

A Pascal-like scientific programming language, Blaze, is described. Blaze contains array arithmetic, forall loops, and APL-style accumulation operators, which allow natural expression of fine grained parallelism. It also employs an applicative or functional procedure invocation mechanism, which makes it easy for compilers to extract coarse grained parallelism using machine specific program restructuring. Thus Blaze should allow one to achieve highly parallel execution on multiprocessor architectures, while still providing the user with onceptually sequential control flow. A central goal in the design of Blaze is portability across a broad range of parallel architectures. The multiple levels of parallelism present in Blaze code, in principle, allow a compiler to extract the types of parallelism appropriate for the given architecture while neglecting the remainder. The features of Blaze are described and shows how this language would be used in typical scientific programming.
Block-Parallel Data Analysis with DIY2

Energy Technology Data Exchange (ETDEWEB)

Morozov, Dmitriy [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Peterka, Tom [Argonne National Lab. (ANL), Argonne, IL (United States)

2017-08-30

DIY2 is a programming model and runtime for block-parallel analytics on distributed-memory machines. Its main abstraction is block-structured data parallelism: data are decomposed into blocks; blocks are assigned to processing elements (processes or threads); computation is described as iterations over these blocks, and communication between blocks is defined by reusable patterns. By expressing computation in this general form, the DIY2 runtime is free to optimize the movement of blocks between slow and fast memories (disk and flash vs. DRAM) and to concurrently execute blocks residing in memory with multiple threads. This enables the same program to execute in-core, out-of-core, serial, parallel, single-threaded, multithreaded, or combinations thereof. This paper describes the implementation of the main features of the DIY2 programming model and optimizations to improve performance. DIY2 is evaluated on benchmark test cases to establish baseline performance for several common patterns and on larger complete analysis codes running on large-scale HPC machines.
Parallel programming practical aspects, models and current limitations

CERN Document Server

Tarkov, Mikhail S

2014-01-01

Parallel programming is designed for the use of parallel computer systems for solving time-consuming problems that cannot be solved on a sequential computer in a reasonable time. These problems can be divided into two classes: 1. Processing large data arrays (including processing images and signals in real time)2. Simulation of complex physical processes and chemical reactions For each of these classes, prospective methods are designed for solving problems. For data processing, one of the most promising technologies is the use of artificial neural networks. Particles-in-cell method and cellular automata are very useful for simulation. Problems of scalability of parallel algorithms and the transfer of existing parallel programs to future parallel computers are very acute now. An important task is to optimize the use of the equipment (including the CPU cache) of parallel computers. Along with parallelizing information processing, it is essential to ensure the processing reliability by the relevant organization ...
On the Automatic Parallelization of Sparse and Irregular Fortran Programs

Directory of Open Access Journals (Sweden)

Yuan Lin

1999-01-01

Full Text Available Automatic parallelization is usually believed to be less effective at exploiting implicit parallelism in sparse/irregular programs than in their dense/regular counterparts. However, not much is really known because there have been few research reports on this topic. In this work, we have studied the possibility of using an automatic parallelizing compiler to detect the parallelism in sparse/irregular programs. The study with a collection of sparse/irregular programs led us to some common loop patterns. Based on these patterns new techniques were derived that produced good speedups when manually applied to our benchmark codes. More importantly, these parallelization methods can be implemented in a parallelizing compiler and can be applied automatically.
Survey on present status and trend of parallel programming environments

International Nuclear Information System (INIS)

Takemiya, Hiroshi; Higuchi, Kenji; Honma, Ichiro; Ohta, Hirofumi; Kawasaki, Takuji; Imamura, Toshiyuki; Koide, Hiroshi; Akimoto, Masayuki.

1997-03-01

This report intends to provide useful information on software tools for parallel programming through the survey on parallel programming environments of the following six parallel computers, Fujitsu VPP300/500, NEC SX-4, Hitachi SR2201, Cray T94, IBM SP, and Intel Paragon, all of which are installed at Japan Atomic Energy Research Institute (JAERI), moreover, the present status of R and D's on parallel softwares of parallel languages, compilers, debuggers, performance evaluation tools, and integrated tools is reported. This survey has been made as a part of our project of developing a basic software for parallel programming environment, which is designed on the concept of STA (Seamless Thinking Aid to programmers). (author)
Parallelization for first principles electronic state calculation program

International Nuclear Information System (INIS)

Watanabe, Hiroshi; Oguchi, Tamio.

1997-03-01

In this report we study the parallelization for First principles electronic state calculation program. The target machines are NEC SX-4 for shared memory type parallelization and FUJITSU VPP300 for distributed memory type parallelization. The features of each parallel machine are surveyed, and the parallelization methods suitable for each are proposed. It is shown that 1.60 times acceleration is achieved with 2 CPU parallelization by SX-4 and 4.97 times acceleration is achieved with 12 PE parallelization by VPP 300. (author)
Implementation of the multireference Brillouin-Wigner and Mukherjee’s coupled cluster methods with non-iterative triple excitations utilizing reference-level parallelism

Energy Technology Data Exchange (ETDEWEB)

Bhaskaran-Nair, Kiran; Brabec, Jiri; Apra, Edoardo; van Dam, Hubertus JJ; Pittner, Jiri; Kowalski, Karol

2012-09-07

In this paper we discuss the performance of the non-iterative State-Specific Mul- tireference Coupled Cluster (SS-MRCC) methods accounting for the effect of triply excited cluster amplitudes. The corrections to the Brillouin-Wigner and Mukherjee MRCC models based on the manifold of singly and doubly excited cluster amplitudes (BW-MRCCSD and Mk-MRCCSD, respectively) are tested and compared with the exact full configuration interaction results (FCI) for small systems (H2O, N2, and Be3). For larger systems (naphthyne isomers and -carotene), the non-iterative BW-MRCCSD(T) and Mk-MRCCSD(T) methods are compared against the results obtained with the single reference coupled cluster methods. We also report on the parallel performance of the non-iterative implementations based on the use of pro- cessor groups.
An object-oriented programming paradigm for parallelization of computational fluid dynamics

International Nuclear Information System (INIS)

Ohta, Takashi.

1997-03-01

We propose an object-oriented programming paradigm for parallelization of scientific computing programs, and show that the approach can be a very useful strategy. Generally, parallelization of scientific programs tends to be complicated and unportable due to the specific requirements of each parallel computer or compiler. In this paper, we show that the object-oriented programming design, which separates the parallel processing parts from the solver of the applications, can achieve the large improvement in the maintenance of the codes, as well as the high portability. We design the program for the two-dimensional Euler equations according to the paradigm, and evaluate the parallel performance on IBM SP2. (author)
Japanese contributions to ITER testing program of solid breeder blankets for DEMO

International Nuclear Information System (INIS)

Kuroda, Toshimasa; Yoshida, Hiroshi; Takatsu, Hideyuki; Maki, Koichi; Mori, Seiji; Kobayashi, Takeshi; Suzuki, Tatsushi; Hirata, Shingo; Miura, Hidenori.

1991-04-01

ITER Conceptual Design Activity (CDA), which has been conducted by four parties (Japan, EC, USA and USSR) since May 1988, has been finished on December 1990 with a great achievement of international design work of the integrated fusion experimental reactor. Numerous issues of physics and technology have been clarified for providing a framework of the next phase of ITER (Engineering Design Activity; EDA). Establishment of an ITER testing program, which includes technical test issues of neutronics, solid breeder blankets, liquid breeder blankets, plasma facing components, and materials, has been one of the goals of the CDA. This report describes Japanese proposal for the testing program of DEMO/power reactor blanket development. For two concepts of solid breeder blanket (helium-cooled and water-cooled), identification of technical issues, scheduling of test program, and conceptual design of test modules including required test facility such as cooling and tritium recovery systems have been carried out as the Japanese contribution to the CDA. (author)
Planning for U.S. Fusion Community Participation in the ITER Program

International Nuclear Information System (INIS)

Baker, Charles; Berk, Herbert; Greenwald, Martin; Mauel, Michael E.; Najmabadi, Farrokh; Nevins, William M.; Stambaugh, Ronald; Synakowski, Edmund; Batchelor, Donald B.; Fonck, Raymond; Hawryluk, Richard J.; Meade, Dale M.; Neilson, George H.; Parker, Ronald; Strait, Ted

2006-01-01

A central step in the mission of the U.S. Fusion Energy Sciences program is the creation and study of a fusion-powered 'star on earth', where the same energy source that drives the sun and other stars is reproduced and controlled for sustained periods in the laboratory. This ''star'' is formed by an ionized gas, or plasma, heated to fusion temperatures in a magnetic confinement device known as a tokamak, which is the most advanced magnetic fusion concept. The ITER tokamak is designed to be the premier scientific tool for exploring and testing expectations for plasma behavior in the fusion burning plasma regime, wherein the fusion process itself provides the dominant heat source to sustain the plasma temperature. It will provide the scientific basis and control tools needed to move toward the fusion energy goal. The ITER project confronts the grand challenge of creating and understanding a burning plasma for the first time. The distinguishing characteristic of a burning plasma is the tight coupling between the fusion heating, the resulting energetic particles, and the confinement and stability properties of the plasma. Achieving this strongly coupled burning state requires resolving complex physics issues and integrating challenging technologies. A clear and comprehensive scientific understanding of the burning plasma state is needed to confidently extrapolate plasma behavior and related technology beyond ITER to a fusion power plant. Developing this predictive understanding is the overarching goal of the U.S. Fusion Energy Sciences program. The burning plasma research program in the U.S. is being organized to maximize the scientific benefits of U.S. participation in the international ITER experiment. It is expected that much of the research pursued on ITER will be based on the scientific merit of proposed activities, and it will be necessary to maintain strong fusion research capabilities in the U.S. to successfully contribute to the
Planning for U.S. Fusion Community Participation in the ITER Program

Energy Technology Data Exchange (ETDEWEB)

Baker, Charles [Sandia National Lab. (SNL-CA), Livermore, CA (United States); Berk, Herbert [Univ. of Texas, Austin, TX (United States); Greenwald, Martin [Massachusetts Inst. of Technology (MIT), Cambridge, MA (United States); Mauel, Michael E. [Columbia Univ., New York, NY (United States); Najmabadi, Farrokh [Univ. of California, San Diego, CA (United States); Nevins, William M. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Stambaugh, Ronald [General Atomics, La Jolla, CA (United States); Synakowski, Edmund [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Batchelor, Donald B. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Fonck, Raymond [Univ. of Wisconsin, Madison, WI (United States); Hawryluk, Richard J. [Princeton Plasma Physics Lab. (PPPL), Princeton, NJ (United States); Meade, Dale M. [Princeton Plasma Physics Lab. (PPPL), Princeton, NJ (United States); Neilson, George H. [Princeton Plasma Physics Lab. (PPPL), Princeton, NJ (United States); Parker, Ronald [Massachusetts Inst. of Technology (MIT), Cambridge, MA (United States); Strait, Ted [General Atomics, La Jolla, CA (United States)

2006-06-07

A central step in the mission of the U.S. Fusion Energy Sciences program is the creation and study of a fusion-powered "star on earth", where the same energy source that drives the sun and other stars is reproduced and controlled for sustained periods in the laboratory. This “star” is formed by an ionized gas, or plasma, heated to fusion temperatures in a magnetic confinement device known as a tokamak, which is the most advanced magnetic fusion concept. The ITER tokamak is designed to be the premier scientific tool for exploring and testing expectations for plasma behavior in the fusion burning plasma regime, wherein the fusion process itself provides the dominant heat source to sustain the plasma temperature. It will provide the scientific basis and control tools needed to move toward the fusion energy goal. The ITER project confronts the grand challenge of creating and understanding a burning plasma for the first time. The distinguishing characteristic of a burning plasma is the tight coupling between the fusion heating, the resulting energetic particles, and the confinement and stability properties of the plasma. Achieving this strongly coupled burning state requires resolving complex physics issues and integrating challenging technologies. A clear and comprehensive scientific understanding of the burning plasma state is needed to confidently extrapolate plasma behavior and related technology beyond ITER to a fusion power plant. Developing this predictive understanding is the overarching goal of the U.S. Fusion Energy Sciences program. The burning plasma research program in the U.S. is being organized to maximize the scientific benefits of U.S. participation in the international ITER experiment. It is expected that much of the research pursued on ITER will be based on the scientific merit of proposed activities, and it will be necessary to maintain strong fusion research capabilities in the U.S. to successfully contribute to the success of ITER and optimize
Testing New Programming Paradigms with NAS Parallel Benchmarks

Science.gov (United States)

Jin, H.; Frumkin, M.; Schultz, M.; Yan, J.

2000-01-01

Over the past decade, high performance computing has evolved rapidly, not only in hardware architectures but also with increasing complexity of real applications. Technologies have been developing to aim at scaling up to thousands of processors on both distributed and shared memory systems. Development of parallel programs on these computers is always a challenging task. Today, writing parallel programs with message passing (e.g. MPI) is the most popular way of achieving scalability and high performance. However, writing message passing programs is difficult and error prone. Recent years new effort has been made in defining new parallel programming paradigms. The best examples are: HPF (based on data parallelism) and OpenMP (based on shared memory parallelism). Both provide simple and clear extensions to sequential programs, thus greatly simplify the tedious tasks encountered in writing message passing programs. HPF is independent of memory hierarchy, however, due to the immaturity of compiler technology its performance is still questionable. Although use of parallel compiler directives is not new, OpenMP offers a portable solution in the shared-memory domain. Another important development involves the tremendous progress in the internet and its associated technology. Although still in its infancy, Java promisses portability in a heterogeneous environment and offers possibility to "compile once and run anywhere." In light of testing these new technologies, we implemented new parallel versions of the NAS Parallel Benchmarks (NPBs) with HPF and OpenMP directives, and extended the work with Java and Java-threads. The purpose of this study is to examine the effectiveness of alternative programming paradigms. NPBs consist of five kernels and three simulated applications that mimic the computation and data movement of large scale computational fluid dynamics (CFD) applications. We started with the serial version included in NPB2.3. Optimization of memory and cache usage
Automatic Parallelization Tool: Classification of Program Code for Parallel Computing

Directory of Open Access Journals (Sweden)

Mustafa Basthikodi

2016-04-01

Full Text Available Performance growth of single-core processors has come to a halt in the past decade, but was re-enabled by the introduction of parallelism in processors. Multicore frameworks along with Graphical Processing Units empowered to enhance parallelism broadly. Couples of compilers are updated to developing challenges forsynchronization and threading issues. Appropriate program and algorithm classifications will have advantage to a great extent to the group of software engineers to get opportunities for effective parallelization. In present work we investigated current species for classification of algorithms, in that related work on classification is discussed along with the comparison of issues that challenges the classification. The set of algorithms are chosen which matches the structure with different issues and perform given task. We have tested these algorithms utilizing existing automatic species extraction toolsalong with Bones compiler. We have added functionalities to existing tool, providing a more detailed characterization. The contributions of our work include support for pointer arithmetic, conditional and incremental statements, user defined types, constants and mathematical functions. With this, we can retain significant data which is not captured by original speciesof algorithms. We executed new theories into the device, empowering automatic characterization of program code.
ITER test programme

International Nuclear Information System (INIS)

Abdou, M.; Baker, C.; Casini, G.

1991-01-01

ITER has been designed to operate in two phases. The first phase which lasts for 6 years, is devoted to machine checkout and physics testing. The second phase lasts for 8 years and is devoted primarily to technology testing. This report describes the technology test program development for ITER, the ancillary equipment outside the torus necessary to support the test modules, the international collaboration aspects of conducting the test program on ITER, the requirements on the machine major parameters and the R and D program required to develop the test modules for testing in ITER. 15 refs, figs and tabs
Value Iteration Adaptive Dynamic Programming for Optimal Control of Discrete-Time Nonlinear Systems.

Science.gov (United States)

Wei, Qinglai; Liu, Derong; Lin, Hanquan

2016-03-01

In this paper, a value iteration adaptive dynamic programming (ADP) algorithm is developed to solve infinite horizon undiscounted optimal control problems for discrete-time nonlinear systems. The present value iteration ADP algorithm permits an arbitrary positive semi-definite function to initialize the algorithm. A novel convergence analysis is developed to guarantee that the iterative value function converges to the optimal performance index function. Initialized by different initial functions, it is proven that the iterative value function will be monotonically nonincreasing, monotonically nondecreasing, or nonmonotonic and will converge to the optimum. In this paper, for the first time, the admissibility properties of the iterative control laws are developed for value iteration algorithms. It is emphasized that new termination criteria are established to guarantee the effectiveness of the iterative control laws. Neural networks are used to approximate the iterative value function and compute the iterative control law, respectively, for facilitating the implementation of the iterative ADP algorithm. Finally, two simulation examples are given to illustrate the performance of the present method.
Iteration schemes for parallelizing models of superconductivity

Energy Technology Data Exchange (ETDEWEB)

Gray, P.A. [Michigan State Univ., East Lansing, MI (United States)

1996-12-31

The time dependent Lawrence-Doniach model, valid for high fields and high values of the Ginzburg-Landau parameter, is often used for studying vortex dynamics in layered high-T{sub c} superconductors. When solving these equations numerically, the added degrees of complexity due to the coupling and nonlinearity of the model often warrant the use of high-performance computers for their solution. However, the interdependence between the layers can be manipulated so as to allow parallelization of the computations at an individual layer level. The reduced parallel tasks may then be solved independently using a heterogeneous cluster of networked workstations connected together with Parallel Virtual Machine (PVM) software. Here, this parallelization of the model is discussed and several computational implementations of varying degrees of parallelism are presented. Computational results are also given which contrast properties of convergence speed, stability, and consistency of these implementations. Included in these results are models involving the motion of vortices due to an applied current and pinning effects due to various material properties.
Progress and challenges of the ITER TBM Program from the IO perspective

Energy Technology Data Exchange (ETDEWEB)

Giancarli, L.M., E-mail: luciano.giancarli@iter.org [ITER Organization, Route de Vinon-sur-Verdon, CS 90 046, 13067 St Paul Lez Durance (France); Barabash, V.; Campbell, D.J.; Chiocchio, S.; Cordier, J.-J.; Dammann, A.; Dell’Orco, G.; Elbez-Uzan, J.; Fourneron, J.M.; Friconneau, J.P. [ITER Organization, Route de Vinon-sur-Verdon, CS 90 046, 13067 St Paul Lez Durance (France); Gasparotto, M. [Max-Planck-Institut für Plasmaphysik, Wendelsteinstraße 1, 17491 Greifswald (Germany); Iseli, M.; Jung, C.-Y.; Kim, B.-Y.; Lazarov, D.; Levesy, B.; Loughlin, M.; Merola, M. [ITER Organization, Route de Vinon-sur-Verdon, CS 90 046, 13067 St Paul Lez Durance (France); Nevière, J.-C. [Comex-Nucleaire, 13115 Saint Paul Lez Durance (France); Pascal, R. [ITER Organization, Route de Vinon-sur-Verdon, CS 90 046, 13067 St Paul Lez Durance (France); and others

2016-11-01

The paper describes the organization of the Test Blanket Module (TBM) program, its overall objective and schedule and the status of the technical activities within the ITER Organization-Central Team (IO-CT). The latter include the design integration of the Test Blanket Systems (TBSs) into the nuclear buildings, ensuring all interfaces with other ITER systems, the design of the common TBS components such as the TBM Frames, the Dummy TBMs, and the TBS maintenance tools and equipment in the TBM Port Cell as well as in the Hot Cell building, the design of the TBS connection pipes and the definition of the required maintenance operations and associated R&D. The paper also discusses the major challenges that the TBM Program will be facing in ITER such as the potential impact of the TBMs ferritic/martensitic structures on plasma operations, the approaches to tritium and contamination confinement, the required mitigation and recovery actions in case of accidents, and the assessment of the reliability aspects that could have an impact on ITER availability.

Progress and challenges of the ITER TBM Program from the IO perspective

International Nuclear Information System (INIS)

Giancarli, L.M.; Barabash, V.; Campbell, D.J.; Chiocchio, S.; Cordier, J.-J.; Dammann, A.; Dell’Orco, G.; Elbez-Uzan, J.; Fourneron, J.M.; Friconneau, J.P.; Gasparotto, M.; Iseli, M.; Jung, C.-Y.; Kim, B.-Y.; Lazarov, D.; Levesy, B.; Loughlin, M.; Merola, M.; Nevière, J.-C.; Pascal, R.

2016-01-01

The paper describes the organization of the Test Blanket Module (TBM) program, its overall objective and schedule and the status of the technical activities within the ITER Organization-Central Team (IO-CT). The latter include the design integration of the Test Blanket Systems (TBSs) into the nuclear buildings, ensuring all interfaces with other ITER systems, the design of the common TBS components such as the TBM Frames, the Dummy TBMs, and the TBS maintenance tools and equipment in the TBM Port Cell as well as in the Hot Cell building, the design of the TBS connection pipes and the definition of the required maintenance operations and associated R&D. The paper also discusses the major challenges that the TBM Program will be facing in ITER such as the potential impact of the TBMs ferritic/martensitic structures on plasma operations, the approaches to tritium and contamination confinement, the required mitigation and recovery actions in case of accidents, and the assessment of the reliability aspects that could have an impact on ITER availability.
Domain decomposition methods and parallel computing

International Nuclear Information System (INIS)

Meurant, G.

1991-01-01

In this paper, we show how to efficiently solve large linear systems on parallel computers. These linear systems arise from discretization of scientific computing problems described by systems of partial differential equations. We show how to get a discrete finite dimensional system from the continuous problem and the chosen conjugate gradient iterative algorithm is briefly described. Then, the different kinds of parallel architectures are reviewed and their advantages and deficiencies are emphasized. We sketch the problems found in programming the conjugate gradient method on parallel computers. For this algorithm to be efficient on parallel machines, domain decomposition techniques are introduced. We give results of numerical experiments showing that these techniques allow a good rate of convergence for the conjugate gradient algorithm as well as computational speeds in excess of a billion of floating point operations per second. (author). 5 refs., 11 figs., 2 tabs., 1 inset
Development of massively parallel quantum chemistry program SMASH

International Nuclear Information System (INIS)

Ishimura, Kazuya

2015-01-01

A massively parallel program for quantum chemistry calculations SMASH was released under the Apache License 2.0 in September 2014. The SMASH program is written in the Fortran90/95 language with MPI and OpenMP standards for parallelization. Frequently used routines, such as one- and two-electron integral calculations, are modularized to make program developments simple. The speed-up of the B3LYP energy calculation for (C 150 H 30 ) 2 with the cc-pVDZ basis set (4500 basis functions) was 50,499 on 98,304 cores of the K computer
Implementation of a cell-wise block-Gauss-Seidel iterative method for SN transport on a hybrid parallel computer architecture

International Nuclear Information System (INIS)

Rosa, Massimiliano; Warsa, James S.; Perks, Michael

2011-01-01

We have implemented a cell-wise, block-Gauss-Seidel (bGS) iterative algorithm, for the solution of the S_n transport equations on the Roadrunner hybrid, parallel computer architecture. A compute node of this massively parallel machine comprises AMD Opteron cores that are linked to a Cell Broadband Engine™ (Cell/B.E.)"1. LAPACK routines have been ported to the Cell/B.E. in order to make use of its parallel Synergistic Processing Elements (SPEs). The bGS algorithm is based on the LU factorization and solution of a linear system that couples the fluxes for all S_n angles and energy groups on a mesh cell. For every cell of a mesh that has been parallel decomposed on the higher-level Opteron processors, a linear system is transferred to the Cell/B.E. and the parallel LAPACK routines are used to compute a solution, which is then transferred back to the Opteron, where the rest of the computations for the S_n transport problem take place. Compared to standard parallel machines, a hundred-fold speedup of the bGS was observed on the hybrid Roadrunner architecture. Numerical experiments with strong and weak parallel scaling demonstrate the bGS method is viable and compares favorably to full parallel sweeps (FPS) on two-dimensional, unstructured meshes when it is applied to optically thick, multi-material problems. As expected, however, it is not as efficient as FPS in optically thin problems. (author)
Acceleration of iterative tomographic reconstruction using graphics processors

International Nuclear Information System (INIS)

Belzunce, M.A.; Osorio, A.; Verrastro, C.A.

2009-01-01

Using iterative algorithms for image reconstruction in 3 D Positron Emission Tomography has shown to produce images with better quality than analytical methods. How ever, these algorithms are computationally expensive. New Graphic Processor Units (GPU) provides high performance at low cost and also programming tools that make possible to execute parallel algorithms easily in scientific applications. In this work, we try to achieve an acceleration of image reconstruction algorithms in 3 D PET by using a GPU. A parallel implementation of the algorithm ML-EM 3 D was developed using Siddon algorithm as Projector and Back-projector. Results show that accelerations of more than one order of magnitude can be achieved, keeping similar image quality. (author)
Parallelization for X-ray crystal structural analysis program

Energy Technology Data Exchange (ETDEWEB)

Watanabe, Hiroshi [Japan Atomic Energy Research Inst., Tokyo (Japan); Minami, Masayuki; Yamamoto, Akiji

1997-10-01

In this report we study vectorization and parallelization for X-ray crystal structural analysis program. The target machine is NEC SX-4 which is a distributed/shared memory type vector parallel supercomputer. X-ray crystal structural analysis is surveyed, and a new multi-dimensional discrete Fourier transform method is proposed. The new method is designed to have a very long vector length, so that it enables to obtain the 12.0 times higher performance result that the original code. Besides the above-mentioned vectorization, the parallelization by micro-task functions on SX-4 reaches 13.7 times acceleration in the part of multi-dimensional discrete Fourier transform with 14 CPUs, and 3.0 times acceleration in the whole program. Totally 35.9 times acceleration to the original 1CPU scalar version is achieved with vectorization and parallelization on SX-4. (author)
The FORCE: A highly portable parallel programming language

Science.gov (United States)

Jordan, Harry F.; Benten, Muhammad S.; Alaghband, Gita; Jakob, Ruediger

1989-01-01

Here, it is explained why the FORCE parallel programming language is easily portable among six different shared-memory microprocessors, and how a two-level macro preprocessor makes it possible to hide low level machine dependencies and to build machine-independent high level constructs on top of them. These FORCE constructs make it possible to write portable parallel programs largely independent of the number of processes and the specific shared memory multiprocessor executing them.
The FORCE - A highly portable parallel programming language

Science.gov (United States)

Jordan, Harry F.; Benten, Muhammad S.; Alaghband, Gita; Jakob, Ruediger

1989-01-01

This paper explains why the FORCE parallel programming language is easily portable among six different shared-memory multiprocessors, and how a two-level macro preprocessor makes it possible to hide low-level machine dependencies and to build machine-independent high-level constructs on top of them. These FORCE constructs make it possible to write portable parallel programs largely independent of the number of processes and the specific shared-memory multiprocessor executing them.
Japanese perspective of fusion nuclear technology from ITER to DEMO

International Nuclear Information System (INIS)

Tanaka, Satoru; Takatsu, Hideyuki

2007-01-01

The world fusion community is now launching construction of ITER, the first nuclear-grade fusion machine in the world. In parallel to the ITER program, Broader Approach (BA) activities are to be initiated in this year by EU and Japan, mainly at Rokkasho BA site in Japan, as complementary activities to ITER toward DEMO. The BA activities include IFMIFEVEDA (International Fusion Materials Irradiation Facility-Engineering Validation and Engineering Design Activities) and DEMO design activities with generic technology R and Ds, both of which are critical to the rapid development of DEMO and commercial fusion power plants. The Atomic Energy Commission of Japan reviewed on-going third phase fusion program and issued the results of the review, 'On the policy of Nuclear Fusion Research and Development' in November 2005. In this report, it is anticipated that the ITER will be made operational in a decade and the programmatic objective can be met in the succeeding seven or eight years. Under this condition, the report presents a roadmap toward the DEMO and beyond and R and D items on fusion nuclear technology, indispensable for fusion energy utilization, are re-aligned. In the present paper, Japanese view and policy on ITER and beyond is summarized mainly from the viewpoints of nuclear fusion technology, and a minimum set of R and D elements on fusion nuclear technology, essential for fusion energy utilization, is presented. (orig.)
MPI_XSTAR: MPI-based Parallelization of the XSTAR Photoionization Program

Science.gov (United States)

Danehkar, Ashkbiz; Nowak, Michael A.; Lee, Julia C.; Smith, Randall K.

2018-02-01

We describe a program for the parallel implementation of multiple runs of XSTAR, a photoionization code that is used to predict the physical properties of an ionized gas from its emission and/or absorption lines. The parallelization program, called MPI_XSTAR, has been developed and implemented in the C++ language by using the Message Passing Interface (MPI) protocol, a conventional standard of parallel computing. We have benchmarked parallel multiprocessing executions of XSTAR, using MPI_XSTAR, against a serial execution of XSTAR, in terms of the parallelization speedup and the computing resource efficiency. Our experience indicates that the parallel execution runs significantly faster than the serial execution, however, the efficiency in terms of the computing resource usage decreases with increasing the number of processors used in the parallel computing.
Development of massively parallel quantum chemistry program SMASH

Energy Technology Data Exchange (ETDEWEB)

Ishimura, Kazuya [Department of Theoretical and Computational Molecular Science, Institute for Molecular Science 38 Nishigo-Naka, Myodaiji, Okazaki, Aichi 444-8585 (Japan)

2015-12-31

A massively parallel program for quantum chemistry calculations SMASH was released under the Apache License 2.0 in September 2014. The SMASH program is written in the Fortran90/95 language with MPI and OpenMP standards for parallelization. Frequently used routines, such as one- and two-electron integral calculations, are modularized to make program developments simple. The speed-up of the B3LYP energy calculation for (C{sub 150}H{sub 30}){sub 2} with the cc-pVDZ basis set (4500 basis functions) was 50,499 on 98,304 cores of the K computer.
Discounted semi-Markov decision processes : linear programming and policy iteration

NARCIS (Netherlands)

Wessels, J.; van Nunen, J.A.E.E.

1975-01-01

For semi-Markov decision processes with discounted rewards we derive the well known results regarding the structure of optimal strategies (nonrandomized, stationary Markov strategies) and the standard algorithms (linear programming, policy iteration). Our analysis is completely based on a primal
Discounted semi-Markov decision processes : linear programming and policy iteration

NARCIS (Netherlands)

Wessels, J.; van Nunen, J.A.E.E.

1974-01-01

For semi-Markov decision processes with discounted rewards we derive the well known results regarding the structure of optimal strategies (nonrandomized, stationary Markov strategies) and the standard algorithms (linear programming, policy iteration). Our analysis is completely based on a primal
Three dimensional Burn-up program parallelization using socket programming

International Nuclear Information System (INIS)

Haliyati R, Evi; Su'ud, Zaki

2002-01-01

A computer parallelization process was built with a purpose to decrease execution time of a physics program. In this case, a multi computer system was built to be used to analyze burn-up process of a nuclear reactor. This multi computer system was design need using a protocol communication among sockets, i.e. TCP/IP. This system consists of computer as a server and the rest as clients. The server has a main control to all its clients. The server also divides the reactor core geometrically to in parts in accordance with the number of clients, each computer including the server has a task to conduct burn-up analysis of 1/n part of the total reactor core measure. This burn-up analysis was conducted simultaneously and in a parallel way by all computers, so a faster program execution time was achieved close to 1/n times that of one computer. Then an analysis was carried out and states that in order to calculate the density of atoms in a reactor of 91 cm x 91 cm x 116 cm, the usage of a parallel system of 2 computers has the highest efficiency
Integrated Task And Data Parallel Programming: Language Design

Science.gov (United States)

Grimshaw, Andrew S.; West, Emily A.

1998-01-01

his research investigates the combination of task and data parallel language constructs within a single programming language. There are an number of applications that exhibit properties which would be well served by such an integrated language. Examples include global climate models, aircraft design problems, and multidisciplinary design optimization problems. Our approach incorporates data parallel language constructs into an existing, object oriented, task parallel language. The language will support creation and manipulation of parallel classes and objects of both types (task parallel and data parallel). Ultimately, the language will allow data parallel and task parallel classes to be used either as building blocks or managers of parallel objects of either type, thus allowing the development of single and multi-paradigm parallel applications. 1995 Research Accomplishments In February I presented a paper at Frontiers '95 describing the design of the data parallel language subset. During the spring I wrote and defended my dissertation proposal. Since that time I have developed a runtime model for the language subset. I have begun implementing the model and hand-coding simple examples which demonstrate the language subset. I have identified an astrophysical fluid flow application which will validate the data parallel language subset. 1996 Research Agenda Milestones for the coming year include implementing a significant portion of the data parallel language subset over the Legion system. Using simple hand-coded methods, I plan to demonstrate (1) concurrent task and data parallel objects and (2) task parallel objects managing both task and data parallel objects. My next steps will focus on constructing a compiler and implementing the fluid flow application with the language. Concurrently, I will conduct a search for a real-world application exhibiting both task and data parallelism within the same program m. Additional 1995 Activities During the fall I collaborated
User's guide of parallel program development environment (PPDE). The 2nd edition

International Nuclear Information System (INIS)

Ueno, Hirokazu; Takemiya, Hiroshi; Imamura, Toshiyuki; Koide, Hiroshi; Matsuda, Katsuyuki; Higuchi, Kenji; Hirayama, Toshio; Ohta, Hirofumi

2000-03-01

The STA basic system has been enhanced to accelerate support for parallel programming on heterogeneous parallel computers, through a series of R and D on the technology of parallel processing. The enhancement has been made through extending the function of the PPDF, Parallel Program Development Environment in the STA basic system. The extended PPDE has the function to make: 1) the automatic creation of a 'makefile' and a shell script file for its execution, 2) the multi-tools execution which makes the tools on heterogeneous computers to execute with one operation a task on a computer, and 3) the mirror composition to reflect editing results of a file on a computer into all related files on other computers. These additional functions will enhance the work efficiency for program development on some computers. More functions have been added to the PPDE to provide help for parallel program development. New functions were also designed to complement a HPF translator and a parallelizing support tool when working together so that a sequential program is efficiently converted to a parallel program. This report describes the use of extended PPDE. (author)
Physics research needs for ITER

International Nuclear Information System (INIS)

Sauthoff, N.R.

1995-01-01

Design of ITER entails the application of physics design tools that have been validated against the world-wide data base of fusion research. In many cases, these tools do not yet exist and must be developed as part of the ITER physics program. ITER's considerable increases in power and size demand significant extrapolations from the current data base; in several cases, new physical effects are projected to dominate the behavior of the ITER plasma. This paper focuses on those design tools and data that have been identified by the ITER team and are not yet available; these needs serve as the basis for the ITER Physics Research Needs, which have been developed jointly by the ITER Physics Expert Groups and the ITER design team. Development of the tools and the supporting data base is an on-going activity that constitutes a significant opportunity for contributions to the ITER program by fusion research programs world-wide
A Programming Model for Massive Data Parallelism with Data Dependencies

International Nuclear Information System (INIS)

Cui, Xiaohui; Mueller, Frank; Potok, Thomas E.; Zhang, Yongpeng

2009-01-01

Accelerating processors can often be more cost and energy effective for a wide range of data-parallel computing problems than general-purpose processors. For graphics processor units (GPUs), this is particularly the case when program development is aided by environments such as NVIDIA s Compute Unified Device Architecture (CUDA), which dramatically reduces the gap between domain-specific architectures and general purpose programming. Nonetheless, general-purpose GPU (GPGPU) programming remains subject to several restrictions. Most significantly, the separation of host (CPU) and accelerator (GPU) address spaces requires explicit management of GPU memory resources, especially for massive data parallelism that well exceeds the memory capacity of GPUs. One solution to this problem is to transfer data between the GPU and host memories frequently. In this work, we investigate another approach. We run massively data-parallel applications on GPU clusters. We further propose a programming model for massive data parallelism with data dependencies for this scenario. Experience from micro benchmarks and real-world applications shows that our model provides not only ease of programming but also significant performance gains
Toolkit for high performance Monte Carlo radiation transport and activation calculations for shielding applications in ITER

International Nuclear Information System (INIS)

Serikov, A.; Fischer, U.; Grosse, D.; Leichtle, D.; Majerle, M.

2011-01-01

The Monte Carlo (MC) method is the most suitable computational technique of radiation transport for shielding applications in fusion neutronics. This paper is intended for sharing the results of long term experience of the fusion neutronics group at Karlsruhe Institute of Technology (KIT) in radiation shielding calculations with the MCNP5 code for the ITER fusion reactor with emphasizing on the use of several ITER project-driven computer programs developed at KIT. Two of them, McCad and R2S, seem to be the most useful in radiation shielding analyses. The McCad computer graphical tool allows to perform automatic conversion of the MCNP models from the underlying CAD (CATIA) data files, while the R2S activation interface couples the MCNP radiation transport with the FISPACT activation allowing to estimate nuclear responses such as dose rate and nuclear heating after the ITER reactor shutdown. The cell-based R2S scheme was applied in shutdown photon dose analysis for the designing of the In-Vessel Viewing System (IVVS) and the Glow Discharge Cleaning (GDC) unit in ITER. Newly developed at KIT mesh-based R2S feature was successfully tested on the shutdown dose rate calculations for the upper port in the Neutral Beam (NB) cell of ITER. The merits of McCad graphical program were broadly acknowledged by the neutronic analysts and its continuous improvement at KIT has introduced its stable and more convenient run with its Graphical User Interface. Detailed 3D ITER neutronic modeling with the MCNP Monte Carlo method requires a lot of computation resources, inevitably leading to parallel calculations on clusters. Performance assessments of the MCNP5 parallel runs on the JUROPA/HPC-FF supercomputer cluster permitted to find the optimal number of processors for ITER-type runs. (author)
Fast ℓ1-SPIRiT Compressed Sensing Parallel Imaging MRI: Scalable Parallel Implementation and Clinically Feasible Runtime

Science.gov (United States)

Murphy, Mark; Alley, Marcus; Demmel, James; Keutzer, Kurt; Vasanawala, Shreyas; Lustig, Michael

2012-01-01

We present ℓ1-SPIRiT, a simple algorithm for auto calibrating parallel imaging (acPI) and compressed sensing (CS) that permits an efficient implementation with clinically-feasible runtimes. We propose a CS objective function that minimizes cross-channel joint sparsity in the Wavelet domain. Our reconstruction minimizes this objective via iterative soft-thresholding, and integrates naturally with iterative Self-Consistent Parallel Imaging (SPIRiT). Like many iterative MRI reconstructions, ℓ1-SPIRiT’s image quality comes at a high computational cost. Excessively long runtimes are a barrier to the clinical use of any reconstruction approach, and thus we discuss our approach to efficiently parallelizing ℓ1-SPIRiT and to achieving clinically-feasible runtimes. We present parallelizations of ℓ1-SPIRiT for both multi-GPU systems and multi-core CPUs, and discuss the software optimization and parallelization decisions made in our implementation. The performance of these alternatives depends on the processor architecture, the size of the image matrix, and the number of parallel imaging channels. Fundamentally, achieving fast runtime requires the correct trade-off between cache usage and parallelization overheads. We demonstrate image quality via a case from our clinical experimentation, using a custom 3DFT Spoiled Gradient Echo (SPGR) sequence with up to 8× acceleration via poisson-disc undersampling in the two phase-encoded directions. PMID:22345529

An environment for parallel structuring of Fortran programs

International Nuclear Information System (INIS)

Sridharan, K.; McShea, M.; Denton, C.; Eventoff, B.; Browne, J.C.; Newton, P.; Ellis, M.; Grossbard, D.; Wise, T.; Clemmer, D.

1990-01-01

The paper describes and illustrates an environment for interactive support of the detection and implementation of macro-level parallelism in Fortran programs. The approach couples algorithms for dependence analysis with both innovative techniques for complexity management and capabilities for the measurement and analysis of the parallel computation structures generated through use of the environment. The resulting environment is complementary to the more common approach of seeking local parallelism by loop unrolling, either by an automatic compiler or manually. (orig.)
Development of a parallelization strategy for the VARIANT code

International Nuclear Information System (INIS)

Hanebutte, U.R.; Khalil, H.S.; Palmiotti, G.; Tatsumi, M.

1996-01-01

The VARIANT code solves the multigroup steady-state neutron diffusion and transport equation in three-dimensional Cartesian and hexagonal geometries using the variational nodal method. VARIANT consists of four major parts that must be executed sequentially: input handling, calculation of response matrices, solution algorithm (i.e. inner-outer iteration), and output of results. The objective of the parallelization effort was to reduce the overall computing time by distributing the work of the two computationally intensive (sequential) tasks, the coupling coefficient calculation and the iterative solver, equally among a group of processors. This report describes the code's calculations and gives performance results on one of the benchmark problems used to test the code. The performance analysis in the IBM SPx system shows good efficiency for well-load-balanced programs. Even for relatively small problem sizes, respectable efficiencies are seen for the SPx. An extension to achieve a higher degree of parallelism will be addressed in future work. 7 refs., 1 tab
Parallel Implicit Algorithms for CFD

Science.gov (United States)

Keyes, David E.

1998-01-01

The main goal of this project was efficient distributed parallel and workstation cluster implementations of Newton-Krylov-Schwarz (NKS) solvers for implicit Computational Fluid Dynamics (CFD.) "Newton" refers to a quadratically convergent nonlinear iteration using gradient information based on the true residual, "Krylov" to an inner linear iteration that accesses the Jacobian matrix only through highly parallelizable sparse matrix-vector products, and "Schwarz" to a domain decomposition form of preconditioning the inner Krylov iterations with primarily neighbor-only exchange of data between the processors. Prior experience has established that Newton-Krylov methods are competitive solvers in the CFD context and that Krylov-Schwarz methods port well to distributed memory computers. The combination of the techniques into Newton-Krylov-Schwarz was implemented on 2D and 3D unstructured Euler codes on the parallel testbeds that used to be at LaRC and on several other parallel computers operated by other agencies or made available by the vendors. Early implementations were made directly in Massively Parallel Integration (MPI) with parallel solvers we adapted from legacy NASA codes and enhanced for full NKS functionality. Later implementations were made in the framework of the PETSC library from Argonne National Laboratory, which now includes pseudo-transient continuation Newton-Krylov-Schwarz solver capability (as a result of demands we made upon PETSC during our early porting experiences). A secondary project pursued with funding from this contract was parallel implicit solvers in acoustics, specifically in the Helmholtz formulation. A 2D acoustic inverse problem has been solved in parallel within the PETSC framework.
Compiling Scientific Programs for Scalable Parallel Systems

National Research Council Canada - National Science Library

Kennedy, Ken

2001-01-01

...). The research performed in this project included new techniques for recognizing implicit parallelism in sequential programs, a powerful and precise set-based framework for analysis and transformation...
Scalability of Parallel Scientific Applications on the Cloud

Directory of Open Access Journals (Sweden)

Satish Narayana Srirama

2011-01-01

Full Text Available Cloud computing, with its promise of virtually infinite resources, seems to suit well in solving resource greedy scientific computing problems. To study the effects of moving parallel scientific applications onto the cloud, we deployed several benchmark applications like matrix–vector operations and NAS parallel benchmarks, and DOUG (Domain decomposition On Unstructured Grids on the cloud. DOUG is an open source software package for parallel iterative solution of very large sparse systems of linear equations. The detailed analysis of DOUG on the cloud showed that parallel applications benefit a lot and scale reasonable on the cloud. We could also observe the limitations of the cloud and its comparison with cluster in terms of performance. However, for efficiently running the scientific applications on the cloud infrastructure, the applications must be reduced to frameworks that can successfully exploit the cloud resources, like the MapReduce framework. Several iterative and embarrassingly parallel algorithms are reduced to the MapReduce model and their performance is measured and analyzed. The analysis showed that Hadoop MapReduce has significant problems with iterative methods, while it suits well for embarrassingly parallel algorithms. Scientific computing often uses iterative methods to solve large problems. Thus, for scientific computing on the cloud, this paper raises the necessity for better frameworks or optimizations for MapReduce.
User's guide of parallel program development environment (PPDE). The 2nd edition

Energy Technology Data Exchange (ETDEWEB)

Ueno, Hirokazu; Takemiya, Hiroshi; Imamura, Toshiyuki; Koide, Hiroshi; Matsuda, Katsuyuki; Higuchi, Kenji; Hirayama, Toshio [Center for Promotion of Computational Science and Engineering, Japan Atomic Energy Research Institute, Tokyo (Japan); Ohta, Hirofumi [Hitachi Ltd., Tokyo (Japan)

2000-03-01

The STA basic system has been enhanced to accelerate support for parallel programming on heterogeneous parallel computers, through a series of R and D on the technology of parallel processing. The enhancement has been made through extending the function of the PPDF, Parallel Program Development Environment in the STA basic system. The extended PPDE has the function to make: 1) the automatic creation of a 'makefile' and a shell script file for its execution, 2) the multi-tools execution which makes the tools on heterogeneous computers to execute with one operation a task on a computer, and 3) the mirror composition to reflect editing results of a file on a computer into all related files on other computers. These additional functions will enhance the work efficiency for program development on some computers. More functions have been added to the PPDE to provide help for parallel program development. New functions were also designed to complement a HPF translator and a paralleilizing support tool when working together so that a sequential program is efficiently converted to a parallel program. This report describes the use of extended PPDE. (author)
Explorations of the implementation of a parallel IDW interpolation algorithm in a Linux cluster-based parallel GIS

Science.gov (United States)

Huang, Fang; Liu, Dingsheng; Tan, Xicheng; Wang, Jian; Chen, Yunping; He, Binbin

2011-04-01

To design and implement an open-source parallel GIS (OP-GIS) based on a Linux cluster, the parallel inverse distance weighting (IDW) interpolation algorithm has been chosen as an example to explore the working model and the principle of algorithm parallel pattern (APP), one of the parallelization patterns for OP-GIS. Based on an analysis of the serial IDW interpolation algorithm of GRASS GIS, this paper has proposed and designed a specific parallel IDW interpolation algorithm, incorporating both single process, multiple data (SPMD) and master/slave (M/S) programming modes. The main steps of the parallel IDW interpolation algorithm are: (1) the master node packages the related information, and then broadcasts it to the slave nodes; (2) each node calculates its assigned data extent along one row using the serial algorithm; (3) the master node gathers the data from all nodes; and (4) iterations continue until all rows have been processed, after which the results are outputted. According to the experiments performed in the course of this work, the parallel IDW interpolation algorithm can attain an efficiency greater than 0.93 compared with similar algorithms, which indicates that the parallel algorithm can greatly reduce processing time and maximize speed and performance.
Program For Parallel Discrete-Event Simulation

Science.gov (United States)

Beckman, Brian C.; Blume, Leo R.; Geiselman, John S.; Presley, Matthew T.; Wedel, John J., Jr.; Bellenot, Steven F.; Diloreto, Michael; Hontalas, Philip J.; Reiher, Peter L.; Weiland, Frederick P.

1991-01-01

User does not have to add any special logic to aid in synchronization. Time Warp Operating System (TWOS) computer program is special-purpose operating system designed to support parallel discrete-event simulation. Complete implementation of Time Warp mechanism. Supports only simulations and other computations designed for virtual time. Time Warp Simulator (TWSIM) subdirectory contains sequential simulation engine interface-compatible with TWOS. TWOS and TWSIM written in, and support simulations in, C programming language.
Vdebug: debugging tool for parallel scientific programs. Design report on vdebug

International Nuclear Information System (INIS)

Matsuda, Katsuyuki; Takemiya, Hiroshi

2000-02-01

We report on a debugging tool called vdebug which supports debugging work for parallel scientific simulation programs. It is difficult to debug scientific programs with an existing debugger, because the volume of data generated by the programs is too large for users to check data in characters. Usually, the existing debugger shows data values in characters. To alleviate it, we have developed vdebug which enables to check the validity of large amounts of data by showing these data values visually. Although targets of vdebug have been restricted to sequential programs, we have made it applicable to parallel programs by realizing the function of merging and visualizing data distributed on programs on each computer node. Now, vdebug works on seven kinds of parallel computers. In this report, we describe the design of vdebug. (author)
Canadian contribution to the European Union Home Team program for ITER

International Nuclear Information System (INIS)

Murdoch, D.K.; Blevins, J.D.; Gierszewski, P.; Matsugu, R.

1998-01-01

Canadian participation in R and D and design tasks for the ITER project is predominantly in the fuel cycle, remote handling and safety fields. These tasks are carried out in Canada by Ontario Hydro, research institutes, industry and universities. In addition, Canada provides the services of a number of specialist engineers and scientists in key positions at the three ITER work sites and in the European Home Team. The Canadian contribution, which is coordinated by the Canadian Fusion Fuels Technology Project (CFFTP), forms an integral part of the European Union Home Team program. The key components of the Canadian contribution are described. (author)
PRIM: An Efficient Preconditioning Iterative Reweighted Least Squares Method for Parallel Brain MRI Reconstruction.

Science.gov (United States)

Xu, Zheng; Wang, Sheng; Li, Yeqing; Zhu, Feiyun; Huang, Junzhou

2018-02-08

The most recent history of parallel Magnetic Resonance Imaging (pMRI) has in large part been devoted to finding ways to reduce acquisition time. While joint total variation (JTV) regularized model has been demonstrated as a powerful tool in increasing sampling speed for pMRI, however, the major bottleneck is the inefficiency of the optimization method. While all present state-of-the-art optimizations for the JTV model could only reach a sublinear convergence rate, in this paper, we squeeze the performance by proposing a linear-convergent optimization method for the JTV model. The proposed method is based on the Iterative Reweighted Least Squares algorithm. Due to the complexity of the tangled JTV objective, we design a novel preconditioner to further accelerate the proposed method. Extensive experiments demonstrate the superior performance of the proposed algorithm for pMRI regarding both accuracy and efficiency compared with state-of-the-art methods.
From sequential to parallel programming with patterns

CERN Document Server

CERN. Geneva

2018-01-01

To increase in both performance and efficiency, our programming models need to adapt to better exploit modern processors. The classic idioms and patterns for programming such as loops, branches or recursion are the pillars of almost every code and are well known among all programmers. These patterns all have in common that they are sequential in nature. Embracing parallel programming patterns, which allow us to program for multi- and many-core hardware in a natural way, greatly simplifies the task of designing a program that scales and performs on modern hardware, independently of the used programming language, and in a generic way.
On program restructuring, scheduling, and communication for parallel processor systems

Energy Technology Data Exchange (ETDEWEB)

Polychronopoulos, Constantine D. [Univ. of Illinois, Urbana, IL (United States)

1986-08-01

This dissertation discusses several software and hardware aspects of program execution on large-scale, high-performance parallel processor systems. The issues covered are program restructuring, partitioning, scheduling and interprocessor communication, synchronization, and hardware design issues of specialized units. All this work was performed focusing on a single goal: to maximize program speedup, or equivalently, to minimize parallel execution time. Parafrase, a Fortran restructuring compiler was used to transform programs in a parallel form and conduct experiments. Two new program restructuring techniques are presented, loop coalescing and subscript blocking. Compile-time and run-time scheduling schemes are covered extensively. Depending on the program construct, these algorithms generate optimal or near-optimal schedules. For the case of arbitrarily nested hybrid loops, two optimal scheduling algorithms for dynamic and static scheduling are presented. Simulation results are given for a new dynamic scheduling algorithm. The performance of this algorithm is compared to that of self-scheduling. Techniques for program partitioning and minimization of interprocessor communication for idealized program models and for real Fortran programs are also discussed. The close relationship between scheduling, interprocessor communication, and synchronization becomes apparent at several points in this work. Finally, the impact of various types of overhead on program speedup and experimental results are presented.
Existence test for asynchronous interval iterations

DEFF Research Database (Denmark)

Madsen, Kaj; Caprani, O.; Stauning, Ole

1997-01-01

In the search for regions that contain fixed points ofa real function of several variables, tests based on interval calculationscan be used to establish existence ornon-existence of fixed points in regions that are examined in the course ofthe search. The search can e.g. be performed...... as a synchronous (sequential) interval iteration:In each iteration step all components of the iterate are calculatedbased on the previous iterate. In this case it is straight forward to base simple interval existence and non-existencetests on the calculations done in each step of the iteration. The search can also...... on thecomponentwise calculations done in the course of the iteration. These componentwisetests are useful for parallel implementation of the search, sincethe tests can then be performed local to each processor and only when a test issuccessful do a processor communicate this result to other processors....
Characterizing and Mitigating Work Time Inflation in Task Parallel Programs

Directory of Open Access Journals (Sweden)

Stephen L. Olivier

2013-01-01

Full Text Available Task parallelism raises the level of abstraction in shared memory parallel programming to simplify the development of complex applications. However, task parallel applications can exhibit poor performance due to thread idleness, scheduling overheads, and work time inflation – additional time spent by threads in a multithreaded computation beyond the time required to perform the same work in a sequential computation. We identify the contributions of each factor to lost efficiency in various task parallel OpenMP applications and diagnose the causes of work time inflation in those applications. Increased data access latency can cause significant work time inflation in NUMA systems. Our locality framework for task parallel OpenMP programs mitigates this cause of work time inflation. Our extensions to the Qthreads library demonstrate that locality-aware scheduling can improve performance up to 3X compared to the Intel OpenMP task scheduler.
Parallel implementation of the PHOENIX generalized stellar atmosphere program. II. Wavelength parallelization

International Nuclear Information System (INIS)

Baron, E.; Hauschildt, Peter H.

1998-01-01

We describe an important addition to the parallel implementation of our generalized nonlocal thermodynamic equilibrium (NLTE) stellar atmosphere and radiative transfer computer program PHOENIX. In a previous paper in this series we described data and task parallel algorithms we have developed for radiative transfer, spectral line opacity, and NLTE opacity and rate calculations. These algorithms divided the work spatially or by spectral lines, that is, distributing the radial zones, individual spectral lines, or characteristic rays among different processors and employ, in addition, task parallelism for logically independent functions (such as atomic and molecular line opacities). For finite, monotonic velocity fields, the radiative transfer equation is an initial value problem in wavelength, and hence each wavelength point depends upon the previous one. However, for sophisticated NLTE models of both static and moving atmospheres needed to accurately describe, e.g., novae and supernovae, the number of wavelength points is very large (200,000 - 300,000) and hence parallelization over wavelength can lead both to considerable speedup in calculation time and the ability to make use of the aggregate memory available on massively parallel supercomputers. Here, we describe an implementation of a pipelined design for the wavelength parallelization of PHOENIX, where the necessary data from the processor working on a previous wavelength point is sent to the processor working on the succeeding wavelength point as soon as it is known. Our implementation uses a MIMD design based on a relatively small number of standard message passing interface (MPI) library calls and is fully portable between serial and parallel computers. copyright 1998 The American Astronomical Society
Parallel Volunteer Learning during Youth Programs

Science.gov (United States)

Lesmeister, Marilyn K.; Green, Jeremy; Derby, Amy; Bothum, Candi

2012-01-01

Lack of time is a hindrance for volunteers to participate in educational opportunities, yet volunteer success in an organization is tied to the orientation and education they receive. Meeting diverse educational needs of volunteers can be a challenge for program managers. Scheduling a Volunteer Learning Track for chaperones that is parallel to a…
Massive parallel electromagnetic field simulation program JEMS-FDTD design and implementation on jasmin

International Nuclear Information System (INIS)

Li Hanyu; Zhou Haijing; Dong Zhiwei; Liao Cheng; Chang Lei; Cao Xiaolin; Xiao Li

2010-01-01

A large-scale parallel electromagnetic field simulation program JEMS-FDTD(J Electromagnetic Solver-Finite Difference Time Domain) is designed and implemented on JASMIN (J parallel Adaptive Structured Mesh applications INfrastructure). This program can simulate propagation, radiation, couple of electromagnetic field by solving Maxwell equations on structured mesh explicitly with FDTD method. JEMS-FDTD is able to simulate billion-mesh-scale problems on thousands of processors. In this article, the program is verified by simulating the radiation of an electric dipole. A beam waveguide is simulated to demonstrate the capability of large scale parallel computation. A parallel performance test indicates that a high parallel efficiency is obtained. (authors)
Development and test of the ITER conductor joints

Energy Technology Data Exchange (ETDEWEB)

Martovetsky, N., LLNL

1998-05-14

Joints for the ITER superconducting Central Solenoid should perform in rapidly varying magnetic field with low losses and low DC resistance. This paper describes the design of the ITER joint and presents its assembly process. Two joints were built and tested at the PTF facility at MIT. Test results are presented, losses in transverse and parallel field and the DC performance are discussed. The developed joint demonstrates sufficient margin for baseline ITER operating scenarios.
Parallel conjugate gradient algorithms for manipulator dynamic simulation

Science.gov (United States)

Fijany, Amir; Scheld, Robert E.

1989-01-01

Parallel conjugate gradient algorithms for the computation of multibody dynamics are developed for the specialized case of a robot manipulator. For an n-dimensional positive-definite linear system, the Classical Conjugate Gradient (CCG) algorithms are guaranteed to converge in n iterations, each with a computation cost of O(n); this leads to a total computational cost of O(n sq) on a serial processor. A conjugate gradient algorithms is presented that provide greater efficiency using a preconditioner, which reduces the number of iterations required, and by exploiting parallelism, which reduces the cost of each iteration. Two Preconditioned Conjugate Gradient (PCG) algorithms are proposed which respectively use a diagonal and a tridiagonal matrix, composed of the diagonal and tridiagonal elements of the mass matrix, as preconditioners. Parallel algorithms are developed to compute the preconditioners and their inversions in O(log sub 2 n) steps using n processors. A parallel algorithm is also presented which, on the same architecture, achieves the computational time of O(log sub 2 n) for each iteration. Simulation results for a seven degree-of-freedom manipulator are presented. Variants of the proposed algorithms are also developed which can be efficiently implemented on the Robot Mathematics Processor (RMP).

MICADO: Parallel implementation of a 2D-1D iterative algorithm for the 3D neutron transport problem in prismatic geometries

International Nuclear Information System (INIS)

Fevotte, F.; Lathuiliere, B.

2013-01-01

The large increase in computing power over the past few years now makes it possible to consider developing 3D full-core heterogeneous deterministic neutron transport solvers for reference calculations. Among all approaches presented in the literature, the method first introduced in [1] seems very promising. It consists in iterating over resolutions of 2D and ID MOC problems by taking advantage of prismatic geometries without introducing approximations of a low order operator such as diffusion. However, before developing a solver with all industrial options at EDF, several points needed to be clarified. In this work, we first prove the convergence of this iterative process, under some assumptions. We then present our high-performance, parallel implementation of this algorithm in the MICADO solver. Benchmarking the solver against the Takeda case shows that the 2D-1D coupling algorithm does not seem to affect the spatial convergence order of the MOC solver. As for performance issues, our study shows that even though the data distribution is suited to the 2D solver part, the efficiency of the ID part is sufficient to ensure a good parallel efficiency of the global algorithm. After this study, the main remaining difficulty implementation-wise is about the memory requirement of a vector used for initialization. An efficient acceleration operator will also need to be developed. (authors)
Parallelized implicit propagators for the finite-difference Schrödinger equation

Science.gov (United States)

Parker, Jonathan; Taylor, K. T.

1995-08-01

We describe the application of block Gauss-Seidel and block Jacobi iterative methods to the design of implicit propagators for finite-difference models of the time-dependent Schrödinger equation. The block-wise iterative methods discussed here are mixed direct-iterative methods for solving simultaneous equations, in the sense that direct methods (e.g. LU decomposition) are used to invert certain block sub-matrices, and iterative methods are used to complete the solution. We describe parallel variants of the basic algorithm that are well suited to the medium- to coarse-grained parallelism of work-station clusters, and MIMD supercomputers, and we show that under a wide range of conditions, fine-grained parallelism of the computation can be achieved. Numerical tests are conducted on a typical one-electron atom Hamiltonian. The methods converge robustly to machine precision (15 significant figures), in some cases in as few as 6 or 7 iterations. The rate of convergence is nearly independent of the finite-difference grid-point separations.
Parallel computation for solving the tridiagonal linear system of equations

International Nuclear Information System (INIS)

Ishiguro, Misako; Harada, Hiroo; Fujii, Minoru; Fujimura, Toichiro; Nakamura, Yasuhiro; Nanba, Katsumi.

1981-09-01

Recently, applications of parallel computation for scientific calculations have increased from the need of the high speed calculation of large scale programs. At the JAERI computing center, an array processor FACOM 230-75 APU has installed to study the applicability of parallel computation for nuclear codes. We made some numerical experiments by using the APU on the methods of solution of tridiagonal linear equation which is an important problem in scientific calculations. Referring to the recent papers with parallel methods, we investigate eight ones. These are Gauss elimination method, Parallel Gauss method, Accelerated parallel Gauss method, Jacobi method, Recursive doubling method, Cyclic reduction method, Chebyshev iteration method, and Conjugate gradient method. The computing time and accuracy were compared among the methods on the basis of the numerical experiments. As the result, it is found that the Cyclic reduction method is best both in computing time and accuracy and the Gauss elimination method is the second one. (author)
Parallel GPU implementation of iterative PCA algorithms.

Science.gov (United States)

Andrecut, M

2009-11-01

Principal component analysis (PCA) is a key statistical technique for multivariate data analysis. For large data sets, the common approach to PCA computation is based on the standard NIPALS-PCA algorithm, which unfortunately suffers from loss of orthogonality, and therefore its applicability is usually limited to the estimation of the first few components. Here we present an algorithm based on Gram-Schmidt orthogonalization (called GS-PCA), which eliminates this shortcoming of NIPALS-PCA. Also, we discuss the GPU (Graphics Processing Unit) parallel implementation of both NIPALS-PCA and GS-PCA algorithms. The numerical results show that the GPU parallel optimized versions, based on CUBLAS (NVIDIA), are substantially faster (up to 12 times) than the CPU optimized versions based on CBLAS (GNU Scientific Library).
ITER concept definition. V.1

International Nuclear Information System (INIS)

1989-01-01

Under the auspices of the International Atomic Energy Agency (IAEA), an agreement among the four parties representing the world's major fusion programs resulted in a program for conceptual design of the next logical step in the fusion program, the International Thermonuclear Experimental Reactor (ITER). The definition phase, which ended in November, 1989, is summarized in two reports: a brief summary is contained in the ITER Definition Phase Report (IAEA/ITER/DS/2); the extended technical summary and technical details of ITER are contained in this two-volume report. The first volume of this report contains the Introduction and Summary, and the remainder will appear in Volume II. In the Conceptual Design Activities phase, ITER has been defined as being a tokamak device. The basic performance parameters of ITER are given in Volume I of this report. In addition, the rationale for selection of this concept, the performance flexibility, technical issues, operations, safety, reliability, cost, and research and development needed to proceed with the design are discussed. Figs and tabs
Development and benchmark verification of a parallelized Monte Carlo burnup calculation program MCBMPI

International Nuclear Information System (INIS)

Yang Wankui; Liu Yaoguang; Ma Jimin; Yang Xin; Wang Guanbo

2014-01-01

MCBMPI, a parallelized burnup calculation program, was developed. The program is modularized. Neutron transport calculation module employs the parallelized MCNP5 program MCNP5MPI, and burnup calculation module employs ORIGEN2, with the MPI parallel zone decomposition strategy. The program system only consists of MCNP5MPI and an interface subroutine. The interface subroutine achieves three main functions, i.e. zone decomposition, nuclide transferring and decaying, data exchanging with MCNP5MPI. Also, the program was verified with the Pressurized Water Reactor (PWR) cell burnup benchmark, the results showed that it's capable to apply the program to burnup calculation of multiple zones, and the computation efficiency could be significantly improved with the development of computer hardware. (authors)
Angular parallelization of a curvilinear Sn transport theory method

International Nuclear Information System (INIS)

Haghighat, A.

1991-01-01

In this paper a parallel algorithm for angular domain decomposition (or parallelization) of an r-dependent spherical S n transport theory method is derived. The parallel formulation is incorporated into TWOTRAN-II using the IBM Parallel Fortran compiler and implemented on an IBM 3090/400 (with four processors). The behavior of the parallel algorithm for different physical problems is studied, and it is concluded that the parallel algorithm behaves differently in the presence of a fission source as opposed to the absence of a fission source; this is attributed to the relative contributions of the source and the angular redistribution terms in the S s algorithm. Further, the parallel performance of the algorithm is measured for various problem sizes and different combinations of angular subdomains or processors. Poor parallel efficiencies between ∼35 and 50% are achieved in situations where the relative difference of parallel to serial iterations is ∼50%. High parallel efficiencies between ∼60% and 90% are obtained in situations where the relative difference of parallel to serial iterations is <35%
Iterative Decoding of Concatenated Codes: A Tutorial

Directory of Open Access Journals (Sweden)

Phillip A. Regalia

2005-05-01

Full Text Available The turbo decoding algorithm of a decade ago constituted a milestone in error-correction coding for digital communications, and has inspired extensions to generalized receiver topologies, including turbo equalization, turbo synchronization, and turbo CDMA, among others. Despite an accrued understanding of iterative decoding over the years, the Ã¢Â€Âœturbo principleÃ¢Â€Â remains elusive to master analytically, thereby inciting interest from researchers outside the communications domain. In this spirit, we develop a tutorial presentation of iterative decoding for parallel and serial concatenated codes, in terms hopefully accessible to a broader audience. We motivate iterative decoding as a computationally tractable attempt to approach maximum-likelihood decoding, and characterize fixed points in terms of a Ã¢Â€ÂœconsensusÃ¢Â€Â property between constituent decoders. We review how the decoding algorithm for both parallel and serial concatenated codes coincides with an alternating projection algorithm, which allows one to identify conditions under which the algorithm indeed converges to a maximum-likelihood solution, in terms of particular likelihood functions factoring into the product of their marginals. The presentation emphasizes a common framework applicable to both parallel and serial concatenated codes.
User's guide of parallel program development environment (PPDE). The 2nd edition

Energy Technology Data Exchange (ETDEWEB)

Ueno, Hirokazu; Takemiya, Hiroshi; Imamura, Toshiyuki; Koide, Hiroshi; Matsuda, Katsuyuki; Higuchi, Kenji; Hirayama, Toshio [Center for Promotion of Computational Science and Engineering, Japan Atomic Energy Research Institute, Tokyo (Japan); Ohta, Hirofumi [Hitachi Ltd., Tokyo (Japan)

2000-03-01

The STA basic system has been enhanced to accelerate support for parallel programming on heterogeneous parallel computers, through a series of R and D on the technology of parallel processing. The enhancement has been made through extending the function of the PPDF, Parallel Program Development Environment in the STA basic system. The extended PPDE has the function to make: 1) the automatic creation of a 'makefile' and a shell script file for its execution, 2) the multi-tools execution which makes the tools on heterogeneous computers to execute with one operation a task on a computer, and 3) the mirror composition to reflect editing results of a file on a computer into all related files on other computers. These additional functions will enhance the work efficiency for program development on some computers. More functions have been added to the PPDE to provide help for parallel program development. New functions were also designed to complement a HPF translator and a paralleilizing support tool when working together so that a sequential program is efficiently converted to a parallel program. This report describes the use of extended PPDE. (author)
Compiling the parallel programming language NestStep to the CELL processor

OpenAIRE

Holm, Magnus

2010-01-01

The goal of this project is to create a source-to-source compiler which will translate NestStep code to C code. The compiler's job is to replace NestStep constructs with a series of function calls to the NestStep runtime system. NestStep is a parallel programming language extension based on the BSP model. It adds constructs for parallel programming on top of an imperative programming language. For this project, only constructs extending the C language are relevant. The output code will compil...
A language for data-parallel and task parallel programming dedicated to multi-SIMD computers. Contributions to hydrodynamic simulation with lattice gases

International Nuclear Information System (INIS)

Pic, Marc Michel

1995-01-01

Parallel programming covers task-parallelism and data-parallelism. Many problems need both parallelisms. Multi-SIMD computers allow hierarchical approach of these parallelisms. The T++ language, based on C++, is dedicated to exploit Multi-SIMD computers using a programming paradigm which is an extension of array-programming to tasks managing. Our language introduced array of independent tasks to achieve separately (MIMD), on subsets of processors of identical behaviour (SIMD), in order to translate the hierarchical inclusion of data-parallelism in task-parallelism. To manipulate in a symmetrical way tasks and data we propose meta-operations which have the same behaviour on tasks arrays and on data arrays. We explain how to implement this language on our parallel computer SYMPHONIE in order to profit by the locally-shared memory, by the hardware virtualization, and by the multiplicity of communications networks. We analyse simultaneously a typical application of such architecture. Finite elements scheme for Fluid mechanic needs powerful parallel computers and requires large floating points abilities. Lattice gases is an alternative to such simulations. Boolean lattice bases are simple, stable, modular, need to floating point computation, but include numerical noise. Boltzmann lattice gases present large precision of computation, but needs floating points and are only locally stable. We propose a new scheme, called multi-bit, who keeps the advantages of each boolean model to which it is applied, with large numerical precision and reduced noise. Experiments on viscosity, physical behaviour, noise reduction and spurious invariants are shown and implementation techniques for parallel Multi-SIMD computers detailed. (author) [fr
Contributions to computational stereology and parallel programming

DEFF Research Database (Denmark)

Rasmusson, Allan

rotator, even without the need for isotropic sections. To meet the need for computational power to perform image restoration of virtual tissue sections, parallel programming on GPUs has also been part of the project. This has lead to a significant change in paradigm for a previously developed surgical...
Heterogeneous Multicore Parallel Programming for Graphics Processing Units

Directory of Open Access Journals (Sweden)

Francois Bodin

2009-01-01

Full Text Available Hybrid parallel multicore architectures based on graphics processing units (GPUs can provide tremendous computing power. Current NVIDIA and AMD Graphics Product Group hardware display a peak performance of hundreds of gigaflops. However, exploiting GPUs from existing applications is a difficult task that requires non-portable rewriting of the code. In this paper, we present HMPP, a Heterogeneous Multicore Parallel Programming workbench with compilers, developed by CAPS entreprise, that allows the integration of heterogeneous hardware accelerators in a unintrusive manner while preserving the legacy code.
Parallel algorithms for nuclear reactor analysis via domain decomposition method

International Nuclear Information System (INIS)

Kim, Yong Hee

1995-02-01

In this thesis, the neutron diffusion equation in reactor physics is discretized by the finite difference method and is solved on a parallel computer network which is composed of T-800 transputers. T-800 transputer is a message-passing type MIMD (multiple instruction streams and multiple data streams) architecture. A parallel variant of Schwarz alternating procedure for overlapping subdomains is developed with domain decomposition. The thesis provides convergence analysis and improvement of the convergence of the algorithm. The convergence of the parallel Schwarz algorithms with DN(or ND), DD, NN, and mixed pseudo-boundary conditions(a weighted combination of Dirichlet and Neumann conditions) is analyzed for both continuous and discrete models in two-subdomain case and various underlying features are explored. The analysis shows that the convergence rate of the algorithm highly depends on the pseudo-boundary conditions and the theoretically best one is the mixed boundary conditions(MM conditions). Also it is shown that there may exist a significant discrepancy between continuous model analysis and discrete model analysis. In order to accelerate the convergence of the parallel Schwarz algorithm, relaxation in pseudo-boundary conditions is introduced and the convergence analysis of the algorithm for two-subdomain case is carried out. The analysis shows that under-relaxation of the pseudo-boundary conditions accelerates the convergence of the parallel Schwarz algorithm if the convergence rate without relaxation is negative, and any relaxation(under or over) decelerates convergence if the convergence rate without relaxation is positive. Numerical implementation of the parallel Schwarz algorithm on an MIMD system requires multi-level iterations: two levels for fixed source problems, three levels for eigenvalue problems. Performance of the algorithm turns out to be very sensitive to the iteration strategy. In general, multi-level iterations provide good performance when
Preliminary Study on the Enhancement of Reconstruction Speed for Emission Computed Tomography Using Parallel Processing

International Nuclear Information System (INIS)

Park, Min Jae; Lee, Jae Sung; Kim, Soo Mee; Kang, Ji Yeon; Lee, Dong Soo; Park, Kwang Suk

2009-01-01

Conventional image reconstruction uses simplified physical models of projection. However, real physics, for example 3D reconstruction, takes too long time to process all the data in clinic and is unable in a common reconstruction machine because of the large memory for complex physical models. We suggest the realistic distributed memory model of fast-reconstruction using parallel processing on personal computers to enable large-scale technologies. The preliminary tests for the possibility on virtual machines and various performance test on commercial super computer, Tachyon were performed. Expectation maximization algorithm with common 2D projection and realistic 3D line of response were tested. Since the process time was getting slower (max 6 times) after a certain iteration, optimization for compiler was performed to maximize the efficiency of parallelization. Parallel processing of a program on multiple computers was available on Linux with MPICH and NFS. We verified that differences between parallel processed image and single processed image at the same iterations were under the significant digits of floating point number, about 6 bit. Double processors showed good efficiency (1.96 times) of parallel computing. Delay phenomenon was solved by vectorization method using SSE. Through the study, realistic parallel computing system in clinic was established to be able to reconstruct by plenty of memory using the realistic physical models which was impossible to simplify
ITER physics design guidelines: 1989

International Nuclear Information System (INIS)

Uckan, N.A.

1990-01-01

The physics basis for ITER has been developed from an assessment of the results of the last twenty-five years of tokamak research and from detailed analysis of important physics issues specifically for the ITER design. This assessment has been carried out with direct participation of members of the experimental teams of each of the major tokamaks in the world fusion program through participation in ITER workshops, contributions to the ITER Physics R and D Program, and by direct contacts between the ITER team and the cognizant experimentalists. Extrapolations to the present data base, where needed, are made in the most cautious way consistent with engineering constraints and performance goals of the ITER. In cases where a working assumptions had to be introduced, which is insufficiently supported by the present data base, is explicitly stated. While a strong emphasis has been placed on the physics credibility of the design, the guidelines also take into account that ITER should be designed to be able to take advantage of potential improvements in tokamak physics that may occur before and during the operation of ITER. (author). 33 refs
ITER Safety and Licensing

International Nuclear Information System (INIS)

Girard, J-.P; Taylor, N.; Garin, P.; Uzan-Elbez, J.; GULDEN, W.; Rodriguez-Rodrigo, L.

2006-01-01

The site for the construction of ITER has been chosen in June 2005. The facility will be implemented in Europe, south of France close to Marseille. The generic safety scheme is now under revision to adapt the design to the host country regulation. Even though ITER will be an international organization, it will have to comply with the French requirements in the fields of public and occupational health and safety, nuclear safety, radiation protection, licensing, nuclear substances and environmental protection. The organization of the central team together with its partners organized in domestic agencies for the in-kind procurement of components is a key issue for the success of the experimentation. ITER is the first facility that will achieve sustained nuclear fusion. It is both important for the experimental one-of-a-kind device, ITER itself, and for the future of fusion power plants to well understand the key safety issues of this potential new source of energy production. The main safety concern is confinement of the tritium, activated dust in the vacuum vessel and activated corrosion products in the coolant of the plasma-facing components. This is achieved in the design through multiple confinement barriers to implement the defence in depth approach. It will be demonstrated in documents submitted to the French regulator that these barriers maintain their function in all postulated incident and accident conditions. The licensing process started by examination of the safety options. This step has been performed by Europe during the candidature phase in 2002. In parallel to the final design, and taking into account the local regulations, the Preliminary Safety Report (RPrS) will be drafted with support of the European partner and others in the framework of ITER Task Agreements. Together with the license application, the RPrS will be forwarded to the regulatory bodies, which will launch public hearings and a safety review. Both processes must succeed in order to
Parallel programming of saccades during natural scene viewing: evidence from eye movement positions.

Science.gov (United States)

Wu, Esther X W; Gilani, Syed Omer; van Boxtel, Jeroen J A; Amihai, Ido; Chua, Fook Kee; Yen, Shih-Cheng

2013-10-24

Previous studies have shown that saccade plans during natural scene viewing can be programmed in parallel. This evidence comes mainly from temporal indicators, i.e., fixation durations and latencies. In the current study, we asked whether eye movement positions recorded during scene viewing also reflect parallel programming of saccades. As participants viewed scenes in preparation for a memory task, their inspection of the scene was suddenly disrupted by a transition to another scene. We examined whether saccades after the transition were invariably directed immediately toward the center or were contingent on saccade onset times relative to the transition. The results, which showed a dissociation in eye movement behavior between two groups of saccades after the scene transition, supported the parallel programming account. Saccades with relatively long onset times (>100 ms) after the transition were directed immediately toward the center of the scene, probably to restart scene exploration. Saccades with short onset times (programming of saccades during scene viewing. Additionally, results from the analyses of intersaccadic intervals were also consistent with the parallel programming hypothesis.
Parallel supercomputing: Advanced methods, algorithms, and software for large-scale linear and nonlinear problems

Energy Technology Data Exchange (ETDEWEB)

Carey, G.F.; Young, D.M.

1993-12-31

The program outlined here is directed to research on methods, algorithms, and software for distributed parallel supercomputers. Of particular interest are finite element methods and finite difference methods together with sparse iterative solution schemes for scientific and engineering computations of very large-scale systems. Both linear and nonlinear problems will be investigated. In the nonlinear case, applications with bifurcation to multiple solutions will be considered using continuation strategies. The parallelizable numerical methods of particular interest are a family of partitioning schemes embracing domain decomposition, element-by-element strategies, and multi-level techniques. The methods will be further developed incorporating parallel iterative solution algorithms with associated preconditioners in parallel computer software. The schemes will be implemented on distributed memory parallel architectures such as the CRAY MPP, Intel Paragon, the NCUBE3, and the Connection Machine. We will also consider other new architectures such as the Kendall-Square (KSQ) and proposed machines such as the TERA. The applications will focus on large-scale three-dimensional nonlinear flow and reservoir problems with strong convective transport contributions. These are legitimate grand challenge class computational fluid dynamics (CFD) problems of significant practical interest to DOE. The methods developed and algorithms will, however, be of wider interest.
Basic design of parallel computational program for probabilistic structural analysis

International Nuclear Information System (INIS)

Kaji, Yoshiyuki; Arai, Taketoshi; Gu, Wenwei; Nakamura, Hitoshi

1999-06-01

In our laboratory, for 'development of damage evaluation method of structural brittle materials by microscopic fracture mechanics and probabilistic theory' (nuclear computational science cross-over research) we examine computational method related to super parallel computation system which is coupled with material strength theory based on microscopic fracture mechanics for latent cracks and continuum structural model to develop new structural reliability evaluation methods for ceramic structures. This technical report is the review results regarding probabilistic structural mechanics theory, basic terms of formula and program methods of parallel computation which are related to principal terms in basic design of computational mechanics program. (author)

Basic design of parallel computational program for probabilistic structural analysis

Energy Technology Data Exchange (ETDEWEB)

Kaji, Yoshiyuki; Arai, Taketoshi [Japan Atomic Energy Research Inst., Tokai, Ibaraki (Japan). Tokai Research Establishment; Gu, Wenwei; Nakamura, Hitoshi

1999-06-01

In our laboratory, for `development of damage evaluation method of structural brittle materials by microscopic fracture mechanics and probabilistic theory` (nuclear computational science cross-over research) we examine computational method related to super parallel computation system which is coupled with material strength theory based on microscopic fracture mechanics for latent cracks and continuum structural model to develop new structural reliability evaluation methods for ceramic structures. This technical report is the review results regarding probabilistic structural mechanics theory, basic terms of formula and program methods of parallel computation which are related to principal terms in basic design of computational mechanics program. (author)
Adaptive control in multi-threaded iterated integration

International Nuclear Information System (INIS)

Doncker, Elise de; Yuasa, Fukuko

2013-01-01

In recent years we have developed a technique for the direct computation of Feynman loop-integrals, which are notorious for the occurrence of integrand singularities. Especially for handling singularities in the interior of the domain, we approximate the iterated integral using an adaptive algorithm in the coordinate directions. We present a novel multi-core parallelization scheme for adaptive multivariate integration, by assigning threads to the rule evaluations in the outer dimensions of the iterated integral. The method ensures a large parallel granularity as each function evaluation by itself comprises an integral over the lower dimensions, while the application of the threads is governed by the adaptive control in the outer level. We give computational results for a test set of 3- to 6-dimensional integrals, where several problems exhibit a loop integral behavior.
Status of R&D activity for ITER ICRF power source system

International Nuclear Information System (INIS)

Mukherjee, Aparajita; Trivedi, Rajesh; Singh, Raghuraj; Rajnish, Kumar; Machchhar, Harsha; Ajesh, P.; Suthar, Gajendra; Soni, Dipal; Patel, Manoj; Mohan, Kartik; Hari, J.V.S.; Anand, Rohit; Verma, Sriprakash; Agarwal, Rohit; Jha, Akhil; Kazarian, Fabienne; Beaumont, Bertrand

2015-01-01

Highlights: • R&D program to establish high power RF technology for ITER ICRF source is described. • R&D RF source is being developed using Diacrode & Tetrode technologies. • Test rig (3 MW/3600 s/35–65 MHz) simulating plasma load is developed. - Abstract: India is in-charge for the procurement of ITER Ion Cyclotron Resonance Frequency (ICRF) sources (1 Prototype + 8 series units) along with auxiliary power supplies and Local Control Unit. As there is no unique amplifier chain able to meet the output power specifications as per ITER requirement (2.5 MW per source at 35–65 MHz/CW/VSWR 2.0), two parallel three-stage amplifier chains along with a combiner circuit on the output side is considered. This kind of RF source will be unique in terms of its stringent specifications and building a first of its kind is always a challenge. An R&D phase has been initiated for establishing the technology considering single amplifier chain experimentation (1.5 MW/35–65 MHz/3600 s/VSWR 2.0) prior to Prototype and series production. This paper presents the status of R&D activity to resolve technological challenges involved and various infrastructures developed at ITER-India lab to support such operation.
Status of R&D activity for ITER ICRF power source system

Energy Technology Data Exchange (ETDEWEB)

Mukherjee, Aparajita, E-mail: aparajita.mukherjee@iter-india.org [ITER-India, Institute for Plasma Research, Bhat, Gandhinagar–382428 (India); Trivedi, Rajesh; Singh, Raghuraj; Rajnish, Kumar; Machchhar, Harsha; Ajesh, P.; Suthar, Gajendra; Soni, Dipal; Patel, Manoj; Mohan, Kartik; Hari, J.V.S.; Anand, Rohit; Verma, Sriprakash; Agarwal, Rohit; Jha, Akhil [ITER-India, Institute for Plasma Research, Bhat, Gandhinagar–382428 (India); Kazarian, Fabienne; Beaumont, Bertrand [ITER Organization, CS 90 046, 13067 Sain-Paul-Les-Durance (France)

2015-10-15

Highlights: • R&D program to establish high power RF technology for ITER ICRF source is described. • R&D RF source is being developed using Diacrode & Tetrode technologies. • Test rig (3 MW/3600 s/35–65 MHz) simulating plasma load is developed. - Abstract: India is in-charge for the procurement of ITER Ion Cyclotron Resonance Frequency (ICRF) sources (1 Prototype + 8 series units) along with auxiliary power supplies and Local Control Unit. As there is no unique amplifier chain able to meet the output power specifications as per ITER requirement (2.5 MW per source at 35–65 MHz/CW/VSWR 2.0), two parallel three-stage amplifier chains along with a combiner circuit on the output side is considered. This kind of RF source will be unique in terms of its stringent specifications and building a first of its kind is always a challenge. An R&D phase has been initiated for establishing the technology considering single amplifier chain experimentation (1.5 MW/35–65 MHz/3600 s/VSWR 2.0) prior to Prototype and series production. This paper presents the status of R&D activity to resolve technological challenges involved and various infrastructures developed at ITER-India lab to support such operation.
Parallel iterative solvers and preconditioners using approximate hierarchical methods

Energy Technology Data Exchange (ETDEWEB)

Grama, A.; Kumar, V.; Sameh, A. [Univ. of Minnesota, Minneapolis, MN (United States)

1996-12-31

In this paper, we report results of the performance, convergence, and accuracy of a parallel GMRES solver for Boundary Element Methods. The solver uses a hierarchical approximate matrix-vector product based on a hybrid Barnes-Hut / Fast Multipole Method. We study the impact of various accuracy parameters on the convergence and show that with minimal loss in accuracy, our solver yields significant speedups. We demonstrate the excellent parallel efficiency and scalability of our solver. The combined speedups from approximation and parallelism represent an improvement of several orders in solution time. We also develop fast and paralellizable preconditioners for this problem. We report on the performance of an inner-outer scheme and a preconditioner based on truncated Green`s function. Experimental results on a 256 processor Cray T3D are presented.
ITER shielding blanket

Energy Technology Data Exchange (ETDEWEB)

Strebkov, Yu [ENTEK, Moscow (Russian Federation); Avsjannikov, A [ENTEK, Moscow (Russian Federation); Baryshev, M [NIAT, Moscow (Russian Federation); Blinov, Yu [ENTEK, Moscow (Russian Federation); Shatalov, G [KIAE, Moscow (Russian Federation); Vasiliev, N [KIAE, Moscow (Russian Federation); Vinnikov, A [ENTEK, Moscow (Russian Federation); Chernjagin, A [DYNAMICA, Moscow (Russian Federation)

1995-03-01

A reference non-breeding blanket is under development now for the ITER Basic Performance Phase for the purpose of high reliability during the first stage of ITER operation. More severe operation modes are expected in this stage with first wall (FW) local heat loads up to 100-300Wcm{sup -2}. Integration of a blanket design with protective and start limiters requires new solutions to achieve high reliability, and possible use of beryllium as a protective material leads to technologies. The rigid shielding blanket concept was developed in Russia to satisfy the above-mentioned requirements. The concept is based on a copper alloy FW, austenitic stainless steel blanket structure, water cooling. Beryllium protection is integrated in the FW design. Fabrication technology and assembly procedure are described in parallel with the equipment used. (orig.).
On the Performance of the Python Programming Language for Serial and Parallel Scientific Computations

Directory of Open Access Journals (Sweden)

Xing Cai

2005-01-01

Full Text Available This article addresses the performance of scientific applications that use the Python programming language. First, we investigate several techniques for improving the computational efficiency of serial Python codes. Then, we discuss the basic programming techniques in Python for parallelizing serial scientific applications. It is shown that an efficient implementation of the array-related operations is essential for achieving good parallel performance, as for the serial case. Once the array-related operations are efficiently implemented, probably using a mixed-language implementation, good serial and parallel performance become achievable. This is confirmed by a set of numerical experiments. Python is also shown to be well suited for writing high-level parallel programs.
MulticoreBSP for C : A high-performance library for shared-memory parallel programming

NARCIS (Netherlands)

Yzelman, A. N.; Bisseling, R. H.; Roose, D.; Meerbergen, K.

2014-01-01

The bulk synchronous parallel (BSP) model, as well as parallel programming interfaces based on BSP, classically target distributed-memory parallel architectures. In earlier work, Yzelman and Bisseling designed a MulticoreBSP for Java library specifically for shared-memory architectures. In the
A multithreaded parallel implementation of a dynamic programming algorithm for sequence comparison.

Science.gov (United States)

Martins, W S; Del Cuvillo, J B; Useche, F J; Theobald, K B; Gao, G R

2001-01-01

This paper discusses the issues involved in implementing a dynamic programming algorithm for biological sequence comparison on a general-purpose parallel computing platform based on a fine-grain event-driven multithreaded program execution model. Fine-grain multithreading permits efficient parallelism exploitation in this application both by taking advantage of asynchronous point-to-point synchronizations and communication with low overheads and by effectively tolerating latency through the overlapping of computation and communication. We have implemented our scheme on EARTH, a fine-grain event-driven multithreaded execution and architecture model which has been ported to a number of parallel machines with off-the-shelf processors. Our experimental results show that the dynamic programming algorithm can be efficiently implemented on EARTH systems with high performance (e.g., speedup of 90 on 120 nodes), good programmability and reasonable cost.
Totally parallel multilevel algorithms

Science.gov (United States)

Frederickson, Paul O.

1988-01-01

Four totally parallel algorithms for the solution of a sparse linear system have common characteristics which become quite apparent when they are implemented on a highly parallel hypercube such as the CM2. These four algorithms are Parallel Superconvergent Multigrid (PSMG) of Frederickson and McBryan, Robust Multigrid (RMG) of Hackbusch, the FFT based Spectral Algorithm, and Parallel Cyclic Reduction. In fact, all four can be formulated as particular cases of the same totally parallel multilevel algorithm, which are referred to as TPMA. In certain cases the spectral radius of TPMA is zero, and it is recognized to be a direct algorithm. In many other cases the spectral radius, although not zero, is small enough that a single iteration per timestep keeps the local error within the required tolerance.
Adaptive Iterative Soft-Input Soft-Output Parallel Decision-Feedback Detectors for Asynchronous Coded DS-CDMA Systems

Directory of Open Access Journals (Sweden)

Zhang Wei

2005-01-01

Full Text Available The optimum and many suboptimum iterative soft-input soft-output (SISO multiuser detectors require a priori information about the multiuser system, such as the users' transmitted signature waveforms, relative delays, as well as the channel impulse response. In this paper, we employ adaptive algorithms in the SISO multiuser detector in order to avoid the need for this a priori information. First, we derive the optimum SISO parallel decision-feedback detector for asynchronous coded DS-CDMA systems. Then, we propose two adaptive versions of this SISO detector, which are based on the normalized least mean square (NLMS and recursive least squares (RLS algorithms. Our SISO adaptive detectors effectively exploit the a priori information of coded symbols, whose soft inputs are obtained from a bank of single-user decoders. Furthermore, we consider how to select practical finite feedforward and feedback filter lengths to obtain a good tradeoff between the performance and computational complexity of the receiver.
High performance parallelism pearls 2 multicore and many-core programming approaches

CERN Document Server

Jeffers, Jim

2015-01-01

High Performance Parallelism Pearls Volume 2 offers another set of examples that demonstrate how to leverage parallelism. Similar to Volume 1, the techniques included here explain how to use processors and coprocessors with the same programming - illustrating the most effective ways to combine Xeon Phi coprocessors with Xeon and other multicore processors. The book includes examples of successful programming efforts, drawn from across industries and domains such as biomed, genetics, finance, manufacturing, imaging, and more. Each chapter in this edited work includes detailed explanations of t
Towards Interactive Visual Exploration of Parallel Programs using a Domain-Specific Language

KAUST Repository

Klein, Tobias

2016-04-19

The use of GPUs and the massively parallel computing paradigm have become wide-spread. We describe a framework for the interactive visualization and visual analysis of the run-time behavior of massively parallel programs, especially OpenCL kernels. This facilitates understanding a program\\'s function and structure, finding the causes of possible slowdowns, locating program bugs, and interactively exploring and visually comparing different code variants in order to improve performance and correctness. Our approach enables very specific, user-centered analysis, both in terms of the recording of the run-time behavior and the visualization itself. Instead of having to manually write instrumented code to record data, simple code annotations tell the source-to-source compiler which code instrumentation to generate automatically. The visualization part of our framework then enables the interactive analysis of kernel run-time behavior in a way that can be very specific to a particular problem or optimization goal, such as analyzing the causes of memory bank conflicts or understanding an entire parallel algorithm.
A Unique Technique to get Kaprekar Iteration in Linear Programming Problem

Science.gov (United States)

Sumathi, P.; Preethy, V.

2018-04-01

This paper explores about a frivolous number popularly known as Kaprekar constant and Kaprekar numbers. A large number of courses and the different classroom capacities with difference in study periods make the assignment between classrooms and courses complicated. An approach of getting the minimum value of number of iterations to reach the Kaprekar constant for four digit numbers and maximum value is also obtained through linear programming techniques.
SequenceL: Automated Parallel Algorithms Derived from CSP-NT Computational Laws

Science.gov (United States)

Cooke, Daniel; Rushton, Nelson

2013-01-01

With the introduction of new parallel architectures like the cell and multicore chips from IBM, Intel, AMD, and ARM, as well as the petascale processing available for highend computing, a larger number of programmers will need to write parallel codes. Adding the parallel control structure to the sequence, selection, and iterative control constructs increases the complexity of code development, which often results in increased development costs and decreased reliability. SequenceL is a high-level programming language that is, a programming language that is closer to a human s way of thinking than to a machine s. Historically, high-level languages have resulted in decreased development costs and increased reliability, at the expense of performance. In recent applications at JSC and in industry, SequenceL has demonstrated the usual advantages of high-level programming in terms of low cost and high reliability. SequenceL programs, however, have run at speeds typically comparable with, and in many cases faster than, their counterparts written in C and C++ when run on single-core processors. Moreover, SequenceL is able to generate parallel executables automatically for multicore hardware, gaining parallel speedups without any extra effort from the programmer beyond what is required to write the sequen tial/singlecore code. A SequenceL-to-C++ translator has been developed that automatically renders readable multithreaded C++ from a combination of a SequenceL program and sample data input. The SequenceL language is based on two fundamental computational laws, Consume-Simplify- Produce (CSP) and Normalize-Trans - pose (NT), which enable it to automate the creation of parallel algorithms from high-level code that has no annotations of parallelism whatsoever. In our anecdotal experience, SequenceL development has been in every case less costly than development of the same algorithm in sequential (that is, single-core, single process) C or C++, and an order of magnitude less
Run-Time and Compiler Support for Programming in Adaptive Parallel Environments

Directory of Open Access Journals (Sweden)

Guy Edjlali

1997-01-01

Full Text Available For better utilization of computing resources, it is important to consider parallel programming environments in which the number of available processors varies at run-time. In this article, we discuss run-time support for data-parallel programming in such an adaptive environment. Executing programs in an adaptive environment requires redistributing data when the number of processors changes, and also requires determining new loop bounds and communication patterns for the new set of processors. We have developed a run-time library to provide this support. We discuss how the run-time library can be used by compilers of high-performance Fortran (HPF-like languages to generate code for an adaptive environment. We present performance results for a Navier-Stokes solver and a multigrid template run on a network of workstations and an IBM SP-2. Our experiments show that if the number of processors is not varied frequently, the cost of data redistribution is not significant compared to the time required for the actual computation. Overall, our work establishes the feasibility of compiling HPF for a network of nondedicated workstations, which are likely to be an important resource for parallel programming in the future.
ITER concept definition. V.2

International Nuclear Information System (INIS)

1989-01-01

Volume II of the two volumes describing the concept definition of the International Thermonuclear Experimental Reactor deals with the ITER concept in technical depth, and covers all areas of design of the ITER tokamak. Included are an assessment of the current database for design, scoping studies, rationale for concepts selection, performance flexibility, the ITER concept, the operations and experimental/testing program, ITER parameters and design phase schedule, and research and development specific to ITER. This latter includes a definition of specific research and development tasks, a division of tasks among members, specific milestones, required results, and schedules. Figs and tabs
Improved parallel solution techniques for the integral transport matrix method

Energy Technology Data Exchange (ETDEWEB)

Zerr, R. Joseph, E-mail: rjz116@psu.edu [Department of Mechanical and Nuclear Engineering, The Pennsylvania State University, University Park, PA (United States); Azmy, Yousry Y., E-mail: yyazmy@ncsu.edu [Department of Nuclear Engineering, North Carolina State University, Burlington Engineering Laboratories, Raleigh, NC (United States)

2011-07-01

Alternative solution strategies to the parallel block Jacobi (PBJ) method for the solution of the global problem with the integral transport matrix method operators have been designed and tested. The most straightforward improvement to the Jacobi iterative method is the Gauss-Seidel alternative. The parallel red-black Gauss-Seidel (PGS) algorithm can improve on the number of iterations and reduce work per iteration by applying an alternating red-black color-set to the subdomains and assigning multiple sub-domains per processor. A parallel GMRES(m) method was implemented as an alternative to stationary iterations. Computational results show that the PGS method can improve on the PBJ method execution time by up to 10´ when eight sub-domains per processor are used. However, compared to traditional source iterations with diffusion synthetic acceleration, it is still approximately an order of magnitude slower. The best-performing cases are optically thick because sub-domains decouple, yielding faster convergence. Further tests revealed that 64 sub-domains per processor was the best performing level of sub-domain division. An acceleration technique that improves the convergence rate would greatly improve the ITMM. The GMRES(m) method with a diagonal block pre conditioner consumes approximately the same time as the PBJ solver but could be improved by an as yet undeveloped, more efficient pre conditioner. (author)
Improved parallel solution techniques for the integral transport matrix method

International Nuclear Information System (INIS)

Zerr, R. Joseph; Azmy, Yousry Y.

2011-01-01

Alternative solution strategies to the parallel block Jacobi (PBJ) method for the solution of the global problem with the integral transport matrix method operators have been designed and tested. The most straightforward improvement to the Jacobi iterative method is the Gauss-Seidel alternative. The parallel red-black Gauss-Seidel (PGS) algorithm can improve on the number of iterations and reduce work per iteration by applying an alternating red-black color-set to the subdomains and assigning multiple sub-domains per processor. A parallel GMRES(m) method was implemented as an alternative to stationary iterations. Computational results show that the PGS method can improve on the PBJ method execution time by up to 10´ when eight sub-domains per processor are used. However, compared to traditional source iterations with diffusion synthetic acceleration, it is still approximately an order of magnitude slower. The best-performing cases are optically thick because sub-domains decouple, yielding faster convergence. Further tests revealed that 64 sub-domains per processor was the best performing level of sub-domain division. An acceleration technique that improves the convergence rate would greatly improve the ITMM. The GMRES(m) method with a diagonal block pre conditioner consumes approximately the same time as the PBJ solver but could be improved by an as yet undeveloped, more efficient pre conditioner. (author)
ITER management advisory committee meeting in NAKA

International Nuclear Information System (INIS)

Yoshikawa, M.

1999-01-01

The ITER Management Advisory Committee (MAC) Meeting was held on 17 December 1999 in Naka, Japan. The main topics were the ITER EDA Status, Task Status Summary and Work Program and a schedule of ITER meetings

Scientific programming on massively parallel processor CP-PACS

International Nuclear Information System (INIS)

Boku, Taisuke

1998-01-01

The massively parallel processor CP-PACS takes various problems of calculation physics as the object, and it has been designed so that its architecture has been devised to do various numerical processings. In this report, the outline of the CP-PACS and the example of programming in the Kernel CG benchmark in NAS Parallel Benchmarks, version 1, are shown, and the pseudo vector processing mechanism and the parallel processing tuning of scientific and technical computation utilizing the three-dimensional hyper crossbar net, which are two great features of the architecture of the CP-PACS are described. As for the CP-PACS, the PUs based on RISC processor and added with pseudo vector processor are used. Pseudo vector processing is realized as the loop processing by scalar command. The features of the connection net of PUs are explained. The algorithm of the NPB version 1 Kernel CG is shown. The part that takes the time for processing most in the main loop is the product of matrix and vector (matvec), and the parallel processing of the matvec is explained. The time for the computation by the CPU is determined. As the evaluation of the performance, the evaluation of the time for execution, the short vector processing of pseudo vector processor based on slide window, and the comparison with other parallel computers are reported. (K.I.)
F-Nets and Software Cabling: Deriving a Formal Model and Language for Portable Parallel Programming

Science.gov (United States)

DiNucci, David C.; Saini, Subhash (Technical Monitor)

1998-01-01

Parallel programming is still being based upon antiquated sequence-based definitions of the terms "algorithm" and "computation", resulting in programs which are architecture dependent and difficult to design and analyze. By focusing on obstacles inherent in existing practice, a more portable model is derived here, which is then formalized into a model called Soviets which utilizes a combination of imperative and functional styles. This formalization suggests more general notions of algorithm and computation, as well as insights into the meaning of structured programming in a parallel setting. To illustrate how these principles can be applied, a very-high-level graphical architecture-independent parallel language, called Software Cabling, is described, with many of the features normally expected from today's computer languages (e.g. data abstraction, data parallelism, and object-based programming constructs).
Evolution of a minimal parallel programming model

International Nuclear Information System (INIS)

Lusk, Ewing; Butler, Ralph; Pieper, Steven C.

2017-01-01

Here, we take a historical approach to our presentation of self-scheduled task parallelism, a programming model with its origins in early irregular and nondeterministic computations encountered in automated theorem proving and logic programming. We show how an extremely simple task model has evolved into a system, asynchronous dynamic load balancing (ADLB), and a scalable implementation capable of supporting sophisticated applications on today’s (and tomorrow’s) largest supercomputers; and we illustrate the use of ADLB with a Green’s function Monte Carlo application, a modern, mature nuclear physics code in production use. Our lesson is that by surrendering a certain amount of generality and thus applicability, a minimal programming model (in terms of its basic concepts and the size of its application programmer interface) can achieve extreme scalability without introducing complexity.
Adaptive dynamic programming for discrete-time linear quadratic regulation based on multirate generalised policy iteration

Science.gov (United States)

Chun, Tae Yoon; Lee, Jae Young; Park, Jin Bae; Choi, Yoon Ho

2018-06-01

In this paper, we propose two multirate generalised policy iteration (GPI) algorithms applied to discrete-time linear quadratic regulation problems. The proposed algorithms are extensions of the existing GPI algorithm that consists of the approximate policy evaluation and policy improvement steps. The two proposed schemes, named heuristic dynamic programming (HDP) and dual HDP (DHP), based on multirate GPI, use multi-step estimation (M-step Bellman equation) at the approximate policy evaluation step for estimating the value function and its gradient called costate, respectively. Then, we show that these two methods with the same update horizon can be considered equivalent in the iteration domain. Furthermore, monotonically increasing and decreasing convergences, so called value iteration (VI)-mode and policy iteration (PI)-mode convergences, are proved to hold for the proposed multirate GPIs. Further, general convergence properties in terms of eigenvalues are also studied. The data-driven online implementation methods for the proposed HDP and DHP are demonstrated and finally, we present the results of numerical simulations performed to verify the effectiveness of the proposed methods.
Parallelization of pressure equation solver for incompressible N-S equations

International Nuclear Information System (INIS)

Ichihara, Kiyoshi; Yokokawa, Mitsuo; Kaburaki, Hideo.

1996-03-01

A pressure equation solver in a code for 3-dimensional incompressible flow analysis has been parallelized by using red-black SOR method and PCG method on Fujitsu VPP500, a vector parallel computer with distributed memory. For the comparison of scalability, the solver using the red-black SOR method has been also parallelized on the Intel Paragon, a scalar parallel computer with a distributed memory. The scalability of the red-black SOR method on both VPP500 and Paragon was lost, when number of processor elements was increased. The reason of non-scalability on both systems is increasing communication time between processor elements. In addition, the parallelization by DO-loop division makes the vectorizing efficiency lower on VPP500. For an effective implementation on VPP500, a large scale problem which holds very long vectorized DO-loops in the parallel program should be solved. PCG method with red-black SOR method applied to incomplete LU factorization (red-black PCG) has more iteration steps than normal PCG method with forward and backward substitution, in spite of same number of the floating point operations in a DO-loop of incomplete LU factorization. The parallelized red-black PCG method has less merits than the parallelized red-black SOR method when the computational region has fewer grids, because the low vectorization efficiency is obtained in red-black PCG method. (author)
Towards Interactive Visual Exploration of Parallel Programs using a Domain-Specific Language

KAUST Repository

Klein, Tobias; Bruckner, Stefan; Grö ller, M. Eduard; Hadwiger, Markus; Rautek, Peter

2016-01-01

The use of GPUs and the massively parallel computing paradigm have become wide-spread. We describe a framework for the interactive visualization and visual analysis of the run-time behavior of massively parallel programs, especially OpenCL kernels. This facilitates understanding a program's function and structure, finding the causes of possible slowdowns, locating program bugs, and interactively exploring and visually comparing different code variants in order to improve performance and correctness. Our approach enables very specific, user-centered analysis, both in terms of the recording of the run-time behavior and the visualization itself. Instead of having to manually write instrumented code to record data, simple code annotations tell the source-to-source compiler which code instrumentation to generate automatically. The visualization part of our framework then enables the interactive analysis of kernel run-time behavior in a way that can be very specific to a particular problem or optimization goal, such as analyzing the causes of memory bank conflicts or understanding an entire parallel algorithm.
A backtracking algorithm for the stream AND-parallel execution of logic programs

Energy Technology Data Exchange (ETDEWEB)

Somogyi, Z.; Ramamohanarao, K.; Vaghani, J. (Univ. of Melbourne, Parkville (Australia))

1988-06-01

The authors present the first backtracking algorithm for stream AND-parallel logic programs. It relies on compile-time knowledge of the data flow graph of each clause to let it figure out efficiently which goals to kill or restart when a goal fails. This crucial information, which they derive from mode declarations, was not available at compile-time in any previous stream AND-parallel system. They show that modes can increase the precision of the backtracking algorithm, though their algorithm allows this precision to be traded off against overhead on a procedure-by-procedure and call-by-call basis. The modes also allow their algorithm to handle efficiently programs that manipulate partially instantiated data structures and an important class of programs with circular dependency graphs. On code that does not need backtracking, the efficiency of their algorithm approaches that of the committed-choice languages; on code that does need backtracking its overhead is comparable to that of the independent AND-parallel backtracking algorithms.
A parallel algorithm for the non-symmetric eigenvalue problem

International Nuclear Information System (INIS)

Sidani, M.M.

1991-01-01

An algorithm is presented for the solution of the non-symmetric eigenvalue problem. The algorithm is based on a divide-and-conquer procedure that provides initial approximations to the eigenpairs, which are then refined using Newton iterations. Since the smaller subproblems can be solved independently, and since Newton iterations with different initial guesses can be started simultaneously, the algorithm - unlike the standard QR method - is ideal for parallel computers. The author also reports on his investigation of deflation methods designed to obtain further eigenpairs if needed. Numerical results from implementations on a host of parallel machines (distributed and shared-memory) are presented
Spirit and prospects of ITER

Energy Technology Data Exchange (ETDEWEB)

Velikhov, E.P. [Kurchatov Institute of Atomic Energy, Moscow (Russian Federation)

2002-10-01

ITER is the unique and the most straightforward way to study the burning plasma science in the nearest future. ITER has a firm physics ground based on the results from the world tokamaks in terms of confinement, stability, heating, current drive, divertor, energetic particle confinement to an extend required in ITER. The flexibility of ITER will allow the exploration of broad operation space of fusion power, beta, pulse length and Q values in various operational scenarios. Success of the engineering R and D programs has demonstrated that all party has an enough capability to produce all the necessary equipment in agreement with the specifications of ITER. The acquired knowledge and technologies in ITER project allow us to demonstrate the scientific and technical feasibility of a fusion reactor. It can be concluded that ITER must be constructed in the nearest future. (author)
Spirit and prospects of ITER

International Nuclear Information System (INIS)

Velikhov, E.P.

2002-01-01

ITER is the unique and the most straightforward way to study the burning plasma science in the nearest future. ITER has a firm physics ground based on the results from the world tokamaks in terms of confinement, stability, heating, current drive, divertor, energetic particle confinement to an extend required in ITER. The flexibility of ITER will allow the exploration of broad operation space of fusion power, beta, pulse length and Q values in various operational scenarios. Success of the engineering R and D programs has demonstrated that all party has an enough capability to produce all the necessary equipment in agreement with the specifications of ITER. The acquired knowledge and technologies in ITER project allow us to demonstrate the scientific and technical feasibility of a fusion reactor. It can be concluded that ITER must be constructed in the nearest future. (author)
New iteration of decommissioning program for NPP Krsko

International Nuclear Information System (INIS)

Lokner, V.; Levanat, I.; Rapic, A.; Zeleznik, N.; Mele, I.; Jenko, T.

2004-01-01

As required by the paragraph 10 of the Agreement between the governments of Slovenia and Croatia on status and other legal issues related to investment, exploitation, and decommissioning of Nuclear power plant Krsko, Decommissioning program for Krsko NPP including LILW and spent fuel management was drafted. The Intergovernmental body required that the Program should be extensive revision of existing program as one of several iterations to be prepared before the final version. The purpose of the Program is to estimate the expenses of the future decommissioning, radioactive waste and spent fuel management for Krsko NPP. Costing estimation would be the basis for establishment of a special fund in Croatia and for adjustment of the annual rates for the existing decommissioning fund in Slovenia. The Program development was entrusted to specialized organizations both in Croatia and Slovenia, which formed the Project team as the operative body. Consulting firms from Croatia and Slovenia were involved as well as experts from the International Atomic Energy Agency (through short visits to Zagreb and Ljubljana) for specialized fields (e.g. economic aspects of decommissioning, pre-feasibility study for spent fuel repository in crystalline rock, etc.). The analysis was performed in several steps. The first step was to develop rational and feasible integral scenarios (strategies) of decommissioning and LILW and spent fuel management on the basis of detailed technical analysis and within defined boundary conditions. Based on technological data, every scenario was attributed with time distribution of expenses for all main activities. In the second step, financial analysis of the scenarios was undertaken aiming at estimation of total discounted expense and the related annuity (19 installments to the single fund, empty in 2003) for each of the scenarios. The third step involves additional analysis of the chosen scenarios aiming at their (technical or financial) improvements even at
Coarse-grain parallel solution of few-group neutron diffusion equations

International Nuclear Information System (INIS)

Sarsour, H.N.; Turinsky, P.J.

1991-01-01

The authors present a parallel numerical algorithm for the solution of the finite difference representation of the few-group neutron diffusion equations. The targeted architectures are multiprocessor computers with shared memory like the Cray Y-MP and the IBM 3090/VF, where coarse granularity is important for minimizing overhead. Most of the work done in the past, which attempts to exploit concurrence, has concentrated on the inner iterations of the standard outer-inner iterative strategy. This produces very fine granularity. To coarsen granularity, the authors introduce parallelism at the nested outer-inner level. The problem's spatial domain was partitioned into contiguous subregions and assigned a processor to solve for each subregion independent of all other subregions, hence, processors; i.e., each subregion is treated as a reactor core with imposed boundary conditions. Since those boundary conditions on interior surfaces, referred to as internal boundary conditions (IBCs), are not known, a third iterative level, the recomposition iterations, is introduced to communicate results between subregions
Perl Modules for Constructing Iterators

Science.gov (United States)

Tilmes, Curt

2009-01-01

The Iterator Perl Module provides a general-purpose framework for constructing iterator objects within Perl, and a standard API for interacting with those objects. Iterators are an object-oriented design pattern where a description of a series of values is used in a constructor. Subsequent queries can request values in that series. These Perl modules build on the standard Iterator framework and provide iterators for some other types of values. Iterator::DateTime constructs iterators from DateTime objects or Date::Parse descriptions and ICal/RFC 2445 style re-currence descriptions. It supports a variety of input parameters, including a start to the sequence, an end to the sequence, an Ical/RFC 2445 recurrence describing the frequency of the values in the series, and a format description that can refine the presentation manner of the DateTime. Iterator::String constructs iterators from string representations. This module is useful in contexts where the API consists of supplying a string and getting back an iterator where the specific iteration desired is opaque to the caller. It is of particular value to the Iterator::Hash module which provides nested iterations. Iterator::Hash constructs iterators from Perl hashes that can include multiple iterators. The constructed iterators will return all the permutations of the iterations of the hash by nested iteration of embedded iterators. A hash simply includes a set of keys mapped to values. It is a very common data structure used throughout Perl programming. The Iterator:: Hash module allows a hash to include strings defining iterators (parsed and dispatched with Iterator::String) that are used to construct an overall series of hash values.
ITER EDA newsletter. V. 7, no. 12

International Nuclear Information System (INIS)

1998-12-01

This edition of the ITER EDA Newsletter is dedicated to celebrate the achievements of the ITER activities at the San Diego Joint Work Site. Articles by E. Velikhov, A. Davies and R. Aymar mark the final days of American participation in the ITER program
Feedback Driven Annotation and Refactoring of Parallel Programs

DEFF Research Database (Denmark)

Larsen, Per

and communication in embedded programs. Runtime checks are developed to ensure that annotations correctly describe observable program behavior. The performance impact of runtime checking is evaluated on several benchmark kernels and is negligible in all cases. The second aspect is compilation feedback. Annotations...... are not effective unless programmers are told how and when they are benecial. A prototype compilation feedback system was developed in collaboration with IBM Haifa Research Labs. It reports issues that prevent further analysis to the programmer. Performance evaluation shows that three programs performes signicantly......This thesis combines programmer knowledge and feedback to improve modeling and optimization of software. The research is motivated by two observations. First, there is a great need for automatic analysis of software for embedded systems - to expose and model parallelism inherent in programs. Second...
A discrete ordinate response matrix method for massively parallel computers

International Nuclear Information System (INIS)

Hanebutte, U.R.; Lewis, E.E.

1991-01-01

A discrete ordinate response matrix method is formulated for the solution of neutron transport problems on massively parallel computers. The response matrix formulation eliminates iteration on the scattering source. The nodal matrices which result from the diamond-differenced equations are utilized in a factored form which minimizes memory requirements and significantly reduces the required number of algorithm utilizes massive parallelism by assigning each spatial node to a processor. The algorithm is accelerated effectively by a synthetic method in which the low-order diffusion equations are also solved by massively parallel red/black iterations. The method has been implemented on a 16k Connection Machine-2, and S 8 and S 16 solutions have been obtained for fixed-source benchmark problems in X--Y geometry
Development and test of prototype components for ITER; Entwicklung und Test von Prototypkomponenten fuer ITER

Energy Technology Data Exchange (ETDEWEB)

Biel, Wolfgang; Behr, Wilfried; Castano-Bardawil, David; and others

2015-08-15

The scientific program of the project is divided into the following partial projects: (1.) ITER Diagnostic Port Plug for the charge-exchange spectroscopy (CXRS) with the subthemes: (a) Development of prototypes for critical mechanical components, (b) development of a roboter for the laser welding of vacuum seals and pipings at the Port Plug, (c) mirror studies, (d) CXRS prototype spectrometer, (2.) ITER tritium retention diagnostics (TR), (3.) ITER disruption mitigation ventile (DMV).
SPINET: A Parallel Computing Approach to Spine Simulations

Directory of Open Access Journals (Sweden)

Peter G. Kropf

1996-01-01

Full Text Available Research in scientitic programming enables us to realize more and more complex applications, and on the other hand, application-driven demands on computing methods and power are continuously growing. Therefore, interdisciplinary approaches become more widely used. The interdisciplinary SPINET project presented in this article applies modern scientific computing tools to biomechanical simulations: parallel computing and symbolic and modern functional programming. The target application is the human spine. Simulations of the spine help us to investigate and better understand the mechanisms of back pain and spinal injury. Two approaches have been used: the first uses the finite element method for high-performance simulations of static biomechanical models, and the second generates a simulation developmenttool for experimenting with different dynamic models. A finite element program for static analysis has been parallelized for the MUSIC machine. To solve the sparse system of linear equations, a conjugate gradient solver (iterative method and a frontal solver (direct method have been implemented. The preprocessor required for the frontal solver is written in the modern functional programming language SML, the solver itself in C, thus exploiting the characteristic advantages of both functional and imperative programming. The speedup analysis of both solvers show very satisfactory results for this irregular problem. A mixed symbolic-numeric environment for rigid body system simulations is presented. It automatically generates C code from a problem specification expressed by the Lagrange formalism using Maple.
A Parallel Particle Swarm Optimization Algorithm Accelerated by Asynchronous Evaluations

Science.gov (United States)

Venter, Gerhard; Sobieszczanski-Sobieski, Jaroslaw

2005-01-01

A parallel Particle Swarm Optimization (PSO) algorithm is presented. Particle swarm optimization is a fairly recent addition to the family of non-gradient based, probabilistic search algorithms that is based on a simplified social model and is closely tied to swarming theory. Although PSO algorithms present several attractive properties to the designer, they are plagued by high computational cost as measured by elapsed time. One approach to reduce the elapsed time is to make use of coarse-grained parallelization to evaluate the design points. Previous parallel PSO algorithms were mostly implemented in a synchronous manner, where all design points within a design iteration are evaluated before the next iteration is started. This approach leads to poor parallel speedup in cases where a heterogeneous parallel environment is used and/or where the analysis time depends on the design point being analyzed. This paper introduces an asynchronous parallel PSO algorithm that greatly improves the parallel e ciency. The asynchronous algorithm is benchmarked on a cluster assembled of Apple Macintosh G5 desktop computers, using the multi-disciplinary optimization of a typical transport aircraft wing as an example.
Automatic Loop Parallelization via Compiler Guided Refactoring

DEFF Research Database (Denmark)

Larsen, Per; Ladelsky, Razya; Lidman, Jacob

For many parallel applications, performance relies not on instruction-level parallelism, but on loop-level parallelism. Unfortunately, many modern applications are written in ways that obstruct automatic loop parallelization. Since we cannot identify sufficient parallelization opportunities...... for these codes in a static, off-line compiler, we developed an interactive compilation feedback system that guides the programmer in iteratively modifying application source, thereby improving the compiler’s ability to generate loop-parallel code. We use this compilation system to modify two sequential...... benchmarks, finding that the code parallelized in this way runs up to 8.3 times faster on an octo-core Intel Xeon 5570 system and up to 12.5 times faster on a quad-core IBM POWER6 system. Benchmark performance varies significantly between the systems. This suggests that semi-automatic parallelization should...

Parallel solutions of the two-group neutron diffusion equations

International Nuclear Information System (INIS)

Zee, K.S.; Turinsky, P.J.

1987-01-01

Recent efforts to adapt various numerical solution algorithms to parallel computer architectures have addressed the possibility of substantially reducing the running time of few-group neutron diffusion calculations. The authors have developed an efficient iterative parallel algorithm and an associated computer code for the rapid solution of the finite difference method representation of the two-group neutron diffusion equations on the CRAY X/MP-48 supercomputer having multi-CPUs and vector pipelines. For realistic simulation of light water reactor cores, the code employees a macroscopic depletion model with trace capability for selected fission product transients and critical boron. In addition to this, moderator and fuel temperature feedback models are also incorporated into the code. The validity of the physics models used in the code were benchmarked against qualified codes and proved accurate. This work is an extension of previous work in that various feedback effects are accounted for in the system; the entire code is structured to accommodate extensive vectorization; and an additional parallelism by multitasking is achieved not only for the solution of the matrix equations associated with the inner iterations but also for the other segments of the code, e.g., outer iterations
Discrete-Time Nonzero-Sum Games for Multiplayer Using Policy-Iteration-Based Adaptive Dynamic Programming Algorithms.

Science.gov (United States)

Zhang, Huaguang; Jiang, He; Luo, Chaomin; Xiao, Geyang

2017-10-01

In this paper, we investigate the nonzero-sum games for a class of discrete-time (DT) nonlinear systems by using a novel policy iteration (PI) adaptive dynamic programming (ADP) method. The main idea of our proposed PI scheme is to utilize the iterative ADP algorithm to obtain the iterative control policies, which not only ensure the system to achieve stability but also minimize the performance index function for each player. This paper integrates game theory, optimal control theory, and reinforcement learning technique to formulate and handle the DT nonzero-sum games for multiplayer. First, we design three actor-critic algorithms, an offline one and two online ones, for the PI scheme. Subsequently, neural networks are employed to implement these algorithms and the corresponding stability analysis is also provided via the Lyapunov theory. Finally, a numerical simulation example is presented to demonstrate the effectiveness of our proposed approach.
Dynamic stability calculations for power grids employing a parallel computer

Energy Technology Data Exchange (ETDEWEB)

Schmidt, K

1982-06-01

The aim of dynamic contingency calculations in power systems is to estimate the effects of assumed disturbances, such as loss of generation. Due to the large dimensions of the problem these simulations require considerable computing time and costs, to the effect that they are at present only used in a planning state but not for routine checks in power control stations. In view of the homogeneity of the problem, where a multitude of equal generator models, having different parameters, are to be integrated simultaneously, the use of a parallel computer looks very attractive. The results of this study employing a prototype parallel computer (SMS 201) are presented. It consists of up to 128 equal microcomputers bus-connected to a control computer. Each of the modules is programmed to simulate a node of the power grid. Generators with their associated control are represented by models of 13 states each. Passive nodes are complemented by 'phantom'-generators, so that the whole power grid is homogenous, thus removing the need for load-flow-iterations. Programming of microcomputers is essentially performed in FORTRAN.
Process-Oriented Parallel Programming with an Application to Data-Intensive Computing

OpenAIRE

Givelberg, Edward

2014-01-01

We introduce process-oriented programming as a natural extension of object-oriented programming for parallel computing. It is based on the observation that every class of an object-oriented language can be instantiated as a process, accessible via a remote pointer. The introduction of process pointers requires no syntax extension, identifies processes with programming objects, and enables processes to exchange information simply by executing remote methods. Process-oriented programming is a h...
U.S. Contributions to ITER

International Nuclear Information System (INIS)

Sauthoff, Ned R.

2005-01-01

The United States participates in the ITER project and program to enable the study of the science and technology of burning plasmas, a key programmatic element missing from the world fusion program. The 2003 U.S. decision to enter the ITER negotiations followed an extensive series of community and governmental reviews of the benefits, readiness, and approaches to the study of burning plasmas. This paper describes both the technical and the organizational preparations and plans for U.S. participation in the ITER construction activity: in-kind contributions, staff contributions, and cash contributions as well as supporting physics and technology research. Near-term technical activities focus on the completion of R and D and design and mitigation of risks in the areas of the central solenoid magnet, shield/blanket, diagnostics, ion cyclotron system, electron cyclotron system, pellet fueling system, vacuum system, tritium processing system, and conventional systems. Outside the project, the U .S. is engaged in preparations for the test blanket module program. Organizational activities focus on preparations of the project management arrangements to maximize the overall success of the ITER Project; elements include refinement of U.S. directions on the international arrangements, the establishment of the U.S. Domestic Agency, progress along the path of the U.S. Department of Energy's Project Management Order, and overall preparations for commencement of the fabrication of major items of equipment and for provision of staff and cash as specified in the upcoming ITER agreement
Eliminating graphs by means of parallel knock-out schemes

NARCIS (Netherlands)

Broersma, H.J.; Fomin, F.V.; Královic, R.; Woeginger, G.J.

2007-01-01

In 1997 Lampert and Slater introduced parallel knock-out schemes, an iterative process on graphs that goes through several rounds. In each round of this process, every vertex eliminates exactly one of its neighbors. The parallel knock-out number of a graph is the minimum number of rounds after which
Eliminating graphs by means of parallel knock-out schemes

NARCIS (Netherlands)

Broersma, Haitze J.; Fomin, F.V.; Královič, R.; Woeginger, Gerhard

In 1997 Lampert and Slater introduced parallel knock-out schemes, an iterative process on graphs that goes through several rounds. In each round of this process, every vertex eliminates exactly one of its neighbors. The parallel knock-out number of a graph is the minimum number of rounds after which
Conceptual design Fusion Experimental Reactor (FER/ITER)

International Nuclear Information System (INIS)

Uehara, Kazuya; Nagashima, Takashi; Ikeda, Yoshitaka

1991-11-01

This report describes a conceptual design of Lower Hybrid Wave (LH) system for FER and ITER. In JAERI, the conceptual design of LH system for FER has been performed in these 3 years in parallel to that of ITER. There must be a common design part with ITER and FER. The physical requirement of LH system is the saving of volt · sec in the current start-up phase, and the current drive at the boundary region. The frequency of 5GHz is mainly chosen for avoidance of the α particle absorption and for the availability of electron tube development. Seventy-two klystrons (FER) and one hundred klystrons (ITER) are necessary to inject the 30 MW (FER) and 45-50 MW (ITER) rf power into plasma using 0.7 - 0.8 MW klystron per one tube. The launching system is the multi-junction type and the rf spectrum must be as sharp as possible with high directivity to improve the current drive efficiency. One port (FER) and two ports (ITER) are used and the injection direction is in horizontal, in which the analysis of the ray-tracing code and the better coupling of LH wave is considered. The transmission line is over-sized waveguide with low rf loss. (author)
Efficient parallel iterative solvers for the solution of large dense linear systems arising from the boundary element method in electromagnetism

International Nuclear Information System (INIS)

Alleon, G.; Carpentieri, B.; Du, I.S.; Giraud, L.; Langou, J.; Martin, E.

2003-01-01

The boundary element method has become a popular tool for the solution of Maxwell's equations in electromagnetism. It discretizes only the surface of the radiating object and gives rise to linear systems that are smaller in size compared to those arising from finite element or finite difference discretizations. However, these systems are prohibitively demanding in terms of memory for direct methods and challenging to solve by iterative methods. In this paper we address the iterative solution via preconditioned Krylov methods of electromagnetic scattering problems expressed in an integral formulation, with main focus on the design of the pre-conditioner. We consider an approximate inverse method based on the Frobenius-norm minimization with a pattern prescribed in advance. The pre-conditioner is constructed from a sparse approximation of the dense coefficient matrix, and the patterns both for the pre-conditioner and for the coefficient matrix are computed a priori using geometric information from the mesh. We describe the implementation of the approximate inverse in an out-of-core parallel code that uses multipole techniques for the matrix-vector products, and show results on the numerical scalability of our method on systems of size up to one million unknowns. We propose an embedded iterative scheme based on the GMRES method and combined with multipole techniques, aimed at improving the robustness of the approximate inverse for large problems. We prove by numerical experiments that the proposed scheme enables the solution of very large and difficult problems efficiently at reduced computational and memory cost. Finally we perform a preliminary study on a spectral two-level pre-conditioner to enhance the robustness of our method. This numerical technique exploits spectral information of the preconditioned systems to build a low rank-update of the pre-conditioner. (authors)
Efficient parallel iterative solvers for the solution of large dense linear systems arising from the boundary element method in electromagnetism

Energy Technology Data Exchange (ETDEWEB)

Alleon, G. [EADS-CCR, 31 - Blagnac (France); Carpentieri, B.; Du, I.S.; Giraud, L.; Langou, J.; Martin, E. [Cerfacs, 31 - Toulouse (France)

2003-07-01

The boundary element method has become a popular tool for the solution of Maxwell's equations in electromagnetism. It discretizes only the surface of the radiating object and gives rise to linear systems that are smaller in size compared to those arising from finite element or finite difference discretizations. However, these systems are prohibitively demanding in terms of memory for direct methods and challenging to solve by iterative methods. In this paper we address the iterative solution via preconditioned Krylov methods of electromagnetic scattering problems expressed in an integral formulation, with main focus on the design of the pre-conditioner. We consider an approximate inverse method based on the Frobenius-norm minimization with a pattern prescribed in advance. The pre-conditioner is constructed from a sparse approximation of the dense coefficient matrix, and the patterns both for the pre-conditioner and for the coefficient matrix are computed a priori using geometric information from the mesh. We describe the implementation of the approximate inverse in an out-of-core parallel code that uses multipole techniques for the matrix-vector products, and show results on the numerical scalability of our method on systems of size up to one million unknowns. We propose an embedded iterative scheme based on the GMRES method and combined with multipole techniques, aimed at improving the robustness of the approximate inverse for large problems. We prove by numerical experiments that the proposed scheme enables the solution of very large and difficult problems efficiently at reduced computational and memory cost. Finally we perform a preliminary study on a spectral two-level pre-conditioner to enhance the robustness of our method. This numerical technique exploits spectral information of the preconditioned systems to build a low rank-update of the pre-conditioner. (authors)
Fourteenth meeting of the ITER management advisory committee

International Nuclear Information System (INIS)

Yoshikawa, M.

1998-01-01

Following the Director's report on the progress made in the ITER Engineering Design Activities, the ITER Management Advisory Committee reviewed the Task Status Summary, Work Program and Task Agreements for EDA Extension, Joint Fund and a schedule of ITER meetings
Rokkasho: Japanese site for ITER

International Nuclear Information System (INIS)

Ohtake, S.; Yamaguchi, V.; Matsuda, S.; Kishimoto, H.

2003-01-01

The Atomic Energy Commission of Japan authorized ITER as the core machine of the Third Phase Basic Program of Fusion Energy Development. After a series of discussions in the Atomic Energy Commission and the Council of Science and Technology Policy, Japanese Government concluded formally with the Cabinet Agreement on 31 May 2002 that Japan should participate in the ITER Project and offer the Rokkasho-Mura site for construction of ITER to the Negotiations among Canada (CA), the European Union (EU), Japan (JA), and the Russian Federation (RF). The JA site proposal is now under the international assessment in the framework of the ITER Negotiations. (author)
Academic training: From Evolution Theory to Parallel and Distributed Genetic Programming

CERN Multimedia

2007-01-01

2006-2007 ACADEMIC TRAINING PROGRAMME LECTURE SERIES 15, 16 March From 11:00 to 12:00 - Main Auditorium, bldg. 500 From Evolution Theory to Parallel and Distributed Genetic Programming F. FERNANDEZ DE VEGA / Univ. of Extremadura, SP Lecture No. 1: From Evolution Theory to Evolutionary Computation Evolutionary computation is a subfield of artificial intelligence (more particularly computational intelligence) involving combinatorial optimization problems, which are based to some degree on the evolution of biological life in the natural world. In this tutorial we will review the source of inspiration for this metaheuristic and its capability for solving problems. We will show the main flavours within the field, and different problems that have been successfully solved employing this kind of techniques. Lecture No. 2: Parallel and Distributed Genetic Programming The successful application of Genetic Programming (GP, one of the available Evolutionary Algorithms) to optimization problems has encouraged an ...
Policy Iteration for $H_\\infty $ Optimal Control of Polynomial Nonlinear Systems via Sum of Squares Programming.

Science.gov (United States)

Zhu, Yuanheng; Zhao, Dongbin; Yang, Xiong; Zhang, Qichao

2018-02-01

Sum of squares (SOS) polynomials have provided a computationally tractable way to deal with inequality constraints appearing in many control problems. It can also act as an approximator in the framework of adaptive dynamic programming. In this paper, an approximate solution to the optimal control of polynomial nonlinear systems is proposed. Under a given attenuation coefficient, the Hamilton-Jacobi-Isaacs equation is relaxed to an optimization problem with a set of inequalities. After applying the policy iteration technique and constraining inequalities to SOS, the optimization problem is divided into a sequence of feasible semidefinite programming problems. With the converged solution, the attenuation coefficient is further minimized to a lower value. After iterations, approximate solutions to the smallest -gain and the associated optimal controller are obtained. Four examples are employed to verify the effectiveness of the proposed algorithm.
Automatic programming via iterated local search for dynamic job shop scheduling.

Science.gov (United States)

Nguyen, Su; Zhang, Mengjie; Johnston, Mark; Tan, Kay Chen

2015-01-01

Dispatching rules have been commonly used in practice for making sequencing and scheduling decisions. Due to specific characteristics of each manufacturing system, there is no universal dispatching rule that can dominate in all situations. Therefore, it is important to design specialized dispatching rules to enhance the scheduling performance for each manufacturing environment. Evolutionary computation approaches such as tree-based genetic programming (TGP) and gene expression programming (GEP) have been proposed to facilitate the design task through automatic design of dispatching rules. However, these methods are still limited by their high computational cost and low exploitation ability. To overcome this problem, we develop a new approach to automatic programming via iterated local search (APRILS) for dynamic job shop scheduling. The key idea of APRILS is to perform multiple local searches started with programs modified from the best obtained programs so far. The experiments show that APRILS outperforms TGP and GEP in most simulation scenarios in terms of effectiveness and efficiency. The analysis also shows that programs generated by APRILS are more compact than those obtained by genetic programming. An investigation of the behavior of APRILS suggests that the good performance of APRILS comes from the balance between exploration and exploitation in its search mechanism.
MPL-A program for computations with iterated integrals on moduli spaces of curves of genus zero

Science.gov (United States)

Bogner, Christian

2016-06-01

We introduce the Maple program MPL for computations with multiple polylogarithms. The program is based on homotopy invariant iterated integrals on moduli spaces M0,n of curves of genus 0 with n ordered marked points. It includes the symbol map and procedures for the analytic computation of period integrals on M0,n. It supports the automated computation of a certain class of Feynman integrals.
ITER Central Solenoid Module Fabrication

Energy Technology Data Exchange (ETDEWEB)

Smith, John [General Atomics, San Diego, CA (United States)

2016-09-23

The fabrication of the modules for the ITER Central Solenoid (CS) has started in a dedicated production facility located in Poway, California, USA. The necessary tools have been designed, built, installed, and tested in the facility to enable the start of production. The current schedule has first module fabrication completed in 2017, followed by testing and subsequent shipment to ITER. The Central Solenoid is a key component of the ITER tokamak providing the inductive voltage to initiate and sustain the plasma current and to position and shape the plasma. The design of the CS has been a collaborative effort between the US ITER Project Office (US ITER), the international ITER Organization (IO) and General Atomics (GA). GA’s responsibility includes: completing the fabrication design, developing and qualifying the fabrication processes and tools, and then completing the fabrication of the seven 110 tonne CS modules. The modules will be shipped separately to the ITER site, and then stacked and aligned in the Assembly Hall prior to insertion in the core of the ITER tokamak. A dedicated facility in Poway, California, USA has been established by GA to complete the fabrication of the seven modules. Infrastructure improvements included thick reinforced concrete floors, a diesel generator for backup power, along with, cranes for moving the tooling within the facility. The fabrication process for a single module requires approximately 22 months followed by five months of testing, which includes preliminary electrical testing followed by high current (48.5 kA) tests at 4.7K. The production of the seven modules is completed in a parallel fashion through ten process stations. The process stations have been designed and built with most stations having completed testing and qualification for carrying out the required fabrication processes. The final qualification step for each process station is achieved by the successful production of a prototype coil. Fabrication of the first
Computational chemistry with transputers: A direct SCF program

International Nuclear Information System (INIS)

Wedig, U.; Burkhardt, A.; Schnering, H.G. von

1989-01-01

By using transputers it is possible to build up networks of parallel processors with varying topology. Due to the architecture of the processors it is appropriate to use the MIMD (multiple instruction multiple data) concept of parallel computing. The most suitable programming language is OCCAM. We investigate the use of transputer networks in computational chemistry, starting with the direct SCF method. The most time consuming step, the calculation of the two electron integrals is executed parallelly. Each node in the network calculates whole batches of integrals. The main program is written in OCCAM. For some large-scale arithmetic processes running on a single node, however, we used FORTRAN subroutines out of standard ab-initio programs to reduce the programming effort. Test calculations show, that the integral calculation step can be parallelled very efficiently. We observed a speed-up of almost 8 using eight network processors. Even in consideration of the scalar part of the SCF iteration, the speed-up is not less than 7.1. (orig.)
Application Portable Parallel Library

Science.gov (United States)

Cole, Gary L.; Blech, Richard A.; Quealy, Angela; Townsend, Scott

1995-01-01

Application Portable Parallel Library (APPL) computer program is subroutine-based message-passing software library intended to provide consistent interface to variety of multiprocessor computers on market today. Minimizes effort needed to move application program from one computer to another. User develops application program once and then easily moves application program from parallel computer on which created to another parallel computer. ("Parallel computer" also include heterogeneous collection of networked computers). Written in C language with one FORTRAN 77 subroutine for UNIX-based computers and callable from application programs written in C language or FORTRAN 77.
Final Report: Center for Programming Models for Scalable Parallel Computing

Energy Technology Data Exchange (ETDEWEB)

Mellor-Crummey, John [William Marsh Rice University

2011-09-13

As part of the Center for Programming Models for Scalable Parallel Computing, Rice University collaborated with project partners in the design, development and deployment of language, compiler, and runtime support for parallel programming models to support application development for the “leadership-class” computer systems at DOE national laboratories. Work over the course of this project has focused on the design, implementation, and evaluation of a second-generation version of Coarray Fortran. Research and development efforts of the project have focused on the CAF 2.0 language, compiler, runtime system, and supporting infrastructure. This has involved working with the teams that provide infrastructure for CAF that we rely on, implementing new language and runtime features, producing an open source compiler that enabled us to evaluate our ideas, and evaluating our design and implementation through the use of benchmarks. The report details the research, development, findings, and conclusions from this work.

ITER Conceptual design: Interim report

International Nuclear Information System (INIS)

1990-01-01

This interim report describes the results of the International Thermonuclear Experimental Reactor (ITER) Conceptual Design Activities after the first year of design following the selection of the ITER concept in the autumn of 1988. Using the concept definition as the basis for conceptual design, the Design Phase has been underway since October 1988, and will be completed at the end of 1990, at which time a final report will be issued. This interim report includes an executive summary of ITER activities, a description of the ITER device and facility, an operation and research program summary, and a description of the physics and engineering design bases. Included are preliminary cost estimates and schedule for completion of the project
76 FR 62808 - Pilot Program for Parallel Review of Medical Products

Science.gov (United States)

2011-10-11

... voluntary participation in the pilot program, as well as the guiding principles the Agencies intend to... 57045), parallel review is intended to reduce the time between FDA marketing approval and CMS national...
Parallel paving: An algorithm for generating distributed, adaptive, all-quadrilateral meshes on parallel computers

Energy Technology Data Exchange (ETDEWEB)

Lober, R.R.; Tautges, T.J.; Vaughan, C.T.

1997-03-01

Paving is an automated mesh generation algorithm which produces all-quadrilateral elements. It can additionally generate these elements in varying sizes such that the resulting mesh adapts to a function distribution, such as an error function. While powerful, conventional paving is a very serial algorithm in its operation. Parallel paving is the extension of serial paving into parallel environments to perform the same meshing functions as conventional paving only on distributed, discretized models. This extension allows large, adaptive, parallel finite element simulations to take advantage of paving`s meshing capabilities for h-remap remeshing. A significantly modified version of the CUBIT mesh generation code has been developed to host the parallel paving algorithm and demonstrate its capabilities on both two dimensional and three dimensional surface geometries and compare the resulting parallel produced meshes to conventionally paved meshes for mesh quality and algorithm performance. Sandia`s {open_quotes}tiling{close_quotes} dynamic load balancing code has also been extended to work with the paving algorithm to retain parallel efficiency as subdomains undergo iterative mesh refinement.
Remote maintenance development for ITER

International Nuclear Information System (INIS)

Tada, Eisuke; Shibanuma, Kiyoshi

1998-01-01

This paper describes the overall ITER remote maintenance design concept developed mainly for in-vessel components such as diverters and blankets, and outlines the ITER R and D program to develop remote handling equipment and radiation hard components. Reactor structures inside the ITER cryostat must be maintained remotely due to DT operation, making remote handling technology basic to reactor design. The overall maintenance scenario and design concepts have been developed, and maintenance design feasibility, including fabrication and testing of full-scale in-vessel remote maintenance handling equipment and tool, is being verified. (author)
Remote maintenance development for ITER

Energy Technology Data Exchange (ETDEWEB)

Tada, Eisuke [Japan Atomic Energy Research Inst., Tokai, Ibaraki (Japan). Tokai Research Establishment; Shibanuma, Kiyoshi

1998-04-01

This paper describes the overall ITER remote maintenance design concept developed mainly for in-vessel components such as diverters and blankets, and outlines the ITER R and D program to develop remote handling equipment and radiation hard components. Reactor structures inside the ITER cryostat must be maintained remotely due to DT operation, making remote handling technology basic to reactor design. The overall maintenance scenario and design concepts have been developed, and maintenance design feasibility, including fabrication and testing of full-scale in-vessel remote maintenance handling equipment and tool, is being verified. (author)
Copper Mountain conference on iterative methods: Proceedings: Volume 2

Energy Technology Data Exchange (ETDEWEB)

NONE

1996-10-01

This volume (the second of two) contains information presented during the last two days of the Copper Mountain Conference on Iterative Methods held April 9-13, 1996 at Copper Mountain, Colorado. Topics of the sessions held these two days include domain decomposition, Krylov methods, computational fluid dynamics, Markov chains, sparse and parallel basic linear algebra subprograms, multigrid methods, applications of iterative methods, equation systems with multiple right-hand sides, projection methods, and the Helmholtz equation. Selected papers indexed separately for the Energy Science and Technology Database.
Empirical valence bond models for reactive potential energy surfaces: a parallel multilevel genetic program approach.

Science.gov (United States)

Bellucci, Michael A; Coker, David F

2011-07-28

We describe a new method for constructing empirical valence bond potential energy surfaces using a parallel multilevel genetic program (PMLGP). Genetic programs can be used to perform an efficient search through function space and parameter space to find the best functions and sets of parameters that fit energies obtained by ab initio electronic structure calculations. Building on the traditional genetic program approach, the PMLGP utilizes a hierarchy of genetic programming on two different levels. The lower level genetic programs are used to optimize coevolving populations in parallel while the higher level genetic program (HLGP) is used to optimize the genetic operator probabilities of the lower level genetic programs. The HLGP allows the algorithm to dynamically learn the mutation or combination of mutations that most effectively increase the fitness of the populations, causing a significant increase in the algorithm's accuracy and efficiency. The algorithm's accuracy and efficiency is tested against a standard parallel genetic program with a variety of one-dimensional test cases. Subsequently, the PMLGP is utilized to obtain an accurate empirical valence bond model for proton transfer in 3-hydroxy-gamma-pyrone in gas phase and protic solvent. © 2011 American Institute of Physics
Distributed Parallel Endmember Extraction of Hyperspectral Data Based on Spark

Directory of Open Access Journals (Sweden)

Zebin Wu

2016-01-01

Full Text Available Due to the increasing dimensionality and volume of remotely sensed hyperspectral data, the development of acceleration techniques for massive hyperspectral image analysis approaches is a very important challenge. Cloud computing offers many possibilities of distributed processing of hyperspectral datasets. This paper proposes a novel distributed parallel endmember extraction method based on iterative error analysis that utilizes cloud computing principles to efficiently process massive hyperspectral data. The proposed method takes advantage of technologies including MapReduce programming model, Hadoop Distributed File System (HDFS, and Apache Spark to realize distributed parallel implementation for hyperspectral endmember extraction, which significantly accelerates the computation of hyperspectral processing and provides high throughput access to large hyperspectral data. The experimental results, which are obtained by extracting endmembers of hyperspectral datasets on a cloud computing platform built on a cluster, demonstrate the effectiveness and computational efficiency of the proposed method.
CAD-Based Shielding Analysis for ITER Port Diagnostics

Directory of Open Access Journals (Sweden)

Serikov Arkady

2017-01-01

Full Text Available Radiation shielding analysis conducted in support of design development of the contemporary diagnostic systems integrated inside the ITER ports is relied on the use of CAD models. This paper presents the CAD-based MCNP Monte Carlo radiation transport and activation analyses for the Diagnostic Upper and Equatorial Port Plugs (UPP #3 and EPP #8, #17. The creation process of the complicated 3D MCNP models of the diagnostics systems was substantially accelerated by application of the CAD-to-MCNP converter programs MCAM and McCad. High performance computing resources of the Helios supercomputer allowed to speed-up the MCNP parallel transport calculations with the MPI/OpenMP interface. The found shielding solutions could be universal, reducing ports R&D costs. The shield block behind the Tritium and Deposit Monitor (TDM optical box was added to study its influence on Shut-Down Dose Rate (SDDR in Port Interspace (PI of EPP#17. Influence of neutron streaming along the Lost Alpha Monitor (LAM on the neutron energy spectra calculated in the Tangential Neutron Spectrometer (TNS of EPP#8. For the UPP#3 with Charge eXchange Recombination Spectroscopy (CXRS-core, an excessive neutron streaming along the CXRS shutter, which should be prevented in further design iteration.
CAD-Based Shielding Analysis for ITER Port Diagnostics

Science.gov (United States)

Serikov, Arkady; Fischer, Ulrich; Anthoine, David; Bertalot, Luciano; De Bock, Maartin; O'Connor, Richard; Juarez, Rafael; Krasilnikov, Vitaly

2017-09-01

Radiation shielding analysis conducted in support of design development of the contemporary diagnostic systems integrated inside the ITER ports is relied on the use of CAD models. This paper presents the CAD-based MCNP Monte Carlo radiation transport and activation analyses for the Diagnostic Upper and Equatorial Port Plugs (UPP #3 and EPP #8, #17). The creation process of the complicated 3D MCNP models of the diagnostics systems was substantially accelerated by application of the CAD-to-MCNP converter programs MCAM and McCad. High performance computing resources of the Helios supercomputer allowed to speed-up the MCNP parallel transport calculations with the MPI/OpenMP interface. The found shielding solutions could be universal, reducing ports R&D costs. The shield block behind the Tritium and Deposit Monitor (TDM) optical box was added to study its influence on Shut-Down Dose Rate (SDDR) in Port Interspace (PI) of EPP#17. Influence of neutron streaming along the Lost Alpha Monitor (LAM) on the neutron energy spectra calculated in the Tangential Neutron Spectrometer (TNS) of EPP#8. For the UPP#3 with Charge eXchange Recombination Spectroscopy (CXRS-core), an excessive neutron streaming along the CXRS shutter, which should be prevented in further design iteration.
Solution of the within-group multidimensional discrete ordinates transport equations on massively parallel architectures

Science.gov (United States)

Zerr, Robert Joseph

2011-12-01

The integral transport matrix method (ITMM) has been used as the kernel of new parallel solution methods for the discrete ordinates approximation of the within-group neutron transport equation. The ITMM abandons the repetitive mesh sweeps of the traditional source iterations (SI) scheme in favor of constructing stored operators that account for the direct coupling factors among all the cells and between the cells and boundary surfaces. The main goals of this work were to develop the algorithms that construct these operators and employ them in the solution process, determine the most suitable way to parallelize the entire procedure, and evaluate the behavior and performance of the developed methods for increasing number of processes. This project compares the effectiveness of the ITMM with the SI scheme parallelized with the Koch-Baker-Alcouffe (KBA) method. The primary parallel solution method involves a decomposition of the domain into smaller spatial sub-domains, each with their own transport matrices, and coupled together via interface boundary angular fluxes. Each sub-domain has its own set of ITMM operators and represents an independent transport problem. Multiple iterative parallel solution methods have investigated, including parallel block Jacobi (PBJ), parallel red/black Gauss-Seidel (PGS), and parallel GMRES (PGMRES). The fastest observed parallel solution method, PGS, was used in a weak scaling comparison with the PARTISN code. Compared to the state-of-the-art SI-KBA with diffusion synthetic acceleration (DSA), this new method without acceleration/preconditioning is not competitive for any problem parameters considered. The best comparisons occur for problems that are difficult for SI DSA, namely highly scattering and optically thick. SI DSA execution time curves are generally steeper than the PGS ones. However, until further testing is performed it cannot be concluded that SI DSA does not outperform the ITMM with PGS even on several thousand or tens of
Non-Cartesian parallel imaging reconstruction.

Science.gov (United States)

Wright, Katherine L; Hamilton, Jesse I; Griswold, Mark A; Gulani, Vikas; Seiberlich, Nicole

2014-11-01

Non-Cartesian parallel imaging has played an important role in reducing data acquisition time in MRI. The use of non-Cartesian trajectories can enable more efficient coverage of k-space, which can be leveraged to reduce scan times. These trajectories can be undersampled to achieve even faster scan times, but the resulting images may contain aliasing artifacts. Just as Cartesian parallel imaging can be used to reconstruct images from undersampled Cartesian data, non-Cartesian parallel imaging methods can mitigate aliasing artifacts by using additional spatial encoding information in the form of the nonhomogeneous sensitivities of multi-coil phased arrays. This review will begin with an overview of non-Cartesian k-space trajectories and their sampling properties, followed by an in-depth discussion of several selected non-Cartesian parallel imaging algorithms. Three representative non-Cartesian parallel imaging methods will be described, including Conjugate Gradient SENSE (CG SENSE), non-Cartesian generalized autocalibrating partially parallel acquisition (GRAPPA), and Iterative Self-Consistent Parallel Imaging Reconstruction (SPIRiT). After a discussion of these three techniques, several potential promising clinical applications of non-Cartesian parallel imaging will be covered. © 2014 Wiley Periodicals, Inc.
Implementing the PM Programming Language using MPI and OpenMP - a New Tool for Programming Geophysical Models on Parallel Systems

Science.gov (United States)

Bellerby, Tim

2015-04-01

PM (Parallel Models) is a new parallel programming language specifically designed for writing environmental and geophysical models. The language is intended to enable implementers to concentrate on the science behind the model rather than the details of running on parallel hardware. At the same time PM leaves the programmer in control - all parallelisation is explicit and the parallel structure of any given program may be deduced directly from the code. This paper describes a PM implementation based on the Message Passing Interface (MPI) and Open Multi-Processing (OpenMP) standards, looking at issues involved with translating the PM parallelisation model to MPI/OpenMP protocols and considering performance in terms of the competing factors of finer-grained parallelisation and increased communication overhead. In order to maximise portability, the implementation stays within the MPI 1.3 standard as much as possible, with MPI-2 MPI-IO file handling the only significant exception. Moreover, it does not assume a thread-safe implementation of MPI. PM adopts a two-tier abstract representation of parallel hardware. A PM processor is a conceptual unit capable of efficiently executing a set of language tasks, with a complete parallel system consisting of an abstract N-dimensional array of such processors. PM processors may map to single cores executing tasks using cooperative multi-tasking, to multiple cores or even to separate processing nodes, efficiently sharing tasks using algorithms such as work stealing. While tasks may move between hardware elements within a PM processor, they may not move between processors without specific programmer intervention. Tasks are assigned to processors using a nested parallelism approach, building on ideas from Reyes et al. (2009). The main program owns all available processors. When the program enters a parallel statement then either processors are divided out among the newly generated tasks (number of new tasks number of processors
The ITER Management Advisory Committee (MAC) meeting in Garching

International Nuclear Information System (INIS)

Yoshikawa, M.

1999-01-01

The ITER management advisory committee meeting was held on 22-23 July 1999 in Garching, Germany. The main topics were the ITER EDA status, task status summary and work program, joint fund, information technology needs at the ITER joint work sites, the disposition of R and D components and a schedule of ITER meetings
Reactor structure and superconducting magnet system of ITER

International Nuclear Information System (INIS)

Tada, Eisuke; Yoshida, Kiyoshi; Shibanuma, Kiyoshi; Okuno, Kiyoshi; Tsuji, Hiroshi; Shimamoto, Susumu

1993-01-01

Fusion Experimental Reactors are one of the major steps toward realization of the fusion energy and the key objective are to demonstrate the scientific and technological feasibility prior to the Demo Fusion Reactor. ITER (International Thermonuclear Experimental Reactor) is one of experimental reactors and the conceptual design has been completed by the united efforts of USA, USSR, EC and Japan. In parallel with the conceptual design, key technology development in various areas has being conducted. This paper describes the overall design concepts and the latest technological achievements of the ITER reactor structure and superconducting magnet system. (author)
76 FR 66309 - Pilot Program for Parallel Review of Medical Products; Correction

Science.gov (United States)

2011-10-26

... DEPARTMENT OF HEALTH AND HUMAN SERVICES Centers for Medicare and Medicaid Services [CMS-3180-N2] Food and Drug Administration [Docket No. FDA-2010-N-0308] Pilot Program for Parallel Review of Medical... 11, 2011 (76 FR 62808). The document announced a pilot program for sponsors of innovative device...
1.5 MW RF Load for ITER

International Nuclear Information System (INIS)

Ives, Robert Lawrence; Marsden, David; Collins, George; Karimov, Rasul; Mizuhara, Max; Neilson, Jeffrey

2016-01-01

Calabazas Creek Research, Inc. developed a 1.5 MW RF load for the ITER fusion research facility currently under construction in France. This program leveraged technology developed in two previous SBIR programs that successfully developed high power RF loads for fusion research applications. This program specifically focused on modifications required by revised technical performance, materials, and assembly specification for ITER. This program implemented an innovative approach to actively distribute the RF power inside the load to avoid excessive heating or arcing associated with constructive interference. The new design implemented materials and assembly changes required to meet specifications. Critical components were built and successfully tested during the program.
1.5 MW RF Load for ITER

Energy Technology Data Exchange (ETDEWEB)

Ives, Robert Lawrence [Calabazas Creek Research, Inc., San Mateo, CA (United States); Marsden, David [Calabazas Creek Research, Inc., San Mateo, CA (United States); Collins, George [Calabazas Creek Research, Inc., San Mateo, CA (United States); Karimov, Rasul [Calabazas Creek Research, Inc., San Mateo, CA (United States); Mizuhara, Max [Calabazas Creek Research, Inc., San Mateo, CA (United States); Neilson, Jeffrey [Lexam Research, Redwood City, CA (United States)

2016-09-01

Calabazas Creek Research, Inc. developed a 1.5 MW RF load for the ITER fusion research facility currently under construction in France. This program leveraged technology developed in two previous SBIR programs that successfully developed high power RF loads for fusion research applications. This program specifically focused on modifications required by revised technical performance, materials, and assembly specification for ITER. This program implemented an innovative approach to actively distribute the RF power inside the load to avoid excessive heating or arcing associated with constructive interference. The new design implemented materials and assembly changes required to meet specifications. Critical components were built and successfully tested during the program.
A cryogenic system design for the international thermonuclear experimental reactor (ITER)

International Nuclear Information System (INIS)

Slack, D.S.

1991-01-01

A conceptual design for ITER was completed last year. The author developed a suitable cryogenic system for ITER as part of this conceptual design effort. An overview of the design is reported. Emphasis is on the fact that cryogenics is a mature science, and a system supporting ITER needs can be made from time-proven components without loss of efficiency or reliability. Because of the large size of the ITER cryogenic system, large numbers of compressors and expanders must be used. Very high reliability is assured by arranging these components in parallel banks where servicing of individual components can be done without interruption of operations. This and other ideas based on the author's experience with Mirror Fusion Test Facility (MFTF) operations are described. 5 refs., 3 figs
Teaching Scientific Computing: A Model-Centered Approach to Pipeline and Parallel Programming with C

Directory of Open Access Journals (Sweden)

Vladimiras Dolgopolovas

2015-01-01

Full Text Available The aim of this study is to present an approach to the introduction into pipeline and parallel computing, using a model of the multiphase queueing system. Pipeline computing, including software pipelines, is among the key concepts in modern computing and electronics engineering. The modern computer science and engineering education requires a comprehensive curriculum, so the introduction to pipeline and parallel computing is the essential topic to be included in the curriculum. At the same time, the topic is among the most motivating tasks due to the comprehensive multidisciplinary and technical requirements. To enhance the educational process, the paper proposes a novel model-centered framework and develops the relevant learning objects. It allows implementing an educational platform of constructivist learning process, thus enabling learners’ experimentation with the provided programming models, obtaining learners’ competences of the modern scientific research and computational thinking, and capturing the relevant technical knowledge. It also provides an integral platform that allows a simultaneous and comparative introduction to pipelining and parallel computing. The programming language C for developing programming models and message passing interface (MPI and OpenMP parallelization tools have been chosen for implementation.

A Study on GPU-based Iterative ML-EM Reconstruction Algorithm for Emission Computed Tomographic Imaging Systems

Energy Technology Data Exchange (ETDEWEB)

Ha, Woo Seok; Kim, Soo Mee; Park, Min Jae; Lee, Dong Soo; Lee, Jae Sung [Seoul National University, Seoul (Korea, Republic of)

2009-10-15

The maximum likelihood-expectation maximization (ML-EM) is the statistical reconstruction algorithm derived from probabilistic model of the emission and detection processes. Although the ML-EM has many advantages in accuracy and utility, the use of the ML-EM is limited due to the computational burden of iterating processing on a CPU (central processing unit). In this study, we developed a parallel computing technique on GPU (graphic processing unit) for ML-EM algorithm. Using Geforce 9800 GTX+ graphic card and CUDA (compute unified device architecture) the projection and backprojection in ML-EM algorithm were parallelized by NVIDIA's technology. The time delay on computations for projection, errors between measured and estimated data and backprojection in an iteration were measured. Total time included the latency in data transmission between RAM and GPU memory. The total computation time of the CPU- and GPU-based ML-EM with 32 iterations were 3.83 and 0.26 sec, respectively. In this case, the computing speed was improved about 15 times on GPU. When the number of iterations increased into 1024, the CPU- and GPU-based computing took totally 18 min and 8 sec, respectively. The improvement was about 135 times and was caused by delay on CPU-based computing after certain iterations. On the other hand, the GPU-based computation provided very small variation on time delay per iteration due to use of shared memory. The GPU-based parallel computation for ML-EM improved significantly the computing speed and stability. The developed GPU-based ML-EM algorithm could be easily modified for some other imaging geometries
A Study on GPU-based Iterative ML-EM Reconstruction Algorithm for Emission Computed Tomographic Imaging Systems

International Nuclear Information System (INIS)

Ha, Woo Seok; Kim, Soo Mee; Park, Min Jae; Lee, Dong Soo; Lee, Jae Sung

2009-01-01

The maximum likelihood-expectation maximization (ML-EM) is the statistical reconstruction algorithm derived from probabilistic model of the emission and detection processes. Although the ML-EM has many advantages in accuracy and utility, the use of the ML-EM is limited due to the computational burden of iterating processing on a CPU (central processing unit). In this study, we developed a parallel computing technique on GPU (graphic processing unit) for ML-EM algorithm. Using Geforce 9800 GTX+ graphic card and CUDA (compute unified device architecture) the projection and backprojection in ML-EM algorithm were parallelized by NVIDIA's technology. The time delay on computations for projection, errors between measured and estimated data and backprojection in an iteration were measured. Total time included the latency in data transmission between RAM and GPU memory. The total computation time of the CPU- and GPU-based ML-EM with 32 iterations were 3.83 and 0.26 sec, respectively. In this case, the computing speed was improved about 15 times on GPU. When the number of iterations increased into 1024, the CPU- and GPU-based computing took totally 18 min and 8 sec, respectively. The improvement was about 135 times and was caused by delay on CPU-based computing after certain iterations. On the other hand, the GPU-based computation provided very small variation on time delay per iteration due to use of shared memory. The GPU-based parallel computation for ML-EM improved significantly the computing speed and stability. The developed GPU-based ML-EM algorithm could be easily modified for some other imaging geometries
Massively Parallel Finite Element Programming

KAUST Repository

Heister, Timo; Kronbichler, Martin; Bangerth, Wolfgang

2010-01-01

Today's large finite element simulations require parallel algorithms to scale on clusters with thousands or tens of thousands of processor cores. We present data structures and algorithms to take advantage of the power of high performance computers in generic finite element codes. Existing generic finite element libraries often restrict the parallelization to parallel linear algebra routines. This is a limiting factor when solving on more than a few hundreds of cores. We describe routines for distributed storage of all major components coupled with efficient, scalable algorithms. We give an overview of our effort to enable the modern and generic finite element library deal.II to take advantage of the power of large clusters. In particular, we describe the construction of a distributed mesh and develop algorithms to fully parallelize the finite element calculation. Numerical results demonstrate good scalability. © 2010 Springer-Verlag.
Massively Parallel Finite Element Programming

KAUST Repository

Heister, Timo

2010-01-01

Today\\'s large finite element simulations require parallel algorithms to scale on clusters with thousands or tens of thousands of processor cores. We present data structures and algorithms to take advantage of the power of high performance computers in generic finite element codes. Existing generic finite element libraries often restrict the parallelization to parallel linear algebra routines. This is a limiting factor when solving on more than a few hundreds of cores. We describe routines for distributed storage of all major components coupled with efficient, scalable algorithms. We give an overview of our effort to enable the modern and generic finite element library deal.II to take advantage of the power of large clusters. In particular, we describe the construction of a distributed mesh and develop algorithms to fully parallelize the finite element calculation. Numerical results demonstrate good scalability. © 2010 Springer-Verlag.
Conceptual design of SC magnet system for ITER, (6)

International Nuclear Information System (INIS)

Yoshida, Kiyoshi; Sugimoto, Makoto; Tsuji, Hiroshi

1991-08-01

The International Thermonuclear Experimental Reactor (ITER) is an experimental thermonuclear tokamak reactor in order to test the basic physics performance and technologies. The conceptual design activity (CDA) of ITER required the joint work at a technical site at the Max Plank Institute for Plasma Physics in the Garching, Germany from 1988 to 1990. The technical proposals from Japan were summarized by the Fusion Experimental Reactor (FER) Team and the Superconducting Magnet Laboratory of the Japan Atomic Energy Research Institute (JAERI). This paper describes the Japanese contributions of the R and D proposals to the magnet system for the ITER. These proposals were discussed in ITER CDA design team and summarized in ITER Technical report No. 20. The development program of Toroidal Field Coil is basically proposed from Japan with the design and analysis reports. The Japanese proposals are almost adopted in the ITER Long-Term R and D program. (author)
Parallelizing More Loops with Compiler Guided Refactoring

DEFF Research Database (Denmark)

Larsen, Per; Ladelsky, Razya; Lidman, Jacob

2012-01-01

an interactive compilation feedback system that guides programmers in iteratively modifying their application source code. This helps leverage the compiler’s ability to generate loop-parallel code. We employ our system to modify two sequential benchmarks dealing with image processing and edge detection...
Industrial opportunities on the International Thermonuclear Experimental Reactor (ITER) project

International Nuclear Information System (INIS)

Ellis, W.R.

1996-01-01

Industry has been a long-term contributor to the magnetic fusion program, playing a variety of important roles over the years. Manufacturing firms, engineering-construction companies, and the electric utility industry should all be regarded as legitimate stakeholders in the fusion energy program. In a program focused primarily on energy production, industry's future roles should follow in a natural way, leading to the commercialization of the technology. In a program focused primarily on science and technology, industry's roles, in the near term, should be, in addition to operating existing research facilities, largely devoted to providing industrial support to the International Thermonuclear Experimental Reactor (ITER) Project. Industrial opportunities on the ITER Project will be guided by the amount of funding available to magnetic fusion generally, since ITER is funded as a component of that program. The ITER Project can conveniently be discussed in terms of its phases, namely, the present Engineering Design Activities (EDA) phase, and the future (as yet not approved) construction phase. 2 refs., 3 tabs
Comparison of multihardware parallel implementations for a phase unwrapping algorithm

Science.gov (United States)

Hernandez-Lopez, Francisco Javier; Rivera, Mariano; Salazar-Garibay, Adan; Legarda-Sáenz, Ricardo

2018-04-01

Phase unwrapping is an important problem in the areas of optical metrology, synthetic aperture radar (SAR) image analysis, and magnetic resonance imaging (MRI) analysis. These images are becoming larger in size and, particularly, the availability and need for processing of SAR and MRI data have increased significantly with the acquisition of remote sensing data and the popularization of magnetic resonators in clinical diagnosis. Therefore, it is important to develop faster and accurate phase unwrapping algorithms. We propose a parallel multigrid algorithm of a phase unwrapping method named accumulation of residual maps, which builds on a serial algorithm that consists of the minimization of a cost function; minimization achieved by means of a serial Gauss-Seidel kind algorithm. Our algorithm also optimizes the original cost function, but unlike the original work, our algorithm is a parallel Jacobi class with alternated minimizations. This strategy is known as the chessboard type, where red pixels can be updated in parallel at same iteration since they are independent. Similarly, black pixels can be updated in parallel in an alternating iteration. We present parallel implementations of our algorithm for different parallel multicore architecture such as CPU-multicore, Xeon Phi coprocessor, and Nvidia graphics processing unit. In all the cases, we obtain a superior performance of our parallel algorithm when compared with the original serial version. In addition, we present a detailed comparative performance of the developed parallel versions.
An iterative method for tri-level quadratic fractional programming problems using fuzzy goal programming approach

Science.gov (United States)

Kassa, Semu Mitiku; Tsegay, Teklay Hailay

2017-08-01

Tri-level optimization problems are optimization problems with three nested hierarchical structures, where in most cases conflicting objectives are set at each level of hierarchy. Such problems are common in management, engineering designs and in decision making situations in general, and are known to be strongly NP-hard. Existing solution methods lack universality in solving these types of problems. In this paper, we investigate a tri-level programming problem with quadratic fractional objective functions at each of the three levels. A solution algorithm has been proposed by applying fuzzy goal programming approach and by reformulating the fractional constraints to equivalent but non-fractional non-linear constraints. Based on the transformed formulation, an iterative procedure is developed that can yield a satisfactory solution to the tri-level problem. The numerical results on various illustrative examples demonstrated that the proposed algorithm is very much promising and it can also be used to solve larger-sized as well as n-level problems of similar structure.
Feasibility studies for a high energy physics MC program on massive parallel platforms

International Nuclear Information System (INIS)

Bertolotto, L.M.; Peach, K.J.; Apostolakis, J.; Bruschini, C.E.; Calafiura, P.; Gagliardi, F.; Metcalf, M.; Norton, A.; Panzer-Steindel, B.

1994-01-01

The parallelization of a Monte Carlo program for the NA48 experiment is presented. As a first step, a task farming structure was realized. Based on this, a further step, making use of a distributed database for showers in the electro-magnetic calorimeter, was implemented. Further possibilities for using parallel processing for a quasi-real time calibration of the calorimeter are described
A Parallel Genetic Algorithm for Automated Electronic Circuit Design

Science.gov (United States)

Long, Jason D.; Colombano, Silvano P.; Haith, Gary L.; Stassinopoulos, Dimitris

2000-01-01

Parallelized versions of genetic algorithms (GAs) are popular primarily for three reasons: the GA is an inherently parallel algorithm, typical GA applications are very compute intensive, and powerful computing platforms, especially Beowulf-style computing clusters, are becoming more affordable and easier to implement. In addition, the low communication bandwidth required allows the use of inexpensive networking hardware such as standard office ethernet. In this paper we describe a parallel GA and its use in automated high-level circuit design. Genetic algorithms are a type of trial-and-error search technique that are guided by principles of Darwinian evolution. Just as the genetic material of two living organisms can intermix to produce offspring that are better adapted to their environment, GAs expose genetic material, frequently strings of 1s and Os, to the forces of artificial evolution: selection, mutation, recombination, etc. GAs start with a pool of randomly-generated candidate solutions which are then tested and scored with respect to their utility. Solutions are then bred by probabilistically selecting high quality parents and recombining their genetic representations to produce offspring solutions. Offspring are typically subjected to a small amount of random mutation. After a pool of offspring is produced, this process iterates until a satisfactory solution is found or an iteration limit is reached. Genetic algorithms have been applied to a wide variety of problems in many fields, including chemistry, biology, and many engineering disciplines. There are many styles of parallelism used in implementing parallel GAs. One such method is called the master-slave or processor farm approach. In this technique, slave nodes are used solely to compute fitness evaluations (the most time consuming part). The master processor collects fitness scores from the nodes and performs the genetic operators (selection, reproduction, variation, etc.). Because of dependency
Parallel programming with Python

CERN Document Server

Palach, Jan

2014-01-01

A fast, easy-to-follow and clear tutorial to help you develop Parallel computing systems using Python. Along with explaining the fundamentals, the book will also introduce you to slightly advanced concepts and will help you in implementing these techniques in the real world. If you are an experienced Python programmer and are willing to utilize the available computing resources by parallelizing applications in a simple way, then this book is for you. You are required to have a basic knowledge of Python development to get the most of this book.
A massively parallel discrete ordinates response matrix method for neutron transport

International Nuclear Information System (INIS)

Hanebutte, U.R.; Lewis, E.E.

1992-01-01

In this paper a discrete ordinates response matrix method is formulated with anisotropic scattering for the solution of neutron transport problems on massively parallel computers. The response matrix formulation eliminates iteration on the scattering source. The nodal matrices that result from the diamond-differenced equations are utilized in a factored form that minimizes memory requirements and significantly reduces the number of arithmetic operations required per node. The red-black solution algorithm utilizes massive parallelism by assigning each spatial node to one or more processors. The algorithm is accelerated by a synthetic method in which the low-order diffusion equations are also solved by massively parallel red-black iterations. The method is implemented on a 16K Connection Machine-2, and S 8 and S 16 solutions are obtained for fixed-source benchmark problems in x-y geometry
Experimental test campaign on an ITER divertor mock-up

Energy Technology Data Exchange (ETDEWEB)

Dell' Orco, G. E-mail: giovanni.dellorco@brasimone.enea.it; Malavasi, A.; Merola, M.; Polazzi, G.; Simoncini, M.; Zito, D

2002-11-01

In 1998, in the frame of the European R and D on ITER high heat flux components, the fabrication of a full scale ITER Divertor Outboard mock-up was launched. It comprised a Cassette Body (CB), designed with some mechanical and hydraulic simplifications with respect to the reference body and its actively cooled Dummy Armour Prototype (DAP). This DAP consists of a Vertical Target (VT), a Wing (WI) and a Dump Target (DT), manufactured by European industries, which are integrated to the Gas Box Liner (GBL) supplied by the Russian Federation ITER Home Team. In 1999, in parallel with the manufacturing activity, the ITER European Home Team decided to assign to ENEA a Task for checking the component integration and performing the thermal-hydraulic and thermal mechanical testing of the DAP and CB. In 1999-2000, ENEA performed the experimental campaign at Brasimone Labs. The present work presents the experimental results of the component integration and the thermal-hydraulic and thermo-mechanical fatigue tests.
Experimental test campaign on an ITER divertor mock-up

International Nuclear Information System (INIS)

Dell'Orco, G.; Malavasi, A.; Merola, M.; Polazzi, G.; Simoncini, M.; Zito, D.

2002-01-01

In 1998, in the frame of the European R and D on ITER high heat flux components, the fabrication of a full scale ITER Divertor Outboard mock-up was launched. It comprised a Cassette Body (CB), designed with some mechanical and hydraulic simplifications with respect to the reference body and its actively cooled Dummy Armour Prototype (DAP). This DAP consists of a Vertical Target (VT), a Wing (WI) and a Dump Target (DT), manufactured by European industries, which are integrated to the Gas Box Liner (GBL) supplied by the Russian Federation ITER Home Team. In 1999, in parallel with the manufacturing activity, the ITER European Home Team decided to assign to ENEA a Task for checking the component integration and performing the thermal-hydraulic and thermal mechanical testing of the DAP and CB. In 1999-2000, ENEA performed the experimental campaign at Brasimone Labs. The present work presents the experimental results of the component integration and the thermal-hydraulic and thermo-mechanical fatigue tests
Numeric algorithms for parallel processors computer architectures with applications to the few-groups neutron diffusion equations

International Nuclear Information System (INIS)

Zee, S.K.

1987-01-01

A numeric algorithm and an associated computer code were developed for the rapid solution of the finite-difference method representation of the few-group neutron-diffusion equations on parallel computers. Applications of the numeric algorithm on both SIMD (vector pipeline) and MIMD/SIMD (multi-CUP/vector pipeline) architectures were explored. The algorithm was successfully implemented in the two-group, 3-D neutron diffusion computer code named DIFPAR3D (DIFfusion PARallel 3-Dimension). Numerical-solution techniques used in the code include the Chebyshev polynomial acceleration technique in conjunction with the power method of outer iteration. For inner iterations, a parallel form of red-black (cyclic) line SOR with automated determination of group dependent relaxation factors and iteration numbers required to achieve specified inner iteration error tolerance is incorporated. The code employs a macroscopic depletion model with trace capability for selected fission products' transients and critical boron. In addition to this, moderator and fuel temperature feedback models are also incorporated into the DIFPAR3D code, for realistic simulation of power reactor cores. The physics models used were proven acceptable in separate benchmarking studies
Enabling Requirements-Based Programming for Highly-Dependable Complex Parallel and Distributed Systems

Science.gov (United States)

Hinchey, Michael G.; Rash, James L.; Rouff, Christopher A.

2005-01-01

The manual application of formal methods in system specification has produced successes, but in the end, despite any claims and assertions by practitioners, there is no provable relationship between a manually derived system specification or formal model and the customer's original requirements. Complex parallel and distributed system present the worst case implications for today s dearth of viable approaches for achieving system dependability. No avenue other than formal methods constitutes a serious contender for resolving the problem, and so recognition of requirements-based programming has come at a critical juncture. We describe a new, NASA-developed automated requirement-based programming method that can be applied to certain classes of systems, including complex parallel and distributed systems, to achieve a high degree of dependability.
RF modeling of the ITER-relevant lower hybrid antenna

International Nuclear Information System (INIS)

Hillairet, J.; Ceccuzzi, S.; Belo, J.; Marfisi, L.; Artaud, J.F.; Bae, Y.S.; Berger-By, G.; Bernard, J.M.; Cara, Ph.; Cardinali, A.; Castaldo, C.; Cesario, R.; Decker, J.; Delpech, L.; Ekedahl, A.; Garcia, J.; Garibaldi, P.; Goniche, M.; Guilhem, D.; Hoang, G.T.

2011-01-01

In the frame of the EFDA task HCD-08-03-01, a 5 GHz Lower Hybrid system which should be able to deliver 20 MW CW on ITER and sustain the expected high heat fluxes has been reviewed. The design and overall dimensions of the key RF elements of the launcher and its subsystem has been updated from the 2001 design in collaboration with ITER organization. Modeling of the LH wave propagation and absorption into the plasma shows that the optimal parallel index must be chosen between 1.9 and 2.0 for the ITER steady-state scenario. The present study has been made with n || = 2.0 but can be adapted for n || = 1.9. Individual components have been studied separately giving confidence on the global RF design of the whole antenna.
Compiler Technology for Parallel Scientific Computation

Directory of Open Access Journals (Sweden)

Can Özturan

1994-01-01

Full Text Available There is a need for compiler technology that, given the source program, will generate efficient parallel codes for different architectures with minimal user involvement. Parallel computation is becoming indispensable in solving large-scale problems in science and engineering. Yet, the use of parallel computation is limited by the high costs of developing the needed software. To overcome this difficulty we advocate a comprehensive approach to the development of scalable architecture-independent software for scientific computation based on our experience with equational programming language (EPL. Our approach is based on a program decomposition, parallel code synthesis, and run-time support for parallel scientific computation. The program decomposition is guided by the source program annotations provided by the user. The synthesis of parallel code is based on configurations that describe the overall computation as a set of interacting components. Run-time support is provided by the compiler-generated code that redistributes computation and data during object program execution. The generated parallel code is optimized using techniques of data alignment, operator placement, wavefront determination, and memory optimization. In this article we discuss annotations, configurations, parallel code generation, and run-time support suitable for parallel programs written in the functional parallel programming language EPL and in Fortran.
ITER blanket designs

International Nuclear Information System (INIS)

Gohar, Y.; Parker, R.; Rebut, P.H.

1995-01-01

The ITER first wall, blanket, and shield system is being designed to handle 1.5±0.3 GW of fusion power and 3 MWa m -2 average neutron fluence. In the basic performance phase of ITER operation, the shielding blanket uses austenitic steel structural material and water coolant. The first wall is made of bimetallic structure, austenitic steel and copper alloy, coated with beryllium and it is protected by beryllium bumper limiters. The choice of copper first wall is dictated by the surface heat flux values anticipated during ITER operation. The water coolant is used at low pressure and low temperature. A breeding blanket has been designed to satisfy the technical objectives of the Enhanced Performance Phase of ITER operation for the Test Program. The breeding blanket design is geometrically similar to the shielding blanket design except it is a self-cooled liquid lithium system with vanadium structural material. Self-healing electrical insulator (aluminum nitride) is used to reduce the MHD pressure drop in the system. Reactor relevancy, low tritium inventory, low activation material, low decay heat, and a tritium self-sufficiency goal are the main features of the breeding blanket design. (orig.)

Solving large mixed linear models using preconditioned conjugate gradient iteration.

Science.gov (United States)

Strandén, I; Lidauer, M

1999-12-01

Continuous evaluation of dairy cattle with a random regression test-day model requires a fast solving method and algorithm. A new computing technique feasible in Jacobi and conjugate gradient based iterative methods using iteration on data is presented. In the new computing technique, the calculations in multiplication of a vector by a matrix were recorded to three steps instead of the commonly used two steps. The three-step method was implemented in a general mixed linear model program that used preconditioned conjugate gradient iteration. Performance of this program in comparison to other general solving programs was assessed via estimation of breeding values using univariate, multivariate, and random regression test-day models. Central processing unit time per iteration with the new three-step technique was, at best, one-third that needed with the old technique. Performance was best with the test-day model, which was the largest and most complex model used. The new program did well in comparison to other general software. Programs keeping the mixed model equations in random access memory required at least 20 and 435% more time to solve the univariate and multivariate animal models, respectively. Computations of the second best iteration on data took approximately three and five times longer for the animal and test-day models, respectively, than did the new program. Good performance was due to fast computing time per iteration and quick convergence to the final solutions. Use of preconditioned conjugate gradient based methods in solving large breeding value problems is supported by our findings.
On the adequacy of message-passing parallel supercomputers for solving neutron transport problems

International Nuclear Information System (INIS)

Azmy, Y.Y.

1990-01-01

A coarse-grained, static-scheduling parallelization of the standard iterative scheme used for solving the discrete-ordinates approximation of the neutron transport equation is described. The parallel algorithm is based on a decomposition of the angular domain along the discrete ordinates, thus naturally producing a set of completely uncoupled systems of equations in each iteration. Implementation of the parallel code on Intcl's iPSC/2 hypercube, and solutions to test problems are presented as evidence of the high speedup and efficiency of the parallel code. The performance of the parallel code on the iPSC/2 is analyzed, and a model for the CPU time as a function of the problem size (order of angular quadrature) and the number of participating processors is developed and validated against measured CPU times. The performance model is used to speculate on the potential of massively parallel computers for significantly speeding up real-life transport calculations at acceptable efficiencies. We conclude that parallel computers with a few hundred processors are capable of producing large speedups at very high efficiencies in very large three-dimensional problems. 10 refs., 8 figs
News from ITER controls - a status report

International Nuclear Information System (INIS)

Wallander, A.; Abadie, L.; Di Maio, F.; Evrard, B.; Fourneron, J.M.; Gulati, H.; Hansalia, C.; Journeaux, J.Y.; Kim, C.; Klotz, W.D.; Mahajan, K.; Makijarvi, P; Matsumoto, Y.; Pande, S.; Simrock, S.; Stepanov, D.; Utzel, N.; Vergara, A.; Winter, A.; Yonekawa, I.

2012-01-01

Construction of ITER has started at the Cadarache site in southern France. The first buildings are taking shape and more than 60 % of the in-kind procurement has been committed by the seven ITER member states (China, Europe, India, Japan, Korea, Russia and United States). The design and manufacturing of the main components of the machine is now underway all over the world. Each of these components comes with a local control system, which must be integrated in the central control system. The control group at ITER has developed two products to facilitate it; the plant control design handbook (PCDH) and the control, data access and communication (CODAC) core system. PCDH is a document which prescribes the technologies and methods to be used in developing local control systems and sets the rules applicable to the in-kind procurements. CODAC core system is a software package, distributed to all in-kind procurement developers, which implements the PCDH and facilitates the compliance of the local control system. In parallel, the ITER control group is proceeding with the design of the central control system to allow fully integrated and automated operation of ITER. In this paper we report on the progress of the design and technology choices and we discuss justifications of those choices. We also report on the results of some pilot projects aimed at validating the design and technologies. (authors)
Introducing PROFESS 2.0: A parallelized, fully linear scaling program for orbital-free density functional theory calculations

Science.gov (United States)

Hung, Linda; Huang, Chen; Shin, Ilgyou; Ho, Gregory S.; Lignères, Vincent L.; Carter, Emily A.

2010-12-01

Orbital-free density functional theory (OFDFT) is a first principles quantum mechanics method to find the ground-state energy of a system by variationally minimizing with respect to the electron density. No orbitals are used in the evaluation of the kinetic energy (unlike Kohn-Sham DFT), and the method scales nearly linearly with the size of the system. The PRinceton Orbital-Free Electronic Structure Software (PROFESS) uses OFDFT to model materials from the atomic scale to the mesoscale. This new version of PROFESS allows the study of larger systems with two significant changes: PROFESS is now parallelized, and the ion-electron and ion-ion terms scale quasilinearly, instead of quadratically as in PROFESS v1 (L. Hung and E.A. Carter, Chem. Phys. Lett. 475 (2009) 163). At the start of a run, PROFESS reads the various input files that describe the geometry of the system (ion positions and cell dimensions), the type of elements (defined by electron-ion pseudopotentials), the actions you want it to perform (minimize with respect to electron density and/or ion positions and/or cell lattice vectors), and the various options for the computation (such as which functionals you want it to use). Based on these inputs, PROFESS sets up a computation and performs the appropriate optimizations. Energies, forces, stresses, material geometries, and electron density configurations are some of the values that can be output throughout the optimization. New version program summaryProgram Title: PROFESS Catalogue identifier: AEBN_v2_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEBN_v2_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 68 721 No. of bytes in distributed program, including test data, etc.: 1 708 547 Distribution format: tar.gz Programming language: Fortran 90 Computer
Practical parallel programming

CERN Document Server

Bauer, Barr E

2014-01-01

This is the book that will teach programmers to write faster, more efficient code for parallel processors. The reader is introduced to a vast array of procedures and paradigms on which actual coding may be based. Examples and real-life simulations using these devices are presented in C and FORTRAN.
The FORCE: A portable parallel programming language supporting computational structural mechanics

Science.gov (United States)

Jordan, Harry F.; Benten, Muhammad S.; Brehm, Juergen; Ramanan, Aruna

1989-01-01

This project supports the conversion of codes in Computational Structural Mechanics (CSM) to a parallel form which will efficiently exploit the computational power available from multiprocessors. The work is a part of a comprehensive, FORTRAN-based system to form a basis for a parallel version of the NICE/SPAR combination which will form the CSM Testbed. The software is macro-based and rests on the force methodology developed by the principal investigator in connection with an early scientific multiprocessor. Machine independence is an important characteristic of the system so that retargeting it to the Flex/32, or any other multiprocessor on which NICE/SPAR might be imnplemented, is well supported. The principal investigator has experience in producing parallel software for both full and sparse systems of linear equations using the force macros. Other researchers have used the Force in finite element programs. It has been possible to rapidly develop software which performs at maximum efficiency on a multiprocessor. The inherent machine independence of the system also means that the parallelization will not be limited to a specific multiprocessor.
Parallelized preconditioned BiCGStab solution of sparse linear system equations in F-COBRA-TF

International Nuclear Information System (INIS)

Geemert, Rene van; Glück, Markus; Riedmann, Michael; Gabriel, Harry

2011-01-01

Recently, the in-house development of a preconditioned and parallelized BiCGStab solver has been pursued successfully in AREVA’s advanced sub-channel code F-COBRA-TF. This solver can be run either in a sequential computation mode on a single CPU, or in a parallel computation mode on multiple parallel CPUs. The developed procedure enables the computation of several thousands of successive sparse linear system solutions in F-COBRA-TF with acceptable wall clock run times. The current paper provides general information about F-COBRA-TF in terms of modeling capabilities and application areas, and points out where the relevance arises for the efficient iterative solution of sparse linear systems. Furthermore, the preconditioning and parallelization strategies in the developed BiCGStab iterative solution approach are discussed. The paper is concluded with a number of verification examples. (author)
Accelerating nuclear configuration interaction calculations through a preconditioned block iterative eigensolver

Science.gov (United States)

Shao, Meiyue; Aktulga, H. Metin; Yang, Chao; Ng, Esmond G.; Maris, Pieter; Vary, James P.

2018-01-01

We describe a number of recently developed techniques for improving the performance of large-scale nuclear configuration interaction calculations on high performance parallel computers. We show the benefit of using a preconditioned block iterative method to replace the Lanczos algorithm that has traditionally been used to perform this type of computation. The rapid convergence of the block iterative method is achieved by a proper choice of starting guesses of the eigenvectors and the construction of an effective preconditioner. These acceleration techniques take advantage of special structure of the nuclear configuration interaction problem which we discuss in detail. The use of a block method also allows us to improve the concurrency of the computation, and take advantage of the memory hierarchy of modern microprocessors to increase the arithmetic intensity of the computation relative to data movement. We also discuss the implementation details that are critical to achieving high performance on massively parallel multi-core supercomputers, and demonstrate that the new block iterative solver is two to three times faster than the Lanczos based algorithm for problems of moderate sizes on a Cray XC30 system.
Evaluation of global synchronization for iterative algebra algorithms on many-core

KAUST Repository

ul Hasan Khan, Ayaz; Al-Mouhamed, Mayez; Firdaus, Lutfi A.

2015-01-01

© 2015 IEEE. Massively parallel computing is applied extensively in various scientific and engineering domains. With the growing interest in many-core architectures and due to the lack of explicit support for inter-block synchronization specifically in GPUs, synchronization becomes necessary to minimize inter-block communication time. In this paper, we have proposed two new inter-block synchronization techniques: 1) Relaxed Synchronization, and 2) Block-Query Synchronization. These schemes are used in implementing numerical iterative solvers where computation/communication overlapping is one used optimization to enhance application performance. We have evaluated and analyzed the performance of the proposed synchronization techniques using Jacobi Iterative Solver in comparison to the state of the art inter-block lock-free synchronization techniques. We have achieved about 1-8% performance improvement in terms of execution time over lock-free synchronization depending on the problem size and the number of thread blocks. We have also evaluated the proposed algorithm on GPU and MIC architectures and obtained about 8-26% performance improvement over the barrier synchronization available in OpenMP programming environment depending on the problem size and number of cores used.
Evaluation of global synchronization for iterative algebra algorithms on many-core

KAUST Repository

ul Hasan Khan, Ayaz

2015-06-01

© 2015 IEEE. Massively parallel computing is applied extensively in various scientific and engineering domains. With the growing interest in many-core architectures and due to the lack of explicit support for inter-block synchronization specifically in GPUs, synchronization becomes necessary to minimize inter-block communication time. In this paper, we have proposed two new inter-block synchronization techniques: 1) Relaxed Synchronization, and 2) Block-Query Synchronization. These schemes are used in implementing numerical iterative solvers where computation/communication overlapping is one used optimization to enhance application performance. We have evaluated and analyzed the performance of the proposed synchronization techniques using Jacobi Iterative Solver in comparison to the state of the art inter-block lock-free synchronization techniques. We have achieved about 1-8% performance improvement in terms of execution time over lock-free synchronization depending on the problem size and the number of thread blocks. We have also evaluated the proposed algorithm on GPU and MIC architectures and obtained about 8-26% performance improvement over the barrier synchronization available in OpenMP programming environment depending on the problem size and number of cores used.
Plane-wave electronic structure calculations on a parallel supercomputer

International Nuclear Information System (INIS)

Nelson, J.S.; Plimpton, S.J.; Sears, M.P.

1993-01-01

The development of iterative solutions of Schrodinger's equation in a plane-wave (pw) basis over the last several years has coincided with great advances in the computational power available for performing the calculations. These dual developments have enabled many new and interesting condensed matter phenomena to be studied from a first-principles approach. The authors present a detailed description of the implementation on a parallel supercomputer (hypercube) of the first-order equation-of-motion solution to Schrodinger's equation, using plane-wave basis functions and ab initio separable pseudopotentials. By distributing the plane-waves across the processors of the hypercube many of the computations can be performed in parallel, resulting in decreases in the overall computation time relative to conventional vector supercomputers. This partitioning also provides ample memory for large Fast Fourier Transform (FFT) meshes and the storage of plane-wave coefficients for many hundreds of energy bands. The usefulness of the parallel techniques is demonstrated by benchmark timings for both the FFT's and iterations of the self-consistent solution of Schrodinger's equation for different sized Si unit cells of up to 512 atoms
Parallelization and checkpointing of GPU applications through program transformation

Energy Technology Data Exchange (ETDEWEB)

Solano-Quinde, Lizandro Damian [Iowa State Univ., Ames, IA (United States)

2012-01-01

GPUs have emerged as a powerful tool for accelerating general-purpose applications. The availability of programming languages that makes writing general-purpose applications for running on GPUs tractable have consolidated GPUs as an alternative for accelerating general purpose applications. Among the areas that have benefited from GPU acceleration are: signal and image processing, computational fluid dynamics, quantum chemistry, and, in general, the High Performance Computing (HPC) Industry. In order to continue to exploit higher levels of parallelism with GPUs, multi-GPU systems are gaining popularity. In this context, single-GPU applications are parallelized for running in multi-GPU systems. Furthermore, multi-GPU systems help to solve the GPU memory limitation for applications with large application memory footprint. Parallelizing single-GPU applications has been approached by libraries that distribute the workload at runtime, however, they impose execution overhead and are not portable. On the other hand, on traditional CPU systems, parallelization has been approached through application transformation at pre-compile time, which enhances the application to distribute the workload at application level and does not have the issues of library-based approaches. Hence, a parallelization scheme for GPU systems based on application transformation is needed. Like any computing engine of today, reliability is also a concern in GPUs. GPUs are vulnerable to transient and permanent failures. Current checkpoint/restart techniques are not suitable for systems with GPUs. Checkpointing for GPU systems present new and interesting challenges, primarily due to the natural differences imposed by the hardware design, the memory subsystem architecture, the massive number of threads, and the limited amount of synchronization among threads. Therefore, a checkpoint/restart technique suitable for GPU systems is needed. The goal of this work is to exploit higher levels of parallelism and
Development and adjustment of programs for solving systems of linear equations

International Nuclear Information System (INIS)

Fujimura, Toichiro

1978-03-01

Programs for solving the systems of linear equations have been adjusted and developed in expanding the scientific subroutine library SSL. The principal programs adjusted are based on the congruent method, method of product form of the inverse, orthogonal method, Crout's method for sparse system, and acceleration of iterative methods. The programs developed are based on the escalator method, direct parallel residue method and block tridiagonal method for band system. Described are usage of the programs developed and their future improvement. FORTRAN lists with simple examples in tests of the programs are also given. (auth.)
The path from ITER to a power plant - initial results from the ARIES ''Pathways'' program

International Nuclear Information System (INIS)

Najmabadi, F.

2007-01-01

The US national power plant studies program, ARIES, has initiated a 3-year integrated study, called the ''Pathways Program'' to investigate what the fusion program needs to do, in addition to successful operation of the ITER, in order to transform fusion into a commercial reality. The US power industry and regulatory agencies view the demonstration power plant, DEMO, as a device which is build and operated by industry, possibly with government participation, to demonstrate the commercial readiness of fusion power. As such, the ''Pathways'' programs will investigate what is needed, in addition to successful operation of ITER, to convince industry to move forward with a fusion DEMO. While many reports exists that provide a strategic view of the needs for fusion development; in the ITER era, a much more detailed view is needed to provide the necessary information for program planning. By comparing the anticipated results from ITER and existing facilities with the requirements for a power plant in the first phase of the Pathways study, we will develop a comprehensive list of remaining R and D items for developing fusion, will identify metrics for distributing resources among R and D issues, and will identify which of those items can/should be done in existing or simulation facilities. In the second phase of the study, we will develop potential embodiments for the fusion test facility (ies) and explore their cost/performance parametrically. An important by-product of this study is the identification of key R and D issues that can be performed and resolved in existing facilities to make the fusion facility cheaper and/or a higher performance device. This paper summarizes the results from the first phase of our study. We have adopted a ''holistic'' or integrated approach with the focus on the needs of the customer. In such an approach, the remaining R and D should generate all of the information needed by industry to move forward with the DEMO, i.e., data needed to
Dynamic programming in parallel boundary detection with application to ultrasound intima-media segmentation.

Science.gov (United States)

Zhou, Yuan; Cheng, Xinyao; Xu, Xiangyang; Song, Enmin

2013-12-01

Segmentation of carotid artery intima-media in longitudinal ultrasound images for measuring its thickness to predict cardiovascular diseases can be simplified as detecting two nearly parallel boundaries within a certain distance range, when plaque with irregular shapes is not considered. In this paper, we improve the implementation of two dynamic programming (DP) based approaches to parallel boundary detection, dual dynamic programming (DDP) and piecewise linear dual dynamic programming (PL-DDP). Then, a novel DP based approach, dual line detection (DLD), which translates the original 2-D curve position to a 4-D parameter space representing two line segments in a local image segment, is proposed to solve the problem while maintaining efficiency and rotation invariance. To apply the DLD to ultrasound intima-media segmentation, it is imbedded in a framework that employs an edge map obtained from multiplication of the responses of two edge detectors with different scales and a coupled snake model that simultaneously deforms the two contours for maintaining parallelism. The experimental results on synthetic images and carotid arteries of clinical ultrasound images indicate improved performance of the proposed DLD compared to DDP and PL-DDP, with respect to accuracy and efficiency. Copyright © 2013 Elsevier B.V. All rights reserved.
Final Report on ITER Task Agreement 81-08

Energy Technology Data Exchange (ETDEWEB)

Richard L. Moore

2008-03-01

As part of an ITER Implementing Task Agreement (ITA) between the ITER US Participant Team (PT) and the ITER International Team (IT), the INL Fusion Safety Program was tasked to provide the ITER IT with upgrades to the fusion version of the MELCOR 1.8.5 code including a beryllium dust oxidation model. The purpose of this model is to allow the ITER IT to investigate hydrogen production from beryllium dust layers on hot surfaces inside the ITER vacuum vessel (VV) during in-vessel loss-of-cooling accidents (LOCAs). Also included in the ITER ITA was a task to construct a RELAP5/ATHENA model of the ITER divertor cooling loop to model the draining of the loop during a large ex-vessel pipe break followed by an in-vessel divertor break and compare the results to a simular MELCOR model developed by the ITER IT. This report, which is the final report for this agreement, documents the completion of the work scope under this ITER TA, designated as TA 81-08.
Parallel multigrid smoothing: polynomial versus Gauss-Seidel

International Nuclear Information System (INIS)

Adams, Mark; Brezina, Marian; Hu, Jonathan; Tuminaro, Ray

2003-01-01

Gauss-Seidel is often the smoother of choice within multigrid applications. In the context of unstructured meshes, however, maintaining good parallel efficiency is difficult with multiplicative iterative methods such as Gauss-Seidel. This leads us to consider alternative smoothers. We discuss the computational advantages of polynomial smoothers within parallel multigrid algorithms for positive definite symmetric systems. Two particular polynomials are considered: Chebyshev and a multilevel specific polynomial. The advantages of polynomial smoothing over traditional smoothers such as Gauss-Seidel are illustrated on several applications: Poisson's equation, thin-body elasticity, and eddy current approximations to Maxwell's equations. While parallelizing the Gauss-Seidel method typically involves a compromise between a scalable convergence rate and maintaining high flop rates, polynomial smoothers achieve parallel scalable multigrid convergence rates without sacrificing flop rates. We show that, although parallel computers are the main motivation, polynomial smoothers are often surprisingly competitive with Gauss-Seidel smoothers on serial machines
Parallel multigrid smoothing: polynomial versus Gauss-Seidel

Science.gov (United States)

Adams, Mark; Brezina, Marian; Hu, Jonathan; Tuminaro, Ray

2003-07-01

Gauss-Seidel is often the smoother of choice within multigrid applications. In the context of unstructured meshes, however, maintaining good parallel efficiency is difficult with multiplicative iterative methods such as Gauss-Seidel. This leads us to consider alternative smoothers. We discuss the computational advantages of polynomial smoothers within parallel multigrid algorithms for positive definite symmetric systems. Two particular polynomials are considered: Chebyshev and a multilevel specific polynomial. The advantages of polynomial smoothing over traditional smoothers such as Gauss-Seidel are illustrated on several applications: Poisson's equation, thin-body elasticity, and eddy current approximations to Maxwell's equations. While parallelizing the Gauss-Seidel method typically involves a compromise between a scalable convergence rate and maintaining high flop rates, polynomial smoothers achieve parallel scalable multigrid convergence rates without sacrificing flop rates. We show that, although parallel computers are the main motivation, polynomial smoothers are often surprisingly competitive with Gauss-Seidel smoothers on serial machines.
On the Convergence of Iterative Receiver Algorithms Utilizing Hard Decisions

Directory of Open Access Journals (Sweden)

Jürgen F. Rößler

2009-01-01

Full Text Available The convergence of receivers performing iterative hard decision interference cancellation (IHDIC is analyzed in a general framework for ASK, PSK, and QAM constellations. We first give an overview of IHDIC algorithms known from the literature applied to linear modulation and DS-CDMA-based transmission systems and show the relation to Hopfield neural network theory. It is proven analytically that IHDIC with serial update scheme always converges to a stable state in the estimated values in course of iterations and that IHDIC with parallel update scheme converges to cycles of length 2. Additionally, we visualize the convergence behavior with the aid of convergence charts. Doing so, we give insight into possible errors occurring in IHDIC which turn out to be caused by locked error situations. The derived results can directly be applied to those iterative soft decision interference cancellation (ISDIC receivers whose soft decision functions approach hard decision functions in course of the iterations.
Chatter suppression methods of a robot machine for ITER vacuum vessel assembly and maintenance

International Nuclear Information System (INIS)

Wu, Huapeng; Wang, Yongbo; Li, Ming; Al-Saedi, Mazin; Handroos, Heikki

2014-01-01

Highlights: •A redundant 10-DOF serial-parallel hybrid robot for ITER assembly and maintains is presented. •A dynamic model of the robot is developed. •A feedback and feedforward controller is presented to suppress machining vibration of the robot. -- Abstract: In the process of assembly and maintenance of ITER vacuum vessel (ITER VV), various machining tasks including threading, milling, welding-defects cutting and flexible hose boring are required to be performed from inside of ITER VV by on-site machining tools. Robot machine is a promising option for these tasks, but great chatter (machine vibration) would happen in the machining process. The chatter vibration will deteriorate the robot accuracy and surface quality, and even cause some damages on the end-effector tools and the robot structure itself. This paper introduces two vibration control methods, one is passive and another is active vibration control. For the passive vibration control, a parallel mechanism is presented to increase the stiffness of robot machine; for the active vibration control, a hybrid control method combining feedforward controller and nonlinear feedback controller is introduced for chatter suppression. A dynamic model and its chatter vibration phenomena of a hybrid robot is demonstrated. Simulation results are given based on the proposed hybrid robot machine which is developed for the ITER VV assembly and maintenance

Chatter suppression methods of a robot machine for ITER vacuum vessel assembly and maintenance

Energy Technology Data Exchange (ETDEWEB)

Wu, Huapeng; Wang, Yongbo, E-mail: yongbo.wang@lut.fi; Li, Ming; Al-Saedi, Mazin; Handroos, Heikki

2014-10-15

Highlights: •A redundant 10-DOF serial-parallel hybrid robot for ITER assembly and maintains is presented. •A dynamic model of the robot is developed. •A feedback and feedforward controller is presented to suppress machining vibration of the robot. -- Abstract: In the process of assembly and maintenance of ITER vacuum vessel (ITER VV), various machining tasks including threading, milling, welding-defects cutting and flexible hose boring are required to be performed from inside of ITER VV by on-site machining tools. Robot machine is a promising option for these tasks, but great chatter (machine vibration) would happen in the machining process. The chatter vibration will deteriorate the robot accuracy and surface quality, and even cause some damages on the end-effector tools and the robot structure itself. This paper introduces two vibration control methods, one is passive and another is active vibration control. For the passive vibration control, a parallel mechanism is presented to increase the stiffness of robot machine; for the active vibration control, a hybrid control method combining feedforward controller and nonlinear feedback controller is introduced for chatter suppression. A dynamic model and its chatter vibration phenomena of a hybrid robot is demonstrated. Simulation results are given based on the proposed hybrid robot machine which is developed for the ITER VV assembly and maintenance.
COMPUTATIONAL EFFICIENCY OF A MODIFIED SCATTERING KERNEL FOR FULL-COUPLED PHOTON-ELECTRON TRANSPORT PARALLEL COMPUTING WITH UNSTRUCTURED TETRAHEDRAL MESHES

Directory of Open Access Journals (Sweden)

JONG WOON KIM

2014-04-01

In this paper, we introduce a modified scattering kernel approach to avoid the unnecessarily repeated calculations involved with the scattering source calculation, and used it with parallel computing to effectively reduce the computation time. Its computational efficiency was tested for three-dimensional full-coupled photon-electron transport problems using our computer program which solves the multi-group discrete ordinates transport equation by using the discontinuous finite element method with unstructured tetrahedral meshes for complicated geometrical problems. The numerical tests show that we can improve speed up to 17∼42 times for the elapsed time per iteration using the modified scattering kernel, not only in the single CPU calculation but also in the parallel computing with several CPUs.
IHadoop: Asynchronous iterations for MapReduce

KAUST Repository

Elnikety, Eslam Mohamed Ibrahim

2011-11-01

MapReduce is a distributed programming frame-work designed to ease the development of scalable data-intensive applications for large clusters of commodity machines. Most machine learning and data mining applications involve iterative computations over large datasets, such as the Web hyperlink structures and social network graphs. Yet, the MapReduce model does not efficiently support this important class of applications. The architecture of MapReduce, most critically its dataflow techniques and task scheduling, is completely unaware of the nature of iterative applications; tasks are scheduled according to a policy that optimizes the execution for a single iteration which wastes bandwidth, I/O, and CPU cycles when compared with an optimal execution for a consecutive set of iterations. This work presents iHadoop, a modified MapReduce model, and an associated implementation, optimized for iterative computations. The iHadoop model schedules iterations asynchronously. It connects the output of one iteration to the next, allowing both to process their data concurrently. iHadoop\\'s task scheduler exploits inter-iteration data locality by scheduling tasks that exhibit a producer/consumer relation on the same physical machine allowing a fast local data transfer. For those iterative applications that require satisfying certain criteria before termination, iHadoop runs the check concurrently during the execution of the subsequent iteration to further reduce the application\\'s latency. This paper also describes our implementation of the iHadoop model, and evaluates its performance against Hadoop, the widely used open source implementation of MapReduce. Experiments using different data analysis applications over real-world and synthetic datasets show that iHadoop performs better than Hadoop for iterative algorithms, reducing execution time of iterative applications by 25% on average. Furthermore, integrating iHadoop with HaLoop, a variant Hadoop implementation that caches
IHadoop: Asynchronous iterations for MapReduce

KAUST Repository

Elnikety, Eslam Mohamed Ibrahim; El Sayed, Tamer S.; Ramadan, Hany E.

2011-01-01

MapReduce is a distributed programming frame-work designed to ease the development of scalable data-intensive applications for large clusters of commodity machines. Most machine learning and data mining applications involve iterative computations over large datasets, such as the Web hyperlink structures and social network graphs. Yet, the MapReduce model does not efficiently support this important class of applications. The architecture of MapReduce, most critically its dataflow techniques and task scheduling, is completely unaware of the nature of iterative applications; tasks are scheduled according to a policy that optimizes the execution for a single iteration which wastes bandwidth, I/O, and CPU cycles when compared with an optimal execution for a consecutive set of iterations. This work presents iHadoop, a modified MapReduce model, and an associated implementation, optimized for iterative computations. The iHadoop model schedules iterations asynchronously. It connects the output of one iteration to the next, allowing both to process their data concurrently. iHadoop's task scheduler exploits inter-iteration data locality by scheduling tasks that exhibit a producer/consumer relation on the same physical machine allowing a fast local data transfer. For those iterative applications that require satisfying certain criteria before termination, iHadoop runs the check concurrently during the execution of the subsequent iteration to further reduce the application's latency. This paper also describes our implementation of the iHadoop model, and evaluates its performance against Hadoop, the widely used open source implementation of MapReduce. Experiments using different data analysis applications over real-world and synthetic datasets show that iHadoop performs better than Hadoop for iterative algorithms, reducing execution time of iterative applications by 25% on average. Furthermore, integrating iHadoop with HaLoop, a variant Hadoop implementation that caches
Pattern-Driven Automatic Parallelization

Directory of Open Access Journals (Sweden)

Christoph W. Kessler

1996-01-01

Full Text Available This article describes a knowledge-based system for automatic parallelization of a wide class of sequential numerical codes operating on vectors and dense matrices, and for execution on distributed memory message-passing multiprocessors. Its main feature is a fast and powerful pattern recognition tool that locally identifies frequently occurring computations and programming concepts in the source code. This tool also works for dusty deck codes that have been "encrypted" by former machine-specific code transformations. Successful pattern recognition guides sophisticated code transformations including local algorithm replacement such that the parallelized code need not emerge from the sequential program structure by just parallelizing the loops. It allows access to an expert's knowledge on useful parallel algorithms, available machine-specific library routines, and powerful program transformations. The partially restored program semantics also supports local array alignment, distribution, and redistribution, and allows for faster and more exact prediction of the performance of the parallelized target code than is usually possible.
ITMETH, Iterative Routines for Linear System

International Nuclear Information System (INIS)

Greenbaum, A.

1989-01-01

1 - Description of program or function: ITMETH is a collection of iterative routines for solving large, sparse linear systems. 2 - Method of solution: ITMETH solves general linear systems of the form AX=B using a variety of methods: Jacobi iteration; Gauss-Seidel iteration; incomplete LU decomposition or matrix splitting with iterative refinement; diagonal scaling, matrix splitting, or incomplete LU decomposition with the conjugate gradient method for the problem AA'Y=B, X=A'Y; bi-conjugate gradient method with diagonal scaling, matrix splitting, or incomplete LU decomposition; and ortho-min method with diagonal scaling, matrix splitting, or incomplete LU decomposition. ITMETH also solves symmetric positive definite linear systems AX=B using the conjugate gradient method with diagonal scaling or matrix splitting, or the incomplete Cholesky conjugate gradient method
Development and test of prototype components for ITER

International Nuclear Information System (INIS)

Biel, Wolfgang; Behr, Wilfried; Castano-Bardawil, David

2015-08-01

The scientific program of the project is divided into the following partial projects: (1.) ITER Diagnostic Port Plug for the charge-exchange spectroscopy (CXRS) with the subthemes: (a) Development of prototypes for critical mechanical components, (b) development of a roboter for the laser welding of vacuum seals and pipings at the Port Plug, (c) mirror studies, (d) CXRS prototype spectrometer, (2.) ITER tritium retention diagnostics (TR), (3.) ITER disruption mitigation ventile (DMV).
Design and fabrication of the 'ITER-like' SINGAP D- acceleration system

International Nuclear Information System (INIS)

Massmann, P.; Esch, H.P.L. de; Hemsworth, R.S.; Svensson, L.

2005-01-01

To demonstrate ITER NBI (1 MV, 40 A) relevant beam optics in the Cadarache 1 MV, 100 mA test bed, a new D - beam source system has been put into operation. The system retains a maximum of the ITER SINGAP key parameters, e.g. the perveance matched D - current density at 1 MeV is 20 mA/cm 2 . The accelerator parameters are identical to the ITER SINGAP design, aiming at a near parallel 1 MeV beam of 5 mrad divergence. The design is aimed at also demonstrating SINGAP 'on to off-axis' beam steering by a simple transverse displacement of the post-acceleration electrode. First beams up to 850 keV have been obtained after only 4 weeks of commissioning
MPI_XSTAR: MPI-based parallelization of XSTAR program

Science.gov (United States)

Danehkar, A.

2017-12-01

MPI_XSTAR parallelizes execution of multiple XSTAR runs using Message Passing Interface (MPI). XSTAR (ascl:9910.008), part of the HEASARC's HEAsoft (ascl:1408.004) package, calculates the physical conditions and emission spectra of ionized gases. MPI_XSTAR invokes XSTINITABLE from HEASoft to generate a job list of XSTAR commands for given physical parameters. The job list is used to make directories in ascending order, where each individual XSTAR is spawned on each processor and outputs are saved. HEASoft's XSTAR2TABLE program is invoked upon the contents of each directory in order to produce table model FITS files for spectroscopy analysis tools.
Parallel preconditioned conjugate gradient algorithm applied to neutron diffusion problem

International Nuclear Information System (INIS)

Majumdar, A.; Martin, W.R.

1992-01-01

Numerical solution of the neutron diffusion problem requires solving a linear system of equations such as Ax = b, where A is an n x n symmetric positive definite (SPD) matrix; x and b are vectors with n components. The preconditioned conjugate gradient (PCG) algorithm is an efficient iterative method for solving such a linear system of equations. In this paper, the authors describe the implementation of a parallel PCG algorithm on a shared memory machine (BBN TC2000) and on a distributed workstation (IBM RS6000) environment created by the parallel virtual machine parallelization software
Methodologies and Tools for Tuning Parallel Programs: 80% Art, 20% Science, and 10% Luck

Science.gov (United States)

Yan, Jerry C.; Bailey, David (Technical Monitor)

1996-01-01

The need for computing power has forced a migration from serial computation on a single processor to parallel processing on multiprocessors. However, without effective means to monitor (and analyze) program execution, tuning the performance of parallel programs becomes exponentially difficult as program complexity and machine size increase. In the past few years, the ubiquitous introduction of performance tuning tools from various supercomputer vendors (Intel's ParAide, TMC's PRISM, CRI's Apprentice, and Convex's CXtrace) seems to indicate the maturity of performance instrumentation/monitor/tuning technologies and vendors'/customers' recognition of their importance. However, a few important questions remain: What kind of performance bottlenecks can these tools detect (or correct)? How time consuming is the performance tuning process? What are some important technical issues that remain to be tackled in this area? This workshop reviews the fundamental concepts involved in analyzing and improving the performance of parallel and heterogeneous message-passing programs. Several alternative strategies will be contrasted, and for each we will describe how currently available tuning tools (e.g. AIMS, ParAide, PRISM, Apprentice, CXtrace, ATExpert, Pablo, IPS-2) can be used to facilitate the process. We will characterize the effectiveness of the tools and methodologies based on actual user experiences at NASA Ames Research Center. Finally, we will discuss their limitations and outline recent approaches taken by vendors and the research community to address them.
Measurement and control system for the ITER remote handling mock-up test

International Nuclear Information System (INIS)

Oka, K.; Kakudate, S.; Takiguchi, Y.; Ako, K.; Taguchi, K.; Tada, E.; Ozaki, F.; Shibanuma, K.

1998-01-01

The mock-up test platforms composed of full-scale remote handling (RH) equipment were developed for demonstrating remote replacement of the ITER blanket and divertor. In parallel, the measurement and control system for operating these RH equipment were constructed on the basis of open architecture with object oriented feature, aiming at realization of fully-remoted automatic operation required for ITER. This paper describes the design concept of the measurement and control system for the remote handling equipment of ITER, and outlines the measured performances of the fabricated measurement system for the remote handling mock-up tests, which includes Data Acquisition System (DAS), Visual Monitoring System (VMS) and Virtual Reality System (VRS). (authors)
Fabrication progress of the ITER vacuum vessel sector in Korea

Energy Technology Data Exchange (ETDEWEB)

Kim, B.C., E-mail: bckim@nfri.re.kr [National Fusion Research Institute, Gwahangno 113, Yuseong-gu, Daejeon (Korea, Republic of); Lee, Y.J.; Hong, K.H.; Sa, J.W.; Kim, H.S.; Park, C.K.; Ahn, H.J.; Bak, J.S.; Jung, K.J. [National Fusion Research Institute, Gwahangno 113, Yuseong-gu, Daejeon (Korea, Republic of); Park, K.H.; Roh, B.R.; Kim, T.S.; Lee, J.S.; Jung, Y.H.; Sung, H.J.; Choi, S.Y.; Kim, H.G.; Kwon, I.K.; Kwon, T.H. [Hyundai Heavy Industries Co. Ltd., Dong-gu, Ulsan (Korea, Republic of)

2013-10-15

Highlights: ► Fabrication of ITER vacuum vessel sector full scale mock-up to develop fabrication procedures. ► The welding and nondestructive examination techniques conform to RCC-MR. ► The preparation of real manufacturing of ITER vacuum vessel sector. -- Abstract: As a participant of ITER project, ITER Korea has to supply two ITER vacuum vessel sectors (Sector no. 6, no. 1) of total nine ITER VV sectors. After the procurement arrangement with ITER Organization, ITER Korea made the contract with Hyundai Heavy Industries (HHI) for fabrication of two sectors. Then the start of the manufacturing design was initiated from January 2010. HHI made three real scale R and D mock-ups to verify the critical fabrication feasibility issues on electron beam welding, 3D forming, welding distortion and achievable tolerances. The documentation according to IO and the French nuclear safety regulation requirement, the qualification of welding and nondestructive examination procedures conform to RCC-MR 2007 were proceed in parallel. The mass production of raw material was done after receiving ANB (agreed notified body) verification of product/parts and shop qualification. The manufacturing drawing, manufacturing and inspection plan of VV sector with supporting fabrication procedures are also verified by ANB, accordingly the first cutting and forming of plates for VV sector fabrication started from February 2012. This paper reports the latest fabrication progress of ITER vacuum vessel Sector no. 6 that will be assembled as the first sector in the ITER pit. The overall fabrication route, R and D mock-up fabrication results with forming and welding distortion analysis, qualification status of welding and nondestructive examination (NDE) are also presented.
Fast parallel algorithm for CT image reconstruction.

Science.gov (United States)

Flores, Liubov A; Vidal, Vicent; Mayo, Patricia; Rodenas, Francisco; Verdú, Gumersindo

2012-01-01

In X-ray computed tomography (CT) the X rays are used to obtain the projection data needed to generate an image of the inside of an object. The image can be generated with different techniques. Iterative methods are more suitable for the reconstruction of images with high contrast and precision in noisy conditions and from a small number of projections. Their use may be important in portable scanners for their functionality in emergency situations. However, in practice, these methods are not widely used due to the high computational cost of their implementation. In this work we analyze iterative parallel image reconstruction with the Portable Extensive Toolkit for Scientific computation (PETSc).
Performance and capacity analysis of Poisson photon-counting based Iter-PIC OCDMA systems.

Science.gov (United States)

Li, Lingbin; Zhou, Xiaolin; Zhang, Rong; Zhang, Dingchen; Hanzo, Lajos

2013-11-04

In this paper, an iterative parallel interference cancellation (Iter-PIC) technique is developed for optical code-division multiple-access (OCDMA) systems relying on shot-noise limited Poisson photon-counting reception. The novel semi-analytical tool of extrinsic information transfer (EXIT) charts is used for analysing both the bit error rate (BER) performance as well as the channel capacity of these systems and the results are verified by Monte Carlo simulations. The proposed Iter-PIC OCDMA system is capable of achieving two orders of magnitude BER improvements and a 0.1 nats of capacity improvement over the conventional chip-level OCDMA systems at a coding rate of 1/10.
Joining technologies for the plasma facing components of ITER

International Nuclear Information System (INIS)

Barabash, V.; Kalinin, G.; Matera, R.

1998-01-01

An extensive R and D program on the development of the joining technologies between armour (beryllium, tungsten and carbon fibre composites)/copper alloys heat sink and copper alloys/ stainless steel has been carried out by ITER Home Teams. A brief review of this R and D program is presented in this paper. Based on the results, reference technologies for use in ITER have been selected and recommended for further development. (author)
ITER management advisory committee (MAC) meeting in Naka

International Nuclear Information System (INIS)

Yoshikawa, M.

2000-01-01

The ITER Management Advisory Committee (MAC) Meeting was held on 28 June 2000 in Moskow, Russia. The main topics were the consideration of the report by the director on the ITER EDA status, the review of the work program, the review of the joint fund, the review of a schedule of ITER meetings and initial discussion and consideration on the disposition of R and D hardware and facilities and other dispositions relating to the termination of the EDA
Parallel ray tracing for one-dimensional discrete ordinate computations

International Nuclear Information System (INIS)

Jarvis, R.D.; Nelson, P.

1996-01-01

The ray-tracing sweep in discrete-ordinates, spatially discrete numerical approximation methods applied to the linear, steady-state, plane-parallel, mono-energetic, azimuthally symmetric, neutral-particle transport equation can be reduced to a parallel prefix computation. In so doing, the often severe penalty in convergence rate of the source iteration, suffered by most current parallel algorithms using spatial domain decomposition, can be avoided while attaining parallelism in the spatial domain to whatever extent desired. In addition, the reduction implies parallel algorithm complexity limits for the ray-tracing sweep. The reduction applies to all closed, linear, one-cell functional (CLOF) spatial approximation methods, which encompasses most in current popular use. Scalability test results of an implementation of the algorithm on a 64-node nCube-2S hypercube-connected, message-passing, multi-computer are described. (author)
Practical parallel computing

CERN Document Server

Morse, H Stephen

1994-01-01

Practical Parallel Computing provides information pertinent to the fundamental aspects of high-performance parallel processing. This book discusses the development of parallel applications on a variety of equipment.Organized into three parts encompassing 12 chapters, this book begins with an overview of the technology trends that converge to favor massively parallel hardware over traditional mainframes and vector machines. This text then gives a tutorial introduction to parallel hardware architectures. Other chapters provide worked-out examples of programs using several parallel languages. Thi
ITER driver blanket, European Community design

International Nuclear Information System (INIS)

Simbolotti, G.; Zampaglione, V.; Ferrari, M.; Gallina, M.; Mazzone, G.; Nardi, C.; Petrizzi, L.; Rado, V.; Violante, V.; Daenner, W.; Lorenzetto, P.; Gierszewski, P.; Grattarola, M.; Rosatelli, F.; Secolo, F.; Zacchia, F.; Caira, M.; Sorabella, L.

1993-01-01

Depending on the final decision on the operation time of ITER (International Thermonuclear Experimental Reactor), the Driver Blanket might become a basic component of the machine with the main function of producing a significant fraction (close to 0.8) of the tritium required for the ITER operation, the remaining fraction being available from external supplies. The Driver Blanket is not required to provide reactor relevant performance in terms of tritium self-sufficiency. However, reactor relevant reliability and safety are mandatory requirements for this component in order not to significantly afftect the overall plant availability and to allow the ITER experimental program to be safely and successfully carried out. With the framework of the ITER Conceptual Design Activities (CDA, 1988-1990), a conceptual design of the ITER Driver Blanket has been carried out by ENEA Fusion Dept., in collaboration with ANSALDO S.p.A. and SRS S.r.l., and in close consultation with the NET Team and CFFTP (Canadian Fusion Fuels Technology Project). Such a design has been selected as EC (European Community) reference design for the ITER Driver Blanket. The status of the design at the end of CDA is reported in the present paper. (orig.)

Advances in iterative methods

International Nuclear Information System (INIS)

Beauwens, B.; Arkuszewski, J.; Boryszewicz, M.

1981-01-01

Results obtained in the field of linear iterative methods within the Coordinated Research Program on Transport Theory and Advanced Reactor Calculations are summarized. The general convergence theory of linear iterative methods is essentially based on the properties of nonnegative operators on ordered normed spaces. The following aspects of this theory have been improved: new comparison theorems for regular splittings, generalization of the notions of M- and H-matrices, new interpretations of classical convergence theorems for positive-definite operators. The estimation of asymptotic convergence rates was developed with two purposes: the analysis of model problems and the optimization of relaxation parameters. In the framework of factorization iterative methods, model problem analysis is needed to investigate whether the increased computational complexity of higher-order methods does not offset their increased asymptotic convergence rates, as well as to appreciate the effect of standard relaxation techniques (polynomial relaxation). On the other hand, the optimal use of factorization iterative methods requires the development of adequate relaxation techniques and their optimization. The relative performances of a few possibilities have been explored for model problems. Presently, the best results have been obtained with optimal diagonal-Chebyshev relaxation
Overview of the Force Scientific Parallel Language

Directory of Open Access Journals (Sweden)

Gita Alaghband

1994-01-01

Full Text Available The Force parallel programming language designed for large-scale shared-memory multiprocessors is presented. The language provides a number of parallel constructs as extensions to the ordinary Fortran language and is implemented as a two-level macro preprocessor to support portability across shared memory multiprocessors. The global parallelism model on which the Force is based provides a powerful parallel language. The parallel constructs, generic synchronization, and freedom from process management supported by the Force has resulted in structured parallel programs that are ported to the many multiprocessors on which the Force is implemented. Two new parallel constructs for looping and functional decomposition are discussed. Several programming examples to illustrate some parallel programming approaches using the Force are also presented.
Initial activities of the ITER Management Advisory Committee

International Nuclear Information System (INIS)

Yoshikawa, Masaji

1994-01-01

The first ITER Council meeting (IC-1) took place in Vienna Austria at the International Atomic Energy Agency (IAEA) headquarters on 10-11 September, 1992. At that meeting the Council appointed Dr. Masaji Yoshikawa as Chair of the Management Advisory Committee (MAC). The Parties designated the members of MAC. The first MAC meeting was held at the ITER CoCenter, Naka, Japan, on 1-3 December 1992 and reviewed its tasks as charged by the Council. In accordance with Article 7 of the ITER EDA Agreement, it is the responsibility of MAC to report to and advise the ITER Council in management and administrative matters, including finance, personnel and task assignment. Also, in accordance with Article 11 of the ITER EDA Agreement, the MAC advises the ITER Council on the Work Program developed by the Director. The Work Program is to contain a detailed list of specific tasks, including a technical description of each task, and the assignment of the specific tasks to each of the Home Teams and the Joint Central Team (JCT), and a flow chart of the specific tasks during the whole EDA. According to Section 3 of Protocol 1, the tasks are to be performed by the Home Teams; task descriptions are to contain a detailed technical description of each task and an indication of the facilities and background information needed for its implementation
Analysis of Parallel Algorithms on SMP Node and Cluster of Workstations Using Parallel Programming Models with New Tile-based Method for Large Biological Datasets.

Science.gov (United States)

Shrimankar, D D; Sathe, S R

2016-01-01

Sequence alignment is an important tool for describing the relationships between DNA sequences. Many sequence alignment algorithms exist, differing in efficiency, in their models of the sequences, and in the relationship between sequences. The focus of this study is to obtain an optimal alignment between two sequences of biological data, particularly DNA sequences. The algorithm is discussed with particular emphasis on time, speedup, and efficiency optimizations. Parallel programming presents a number of critical challenges to application developers. Today's supercomputer often consists of clusters of SMP nodes. Programming paradigms such as OpenMP and MPI are used to write parallel codes for such architectures. However, the OpenMP programs cannot be scaled for more than a single SMP node. However, programs written in MPI can have more than single SMP nodes. But such a programming paradigm has an overhead of internode communication. In this work, we explore the tradeoffs between using OpenMP and MPI. We demonstrate that the communication overhead incurs significantly even in OpenMP loop execution and increases with the number of cores participating. We also demonstrate a communication model to approximate the overhead from communication in OpenMP loops. Our results are astonishing and interesting to a large variety of input data files. We have developed our own load balancing and cache optimization technique for message passing model. Our experimental results show that our own developed techniques give optimum performance of our parallel algorithm for various sizes of input parameter, such as sequence size and tile size, on a wide variety of multicore architectures.
Analysis of Parallel Algorithms on SMP Node and Cluster of Workstations Using Parallel Programming Models with New Tile-based Method for Large Biological Datasets

Science.gov (United States)

Shrimankar, D. D.; Sathe, S. R.

2016-01-01

Sequence alignment is an important tool for describing the relationships between DNA sequences. Many sequence alignment algorithms exist, differing in efficiency, in their models of the sequences, and in the relationship between sequences. The focus of this study is to obtain an optimal alignment between two sequences of biological data, particularly DNA sequences. The algorithm is discussed with particular emphasis on time, speedup, and efficiency optimizations. Parallel programming presents a number of critical challenges to application developers. Today’s supercomputer often consists of clusters of SMP nodes. Programming paradigms such as OpenMP and MPI are used to write parallel codes for such architectures. However, the OpenMP programs cannot be scaled for more than a single SMP node. However, programs written in MPI can have more than single SMP nodes. But such a programming paradigm has an overhead of internode communication. In this work, we explore the tradeoffs between using OpenMP and MPI. We demonstrate that the communication overhead incurs significantly even in OpenMP loop execution and increases with the number of cores participating. We also demonstrate a communication model to approximate the overhead from communication in OpenMP loops. Our results are astonishing and interesting to a large variety of input data files. We have developed our own load balancing and cache optimization technique for message passing model. Our experimental results show that our own developed techniques give optimum performance of our parallel algorithm for various sizes of input parameter, such as sequence size and tile size, on a wide variety of multicore architectures. PMID:27932868
High-speed parallel solution of the neutron diffusion equation with the hierarchical domain decomposition boundary element method incorporating parallel communications

International Nuclear Information System (INIS)

Tsuji, Masashi; Chiba, Gou

2000-01-01

A hierarchical domain decomposition boundary element method (HDD-BEM) for solving the multiregion neutron diffusion equation (NDE) has been fully parallelized, both for numerical computations and for data communications, to accomplish a high parallel efficiency on distributed memory message passing parallel computers. Data exchanges between node processors that are repeated during iteration processes of HDD-BEM are implemented, without any intervention of the host processor that was used to supervise parallel processing in the conventional parallelized HDD-BEM (P-HDD-BEM). Thus, the parallel processing can be executed with only cooperative operations of node processors. The communication overhead was even the dominant time consuming part in the conventional P-HDD-BEM, and the parallelization efficiency decreased steeply with the increase of the number of processors. With the parallel data communication, the efficiency is affected only by the number of boundary elements assigned to decomposed subregions, and the communication overhead can be drastically reduced. This feature can be particularly advantageous in the analysis of three-dimensional problems where a large number of processors are required. The proposed P-HDD-BEM offers a promising solution to the deterioration problem of parallel efficiency and opens a new path to parallel computations of NDEs on distributed memory message passing parallel computers. (author)
FAST ITERATIVE KILOVOLTAGE CONE BEAM TOMOGRAPHY

Directory of Open Access Journals (Sweden)

S. A. Zolotarev

2015-01-01

Full Text Available Creating a fast parallel iterative tomographic algorithms based on the use of graphics accelerators, which simultaneously provide the minimization of residual and total variation of the reconstructed image is an important and urgent task, which is of great scientific and practical importance. Such algorithms can be used, for example, in the implementation of radiation therapy patients, because it is always done pre-computed tomography of patients in order to better identify areas which can then be subjected to radiation exposure.
Solving the Stokes problem on a massively parallel computer

DEFF Research Database (Denmark)

Axelsson, Owe; Barker, Vincent A.; Neytcheva, Maya

2001-01-01

boundary value problem for each velocity component, are solved by the conjugate gradient method with a preconditioning based on the algebraic multi‐level iteration (AMLI) technique. The velocity is found from the computed pressure. The method is optimal in the sense that the computational work...... is proportional to the number of unknowns. Further, it is designed to exploit a massively parallel computer with distributed memory architecture. Numerical experiments on a Cray T3E computer illustrate the parallel performance of the method....
A kind of iteration algorithm for fast wave heating

International Nuclear Information System (INIS)

Zhu Xueguang; Kuang Guangli; Zhao Yanping; Li Youyi; Xie Jikang

1998-03-01

The standard normal distribution for particles in Tokamak geometry is usually assumed in fast wave heating. In fact, due to the quasi-linear diffusion effect, the parallel and vertical temperature of resonant particles is not equal, so, this will bring some error. For this case, the Fokker-Planck equation is introduced, and iteration algorithm is adopted to solve the problem well
A parallel algorithm for solving the integral form of the discrete ordinates equations

International Nuclear Information System (INIS)

Zerr, R. J.; Azmy, Y. Y.

2009-01-01

The integral form of the discrete ordinates equations involves a system of equations that has a large, dense coefficient matrix. The serial construction methodology is presented and properties that affect the execution times to construct and solve the system are evaluated. Two approaches for massively parallel implementation of the solution algorithm are proposed and the current results of one of these are presented. The system of equations May be solved using two parallel solvers-block Jacobi and conjugate gradient. Results indicate that both methods can reduce overall wall-clock time for execution. The conjugate gradient solver exhibits better performance to compete with the traditional source iteration technique in terms of execution time and scalability. The parallel conjugate gradient method is synchronous, hence it does not increase the number of iterations for convergence compared to serial execution, and the efficiency of the algorithm demonstrates an apparent asymptotic decline. (authors)
Variable aperture-based ptychographical iterative engine method

Science.gov (United States)

Sun, Aihui; Kong, Yan; Meng, Xin; He, Xiaoliang; Du, Ruijun; Jiang, Zhilong; Liu, Fei; Xue, Liang; Wang, Shouyu; Liu, Cheng

2018-02-01

A variable aperture-based ptychographical iterative engine (vaPIE) is demonstrated both numerically and experimentally to reconstruct the sample phase and amplitude rapidly. By adjusting the size of a tiny aperture under the illumination of a parallel light beam to change the illumination on the sample step by step and recording the corresponding diffraction patterns sequentially, both the sample phase and amplitude can be faithfully reconstructed with a modified ptychographical iterative engine (PIE) algorithm. Since many fewer diffraction patterns are required than in common PIE and the shape, the size, and the position of the aperture need not to be known exactly, this proposed vaPIE method remarkably reduces the data acquisition time and makes PIE less dependent on the mechanical accuracy of the translation stage; therefore, the proposed technique can be potentially applied for various scientific researches.
Status of the Japanese ITER Home Team: January 1993

International Nuclear Information System (INIS)

Matsuda, Shinzaburo

1994-01-01

In June, 1992, Atomic Energy Commission of Japan determined the Third Phase Basic Program of Fusion Research and Development. It directs national policy for the experimental reactor phase of fusion research and development. As a government committee, the promotion and the planning of the entire fusion program will be continually carried out by the Fusion Council of Atomic Energy Commission. The Fusion Council has recently established an ITER Technical Committee which will give advice on technical matters of the ITER program to the Fusion Council. Thus, the government is ready to be fully supportive of ITER for the execution of this unprecedented international collaboration. There will be some other units to be organized in the fear future, in pace with the evolution of ITER activities. The involvement of other research institutes is open as a future possibility. The number of persons nominated as Home Team members is about 100 at present and will be increased depending upon the tasks assigned to the Japanese Home Team. The participation of industries in the EDA is of significant importance for the success of ITER. Firstly, innovative concepts or proposals owing to the technical expertise in other fields can be expected. Secondly, experience in production, fabrication or assembly is valuable in the integral review of the design. Thirdly, development and integration of production technologies are essential to realize future construction
Tungsten recrystallization and cracking under ITER-relevant heat loads

Energy Technology Data Exchange (ETDEWEB)

Budaev, V.P., E-mail: Budaev@mail.ru [NRC «Kurchatov Institute», Akademika Kurchatova pl., Moscow (Russian Federation); Martynenko, Yu.V. [NRC «Kurchatov Institute», Akademika Kurchatova pl., Moscow (Russian Federation); National Research Nuclear University MEPhI, Kashirskoe sh. 31, Moscow (Russian Federation); Karpov, A.V.; Belova, N.E. [NRC «Kurchatov Institute», Akademika Kurchatova pl., Moscow (Russian Federation); Zhitlukhin, A.M. [SRC RF TRINITI, Moscow Region (Russian Federation); Klimov, N.S., E-mail: klimov@triniti.ru [SRC RF TRINITI, Moscow Region (Russian Federation); National Research Nuclear University MEPhI, Kashirskoe sh. 31, Moscow (Russian Federation); Podkovyrov, V.L.; Barsuk, V.A.; Putrik, A.B.; Yaroshevskaya, A.D. [SRC RF TRINITI, Moscow Region (Russian Federation); Giniyatulin, R.N. [Efremov Institute, St. Petersburg (Russian Federation); Safronov, V.M. [Institution «Project Center ITER», Moscow (Russian Federation); SRC RF TRINITI, Moscow Region (Russian Federation); Khimchenko, L.N. [Institution «Project Center ITER», Moscow (Russian Federation)

2015-08-15

The tungsten surface structure was analyzed after the test in the QSPA-T under heat loads relevant to those expected in the ITER during disruptions. Repeated pulses lead to the melting and the resolidification of the tungsten surface layer of ∼50 μm thickness. There is ∼50 μm thickness intermediate layer between the original structure and the resolidified layer. The intermediate layer is recrystallized and has a random grains’ orientation whereas the resolidified layer and basic structure have texture with preferable orientation 〈1 0 0〉 normal to the surface. The cracks which were normal to the surface were observed in the resolidified layer as well as the cracks which were parallel to the surface at the depth up to 300 μm. Such cracks can result in the brittle destruction which is a hazard for the full tungsten divertor of the ITER. The theoretical analysis of the crack formation reasons and a possible consequence for the ITER are given.
The numerical parallel computing of photon transport

International Nuclear Information System (INIS)

Huang Qingnan; Liang Xiaoguang; Zhang Lifa

1998-12-01

The parallel computing of photon transport is investigated, the parallel algorithm and the parallelization of programs on parallel computers both with shared memory and with distributed memory are discussed. By analyzing the inherent law of the mathematics and physics model of photon transport according to the structure feature of parallel computers, using the strategy of 'to divide and conquer', adjusting the algorithm structure of the program, dissolving the data relationship, finding parallel liable ingredients and creating large grain parallel subtasks, the sequential computing of photon transport into is efficiently transformed into parallel and vector computing. The program was run on various HP parallel computers such as the HY-1 (PVP), the Challenge (SMP) and the YH-3 (MPP) and very good parallel speedup has been gotten
Shared Variable Oriented Parallel Precompiler for SPMD Model

Institute of Scientific and Technical Information of China (English)

无

1995-01-01

For the moment,commercial parallel computer systems with distributed memory architecture are usually provided with parallel FORTRAN or parallel C compliers,which are just traditional sequential FORTRAN or C compilers expanded with communication statements.Programmers suffer from writing parallel programs with communication statements. The Shared Variable Oriented Parallel Precompiler (SVOPP) proposed in this paper can automatically generate appropriate communication statements based on shared variables for SPMD(Single Program Multiple Data) computation model and greatly ease the parallel programming with high communication efficiency.The core function of parallel C precompiler has been successfully verified on a transputer-based parallel computer.Its prominent performance shows that SVOPP is probably a break-through in parallel programming technique.
Iterative solution of high order compact systems

Energy Technology Data Exchange (ETDEWEB)

Spotz, W.F.; Carey, G.F. [Univ. of Texas, Austin, TX (United States)

1996-12-31

We have recently developed a class of finite difference methods which provide higher accuracy and greater stability than standard central or upwind difference methods, but still reside on a compact patch of grid cells. In the present study we investigate the performance of several gradient-type iterative methods for solving the associated sparse systems. Both serial and parallel performance studies have been made. Representative examples are taken from elliptic PDE`s for diffusion, convection-diffusion, and viscous flow applications.
Resolutions of the Coulomb operator: VIII. Parallel implementation using the modern programming language X10.

Science.gov (United States)

Limpanuparb, Taweetham; Milthorpe, Josh; Rendell, Alistair P

2014-10-30

Use of the modern parallel programming language X10 for computing long-range Coulomb and exchange interactions is presented. By using X10, a partitioned global address space language with support for task parallelism and the explicit representation of data locality, the resolution of the Ewald operator can be parallelized in a straightforward manner including use of both intranode and internode parallelism. We evaluate four different schemes for dynamic load balancing of integral calculation using X10's work stealing runtime, and report performance results for long-range HF energy calculation of large molecule/high quality basis running on up to 1024 cores of a high performance cluster machine. Copyright © 2014 Wiley Periodicals, Inc.
A scalable parallel algorithm for multiple objective linear programs

Science.gov (United States)

Wiecek, Malgorzata M.; Zhang, Hong

1994-01-01

This paper presents an ADBASE-based parallel algorithm for solving multiple objective linear programs (MOLP's). Job balance, speedup and scalability are of primary interest in evaluating efficiency of the new algorithm. Implementation results on Intel iPSC/2 and Paragon multiprocessors show that the algorithm significantly speeds up the process of solving MOLP's, which is understood as generating all or some efficient extreme points and unbounded efficient edges. The algorithm gives specially good results for large and very large problems. Motivation and justification for solving such large MOLP's are also included.
Determination of quantitative tissue composition by iterative reconstruction on 3D DECT volumes

Energy Technology Data Exchange (ETDEWEB)

Magnusson, Maria [Linkoeping Univ. (Sweden). Dept. of Electrical Engineering; Linkoeping Univ. (Sweden). Dept. of Medical and Health Sciences, Radiation Physics; Linkoeping Univ. (Sweden). Center for Medical Image Science and Visualization (CMIV); Malusek, Alexandr [Linkoeping Univ. (Sweden). Dept. of Medical and Health Sciences, Radiation Physics; Linkoeping Univ. (Sweden). Center for Medical Image Science and Visualization (CMIV); Nuclear Physics Institute AS CR, Prague (Czech Republic). Dept. of Radiation Dosimetry; Muhammad, Arif [Linkoeping Univ. (Sweden). Dept. of Medical and Health Sciences, Radiation Physics; Carlsson, Gudrun Alm [Linkoeping Univ. (Sweden). Dept. of Medical and Health Sciences, Radiation Physics; Linkoeping Univ. (Sweden). Center for Medical Image Science and Visualization (CMIV)

2011-07-01

Quantitative tissue classification using dual-energy CT has the potential to improve accuracy in radiation therapy dose planning as it provides more information about material composition of scanned objects than the currently used methods based on single-energy CT. One problem that hinders successful application of both single- and dual-energy CT is the presence of beam hardening and scatter artifacts in reconstructed data. Current pre- and post-correction methods used for image reconstruction often bias CT attenuation values and thus limit their applicability for quantitative tissue classification. Here we demonstrate simulation studies with a novel iterative algorithm that decomposes every soft tissue voxel into three base materials: water, protein, and adipose. The results demonstrate that beam hardening artifacts can effectively be removed and accurate estimation of mass fractions of each base material can be achieved. Our iterative algorithm starts with calculating parallel projections on two previously reconstructed DECT volumes reconstructed from fan-beam or helical projections with small conebeam angle. The parallel projections are then used in an iterative loop. Future developments include segmentation of soft and bone tissue and subsequent determination of bone composition. (orig.)
Iterative solution of general sparse linear systems on clusters of workstations

Energy Technology Data Exchange (ETDEWEB)

Lo, Gen-Ching; Saad, Y. [Univ. of Minnesota, Minneapolis, MN (United States)

1996-12-31

Solving sparse irregularly structured linear systems on parallel platforms poses several challenges. First, sparsity makes it difficult to exploit data locality, whether in a distributed or shared memory environment. A second, perhaps more serious challenge, is to find efficient ways to precondition the system. Preconditioning techniques which have a large degree of parallelism, such as multicolor SSOR, often have a slower rate of convergence than their sequential counterparts. Finally, a number of other computational kernels such as inner products could ruin any gains gained from parallel speed-ups, and this is especially true on workstation clusters where start-up times may be high. In this paper we discuss these issues and report on our experience with PSPARSLIB, an on-going project for building a library of parallel iterative sparse matrix solvers.

Materials challenges for ITER - Current status and future activities

Energy Technology Data Exchange (ETDEWEB)

Barabash, V. [ITER International Team, Boltsmannstrasse 2, 85748 Garching (Germany)]. E-mail: valdimir.barabash@iter.org; Peacock, A. [EFDA Close Support Unit, 85748 Garching (Germany); Fabritsiev, S. [D.V. Efremov Scientific Research Institute, 196641 St. Petersburg (Russian Federation); Kalinin, G. [ENES, P.O. Box 788, 101000 Moscow (Russian Federation); Zinkle, S. [Metals and Ceramics Division, ORNL, P.O. Box 2008, Oak Ridge, TN 37831-6138 (United States); Rowcliffe, A. [Metals and Ceramics Division, ORNL, P.O. Box 2008, Oak Ridge, TN 37831-6138 (United States); Rensman, J.-W. [NRG, P.O. Box 25, 1755 ZG Petten (Netherlands); Tavassoli, A.A. [Commissariat a l' Energie Atomique, CEA/Saclay, 91191 Gif sur Yvette cedex (France); Marmy, P. [CRPP, EPFL, Association EURATOM-Confederation Suisse, 5232, Villigen PSI (Switzerland); Karditsas, P.J. [EURATOM/UKAEA Fusion Association, Abingdon, Oxon OX14 3DB (United Kingdom); Gillemot, F. [AEKI Atomic Research Institute, 1121 Budapest, (Hungary); Akiba, M. [JAEA, Naka-machi, Naka-gun, Ibaraki-ken 311-0193 (Japan)

2007-08-01

ITER will be the first experimental fusion facility, which brings together the key physical, material and technological issues related to development of fusion reactors. The design of ITER is complete and the construction will start soon. This paper discusses the main directions of the project oriented materials activity and main challenges related to selection of materials for the ITER components. For each application in ITER the main materials issues were identified and these issues were addressed in the dedicated ITER R and D program. The justification of materials performance was fully documented, which allows traceability and reliability of design data. Several examples are given to illustrate the main achievements and recommendations from the recently updated ITER Materials Properties Handbook. The main ongoing and future materials activities are described.
U.S. contributions to ITER

International Nuclear Information System (INIS)

Sauthoff, Ned R.

2006-01-01

The United States participates in the ITER project to undertake the study of the science and technology of burning plasmas. The 2003 U.S. decision to enter the ITER negotiations followed an extensive series of community and governmental reviews of the benefits, readiness, and approaches to the study of burning plasmas. This paper describes both the technical and the organizational preparations and plans for U.S. participation in the ITER construction activity: in-kind contributions, staff contributions, and cash contributions as well as supporting physics and technology research. Near-term technical activities focus on the completion of R and D and design and mitigation of risks in the areas of provisionally assigned US contributions. Outside the project, the U.S. is engaged in preparations for the test blanket module program. Organizational activities focus on preparations of the project management arrangements to maximize the overall success of the ITER project: elements include refinement of U.S. positions on the international arrangements, the establishment of the U.S. Domestic Agency, progress along the path of the U.S. Department of Energy's Project Management Order, and overall preparations for commencement of the fabrication of major items of equipment and for provision of staff and cash as specified in the upcoming ITER agreement
Challenges and status of ITER conductor production

International Nuclear Information System (INIS)

Devred, A; Backbier, I; Bessette, D; Bevillard, G; Gardner, M; Jong, C; Lillaz, F; Mitchell, N; Romano, G; Vostner, A

2014-01-01

Taking the relay of the large Hadron collider (LHC) at CERN, ITER has become the largest project in applied superconductivity. In addition to its technical complexity, ITER is also a management challenge as it relies on an unprecedented collaboration of seven partners, representing more than half of the world population, who provide 90% of the components as in-kind contributions. The ITER magnet system is one of the most sophisticated superconducting magnet systems ever designed, with an enormous stored energy of 51 GJ. It involves six of the ITER partners. The coils are wound from cable-in-conduit conductors (CICCs) made up of superconducting and copper strands assembled into a multistage cable, inserted into a conduit of butt-welded austenitic steel tubes. The conductors for the toroidal field (TF) and central solenoid (CS) coils require about 600 t of Nb 3 Sn strands while the poloidal field (PF) and correction coil (CC) and busbar conductors need around 275 t of Nb–Ti strands. The required amount of Nb 3 Sn strands far exceeds pre-existing industrial capacity and has called for a significant worldwide production scale up. The TF conductors are the first ITER components to be mass produced and are more than 50% complete. During its life time, the CS coil will have to sustain several tens of thousands of electromagnetic (EM) cycles to high current and field conditions, way beyond anything a large Nb 3 Sn coil has ever experienced. Following a comprehensive R and D program, a technical solution has been found for the CS conductor, which ensures stable performance versus EM and thermal cycling. Productions of PF, CC and busbar conductors are also underway. After an introduction to the ITER project and magnet system, we describe the ITER conductor procurements and the quality assurance/quality control programs that have been implemented to ensure production uniformity across numerous suppliers. Then, we provide examples of technical challenges that have been
Challenges and status of ITER conductor production

Science.gov (United States)

Devred, A.; Backbier, I.; Bessette, D.; Bevillard, G.; Gardner, M.; Jong, C.; Lillaz, F.; Mitchell, N.; Romano, G.; Vostner, A.

2014-04-01

Taking the relay of the large Hadron collider (LHC) at CERN, ITER has become the largest project in applied superconductivity. In addition to its technical complexity, ITER is also a management challenge as it relies on an unprecedented collaboration of seven partners, representing more than half of the world population, who provide 90% of the components as in-kind contributions. The ITER magnet system is one of the most sophisticated superconducting magnet systems ever designed, with an enormous stored energy of 51 GJ. It involves six of the ITER partners. The coils are wound from cable-in-conduit conductors (CICCs) made up of superconducting and copper strands assembled into a multistage cable, inserted into a conduit of butt-welded austenitic steel tubes. The conductors for the toroidal field (TF) and central solenoid (CS) coils require about 600 t of Nb3Sn strands while the poloidal field (PF) and correction coil (CC) and busbar conductors need around 275 t of Nb-Ti strands. The required amount of Nb3Sn strands far exceeds pre-existing industrial capacity and has called for a significant worldwide production scale up. The TF conductors are the first ITER components to be mass produced and are more than 50% complete. During its life time, the CS coil will have to sustain several tens of thousands of electromagnetic (EM) cycles to high current and field conditions, way beyond anything a large Nb3Sn coil has ever experienced. Following a comprehensive R&D program, a technical solution has been found for the CS conductor, which ensures stable performance versus EM and thermal cycling. Productions of PF, CC and busbar conductors are also underway. After an introduction to the ITER project and magnet system, we describe the ITER conductor procurements and the quality assurance/quality control programs that have been implemented to ensure production uniformity across numerous suppliers. Then, we provide examples of technical challenges that have been encountered and
A dimension decomposition approach based on iterative observer design for an elliptic Cauchy problem

KAUST Repository

Majeed, Muhammad Usman; Laleg-Kirati, Taous-Meriem

2015-01-01

A state observer inspired iterative algorithm is presented to solve boundary estimation problem for Laplace equation using one of the space variables as a time-like variable. Three dimensional domain with two congruent parallel surfaces
New algorithms for parallel MRI

International Nuclear Information System (INIS)

Anzengruber, S; Ramlau, R; Bauer, F; Leitao, A

2008-01-01

Magnetic Resonance Imaging with parallel data acquisition requires algorithms for reconstructing the patient's image from a small number of measured lines of the Fourier domain (k-space). In contrast to well-known algorithms like SENSE and GRAPPA and its flavors we consider the problem as a non-linear inverse problem. However, in order to avoid cost intensive derivatives we will use Landweber-Kaczmarz iteration and in order to improve the overall results some additional sparsity constraints.
The ITER divertor concept

International Nuclear Information System (INIS)

Janeschitz, G.; Borrass, K.; Federici, G.; Igitkhanov, Y.; Kukushkin, A.; Pacher, H.D.; Pacher, G.W.; Sugihara, M.

1995-01-01

The ITER divertor must exhaust most of the alpha particle power and the He ash at acceptable erosion rates. The high recycling regime of the ITER-CDA for present parameters would yield high power loads and erosion rates on conventional targets. Improvement by radiation in the SOL at constant pressure is limited in principle. To permit a higher radiation fraction, the plasma pressure along the field must be reduced by more than a factor 10, reducing also the target ion flux. This pressure reduction can be obtained by strong plasma-neutral interaction below the X-point. Under these conditions T e in the divertor can be reduced to <5 eV along a flame like ionisation front by impurity radiation and CX losses. Downstream of the front, neutrals undergo more CX or i-n collisions than ionisation events, resulting in significant momentum loss via neutrals to the divertor chamber wall. The pressure reduction by this mechanism depends on the along-field length for neutral-plasma interaction, the parallel power flux, the neutral density, the ratio of neutral-neutral collision length to the plasma-wall distance and on the Mach number of ions and neutrals. A supersonic transition in the main plasma-neutral interaction region, expected to occur near the ionisation front, would be beneficial for momentum removal. The momentum transfer fraction to the side walls is calculated: low Knudsen number is beneficial. The impact of the different physics effects on the chosen geometry and on the ITER divertor design and the lifetime of the various divertor components are discussed. ((orig.))
R and D on support to ITER safety assessment

International Nuclear Information System (INIS)

Van Dorsselaere, J.P.; Perrault, D.; Barrachin, M.; Bentaib, A.; Bez, J.; Cortes, P.; Seropian, C.; Tregoures, N.; Vendel, J.

2009-01-01

After performing its first ITER safety assessment in 2002 on behalf of the French 'Autorite de Surete Nucleaire (ASN)', the French 'Institut de Radioprotection et de Surete Nucleaire (IRSN)' is now analysing the new ITER Fusion facility safety file. The operator delivered this file to the ASN as part of its request for a creation decree, legally necessary before building works can begin on the site. The IRSN first task in following ITER throughout its lifetime is to study the safety approach adopted by the operator and the associated issues. Such a challenging new technology calls for further in-house expertise and so in parallel a R and D program has been set up to support this safety assessment process, now and in the next years. Its main objectives are to identify the key parameters for mastering some risks (that would have been insufficiently justified by the operator) and to perform some verifications with methods and codes independent from the operator's ones. Priority has been given to four technical issues (others could be investigated in the future, like the behaviour of activated corrosion products). The first issue concerns the simulation of accident sequences with the help of the ASTEC European system code, developed by IRSN (jointly with its German counterpart, the GRS) for severe accidents in Pressurised Water Reactors. A preliminary analysis showed that most of its physical models are already applicable, e.g., for thermal-hydraulics in accidents caused by water or air ingress into the vacuum vessel (VV) or dust transport. Work has started in 2008 on some model adaptations, for instance oxidation of VV first wall materials by steam or air, and on validation on the ITER-specific ICE and LOVA experiments. Other model improvements are planned in the next years, as feedback from the work done for the other technical issues and from the code validation. The second issue concerns the risk of gas explosion due to concentrations of hydrogen and carbon
Remote maintenance development for ITER

International Nuclear Information System (INIS)

Tada, Eisuke; Shibanuma, Kiyoshi

1997-01-01

This paper both describes the overall design concept of the ITER remote maintenance system, which has been developed mainly for use with in-vessel components such as divertor and blanket, and outlines of the ITER R and D program, which has been established to develop remote handling equipment/tools and radiation hard components. In ITER, the reactor structures inside cryostat have to be maintained remotely because of activation due to DT operation. Therefore, remote-handling technology is fundamental, and the reactor-structure design must be made consistent with remote maintainability. The overall maintenance scenario and design concepts of the required remote handling equipment/tools have been developed according to their maintenance classification. Technologies are also being developed to verify the feasibility of the maintenance design and include fabrication and testing of a fullscale remote-handling equipment/tools for in-vessel maintenance. (author)
Parallelization of a spherical Sn transport theory algorithm

International Nuclear Information System (INIS)

Haghighat, A.

1989-01-01

The work described in this paper derives a parallel algorithm for an R-dependent spherical S N transport theory algorithm and studies its performance by testing different sample problems. The S N transport method is one of the most accurate techniques used to solve the linear Boltzmann equation. Several studies have been done on the vectorization of the S N algorithms; however, very few studies have been performed on the parallelization of this algorithm. Weinke and Hommoto have looked at the parallel processing of the different energy groups, and Azmy recently studied the parallel processing of the inner iterations of an X-Y S N nodal transport theory method. Both studies have reported very encouraging results, which have prompted us to look at the parallel processing of an R-dependent S N spherical geometry algorithm. This geometry was chosen because, in spite of its simplicity, it contains the complications of the curvilinear geometries (i.e., redistribution of neutrons over the discretized angular bins)
A Parallel Solver for Large-Scale Markov Chains

Czech Academy of Sciences Publication Activity Database

Benzi, M.; Tůma, Miroslav

2002-01-01

Roč. 41, - (2002), s. 135-153 ISSN 0168-9274 R&D Projects: GA AV ČR IAA2030801; GA ČR GA101/00/1035 Keywords : parallel preconditioning * iterative methods * discrete Markov chains * generalized inverses * singular matrices * graph partitioning * AINV * Bi-CGSTAB Subject RIV: BA - General Mathematics Impact factor: 0.504, year: 2002
Iterated Process Analysis over Lattice-Valued Regular Expressions

DEFF Research Database (Denmark)

Midtgaard, Jan; Nielson, Flemming; Nielson, Hanne Riis

2016-01-01

We present an iterated approach to statically analyze programs of two processes communicating by message passing. Our analysis operates over a domain of lattice-valued regular expressions, and computes increasingly better approximations of each process's communication behavior. Overall the work e...... extends traditional semantics-based program analysis techniques to automatically reason about message passing in a manner that can simultaneously analyze both values of variables as well as message order, message content, and their interdependencies.......We present an iterated approach to statically analyze programs of two processes communicating by message passing. Our analysis operates over a domain of lattice-valued regular expressions, and computes increasingly better approximations of each process's communication behavior. Overall the work...
LHCD and coupling experiments with an ITER-like PAM launcher on the FTU tokamak

International Nuclear Information System (INIS)

Pericoli Ridolfini, V.; Apicella, M.L.; Barbato, E.; Buratti, P.; Calabro, G.; Cardinali, A.; Mirizzi, F.; Panaccione, L.; Podda, S.; Tuccillo, A.A.; Bibet, Ph.; Granucci, G.; Sozzi, C.

2005-01-01

Successful experimental tests on a PAM (passive active multijunction) prototype antenna for the Lower Hybrid (LH) waves similar to that foreseen for ITER have been carried out on FTU. The power level routinely achieved without any fault in the transmission lines for the maximum time allowed by the LH power plant, i.e. 0.9 s, is 250 kW versus a design value of 270. It corresponds to 50 MW/m 2 through the ITER antenna active area if it is scaled for the different LH frequencies (5 GHz in ITER, 8 GHz in FTU) and it is more than 1.4 times the goal of the ITER design (33 MW/m 2 ). The test results validate the main features indicated by the simulation codes, concerning the power handling, the coupling and the launched N parallel spectrum. The power reflection coefficient R c is always ≤ 2.5%, once the PAM launcher has been properly conditioned, even with the grill mouth retracted 2 mm inside the port shadow, with density in front of the launcher very close or even lower than the cut-off value. The current drive efficiency is comparable to a conventional grill in similar conditions, once the lower directivity is taken into account. The flexibility in the N parallel spectrum is confirmed by the HXR and ECE spectra. Conditioning the PAM to operate at the ITER equivalent power level has required only one day of RF operation, without a previous baking of the waveguides. (author)
Progress and Achievements on the R&D Activities for ITER Vacuum Vessel

Energy Technology Data Exchange (ETDEWEB)

Nakahira, M. [Japan Atomic Energy Research Institute (JAERI); Koizumi, K. [Japan Atomic Energy Research Institute (JAERI); Takahashi, H. [Japan Atomic Energy Research Institute (JAERI); Onozuka, M. [ITER Joint Central Team, Garching, Germany; Ioki, K. [ITER Joint Central Team, Garching, Germany; Kuzumin, E. [D.V. Efremov Scientific Research Institute, St. Petersburg, Russia; Krylov, V. [D.V. Efremov Scientific Research Institute, St. Petersburg, Russia; Maslakowski, J. [Oak Ridge National Laboratory (ORNL); Nelson, Brad E [ORNL; Jones, L. [Max-Planck Institute, Garching, Germany; Danner, W. [Max-Planck Institute, Garching, Germany; Maisonnier, D. [Max-Planck Institute, Garching, Germany

2001-01-01

The ITER vacuum vessel (VV) is designed to be large double-walled structure with a D-shaped crosssection. The achievable fabrication tolerance of this structure was unknown due to the size and complexity of shape. The Full-scale Sector Model of ITER Vacuum Vessel, which was 15m in height, was fabricated and tested to obtain the fabrication and assembly tolerances. The model was fabricated within the target tolerance of 5mm and welding deformation during assembly operation was obtained. The port structure was also connected using remotized welding tools to demonstrate the basic maintenance activity. In parallel, the tests of advanced welding, cutting and inspection system were performed to improve the efficiency of fabrication and maintenance of the Vacuum Vessel. These activities show the feasibility of ITER Vacuum Vessel as feasible in a realistic way. This paper describes the major progress, achievement and latest status of the R&D activities on the ITER vacuum vessel.
Algorithmic differentiation of pragma-defined parallel regions differentiating computer programs containing OpenMP

CERN Document Server

Förster, Michael

2014-01-01

Numerical programs often use parallel programming techniques such as OpenMP to compute the program's output values as efficient as possible. In addition, derivative values of these output values with respect to certain input values play a crucial role. To achieve code that computes not only the output values simultaneously but also the derivative values, this work introduces several source-to-source transformation rules. These rules are based on a technique called algorithmic differentiation. The main focus of this work lies on the important reverse mode of algorithmic differentiation. The inh
Parallel Jacobi EVD Methods on Integrated Circuits

Directory of Open Access Journals (Sweden)

Chi-Chia Sun

2014-01-01

Full Text Available Design strategies for parallel iterative algorithms are presented. In order to further study different tradeoff strategies in design criteria for integrated circuits, A 10 × 10 Jacobi Brent-Luk-EVD array with the simplified μ-CORDIC processor is used as an example. The experimental results show that using the μ-CORDIC processor is beneficial for the design criteria as it yields a smaller area, faster overall computation time, and less energy consumption than the regular CORDIC processor. It is worth to notice that the proposed parallel EVD method can be applied to real-time and low-power array signal processing algorithms performing beamforming or DOA estimation.
Parallel preconditioning techniques for sparse CG solvers

Energy Technology Data Exchange (ETDEWEB)

Basermann, A.; Reichel, B.; Schelthoff, C. [Central Institute for Applied Mathematics, Juelich (Germany)

1996-12-31

Conjugate gradient (CG) methods to solve sparse systems of linear equations play an important role in numerical methods for solving discretized partial differential equations. The large size and the condition of many technical or physical applications in this area result in the need for efficient parallelization and preconditioning techniques of the CG method. In particular for very ill-conditioned matrices, sophisticated preconditioner are necessary to obtain both acceptable convergence and accuracy of CG. Here, we investigate variants of polynomial and incomplete Cholesky preconditioners that markedly reduce the iterations of the simply diagonally scaled CG and are shown to be well suited for massively parallel machines.
Variable aperture-based ptychographical iterative engine method.

Science.gov (United States)

Sun, Aihui; Kong, Yan; Meng, Xin; He, Xiaoliang; Du, Ruijun; Jiang, Zhilong; Liu, Fei; Xue, Liang; Wang, Shouyu; Liu, Cheng

2018-02-01

A variable aperture-based ptychographical iterative engine (vaPIE) is demonstrated both numerically and experimentally to reconstruct the sample phase and amplitude rapidly. By adjusting the size of a tiny aperture under the illumination of a parallel light beam to change the illumination on the sample step by step and recording the corresponding diffraction patterns sequentially, both the sample phase and amplitude can be faithfully reconstructed with a modified ptychographical iterative engine (PIE) algorithm. Since many fewer diffraction patterns are required than in common PIE and the shape, the size, and the position of the aperture need not to be known exactly, this proposed vaPIE method remarkably reduces the data acquisition time and makes PIE less dependent on the mechanical accuracy of the translation stage; therefore, the proposed technique can be potentially applied for various scientific researches. (2018) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE).
Generalized Analytical Program of Thyristor Phase Control Circuit with Series and Parallel Resonance Load

OpenAIRE

Nakanishi, Sen-ichiro; Ishida, Hideaki; Himei, Toyoji

1981-01-01

The systematic analytical method is reqUired for the ac phase control circuit by means of an inverse parallel thyristor pair which has a series and parallel L-C resonant load, because the phase control action causes abnormal and interesting phenomena, such as an extreme increase of voltage and current, an unique increase and decrease of contained higher harmonics, and a wide variation of power factor, etc. In this paper, the program for the analysis of the thyristor phase control circuit with...
Efficiency Analysis of the Parallel Implementation of the SIMPLE Algorithm on Multiprocessor Computers

Science.gov (United States)

Lashkin, S. V.; Kozelkov, A. S.; Yalozo, A. V.; Gerasimov, V. Yu.; Zelensky, D. K.

2017-12-01

This paper describes the details of the parallel implementation of the SIMPLE algorithm for numerical solution of the Navier-Stokes system of equations on arbitrary unstructured grids. The iteration schemes for the serial and parallel versions of the SIMPLE algorithm are implemented. In the description of the parallel implementation, special attention is paid to computational data exchange among processors under the condition of the grid model decomposition using fictitious cells. We discuss the specific features for the storage of distributed matrices and implementation of vector-matrix operations in parallel mode. It is shown that the proposed way of matrix storage reduces the number of interprocessor exchanges. A series of numerical experiments illustrates the effect of the multigrid SLAE solver tuning on the general efficiency of the algorithm; the tuning involves the types of the cycles used (V, W, and F), the number of iterations of a smoothing operator, and the number of cells for coarsening. Two ways (direct and indirect) of efficiency evaluation for parallelization of the numerical algorithm are demonstrated. The paper presents the results of solving some internal and external flow problems with the evaluation of parallelization efficiency by two algorithms. It is shown that the proposed parallel implementation enables efficient computations for the problems on a thousand processors. Based on the results obtained, some general recommendations are made for the optimal tuning of the multigrid solver, as well as for selecting the optimal number of cells per processor.

Parallelism in matrix computations

CERN Document Server

Gallopoulos, Efstratios; Sameh, Ahmed H

2016-01-01

This book is primarily intended as a research monograph that could also be used in graduate courses for the design of parallel algorithms in matrix computations. It assumes general but not extensive knowledge of numerical linear algebra, parallel architectures, and parallel programming paradigms. The book consists of four parts: (I) Basics; (II) Dense and Special Matrix Computations; (III) Sparse Matrix Computations; and (IV) Matrix functions and characteristics. Part I deals with parallel programming paradigms and fundamental kernels, including reordering schemes for sparse matrices. Part II is devoted to dense matrix computations such as parallel algorithms for solving linear systems, linear least squares, the symmetric algebraic eigenvalue problem, and the singular-value decomposition. It also deals with the development of parallel algorithms for special linear systems such as banded ,Vandermonde ,Toeplitz ,and block Toeplitz systems. Part III addresses sparse matrix computations: (a) the development of pa...
Eigenvalues calculation algorithms for {lambda}-modes determination. Parallelization approach

Energy Technology Data Exchange (ETDEWEB)

Vidal, V. [Universidad Politecnica de Valencia (Spain). Departamento de Sistemas Informaticos y Computacion; Verdu, G.; Munoz-Cobo, J.L. [Universidad Politecnica de Valencia (Spain). Departamento de Ingenieria Quimica y Nuclear; Ginestart, D. [Universidad Politecnica de Valencia (Spain). Departamento de Matematica Aplicada

1997-03-01

In this paper, we review two methods to obtain the {lambda}-modes of a nuclear reactor, Subspace Iteration method and Arnoldi`s method, which are popular methods to solve the partial eigenvalue problem for a given matrix. In the developed application for the neutron diffusion equation we include improved acceleration techniques for both methods. Also, we propose two parallelization approaches for these methods, a coarse grain parallelization and a fine grain one. We have tested the developed algorithms with two realistic problems, focusing on the efficiency of the methods according to the CPU times. (author).
Potential for Australian involvement in ITER

International Nuclear Information System (INIS)

O'Connor, D. J.; Collins, G. A.; Hole, M. J.

2006-01-01

Full text: Full text: Fusion, the process that powers the sun and stars, offers a solution to the world's long-term energy needs: providing large scale energy production with zero greenhouse gas emissions, short-lived radio-active waste compared to conventional nuclear fission cycles, and a virtually limitless supply of fuel. Almost three decades of fusion research has produced spectacular progress. Present-day experiments have a power gain ratio of approximately 1 (ratio of power out to power in), with a power output in the 10's of megawatts. The world's next major fusion experiment, the International Thermonuclear Experimental Reactor (ITER), will be a pre-prototype power plant. Since announcement of the ITER site in June 2005, the ITER project, has gained momentum and political support. Despite Australia's foundation role in the field of fusion science, through the pioneering work of Sir Mark Oliphant, and significant contributions to the international fusion program over the succeeding years, Australia is not involved in the ITER project. In this talk, the activities of a recently formed consortium of scientists and engineers, the Australian ITER Forum will be outlined. The Forum is drawn from five Universities, ANSTO (the Australian Nuclear Science and Technology Organisation) and AINSE (the Australian Institute for Nuclear Science and Engineering), and seeks to promote fusion energy in the Australian community and negotiate a role for Australia in the ITER project. As part of this activity, the Australian government recently funded a workshop that discussed the ways and means of engaging Australia in ITER. The workshop brought the research, industrial, government and general public communities, together with the ITER partners, and forged an opportunity for ITER engagement; with scientific, industrial, and energy security rewards for Australia. We will report on the emerging scope for Australian involvement
Performance Analysis of Iterative Channel Estimation and Multiuser Detection in Multipath DS-CDMA Channels

Science.gov (United States)

Li, Husheng; Betz, Sharon M.; Poor, H. Vincent

2007-05-01

This paper examines the performance of decision feedback based iterative channel estimation and multiuser detection in channel coded aperiodic DS-CDMA systems operating over multipath fading channels. First, explicit expressions describing the performance of channel estimation and parallel interference cancellation based multiuser detection are developed. These results are then combined to characterize the evolution of the performance of a system that iterates among channel estimation, multiuser detection and channel decoding. Sufficient conditions for convergence of this system to a unique fixed point are developed.
US ITER Management Plan

International Nuclear Information System (INIS)

1991-12-01

This US ITER Management Plan is the plan for conducting the Engineering Design Activities within the US. The plan applies to all design, analyses, and associated physics and technology research and development (R ampersand D) required to support the program. The plan defines the management considerations associated with these activities. The plan also defines the management controls that the project participants will follow to establish, implement, monitor, and report these activities. The activities are to be conducted by the project in accordance with this plan. The plan will be updated to reflect the then-current management approach required to meet the project objectives. The plan will be reviewed at least annually for possible revision. Section 2 presents the ITER objectives, a brief description of the ITER concept as developed during the Conceptual Design Activities, and comments on the Engineering Design Activities. Section 3 discusses the planned international organization for the Engineering Design Activities, from which the tasks will flow to the US Home Team. Section 4 describes the US ITER management organization and responsibilities during the Engineering Design Activities. Section 5 describes the project management and control to be used to perform the assigned tasks during the Engineering Design Activities. Section 6 presents the references. Several appendices are provided that contain detailed information related to the front material
High performance parallel computers for science: New developments at the Fermilab advanced computer program

International Nuclear Information System (INIS)

Nash, T.; Areti, H.; Atac, R.

1988-08-01

Fermilab's Advanced Computer Program (ACP) has been developing highly cost effective, yet practical, parallel computers for high energy physics since 1984. The ACP's latest developments are proceeding in two directions. A Second Generation ACP Multiprocessor System for experiments will include $3500 RISC processors each with performance over 15 VAX MIPS. To support such high performance, the new system allows parallel I/O, parallel interprocess communication, and parallel host processes. The ACP Multi-Array Processor, has been developed for theoretical physics. Each $4000 node is a FORTRAN or C programmable pipelined 20 MFlops (peak), 10 MByte single board computer. These are plugged into a 16 port crossbar switch crate which handles both inter and intra crate communication. The crates are connected in a hypercube. Site oriented applications like lattice gauge theory are supported by system software called CANOPY, which makes the hardware virtually transparent to users. A 256 node, 5 GFlop, system is under construction. 10 refs., 7 figs
Use of MCAM in creating 3D neutronics model for ITER building

International Nuclear Information System (INIS)

Zeng Qin; Wang Guozhong; Dang Tongqiang; Long Pengcheng; Loughlin, Michael

2012-01-01

Highlights: ► We created a 3D neutronics model of the ITER building. ► The model was produced from the engineering CAD model by MCAM software. ► The neutron flux map in the ITER building was calculated. - Abstract: The three dimensional (3D) neutronics reference model of International Thermonuclear Experimental Reactor (ITER) only defines the tokamak machine and extends to the bio-shield. In order to meet further 3D neutronics analysis needs, it is necessary to create a 3D reference model of the ITER building. Monte Carlo Automatic Modeling Program for Radiation Transport Simulation (MCAM) was developed as a computer aided design (CAD) based bi-directional interface program between general CAD systems and Monte Carlo radiation transport simulation codes. With the help of MCAM version 4.8, the 3D neutronics model of ITER building was created based on the engineering CAD model. The calculation of the neutron flux map in ITER building during operation showed the correctness and usability of the model. This model is the first detailed ITER building 3D neutronics model and it will be made available to all international organization collaborators as a reference model.
ITER oriented issues-2 (etc.)

International Nuclear Information System (INIS)

Goryayev, G.V.; Savchuk, V.V.; Shakhvorostov, Yu. V.

2004-01-01

The study analyzes the possibilities of utilization beryllium ingots produced at UMZ (Ulba Metallurgical Plant) for the purpose of ITER program. The results of comparative analysis of specification requirement to S-65 grade chemical compound and statistics data of UMZ beryllium ingots impurities content are presented. It has been demonstrated that beryllium industrial ingots produced at UMZ can be used for a production of powders and billets conforming the requirements of ITER specification. Beryllium ingots production flow chart, description of basic process equipment, the layout of metallurgical production upgrade, the results of such upgrade implementation are complimentary data to this study. The study illustrated with explanatory drawings. (author)
Recommendations for a cryogenic system for ITER [International Thermonuclear Experimental Reactor

International Nuclear Information System (INIS)

Slack, D.S.

1989-01-01

The International Thermonuclear Experimental Reactor (ITER) is a new tokamak design project with joint participation from Japan, the European Community, the Soviet Union, and the United States. ITER will be a large machine requiring up to 100 kW of refrigeration at 4.5 K to cool its superconducting magnets. Unlike earlier fusion experiments, the ITER cryogenic system must handle pulse loads constituting a large percentage of the total load. These come from neutron heating during a fusion burn and from ac losses during ramping of current in the PF (poloidal field) coils. This paper presents a conceptual design for a cryogenic system that meets ITER requirements. It describes a system with the following features: Only time-proven components are used. The system obtains a high efficiency without use of cold pumps or other developmental components. High reliability is achieved by paralleling compressors and expanders and by using adequate isolation valving. The problem of load fluctuations is solved by a simple load-leveling device. The cryogenic system can be housed in a separate building located at a considerable distance from the ITER core, if desired. The paper also summarizes physical plant size, cost estimates, and means of handling vented helium during magnet quench. 4 refs., 4 figs., 3 tabs
iHadoop: Asynchronous Iterations Support for MapReduce

KAUST Repository

Elnikety, Eslam

2011-08-01

MapReduce is a distributed programming framework designed to ease the development of scalable data-intensive applications for large clusters of commodity machines. Most machine learning and data mining applications involve iterative computations over large datasets, such as the Web hyperlink structures and social network graphs. Yet, the MapReduce model does not efficiently support this important class of applications. The architecture of MapReduce, most critically its dataflow techniques and task scheduling, is completely unaware of the nature of iterative applications; tasks are scheduled according to a policy that optimizes the execution for a single iteration which wastes bandwidth, I/O, and CPU cycles when compared with an optimal execution for a consecutive set of iterations. This work presents iHadoop, a modified MapReduce model, and an associated implementation, optimized for iterative computations. The iHadoop model schedules iterations asynchronously. It connects the output of one iteration to the next, allowing both to process their data concurrently. iHadoop\\'s task scheduler exploits inter- iteration data locality by scheduling tasks that exhibit a producer/consumer relation on the same physical machine allowing a fast local data transfer. For those iterative applications that require satisfying certain criteria before termination, iHadoop runs the check concurrently during the execution of the subsequent iteration to further reduce the application\\'s latency. This thesis also describes our implementation of the iHadoop model, and evaluates its performance against Hadoop, the widely used open source implementation of MapReduce. Experiments using different data analysis applications over real-world and synthetic datasets show that iHadoop performs better than Hadoop for iterative algorithms, reducing execution time of iterative applications by 25% on average. Furthermore, integrating iHadoop with HaLoop, a variant Hadoop implementation that caches
Involvement of the EU industry in ITER EDA

International Nuclear Information System (INIS)

Bogusch, E.

2001-01-01

Since the fifties, European industry has been involved in research and development in the field of nuclear fusion as a potential future source of energy. Early contributions mainly included deliveries of plant components and services to experimental facilities. In the Engineering Design Activities (EDA) phase of the planned multinational International Thermonuclear Experimental Reactor (ITER) in 1993 to 2001 this commitment of industry was intensified. Industries from seven European countries participated in the project with various contributions, e.g., in the development, design, and manufacture of components, and in the development of methods of planning and executing the complex ITER project. These activities were accompanied by an extensive R and D program. e.g., about materials and methods of manufacturing ITER components. In this way, European industry made an important contribution to the further development of nuclear fusion within the framework of ITER EDA activities, and will be able to continue this work intensively in the expected ITER construction phase to follow. (orig.) [de
FPGA implementation of low complexity LDPC iterative decoder

Science.gov (United States)

Verma, Shivani; Sharma, Sanjay

2016-07-01

Low-density parity-check (LDPC) codes, proposed by Gallager, emerged as a class of codes which can yield very good performance on the additive white Gaussian noise channel as well as on the binary symmetric channel. LDPC codes have gained lots of importance due to their capacity achieving property and excellent performance in the noisy channel. Belief propagation (BP) algorithm and its approximations, most notably min-sum, are popular iterative decoding algorithms used for LDPC and turbo codes. The trade-off between the hardware complexity and the decoding throughput is a critical factor in the implementation of the practical decoder. This article presents introduction to LDPC codes and its various decoding algorithms followed by realisation of LDPC decoder by using simplified message passing algorithm and partially parallel decoder architecture. Simplified message passing algorithm has been proposed for trade-off between low decoding complexity and decoder performance. It greatly reduces the routing and check node complexity of the decoder. Partially parallel decoder architecture possesses high speed and reduced complexity. The improved design of the decoder possesses a maximum symbol throughput of 92.95 Mbps and a maximum of 18 decoding iterations. The article presents implementation of 9216 bits, rate-1/2, (3, 6) LDPC decoder on Xilinx XC3D3400A device from Spartan-3A DSP family.
GPU Parallel Bundle Block Adjustment

Directory of Open Access Journals (Sweden)

ZHENG Maoteng

2017-09-01

Full Text Available To deal with massive data in photogrammetry, we introduce the GPU parallel computing technology. The preconditioned conjugate gradient and inexact Newton method are also applied to decrease the iteration times while solving the normal equation. A brand new workflow of bundle adjustment is developed to utilize GPU parallel computing technology. Our method can avoid the storage and inversion of the big normal matrix, and compute the normal matrix in real time. The proposed method can not only largely decrease the memory requirement of normal matrix, but also largely improve the efficiency of bundle adjustment. It also achieves the same accuracy as the conventional method. Preliminary experiment results show that the bundle adjustment of a dataset with about 4500 images and 9 million image points can be done in only 1.5 minutes while achieving sub-pixel accuracy.
Development, Verification and Validation of Parallel, Scalable Volume of Fluid CFD Program for Propulsion Applications

Science.gov (United States)

West, Jeff; Yang, H. Q.

2014-01-01

There are many instances involving liquid/gas interfaces and their dynamics in the design of liquid engine powered rockets such as the Space Launch System (SLS). Some examples of these applications are: Propellant tank draining and slosh, subcritical condition injector analysis for gas generators, preburners and thrust chambers, water deluge mitigation for launch induced environments and even solid rocket motor liquid slag dynamics. Commercially available CFD programs simulating gas/liquid interfaces using the Volume of Fluid approach are currently limited in their parallel scalability. In 2010 for instance, an internal NASA/MSFC review of three commercial tools revealed that parallel scalability was seriously compromised at 8 cpus and no additional speedup was possible after 32 cpus. Other non-interface CFD applications at the time were demonstrating useful parallel scalability up to 4,096 processors or more. Based on this review, NASA/MSFC initiated an effort to implement a Volume of Fluid implementation within the unstructured mesh, pressure-based algorithm CFD program, Loci-STREAM. After verification was achieved by comparing results to the commercial CFD program CFD-Ace+, and validation by direct comparison with data, Loci-STREAM-VoF is now the production CFD tool for propellant slosh force and slosh damping rate simulations at NASA/MSFC. On these applications, good parallel scalability has been demonstrated for problems sizes of tens of millions of cells and thousands of cpu cores. Ongoing efforts are focused on the application of Loci-STREAM-VoF to predict the transient flow patterns of water on the SLS Mobile Launch Platform in order to support the phasing of water for launch environment mitigation so that vehicle determinantal effects are not realized.
The technical feasibility of uranium enrichment for nuclear bomb construction at the parallel nuclear program plant

International Nuclear Information System (INIS)

Rosa, L.P.

1990-01-01

It is discussed the hole of the Parallel Nuclear Program is Brazil and the feasibility of uranium enrichment for nuclear bomb construction. This program involves two research centers, one belonging to the brazilian navy and another to the aeronautics. Some other brazilian institutes like CTA, IPEN, COPESP and CETEX and also taking part in the program. (A.C.A.S.)
Status and plans for US ITER studies

International Nuclear Information System (INIS)

Doggett, J.N.

1992-01-01

The United States' participation in the International Thermonuclear Experimental Reactor (ITER) began in later 1987 when the initiative to start a cooperative program among the four Parties -- the Soviet Union, Japan, the European Community, and the United States -- was initiated. Participation then continued through the start of Joint Work in May 1988 until the conclusion of the Conceptual Design Activities (CDA) in December 1990. In the period between the conclusion of the CDA and the agreement to execute the Engineering Design Activities (EDA), the US ITER Home Team continued to do work on the design, executed additional research and development, and participated in the preparations for the EDA. Activities included one major design study on a High-Aspect-Ratio Design and input to the National ITER Technical Review, the ITER Steering Committee -- US, Special Working Group 1, and the Fusion Energy Advisory Committee's Panel 1. Research and development was continued in areas of work that were identified as critical-path elements by an international panel chartered by the four ITER Parties near the end of the CDA. I will describe the conclusion of the CDA and the interim US ITER activities and will give an indication of our involvement in the EDA
A fast iterative scheme for the linearized Boltzmann equation

Science.gov (United States)

Wu, Lei; Zhang, Jun; Liu, Haihu; Zhang, Yonghao; Reese, Jason M.

2017-06-01

Iterative schemes to find steady-state solutions to the Boltzmann equation are efficient for highly rarefied gas flows, but can be very slow to converge in the near-continuum flow regime. In this paper, a synthetic iterative scheme is developed to speed up the solution of the linearized Boltzmann equation by penalizing the collision operator L into the form L = (L + Nδh) - Nδh, where δ is the gas rarefaction parameter, h is the velocity distribution function, and N is a tuning parameter controlling the convergence rate. The velocity distribution function is first solved by the conventional iterative scheme, then it is corrected such that the macroscopic flow velocity is governed by a diffusion-type equation that is asymptotic-preserving into the Navier-Stokes limit. The efficiency of this new scheme is assessed by calculating the eigenvalue of the iteration, as well as solving for Poiseuille and thermal transpiration flows. We find that the fastest convergence of our synthetic scheme for the linearized Boltzmann equation is achieved when Nδ is close to the average collision frequency. The synthetic iterative scheme is significantly faster than the conventional iterative scheme in both the transition and the near-continuum gas flow regimes. Moreover, due to its asymptotic-preserving properties, the synthetic iterative scheme does not need high spatial resolution in the near-continuum flow regime, which makes it even faster than the conventional iterative scheme. Using this synthetic scheme, with the fast spectral approximation of the linearized Boltzmann collision operator, Poiseuille and thermal transpiration flows between two parallel plates, through channels of circular/rectangular cross sections and various porous media are calculated over the whole range of gas rarefaction. Finally, the flow of a Ne-Ar gas mixture is solved based on the linearized Boltzmann equation with the Lennard-Jones intermolecular potential for the first time, and the difference
ITER JCT presentation at the International Conference on Fusion Reactor Materials (ICFRM-9)

International Nuclear Information System (INIS)

Kalinin, G.; Barabash, V.; Ioki, K.

1999-01-01

During this conference four invited papers and one poster paper were presented on behalf of the ITER Joint Central Team with the review of latest achievements. The results of the comprehensive materials R and D program in support of the ITER design were extensively reported the ITER Home Teams
India's participation in the ITER (International Thermonuclear Experimental Reactor) collaboration

International Nuclear Information System (INIS)

Deshpande, Shishir

2012-01-01

Keeping its vision of developing fusion energy as a viable source, India joined the ITER collaboration in December 2005. ITER is a seven party collaboration with China, EU, India, Japan, S. Korea, Russia and the USA. ITER has a challenging mission of achieving Q=10 figure of merit at 500 MW fusion power output. The construction of ITER is structured as a set of 'in-kind' procurement packages to be executed by the partners. This involves all activities like design, prototyping, testing, shipping and assembly with commissioning at the ITER site at Cadarache, France. Currently, ITER presents the only opportunity to carry out novel experiments with burning plasmas and the new realms of fusion physics. It is important to participate in such experiments with a view for their exploitation in future. This talk summarizes the ITER device, its key challenges, role played by India and how these enmesh with the future of domestic program in fusion research. (author)
Overview of International Thermonuclear Experimental Reactor (ITER) engineering design activities*

Science.gov (United States)

Shimomura, Y.

1994-05-01

The International Thermonuclear Experimental Reactor (ITER) [International Thermonuclear Experimental Reactor (ITER) (International Atomic Energy Agency, Vienna, 1988), ITER Documentation Series, No. 1] project is a multiphased project, presently proceeding under the auspices of the International Atomic Energy Agency according to the terms of a four-party agreement among the European Atomic Energy Community (EC), the Government of Japan (JA), the Government of the Russian Federation (RF), and the Government of the United States (US), ``the Parties.'' The ITER project is based on the tokamak, a Russian invention, and has since been brought to a high level of development in all major fusion programs in the world. The objective of ITER is to demonstrate the scientific and technological feasibility of fusion energy for peaceful purposes. The ITER design is being developed, with support from the Parties' four Home Teams and is in progress by the Joint Central Team. An overview of ITER Design activities is presented.

Implementations of BLAST for parallel computers.

Science.gov (United States)

Jülich, A

1995-02-01

The BLAST sequence comparison programs have been ported to a variety of parallel computers-the shared memory machine Cray Y-MP 8/864 and the distributed memory architectures Intel iPSC/860 and nCUBE. Additionally, the programs were ported to run on workstation clusters. We explain the parallelization techniques and consider the pros and cons of these methods. The BLAST programs are very well suited for parallelization for a moderate number of processors. We illustrate our results using the program blastp as an example. As input data for blastp, a 799 residue protein query sequence and the protein database PIR were used.
On the Convergence of Asynchronous Parallel Pattern Search

International Nuclear Information System (INIS)

Tamara Gilbson Kolda

2002-01-01

In this paper the authors prove global convergence for asynchronous parallel pattern search. In standard pattern search, decisions regarding the update of the iterate and the step-length control parameter are synchronized implicitly across all search directions. They lose this feature in asynchronous parallel pattern search since the search along each direction proceeds semi-autonomously. By bounding the value of the step-length control parameter after any step that produces decrease along a single search direction, they can prove that all the processes share a common accumulation point and that such a point is a stationary point of the standard nonlinear unconstrained optimization problem
A parallel algorithm for solving the multidimensional within-group discrete ordinates equations with spatial domain decomposition - 104

International Nuclear Information System (INIS)

Zerr, R.J.; Azmy, Y.Y.

2010-01-01

A spatial domain decomposition with a parallel block Jacobi solution algorithm has been developed based on the integral transport matrix formulation of the discrete ordinates approximation for solving the within-group transport equation. The new methodology abandons the typical source iteration scheme and solves directly for the fully converged scalar flux. Four matrix operators are constructed based upon the integral form of the discrete ordinates equations. A single differential mesh sweep is performed to construct these operators. The method is parallelized by decomposing the problem domain into several smaller sub-domains, each treated as an independent problem. The scalar flux of each sub-domain is solved exactly given incoming angular flux boundary conditions. Sub-domain boundary conditions are updated iteratively, and convergence is achieved when the scalar flux error in all cells meets a pre-specified convergence criterion. The method has been implemented in a computer code that was then employed for strong scaling studies of the algorithm's parallel performance via a fixed-size problem in tests ranging from one domain up to one cell per sub-domain. Results indicate that the best parallel performance compared to source iterations occurs for optically thick, highly scattering problems, the variety that is most difficult for the traditional SI scheme to solve. Moreover, the minimum execution time occurs when each sub-domain contains a total of four cells. (authors)
Development of parallel algorithms for electrical power management in space applications

Science.gov (United States)

Berry, Frederick C.

1989-01-01

The application of parallel techniques for electrical power system analysis is discussed. The Newton-Raphson method of load flow analysis was used along with the decomposition-coordination technique to perform load flow analysis. The decomposition-coordination technique enables tasks to be performed in parallel by partitioning the electrical power system into independent local problems. Each independent local problem represents a portion of the total electrical power system on which a loan flow analysis can be performed. The load flow analysis is performed on these partitioned elements by using the Newton-Raphson load flow method. These independent local problems will produce results for voltage and power which can then be passed to the coordinator portion of the solution procedure. The coordinator problem uses the results of the local problems to determine if any correction is needed on the local problems. The coordinator problem is also solved by an iterative method much like the local problem. The iterative method for the coordination problem will also be the Newton-Raphson method. Therefore, each iteration at the coordination level will result in new values for the local problems. The local problems will have to be solved again along with the coordinator problem until some convergence conditions are met.
Implementation of GPU parallel equilibrium reconstruction for plasma control in EAST

Energy Technology Data Exchange (ETDEWEB)

Huang, Yao, E-mail: yaohuang@ipp.ac.cn [Institute of Plasma Physics, Chinese Academy of Sciences, Hefei (China); Xiao, B.J. [Institute of Plasma Physics, Chinese Academy of Sciences, Hefei (China); School of Nuclear Science & Technology, University of Science & Technology of China (China); Luo, Z.P.; Yuan, Q.P.; Pei, X.F. [Institute of Plasma Physics, Chinese Academy of Sciences, Hefei (China); Yue, X.N. [School of Nuclear Science & Technology, University of Science & Technology of China (China)

2016-11-15

Highlights: • We described parallel equilibrium reconstruction code P-EFIT running on GPU was integrated with EAST plasma control system. • Compared with RT-EFIT used in EAST, P-EFIT has better spatial resolution and full algorithm of EFIT per iteration. • With the data interface through RFM, 65 × 65 spatial grids P-EFIT can satisfy the accuracy and time feasibility requirements for plasma control. • Successful control using ISOFLUX/P-EFIT was established in the dedicated experiment during the EAST 2014 campaign. • This work is a stepping-stone towards versatile ISOFLUX/P-EFIT control, such as real-time equilibrium reconstruction with more diagnostics. - Abstract: Implementation of P-EFIT code for plasma control in EAST is described. P-EFIT is based on the EFIT framework, but built with the CUDA™ architecture to take advantage of massively parallel Graphical Processing Unit (GPU) cores to significantly accelerate the computation. 65 × 65 grid size P-EFIT can complete one reconstruction iteration in 300 μs, with one iteration strategy, it can satisfy the needs of real-time plasma shape control. Data interface between P-EFIT and PCS is realized and developed by transferring data through RFM. First application of P-EFIT to discharge control in EAST is described.
Novel Robot Solutions for Carrying out Field Joint Welding and Machining in the Assembly of the Vacuum Vessel of ITER

International Nuclear Information System (INIS)

Pessi, P.

2009-01-01

It is necessary to use highly specialized robots in ITER (International Thermonuclear Experimental Reactor) both in the manufacturing and maintenance of the reactor due to a demanding environment. The sectors of the ITER vacuum vessel (VV) require more stringent tolerances than normally expected for the size of the structure involved. VV consists of nine sectors that are to be welded together. The vacuum vessel has a toroidal chamber structure. The task of the designed robot is to carry the welding apparatus along a path with a stringent tolerance during the assembly operation. In addition to the initial vacuum vessel assembly, after a limited running period, sectors need to be replaced for repair. Mechanisms with closed-loop kinematic chains are used in the design of robots in this work. One version is a purely parallel manipulator and another is a hybrid manipulator where the parallel and serial structures are combined. Traditional industrial robots that generally have the links actuated in series are inherently not very rigid and have poor dynamic performance in high speed and high dynamic loading conditions. Compared with open chain manipulators, parallel manipulators have high stiffness, high accuracy and a high force/torque capacity in a reduced workspace. Parallel manipulators have a mechanical architecture where all of the links are connected to the base and to the end-effector of the robot. The purpose of this thesis is to develop special parallel robots for the assembly, machining and repairing of the VV of the ITER. The process of the assembly and machining of the vacuum vessel needs a special robot. By studying the structure of the vacuum vessel, two novel parallel robots were designed and built; they have six and ten degrees of freedom driven by hydraulic cylinders and electrical servo motors. Kinematic models for the proposed robots were defined and two prototypes built. Experiments for machine cutting and laser welding with the 6-DOF robot were
Novel Robot Solutions for Carrying out Field Joint Welding and Machining in the Assembly of the Vacuum Vessel of ITER

Energy Technology Data Exchange (ETDEWEB)

Pessi, P.

2009-07-01

It is necessary to use highly specialized robots in ITER (International Thermonuclear Experimental Reactor) both in the manufacturing and maintenance of the reactor due to a demanding environment. The sectors of the ITER vacuum vessel (VV) require more stringent tolerances than normally expected for the size of the structure involved. VV consists of nine sectors that are to be welded together. The vacuum vessel has a toroidal chamber structure. The task of the designed robot is to carry the welding apparatus along a path with a stringent tolerance during the assembly operation. In addition to the initial vacuum vessel assembly, after a limited running period, sectors need to be replaced for repair. Mechanisms with closed-loop kinematic chains are used in the design of robots in this work. One version is a purely parallel manipulator and another is a hybrid manipulator where the parallel and serial structures are combined. Traditional industrial robots that generally have the links actuated in series are inherently not very rigid and have poor dynamic performance in high speed and high dynamic loading conditions. Compared with open chain manipulators, parallel manipulators have high stiffness, high accuracy and a high force/torque capacity in a reduced workspace. Parallel manipulators have a mechanical architecture where all of the links are connected to the base and to the end-effector of the robot. The purpose of this thesis is to develop special parallel robots for the assembly, machining and repairing of the VV of the ITER. The process of the assembly and machining of the vacuum vessel needs a special robot. By studying the structure of the vacuum vessel, two novel parallel robots were designed and built; they have six and ten degrees of freedom driven by hydraulic cylinders and electrical servo motors. Kinematic models for the proposed robots were defined and two prototypes built. Experiments for machine cutting and laser welding with the 6-DOF robot were
SPECT reconstruction of combined cone beam and parallel hole collimation with experimental data

International Nuclear Information System (INIS)

Li, Jianying; Jaszczak, R.J.; Turkington, T.G.; Greer, K.L.; Coleman, R.E.

1993-01-01

The authors have developed three methods to combine parallel and cone bean (P and CB) SPECT data using modified Maximum Likelihood-Expectation Maximization (ML-EM) algorithms. The first combination method applies both parallel and cone beam data sets to reconstruct a single intermediate image after each iteration using the ML-EM algorithm. The other two iterative methods combine the intermediate parallel beam (PB) and cone beam (CB) source estimates to enhance the uniformity of images. These two methods are ad hoc methods. In earlier studies using computer Monte Carlo simulation, they suggested that improved images might be obtained by reconstructing combined P and CB SPECT data. These combined collimation methods are qualitatively evaluated using experimental data. An attenuation compensation is performed by including the effects of attenuation in the transition matrix as a multiplicative factor. The combined P and CB images are compared with CB-only images and the result indicate that the combined P and CB approaches suppress artifacts caused by truncated projections and correct for the distortions of the CB-only images
Efficient parallel implicit methods for rotary-wing aerodynamics calculations

Science.gov (United States)

Wissink, Andrew M.

Euler/Navier-Stokes Computational Fluid Dynamics (CFD) methods are commonly used for prediction of the aerodynamics and aeroacoustics of modern rotary-wing aircraft. However, their widespread application to large complex problems is limited lack of adequate computing power. Parallel processing offers the potential for dramatic increases in computing power, but most conventional implicit solution methods are inefficient in parallel and new techniques must be adopted to realize its potential. This work proposes alternative implicit schemes for Euler/Navier-Stokes rotary-wing calculations which are robust and efficient in parallel. The first part of this work proposes an efficient parallelizable modification of the Lower Upper-Symmetric Gauss Seidel (LU-SGS) implicit operator used in the well-known Transonic Unsteady Rotor Navier Stokes (TURNS) code. The new hybrid LU-SGS scheme couples a point-relaxation approach of the Data Parallel-Lower Upper Relaxation (DP-LUR) algorithm for inter-processor communication with the Symmetric Gauss Seidel algorithm of LU-SGS for on-processor computations. With the modified operator, TURNS is implemented in parallel using Message Passing Interface (MPI) for communication. Numerical performance and parallel efficiency are evaluated on the IBM SP2 and Thinking Machines CM-5 multi-processors for a variety of steady-state and unsteady test cases. The hybrid LU-SGS scheme maintains the numerical performance of the original LU-SGS algorithm in all cases and shows a good degree of parallel efficiency. It experiences a higher degree of robustness than DP-LUR for third-order upwind solutions. The second part of this work examines use of Krylov subspace iterative solvers for the nonlinear CFD solutions. The hybrid LU-SGS scheme is used as a parallelizable preconditioner. Two iterative methods are tested, Generalized Minimum Residual (GMRES) and Orthogonal s-Step Generalized Conjugate Residual (OSGCR). The Newton method demonstrates good
Exploiting Symmetry on Parallel Architectures.

Science.gov (United States)

Stiller, Lewis Benjamin

1995-01-01

This thesis describes techniques for the design of parallel programs that solve well-structured problems with inherent symmetry. Part I demonstrates the reduction of such problems to generalized matrix multiplication by a group-equivariant matrix. Fast techniques for this multiplication are described, including factorization, orbit decomposition, and Fourier transforms over finite groups. Our algorithms entail interaction between two symmetry groups: one arising at the software level from the problem's symmetry and the other arising at the hardware level from the processors' communication network. Part II illustrates the applicability of our symmetry -exploitation techniques by presenting a series of case studies of the design and implementation of parallel programs. First, a parallel program that solves chess endgames by factorization of an associated dihedral group-equivariant matrix is described. This code runs faster than previous serial programs, and discovered it a number of results. Second, parallel algorithms for Fourier transforms for finite groups are developed, and preliminary parallel implementations for group transforms of dihedral and of symmetric groups are described. Applications in learning, vision, pattern recognition, and statistics are proposed. Third, parallel implementations solving several computational science problems are described, including the direct n-body problem, convolutions arising from molecular biology, and some communication primitives such as broadcast and reduce. Some of our implementations ran orders of magnitude faster than previous techniques, and were used in the investigation of various physical phenomena.
Future on the ITER program. On a branch of research on nuclear fusion

International Nuclear Information System (INIS)

Masaike, Akira

2000-01-01

As a huge cost for research and development of nuclear fusion is required, some international cooperative research such as ITER program have been intended to promote, to which Japanese response is required. As the program can be understood on its meaning at a viewpoint of promotion of basic science, concept on a key of energy problem is not insufficient yet And, its effect on technical problems and environment cannot be neglected Here was shown some proposals necessity for discussion on how the program had to be promoted under consideration of these problems. When a large scale program consuming national budget will be carried out, it is natural that agreement of national peoples must be obtained. Regretfully, in Japan discussion on science program above all nuclear policy has scarcely been experienced at citizens' levels, and some bitter experiences, where the concerned have promoted it in one side under a concept without any change once decided, have been pressured without any response to scientific advancements and social changes. Therefore, future plan on the nuclear fusion must be carried out a number of thorough discussion at a wide range from various viewpoints such as its realizing feasibility, safety, economics, and so forth, to promote careful adaptabilities. And, the concerned under promotion of the program and the relatives in the academic community seem to have a responsibility to easily explain present condition and scope of the plan to not only scientists but also citizens to awake them to promote its discussion with them. (G.K.)
Qualification Test for Korean Mockups of ITER Blanket First Wall

International Nuclear Information System (INIS)

Kim, S. K.; Lee, D. W.; Bae, Y. D.; Hong, B. G.; Jung, H. K.; Jung, Y. I.; Park, J. Y.; Jeong, Y. H.; Choi, B. K.; Kim, B. Y.

2009-01-01

ITER First Wall (FW) includes the beryllium armor tiles joined to CuCrZr heat sink with stainless steel cooling tubes. This first wall panels are one of the critical components in the ITER machine with the surface heat flux of 0.5 MW/m 2 or above. So qualification program needs to be performed with the goal to qualify the joining technologies required for the ITER First Wall. Based on the results of tests, the acceptance of the developed joining technologies will be established. The results of this qualification test will affect the final selection of the manufacturers for the ITER First Wall
Use of MCAM in creating 3D neutronics model for ITER building

Energy Technology Data Exchange (ETDEWEB)

Zeng Qin [Institute of Nuclear Energy Safety Technology, Chinese Academy of Sciences, Hefei, Anhui 230031 (China); School of Nuclear Science and Technology, University of Science and Technology of China, Hefei, Anhui 230027 (China); Wang Guozhong, E-mail: mango33@mail.ustc.edu.cn [School of Nuclear Science and Technology, University of Science and Technology of China, Hefei, Anhui 230027 (China); Dang Tongqiang [School of Nuclear Science and Technology, University of Science and Technology of China, Hefei, Anhui 230027 (China); Long Pengcheng [Institute of Nuclear Energy Safety Technology, Chinese Academy of Sciences, Hefei, Anhui 230031 (China); School of Nuclear Science and Technology, University of Science and Technology of China, Hefei, Anhui 230027 (China); Loughlin, Michael [ITER Organization, Route de Vinon sur Verdon, 13115 St. Paul-Lz-Durance (France)

2012-08-15

Highlights: Black-Right-Pointing-Pointer We created a 3D neutronics model of the ITER building. Black-Right-Pointing-Pointer The model was produced from the engineering CAD model by MCAM software. Black-Right-Pointing-Pointer The neutron flux map in the ITER building was calculated. - Abstract: The three dimensional (3D) neutronics reference model of International Thermonuclear Experimental Reactor (ITER) only defines the tokamak machine and extends to the bio-shield. In order to meet further 3D neutronics analysis needs, it is necessary to create a 3D reference model of the ITER building. Monte Carlo Automatic Modeling Program for Radiation Transport Simulation (MCAM) was developed as a computer aided design (CAD) based bi-directional interface program between general CAD systems and Monte Carlo radiation transport simulation codes. With the help of MCAM version 4.8, the 3D neutronics model of ITER building was created based on the engineering CAD model. The calculation of the neutron flux map in ITER building during operation showed the correctness and usability of the model. This model is the first detailed ITER building 3D neutronics model and it will be made available to all international organization collaborators as a reference model.
Parallelization of a blind deconvolution algorithm

Science.gov (United States)

Matson, Charles L.; Borelli, Kathy J.

2006-09-01

Often it is of interest to deblur imagery in order to obtain higher-resolution images. Deblurring requires knowledge of the blurring function - information that is often not available separately from the blurred imagery. Blind deconvolution algorithms overcome this problem by jointly estimating both the high-resolution image and the blurring function from the blurred imagery. Because blind deconvolution algorithms are iterative in nature, they can take minutes to days to deblur an image depending how many frames of data are used for the deblurring and the platforms on which the algorithms are executed. Here we present our progress in parallelizing a blind deconvolution algorithm to increase its execution speed. This progress includes sub-frame parallelization and a code structure that is not specialized to a specific computer hardware architecture.
High-performance blob-based iterative three-dimensional reconstruction in electron tomography using multi-GPUs

Directory of Open Access Journals (Sweden)

Wan Xiaohua

2012-06-01

Full Text Available Abstract Background Three-dimensional (3D reconstruction in electron tomography (ET has emerged as a leading technique to elucidate the molecular structures of complex biological specimens. Blob-based iterative methods are advantageous reconstruction methods for 3D reconstruction in ET, but demand huge computational costs. Multiple graphic processing units (multi-GPUs offer an affordable platform to meet these demands. However, a synchronous communication scheme between multi-GPUs leads to idle GPU time, and a weighted matrix involved in iterative methods cannot be loaded into GPUs especially for large images due to the limited available memory of GPUs. Results In this paper we propose a multilevel parallel strategy combined with an asynchronous communication scheme and a blob-ELLR data structure to efficiently perform blob-based iterative reconstructions on multi-GPUs. The asynchronous communication scheme is used to minimize the idle GPU time so as to asynchronously overlap communications with computations. The blob-ELLR data structure only needs nearly 1/16 of the storage space in comparison with ELLPACK-R (ELLR data structure and yields significant acceleration. Conclusions Experimental results indicate that the multilevel parallel scheme combined with the asynchronous communication scheme and the blob-ELLR data structure allows efficient implementations of 3D reconstruction in ET on multi-GPUs.
MEGA-CC: computing core of molecular evolutionary genetics analysis program for automated and iterative data analysis.

Science.gov (United States)

Kumar, Sudhir; Stecher, Glen; Peterson, Daniel; Tamura, Koichiro

2012-10-15

There is a growing need in the research community to apply the molecular evolutionary genetics analysis (MEGA) software tool for batch processing a large number of datasets and to integrate it into analysis workflows. Therefore, we now make available the computing core of the MEGA software as a stand-alone executable (MEGA-CC), along with an analysis prototyper (MEGA-Proto). MEGA-CC provides users with access to all the computational analyses available through MEGA's graphical user interface version. This includes methods for multiple sequence alignment, substitution model selection, evolutionary distance estimation, phylogeny inference, substitution rate and pattern estimation, tests of natural selection and ancestral sequence inference. Additionally, we have upgraded the source code for phylogenetic analysis using the maximum likelihood methods for parallel execution on multiple processors and cores. Here, we describe MEGA-CC and outline the steps for using MEGA-CC in tandem with MEGA-Proto for iterative and automated data analysis. http://www.megasoftware.net/.
A Quadratically Convergent O(square root of nL-Iteration Algorithm for Linear Programming

National Research Council Canada - National Science Library

Ye, Y; Gueler, O; Tapia, Richard A; Zhang, Y

1991-01-01

...)-iteration complexity while exhibiting superlinear convergence of the duality gap to zero under the assumption that the iteration sequence converges, and quadratic convergence of the duality gap...
The language parallel Pascal and other aspects of the massively parallel processor

Science.gov (United States)

Reeves, A. P.; Bruner, J. D.

1982-01-01

A high level language for the Massively Parallel Processor (MPP) was designed. This language, called Parallel Pascal, is described in detail. A description of the language design, a description of the intermediate language, Parallel P-Code, and details for the MPP implementation are included. Formal descriptions of Parallel Pascal and Parallel P-Code are given. A compiler was developed which converts programs in Parallel Pascal into the intermediate Parallel P-Code language. The code generator to complete the compiler for the MPP is being developed independently. A Parallel Pascal to Pascal translator was also developed. The architecture design for a VLSI version of the MPP was completed with a description of fault tolerant interconnection networks. The memory arrangement aspects of the MPP are discussed and a survey of other high level languages is given.
Iterative Adaptive Dynamic Programming for Solving Unknown Nonlinear Zero-Sum Game Based on Online Data.

Science.gov (United States)

Zhu, Yuanheng; Zhao, Dongbin; Li, Xiangjun

2017-03-01

H ∞ control is a powerful method to solve the disturbance attenuation problems that occur in some control systems. The design of such controllers relies on solving the zero-sum game (ZSG). But in practical applications, the exact dynamics is mostly unknown. Identification of dynamics also produces errors that are detrimental to the control performance. To overcome this problem, an iterative adaptive dynamic programming algorithm is proposed in this paper to solve the continuous-time, unknown nonlinear ZSG with only online data. A model-free approach to the Hamilton-Jacobi-Isaacs equation is developed based on the policy iteration method. Control and disturbance policies and value are approximated by neural networks (NNs) under the critic-actor-disturber structure. The NN weights are solved by the least-squares method. According to the theoretical analysis, our algorithm is equivalent to a Gauss-Newton method solving an optimization problem, and it converges uniformly to the optimal solution. The online data can also be used repeatedly, which is highly efficient. Simulation results demonstrate its feasibility to solve the unknown nonlinear ZSG. When compared with other algorithms, it saves a significant amount of online measurement time.
Parallel 3-D method of characteristics in MPACT

International Nuclear Information System (INIS)

Kochunas, B.; Dovvnar, T. J.; Liu, Z.

2013-01-01

A new parallel 3-D MOC kernel has been developed and implemented in MPACT which makes use of the modular ray tracing technique to reduce computational requirements and to facilitate parallel decomposition. The parallel model makes use of both distributed and shared memory parallelism which are implemented with the MPI and OpenMP standards, respectively. The kernel is capable of parallel decomposition of problems in space, angle, and by characteristic rays up to 0(104) processors. Initial verification of the parallel 3-D MOC kernel was performed using the Takeda 3-D transport benchmark problems. The eigenvalues computed by MPACT are within the statistical uncertainty of the benchmark reference and agree well with the averages of other participants. The MPACT k eff differs from the benchmark results for rodded and un-rodded cases by 11 and -40 pcm, respectively. The calculations were performed for various numbers of processors and parallel decompositions up to 15625 processors; all producing the same result at convergence. The parallel efficiency of the worst case was 60%, while very good efficiency (>95%) was observed for cases using 500 processors. The overall run time for the 500 processor case was 231 seconds and 19 seconds for the case with 15625 processors. Ongoing work is focused on developing theoretical performance models and the implementation of acceleration techniques to minimize the number of iterations to converge. (authors)

Mission of ITER and Challenges for the Young

International Nuclear Information System (INIS)

Ikeda, Kaname

2009-01-01

It is recognized that the ongoing effort to provide sufficient energy for the wellbeing of the globe's population and to power the world economy is of the greatest importance. ITER is a joint international research and development project that aims to demonstrate the scientific and technical feasibility of fusion power. It represents the responsible actions of governments whose countries comprise over half the world's population, to create fusion power as a source of clean, economic, carbon dioxide-free energy. This is the most important science initiative of our time.The partners in the Project--the ITER Parties--are the European Union, Japan, the People's Republic of China, India, the Republic of Korea, the Russian Federation and the USA. ITER will be constructed in Europe, at Cadarache in the South of France. The talk will illustrate the genesis of the ITER Organization, the ongoing work at the Cadarache site and the planned schedule for construction. There will also be an explanation of the unique aspects of international collaboration that have been developed for ITER.Although the present focus of the project is construction activities, ITER is also a major scientific and technological research program, for which the best of the world's intellectual resources is needed. Challenges for the young, imperative for fulfillment of the objective of ITER will be identified. It is important that young students and researchers worldwide recognize the rapid development of the project, and the fundamental issues that must be overcome in ITER.The talk will also cover the exciting career and fellowship opportunities for young people at the ITER Organization.
Divertor cassette movers prototypes for ITER

International Nuclear Information System (INIS)

Bogusch, E.; Batz, R.; Bieber, O.; Gottfried, R.; Cerdan, G.

1998-01-01

Following competitive tendering, in October 1996 Siemens was contracted by the European Commission to design and supply an assembly of four Divertor Cassette Movers Prototypes including the control and command systems for the movers proper. The assembly consisting of one Cassette Toroidal Mover (CTM), one Radial Mover Tractor (TRC), one Second Cassette Carrier (SCC), and one Radial Cassette Carrier (RCC) represents key components of the Divertor Test Platform at Brasimone, one of the seven large R+D projects for ITER. By detailed design, high-precision manufacturing and testing of these devices, Siemens contributed to the verification of an important task within the European R and D program towards ITER construction. Replacement of the divertor cassettes is a scheduled maintenance operation throughout the life of ITER. The successful fabrication and testing of the Divertor Cassette Movers Prototypes is all important milestone to verify this delicate operation. (authors)
ITER Remote Maintenance System (IRMS) lifecycle management

Energy Technology Data Exchange (ETDEWEB)

Tesini, Alessandro, E-mail: alessandro.tesini@iter.org [ITER Organization, CS 90 046, 13067 St. Paul Lez Durance Cedex (France); Otto' , Bede [Oxford Technologies Ltd, 7, Nuffield Way, Abingdon, Oxon OX14 1RJ (United Kingdom); Blight, John [FAAST 31c Allee de la Granette, 13600 Ceyreste (France); Choi, Chang-Hwan; Friconneau, Jean-Pierre; Gotewal, Krishan Kumar; Hamilton, David [ITER Organization, CS 90 046, 13067 St. Paul Lez Durance Cedex (France); Heckendorn, Frank [FD Technologies, PO Box 6686, Aiken, SC (United States); Martins, Jean-Pierre [ITER Organization, CS 90 046, 13067 St. Paul Lez Durance Cedex (France); Marty, Thomas [Westinghouse, 122, avenue de Hambourg, 13008 Marseille (France); Nakahira, Masataka; Palmer, Jim; Subramanian, Rajendran [ITER Organization, CS 90 046, 13067 St. Paul Lez Durance Cedex (France)

2011-10-15

The availability of the ITER machine to perform its scientific program is strongly dependent on the performance of the different Remote Handling (RH) systems constituting the ITER Remote Maintenance System (IRMS). The lifecycle of the IRMS will largely exceed 40 years from initial concept design and proof testing through to machine decommissioning. Such a long lifecycle requires that a rigorous approach is put in place to guarantee the technical capabilities of the highly innovative IRMS, its efficiency and its availability. For this purpose, an IRMS System Engineering and IRMS lifecycle management approach has been adopted by ITER. The approach aims at ensuring the IRMS full operability and availability at an acceptable cost of ownership over the full ITER machine assembly and operations period. The IRMS lifecycle management method described in this paper covers such subjects as specific requirements for IRMS design reviews, monitoring during manufacture, factory and site acceptance testing, integrated commissioning, decontamination, maintenance and re-qualification strategies, requirements for Integrated Logistical Support during operations. The updating and implementation of the IRMS lifecycle strategy and this procedure will be managed and monitored by the Remote Handling Integrated Product Team (RH-IPT). Although developed for the IRMS, the basic principles and procedures of lifecycle management could be applied to other ITER plant systems whose reliability and availability will be essential for the continued operation of the ITER machine.
ITER Remote Maintenance System (IRMS) lifecycle management

International Nuclear Information System (INIS)

Tesini, Alessandro; Otto', Bede; Blight, John; Choi, Chang-Hwan; Friconneau, Jean-Pierre; Gotewal, Krishan Kumar; Hamilton, David; Heckendorn, Frank; Martins, Jean-Pierre; Marty, Thomas; Nakahira, Masataka; Palmer, Jim; Subramanian, Rajendran

2011-01-01

The availability of the ITER machine to perform its scientific program is strongly dependent on the performance of the different Remote Handling (RH) systems constituting the ITER Remote Maintenance System (IRMS). The lifecycle of the IRMS will largely exceed 40 years from initial concept design and proof testing through to machine decommissioning. Such a long lifecycle requires that a rigorous approach is put in place to guarantee the technical capabilities of the highly innovative IRMS, its efficiency and its availability. For this purpose, an IRMS System Engineering and IRMS lifecycle management approach has been adopted by ITER. The approach aims at ensuring the IRMS full operability and availability at an acceptable cost of ownership over the full ITER machine assembly and operations period. The IRMS lifecycle management method described in this paper covers such subjects as specific requirements for IRMS design reviews, monitoring during manufacture, factory and site acceptance testing, integrated commissioning, decontamination, maintenance and re-qualification strategies, requirements for Integrated Logistical Support during operations. The updating and implementation of the IRMS lifecycle strategy and this procedure will be managed and monitored by the Remote Handling Integrated Product Team (RH-IPT). Although developed for the IRMS, the basic principles and procedures of lifecycle management could be applied to other ITER plant systems whose reliability and availability will be essential for the continued operation of the ITER machine.
ITER blanket module shield block design and analysis

International Nuclear Information System (INIS)

Mitin, D.; Khomyakov, S.; Razmerov, A.; Strebkov, Yu.

2008-01-01

This paper presents the alternative design of the shield block cooling path for a typical ITER blanket module with a predominantly sequential flow circuit. A number of serious disadvantages have been observed for the reference design, where the parallel flow circuit is used, which is inherent in the majority of blanket modules. The paper discusses these disadvantages and demonstrates the benefit of the alternative design based on the detailed design and the technological, hydraulic, thermal, structural and strength analyses, conducted for module no. 17
ITER-FEAT - outline design report. Report by the ITER Director. ITER meeting, Tokyo, January 2000

International Nuclear Information System (INIS)

2001-01-01

It is now possible to define the key elements of ITER-FEAT. This report provides the results, to date, of the joint work of the Special Working Group in the form of an Outline Design Report on the ITER-FEAT design which, subject to the views of ITER Council and of the Parties, will be the focus of further detailed design work and analysis in order to provide to the Parties a complete and fully integrated engineering design within the framework of the ITER EDA extension
System engineering and configuration management in ITER

International Nuclear Information System (INIS)

Chiocchio, S.; Martin, E.; Barabaschi, P.; Bartels, Hans Werner; How, J.; Spears, W.

2007-01-01

The construction of ITER will represent a major challenge for the fusion community at large, because of the intrinsic complexity of the tokamak design, the large number of different systems which are all essential for its operation, the worldwide distribution of the design activities and the unusual procurement scheme based on a combination of in-kind and directly funded deliverables. A key requirement for the success of such a large project is that a systematic approach to ensure the consistency of the design with the required performance is adopted. Also, effective project management methods, tools and working practices must be deployed to facilitate the communication and collaboration among the institutions and industries involved in the project. The authors have been involved in the definition and practical implementation of the design integration and configuration control structure inside ITER and in the system engineering process during the selection and optimization of the machine configuration. In parallel, they have assessed design, drawing and documentation management software to be used for the construction phase. Here, they describe the experience gained in recent years, explain the drivers behind the selection of the documents and drawings management systems, and illustrate the scope and issues of the configuration management activities to ensure the congruence of the design, to control and track the design changes and to manage the interfaces among the ITER systems
Fourier analysis of parallel block-Jacobi splitting with transport synthetic acceleration in two-dimensional geometry

International Nuclear Information System (INIS)

Rosa, M.; Warsa, J. S.; Chang, J. H.

2007-01-01

A Fourier analysis is conducted in two-dimensional (2D) Cartesian geometry for the discrete-ordinates (SN) approximation of the neutron transport problem solved with Richardson iteration (Source Iteration) and Richardson iteration preconditioned with Transport Synthetic Acceleration (TSA), using the Parallel Block-Jacobi (PBJ) algorithm. The results for the un-accelerated algorithm show that convergence of PBJ can degrade, leading in particular to stagnation of GMRES(m) in problems containing optically thin sub-domains. The results for the accelerated algorithm indicate that TSA can be used to efficiently precondition an iterative method in the optically thin case when implemented in the 'modified' version MTSA, in which only the scattering in the low order equations is reduced by some non-negative factor β<1. (authors)
Tokamak equilibria with non-parallel flow in a triangularity-deformed axisymmetric toroidal coordinate system

Directory of Open Access Journals (Sweden)

Ap Kuiroukidis

2018-01-01

Full Text Available We consider a generalized Grad–Shafranov equation (GGSE in a triangularity-deformed axisymmetric toroidal coordinate system and solve it numerically for the generic case of ITER-like and JET-like equilibria with non-parallel flow. It turns out that increase of the triangularity improves confinement by leading to larger values of the toroidal beta and the safety factor. This result is supported by the application of a criterion for linear stability valid for equilibria with flow parallel to the magnetic field. Also, the parallel flow has a weaker stabilizing effect.
An approach to multicore parallelism using functional programming: A case study based on Presburger Arithmetic

DEFF Research Database (Denmark)

Dung, Phan Anh; Hansen, Michael Reichhardt

2015-01-01

In this paper we investigate multicore parallelism in the context of functional programming by means of two quantifier-elimination procedures for Presburger Arithmetic: one is based on Cooper’s algorithm and the other is based on the Omega Test. We first develop correct-by-construction prototype...... platform executing on an 8-core machine. A speedup of approximately 4 was obtained for Cooper’s algorithm and a speedup of approximately 6 was obtained for the exact-shadow part of the Omega Test. The considered procedures are complex, memory-intense algorithms on huge formula trees and the case study...... reveals more general applicable techniques and guideline for deriving parallel algorithms from sequential ones in the context of data-intensive tree algorithms. The obtained insights should apply for any strict and impure functional programming language. Furthermore, the results obtained for the exact...
Tuning iteration space slicing based tiled multi-core code implementing Nussinov's RNA folding.

Science.gov (United States)

Palkowski, Marek; Bielecki, Wlodzimierz

2018-01-15

RNA folding is an ongoing compute-intensive task of bioinformatics. Parallelization and improving code locality for this kind of algorithms is one of the most relevant areas in computational biology. Fortunately, RNA secondary structure approaches, such as Nussinov's recurrence, involve mathematical operations over affine control loops whose iteration space can be represented by the polyhedral model. This allows us to apply powerful polyhedral compilation techniques based on the transitive closure of dependence graphs to generate parallel tiled code implementing Nussinov's RNA folding. Such techniques are within the iteration space slicing framework - the transitive dependences are applied to the statement instances of interest to produce valid tiles. The main problem at generating parallel tiled code is defining a proper tile size and tile dimension which impact parallelism degree and code locality. To choose the best tile size and tile dimension, we first construct parallel parametric tiled code (parameters are variables defining tile size). With this purpose, we first generate two nonparametric tiled codes with different fixed tile sizes but with the same code structure and then derive a general affine model, which describes all integer factors available in expressions of those codes. Using this model and known integer factors present in the mentioned expressions (they define the left-hand side of the model), we find unknown integers in this model for each integer factor available in the same fixed tiled code position and replace in this code expressions, including integer factors, with those including parameters. Then we use this parallel parametric tiled code to implement the well-known tile size selection (TSS) technique, which allows us to discover in a given search space the best tile size and tile dimension maximizing target code performance. For a given search space, the presented approach allows us to choose the best tile size and tile dimension in
Fast Time and Space Parallel Algorithms for Solution of Parabolic Partial Differential Equations

Science.gov (United States)

Fijany, Amir

1993-01-01

In this paper, fast time- and Space -Parallel agorithms for solution of linear parabolic PDEs are developed. It is shown that the seemingly strictly serial iterations of the time-stepping procedure for solution of the problem can be completed decoupled.
IWR-solution for the ITER vacuum vessel assembly

Energy Technology Data Exchange (ETDEWEB)

Wu, H., E-mail: huapeng@lut.fi [Laboratory of Intelligent Machines, Lappeenranta University of Technology (Finland); Handroos, H. [Laboratory of Intelligent Machines, Lappeenranta University of Technology (Finland); Pela, P. [Tekes (Finland); Wang, Y. [Laboratory of Intelligent Machines, Lappeenranta University of Technology (Finland)

2011-10-15

The assembly of ITER vacuum vessel (VV) is still a very big challenge as the process can only be done from inside the VV. The welding of the VV assembly is carried out using the dedicated robotic systems. The main functions of the robots are: (i) measuring the actual space between every two sectors, (ii) positioning of the 150 kg splice plates between the sector shells, (iii) welding the splice plates to the sector shells, (iv) NDT of the welds, (v) repairing, including machining of the welds, (vi) He-leak tests of the welds, and (vii) the non-planned functions that may turn out. This paper presents a reasonable method to assemble the ITER VV. In this article, one parallel mobile robot, running on the track rail fixed on the wall inside the VV, is designed and tested. The assembling process, carried out by the mobile robot together with the welding robot, is presented.
Parallel computing: numerics, applications, and trends

National Research Council Canada - National Science Library

Trobec, Roman; Vajteršic, Marián; Zinterhof, Peter

2009-01-01

... and/or distributed systems. The contributions to this book are focused on topics most concerned in the trends of today's parallel computing. These range from parallel algorithmics, programming, tools, network computing to future parallel computing. Particular attention is paid to parallel numerics: linear algebra, differential equations, numerica...
ITER council proceedings: 2001

International Nuclear Information System (INIS)

2001-01-01

Continuing the ITER EDA, two further ITER Council Meetings were held since the publication of ITER EDA documentation series no, 20, namely the ITER Council Meeting on 27-28 February 2001 in Toronto, and the ITER Council Meeting on 18-19 July in Vienna. That Meeting was the last one during the ITER EDA. This volume contains records of these Meetings, including: Records of decisions; List of attendees; ITER EDA status report; ITER EDA technical activities report; MAC report and advice; Final report of ITER EDA; and Press release
A homotopy method for solving Riccati equations on a shared memory parallel computer

International Nuclear Information System (INIS)

Zigic, D.; Watson, L.T.; Collins, E.G. Jr.; Davis, L.D.

1993-01-01

Although there are numerous algorithms for solving Riccati equations, there still remains a need for algorithms which can operate efficiently on large problems and on parallel machines. This paper gives a new homotopy-based algorithm for solving Riccati equations on a shared memory parallel computer. The central part of the algorithm is the computation of the kernel of the Jacobian matrix, which is essential for the corrector iterations along the homotopy zero curve. Using a Schur decomposition the tensor product structure of various matrices can be efficiently exploited. The algorithm allows for efficient parallelization on shared memory machines
Parallel Computing Strategies for Irregular Algorithms

Science.gov (United States)

Biswas, Rupak; Oliker, Leonid; Shan, Hongzhang; Biegel, Bryan (Technical Monitor)

2002-01-01

Parallel computing promises several orders of magnitude increase in our ability to solve realistic computationally-intensive problems, but relies on their efficient mapping and execution on large-scale multiprocessor architectures. Unfortunately, many important applications are irregular and dynamic in nature, making their effective parallel implementation a daunting task. Moreover, with the proliferation of parallel architectures and programming paradigms, the typical scientist is faced with a plethora of questions that must be answered in order to obtain an acceptable parallel implementation of the solution algorithm. In this paper, we consider three representative irregular applications: unstructured remeshing, sparse matrix computations, and N-body problems, and parallelize them using various popular programming paradigms on a wide spectrum of computer platforms ranging from state-of-the-art supercomputers to PC clusters. We present the underlying problems, the solution algorithms, and the parallel implementation strategies. Smart load-balancing, partitioning, and ordering techniques are used to enhance parallel performance. Overall results demonstrate the complexity of efficiently parallelizing irregular algorithms.
Iterative Object Localization Algorithm Using Visual Images with a Reference Coordinate

Directory of Open Access Journals (Sweden)

We-Duke Cho

2008-09-01

Full Text Available We present a simplified algorithm for localizing an object using multiple visual images that are obtained from widely used digital imaging devices. We use a parallel projection model which supports both zooming and panning of the imaging devices. Our proposed algorithm is based on a virtual viewable plane for creating a relationship between an object position and a reference coordinate. The reference point is obtained from a rough estimate which may be obtained from the preestimation process. The algorithm minimizes localization error through the iterative process with relatively low-computational complexity. In addition, nonlinearity distortion of the digital image devices is compensated during the iterative process. Finally, the performances of several scenarios are evaluated and analyzed in both indoor and outdoor environments.
Fast iterative censoring CFAR algorithm for ship detection from SAR images

Science.gov (United States)

Gu, Dandan; Yue, Hui; Zhang, Yuan; Gao, Pengcheng

2017-11-01

Ship detection is one of the essential techniques for ship recognition from synthetic aperture radar (SAR) images. This paper presents a fast iterative detection procedure to eliminate the influence of target returns on the estimation of local sea clutter distributions for constant false alarm rate (CFAR) detectors. A fast block detector is first employed to extract potential target sub-images; and then, an iterative censoring CFAR algorithm is used to detect ship candidates from each target blocks adaptively and efficiently, where parallel detection is available, and statistical parameters of G0 distribution fitting local sea clutter well can be quickly estimated based on an integral image operator. Experimental results of TerraSAR-X images demonstrate the effectiveness of the proposed technique.
ITER safety

International Nuclear Information System (INIS)

Raeder, J.; Piet, S.; Buende, R.

1991-01-01

As part of the series of publications by the IAEA that summarize the results of the Conceptual Design Activities for the ITER project, this document describes the ITER safety analyses. It contains an assessment of normal operation effluents, accident scenarios, plasma chamber safety, tritium system safety, magnet system safety, external loss of coolant and coolant flow problems, and a waste management assessment, while it describes the implementation of the safety approach for ITER. The document ends with a list of major conclusions, a set of topical remarks on technical safety issues, and recommendations for the Engineering Design Activities, safety considerations for siting ITER, and recommendations with regard to the safety issues for the R and D for ITER. Refs, figs and tabs

Scientific Programming with High Performance Fortran: A Case Study Using the xHPF Compiler

Directory of Open Access Journals (Sweden)

Eric De Sturler

1997-01-01

Full Text Available Recently, the first commercial High Performance Fortran (HPF subset compilers have appeared. This article reports on our experiences with the xHPF compiler of Applied Parallel Research, version 1.2, for the Intel Paragon. At this stage, we do not expect very High Performance from our HPF programs, even though performance will eventually be of paramount importance for the acceptance of HPF. Instead, our primary objective is to study how to convert large Fortran 77 (F77 programs to HPF such that the compiler generates reasonably efficient parallel code. We report on a case study that identifies several problems when parallelizing code with HPF; most of these problems affect current HPF compiler technology in general, although some are specific for the xHPF compiler. We discuss our solutions from the perspective of the scientific programmer, and presenttiming results on the Intel Paragon. The case study comprises three programs of different complexity with respect to parallelization. We use the dense matrix-matrix product to show that the distribution of arrays and the order of nested loops significantly influence the performance of the parallel program. We use Gaussian elimination with partial pivoting to study the parallelization strategy of the compiler. There are various ways to structure this algorithm for a particular data distribution. This example shows how much effort may be demanded from the programmer to support the compiler in generating an efficient parallel implementation. Finally, we use a small application to show that the more complicated structure of a larger program may introduce problems for the parallelization, even though all subroutines of the application are easy to parallelize by themselves. The application consists of a finite volume discretization on a structured grid and a nested iterative solver. Our case study shows that it is possible to obtain reasonably efficient parallel programs with xHPF, although the compiler
Modeling of ELM Dynamics in ITER

International Nuclear Information System (INIS)

Pankin, A.Y.; Bateman, G.; Kritz, A.H.; Brennan, D.P.; Snyder, P.B.; Kruger, S.

2007-01-01

Edge localized modes (ELMs) are large scale instabilities that alter the H-mode pedestal, reduce the total plasma stored energy, and can result in heat pulses to the divertor plates. These modes can be triggered by pressure driven ballooning modes or by current driven peeling instabilities. In this study, stability analyses are carried out for a series of ITER equilibria that are generated with the TEQ and TOQ equilibrium codes. The H-mode pedestal pressure and parallel component of plasma current density are varied in a systematic way in order to include the relevant parameter space for a specific ITER discharge. Ideal MHD stability codes, DCON, ELITE, and BALOO code, are employed to determine whether or not each ITER equilibrium profile is unstable to peeling or ballooning modes in the pedestal region. Several equilibria that are close to the marginal stability boundary for peeling and ballooning modes are tested with the NIMROD non-ideal MHD code. The effects of finite resistivity are studied in a series of linear NIMROD computations. It is found that the peeling-ballooning stability threshold is very sensitive to the resistivity and viscosity profiles, which vary dramatically over a wide range near the separatrix. Due to the effects of finite resistivity and viscosity, the peeling-ballooning stability threshold is shifted compared to the ideal threshold. A fundamental question in the integrated modeling of ELMy H-mode discharges concerning how much plasma and current density is removed during each ELM crash can be addressed with nonlinear non-ideal MHD simulations. In this study, the NIMROD computer simulations are continued into the nonlinear stage for several ITER equilibria that are marginally unstable to peeling or ballooning modes. The role of two-fluid and finite Larmor radius effects on the ELM dynamics in ITER geometry is examined. The formation of ELM filament structures, which are observed in many existing tokamak experiments, is demonstrated for ITER
MELCOR 1.8.2 Analyses in Support of ITER's RPrS

International Nuclear Information System (INIS)

Brad J Merrill

2008-01-01

The International Thermonuclear Experimental Reactor (ITER) Program is performing accident analyses for ITER's 'Rapport Preliminaire de Surete' (Report Preliminary on Safety - RPrS) with a modified version of the MELCOR 1.8.2 code. The RPrS is an ITER safety document required in the ITER licensing process to obtain a 'Decret Autorisation de Construction' (a Decree Authorizing Construction - DAC) for the ITER device. This report documents the accident analyses performed by the US with the MELCOR 1.8.2 code in support of the ITER RPrS effort. This work was funded through an ITER Task Agreement for MELCOR Quality Assurance and Safety Analyses. Under this agreement, the US was tasked with performing analyses for three accident scenarios in the ITER facility. Contained within the text of this report are discussions that identify the cause of these accidents, descriptions of how these accidents are likely to proceed, the method used to analyze the consequences of these accidents, and discussions of the transient thermal hydraulic and radiological release results for these accidents
Analysis of the ITER cryoplant operational modes

International Nuclear Information System (INIS)

Henry, D.; Journeaux, J.Y.; Roussel, P.; Michel, F.; Poncet, J.M.; Girard, A.; Kalinin, V.; Chesny, P.

2007-01-01

In the framework of an EFDA task, CEA is carrying out an analysis of the various ITER cryoplant operational modes. According to the project integration document, ITER is designed to be operated 365 days per year in order to optimize the available time of the Tokamak. It is anticipated that operation will be performed in long periods separated by maintenance periods (e.g. 10 days continuous operation and 1 week break) with annual or bi-annual major shutdown periods of a few months for maintenance, further installation and commissioning. For this operation schedule, auxiliary subsystems like the cryoplant and the cryodistribution have to cope with different heat loads which depend on the different ITER operating states. The cryoplant consists of four identical 4.5 K refrigerators and two 80 K helium loops coupled with two LN2 modules. All of these cryogenic subsystems have to operate in parallel to remove the heat loads from the magnet, 80 K shields, cryopumps and other small users. After a brief recall of the main particularities of a cryogenic system operating in a Tokamak environment, the first part of this study is dedicated to the assessment of the main ITER operation states. A new design of refrigeration loop for the HTS current leads, the updated layout of the cryodistribution system and revised strategy for operations of the cryopumps have been taken into consideration. The relevant normal operating scenarios of the cryoplant are checked for the typical ITER operating states like plasma operation state, short term stand by, short term maintenance, or test and conditioning state. The second part of the paper is dedicated to the abnormal operating modes coming from the magnets and from those generated by the cryoplant itself. The occurrence of a fast discharge or a quench of the magnets generates large heat loads disturbances and produces exceptional high mass flow rates which have to be managed by the cryoplant, while a failure of a cryogenic component induces
Final report of the ITER EDA. Final report of the ITER Engineering Design Activities. Prepared by the ITER Council

International Nuclear Information System (INIS)

2001-01-01

This is the Final Report by the ITER Council on work carried out by ITER participating countries on cooperation in the Engineering Design Activities (EDA) for the ITER. In this report the main ITER EDA technical objectives, the scope of ITER EDA, its organization and resources, engineering design of ITER tokamak and its main parameters are presented. This Report also includes safety and environmental assessments, site requirements and proposed schedule and estimates of manpower and cost as well as proposals on approaches to joint implementation of the project
Demonstration of ITER Operational Scenarios on DIII-D

International Nuclear Information System (INIS)

Doyle, E.J.; Budny, R.V.; DeBoo, J.C.; Ferron, J.R.; Jackson, G.L.; Luce, T.C.; Murakami, M.; Osborne, T.H.; Park, J.; Politzer, P.A.; Reimerdes, H.; Casper, T.A.; Challis, C.D.; Groebner, R.J.; Holcomb, C.T.; Hyatt, A.W.; La Haye, R.J.; McKee, G.R.; Petrie, T.W.; Petty, C.C.; Rhodes, T.L.; Shafer, M.W.; Snyder, P.B.; Strait, E.J; Wade, M.R.; Wang, G.; West, W.P.; Zeng, L.

2008-01-01

The DIII-D program has recently initiated an effort to provide suitably scaled experimental evaluations of four primary ITER operational scenarios. New and unique features of this work are that the plasmas incorporate essential features of the ITER scenarios and anticipated operating characteristics; e.g., the plasma cross-section, aspect ratio and value of I/aB of the DIII-D discharges match the ITER design, with size reduced by a factor of 3.7. Key aspects of all four scenarios, such as target values for β N and H 98 , have been replicated successfully on DIII-D, providing an improved and unified physics basis for transport and stability modeling, as well as for performance extrapolation to ITER. In all four scenarios normalized performance equals or closely approaches that required to realize the physics and technology goals of ITER, and projections of the DIII-D discharges are consistent with ITER achieving its goals of (ge) 400 MW of fusion power production and Q (ge) 10. These studies also address many of the key physics issues related to the ITER design, including the L-H transition power threshold, the size of ELMs, pedestal parameter scaling, the impact of tearing modes on confinement and disruptivity, beta limits and the required capabilities of the plasma control system. An example of direct influence on the ITER design from this work is a modification of the specified operating range in internal inductance at 15 MA for the poloidal field coil set, based on observations that the measured inductance in the baseline scenario case lay outside the original ITER specification
Parallel processing of two-dimensional Sn transport calculations

International Nuclear Information System (INIS)

Uematsu, M.

1997-01-01

A parallel processing method for the two-dimensional S n transport code DOT3.5 has been developed to achieve a drastic reduction in computation time. In the proposed method, parallelization is achieved with angular domain decomposition and/or space domain decomposition. The calculational speed of parallel processing by angular domain decomposition is largely influenced by frequent communications between processing elements. To assess parallelization efficiency, sample problems with up to 32 x 32 spatial meshes were solved with a Sun workstation using the PVM message-passing library. As a result, parallel calculation using 16 processing elements, for example, was found to be nine times as fast as that with one processing element. As for parallel processing by geometry segmentation, the influence of processing element communications on computation time is small; however, discontinuity at the segment boundary degrades convergence speed. To accelerate the convergence, an alternate sweep of angular flux in conjunction with space domain decomposition and a two-step rescaling method consisting of segmentwise rescaling and ordinary pointwise rescaling have been developed. By applying the developed method, the number of iterations needed to obtain a converged flux solution was reduced by a factor of 2. As a result, parallel calculation using 16 processing elements was found to be 5.98 times as fast as the original DOT3.5 calculation
Technical meeting on materials for in-vessel components of ITER

International Nuclear Information System (INIS)

Kalinin, G.; Barabash, V.

2000-01-01

The Technical meeting on materials for in-vessel components of ITER was held at the ITER Joint Work Site in Garching from 31 January to 4 February. The main objectives of the meetings were: 1. to summarize the requirements, 2. to review new data, 3. to discuss in detail the R and D program and to discuss the material assessment report
Parallel computation of rotating flows

DEFF Research Database (Denmark)

Lundin, Lars Kristian; Barker, Vincent A.; Sørensen, Jens Nørkær

1999-01-01

This paper deals with the simulation of 3‐D rotating flows based on the velocity‐vorticity formulation of the Navier‐Stokes equations in cylindrical coordinates. The governing equations are discretized by a finite difference method. The solution is advanced to a new time level by a two‐step process...... is that of solving a singular, large, sparse, over‐determined linear system of equations, and the iterative method CGLS is applied for this purpose. We discuss some of the mathematical and numerical aspects of this procedure and report on the performance of our software on a wide range of parallel computers. Darbe...
Variation in efficiency of parallel algorithms. [for study of stiffness matrices in planar trusses

Science.gov (United States)

Hayashi, A.; Melosh, R. J.; Utku, S.; Salama, M.

1985-01-01

The present study has the objective to investigate some iterative parallel-processor linear equation solving algorithms with respect to efficiency for analyses of typical linear engineering systems. Attention is given to a set of n linear equations, Ku = p, where K = an n x n positive definite, sparsely populated, symmetric matrix, u = an n x 1 vector of unknown responses, and p = an n x 1 vector of prescribed constants. This study is concerned with a hybrid method in which iteration is used to solve the problem, while a direct method is used on the local processor level. Variations in the efficiency of parallel algorithms are explored. Measures of the efficiency are based on computer experiments regarding the algorithms. For all the algorithms, the wall clock time is found to decrease as the number of processors increases.
iterClust: a statistical framework for iterative clustering analysis.

Science.gov (United States)

Ding, Hongxu; Wang, Wanxin; Califano, Andrea

2018-03-22

In a scenario where populations A, B1 and B2 (subpopulations of B) exist, pronounced differences between A and B may mask subtle differences between B1 and B2. Here we present iterClust, an iterative clustering framework, which can separate more pronounced differences (e.g. A and B) in starting iterations, followed by relatively subtle differences (e.g. B1 and B2), providing a comprehensive clustering trajectory. iterClust is implemented as a Bioconductor R package. andrea.califano@columbia.edu, hd2326@columbia.edu. Supplementary information is available at Bioinformatics online.
Qualification of phased array ultrasonic examination on T-joint weld of austenitic stainless steel for ITER vacuum vessel

Energy Technology Data Exchange (ETDEWEB)

Kim, G.H. [ITER Korea, National Fusion Research Institute, Daejeon 305-333 (Korea, Republic of); Park, C.K., E-mail: love879@hanmail.net [ITER Korea, National Fusion Research Institute, Daejeon 305-333 (Korea, Republic of); Jin, S.W.; Kim, H.S.; Hong, K.H.; Lee, Y.J.; Ahn, H.J.; Chung, W. [ITER Korea, National Fusion Research Institute, Daejeon 305-333 (Korea, Republic of); Jung, Y.H.; Roh, B.R. [Hyundai Heavy Industries Co. Ltd., Ulsan 682-792 (Korea, Republic of); Sa, J.W.; Choi, C.H. [ITER Organization, Route de Vinon-sur-Verdon, CS 90 046, 13067 St. Paul Lez Durance Cedex (France)

2016-11-01

Highlights: • PAUT techniques has been developed by Hyundai Heavy Industries Co., LTD (HHI) and Korea Domestic Agency (KODA) to verify and settle down instrument calibration, test procedures, image processing, and so on. As the first step of development for PAUT technique, Several dozens of qualification blocks with artificial defects, which are parallel side drilled hole, embedded lack of fusion, embedded repair weld notch, and so on, have been designed and fabricated to simulate all potential defects during welding process. Real UT qualification group-1 for T-joint weld was successfully conducted in front of ANB inspector. • In this paper, remarkable progresses of UT qualification are presented for ITER vacuum vessel. - Abstract: Full penetration welding and 100% volumetric examination are required for all welds of pressure retaining parts of the ITER Vacuum Vessel (VV) according to RCC-MR Code and French Order of Nuclear Pressure Equipment (ESPN). The NDE requirement is one of important technical issues because radiographic examination (RT) is not applicable to many welding joints. Therefore the ultrasonic examination (UT) has been selected as an alternative method. Generally the UT on the austenitic welds is regarded as a great challenge due to the high attenuation and dispersion of the ultrasonic signal. In this paper, Phased array ultrasonic examination (PAUT) has been introduced on double sided T-shape austenitic welds of the ITER VV as a major NDE method as well as RT. Several dozens of qualification blocks with artificial defects, which are parallel side drilled hole, embedded lack of fusion, embedded repair weld notch, embedded parallel vertical notch, and so on, have been designed and fabricated to simulate all potential defects during welding process. PAUT techniques on the thick austenitic welds have been developed taking into account the acceptance criteria. Test procedure including calibration of equipment is derived and qualified through
A Note on Using Partitioning Techniques for Solving Unconstrained Optimization Problems on Parallel Systems

Directory of Open Access Journals (Sweden)

Mehiddin Al-Baali

2015-12-01

Full Text Available We deal with the design of parallel algorithms by using variable partitioning techniques to solve nonlinear optimization problems. We propose an iterative solution method that is very efficient for separable functions, our scope being to discuss its performance for general functions. Experimental results on an illustrative example have suggested some useful modifications that, even though they improve the efficiency of our parallel method, leave some questions open for further investigation.
Parallel keyed hash function construction based on chaotic maps

International Nuclear Information System (INIS)

Xiao Di; Liao Xiaofeng; Deng Shaojiang

2008-01-01

Recently, a variety of chaos-based hash functions have been proposed. Nevertheless, none of them works efficiently in parallel computing environment. In this Letter, an algorithm for parallel keyed hash function construction is proposed, whose structure can ensure the uniform sensitivity of hash value to the message. By means of the mechanism of both changeable-parameter and self-synchronization, the keystream establishes a close relation with the algorithm key, the content and the order of each message block. The entire message is modulated into the chaotic iteration orbit, and the coarse-graining trajectory is extracted as the hash value. Theoretical analysis and computer simulation indicate that the proposed algorithm can satisfy the performance requirements of hash function. It is simple, efficient, practicable, and reliable. These properties make it a good choice for hash on parallel computing platform
Feasibility analysis of fuzzy logic control for ITER Poloidal field (PF) AC/DC converter system

Energy Technology Data Exchange (ETDEWEB)

Hassan, Mahmood Ul; Fu, Peng [Institute of Plasma Physics, Chinese Academy of Sciences, Hefei 230031 (China); University of Science and Technology of China (China); Song, Zhiquan, E-mail: zhquansong@ipp.ac.cn [Institute of Plasma Physics, Chinese Academy of Sciences, Hefei 230031 (China); Chen, Xiaojiao [Institute of Plasma Physics, Chinese Academy of Sciences, Hefei 230031 (China); University of Science and Technology of China (China); Zhang, Xiuqing [Institute of Plasma Physics, Chinese Academy of Sciences, Hefei 230031 (China); Humayun, Muhammad [Shanghai Jiaotong University (China)

2017-05-15

Highlights: • The implementation of the Fuzzy controller for the ITER PF converter system is presented. • The comparison of the FLC and PI simulation are investigated. • The FLC single and parallel bridge operation are presented. • Fuzzification and Defuzzification algorithms are presented using FLC controller. - Abstract: This paper describes the feasibility analysis of the fuzzy logic control to increase the performance of the ITER poloidal field (PF) converter systems. A fuzzy-logic-based controller is designed for ITER PF converter system, using the traditional PI controller and Fuzzy controller (FC), the dynamic behavior and transient response of the PF converter system are compared under normal operation by analysis and simulation. The analysis results show that the fuzzy logic control can achieve better operation performance than PI control.
ITER implementation and fusion energy research in China

International Nuclear Information System (INIS)

Zhao, Jing; Feng, Zhaoliang; Yang, Changchun

2015-01-01

ITER Project is jointly implemented by China, EU, India, Japan, Korea, Russian Federation and USA, under the coordination of Center Team of ITER International Fusion Energy Organization (IO-CT). Chinese fusion research related institutes and industrial enterprises are fully involved in the implementation of China contribution to the project under the leadership of ITER China Domestic Agency (CN-DA), together with IO-CT. The progresses of Procurement Packages (PA) allocated to China and the technical issues, especially on key technology development and schedule, QA/QC issues, are highlighted in this report. The specific enterprises carrying out different PAs are identified in order to make the increasing international manufactures and producers to ITER PAs know each other well for the successful implementation of ITER project. The participation of China to the management of IO-CT is also included, mainly from the governmental aspect and staff recruited from China. On the other hand, the domestic fusion researches, including upgrade of EAST, HL-2A Tokamaks in China, TBM program, the next step design activities for fusion energy power plant, namely, CFETR and training in this area, are also introduced for global cooperation for international fusion community. (author)
Virtual fringe projection system with nonparallel illumination based on iteration

International Nuclear Information System (INIS)

Zhou, Duo; Wang, Zhangying; Gao, Nan; Zhang, Zonghua; Jiang, Xiangqian

2017-01-01

Fringe projection profilometry has been widely applied in many fields. To set up an ideal measuring system, a virtual fringe projection technique has been studied to assist in the design of hardware configurations. However, existing virtual fringe projection systems use parallel illumination and have a fixed optical framework. This paper presents a virtual fringe projection system with nonparallel illumination. Using an iterative method to calculate intersection points between rays and reference planes or object surfaces, the proposed system can simulate projected fringe patterns and captured images. A new explicit calibration method has been presented to validate the precision of the system. Simulated results indicate that the proposed iterative method outperforms previous systems. Our virtual system can be applied to error analysis, algorithm optimization, and help operators to find ideal system parameter settings for actual measurements. (paper)
Massively parallel red-black algorithms for x-y-z response matrix equations

International Nuclear Information System (INIS)

Hanebutte, U.R.; Laurin-Kovitz, K.; Lewis, E.E.

1992-01-01

Recently, both discrete ordinates and spherical harmonic (S n and P n ) methods have been cast in the form of response matrices. In x-y geometry, massively parallel algorithms have been developed to solve the resulting response matrix equations on the Connection Machine family of parallel computers, the CM-2, CM-200, and CM-5. These algorithms utilize two-cycle iteration on a red-black checkerboard. In this work we examine the use of massively parallel red-black algorithms to solve response matric equations in three dimensions. This longer term objective is to utilize massively parallel algorithms to solve S n and/or P n response matrix problems. In this exploratory examination, however, we consider the simple 6 x 6 response matrices that are derivable from fine-mesh diffusion approximations in three dimensions
SOFTWARE FOR DESIGNING PARALLEL APPLICATIONS

Directory of Open Access Journals (Sweden)

M. K. Bouza

2017-01-01

Full Text Available The object of research is the tools to support the development of parallel programs in C/C ++. The methods and software which automates the process of designing parallel applications are proposed.
Iterative regularization in intensity-modulated radiation therapy optimization

International Nuclear Information System (INIS)

Carlsson, Fredrik; Forsgren, Anders

2006-01-01

A common way to solve intensity-modulated radiation therapy (IMRT) optimization problems is to use a beamlet-based approach. The approach is usually employed in a three-step manner: first a beamlet-weight optimization problem is solved, then the fluence profiles are converted into step-and-shoot segments, and finally postoptimization of the segment weights is performed. A drawback of beamlet-based approaches is that beamlet-weight optimization problems are ill-conditioned and have to be regularized in order to produce smooth fluence profiles that are suitable for conversion. The purpose of this paper is twofold: first, to explain the suitability of solving beamlet-based IMRT problems by a BFGS quasi-Newton sequential quadratic programming method with diagonal initial Hessian estimate, and second, to empirically show that beamlet-weight optimization problems should be solved in relatively few iterations when using this optimization method. The explanation of the suitability is based on viewing the optimization method as an iterative regularization method. In iterative regularization, the optimization problem is solved approximately by iterating long enough to obtain a solution close to the optimal one, but terminating before too much noise occurs. Iterative regularization requires an optimization method that initially proceeds in smooth directions and makes rapid initial progress. Solving ten beamlet-based IMRT problems with dose-volume objectives and bounds on the beamlet-weights, we find that the considered optimization method fulfills the requirements for performing iterative regularization. After segment-weight optimization, the treatments obtained using 35 beamlet-weight iterations outperform the treatments obtained using 100 beamlet-weight iterations, both in terms of objective value and of target uniformity. We conclude that iterating too long may in fact deteriorate the quality of the deliverable plan

NUMERICAL WITHOUT ITERATION METHOD OF MODELING OF ELECTROMECHANICAL PROCESSES IN ASYNCHRONOUS ENGINES

Directory of Open Access Journals (Sweden)

D. G. Patalakh

2018-02-01

Full Text Available Purpose. Development of calculation of electromagnetic and electromechanic transients is in asynchronous engines without iterations. Methodology. Numeral methods of integration of usual differential equations, programming. Findings. As the system of equations, describing the dynamics of asynchronous engine, contents the products of rotor and stator currents and product of rotation frequency of rotor and currents, so this system is nonlinear one. The numeral solution of nonlinear differential equations supposes an iteration process on every step of integration. Time-continuing and badly converging iteration process may be the reason of calculation slowing. The improvement of numeral method by the way of an iteration process removing is offered. As result the modeling time is reduced. The improved numeral method is applied for integration of differential equations, describing the dynamics of asynchronous engine. Originality. The improvement of numeral method allowing to execute numeral integrations of differential equations containing product of functions is offered, that allows to avoid an iteration process on every step of integration and shorten modeling time. Practical value. On the basis of the offered methodology the universal program of modeling of electromechanics processes in asynchronous engines could be developed as taking advantage on fast-acting.
Parallel numerical modeling of hybrid-dimensional compositional non-isothermal Darcy flows in fractured porous media

Science.gov (United States)

Xing, F.; Masson, R.; Lopez, S.

2017-09-01

This paper introduces a new discrete fracture model accounting for non-isothermal compositional multiphase Darcy flows and complex networks of fractures with intersecting, immersed and non-immersed fractures. The so called hybrid-dimensional model using a 2D model in the fractures coupled with a 3D model in the matrix is first derived rigorously starting from the equi-dimensional matrix fracture model. Then, it is discretized using a fully implicit time integration combined with the Vertex Approximate Gradient (VAG) finite volume scheme which is adapted to polyhedral meshes and anisotropic heterogeneous media. The fully coupled systems are assembled and solved in parallel using the Single Program Multiple Data (SPMD) paradigm with one layer of ghost cells. This strategy allows for a local assembly of the discrete systems. An efficient preconditioner is implemented to solve the linear systems at each time step and each Newton type iteration of the simulation. The numerical efficiency of our approach is assessed on different meshes, fracture networks, and physical settings in terms of parallel scalability, nonlinear convergence and linear convergence.
Speeding up predictive electromagnetic simulations for ITER application

International Nuclear Information System (INIS)

Alekseev, A.B.; Amoskov, V.M.; Bazarov, A.M.; Belov, A.V.; Belyakov, V.A.; Gapionok, E.I.; Gornikel, I.V.; Gribov, Yu. V.; Kukhtin, V.P.; Lamzin, E.A.; Sytchevsky, S.E.

2017-01-01

Highlights: • A general concept of engineering EM simulator for tokamak application is proposed. • An algorithm is based on influence functions and superposition principle. • The software works with extensive databases and offers parallel processing. • The simulator allows us to obtain the solution hundreds times faster. - Abstract: The paper presents an attempt to proceed to a general concept of software environment for fast and consistent multi-task simulation of EM transients (engineering simulator for tokamak applications). As an example, the ITER tokamak is taken to introduce a computational technique. The strategy exploits parallel processing with optimized simulation algorithms based on using of influence functions and superposition principle to take full advantage of parallelism. The software has been tested on a multi-core supercomputer. The results were compared with data obtained in TYPHOON computations. A discrepancy was found to be below 0.4%. The computation cost for the simulator is proportional to the number of observation points. An average computation time with the simulator is found to be by hundreds times less than the time required to solve numerically a relevant system of differential equations for known software tools.
Speeding up predictive electromagnetic simulations for ITER application

Energy Technology Data Exchange (ETDEWEB)

Alekseev, A.B. [ITER Organization, Route de Vinon sur Verdon, 13067 St. Paul Lez Durance Cedex (France); Amoskov, V.M. [JSC “NIIEFA”, Doroga na Metallostroy 3, St. Petersburg, 196641 (Russian Federation); Bazarov, A.M., E-mail: alexander.bazarov@gmail.com [JSC “NIIEFA”, Doroga na Metallostroy 3, St. Petersburg, 196641 (Russian Federation); Belov, A.V. [JSC “NIIEFA”, Doroga na Metallostroy 3, St. Petersburg, 196641 (Russian Federation); Belyakov, V.A. [JSC “NIIEFA”, Doroga na Metallostroy 3, St. Petersburg, 196641 (Russian Federation); St. Petersburg State University, 7/9 Universitetskaya Embankment, St. Petersburg, 199034 (Russian Federation); Gapionok, E.I. [JSC “NIIEFA”, Doroga na Metallostroy 3, St. Petersburg, 196641 (Russian Federation); Gornikel, I.V. [Alphysica GmbH, Unterreut, 6, D-76135, Karlsruhe (Germany); Gribov, Yu. V. [ITER Organization, Route de Vinon sur Verdon, 13067 St. Paul Lez Durance Cedex (France); Kukhtin, V.P.; Lamzin, E.A. [JSC “NIIEFA”, Doroga na Metallostroy 3, St. Petersburg, 196641 (Russian Federation); Sytchevsky, S.E. [JSC “NIIEFA”, Doroga na Metallostroy 3, St. Petersburg, 196641 (Russian Federation); St. Petersburg State University, 7/9 Universitetskaya Embankment, St. Petersburg, 199034 (Russian Federation)

2017-05-15

Highlights: • A general concept of engineering EM simulator for tokamak application is proposed. • An algorithm is based on influence functions and superposition principle. • The software works with extensive databases and offers parallel processing. • The simulator allows us to obtain the solution hundreds times faster. - Abstract: The paper presents an attempt to proceed to a general concept of software environment for fast and consistent multi-task simulation of EM transients (engineering simulator for tokamak applications). As an example, the ITER tokamak is taken to introduce a computational technique. The strategy exploits parallel processing with optimized simulation algorithms based on using of influence functions and superposition principle to take full advantage of parallelism. The software has been tested on a multi-core supercomputer. The results were compared with data obtained in TYPHOON computations. A discrepancy was found to be below 0.4%. The computation cost for the simulator is proportional to the number of observation points. An average computation time with the simulator is found to be by hundreds times less than the time required to solve numerically a relevant system of differential equations for known software tools.
Fast parallel algorithms for the x-ray transform and its adjoint.

Science.gov (United States)

Gao, Hao

2012-11-01

Iterative reconstruction methods often offer better imaging quality and allow for reconstructions with lower imaging dose than classical methods in computed tomography. However, the computational speed is a major concern for these iterative methods, for which the x-ray transform and its adjoint are two most time-consuming components. The speed issue becomes even notable for the 3D imaging such as cone beam scans or helical scans, since the x-ray transform and its adjoint are frequently computed as there is usually not enough computer memory to save the corresponding system matrix. The purpose of this paper is to optimize the algorithm for computing the x-ray transform and its adjoint, and their parallel computation. The fast and highly parallelizable algorithms for the x-ray transform and its adjoint are proposed for the infinitely narrow beam in both 2D and 3D. The extension of these fast algorithms to the finite-size beam is proposed in 2D and discussed in 3D. The CPU and GPU codes are available at https://sites.google.com/site/fastxraytransform. The proposed algorithm is faster than Siddon's algorithm for computing the x-ray transform. In particular, the improvement for the parallel computation can be an order of magnitude. The authors have proposed fast and highly parallelizable algorithms for the x-ray transform and its adjoint, which are extendable for the finite-size beam. The proposed algorithms are suitable for parallel computing in the sense that the computational cost per parallel thread is O(1).
Performance of a multi-section ICRF array for a RTO/RC ITER

International Nuclear Information System (INIS)

Bosia, Giuseppe; Brambilla, Marco

1999-01-01

In an RTO/RC ITER, the Ion Cyclotron (IC) Heating and Current Drive System would need to operate at a power density of 6.5 MW/m 2 , (or about twice the design value adopted in the ITER Final Design Report), in order to provide the required total power output of 40 MW of RF power from two equatorial ports. A significant upgrade of the original IC array design is necessary, in order to keep the operating RF voltage at the plasma interface within acceptable limits. This is in principle possible by increasing the number of array elements and by operating them in parallel. In the paper the prospects of this modifications and the implications on the array layout are discussed
ITER council proceedings: 2000

International Nuclear Information System (INIS)

2001-01-01

No ITER Council Meetings were held during 2000. However, two ITER EDA Meetings were held, one in Tokyo, January 19-20, and one in Moscow, June 29-30. The parties participating in these meetings were those that partake in the extended ITER EDA, namely the EU, the Russian Federation, and Japan. This document contains, a/o, the records of these meetings, the list of attendees, the agenda, the ITER EDA Status Reports issued during these meetings, the TAC (Technical Advisory Committee) reports and recommendations, the MAC Reports and Advice (also for the July 1999 Meeting), the ITER-FEAT Outline Design Report, the TAC Reports and Recommendations both meetings), Site requirements and Site Design Assumptions, the Tentative Sequence of technical Activities 2000-2001, Report of the ITER SWG-P2 on Joint Implementation of ITER, EU/ITER Canada Proposal for New ITER Identification
Parallelization of a hydrological model using the message passing interface

Science.gov (United States)

Wu, Yiping; Li, Tiejian; Sun, Liqun; Chen, Ji

2013-01-01

With the increasing knowledge about the natural processes, hydrological models such as the Soil and Water Assessment Tool (SWAT) are becoming larger and more complex with increasing computation time. Additionally, other procedures such as model calibration, which may require thousands of model iterations, can increase running time and thus further reduce rapid modeling and analysis. Using the widely-applied SWAT as an example, this study demonstrates how to parallelize a serial hydrological model in a Windows® environment using a parallel programing technology—Message Passing Interface (MPI). With a case study, we derived the optimal values for the two parameters (the number of processes and the corresponding percentage of work to be distributed to the master process) of the parallel SWAT (P-SWAT) on an ordinary personal computer and a work station. Our study indicates that model execution time can be reduced by 42%–70% (or a speedup of 1.74–3.36) using multiple processes (two to five) with a proper task-distribution scheme (between the master and slave processes). Although the computation time cost becomes lower with an increasing number of processes (from two to five), this enhancement becomes less due to the accompanied increase in demand for message passing procedures between the master and all slave processes. Our case study demonstrates that the P-SWAT with a five-process run may reach the maximum speedup, and the performance can be quite stable (fairly independent of a project size). Overall, the P-SWAT can help reduce the computation time substantially for an individual model run, manual and automatic calibration procedures, and optimization of best management practices. In particular, the parallelization method we used and the scheme for deriving the optimal parameters in this study can be valuable and easily applied to other hydrological or environmental models.
Summary of beryllium qualification activity for ITER first-wall applications

International Nuclear Information System (INIS)

Barabash, V; Eaton, R; Hirai, T; Kupriyanov, I; Nikolaev, G; Wang Zhanhong; Liu Xiang; Roedig, M; Linke, J

2011-01-01

Beryllium is considered as an armor material for the ITER first wall. The ITER Final Design Report 2001 identified the reference grades S-65C vacuum hot pressed (VHP) from Brush Wellman and DShG-200 from the Russian Federation. These grades have been selected based on excellent thermal fatigue/shock behavior and the available comprehensive database. Later, Chinese and Russian ITER Parties proposed their new grades: CN-G01 (from China) and TGP-56FW (from Russia). To assess the performance of these new grades, the ITER Organization, Chinese and Russian Parties established a program for the characterization of these materials. A summary of the published data and new results are presented in the paper. It was concluded that the proposed Chinese (CN-G01) and Russian (TGP-56FW) beryllium grades can be accepted. Three grades of beryllium are now available for the armor application for the ITER first wall: S-65, CN-G01 and TGP-56FW.
Summary of beryllium qualification activity for ITER first-wall applications

Science.gov (United States)

Barabash, V.; Eaton, R.; Hirai, T.; Kupriyanov, I.; Nikolaev, G.; Wang, Zhanhong; Liu, Xiang; Roedig, M.; Linke, J.

2011-12-01

Beryllium is considered as an armor material for the ITER first wall. The ITER Final Design Report 2001 identified the reference grades S-65C vacuum hot pressed (VHP) from Brush Wellman and DShG-200 from the Russian Federation. These grades have been selected based on excellent thermal fatigue/shock behavior and the available comprehensive database. Later, Chinese and Russian ITER Parties proposed their new grades: CN-G01 (from China) and TGP-56FW (from Russia). To assess the performance of these new grades, the ITER Organization, Chinese and Russian Parties established a program for the characterization of these materials. A summary of the published data and new results are presented in the paper. It was concluded that the proposed Chinese (CN-G01) and Russian (TGP-56FW) beryllium grades can be accepted. Three grades of beryllium are now available for the armor application for the ITER first wall: S-65, CN-G01 and TGP-56FW.
ITER: the first experimental fusion reactor

International Nuclear Information System (INIS)

Rebut, P.H.

1995-01-01

The International Thermonuclear Experimental Reactor (ITER) project is a multiphased project, at present proceeding under the auspices of the International Atomic Energy Agency according to the terms of a four-party agreement between the European Atomic Energy Community, the Government of Japan, the Government of the USA and the Government of Russia (''the parties''). The project is based on the tokamak, a Russian invention which has been brought to a high level of development and progress in all major fusion programs throughout the world.The objective of ITER is to demonstrate the scientific and technological feasibility of fusion energy for commercial energy production and to test technologies for a demonstration fusion power plant. During the extended performance phase of ITER, it will demonstrate the characteristics of a fusion power plant, producing more than 1500MW of fusion power.The objective of the engineering design activity (EDA) phase is to produce a detailed, complete and fully integrated engineering design of ITER and all technical data necessary for the future decision on the construction of ITER.The ITER device will be a major step from present fusion experiments and will encompass all the major elements required for a fusion reactor. It will also require the development and the implementation of major new components and technologies.The inside surface of the plasma containment chamber will be designed to withstand temperature of up to 500 C, although normal operating temperatures will be substantially lower. Materials will have to be carefully chosen to withstand these temperatures, and a high neutron flux. In addition, other components of the device will be composed of state-of-the-art metal alloys, ceramics and composites, many of which are now in the early stage of development of testing. (orig.)
Development of whole core thermal-hydraulic analysis program ACT. 4. Simplified fuel assembly model and parallelization by MPI

International Nuclear Information System (INIS)

Ohshima, Hiroyuki

2001-10-01

A whole core thermal-hydraulic analysis program ACT is being developed for the purpose of evaluating detailed in-core thermal hydraulic phenomena of fast reactors including the effect of the flow between wrapper-tube walls (inter-wrapper flow) under various reactor operation conditions. As appropriate boundary conditions in addition to a detailed modeling of the core are essential for accurate simulations of in-core thermal hydraulics, ACT consists of not only fuel assembly and inter-wrapper flow analysis modules but also a heat transport system analysis module that gives response of the plant dynamics to the core model. This report describes incorporation of a simplified model to the fuel assembly analysis module and program parallelization by a message passing method toward large-scale simulations. ACT has a fuel assembly analysis module which can simulate a whole fuel pin bundle in each fuel assembly of the core and, however, it may take much CPU time for a large-scale core simulation. Therefore, a simplified fuel assembly model that is thermal-hydraulically equivalent to the detailed one has been incorporated in order to save the simulation time and resources. This simplified model is applied to several parts of fuel assemblies in a core where the detailed simulation results are not required. With regard to the program parallelization, the calculation load and the data flow of ACT were analyzed and the optimum parallelization has been done including the improvement of the numerical simulation algorithm of ACT. Message Passing Interface (MPI) is applied to data communication between processes and synchronization in parallel calculations. Parallelized ACT was verified through a comparison simulation with the original one. In addition to the above works, input manuals of the core analysis module and the heat transport system analysis module have been prepared. (author)
Mobile and replicated alignment of arrays in data-parallel programs

Science.gov (United States)

Chatterjee, Siddhartha; Gilbert, John R.; Schreiber, Robert

1993-01-01

When a data-parallel language like FORTRAN 90 is compiled for a distributed-memory machine, aggregate data objects (such as arrays) are distributed across the processor memories. The mapping determines the amount of residual communication needed to bring operands of parallel operations into alignment with each other. A common approach is to break the mapping into two stages: first, an alignment that maps all the objects to an abstract template, and then a distribution that maps the template to the processors. We solve two facets of the problem of finding alignments that reduce residual communication: we determine alignments that vary in loops, and objects that should have replicated alignments. We show that loop-dependent mobile alignment is sometimes necessary for optimum performance, and we provide algorithms with which a compiler can determine good mobile alignments for objects within do loops. We also identify situations in which replicated alignment is either required by the program itself (via spread operations) or can be used to improve performance. We propose an algorithm based on network flow that determines which objects to replicate so as to minimize the total amount of broadcast communication in replication. This work on mobile and replicated alignment extends our earlier work on determining static alignment.
Beryllium application in ITER plasma facing components

International Nuclear Information System (INIS)

Raffray, A.R.; Federici, G.; Barabash, V.; Cardella, A.; Jakeman, R.; Ioki, K.; Janeschitz, G.; Parker, R.; Tivey, R.; Pacher, H.D.; Wu, C.H.; Bartels, H.W.

1997-01-01

Beryllium is a candidate armour material for the in-vessel components of the International Thermonuclear Experimental Reactor (ITER), namely the primary first wall, the limiter, the baffle and the divertor. However, a number of issues arising from the performance requirements of the ITER plasma facing components (PFCs) must be addressed to better assess the attractiveness of Be as armour for these different components. These issues include heat loading limits arising from temperature and stress constraints under steady state conditions, armour lifetime including the effects of sputtering erosion as well as vaporisation and loss of melt during disruption events, tritium retention and permeation, and chemical hazards, in particular with respect to potential Be/steam reaction. Other issues such as fabrication and the possibility of in-situ repair are not performance-dependent but have an important impact on the overall assessment of Be as PFC armour. This paper describes the present view on Be application for ITER PFCs. The key issues are discussed including an assessment of the current level of understanding based on analysis and experimental data; and on-going activities as part of the ITER EDA R and D program are highlighted. (orig.)
Computational acceleration for MR image reconstruction in partially parallel imaging.

Science.gov (United States)

Ye, Xiaojing; Chen, Yunmei; Huang, Feng

2011-05-01

In this paper, we present a fast numerical algorithm for solving total variation and l(1) (TVL1) based image reconstruction with application in partially parallel magnetic resonance imaging. Our algorithm uses variable splitting method to reduce computational cost. Moreover, the Barzilai-Borwein step size selection method is adopted in our algorithm for much faster convergence. Experimental results on clinical partially parallel imaging data demonstrate that the proposed algorithm requires much fewer iterations and/or less computational cost than recently developed operator splitting and Bregman operator splitting methods, which can deal with a general sensing matrix in reconstruction framework, to get similar or even better quality of reconstructed images.
ITER overview

International Nuclear Information System (INIS)

Shimomura, Y.; Aymar, R.; Chuyanov, V.; Huguet, M.; Parker, R.R.

2001-01-01

This report summarizes technical works of six years done by the ITER Joint Central Team and Home Teams under terms of Agreement of the ITER Engineering Design Activities. The major products are as follows: complete and detailed engineering design with supporting assessments, industrial-based cost estimates and schedule, non-site specific comprehensive safety and environmental assessment, and technology R and D to validate and qualify design including proof of technologies and industrial manufacture and testing of full size or scalable models of key components. The ITER design is at an advanced stage of maturity and contains sufficient technical information for a construction decision. The operation of ITER will demonstrate the availability of a new energy source, fusion. (author)
ITER Overview

International Nuclear Information System (INIS)

Shimomura, Y.; Aymar, R.; Chuyanov, V.; Huguet, M.; Parker, R.

1999-01-01

This report summarizes technical works of six years done by the ITER Joint Central Team and Home Teams under terms of Agreement of the ITER Engineering Design Activities. The major products are as follows: complete and detailed engineering design with supporting assessments, industrial-based cost estimates and schedule, non-site specific comprehensive safety and environmental assessment, and technology R and D to validate and qualify design including proof of technologies and industrial manufacture and testing of full size or scalable models of key components. The ITER design is at an advanced stage of maturity and contains sufficient technical information for a construction decision. The operation of ITER will demonstrate the availability of a new energy source, fusion. (author)
ITER Council proceedings: 1993

International Nuclear Information System (INIS)

1994-01-01

Records of the third ITER Council Meeting (IC-3), held on 21-22 April 1993, in Tokyo, Japan, and the fourth ITER Council Meeting (IC-4) held on 29 September - 1 October 1993 in San Diego, USA, are presented, giving essential information on the evolution of the ITER Engineering Design Activities (EDA), such as the text of the draft of Protocol 2 further elaborated in ''ITER EDA Agreement and Protocol 2'' (ITER EDA Documentation Series No. 5), recommendations on future work programmes: a description of technology R and D tasks; the establishment of a trust fund for the ITER EDA activities; arrangements for Visiting Home Team Personnel; the general framework for the involvement of other countries in the ITER EDA; conditions for the involvement of Canada in the Euratom Contribution to the ITER EDA; and other attachments as parts of the Records of Decision of the aforementioned ITER Council Meetings
ITER council proceedings: 1993

Energy Technology Data Exchange (ETDEWEB)

NONE

1994-12-31

Records of the third ITER Council Meeting (IC-3), held on 21-22 April 1993, in Tokyo, Japan, and the fourth ITER Council Meeting (IC-4) held on 29 September - 1 October 1993 in San Diego, USA, are presented, giving essential information on the evolution of the ITER Engineering Design Activities (EDA), such as the text of the draft of Protocol 2 further elaborated in ``ITER EDA Agreement and Protocol 2`` (ITER EDA Documentation Series No. 5), recommendations on future work programmes: a description of technology R and D tastes; the establishment of a trust fund for the ITER EDA activities; arrangements for Visiting Home Team Personnel; the general framework for the involvement of other countries in the ITER EDA; conditions for the involvement of Canada in the Euratom Contribution to the ITER EDA; and other attachments as parts of the Records of Decision of the aforementioned ITER Council Meetings.
Parallelizing Gene Expression Programming Algorithm in Enabling Large-Scale Classification

Directory of Open Access Journals (Sweden)

Lixiong Xu

2017-01-01

Full Text Available As one of the most effective function mining algorithms, Gene Expression Programming (GEP algorithm has been widely used in classification, pattern recognition, prediction, and other research fields. Based on the self-evolution, GEP is able to mine an optimal function for dealing with further complicated tasks. However, in big data researches, GEP encounters low efficiency issue due to its long time mining processes. To improve the efficiency of GEP in big data researches especially for processing large-scale classification tasks, this paper presents a parallelized GEP algorithm using MapReduce computing model. The experimental results show that the presented algorithm is scalable and efficient for processing large-scale classification tasks.

Massive Asynchronous Parallelization of Sparse Matrix Factorizations

Energy Technology Data Exchange (ETDEWEB)

Chow, Edmond [Georgia Inst. of Technology, Atlanta, GA (United States)

2018-01-08

Solving sparse problems is at the core of many DOE computational science applications. We focus on the challenge of developing sparse algorithms that can fully exploit the parallelism in extreme-scale computing systems, in particular systems with massive numbers of cores per node. Our approach is to express a sparse matrix factorization as a large number of bilinear constraint equations, and then solving these equations via an asynchronous iterative method. The unknowns in these equations are the matrix entries of the factorization that is desired.
ITER central solenoid model coil heat treatment complete and assembly started

International Nuclear Information System (INIS)

Thome, R.J.; Okuno, K.

1998-01-01

A major R and D task in the ITER program is to fabricate a Superconducting Model Coil for the Central Solenoid to establish the design and fabrication methods for ITER size coils and to demonstrate conductor performance. Completion of its components is expected in 1998, to be followed by assembly with structural components and testing in a facility at JAERI
ITER-FEAT safety

International Nuclear Information System (INIS)

Gordon, C.W.; Bartels, H.-W.; Honda, T.; Raeder, J.; Topilski, L.; Iseli, M.; Moshonas, K.; Taylor, N.; Gulden, W.; Kolbasov, B.; Inabe, T.; Tada, E.

2001-01-01

Safety has been an integral part of the design process for ITER since the Conceptual Design Activities of the project. The safety approach adopted in the ITER-FEAT design and the complementary assessments underway, to be documented in the Generic Site Safety Report (GSSR), are expected to help demonstrate the attractiveness of fusion and thereby set a good precedent for future fusion power reactors. The assessments address ITER's radiological hazards taking into account fusion's favourable safety characteristics. The expectation that ITER will need regulatory approval has influenced the entire safety design and assessment approach. This paper summarises the ITER-FEAT safety approach and assessments underway. (author)
Hybrid parallelization of the XTOR-2F code for the simulation of two-fluid MHD instabilities in tokamaks

Science.gov (United States)

Marx, Alain; Lütjens, Hinrich

2017-03-01

A hybrid MPI/OpenMP parallel version of the XTOR-2F code [Lütjens and Luciani, J. Comput. Phys. 229 (2010) 8130] solving the two-fluid MHD equations in full tokamak geometry by means of an iterative Newton-Krylov matrix-free method has been developed. The present work shows that the code has been parallelized significantly despite the numerical profile of the problem solved by XTOR-2F, i.e. a discretization with pseudo-spectral representations in all angular directions, the stiffness of the two-fluid stability problem in tokamaks, and the use of a direct LU decomposition to invert the physical pre-conditioner at every Krylov iteration of the solver. The execution time of the parallelized version is an order of magnitude smaller than the sequential one for low resolution cases, with an increasing speedup when the discretization mesh is refined. Moreover, it allows to perform simulations with higher resolutions, previously forbidden because of memory limitations.
A study of reconstruction artifacts in cone beam tomography using filtered backprojection and iterative EM algorithms

International Nuclear Information System (INIS)

Zeng, G.L.; Gullberg, G.T.

1990-01-01

Reconstruction artifacts in cone beam tomography are studied for filtered backprojection (Feldkamp) and iterative EM algorithms. The filtered backprojection algorithm uses a voxel-driven, interpolated backprojection to reconstruct the cone beam data; whereas, the iterative EM algorithm performs ray-driven projection and backprojection operations for each iteration. Two weight in schemes for the projection and backprojection operations in the EM algorithm are studied. One weights each voxel by the length of the ray through the voxel and the other equates the value of a voxel to the functional value of the midpoint of the line intersecting the voxel, which is obtained by interpolating between eight neighboring voxels. Cone beam reconstruction artifacts such as rings, bright vertical extremities, and slice-to slice cross talk are not found with parallel beam and fan beam geometries
Disruptions in ITER and strategies for their control and mitigation

Energy Technology Data Exchange (ETDEWEB)

Lehnen, M., E-mail: michael.lehnen@iter.org [ITER Organization, Route de Vinon sur Verdon, 13115 St Paul Lez Durance (France); Aleynikova, K.; Aleynikov, P.B.; Campbell, D.J. [ITER Organization, Route de Vinon sur Verdon, 13115 St Paul Lez Durance (France); Drewelow, P. [Max-Planck-Institut für Plasmaphysik, Greifswald branch, EURATOM Ass., D-17491 Greifswald (Germany); Eidietis, N.W. [General Atomics, P.O. Box 85608, San Diego, CA 92186-5608 (United States); Gasparyan, Yu. [National Research Nuclear University MEPhI (Moscow Engineering Physics Institute), Kashirskoe sh. 31, Moscow 115409 (Russian Federation); Granetz, R.S. [MIT Plasma Science and Fusion Center, Cambridge, MA 02139 (United States); Gribov, Y. [ITER Organization, Route de Vinon sur Verdon, 13115 St Paul Lez Durance (France); Hartmann, N. [Forschungszentrum Jülich GmbH, Institute of Energy and Climate Research—Plasma Physics, Association EURATOM-FZJ, Trilateral Euregio Cluster, 52425 Jülich (Germany); Hollmann, E.M. [University of California-San Diego, La Jolla, CA 92093 (United States); Izzo, V.A. [ITER Organization, Route de Vinon sur Verdon, 13115 St Paul Lez Durance (France); Jachmich, S. [Laboratory for Plasma Physics, ERM/KMS, Association EURATOM – Belgian State, B-1000 Brussels (Belgium); Kim, S.-H.; Kočan, M. [ITER Organization, Route de Vinon sur Verdon, 13115 St Paul Lez Durance (France); Koslowski, H.R. [Forschungszentrum Jülich GmbH, Institute of Energy and Climate Research—Plasma Physics, Association EURATOM-FZJ, Trilateral Euregio Cluster, 52425 Jülich (Germany); Kovalenko, D. [SRC RF TRINITI, ul. Pushkovykh, vladenie 12, Troitsk, Moscow 142190 (Russian Federation); Kruezi, U. [CCFE, Culham Science Centre, Abingdon, Oxon, OX14 3DB (United Kingdom); and others

2015-08-15

The thermal and electromagnetic loads related to disruptions in ITER are substantial and require careful design of tokamak components to ensure they reach the projected lifetime and to ensure that safety relevant components fulfil their function for the worst foreseen scenarios. The disruption load specifications are the basis for the design process of components like the full-W divertor, the blanket modules and the vacuum vessel and will set the boundary conditions for ITER operations. This paper will give a brief overview on the disruption loads and mitigation strategies for ITER and will discuss the physics basis which is continuously refined through the current disruption R&D programs.
Copper Mountain conference on iterative methods: Proceedings: Volume 1

Energy Technology Data Exchange (ETDEWEB)

NONE

1996-10-01

This volume (one of two) contains information presented during the first three days of the Copper Mountain Conference on Iterative Methods held April 9-13, 1996 at Copper Mountain, Colorado. Topics of the sessions held these three days included nonlinear systems, parallel processing, preconditioning, sparse matrix test collections, first-order system least squares, Arnoldi`s method, integral equations, software, Navier-Stokes equations, Euler equations, Krylov methods, and eigenvalues. The top three papers from a student competition are also included. Selected papers indexed separately for the Energy Science and Technology Database.
Advanced neutron diagnostics for ITER fusion experiments

International Nuclear Information System (INIS)

Kaellne, J.; Giacomelli, L.; Hjalmarsson, A.; Conroy, S.; Ericsson, G.; Johnson, M.G.; Glasser, W.; Henriksson, H.; Ronchi, E.; Sjoestrand, H.; Andersson, E.S.; Thun, J.; Weiszflog, M.; Gorini, G.; Tardocchi, M.; Popovichev, S.; Sousa, J.

2005-01-01

Results are presented from the neutron emission spectroscopy (NES) diagnosis of JET plasma performed with the MPR during the DTE1 campaign of 1997 and the recent TTE of 2003. The NES diagnostic capabilities at JET are presently being drastically enhanced by an upgrade of the MPR (MPRu) and a new 2.5-MeV TOF neutron spectrometer (TOFOR). The principles of MPRu and TOFOR are described and illustrated with the diagnostic role they will play in the high performance fusion experiments in the forward program of JET largely aimed at supporting ITER. The importance for the JET NES effort for ITER is discussed. (author)
Remote handling demonstration of ITER blanket module replacement

International Nuclear Information System (INIS)

Kakudate, S.; Nakahira, M.; Oka, K.; Taguchi, K.; Obara, K.; Tada, E.; Shibanuma, K.; Tesini, A.; Haange, R.; Maisonnier, D.

2001-01-01

In ITER, the in-vessel components such as blanket are to be maintained or replaced remotely since they will be activated by 14 MeV neutrons, and a complete exchange of shielding blanket with breeding blanket is foreseen after the Basic Performance Phase. The blanket is segmented into about seven hundred modules to facilitate remote maintainability and allow individual module replacement. For this, the remote handing equipment for blanket maintenance is required to handle a module with a dead weight of about 4 tonne within a positioning accuracy of a few mm under intense gamma radiation. According to the ITER R and D program, a rail-mounted vehicle manipulator system was developed and the basic feasibility of this system was verified through prototype testing. Following this, development of full-scale remote handling equipment has been conducted as one of the ITER Seven R and D Projects aiming at a remote handling demonstration of the ITER blanket. As a result, the Blanket Test Platform (BTP) composed of the full-scale remote handling equipment has been completed and the first integrated performance test in March 1998 has shown that the fabricate remote handling equipment satisfies the main requirements of ITER blanket maintenance. (author)
The Davidson Method as an alternative to power iterations for criticality calculations

International Nuclear Information System (INIS)

Subramanian, C.; Van Criekingen, S.; Heuveline, V.; Nataf, F.; Have, P.

2011-01-01

The Davidson method is implemented within the neutron transport core solver parafish to solve k-eigenvalue criticality transport problems. The parafish solver is based on domain decomposition, uses spherical harmonics (P_N method) for angular discretization, and nonconforming finite elements for spatial discretization. The Davidson method is compared to the traditional power iteration method in that context. Encouraging numerical results are obtained with both sequential and parallel calculations. (author)
Status and plans for U.S. ITER studies

International Nuclear Information System (INIS)

Doggett, J.N.

1992-01-01

The United States' participation in the International Thermonuclear Experimental Reactor (ITER) began in late 1987 when the initiative to start a cooperative program among the four Parties-the Soviet Union, Japan, the European Community, and the United States-was initiated. Participation then continued through the start of joint Work in May 1988 until the conclusion of the Conceptual Design Activities (CDA) in December 1990. In the period between the conclusion of the CDA and the agreement to execute the Engineering Design Activities (EDA), the U.S. ITER Home Team continued to do work on the design, executed additional research and development (R and D) and participated in the preparations for the EDA. Activities included one major design study on a High-Aspect-Ratio Design (HARD) and input to the National ITER Technical Review, the ITER Steering Committee-U.S. (ISCUS), Special Working Group 1 (SWG-1), and the Fusion Energy Advisory Committee's Panel 1 (FEAC-1). Research and development was continued in areas of work that were identified as critical-path elements by an international panel chartered by the four ITER Parties near the end of the CDA. During the interim period, the U.S. Home Team Management (HTM) was in the process of organizing to support the EDA both at home and in the central design sites. The major efforts have been in producing a management plan, establishing memorandums of agreement with the performing institutions for ITER tasks, establishing an industrial council, and producing a list of candidates who are qualified, willing, and available to serve on the joint Central Team or to participate in ITER home tasks. The author describes the conclusion of the CDA and the interim U.S. ITER activities and will give an indication of US involvement in the EDA
New proposal on the development of machine protection functions for ITER diagnostics control

International Nuclear Information System (INIS)

Yamamoto, Tsuyoshi; Yatsuka, Eiichi; Hatae, Takaki; Takeuchi, Masaki; Kitazawa, Sin-iti; Ogawa, Hiroaki; Kawano, Yasunori; Itami, Kiyoshi; Ota, Kazuya; Hashimoto, Yasunori; Nakamura, Kitaru; Sugie, Tatsuo

2016-01-01

There is a need to develop ITER instrumentation and control (I and C) systems with high reliabilities. Interlock systems that activate machine protection functions are implemented on robust wired-logic systems such as programmable logic controllers (PLCs). We herein propose a software tool that generates program code templates for the control systems using PLC logic. This tool decreases careless mistakes by developers and increases reliability of the program codes. A large-scale engineering database has been implemented in the ITER project. To derive useful information from this database, we propose adding semantic data to it using the Resource Description Framework format. In our novel proposal for the ITER diagnostic control system, a guide words generator that analyzes the engineering data by inference is applied to the hazard and operability study. We validated the methods proposed in this paper by applying them to the preliminary design for the I and C system of the ITER edge Thomson scattering system. (author)
ITER plasma facing components, design and development

International Nuclear Information System (INIS)

Vieider, G.; Cardella, A.; Akiba, M.; Matera, R.; Watson, R.

1991-01-01

The paper summarizes the collaborative effort of the ITER Conceptual Design Activity (CDA) on Plasma Facing Components (PFC) which focused on the following main tasks: (a) The definition of basic design concepts for the First Wall (FW) and Divertor Plates (DP), (b) the analysis of the performance and likely lifetime of these PFC designs including the identification of major critical issues, (c) the start of R and D work giving already first results, and the definition of the required further R and D program to support the contemplated ITER Engineering Design Activity (EDA). From the ITER CDA effort on PFC it is mainly concluded that: (a) The expected PFC operating conditions lead to design solutions at the limit of present technology in particular for the divertor, which may constrain the overall machine performance, (b) the development of convincing PFC designs requires an intensified R and D effort both on PFC technology and plasma physics. (orig.)
Development of radiation hardness components for ITER remote maintenance

Energy Technology Data Exchange (ETDEWEB)

Obara, Kenjiro; Kakudate, Satoshi; Oka, Kiyoshi; Ito, Akira [Japan Atomic Energy Research Inst., Tokai, Ibaraki (Japan). Tokai Research Establishment; Yagi, Toshiaki; Morita, Yousuke

1998-04-01

In the ITER, in-vessel remote handling is required to assemble and maintain in-vessel components in DT operations. Since in-vessel remote handling systems must operate under intense gamma ray radiation exceeding 30 kGy/h, their components must have sufficiently high radiation hardness to allow maintenance long enough in ITER in-vessel environments. Thus, extensive radiation tests and quality improvement, including optimization of material compositions, have been conducted through the ITER R and D program to develop radiation hardness components that meet radiation doses from 10 to 100 MGy at 10 kGy/h. This paper presents the latest on radiation hardness component development conducted by the Japan Home Team as a contribution to the ITER. The remote handling components tested are categorized for use in robotic or viewing systems, or as common components. Radiation tests have been conducted on commercially available products for screening, on modified products, and on new products to improve the radiation hardness. (author)
Development of radiation hardness components for ITER remote maintenance

International Nuclear Information System (INIS)

Obara, Kenjiro; Kakudate, Satoshi; Oka, Kiyoshi; Ito, Akira; Yagi, Toshiaki; Morita, Yousuke

1998-01-01

In the ITER, in-vessel remote handling is required to assemble and maintain in-vessel components in DT operations. Since in-vessel remote handling systems must operate under intense gamma ray radiation exceeding 30 kGy/h, their components must have sufficiently high radiation hardness to allow maintenance long enough in ITER in-vessel environments. Thus, extensive radiation tests and quality improvement, including optimization of material compositions, have been conducted through the ITER R and D program to develop radiation hardness components that meet radiation doses from 10 to 100 MGy at 10 kGy/h. This paper presents the latest on radiation hardness component development conducted by the Japan Home Team as a contribution to the ITER. The remote handling components tested are categorized for use in robotic or viewing systems, or as common components. Radiation tests have been conducted on commercially available products for screening, on modified products, and on new products to improve the radiation hardness. (author)
HPC parallel programming model for gyrokinetic MHD simulation

International Nuclear Information System (INIS)

Naitou, Hiroshi; Yamada, Yusuke; Tokuda, Shinji; Ishii, Yasutomo; Yagi, Masatoshi

2011-01-01

The 3-dimensional gyrokinetic PIC (particle-in-cell) code for MHD simulation, Gpic-MHD, was installed on SR16000 (“Plasma Simulator”), which is a scalar cluster system consisting of 8,192 logical cores. The Gpic-MHD code advances particle and field quantities in time. In order to distribute calculations over large number of logical cores, the total simulation domain in cylindrical geometry was broken up into N DD-r × N DD-z (number of radial decomposition times number of axial decomposition) small domains including approximately the same number of particles. The axial direction was uniformly decomposed, while the radial direction was non-uniformly decomposed. N RP replicas (copies) of each decomposed domain were used (“particle decomposition”). The hybrid parallelization model of multi-threads and multi-processes was employed: threads were parallelized by the auto-parallelization and N DD-r × N DD-z × N RP processes were parallelized by MPI (message-passing interface). The parallelization performance of Gpic-MHD was investigated for the medium size system of N r × N θ × N z = 1025 × 128 × 128 mesh with 4.196 or 8.192 billion particles. The highest speed for the fixed number of logical cores was obtained for two threads, the maximum number of N DD-z , and optimum combination of N DD-r and N RP . The observed optimum speeds demonstrated good scaling up to 8,192 logical cores. (author)
ITER ITA newsletter. No. 24, July 2005

International Nuclear Information System (INIS)

2005-08-01

stimulant for international co-operation on science and technology in the twenty first century, and taking a broader view of the situation, Japan has decided that they will let the EU host the ITER site. Dr. J. Potocnik, European Commissioner for Science and Research, thanked Minister Nakayama for the highly constructive spirit with which he and his colleagues had conducted the bilateral discussions. He expressed his respect for the honourable manner in which the most sensitive stages were handled. He pointed out that the EU was well aware of the important task it had in front of it as the Host of ITER. The action taken had implications beyond that of establishing fusion energy. It was also an expression of mutual confidence to face the scientific, technical and political challenges that will occur in the course of this first-of-a-kind true international science cooperation among the leading nations of the world. ITER was establishing a model of global co-operation to address the increasingly global nature of the challenges confronting today's society. The Chinese Minister of Science and Technology, Mr. Xu Guanhua, expressed his pleasure that agreement on the site had been found within the six-Party framework. China considered that a sustainable solution to the world's energy source problem required multilateral international collaboration on fusion, so that participants could complement each other's skills and pool resources in the shared challenge. Mr. S. Choi, Vice-Minister of Science and Technology, Republic of Korea, reminded the delegates that the eyes of the world were on ITER as one of the most significant projects of the century, with a view to it being a peaceful and affluent one. Having just crossed the barrier of the site decision, there was still more to be done ahead, particularly by concluding the ITER Joint Implementation Agreement as soon as possible. He quoted a Korean proverb, literally translated as 'After rain ground hardens', which parallels with the
Recent Progress on ECH Technology for ITER

Science.gov (United States)

Sirigiri, Jagadishwar

2005-10-01

The Electron Cyclotron Heating and Current Drive (ECH&CD) system for ITER is a critical ITER system that must be available for use on Day 1 of the ITER experimental program. The applications of the system include plasma start-up, plasma heating and suppression of Neoclassical Tearing Modes (NTMs). These applications are accomplished using 27 one megawatt continuous wave gyrotrons: 24 at a frequency of 170 GHz and 3 at a frequency of 120 GHz. There are DC power supplies for the gyrotrons, a transmission line system, one launcher at the equatorial plane and three upper port launchers. The US will play a major role in delivering parts of the ECH&CD system to ITER. The present state-of-the-art includes major advances in all areas of ECH technology. In the US, a major effort is underway to supply gyrotrons of up to 1.5 MW power level at 110 GHz to General Atomics for use in heating the DIII-D tokamak. This presentation will include a brief review of the state-of-the-art, worldwide, in ECH technology. The requirements for the ITER ECH&CD system will then be reviewed. ITER calls for gyrotrons capable of operating from a 50 kV power supply, after potential depression, with a minimum of 50% overall efficiency. This is a very significant challenge and some approaches to meeting this goal will be presented. Recent experimental results at MIT showing improved efficiency of high frequency, 1.5 MW gyrotrons will be described. These results will be incorporated into the planned development of gyrotrons for ITER. The ITER ECH&CD system will also be a challenge to the transmission lines, which must operate at high average power at up to 1000 seconds and with high efficiency. The technology challenges and efforts in the US and other ITER parties to solve these problems will be reviewed. *In collaboration with E. Choi, C. Marchewka, I. Mastovosky, M. A. Shapiro and R. J. Temkin. This work is supported by the Office of Fusion Energy Sciences of the U. S. Department of Energy.
An Introduction to Parallel Computation R

Indian Academy of Sciences (India)

How are they programmed? This article provides an introduction. A parallel computer is a network of processors built for ... and have been used to solve problems much faster than a single ... in parallel computer design is to select an organization which ..... The most ambitious approach to parallel computing is to develop.
Is Carbon a Realistic Choice for ITER's Divertor?

International Nuclear Information System (INIS)

Skinner, C.H.; Federici, G.

2005-01-01

Tritium retention by co-deposition with carbon on the divertor target plate is predicted to limit ITER's DT burning plasma operations (e.g. to about 100 pulses for the worst conditions) before the in-vessel tritium inventory limit, currently set at 350 g, is reached. At this point, ITER will only be able to continue its burning plasma program if technology is available that is capable of rapidly removing large quantities of tritium from the vessel with over 90% efficiency. The removal rate required is four orders of magnitude faster than that demonstrated in current tokamaks. Eighteen years after the observation of co-deposition on JET and TFTR, such technology is nowhere in sight. The inexorable conclusion is that either a major initiative in tritium removal should be funded or that research priorities for ITER should focus on metal alternatives

ITER CTA newsletter. No. 14, November 2002

International Nuclear Information System (INIS)

2003-01-01

The Sixth ITER Negotiations Meeting (N6) took place on 29-30 October 2002 at Rokkasho-mura in the Aomori Prefecture - the location of the site that Japan has offered to host the ITER project. Japan hosted the meeting, which was also attended by delegations from Canada, the European Union, and the Russian Federation. At the start of the meeting, Mr. Yoshiro Mori, the former Prime Minister of Japan said that energy issues are important to achieving human prosperity, world peace and conservation of the environment, and that therefore the Japanese Government as a whole should promote the ITER project under international collaboration to realize fusion energy. The JA delegation reported that JA had sent a letter to China on 22 October 2002 on behalf of the ITER Negotiators in response to a letter from Mr. Liu, Vice Minister of Science and Technology of China. The Canadian delegation reported on the special informal ITER session at the IAEA Fusion Energy Conference in Lyon, France, and noted that it raised the ITER profile in a positive way. The EU delegation reported on the adoption, within the Sixth Framework Programme, of the Specific Euratom Programme, which gives an explicit basis for continuing activities in the period up to the end of 2006, including a provision of up to Euro 200 million for a possible start of ITER construction. The RF delegation reported that the ITER activities in the Russian Federation are conducted in accordance with the Federal Program (2002-2005) approved by the Russian Government. Funding for ITER activities in 2003 is expected to be on the same level as in previous years. It was reported that the mandate of the Russian delegation to participate in the Negotiations in 2003 is expected to be approved soon by the Government. The RF delegation also reported that they had received informal enquiries from the Republic of Korea about possible participation in ITER. Significant progress was also made on a wide range of other issues, including
Sixth negotiations meeting on the joint implementation of ITER

International Nuclear Information System (INIS)

Okumura, Y.

2003-01-01

During the Sixth ITER Negotiations Meeting (N6), the JA delegation reported that JA had sent a letter to China on 22 October 2002 on behalf of the ITER Negotiators in response to a letter from Mr. Liu, Vice Minister of Science and Technology of China. The Canadian delegation reported on the special informal ITER session at the IAEA Fusion Energy Conference in Lyon, France, and noted that it raised the ITER profile in a positive way. The EU delegation reported on the adoption, within the Sixth Framework Programme, of the Specific Euratom Programme, which gives an explicit basis for continuing activities in the period up to the end of 2006, including a provision of up to Euro 200 million for a possible start of ITER construction. The RF delegation reported that the ITER activities in the Russian Federation are conducted in accordance with the Federal Program (2002-2005) approved by the Russian Government. Funding for ITER activities in 2003 is expected to be on the same level as in previous years. It was reported that the mandate of the Russian delegation to participate in the Negotiations in 2003 is expected to be approved soon by the Government. The RF delegation also reported that they had received informal enquiries from the Republic of Korea about possible participation in ITER. Significant progress was also made on a wide range of other issues, including matters such as the treaty to implement ITER (the Joint Implementation Agreement - JIA), procurement allocation and the intellectual property rights that would accrue to participants in the project. The Negotiators agreed that the international organization responsible for implementing the project would be called the ITER International Fusion Energy Organization. The delegations noted the progress in developing the fifth draft of the JIA and charged the NSSG to elaborate further the JIA and Related Instruments. At the conclusion of the N6 meeting, the delegations reaffirmed their belief that the critical issues
Comparison of the deflated preconditioned conjugate gradient method and parallel direct solver for composite materials

NARCIS (Netherlands)

Jönsthövel, T.B.; Van Gijzen, M.B.; MacLachlan, S.; Vuik, C.; Scarpas, A.

2011-01-01

The demand for large FE meshes increases as parallel computing becomes the standard in FE simulations. Direct and iterative solution methods are used to solve the resulting linear systems. Many applications concern composite materials, which are characterized by large discontinuities in the material
A Guided Online and Mobile Self-Help Program for Individuals With Eating Disorders: An Iterative Engagement and Usability Study.

Science.gov (United States)

Nitsch, Martina; Dimopoulos, Christina N; Flaschberger, Edith; Saffran, Kristina; Kruger, Jenna F; Garlock, Lindsay; Wilfley, Denise E; Taylor, Craig B; Jones, Megan

2016-01-11

Numerous digital health interventions have been developed for mental health promotion and intervention, including eating disorders. Efficacy of many interventions has been evaluated, yet knowledge about reasons for dropout and poor adherence is scarce. Most digital health intervention studies lack appropriate research design and methods to investigate individual engagement issues. User engagement and program usability are inextricably linked, making usability studies vital in understanding and improving engagement. The aim of this study was to explore engagement and corresponding usability issues of the Healthy Body Image Program-a guided online intervention for individuals with body image concerns or eating disorders. The secondary aim was to demonstrate the value of usability research in order to investigate engagement. We conducted an iterative usability study based on a mixed-methods approach, combining cognitive and semistructured interviews as well as questionnaires, prior to program launch. Two separate rounds of usability studies were completed, testing a total of 9 potential users. Thematic analysis and descriptive statistics were used to analyze the think-aloud tasks, interviews, and questionnaires. Participants were satisfied with the overall usability of the program. The average usability score was 77.5/100 for the first test round and improved to 83.1/100 after applying modifications for the second iteration. The analysis of the qualitative data revealed five central themes: layout, navigation, content, support, and engagement conditions. The first three themes highlight usability aspects of the program, while the latter two highlight engagement issues. An easy-to-use format, clear wording, the nature of guidance, and opportunity for interactivity were important issues related to usability. The coach support, time investment, and severity of users' symptoms, the program's features and effectiveness, trust, anonymity, and affordability were relevant to
The parallel processing of EGS4 code on distributed memory scalar parallel computer:Intel Paragon XP/S15-256

Energy Technology Data Exchange (ETDEWEB)

Takemiya, Hiroshi; Ohta, Hirofumi; Honma, Ichirou

1996-03-01

The parallelization of Electro-Magnetic Cascade Monte Carlo Simulation Code, EGS4 on distributed memory scalar parallel computer: Intel Paragon XP/S15-256 is described. EGS4 has the feature that calculation time for one incident particle is quite different from each other because of the dynamic generation of secondary particles and different behavior of each particle. Granularity for parallel processing, parallel programming model and the algorithm of parallel random number generation are discussed and two kinds of method, each of which allocates particles dynamically or statically, are used for the purpose of realizing high speed parallel processing of this code. Among four problems chosen for performance evaluation, the speedup factors for three problems have been attained to nearly 100 times with 128 processor. It has been found that when both the calculation time for each incident particles and its dispersion are large, it is preferable to use dynamic particle allocation method which can average the load for each processor. And it has also been found that when they are small, it is preferable to use static particle allocation method which reduces the communication overhead. Moreover, it is pointed out that to get the result accurately, it is necessary to use double precision variables in EGS4 code. Finally, the workflow of program parallelization is analyzed and tools for program parallelization through the experience of the EGS4 parallelization are discussed. (author).
Helium embrittlement model and program plan for weldability of ITER materials

International Nuclear Information System (INIS)

Louthan, M.R. Jr.; Kanne, W.R. Jr.; Tosten, M.H.; Rankin, D.T.; Cross, B.J.

1997-02-01

This report presents a refined model of how helium embrittles irradiated stainless steel during welding. The model was developed based on experimental observations drawn from experience at the Savannah River Site and from an extensive literature search. The model shows how helium content, stress, and temperature interact to produce embrittlement. The model takes into account defect structure, time, and gradients in stress, temperature and composition. The report also proposes an experimental program based on the refined helium embrittlement model. A parametric study of the effect of initial defect density on the resulting helium bubble distribution and weldability of tritium aged material is proposed to demonstrate the roll that defects play in embrittlement. This study should include samples charged using vastly different aging times to obtain equivalent helium contents. Additionally, studies to establish the minimal sample thickness and size are needed for extrapolation to real structural materials. The results of these studies should provide a technical basis for the use of tritium aged materials to predict the weldability of irradiated structures. Use of tritium charged and aged material would provide a cost effective approach to developing weld repair techniques for ITER components
Parallel computing works

Energy Technology Data Exchange (ETDEWEB)

1991-10-23

An account of the Caltech Concurrent Computation Program (C{sup 3}P), a five year project that focused on answering the question: Can parallel computers be used to do large-scale scientific computations '' As the title indicates, the question is answered in the affirmative, by implementing numerous scientific applications on real parallel computers and doing computations that produced new scientific results. In the process of doing so, C{sup 3}P helped design and build several new computers, designed and implemented basic system software, developed algorithms for frequently used mathematical computations on massively parallel machines, devised performance models and measured the performance of many computers, and created a high performance computing facility based exclusively on parallel computers. While the initial focus of C{sup 3}P was the hypercube architecture developed by C. Seitz, many of the methods developed and lessons learned have been applied successfully on other massively parallel architectures.
Rubus: A compiler for seamless and extensible parallelism

Science.gov (United States)

Adnan, Muhammad; Aslam, Faisal; Sarwar, Syed Mansoor

2017-01-01

Nowadays, a typical processor may have multiple processing cores on a single chip. Furthermore, a special purpose processing unit called Graphic Processing Unit (GPU), originally designed for 2D/3D games, is now available for general purpose use in computers and mobile devices. However, the traditional programming languages which were designed to work with machines having single core CPUs, cannot utilize the parallelism available on multi-core processors efficiently. Therefore, to exploit the extraordinary processing power of multi-core processors, researchers are working on new tools and techniques to facilitate parallel programming. To this end, languages like CUDA and OpenCL have been introduced, which can be used to write code with parallelism. The main shortcoming of these languages is that programmer needs to specify all the complex details manually in order to parallelize the code across multiple cores. Therefore, the code written in these languages is difficult to understand, debug and maintain. Furthermore, to parallelize legacy code can require rewriting a significant portion of code in CUDA or OpenCL, which can consume significant time and resources. Thus, the amount of parallelism achieved is proportional to the skills of the programmer and the time spent in code optimizations. This paper proposes a new open source compiler, Rubus, to achieve seamless parallelism. The Rubus compiler relieves the programmer from manually specifying the low-level details. It analyses and transforms a sequential program into a parallel program automatically, without any user intervention. This achieves massive speedup and better utilization of the underlying hardware without a programmer’s expertise in parallel programming. For five different benchmarks, on average a speedup of 34.54 times has been achieved by Rubus as compared to Java on a basic GPU having only 96 cores. Whereas, for a matrix multiplication benchmark the average execution speedup of 84 times has been
Rubus: A compiler for seamless and extensible parallelism.

Directory of Open Access Journals (Sweden)

Muhammad Adnan

Full Text Available Nowadays, a typical processor may have multiple processing cores on a single chip. Furthermore, a special purpose processing unit called Graphic Processing Unit (GPU, originally designed for 2D/3D games, is now available for general purpose use in computers and mobile devices. However, the traditional programming languages which were designed to work with machines having single core CPUs, cannot utilize the parallelism available on multi-core processors efficiently. Therefore, to exploit the extraordinary processing power of multi-core processors, researchers are working on new tools and techniques to facilitate parallel programming. To this end, languages like CUDA and OpenCL have been introduced, which can be used to write code with parallelism. The main shortcoming of these languages is that programmer needs to specify all the complex details manually in order to parallelize the code across multiple cores. Therefore, the code written in these languages is difficult to understand, debug and maintain. Furthermore, to parallelize legacy code can require rewriting a significant portion of code in CUDA or OpenCL, which can consume significant time and resources. Thus, the amount of parallelism achieved is proportional to the skills of the programmer and the time spent in code optimizations. This paper proposes a new open source compiler, Rubus, to achieve seamless parallelism. The Rubus compiler relieves the programmer from manually specifying the low-level details. It analyses and transforms a sequential program into a parallel program automatically, without any user intervention. This achieves massive speedup and better utilization of the underlying hardware without a programmer's expertise in parallel programming. For five different benchmarks, on average a speedup of 34.54 times has been achieved by Rubus as compared to Java on a basic GPU having only 96 cores. Whereas, for a matrix multiplication benchmark the average execution speedup of 84
Comparison of collective Thomson scattering signals due to fast ions in ITER scenarios with fusion and auxiliary heating

DEFF Research Database (Denmark)

Salewski, Mirko; Asunta, O.; Eriksson, L.-G.

2009-01-01

Auxiliary heating such as neutral beam injection (NBI) and ion cyclotron resonance heating (ICRH) will accelerate ions in ITER up to energies in the MeV range, i.e. energies which are also typical for alpha particles. Fast ions of any of these populations will elevate the collective Thomson...... functions of fast ions generated by NBI and ICRH are calculated for a steady-state ITER burning plasma equilibrium with the ASCOT and PION codes, respectively. The parameters for the auxiliary heating systems correspond to the design currently foreseen for ITER. The geometry of the CTS system for ITER...... is chosen such that near perpendicular and near parallel velocity components are resolved. In the investigated ICRH scenario, waves at 50MHz resonate with tritium at the second harmonic off-axis on the low field side. Effects of a minority heating scheme with He-3 are also considered. CTS scattering...
ITER council proceedings: 1998

International Nuclear Information System (INIS)

1999-01-01

This volume contains documents of the 13th and the 14th ITER council meeting as well as of the 1st extraordinary ITER council meeting. Documents of the ITER meetings held in Vienna and Yokohama during 1998 are also included. The contents include an outline of the ITER objectives, the ITER parameters and design overview as well as operating scenarios and plasma performance. Furthermore, design features, safety and environmental characteristics are given
Preparation of the ITER Poloidal Field Conductor Insert (PFCI) test

International Nuclear Information System (INIS)

Zanino, R.; Egorov, S.; Kim, K.; Martovetsky, N.; Nunoya, Y.; Okuno, K.; Salpietro, E.; Sborchia, C.; Takahashi, Y.; Weng, P.; Bangasco, M.; Savoldi Richard, L.; Polak, M.; Formisano, A.; Zapretilina, E.; Shikov, A.; Vedernikov, G.; Ciazynski, D.; Zani, L.; Muzzi, L.; Ricci, M.; Deela Corte, A.; Sugimoto, M.; Hamada, K.; Portone, A.; Hurd, F.; Mitchell, N.; Nijhuis, A.; Ilyin, Y.

2004-01-01

The Poloidal Field Conductor Insert (PFCI) of the International Thermonuclear Experimental Reactor (ITER) has been designed in Europe and is being manufactured at Tesla Engineering, UK, in the frame of a Task Agreement with the ITER International Team. Completion of the PFCI is expected at the beginning of 2005. Then, the coil shall be shipped to JAERI Naka, Japan, and inserted into the bore of the ITER Central Solenoid Model Coil, where it should be tested in 2005 to 2006. The PFCI consists of a NbTi dual-channel conductor, almost identical to the ITER PF1 and PF6 design, about 45 m long, with a 50 mm thick square stainless steel jacket, wound in a single-layer solenoid. It should carry up to 50 kA in a field of about 6 T, and it will be cooled by supercritical He at around 4.5 K and 0.6 MPa. An intermediate joint, representative of the ITER PF joints and located at relatively high field, will be an important new item in the test configuration with respect to the previous ITER Insert Coils. The PFCI will be fully instrumented with inductive and resistive heaters, as well as with voltage taps, Hall probes, pick-up coils, temperature sensors, pressure taps, strain and displacement sensors. The test program shall be aimed at DC and pulsed performance assessment of conductor and intermediate joint, AC loss measurement, stability and quench propagation, thermalhydraulic characterization. Here we give an overview of the preparatory work towards the test, including a review of the coil manufacturing and of the available instrumentation, a discussion of the most likely test program items, and a presentation of the supporting modeling and characterization work performed so far. (authors)
Using SharePoint to manage and disseminate fusion project information: An ITER case study

International Nuclear Information System (INIS)

Prescott, Barry; Downing, James; Di Maio, Marco; How, John

2010-01-01

The ITER Organization, in common with many other fusion laboratories, has an authenticated-access website devoted to the communication of information to all its staff and remote collaborators. In 2007 and 2008, the number of registered users of this site increased by more than a factor of ten, to over 3000 at present, and with approximately 900 unique users using the website per month. In parallel, the project management of the organisation has been put in place. A decision was taken to move the web platform from simple HTML to Microsoft SharePoint and to web-enable the many applications and databases used for ITER management. This decision has been well justified by the power and extensive flexibility provided by SharePoint, for example it permits different groups to publish their own information and to collaborate, and to consolidate disparate spreadsheet data in linked SharePoint lists to improve quality and maintainability. This paper examines the use of SharePoint at ITER: why it was selected and what benefits it brings to both the local and remote ITER community. Some active case studies are presented. The paper also looks ahead at what future benefits to ITER this platform offers, and reviews the type of information that the site can profitably publish. The paper also highlights some of the limitations of the platform, the problems of integration with other ITER systems, and discusses its potential for adaptability in other scientific organisations.
Iter

Science.gov (United States)

Iotti, Robert

2015-04-01

ITER is an international experimental facility being built by seven Parties to demonstrate the long term potential of fusion energy. The ITER Joint Implementation Agreement (JIA) defines the structure and governance model of such cooperation. There are a number of necessary conditions for such international projects to be successful: a complete design, strong systems engineering working with an agreed set of requirements, an experienced organization with systems and plans in place to manage the project, a cost estimate backed by industry, and someone in charge. Unfortunately for ITER many of these conditions were not present. The paper discusses the priorities in the JIA which led to setting up the project with a Central Integrating Organization (IO) in Cadarache, France as the ITER HQ, and seven Domestic Agencies (DAs) located in the countries of the Parties, responsible for delivering 90%+ of the project hardware as Contributions-in-Kind and also financial contributions to the IO, as ``Contributions-in-Cash.'' Theoretically the Director General (DG) is responsible for everything. In practice the DG does not have the power to control the work of the DAs, and there is not an effective management structure enabling the IO and the DAs to arbitrate disputes, so the project is not really managed, but is a loose collaboration of competing interests. Any DA can effectively block a decision reached by the DG. Inefficiencies in completing design while setting up a competent organization from scratch contributed to the delays and cost increases during the initial few years. So did the fact that the original estimate was not developed from industry input. Unforeseen inflation and market demand on certain commodities/materials further exacerbated the cost increases. Since then, improvements are debatable. Does this mean that the governance model of ITER is a wrong model for international scientific cooperation? I do not believe so. Had the necessary conditions for success
.NET 4.5 parallel extensions

CERN Document Server

Freeman, Bryan

2013-01-01

This book contains practical recipes on everything you will need to create task-based parallel programs using C#, .NET 4.5, and Visual Studio. The book is packed with illustrated code examples to create scalable programs.This book is intended to help experienced C# developers write applications that leverage the power of modern multicore processors. It provides the necessary knowledge for an experienced C# developer to work with .NET parallelism APIs. Previous experience of writing multithreaded applications is not necessary.
Patterns for Parallel Software Design

CERN Document Server

Ortega-Arjona, Jorge Luis

2010-01-01

Essential reading to understand patterns for parallel programming Software patterns have revolutionized the way we think about how software is designed, built, and documented, and the design of parallel software requires you to consider other particular design aspects and special skills. From clusters to supercomputers, success heavily depends on the design skills of software developers. Patterns for Parallel Software Design presents a pattern-oriented software architecture approach to parallel software design. This approach is not a design method in the classic sense, but a new way of managin
ITER EDA newsletter. V. 8, no. 6

International Nuclear Information System (INIS)

1999-06-01

A ceremony was held on 1 June 1999 at the Naka Fusion Research Establishment of JAERI to celebrate the successful development and fabrication of the ITER Central Solenoid Model Coil Inner Module and Outer Module and the CS Insert Coil. At this occasion, Dr. Martha Krebs from the US-DOE regretted the withdrawal of the United States from the ITER project, the US are now looking for Japan, the European Union and the Russian Federation to continue making progress. In response to this speech, Mr. Tsutomu Imamura said, that that was to be regretted and stated that Japan actively promoted the ITER project. Then, Dr. Michel Huguet, representing the JCT, presented a message from Dr. R. Aymar, the director of the ITER program. In this message he indicated that each and every one who had been involved in that project could take great pride. The ceremony was concluded by warm and thoughtful words from Dr. Masami Fujiwara and a toast by Dr. Masaji Yoshikawa. At the end, all participants praised each other for their efforts and the three coils, the CS Model Coil Inner Module, the Outer Module and the the CS Insert Coil, seemed to be smiling at them
Current Status on the Korean Test Blanket Module Development for testing in the ITER

International Nuclear Information System (INIS)

Lee, Dong Won; Kim, Suk Kwon; Bae, Young Dug; Yoon, Jae Sung; Jung, Ki Sok

2010-01-01

Korea has proposed and designed a Helium Cooled Molten Lithium (HCML) Test Blanket Module (TBM) to be tested in the International Thermonuclear Experimental Reactor (ITER). Ferrite Martensitic (FM) steel is used as the structural material and helium (He) is used as a coolant to cool the first wall (FW) and breeding zone. Liquid lithium (Li) is circulated for a tritium breeding, not for a cooling purpose. Main purpose for developing the TBM is to develop the design technology for DEMO and fusion reactor and it should be proved through the experiment in the ITER with TBM. Therefore, we have developed the design scheme and related codes including the safety analysis for obtain the license to be tested in the ITER. In order to develop and install at the ITER, several technologies were developed in parallel; fabrication, breeder, He cooling, tritium extraction and so on. Figure 1 shows the overall TBM development scheme. In Korea, official strategy for developing the TBM is to participate to other parties' concept such as US and EU ones, in which PbLi (lead lithium eutectic), He, and FM steel were used for liquid breeder, coolant, and structural material, respectively
Large-Scale Parallel Finite Element Analysis of the Stress Singular Problems

International Nuclear Information System (INIS)

Noriyuki Kushida; Hiroshi Okuda; Genki Yagawa

2002-01-01

In this paper, the convergence behavior of large-scale parallel finite element method for the stress singular problems was investigated. The convergence behavior of iterative solvers depends on the efficiency of the pre-conditioners. However, efficiency of pre-conditioners may be influenced by the domain decomposition that is necessary for parallel FEM. In this study the following results were obtained: Conjugate gradient method without preconditioning and the diagonal scaling preconditioned conjugate gradient method were not influenced by the domain decomposition as expected. symmetric successive over relaxation method preconditioned conjugate gradient method converged 6% faster as maximum if the stress singular area was contained in one sub-domain. (authors)
Verifying large modular systems using iterative abstraction refinement

International Nuclear Information System (INIS)

Lahtinen, Jussi; Kuismin, Tuomas; Heljanko, Keijo

2015-01-01

Digital instrumentation and control (I&C) systems are increasingly used in the nuclear engineering domain. The exhaustive verification of these systems is challenging, and the usual verification methods such as testing and simulation are typically insufficient. Model checking is a formal method that is able to exhaustively analyse the behaviour of a model against a formally written specification. If the model checking tool detects a violation of the specification, it will give out a counter-example that demonstrates how the specification is violated in the system. Unfortunately, sometimes real life system designs are too big to be directly analysed by traditional model checking techniques. We have developed an iterative technique for model checking large modular systems. The technique uses abstraction based over-approximations of the model behaviour, combined with iterative refinement. The main contribution of the work is the concrete abstraction refinement technique based on the modular structure of the model, the dependency graph of the model, and a refinement sampling heuristic similar to delta debugging. The technique is geared towards proving properties, and outperforms BDD-based model checking, the k-induction technique, and the property directed reachability algorithm (PDR) in our experiments. - Highlights: • We have developed an iterative technique for model checking large modular systems. • The technique uses BDD-based model checking, k-induction, and PDR in parallel. • We have tested our algorithm by verifying two models with it. • The technique outperforms classical model checking methods in our experiments

The ITER poloidal field system

Energy Technology Data Exchange (ETDEWEB)

Wesley, J [General Atomics, San Diego, CA (USA); Beljakov, V; Kavin, A; Korshakov, V; Kostenko, A; Roshal, A; Zakharov, L [Kurchatov Inst. of Atomic Energy, Moscow (USSR); Bulmer, R; Kaiser, T; Miller, J R; Pearlstein, L D [Lawrence Livermore National Lab., CA (USA); Hogan, J [Oak Ridge National Lab., TN (USA); Kurihara, K; Shimomura, Y; Sugihara, M; Yoshino, R [Japan Atomic Energy Resea

1990-12-15

The ITER poloidal field (PF) system uses superconducting coils to provide the plasma equilibrium fields, slow equilibrium control and plasma flux linkage (V-s) needed for the ITER Operations and Research Program. Double-null (DN) divertor plasmas and operation scenarios for 22 MA Physics (high-Q/ignition) and 15 MA Technology (high-fluence testing) phases are provided. For 22 MA plasmas, total PF flux swing is 333 V-s. This provides inductive current drive (CD) for start-up with 66 V-s of resistive loss and 440-s (330-s minimum) sustained burn. The PF system also allows plasma start-up and shutdown scenarios, and can maintain the plasma configuration during burn over a range of current and pressure profiles. Other capabilities include increased plasma current (25 MA with inductive CD; 28 MA with non-inductive CD assist), divertor separatrix sweeping, and semi-DN and single-null plasmas.
Iterative Sparse Channel Estimation and Decoding for Underwater MIMO-OFDM

Directory of Open Access Journals (Sweden)

Berger ChristianR

2010-01-01

Full Text Available We propose a block-by-block iterative receiver for underwater MIMO-OFDM that couples channel estimation with multiple-input multiple-output (MIMO detection and low-density parity-check (LDPC channel decoding. In particular, the channel estimator is based on a compressive sensing technique to exploit the channel sparsity, the MIMO detector consists of a hybrid use of successive interference cancellation and soft minimum mean-square error (MMSE equalization, and channel coding uses nonbinary LDPC codes. Various feedback strategies from the channel decoder to the channel estimator are studied, including full feedback of hard or soft symbol decisions, as well as their threshold-controlled versions. We study the receiver performance using numerical simulation and experimental data collected from the RACE08 and SPACE08 experiments. We find that iterative receiver processing including sparse channel estimation leads to impressive performance gains. These gains are more pronounced when the number of available pilots to estimate the channel is decreased, for example, when a fixed number of pilots is split between an increasing number of parallel data streams in MIMO transmission. For the various feedback strategies for iterative channel estimation, we observe that soft decision feedback slightly outperforms hard decision feedback.
Three-Dimensional Induced Polarization Parallel Inversion Using Nonlinear Conjugate Gradients Method

Directory of Open Access Journals (Sweden)

Huan Ma

2015-01-01

Full Text Available Four kinds of array of induced polarization (IP methods (surface, borehole-surface, surface-borehole, and borehole-borehole are widely used in resource exploration. However, due to the presence of large amounts of the sources, it will take much time to complete the inversion. In the paper, a new parallel algorithm is described which uses message passing interface (MPI and graphics processing unit (GPU to accelerate 3D inversion of these four methods. The forward finite differential equation is solved by ILU0 preconditioner and the conjugate gradient (CG solver. The inverse problem is solved by nonlinear conjugate gradients (NLCG iteration which is used to calculate one forward and two “pseudo-forward” modelings and update the direction, space, and model in turn. Because each source is independent in forward and “pseudo-forward” modelings, multiprocess modes are opened by calling MPI library. The iterative matrix solver within CULA is called in each process. Some tables and synthetic data examples illustrate that this parallel inversion algorithm is effective. Furthermore, we demonstrate that the joint inversion of surface and borehole data produces resistivity and chargeability results are superior to those obtained from inversions of individual surface data.
STEP: Self-supporting tailored k-space estimation for parallel imaging reconstruction.

Science.gov (United States)

Zhou, Zechen; Wang, Jinnan; Balu, Niranjan; Li, Rui; Yuan, Chun

2016-02-01

A new subspace-based iterative reconstruction method, termed Self-supporting Tailored k-space Estimation for Parallel imaging reconstruction (STEP), is presented and evaluated in comparison to the existing autocalibrating method SPIRiT and calibrationless method SAKE. In STEP, two tailored schemes including k-space partition and basis selection are proposed to promote spatially variant signal subspace and incorporated into a self-supporting structured low rank model to enforce properties of locality, sparsity, and rank deficiency, which can be formulated into a constrained optimization problem and solved by an iterative algorithm. Simulated and in vivo datasets were used to investigate the performance of STEP in terms of overall image quality and detail structure preservation. The advantage of STEP on image quality is demonstrated by retrospectively undersampled multichannel Cartesian data with various patterns. Compared with SPIRiT and SAKE, STEP can provide more accurate reconstruction images with less residual aliasing artifacts and reduced noise amplification in simulation and in vivo experiments. In addition, STEP has the capability of combining compressed sensing with arbitrary sampling trajectory. Using k-space partition and basis selection can further improve the performance of parallel imaging reconstruction with or without calibration signals. © 2015 Wiley Periodicals, Inc.
ParaHaplo 3.0: A program package for imputation and a haplotype-based whole-genome association study using hybrid parallel computing

Directory of Open Access Journals (Sweden)

Kamatani Naoyuki

2011-05-01

Full Text Available Abstract Background Use of missing genotype imputations and haplotype reconstructions are valuable in genome-wide association studies (GWASs. By modeling the patterns of linkage disequilibrium in a reference panel, genotypes not directly measured in the study samples can be imputed and used for GWASs. Since millions of single nucleotide polymorphisms need to be imputed in a GWAS, faster methods for genotype imputation and haplotype reconstruction are required. Results We developed a program package for parallel computation of genotype imputation and haplotype reconstruction. Our program package, ParaHaplo 3.0, is intended for use in workstation clusters using the Intel Message Passing Interface. We compared the performance of ParaHaplo 3.0 on the Japanese in Tokyo, Japan and Han Chinese in Beijing, and Chinese in the HapMap dataset. A parallel version of ParaHaplo 3.0 can conduct genotype imputation 20 times faster than a non-parallel version of ParaHaplo. Conclusions ParaHaplo 3.0 is an invaluable tool for conducting haplotype-based GWASs. The need for faster genotype imputation and haplotype reconstruction using parallel computing will become increasingly important as the data sizes of such projects continue to increase. ParaHaplo executable binaries and program sources are available at http://en.sourceforge.jp/projects/parallelgwas/releases/.
Positron emission tomographic images and expectation maximization: A VLSI architecture for multiple iterations per second

International Nuclear Information System (INIS)

Jones, W.F.; Byars, L.G.; Casey, M.E.

1988-01-01

A digital electronic architecture for parallel processing of the expectation maximization (EM) algorithm for Positron Emission tomography (PET) image reconstruction is proposed. Rapid (0.2 second) EM iterations on high resolution (256 x 256) images are supported. Arrays of two very large scale integration (VLSI) chips perform forward and back projection calculations. A description of the architecture is given, including data flow and partitioning relevant to EM and parallel processing. EM images shown are produced with software simulating the proposed hardware reconstruction algorithm. Projected cost of the system is estimated to be small in comparison to the cost of current PET scanners
SPSS and SAS programs for determining the number of components using parallel analysis and velicer's MAP test.

Science.gov (United States)

O'Connor, B P

2000-08-01

Popular statistical software packages do not have the proper procedures for determining the number of components in factor and principal components analyses. Parallel analysis and Velicer's minimum average partial (MAP) test are validated procedures, recommended widely by statisticians. However, many researchers continue to use alternative, simpler, but flawed procedures, such as the eigenvalues-greater-than-one rule. Use of the proper procedures might be increased if these procedures could be conducted within familiar software environments. This paper describes brief and efficient programs for using SPSS and SAS to conduct parallel analyses and the MAP test.
Enhancing Application Performance Using Mini-Apps: Comparison of Hybrid Parallel Programming Paradigms

Science.gov (United States)

Lawson, Gary; Sosonkina, Masha; Baurle, Robert; Hammond, Dana

2017-01-01

In many fields, real-world applications for High Performance Computing have already been developed. For these applications to stay up-to-date, new parallel strategies must be explored to yield the best performance; however, restructuring or modifying a real-world application may be daunting depending on the size of the code. In this case, a mini-app may be employed to quickly explore such options without modifying the entire code. In this work, several mini-apps have been created to enhance a real-world application performance, namely the VULCAN code for complex flow analysis developed at the NASA Langley Research Center. These mini-apps explore hybrid parallel programming paradigms with Message Passing Interface (MPI) for distributed memory access and either Shared MPI (SMPI) or OpenMP for shared memory accesses. Performance testing shows that MPI+SMPI yields the best execution performance, while requiring the largest number of code changes. A maximum speedup of 23 was measured for MPI+SMPI, but only 11 was measured for MPI+OpenMP.
Parallel Computing Characteristics of Two-Phase Thermal-Hydraulics code, CUPID

International Nuclear Information System (INIS)

Lee, Jae Ryong; Yoon, Han Young

2013-01-01

Parallelized CUPID code has proved to be able to reproduce multi-dimensional thermal hydraulic analysis by validating with various conceptual problems and experimental data. In this paper, the characteristics of the parallelized CUPID code were investigated. Both single- and two phase simulation are taken into account. Since the scalability of a parallel simulation is known to be better for fine mesh system, two types of mesh system are considered. In addition, the dependency of the preconditioner for matrix solver was also compared. The scalability for the single-phase flow is better than that for two-phase flow due to the less numbers of iterations for solving pressure matrix. The CUPID code was investigated the parallel performance in terms of scalability. The CUPID code was parallelized with domain decomposition method. The MPI library was adopted to communicate the information at the interface cells. As increasing the number of mesh, the scalability is improved. For a given mesh, single-phase flow simulation with diagonal preconditioner shows the best speedup. However, for the two-phase flow simulation, the ILU preconditioner is recommended since it reduces the overall simulation time
Adapting algorithms to massively parallel hardware

CERN Document Server

Sioulas, Panagiotis

2016-01-01

In the recent years, the trend in computing has shifted from delivering processors with faster clock speeds to increasing the number of cores per processor. This marks a paradigm shift towards parallel programming in which applications are programmed to exploit the power provided by multi-cores. Usually there is gain in terms of the time-to-solution and the memory footprint. Specifically, this trend has sparked an interest towards massively parallel systems that can provide a large number of processors, and possibly computing nodes, as in the GPUs and MPPAs (Massively Parallel Processor Arrays). In this project, the focus was on two distinct computing problems: k-d tree searches and track seeding cellular automata. The goal was to adapt the algorithms to parallel systems and evaluate their performance in different cases.
Intelligent controller of a flexible hybrid robot machine for ITER assembly and maintenance

International Nuclear Information System (INIS)

Al-saedi, Mazin I.; Wu, Huapeng; Handroos, Heikki

2014-01-01

Highlights: • Studying flexible multibody dynamic of hybrid parallel robot. • Investigating fuzzy-PD controller to control a hybrid flexible hydraulically driven robot. • Investigating ANFIS-PD controller to control a hybrid flexible robot. Compare to traditional PID this method gives better performance. • Using the equilibrium of reaction forces between the parallel and serial parts of hybrid robot to control the serial part hydraulically driven. - Abstract: The assembly and maintenance of International Thermonuclear Experimental Reactor (ITER) vacuum vessel (VV) is highly challenging since the tasks performed by the robot involve welding, material handling, and machine cutting from inside the VV. To fulfill the tasks in ITER application, this paper presents a hybrid redundant manipulator with four DOFs provided by serial kinematic axes and six DOFs by parallel mechanism. Thus, in machining, to achieve greater end-effector trajectory tracking accuracy for surface quality, a robust control of the actuators for the flexible link has to be deduced. In this paper, the intelligent control of a hydraulically driven parallel robot part based on the dynamic model and two control schemes have been investigated: (1) fuzzy-PID self tuning controller composed of the conventional PID control and with fuzzy logic; (2) adaptive neuro-fuzzy inference system-PID (ANFIS-PID) self tuning of the gains of the PID controller, which are implemented independently to control each hydraulic cylinder of the parallel robot based on rod position predictions. The obtained results of the fuzzy-PID and ANFIS-PID self tuning controller can reduce more tracking errors than the conventional PID controller. Subsequently, the serial component of the hybrid robot can be analyzed using the equilibrium of reaction forces at the universal joint connections of the hexa-element. To achieve precise positional control of the end effector for maximum precision machining, the hydraulic cylinder should
Intelligent controller of a flexible hybrid robot machine for ITER assembly and maintenance

Energy Technology Data Exchange (ETDEWEB)

Al-saedi, Mazin I., E-mail: mazin.al-saedi@lut.fi; Wu, Huapeng; Handroos, Heikki

2014-10-15

Highlights: • Studying flexible multibody dynamic of hybrid parallel robot. • Investigating fuzzy-PD controller to control a hybrid flexible hydraulically driven robot. • Investigating ANFIS-PD controller to control a hybrid flexible robot. Compare to traditional PID this method gives better performance. • Using the equilibrium of reaction forces between the parallel and serial parts of hybrid robot to control the serial part hydraulically driven. - Abstract: The assembly and maintenance of International Thermonuclear Experimental Reactor (ITER) vacuum vessel (VV) is highly challenging since the tasks performed by the robot involve welding, material handling, and machine cutting from inside the VV. To fulfill the tasks in ITER application, this paper presents a hybrid redundant manipulator with four DOFs provided by serial kinematic axes and six DOFs by parallel mechanism. Thus, in machining, to achieve greater end-effector trajectory tracking accuracy for surface quality, a robust control of the actuators for the flexible link has to be deduced. In this paper, the intelligent control of a hydraulically driven parallel robot part based on the dynamic model and two control schemes have been investigated: (1) fuzzy-PID self tuning controller composed of the conventional PID control and with fuzzy logic; (2) adaptive neuro-fuzzy inference system-PID (ANFIS-PID) self tuning of the gains of the PID controller, which are implemented independently to control each hydraulic cylinder of the parallel robot based on rod position predictions. The obtained results of the fuzzy-PID and ANFIS-PID self tuning controller can reduce more tracking errors than the conventional PID controller. Subsequently, the serial component of the hybrid robot can be analyzed using the equilibrium of reaction forces at the universal joint connections of the hexa-element. To achieve precise positional control of the end effector for maximum precision machining, the hydraulic cylinder should
Parallel alternating direction preconditioner for isogeometric simulations of explicit dynamics

KAUST Repository

Łoś, Marcin

2015-04-27

In this paper we present a parallel implementation of the alternating direction preconditioner for isogeometric simulations of explicit dynamics. The Alternating Direction Implicit (ADI) algorithm, belongs to the category of matrix-splitting iterative methods, was proposed almost six decades ago for solving parabolic and elliptic partial differential equations, see [1–4]. The new version of this algorithm has been recently developed for isogeometric simulations of two dimensional explicit dynamics [5] and steady-state diffusion equations with orthotropic heterogenous coefficients [6]. In this paper we present a parallel version of the alternating direction implicit algorithm for three dimensional simulations. The algorithm has been incorporated as a part of PETIGA an isogeometric framework [7] build on top of PETSc [8]. We show the scalability of the parallel algorithm on STAMPEDE linux cluster up to 10,000 processors, as well as the convergence rate of the PCG solver with ADI algorithm as preconditioner.
Fuel cycle design for ITER and its extrapolation to DEMO

International Nuclear Information System (INIS)

Konishi, Satoshi; Glugla, Manfred; Hayashi, Takumi

2008-01-01

future energy source. Some of the subjects cannot be expected to be within the extrapolation of ITER technology and require long term efforts paralleling ITER
Fuel cycle design for ITER and its extrapolation to DEMO

Energy Technology Data Exchange (ETDEWEB)

Konishi, Satoshi [Institute of Advanced Energy, Kyoto University, Kyoto 611-0011 (Japan)], E-mail: s-konishi@iae.kyoto-u.ac.jp; Glugla, Manfred [Forschungszentrum Karlsruhe, P.O. Box 3640, D 76021 Karlsruhe (Germany); Hayashi, Takumi [Apan Atomic Energy AgencyTokai, Ibaraki 319-0015 Japan (Japan)

2008-12-15

future energy source. Some of the subjects cannot be expected to be within the extrapolation of ITER technology and require long term efforts paralleling ITER.
Design strategies for irregularly adapting parallel applications

International Nuclear Information System (INIS)

Oliker, Leonid; Biswas, Rupak; Shan, Hongzhang; Sing, Jaswinder Pal

2000-01-01

Achieving scalable performance for dynamic irregular applications is eminently challenging. Traditional message-passing approaches have been making steady progress towards this goal; however, they suffer from complex implementation requirements. The use of a global address space greatly simplifies the programming task, but can degrade the performance of dynamically adapting computations. In this work, we examine two major classes of adaptive applications, under five competing programming methodologies and four leading parallel architectures. Results indicate that it is possible to achieve message-passing performance using shared-memory programming techniques by carefully following the same high level strategies. Adaptive applications have computational work loads and communication patterns which change unpredictably at runtime, requiring dynamic load balancing to achieve scalable performance on parallel machines. Efficient parallel implementations of such adaptive applications are therefore a challenging task. This work examines the implementation of two typical adaptive applications, Dynamic Remeshing and N-Body, across various programming paradigms and architectural platforms. We compare several critical factors of the parallel code development, including performance, programmability, scalability, algorithmic development, and portability
A PARALLEL NONOVERLAPPING DOMAIN DECOMPOSITION METHOD FOR STOKES PROBLEMS

Institute of Scientific and Technical Information of China (English)

Mei-qun Jiang; Pei-liang Dai

2006-01-01

A nonoverlapping domain decomposition iterative procedure is developed and analyzed for generalized Stokes problems and their finite element approximate problems in RN(N=2,3). The method is based on a mixed-type consistency condition with two parameters as a transmission condition together with a derivative-free transmission data updating technique on the artificial interfaces. The method can be applied to a general multi-subdomain decomposition and implemented on parallel machines with local simple communications naturally.
Vector-Parallel processing of the successive overrelaxation method

International Nuclear Information System (INIS)

Yokokawa, Mitsuo

1988-02-01

Successive overrelaxation method, called SOR method, is one of iterative methods for solving linear system of equations, and it has been calculated in serial with a natural ordering in many nuclear codes. After the appearance of vector processors, this natural SOR method has been changed for the parallel algorithm such as hyperplane or red-black method, in which the calculation order is modified. These methods are suitable for vector processors, and more high-speed calculation can be obtained compared with the natural SOR method on vector processors. In this report, a new scheme named 4-colors SOR method is proposed. We find that the 4-colors SOR method can be executed on vector-parallel processors and it gives the most high-speed calculation among all SOR methods according to results of the vector-parallel execution on the Alliant FX/8 multiprocessor system. It is also shown that the theoretical optimal acceleration parameters are equal among five different ordering SOR methods, and the difference between convergence rates of these SOR methods are examined. (author)
Implementation and performance of parallelized elegant

International Nuclear Information System (INIS)

Wang, Y.; Borland, M.

2008-01-01

The program elegant is widely used for design and modeling of linacs for free-electron lasers and energy recovery linacs, as well as storage rings and other applications. As part of a multi-year effort, we have parallelized many aspects of the code, including single-particle dynamics, wakefields, and coherent synchrotron radiation. We report on the approach used for gradual parallelization, which proved very beneficial in getting parallel features into the hands of users quickly. We also report details of parallelization of collective effects. Finally, we discuss performance of the parallelized code in various applications.
Structural analysis of the ITER Vacuum Vessel regarding 2012 ITER Project-Level Loads

Energy Technology Data Exchange (ETDEWEB)

Martinez, J.-M., E-mail: jean-marc.martinez@live.fr [ITER Organization, Route de Vinon sur Verdon, 13115 St Paul lez Durance (France); Jun, C.H.; Portafaix, C.; Choi, C.-H.; Ioki, K.; Sannazzaro, G.; Sborchia, C. [ITER Organization, Route de Vinon sur Verdon, 13115 St Paul lez Durance (France); Cambazar, M.; Corti, Ph.; Pinori, K.; Sfarni, S.; Tailhardat, O. [Assystem EOS, 117 rue Jacquard, L' Atrium, 84120 Pertuis (France); Borrelly, S. [Sogeti High Tech, RE2, 180 rue René Descartes, Le Millenium – Bat C, 13857 Aix en Provence (France); Albin, V.; Pelletier, N. [SOM Calcul – Groupe ORTEC, 121 ancien Chemin de Cassis – Immeuble Grand Pré, 13009 Marseille (France)

2014-10-15

Highlights: • ITER Vacuum Vessel is a part of the first barrier to confine the plasma. • ITER Vacuum Vessel as Nuclear Pressure Equipment (NPE) necessitates a third party organization authorized by the French nuclear regulator to assure design, fabrication, conformance testing and quality assurance, i.e. Agreed Notified Body (ANB). • A revision of the ITER Project-Level Load Specification was implemented in April 2012. • ITER Vacuum Vessel Loads (seismic, pressure, thermal and electromagnetic loads) were summarized. • ITER Vacuum Vessel Structural Margins with regards to RCC-MR code were summarized. - Abstract: A revision of the ITER Project-Level Load Specification (to be used for all systems of the ITER machine) was implemented in April 2012. This revision supports ITER's licensing by accommodating requests from the French regulator to maintain consistency with the plasma physics database and our present understanding of plasma transients and electro-magnetic (EM) loads, to investigate the possibility of removing unnecessary conservatism in the load requirements and to review the list and definition of incidental cases. The purpose of this paper is to present the impact of this 2012 revision of the ITER Project-Level Load Specification (LS) on the ITER Vacuum Vessel (VV) loads and the main structural margins required by the applicable French code, RCC-MR.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.