model parallel models: Topics by WorldWideScience.org

Sample records for model parallel models

Parallel phase model : a programming model for high-end parallel machines with manycores.

Energy Technology Data Exchange (ETDEWEB)

Wu, Junfeng (Syracuse University, Syracuse, NY); Wen, Zhaofang; Heroux, Michael Allen; Brightwell, Ronald Brian

2009-04-01

This paper presents a parallel programming model, Parallel Phase Model (PPM), for next-generation high-end parallel machines based on a distributed memory architecture consisting of a networked cluster of nodes with a large number of cores on each node. PPM has a unified high-level programming abstraction that facilitates the design and implementation of parallel algorithms to exploit both the parallelism of the many cores and the parallelism at the cluster level. The programming abstraction will be suitable for expressing both fine-grained and coarse-grained parallelism. It includes a few high-level parallel programming language constructs that can be added as an extension to an existing (sequential or parallel) programming language such as C; and the implementation of PPM also includes a light-weight runtime library that runs on top of an existing network communication software layer (e.g. MPI). Design philosophy of PPM and details of the programming abstraction are also presented. Several unstructured applications that inherently require high-volume random fine-grained data accesses have been implemented in PPM with very promising results.
Parallel Boltzmann machines : a mathematical model

NARCIS (Netherlands)

Zwietering, P.J.; Aarts, E.H.L.

1991-01-01

A mathematical model is presented for the description of parallel Boltzmann machines. The framework is based on the theory of Markov chains and combines a number of previously known results into one generic model. It is argued that parallel Boltzmann machines maximize a function consisting of a
Models of parallel computation :a survey and classification

Institute of Scientific and Technical Information of China (English)

ZHANG Yunquan; CHEN Guoliang; SUN Guangzhong; MIAO Qiankun

2007-01-01

In this paper,the state-of-the-art parallel computational model research is reviewed.We will introduce various models that were developed during the past decades.According to their targeting architecture features,especially memory organization,we classify these parallel computational models into three generations.These models and their characteristics are discussed based on three generations classification.We believe that with the ever increasing speed gap between the CPU and memory systems,incorporating non-uniform memory hierarchy into computational models will become unavoidable.With the emergence of multi-core CPUs,the parallelism hierarchy of current computing platforms becomes more and more complicated.Describing this complicated parallelism hierarchy in future computational models becomes more and more important.A semi-automatic toolkit that can extract model parameters and their values on real computers can reduce the model analysis complexity,thus allowing more complicated models with more parameters to be adopted.Hierarchical memory and hierarchical parallelism will be two very important features that should be considered in future model design and research.
Multitasking TORT Under UNICOS: Parallel Performance Models and Measurements

International Nuclear Information System (INIS)

Azmy, Y.Y.; Barnett, D.A.

1999-01-01

The existing parallel algorithms in the TORT discrete ordinates were updated to function in a UNI-COS environment. A performance model for the parallel overhead was derived for the existing algorithms. The largest contributors to the parallel overhead were identified and a new algorithm was developed. A parallel overhead model was also derived for the new algorithm. The results of the comparison of parallel performance models were compared to applications of the code to two TORT standard test problems and a large production problem. The parallel performance models agree well with the measured parallel overhead
Multitasking TORT under UNICOS: Parallel performance models and measurements

International Nuclear Information System (INIS)

Barnett, A.; Azmy, Y.Y.

1999-01-01

The existing parallel algorithms in the TORT discrete ordinates code were updated to function in a UNICOS environment. A performance model for the parallel overhead was derived for the existing algorithms. The largest contributors to the parallel overhead were identified and a new algorithm was developed. A parallel overhead model was also derived for the new algorithm. The results of the comparison of parallel performance models were compared to applications of the code to two TORT standard test problems and a large production problem. The parallel performance models agree well with the measured parallel overhead
Cellular automata a parallel model

CERN Document Server

Mazoyer, J

1999-01-01

Cellular automata can be viewed both as computational models and modelling systems of real processes. This volume emphasises the first aspect. In articles written by leading researchers, sophisticated massive parallel algorithms (firing squad, life, Fischer's primes recognition) are treated. Their computational power and the specific complexity classes they determine are surveyed, while some recent results in relation to chaos from a new dynamic systems point of view are also presented. Audience: This book will be of interest to specialists of theoretical computer science and the parallelism challenge.
Towards a streaming model for nested data parallelism

DEFF Research Database (Denmark)

Madsen, Frederik Meisner; Filinski, Andrzej

2013-01-01

The language-integrated cost semantics for nested data parallelism pioneered by NESL provides an intuitive, high-level model for predicting performance and scalability of parallel algorithms with reasonable accuracy. However, this predictability, obtained through a uniform, parallelism-flattening......The language-integrated cost semantics for nested data parallelism pioneered by NESL provides an intuitive, high-level model for predicting performance and scalability of parallel algorithms with reasonable accuracy. However, this predictability, obtained through a uniform, parallelism......-processable in a streaming fashion. This semantics is directly compatible with previously proposed piecewise execution models for nested data parallelism, but allows the expected space usage to be reasoned about directly at the source-language level. The language definition and implementation are still very much work...
Iteration schemes for parallelizing models of superconductivity

Energy Technology Data Exchange (ETDEWEB)

Gray, P.A. [Michigan State Univ., East Lansing, MI (United States)

1996-12-31

The time dependent Lawrence-Doniach model, valid for high fields and high values of the Ginzburg-Landau parameter, is often used for studying vortex dynamics in layered high-T{sub c} superconductors. When solving these equations numerically, the added degrees of complexity due to the coupling and nonlinearity of the model often warrant the use of high-performance computers for their solution. However, the interdependence between the layers can be manipulated so as to allow parallelization of the computations at an individual layer level. The reduced parallel tasks may then be solved independently using a heterogeneous cluster of networked workstations connected together with Parallel Virtual Machine (PVM) software. Here, this parallelization of the model is discussed and several computational implementations of varying degrees of parallelism are presented. Computational results are also given which contrast properties of convergence speed, stability, and consistency of these implementations. Included in these results are models involving the motion of vortices due to an applied current and pinning effects due to various material properties.
Optimisation of a parallel ocean general circulation model

OpenAIRE

M. I. Beare; D. P. Stevens

1997-01-01

International audience; This paper presents the development of a general-purpose parallel ocean circulation model, for use on a wide range of computer platforms, from traditional scalar machines to workstation clusters and massively parallel processors. Parallelism is provided, as a modular option, via high-level message-passing routines, thus hiding the technical intricacies from the user. An initial implementation highlights that the parallel efficiency of the model is adversely affected by...
Research on Multi - Person Parallel Modeling Method Based on Integrated Model Persistent Storage

Science.gov (United States)

Qu, MingCheng; Wu, XiangHu; Tao, YongChao; Liu, Ying

2018-03-01

This paper mainly studies the multi-person parallel modeling method based on the integrated model persistence storage. The integrated model refers to a set of MDDT modeling graphics system, which can carry out multi-angle, multi-level and multi-stage description of aerospace general embedded software. Persistent storage refers to converting the data model in memory into a storage model and converting the storage model into a data model in memory, where the data model refers to the object model and the storage model is a binary stream. And multi-person parallel modeling refers to the need for multi-person collaboration, the role of separation, and even real-time remote synchronization modeling.
Intelligent spatial ecosystem modeling using parallel processors

International Nuclear Information System (INIS)

Maxwell, T.; Costanza, R.

1993-01-01

Spatial modeling of ecosystems is essential if one's modeling goals include developing a relatively realistic description of past behavior and predictions of the impacts of alternative management policies on future ecosystem behavior. Development of these models has been limited in the past by the large amount of input data required and the difficulty of even large mainframe serial computers in dealing with large spatial arrays. These two limitations have begun to erode with the increasing availability of remote sensing data and GIS systems to manipulate it, and the development of parallel computer systems which allow computation of large, complex, spatial arrays. Although many forms of dynamic spatial modeling are highly amenable to parallel processing, the primary focus in this project is on process-based landscape models. These models simulate spatial structure by first compartmentalizing the landscape into some geometric design and then describing flows within compartments and spatial processes between compartments according to location-specific algorithms. The authors are currently building and running parallel spatial models at the regional scale for the Patuxent River region in Maryland, the Everglades in Florida, and Barataria Basin in Louisiana. The authors are also planning a project to construct a series of spatially explicit linked ecological and economic simulation models aimed at assessing the long-term potential impacts of global climate change
PDDP, A Data Parallel Programming Model

Directory of Open Access Journals (Sweden)

Karen H. Warren

1996-01-01

Full Text Available PDDP, the parallel data distribution preprocessor, is a data parallel programming model for distributed memory parallel computers. PDDP implements high-performance Fortran-compatible data distribution directives and parallelism expressed by the use of Fortran 90 array syntax, the FORALL statement, and the WHERE construct. Distributed data objects belong to a global name space; other data objects are treated as local and replicated on each processor. PDDP allows the user to program in a shared memory style and generates codes that are portable to a variety of parallel machines. For interprocessor communication, PDDP uses the fastest communication primitives on each platform.
Structured building model reduction toward parallel simulation

Energy Technology Data Exchange (ETDEWEB)

Dobbs, Justin R. [Cornell University; Hencey, Brondon M. [Cornell University

2013-08-26

Building energy model reduction exchanges accuracy for improved simulation speed by reducing the number of dynamical equations. Parallel computing aims to improve simulation times without loss of accuracy but is poorly utilized by contemporary simulators and is inherently limited by inter-processor communication. This paper bridges these disparate techniques to implement efficient parallel building thermal simulation. We begin with a survey of three structured reduction approaches that compares their performance to a leading unstructured method. We then use structured model reduction to find thermal clusters in the building energy model and allocate processing resources. Experimental results demonstrate faster simulation and low error without any interprocessor communication.
Optimisation of a parallel ocean general circulation model

Science.gov (United States)

Beare, M. I.; Stevens, D. P.

1997-10-01

This paper presents the development of a general-purpose parallel ocean circulation model, for use on a wide range of computer platforms, from traditional scalar machines to workstation clusters and massively parallel processors. Parallelism is provided, as a modular option, via high-level message-passing routines, thus hiding the technical intricacies from the user. An initial implementation highlights that the parallel efficiency of the model is adversely affected by a number of factors, for which optimisations are discussed and implemented. The resulting ocean code is portable and, in particular, allows science to be achieved on local workstations that could otherwise only be undertaken on state-of-the-art supercomputers.
The Potsdam Parallel Ice Sheet Model (PISM-PIK) - Part 1: Model description

Science.gov (United States)

Winkelmann, R.; Martin, M. A.; Haseloff, M.; Albrecht, T.; Bueler, E.; Khroulev, C.; Levermann, A.

2011-09-01

We present the Potsdam Parallel Ice Sheet Model (PISM-PIK), developed at the Potsdam Institute for Climate Impact Research to be used for simulations of large-scale ice sheet-shelf systems. It is derived from the Parallel Ice Sheet Model (Bueler and Brown, 2009). Velocities are calculated by superposition of two shallow stress balance approximations within the entire ice covered region: the shallow ice approximation (SIA) is dominant in grounded regions and accounts for shear deformation parallel to the geoid. The plug-flow type shallow shelf approximation (SSA) dominates the velocity field in ice shelf regions and serves as a basal sliding velocity in grounded regions. Ice streams can be identified diagnostically as regions with a significant contribution of membrane stresses to the local momentum balance. All lateral boundaries in PISM-PIK are free to evolve, including the grounding line and ice fronts. Ice shelf margins in particular are modeled using Neumann boundary conditions for the SSA equations, reflecting a hydrostatic stress imbalance along the vertical calving face. The ice front position is modeled using a subgrid-scale representation of calving front motion (Albrecht et al., 2011) and a physically-motivated calving law based on horizontal spreading rates. The model is tested in experiments from the Marine Ice Sheet Model Intercomparison Project (MISMIP). A dynamic equilibrium simulation of Antarctica under present-day conditions is presented in Martin et al. (2011).
Optimisation of a parallel ocean general circulation model

Directory of Open Access Journals (Sweden)

M. I. Beare

1997-10-01

Full Text Available This paper presents the development of a general-purpose parallel ocean circulation model, for use on a wide range of computer platforms, from traditional scalar machines to workstation clusters and massively parallel processors. Parallelism is provided, as a modular option, via high-level message-passing routines, thus hiding the technical intricacies from the user. An initial implementation highlights that the parallel efficiency of the model is adversely affected by a number of factors, for which optimisations are discussed and implemented. The resulting ocean code is portable and, in particular, allows science to be achieved on local workstations that could otherwise only be undertaken on state-of-the-art supercomputers.
Optimisation of a parallel ocean general circulation model

Directory of Open Access Journals (Sweden)

M. I. Beare

Full Text Available This paper presents the development of a general-purpose parallel ocean circulation model, for use on a wide range of computer platforms, from traditional scalar machines to workstation clusters and massively parallel processors. Parallelism is provided, as a modular option, via high-level message-passing routines, thus hiding the technical intricacies from the user. An initial implementation highlights that the parallel efficiency of the model is adversely affected by a number of factors, for which optimisations are discussed and implemented. The resulting ocean code is portable and, in particular, allows science to be achieved on local workstations that could otherwise only be undertaken on state-of-the-art supercomputers.
A new parallelization algorithm of ocean model with explicit scheme

Science.gov (United States)

Fu, X. D.

2017-08-01

This paper will focus on the parallelization of ocean model with explicit scheme which is one of the most commonly used schemes in the discretization of governing equation of ocean model. The characteristic of explicit schema is that calculation is simple, and that the value of the given grid point of ocean model depends on the grid point at the previous time step, which means that one doesn’t need to solve sparse linear equations in the process of solving the governing equation of the ocean model. Aiming at characteristics of the explicit scheme, this paper designs a parallel algorithm named halo cells update with tiny modification of original ocean model and little change of space step and time step of the original ocean model, which can parallelize ocean model by designing transmission module between sub-domains. This paper takes the GRGO for an example to implement the parallelization of GRGO (Global Reduced Gravity Ocean model) with halo update. The result demonstrates that the higher speedup can be achieved at different problem size.
Peformance Tuning and Evaluation of a Parallel Community Climate Model

Energy Technology Data Exchange (ETDEWEB)

Drake, J.B.; Worley, P.H.; Hammond, S.

1999-11-13

The Parallel Community Climate Model (PCCM) is a message-passing parallelization of version 2.1 of the Community Climate Model (CCM) developed by researchers at Argonne and Oak Ridge National Laboratories and at the National Center for Atmospheric Research in the early to mid 1990s. In preparation for use in the Department of Energy's Parallel Climate Model (PCM), PCCM has recently been updated with new physics routines from version 3.2 of the CCM, improvements to the parallel implementation, and ports to the SGIKray Research T3E and Origin 2000. We describe our experience in porting and tuning PCCM on these new platforms, evaluating the performance of different parallel algorithm options and comparing performance between the T3E and Origin 2000.
Shared Variable Oriented Parallel Precompiler for SPMD Model

Institute of Scientific and Technical Information of China (English)

无

1995-01-01

For the moment,commercial parallel computer systems with distributed memory architecture are usually provided with parallel FORTRAN or parallel C compliers,which are just traditional sequential FORTRAN or C compilers expanded with communication statements.Programmers suffer from writing parallel programs with communication statements. The Shared Variable Oriented Parallel Precompiler (SVOPP) proposed in this paper can automatically generate appropriate communication statements based on shared variables for SPMD(Single Program Multiple Data) computation model and greatly ease the parallel programming with high communication efficiency.The core function of parallel C precompiler has been successfully verified on a transputer-based parallel computer.Its prominent performance shows that SVOPP is probably a break-through in parallel programming technique.

The Potsdam Parallel Ice Sheet Model (PISM-PIK – Part 1: Model description

Directory of Open Access Journals (Sweden)

R. Winkelmann

2011-09-01

Full Text Available We present the Potsdam Parallel Ice Sheet Model (PISM-PIK, developed at the Potsdam Institute for Climate Impact Research to be used for simulations of large-scale ice sheet-shelf systems. It is derived from the Parallel Ice Sheet Model (Bueler and Brown, 2009. Velocities are calculated by superposition of two shallow stress balance approximations within the entire ice covered region: the shallow ice approximation (SIA is dominant in grounded regions and accounts for shear deformation parallel to the geoid. The plug-flow type shallow shelf approximation (SSA dominates the velocity field in ice shelf regions and serves as a basal sliding velocity in grounded regions. Ice streams can be identified diagnostically as regions with a significant contribution of membrane stresses to the local momentum balance. All lateral boundaries in PISM-PIK are free to evolve, including the grounding line and ice fronts. Ice shelf margins in particular are modeled using Neumann boundary conditions for the SSA equations, reflecting a hydrostatic stress imbalance along the vertical calving face. The ice front position is modeled using a subgrid-scale representation of calving front motion (Albrecht et al., 2011 and a physically-motivated calving law based on horizontal spreading rates. The model is tested in experiments from the Marine Ice Sheet Model Intercomparison Project (MISMIP. A dynamic equilibrium simulation of Antarctica under present-day conditions is presented in Martin et al. (2011.
High-Performance Psychometrics: The Parallel-E Parallel-M Algorithm for Generalized Latent Variable Models. Research Report. ETS RR-16-34

Science.gov (United States)

von Davier, Matthias

2016-01-01

This report presents results on a parallel implementation of the expectation-maximization (EM) algorithm for multidimensional latent variable models. The developments presented here are based on code that parallelizes both the E step and the M step of the parallel-E parallel-M algorithm. Examples presented in this report include item response…
Electromagnetic Physics Models for Parallel Computing Architectures

Science.gov (United States)

Amadio, G.; Ananya, A.; Apostolakis, J.; Aurora, A.; Bandieramonte, M.; Bhattacharyya, A.; Bianchini, C.; Brun, R.; Canal, P.; Carminati, F.; Duhem, L.; Elvira, D.; Gheata, A.; Gheata, M.; Goulas, I.; Iope, R.; Jun, S. Y.; Lima, G.; Mohanty, A.; Nikitina, T.; Novak, M.; Pokorski, W.; Ribon, A.; Seghal, R.; Shadura, O.; Vallecorsa, S.; Wenzel, S.; Zhang, Y.

2016-10-01

The recent emergence of hardware architectures characterized by many-core or accelerated processors has opened new opportunities for concurrent programming models taking advantage of both SIMD and SIMT architectures. GeantV, a next generation detector simulation, has been designed to exploit both the vector capability of mainstream CPUs and multi-threading capabilities of coprocessors including NVidia GPUs and Intel Xeon Phi. The characteristics of these architectures are very different in terms of the vectorization depth and type of parallelization needed to achieve optimal performance. In this paper we describe implementation of electromagnetic physics models developed for parallel computing architectures as a part of the GeantV project. Results of preliminary performance evaluation and physics validation are presented as well.
Electromagnetic Physics Models for Parallel Computing Architectures

International Nuclear Information System (INIS)

Amadio, G; Bianchini, C; Iope, R; Ananya, A; Apostolakis, J; Aurora, A; Bandieramonte, M; Brun, R; Carminati, F; Gheata, A; Gheata, M; Goulas, I; Nikitina, T; Bhattacharyya, A; Mohanty, A; Canal, P; Elvira, D; Jun, S Y; Lima, G; Duhem, L

2016-01-01

The recent emergence of hardware architectures characterized by many-core or accelerated processors has opened new opportunities for concurrent programming models taking advantage of both SIMD and SIMT architectures. GeantV, a next generation detector simulation, has been designed to exploit both the vector capability of mainstream CPUs and multi-threading capabilities of coprocessors including NVidia GPUs and Intel Xeon Phi. The characteristics of these architectures are very different in terms of the vectorization depth and type of parallelization needed to achieve optimal performance. In this paper we describe implementation of electromagnetic physics models developed for parallel computing architectures as a part of the GeantV project. Results of preliminary performance evaluation and physics validation are presented as well. (paper)
Modelling and parallel calculation of a kinetic boundary layer

International Nuclear Information System (INIS)

Perlat, Jean Philippe

1998-01-01

This research thesis aims at addressing reliability and cost issues in the calculation by numeric simulation of flows in transition regime. The first step has been to reduce calculation cost and memory space for the Monte Carlo method which is known to provide performance and reliability for rarefied regimes. Vector and parallel computers allow this objective to be reached. Here, a MIMD (multiple instructions, multiple data) machine has been used which implements parallel calculation at different levels of parallelization. Parallelization procedures have been adapted, and results showed that parallelization by calculation domain decomposition was far more efficient. Due to reliability issue related to the statistic feature of Monte Carlo methods, a new deterministic model was necessary to simulate gas molecules in transition regime. New models and hyperbolic systems have therefore been studied. One is chosen which allows thermodynamic values (density, average velocity, temperature, deformation tensor, heat flow) present in Navier-Stokes equations to be determined, and the equations of evolution of thermodynamic values are described for the mono-atomic case. Numerical resolution of is reported. A kinetic scheme is developed which complies with the structure of all systems, and which naturally expresses boundary conditions. The validation of the obtained 14 moment-based model is performed on shock problems and on Couette flows [fr
Distributed parallel computing in stochastic modeling of groundwater systems.

Science.gov (United States)

Dong, Yanhui; Li, Guomin; Xu, Haizhen

2013-03-01

Stochastic modeling is a rapidly evolving, popular approach to the study of the uncertainty and heterogeneity of groundwater systems. However, the use of Monte Carlo-type simulations to solve practical groundwater problems often encounters computational bottlenecks that hinder the acquisition of meaningful results. To improve the computational efficiency, a system that combines stochastic model generation with MODFLOW-related programs and distributed parallel processing is investigated. The distributed computing framework, called the Java Parallel Processing Framework, is integrated into the system to allow the batch processing of stochastic models in distributed and parallel systems. As an example, the system is applied to the stochastic delineation of well capture zones in the Pinggu Basin in Beijing. Through the use of 50 processing threads on a cluster with 10 multicore nodes, the execution times of 500 realizations are reduced to 3% compared with those of a serial execution. Through this application, the system demonstrates its potential in solving difficult computational problems in practical stochastic modeling. © 2012, The Author(s). Groundwater © 2012, National Ground Water Association.
Modeling and Control of Primary Parallel Isolated Boost Converter

DEFF Research Database (Denmark)

Mira Albert, Maria del Carmen; Hernandez Botella, Juan Carlos; Sen, Gökhan

2012-01-01

In this paper state space modeling and closed loop controlled operation have been presented for primary parallel isolated boost converter (PPIBC) topology as a battery charging unit. Parasitic resistances have been included to have an accurate dynamic model. The accuracy of the model has been...
Performance modeling of parallel algorithms for solving neutron diffusion problems

International Nuclear Information System (INIS)

Azmy, Y.Y.; Kirk, B.L.

1995-01-01

Neutron diffusion calculations are the most common computational methods used in the design, analysis, and operation of nuclear reactors and related activities. Here, mathematical performance models are developed for the parallel algorithm used to solve the neutron diffusion equation on message passing and shared memory multiprocessors represented by the Intel iPSC/860 and the Sequent Balance 8000, respectively. The performance models are validated through several test problems, and these models are used to estimate the performance of each of the two considered architectures in situations typical of practical applications, such as fine meshes and a large number of participating processors. While message passing computers are capable of producing speedup, the parallel efficiency deteriorates rapidly as the number of processors increases. Furthermore, the speedup fails to improve appreciably for massively parallel computers so that only small- to medium-sized message passing multiprocessors offer a reasonable platform for this algorithm. In contrast, the performance model for the shared memory architecture predicts very high efficiency over a wide range of number of processors reasonable for this architecture. Furthermore, the model efficiency of the Sequent remains superior to that of the hypercube if its model parameters are adjusted to make its processors as fast as those of the iPSC/860. It is concluded that shared memory computers are better suited for this parallel algorithm than message passing computers
Parallelized Genetic Identification of the Thermal-Electrochemical Model for Lithium-Ion Battery

Directory of Open Access Journals (Sweden)

Liqiang Zhang

2013-01-01

Full Text Available The parameters of a well predicted model can be used as health characteristics for Lithium-ion battery. This article reports a parallelized parameter identification of the thermal-electrochemical model, which significantly reduces the time consumption of parameter identification. Since the P2D model has the most predictability, it is chosen for further research and expanded to the thermal-electrochemical model by coupling thermal effect and temperature-dependent parameters. Then Genetic Algorithm is used for parameter identification, but it takes too much time because of the long time simulation of model. For this reason, a computer cluster is built by surplus computing resource in our laboratory based on Parallel Computing Toolbox and Distributed Computing Server in MATLAB. The performance of two parallelized methods, namely Single Program Multiple Data (SPMD and parallel FOR loop (PARFOR, is investigated and then the parallelized GA identification is proposed. With this method, model simulations running parallelly and the parameter identification could be speeded up more than a dozen times, and the identification result is batter than that from serial GA. This conclusion is validated by model parameter identification of a real LiFePO4 battery.
Parallelization of elliptic solver for solving 1D Boussinesq model

Science.gov (United States)

Tarwidi, D.; Adytia, D.

2018-03-01

In this paper, a parallel implementation of an elliptic solver in solving 1D Boussinesq model is presented. Numerical solution of Boussinesq model is obtained by implementing a staggered grid scheme to continuity, momentum, and elliptic equation of Boussinesq model. Tridiagonal system emerging from numerical scheme of elliptic equation is solved by cyclic reduction algorithm. The parallel implementation of cyclic reduction is executed on multicore processors with shared memory architectures using OpenMP. To measure the performance of parallel program, large number of grids is varied from 28 to 214. Two test cases of numerical experiment, i.e. propagation of solitary and standing wave, are proposed to evaluate the parallel program. The numerical results are verified with analytical solution of solitary and standing wave. The best speedup of solitary and standing wave test cases is about 2.07 with 214 of grids and 1.86 with 213 of grids, respectively, which are executed by using 8 threads. Moreover, the best efficiency of parallel program is 76.2% and 73.5% for solitary and standing wave test cases, respectively.
A hybrid parallel framework for the cellular Potts model simulations

Energy Technology Data Exchange (ETDEWEB)

Jiang, Yi [Los Alamos National Laboratory; He, Kejing [SOUTH CHINA UNIV; Dong, Shoubin [SOUTH CHINA UNIV

2009-01-01

The Cellular Potts Model (CPM) has been widely used for biological simulations. However, most current implementations are either sequential or approximated, which can't be used for large scale complex 3D simulation. In this paper we present a hybrid parallel framework for CPM simulations. The time-consuming POE solving, cell division, and cell reaction operation are distributed to clusters using the Message Passing Interface (MPI). The Monte Carlo lattice update is parallelized on shared-memory SMP system using OpenMP. Because the Monte Carlo lattice update is much faster than the POE solving and SMP systems are more and more common, this hybrid approach achieves good performance and high accuracy at the same time. Based on the parallel Cellular Potts Model, we studied the avascular tumor growth using a multiscale model. The application and performance analysis show that the hybrid parallel framework is quite efficient. The hybrid parallel CPM can be used for the large scale simulation ({approx}10{sup 8} sites) of complex collective behavior of numerous cells ({approx}10{sup 6}).
Error Modeling and Design Optimization of Parallel Manipulators

DEFF Research Database (Denmark)

Wu, Guanglei

/backlash, manufacturing and assembly errors and joint clearances. From the error prediction model, the distributions of the pose errors due to joint clearances are mapped within its constant-orientation workspace and the correctness of the developed model is validated experimentally. ix Additionally, using the screw......, dynamic modeling etc. Next, the rst-order dierential equation of the kinematic closure equation of planar parallel manipulator is obtained to develop its error model both in Polar and Cartesian coordinate systems. The established error model contains the error sources of actuation error...
Badlands: A parallel basin and landscape dynamics model

Directory of Open Access Journals (Sweden)

T. Salles

2016-01-01

Full Text Available Over more than three decades, a number of numerical landscape evolution models (LEMs have been developed to study the combined effects of climate, sea-level, tectonics and sediments on Earth surface dynamics. Most of them are written in efficient programming languages, but often cannot be used on parallel architectures. Here, I present a LEM which ports a common core of accepted physical principles governing landscape evolution into a distributed memory parallel environment. Badlands (acronym for BAsin anD LANdscape DynamicS is an open-source, flexible, TIN-based landscape evolution model, built to simulate topography development at various space and time scales.
Implementation of a parallel version of a regional climate model

Energy Technology Data Exchange (ETDEWEB)

Gerstengarbe, F.W. [ed.; Kuecken, M. [Potsdam-Institut fuer Klimafolgenforschung (PIK), Potsdam (Germany); Schaettler, U. [Deutscher Wetterdienst, Offenbach am Main (Germany). Geschaeftsbereich Forschung und Entwicklung

1997-10-01

A regional climate model developed by the Max Planck Institute for Meterology and the German Climate Computing Centre in Hamburg based on the `Europa` and `Deutschland` models of the German Weather Service has been parallelized and implemented on the IBM RS/6000 SP computer system of the Potsdam Institute for Climate Impact Research including parallel input/output processing, the explicit Eulerian time-step, the semi-implicit corrections, the normal-mode initialization and the physical parameterizations of the German Weather Service. The implementation utilizes Fortran 90 and the Message Passing Interface. The parallelization strategy used is a 2D domain decomposition. This report describes the parallelization strategy, the parallel I/O organization, the influence of different domain decomposition approaches for static and dynamic load imbalances and first numerical results. (orig.)
Tutorial: Parallel Computing of Simulation Models for Risk Analysis.

Science.gov (United States)

Reilly, Allison C; Staid, Andrea; Gao, Michael; Guikema, Seth D

2016-10-01

Simulation models are widely used in risk analysis to study the effects of uncertainties on outcomes of interest in complex problems. Often, these models are computationally complex and time consuming to run. This latter point may be at odds with time-sensitive evaluations or may limit the number of parameters that are considered. In this article, we give an introductory tutorial focused on parallelizing simulation code to better leverage modern computing hardware, enabling risk analysts to better utilize simulation-based methods for quantifying uncertainty in practice. This article is aimed primarily at risk analysts who use simulation methods but do not yet utilize parallelization to decrease the computational burden of these models. The discussion is focused on conceptual aspects of embarrassingly parallel computer code and software considerations. Two complementary examples are shown using the languages MATLAB and R. A brief discussion of hardware considerations is located in the Appendix. © 2016 Society for Risk Analysis.
Implementing parallel spreadsheet models for health policy decisions: The impact of unintentional errors on model projections.

Science.gov (United States)

Bailey, Stephanie L; Bono, Rose S; Nash, Denis; Kimmel, April D

2018-01-01

Spreadsheet software is increasingly used to implement systems science models informing health policy decisions, both in academia and in practice where technical capacity may be limited. However, spreadsheet models are prone to unintentional errors that may not always be identified using standard error-checking techniques. Our objective was to illustrate, through a methodologic case study analysis, the impact of unintentional errors on model projections by implementing parallel model versions. We leveraged a real-world need to revise an existing spreadsheet model designed to inform HIV policy. We developed three parallel versions of a previously validated spreadsheet-based model; versions differed by the spreadsheet cell-referencing approach (named single cells; column/row references; named matrices). For each version, we implemented three model revisions (re-entry into care; guideline-concordant treatment initiation; immediate treatment initiation). After standard error-checking, we identified unintentional errors by comparing model output across the three versions. Concordant model output across all versions was considered error-free. We calculated the impact of unintentional errors as the percentage difference in model projections between model versions with and without unintentional errors, using +/-5% difference to define a material error. We identified 58 original and 4,331 propagated unintentional errors across all model versions and revisions. Over 40% (24/58) of original unintentional errors occurred in the column/row reference model version; most (23/24) were due to incorrect cell references. Overall, >20% of model spreadsheet cells had material unintentional errors. When examining error impact along the HIV care continuum, the percentage difference between versions with and without unintentional errors ranged from +3% to +16% (named single cells), +26% to +76% (column/row reference), and 0% (named matrices). Standard error-checking techniques may not
Parallelization of the Coupled Earthquake Model

Science.gov (United States)

Block, Gary; Li, P. Peggy; Song, Yuhe T.

2007-01-01

This Web-based tsunami simulation system allows users to remotely run a model on JPL s supercomputers for a given undersea earthquake. At the time of this reporting, predicting tsunamis on the Internet has never happened before. This new code directly couples the earthquake model and the ocean model on parallel computers and improves simulation speed. Seismometers can only detect information from earthquakes; they cannot detect whether or not a tsunami may occur as a result of the earthquake. When earthquake-tsunami models are coupled with the improved computational speed of modern, high-performance computers and constrained by remotely sensed data, they are able to provide early warnings for those coastal regions at risk. The software is capable of testing NASA s satellite observations of tsunamis. It has been successfully tested for several historical tsunamis, has passed all alpha and beta testing, and is well documented for users.
Parallel-Batch Scheduling with Two Models of Deterioration to Minimize the Makespan

Directory of Open Access Journals (Sweden)

Cuixia Miao

2014-01-01

Full Text Available We consider the bounded parallel-batch scheduling with two models of deterioration, in which the processing time of the first model is pj=aj+αt and of the second model is pj=a+αjt. The objective is to minimize the makespan. We present O(n log n time algorithms for the single-machine problems, respectively. And we propose fully polynomial time approximation schemes to solve the identical-parallel-machine problem and uniform-parallel-machine problem, respectively.
Parallelization of a hydrological model using the message passing interface

Science.gov (United States)

Wu, Yiping; Li, Tiejian; Sun, Liqun; Chen, Ji

2013-01-01

With the increasing knowledge about the natural processes, hydrological models such as the Soil and Water Assessment Tool (SWAT) are becoming larger and more complex with increasing computation time. Additionally, other procedures such as model calibration, which may require thousands of model iterations, can increase running time and thus further reduce rapid modeling and analysis. Using the widely-applied SWAT as an example, this study demonstrates how to parallelize a serial hydrological model in a Windows® environment using a parallel programing technology—Message Passing Interface (MPI). With a case study, we derived the optimal values for the two parameters (the number of processes and the corresponding percentage of work to be distributed to the master process) of the parallel SWAT (P-SWAT) on an ordinary personal computer and a work station. Our study indicates that model execution time can be reduced by 42%–70% (or a speedup of 1.74–3.36) using multiple processes (two to five) with a proper task-distribution scheme (between the master and slave processes). Although the computation time cost becomes lower with an increasing number of processes (from two to five), this enhancement becomes less due to the accompanied increase in demand for message passing procedures between the master and all slave processes. Our case study demonstrates that the P-SWAT with a five-process run may reach the maximum speedup, and the performance can be quite stable (fairly independent of a project size). Overall, the P-SWAT can help reduce the computation time substantially for an individual model run, manual and automatic calibration procedures, and optimization of best management practices. In particular, the parallelization method we used and the scheme for deriving the optimal parameters in this study can be valuable and easily applied to other hydrological or environmental models.
A Parallel Computational Model for Multichannel Phase Unwrapping Problem

Science.gov (United States)

Imperatore, Pasquale; Pepe, Antonio; Lanari, Riccardo

2015-05-01

In this paper, a parallel model for the solution of the computationally intensive multichannel phase unwrapping (MCh-PhU) problem is proposed. Firstly, the Extended Minimum Cost Flow (EMCF) algorithm for solving MCh-PhU problem is revised within the rigorous mathematical framework of the discrete calculus ; thus permitting to capture its topological structure in terms of meaningful discrete differential operators. Secondly, emphasis is placed on those methodological and practical aspects, which lead to a parallel reformulation of the EMCF algorithm. Thus, a novel dual-level parallel computational model, in which the parallelism is hierarchically implemented at two different (i.e., process and thread) levels, is presented. The validity of our approach has been demonstrated through a series of experiments that have revealed a significant speedup. Therefore, the attained high-performance prototype is suitable for the solution of large-scale phase unwrapping problems in reasonable time frames, with a significant impact on the systematic exploitation of the existing, and rapidly growing, large archives of SAR data.

Climate models on massively parallel computers

International Nuclear Information System (INIS)

Vitart, F.; Rouvillois, P.

1993-01-01

First results got on massively parallel computers (Multiple Instruction Multiple Data and Simple Instruction Multiple Data) allow to consider building of coupled models with high resolutions. This would make possible simulation of thermoaline circulation and other interaction phenomena between atmosphere and ocean. The increasing of computers powers, and then the improvement of resolution will go us to revise our approximations. Then hydrostatic approximation (in ocean circulation) will not be valid when the grid mesh will be of a dimension lower than a few kilometers: We shall have to find other models. The expert appraisement got in numerical analysis at the Center of Limeil-Valenton (CEL-V) will be used again to imagine global models taking in account atmosphere, ocean, ice floe and biosphere, allowing climate simulation until a regional scale
The Parallel System for Integrating Impact Models and Sectors (pSIMS)

Science.gov (United States)

Elliott, Joshua; Kelly, David; Chryssanthacopoulos, James; Glotter, Michael; Jhunjhnuwala, Kanika; Best, Neil; Wilde, Michael; Foster, Ian

2014-01-01

We present a framework for massively parallel climate impact simulations: the parallel System for Integrating Impact Models and Sectors (pSIMS). This framework comprises a) tools for ingesting and converting large amounts of data to a versatile datatype based on a common geospatial grid; b) tools for translating this datatype into custom formats for site-based models; c) a scalable parallel framework for performing large ensemble simulations, using any one of a number of different impacts models, on clusters, supercomputers, distributed grids, or clouds; d) tools and data standards for reformatting outputs to common datatypes for analysis and visualization; and e) methodologies for aggregating these datatypes to arbitrary spatial scales such as administrative and environmental demarcations. By automating many time-consuming and error-prone aspects of large-scale climate impacts studies, pSIMS accelerates computational research, encourages model intercomparison, and enhances reproducibility of simulation results. We present the pSIMS design and use example assessments to demonstrate its multi-model, multi-scale, and multi-sector versatility.
Efficient Parallel Statistical Model Checking of Biochemical Networks

Directory of Open Access Journals (Sweden)

Paolo Ballarini

2009-12-01

Full Text Available We consider the problem of verifying stochastic models of biochemical networks against behavioral properties expressed in temporal logic terms. Exact probabilistic verification approaches such as, for example, CSL/PCTL model checking, are undermined by a huge computational demand which rule them out for most real case studies. Less demanding approaches, such as statistical model checking, estimate the likelihood that a property is satisfied by sampling executions out of the stochastic model. We propose a methodology for efficiently estimating the likelihood that a LTL property P holds of a stochastic model of a biochemical network. As with other statistical verification techniques, the methodology we propose uses a stochastic simulation algorithm for generating execution samples, however there are three key aspects that improve the efficiency: first, the sample generation is driven by on-the-fly verification of P which results in optimal overall simulation time. Second, the confidence interval estimation for the probability of P to hold is based on an efficient variant of the Wilson method which ensures a faster convergence. Third, the whole methodology is designed according to a parallel fashion and a prototype software tool has been implemented that performs the sampling/verification process in parallel over an HPC architecture.
Parallel computing in enterprise modeling.

Energy Technology Data Exchange (ETDEWEB)

Goldsby, Michael E.; Armstrong, Robert C.; Shneider, Max S.; Vanderveen, Keith; Ray, Jaideep; Heath, Zach; Allan, Benjamin A.

2008-08-01

This report presents the results of our efforts to apply high-performance computing to entity-based simulations with a multi-use plugin for parallel computing. We use the term 'Entity-based simulation' to describe a class of simulation which includes both discrete event simulation and agent based simulation. What simulations of this class share, and what differs from more traditional models, is that the result sought is emergent from a large number of contributing entities. Logistic, economic and social simulations are members of this class where things or people are organized or self-organize to produce a solution. Entity-based problems never have an a priori ergodic principle that will greatly simplify calculations. Because the results of entity-based simulations can only be realized at scale, scalable computing is de rigueur for large problems. Having said that, the absence of a spatial organizing principal makes the decomposition of the problem onto processors problematic. In addition, practitioners in this domain commonly use the Java programming language which presents its own problems in a high-performance setting. The plugin we have developed, called the Parallel Particle Data Model, overcomes both of these obstacles and is now being used by two Sandia frameworks: the Decision Analysis Center, and the Seldon social simulation facility. While the ability to engage U.S.-sized problems is now available to the Decision Analysis Center, this plugin is central to the success of Seldon. Because Seldon relies on computationally intensive cognitive sub-models, this work is necessary to achieve the scale necessary for realistic results. With the recent upheavals in the financial markets, and the inscrutability of terrorist activity, this simulation domain will likely need a capability with ever greater fidelity. High-performance computing will play an important part in enabling that greater fidelity.
Exploration Of Deep Learning Algorithms Using Openacc Parallel Programming Model

KAUST Repository

Hamam, Alwaleed A.

2017-03-13

Deep learning is based on a set of algorithms that attempt to model high level abstractions in data. Specifically, RBM is a deep learning algorithm that used in the project to increase it\\'s time performance using some efficient parallel implementation by OpenACC tool with best possible optimizations on RBM to harness the massively parallel power of NVIDIA GPUs. GPUs development in the last few years has contributed to growing the concept of deep learning. OpenACC is a directive based ap-proach for computing where directives provide compiler hints to accelerate code. The traditional Restricted Boltzmann Ma-chine is a stochastic neural network that essentially perform a binary version of factor analysis. RBM is a useful neural net-work basis for larger modern deep learning model, such as Deep Belief Network. RBM parameters are estimated using an efficient training method that called Contrastive Divergence. Parallel implementation of RBM is available using different models such as OpenMP, and CUDA. But this project has been the first attempt to apply OpenACC model on RBM.
Exploration Of Deep Learning Algorithms Using Openacc Parallel Programming Model

KAUST Repository

Hamam, Alwaleed A.; Khan, Ayaz H.

2017-01-01

Deep learning is based on a set of algorithms that attempt to model high level abstractions in data. Specifically, RBM is a deep learning algorithm that used in the project to increase it's time performance using some efficient parallel implementation by OpenACC tool with best possible optimizations on RBM to harness the massively parallel power of NVIDIA GPUs. GPUs development in the last few years has contributed to growing the concept of deep learning. OpenACC is a directive based ap-proach for computing where directives provide compiler hints to accelerate code. The traditional Restricted Boltzmann Ma-chine is a stochastic neural network that essentially perform a binary version of factor analysis. RBM is a useful neural net-work basis for larger modern deep learning model, such as Deep Belief Network. RBM parameters are estimated using an efficient training method that called Contrastive Divergence. Parallel implementation of RBM is available using different models such as OpenMP, and CUDA. But this project has been the first attempt to apply OpenACC model on RBM.
A model for optimizing file access patterns using spatio-temporal parallelism

Energy Technology Data Exchange (ETDEWEB)

Boonthanome, Nouanesengsy [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Patchett, John [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Geveci, Berk [Kitware Inc., Clifton Park, NY (United States); Ahrens, James [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Bauer, Andy [Kitware Inc., Clifton Park, NY (United States); Chaudhary, Aashish [Kitware Inc., Clifton Park, NY (United States); Miller, Ross G. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Shipman, Galen M. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Williams, Dean N. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

2013-01-01

For many years now, I/O read time has been recognized as the primary bottleneck for parallel visualization and analysis of large-scale data. In this paper, we introduce a model that can estimate the read time for a file stored in a parallel filesystem when given the file access pattern. Read times ultimately depend on how the file is stored and the access pattern used to read the file. The file access pattern will be dictated by the type of parallel decomposition used. We employ spatio-temporal parallelism, which combines both spatial and temporal parallelism, to provide greater flexibility to possible file access patterns. Using our model, we were able to configure the spatio-temporal parallelism to design optimized read access patterns that resulted in a speedup factor of approximately 400 over traditional file access patterns.
Development of Parallel Code for the Alaska Tsunami Forecast Model

Science.gov (United States)

Bahng, B.; Knight, W. R.; Whitmore, P.

2014-12-01

The Alaska Tsunami Forecast Model (ATFM) is a numerical model used to forecast propagation and inundation of tsunamis generated by earthquakes and other means in both the Pacific and Atlantic Oceans. At the U.S. National Tsunami Warning Center (NTWC), the model is mainly used in a pre-computed fashion. That is, results for hundreds of hypothetical events are computed before alerts, and are accessed and calibrated with observations during tsunamis to immediately produce forecasts. ATFM uses the non-linear, depth-averaged, shallow-water equations of motion with multiply nested grids in two-way communications between domains of each parent-child pair as waves get closer to coastal waters. Even with the pre-computation the task becomes non-trivial as sub-grid resolution gets finer. Currently, the finest resolution Digital Elevation Models (DEM) used by ATFM are 1/3 arc-seconds. With a serial code, large or multiple areas of very high resolution can produce run-times that are unrealistic even in a pre-computed approach. One way to increase the model performance is code parallelization used in conjunction with a multi-processor computing environment. NTWC developers have undertaken an ATFM code-parallelization effort to streamline the creation of the pre-computed database of results with the long term aim of tsunami forecasts from source to high resolution shoreline grids in real time. Parallelization will also permit timely regeneration of the forecast model database with new DEMs; and, will make possible future inclusion of new physics such as the non-hydrostatic treatment of tsunami propagation. The purpose of our presentation is to elaborate on the parallelization approach and to show the compute speed increase on various multi-processor systems.
Performance of Air Pollution Models on Massively Parallel Computers

DEFF Research Database (Denmark)

Brown, John; Hansen, Per Christian; Wasniewski, Jerzy

1996-01-01

To compare the performance and use of three massively parallel SIMD computers, we implemented a large air pollution model on the computers. Using a realistic large-scale model, we gain detailed insight about the performance of the three computers when used to solve large-scale scientific problems...
Parallel community climate model: Description and user`s guide

Energy Technology Data Exchange (ETDEWEB)

Drake, J.B.; Flanery, R.E.; Semeraro, B.D.; Worley, P.H. [and others

1996-07-15

This report gives an overview of a parallel version of the NCAR Community Climate Model, CCM2, implemented for MIMD massively parallel computers using a message-passing programming paradigm. The parallel implementation was developed on an Intel iPSC/860 with 128 processors and on the Intel Delta with 512 processors, and the initial target platform for the production version of the code is the Intel Paragon with 2048 processors. Because the implementation uses a standard, portable message-passing libraries, the code has been easily ported to other multiprocessors supporting a message-passing programming paradigm. The parallelization strategy used is to decompose the problem domain into geographical patches and assign each processor the computation associated with a distinct subset of the patches. With this decomposition, the physics calculations involve only grid points and data local to a processor and are performed in parallel. Using parallel algorithms developed for the semi-Lagrangian transport, the fast Fourier transform and the Legendre transform, both physics and dynamics are computed in parallel with minimal data movement and modest change to the original CCM2 source code. Sequential or parallel history tapes are written and input files (in history tape format) are read sequentially by the parallel code to promote compatibility with production use of the model on other computer systems. A validation exercise has been performed with the parallel code and is detailed along with some performance numbers on the Intel Paragon and the IBM SP2. A discussion of reproducibility of results is included. A user`s guide for the PCCM2 version 2.1 on the various parallel machines completes the report. Procedures for compilation, setup and execution are given. A discussion of code internals is included for those who may wish to modify and use the program in their own research.
Methods to model-check parallel systems software

International Nuclear Information System (INIS)

Matlin, O. S.; McCune, W.; Lusk, E.

2003-01-01

We report on an effort to develop methodologies for formal verification of parts of the Multi-Purpose Daemon (MPD) parallel process management system. MPD is a distributed collection of communicating processes. While the individual components of the collection execute simple algorithms, their interaction leads to unexpected errors that are difficult to uncover by conventional means. Two verification approaches are discussed here: the standard model checking approach using the software model checker SPIN and the nonstandard use of a general-purpose first-order resolution-style theorem prover OTTER to conduct the traditional state space exploration. We compare modeling methodology and analyze performance and scalability of the two methods with respect to verification of MPD
cellGPU: Massively parallel simulations of dynamic vertex models

Science.gov (United States)

Sussman, Daniel M.

2017-10-01

Vertex models represent confluent tissue by polygonal or polyhedral tilings of space, with the individual cells interacting via force laws that depend on both the geometry of the cells and the topology of the tessellation. This dependence on the connectivity of the cellular network introduces several complications to performing molecular-dynamics-like simulations of vertex models, and in particular makes parallelizing the simulations difficult. cellGPU addresses this difficulty and lays the foundation for massively parallelized, GPU-based simulations of these models. This article discusses its implementation for a pair of two-dimensional models, and compares the typical performance that can be expected between running cellGPU entirely on the CPU versus its performance when running on a range of commercial and server-grade graphics cards. By implementing the calculation of topological changes and forces on cells in a highly parallelizable fashion, cellGPU enables researchers to simulate time- and length-scales previously inaccessible via existing single-threaded CPU implementations. Program Files doi:http://dx.doi.org/10.17632/6j2cj29t3r.1 Licensing provisions: MIT Programming language: CUDA/C++ Nature of problem: Simulations of off-lattice "vertex models" of cells, in which the interaction forces depend on both the geometry and the topology of the cellular aggregate. Solution method: Highly parallelized GPU-accelerated dynamical simulations in which the force calculations and the topological features can be handled on either the CPU or GPU. Additional comments: The code is hosted at https://gitlab.com/dmsussman/cellGPU, with documentation additionally maintained at http://dmsussman.gitlab.io/cellGPUdocumentation
Co-simulation of dynamic systems in parallel and serial model configurations

International Nuclear Information System (INIS)

Sweafford, Trevor; Yoon, Hwan Sik

2013-01-01

Recent advancement in simulation software and computation hardware make it realizable to simulate complex dynamic systems comprised of multiple submodels developed in different modeling languages. The so-called co-simulation enables one to study various aspects of a complex dynamic system with heterogeneous submodels in a cost-effective manner. Among several different model configurations for co-simulation, synchronized parallel configuration is regarded to expedite the simulation process by simulation multiple sub models concurrently on a multi core processor. In this paper, computational accuracies as well as computation time are studied for three different co-simulation frameworks : integrated, serial, and parallel. for this purpose, analytical evaluations of the three different methods are made using the explicit Euler method and then they are applied to two-DOF mass-spring systems. The result show that while the parallel simulation configuration produces the same accurate results as the integrated configuration, results of the serial configuration, results of the serial configuration show a slight deviation. it is also shown that the computation time can be reduced by running simulation in the parallel configuration. Therefore, it can be concluded that the synchronized parallel simulation methodology is the best for both simulation accuracy and time efficiency.
Efficient parallel implementation of active appearance model fitting algorithm on GPU.

Science.gov (United States)

Wang, Jinwei; Ma, Xirong; Zhu, Yuanping; Sun, Jizhou

2014-01-01

The active appearance model (AAM) is one of the most powerful model-based object detecting and tracking methods which has been widely used in various situations. However, the high-dimensional texture representation causes very time-consuming computations, which makes the AAM difficult to apply to real-time systems. The emergence of modern graphics processing units (GPUs) that feature a many-core, fine-grained parallel architecture provides new and promising solutions to overcome the computational challenge. In this paper, we propose an efficient parallel implementation of the AAM fitting algorithm on GPUs. Our design idea is fine grain parallelism in which we distribute the texture data of the AAM, in pixels, to thousands of parallel GPU threads for processing, which makes the algorithm fit better into the GPU architecture. We implement our algorithm using the compute unified device architecture (CUDA) on the Nvidia's GTX 650 GPU, which has the latest Kepler architecture. To compare the performance of our algorithm with different data sizes, we built sixteen face AAM models of different dimensional textures. The experiment results show that our parallel AAM fitting algorithm can achieve real-time performance for videos even on very high-dimensional textures.
Depth-Averaged Non-Hydrostatic Hydrodynamic Model Using a New Multithreading Parallel Computing Method

Directory of Open Access Journals (Sweden)

Ling Kang

2017-03-01

Full Text Available Compared to the hydrostatic hydrodynamic model, the non-hydrostatic hydrodynamic model can accurately simulate flows that feature vertical accelerations. The model’s low computational efficiency severely restricts its wider application. This paper proposes a non-hydrostatic hydrodynamic model based on a multithreading parallel computing method. The horizontal momentum equation is obtained by integrating the Navier–Stokes equations from the bottom to the free surface. The vertical momentum equation is approximated by the Keller-box scheme. A two-step method is used to solve the model equations. A parallel strategy based on block decomposition computation is utilized. The original computational domain is subdivided into two subdomains that are physically connected via a virtual boundary technique. Two sub-threads are created and tasked with the computation of the two subdomains. The producer–consumer model and the thread lock technique are used to achieve synchronous communication between sub-threads. The validity of the model was verified by solitary wave propagation experiments over a flat bottom and slope, followed by two sinusoidal wave propagation experiments over submerged breakwater. The parallel computing method proposed here was found to effectively enhance computational efficiency and save 20%–40% computation time compared to serial computing. The parallel acceleration rate and acceleration efficiency are approximately 1.45% and 72%, respectively. The parallel computing method makes a contribution to the popularization of non-hydrostatic models.
Boltzmann machines as a model for parallel annealing

NARCIS (Netherlands)

Aarts, E.H.L.; Korst, J.H.M.

1991-01-01

The potential of Boltzmann machines to cope with difficult combinatorial optimization problems is investigated. A discussion of various (parallel) models of Boltzmann machines is given based on the theory of Markov chains. A general strategy is presented for solving (approximately) combinatorial
Efficient Parallel Implementation of Active Appearance Model Fitting Algorithm on GPU

Directory of Open Access Journals (Sweden)

Jinwei Wang

2014-01-01

Full Text Available The active appearance model (AAM is one of the most powerful model-based object detecting and tracking methods which has been widely used in various situations. However, the high-dimensional texture representation causes very time-consuming computations, which makes the AAM difficult to apply to real-time systems. The emergence of modern graphics processing units (GPUs that feature a many-core, fine-grained parallel architecture provides new and promising solutions to overcome the computational challenge. In this paper, we propose an efficient parallel implementation of the AAM fitting algorithm on GPUs. Our design idea is fine grain parallelism in which we distribute the texture data of the AAM, in pixels, to thousands of parallel GPU threads for processing, which makes the algorithm fit better into the GPU architecture. We implement our algorithm using the compute unified device architecture (CUDA on the Nvidia’s GTX 650 GPU, which has the latest Kepler architecture. To compare the performance of our algorithm with different data sizes, we built sixteen face AAM models of different dimensional textures. The experiment results show that our parallel AAM fitting algorithm can achieve real-time performance for videos even on very high-dimensional textures.
Evolution of a minimal parallel programming model

International Nuclear Information System (INIS)

Lusk, Ewing; Butler, Ralph; Pieper, Steven C.

2017-01-01

Here, we take a historical approach to our presentation of self-scheduled task parallelism, a programming model with its origins in early irregular and nondeterministic computations encountered in automated theorem proving and logic programming. We show how an extremely simple task model has evolved into a system, asynchronous dynamic load balancing (ADLB), and a scalable implementation capable of supporting sophisticated applications on today’s (and tomorrow’s) largest supercomputers; and we illustrate the use of ADLB with a Green’s function Monte Carlo application, a modern, mature nuclear physics code in production use. Our lesson is that by surrendering a certain amount of generality and thus applicability, a minimal programming model (in terms of its basic concepts and the size of its application programmer interface) can achieve extreme scalability without introducing complexity.
Investigation of Mediational Processes Using Parallel Process Latent Growth Curve Modeling

Science.gov (United States)

Cheong, JeeWon; MacKinnon, David P.; Khoo, Siek Toon

2010-01-01

This study investigated a method to evaluate mediational processes using latent growth curve modeling. The mediator and the outcome measured across multiple time points were viewed as 2 separate parallel processes. The mediational process was defined as the independent variable influencing the growth of the mediator, which, in turn, affected the growth of the outcome. To illustrate modeling procedures, empirical data from a longitudinal drug prevention program, Adolescents Training and Learning to Avoid Steroids, were used. The program effects on the growth of the mediator and the growth of the outcome were examined first in a 2-group structural equation model. The mediational process was then modeled and tested in a parallel process latent growth curve model by relating the prevention program condition, the growth rate factor of the mediator, and the growth rate factor of the outcome. PMID:20157639
Parallel Optimization of 3D Cardiac Electrophysiological Model Using GPU

Directory of Open Access Journals (Sweden)

Yong Xia

2015-01-01

Full Text Available Large-scale 3D virtual heart model simulations are highly demanding in computational resources. This imposes a big challenge to the traditional computation resources based on CPU environment, which already cannot meet the requirement of the whole computation demands or are not easily available due to expensive costs. GPU as a parallel computing environment therefore provides an alternative to solve the large-scale computational problems of whole heart modeling. In this study, using a 3D sheep atrial model as a test bed, we developed a GPU-based simulation algorithm to simulate the conduction of electrical excitation waves in the 3D atria. In the GPU algorithm, a multicellular tissue model was split into two components: one is the single cell model (ordinary differential equation and the other is the diffusion term of the monodomain model (partial differential equation. Such a decoupling enabled realization of the GPU parallel algorithm. Furthermore, several optimization strategies were proposed based on the features of the virtual heart model, which enabled a 200-fold speedup as compared to a CPU implementation. In conclusion, an optimized GPU algorithm has been developed that provides an economic and powerful platform for 3D whole heart simulations.

A Programming Model for Massive Data Parallelism with Data Dependencies

International Nuclear Information System (INIS)

Cui, Xiaohui; Mueller, Frank; Potok, Thomas E.; Zhang, Yongpeng

2009-01-01

Accelerating processors can often be more cost and energy effective for a wide range of data-parallel computing problems than general-purpose processors. For graphics processor units (GPUs), this is particularly the case when program development is aided by environments such as NVIDIA s Compute Unified Device Architecture (CUDA), which dramatically reduces the gap between domain-specific architectures and general purpose programming. Nonetheless, general-purpose GPU (GPGPU) programming remains subject to several restrictions. Most significantly, the separation of host (CPU) and accelerator (GPU) address spaces requires explicit management of GPU memory resources, especially for massive data parallelism that well exceeds the memory capacity of GPUs. One solution to this problem is to transfer data between the GPU and host memories frequently. In this work, we investigate another approach. We run massively data-parallel applications on GPU clusters. We further propose a programming model for massive data parallelism with data dependencies for this scenario. Experience from micro benchmarks and real-world applications shows that our model provides not only ease of programming but also significant performance gains
Parallelization of the model-based iterative reconstruction algorithm DIRA

International Nuclear Information System (INIS)

Oertenberg, A.; Sandborg, M.; Alm Carlsson, G.; Malusek, A.; Magnusson, M.

2016-01-01

New paradigms for parallel programming have been devised to simplify software development on multi-core processors and many-core graphical processing units (GPU). Despite their obvious benefits, the parallelization of existing computer programs is not an easy task. In this work, the use of the Open Multiprocessing (OpenMP) and Open Computing Language (OpenCL) frameworks is considered for the parallelization of the model-based iterative reconstruction algorithm DIRA with the aim to significantly shorten the code's execution time. Selected routines were parallelized using OpenMP and OpenCL libraries; some routines were converted from MATLAB to C and optimised. Parallelization of the code with the OpenMP was easy and resulted in an overall speedup of 15 on a 16-core computer. Parallelization with OpenCL was more difficult owing to differences between the central processing unit and GPU architectures. The resulting speedup was substantially lower than the theoretical peak performance of the GPU; the cause was explained. (authors)
A parallelized three-dimensional cellular automaton model for grain growth during additive manufacturing

Science.gov (United States)

Lian, Yanping; Lin, Stephen; Yan, Wentao; Liu, Wing Kam; Wagner, Gregory J.

2018-05-01

In this paper, a parallelized 3D cellular automaton computational model is developed to predict grain morphology for solidification of metal during the additive manufacturing process. Solidification phenomena are characterized by highly localized events, such as the nucleation and growth of multiple grains. As a result, parallelization requires careful treatment of load balancing between processors as well as interprocess communication in order to maintain a high parallel efficiency. We give a detailed summary of the formulation of the model, as well as a description of the communication strategies implemented to ensure parallel efficiency. Scaling tests on a representative problem with about half a billion cells demonstrate parallel efficiency of more than 80% on 8 processors and around 50% on 64; loss of efficiency is attributable to load imbalance due to near-surface grain nucleation in this test problem. The model is further demonstrated through an additive manufacturing simulation with resulting grain structures showing reasonable agreement with those observed in experiments.
A parallelized three-dimensional cellular automaton model for grain growth during additive manufacturing

Science.gov (United States)

Lian, Yanping; Lin, Stephen; Yan, Wentao; Liu, Wing Kam; Wagner, Gregory J.

2018-01-01

In this paper, a parallelized 3D cellular automaton computational model is developed to predict grain morphology for solidification of metal during the additive manufacturing process. Solidification phenomena are characterized by highly localized events, such as the nucleation and growth of multiple grains. As a result, parallelization requires careful treatment of load balancing between processors as well as interprocess communication in order to maintain a high parallel efficiency. We give a detailed summary of the formulation of the model, as well as a description of the communication strategies implemented to ensure parallel efficiency. Scaling tests on a representative problem with about half a billion cells demonstrate parallel efficiency of more than 80% on 8 processors and around 50% on 64; loss of efficiency is attributable to load imbalance due to near-surface grain nucleation in this test problem. The model is further demonstrated through an additive manufacturing simulation with resulting grain structures showing reasonable agreement with those observed in experiments.
Parallel Motion Simulation of Large-Scale Real-Time Crowd in a Hierarchical Environmental Model

Directory of Open Access Journals (Sweden)

Xin Wang

2012-01-01

Full Text Available This paper presents a parallel real-time crowd simulation method based on a hierarchical environmental model. A dynamical model of the complex environment should be constructed to simulate the state transition and propagation of individual motions. By modeling of a virtual environment where virtual crowds reside, we employ different parallel methods on a topological layer, a path layer and a perceptual layer. We propose a parallel motion path matching method based on the path layer and a parallel crowd simulation method based on the perceptual layer. The large-scale real-time crowd simulation becomes possible with these methods. Numerical experiments are carried out to demonstrate the methods and results.
Mechatronic Model Based Computed Torque Control of a Parallel Manipulator

Directory of Open Access Journals (Sweden)

Zhiyong Yang

2008-11-01

Full Text Available With high speed and accuracy the parallel manipulators have wide application in the industry, but there still exist many difficulties in the actual control process because of the time-varying and coupling. Unfortunately, the present-day commercial controlles cannot provide satisfying performance for its single axis linear control only. Therefore, aimed at a novel 2-DOF (Degree of Freedom parallel manipulator called Diamond 600, a motor-mechanism coupling dynamic model based control scheme employing the computed torque control algorithm are presented in this paper. First, the integrated dynamic coupling model is deduced, according to equivalent torques between the mechanical structure and the PM (Permanent Magnetism servomotor. Second, computed torque controller is described in detail for the above proposed model. At last, a series of numerical simulations and experiments are carried out to test the effectiveness of the system, and the results verify the favourable tracking ability and robustness.
Mechatronic Model Based Computed Torque Control of a Parallel Manipulator

Directory of Open Access Journals (Sweden)

Zhiyong Yang

2008-03-01

Full Text Available With high speed and accuracy the parallel manipulators have wide application in the industry, but there still exist many difficulties in the actual control process because of the time-varying and coupling. Unfortunately, the present-day commercial controlles cannot provide satisfying performance for its single axis linear control only. Therefore, aimed at a novel 2-DOF (Degree of Freedom parallel manipulator called Diamond 600, a motor-mechanism coupling dynamic model based control scheme employing the computed torque control algorithm are presented in this paper. First, the integrated dynamic coupling model is deduced, according to equivalent torques between the mechanical structure and the PM (Permanent Magnetism servomotor. Second, computed torque controller is described in detail for the above proposed model. At last, a series of numerical simulations and experiments are carried out to test the effectiveness of the system, and the results verify the favourable tracking ability and robustness.
Parallelization of simulation code for liquid-gas model of lattice-gas fluid

International Nuclear Information System (INIS)

Kawai, Wataru; Ebihara, Kenichi; Kume, Etsuo; Watanabe, Tadashi

2000-03-01

A simulation code for hydrodynamical phenomena which is based on the liquid-gas model of lattice-gas fluid is parallelized by using MPI (Message Passing Interface) library. The parallelized code can be applied to the larger size of the simulations than the non-parallelized code. The calculation times of the parallelized code on VPP500 (Vector-Parallel super computer with dispersed memory units), AP3000 (Scalar-parallel server with dispersed memory units), and a workstation cluster decreased in inverse proportion to the number of processors. (author)
Parallel computing of a climate model on the dawn 1000 by domain decomposition method

Science.gov (United States)

Bi, Xunqiang

1997-12-01

In this paper the parallel computing of a grid-point nine-level atmospheric general circulation model on the Dawn 1000 is introduced. The model was developed by the Institute of Atmospheric Physics (IAP), Chinese Academy of Sciences (CAS). The Dawn 1000 is a MIMD massive parallel computer made by National Research Center for Intelligent Computer (NCIC), CAS. A two-dimensional domain decomposition method is adopted to perform the parallel computing. The potential ways to increase the speed-up ratio and exploit more resources of future massively parallel supercomputation are also discussed.
Modeling and optimization of parallel and distributed embedded systems

CERN Document Server

Munir, Arslan; Ranka, Sanjay

2016-01-01

This book introduces the state-of-the-art in research in parallel and distributed embedded systems, which have been enabled by developments in silicon technology, micro-electro-mechanical systems (MEMS), wireless communications, computer networking, and digital electronics. These systems have diverse applications in domains including military and defense, medical, automotive, and unmanned autonomous vehicles. The emphasis of the book is on the modeling and optimization of emerging parallel and distributed embedded systems in relation to the three key design metrics of performance, power and dependability.
Development Of A Parallel Performance Model For The THOR Neutral Particle Transport Code

Energy Technology Data Exchange (ETDEWEB)

Yessayan, Raffi; Azmy, Yousry; Schunert, Sebastian

2017-02-01

The THOR neutral particle transport code enables simulation of complex geometries for various problems from reactor simulations to nuclear non-proliferation. It is undergoing a thorough V&V requiring computational efficiency. This has motivated various improvements including angular parallelization, outer iteration acceleration, and development of peripheral tools. For guiding future improvements to the code’s efficiency, better characterization of its parallel performance is useful. A parallel performance model (PPM) can be used to evaluate the benefits of modifications and to identify performance bottlenecks. Using INL’s Falcon HPC, the PPM development incorporates an evaluation of network communication behavior over heterogeneous links and a functional characterization of the per-cell/angle/group runtime of each major code component. After evaluating several possible sources of variability, this resulted in a communication model and a parallel portion model. The former’s accuracy is bounded by the variability of communication on Falcon while the latter has an error on the order of 1%.
Stage-by-Stage and Parallel Flow Path Compressor Modeling for a Variable Cycle Engine

Science.gov (United States)

Kopasakis, George; Connolly, Joseph W.; Cheng, Larry

2015-01-01

This paper covers the development of stage-by-stage and parallel flow path compressor modeling approaches for a Variable Cycle Engine. The stage-by-stage compressor modeling approach is an extension of a technique for lumped volume dynamics and performance characteristic modeling. It was developed to improve the accuracy of axial compressor dynamics over lumped volume dynamics modeling. The stage-by-stage compressor model presented here is formulated into a parallel flow path model that includes both axial and rotational dynamics. This is done to enable the study of compressor and propulsion system dynamic performance under flow distortion conditions. The approaches utilized here are generic and should be applicable for the modeling of any axial flow compressor design.
Analysis and Modeling of Circulating Current in Two Parallel-Connected Inverters

DEFF Research Database (Denmark)

Maheshwari, Ram Krishan; Gohil, Ghanshyamsinh Vijaysinh; Bede, Lorand

2015-01-01

Parallel-connected inverters are gaining attention for high power applications because of the limited power handling capability of the power modules. Moreover, the parallel-connected inverters may have low total harmonic distortion of the ac current if they are operated with the interleaved pulse...... this model, the circulating current between two parallel-connected inverters is analysed in this study. The peak and root mean square (rms) values of the normalised circulating current are calculated for different PWM methods, which makes this analysis a valuable tool to design a filter for the circulating......-width modulation (PWM). However, the interleaved PWM causes a circulating current between the inverters, which in turn causes additional losses. A model describing the dynamics of the circulating current is presented in this study which shows that the circulating current depends on the common-mode voltage. Using...
PVeStA: A Parallel Statistical Model Checking and Quantitative Analysis Tool

KAUST Repository

AlTurki, Musab

2011-01-01

Statistical model checking is an attractive formal analysis method for probabilistic systems such as, for example, cyber-physical systems which are often probabilistic in nature. This paper is about drastically increasing the scalability of statistical model checking, and making such scalability of analysis available to tools like Maude, where probabilistic systems can be specified at a high level as probabilistic rewrite theories. It presents PVeStA, an extension and parallelization of the VeStA statistical model checking tool [10]. PVeStA supports statistical model checking of probabilistic real-time systems specified as either: (i) discrete or continuous Markov Chains; or (ii) probabilistic rewrite theories in Maude. Furthermore, the properties that it can model check can be expressed in either: (i) PCTL/CSL, or (ii) the QuaTEx quantitative temporal logic. As our experiments show, the performance gains obtained from parallelization can be very high. © 2011 Springer-Verlag.
A simple and efficient parallel FFT algorithm using the BSP model

NARCIS (Netherlands)

Bisseling, R.H.; Inda, M.A.

2000-01-01

In this paper we present a new parallel radix FFT algorithm based on the BSP model Our parallel algorithm uses the groupcyclic distribution family which makes it simple to understand and easy to implement We show how to reduce the com munication cost of the algorithm by a factor of three in the case
A new model for reliability optimization of series-parallel systems with non-homogeneous components

International Nuclear Information System (INIS)

Feizabadi, Mohammad; Jahromi, Abdolhamid Eshraghniaye

2017-01-01

In discussions related to reliability optimization using redundancy allocation, one of the structures that has attracted the attention of many researchers, is series-parallel structure. In models previously presented for reliability optimization of series-parallel systems, there is a restricting assumption based on which all components of a subsystem must be homogeneous. This constraint limits system designers in selecting components and prevents achieving higher levels of reliability. In this paper, a new model is proposed for reliability optimization of series-parallel systems, which makes possible the use of non-homogeneous components in each subsystem. As a result of this flexibility, the process of supplying system components will be easier. To solve the proposed model, since the redundancy allocation problem (RAP) belongs to the NP-hard class of optimization problems, a genetic algorithm (GA) is developed. The computational results of the designed GA are indicative of high performance of the proposed model in increasing system reliability and decreasing costs. - Highlights: • In this paper, a new model is proposed for reliability optimization of series-parallel systems. • In the previous models, there is a restricting assumption based on which all components of a subsystem must be homogeneous. • The presented model provides a possibility for the subsystems’ components to be non- homogeneous in the required conditions. • The computational results demonstrate the high performance of the proposed model in improving reliability and reducing costs.
Reduced-Order Structure-Preserving Model for Parallel-Connected Three-Phase Grid-Tied Inverters: Preprint

Energy Technology Data Exchange (ETDEWEB)

Johnson, Brian B [National Renewable Energy Laboratory (NREL), Golden, CO (United States); Purba, Victor [University of Minnesota; Jafarpour, Saber [University of California, Santa Barbara; Bullo, Francesco [University of California, Santa Barbara; Dhople, Sairaj [University of Minnesota

2017-08-31

Given that next-generation infrastructures will contain large numbers of grid-connected inverters and these interfaces will be satisfying a growing fraction of system load, it is imperative to analyze the impacts of power electronics on such systems. However, since each inverter model has a relatively large number of dynamic states, it would be impractical to execute complex system models where the full dynamics of each inverter are retained. To address this challenge, we derive a reduced-order structure-preserving model for parallel-connected grid-tied three-phase inverters. Here, each inverter in the system is assumed to have a full-bridge topology, LCL filter at the point of common coupling, and the control architecture for each inverter includes a current controller, a power controller, and a phase-locked loop for grid synchronization. We outline a structure-preserving reduced-order inverter model for the setting where the parallel inverters are each designed such that the filter components and controller gains scale linearly with the power rating. By structure preserving, we mean that the reduced-order three-phase inverter model is also composed of an LCL filter, a power controller, current controller, and PLL. That is, we show that the system of parallel inverters can be modeled exactly as one aggregated inverter unit and this equivalent model has the same number of dynamical states as an individual inverter in the paralleled system. Numerical simulations validate the reduced-order models.
Reduced-Order Structure-Preserving Model for Parallel-Connected Three-Phase Grid-Tied Inverters

Energy Technology Data Exchange (ETDEWEB)

Johnson, Brian B [National Renewable Energy Laboratory (NREL), Golden, CO (United States); Purba, Victor [University of Minnesota; Jafarpour, Saber [University of California Santa-Barbara; Bullo, Francesco [University of California Santa-Barbara; Dhople, Sairaj V. [University of Minnesota

2017-08-21

Next-generation power networks will contain large numbers of grid-connected inverters satisfying a significant fraction of system load. Since each inverter model has a relatively large number of dynamic states, it is impractical to analyze complex system models where the full dynamics of each inverter are retained. To address this challenge, we derive a reduced-order structure-preserving model for parallel-connected grid-tied three-phase inverters. Here, each inverter in the system is assumed to have a full-bridge topology, LCL filter at the point of common coupling, and the control architecture for each inverter includes a current controller, a power controller, and a phase-locked loop for grid synchronization. We outline a structure-preserving reduced-order inverter model with lumped parameters for the setting where the parallel inverters are each designed such that the filter components and controller gains scale linearly with the power rating. By structure preserving, we mean that the reduced-order three-phase inverter model is also composed of an LCL filter, a power controller, current controller, and PLL. We show that the system of parallel inverters can be modeled exactly as one aggregated inverter unit and this equivalent model has the same number of dynamical states as any individual inverter in the system. Numerical simulations validate the reduced-order model.
Parallel computation for biological sequence comparison: comparing a portable model to the native model for the Intel Hypercube.

Science.gov (United States)

Nadkarni, P M; Miller, P L

1991-01-01

A parallel program for inter-database sequence comparison was developed on the Intel Hypercube using two models of parallel programming. One version was built using machine-specific Hypercube parallel programming commands. The other version was built using Linda, a machine-independent parallel programming language. The two versions of the program provide a case study comparing these two approaches to parallelization in an important biological application area. Benchmark tests with both programs gave comparable results with a small number of processors. As the number of processors was increased, the Linda version was somewhat less efficient. The Linda version was also run without change on Network Linda, a virtual parallel machine running on a network of desktop workstations.
New Parallel Algorithms for Landscape Evolution Model

Science.gov (United States)

Jin, Y.; Zhang, H.; Shi, Y.

2017-12-01

Most landscape evolution models (LEM) developed in the last two decades solve the diffusion equation to simulate the transportation of surface sediments. This numerical approach is difficult to parallelize due to the computation of drainage area for each node, which needs huge amount of communication if run in parallel. In order to overcome this difficulty, we developed two parallel algorithms for LEM with a stream net. One algorithm handles the partition of grid with traditional methods and applies an efficient global reduction algorithm to do the computation of drainage areas and transport rates for the stream net; the other algorithm is based on a new partition algorithm, which partitions the nodes in catchments between processes first, and then partitions the cells according to the partition of nodes. Both methods focus on decreasing communication between processes and take the advantage of massive computing techniques, and numerical experiments show that they are both adequate to handle large scale problems with millions of cells. We implemented the two algorithms in our program based on the widely used finite element library deal.II, so that it can be easily coupled with ASPECT.

A model of breakdown in parallel-plate detectors

International Nuclear Information System (INIS)

Fonte, P.

1996-01-01

Parallel-plate avalanche chambers (PPAC's) have many desirable properties, such as a fast, large area particle detector. However, the maximum gain is limited by a form of violent breakdown that limits the usefulness of this detector, despite its other evident qualities. The exact nature of this phenomenon is not yet sufficiently clear to sustain possible improvements. A previous experimental study is complemented in the present work by a quantitative model of the breakdown phenomenon in PPAC's, based on the streamer theory. The model reproduces well the peculiar behavior of the external current observed in PPAC's and resistive-plate chambers. Other breakdown properties measured in PPAC's are also well reproduced
Modelling distribution of evaporating CO2 in parallel minichannels

DEFF Research Database (Denmark)

Brix, Wiebke; Kærn, Martin Ryhl; Elmegaard, Brian

2010-01-01

The effects of airflow non-uniformity and uneven inlet qualities on the performance of a minichannel evaporator with parallel channels, using CO2 as refrigerant, are investigated numerically. For this purpose a one-dimensional discretised steady-state model was developed, applying well-known empi......The effects of airflow non-uniformity and uneven inlet qualities on the performance of a minichannel evaporator with parallel channels, using CO2 as refrigerant, are investigated numerically. For this purpose a one-dimensional discretised steady-state model was developed, applying well...... to maldistribution of the refrigerant and considerable capacity reduction of the evaporator. Uneven inlet ualities to the different channels show only minor effects on the refrigerant distribution and evaporator capacity as long as the channels are vertically oriented with CO2 flowing upwards. For horizontal...... channels capacity reductions are found for both non-uniform airflow and uneven inlet qualities. For horizontal minichannels the results are very similar to those obtained using R134a as refrigerant....
Optimal parallel algorithms for problems modeled by a family of intervals

Science.gov (United States)

Olariu, Stephan; Schwing, James L.; Zhang, Jingyuan

1992-01-01

A family of intervals on the real line provides a natural model for a vast number of scheduling and VLSI problems. Recently, a number of parallel algorithms to solve a variety of practical problems on such a family of intervals have been proposed in the literature. Computational tools are developed, and it is shown how they can be used for the purpose of devising cost-optimal parallel algorithms for a number of interval-related problems including finding a largest subset of pairwise nonoverlapping intervals, a minimum dominating subset of intervals, along with algorithms to compute the shortest path between a pair of intervals and, based on the shortest path, a parallel algorithm to find the center of the family of intervals. More precisely, with an arbitrary family of n intervals as input, all algorithms run in O(log n) time using O(n) processors in the EREW-PRAM model of computation.
Verification of Electromagnetic Physics Models for Parallel Computing Architectures in the GeantV Project

Energy Technology Data Exchange (ETDEWEB)

Amadio, G.; et al.

2017-11-22

An intensive R&D and programming effort is required to accomplish new challenges posed by future experimental high-energy particle physics (HEP) programs. The GeantV project aims to narrow the gap between the performance of the existing HEP detector simulation software and the ideal performance achievable, exploiting latest advances in computing technology. The project has developed a particle detector simulation prototype capable of transporting in parallel particles in complex geometries exploiting instruction level microparallelism (SIMD and SIMT), task-level parallelism (multithreading) and high-level parallelism (MPI), leveraging both the multi-core and the many-core opportunities. We present preliminary verification results concerning the electromagnetic (EM) physics models developed for parallel computing architectures within the GeantV project. In order to exploit the potential of vectorization and accelerators and to make the physics model effectively parallelizable, advanced sampling techniques have been implemented and tested. In this paper we introduce a set of automated statistical tests in order to verify the vectorized models by checking their consistency with the corresponding Geant4 models and to validate them against experimental data.
Dynamic modelling of a 3-CPU parallel robot via screw theory

Directory of Open Access Journals (Sweden)

L. Carbonari

2013-04-01

Full Text Available The article describes the dynamic modelling of I.Ca.Ro., a novel Cartesian parallel robot recently designed and prototyped by the robotics research group of the Polytechnic University of Marche. By means of screw theory and virtual work principle, a computationally efficient model has been built, with the final aim of realising advanced model based controllers. Then a dynamic analysis has been performed in order to point out possible model simplifications that could lead to a more efficient run time implementation.
PARAMO: a PARAllel predictive MOdeling platform for healthcare analytic research using electronic health records.

Science.gov (United States)

Ng, Kenney; Ghoting, Amol; Steinhubl, Steven R; Stewart, Walter F; Malin, Bradley; Sun, Jimeng

2014-04-01

Healthcare analytics research increasingly involves the construction of predictive models for disease targets across varying patient cohorts using electronic health records (EHRs). To facilitate this process, it is critical to support a pipeline of tasks: (1) cohort construction, (2) feature construction, (3) cross-validation, (4) feature selection, and (5) classification. To develop an appropriate model, it is necessary to compare and refine models derived from a diversity of cohorts, patient-specific features, and statistical frameworks. The goal of this work is to develop and evaluate a predictive modeling platform that can be used to simplify and expedite this process for health data. To support this goal, we developed a PARAllel predictive MOdeling (PARAMO) platform which (1) constructs a dependency graph of tasks from specifications of predictive modeling pipelines, (2) schedules the tasks in a topological ordering of the graph, and (3) executes those tasks in parallel. We implemented this platform using Map-Reduce to enable independent tasks to run in parallel in a cluster computing environment. Different task scheduling preferences are also supported. We assess the performance of PARAMO on various workloads using three datasets derived from the EHR systems in place at Geisinger Health System and Vanderbilt University Medical Center and an anonymous longitudinal claims database. We demonstrate significant gains in computational efficiency against a standard approach. In particular, PARAMO can build 800 different models on a 300,000 patient data set in 3h in parallel compared to 9days if running sequentially. This work demonstrates that an efficient parallel predictive modeling platform can be developed for EHR data. This platform can facilitate large-scale modeling endeavors and speed-up the research workflow and reuse of health information. This platform is only a first step and provides the foundation for our ultimate goal of building analytic pipelines
Analysis of clinical complication data for radiation hepatitis using a parallel architecture model

International Nuclear Information System (INIS)

Jackson, A.; Haken, R.K. ten; Robertson, J.M.; Kessler, M.L.; Kutcher, G.J.; Lawrence, T.S.

1995-01-01

Purpose: The detailed knowledge of dose volume distributions available from the three-dimensional (3D) conformal radiation treatment of tumors in the liver (reported elsewhere) offers new opportunities to quantify the effect of volume on the probability of producing radiation hepatitis. We aim to test a new parallel architecture model of normal tissue complication probability (NTCP) with these data. Methods and Materials: Complication data and dose volume histograms from a total of 93 patients with normal liver function, treated on a prospective protocol with 3D conformal radiation therapy and intraarterial hepatic fluorodeoxyuridine, were analyzed with a new parallel architecture model. Patient treatment fell into six categories differing in doses delivered and volumes irradiated. By modeling the radiosensitivity of liver subunits, we are able to use dose volume histograms to calculate the fraction of the liver damaged in each patient. A complication results if this fraction exceeds the patient's functional reserve. To determine the patient distribution of functional reserves and the subunit radiosensitivity, the maximum likelihood method was used to fit the observed complication data. Results: The parallel model fit the complication data well, although uncertainties on the functional reserve distribution and subunit radiosensitivy are highly correlated. Conclusion: The observed radiation hepatitis complications show a threshold effect that can be described well with a parallel architecture model. However, additional independent studies are required to better determine the parameters defining the functional reserve distribution and subunit radiosensitivity
Fast parallel algorithm for three-dimensional distance-driven model in iterative computed tomography reconstruction

International Nuclear Information System (INIS)

Chen Jian-Lin; Li Lei; Wang Lin-Yuan; Cai Ai-Long; Xi Xiao-Qi; Zhang Han-Ming; Li Jian-Xin; Yan Bin

2015-01-01

The projection matrix model is used to describe the physical relationship between reconstructed object and projection. Such a model has a strong influence on projection and backprojection, two vital operations in iterative computed tomographic reconstruction. The distance-driven model (DDM) is a state-of-the-art technology that simulates forward and back projections. This model has a low computational complexity and a relatively high spatial resolution; however, it includes only a few methods in a parallel operation with a matched model scheme. This study introduces a fast and parallelizable algorithm to improve the traditional DDM for computing the parallel projection and backprojection operations. Our proposed model has been implemented on a GPU (graphic processing unit) platform and has achieved satisfactory computational efficiency with no approximation. The runtime for the projection and backprojection operations with our model is approximately 4.5 s and 10.5 s per loop, respectively, with an image size of 256×256×256 and 360 projections with a size of 512×512. We compare several general algorithms that have been proposed for maximizing GPU efficiency by using the unmatched projection/backprojection models in a parallel computation. The imaging resolution is not sacrificed and remains accurate during computed tomographic reconstruction. (paper)
Parallel shooting methods for finding steady state solutions to engine simulation models

DEFF Research Database (Denmark)

Andersen, Stig Kildegård; Thomsen, Per Grove; Carlsen, Henrik

2007-01-01

Parallel single- and multiple shooting methods were tested for finding periodic steady state solutions to a Stirling engine model. The model was used to illustrate features of the methods and possibilities for optimisations. Performance was measured using simulation of an experimental data set...
Parallelization Experience with Four Canonical Econometric Models Using ParMitISEM

Directory of Open Access Journals (Sweden)

Nalan Baştürk

2016-03-01

Full Text Available This paper presents the parallel computing implementation of the MitISEM algorithm, labeled Parallel MitISEM. The basic MitISEM algorithm provides an automatic and flexible method to approximate a non-elliptical target density using adaptive mixtures of Student-t densities, where only a kernel of the target density is required. The approximation can be used as a candidate density in Importance Sampling or Metropolis Hastings methods for Bayesian inference on model parameters and probabilities. We present and discuss four canonical econometric models using a Graphics Processing Unit and a multi-core Central Processing Unit version of the MitISEM algorithm. The results show that the parallelization of the MitISEM algorithm on Graphics Processing Units and multi-core Central Processing Units is straightforward and fast to program using MATLAB. Moreover the speed performance of the Graphics Processing Unit version is much higher than the Central Processing Unit one.
a Predator-Prey Model Based on the Fully Parallel Cellular Automata

Science.gov (United States)

He, Mingfeng; Ruan, Hongbo; Yu, Changliang

We presented a predator-prey lattice model containing moveable wolves and sheep, which are characterized by Penna double bit strings. Sexual reproduction and child-care strategies are considered. To implement this model in an efficient way, we build a fully parallel Cellular Automata based on a new definition of the neighborhood. We show the roles played by the initial densities of the populations, the mutation rate and the linear size of the lattice in the evolution of this model.
Connectionist Models and Parallelism in High Level Vision.

Science.gov (United States)

1985-01-01

GRANT NUMBER(s) Jerome A. Feldman N00014-82-K-0193 9. PERFORMING ORGANIZATION NAME AND ADDRESS 10. PROGRAM ELEMENt. PROJECT, TASK Computer Science...Connectionist Models 2.1 Background and Overviev % Computer science is just beginning to look seriously at parallel computation : it may turn out that...the chair. The program includes intermediate level networks that compute more complex joints and ones that compute parallelograms in the image. These
Teaching Scientific Computing: A Model-Centered Approach to Pipeline and Parallel Programming with C

Directory of Open Access Journals (Sweden)

Vladimiras Dolgopolovas

2015-01-01

Full Text Available The aim of this study is to present an approach to the introduction into pipeline and parallel computing, using a model of the multiphase queueing system. Pipeline computing, including software pipelines, is among the key concepts in modern computing and electronics engineering. The modern computer science and engineering education requires a comprehensive curriculum, so the introduction to pipeline and parallel computing is the essential topic to be included in the curriculum. At the same time, the topic is among the most motivating tasks due to the comprehensive multidisciplinary and technical requirements. To enhance the educational process, the paper proposes a novel model-centered framework and develops the relevant learning objects. It allows implementing an educational platform of constructivist learning process, thus enabling learners’ experimentation with the provided programming models, obtaining learners’ competences of the modern scientific research and computational thinking, and capturing the relevant technical knowledge. It also provides an integral platform that allows a simultaneous and comparative introduction to pipelining and parallel computing. The programming language C for developing programming models and message passing interface (MPI and OpenMP parallelization tools have been chosen for implementation.
Toward a model framework of generalized parallel componential processing of multi-symbol numbers.

Science.gov (United States)

Huber, Stefan; Cornelsen, Sonja; Moeller, Korbinian; Nuerk, Hans-Christoph

2015-05-01

In this article, we propose and evaluate a new model framework of parallel componential multi-symbol number processing, generalizing the idea of parallel componential processing of multi-digit numbers to the case of negative numbers by considering the polarity signs similar to single digits. In a first step, we evaluated this account by defining and investigating a sign-decade compatibility effect for the comparison of positive and negative numbers, which extends the unit-decade compatibility effect in 2-digit number processing. Then, we evaluated whether the model is capable of accounting for previous findings in negative number processing. In a magnitude comparison task, in which participants had to single out the larger of 2 integers, we observed a reliable sign-decade compatibility effect with prolonged reaction times for incompatible (e.g., -97 vs. +53; in which the number with the larger decade digit has the smaller, i.e., negative polarity sign) as compared with sign-decade compatible number pairs (e.g., -53 vs. +97). Moreover, an analysis of participants' eye fixation behavior corroborated our model of parallel componential processing of multi-symbol numbers. These results are discussed in light of concurrent theoretical notions about negative number processing. On the basis of the present results, we propose a generalized integrated model framework of parallel componential multi-symbol processing. (c) 2015 APA, all rights reserved).
New physics beyond the standard model of particle physics and parallel universes

Energy Technology Data Exchange (ETDEWEB)

Plaga, R. [Franzstr. 40, 53111 Bonn (Germany)]. E-mail: rainer.plaga@gmx.de

2006-03-09

It is shown that if-and only if-'parallel universes' exist, an electroweak vacuum that is expected to have decayed since the big bang with a high probability might exist. It would neither necessarily render our existence unlikely nor could it be observed. In this special case the observation of certain combinations of Higgs-boson and top-quark masses-for which the standard model predicts such a decay-cannot be interpreted as evidence for new physics at low energy scales. The question of whether parallel universes exist is of interest to our understanding of the standard model of particle physics.
A Hybrid Parallel Execution Model for Logic Based Requirement Specifications (Invited Paper

Directory of Open Access Journals (Sweden)

Jeffrey J. P. Tsai

1999-05-01

Full Text Available It is well known that undiscovered errors in a requirements specification is extremely expensive to be fixed when discovered in the software maintenance phase. Errors in the requirement phase can be reduced through the validation and verification of the requirements specification. Many logic-based requirements specification languages have been developed to achieve these goals. However, the execution and reasoning of a logic-based requirements specification can be very slow. An effective way to improve their performance is to execute and reason the logic-based requirements specification in parallel. In this paper, we present a hybrid model to facilitate the parallel execution of a logic-based requirements specification language. A logic-based specification is first applied by a data dependency analysis technique which can find all the mode combinations that exist within a specification clause. This mode information is used to support a novel hybrid parallel execution model, which combines both top-down and bottom-up evaluation strategies. This new execution model can find the failure in the deepest node of the search tree at the early stage of the evaluation, thus this new execution model can reduce the total number of nodes searched in the tree, the total processes needed to be generated, and the total communication channels needed in the search process. A simulator has been implemented to analyze the execution behavior of the new model. Experiments show significant improvement based on several criteria.
Vlasov modelling of parallel transport in a tokamak scrape-off layer

International Nuclear Information System (INIS)

Manfredi, G; Hirstoaga, S; Devaux, S

2011-01-01

A one-dimensional Vlasov-Poisson model is used to describe the parallel transport in a tokamak scrape-off layer. Thanks to a recently developed 'asymptotic-preserving' numerical scheme, it is possible to lift numerical constraints on the time step and grid spacing, which are no longer limited by, respectively, the electron plasma period and Debye length. The Vlasov approach provides a good velocity-space resolution even in regions of low density. The model is applied to the study of parallel transport during edge-localized modes, with particular emphasis on the particles and energy fluxes on the divertor plates. The numerical results are compared with analytical estimates based on a free-streaming model, with good general agreement. An interesting feature is the observation of an early electron energy flux, due to suprathermal electrons escaping the ions' attraction. In contrast, the long-time evolution is essentially quasi-neutral and dominated by the ion dynamics.
Vlasov modelling of parallel transport in a tokamak scrape-off layer

Energy Technology Data Exchange (ETDEWEB)

Manfredi, G [Institut de Physique et Chimie des Materiaux, CNRS and Universite de Strasbourg, BP 43, F-67034 Strasbourg (France); Hirstoaga, S [INRIA Nancy Grand-Est and Institut de Recherche en Mathematiques Avancees, 7 rue Rene Descartes, F-67084 Strasbourg (France); Devaux, S, E-mail: Giovanni.Manfredi@ipcms.u-strasbg.f, E-mail: hirstoaga@math.unistra.f, E-mail: Stephane.Devaux@ccfe.ac.u [JET-EFDA, Culham Science Centre, Abingdon, OX14 3DB (United Kingdom)

2011-01-15

A one-dimensional Vlasov-Poisson model is used to describe the parallel transport in a tokamak scrape-off layer. Thanks to a recently developed 'asymptotic-preserving' numerical scheme, it is possible to lift numerical constraints on the time step and grid spacing, which are no longer limited by, respectively, the electron plasma period and Debye length. The Vlasov approach provides a good velocity-space resolution even in regions of low density. The model is applied to the study of parallel transport during edge-localized modes, with particular emphasis on the particles and energy fluxes on the divertor plates. The numerical results are compared with analytical estimates based on a free-streaming model, with good general agreement. An interesting feature is the observation of an early electron energy flux, due to suprathermal electrons escaping the ions' attraction. In contrast, the long-time evolution is essentially quasi-neutral and dominated by the ion dynamics.
Exploiting Thread Parallelism for Ocean Modeling on Cray XC Supercomputers

Energy Technology Data Exchange (ETDEWEB)

Sarje, Abhinav [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Jacobsen, Douglas W. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Williams, Samuel W. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Ringler, Todd [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Oliker, Leonid [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)

2016-05-01

The incorporation of increasing core counts in modern processors used to build state-of-the-art supercomputers is driving application development towards exploitation of thread parallelism, in addition to distributed memory parallelism, with the goal of delivering efficient high-performance codes. In this work we describe the exploitation of threading and our experiences with it with respect to a real-world ocean modeling application code, MPAS-Ocean. We present detailed performance analysis and comparisons of various approaches and configurations for threading on the Cray XC series supercomputers.
HPC parallel programming model for gyrokinetic MHD simulation

International Nuclear Information System (INIS)

Naitou, Hiroshi; Yamada, Yusuke; Tokuda, Shinji; Ishii, Yasutomo; Yagi, Masatoshi

2011-01-01

The 3-dimensional gyrokinetic PIC (particle-in-cell) code for MHD simulation, Gpic-MHD, was installed on SR16000 (“Plasma Simulator”), which is a scalar cluster system consisting of 8,192 logical cores. The Gpic-MHD code advances particle and field quantities in time. In order to distribute calculations over large number of logical cores, the total simulation domain in cylindrical geometry was broken up into N DD-r × N DD-z (number of radial decomposition times number of axial decomposition) small domains including approximately the same number of particles. The axial direction was uniformly decomposed, while the radial direction was non-uniformly decomposed. N RP replicas (copies) of each decomposed domain were used (“particle decomposition”). The hybrid parallelization model of multi-threads and multi-processes was employed: threads were parallelized by the auto-parallelization and N DD-r × N DD-z × N RP processes were parallelized by MPI (message-passing interface). The parallelization performance of Gpic-MHD was investigated for the medium size system of N r × N θ × N z = 1025 × 128 × 128 mesh with 4.196 or 8.192 billion particles. The highest speed for the fixed number of logical cores was obtained for two threads, the maximum number of N DD-z , and optimum combination of N DD-r and N RP . The observed optimum speeds demonstrated good scaling up to 8,192 logical cores. (author)

PARALLEL ADAPTIVE MULTILEVEL SAMPLING ALGORITHMS FOR THE BAYESIAN ANALYSIS OF MATHEMATICAL MODELS

KAUST Repository

Prudencio, Ernesto; Cheung, Sai Hung

2012-01-01

In recent years, Bayesian model updating techniques based on measured data have been applied to many engineering and applied science problems. At the same time, parallel computational platforms are becoming increasingly more powerful and are being used more frequently by the engineering and scientific communities. Bayesian techniques usually require the evaluation of multi-dimensional integrals related to the posterior probability density function (PDF) of uncertain model parameters. The fact that such integrals cannot be computed analytically motivates the research of stochastic simulation methods for sampling posterior PDFs. One such algorithm is the adaptive multilevel stochastic simulation algorithm (AMSSA). In this paper we discuss the parallelization of AMSSA, formulating the necessary load balancing step as a binary integer programming problem. We present a variety of results showing the effectiveness of load balancing on the overall performance of AMSSA in a parallel computational environment.
Efficient Out of Core Sorting Algorithms for the Parallel Disks Model.

Science.gov (United States)

Kundeti, Vamsi; Rajasekaran, Sanguthevar

2011-11-01

In this paper we present efficient algorithms for sorting on the Parallel Disks Model (PDM). Numerous asymptotically optimal algorithms have been proposed in the literature. However many of these merge based algorithms have large underlying constants in the time bounds, because they suffer from the lack of read parallelism on PDM. The irregular consumption of the runs during the merge affects the read parallelism and contributes to the increased sorting time. In this paper we first introduce a novel idea called the dirty sequence accumulation that improves the read parallelism. Secondly, we show analytically that this idea can reduce the number of parallel I/O's required to sort the input close to the lower bound of [Formula: see text]. We experimentally verify our dirty sequence idea with the standard R-Way merge and show that our idea can reduce the number of parallel I/Os to sort on PDM significantly.
Improved modelling of a parallel plate active magnetic regenerator

International Nuclear Information System (INIS)

Engelbrecht, K; Nielsen, K K; Bahl, C R H; Tušek, J; Kitanovski, A; Poredoš, A

2013-01-01

Much of the active magnetic regenerator (AMR) modelling presented in the literature considers only the solid and fluid domains of the regenerator and ignores other physical effects that have been shown to be important, such as demagnetizing fields in the regenerator, parasitic heat losses and fluid flow maldistribution in the regenerator. This paper studies the effects of these loss mechanisms and compares theoretical results with experimental results obtained on an experimental AMR device. Three parallel plate regenerators were tested, each having different demagnetizing field characteristics and fluid flow maldistributions. It was shown that when these loss mechanisms are ignored, the model significantly over predicts experimental results. Including the loss mechanisms can significantly change the model predictions, depending on the operating conditions and construction of the regenerator. The model is compared with experimental results for a range of fluid flow rates and cooling loads. (paper)
Final Report: Center for Programming Models for Scalable Parallel Computing

Energy Technology Data Exchange (ETDEWEB)

Mellor-Crummey, John [William Marsh Rice University

2011-09-13

As part of the Center for Programming Models for Scalable Parallel Computing, Rice University collaborated with project partners in the design, development and deployment of language, compiler, and runtime support for parallel programming models to support application development for the “leadership-class” computer systems at DOE national laboratories. Work over the course of this project has focused on the design, implementation, and evaluation of a second-generation version of Coarray Fortran. Research and development efforts of the project have focused on the CAF 2.0 language, compiler, runtime system, and supporting infrastructure. This has involved working with the teams that provide infrastructure for CAF that we rely on, implementing new language and runtime features, producing an open source compiler that enabled us to evaluate our ideas, and evaluating our design and implementation through the use of benchmarks. The report details the research, development, findings, and conclusions from this work.
A one-dimensional heat transfer model for parallel-plate thermoacoustic heat exchangers

NARCIS (Netherlands)

de Jong, Anne; Wijnant, Ysbrand H.; de Boer, Andries

2014-01-01

A one-dimensional (1D) laminar oscillating flow heat transfer model is derived and applied to parallel-plate thermoacoustic heat exchangers. The model can be used to estimate the heat transfer from the solid wall to the acoustic medium, which is required for the heat input/output of thermoacoustic
Measuring effectiveness of a university by a parallel network DEA model

Science.gov (United States)

Kashim, Rosmaini; Kasim, Maznah Mat; Rahman, Rosshairy Abd

2017-11-01

Universities contribute significantly to the development of human capital and socio-economic improvement of a country. Due to that, Malaysian universities carried out various initiatives to improve their performance. Most studies have used the Data Envelopment Analysis (DEA) model to measure efficiency rather than effectiveness, even though, the measurement of effectiveness is important to realize how effective a university in achieving its ultimate goals. A university system has two major functions, namely teaching and research and every function has different resources based on its emphasis. Therefore, a university is actually structured as a parallel production system with its overall effectiveness is the aggregated effectiveness of teaching and research. Hence, this paper is proposing a parallel network DEA model to measure the effectiveness of a university. This model includes internal operations of both teaching and research functions into account in computing the effectiveness of a university system. In literature, the graduate and the number of program offered are defined as the outputs, then, the employed graduates and the numbers of programs accredited from professional bodies are considered as the outcomes for measuring the teaching effectiveness. Amount of grants is regarded as the output of research, while the different quality of publications considered as the outcomes of research. A system is considered effective if only all functions are effective. This model has been tested using a hypothetical set of data consisting of 14 faculties at a public university in Malaysia. The results show that none of the faculties is relatively effective for the overall performance. Three faculties are effective in teaching and two faculties are effective in research. The potential applications of the parallel network DEA model allow the top management of a university to identify weaknesses in any functions in their universities and take rational steps for improvement.
Interaction Admittance Based Modeling of Multi-Paralleled Grid-Connected Inverter with LCL-Filter

DEFF Research Database (Denmark)

Lu, Minghui; Blaabjerg, Frede; Wang, Xiongfei

2016-01-01

This paper investigates the mutual interaction and stability issues of multi-parallel LCL-filtered inverters. The stability and power quality of multiple grid-tied inverters are gaining more and more research attention as the penetration of renewables increases. In this paper, interactions...... and coupling effects among the multi-paralleled inverters and power grid are explicitly revealed. An Interaction Admittance concept is introduced to express and model the interaction through the physical admittances of the network. Compared to the existing modeling methods, the proposed analysis provides...
Parallel Computing for Terrestrial Ecosystem Carbon Modeling

International Nuclear Information System (INIS)

Wang, Dali; Post, Wilfred M.; Ricciuto, Daniel M.; Berry, Michael

2011-01-01

Terrestrial ecosystems are a primary component of research on global environmental change. Observational and modeling research on terrestrial ecosystems at the global scale, however, has lagged behind their counterparts for oceanic and atmospheric systems, largely because the unique challenges associated with the tremendous diversity and complexity of terrestrial ecosystems. There are 8 major types of terrestrial ecosystem: tropical rain forest, savannas, deserts, temperate grassland, deciduous forest, coniferous forest, tundra, and chaparral. The carbon cycle is an important mechanism in the coupling of terrestrial ecosystems with climate through biological fluxes of CO 2 . The influence of terrestrial ecosystems on atmospheric CO 2 can be modeled via several means at different timescales. Important processes include plant dynamics, change in land use, as well as ecosystem biogeography. Over the past several decades, many terrestrial ecosystem models (see the 'Model developments' section) have been developed to understand the interactions between terrestrial carbon storage and CO 2 concentration in the atmosphere, as well as the consequences of these interactions. Early TECMs generally adapted simple box-flow exchange models, in which photosynthetic CO 2 uptake and respiratory CO 2 release are simulated in an empirical manner with a small number of vegetation and soil carbon pools. Demands on kinds and amount of information required from global TECMs have grown. Recently, along with the rapid development of parallel computing, spatially explicit TECMs with detailed process based representations of carbon dynamics become attractive, because those models can readily incorporate a variety of additional ecosystem processes (such as dispersal, establishment, growth, mortality etc.) and environmental factors (such as landscape position, pest populations, disturbances, resource manipulations, etc.), and provide information to frame policy options for climate change
Lamb wave propagation modelling and simulation using parallel processing architecture and graphical cards

International Nuclear Information System (INIS)

Paćko, P; Bielak, T; Staszewski, W J; Uhl, T; Spencer, A B; Worden, K

2012-01-01

This paper demonstrates new parallel computation technology and an implementation for Lamb wave propagation modelling in complex structures. A graphical processing unit (GPU) and computer unified device architecture (CUDA), available in low-cost graphical cards in standard PCs, are used for Lamb wave propagation numerical simulations. The local interaction simulation approach (LISA) wave propagation algorithm has been implemented as an example. Other algorithms suitable for parallel discretization can also be used in practice. The method is illustrated using examples related to damage detection. The results demonstrate good accuracy and effective computational performance of very large models. The wave propagation modelling presented in the paper can be used in many practical applications of science and engineering. (paper)
Using Hadoop MapReduce for Parallel Genetic Algorithms: A Comparison of the Global, Grid and Island Models.

Science.gov (United States)

Ferrucci, Filomena; Salza, Pasquale; Sarro, Federica

2017-06-29

The need to improve the scalability of Genetic Algorithms (GAs) has motivated the research on Parallel Genetic Algorithms (PGAs), and different technologies and approaches have been used. Hadoop MapReduce represents one of the most mature technologies to develop parallel algorithms. Based on the fact that parallel algorithms introduce communication overhead, the aim of the present work is to understand if, and possibly when, the parallel GAs solutions using Hadoop MapReduce show better performance than sequential versions in terms of execution time. Moreover, we are interested in understanding which PGA model can be most effective among the global, grid, and island models. We empirically assessed the performance of these three parallel models with respect to a sequential GA on a software engineering problem, evaluating the execution time and the achieved speedup. We also analysed the behaviour of the parallel models in relation to the overhead produced by the use of Hadoop MapReduce and the GAs' computational effort, which gives a more machine-independent measure of these algorithms. We exploited three problem instances to differentiate the computation load and three cluster configurations based on 2, 4, and 8 parallel nodes. Moreover, we estimated the costs of the execution of the experimentation on a potential cloud infrastructure, based on the pricing of the major commercial cloud providers. The empirical study revealed that the use of PGA based on the island model outperforms the other parallel models and the sequential GA for all the considered instances and clusters. Using 2, 4, and 8 nodes, the island model achieves an average speedup over the three datasets of 1.8, 3.4, and 7.0 times, respectively. Hadoop MapReduce has a set of different constraints that need to be considered during the design and the implementation of parallel algorithms. The overhead of data store (i.e., HDFS) accesses, communication, and latency requires solutions that reduce data store
Methods and models for the construction of weakly parallel tests

NARCIS (Netherlands)

Adema, J.J.; Adema, Jos J.

1992-01-01

Several methods are proposed for the construction of weakly parallel tests [i.e., tests with the same test information function (TIF)]. A mathematical programming model that constructs tests containing a prespecified TIF and a heuristic that assigns items to tests with information functions that are
Methods and models for the construction of weakly parallel tests

NARCIS (Netherlands)

Adema, J.J.; Adema, Jos J.

1990-01-01

Methods are proposed for the construction of weakly parallel tests, that is, tests with the same test information function. A mathematical programing model for constructing tests with a prespecified test information function and a heuristic for assigning items to tests such that their information
cudaBayesreg: Parallel Implementation of a Bayesian Multilevel Model for fMRI Data Analysis

Directory of Open Access Journals (Sweden)

Adelino R. Ferreira da Silva

2011-10-01

Full Text Available Graphic processing units (GPUs are rapidly gaining maturity as powerful general parallel computing devices. A key feature in the development of modern GPUs has been the advancement of the programming model and programming tools. Compute Unified Device Architecture (CUDA is a software platform for massively parallel high-performance computing on Nvidia many-core GPUs. In functional magnetic resonance imaging (fMRI, the volume of the data to be processed, and the type of statistical analysis to perform call for high-performance computing strategies. In this work, we present the main features of the R-CUDA package cudaBayesreg which implements in CUDA the core of a Bayesian multilevel model for the analysis of brain fMRI data. The statistical model implements a Gibbs sampler for multilevel/hierarchical linear models with a normal prior. The main contribution for the increased performance comes from the use of separate threads for fitting the linear regression model at each voxel in parallel. The R-CUDA implementation of the Bayesian model proposed here has been able to reduce significantly the run-time processing of Markov chain Monte Carlo (MCMC simulations used in Bayesian fMRI data analyses. Presently, cudaBayesreg is only configured for Linux systems with Nvidia CUDA support.
Mathematical Model of Thyristor Inverter Including a Series-parallel Resonant Circuit

OpenAIRE

Miroslaw Luft; Elzbieta Szychta

2008-01-01

The article presents a mathematical model of thyristor inverter including a series-parallel resonant circuit with theaid of state variable method. Maple procedures are used to compute current and voltage waveforms in the inverter.
Mathematical model of thyristor inverter including a series-parallel resonant circuit

OpenAIRE

Luft, M.; Szychta, E.

2008-01-01

The article presents a mathematical model of thyristor inverter including a series-parallel resonant circuit with the aid of state variable method. Maple procedures are used to compute current and voltage waveforms in the inverter.
Parallel Algorithm for Solving TOV Equations for Sequence of Cold and Dense Nuclear Matter Models

Science.gov (United States)

Ayriyan, Alexander; Buša, Ján; Grigorian, Hovik; Poghosyan, Gevorg

2018-04-01

We have introduced parallel algorithm simulation of neutron star configurations for set of equation of state models. The performance of the parallel algorithm has been investigated for testing set of EoS models on two computational systems. It scales when using with MPI on modern CPUs and this investigation allowed us also to compare two different types of computational nodes.
The island dynamics model on parallel quadtree grids

Science.gov (United States)

Mistani, Pouria; Guittet, Arthur; Bochkov, Daniil; Schneider, Joshua; Margetis, Dionisios; Ratsch, Christian; Gibou, Frederic

2018-05-01

We introduce an approach for simulating epitaxial growth by use of an island dynamics model on a forest of quadtree grids, and in a parallel environment. To this end, we use a parallel framework introduced in the context of the level-set method. This framework utilizes: discretizations that achieve a second-order accurate level-set method on non-graded adaptive Cartesian grids for solving the associated free boundary value problem for surface diffusion; and an established library for the partitioning of the grid. We consider the cases with: irreversible aggregation, which amounts to applying Dirichlet boundary conditions at the island boundary; and an asymmetric (Ehrlich-Schwoebel) energy barrier for attachment/detachment of atoms at the island boundary, which entails the use of a Robin boundary condition. We provide the scaling analyses performed on the Stampede supercomputer and numerical examples that illustrate the capability of our methodology to efficiently simulate different aspects of epitaxial growth. The combination of adaptivity and parallelism in our approach enables simulations that are several orders of magnitude faster than those reported in the recent literature and, thus, provides a viable framework for the systematic study of mound formation on crystal surfaces.
Algorithm comparison and benchmarking using a parallel spectra transform shallow water model

Energy Technology Data Exchange (ETDEWEB)

Worley, P.H. [Oak Ridge National Lab., TN (United States); Foster, I.T.; Toonen, B. [Argonne National Lab., IL (United States)

1995-04-01

In recent years, a number of computer vendors have produced supercomputers based on a massively parallel processing (MPP) architecture. These computers have been shown to be competitive in performance with conventional vector supercomputers for some applications. As spectral weather and climate models are heavy users of vector supercomputers, it is interesting to determine how these models perform on MPPS, and which MPPs are best suited to the execution of spectral models. The benchmarking of MPPs is complicated by the fact that different algorithms may be more efficient on different architectures. Hence, a comprehensive benchmarking effort must answer two related questions: which algorithm is most efficient on each computer and how do the most efficient algorithms compare on different computers. In general, these are difficult questions to answer because of the high cost associated with implementing and evaluating a range of different parallel algorithms on each MPP platform.
Facilitating arrhythmia simulation: the method of quantitative cellular automata modeling and parallel running

Directory of Open Access Journals (Sweden)

Mondry Adrian

2004-08-01

Full Text Available Abstract Background Many arrhythmias are triggered by abnormal electrical activity at the ionic channel and cell level, and then evolve spatio-temporally within the heart. To understand arrhythmias better and to diagnose them more precisely by their ECG waveforms, a whole-heart model is required to explore the association between the massively parallel activities at the channel/cell level and the integrative electrophysiological phenomena at organ level. Methods We have developed a method to build large-scale electrophysiological models by using extended cellular automata, and to run such models on a cluster of shared memory machines. We describe here the method, including the extension of a language-based cellular automaton to implement quantitative computing, the building of a whole-heart model with Visible Human Project data, the parallelization of the model on a cluster of shared memory computers with OpenMP and MPI hybrid programming, and a simulation algorithm that links cellular activity with the ECG. Results We demonstrate that electrical activities at channel, cell, and organ levels can be traced and captured conveniently in our extended cellular automaton system. Examples of some ECG waveforms simulated with a 2-D slice are given to support the ECG simulation algorithm. A performance evaluation of the 3-D model on a four-node cluster is also given. Conclusions Quantitative multicellular modeling with extended cellular automata is a highly efficient and widely applicable method to weave experimental data at different levels into computational models. This process can be used to investigate complex and collective biological activities that can be described neither by their governing differentiation equations nor by discrete parallel computation. Transparent cluster computing is a convenient and effective method to make time-consuming simulation feasible. Arrhythmias, as a typical case, can be effectively simulated with the methods
Mathematical Model of Thyristor Inverter Including a Series-parallel Resonant Circuit

Directory of Open Access Journals (Sweden)

Miroslaw Luft

2008-01-01

Full Text Available The article presents a mathematical model of thyristor inverter including a series-parallel resonant circuit with theaid of state variable method. Maple procedures are used to compute current and voltage waveforms in the inverter.

A model for dealing with parallel processes in supervision

Directory of Open Access Journals (Sweden)

Lilja Cajvert

2011-03-01

Supervision in social work is essential for successful outcomes when working with clients. In social work, unconscious difficulties may arise and similar difficulties may occur in supervision as parallel processes. In this article, the development of a practice-based model of supervision to deal with parallel processes in supervision is described. The model has six phases. In the first phase, the focus is on the supervisor’s inner world, his/her own reflections and observations. In the second phase, the supervision situation is “frozen”, and the supervisees are invited to join the supervisor in taking a meta-perspective on the current situation of supervision. The focus in the third phase is on the inner world of all the group members as well as the visualization and identification of reflections and feelings that arose during the supervision process. Phase four focuses on the supervisee who presented a case, and in phase five the focus shifts to the common understanding and theorization of the supervision process as well as the definition and identification of possible parallel processes. In the final phase, the supervisee, with the assistance of the supervisor and other members of the group, develops a solution and determines how to proceed with the client in treatment. This article uses phenomenological concepts to provide a theoretical framework for the supervision model. Phenomenological reduction is an important approach to examine and to externalize and visualize the inner words of the supervisor and supervisees. Een model voor het hanteren van parallelle processen tijdens supervisie Om succesvol te zijn in de hulpverlening aan cliënten, is supervisie cruciaal in het sociaal werk. Tijdens de hulpverlening kunnen impliciete moeilijkheden de kop opsteken en soortgelijke moeilijkheden duiken soms ook op tijdens supervisie. Dit worden parallelle processen genoemd. Dit artikel beschrijft een op praktijkervaringen gebaseerd model om dergelijke parallelle
Modelling and simulation of multiple single - phase induction motor in parallel connection

Directory of Open Access Journals (Sweden)

Sujitjorn, S.

2006-11-01

Full Text Available A mathematical model for parallel connected n-multiple single-phase induction motors in generalized state-space form is proposed in this paper. The motor group draws electric power from one inverter. The model is developed by the dq-frame theory and was tested against four loading scenarios in which satisfactory results were obtained.
Two Phase Flow Split Model for Parallel Channels | Iloeje | Nigerian ...

African Journals Online (AJOL)

The model and code are capable of handling single and two phase flows, steady states and transients, up to ten parallel flow paths, simple and complicated geometries, including the boilers of fossil steam generators and nuclear power plants. A test calculation has been made with a simplified three-channel system ...
Unified Singularity Modeling and Reconfiguration of 3rTPS Metamorphic Parallel Mechanisms with Parallel Constraint Screws

Directory of Open Access Journals (Sweden)

Yufeng Zhuang

2015-01-01

Full Text Available This paper presents a unified singularity modeling and reconfiguration analysis of variable topologies of a class of metamorphic parallel mechanisms with parallel constraint screws. The new parallel mechanisms consist of three reconfigurable rTPS limbs that have two working phases stemming from the reconfigurable Hooke (rT joint. While one phase has full mobility, the other supplies a constraint force to the platform. Based on these, the platform constraint screw systems show that the new metamorphic parallel mechanisms have four topologies by altering the limb phases with mobility change among 1R2T (one rotation with two translations, 2R2T, and 3R2T and mobility 6. Geometric conditions of the mechanism design are investigated with some special topologies illustrated considering the limb arrangement. Following this and the actuation scheme analysis, a unified Jacobian matrix is formed using screw theory to include the change between geometric constraints and actuation constraints in the topology reconfiguration. Various singular configurations are identified by analyzing screw dependency in the Jacobian matrix. The work in this paper provides basis for singularity-free workspace analysis and optimal design of the class of metamorphic parallel mechanisms with parallel constraint screws which shows simple geometric constraints with potential simple kinematics and dynamics properties.
The Extended Parallel Process Model: Illuminating the Gaps in Research

Science.gov (United States)

Popova, Lucy

2012-01-01

This article examines constructs, propositions, and assumptions of the extended parallel process model (EPPM). Review of the EPPM literature reveals that its theoretical concepts are thoroughly developed, but the theory lacks consistency in operational definitions of some of its constructs. Out of the 12 propositions of the EPPM, a few have not…
Three-dimensional parallel edge-based finite element modeling of electromagnetic data with field redatuming

DEFF Research Database (Denmark)

Cai, Hongzhu; Čuma, Martin; Zhdanov, Michael

2015-01-01

This paper presents a parallelized version of the edge-based finite element method with a novel post-processing approach for numerical modeling of an electromagnetic field in complex media. The method uses an unstructured tetrahedral mesh which can reduce the number of degrees of freedom signific......This paper presents a parallelized version of the edge-based finite element method with a novel post-processing approach for numerical modeling of an electromagnetic field in complex media. The method uses an unstructured tetrahedral mesh which can reduce the number of degrees of freedom...... significantly. The linear system of finite element equations is solved using parallel direct solvers which are robust for ill-conditioned systems and efficient for multiple source electromagnetic (EM) modeling. We also introduce a novel approach to compute the scalar components of the electric field from...... the tangential components along each edge based on field redatuming. The method can produce a more accurate result as compared to conventional approach. We have applied the developed algorithm to compute the EM response for a typical 3D anisotropic geoelectrical model of the off-shore HC reservoir with complex...
A one-dimensional heat transfer model for parallel-plate thermoacoustic heat exchangers.

Science.gov (United States)

de Jong, J A; Wijnant, Y H; de Boer, A

2014-03-01

A one-dimensional (1D) laminar oscillating flow heat transfer model is derived and applied to parallel-plate thermoacoustic heat exchangers. The model can be used to estimate the heat transfer from the solid wall to the acoustic medium, which is required for the heat input/output of thermoacoustic systems. The model is implementable in existing (quasi-)1D thermoacoustic codes, such as DeltaEC. Examples of generated results show good agreement with literature results. The model allows for arbitrary wave phasing; however, it is shown that the wave phasing does not significantly influence the heat transfer.
Parallel performance of TORT on the CRAY J90: Model and measurement

International Nuclear Information System (INIS)

Barnett, A.; Azmy, Y.Y.

1997-10-01

A limitation on the parallel performance of TORT on the CRAY J90 is the amount of extra work introduced by the multitasking algorithm itself. The extra work beyond that of the serial version of the code, called overhead, arises from the synchronization of the parallel tasks and the accumulation of results by the master task. The goal of recent updates to TORT was to reduce the time consumed by these activities. To help understand which components of the multitasking algorithm contribute significantly to the overhead, a parallel performance model was constructed and compared to measurements of actual timings of the code
Parallel algorithms for interactive manipulation of digital terrain models

Science.gov (United States)

Davis, E. W.; Mcallister, D. F.; Nagaraj, V.

1988-01-01

Interactive three-dimensional graphics applications, such as terrain data representation and manipulation, require extensive arithmetic processing. Massively parallel machines are attractive for this application since they offer high computational rates, and grid connected architectures provide a natural mapping for grid based terrain models. Presented here are algorithms for data movement on the massive parallel processor (MPP) in support of pan and zoom functions over large data grids. It is an extension of earlier work that demonstrated real-time performance of graphics functions on grids that were equal in size to the physical dimensions of the MPP. When the dimensions of a data grid exceed the processing array size, data is packed in the array memory. Windows of the total data grid are interactively selected for processing. Movement of packed data is needed to distribute items across the array for efficient parallel processing. Execution time for data movement was found to exceed that for arithmetic aspects of graphics functions. Performance figures are given for routines written in MPP Pascal.
The design of multi-core DSP parallel model based on message passing and multi-level pipeline

Science.gov (United States)

Niu, Jingyu; Hu, Jian; He, Wenjing; Meng, Fanrong; Li, Chuanrong

2017-10-01

Currently, the design of embedded signal processing system is often based on a specific application, but this idea is not conducive to the rapid development of signal processing technology. In this paper, a parallel processing model architecture based on multi-core DSP platform is designed, and it is mainly suitable for the complex algorithms which are composed of different modules. This model combines the ideas of multi-level pipeline parallelism and message passing, and summarizes the advantages of the mainstream model of multi-core DSP (the Master-Slave model and the Data Flow model), so that it has better performance. This paper uses three-dimensional image generation algorithm to validate the efficiency of the proposed model by comparing with the effectiveness of the Master-Slave and the Data Flow model.
Improvements in fast-response flood modeling: desktop parallel computing and domain tracking

Energy Technology Data Exchange (ETDEWEB)

Judi, David R [Los Alamos National Laboratory; Mcpherson, Timothy N [Los Alamos National Laboratory; Burian, Steven J [UNIV. OF UTAH

2009-01-01

It is becoming increasingly important to have the ability to accurately forecast flooding, as flooding accounts for the most losses due to natural disasters in the world and the United States. Flood inundation modeling has been dominated by one-dimensional approaches. These models are computationally efficient and are considered by many engineers to produce reasonably accurate water surface profiles. However, because the profiles estimated in these models must be superimposed on digital elevation data to create a two-dimensional map, the result may be sensitive to the ability of the elevation data to capture relevant features (e.g. dikes/levees, roads, walls, etc...). Moreover, one-dimensional models do not explicitly represent the complex flow processes present in floodplains and urban environments and because two-dimensional models based on the shallow water equations have significantly greater ability to determine flow velocity and direction, the National Research Council (NRC) has recommended that two-dimensional models be used over one-dimensional models for flood inundation studies. This paper has shown that two-dimensional flood modeling computational time can be greatly reduced through the use of Java multithreading on multi-core computers which effectively provides a means for parallel computing on a desktop computer. In addition, this paper has shown that when desktop parallel computing is coupled with a domain tracking algorithm, significant computation time can be eliminated when computations are completed only on inundated cells. The drastic reduction in computational time shown here enhances the ability of two-dimensional flood inundation models to be used as a near-real time flood forecasting tool, engineering, design tool, or planning tool. Perhaps even of greater significance, the reduction in computation time makes the incorporation of risk and uncertainty/ensemble forecasting more feasible for flood inundation modeling (NRC 2000; Sayers et al
runjags: An R Package Providing Interface Utilities, Model Templates, Parallel Computing Methods and Additional Distributions for MCMC Models in JAGS

Directory of Open Access Journals (Sweden)

Matthew J. Denwood

2016-07-01

Full Text Available The runjags package provides a set of interface functions to facilitate running Markov chain Monte Carlo models in JAGS from within R. Automated calculation of appropriate convergence and sample length diagnostics, user-friendly access to commonly used graphical outputs and summary statistics, and parallelized methods of running JAGS are provided. Template model specifications can be generated using a standard lme4-style formula interface to assist users less familiar with the BUGS syntax. Automated simulation study functions are implemented to facilitate model performance assessment, as well as drop-k type cross-validation studies, using high performance computing clusters such as those provided by parallel. A module extension for JAGS is also included within runjags, providing the Pareto family of distributions and a series of minimally-informative priors including the DuMouchel and half-Cauchy priors. This paper outlines the primary functions of this package, and gives an illustration of a simulation study to assess the sensitivity of two equivalent model formulations to different prior distributions.
Parallelization of a Quantum-Classic Hybrid Model For Nanoscale Semiconductor Devices

Directory of Open Access Journals (Sweden)

Oscar Salas

2011-07-01

Full Text Available The expensive reengineering of the sequential software and the difficult parallel programming are two of the many technical and economic obstacles to the wide use of HPC. We investigate the chance to improve in a rapid way the performance of a numerical serial code for the simulation of the transport of a charged carriers in a Double-Gate MOSFET. We introduce the Drift-Diffusion-Schrödinger-Poisson (DDSP model and we study a rapid parallelization strategy of the numerical procedure on shared memory architectures.
Parallel processing and non-uniform grids in global air quality modeling

NARCIS (Netherlands)

Berkvens, P.J.F.; Bochev, Mikhail A.

2002-01-01

A large-scale global air quality model, running efficiently on a single vector processor, is enhanced to make more realistic and more long-term simulations feasible. Two strategies are combined: non-uniform grids and parallel processing. The communication through the hierarchy of non-uniform grids
F-Nets and Software Cabling: Deriving a Formal Model and Language for Portable Parallel Programming

Science.gov (United States)

DiNucci, David C.; Saini, Subhash (Technical Monitor)

1998-01-01

Parallel programming is still being based upon antiquated sequence-based definitions of the terms "algorithm" and "computation", resulting in programs which are architecture dependent and difficult to design and analyze. By focusing on obstacles inherent in existing practice, a more portable model is derived here, which is then formalized into a model called Soviets which utilizes a combination of imperative and functional styles. This formalization suggests more general notions of algorithm and computation, as well as insights into the meaning of structured programming in a parallel setting. To illustrate how these principles can be applied, a very-high-level graphical architecture-independent parallel language, called Software Cabling, is described, with many of the features normally expected from today's computer languages (e.g. data abstraction, data parallelism, and object-based programming constructs).
Image reconstruction method for electrical capacitance tomography based on the combined series and parallel normalization model

International Nuclear Information System (INIS)

Dong, Xiangyuan; Guo, Shuqing

2008-01-01

In this paper, a novel image reconstruction method for electrical capacitance tomography (ECT) based on the combined series and parallel model is presented. A regularization technique is used to obtain a stabilized solution of the inverse problem. Also, the adaptive coefficient of the combined model is deduced by numerical optimization. Simulation results indicate that it can produce higher quality images when compared to the algorithm based on the parallel or series models for the cases tested in this paper. It provides a new algorithm for ECT application
Modeling, analysis, and design of stationary reference frame droop controlled parallel three-phase voltage source inverters

DEFF Research Database (Denmark)

Vasquez, Juan Carlos; Guerrero, Josep M.; Savaghebi, Mehdi

2013-01-01

Power electronics based MicroGrids consist of a number of voltage source inverters (VSIs) operating in parallel. In this paper, the modeling, control design, and stability analysis of parallel connected three-phase VSIs are derived. The proposed voltage and current inner control loops and the mat......Power electronics based MicroGrids consist of a number of voltage source inverters (VSIs) operating in parallel. In this paper, the modeling, control design, and stability analysis of parallel connected three-phase VSIs are derived. The proposed voltage and current inner control loops...... control restores the frequency and amplitude deviations produced by the primary control. Also, a synchronization algorithm is presented in order to connect the MicroGrid to the grid. Experimental results are provided to validate the performance and robustness of the parallel VSI system control...
Availability modeling and optimization of dynamic multi-state series–parallel systems with random reconfiguration

International Nuclear Information System (INIS)

Li, Y.F.; Peng, R.

2014-01-01

Most studies on multi-state series–parallel systems focus on the static type of system architecture. However, it is insufficient to model many complex industrial systems having several operation phases and each requires a subset of the subsystems combined together to perform certain tasks. To bridge this gap, this study takes into account this type of dynamic behavior in the multi-state series–parallel system and proposes an analytical approach to calculate the system availability and the operation cost. In this approach, Markov process is used to model the dynamics of system phase changing and component state changing, Markov reward model is used to calculate the operation cost associated with the dynamics, and universal generating function (UGF) is used to build system availability function from the system phase model and the component models. Based upon these models, an optimization problem is formulated to minimize the total system cost with the constraint that system availability is greater than a desired level. The genetic algorithm is then applied to solve the optimization problem. The proposed modeling and solution procedures are illustrated on a system design problem modified from a real-world maritime oil transportation system
Fast robot kinematics modeling by using a parallel simulator (PSIM)

International Nuclear Information System (INIS)

El-Gazzar, H.M.; Ayad, N.M.A.

2002-01-01

High-speed computers are strongly needed not only for solving scientific and engineering problems, but also for numerous industrial applications. Such applications include computer-aided design, oil exploration, weather predication, space applications and safety of nuclear reactors. The rapid development in VLSI technology makes it possible to implement time consuming algorithms in real-time situations. Parallel processing approaches can now be used to reduce the processing-time for models of very high mathematical structure such as the kinematics molding of robot manipulator. This system is used to construct and evaluate the performance and cost effectiveness of several proposed methods to solve the Jacobian algorithm. Parallelism is introduced to the algorithms by using different task-allocations and dividing the whole job into sub tasks. Detailed analysis is performed and results are obtained for the case of six DOF (degree of freedom) robot arms (Stanford Arm). Execution times comparisons between Von Neumann (uni processor) and parallel processor architectures by using parallel simulator package (PSIM) are presented. The gained results are much in favour for the parallel techniques by at least fifty-percent improvements. Of course, further studies are needed to achieve the convenient and optimum number of processors has to be done
Fast robot kinematics modeling by using a parallel simulator (PSIM)

Energy Technology Data Exchange (ETDEWEB)

El-Gazzar, H M; Ayad, N M.A. [Atomic Energy Authority, Reactor Dept., Computer and Control Lab., P.O. Box no 13759 (Egypt)

2002-09-15

High-speed computers are strongly needed not only for solving scientific and engineering problems, but also for numerous industrial applications. Such applications include computer-aided design, oil exploration, weather predication, space applications and safety of nuclear reactors. The rapid development in VLSI technology makes it possible to implement time consuming algorithms in real-time situations. Parallel processing approaches can now be used to reduce the processing-time for models of very high mathematical structure such as the kinematics molding of robot manipulator. This system is used to construct and evaluate the performance and cost effectiveness of several proposed methods to solve the Jacobian algorithm. Parallelism is introduced to the algorithms by using different task-allocations and dividing the whole job into sub tasks. Detailed analysis is performed and results are obtained for the case of six DOF (degree of freedom) robot arms (Stanford Arm). Execution times comparisons between Von Neumann (uni processor) and parallel processor architectures by using parallel simulator package (PSIM) are presented. The gained results are much in favour for the parallel techniques by at least fifty-percent improvements. Of course, further studies are needed to achieve the convenient and optimum number of processors has to be done.

A fractional model with parallel fractional Maxwell elements for amorphous thermoplastics

Science.gov (United States)

Lei, Dong; Liang, Yingjie; Xiao, Rui

2018-01-01

We develop a fractional model to describe the thermomechanical behavior of amorphous thermoplastics. The fractional model is composed of two parallel fractional Maxwell elements. The first fractional Maxwell model is used to describe the glass transition, while the second component is aimed at describing the viscous flow. We further derive the analytical solutions for the stress relaxation modulus and complex modulus through Laplace transform. We then demonstrate the model is able to describe the master curves of the stress relaxation modulus, storage modulus and loss modulus, which all show two distinct transition regions. The obtained parameters show that the modulus of the two fractional Maxwell elements differs in 2-3 orders of magnitude, while the relaxation time differs in 7-9 orders of magnitude. Finally, we apply the model to describe the stress response of constant strain rate tests. The model, together with the parameters obtained from fitting the master curve of stress relaxation modulus, can accurately predict the temperature and strain rate dependent stress response.
War and peace: morphemes and full forms in a noninteractive activation parallel dual-route model.

Science.gov (United States)

Baayen, H; Schreuder, R

This article introduces a computational tool for modeling the process of morphological segmentation in visual and auditory word recognition in the framework of a parallel dual-route model. Copyright 1999 Academic Press.
Parallel two-phase-flow-induced vibrations in fuel pin model

International Nuclear Information System (INIS)

Hara, Fumio; Yamashita, Tadashi

1978-01-01

This paper reports the experimental results of vibrations of a fuel pin model -herein meaning the essential form of a fuel pin from the standpoint of vibration- in a parallel air-and-water two-phase flow. The essential part of the experimental apparatus consisted of a flat elastic strip made of stainless steel, both ends of which were firmly supported in a circular channel conveying the two-phase fluid. Vibrational strain of the fuel pin model, pressure fluctuation of the two-phase flow and two-phase-flow void signals were measured. Statistical measures such as power spectral density, variance and correlation function were calculated. The authors obtained (1) the relation between variance of vibrational strain and two-phase-flow velocity, (2) the relation between variance of vibrational strain and two-phase-flow pressure fluctuation, (3) frequency characteristics of variance of vibrational strain against the dominant frequency of the two-phase-flow pressure fluctuation, and (4) frequency characteristics of variance of vibrational strain against the dominant frequency of two-phase-flow void signals. The authors conclude that there exist two kinds of excitation mechanisms in vibrations of a fuel pin model inserted in a parallel air-and-water two-phase flow; namely, (1) parametric excitation, which occurs when the fundamental natural frequency of the fuel pin model is related to the dominant travelling frequency of water slugs in the two-phase flow by the ratio 1/2, 1/1, 3/2 and so on; and (2) vibrational resonance, which occurs when the fundamental frequency coincides with the dominant frequency of the two-phase-flow pressure fluctuation. (auth.)
A scalable approach to modeling groundwater flow on massively parallel computers

International Nuclear Information System (INIS)

Ashby, S.F.; Falgout, R.D.; Tompson, A.F.B.

1995-12-01

We describe a fully scalable approach to the simulation of groundwater flow on a hierarchy of computing platforms, ranging from workstations to massively parallel computers. Specifically, we advocate the use of scalable conceptual models in which the subsurface model is defined independently of the computational grid on which the simulation takes place. We also describe a scalable multigrid algorithm for computing the groundwater flow velocities. We axe thus able to leverage both the engineer's time spent developing the conceptual model and the computing resources used in the numerical simulation. We have successfully employed this approach at the LLNL site, where we have run simulations ranging in size from just a few thousand spatial zones (on workstations) to more than eight million spatial zones (on the CRAY T3D)-all using the same conceptual model
Algorithms for a parallel implementation of Hidden Markov Models with a small state space

DEFF Research Database (Denmark)

Nielsen, Jesper; Sand, Andreas

2011-01-01

Two of the most important algorithms for Hidden Markov Models are the forward and the Viterbi algorithms. We show how formulating these using linear algebra naturally lends itself to parallelization. Although the obtained algorithms are slow for Hidden Markov Models with large state spaces...
Construction of a digital elevation model: methods and parallelization

International Nuclear Information System (INIS)

Mazzoni, Christophe

1995-01-01

The aim of this work is to reduce the computation time needed to produce the Digital Elevation Models (DEM) by using a parallel machine. It is made in collaboration between the French 'Institut Geographique National' (IGN) and the Laboratoire d'Electronique de Technologie et d'Instrumentation (LETI) of the French Atomic Energy Commission (CEA). The IGN has developed a system which provides DEM that is used to produce topographic maps. The kernel of this system is the correlator, a software which automatically matches pairs of homologous points of a stereo-pair of photographs. Nevertheless the correlator is expensive In computing time. In order to reduce computation time and to produce the DEM with same accuracy that the actual system, we have parallelized the IGN's correlator on the OPENVISION system. This hardware solution uses a SIMD (Single Instruction Multiple Data) parallel machine SYMPATI-2, developed by the LETI that is involved in parallel architecture and image processing. Our analysis of the implementation has demonstrated the difficulty of efficient coupling between scalar and parallel structure. So we propose solutions to reinforce this coupling. In order to accelerate more the processing we evaluate SYMPHONIE, a SIMD calculator, successor of SYMPATI-2. On an other hand, we developed a multi-agent approach for what a MIMD (Multiple Instruction, Multiple Data) architecture is available. At last, we describe a Multi-SIMD architecture that conciliates our two approaches. This architecture offers a capacity to apprehend efficiently multi-level treatment image. It is flexible by its modularity, and its communication network supplies reliability that interest sensible systems. (author) [fr
On a model of three-dimensional bursting and its parallel implementation

Science.gov (United States)

Tabik, S.; Romero, L. F.; Garzón, E. M.; Ramos, J. I.

2008-04-01

A mathematical model for the simulation of three-dimensional bursting phenomena and its parallel implementation are presented. The model consists of four nonlinearly coupled partial differential equations that include fast and slow variables, and exhibits bursting in the absence of diffusion. The differential equations have been discretized by means of a second-order accurate in both space and time, linearly-implicit finite difference method in equally-spaced grids. The resulting system of linear algebraic equations at each time level has been solved by means of the Preconditioned Conjugate Gradient (PCG) method. Three different parallel implementations of the proposed mathematical model have been developed; two of these implementations, i.e., the MPI and the PETSc codes, are based on a message passing paradigm, while the third one, i.e., the OpenMP code, is based on a shared space address paradigm. These three implementations are evaluated on two current high performance parallel architectures, i.e., a dual-processor cluster and a Shared Distributed Memory (SDM) system. A novel representation of the results that emphasizes the most relevant factors that affect the performance of the paralled implementations, is proposed. The comparative analysis of the computational results shows that the MPI and the OpenMP implementations are about twice more efficient than the PETSc code on the SDM system. It is also shown that, for the conditions reported here, the nonlinear dynamics of the three-dimensional bursting phenomena exhibits three stages characterized by asynchronous, synchronous and then asynchronous oscillations, before a quiescent state is reached. It is also shown that the fast system reaches steady state in much less time than the slow variables.
Parallel Execution of Functional Mock-up Units in Buildings Modeling

Energy Technology Data Exchange (ETDEWEB)

Ozmen, Ozgur [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Nutaro, James J. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); New, Joshua Ryan [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)

2016-06-30

A Functional Mock-up Interface (FMI) defines a standardized interface to be used in computer simulations to develop complex cyber-physical systems. FMI implementation by a software modeling tool enables the creation of a simulation model that can be interconnected, or the creation of a software library called a Functional Mock-up Unit (FMU). This report describes an FMU wrapper implementation that imports FMUs into a C++ environment and uses an Euler solver that executes FMUs in parallel using Open Multi-Processing (OpenMP). The purpose of this report is to elucidate the runtime performance of the solver when a multi-component system is imported as a single FMU (for the whole system) or as multiple FMUs (for different groups of components as sub-systems). This performance comparison is conducted using two test cases: (1) a simple, multi-tank problem; and (2) a more realistic use case based on the Modelica Buildings Library. In both test cases, the performance gains are promising when each FMU consists of a large number of states and state events that are wrapped in a single FMU. Load balancing is demonstrated to be a critical factor in speeding up parallel execution of multiple FMUs.
Incorrectness of conventional one-dimensional parallel thermal resistance circuit model for two-dimensional circular composite pipes

International Nuclear Information System (INIS)

Wong, K.-L.; Hsien, T.-L.; Chen, W.-L.; Yu, S.-J.

2008-01-01

This study is to prove that two-dimensional steady state heat transfer problems of composite circular pipes cannot be appropriately solved by the conventional one-dimensional parallel thermal resistance circuits (PTRC) model because its interface temperatures are not unique. Thus, the PTRC model is definitely different from its conventional recognized analogy, parallel electrical resistance circuits (PERC) model, which has unique node electric voltages. Two typical composite circular pipe examples are solved by CFD software, and the numerical results are compared with those obtained by the PTRC model. This shows that the PTRC model generates large error. Thus, this conventional model, introduced in most heat transfer text books, cannot be applied to two-dimensional composite circular pipes. On the contrary, an alternative one-dimensional separately series thermal resistance circuit (SSTRC) model is proposed and applied to a two-dimensional composite circular pipe with isothermal boundaries, and acceptable results are returned
Experimental and modelling results of a parallel-plate based active magnetic regenerator

DEFF Research Database (Denmark)

Tura, A.; Nielsen, Kaspar Kirstein; Rowe, A.

2012-01-01

The performance of a permanent magnet magnetic refrigerator (PMMR) using gadolinium parallel plates is described. The configuration and operating parameters are described in detail. Experimental results are compared to simulations using an established twodimensional model of an active magnetic...
A web-based, collaborative modeling, simulation, and parallel computing environment for electromechanical systems

Directory of Open Access Journals (Sweden)

Xiaoliang Yin

2015-03-01

Full Text Available Complex electromechanical system is usually composed of multiple components from different domains, including mechanical, electronic, hydraulic, control, and so on. Modeling and simulation for electromechanical system on a unified platform is one of the research hotspots in system engineering at present. It is also the development trend of the design for complex electromechanical system. The unified modeling techniques and tools based on Modelica language provide a satisfactory solution. To meet with the requirements of collaborative modeling, simulation, and parallel computing for complex electromechanical systems based on Modelica, a general web-based modeling and simulation prototype environment, namely, WebMWorks, is designed and implemented. Based on the rich Internet application technologies, an interactive graphic user interface for modeling and post-processing on web browser was implemented; with the collaborative design module, the environment supports top-down, concurrent modeling and team cooperation; additionally, service-oriented architecture–based architecture was applied to supply compiling and solving services which run on cloud-like servers, so the environment can manage and dispatch large-scale simulation tasks in parallel on multiple computing servers simultaneously. An engineering application about pure electric vehicle is tested on WebMWorks. The results of simulation and parametric experiment demonstrate that the tested web-based environment can effectively shorten the design cycle of the complex electromechanical system.
The inaccuracy of conventional one-dimensional parallel thermal resistance circuit model for two-dimensional composite walls

International Nuclear Information System (INIS)

Wong, K.-L.; Hsien, T.-L.; Hsiao, M.-C.; Chen, W.-L.; Lin, K.-C.

2008-01-01

This investigation is to show that two-dimensional steady state heat transfer problems of composite walls should not be solved by the conventionally one-dimensional parallel thermal resistance circuits (PTRC) model because the interface temperatures are not unique. Thus PTRC model cannot be used like its conventional recognized analogy, parallel electrical resistance circuits (PERC) model which has the unique node electric voltage. Two typical composite wall examples, solved by CFD software, are used to demonstrate the incorrectness. The numerical results are compared with those obtained by PTRC model, and very large differences are observed between their results. This proves that the application of conventional heat transfer PTRC model to two-dimensional composite walls, introduced in most heat transfer text book, is totally incorrect. An alternative one-dimensional separately series thermal resistance circuit (SSTRC) model is proposed and applied to the two-dimensional composite walls with isothermal boundaries. Results with acceptable accuracy can be obtained by the new model
An Inconvenient Truth: An Application of the Extended Parallel Process Model

Science.gov (United States)

Goodall, Catherine E.; Roberto, Anthony J.

2008-01-01

"An Inconvenient Truth" is an Academy Award-winning documentary about global warming presented by Al Gore. This documentary is appropriate for a lesson on fear appeals and the extended parallel process model (EPPM). The EPPM is concerned with the effects of perceived threat and efficacy on behavior change. Perceived threat is composed of an…
Computer model of a reverberant and parallel circuit coupling

Science.gov (United States)

Kalil, Camila de Andrade; de Castro, Maria Clícia Stelling; Cortez, Célia Martins

2017-11-01

The objective of the present study was to deepen the knowledge about the functioning of the neural circuits by implementing a signal transmission model using the Graph Theory in a small network of neurons composed of an interconnected reverberant and parallel circuit, in order to investigate the processing of the signals in each of them and the effects on the output of the network. For this, a program was developed in C language and simulations were done using neurophysiological data obtained in the literature.
Improved Path Loss Simulation Incorporating Three-Dimensional Terrain Model Using Parallel Coprocessors

Directory of Open Access Journals (Sweden)

Zhang Bin Loo

2017-01-01

Full Text Available Current network simulators abstract out wireless propagation models due to the high computation requirements for realistic modeling. As such, there is still a large gap between the results obtained from simulators and real world scenario. In this paper, we present a framework for improved path loss simulation built on top of an existing network simulation software, NS-3. Different from the conventional disk model, the proposed simulation also considers the diffraction loss computed using Epstein and Peterson’s model through the use of actual terrain elevation data to give an accurate estimate of path loss between a transmitter and a receiver. The drawback of high computation requirements is relaxed by offloading the computationally intensive components onto an inexpensive off-the-shelf parallel coprocessor, which is a NVIDIA GPU. Experiments are performed using actual terrain elevation data provided from United States Geological Survey. As compared to the conventional CPU architecture, the experimental result shows that a speedup of 20x to 42x is achieved by exploiting the parallel processing of GPU to compute the path loss between two nodes using terrain elevation data. The result shows that the path losses between two nodes are greatly affected by the terrain profile between these two nodes. Besides this, the result also suggests that the common strategy to place the transmitter in the highest position may not always work.
Analysis and Modeling of Parallel Photovoltaic Systems under Partial Shading Conditions

Science.gov (United States)

Buddala, Santhoshi Snigdha

Since the industrial revolution, fossil fuels like petroleum, coal, oil, natural gas and other non-renewable energy sources have been used as the primary energy source. The consumption of fossil fuels releases various harmful gases into the atmosphere as byproducts which are hazardous in nature and they tend to deplete the protective layers and affect the overall environmental balance. Also the fossil fuels are bounded resources of energy and rapid depletion of these sources of energy, have prompted the need to investigate alternate sources of energy called renewable energy. One such promising source of renewable energy is the solar/photovoltaic energy. This work focuses on investigating a new solar array architecture with solar cells connected in parallel configuration. By retaining the structural simplicity of the parallel architecture, a theoretical small signal model of the solar cell is proposed and modeled to analyze the variations in the module parameters when subjected to partial shading conditions. Simulations were run in SPICE to validate the model implemented in Matlab. The voltage limitations of the proposed architecture are addressed by adopting a simple dc-dc boost converter and evaluating the performance of the architecture in terms of efficiencies by comparing it with the traditional architectures. SPICE simulations are used to compare the architectures and identify the best one in terms of power conversion efficiency under partial shading conditions.
Modelling radiative transfer through ponded first-year Arctic sea ice with a plane-parallel model

Directory of Open Access Journals (Sweden)

T. Taskjelle

2017-09-01

Full Text Available Under-ice irradiance measurements were done on ponded first-year pack ice along three transects during the ICE12 expedition north of Svalbard. Bulk transmittances (400–900 nm were found to be on average 0.15–0.20 under bare ice, and 0.39–0.46 under ponded ice. Radiative transfer modelling was done with a plane-parallel model. While simulated transmittances deviate significantly from measured transmittances close to the edge of ponds, spatially averaged bulk transmittances agree well. That is, transect-average bulk transmittances, calculated using typical simulated transmittances for ponded and bare ice weighted by the fractional coverage of the two surface types, are in good agreement with the measured values. Radiative heating rates calculated from model output indicates that about 20 % of the incident solar energy is absorbed in bare ice, and 50 % in ponded ice (35 % in pond itself, 15 % in the underlying ice. This large difference is due to the highly scattering surface scattering layer (SSL increasing the albedo of the bare ice.
Modelling radiative transfer through ponded first-year Arctic sea ice with a plane-parallel model

Science.gov (United States)

Taskjelle, Torbjørn; Hudson, Stephen R.; Granskog, Mats A.; Hamre, Børge

2017-09-01

Under-ice irradiance measurements were done on ponded first-year pack ice along three transects during the ICE12 expedition north of Svalbard. Bulk transmittances (400-900 nm) were found to be on average 0.15-0.20 under bare ice, and 0.39-0.46 under ponded ice. Radiative transfer modelling was done with a plane-parallel model. While simulated transmittances deviate significantly from measured transmittances close to the edge of ponds, spatially averaged bulk transmittances agree well. That is, transect-average bulk transmittances, calculated using typical simulated transmittances for ponded and bare ice weighted by the fractional coverage of the two surface types, are in good agreement with the measured values. Radiative heating rates calculated from model output indicates that about 20 % of the incident solar energy is absorbed in bare ice, and 50 % in ponded ice (35 % in pond itself, 15 % in the underlying ice). This large difference is due to the highly scattering surface scattering layer (SSL) increasing the albedo of the bare ice.
Steady-state and time-dependent modelling of parallel transport in the scrape-off layer

DEFF Research Database (Denmark)

Havlickova, E.; Fundamenski, W.; Naulin, Volker

2011-01-01

The one-dimensional fluid code SOLF1D has been used for modelling of plasma transport in the scrape-off layer (SOL) along magnetic field lines, both in steady state and under transient conditions that arise due to plasma turbulence. The presented work summarizes results of SOLF1D with attention...... given to transient parallel transport which reveals two distinct time scales due to the transport mechanisms of convection and diffusion. Time-dependent modelling combined with the effect of ballooning shows propagation of particles along the magnetic field line with Mach number up to M ≈ 1...... temperature calculated in SOLF1D is compared with the approximative model used in the turbulence code ESEL both for steady-state and turbulent SOL. Dynamics of the parallel transport are investigated for a simple transient event simulating the propagation of particles and energy to the targets from a blob...
Error modelling and experimental validation of a planar 3-PPR parallel manipulator with joint clearances

DEFF Research Database (Denmark)

Wu, Guanglei; Bai, Shaoping; Kepler, Jørgen Asbøl

2012-01-01

This paper deals with the error modelling and analysis of a 3-PPR planar parallel manipulator with joint clearances. The kinematics and the Cartesian workspace of the manipulator are analyzed. An error model is established with considerations of both configuration errors and joint clearances. Using...

Parallel numerical modeling of hybrid-dimensional compositional non-isothermal Darcy flows in fractured porous media

Science.gov (United States)

Xing, F.; Masson, R.; Lopez, S.

2017-09-01

This paper introduces a new discrete fracture model accounting for non-isothermal compositional multiphase Darcy flows and complex networks of fractures with intersecting, immersed and non-immersed fractures. The so called hybrid-dimensional model using a 2D model in the fractures coupled with a 3D model in the matrix is first derived rigorously starting from the equi-dimensional matrix fracture model. Then, it is discretized using a fully implicit time integration combined with the Vertex Approximate Gradient (VAG) finite volume scheme which is adapted to polyhedral meshes and anisotropic heterogeneous media. The fully coupled systems are assembled and solved in parallel using the Single Program Multiple Data (SPMD) paradigm with one layer of ghost cells. This strategy allows for a local assembly of the discrete systems. An efficient preconditioner is implemented to solve the linear systems at each time step and each Newton type iteration of the simulation. The numerical efficiency of our approach is assessed on different meshes, fracture networks, and physical settings in terms of parallel scalability, nonlinear convergence and linear convergence.
Precise Modeling Based on Dynamic Phasors for Droop-Controlled Parallel-Connected Inverters

DEFF Research Database (Denmark)

Wang, L.; Guo, X.Q.; Gu, H.R.

2012-01-01

This paper deals with the precise modeling of droop controlled parallel inverters. This is very attractive since that is a common structure that can be found in a stand-alone droopcontrolled MicroGrid. The conventional small-signal dynamic is not able to predict instabilities of the system, so...
Practical enhancement factor model based on GM for multiple parallel reactions: Piperazine (PZ) CO2 capture

DEFF Research Database (Denmark)

Gaspar, Jozsef; Fosbøl, Philip Loldrup

2017-01-01

Reactive absorption is a key process for gas separation and purification and it is the main technology for CO2 capture. Thus, reliable and simple mathematical models for mass transfer rate calculation are essential. Models which apply to parallel interacting and non-interacting reactions, for all......, desorption and pinch conditions.In this work, we apply the GM model to multiple parallel reactions. We deduce the model for piperazine (PZ) CO2 capture and we validate it against wetted-wall column measurements using 2, 5 and 8 molal PZ for temperatures between 40 °C and 100 °C and CO2 loadings between 0.......23 and 0.41 mol CO2/2 mol PZ. We show that overall second order kinetics describes well the reaction between CO2 and PZ accounting for the carbamate and bicarbamate reactions. Here we prove the GM model for piperazine and MEA but we expect that this practical approach is applicable for various amines...
Parallel Genetic Algorithms for calibrating Cellular Automata models: Application to lava flows

International Nuclear Information System (INIS)

D'Ambrosio, D.; Spataro, W.; Di Gregorio, S.; Calabria Univ., Cosenza; Crisci, G.M.; Rongo, R.; Calabria Univ., Cosenza

2005-01-01

Cellular Automata are highly nonlinear dynamical systems which are suitable far simulating natural phenomena whose behaviour may be specified in terms of local interactions. The Cellular Automata model SCIARA, developed far the simulation of lava flows, demonstrated to be able to reproduce the behaviour of Etnean events. However, in order to apply the model far the prediction of future scenarios, a thorough calibrating phase is required. This work presents the application of Genetic Algorithms, general-purpose search algorithms inspired to natural selection and genetics, far the parameters optimisation of the model SCIARA. Difficulties due to the elevated computational time suggested the adoption a Master-Slave Parallel Genetic Algorithm far the calibration of the model with respect to the 2001 Mt. Etna eruption. Results demonstrated the usefulness of the approach, both in terms of computing time and quality of performed simulations
Parallel eigenanalysis of finite element models in a completely connected architecture

Science.gov (United States)

Akl, F. A.; Morel, M. R.

1989-01-01

A parallel algorithm is presented for the solution of the generalized eigenproblem in linear elastic finite element analysis, (K)(phi) = (M)(phi)(omega), where (K) and (M) are of order N, and (omega) is order of q. The concurrent solution of the eigenproblem is based on the multifrontal/modified subspace method and is achieved in a completely connected parallel architecture in which each processor is allowed to communicate with all other processors. The algorithm was successfully implemented on a tightly coupled multiple-instruction multiple-data parallel processing machine, Cray X-MP. A finite element model is divided into m domains each of which is assumed to process n elements. Each domain is then assigned to a processor or to a logical processor (task) if the number of domains exceeds the number of physical processors. The macrotasking library routines are used in mapping each domain to a user task. Computational speed-up and efficiency are used to determine the effectiveness of the algorithm. The effect of the number of domains, the number of degrees-of-freedom located along the global fronts and the dimension of the subspace on the performance of the algorithm are investigated. A parallel finite element dynamic analysis program, p-feda, is documented and the performance of its subroutines in parallel environment is analyzed.
Modeling of Electromagnetic Fields in Parallel-Plane Structures: A Unified Contour-Integral Approach

Directory of Open Access Journals (Sweden)

M. Stumpf

2017-04-01

Full Text Available A unified reciprocity-based modeling approach for analyzing electromagnetic fields in dispersive parallel-plane structures of arbitrary shape is described. It is shown that the use of the reciprocity theorem of the time-convolution type leads to a global contour-integral interaction quantity from which novel both time- and frequency-domain numerical schemes can be arrived at. Applications of the numerical method concerning the time-domain radiated interference and susceptibility of parallel-plane structures are discussed and illustrated on numerical examples.
CSDFa: a model for exploiting the trade-off between data and pipeline parallelism

NARCIS (Netherlands)

Koek, Peter; Geuns, S.J.; Hausmans, J.P.H.M.; Corporaal, Henk; Bekooij, Marco Jan Gerrit

2016-01-01

Real-time stream processing applications, such as SDR applications, are often executed concurrently on multiprocessor systems. A unified data flow model and analysis method have been proposed that can be used to simultaneously determine the amount of pipeline and coarse-grained data parallelism
On Modeling Large-Scale Multi-Agent Systems with Parallel, Sequential and Genuinely Asynchronous Cellular Automata

International Nuclear Information System (INIS)

Tosic, P.T.

2011-01-01

We study certain types of Cellular Automata (CA) viewed as an abstraction of large-scale Multi-Agent Systems (MAS). We argue that the classical CA model needs to be modified in several important respects, in order to become a relevant and sufficiently general model for the large-scale MAS, and so that thus generalized model can capture many important MAS properties at the level of agent ensembles and their long-term collective behavior patterns. We specifically focus on the issue of inter-agent communication in CA, and propose sequential cellular automata (SCA) as the first step, and genuinely Asynchronous Cellular Automata (ACA) as the ultimate deterministic CA-based abstract models for large-scale MAS made of simple reactive agents. We first formulate deterministic and nondeterministic versions of sequential CA, and then summarize some interesting configuration space properties (i.e., possible behaviors) of a restricted class of sequential CA. In particular, we compare and contrast those properties of sequential CA with the corresponding properties of the classical (that is, parallel and perfectly synchronous) CA with the same restricted class of update rules. We analytically demonstrate failure of the studied sequential CA models to simulate all possible behaviors of perfectly synchronous parallel CA, even for a very restricted class of non-linear totalistic node update rules. The lesson learned is that the interleaving semantics of concurrency, when applied to sequential CA, is not refined enough to adequately capture the perfect synchrony of parallel CA updates. Last but not least, we outline what would be an appropriate CA-like abstraction for large-scale distributed computing insofar as the inter-agent communication model is concerned, and in that context we propose genuinely asynchronous CA. (author)
Belief–logic conflict resolution in syllogistic reasoning: Inspection-time evidence for a parallel process model

OpenAIRE

Stupple, Edward J.N; Ball, Linden

2008-01-01

An experiment is reported examining dual-process models of belief bias in syllogistic reasoning using a problem complexity manipulation and an inspection-time method to monitor processing latencies for premises and conclusions. Endorsement rates indicated increased belief bias on complex problems, a finding that runs counter to the “belief-first” selective scrutiny model, but which is consistent with other theories, including “reasoning-first” and “parallel-process” models. Inspection-time da...
Parallel Beam Dynamics Simulation Tools for Future Light Source Linac Modeling

International Nuclear Information System (INIS)

Qiang, Ji; Pogorelov, Ilya v.; Ryne, Robert D.

2007-01-01

Large-scale modeling on parallel computers is playing an increasingly important role in the design of future light sources. Such modeling provides a means to accurately and efficiently explore issues such as limits to beam brightness, emittance preservation, the growth of instabilities, etc. Recently the IMPACT codes suite was enhanced to be applicable to future light source design. Simulations with IMPACT-Z were performed using up to one billion simulation particles for the main linac of a future light source to study the microbunching instability. Combined with the time domain code IMPACT-T, it is now possible to perform large-scale start-to-end linac simulations for future light sources, including the injector, main linac, chicanes, and transfer lines. In this paper we provide an overview of the IMPACT code suite, its key capabilities, and recent enhancements pertinent to accelerator modeling for future linac-based light sources
Developing a Massively Parallel Forward Projection Radiography Model for Large-Scale Industrial Applications

Energy Technology Data Exchange (ETDEWEB)

Bauerle, Matthew [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

2014-08-01

This project utilizes Graphics Processing Units (GPUs) to compute radiograph simulations for arbitrary objects. The generation of radiographs, also known as the forward projection imaging model, is computationally intensive and not widely utilized. The goal of this research is to develop a massively parallel algorithm that can compute forward projections for objects with a trillion voxels (3D pixels). To achieve this end, the data are divided into blocks that can each t into GPU memory. The forward projected image is also divided into segments to allow for future parallelization and to avoid needless computations.
Parameter estimation in large-scale systems biology models: a parallel and self-adaptive cooperative strategy.

Science.gov (United States)

Penas, David R; González, Patricia; Egea, Jose A; Doallo, Ramón; Banga, Julio R

2017-01-21

The development of large-scale kinetic models is one of the current key issues in computational systems biology and bioinformatics. Here we consider the problem of parameter estimation in nonlinear dynamic models. Global optimization methods can be used to solve this type of problems but the associated computational cost is very large. Moreover, many of these methods need the tuning of a number of adjustable search parameters, requiring a number of initial exploratory runs and therefore further increasing the computation times. Here we present a novel parallel method, self-adaptive cooperative enhanced scatter search (saCeSS), to accelerate the solution of this class of problems. The method is based on the scatter search optimization metaheuristic and incorporates several key new mechanisms: (i) asynchronous cooperation between parallel processes, (ii) coarse and fine-grained parallelism, and (iii) self-tuning strategies. The performance and robustness of saCeSS is illustrated by solving a set of challenging parameter estimation problems, including medium and large-scale kinetic models of the bacterium E. coli, bakerés yeast S. cerevisiae, the vinegar fly D. melanogaster, Chinese Hamster Ovary cells, and a generic signal transduction network. The results consistently show that saCeSS is a robust and efficient method, allowing very significant reduction of computation times with respect to several previous state of the art methods (from days to minutes, in several cases) even when only a small number of processors is used. The new parallel cooperative method presented here allows the solution of medium and large scale parameter estimation problems in reasonable computation times and with small hardware requirements. Further, the method includes self-tuning mechanisms which facilitate its use by non-experts. We believe that this new method can play a key role in the development of large-scale and even whole-cell dynamic models.
Transfer function modeling of parallel connected two three-phase induction motor implementation using LabView platform

DEFF Research Database (Denmark)

Gunabalan, R.; Sanjeevikumar, P.; Blaabjerg, Frede

2015-01-01

This paper presents the transfer function modeling and stability analysis of two induction motors of same ratings and parameters connected in parallel. The induction motors are controlled by a single inverter and the entire drive system is modeled using transfer function in LabView. Further...
Modeling and Control of the Redundant Parallel Adjustment Mechanism on a Deployable Antenna Panel

Directory of Open Access Journals (Sweden)

Lili Tian

2016-10-01

Full Text Available With the aim of developing multiple input and multiple output (MIMO coupling systems with a redundant parallel adjustment mechanism on the deployable antenna panel, a structural control integrated design methodology is proposed in this paper. Firstly, the modal information from the finite element model of the structure of the antenna panel is extracted, and then the mathematical model is established with the Hamilton principle; Secondly, the discrete Linear Quadratic Regulator (LQR controller is added to the model in order to control the actuators and adjust the shape of the panel. Finally, the engineering practicality of the modeling and control method based on finite element analysis simulation is verified.
SBML-PET-MPI: a parallel parameter estimation tool for Systems Biology Markup Language based models.

Science.gov (United States)

Zi, Zhike

2011-04-01

Parameter estimation is crucial for the modeling and dynamic analysis of biological systems. However, implementing parameter estimation is time consuming and computationally demanding. Here, we introduced a parallel parameter estimation tool for Systems Biology Markup Language (SBML)-based models (SBML-PET-MPI). SBML-PET-MPI allows the user to perform parameter estimation and parameter uncertainty analysis by collectively fitting multiple experimental datasets. The tool is developed and parallelized using the message passing interface (MPI) protocol, which provides good scalability with the number of processors. SBML-PET-MPI is freely available for non-commercial use at http://www.bioss.uni-freiburg.de/cms/sbml-pet-mpi.html or http://sites.google.com/site/sbmlpetmpi/.
Implementation of an Agent-Based Parallel Tissue Modelling Framework for the Intel MIC Architecture

Directory of Open Access Journals (Sweden)

Maciej Cytowski

2017-01-01

Full Text Available Timothy is a novel large scale modelling framework that allows simulating of biological processes involving different cellular colonies growing and interacting with variable environment. Timothy was designed for execution on massively parallel High Performance Computing (HPC systems. The high parallel scalability of the implementation allows for simulations of up to 109 individual cells (i.e., simulations at tissue spatial scales of up to 1 cm3 in size. With the recent advancements of the Timothy model, it has become critical to ensure appropriate performance level on emerging HPC architectures. For instance, the introduction of blood vessels supplying nutrients to the tissue is a very important step towards realistic simulations of complex biological processes, but it greatly increased the computational complexity of the model. In this paper, we describe the process of modernization of the application in order to achieve high computational performance on HPC hybrid systems based on modern Intel® MIC architecture. Experimental results on the Intel Xeon Phi™ coprocessor x100 and the Intel Xeon Phi processor x200 are presented.
Condition-based maintenance effectiveness for series–parallel power generation system—A combined Markovian simulation model

International Nuclear Information System (INIS)

Azadeh, A.; Asadzadeh, S.M.; Salehi, N.; Firoozi, M.

2015-01-01

Condition-based maintenance (CBM) is an increasingly applicable policy in the competitive marketplace as a means of improving equipment reliability and efficiency. Not only has maintenance a close relationship with safety but its costs also make it even more attractive issue for researchers. This study proposes a model to evaluate the effectiveness of CBM policy compared to two other maintenance policies: Corrective Maintenance (CM) and Preventive Maintenance (PM). Maintenance policies are compared through two system performance indicators: reliability and cost. To estimate the reliability and costs of the system, the proposed Markovian discrete-event simulation model is developed under each of these policies. The applicability and usefulness of the proposed Markovian simulation model is illustrated for a series–parallel power generation system. The simulated characteristics of CBM system include its prognostics efficiency to estimate remaining useful life of the equipment. Results show that with an efficient prognostics, CBM policy is an effective strategy compared to other maintenance strategies. - Highlights: • A model is developed to evaluate the effectiveness of CBM policy. • Maintenance policies are compared through reliability and cost. • A Markovian simulation model is developed. • A series–parallel power generation system is considered. • CBM is an effective strategy compared to others
A queueing network model to analyze the impact of parallelization of care on patient cycle time.

Science.gov (United States)

Jiang, Lixiang; Giachetti, Ronald E

2008-09-01

The total time a patient spends in an outpatient facility, called the patient cycle time, is a major contributor to overall patient satisfaction. A frequently recommended strategy to reduce the total time is to perform some activities in parallel thereby shortening patient cycle time. To analyze patient cycle time this paper extends and improves upon existing multi-class open queueing network model (MOQN) so that the patient flow in an urgent care center can be modeled. Results of the model are analyzed using data from an urgent care center contemplating greater parallelization of patient care activities. The results indicate that parallelization can reduce the cycle time for those patient classes which require more than one diagnostic and/ or treatment intervention. However, for many patient classes there would be little if any improvement, indicating the importance of tools to analyze business process reengineering rules. The paper makes contributions by implementing an approximation for fork/join queues in the network and by improving the approximation for multiple server queues in both low traffic and high traffic conditions. We demonstrate the accuracy of the MOQN results through comparisons to simulation results.
Parallel Factor-Based Model for Two-Dimensional Direction Estimation

Directory of Open Access Journals (Sweden)

Nizar Tayem

2017-01-01

Full Text Available Two-dimensional (2D Direction-of-Arrivals (DOA estimation for elevation and azimuth angles assuming noncoherent, mixture of coherent and noncoherent, and coherent sources using extended three parallel uniform linear arrays (ULAs is proposed. Most of the existing schemes have drawbacks in estimating 2D DOA for multiple narrowband incident sources as follows: use of large number of snapshots, estimation failure problem for elevation and azimuth angles in the range of typical mobile communication, and estimation of coherent sources. Moreover, the DOA estimation for multiple sources requires complex pair-matching methods. The algorithm proposed in this paper is based on first-order data matrix to overcome these problems. The main contributions of the proposed method are as follows: (1 it avoids estimation failure problem using a new antenna configuration and estimates elevation and azimuth angles for coherent sources; (2 it reduces the estimation complexity by constructing Toeplitz data matrices, which are based on a single or few snapshots; (3 it derives parallel factor (PARAFAC model to avoid pair-matching problems between multiple sources. Simulation results demonstrate the effectiveness of the proposed algorithm.
Development of whole core thermal-hydraulic analysis program ACT. 4. Simplified fuel assembly model and parallelization by MPI

International Nuclear Information System (INIS)

Ohshima, Hiroyuki

2001-10-01

A whole core thermal-hydraulic analysis program ACT is being developed for the purpose of evaluating detailed in-core thermal hydraulic phenomena of fast reactors including the effect of the flow between wrapper-tube walls (inter-wrapper flow) under various reactor operation conditions. As appropriate boundary conditions in addition to a detailed modeling of the core are essential for accurate simulations of in-core thermal hydraulics, ACT consists of not only fuel assembly and inter-wrapper flow analysis modules but also a heat transport system analysis module that gives response of the plant dynamics to the core model. This report describes incorporation of a simplified model to the fuel assembly analysis module and program parallelization by a message passing method toward large-scale simulations. ACT has a fuel assembly analysis module which can simulate a whole fuel pin bundle in each fuel assembly of the core and, however, it may take much CPU time for a large-scale core simulation. Therefore, a simplified fuel assembly model that is thermal-hydraulically equivalent to the detailed one has been incorporated in order to save the simulation time and resources. This simplified model is applied to several parts of fuel assemblies in a core where the detailed simulation results are not required. With regard to the program parallelization, the calculation load and the data flow of ACT were analyzed and the optimum parallelization has been done including the improvement of the numerical simulation algorithm of ACT. Message Passing Interface (MPI) is applied to data communication between processes and synchronization in parallel calculations. Parallelized ACT was verified through a comparison simulation with the original one. In addition to the above works, input manuals of the core analysis module and the heat transport system analysis module have been prepared. (author)

A self-calibrating robot based upon a virtual machine model of parallel kinematics

DEFF Research Database (Denmark)

Pedersen, David Bue; Eiríksson, Eyþór Rúnar; Hansen, Hans Nørgaard

2016-01-01

A delta-type parallel kinematics system for Additive Manufacturing has been created, which through a probing system can recognise its geometrical deviations from nominal and compensate for these in the driving inverse kinematic model of the machine. Novelty is that this model is derived from...... a virtual machine of the kinematics system, built on principles from geometrical metrology. Relevant mathematically non-trivial deviations to the ideal machine are identified and decomposed into elemental deviations. From these deviations, a routine is added to a physical machine tool, which allows...
Parallel implementation of a Lagrangian-based model on an adaptive mesh in C++: Application to sea-ice

Science.gov (United States)

Samaké, Abdoulaye; Rampal, Pierre; Bouillon, Sylvain; Ólason, Einar

2017-12-01

We present a parallel implementation framework for a new dynamic/thermodynamic sea-ice model, called neXtSIM, based on the Elasto-Brittle rheology and using an adaptive mesh. The spatial discretisation of the model is done using the finite-element method. The temporal discretisation is semi-implicit and the advection is achieved using either a pure Lagrangian scheme or an Arbitrary Lagrangian Eulerian scheme (ALE). The parallel implementation presented here focuses on the distributed-memory approach using the message-passing library MPI. The efficiency and the scalability of the parallel algorithms are illustrated by the numerical experiments performed using up to 500 processor cores of a cluster computing system. The performance obtained by the proposed parallel implementation of the neXtSIM code is shown being sufficient to perform simulations for state-of-the-art sea ice forecasting and geophysical process studies over geographical domain of several millions squared kilometers like the Arctic region.
Dynamics modeling for parallel haptic interfaces with force sensing and control.

Science.gov (United States)

Bernstein, Nicholas; Lawrence, Dale; Pao, Lucy

2013-01-01

Closed-loop force control can be used on haptic interfaces (HIs) to mitigate the effects of mechanism dynamics. A single multidimensional force-torque sensor is often employed to measure the interaction force between the haptic device and the user's hand. The parallel haptic interface at the University of Colorado (CU) instead employs smaller 1D force sensors oriented along each of the five actuating rods to build up a 5D force vector. This paper shows that a particular manipulandum/hand partition in the system dynamics is induced by the placement and type of force sensing, and discusses the implications on force and impedance control for parallel haptic interfaces. The details of a "squaring down" process are also discussed, showing how to obtain reduced degree-of-freedom models from the general six degree-of-freedom dynamics formulation.
Parameters Design for a Parallel Hybrid Electric Bus Using Regenerative Brake Model

Directory of Open Access Journals (Sweden)

Zilin Ma

2014-01-01

Full Text Available A design methodology which uses the regenerative brake model is introduced to determine the major system parameters of a parallel electric hybrid bus drive train. Hybrid system parameters mainly include the power rating of internal combustion engine (ICE, gear ratios of transmission, power rating, and maximal torque of motor, power, and capacity of battery. The regenerative model is built in the vehicle model to estimate the regenerative energy in the real road conditions. The design target is to ensure that the vehicle meets the specified vehicle performance, such as speed and acceleration, and at the same time, operates the ICE within an expected speed range. Several pairs of parameters are selected from the result analysis, and the fuel saving result in the road test shows that a 25% reduction is achieved in fuel consumption.
Nambu-Jona-Lasinio model in a parallel electromagnetic field

Science.gov (United States)

Wang, Lingxiao; Cao, Gaoqing; Huang, Xu-Guang; Zhuang, Pengfei

2018-05-01

We explore the features of the UA (1) and chiral symmetry breaking of the Nambu-Jona-Lasinio model without the Kobayashi-Maskawa-'t Hooft determinant term in the presence of a parallel electromagnetic field. We show that the electromagnetic chiral anomaly can induce both finite neutral pion condensate and isospin-singlet pseudo-scalar η condensate and thus modifies the chiral symmetry breaking pattern. In order to characterize the strength of the UA (1) symmetry breaking, we evaluate the susceptibility associated with the UA (1) charge. The result shows that the susceptibility contributed from the chiral anomaly is consistent with the behavior of the corresponding η condensate. The spectra of the mesonic excitations are also studied.
Parallel imaging enhanced MR colonography using a phantom model.

LENUS (Irish Health Repository)

Morrin, Martina M

2008-09-01

To compare various Array Spatial and Sensitivity Encoding Technique (ASSET)-enhanced T2W SSFSE (single shot fast spin echo) and T1-weighted (T1W) 3D SPGR (spoiled gradient recalled echo) sequences for polyp detection and image quality at MR colonography (MRC) in a phantom model. Limitations of MRC using standard 3D SPGR T1W imaging include the long breath-hold required to cover the entire colon within one acquisition and the relatively low spatial resolution due to the long acquisition time. Parallel imaging using ASSET-enhanced T2W SSFSE and 3D T1W SPGR imaging results in much shorter imaging times, which allows for increased spatial resolution.
When fast logic meets slow belief: Evidence for a parallel-processing model of belief bias

OpenAIRE

Trippas, Dries; Thompson, Valerie A.; Handley, Simon J.

2016-01-01

Two experiments pitted the default-interventionist account of belief bias against a parallel-processing model. According to the former, belief bias occurs because a fast, belief-based evaluation of the conclusion pre-empts a working-memory demanding logical analysis. In contrast, according to the latter both belief-based and logic-based responding occur in parallel. Participants were given deductive reasoning problems of variable complexity and instructed to decide whether the conclusion was ...
Highly accelerated cardiac cine parallel MRI using low-rank matrix completion and partial separability model

Science.gov (United States)

Lyu, Jingyuan; Nakarmi, Ukash; Zhang, Chaoyi; Ying, Leslie

2016-05-01

This paper presents a new approach to highly accelerated dynamic parallel MRI using low rank matrix completion, partial separability (PS) model. In data acquisition, k-space data is moderately randomly undersampled at the center kspace navigator locations, but highly undersampled at the outer k-space for each temporal frame. In reconstruction, the navigator data is reconstructed from undersampled data using structured low-rank matrix completion. After all the unacquired navigator data is estimated, the partial separable model is used to obtain partial k-t data. Then the parallel imaging method is used to acquire the entire dynamic image series from highly undersampled data. The proposed method has shown to achieve high quality reconstructions with reduction factors up to 31, and temporal resolution of 29ms, when the conventional PS method fails.
Parallel Algorithms for Model Checking

NARCIS (Netherlands)

van de Pol, Jaco; Mousavi, Mohammad Reza; Sgall, Jiri

2017-01-01

Model checking is an automated verification procedure, which checks that a model of a system satisfies certain properties. These properties are typically expressed in some temporal logic, like LTL and CTL. Algorithms for LTL model checking (linear time logic) are based on automata theory and graph
Parallel family trees for transfer matrices in the Potts model

Science.gov (United States)

Navarro, Cristobal A.; Canfora, Fabrizio; Hitschfeld, Nancy; Navarro, Gonzalo

2015-02-01

The computational cost of transfer matrix methods for the Potts model is related to the question in how many ways can two layers of a lattice be connected? Answering the question leads to the generation of a combinatorial set of lattice configurations. This set defines the configuration space of the problem, and the smaller it is, the faster the transfer matrix can be computed. The configuration space of generic (q , v) transfer matrix methods for strips is in the order of the Catalan numbers, which grows asymptotically as O(4m) where m is the width of the strip. Other transfer matrix methods with a smaller configuration space indeed exist but they make assumptions on the temperature, number of spin states, or restrict the structure of the lattice. In this paper we propose a parallel algorithm that uses a sub-Catalan configuration space of O(3m) to build the generic (q , v) transfer matrix in a compressed form. The improvement is achieved by grouping the original set of Catalan configurations into a forest of family trees, in such a way that the solution to the problem is now computed by solving the root node of each family. As a result, the algorithm becomes exponentially faster than the Catalan approach while still highly parallel. The resulting matrix is stored in a compressed form using O(3m ×4m) of space, making numerical evaluation and decompression to be faster than evaluating the matrix in its O(4m ×4m) uncompressed form. Experimental results for different sizes of strip lattices show that the parallel family trees (PFT) strategy indeed runs exponentially faster than the Catalan Parallel Method (CPM), especially when dealing with dense transfer matrices. In terms of parallel performance, we report strong-scaling speedups of up to 5.7 × when running on an 8-core shared memory machine and 28 × for a 32-core cluster. The best balance of speedup and efficiency for the multi-core machine was achieved when using p = 4 processors, while for the cluster
Cpl6: The New Extensible, High-Performance Parallel Coupler forthe Community Climate System Model

Energy Technology Data Exchange (ETDEWEB)

Craig, Anthony P.; Jacob, Robert L.; Kauffman, Brain; Bettge,Tom; Larson, Jay; Ong, Everest; Ding, Chris; He, Yun

2005-03-24

Coupled climate models are large, multiphysics applications designed to simulate the Earth's climate and predict the response of the climate to any changes in the forcing or boundary conditions. The Community Climate System Model (CCSM) is a widely used state-of-art climate model that has released several versions to the climate community over the past ten years. Like many climate models, CCSM employs a coupler, a functional unit that coordinates the exchange of data between parts of climate system such as the atmosphere and ocean. This paper describes the new coupler, cpl6, contained in the latest version of CCSM,CCSM3. Cpl6 introduces distributed-memory parallelism to the coupler, a class library for important coupler functions, and a standardized interface for component models. Cpl6 is implemented entirely in Fortran90 and uses Model Coupling Toolkit as the base for most of its classes. Cpl6 gives improved performance over previous versions and scales well on multiple platforms.
Fear Control an Danger Control: A Test of the Extended Parallel Process Model (EPPM).

Science.gov (United States)

Witte, Kim

1994-01-01

Explores cognitive and emotional mechanisms underlying success and failure of fear appeals in context of AIDS prevention. Offers general support for Extended Parallel Process Model. Suggests that cognitions lead to fear appeal success (attitude, intention, or behavior changes) via danger control processes, whereas the emotion fear leads to fear…
Element-by-element parallel spectral-element methods for 3-D teleseismic wave modeling

KAUST Repository

Liu, Shaolin

2017-09-28

The development of an efficient algorithm for teleseismic wave field modeling is valuable for calculating the gradients of the misfit function (termed misfit gradients) or Fréchet derivatives when the teleseismic waveform is used for adjoint tomography. Here, we introduce an element-by-element parallel spectral-element method (EBE-SEM) for the efficient modeling of teleseismic wave field propagation in a reduced geology model. Under the plane-wave assumption, the frequency-wavenumber (FK) technique is implemented to compute the boundary wave field used to construct the boundary condition of the teleseismic wave incidence. To reduce the memory required for the storage of the boundary wave field for the incidence boundary condition, a strategy is introduced to efficiently store the boundary wave field on the model boundary. The perfectly matched layers absorbing boundary condition (PML ABC) is formulated using the EBE-SEM to absorb the scattered wave field from the model interior. The misfit gradient can easily be constructed in each time step during the calculation of the adjoint wave field. Three synthetic examples demonstrate the validity of the EBE-SEM for use in teleseismic wave field modeling and the misfit gradient calculation.
A Parallel, Multi-Scale Watershed-Hydrologic-Inundation Model with Adaptively Switching Mesh for Capturing Flooding and Lake Dynamics

Science.gov (United States)

Ji, X.; Shen, C.

2017-12-01

Flood inundation presents substantial societal hazards and also changes biogeochemistry for systems like the Amazon. It is often expensive to simulate high-resolution flood inundation and propagation in a long-term watershed-scale model. Due to the Courant-Friedrichs-Lewy (CFL) restriction, high resolution and large local flow velocity both demand prohibitively small time steps even for parallel codes. Here we develop a parallel surface-subsurface process-based model enhanced by multi-resolution meshes that are adaptively switched on or off. The high-resolution overland flow meshes are enabled only when the flood wave invades to floodplains. This model applies semi-implicit, semi-Lagrangian (SISL) scheme in solving dynamic wave equations, and with the assistant of the multi-mesh method, it also adaptively chooses the dynamic wave equation only in the area of deep inundation. Therefore, the model achieves a balance between accuracy and computational cost.
Stem thrust prediction model for W-K-M double wedge parallel expanding gate valves

Energy Technology Data Exchange (ETDEWEB)

Eldiwany, B.; Alvarez, P.D. [Kalsi Engineering Inc., Sugar Land, TX (United States); Wolfe, K. [Electric Power Research Institute, Palo Alto, CA (United States)

1996-12-01

An analytical model for determining the required valve stem thrust during opening and closing strokes of W-K-M parallel expanding gate valves was developed as part of the EPRI Motor-Operated Valve Performance Prediction Methodology (EPRI MOV PPM) Program. The model was validated against measured stem thrust data obtained from in-situ testing of three W-K-M valves. Model predictions show favorable, bounding agreement with the measured data for valves with Stellite 6 hardfacing on the disks and seat rings for water flow in the preferred flow direction (gate downstream). The maximum required thrust to open and to close the valve (excluding wedging and unwedging forces) occurs at a slightly open position and not at the fully closed position. In the nonpreferred flow direction, the model shows that premature wedging can occur during {Delta}P closure strokes even when the coefficients of friction at different sliding surfaces are within the typical range. This paper summarizes the model description and comparison against test data.
Stem thrust prediction model for W-K-M double wedge parallel expanding gate valves

International Nuclear Information System (INIS)

Eldiwany, B.; Alvarez, P.D.; Wolfe, K.

1996-01-01

An analytical model for determining the required valve stem thrust during opening and closing strokes of W-K-M parallel expanding gate valves was developed as part of the EPRI Motor-Operated Valve Performance Prediction Methodology (EPRI MOV PPM) Program. The model was validated against measured stem thrust data obtained from in-situ testing of three W-K-M valves. Model predictions show favorable, bounding agreement with the measured data for valves with Stellite 6 hardfacing on the disks and seat rings for water flow in the preferred flow direction (gate downstream). The maximum required thrust to open and to close the valve (excluding wedging and unwedging forces) occurs at a slightly open position and not at the fully closed position. In the nonpreferred flow direction, the model shows that premature wedging can occur during ΔP closure strokes even when the coefficients of friction at different sliding surfaces are within the typical range. This paper summarizes the model description and comparison against test data
SiGN-SSM: open source parallel software for estimating gene networks with state space models.

Science.gov (United States)

Tamada, Yoshinori; Yamaguchi, Rui; Imoto, Seiya; Hirose, Osamu; Yoshida, Ryo; Nagasaki, Masao; Miyano, Satoru

2011-04-15

SiGN-SSM is an open-source gene network estimation software able to run in parallel on PCs and massively parallel supercomputers. The software estimates a state space model (SSM), that is a statistical dynamic model suitable for analyzing short time and/or replicated time series gene expression profiles. SiGN-SSM implements a novel parameter constraint effective to stabilize the estimated models. Also, by using a supercomputer, it is able to determine the gene network structure by a statistical permutation test in a practical time. SiGN-SSM is applicable not only to analyzing temporal regulatory dependencies between genes, but also to extracting the differentially regulated genes from time series expression profiles. SiGN-SSM is distributed under GNU Affero General Public Licence (GNU AGPL) version 3 and can be downloaded at http://sign.hgc.jp/signssm/. The pre-compiled binaries for some architectures are available in addition to the source code. The pre-installed binaries are also available on the Human Genome Center supercomputer system. The online manual and the supplementary information of SiGN-SSM is available on our web site. tamada@ims.u-tokyo.ac.jp.
Sustainability Attitudes and Behavioral Motivations of College Students: Testing the Extended Parallel Process Model

Science.gov (United States)

Perrault, Evan K.; Clark, Scott K.

2018-01-01

Purpose: A planet that can no longer sustain life is a frightening thought--and one that is often present in mass media messages. Therefore, this study aims to test the components of a classic fear appeal theory, the extended parallel process model (EPPM) and to determine how well its constructs predict sustainability behavioral intentions. This…
Performance evaluation of parallel electric field tunnel field-effect transistor by a distributed-element circuit model

Science.gov (United States)

Morita, Yukinori; Mori, Takahiro; Migita, Shinji; Mizubayashi, Wataru; Tanabe, Akihito; Fukuda, Koichi; Matsukawa, Takashi; Endo, Kazuhiko; O'uchi, Shin-ichi; Liu, Yongxun; Masahara, Meishoku; Ota, Hiroyuki

2014-12-01

The performance of parallel electric field tunnel field-effect transistors (TFETs), in which band-to-band tunneling (BTBT) was initiated in-line to the gate electric field was evaluated. The TFET was fabricated by inserting an epitaxially-grown parallel-plate tunnel capacitor between heavily doped source wells and gate insulators. Analysis using a distributed-element circuit model indicated there should be a limit of the drain current caused by the self-voltage-drop effect in the ultrathin channel layer.
Simplified numerical model for predicting onset of flow instability in parallel heated channels

International Nuclear Information System (INIS)

Noura Rassoul; El-Khider Si-Ahmed; Tewfik Hamidouche; Anis Bousbia-Salah

2005-01-01

Full text of publication follows: Flow instabilities are undesirable phenomena in heated channels since change in flow rate affects the local heat transfer characteristics and may results in premature burnout. For instance, two-phase flow excursion (Ledinegg) instability in boiling channels is of great concern in the design and operation of numerous practical systems especially the MTR fuel type Research Reactors. For heated parallel channels, the negative-sloped segment of the pressure drop-flow rate characteristics (demand curve) of a boiling channel becomes negative. Such instability can lead to significant reduction in channel flow, thereby causing premature burnout of the heated channel before the CHF point. Furthermore, as a consequence of this flow decrease, different types of flow instabilities that may appear can also induce (density wave) flow oscillations of constant amplitude or diverging amplitude. The present work focuses on a numerical simulation of pressure drop in forced convection boiling in vertical narrow and parallel uniformly heated channels. The objective is to determine the point of Onset of flow instability by varying input flow rate without any consideration to density wave oscillations. By the way, the axial void distribution is provided. The numerical model is based on the finite difference method which transform the partial differential conservation equations of Mass, Momentum and Energy, in algebraic equations. Closure relationships as the drift flux model and other constitutive equations are considered to determine the channel pressure drop under steady state boiling conditions. The model validation is performed by confronting the calculations with the Oak Ridge National Laboratory Thermal Hydraulic Test Loop (THTL) experimental data set. Further verification of this model is performed by code-to code verification using the results of RELAP5/Mod 3.2 code. (authors)

Generalized coupling resonance modeling, analysis, and active damping of multi-parallel inverters in microgrid operating in grid-connected mode

DEFF Research Database (Denmark)

Chen, Zhiyong; Chen, Yandong; Guerrero, Josep M.

2016-01-01

This paper firstly presents an equivalent coupling circuit modeling of multi-parallel inverters in microgrid operating in grid-connected mode. By using the model, the coupling resonance phenomena are explicitly investigated through the mathematical approach, and the intrinsic and extrinsic...
Contribution to the optimal design of an hybrid parallel power-train: choice of a battery model; Contribution a la conception optimale d'une motorisation hybride parallele. Choix d'un modele d'accumulateur

Energy Technology Data Exchange (ETDEWEB)

Kuhn, E.

2004-09-15

This work deals with the dynamical and energetic modeling of a 42 V NiMH battery, the model of which is taking into account into a control law for an hybrid electrical vehicle. Using an inventory of the electrochemical phenomena, an equivalent electrical scheme has been established. In this model, diffusion phenomena were represented using non integer derivatives. This tool leads to a very good approximation of diffusion phenomena, nevertheless such a pure mathematical approach did not allow to represent energetic losses inside the battery. Consequently, a second model, made of a series of electric circuits has been proposed to represent energetic transfers. This second model has been used in the determination of a control law which warrants an autonomous management of electrical energy embedded in a parallel hybrid electrical vehicle, and to prevent deep discharge of the battery. (author)
Parallel Solver for Diffuse Optical Tomography on Realistic Head Models With Scattering and Clear Regions.

Science.gov (United States)

Placati, Silvio; Guermandi, Marco; Samore, Andrea; Scarselli, Eleonora Franchi; Guerrieri, Roberto

2016-09-01

Diffuse optical tomography is an imaging technique, based on evaluation of how light propagates within the human head to obtain the functional information about the brain. Precision in reconstructing such an optical properties map is highly affected by the accuracy of the light propagation model implemented, which needs to take into account the presence of clear and scattering tissues. We present a numerical solver based on the radiosity-diffusion model, integrating the anatomical information provided by a structural MRI. The solver is designed to run on parallel heterogeneous platforms based on multiple GPUs and CPUs. We demonstrate how the solver provides a 7 times speed-up over an isotropic-scattered parallel Monte Carlo engine based on a radiative transport equation for a domain composed of 2 million voxels, along with a significant improvement in accuracy. The speed-up greatly increases for larger domains, allowing us to compute the light distribution of a full human head ( ≈ 3 million voxels) in 116 s for the platform used.
Analysis of Parallel Algorithms on SMP Node and Cluster of Workstations Using Parallel Programming Models with New Tile-based Method for Large Biological Datasets.

Science.gov (United States)

Shrimankar, D D; Sathe, S R

2016-01-01

Sequence alignment is an important tool for describing the relationships between DNA sequences. Many sequence alignment algorithms exist, differing in efficiency, in their models of the sequences, and in the relationship between sequences. The focus of this study is to obtain an optimal alignment between two sequences of biological data, particularly DNA sequences. The algorithm is discussed with particular emphasis on time, speedup, and efficiency optimizations. Parallel programming presents a number of critical challenges to application developers. Today's supercomputer often consists of clusters of SMP nodes. Programming paradigms such as OpenMP and MPI are used to write parallel codes for such architectures. However, the OpenMP programs cannot be scaled for more than a single SMP node. However, programs written in MPI can have more than single SMP nodes. But such a programming paradigm has an overhead of internode communication. In this work, we explore the tradeoffs between using OpenMP and MPI. We demonstrate that the communication overhead incurs significantly even in OpenMP loop execution and increases with the number of cores participating. We also demonstrate a communication model to approximate the overhead from communication in OpenMP loops. Our results are astonishing and interesting to a large variety of input data files. We have developed our own load balancing and cache optimization technique for message passing model. Our experimental results show that our own developed techniques give optimum performance of our parallel algorithm for various sizes of input parameter, such as sequence size and tile size, on a wide variety of multicore architectures.
Analysis of Parallel Algorithms on SMP Node and Cluster of Workstations Using Parallel Programming Models with New Tile-based Method for Large Biological Datasets

Science.gov (United States)

Shrimankar, D. D.; Sathe, S. R.

2016-01-01

Sequence alignment is an important tool for describing the relationships between DNA sequences. Many sequence alignment algorithms exist, differing in efficiency, in their models of the sequences, and in the relationship between sequences. The focus of this study is to obtain an optimal alignment between two sequences of biological data, particularly DNA sequences. The algorithm is discussed with particular emphasis on time, speedup, and efficiency optimizations. Parallel programming presents a number of critical challenges to application developers. Today’s supercomputer often consists of clusters of SMP nodes. Programming paradigms such as OpenMP and MPI are used to write parallel codes for such architectures. However, the OpenMP programs cannot be scaled for more than a single SMP node. However, programs written in MPI can have more than single SMP nodes. But such a programming paradigm has an overhead of internode communication. In this work, we explore the tradeoffs between using OpenMP and MPI. We demonstrate that the communication overhead incurs significantly even in OpenMP loop execution and increases with the number of cores participating. We also demonstrate a communication model to approximate the overhead from communication in OpenMP loops. Our results are astonishing and interesting to a large variety of input data files. We have developed our own load balancing and cache optimization technique for message passing model. Our experimental results show that our own developed techniques give optimum performance of our parallel algorithm for various sizes of input parameter, such as sequence size and tile size, on a wide variety of multicore architectures. PMID:27932868
Massively Parallel Assimilation of TOGA/TAO and Topex/Poseidon Measurements into a Quasi Isopycnal Ocean General Circulation Model Using an Ensemble Kalman Filter

Science.gov (United States)

Keppenne, Christian L.; Rienecker, Michele; Borovikov, Anna Y.; Suarez, Max

1999-01-01

A massively parallel ensemble Kalman filter (EnKF)is used to assimilate temperature data from the TOGA/TAO array and altimetry from TOPEX/POSEIDON into a Pacific basin version of the NASA Seasonal to Interannual Prediction Project (NSIPP)ls quasi-isopycnal ocean general circulation model. The EnKF is an approximate Kalman filter in which the error-covariance propagation step is modeled by the integration of multiple instances of a numerical model. An estimate of the true error covariances is then inferred from the distribution of the ensemble of model state vectors. This inplementation of the filter takes advantage of the inherent parallelism in the EnKF algorithm by running all the model instances concurrently. The Kalman filter update step also occurs in parallel by having each processor process the observations that occur in the region of physical space for which it is responsible. The massively parallel data assimilation system is validated by withholding some of the data and then quantifying the extent to which the withheld information can be inferred from the assimilation of the remaining data. The distributions of the forecast and analysis error covariances predicted by the ENKF are also examined.
Parallel programming practical aspects, models and current limitations

CERN Document Server

Tarkov, Mikhail S

2014-01-01

Parallel programming is designed for the use of parallel computer systems for solving time-consuming problems that cannot be solved on a sequential computer in a reasonable time. These problems can be divided into two classes: 1. Processing large data arrays (including processing images and signals in real time)2. Simulation of complex physical processes and chemical reactions For each of these classes, prospective methods are designed for solving problems. For data processing, one of the most promising technologies is the use of artificial neural networks. Particles-in-cell method and cellular automata are very useful for simulation. Problems of scalability of parallel algorithms and the transfer of existing parallel programs to future parallel computers are very acute now. An important task is to optimize the use of the equipment (including the CPU cache) of parallel computers. Along with parallelizing information processing, it is essential to ensure the processing reliability by the relevant organization ...
Implementing the PM Programming Language using MPI and OpenMP - a New Tool for Programming Geophysical Models on Parallel Systems

Science.gov (United States)

Bellerby, Tim

2015-04-01

PM (Parallel Models) is a new parallel programming language specifically designed for writing environmental and geophysical models. The language is intended to enable implementers to concentrate on the science behind the model rather than the details of running on parallel hardware. At the same time PM leaves the programmer in control - all parallelisation is explicit and the parallel structure of any given program may be deduced directly from the code. This paper describes a PM implementation based on the Message Passing Interface (MPI) and Open Multi-Processing (OpenMP) standards, looking at issues involved with translating the PM parallelisation model to MPI/OpenMP protocols and considering performance in terms of the competing factors of finer-grained parallelisation and increased communication overhead. In order to maximise portability, the implementation stays within the MPI 1.3 standard as much as possible, with MPI-2 MPI-IO file handling the only significant exception. Moreover, it does not assume a thread-safe implementation of MPI. PM adopts a two-tier abstract representation of parallel hardware. A PM processor is a conceptual unit capable of efficiently executing a set of language tasks, with a complete parallel system consisting of an abstract N-dimensional array of such processors. PM processors may map to single cores executing tasks using cooperative multi-tasking, to multiple cores or even to separate processing nodes, efficiently sharing tasks using algorithms such as work stealing. While tasks may move between hardware elements within a PM processor, they may not move between processors without specific programmer intervention. Tasks are assigned to processors using a nested parallelism approach, building on ideas from Reyes et al. (2009). The main program owns all available processors. When the program enters a parallel statement then either processors are divided out among the newly generated tasks (number of new tasks number of processors
When fast logic meets slow belief: Evidence for a parallel-processing model of belief bias.

Science.gov (United States)

Trippas, Dries; Thompson, Valerie A; Handley, Simon J

2017-05-01

Two experiments pitted the default-interventionist account of belief bias against a parallel-processing model. According to the former, belief bias occurs because a fast, belief-based evaluation of the conclusion pre-empts a working-memory demanding logical analysis. In contrast, according to the latter both belief-based and logic-based responding occur in parallel. Participants were given deductive reasoning problems of variable complexity and instructed to decide whether the conclusion was valid on half the trials or to decide whether the conclusion was believable on the other half. When belief and logic conflict, the default-interventionist view predicts that it should take less time to respond on the basis of belief than logic, and that the believability of a conclusion should interfere with judgments of validity, but not the reverse. The parallel-processing view predicts that beliefs should interfere with logic judgments only if the processing required to evaluate the logical structure exceeds that required to evaluate the knowledge necessary to make a belief-based judgment, and vice versa otherwise. Consistent with this latter view, for the simplest reasoning problems (modus ponens), judgments of belief resulted in lower accuracy than judgments of validity, and believability interfered more with judgments of validity than the converse. For problems of moderate complexity (modus tollens and single-model syllogisms), the interference was symmetrical, in that validity interfered with belief judgments to the same degree that believability interfered with validity judgments. For the most complex (three-term multiple-model syllogisms), conclusion believability interfered more with judgments of validity than vice versa, in spite of the significant interference from conclusion validity on judgments of belief.
Partial Overhaul and Initial Parallel Optimization of KINETICS, a Coupled Dynamics and Chemistry Atmosphere Model

Science.gov (United States)

Nguyen, Howard; Willacy, Karen; Allen, Mark

2012-01-01

KINETICS is a coupled dynamics and chemistry atmosphere model that is data intensive and computationally demanding. The potential performance gain from using a supercomputer motivates the adaptation from a serial version to a parallelized one. Although the initial parallelization had been done, bottlenecks caused by an abundance of communication calls between processors led to an unfavorable drop in performance. Before starting on the parallel optimization process, a partial overhaul was required because a large emphasis was placed on streamlining the code for user convenience and revising the program to accommodate the new supercomputers at Caltech and JPL. After the first round of optimizations, the partial runtime was reduced by a factor of 23; however, performance gains are dependent on the size of the data, the number of processors requested, and the computer used.
Efficient multi-objective calibration of a computationally intensive hydrologic model with parallel computing software in Python

Science.gov (United States)

With enhanced data availability, distributed watershed models for large areas with high spatial and temporal resolution are increasingly used to understand water budgets and examine effects of human activities and climate change/variability on water resources. Developing parallel computing software...
Model-driven product line engineering for mapping parallel algorithms to parallel computing platforms

NARCIS (Netherlands)

Arkin, Ethem; Tekinerdogan, Bedir

2016-01-01

Mapping parallel algorithms to parallel computing platforms requires several activities such as the analysis of the parallel algorithm, the definition of the logical configuration of the platform, the mapping of the algorithm to the logical configuration platform and the implementation of the
An application of analyzing the trajectories of two disorders: A parallel piecewise growth model of substance use and attention-deficit/hyperactivity disorder.

Science.gov (United States)

Mamey, Mary Rose; Barbosa-Leiker, Celestina; McPherson, Sterling; Burns, G Leonard; Parks, Craig; Roll, John

2015-12-01

Researchers often want to examine 2 comorbid conditions simultaneously. One strategy to do so is through the use of parallel latent growth curve modeling (LGCM). This statistical technique allows for the simultaneous evaluation of 2 disorders to determine the explanations and predictors of change over time. Additionally, a piecewise model can help identify whether there are more than 2 growth processes within each disorder (e.g., during a clinical trial). A parallel piecewise LGCM was applied to self-reported attention-deficit/hyperactivity disorder (ADHD) and self-reported substance use symptoms in 303 adolescents enrolled in cognitive-behavioral therapy treatment for a substance use disorder and receiving either oral-methylphenidate or placebo for ADHD across 16 weeks. Assessing these 2 disorders concurrently allowed us to determine whether elevated levels of 1 disorder predicted elevated levels or increased risk of the other disorder. First, a piecewise growth model measured ADHD and substance use separately. Next, a parallel piecewise LGCM was used to estimate the regressions across disorders to determine whether higher scores at baseline of the disorders (i.e., ADHD or substance use disorder) predicted rates of change in the related disorder. Finally, treatment was added to the model to predict change. While the analyses revealed no significant relationships across disorders, this study explains and applies a parallel piecewise growth model to examine the developmental processes of comorbid conditions over the course of a clinical trial. Strengths of piecewise and parallel LGCMs for other addictions researchers interested in examining dual processes over time are discussed. (PsycINFO Database Record (c) 2015 APA, all rights reserved).
Parallel LC circuit model for multi-band absorption and preliminary design of radiative cooling.

Science.gov (United States)

Feng, Rui; Qiu, Jun; Liu, Linhua; Ding, Weiqiang; Chen, Lixue

2014-12-15

We perform a comprehensive analysis of multi-band absorption by exciting magnetic polaritons in the infrared region. According to the independent properties of the magnetic polaritons, we propose a parallel inductance and capacitance(PLC) circuit model to explain and predict the multi-band resonant absorption peaks, which is fully validated by using the multi-sized structure with identical dielectric spacing layer and the multilayer structure with the same strip width. More importantly, we present the application of the PLC circuit model to preliminarily design a radiative cooling structure realized by merging several close peaks together. This omnidirectional and polarization insensitive structure is a good candidate for radiative cooling application.
Final Report, Center for Programming Models for Scalable Parallel Computing: Co-Array Fortran, Grant Number DE-FC02-01ER25505

Energy Technology Data Exchange (ETDEWEB)

Robert W. Numrich

2008-04-22

The major accomplishment of this project is the production of CafLib, an 'object-oriented' parallel numerical library written in Co-Array Fortran. CafLib contains distributed objects such as block vectors and block matrices along with procedures, attached to each object, that perform basic linear algebra operations such as matrix multiplication, matrix transpose and LU decomposition. It also contains constructors and destructors for each object that hide the details of data decomposition from the programmer, and it contains collective operations that allow the programmer to calculate global reductions, such as global sums, global minima and global maxima, as well as vector and matrix norms of several kinds. CafLib is designed to be extensible in such a way that programmers can define distributed grid and field objects, based on vector and matrix objects from the library, for finite difference algorithms to solve partial differential equations. A very important extra benefit that resulted from the project is the inclusion of the co-array programming model in the next Fortran standard called Fortran 2008. It is the first parallel programming model ever included as a standard part of the language. Co-arrays will be a supported feature in all Fortran compilers, and the portability provided by standardization will encourage a large number of programmers to adopt it for new parallel application development. The combination of object-oriented programming in Fortran 2003 with co-arrays in Fortran 2008 provides a very powerful programming model for high-performance scientific computing. Additional benefits from the project, beyond the original goal, include a programto provide access to the co-array model through access to the Cray compiler as a resource for teaching and research. Several academics, for the first time, included the co-array model as a topic in their courses on parallel computing. A separate collaborative project with LANL and PNNL showed how to
Modeling and Grid impedance Variation Analysis of Parallel Connected Grid Connected Inverter based on Impedance Based Harmonic Analysis

DEFF Research Database (Denmark)

Kwon, JunBum; Wang, Xiongfei; Bak, Claus Leth

2014-01-01

This paper addresses the harmonic compensation error problem existing with parallel connected inverter in the same grid interface conditions by means of impedance-based analysis and modeling. Unlike the single grid connected inverter, it is found that multiple parallel connected inverters and grid...... impedance can make influence to each other if they each have a harmonic compensation function. The analysis method proposed in this paper is based on the relationship between the overall output impedance and input impedance of parallel connected inverter, where controller gain design method, which can...
Measurement model and calibration experiment of over-constrained parallel six-dimensional force sensor based on stiffness characteristics analysis

International Nuclear Information System (INIS)

Niu, Zhi; Zhao, Yanzhi; Zhao, Tieshi; Cao, Yachao; Liu, Menghua

2017-01-01

An over-constrained, parallel six-dimensional force sensor has various advantages, including its ability to bear heavy loads and provide redundant force measurement information. These advantages render the sensor valuable in important applications in the field of aerospace (space docking tests, etc). The stiffness of each component in the over-constrained structure has a considerable influence on the internal force distribution of the structure. Thus, the measurement model changes when the measurement branches of the sensor are under tensile or compressive force. This study establishes a general measurement model for an over-constrained parallel six-dimensional force sensor considering the different branch tensions and compression stiffness values. Numerical calculations and analyses are performed using practical examples. Based on the parallel mechanism, an over-constrained, orthogonal structure is proposed for a six-dimensional force sensor. Hence, a prototype is designed and developed, and a calibration experiment is conducted. The measurement accuracy of the sensor is improved based on the measurement model under different branch tensions and compression stiffness values. Moreover, the largest class I error is reduced from 5.81 to 2.23% full scale (FS), and the largest class II error is reduced from 3.425 to 1.871% FS. (paper)
Exploiting multi-scale parallelism for large scale numerical modelling of laser wakefield accelerators

International Nuclear Information System (INIS)

Fonseca, R A; Vieira, J; Silva, L O; Fiuza, F; Davidson, A; Tsung, F S; Mori, W B

2013-01-01

A new generation of laser wakefield accelerators (LWFA), supported by the extreme accelerating fields generated in the interaction of PW-Class lasers and underdense targets, promises the production of high quality electron beams in short distances for multiple applications. Achieving this goal will rely heavily on numerical modelling to further understand the underlying physics and identify optimal regimes, but large scale modelling of these scenarios is computationally heavy and requires the efficient use of state-of-the-art petascale supercomputing systems. We discuss the main difficulties involved in running these simulations and the new developments implemented in the OSIRIS framework to address these issues, ranging from multi-dimensional dynamic load balancing and hybrid distributed/shared memory parallelism to the vectorization of the PIC algorithm. We present the results of the OASCR Joule Metric program on the issue of large scale modelling of LWFA, demonstrating speedups of over 1 order of magnitude on the same hardware. Finally, scalability to over ∼10 6 cores and sustained performance over ∼2 P Flops is demonstrated, opening the way for large scale modelling of LWFA scenarios. (paper)
Kinetics of transformations nucleated on random parallel planes: analytical modelling and computer simulation

International Nuclear Information System (INIS)

Rios, Paulo R; Assis, Weslley L S; Ribeiro, Tatiana C S; Villa, Elena

2012-01-01

In a classical paper, Cahn derived expressions for the kinetics of transformations nucleated on random planes and lines. He used those as a model for nucleation on the boundaries, edges and vertices of a polycrystal consisting of equiaxed grains. In this paper it is demonstrated that Cahn's expression for random planes may be used in situations beyond the scope envisaged in Cahn's original paper. For instance, we derived an expression for the kinetics of transformations nucleated on random parallel planes that is identical to that formerly obtained by Cahn considering random planes. Computer simulation of transformations nucleated on random parallel planes is carried out. It is shown that there is excellent agreement between simulated results and analytical solutions. Such an agreement is to be expected if both the simulation and the analytical solution are correct. (paper)
Cocaine Use and Delinquent Behavior among High-Risk Youths: A Growth Model of Parallel Processes

Science.gov (United States)

Dembo, Richard; Sullivan, Christopher

2009-01-01

We report the results of a parallel-process, latent growth model analysis examining the relationships between cocaine use and delinquent behavior among youths. The study examined a sample of 278 justice-involved juveniles completing at least one of three follow-up interviews as part of a National Institute on Drug Abuse-funded study. The results…

Error Modeling and Experimental Study of a Flexible Joint 6-UPUR Parallel Six-Axis Force Sensor.

Science.gov (United States)

Zhao, Yanzhi; Cao, Yachao; Zhang, Caifeng; Zhang, Dan; Zhang, Jie

2017-09-29

By combining a parallel mechanism with integrated flexible joints, a large measurement range and high accuracy sensor is realized. However, the main errors of the sensor involve not only assembly errors, but also deformation errors of its flexible leg. Based on a flexible joint 6-UPUR (a kind of mechanism configuration where U-universal joint, P-prismatic joint, R-revolute joint) parallel six-axis force sensor developed during the prephase, assembly and deformation error modeling and analysis of the resulting sensors with a large measurement range and high accuracy are made in this paper. First, an assembly error model is established based on the imaginary kinematic joint method and the Denavit-Hartenberg (D-H) method. Next, a stiffness model is built to solve the stiffness matrix. The deformation error model of the sensor is obtained. Then, the first order kinematic influence coefficient matrix when the synthetic error is taken into account is solved. Finally, measurement and calibration experiments of the sensor composed of the hardware and software system are performed. Forced deformation of the force-measuring platform is detected by using laser interferometry and analyzed to verify the correctness of the synthetic error model. In addition, the first order kinematic influence coefficient matrix in actual circumstances is calculated. By comparing the condition numbers and square norms of the coefficient matrices, the conclusion is drawn theoretically that it is very important to take into account the synthetic error for design stage of the sensor and helpful to improve performance of the sensor in order to meet needs of actual working environments.
Error Modeling and Experimental Study of a Flexible Joint 6-UPUR Parallel Six-Axis Force Sensor

Directory of Open Access Journals (Sweden)

Yanzhi Zhao

2017-09-01

Full Text Available By combining a parallel mechanism with integrated flexible joints, a large measurement range and high accuracy sensor is realized. However, the main errors of the sensor involve not only assembly errors, but also deformation errors of its flexible leg. Based on a flexible joint 6-UPUR (a kind of mechanism configuration where U-universal joint, P-prismatic joint, R-revolute joint parallel six-axis force sensor developed during the prephase, assembly and deformation error modeling and analysis of the resulting sensors with a large measurement range and high accuracy are made in this paper. First, an assembly error model is established based on the imaginary kinematic joint method and the Denavit-Hartenberg (D-H method. Next, a stiffness model is built to solve the stiffness matrix. The deformation error model of the sensor is obtained. Then, the first order kinematic influence coefficient matrix when the synthetic error is taken into account is solved. Finally, measurement and calibration experiments of the sensor composed of the hardware and software system are performed. Forced deformation of the force-measuring platform is detected by using laser interferometry and analyzed to verify the correctness of the synthetic error model. In addition, the first order kinematic influence coefficient matrix in actual circumstances is calculated. By comparing the condition numbers and square norms of the coefficient matrices, the conclusion is drawn theoretically that it is very important to take into account the synthetic error for design stage of the sensor and helpful to improve performance of the sensor in order to meet needs of actual working environments.
Modeling of fatigue crack induced nonlinear ultrasonics using a highly parallelized explicit local interaction simulation approach

Science.gov (United States)

Shen, Yanfeng; Cesnik, Carlos E. S.

2016-04-01

This paper presents a parallelized modeling technique for the efficient simulation of nonlinear ultrasonics introduced by the wave interaction with fatigue cracks. The elastodynamic wave equations with contact effects are formulated using an explicit Local Interaction Simulation Approach (LISA). The LISA formulation is extended to capture the contact-impact phenomena during the wave damage interaction based on the penalty method. A Coulomb friction model is integrated into the computation procedure to capture the stick-slip contact shear motion. The LISA procedure is coded using the Compute Unified Device Architecture (CUDA), which enables the highly parallelized supercomputing on powerful graphic cards. Both the explicit contact formulation and the parallel feature facilitates LISA's superb computational efficiency over the conventional finite element method (FEM). The theoretical formulations based on the penalty method is introduced and a guideline for the proper choice of the contact stiffness is given. The convergence behavior of the solution under various contact stiffness values is examined. A numerical benchmark problem is used to investigate the new LISA formulation and results are compared with a conventional contact finite element solution. Various nonlinear ultrasonic phenomena are successfully captured using this contact LISA formulation, including the generation of nonlinear higher harmonic responses. Nonlinear mode conversion of guided waves at fatigue cracks is also studied.
A detailed chemistry model for transient hydrogen and carbon monoxide catalytic recombination on parallel flat Pt surfaces implemented in an integral code

International Nuclear Information System (INIS)

Jimenez, Miguel A.; Martin-Valdepenas, Juan M.; Martin-Fuertes, Francisco; Fernandez, Jose A.

2007-01-01

A detailed chemistry model has been adapted and developed for surface chemistry, heat and mass transfer between H 2 /CO/air/steam/CO 2 mixtures and vertical parallel Pt-coated surfaces. This model is based onto a simplified Deutschmann reaction scheme for methane surface combustion and the analysis by Elenbaas for buoyancy-induced heat transfer between parallel plates. Mass transfer is treated by the heat and mass transfer analogy. The proposed model is able to simulate the H 2 /CO recombination phenomena characteristic of parallel-plate Passive Autocatalytic Recombiners (PARs), which have been proposed and implemented as a promising hydrogen-control strategy in the safety of nuclear power stations or other industries. The transient model is able to approach the warm-up phase of the PAR and its shut-down as well as the dynamic changes within the surrounding atmosphere. The model has been implemented within the MELCOR code and assessed against results of the Battelle Model Containment tests of the Zx series. Results show accurate predictions and a better performance than traditional methods in integral codes, i.e. empirical correlations, which are also much case-specific. Influence of CO present in the mixture on the PAR performance is also addressed in this paper
An approach to computing discrete adjoints for MPI-parallelized models applied to Ice Sheet System Model 4.11

Directory of Open Access Journals (Sweden)

E. Larour

2016-11-01

Full Text Available Within the framework of sea-level rise projections, there is a strong need for hindcast validation of the evolution of polar ice sheets in a way that tightly matches observational records (from radar, gravity, and altimetry observations mainly. However, the computational requirements for making hindcast reconstructions possible are severe and rely mainly on the evaluation of the adjoint state of transient ice-flow models. Here, we look at the computation of adjoints in the context of the NASA/JPL/UCI Ice Sheet System Model (ISSM, written in C++ and designed for parallel execution with MPI. We present the adaptations required in the way the software is designed and written, but also generic adaptations in the tools facilitating the adjoint computations. We concentrate on the use of operator overloading coupled with the AdjoinableMPI library to achieve the adjoint computation of the ISSM. We present a comprehensive approach to (1 carry out type changing through the ISSM, hence facilitating operator overloading, (2 bind to external solvers such as MUMPS and GSL-LU, and (3 handle MPI-based parallelism to scale the capability. We demonstrate the success of the approach by computing sensitivities of hindcast metrics such as the misfit to observed records of surface altimetry on the northeastern Greenland Ice Stream, or the misfit to observed records of surface velocities on Upernavik Glacier, central West Greenland. We also provide metrics for the scalability of the approach, and the expected performance. This approach has the potential to enable a new generation of hindcast-validated projections that make full use of the wealth of datasets currently being collected, or already collected, in Greenland and Antarctica.
A Parallel Process Growth Model of Avoidant Personality Disorder Symptoms and Personality Traits

Science.gov (United States)

Wright, Aidan G. C.; Pincus, Aaron L.; Lenzenweger, Mark F.

2012-01-01

Background Avoidant personality disorder (AVPD), like other personality disorders, has historically been construed as a highly stable disorder. However, results from a number of longitudinal studies have found that the symptoms of AVPD demonstrate marked change over time. Little is known about which other psychological systems are related to this change. Although cross-sectional research suggests a strong relationship between AVPD and personality traits, no work has examined the relationship of their change trajectories. The current study sought to establish the longitudinal relationship between AVPD and basic personality traits using parallel process growth curve modeling. Methods Parallel process growth curve modeling was applied to the trajectories of AVPD and basic personality traits from the Longitudinal Study of Personality Disorders (Lenzenweger, 2006), a naturalistic, prospective, multiwave, longitudinal study of personality disorder, temperament, and normal personality. The focus of these analyses is on the relationship between the rates of change in both AVPD symptoms and basic personality traits. Results AVPD symptom trajectories demonstrated significant negative relationships with the trajectories of interpersonal dominance and affiliation, and a significant positive relationship to rates of change in neuroticism. Conclusions These results provide some of the first compelling evidence that trajectories of change in PD symptoms and personality traits are linked. These results have important implications for the ways in which temporal stability is conceptualized in AVPD specifically, and PD in general. PMID:22506627
"Let's Move" campaign: applying the extended parallel process model.

Science.gov (United States)

Batchelder, Alicia; Matusitz, Jonathan

2014-01-01

This article examines Michelle Obama's health campaign, "Let's Move," through the lens of the extended parallel process model (EPPM). "Let's Move" aims to reduce the childhood obesity epidemic in the United States. Developed by Kim Witte, EPPM rests on the premise that people's attitudes can be changed when fear is exploited as a factor of persuasion. Fear appeals work best (a) when a person feels a concern about the issue or situation, and (b) when he or she believes to have the capability of dealing with that issue or situation. Overall, the analysis found that "Let's Move" is based on past health campaigns that have been successful. An important element of the campaign is the use of fear appeals (as it is postulated by EPPM). For example, part of the campaign's strategies is to explain the severity of the diseases associated with obesity. By looking at the steps of EPPM, readers can also understand the strengths and weaknesses of "Let's Move."
Modeling, analysis, and design of stationary reference frame droop controlled parallel three-phase voltage source inverters

DEFF Research Database (Denmark)

Vasquez, Juan Carlos; Guerrero, Josep M.; Savaghebi, Mehdi

2011-01-01

and discussed. Experimental results are provided to validate the performance and robustness of the VSIs functionality during Islanded and grid-connected operations, allowing a seamless transition between these modes through control hierarchies by regulating frequency and voltage, main-grid interactivity......Power electronics based microgrids consist of a number of voltage source inverters (VSIs) operating in parallel. In this paper, the modeling, control design, and stability analysis of three-phase VSIs are derived. The proposed voltage and current inner control loops and the mathematical models...
Coupled Model of channels in parallel and neutron kinetics in two dimensions

International Nuclear Information System (INIS)

Cecenas F, M.; Campos G, R.M.; Valle G, E. del

2004-01-01

In this work an arrangement of thermohydraulic channels is presented that represent those four quadrants of a nucleus of reactor type BWR. The channels are coupled to a model of neutronic in two dimensions that allow to generate the radial profile of power of the reactor. Nevertheless that the neutronic pattern is of two dimensions, it is supplemented with axial additional information when considering the axial profiles of power for each thermo hydraulic channel. The stationary state is obtained the one it imposes as frontier condition the same pressure drop for all the channels. This condition is satisfied to iterating on the flow of coolant in each channel to equal the pressure drop in all the channels. This stationary state is perturbed later on when modifying the values for the effective sections corresponding to an it assembles. The calculation in parallel of the neutronic and the thermo hydraulic is carried out with Vpm (Virtual parallel machine) by means of an outline teacher-slave in a local net of computers. (Author)
Parallel Solution of Robust Nonlinear Model Predictive Control Problems in Batch Crystallization

Directory of Open Access Journals (Sweden)

Yankai Cao

2016-06-01

Full Text Available Representing the uncertainties with a set of scenarios, the optimization problem resulting from a robust nonlinear model predictive control (NMPC strategy at each sampling instance can be viewed as a large-scale stochastic program. This paper solves these optimization problems using the parallel Schur complement method developed to solve stochastic programs on distributed and shared memory machines. The control strategy is illustrated with a case study of a multidimensional unseeded batch crystallization process. For this application, a robust NMPC based on min–max optimization guarantees satisfaction of all state and input constraints for a set of uncertainty realizations, and also provides better robust performance compared with open-loop optimal control, nominal NMPC, and robust NMPC minimizing the expected performance at each sampling instance. The performance of robust NMPC can be improved by generating optimization scenarios using Bayesian inference. With the efficient parallel solver, the solution time of one optimization problem is reduced from 6.7 min to 0.5 min, allowing for real-time application.
Analytical model for vibration prediction of two parallel tunnels in a full-space

Science.gov (United States)

He, Chao; Zhou, Shunhua; Guo, Peijun; Di, Honggui; Zhang, Xiaohui

2018-06-01

This paper presents a three-dimensional analytical model for the prediction of ground vibrations from two parallel tunnels embedded in a full-space. The two tunnels are modelled as cylindrical shells of infinite length, and the surrounding soil is modelled as a full-space with two cylindrical cavities. A virtual interface is introduced to divide the soil into the right layer and the left layer. By transforming the cylindrical waves into the plane waves, the solution of wave propagation in the full-space with two cylindrical cavities is obtained. The transformations from the plane waves to cylindrical waves are then used to satisfy the boundary conditions on the tunnel-soil interfaces. The proposed model provides a highly efficient tool to predict the ground vibration induced by the underground railway, which accounts for the dynamic interaction between neighbouring tunnels. Analysis of the vibration fields produced over a range of frequencies and soil properties is conducted. When the distance between the two tunnels is smaller than three times the tunnel diameter, the interaction between neighbouring tunnels is highly significant, at times in the order of 20 dB. It is necessary to consider the interaction between neighbouring tunnels for the prediction of ground vibrations induced underground railways.
Investigation of the applicability of a functional programming model to fault-tolerant parallel processing for knowledge-based systems

Science.gov (United States)

Harper, Richard

1989-01-01

In a fault-tolerant parallel computer, a functional programming model can facilitate distributed checkpointing, error recovery, load balancing, and graceful degradation. Such a model has been implemented on the Draper Fault-Tolerant Parallel Processor (FTPP). When used in conjunction with the FTPP's fault detection and masking capabilities, this implementation results in a graceful degradation of system performance after faults. Three graceful degradation algorithms have been implemented and are presented. A user interface has been implemented which requires minimal cognitive overhead by the application programmer, masking such complexities as the system's redundancy, distributed nature, variable complement of processing resources, load balancing, fault occurrence and recovery. This user interface is described and its use demonstrated. The applicability of the functional programming style to the Activation Framework, a paradigm for intelligent systems, is then briefly described.
A modified parallel constitutive model for elevated temperature flow behavior of Ti-6Al-4V alloy based on multiple regression

Energy Technology Data Exchange (ETDEWEB)

Cai, Jun; Shi, Jiamin; Wang, Kuaishe; Wang, Wen; Wang, Qingjuan; Liu, Yingying [Xi' an Univ. of Architecture and Technology, Xi' an (China). School of Metallurgical Engineering; Li, Fuguo [Northwestern Polytechnical Univ., Xi' an (China). School of Materials Science and Engineering

2017-07-15

Constitutive analysis for hot working of Ti-6Al-4V alloy was carried out by using experimental stress-strain data from isothermal hot compression tests. A new kind of constitutive equation called a modified parallel constitutive model was proposed by considering the independent effects of strain, strain rate and temperature. The predicted flow stress data were compared with the experimental data. Statistical analysis was introduced to verify the validity of the developed constitutive equation. Subsequently, the accuracy of the proposed constitutive equations was evaluated by comparing with other constitutive models. The results showed that the developed modified parallel constitutive model based on multiple regression could predict flow stress of Ti-6Al-4V alloy with good correlation and generalization.
Towards anatomic scale agent-based modeling with a massively parallel spatially explicit general-purpose model of enteric tissue (SEGMEnT_HPC).

Science.gov (United States)

Cockrell, Robert Chase; Christley, Scott; Chang, Eugene; An, Gary

2015-01-01

Perhaps the greatest challenge currently facing the biomedical research community is the ability to integrate highly detailed cellular and molecular mechanisms to represent clinical disease states as a pathway to engineer effective therapeutics. This is particularly evident in the representation of organ-level pathophysiology in terms of abnormal tissue structure, which, through histology, remains a mainstay in disease diagnosis and staging. As such, being able to generate anatomic scale simulations is a highly desirable goal. While computational limitations have previously constrained the size and scope of multi-scale computational models, advances in the capacity and availability of high-performance computing (HPC) resources have greatly expanded the ability of computational models of biological systems to achieve anatomic, clinically relevant scale. Diseases of the intestinal tract are exemplary examples of pathophysiological processes that manifest at multiple scales of spatial resolution, with structural abnormalities present at the microscopic, macroscopic and organ-levels. In this paper, we describe a novel, massively parallel computational model of the gut, the Spatially Explicitly General-purpose Model of Enteric Tissue_HPC (SEGMEnT_HPC), which extends an existing model of the gut epithelium, SEGMEnT, in order to create cell-for-cell anatomic scale simulations. We present an example implementation of SEGMEnT_HPC that simulates the pathogenesis of ileal pouchitis, and important clinical entity that affects patients following remedial surgery for ulcerative colitis.
Modelling and Experimental Evaluation of a Static Balancing Technique for a New Horizontally Mounted 3-UPU Parallel Mechanism

Directory of Open Access Journals (Sweden)

Maryam Banitalebi Dehkordi

2012-11-01

Full Text Available This paper presents the modelling and experimental evaluation of the gravity compensation of a horizontal 3-UPU parallel mechanism. The conventional Newton-Euler method for static analysis and balancing of mechanisms works for serial robots; however, it can become computationally expensive when applied to the analysis of parallel manipulators. To overcome this difficulty, in this paper we propose an approach, based on a Lagrangian method, that is more efficient in terms of computation time. The derivation of the gravity compensation model is based on the analytical computation of the total potential energy of the system at each position of the end-effector. In order to satisfy the gravity compensation condition, the total potential energy of the system should remain constant for all of the manipulator's configurations. Analytical and mechanical gravity compensation is taken into account, and the set of conditions and the system of springs are defined. Finally, employing a virtual reality environment, some experiments are carried out and the reliability and feasibility of the proposed model are evaluated in the presence and absence of the elastic components.
Climate Ocean Modeling on Parallel Computers

Science.gov (United States)

Wang, P.; Cheng, B. N.; Chao, Y.

1998-01-01

Ocean modeling plays an important role in both understanding the current climatic conditions and predicting future climate change. However, modeling the ocean circulation at various spatial and temporal scales is a very challenging computational task.
P-HS-SFM: a parallel harmony search algorithm for the reproduction of experimental data in the continuous microscopic crowd dynamic models

Science.gov (United States)

Jaber, Khalid Mohammad; Alia, Osama Moh'd.; Shuaib, Mohammed Mahmod

2018-03-01

Finding the optimal parameters that can reproduce experimental data (such as the velocity-density relation and the specific flow rate) is a very important component of the validation and calibration of microscopic crowd dynamic models. Heavy computational demand during parameter search is a known limitation that exists in a previously developed model known as the Harmony Search-Based Social Force Model (HS-SFM). In this paper, a parallel-based mechanism is proposed to reduce the computational time and memory resource utilisation required to find these parameters. More specifically, two MATLAB-based multicore techniques (parfor and create independent jobs) using shared memory are developed by taking advantage of the multithreading capabilities of parallel computing, resulting in a new framework called the Parallel Harmony Search-Based Social Force Model (P-HS-SFM). The experimental results show that the parfor-based P-HS-SFM achieved a better computational time of about 26 h, an efficiency improvement of ? 54% and a speedup factor of 2.196 times in comparison with the HS-SFM sequential processor. The performance of the P-HS-SFM using the create independent jobs approach is also comparable to parfor with a computational time of 26.8 h, an efficiency improvement of about 30% and a speedup of 2.137 times.
Cache-aware data structure model for parallelism and dynamic load balancing

International Nuclear Information System (INIS)

Sridi, Marwa

2016-01-01

This PhD thesis is dedicated to the implementation of innovative parallel methods in the framework of fast transient fluid-structure dynamics. It improves existing methods within EUROPLEXUS software, in order to optimize the shared memory parallel strategy, complementary to the original distributed memory approach, brought together into a global hybrid strategy for clusters of multi-core nodes. Starting from a sound analysis of the state of the art concerning data structuring techniques correlated to the hierarchic memory organization of current multi-processor architectures, the proposed work introduces an approach suitable for an explicit time integration (i.e. with no linear system to solve at each step). A data structure of type 'Structure of arrays' is conserved for the global data storage, providing flexibility and efficiency for current operations on kinematics fields (displacement, velocity and acceleration). On the contrary, in the particular case of elementary operations (for internal forces generic computations, as well as fluxes computations between cell faces for fluid models), particularly time consuming but localized in the program, a temporary data structure of type 'Array of structures' is used instead, to force an efficient filling of the cache memory and increase the performance of the resolution, for both serial and shared memory parallel processing. Switching from the global structure to the temporary one is based on a cell grouping strategy, following classing cache-blocking principles but handling specifically for this work neighboring data necessary to the efficient treatment of ALE fluxes for cells on the group boundaries. The proposed approach is extensively tested, from the point of views of both the computation time and the access failures into cache memory, confronting the gains obtained within the elementary operations to the potential overhead generated by the data structure switch. Obtained results are very satisfactory, especially
Evaluation of RANS and LES models for Natural Convection in High-Aspect-Ratio Parallel Plate Channels

Science.gov (United States)

Fradeneck, Austen; Kimber, Mark

2017-11-01

The present study evaluates the effectiveness of current RANS and LES models in simulating natural convection in high-aspect ratio parallel plate channels. The geometry under consideration is based on a simplification of the coolant and bypass channels in the very high-temperature gas reactor (VHTR). Two thermal conditions are considered, asymmetric and symmetric wall heating with an applied heat flux to match Rayleigh numbers experienced in the VHTR during a loss of flow accident (LOFA). RANS models are compared to analogous high-fidelity LES simulations. Preliminary results demonstrate the efficacy of the low-Reynolds number k- ɛ formulations and their enhancement to the standard form and Reynolds stress transport model in terms of calculating the turbulence production due to buoyancy and overall mean flow variables.
Algebraic multigrid preconditioning within parallel finite-element solvers for 3-D electromagnetic modelling problems in geophysics

Science.gov (United States)

Koldan, Jelena; Puzyrev, Vladimir; de la Puente, Josep; Houzeaux, Guillaume; Cela, José María

2014-06-01

We present an elaborate preconditioning scheme for Krylov subspace methods which has been developed to improve the performance and reduce the execution time of parallel node-based finite-element (FE) solvers for 3-D electromagnetic (EM) numerical modelling in exploration geophysics. This new preconditioner is based on algebraic multigrid (AMG) that uses different basic relaxation methods, such as Jacobi, symmetric successive over-relaxation (SSOR) and Gauss-Seidel, as smoothers and the wave front algorithm to create groups, which are used for a coarse-level generation. We have implemented and tested this new preconditioner within our parallel nodal FE solver for 3-D forward problems in EM induction geophysics. We have performed series of experiments for several models with different conductivity structures and characteristics to test the performance of our AMG preconditioning technique when combined with biconjugate gradient stabilized method. The results have shown that, the more challenging the problem is in terms of conductivity contrasts, ratio between the sizes of grid elements and/or frequency, the more benefit is obtained by using this preconditioner. Compared to other preconditioning schemes, such as diagonal, SSOR and truncated approximate inverse, the AMG preconditioner greatly improves the convergence of the iterative solver for all tested models. Also, when it comes to cases in which other preconditioners succeed to converge to a desired precision, AMG is able to considerably reduce the total execution time of the forward-problem code-up to an order of magnitude. Furthermore, the tests have confirmed that our AMG scheme ensures grid-independent rate of convergence, as well as improvement in convergence regardless of how big local mesh refinements are. In addition, AMG is designed to be a black-box preconditioner, which makes it easy to use and combine with different iterative methods. Finally, it has proved to be very practical and efficient in the

Development of mathematical model and optimal control system of internal temperatures of hot-blast stove process in staggered parallel operation; Netsufuro sushiki model to parallel sofu ni okeru ronai ondo saiteki seigyo system no kaihatsu

Energy Technology Data Exchange (ETDEWEB)

Matoba, Y. [Sumitomo Metal Industries, Ltd., Osaka (Japan); Otsuka, K.

1998-07-01

A mathematical model and an optimal control system of hot-blast stove process are described. A precise mathematical simulation model of the hot-blast stove was developed and the accuracy of the model has been confirmed. An optimal control system of the thermal conditions of the hot-blast stoves in staggered parallel operation was also developed. By the use of the multivariable optimal regulator and the feedforward compensations for the change of the aimed blast temperature and blast volume, the system is able to control the hot blast temperature and the brick temperature efficiently. The system has been applied to Kashima works. The variations of the blast temperature and the silica brick temperature have been decreased. The ultimate low heat level operations have been realized and the thermal efficiency furthermore has been raised by about 1%. 8 refs., 14 figs., 1 tab.
Modeling and control of a parallel waste heat recovery system for Euro-VI heavy-duty diesel engines

NARCIS (Netherlands)

Feru, E.; Willems, F.P.T.; Jager, de A.G.; Steinbuch, M.

2014-01-01

This paper presents the modeling and control of a waste heat recovery systemfor a Euro-VI heavy-duty truck engine. The considered waste heat recovery system consists of two parallel evaporators with expander and pumps mechanically coupled to the engine crankshaft. Compared to previous work, the
Modeling and Control of a Parallel Waste Heat Recovery System for Euro-VI Heavy-Duty Diesel Engines

NARCIS (Netherlands)

Feru, E.; Willems, F.P.T.; Jager, B. de; Steinbuch, M.

2014-01-01

This paper presents the modeling and control of a waste heat recovery system for a Euro-VI heavy-duty truck engine. The considered waste heat recovery system consists of two parallel evaporators with expander and pumps mechanically coupled to the engine crankshaft. Compared to previous work, the
Toward real-time diffuse optical tomography: accelerating light propagation modeling employing parallel computing on GPU and CPU

Science.gov (United States)

Doulgerakis, Matthaios; Eggebrecht, Adam; Wojtkiewicz, Stanislaw; Culver, Joseph; Dehghani, Hamid

2017-12-01

Parameter recovery in diffuse optical tomography is a computationally expensive algorithm, especially when used for large and complex volumes, as in the case of human brain functional imaging. The modeling of light propagation, also known as the forward problem, is the computational bottleneck of the recovery algorithm, whereby the lack of a real-time solution is impeding practical and clinical applications. The objective of this work is the acceleration of the forward model, within a diffusion approximation-based finite-element modeling framework, employing parallelization to expedite the calculation of light propagation in realistic adult head models. The proposed methodology is applicable for modeling both continuous wave and frequency-domain systems with the results demonstrating a 10-fold speed increase when GPU architectures are available, while maintaining high accuracy. It is shown that, for a very high-resolution finite-element model of the adult human head with ˜600,000 nodes, consisting of heterogeneous layers, light propagation can be calculated at ˜0.25 s/excitation source.
SEISMIC SIMULATIONS USING PARALLEL COMPUTING AND THREE-DIMENSIONAL EARTH MODELS TO IMPROVE NUCLEAR EXPLOSION PHENOMENOLOGY AND MONITORING

Energy Technology Data Exchange (ETDEWEB)

Rodgers, A; Matzel, E; Pasyanos, M; Petersson, A; Sjogreen, B; Bono, C; Vorobiev, O; Antoun, T; Walter, W; Myers, S; Lomov, I

2008-07-07

The development of accurate numerical methods to simulate wave propagation in three-dimensional (3D) earth models and advances in computational power offer exciting possibilities for modeling the motions excited by underground nuclear explosions. This presentation will describe recent work to use new numerical techniques and parallel computing to model earthquakes and underground explosions to improve understanding of the wave excitation at the source and path-propagation effects. Firstly, we are using the spectral element method (SEM, SPECFEM3D code of Komatitsch and Tromp, 2002) to model earthquakes and explosions at regional distances using available 3D models. SPECFEM3D simulates anelastic wave propagation in fully 3D earth models in spherical geometry with the ability to account for free surface topography, anisotropy, ellipticity, rotation and gravity. Results show in many cases that 3D models are able to reproduce features of the observed seismograms that arise from path-propagation effects (e.g. enhanced surface wave dispersion, refraction, amplitude variations from focusing and defocusing, tangential component energy from isotropic sources). We are currently investigating the ability of different 3D models to predict path-specific seismograms as a function of frequency. A number of models developed using a variety of methodologies are available for testing. These include the WENA/Unified model of Eurasia (e.g. Pasyanos et al 2004), the global CUB 2.0 model (Shapiro and Ritzwoller, 2002), the partitioned waveform model for the Mediterranean (van der Lee et al., 2007) and stochastic models of the Yellow Sea Korean Peninsula region (Pasyanos et al., 2006). Secondly, we are extending our Cartesian anelastic finite difference code (WPP of Nilsson et al., 2007) to model the effects of free-surface topography. WPP models anelastic wave propagation in fully 3D earth models using mesh refinement to increase computational speed and improve memory efficiency. Thirdly
Bentonite electrical conductivity: a model based on series–parallel transport

KAUST Repository

Lima, Ana T.

2010-01-30

Bentonite has significant applications nowadays, among them as landfill liners, in concrete industry as a repairing material, and as drilling mud in oil well construction. The application of an electric field to such perimeters is under wide discussion, and subject of many studies. However, to understand the behaviour of such an expansive and plastic material under the influence of an electric field, the perception of its electrical properties is essential. This work serves to compare existing data of such electrical behaviour with new laboratorial results. Electrical conductivity is a pertinent parameter since it indicates how much a material is prone to conduct electricity. In the current study, total conductivity of a compacted porous medium was established to be dependent upon density of the bentonite plug. Therefore, surface conductivity was addressed and a series-parallel transport model used to quantify/predict the total conductivity of the system. © The Author(s) 2010.
Particle and parallel momentum balance equations with inclusion of drifts, for modelling strong- to weakly-collisional edge plasmas

International Nuclear Information System (INIS)

Chankin, A. V.; Stangeby, P. C.

2006-01-01

A system of plasma particle and parallel momentum balance equations is derived appropriate for understanding the role of drifts in the edge and for edge modelling, particularly in the scrape-off layer (SOL) of tokamaks, stellarators and other magnetic confinement devices. The formulation allows for strong collisionality-but also covers the case of weak collisionality and strong drifts, a combination often encountered in the SOL. The most important terms are identified by assessing the magnitude of characteristic velocities and fluxes for the plasma edge region. Explanations of the physical nature of each term are provided. A number of terms that are sometimes not included in edge modelling has been included in the parallel momentum balance equation after detailed analysis of the parallel component of the gradient of the total pressure-stress tensor. This includes terms related to curvature and divergence of the field lines, as well as further contributions coming from viscous forces related mainly to the ion centrifugal drift. All these terms are shown to be roughly of the same order of magnitude as convective momentum fluxes related to drifts and therefore should be included in the momentum balance equation
Parallel Computation of RCS of Electrically Large Platform with Coatings Modeled with NURBS Surfaces

Directory of Open Access Journals (Sweden)

Ying Yan

2012-01-01

Full Text Available The significance of Radar Cross Section (RCS in the military applications makes its prediction an important problem. This paper uses large-scale parallel Physical Optics (PO to realize the fast computation of RCS to electrically large targets, which are modeled by Non-Uniform Rational B-Spline (NURBS surfaces and coated with dielectric materials. Some numerical examples are presented to validate this paper’s method. In addition, 1024 CPUs are used in Shanghai Supercomputer Center (SSC to perform the simulation of a model with the maximum electrical size 1966.7 λ for the first time in China. From which, it can be found that this paper’s method can greatly speed the calculation and is capable of solving the real-life problem of RCS prediction.
Petascale Hierarchical Modeling VIA Parallel Execution

Energy Technology Data Exchange (ETDEWEB)

Gelman, Andrew [Principal Investigator

2014-04-14

The research allows more effective model building. By allowing researchers to fit complex models to large datasets in a scalable manner, our algorithms and software enable more effective scientific research. In the new area of “big data,” it is often necessary to fit “big models” to adjust for systematic differences between sample and population. For this task, scalable and efficient model-fitting tools are needed, and these have been achieved with our new Hamiltonian Monte Carlo algorithm, the no-U-turn sampler, and our new C++ program, Stan. In layman’s terms, our research enables researchers to create improved mathematical modes for large and complex systems.
A parallel process growth model of avoidant personality disorder symptoms and personality traits.

Science.gov (United States)

Wright, Aidan G C; Pincus, Aaron L; Lenzenweger, Mark F

2013-07-01

Avoidant personality disorder (AVPD), like other personality disorders, has historically been construed as a highly stable disorder. However, results from a number of longitudinal studies have found that the symptoms of AVPD demonstrate marked change over time. Little is known about which other psychological systems are related to this change. Although cross-sectional research suggests a strong relationship between AVPD and personality traits, no work has examined the relationship of their change trajectories. The current study sought to establish the longitudinal relationship between AVPD and basic personality traits using parallel process growth curve modeling. Parallel process growth curve modeling was applied to the trajectories of AVPD and basic personality traits from the Longitudinal Study of Personality Disorders (Lenzenweger, M. F., 2006, The longitudinal study of personality disorders: History, design considerations, and initial findings. Journal of Personality Disorders, 20, 645-670. doi:10.1521/pedi.2006.20.6.645), a naturalistic, prospective, multiwave, longitudinal study of personality disorder, temperament, and normal personality. The focus of these analyses is on the relationship between the rates of change in both AVPD symptoms and basic personality traits. AVPD symptom trajectories demonstrated significant negative relationships with the trajectories of interpersonal dominance and affiliation, and a significant positive relationship to rates of change in neuroticism. These results provide some of the first compelling evidence that trajectories of change in PD symptoms and personality traits are linked. These results have important implications for the ways in which temporal stability is conceptualized in AVPD specifically, and PD in general.
Parallel Application Development Using Architecture View Driven Model Transformations

NARCIS (Netherlands)

Arkin, E.; Tekinerdogan, B.

2015-01-01

o realize the increased need for computing performance the current trend is towards applying parallel computing in which the tasks are run in parallel on multiple nodes. On its turn we can observe the rapid increase of the scale of parallel computing platforms. This situation has led to a complexity
Kernel integration scatter model for parallel beam gamma camera and SPECT point source response

International Nuclear Information System (INIS)

Marinkovic, P.M.

2001-01-01

Scatter correction is a prerequisite for quantitative single photon emission computed tomography (SPECT). In this paper a kernel integration scatter Scatter correction is a prerequisite for quantitative SPECT. In this paper a kernel integration scatter model for parallel beam gamma camera and SPECT point source response based on Klein-Nishina formula is proposed. This method models primary photon distribution as well as first Compton scattering. It also includes a correction for multiple scattering by applying a point isotropic single medium buildup factor for the path segment between the point of scatter an the point of detection. Gamma ray attenuation in the object of imaging, based on known μ-map distribution, is considered too. Intrinsic spatial resolution of the camera is approximated by a simple Gaussian function. Collimator is modeled simply using acceptance angles derived from the physical dimensions of the collimator. Any gamma rays satisfying this angle were passed through the collimator to the crystal. Septal penetration and scatter in the collimator were not included in the model. The method was validated by comparison with Monte Carlo MCNP-4a numerical phantom simulation and excellent results were obtained. The physical phantom experiments, to confirm this method, are planed to be done. (author)
Temporal Precedence Checking for Switched Models and its Application to a Parallel Landing Protocol

Science.gov (United States)

Duggirala, Parasara Sridhar; Wang, Le; Mitra, Sayan; Viswanathan, Mahesh; Munoz, Cesar A.

2014-01-01

This paper presents an algorithm for checking temporal precedence properties of nonlinear switched systems. This class of properties subsume bounded safety and capture requirements about visiting a sequence of predicates within given time intervals. The algorithm handles nonlinear predicates that arise from dynamics-based predictions used in alerting protocols for state-of-the-art transportation systems. It is sound and complete for nonlinear switch systems that robustly satisfy the given property. The algorithm is implemented in the Compare Execute Check Engine (C2E2) using validated simulations. As a case study, a simplified model of an alerting system for closely spaced parallel runways is considered. The proposed approach is applied to this model to check safety properties of the alerting logic for different operating conditions such as initial velocities, bank angles, aircraft longitudinal separation, and runway separation.
Design Analysis and Dynamic Modeling of a High-Speed 3T1R Pick-and-Place Parallel Robot

DEFF Research Database (Denmark)

Wu, Guanglei; Bai, Shaoping; Hjørnet, Preben

2015-01-01

This paper introduces a four degree-of-freedom parallel robot producing three translation and one rotation (Schönflies motion). This robot can generate a rectangular workspace that is close to the applicable work envelope and suitable for pick-and-place operations. The kinematics of the robot...... is studied to analyze the workspace and the isocontours of the local dexterity over the representative regular workspace are visualized. The simplified dynamics is modeled and compared with Adams model to show its effectiveness....
Modeling the Control Systems of Gas-Turbines to Ensure Their Reliable Parallel Operation in the UPS of Russia

Energy Technology Data Exchange (ETDEWEB)

Vinogradov, A. Yu., E-mail: vinogradov-a@ntcees.ru; Gerasimov, A. S.; Kozlov, A. V.; Smirnov, A. N. [JSC “STC UPS” (Russian Federation)

2016-05-15

Consideration is given to different approaches to modeling the control systems of gas turbines as a component of CCPP and GTPP to ensure their reliable parallel operation in the UPS of Russia. The disadvantages of the approaches to the modeling of combined-cycle units in studying long-term electromechanical transients accompanied by power imbalance are pointed out. Examples are presented to support the use of more detailed models of gas turbines in electromechanical transient calculations. It is shown that the modern speed control systems of gas turbines in combination with relatively low equivalent inertia have a considerable effect on electromechanical transients, including those caused by disturbances not related to power imbalance.
Parallelization in Modern C++

CERN Multimedia

CERN. Geneva

2016-01-01

The traditionally used and well established parallel programming models OpenMP and MPI are both targeting lower level parallelism and are meant to be as language agnostic as possible. For a long time, those models were the only widely available portable options for developing parallel C++ applications beyond using plain threads. This has strongly limited the optimization capabilities of compilers, has inhibited extensibility and genericity, and has restricted the use of those models together with other, modern higher level abstractions introduced by the C++11 and C++14 standards. The recent revival of interest in the industry and wider community for the C++ language has also spurred a remarkable amount of standardization proposals and technical specifications being developed. Those efforts however have so far failed to build a vision on how to seamlessly integrate various types of parallelism, such as iterative parallel execution, task-based parallelism, asynchronous many-task execution flows, continuation s...
Toward real-time diffuse optical tomography: accelerating light propagation modeling employing parallel computing on GPU and CPU.

Science.gov (United States)

Doulgerakis, Matthaios; Eggebrecht, Adam; Wojtkiewicz, Stanislaw; Culver, Joseph; Dehghani, Hamid

2017-12-01

Parameter recovery in diffuse optical tomography is a computationally expensive algorithm, especially when used for large and complex volumes, as in the case of human brain functional imaging. The modeling of light propagation, also known as the forward problem, is the computational bottleneck of the recovery algorithm, whereby the lack of a real-time solution is impeding practical and clinical applications. The objective of this work is the acceleration of the forward model, within a diffusion approximation-based finite-element modeling framework, employing parallelization to expedite the calculation of light propagation in realistic adult head models. The proposed methodology is applicable for modeling both continuous wave and frequency-domain systems with the results demonstrating a 10-fold speed increase when GPU architectures are available, while maintaining high accuracy. It is shown that, for a very high-resolution finite-element model of the adult human head with ∼600,000 nodes, consisting of heterogeneous layers, light propagation can be calculated at ∼0.25 s/excitation source. (2017) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE).
Development of a new dynamic turbulent model, applications to two-dimensional and plane parallel flows

International Nuclear Information System (INIS)

Laval, Jean Philippe

1999-01-01

We developed a turbulent model based on asymptotic development of the Navier-Stokes equations within the hypothesis of non-local interactions at small scales. This model provides expressions of the turbulent Reynolds sub-grid stresses via estimates of the sub-grid velocities rather than velocities correlations as is usually done. The model involves the coupling of two dynamical equations: one for the resolved scales of motions, which depends upon the Reynolds stresses generated by the sub-grid motions, and one for the sub-grid scales of motions, which can be used to compute the sub-grid Reynolds stresses. The non-locality of interaction at sub-grid scales allows to model their evolution with a linear inhomogeneous equation where the forcing occurs via the energy cascade from resolved to sub-grid scales. This model was solved using a decomposition of sub-grid scales on Gabor's modes and implemented numerically in 2D with periodic boundary conditions. A particles method (PIC) was used to compute the sub-grid scales. The results were compared with results of direct simulations for several typical flows. The model was also applied to plane parallel flows. An analytical study of the equations allows a description of mean velocity profiles in agreement with experimental results and theoretical results based on the symmetries of the Navier-Stokes equation. Possible applications and improvements of the model are discussed in the conclusion. (author) [fr
Model-to-model interface for multiscale materials modeling

Energy Technology Data Exchange (ETDEWEB)

Antonelli, Perry Edward [Iowa State Univ., Ames, IA (United States)

2017-12-17

A low-level model-to-model interface is presented that will enable independent models to be linked into an integrated system of models. The interface is based on a standard set of functions that contain appropriate export and import schemas that enable models to be linked with no changes to the models themselves. These ideas are presented in the context of a specific multiscale material problem that couples atomistic-based molecular dynamics calculations to continuum calculations of fluid ow. These simulations will be used to examine the influence of interactions of the fluid with an adjacent solid on the fluid ow. The interface will also be examined by adding it to an already existing modeling code, Large-scale Atomic/Molecular Massively Parallel Simulator (LAMMPS) and comparing it with our own molecular dynamics code.
SF-FDTD analysis of a predictive physical model for parallel aligned liquid crystal devices

Science.gov (United States)

Márquez, Andrés.; Francés, Jorge; Martínez, Francisco J.; Gallego, Sergi; Alvarez, Mariela L.; Calzado, Eva M.; Pascual, Inmaculada; Beléndez, Augusto

2017-08-01

Recently we demonstrated a novel and simplified model enabling to calculate the voltage dependent retardance provided by parallel aligned liquid crystal devices (PA-LCoS) for a very wide range of incidence angles and any wavelength in the visible. To our knowledge it represents the most simplified approach still showing predictive capability. Deeper insight into the physics behind the simplified model is necessary to understand if the parameters in the model are physically meaningful. Since the PA-LCoS is a black-box where we do not have information about the physical parameters of the device, we cannot perform this kind of analysis using the experimental retardance measurements. In this work we develop realistic simulations for the non-linear tilt of the liquid crystal director across the thickness of the liquid crystal layer in the PA devices. We consider these profiles to have a sine-like shape, which is a good approximation for typical ranges of applied voltage in commercial PA-LCoS microdisplays. For these simulations we develop a rigorous method based on the split-field finite difference time domain (SF-FDTD) technique which provides realistic retardance values. These values are used as the experimental measurements to which the simplified model is fitted. From this analysis we learn that the simplified model is very robust, providing unambiguous solutions when fitting its parameters. We also learn that two of the parameters in the model are physically meaningful, proving a useful reverse-engineering approach, with predictive capability, to probe into internal characteristics of the PA-LCoS device.

GPGPU Parallel SPIN Model Checker

Data.gov (United States)

National Aeronautics and Space Administration — Model Checking is a powerful technique used to verify that a system does not violate its intended behavior. While this is very useful in proving the robustness of a...
Jet formation and equatorial superrotation in Jupiter's atmosphere: Numerical modelling using a new efficient parallel code

Science.gov (United States)

Rivier, Leonard Gilles

Using an efficient parallel code solving the primitive equations of atmospheric dynamics, the jet structure of a Jupiter like atmosphere is modeled. In the first part of this thesis, a parallel spectral code solving both the shallow water equations and the multi-level primitive equations of atmospheric dynamics is built. The implementation of this code called BOB is done so that it runs effectively on an inexpensive cluster of workstations. A one dimensional decomposition and transposition method insuring load balancing among processes is used. The Legendre transform is cache-blocked. A "compute on the fly" of the Legendre polynomials used in the spectral method produces a lower memory footprint and enables high resolution runs on relatively small memory machines. Performance studies are done using a cluster of workstations located at the National Center for Atmospheric Research (NCAR). BOB performances are compared to the parallel benchmark code PSTSWM and the dynamical core of NCAR's CCM3.6.6. In both cases, the comparison favors BOB. In the second part of this thesis, the primitive equation version of the code described in part I is used to study the formation of organized zonal jets and equatorial superrotation in a planetary atmosphere where the parameters are chosen to best model the upper atmosphere of Jupiter. Two levels are used in the vertical and only large scale forcing is present. The model is forced towards a baroclinically unstable flow, so that eddies are generated by baroclinic instability. We consider several types of forcing, acting on either the temperature or the momentum field. We show that only under very specific parametric conditions, zonally elongated structures form and persist resembling the jet structure observed near the cloud level top (1 bar) on Jupiter. We also study the effect of an equatorial heat source, meant to be a crude representation of the effect of the deep convective planetary interior onto the outer atmospheric layer. We
Performance modeling and analysis of parallel Gaussian elimination on multi-core computers

Directory of Open Access Journals (Sweden)

Fadi N. Sibai

2014-01-01

Full Text Available Gaussian elimination is used in many applications and in particular in the solution of systems of linear equations. This paper presents mathematical performance models and analysis of four parallel Gaussian Elimination methods (precisely the Original method and the new Meet in the Middle –MiM– algorithms and their variants with SIMD vectorization on multi-core systems. Analytical performance models of the four methods are formulated and presented followed by evaluations of these models with modern multi-core systems’ operation latencies. Our results reveal that the four methods generally exhibit good performance scaling with increasing matrix size and number of cores. SIMD vectorization only makes a large difference in performance for low number of cores. For a large matrix size (n ⩾ 16 K, the performance difference between the MiM and Original methods falls from 16× with four cores to 4× with 16 K cores. The efficiencies of all four methods are low with 1 K cores or more stressing a major problem of multi-core systems where the network-on-chip and memory latencies are too high in relation to basic arithmetic operations. Thus Gaussian Elimination can greatly benefit from the resources of multi-core systems, but higher performance gains can be achieved if multi-core systems can be designed with lower memory operation, synchronization, and interconnect communication latencies, requirements of utmost importance and challenge in the exascale computing age.
A parallel model for SQL astronomical databases based on solid state storage. Application to the Gaia Archive PostgreSQL database

Science.gov (United States)

González-Núñez, J.; Gutiérrez-Sánchez, R.; Salgado, J.; Segovia, J. C.; Merín, B.; Aguado-Agelet, F.

2017-07-01

Query planning and optimisation algorithms in most popular relational databases were developed at the times hard disk drives were the only storage technology available. The advent of higher parallel random access capacity devices, such as solid state disks, opens up the way for intra-machine parallel computing over large datasets. We describe a two phase parallel model for the implementation of heavy analytical processes in single instance PostgreSQL astronomical databases. This model is particularised to fulfil two frequent astronomical problems, density maps and crossmatch computation with Quad Tree Cube (Q3C) indexes. They are implemented as part of the relational databases infrastructure for the Gaia Archive and performance is assessed. Improvement of a factor 28.40 in comparison to sequential execution is observed in the reference implementation for a histogram computation. Speedup ratios of 3.7 and 4.0 are attained for the reference positional crossmatches considered. We observe large performance enhancements over sequential execution for both CPU and disk access intensive computations, suggesting these methods might be useful with the growing data volumes in Astronomy.
Tug-of-war model for the two-bandit problem: nonlocally-correlated parallel exploration via resource conservation.

Science.gov (United States)

Kim, Song-Ju; Aono, Masashi; Hara, Masahiko

2010-07-01

We propose a model - the "tug-of-war (TOW) model" - to conduct unique parallel searches using many nonlocally-correlated search agents. The model is based on the property of a single-celled amoeba, the true slime mold Physarum, which maintains a constant intracellular resource volume while collecting environmental information by concurrently expanding and shrinking its branches. The conservation law entails a "nonlocal correlation" among the branches, i.e., volume increment in one branch is immediately compensated by volume decrement(s) in the other branch(es). This nonlocal correlation was shown to be useful for decision making in the case of a dilemma. The multi-armed bandit problem is to determine the optimal strategy for maximizing the total reward sum with incompatible demands, by either exploiting the rewards obtained using the already collected information or exploring new information for acquiring higher payoffs involving risks. Our model can efficiently manage the "exploration-exploitation dilemma" and exhibits good performances. The average accuracy rate of our model is higher than those of well-known algorithms such as the modified -greedy algorithm and modified softmax algorithm, especially, for solving relatively difficult problems. Moreover, our model flexibly adapts to changing environments, a property essential for living organisms surviving in uncertain environments.
Empirical valence bond models for reactive potential energy surfaces: a parallel multilevel genetic program approach.

Science.gov (United States)

Bellucci, Michael A; Coker, David F

2011-07-28

We describe a new method for constructing empirical valence bond potential energy surfaces using a parallel multilevel genetic program (PMLGP). Genetic programs can be used to perform an efficient search through function space and parameter space to find the best functions and sets of parameters that fit energies obtained by ab initio electronic structure calculations. Building on the traditional genetic program approach, the PMLGP utilizes a hierarchy of genetic programming on two different levels. The lower level genetic programs are used to optimize coevolving populations in parallel while the higher level genetic program (HLGP) is used to optimize the genetic operator probabilities of the lower level genetic programs. The HLGP allows the algorithm to dynamically learn the mutation or combination of mutations that most effectively increase the fitness of the populations, causing a significant increase in the algorithm's accuracy and efficiency. The algorithm's accuracy and efficiency is tested against a standard parallel genetic program with a variety of one-dimensional test cases. Subsequently, the PMLGP is utilized to obtain an accurate empirical valence bond model for proton transfer in 3-hydroxy-gamma-pyrone in gas phase and protic solvent. © 2011 American Institute of Physics
Parallel External Memory Graph Algorithms

DEFF Research Database (Denmark)

Arge, Lars Allan; Goodrich, Michael T.; Sitchinava, Nodari

2010-01-01

In this paper, we study parallel I/O efficient graph algorithms in the Parallel External Memory (PEM) model, one o f the private-cache chip multiprocessor (CMP) models. We study the fundamental problem of list ranking which leads to efficient solutions to problems on trees, such as computing lowest...... an optimal speedup of Â¿(P) in parallel I/O complexity and parallel computation time, compared to the single-processor external memory counterparts....
Small-Signal Modeling, Analysis and Testing of Parallel Three-Phase-Inverters with A Novel Autonomous Current Sharing Controller

DEFF Research Database (Denmark)

Guan, Yajuan; Quintero, Juan Carlos Vasquez; Guerrero, Josep M.

2015-01-01

A novel simple and effective autonomous currentsharing controller for parallel three-phase inverters is employed in this paper. The novel controller is able to endow to the system high speed response and precision in contrast to the conventional droop control as it does not require calculating any...... active or reactive power, instead it uses a virtual impedance loop and a SFR phase-locked loop. The small-signal model of the system was developed for the autonomous operation of inverter-based microgrid with the proposed controller. The developed model shows large stability margin and fast transient...
Experiments with parallel algorithms for combinatorial problems

NARCIS (Netherlands)

G.A.P. Kindervater (Gerard); H.W.J.M. Trienekens

1985-01-01

textabstractIn the last decade many models for parallel computation have been proposed and many parallel algorithms have been developed. However, few of these models have been realized and most of these algorithms are supposed to run on idealized, unrealistic parallel machines. The parallel machines
Parallel education: what is it?

OpenAIRE

Amos, Michelle Peta

2017-01-01

In the history of education it has long been discussed that single-sex and coeducation are the two models of education present in schools. With the introduction of parallel schools over the last 15 years, there has been very little research into this 'new model'. Many people do not understand what it means for a school to be parallel or they confuse a parallel model with co-education, due to the presence of both boys and girls within the one institution. Therefore, the main obj...
Shapes of leaves with parallel venation. Modelling of the Epipactis sp. (Orchidaceae leaves with the help of a system of coupled elastic beams

Directory of Open Access Journals (Sweden)

Anna Jakubska-Busse

2016-06-01

Full Text Available Static properties of leaves with parallel venation, with particular emphasis on the genus EpipactisZinn, 1757 (Orchidaceae, Neottieae have been modelled with coupled quasi-parallel elastic “beams.” The non-linear theory of strongly bended beams have been employed. The resulting boundary-value problem has been solved numerically with the help of the finite-difference method. Possible dislocations resulting in additional Dirac-delta like forces have been take into account. Morphological similarity of the model and real leaves has been obtained. In particular, the concentrated forces have been shown to cause undulation in the leaves.
Quantitative modelling of the closure of meso-scale parallel currents in the nightside ionosphere

Directory of Open Access Journals (Sweden)

A. Marchaudon

2004-01-01

Full Text Available On 12 January 2000, during a northward IMF period, two successive conjunctions occur between the CUTLASS SuperDARN radar pair and the two satellites Ørsted and FAST. This situation is used to describe and model the electrodynamic of a nightside meso-scale arc associated with a convection shear. Three field-aligned current sheets, one upward and two downward on both sides, are observed. Based on the measurements of the parallel currents and either the conductance or the electric field profile, a model of the ionospheric current closure is developed along each satellite orbit. This model is one-dimensional, in a first attempt and a two-dimensional model is tested for the Ørsted case. These models allow one to quantify the balance between electric field gradients and ionospheric conductance gradients in the closure of the field-aligned currents. These radar and satellite data are also combined with images from Polar-UVI, allowing for a description of the time evolution of the arc between the two satellite passes. The arc is very dynamic, in spite of quiet solar wind conditions. Periodic enhancements of the convection and of electron precipitation associated with the arc are observed, probably associated with quasi-periodic injections of particles due to reconnection in the magnetotail. Also, a northward shift and a reorganisation of the precipitation pattern are observed, together with a southward shift of the convection shear. Key words. Ionosphere (auroral ionosphere; electric fields and currents; particle precipitation – Magnetospheric physics (magnetosphere-ionosphere interactions
Animated computer graphics models of space and earth sciences data generated via the massively parallel processor

Science.gov (United States)

Treinish, Lloyd A.; Gough, Michael L.; Wildenhain, W. David

1987-01-01

The capability was developed of rapidly producing visual representations of large, complex, multi-dimensional space and earth sciences data sets via the implementation of computer graphics modeling techniques on the Massively Parallel Processor (MPP) by employing techniques recently developed for typically non-scientific applications. Such capabilities can provide a new and valuable tool for the understanding of complex scientific data, and a new application of parallel computing via the MPP. A prototype system with such capabilities was developed and integrated into the National Space Science Data Center's (NSSDC) Pilot Climate Data System (PCDS) data-independent environment for computer graphics data display to provide easy access to users. While developing these capabilities, several problems had to be solved independently of the actual use of the MPP, all of which are outlined.
Parallelizing the spectral transform method: A comparison of alternative parallel algorithms

International Nuclear Information System (INIS)

Foster, I.; Worley, P.H.

1993-01-01

The spectral transform method is a standard numerical technique for solving partial differential equations on the sphere and is widely used in global climate modeling. In this paper, we outline different approaches to parallelizing the method and describe experiments that we are conducting to evaluate the efficiency of these approaches on parallel computers. The experiments are conducted using a testbed code that solves the nonlinear shallow water equations on a sphere, but are designed to permit evaluation in the context of a global model. They allow us to evaluate the relative merits of the approaches as a function of problem size and number of processors. The results of this study are guiding ongoing work on PCCM2, a parallel implementation of the Community Climate Model developed at the National Center for Atmospheric Research
Coupled model of INM-IO global ocean model, CICE sea ice model and SCM OIAS framework

Science.gov (United States)

Bayburin, Ruslan; Rashit, Ibrayev; Konstantin, Ushakov; Vladimir, Kalmykov; Gleb, Dyakonov

2015-04-01

Status of coupled Arctic model of ocean and sea ice is presented. Model consists of INM IO global ocean component of high resolution, Los Alamos National Laboratory CICE sea ice model and a framework SCM OIAS for the ocean-ice-atmosphere-land coupled modeling on massively-parallel architectures. Model is currently under development at the Institute of Numerical Mathematics (INM), Hydrometeorological Center (HMC) and P.P. Shirshov Institute of Oceanology (IO). Model is aimed at modeling of intra-annual variability of hydrodynamics in Arctic and. The computational characteristics of the world ocean-sea ice coupled model governed by SCM OIAS are presented. The model is parallelized using MPI technologies and currently can use efficiently up to 5000 cores. Details of programming implementation, computational configuration and physical phenomena parametrization are analyzed in terms of intercoupling complex. Results of five year computational experiment of sea ice, snow and ocean state evolution in Arctic region on tripole grid with horizontal resolution of 3-5 kilometers, closed by atmospheric forcing field from repeating "normal" annual course taken from CORE1 experiment data base are presented and analyzed in terms of the state of vorticity and warm Atlantic water expansion.
Influence of heterogeneity on rock strength and stiffness using discrete element method and parallel bond model

Directory of Open Access Journals (Sweden)

Spyridon Liakas

2017-08-01

Full Text Available The particulate discrete element method (DEM can be employed to capture the response of rock, provided that appropriate bonding models are used to cement the particles to each other. Simulations of laboratory tests are important to establish the extent to which those models can capture realistic rock behaviors. Hitherto the focus in such comparison studies has either been on homogeneous specimens or use of two-dimensional (2D models. In situ rock formations are often heterogeneous, thus exploring the ability of this type of models to capture heterogeneous material behavior is important to facilitate their use in design analysis. In situ stress states are basically three-dimensional (3D, and therefore it is important to develop 3D models for this purpose. This paper revisits an earlier experimental study on heterogeneous specimens, of which the relative proportions of weaker material (siltstone and stronger, harder material (sandstone were varied in a controlled manner. Using a 3D DEM model with the parallel bond model, virtual heterogeneous specimens were created. The overall responses in terms of variations in strength and stiffness with different percentages of weaker material (siltstone were shown to agree with the experimental observations. There was also a good qualitative agreement in the failure patterns observed in the experiments and the simulations, suggesting that the DEM data enabled analysis of the initiation of localizations and micro fractures in the specimens.
Parallel unstructured mesh optimisation for 3D radiation transport and fluids modelling

International Nuclear Information System (INIS)

Gorman, G.J.; Pain, Ch. C.; Oliveira, C.R.E. de; Umpleby, A.P.; Goddard, A.J.H.

2003-01-01

In this paper we describe the theory and application of a parallel mesh optimisation procedure to obtain self-adapting finite element solutions on unstructured tetrahedral grids. The optimisation procedure adapts the tetrahedral mesh to the solution of a radiation transport or fluid flow problem without sacrificing the integrity of the boundary (geometry), or internal boundaries (regions) of the domain. The objective is to obtain a mesh which has both a uniform interpolation error in any direction and the element shapes are of good quality. This is accomplished with use of a non-Euclidean (anisotropic) metric which is related to the Hessian of the solution field. Appropriate scaling of the metric enables the resolution of multi-scale phenomena as encountered in transient incompressible fluids and multigroup transport calculations. The resulting metric is used to calculate element size and shape quality. The mesh optimisation method is based on a series of mesh connectivity and node position searches of the landscape defining mesh quality which is gauged by a functional. The mesh modification thus fits the solution field(s) in an optimal manner. The parallel mesh optimisation/adaptivity procedure presented in this paper is of general applicability. We illustrate this by applying it to a transient CFD (computational fluid dynamics) problem. Incompressible flow past a cylinder at moderate Reynolds numbers is modelled to demonstrate that the mesh can follow transient flow features. (authors)
Parallelization and automatic data distribution for nuclear reactor simulations

Energy Technology Data Exchange (ETDEWEB)

Liebrock, L.M. [Liebrock-Hicks Research, Calumet, MI (United States)

1997-07-01

Detailed attempts at realistic nuclear reactor simulations currently take many times real time to execute on high performance workstations. Even the fastest sequential machine can not run these simulations fast enough to ensure that the best corrective measure is used during a nuclear accident to prevent a minor malfunction from becoming a major catastrophe. Since sequential computers have nearly reached the speed of light barrier, these simulations will have to be run in parallel to make significant improvements in speed. In physical reactor plants, parallelism abounds. Fluids flow, controls change, and reactions occur in parallel with only adjacent components directly affecting each other. These do not occur in the sequentialized manner, with global instantaneous effects, that is often used in simulators. Development of parallel algorithms that more closely approximate the real-world operation of a reactor may, in addition to speeding up the simulations, actually improve the accuracy and reliability of the predictions generated. Three types of parallel architecture (shared memory machines, distributed memory multicomputers, and distributed networks) are briefly reviewed as targets for parallelization of nuclear reactor simulation. Various parallelization models (loop-based model, shared memory model, functional model, data parallel model, and a combined functional and data parallel model) are discussed along with their advantages and disadvantages for nuclear reactor simulation. A variety of tools are introduced for each of the models. Emphasis is placed on the data parallel model as the primary focus for two-phase flow simulation. Tools to support data parallel programming for multiple component applications and special parallelization considerations are also discussed.
Parallelization and automatic data distribution for nuclear reactor simulations

International Nuclear Information System (INIS)

Liebrock, L.M.

1997-01-01

Detailed attempts at realistic nuclear reactor simulations currently take many times real time to execute on high performance workstations. Even the fastest sequential machine can not run these simulations fast enough to ensure that the best corrective measure is used during a nuclear accident to prevent a minor malfunction from becoming a major catastrophe. Since sequential computers have nearly reached the speed of light barrier, these simulations will have to be run in parallel to make significant improvements in speed. In physical reactor plants, parallelism abounds. Fluids flow, controls change, and reactions occur in parallel with only adjacent components directly affecting each other. These do not occur in the sequentialized manner, with global instantaneous effects, that is often used in simulators. Development of parallel algorithms that more closely approximate the real-world operation of a reactor may, in addition to speeding up the simulations, actually improve the accuracy and reliability of the predictions generated. Three types of parallel architecture (shared memory machines, distributed memory multicomputers, and distributed networks) are briefly reviewed as targets for parallelization of nuclear reactor simulation. Various parallelization models (loop-based model, shared memory model, functional model, data parallel model, and a combined functional and data parallel model) are discussed along with their advantages and disadvantages for nuclear reactor simulation. A variety of tools are introduced for each of the models. Emphasis is placed on the data parallel model as the primary focus for two-phase flow simulation. Tools to support data parallel programming for multiple component applications and special parallelization considerations are also discussed
Time-dependent Perpendicular Transport of Energetic Particles for Different Turbulence Configurations and Parallel Transport Models

Energy Technology Data Exchange (ETDEWEB)

Lasuik, J.; Shalchi, A., E-mail: andreasm4@yahoo.com [Department of Physics and Astronomy, University of Manitoba, Winnipeg, MB R3T 2N2 (Canada)

2017-09-20

Recently, a new theory for the transport of energetic particles across a mean magnetic field was presented. Compared to other nonlinear theories the new approach has the advantage that it provides a full time-dependent description of the transport. Furthermore, a diffusion approximation is no longer part of that theory. The purpose of this paper is to combine this new approach with a time-dependent model for parallel transport and different turbulence configurations in order to explore the parameter regimes for which we get ballistic transport, compound subdiffusion, and normal Markovian diffusion.

Parallel Development of Products and New Business Models

DEFF Research Database (Denmark)

Lund, Morten; Hansen, Poul H. Kyvsgård

2014-01-01

The perception of product development and the practical execution of product development in professional organizations have undergone dramatic changes in recent years. Many of these chances relate to introduction of broader and more cross-disciplinary views that involves new organizational functi...... and innovation management the 4th generation models are increasingly including the concept business models and business model innovation....... functions and new concepts. These chances can be captured in various generations of practice. This paper will discuss the recent development of 3rd generation product development process models and the emergence of a 4th generation. While the 3rd generation models included the concept of innovation...
PALM: a paralleled and integrated framework for phylogenetic inference with automatic likelihood model selectors.

Directory of Open Access Journals (Sweden)

Shu-Hwa Chen

Full Text Available BACKGROUND: Selecting an appropriate substitution model and deriving a tree topology for a given sequence set are essential in phylogenetic analysis. However, such time consuming, computationally intensive tasks rely on knowledge of substitution model theories and related expertise to run through all possible combinations of several separate programs. To ensure a thorough and efficient analysis and avert tedious manipulations of various programs, this work presents an intuitive framework, the phylogenetic reconstruction with automatic likelihood model selectors (PALM, with convincing, updated algorithms and a best-fit model selection mechanism for seamless phylogenetic analysis. METHODOLOGY: As an integrated framework of ClustalW, PhyML, MODELTEST, ProtTest, and several in-house programs, PALM evaluates the fitness of 56 substitution models for nucleotide sequences and 112 substitution models for protein sequences with scores in various criteria. The input for PALM can be either sequences in FASTA format or a sequence alignment file in PHYLIP format. To accelerate the computing of maximum likelihood and bootstrapping, this work integrates MPICH2/PhyML, PalmMonitor and Palm job controller across several machines with multiple processors and adopts the task parallelism approach. Moreover, an intuitive and interactive web component, PalmTree, is developed for displaying and operating the output tree with options of tree rooting, branches swapping, viewing the branch length values, and viewing bootstrapping score, as well as removing nodes to restart analysis iteratively. SIGNIFICANCE: The workflow of PALM is straightforward and coherent. Via a succinct, user-friendly interface, researchers unfamiliar with phylogenetic analysis can easily use this server to submit sequences, retrieve the output, and re-submit a job based on a previous result if some sequences are to be deleted or added for phylogenetic reconstruction. PALM results in an inference of
A finite parallel zone model to interpret and extend Giddings' coupling theory for the eddy-dispersion in porous chromatographic media.

Science.gov (United States)

Desmet, Gert

2013-11-01

The finite length parallel zone (FPZ)-model is proposed as an alternative model for the axial- or eddy-dispersion caused by the occurrence of local velocity biases or flow heterogeneities in porous media such as those used in liquid chromatography columns. The mathematical plate height expression evolving from the model shows that the A- and C-term band broadening effects that can originate from a given velocity bias should be coupled in an exponentially decaying way instead of harmonically as proposed in Giddings' coupling theory. In the low and high velocity limit both models converge, while a 12% difference can be observed in the (practically most relevant) intermediate range of reduced velocities. Explicit expressions for the A- and C-constants appearing in the exponential decay-based plate height expression have been derived for each of the different possible velocity bias levels (single through-pore and particle level, multi-particle level and trans-column level). These expressions allow to directly relate the band broadening originating from these different levels to the local fundamental transport parameters, hence offering the possibility to include a velocity-dependent and, if, needed retention factor-dependent transversal dispersion coefficient. Having developed the mathematics for the general case wherein a difference in retention equilibrium establishes between the two parallel zones, the effect of any possible local variations in packing density and/or retention capacity on the eddy-dispersion can be explicitly accounted for as well. It is furthermore also shown that, whereas the lumped transport parameter model used in the basic variant of the FPZ-model only provides a first approximation of the true decay constant, the model can be extended by introducing a constant correction factor to correctly account for the continuous transversal dispersion transport in the velocity bias zones. Copyright © 2013 Elsevier B.V. All rights reserved.
Parallel Human and Animal Models of Blast- and Concussion-Induced Tinnitus and Related Traumatic Brain Injury (TBI)

Science.gov (United States)

2014-01-01

Andersson G (2009) The role of anxiety sensitivity and behavioral avoidance in tinnitus disability. IntJAudiol 48:295-299. Hiller W, Goebel G (1999...Parallel Human and Animal Models of Blast- and Concussion-Induced Tinnitus and Related Traumatic Brain Injury (TBI) PRINCIPAL INVESTIGATOR...Induced Tinnitus and Related Traumatic Brain Injury (TBI) 5b. GRANT NUMBER W81XWH-11-2-0031 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S
A parallel Discrete Element Method to model collisions between non-convex particles

Directory of Open Access Journals (Sweden)

Rakotonirina Andriarimina Daniel

2017-01-01

Full Text Available In many dry granular and suspension flow configurations, particles can be highly non-spherical. It is now well established in the literature that particle shape affects the flow dynamics or the microstructure of the particles assembly in assorted ways as e.g. compacity of packed bed or heap, dilation under shear, resistance to shear, momentum transfer between translational and angular motions, ability to form arches and block the flow. In this talk, we suggest an accurate and efficient way to model collisions between particles of (almost arbitrary shape. For that purpose, we develop a Discrete Element Method (DEM combined with a soft particle contact model. The collision detection algorithm handles contacts between bodies of various shape and size. For nonconvex bodies, our strategy is based on decomposing a non-convex body into a set of convex ones. Therefore, our novel method can be called “glued-convex method” (in the sense clumping convex bodies together, as an extension of the popular “glued-spheres” method, and is implemented in our own granular dynamics code Grains3D. Since the whole problem is solved explicitly, our fully-MPI parallelized code Grains3D exhibits a very high scalability when dynamic load balancing is not required. In particular, simulations on up to a few thousands cores in configurations involving up to a few tens of millions of particles can readily be performed. We apply our enhanced numerical model to (i the collapse of a granular column made of convex particles and (i the microstructure of a heap of non-convex particles in a cylindrical reactor.
3D multiphysics modeling of superconducting cavities with a massively parallel simulation suite

Directory of Open Access Journals (Sweden)

Oleksiy Kononenko

2017-10-01

Full Text Available Radiofrequency cavities based on superconducting technology are widely used in particle accelerators for various applications. The cavities usually have high quality factors and hence narrow bandwidths, so the field stability is sensitive to detuning from the Lorentz force and external loads, including vibrations and helium pressure variations. If not properly controlled, the detuning can result in a serious performance degradation of a superconducting accelerator, so an understanding of the underlying detuning mechanisms can be very helpful. Recent advances in the simulation suite ace3p have enabled realistic multiphysics characterization of such complex accelerator systems on supercomputers. In this paper, we present the new capabilities in ace3p for large-scale 3D multiphysics modeling of superconducting cavities, in particular, a parallel eigensolver for determining mechanical resonances, a parallel harmonic response solver to calculate the response of a cavity to external vibrations, and a numerical procedure to decompose mechanical loads, such as from the Lorentz force or piezoactuators, into the corresponding mechanical modes. These capabilities have been used to do an extensive rf-mechanical analysis of dressed TESLA-type superconducting cavities. The simulation results and their implications for the operational stability of the Linac Coherent Light Source-II are discussed.
Parallel sorting algorithms

CERN Document Server

Akl, Selim G

1985-01-01

Parallel Sorting Algorithms explains how to use parallel algorithms to sort a sequence of items on a variety of parallel computers. The book reviews the sorting problem, the parallel models of computation, parallel algorithms, and the lower bounds on the parallel sorting problems. The text also presents twenty different algorithms, such as linear arrays, mesh-connected computers, cube-connected computers. Another example where algorithm can be applied is on the shared-memory SIMD (single instruction stream multiple data stream) computers in which the whole sequence to be sorted can fit in the
Numerical modelling of series-parallel cooling systems in power plant

Directory of Open Access Journals (Sweden)

Regucki Paweł

2017-01-01

Full Text Available The paper presents a mathematical model allowing one to study series-parallel hydraulic systems like, e.g., the cooling system of a power boiler's auxiliary devices or a closed cooling system including condensers and cooling towers. The analytical approach is based on a set of non-linear algebraic equations solved using numerical techniques. As a result of the iterative process, a set of volumetric flow rates of water through all the branches of the investigated hydraulic system is obtained. The calculations indicate the influence of changes in the pipeline's geometrical parameters on the total cooling water flow rate in the analysed installation. Such an approach makes it possible to analyse different variants of the modernization of the studied systems, as well as allowing for the indication of its critical elements. Basing on these results, an investor can choose the optimal variant of the reconstruction of the installation from the economic point of view. As examples of such a calculation, two hydraulic installations are described. One is a boiler auxiliary cooling installation including two screw ash coolers. The other is a closed cooling system consisting of cooling towers and condensers.
A Model of Parallel Kinematics for Machine Calibration

DEFF Research Database (Denmark)

Pedersen, David Bue; Bæk Nielsen, Morten; Kløve Christensen, Simon

2016-01-01

Parallel kinematics have been adopted by more than 25 manufacturers of high-end desktop 3D printers [Wohlers Report (2015), p.118] as well as by research projects such as the WASP project [WASP (2015)], a 12 meter tall linear delta robot for Additive Manufacture of large-scale components for cons......Parallel kinematics have been adopted by more than 25 manufacturers of high-end desktop 3D printers [Wohlers Report (2015), p.118] as well as by research projects such as the WASP project [WASP (2015)], a 12 meter tall linear delta robot for Additive Manufacture of large-scale components...
A Model for Speedup of Parallel Programs

Science.gov (United States)

1997-01-01

Sanjeev. K Setia . The interaction between mem- ory allocation and adaptive partitioning in message- passing multicomputers. In IPPS 󈨣 Workshop on Job...Scheduling Strategies for Parallel Processing, pages 89{99, 1995. [15] Sanjeev K. Setia and Satish K. Tripathi. A compar- ative analysis of static
Dynamic Model and Vibration Characteristics of Planar 3-RRR Parallel Manipulator with Flexible Intermediate Links considering Exact Boundary Conditions

Directory of Open Access Journals (Sweden)

Lianchao Sheng

2017-01-01

Full Text Available Due to the complexity of the dynamic model of a planar 3-RRR flexible parallel manipulator (FPM, it is often difficult to achieve active vibration control algorithm based on the system dynamic model. To establish a simple and efficient dynamic model of the planar 3-RRR FPM to study its dynamic characteristics and build a controller conveniently, firstly, considering the effect of rigid-flexible coupling and the moment of inertia at the end of the flexible intermediate link, the modal function is determined with the pinned-free boundary condition. Then, considering the main vibration modes of the system, a high-efficiency coupling dynamic model is established on the basis of guaranteeing the model control accuracy. According to the model, the modal characteristics of the flexible intermediate link are analyzed and compared with the modal test results. The results show that the model can effectively reflect the main vibration modes of the planar 3-RRR FPM; in addition the model can be used to analyze the effects of inertial and coupling forces on the dynamics model and the drive torque of the drive motor. Because this model is of the less dynamic parameters, it is convenient to carry out the control program.
Dynamic modeling and experiment of a new type of parallel servo press considering gravity counterbalance

Science.gov (United States)

He, Jun; Gao, Feng; Bai, Yongjun; Wu, Shengfu

2013-11-01

The large capacity servo press is traditionally realized by means of redundant actuation, however there exist the over-constraint problem and interference among actuators, which increases the control difficulty and the product cost. A new type of press mechanism with parallel topology is presented to develop the mechanical servo press with high stamping capacity. The dynamic model considering gravity counterbalance is proposed based on the virtual work principle, and then the effect of counterbalance cylinder on the dynamic performance of the servo press is studied. It is found that the motor torque required to operate the press is a lot less than the others when the ratio of the counterbalance force to the gravity of ram is in the vicinity of 1.0. The stamping force of the real press prototype can reach up to 25 MN on the position of 13 mm away from the bottom dead center. The typical deep-drawing process with 1 200 mm stroke at 8 strokes per minute is proposed by means of five order polynomial. On this process condition, the driving torques are calculated based on the above dynamic model and the torque measuring test is also carried out on the prototype. It is shown that the curve trend of calculation torque is consistent to the measured result and that the average error is less than 15%. The parallel mechanism is introduced into the development of large capacity servo press to avoid the over-constraint and interference of traditional redundant actuation, and its dynamic characteristics with gravity counterbalance are presented.
Modelling, Simulation and Testing of a Reconfigurable Cable-Based Parallel Manipulator as Motion Aiding System

Directory of Open Access Journals (Sweden)

Gianni Castelli

2010-01-01

Full Text Available This paper presents results on the modelling, simulation and experimental tests of a cable-based parallel manipulator to be used as an aiding or guiding system for people with motion disabilities. There is a high level of motivation for people with a motion disability or the elderly to perform basic daily-living activities independently. Therefore, it is of great interest to design and implement safe and reliable motion assisting and guiding devices that are able to help end-users. In general, a robot for a medical application should be able to interact with a patient in safety conditions, i.e. it must not damage people or surroundings; it must be designed to guarantee high accuracy and low acceleration during the operation. Furthermore, it should not be too bulky and it should exert limited wrenches after close interaction with people. It can be advisable to have a portable system which can be easily brought into and assembled in a hospital or a domestic environment. Cable-based robotic structures can fulfil those requirements because of their main characteristics that make them light and intrinsically safe. In this paper, a reconfigurable four-cable-based parallel manipulator has been proposed as a motion assisting and guiding device to help people to accomplish a number of tasks, such as an aiding or guiding system to move the upper and lower limbs or the whole body. Modelling and simulation are presented in the ADAMS environment. Moreover, experimental tests are reported as based on an available laboratory prototype.
Implementation science: a role for parallel dual processing models of reasoning?

Science.gov (United States)

Sladek, Ruth M; Phillips, Paddy A; Bond, Malcolm J

2006-05-25

A better theoretical base for understanding professional behaviour change is needed to support evidence-based changes in medical practice. Traditionally strategies to encourage changes in clinical practices have been guided empirically, without explicit consideration of underlying theoretical rationales for such strategies. This paper considers a theoretical framework for reasoning from within psychology for identifying individual differences in cognitive processing between doctors that could moderate the decision to incorporate new evidence into their clinical decision-making. Parallel dual processing models of reasoning posit two cognitive modes of information processing that are in constant operation as humans reason. One mode has been described as experiential, fast and heuristic; the other as rational, conscious and rule based. Within such models, the uptake of new research evidence can be represented by the latter mode; it is reflective, explicit and intentional. On the other hand, well practiced clinical judgments can be positioned in the experiential mode, being automatic, reflexive and swift. Research suggests that individual differences between people in both cognitive capacity (e.g., intelligence) and cognitive processing (e.g., thinking styles) influence how both reasoning modes interact. This being so, it is proposed that these same differences between doctors may moderate the uptake of new research evidence. Such dispositional characteristics have largely been ignored in research investigating effective strategies in implementing research evidence. Whilst medical decision-making occurs in a complex social environment with multiple influences and decision makers, it remains true that an individual doctor's judgment still retains a key position in terms of diagnostic and treatment decisions for individual patients. This paper argues therefore, that individual differences between doctors in terms of reasoning are important considerations in any
Comparing planar image quality of rotating slat and parallel hole collimation: influence of system modeling

International Nuclear Information System (INIS)

Holen, Roel van; Vandenberghe, Stefaan; Staelens, Steven; Lemahieu, Ignace

2008-01-01

The main remaining challenge for a gamma camera is to overcome the existing trade-off between collimator spatial resolution and system sensitivity. This problem, strongly limiting the performance of parallel hole collimated gamma cameras, can be overcome by applying new collimator designs such as rotating slat (RS) collimators which have a much higher photon collection efficiency. The drawback of a RS collimated gamma camera is that, even for obtaining planar images, image reconstruction is needed, resulting in noise accumulation. However, nowadays iterative reconstruction techniques with accurate system modeling can provide better image quality. Because the impact of this modeling on image quality differs from one system to another, an objective assessment of the image quality obtained with a RS collimator is needed in comparison to classical projection images obtained using a parallel hole (PH) collimator. In this paper, a comparative study of image quality, achieved with system modeling, is presented. RS data are reconstructed to planar images using maximum likelihood expectation maximization (MLEM) with an accurate Monte Carlo derived system matrix while PH projections are deconvolved using a Monte Carlo derived point-spread function. Contrast-to-noise characteristics are used to show image quality for cold and hot spots of varying size. Influence of the object size and contrast is investigated using the optimal contrast-to-noise ratio (CNR o ). For a typical phantom setup, results show that cold spot imaging is slightly better for a PH collimator. For hot spot imaging, the CNR o of the RS images is found to increase with increasing lesion diameter and lesion contrast while it decreases when background dimensions become larger. Only for very large background dimensions in combination with low contrast lesions, the use of a PH collimator could be beneficial for hot spot imaging. In all other cases, the RS collimator scores better. Finally, the simulation of a
Accelerating population balance-Monte Carlo simulation for coagulation dynamics from the Markov jump model, stochastic algorithm and GPU parallel computing

International Nuclear Information System (INIS)

Xu, Zuwei; Zhao, Haibo; Zheng, Chuguang

2015-01-01

This paper proposes a comprehensive framework for accelerating population balance-Monte Carlo (PBMC) simulation of particle coagulation dynamics. By combining Markov jump model, weighted majorant kernel and GPU (graphics processing unit) parallel computing, a significant gain in computational efficiency is achieved. The Markov jump model constructs a coagulation-rule matrix of differentially-weighted simulation particles, so as to capture the time evolution of particle size distribution with low statistical noise over the full size range and as far as possible to reduce the number of time loopings. Here three coagulation rules are highlighted and it is found that constructing appropriate coagulation rule provides a route to attain the compromise between accuracy and cost of PBMC methods. Further, in order to avoid double looping over all simulation particles when considering the two-particle events (typically, particle coagulation), the weighted majorant kernel is introduced to estimate the maximum coagulation rates being used for acceptance–rejection processes by single-looping over all particles, and meanwhile the mean time-step of coagulation event is estimated by summing the coagulation kernels of rejected and accepted particle pairs. The computational load of these fast differentially-weighted PBMC simulations (based on the Markov jump model) is reduced greatly to be proportional to the number of simulation particles in a zero-dimensional system (single cell). Finally, for a spatially inhomogeneous multi-dimensional (multi-cell) simulation, the proposed fast PBMC is performed in each cell, and multiple cells are parallel processed by multi-cores on a GPU that can implement the massively threaded data-parallel tasks to obtain remarkable speedup ratio (comparing with CPU computation, the speedup ratio of GPU parallel computing is as high as 200 in a case of 100 cells with 10 000 simulation particles per cell). These accelerating approaches of PBMC are
Getting To Exascale: Applying Novel Parallel Programming Models To Lab Applications For The Next Generation Of Supercomputers

Energy Technology Data Exchange (ETDEWEB)

Dube, Evi [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Shereda, Charles [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Nau, Lee [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Harris, Lance [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

2010-09-27

As supercomputing moves toward exascale, node architectures will change significantly. CPU core counts on nodes will increase by an order of magnitude or more. Heterogeneous architectures will become more commonplace, with GPUs or FPGAs providing additional computational power. Novel programming models may make better use of on-node parallelism in these new architectures than do current models. In this paper we examine several of these novel models – UPC, CUDA, and OpenCL –to determine their suitability to LLNL scientific application codes. Our study consisted of several phases: We conducted interviews with code teams and selected two codes to port; We learned how to program in the new models and ported the codes; We debugged and tuned the ported applications; We measured results, and documented our findings. We conclude that UPC is a challenge for porting code, Berkeley UPC is not very robust, and UPC is not suitable as a general alternative to OpenMP for a number of reasons. CUDA is well supported and robust but is a proprietary NVIDIA standard, while OpenCL is an open standard. Both are well suited to a specific set of application problems that can be run on GPUs, but some problems are not suited to GPUs. Further study of the landscape of novel models is recommended.
OpenSWPC: an open-source integrated parallel simulation code for modeling seismic wave propagation in 3D heterogeneous viscoelastic media

Science.gov (United States)

Maeda, Takuto; Takemura, Shunsuke; Furumura, Takashi

2017-07-01

We have developed an open-source software package, Open-source Seismic Wave Propagation Code (OpenSWPC), for parallel numerical simulations of seismic wave propagation in 3D and 2D (P-SV and SH) viscoelastic media based on the finite difference method in local-to-regional scales. This code is equipped with a frequency-independent attenuation model based on the generalized Zener body and an efficient perfectly matched layer for absorbing boundary condition. A hybrid-style programming using OpenMP and the Message Passing Interface (MPI) is adopted for efficient parallel computation. OpenSWPC has wide applicability for seismological studies and great portability to allowing excellent performance from PC clusters to supercomputers. Without modifying the code, users can conduct seismic wave propagation simulations using their own velocity structure models and the necessary source representations by specifying them in an input parameter file. The code has various modes for different types of velocity structure model input and different source representations such as single force, moment tensor and plane-wave incidence, which can easily be selected via the input parameters. Widely used binary data formats, the Network Common Data Form (NetCDF) and the Seismic Analysis Code (SAC) are adopted for the input of the heterogeneous structure model and the outputs of the simulation results, so users can easily handle the input/output datasets. All codes are written in Fortran 2003 and are available with detailed documents in a public repository.[Figure not available: see fulltext.
New method for model coupling using Stampi. Application to the coupling of atmosphere model (MM5) and land-surface model (SOLVEG)

International Nuclear Information System (INIS)

Nagai, Haruyasu

2003-12-01

A new method to couple atmosphere and land-surface models using the message passing interface (MPI) was proposed to develop an atmosphere-land model for studies on heat, water, and material exchanges around the land surface. A non-hydrostatic atmospheric dynamic model of Pennsylvania State University and National Center for Atmospheric Research (PUS/NCAR-MM5) and a detailed land surface model (SOLVEG) including the surface-layer atmosphere, soil, and vegetation developed at Japan Atomic Energy Research Institute (JAERI) are used as the atmosphere and land-surface models, respectively. Concerning the MPI, a message passing library named Stampi developed at JAERI that can be used between different parallel computers is used. The models are coupled by exchanging calculation results by using MPI on their independent parallel calculations. The modifications for this model coupling are easy, simply adding some modules for data exchanges to each model code without changing each model's original structure. Moreover, this coupling method is flexible and allows the use of independent time step and grid interval for each model. (author)
Development of design technology on thermal-hydraulic performance in tight-lattice rod bundle. 4. Large paralleled simulation by the advanced two-fluid model code

International Nuclear Information System (INIS)

Misawa, Takeharu; Yoshida, Hiroyuki; Akimoto, Hajime

2008-01-01

In Japan Atomic Energy Agency (JAEA), the Innovative Water Reactor for Flexible Fuel Cycle (FLWR) has been developed. For thermal design of FLWR, it is necessary to develop analytical method to predict boiling transition of FLWR. Japan Atomic Energy Agency (JAEA) has been developing three-dimensional two-fluid model analysis code ACE-3D, which adopts boundary fitted coordinate system to simulate complex shape channel flow. In this paper, as a part of development of ACE-3D to apply to rod bundle analysis, introduction of parallelization to ACE-3D and assessments of ACE-3D are shown. In analysis of large-scale domain such as a rod bundle, even two-fluid model requires large number of computational cost, which exceeds upper limit of memory amount of 1 CPU. Therefore, parallelization was introduced to ACE-3D to divide data amount for analysis of large-scale domain among large number of CPUs, and it is confirmed that analysis of large-scale domain such as a rod bundle can be performed by parallel computation with keeping parallel computation performance even using large number of CPUs. ACE-3D adopts two-phase flow models, some of which are dependent upon channel geometry. Therefore, analyses in the domains, which simulate individual subchannel and 37 rod bundle, are performed, and compared with experiments. It is confirmed that the results obtained by both analyses using ACE-3D show agreement with past experimental result qualitatively. (author)

Programming Models in HPC

Energy Technology Data Exchange (ETDEWEB)

Shipman, Galen M. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

2016-06-13

These are the slides for a presentation on programming models in HPC, at the Los Alamos National Laboratory's Parallel Computing Summer School. The following topics are covered: Flynn's Taxonomy of computer architectures; single instruction single data; single instruction multiple data; multiple instruction multiple data; address space organization; definition of Trinity (Intel Xeon-Phi is a MIMD architecture); single program multiple data; multiple program multiple data; ExMatEx workflow overview; definition of a programming model, programming languages, runtime systems; programming model and environments; MPI (Message Passing Interface); OpenMP; Kokkos (Performance Portable Thread-Parallel Programming Model); Kokkos abstractions, patterns, policies, and spaces; RAJA, a systematic approach to node-level portability and tuning; overview of the Legion Programming Model; mapping tasks and data to hardware resources; interoperability: supporting task-level models; Legion S3D execution and performance details; workflow, integration of external resources into the programming model.
Sierra toolkit computational mesh conceptual model

International Nuclear Information System (INIS)

Baur, David G.; Edwards, Harold Carter; Cochran, William K.; Williams, Alan B.; Sjaardema, Gregory D.

2010-01-01

The Sierra Toolkit computational mesh is a software library intended to support massively parallel multi-physics computations on dynamically changing unstructured meshes. This domain of intended use is inherently complex due to distributed memory parallelism, parallel scalability, heterogeneity of physics, heterogeneous discretization of an unstructured mesh, and runtime adaptation of the mesh. Management of this inherent complexity begins with a conceptual analysis and modeling of this domain of intended use; i.e., development of a domain model. The Sierra Toolkit computational mesh software library is designed and implemented based upon this domain model. Software developers using, maintaining, or extending the Sierra Toolkit computational mesh library must be familiar with the concepts/domain model presented in this report.
Finite mixture model applied in the analysis of a turbulent bistable flow on two parallel circular cylinders

Energy Technology Data Exchange (ETDEWEB)

Paula, A.V. de, E-mail: vagtinski@mecanica.ufrgs.br [PROMEC – Programa de Pós Graduação em Engenharia Mecânica, UFRGS – Universidade Federal do Rio Grande do Sul, Porto Alegre, RS (Brazil); Möller, S.V., E-mail: svmoller@ufrgs.br [PROMEC – Programa de Pós Graduação em Engenharia Mecânica, UFRGS – Universidade Federal do Rio Grande do Sul, Porto Alegre, RS (Brazil)

2013-11-15

This paper presents a study of the bistable phenomenon which occurs in the turbulent flow impinging on circular cylinders placed side-by-side. Time series of axial and transversal velocity obtained with the constant temperature hot wire anemometry technique in an aerodynamic channel are used as input data in a finite mixture model, to classify the observed data according to a family of probability density functions. Wavelet transforms are applied to analyze the unsteady turbulent signals. Results of flow visualization show that the flow is predominantly two-dimensional. A double-well energy model is suggested to describe the behavior of the bistable phenomenon in this case. -- Highlights: ► Bistable flow on two parallel cylinders is studied with hot wire anemometry as a first step for the application on the analysis to tube bank flow. ► The method of maximum likelihood estimation is applied to hot wire experimental series to classify the data according to PDF functions in a mixture model approach. ► Results show no evident correlation between the changes of flow modes with time. ► An energy model suggests the presence of more than two flow modes.
Stage-by-Stage and Parallel Flow Path Compressor Modeling for a Variable Cycle Engine, NASA Advanced Air Vehicles Program - Commercial Supersonic Technology Project - AeroServoElasticity

Science.gov (United States)

Kopasakis, George; Connolly, Joseph W.; Cheng, Larry

2015-01-01

This paper covers the development of stage-by-stage and parallel flow path compressor modeling approaches for a Variable Cycle Engine. The stage-by-stage compressor modeling approach is an extension of a technique for lumped volume dynamics and performance characteristic modeling. It was developed to improve the accuracy of axial compressor dynamics over lumped volume dynamics modeling. The stage-by-stage compressor model presented here is formulated into a parallel flow path model that includes both axial and rotational dynamics. This is done to enable the study of compressor and propulsion system dynamic performance under flow distortion conditions. The approaches utilized here are generic and should be applicable for the modeling of any axial flow compressor design accurate time domain simulations. The objective of this work is as follows. Given the parameters describing the conditions of atmospheric disturbances, and utilizing the derived formulations, directly compute the transfer function poles and zeros describing these disturbances for acoustic velocity, temperature, pressure, and density. Time domain simulations of representative atmospheric turbulence can then be developed by utilizing these computed transfer functions together with the disturbance frequencies of interest.
Ocean Modeling and Visualization on Massively Parallel Computer

Science.gov (United States)

Chao, Yi; Li, P. Peggy; Wang, Ping; Katz, Daniel S.; Cheng, Benny N.

1997-01-01

Climate modeling is one of the grand challenges of computational science, and ocean modeling plays an important role in both understanding the current climatic conditions and predicting future climate change.
Sequential and Parallel Attack Tree Modelling

NARCIS (Netherlands)

Arnold, Florian; Guck, Dennis; Kumar, Rajesh; Stoelinga, Mariëlle Ida Antoinette; Koornneef, Floor; van Gulijk, Coen

The intricacy of socio-technical systems requires a careful planning and utilisation of security resources to ensure uninterrupted, secure and reliable services. Even though many studies have been conducted to understand and model the behaviour of a potential attacker, the detection of crucial
Evaluation of the hydrodynamic behaviour of turbulence promoters in parallel plate electrochemical reactors by means of the dispersion model

International Nuclear Information System (INIS)

Colli, A.N.; Bisang, J.M.

2011-01-01

Highlights: · The type of turbulence promoters has a strong influence on the hydrodynamics. · The dispersion model is appropriate for expanded plastic turbulence promoters. · The dispersion model is appropriate for glass beads turbulence promoters. - Abstract: The hydrodynamic behaviour of electrochemical reactors with parallel plate electrodes is experimentally studied using the stimulus-response method either with an empty reactor or with different turbulence promoters. Theoretical results which are in accordance with the analytical and numerical resolution of the dispersion model for a closed system are compared with the classical relationships of the normalized outlet concentration for open systems and the validity range of the equations is discussed. The experimental results were well correlated with the dispersion model using glass beads or expanded plastic meshes as turbulence promoters, which have shown the most advantageous performance. The Peclet number was higher than 63. The dispersion coefficient was found to increase linearly with flow velocity in these cases.
Isotropic damage model and serial/parallel mix theory applied to nonlinear analysis of ferrocement thin walls. Experimental and numerical analysis

Directory of Open Access Journals (Sweden)

Jairo A. Paredes

2016-01-01

Full Text Available Ferrocement thin walls are the structural elements that comprise the earthquake resistant system of dwellings built with this material. This article presents the results drawn from an experimental campaign carried out over full-scale precast ferrocement thin walls that were assessed under lateral static loading conditions. The tests allowed the identification of structural parameters and the evaluation of the performance of the walls under static loading conditions. Additionally, an isotropic damage model for modelling the mortar was applied, as well as the classic elasto-plastic theory for modelling the meshes and reinforcing bars. The ferrocement is considered as a composite material, thus the serial/parallel mix theory is used for modelling its mechanical behavior. In this work a methodology for the numerical analysis that allows modeling the nonlinear behavior exhibited by ferrocement walls under static loading conditions, as well as their potential use in earthquake resistant design, is proposed.
Computational split-field finite-difference time-domain evaluation of simplified tilt-angle models for parallel-aligned liquid-crystal devices

Science.gov (United States)

Márquez, Andrés; Francés, Jorge; Martínez, Francisco J.; Gallego, Sergi; Álvarez, Mariela L.; Calzado, Eva M.; Pascual, Inmaculada; Beléndez, Augusto

2018-03-01

Simplified analytical models with predictive capability enable simpler and faster optimization of the performance in applications of complex photonic devices. We recently demonstrated the most simplified analytical model still showing predictive capability for parallel-aligned liquid crystal on silicon (PA-LCoS) devices, which provides the voltage-dependent retardance for a very wide range of incidence angles and any wavelength in the visible. We further show that the proposed model is not only phenomenological but also physically meaningful, since two of its parameters provide the correct values for important internal properties of these devices related to the birefringence, cell gap, and director profile. Therefore, the proposed model can be used as a means to inspect internal physical properties of the cell. As an innovation, we also show the applicability of the split-field finite-difference time-domain (SF-FDTD) technique for phase-shift and retardance evaluation of PA-LCoS devices under oblique incidence. As a simplified model for PA-LCoS devices, we also consider the exact description of homogeneous birefringent slabs. However, we show that, despite its higher degree of simplification, the proposed model is more robust, providing unambiguous and physically meaningful solutions when fitting its parameters.
Calibrationless Parallel Magnetic Resonance Imaging: A Joint Sparsity Model

Directory of Open Access Journals (Sweden)

Angshul Majumdar

2013-12-01

Full Text Available State-of-the-art parallel MRI techniques either explicitly or implicitly require certain parameters to be estimated, e.g., the sensitivity map for SENSE, SMASH and interpolation weights for GRAPPA, SPIRiT. Thus all these techniques are sensitive to the calibration (parameter estimation stage. In this work, we have proposed a parallel MRI technique that does not require any calibration but yields reconstruction results that are at par with (or even better than state-of-the-art methods in parallel MRI. Our proposed method required solving non-convex analysis and synthesis prior joint-sparsity problems. This work also derives the algorithms for solving them. Experimental validation was carried out on two datasets—eight channel brain and eight channel Shepp-Logan phantom. Two sampling methods were used—Variable Density Random sampling and non-Cartesian Radial sampling. For the brain data, acceleration factor of 4 was used and for the other an acceleration factor of 6 was used. The reconstruction results were quantitatively evaluated based on the Normalised Mean Squared Error between the reconstructed image and the originals. The qualitative evaluation was based on the actual reconstructed images. We compared our work with four state-of-the-art parallel imaging techniques; two calibrated methods—CS SENSE and l1SPIRiT and two calibration free techniques—Distributed CS and SAKE. Our method yields better reconstruction results than all of them.
The CRAFT Fortran Programming Model

Directory of Open Access Journals (Sweden)

Douglas M. Pase

1994-01-01

Full Text Available Many programming models for massively parallel machines exist, and each has its advantages and disadvantages. In this article we present a programming model that combines features from other programming models that (1 can be efficiently implemented on present and future Cray Research massively parallel processor (MPP systems and (2 are useful in constructing highly parallel programs. The model supports several styles of programming: message-passing, data parallel, global address (shared data, and work-sharing. These styles may be combined within the same program. The model includes features that allow a user to define a program in terms of the behavior of the system as a whole, where the behavior of individual tasks is implicit from this systemic definition. (In general, features marked as shared are designed to support this perspective. It also supports an opposite perspective, where a program may be defined in terms of the behaviors of individual tasks, and a program is implicitly the sum of the behaviors of all tasks. (Features marked as private are designed to support this perspective. Users can exploit any combination of either set of features without ambiguity and thus are free to define a program from whatever perspective is most appropriate to the problem at hand.
Shapes of leaves with parallel venation. Modelling of the Epipactis sp. (Orchidaceae) leaves with the help of a system of coupled elastic beams

OpenAIRE

Jakubska-Busse, Anna; Janowicz, Maciej; Ochnio, Luiza; Jackowska-Zduniak, Beata

2016-01-01

Static properties of leaves with parallel venation, with particular emphasis on the genus EpipactisZinn, 1757 (Orchidaceae, Neottieae) have been modelled with coupled quasi-parallel elastic “beams.” The non-linear theory of strongly bended beams have been employed. The resulting boundary-value problem has been solved numerically with the help of the finite-difference method. Possible dislocations resulting in additional Dirac-delta like forces have been take into account. Morphological simila...
Parallel models of associative memory

CERN Document Server

Hinton, Geoffrey E

2014-01-01

This update of the 1981 classic on neural networks includes new commentaries by the authors that show how the original ideas are related to subsequent developments. As researchers continue to uncover ways of applying the complex information processing abilities of neural networks, they give these models an exciting future which may well involve revolutionary developments in understanding the brain and the mind -- developments that may allow researchers to build adaptive intelligent machines. The original chapters show where the ideas came from and the new commentaries show where they are going
Hybrid parallel execution model for logic-based specification languages

CERN Document Server

Tsai, Jeffrey J P

2001-01-01

Parallel processing is a very important technique for improving the performance of various software development and maintenance activities. The purpose of this book is to introduce important techniques for parallel executation of high-level specifications of software systems. These techniques are very useful for the construction, analysis, and transformation of reliable large-scale and complex software systems. Contents: Current Approaches; Overview of the New Approach; FRORL Requirements Specification Language and Its Decomposition; Rewriting and Data Dependency, Control Flow Analysis of a Lo
Multi-objective optimization algorithms for mixed model assembly line balancing problem with parallel workstations

Directory of Open Access Journals (Sweden)

Masoud Rabbani

2016-12-01

Full Text Available This paper deals with mixed model assembly line (MMAL balancing problem of type-I. In MMALs several products are made on an assembly line while the similarity of these products is so high. As a result, it is possible to assemble several types of products simultaneously without any additional setup times. The problem has some particular features such as parallel workstations and precedence constraints in dynamic periods in which each period also effects on its next period. The research intends to reduce the number of workstations and maximize the workload smoothness between workstations. Dynamic periods are used to determine all variables in different periods to achieve efficient solutions. A non-dominated sorting genetic algorithm (NSGA-II and multi-objective particle swarm optimization (MOPSO are used to solve the problem. The proposed model is validated with GAMS software for small size problem and the performance of the foregoing algorithms is compared with each other based on some comparison metrics. The NSGA-II outperforms MOPSO with respect to some comparison metrics used in this paper, but in other metrics MOPSO is better than NSGA-II. Finally, conclusion and future research is provided.
ModelMate - A graphical user interface for model analysis

Science.gov (United States)

Banta, Edward R.

2011-01-01

ModelMate is a graphical user interface designed to facilitate use of model-analysis programs with models. This initial version of ModelMate supports one model-analysis program, UCODE_2005, and one model software program, MODFLOW-2005. ModelMate can be used to prepare input files for UCODE_2005, run UCODE_2005, and display analysis results. A link to the GW_Chart graphing program facilitates visual interpretation of results. ModelMate includes capabilities for organizing directories used with the parallel-processing capabilities of UCODE_2005 and for maintaining files in those directories to be identical to a set of files in a master directory. ModelMate can be used on its own or in conjunction with ModelMuse, a graphical user interface for MODFLOW-2005 and PHAST.
Physics based modeling of a series parallel battery pack for asymmetry analysis, predictive control and life extension

Science.gov (United States)

Ganesan, Nandhini; Basu, Suman; Hariharan, Krishnan S.; Kolake, Subramanya Mayya; Song, Taewon; Yeo, Taejung; Sohn, Dong Kee; Doo, Seokgwang

2016-08-01

Lithium-Ion batteries used for electric vehicle applications are subject to large currents and various operation conditions, making battery pack design and life extension a challenging problem. With increase in complexity, modeling and simulation can lead to insights that ensure optimal performance and life extension. In this manuscript, an electrochemical-thermal (ECT) coupled model for a 6 series × 5 parallel pack is developed for Li ion cells with NCA/C electrodes and validated against experimental data. Contribution of the cathode to overall degradation at various operating conditions is assessed. Pack asymmetry is analyzed from a design and an operational perspective. Design based asymmetry leads to a new approach of obtaining the individual cell responses of the pack from an average ECT output. Operational asymmetry is demonstrated in terms of effects of thermal gradients on cycle life, and an efficient model predictive control technique is developed. Concept of reconfigurable battery pack is studied using detailed simulations that can be used for effective monitoring and extension of battery pack life.
Implementation science: a role for parallel dual processing models of reasoning?

Directory of Open Access Journals (Sweden)

Phillips Paddy A

2006-05-01

Full Text Available Abstract Background A better theoretical base for understanding professional behaviour change is needed to support evidence-based changes in medical practice. Traditionally strategies to encourage changes in clinical practices have been guided empirically, without explicit consideration of underlying theoretical rationales for such strategies. This paper considers a theoretical framework for reasoning from within psychology for identifying individual differences in cognitive processing between doctors that could moderate the decision to incorporate new evidence into their clinical decision-making. Discussion Parallel dual processing models of reasoning posit two cognitive modes of information processing that are in constant operation as humans reason. One mode has been described as experiential, fast and heuristic; the other as rational, conscious and rule based. Within such models, the uptake of new research evidence can be represented by the latter mode; it is reflective, explicit and intentional. On the other hand, well practiced clinical judgments can be positioned in the experiential mode, being automatic, reflexive and swift. Research suggests that individual differences between people in both cognitive capacity (e.g., intelligence and cognitive processing (e.g., thinking styles influence how both reasoning modes interact. This being so, it is proposed that these same differences between doctors may moderate the uptake of new research evidence. Such dispositional characteristics have largely been ignored in research investigating effective strategies in implementing research evidence. Whilst medical decision-making occurs in a complex social environment with multiple influences and decision makers, it remains true that an individual doctor's judgment still retains a key position in terms of diagnostic and treatment decisions for individual patients. This paper argues therefore, that individual differences between doctors in terms of
3D seismic modeling and reverse‐time migration with the parallel Fourier method using non‐blocking collective communications

KAUST Repository

Chu, Chunlei

2009-01-01

The major performance bottleneck of the parallel Fourier method on distributed memory systems is the network communication cost. In this study, we investigate the potential of using non‐blocking all‐to‐all communications to solve this problem by overlapping computation and communication. We present the runtime comparison of a 3D seismic modeling problem with the Fourier method using non‐blocking and blocking calls, respectively, on a Linux cluster. The data demonstrate that a performance improvement of up to 40% can be achieved by simply changing blocking all‐to‐all communication calls to non‐blocking ones to introduce the overlapping capability. A 3D reverse‐time migration result is also presented as an extension to the modeling work based on non‐blocking collective communications.
LMFAO! Humor as a Response to Fear: Decomposing Fear Control within the Extended Parallel Process Model

Science.gov (United States)

Abril, Eulàlia P.; Szczypka, Glen; Emery, Sherry L.

2017-01-01

This study seeks to analyze fear control responses to the 2012 Tips from Former Smokers campaign using the Extended Parallel Process Model (EPPM). The goal is to examine the occurrence of ancillary fear control responses, like humor. In order to explore individuals’ responses in an organic setting, we use Twitter data—tweets—collected via the Firehose. Content analysis of relevant fear control tweets (N = 14,281) validated the existence of boomerang responses within the EPPM: denial, defensive avoidance, and reactance. More importantly, results showed that humor tweets were not only a significant occurrence but constituted the majority of fear control responses. PMID:29527092

Landau fluid models of collisionless magnetohydrodynamics

International Nuclear Information System (INIS)

Snyder, P.B.; Hammett, G.W.; Dorland, W.

1997-01-01

A closed set of fluid moment equations including models of kinetic Landau damping is developed which describes the evolution of collisionless plasmas in the magnetohydrodynamic parameter regime. The model is fully electromagnetic and describes the dynamics of both compressional and shear Alfven waves, as well as ion acoustic waves. The model allows for separate parallel and perpendicular pressures p parallel and p perpendicular , and, unlike previous models such as Chew-Goldberger-Low theory, correctly predicts the instability threshold for the mirror instability. Both a simple 3 + 1 moment model and a more accurate 4 + 2 moment model are developed, and both could be useful for numerical simulations of astrophysical and fusion plasmas
Parallelization of 2-D lattice Boltzmann codes

International Nuclear Information System (INIS)

Suzuki, Soichiro; Kaburaki, Hideo; Yokokawa, Mitsuo.

1996-03-01

Lattice Boltzmann (LB) codes to simulate two dimensional fluid flow are developed on vector parallel computer Fujitsu VPP500 and scalar parallel computer Intel Paragon XP/S. While a 2-D domain decomposition method is used for the scalar parallel LB code, a 1-D domain decomposition method is used for the vector parallel LB code to be vectorized along with the axis perpendicular to the direction of the decomposition. High parallel efficiency of 95.1% by the vector parallel calculation on 16 processors with 1152x1152 grid and 88.6% by the scalar parallel calculation on 100 processors with 800x800 grid are obtained. The performance models are developed to analyze the performance of the LB codes. It is shown by our performance models that the execution speed of the vector parallel code is about one hundred times faster than that of the scalar parallel code with the same number of processors up to 100 processors. We also analyze the scalability in keeping the available memory size of one processor element at maximum. Our performance model predicts that the execution time of the vector parallel code increases about 3% on 500 processors. Although the 1-D domain decomposition method has in general a drawback in the interprocessor communication, the vector parallel LB code is still suitable for the large scale and/or high resolution simulations. (author)
Parallelization of 2-D lattice Boltzmann codes

Energy Technology Data Exchange (ETDEWEB)

Suzuki, Soichiro; Kaburaki, Hideo; Yokokawa, Mitsuo

1996-03-01

Lattice Boltzmann (LB) codes to simulate two dimensional fluid flow are developed on vector parallel computer Fujitsu VPP500 and scalar parallel computer Intel Paragon XP/S. While a 2-D domain decomposition method is used for the scalar parallel LB code, a 1-D domain decomposition method is used for the vector parallel LB code to be vectorized along with the axis perpendicular to the direction of the decomposition. High parallel efficiency of 95.1% by the vector parallel calculation on 16 processors with 1152x1152 grid and 88.6% by the scalar parallel calculation on 100 processors with 800x800 grid are obtained. The performance models are developed to analyze the performance of the LB codes. It is shown by our performance models that the execution speed of the vector parallel code is about one hundred times faster than that of the scalar parallel code with the same number of processors up to 100 processors. We also analyze the scalability in keeping the available memory size of one processor element at maximum. Our performance model predicts that the execution time of the vector parallel code increases about 3% on 500 processors. Although the 1-D domain decomposition method has in general a drawback in the interprocessor communication, the vector parallel LB code is still suitable for the large scale and/or high resolution simulations. (author).
Case Study: Modelling Telecommunication Systems using Concurrent ML

DEFF Research Database (Denmark)

Hoffmann, Torben

1998-01-01

How can telecommunication hardware be modelled in CML. How are generalities and parallelism captured in CML.......How can telecommunication hardware be modelled in CML. How are generalities and parallelism captured in CML....
A Topological Model for Parallel Algorithm Design

Science.gov (United States)

1991-09-01

effort should be directed to planning, requirements analysis, specification and design, with 20% invested into the actual coding, and then the final 40...be olle more language to learn. And by investing the effort into improving the utility of ai, existing language instead of creating a new one, this...193) it abandons the notion of a process as a fundemental concept of parallel program design and that it facilitates program derivation by rigorously
OpenMP parallelization of a gridded SWAT (SWATG)

Science.gov (United States)

Zhang, Ying; Hou, Jinliang; Cao, Yongpan; Gu, Juan; Huang, Chunlin

2017-12-01

Large-scale, long-term and high spatial resolution simulation is a common issue in environmental modeling. A Gridded Hydrologic Response Unit (HRU)-based Soil and Water Assessment Tool (SWATG) that integrates grid modeling scheme with different spatial representations also presents such problems. The time-consuming problem affects applications of very high resolution large-scale watershed modeling. The OpenMP (Open Multi-Processing) parallel application interface is integrated with SWATG (called SWATGP) to accelerate grid modeling based on the HRU level. Such parallel implementation takes better advantage of the computational power of a shared memory computer system. We conducted two experiments at multiple temporal and spatial scales of hydrological modeling using SWATG and SWATGP on a high-end server. At 500-m resolution, SWATGP was found to be up to nine times faster than SWATG in modeling over a roughly 2000 km2 watershed with 1 CPU and a 15 thread configuration. The study results demonstrate that parallel models save considerable time relative to traditional sequential simulation runs. Parallel computations of environmental models are beneficial for model applications, especially at large spatial and temporal scales and at high resolutions. The proposed SWATGP model is thus a promising tool for large-scale and high-resolution water resources research and management in addition to offering data fusion and model coupling ability.
Introduction to parallel programming

CERN Document Server

Brawer, Steven

1989-01-01

Introduction to Parallel Programming focuses on the techniques, processes, methodologies, and approaches involved in parallel programming. The book first offers information on Fortran, hardware and operating system models, and processes, shared memory, and simple parallel programs. Discussions focus on processes and processors, joining processes, shared memory, time-sharing with multiple processors, hardware, loops, passing arguments in function/subroutine calls, program structure, and arithmetic expressions. The text then elaborates on basic parallel programming techniques, barriers and race
A combined PHREEQC-2/parallel fracture model for the simulation of laminar/non-laminar flow and contaminant transport with reactions

Science.gov (United States)

Masciopinto, Costantino; Volpe, Angela; Palmiotta, Domenico; Cherubini, Claudia

2010-09-01

A combination of a parallel fracture model with the PHREEQC-2 geochemical model was developed to simulate sequential flow and chemical transport with reactions in fractured media where both laminar and turbulent flows occur. The integration of non-laminar flow resistances in one model produced relevant effects on water flow velocities, thus improving model prediction capabilities on contaminant transport. The proposed conceptual model consists of 3D rock-blocks, separated by horizontal bedding plane fractures with variable apertures. Particle tracking solved the transport equations for conservative compounds and provided input for PHREEQC-2. For each cluster of contaminant pathways, PHREEQC-2 determined the concentration for mass-transfer, sorption/desorption, ion exchange, mineral dissolution/precipitation and biodegradation, under kinetically controlled reactive processes of equilibrated chemical species. Field tests have been performed for the code verification. As an example, the combined model has been applied to a contaminated fractured aquifer of southern Italy in order to simulate the phenol transport. The code correctly fitted the field available data and also predicted a possible rapid depletion of phenols as a result of an increased biodegradation rate induced by a simulated artificial injection of nitrates, upgradient to the sources.
Execution Model of Three Parallel Languages: OpenMP, UPC and CAF

Directory of Open Access Journals (Sweden)

Ami Marowka

2005-01-01

Full Text Available The aim of this paper is to present a qualitative evaluation of three state-of-the-art parallel languages: OpenMP, Unified Parallel C (UPC and Co-Array Fortran (CAF. OpenMP and UPC are explicit parallel programming languages based on the ANSI standard. CAF is an implicit programming language. On the one hand, OpenMP designs for shared-memory architectures and extends the base-language by using compiler directives that annotate the original source-code. On the other hand, UPC and CAF designs for distribute-shared memory architectures and extends the base-language by new parallel constructs. We deconstruct each language into its basic components, show examples, make a detailed analysis, compare them, and finally draw some conclusions.
Dynamics of parallel robots from rigid bodies to flexible elements

CERN Document Server

Briot, Sébastien

2015-01-01

This book starts with a short recapitulation on basic concepts, common to any types of robots (serial, tree structure, parallel, etc.), that are also necessary for computation of the dynamic models of parallel robots. Then, as dynamics requires the use of geometry and kinematics, the general equations of geometric and kinematic models of parallel robots are given. After, it is explained that parallel robot dynamic models can be obtained by decomposing the real robot into two virtual systems: a tree-structure robot (equivalent to the robot legs for which all joints would be actuated) plus a free body corresponding to the platform. Thus, the dynamics of rigid tree-structure robots is analyzed and algorithms to obtain their dynamic models in the most compact form are given. The dynamic model of the real rigid parallel robot is obtained by closing the loops through the use of the Lagrange multipliers. The problem of the dynamic model degeneracy near singularities is treated and optimal trajectory planning for cro...
Statistical Model Checking of Rich Models and Properties

DEFF Research Database (Denmark)

Poulsen, Danny Bøgsted

in undecidability issues for the traditional model checking approaches. Statistical model checking has proven itself a valuable supplement to model checking and this thesis is concerned with extending this software validation technique to stochastic hybrid systems. The thesis consists of two parts: the first part...... motivates why existing model checking technology should be supplemented by new techniques. It also contains a brief introduction to probability theory and concepts covered by the six papers making up the second part. The first two papers are concerned with developing online monitoring techniques...... systems. The fifth paper shows how stochastic hybrid automata are useful for modelling biological systems and the final paper is concerned with showing how statistical model checking is efficiently distributed. In parallel with developing the theory contained in the papers, a substantial part of this work...
Parallel plasma fluid turbulence calculations

International Nuclear Information System (INIS)

Leboeuf, J.N.; Carreras, B.A.; Charlton, L.A.; Drake, J.B.; Lynch, V.E.; Newman, D.E.; Sidikman, K.L.; Spong, D.A.

1994-01-01

The study of plasma turbulence and transport is a complex problem of critical importance for fusion-relevant plasmas. To this day, the fluid treatment of plasma dynamics is the best approach to realistic physics at the high resolution required for certain experimentally relevant calculations. Core and edge turbulence in a magnetic fusion device have been modeled using state-of-the-art, nonlinear, three-dimensional, initial-value fluid and gyrofluid codes. Parallel implementation of these models on diverse platforms--vector parallel (National Energy Research Supercomputer Center's CRAY Y-MP C90), massively parallel (Intel Paragon XP/S 35), and serial parallel (clusters of high-performance workstations using the Parallel Virtual Machine protocol)--offers a variety of paths to high resolution and significant improvements in real-time efficiency, each with its own advantages. The largest and most efficient calculations have been performed at the 200 Mword memory limit on the C90 in dedicated mode, where an overlap of 12 to 13 out of a maximum of 16 processors has been achieved with a gyrofluid model of core fluctuations. The richness of the physics captured by these calculations is commensurate with the increased resolution and efficiency and is limited only by the ingenuity brought to the analysis of the massive amounts of data generated
Aspects of computation on asynchronous parallel processors

International Nuclear Information System (INIS)

Wright, M.

1989-01-01

The increasing availability of asynchronous parallel processors has provided opportunities for original and useful work in scientific computing. However, the field of parallel computing is still in a highly volatile state, and researchers display a wide range of opinion about many fundamental questions such as models of parallelism, approaches for detecting and analyzing parallelism of algorithms, and tools that allow software developers and users to make effective use of diverse forms of complex hardware. This volume collects the work of researchers specializing in different aspects of parallel computing, who met to discuss the framework and the mechanics of numerical computing. The far-reaching impact of high-performance asynchronous systems is reflected in the wide variety of topics, which include scientific applications (e.g. linear algebra, lattice gauge simulation, ordinary and partial differential equations), models of parallelism, parallel language features, task scheduling, automatic parallelization techniques, tools for algorithm development in parallel environments, and system design issues
A diffusion model for two parallel queues with processor sharing: transient behavior and asymptotics

Directory of Open Access Journals (Sweden)

Charles Knessl

1999-01-01

Full Text Available We consider two identical, parallel M/M/1 queues. Both queues are fed by a Poisson arrival stream of rate λ and have service rates equal to μ. When both queues are non-empty, the two systems behave independently of each other. However, when one of the queues becomes empty, the corresponding server helps in the other queue. This is called head-of-the-line processor sharing. We study this model in the heavy traffic limit, where ρ=λ/μ→1. We formulate the heavy traffic diffusion approximation and explicitly compute the time-dependent probability of the diffusion approximation to the joint queue length process. We then evaluate the solution asymptotically for large values of space and/or time. This leads to simple expressions that show how the process achieves its stead state and other transient aspects.
3D printed soft parallel actuator

Science.gov (United States)

Zolfagharian, Ali; Kouzani, Abbas Z.; Khoo, Sui Yang; Noshadi, Amin; Kaynak, Akif

2018-04-01

This paper presents a 3-dimensional (3D) printed soft parallel contactless actuator for the first time. The actuator involves an electro-responsive parallel mechanism made of two segments namely active chain and passive chain both 3D printed. The active chain is attached to the ground from one end and constitutes two actuator links made of responsive hydrogel. The passive chain, on the other hand, is attached to the active chain from one end and consists of two rigid links made of polymer. The actuator links are printed using an extrusion-based 3D-Bioplotter with polyelectrolyte hydrogel as printer ink. The rigid links are also printed by a 3D fused deposition modelling (FDM) printer with acrylonitrile butadiene styrene (ABS) as print material. The kinematics model of the soft parallel actuator is derived via transformation matrices notations to simulate and determine the workspace of the actuator. The printed soft parallel actuator is then immersed into NaOH solution with specific voltage applied to it via two contactless electrodes. The experimental data is then collected and used to develop a parametric model to estimate the end-effector position and regulate kinematics model in response to specific input voltage over time. It is observed that the electroactive actuator demonstrates expected behaviour according to the simulation of its kinematics model. The use of 3D printing for the fabrication of parallel soft actuators opens a new chapter in manufacturing sophisticated soft actuators with high dexterity and mechanical robustness for biomedical applications such as cell manipulation and drug release.
Parallelization Experience with Four Canonical Econometric Models Using ParMitISEM

NARCIS (Netherlands)

N. Basturk (Nalan); S. Grassi (Stefano); L.F. Hoogerheide (Lennart); H.K. van Dijk (Herman)

2016-01-01

textabstractThis paper presents the parallel computing implementation of the MitISEM algorithm, labeled Parallel MitISEM. The basic MitISEM algorithm, introduced by Hoogerheide, Opschoor and Van Dijk (2012), provides an automatic and flexible method to approximate a non-elliptical target density
Parallelization experience with four canonical econometric models using ParMitISEM

NARCIS (Netherlands)

Baştürk, N.; Grassi, S.; Hoogerheide, L.; van Dijk, H.K.

2016-01-01

This paper presents the parallel computing implementation of the MitISEM algorithm, labeled Parallel MitISEM. The basic MitISEM algorithm, introduced by Hoogerheide et al. (2012), provides an automatic and flexible method to approximate a non-elliptical target density using adaptive mixtures of
Parallelization experience with four canonical econometric models using ParMitISEM

NARCIS (Netherlands)

Bastürk, Nalan; Grassi, S.; Hoogerheide, L.; van Dijk, Herman K.

2016-01-01

This paper presents the parallel computing implementation of the MitISEM algorithm, labeled Parallel MitISEM. The basic MitISEM algorithm provides an automatic and flexible method to approximate a non-elliptical target density using adaptive mixtures of Student-t densities, where only a kernel of
Synchronization Of Parallel Discrete Event Simulations

Science.gov (United States)

Steinman, Jeffrey S.

1992-01-01

Adaptive, parallel, discrete-event-simulation-synchronization algorithm, Breathing Time Buckets, developed in Synchronous Parallel Environment for Emulation and Discrete Event Simulation (SPEEDES) operating system. Algorithm allows parallel simulations to process events optimistically in fluctuating time cycles that naturally adapt while simulation in progress. Combines best of optimistic and conservative synchronization strategies while avoiding major disadvantages. Algorithm processes events optimistically in time cycles adapting while simulation in progress. Well suited for modeling communication networks, for large-scale war games, for simulated flights of aircraft, for simulations of computer equipment, for mathematical modeling, for interactive engineering simulations, and for depictions of flows of information.
High Efficiency EBCOT with Parallel Coding Architecture for JPEG2000

Directory of Open Access Journals (Sweden)

Chiang Jen-Shiun

2006-01-01

Full Text Available This work presents a parallel context-modeling coding architecture and a matching arithmetic coder (MQ-coder for the embedded block coding (EBCOT unit of the JPEG2000 encoder. Tier-1 of the EBCOT consumes most of the computation time in a JPEG2000 encoding system. The proposed parallel architecture can increase the throughput rate of the context modeling. To match the high throughput rate of the parallel context-modeling architecture, an efficient pipelined architecture for context-based adaptive arithmetic encoder is proposed. This encoder of JPEG2000 can work at 180 MHz to encode one symbol each cycle. Compared with the previous context-modeling architectures, our parallel architectures can improve the throughput rate up to 25%.

Dynamic modeling and hierarchical compound control of a novel 2-DOF flexible parallel manipulator with multiple actuation modes

Science.gov (United States)

Liang, Dong; Song, Yimin; Sun, Tao; Jin, Xueying

2018-03-01

This paper addresses the problem of rigid-flexible coupling dynamic modeling and active control of a novel flexible parallel manipulator (PM) with multiple actuation modes. Firstly, based on the flexible multi-body dynamics theory, the rigid-flexible coupling dynamic model (RFDM) of system is developed by virtue of the augmented Lagrangian multipliers approach. For completeness, the mathematical models of permanent magnet synchronous motor (PMSM) and piezoelectric transducer (PZT) are further established and integrated with the RFDM of mechanical system to formulate the electromechanical coupling dynamic model (ECDM). To achieve the trajectory tracking and vibration suppression, a hierarchical compound control strategy is presented. Within this control strategy, the proportional-differential (PD) feedback controller is employed to realize the trajectory tracking of end-effector, while the strain and strain rate feedback (SSRF) controller is developed to restrain the vibration of the flexible links using PZT. Furthermore, the stability of the control algorithm is demonstrated based on the Lyapunov stability theory. Finally, two simulation case studies are performed to illustrate the effectiveness of the proposed approach. The results indicate that, under the redundant actuation mode, the hierarchical compound control strategy can guarantee the flexible PM achieves singularity-free motion and vibration attenuation within task workspace simultaneously. The systematic methodology proposed in this study can be conveniently extended for the dynamic modeling and efficient controller design of other flexible PMs, especially the emerging ones with multiple actuation modes.
The Modeling and Harmonic Coupling Analysis of Multiple-Parallel Connected Inverter Using Harmonic State Space (HSS)

DEFF Research Database (Denmark)

Kwon, Jun Bum; Wang, Xiongfei; Bak, Claus Leth

2015-01-01

As the number of power electronics based systems are increasing, studies about overall stability and harmonic problems are rising. In order to analyze harmonics and stability, most research is using an analysis method, which is based on the Linear Time Invariant (LTI) approach. However, this can...... be difficult in terms of complex multi-parallel connected systems, especially in the case of renewable energy, where possibilities for intermittent operation due to the weather conditions exist. Hence, it can bring many different operating points to the power converter, and the impedance characteristics can...... can demonstrate other phenomenon, which can not be found in the conventional LTI approach. The theoretical modeling and analysis are verified by means of simulations and experiments....
Model coupler for coupling of atmospheric, oceanic, and terrestrial models

International Nuclear Information System (INIS)

Nagai, Haruyasu; Kobayashi, Takuya; Tsuduki, Katsunori; Kim, Keyong-Ok

2007-02-01

A numerical simulation system SPEEDI-MP, which is applicable for various environmental studies, consists of dynamical models and material transport models for the atmospheric, terrestrial, and oceanic environments, meteorological and geographical databases for model inputs, and system utilities for file management, visualization, analysis, etc., using graphical user interfaces (GUIs). As a numerical simulation tool, a model coupling program (model coupler) has been developed. It controls parallel calculations of several models and data exchanges among them to realize the dynamical coupling of the models. It is applicable for any models with three-dimensional structured grid system, which is used by most environmental and hydrodynamic models. A coupled model system for water circulation has been constructed with atmosphere, ocean, wave, hydrology, and land-surface models using the model coupler. Performance tests of the coupled model system for water circulation were also carried out for the flood event at Saudi Arabia in January 2005 and the storm surge case by the hurricane KATRINA in August 2005. (author)
Biocellion: accelerating computer simulation of multicellular biological system models.

Science.gov (United States)

Kang, Seunghwa; Kahan, Simon; McDermott, Jason; Flann, Nicholas; Shmulevich, Ilya

2014-11-01

Biological system behaviors are often the outcome of complex interactions among a large number of cells and their biotic and abiotic environment. Computational biologists attempt to understand, predict and manipulate biological system behavior through mathematical modeling and computer simulation. Discrete agent-based modeling (in combination with high-resolution grids to model the extracellular environment) is a popular approach for building biological system models. However, the computational complexity of this approach forces computational biologists to resort to coarser resolution approaches to simulate large biological systems. High-performance parallel computers have the potential to address the computing challenge, but writing efficient software for parallel computers is difficult and time-consuming. We have developed Biocellion, a high-performance software framework, to solve this computing challenge using parallel computers. To support a wide range of multicellular biological system models, Biocellion asks users to provide their model specifics by filling the function body of pre-defined model routines. Using Biocellion, modelers without parallel computing expertise can efficiently exploit parallel computers with less effort than writing sequential programs from scratch. We simulate cell sorting, microbial patterning and a bacterial system in soil aggregate as case studies. Biocellion runs on x86 compatible systems with the 64 bit Linux operating system and is freely available for academic use. Visit http://biocellion.com for additional information. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
A massively parallel GPU-accelerated model for analysis of fully nonlinear free surface waves

DEFF Research Database (Denmark)

Engsig-Karup, Allan Peter; Madsen, Morten G.; Glimberg, Stefan Lemvig

2011-01-01

-storage flexible-order accurate finite difference method that is known to be efficient and scalable on a CPU core (single thread). To achieve parallel performance of the relatively complex numerical model, we investigate a new trend in high-performance computing where many-core GPUs are utilized as high......-throughput co-processors to the CPU. We describe and demonstrate how this approach makes it possible to do fast desktop computations for large nonlinear wave problems in numerical wave tanks (NWTs) with close to 50/100 million total grid points in double/ single precision with 4 GB global device memory...... available. A new code base has been developed in C++ and compute unified device architecture C and is found to improve the runtime more than an order in magnitude in double precision arithmetic for the same accuracy over an existing CPU (single thread) Fortran 90 code when executed on a single modern GPU...
Application of Parallel Algorithms in an Air Pollution Model

DEFF Research Database (Denmark)

Georgiev, K.; Zlatev, Z.

1999-01-01

Proceedings of the NATO Advanced Research Workshop on Large Scale Computations in Air Pollution Modelling, Sofia, Bulgaria, 6-10 July 1998......Proceedings of the NATO Advanced Research Workshop on Large Scale Computations in Air Pollution Modelling, Sofia, Bulgaria, 6-10 July 1998...
Prediction of Adequate Prenatal Care Utilization Based on the Extended Parallel Process Model.

Science.gov (United States)

Hajian, Sepideh; Imani, Fatemeh; Riazi, Hedyeh; Salmani, Fatemeh

2017-10-01

Pregnancy complications are one of the major public health concerns. One of the main causes of preventable complications is the absence of or inadequate provision of prenatal care. The present study was conducted to investigate whether Extended Parallel Process Model's constructs can predict the utilization of prenatal care services. The present longitudinal prospective study was conducted on 192 pregnant women selected through the multi-stage sampling of health facilities in Qeshm, Hormozgan province, from April to June 2015. Participants were followed up from the first half of pregnancy until their childbirth to assess adequate or inadequate/non-utilization of prenatal care services. Data were collected using the structured Risk Behavior Diagnosis Scale. The analysis of the data was carried out in SPSS-22 using one-way ANOVA, linear regression and logistic regression analysis. The level of significance was set at 0.05. Totally, 178 pregnant women with a mean age of 25.31±5.42 completed the study. Perceived self-efficacy (OR=25.23; Pprenatal care. Husband's occupation in the labor market (OR=0.43; P=0.02), unwanted pregnancy (OR=0.352; Pcare for the minors or elderly at home (OR=0.35; P=0.045) were associated with lower odds of receiving prenatal care. The model showed that when perceived efficacy of the prenatal care services overcame the perceived threat, the likelihood of prenatal care usage will increase. This study identified some modifiable factors associated with prenatal care usage by women, providing key targets for appropriate clinical interventions.
A Parallel Saturation Algorithm on Shared Memory Architectures

Science.gov (United States)

Ezekiel, Jonathan; Siminiceanu

2007-01-01

Symbolic state-space generators are notoriously hard to parallelize. However, the Saturation algorithm implemented in the SMART verification tool differs from other sequential symbolic state-space generators in that it exploits the locality of ring events in asynchronous system models. This paper explores whether event locality can be utilized to efficiently parallelize Saturation on shared-memory architectures. Conceptually, we propose to parallelize the ring of events within a decision diagram node, which is technically realized via a thread pool. We discuss the challenges involved in our parallel design and conduct experimental studies on its prototypical implementation. On a dual-processor dual core PC, our studies show speed-ups for several example models, e.g., of up to 50% for a Kanban model, when compared to running our algorithm only on a single core.
Pthreads vs MPI Parallel Performance of Angular-Domain Decomposed S

International Nuclear Information System (INIS)

Azmy, Y.Y.; Barnett, D.A.

2000-01-01

Two programming models for parallelizing the Angular Domain Decomposition (ADD) of the discrete ordinates (S n ) approximation of the neutron transport equation are examined. These are the shared memory model based on the POSIX threads (Pthreads) standard, and the message passing model based on the Message Passing Interface (MPI) standard. These standard libraries are available on most multiprocessor platforms thus making the resulting parallel codes widely portable. The question is: on a fixed platform, and for a particular code solving a given test problem, which of the two programming models delivers better parallel performance? Such comparison is possible on Symmetric Multi-Processors (SMP) architectures in which several CPUs physically share a common memory, and in addition are capable of emulating message passing functionality. Implementation of the two-dimensional,(S n ), Arbitrarily High Order Transport (AHOT) code for solving neutron transport problems using these two parallelization models is described. Measured parallel performance of each model on the COMPAQ AlphaServer 8400 and the SGI Origin 2000 platforms is described, and comparison of the observed speedup for the two programming models is reported. For the case presented in this paper it appears that the MPI implementation scales better than the Pthreads implementation on both platforms
Comparison of least squares and exponential sine sweep methods for Parallel Hammerstein Models estimation

Science.gov (United States)

Rebillat, Marc; Schoukens, Maarten

2018-05-01

Linearity is a common assumption for many real-life systems, but in many cases the nonlinear behavior of systems cannot be ignored and must be modeled and estimated. Among the various existing classes of nonlinear models, Parallel Hammerstein Models (PHM) are interesting as they are at the same time easy to interpret as well as to estimate. One way to estimate PHM relies on the fact that the estimation problem is linear in the parameters and thus that classical least squares (LS) estimation algorithms can be used. In that area, this article introduces a regularized LS estimation algorithm inspired on some of the recently developed regularized impulse response estimation techniques. Another mean to estimate PHM consists in using parametric or non-parametric exponential sine sweeps (ESS) based methods. These methods (LS and ESS) are founded on radically different mathematical backgrounds but are expected to tackle the same issue. A methodology is proposed here to compare them with respect to (i) their accuracy, (ii) their computational cost, and (iii) their robustness to noise. Tests are performed on simulated systems for several values of methods respective parameters and of signal to noise ratio. Results show that, for a given set of data points, the ESS method is less demanding in computational resources than the LS method but that it is also less accurate. Furthermore, the LS method needs parameters to be set in advance whereas the ESS method is not subject to conditioning issues and can be fully non-parametric. In summary, for a given set of data points, ESS method can provide a first, automatic, and quick overview of a nonlinear system than can guide more computationally demanding and precise methods, such as the regularized LS one proposed here.
Bio-Inspired Neural Model for Learning Dynamic Models

Science.gov (United States)

Duong, Tuan; Duong, Vu; Suri, Ronald

2009-01-01

A neural-network mathematical model that, relative to prior such models, places greater emphasis on some of the temporal aspects of real neural physical processes, has been proposed as a basis for massively parallel, distributed algorithms that learn dynamic models of possibly complex external processes by means of learning rules that are local in space and time. The algorithms could be made to perform such functions as recognition and prediction of words in speech and of objects depicted in video images. The approach embodied in this model is said to be "hardware-friendly" in the following sense: The algorithms would be amenable to execution by special-purpose computers implemented as very-large-scale integrated (VLSI) circuits that would operate at relatively high speeds and low power demands.
The Potsdam Parallel Ice Sheet Model (PISM-PIK) - Part 2: Dynamic equilibrium simulation of the Antarctic ice sheet

Science.gov (United States)

Martin, M. A.; Winkelmann, R.; Haseloff, M.; Albrecht, T.; Bueler, E.; Khroulev, C.; Levermann, A.

2011-09-01

We present a dynamic equilibrium simulation of the ice sheet-shelf system on Antarctica with the Potsdam Parallel Ice Sheet Model (PISM-PIK). The simulation is initialized with present-day conditions for bed topography and ice thickness and then run to steady state with constant present-day surface mass balance. Surface temperature and sub-shelf basal melt distribution are parameterized. Grounding lines and calving fronts are free to evolve, and their modeled equilibrium state is compared to observational data. A physically-motivated calving law based on horizontal spreading rates allows for realistic calving fronts for various types of shelves. Steady-state dynamics including surface velocity and ice flux are analyzed for whole Antarctica and the Ronne-Filchner and Ross ice shelf areas in particular. The results show that the different flow regimes in sheet and shelves, and the transition zone between them, are captured reasonably well, supporting the approach of superposition of SIA and SSA for the representation of fast motion of grounded ice. This approach also leads to a natural emergence of sliding-dominated flow in stream-like features in this new 3-D marine ice sheet model.
MIP Models and Hybrid Algorithms for Simultaneous Job Splitting and Scheduling on Unrelated Parallel Machines

Science.gov (United States)

Ozmutlu, H. Cenk

2014-01-01

We developed mixed integer programming (MIP) models and hybrid genetic-local search algorithms for the scheduling problem of unrelated parallel machines with job sequence and machine-dependent setup times and with job splitting property. The first contribution of this paper is to introduce novel algorithms which make splitting and scheduling simultaneously with variable number of subjobs. We proposed simple chromosome structure which is constituted by random key numbers in hybrid genetic-local search algorithm (GAspLA). Random key numbers are used frequently in genetic algorithms, but it creates additional difficulty when hybrid factors in local search are implemented. We developed algorithms that satisfy the adaptation of results of local search into the genetic algorithms with minimum relocation operation of genes' random key numbers. This is the second contribution of the paper. The third contribution of this paper is three developed new MIP models which are making splitting and scheduling simultaneously. The fourth contribution of this paper is implementation of the GAspLAMIP. This implementation let us verify the optimality of GAspLA for the studied combinations. The proposed methods are tested on a set of problems taken from the literature and the results validate the effectiveness of the proposed algorithms. PMID:24977204
MIP models and hybrid algorithms for simultaneous job splitting and scheduling on unrelated parallel machines.

Science.gov (United States)

Eroglu, Duygu Yilmaz; Ozmutlu, H Cenk

2014-01-01

We developed mixed integer programming (MIP) models and hybrid genetic-local search algorithms for the scheduling problem of unrelated parallel machines with job sequence and machine-dependent setup times and with job splitting property. The first contribution of this paper is to introduce novel algorithms which make splitting and scheduling simultaneously with variable number of subjobs. We proposed simple chromosome structure which is constituted by random key numbers in hybrid genetic-local search algorithm (GAspLA). Random key numbers are used frequently in genetic algorithms, but it creates additional difficulty when hybrid factors in local search are implemented. We developed algorithms that satisfy the adaptation of results of local search into the genetic algorithms with minimum relocation operation of genes' random key numbers. This is the second contribution of the paper. The third contribution of this paper is three developed new MIP models which are making splitting and scheduling simultaneously. The fourth contribution of this paper is implementation of the GAspLAMIP. This implementation let us verify the optimality of GAspLA for the studied combinations. The proposed methods are tested on a set of problems taken from the literature and the results validate the effectiveness of the proposed algorithms.
Smuggling, non-fundamental uncertainty, and parallel market exchange rate volatility

OpenAIRE

Richard Clay Barnett

2003-01-01

We explore a model where smuggling and a parallel currency market arise, owing to government restrictions that prevent agents from legally holding foreign exchange. Despite such restrictions, agents are able to diversify their savings, holding both domestic and parallel foreign cash, basing their portfolio allocation on current and prospective parallel exchange rates. We attribute movements in parallel rates to non-fundamental uncertainty. The model generates equilibria with both positive and...
Vectorization, parallelization and porting of nuclear codes (vectorization and parallelization). Progress report fiscal 1998

International Nuclear Information System (INIS)

Ishizuki, Shigeru; Kawai, Wataru; Nemoto, Toshiyuki; Ogasawara, Shinobu; Kume, Etsuo; Adachi, Masaaki; Kawasaki, Nobuo; Yatake, Yo-ichi

2000-03-01

Several computer codes in the nuclear field have been vectorized, parallelized and transported on the FUJITSU VPP500 system, the AP3000 system and the Paragon system at Center for Promotion of Computational Science and Engineering in Japan Atomic Energy Research Institute. We dealt with 12 codes in fiscal 1998. These results are reported in 3 parts, i.e., the vectorization and parallelization on vector processors part, the parallelization on scalar processors part and the porting part. In this report, we describe the vectorization and parallelization on vector processors. In this vectorization and parallelization on vector processors part, the vectorization of General Tokamak Circuit Simulation Program code GTCSP, the vectorization and parallelization of Molecular Dynamics NTV (n-particle, Temperature and Velocity) Simulation code MSP2, Eddy Current Analysis code EDDYCAL, Thermal Analysis Code for Test of Passive Cooling System by HENDEL T2 code THANPACST2 and MHD Equilibrium code SELENEJ on the VPP500 are described. In the parallelization on scalar processors part, the parallelization of Monte Carlo N-Particle Transport code MCNP4B2, Plasma Hydrodynamics code using Cubic Interpolated Propagation Method PHCIP and Vectorized Monte Carlo code (continuous energy model / multi-group model) MVP/GMVP on the Paragon are described. In the porting part, the porting of Monte Carlo N-Particle Transport code MCNP4B2 and Reactor Safety Analysis code RELAP5 on the AP3000 are described. (author)
Model Driven Engineering

Science.gov (United States)

Gaševic, Dragan; Djuric, Dragan; Devedžic, Vladan

A relevant initiative from the software engineering community called Model Driven Engineering (MDE) is being developed in parallel with the Semantic Web (Mellor et al. 2003a). The MDE approach to software development suggests that one should first develop a model of the system under study, which is then transformed into the real thing (i.e., an executable software entity). The most important research initiative in this area is the Model Driven Architecture (MDA), which is Model Driven Architecture being developed under the umbrella of the Object Management Group (OMG). This chapter describes the basic concepts of this software engineering effort.
Modeling the Fracture of Ice Sheets on Parallel Computers

Energy Technology Data Exchange (ETDEWEB)

Waisman, Haim [Columbia Univ., New York, NY (United States); Tuminaro, Ray [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

2013-10-10

The objective of this project was to investigate the complex fracture of ice and understand its role within larger ice sheet simulations and global climate change. This objective was achieved by developing novel physics based models for ice, novel numerical tools to enable the modeling of the physics and by collaboration with the ice community experts. At the present time, ice fracture is not explicitly considered within ice sheet models due in part to large computational costs associated with the accurate modeling of this complex phenomena. However, fracture not only plays an extremely important role in regional behavior but also influences ice dynamics over much larger zones in ways that are currently not well understood. To this end, our research findings through this project offers significant advancement to the field and closes a large gap of knowledge in understanding and modeling the fracture of ice sheets in the polar regions. Thus, we believe that our objective has been achieved and our research accomplishments are significant. This is corroborated through a set of published papers, posters and presentations at technical conferences in the field. In particular significant progress has been made in the mechanics of ice, fracture of ice sheets and ice shelves in polar regions and sophisticated numerical methods that enable the solution of the physics in an efficient way.
Performance of GeantV EM Physics Models

Energy Technology Data Exchange (ETDEWEB)

Amadio, G.; et al.

2016-10-14

The recent progress in parallel hardware architectures with deeper vector pipelines or many-cores technologies brings opportunities for HEP experiments to take advantage of SIMD and SIMT computing models. Launched in 2013, the GeantV project studies performance gains in propagating multiple particles in parallel, improving instruction throughput and data locality in HEP event simulation on modern parallel hardware architecture. Due to the complexity of geometry description and physics algorithms of a typical HEP application, performance analysis is indispensable in identifying factors limiting parallel execution. In this report, we will present design considerations and preliminary computing performance of GeantV physics models on coprocessors (Intel Xeon Phi and NVidia GPUs) as well as on mainstream CPUs.
Performance of GeantV EM Physics Models

Science.gov (United States)

Amadio, G.; Ananya, A.; Apostolakis, J.; Aurora, A.; Bandieramonte, M.; Bhattacharyya, A.; Bianchini, C.; Brun, R.; Canal, P.; Carminati, F.; Cosmo, G.; Duhem, L.; Elvira, D.; Folger, G.; Gheata, A.; Gheata, M.; Goulas, I.; Iope, R.; Jun, S. Y.; Lima, G.; Mohanty, A.; Nikitina, T.; Novak, M.; Pokorski, W.; Ribon, A.; Seghal, R.; Shadura, O.; Vallecorsa, S.; Wenzel, S.; Zhang, Y.

2017-10-01

The recent progress in parallel hardware architectures with deeper vector pipelines or many-cores technologies brings opportunities for HEP experiments to take advantage of SIMD and SIMT computing models. Launched in 2013, the GeantV project studies performance gains in propagating multiple particles in parallel, improving instruction throughput and data locality in HEP event simulation on modern parallel hardware architecture. Due to the complexity of geometry description and physics algorithms of a typical HEP application, performance analysis is indispensable in identifying factors limiting parallel execution. In this report, we will present design considerations and preliminary computing performance of GeantV physics models on coprocessors (Intel Xeon Phi and NVidia GPUs) as well as on mainstream CPUs.

Performance of GeantV EM Physics Models

CERN Document Server

Amadio, G; Apostolakis, J; Aurora, A; Bandieramonte, M; Bhattacharyya, A; Bianchini, C; Brun, R; Canal P; Carminati, F; Cosmo, G; Duhem, L; Elvira, D; Folger, G; Gheata, A; Gheata, M; Goulas, I; Iope, R; Jun, S Y; Lima, G; Mohanty, A; Nikitina, T; Novak, M; Pokorski, W; Ribon, A; Seghal, R; Shadura, O; Vallecorsa, S; Wenzel, S; Zhang, Y

2017-01-01

The recent progress in parallel hardware architectures with deeper vector pipelines or many-cores technologies brings opportunities for HEP experiments to take advantage of SIMD and SIMT computing models. Launched in 2013, the GeantV project studies performance gains in propagating multiple particles in parallel, improving instruction throughput and data locality in HEP event simulation on modern parallel hardware architecture. Due to the complexity of geometry description and physics algorithms of a typical HEP application, performance analysis is indispensable in identifying factors limiting parallel execution. In this report, we will present design considerations and preliminary computing performance of GeantV physics models on coprocessors (Intel Xeon Phi and NVidia GPUs) as well as on mainstream CPUs.
Iterative algorithms for large sparse linear systems on parallel computers

Science.gov (United States)

Adams, L. M.

1982-01-01

Algorithms for assembling in parallel the sparse system of linear equations that result from finite difference or finite element discretizations of elliptic partial differential equations, such as those that arise in structural engineering are developed. Parallel linear stationary iterative algorithms and parallel preconditioned conjugate gradient algorithms are developed for solving these systems. In addition, a model for comparing parallel algorithms on array architectures is developed and results of this model for the algorithms are given.
Fermion analogy for layered superconducting films in parallel magnetic field

International Nuclear Information System (INIS)

Rodriguez, J.P.

1997-01-01

The equivalence between the Lawrence-Doniach model for films of extreme type-II layered superconductors and a generalization of the back-scattering model for spin-(1/2) electrons in one dimension is demonstrated. This fermion analogy is then exploited to obtain an anomalous H parallel -1 tail for the parallel equilibrium magnetization of the minimal double-layer case in the limit of high parallel magnetic fields H parallel for temperatures in the critical regime. (orig.)
GPU-based parallel computing in real-time modeling of atmospheric transport and diffusion of radioactive material

Energy Technology Data Exchange (ETDEWEB)

Santos, Marcelo C. dos; Pereira, Claudio M.N.A.; Schirru, Roberto; Pinheiro, André, E-mail: jovitamarcelo@gmail.com, E-mail: cmnap@ien.gov.br, E-mail: schirru@lmp.ufrj.br, E-mail: apinheiro99@gmail.com [Instituto de Engenharia Nuclear (IEN/CNEN-RJ), Rio de Janeiro, RJ (Brazil); Coordenacao de Pos-Graduacao e Pesquisa de Engenharia (COPPE/UFRJ), Rio de Janeiro, RJ (Brazil). Programa de Engenharia Nuclear

2017-07-01

Atmospheric radionuclide dispersion systems (ARDS) are essential mechanisms to predict the consequences of unexpected radioactive releases from nuclear power plants. Considering, that during an eventuality of an accident with a radioactive material release, an accurate forecast is vital to guide the evacuation plan of the possible affected areas. However, in order to predict the dispersion of the radioactive material and its impact on the environment, the model must process information about source term (radioactive materials released, activities and location), weather condition (wind, humidity and precipitation) and geographical characteristics (topography). Furthermore, ARDS is basically composed of 4 main modules: Source Term, Wind Field, Plume Dispersion and Doses Calculations. The Wind Field and Plume Dispersion modules are the ones that require a high computational performance to achieve accurate results within an acceptable time. Taking this into account, this work focuses on the development of a GPU-based parallel Plume Dispersion module, focusing on the radionuclide transport and diffusion calculations, which use a given wind field and a released source term as parameters. The program is being developed using the C ++ programming language, allied with CUDA libraries. In comparative case study between a parallel and sequential version of the slower function of the Plume Dispersion module, a speedup of 11.63 times could be observed. (author)
GPU-based parallel computing in real-time modeling of atmospheric transport and diffusion of radioactive material

International Nuclear Information System (INIS)

Santos, Marcelo C. dos; Pereira, Claudio M.N.A.; Schirru, Roberto; Pinheiro, André; Coordenacao de Pos-Graduacao e Pesquisa de Engenharia

2017-01-01

Atmospheric radionuclide dispersion systems (ARDS) are essential mechanisms to predict the consequences of unexpected radioactive releases from nuclear power plants. Considering, that during an eventuality of an accident with a radioactive material release, an accurate forecast is vital to guide the evacuation plan of the possible affected areas. However, in order to predict the dispersion of the radioactive material and its impact on the environment, the model must process information about source term (radioactive materials released, activities and location), weather condition (wind, humidity and precipitation) and geographical characteristics (topography). Furthermore, ARDS is basically composed of 4 main modules: Source Term, Wind Field, Plume Dispersion and Doses Calculations. The Wind Field and Plume Dispersion modules are the ones that require a high computational performance to achieve accurate results within an acceptable time. Taking this into account, this work focuses on the development of a GPU-based parallel Plume Dispersion module, focusing on the radionuclide transport and diffusion calculations, which use a given wind field and a released source term as parameters. The program is being developed using the C ++ programming language, allied with CUDA libraries. In comparative case study between a parallel and sequential version of the slower function of the Plume Dispersion module, a speedup of 11.63 times could be observed. (author)
Gradient-based model calibration with proxy-model assistance

Science.gov (United States)

Burrows, Wesley; Doherty, John

2016-02-01

Use of a proxy model in gradient-based calibration and uncertainty analysis of a complex groundwater model with large run times and problematic numerical behaviour is described. The methodology is general, and can be used with models of all types. The proxy model is based on a series of analytical functions that link all model outputs used in the calibration process to all parameters requiring estimation. In enforcing history-matching constraints during the calibration and post-calibration uncertainty analysis processes, the proxy model is run for the purposes of populating the Jacobian matrix, while the original model is run when testing parameter upgrades; the latter process is readily parallelized. Use of a proxy model in this fashion dramatically reduces the computational burden of complex model calibration and uncertainty analysis. At the same time, the effect of model numerical misbehaviour on calculation of local gradients is mitigated, this allowing access to the benefits of gradient-based analysis where lack of integrity in finite-difference derivatives calculation would otherwise have impeded such access. Construction of a proxy model, and its subsequent use in calibration of a complex model, and in analysing the uncertainties of predictions made by that model, is implemented in the PEST suite.
Center for Programming Models for Scalable Parallel Computing - Towards Enhancing OpenMP for Manycore and Heterogeneous Nodes

Energy Technology Data Exchange (ETDEWEB)

Barbara Chapman

2012-02-01

OpenMP was not well recognized at the beginning of the project, around year 2003, because of its limited use in DoE production applications and the inmature hardware support for an efficient implementation. Yet in the recent years, it has been graduately adopted both in HPC applications, mostly in the form of MPI+OpenMP hybrid code, and in mid-scale desktop applications for scientific and experimental studies. We have observed this trend and worked deligiently to improve our OpenMP compiler and runtimes, as well as to work with the OpenMP standard organization to make sure OpenMP are evolved in the direction close to DoE missions. In the Center for Programming Models for Scalable Parallel Computing project, the HPCTools team at the University of Houston (UH), directed by Dr. Barbara Chapman, has been working with project partners, external collaborators and hardware vendors to increase the scalability and applicability of OpenMP for multi-core (and future manycore) platforms and for distributed memory systems by exploring different programming models, language extensions, compiler optimizations, as well as runtime library support.
Parallel Computing Using Web Servers and "Servlets".

Science.gov (United States)

Lo, Alfred; Bloor, Chris; Choi, Y. K.

2000-01-01

Describes parallel computing and presents inexpensive ways to implement a virtual parallel computer with multiple Web servers. Highlights include performance measurement of parallel systems; models for using Java and intranet technology including single server, multiple clients and multiple servers, single client; and a comparison of CGI (common…
Multi-Physics Modelling of Fault Mechanics Using REDBACK: A Parallel Open-Source Simulator for Tightly Coupled Problems

Science.gov (United States)

Poulet, Thomas; Paesold, Martin; Veveakis, Manolis

2017-03-01

Faults play a major role in many economically and environmentally important geological systems, ranging from impermeable seals in petroleum reservoirs to fluid pathways in ore-forming hydrothermal systems. Their behavior is therefore widely studied and fault mechanics is particularly focused on the mechanisms explaining their transient evolution. Single faults can change in time from seals to open channels as they become seismically active and various models have recently been presented to explain the driving forces responsible for such transitions. A model of particular interest is the multi-physics oscillator of Alevizos et al. (J Geophys Res Solid Earth 119(6), 4558-4582, 2014) which extends the traditional rate and state friction approach to rate and temperature-dependent ductile rocks, and has been successfully applied to explain spatial features of exposed thrusts as well as temporal evolutions of current subduction zones. In this contribution we implement that model in REDBACK, a parallel open-source multi-physics simulator developed to solve such geological instabilities in three dimensions. The resolution of the underlying system of equations in a tightly coupled manner allows REDBACK to capture appropriately the various theoretical regimes of the system, including the periodic and non-periodic instabilities. REDBACK can then be used to simulate the drastic permeability evolution in time of such systems, where nominally impermeable faults can sporadically become fluid pathways, with permeability increases of several orders of magnitude.
Parameter discovery in stochastic biological models using simulated annealing and statistical model checking.

Science.gov (United States)

Hussain, Faraz; Jha, Sumit K; Jha, Susmit; Langmead, Christopher J

2014-01-01

Stochastic models are increasingly used to study the behaviour of biochemical systems. While the structure of such models is often readily available from first principles, unknown quantitative features of the model are incorporated into the model as parameters. Algorithmic discovery of parameter values from experimentally observed facts remains a challenge for the computational systems biology community. We present a new parameter discovery algorithm that uses simulated annealing, sequential hypothesis testing, and statistical model checking to learn the parameters in a stochastic model. We apply our technique to a model of glucose and insulin metabolism used for in-silico validation of artificial pancreata and demonstrate its effectiveness by developing parallel CUDA-based implementation for parameter synthesis in this model.
Introducing heterogeneity in Monte Carlo models for risk assessments of high-level nuclear waste. A parallel implementation of the MLCRYSTAL code

Energy Technology Data Exchange (ETDEWEB)

Andersson, M.

1996-09-01

We have introduced heterogeneity to an existing model as a special feature and simultaneously extended the model from 1D to 3D. Briefly, the code generates stochastic fractures in a given geosphere. These fractures are connected in series to form one pathway for radionuclide transport from the repository to the biosphere. Rock heterogeneity is realized by simulating physical and chemical properties for each fracture, i.e. these properties vary along the transport pathway (which is an ensemble of all fractures serially connected). In this case, each Monte Carlo simulation involves a set of many thousands of realizations, one for each pathway. Each pathway can be formed by approx. 100 fractures. This means that for a Monte Carlo simulation of 1000 realizations, we need to perform a total of 100,000 simulations. Therefore the introduction of heterogeneity has increased the CPU demands by two orders of magnitude. To overcome the demand for CPU, the program, MLCRYSTAL, has been implemented in a parallel workstation environment using the MPI, Message Passing Interface, and later on ported to an IBM-SP2 parallel supercomputer. The program is presented here and a preliminary set of results is given with the conclusions that can be drawn. 3 refs, 12 figs.
Introducing heterogeneity in Monte Carlo models for risk assessments of high-level nuclear waste. A parallel implementation of the MLCRYSTAL code

International Nuclear Information System (INIS)

Andersson, M.

1996-09-01

We have introduced heterogeneity to an existing model as a special feature and simultaneously extended the model from 1D to 3D. Briefly, the code generates stochastic fractures in a given geosphere. These fractures are connected in series to form one pathway for radionuclide transport from the repository to the biosphere. Rock heterogeneity is realized by simulating physical and chemical properties for each fracture, i.e. these properties vary along the transport pathway (which is an ensemble of all fractures serially connected). In this case, each Monte Carlo simulation involves a set of many thousands of realizations, one for each pathway. Each pathway can be formed by approx. 100 fractures. This means that for a Monte Carlo simulation of 1000 realizations, we need to perform a total of 100,000 simulations. Therefore the introduction of heterogeneity has increased the CPU demands by two orders of magnitude. To overcome the demand for CPU, the program, MLCRYSTAL, has been implemented in a parallel workstation environment using the MPI, Message Passing Interface, and later on ported to an IBM-SP2 parallel supercomputer. The program is presented here and a preliminary set of results is given with the conclusions that can be drawn. 3 refs, 12 figs
A kinetic model for hydrodesulfurisation

Energy Technology Data Exchange (ETDEWEB)

Sau, M.; Narasimhan, C.S.L.; Verma, R.P. [Indian Oil Corporation Limited, Research and Development Centre, Faridabad (India)

1997-07-01

Due to stringent environmental considerations and related insistence on low sulfur fuels, hydrodesulfurisation has emerged as an important component of any refining scheme globally. The process is used ranging from Naphta/Kerosine hydrotreating to heavy oil hydrotreating. Processes such as Deep gas oil desulfurisation aiming at reduction of sulfur levels to less than 500 ppm have emerged as major players in the scenario. Hydrodesulfurisation (HDS) involves parallel desulfurisation of different organo-sulfur compounds present in the complex petroleum mixtures. In order to design, monitor, optimise and control the HDS reactor, it is necessary to have a detailed, yet simple model which follows the reaction chemistry accurately. In the present paper, a kinetic model is presented for HDS using continuum theory of lumping. The sulfur distribution in the reaction mixture is treated as continuum and parallel reaction networks are devised for kinetic modelling using continuum theory of lumping approach. The model based on the above approach follows the HDS chemistry reasonably well and hence the model parameters are almost feed invariant. Methods are also devised to incorporate heat and pressure effects into the model. The model has been validated based on commercial kero-HDS data. It is found that the model predictions agree with the experimental/commercial data. 17 refs.
Modeling and simulation of complex systems a framework for efficient agent-based modeling and simulation

CERN Document Server

Siegfried, Robert

2014-01-01

Robert Siegfried presents a framework for efficient agent-based modeling and simulation of complex systems. He compares different approaches for describing structure and dynamics of agent-based models in detail. Based on this evaluation the author introduces the "General Reference Model for Agent-based Modeling and Simulation" (GRAMS). Furthermore he presents parallel and distributed simulation approaches for execution of agent-based models -from small scale to very large scale. The author shows how agent-based models may be executed by different simulation engines that utilize underlying hard
Detailed numerical modeling of a linear parallel-plate Active Magnetic Regenerator

DEFF Research Database (Denmark)

Nielsen, Kaspar Kirstein; Bahl, Christian Robert Haffenden; Smith, Anders

2009-01-01

A numerical model simulating Active Magnetic Regeneration (AMR) is presented and compared to a selection of experiments. The model is an extension and re-implementation of a previous two-dimensional model. The new model is extended to 2.5D, meaning that parasitic thermal losses are included...
Metastable states in the hierarchical Dyson model drive parallel processing in the hierarchical Hopfield network

International Nuclear Information System (INIS)

Agliari, Elena; Barra, Adriano; Guerra, Francesco; Galluzzi, Andrea; Tantari, Daniele; Tavani, Flavia

2015-01-01

In this paper, we introduce and investigate the statistical mechanics of hierarchical neural networks. First, we approach these systems à la Mattis, by thinking of the Dyson model as a single-pattern hierarchical neural network. We also discuss the stability of different retrievable states as predicted by the related self-consistencies obtained both from a mean-field bound and from a bound that bypasses the mean-field limitation. The latter is worked out by properly reabsorbing the magnetization fluctuations related to higher levels of the hierarchy into effective fields for the lower levels. Remarkably, mixing Amit's ansatz technique for selecting candidate-retrievable states with the interpolation procedure for solving for the free energy of these states, we prove that, due to gauge symmetry, the Dyson model accomplishes both serial and parallel processing. We extend this scenario to multiple stored patterns by implementing the Hebb prescription for learning within the couplings. This results in Hopfield-like networks constrained on a hierarchical topology, for which, by restricting to the low-storage regime where the number of patterns grows at its most logarithmical with the amount of neurons, we prove the existence of the thermodynamic limit for the free energy, and we give an explicit expression of its mean-field bound and of its related improved bound. We studied the resulting self-consistencies for the Mattis magnetizations, which act as order parameters, are studied and the stability of solutions is analyzed to get a picture of the overall retrieval capabilities of the system according to both mean-field and non-mean-field scenarios. Our main finding is that embedding the Hebbian rule on a hierarchical topology allows the network to accomplish both serial and parallel processing. By tuning the level of fast noise affecting it or triggering the decay of the interactions with the distance among neurons, the system may switch from sequential retrieval to
Implementation and performance of parallelized elegant

International Nuclear Information System (INIS)

Wang, Y.; Borland, M.

2008-01-01

The program elegant is widely used for design and modeling of linacs for free-electron lasers and energy recovery linacs, as well as storage rings and other applications. As part of a multi-year effort, we have parallelized many aspects of the code, including single-particle dynamics, wakefields, and coherent synchrotron radiation. We report on the approach used for gradual parallelization, which proved very beneficial in getting parallel features into the hands of users quickly. We also report details of parallelization of collective effects. Finally, we discuss performance of the parallelized code in various applications.
The Potsdam Parallel Ice Sheet Model (PISM-PIK – Part 2: Dynamic equilibrium simulation of the Antarctic ice sheet

Directory of Open Access Journals (Sweden)

M. A. Martin

2011-09-01

Full Text Available We present a dynamic equilibrium simulation of the ice sheet-shelf system on Antarctica with the Potsdam Parallel Ice Sheet Model (PISM-PIK. The simulation is initialized with present-day conditions for bed topography and ice thickness and then run to steady state with constant present-day surface mass balance. Surface temperature and sub-shelf basal melt distribution are parameterized. Grounding lines and calving fronts are free to evolve, and their modeled equilibrium state is compared to observational data. A physically-motivated calving law based on horizontal spreading rates allows for realistic calving fronts for various types of shelves. Steady-state dynamics including surface velocity and ice flux are analyzed for whole Antarctica and the Ronne-Filchner and Ross ice shelf areas in particular. The results show that the different flow regimes in sheet and shelves, and the transition zone between them, are captured reasonably well, supporting the approach of superposition of SIA and SSA for the representation of fast motion of grounded ice. This approach also leads to a natural emergence of sliding-dominated flow in stream-like features in this new 3-D marine ice sheet model.
Modeling an in-register, parallel "iowa" aβ fibril structure using solid-state NMR data from labeled samples with rosetta.

Science.gov (United States)

Sgourakis, Nikolaos G; Yau, Wai-Ming; Qiang, Wei

2015-01-06

Determining the structures of amyloid fibrils is an important first step toward understanding the molecular basis of neurodegenerative diseases. For β-amyloid (Aβ) fibrils, conventional solid-state NMR structure determination using uniform labeling is limited by extensive peak overlap. We describe the characterization of a distinct structural polymorph of Aβ using solid-state NMR, transmission electron microscopy (TEM), and Rosetta model building. First, the overall fibril arrangement is established using mass-per-length measurements from TEM. Then, the fibril backbone arrangement, stacking registry, and "steric zipper" core interactions are determined using a number of solid-state NMR techniques on sparsely (13)C-labeled samples. Finally, we perform Rosetta structure calculations with an explicitly symmetric representation of the system. We demonstrate the power of the hybrid Rosetta/NMR approach by modeling the in-register, parallel "Iowa" mutant (D23N) at high resolution (1.2Å backbone rmsd). The final models are validated using an independent set of NMR experiments that confirm key features. Copyright © 2015 Elsevier Ltd. All rights reserved.
An Improved QTM Subdivision Model with Approximate Equal-area

Directory of Open Access Journals (Sweden)

ZHAO Xuesheng

2016-01-01

Full Text Available To overcome the defect of large area deformation in the traditional QTM subdivision model, an improved subdivision model is proposed which based on the “parallel method” and the thought of the equal area subdivision with changed-longitude-latitude. By adjusting the position of the parallel, this model ensures that the grid area between two adjacent parallels combined with no variation, so as to control area variation and variation accumulation of the QTM grid. The experimental results show that this improved model not only remains some advantages of the traditional QTM model(such as the simple calculation and the clear corresponding relationship with longitude/latitude grid, etc, but also has the following advantages: ①this improved model has a better convergence than the traditional one. The ratio of area_max/min finally converges to 1.38, far less than 1.73 of the “parallel method”; ②the grid units in middle and low latitude regions have small area variations and successive distributions; meanwhile, with the increase of subdivision level, the grid units with large variations gradually concentrate to the poles; ③the area variation of grid unit will not cumulate with the increasing of subdivision level.

Model selection for Gaussian kernel PCA denoising

DEFF Research Database (Denmark)

Jørgensen, Kasper Winther; Hansen, Lars Kai

2012-01-01

We propose kernel Parallel Analysis (kPA) for automatic kernel scale and model order selection in Gaussian kernel PCA. Parallel Analysis [1] is based on a permutation test for covariance and has previously been applied for model order selection in linear PCA, we here augment the procedure to also...... tune the Gaussian kernel scale of radial basis function based kernel PCA.We evaluate kPA for denoising of simulated data and the US Postal data set of handwritten digits. We find that kPA outperforms other heuristics to choose the model order and kernel scale in terms of signal-to-noise ratio (SNR...
A formal definition of data flow graph models

Science.gov (United States)

Kavi, Krishna M.; Buckles, Bill P.; Bhat, U. Narayan

1986-01-01

In this paper, a new model for parallel computations and parallel computer systems that is based on data flow principles is presented. Uninterpreted data flow graphs can be used to model computer systems including data driven and parallel processors. A data flow graph is defined to be a bipartite graph with actors and links as the two vertex classes. Actors can be considered similar to transitions in Petri nets, and links similar to places. The nondeterministic nature of uninterpreted data flow graphs necessitates the derivation of liveness conditions.
The convergence of parallel Boltzmann machines

NARCIS (Netherlands)

Zwietering, P.J.; Aarts, E.H.L.; Eckmiller, R.; Hartmann, G.; Hauske, G.

1990-01-01

We discuss the main results obtained in a study of a mathematical model of synchronously parallel Boltzmann machines. We present supporting evidence for the conjecture that a synchronously parallel Boltzmann machine maximizes a consensus function that consists of a weighted sum of the regular
Parallel computing works

Energy Technology Data Exchange (ETDEWEB)

1991-10-23

An account of the Caltech Concurrent Computation Program (C{sup 3}P), a five year project that focused on answering the question: Can parallel computers be used to do large-scale scientific computations '' As the title indicates, the question is answered in the affirmative, by implementing numerous scientific applications on real parallel computers and doing computations that produced new scientific results. In the process of doing so, C{sup 3}P helped design and build several new computers, designed and implemented basic system software, developed algorithms for frequently used mathematical computations on massively parallel machines, devised performance models and measured the performance of many computers, and created a high performance computing facility based exclusively on parallel computers. While the initial focus of C{sup 3}P was the hypercube architecture developed by C. Seitz, many of the methods developed and lessons learned have been applied successfully on other massively parallel architectures.
Optimized parallel convolutions for non-linear fluid models of tokamak ηi turbulence

International Nuclear Information System (INIS)

Milovich, J.L.; Tomaschke, G.; Kerbel, G.D.

1993-01-01

Non-linear computational fluid models of plasma turbulence based on spectral methods typically spend a large fraction of the total computing time evaluating convolutions. Usually these convolutions arise from an explicit or semi implicit treatment of the convective non-linearities in the problem. Often the principal convective velocity is perpendicular to magnetic field lines allowing a reduction of the convolution to two dimensions in an appropriate geometry, but beyond this, different models vary widely in the particulars of which mode amplitudes are selectively evolved to get the most efficient representation of the turbulence. As the number of modes in the problem, N, increases, the amount of computation required for this part of the evolution algorithm then scales as N 2 /timestep for a direct or analytic method and N ln N/timestep for a pseudospectral method. The constants of proportionality depend on the particulars of mode selection and determine the size problem for which the method will perform equally. For large enough N, the pseudospectral method performance is always superior, though some problems do not require correspondingly high resolution. Further, the Courant condition for numerical stability requires that the timestep size must decrease proportionately as N increases, thus accentuating the need to have fast methods for larger N problems. The authors have developed a package for the Cray system which performs these convolutions for a rather arbitrary mode selection scheme using either method. The package is highly optimized using a combination of macro and microtasking techniques, as well as vectorization and in some cases assembly coded routines. Parts of the package have also been developed and optimized for the CM200 and CM5 system. Performance comparisons with respect to problem size, parallelization, selection schemes and architecture are presented
Using parallel computing in modeling and optimization of mineral ...

African Journals Online (AJOL)

Then to solve ultimate pit limit problem it is required to find such a sub graph in a graph whose sum of weights will be maximal. One of the possible solutions of this problem is using genetic algorithms. We use a ... Details of implementation parallel genetic algorithm for searching open pit limits are provided. Comparison with ...
Towards a standard model for research in agent-based modeling and simulation

Directory of Open Access Journals (Sweden)

Nuno Fachada

2015-11-01

Full Text Available Agent-based modeling (ABM is a bottom-up modeling approach, where each entity of the system being modeled is uniquely represented as an independent decision-making agent. ABMs are very sensitive to implementation details. Thus, it is very easy to inadvertently introduce changes which modify model dynamics. Such problems usually arise due to the lack of transparency in model descriptions, which constrains how models are assessed, implemented and replicated. In this paper, we present PPHPC, a model which aims to serve as a standard in agent based modeling research, namely, but not limited to, conceptual model specification, statistical analysis of simulation output, model comparison and parallelization studies. This paper focuses on the first two aspects (conceptual model specification and statistical analysis of simulation output, also providing a canonical implementation of PPHPC. The paper serves as a complete reference to the presented model, and can be used as a tutorial for simulation practitioners who wish to improve the way they communicate their ABMs.
Xyce Parallel Electronic Simulator : users' guide, version 2.0.

Energy Technology Data Exchange (ETDEWEB)

Hoekstra, Robert John; Waters, Lon J.; Rankin, Eric Lamont; Fixel, Deborah A.; Russo, Thomas V.; Keiter, Eric Richard; Hutchinson, Scott Alan; Pawlowski, Roger Patrick; Wix, Steven D.

2004-06-01

This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator capable of simulating electrical circuits at a variety of abstraction levels. Primarily, Xyce has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability the current state-of-the-art in the following areas: {sm_bullet} Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). Note that this includes support for most popular parallel and serial computers. {sm_bullet} Improved performance for all numerical kernels (e.g., time integrator, nonlinear and linear solvers) through state-of-the-art algorithms and novel techniques. {sm_bullet} Device models which are specifically tailored to meet Sandia's needs, including many radiation-aware devices. {sm_bullet} A client-server or multi-tiered operating model wherein the numerical kernel can operate independently of the graphical user interface (GUI). {sm_bullet} Object-oriented code design and implementation using modern coding practices that ensure that the Xyce Parallel Electronic Simulator will be maintainable and extensible far into the future. Xyce is a parallel code in the most general sense of the phrase - a message passing of computing platforms. These include serial, shared-memory and distributed-memory parallel implementation - which allows it to run efficiently on the widest possible number parallel as well as heterogeneous platforms. Careful attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. One feature required by designers is the ability to add device models, many specific to the needs of Sandia, to the code. To this end, the device package in the Xyce
The boat hull model : adapting the roofline model to enable performance prediction for parallel computing

NARCIS (Netherlands)

Nugteren, C.; Corporaal, H.

2012-01-01

Multi-core and many-core were already major trends for the past six years, and are expected to continue for the next decades. With these trends of parallel computing, it becomes increasingly difficult to decide on which architecture to run a given application. In this work, we use an algorithm
Three dimensional transport model for toroidal plasmas

International Nuclear Information System (INIS)

Copenhauer, C.

1980-12-01

A nonlinear MHD model, developed for three-dimensional toroidal geometries (asymmetric) and for high β (β approximately epsilon), is used as a basis for a three-dimensional transport model. Since inertia terms are needed in describing evolving magnetic islands, the model can calculate transport, both in the transient phase before nonlinear saturation of magnetic islands and afterwards on the resistive time scale. In the β approximately epsilon ordering, the plasma does not have sufficient energy to compress the parallel magnetic field, which allows the Alfven wave to be eliminated in the reduced nonlinear equations, and the model then follows the slower time scales. The resulting perpendicular and parallel plasma drift velocities can be identified with those of guiding center theory
Comparative Evaluation and Case Studies of Shared-Memory and Data-Parallel Execution Patterns

Directory of Open Access Journals (Sweden)

Xiaodong Zhang

1999-01-01

Full Text Available Shared‐memory and data‐parallel programming models are two important paradigms for scientific applications. Both models provide high‐level program abstractions, and simple and uniform views of network structures. The common features of the two models significantly simplify program coding and debugging for scientific applications. However, the underlining execution and overhead patterns are significantly different between the two models due to their programming constraints, and due to different and complex structures of interconnection networks and systems which support the two models. We performed this experimental study to present implications and comparisons of execution patterns on two commercial architectures. We implemented a standard electromagnetic simulation program (EM and a linear system solver using the shared‐memory model on the KSR‐1 and the data‐parallel model on the CM‐5. Our objectives are to examine the execution pattern changes required for an implementation transformation between the two models; to study memory access patterns; to address scalability issues; and to investigate relative costs and advantages/disadvantages of using the two models for scientific computations. Our results indicate that the EM program tends to become computation‐intensive in the KSR‐1 shared‐memory system, and memory‐demanding in the CM‐5 data‐parallel system when the systems and the problems are scaled. The EM program, a highly data‐parallel program performed extremely well, and the linear system solver, a highly control‐structured program suffered significantly in the data‐parallel model on the CM‐5. Our study provides further evidence that matching execution patterns of algorithms to parallel architectures would achieve better performance.
Comparison of elastic-viscous-plastic and viscous-plastic dynamics models using a high resolution Arctic sea ice model

Energy Technology Data Exchange (ETDEWEB)

Hunke, E.C. [Los Alamos National Lab., NM (United States); Zhang, Y. [Naval Postgraduate School, Monterey, CA (United States)

1997-12-31

A nonlinear viscous-plastic (VP) rheology proposed by Hibler (1979) has been demonstrated to be the most suitable of the rheologies commonly used for modeling sea ice dynamics. However, the presence of a huge range of effective viscosities hinders numerical implementations of this model, particularly on high resolution grids or when the ice model is coupled to an ocean or atmosphere model. Hunke and Dukowicz (1997) have modified the VP model by including elastic waves as a numerical regularization in the case of zero strain rate. This modification (EVP) allows an efficient, fully explicit discretization that adapts well to parallel architectures. The authors present a comparison of EVP and VP dynamics model results from two 5-year simulations of Arctic sea ice, obtained with a high resolution sea ice model. The purpose of the comparison is to determine how differently the two dynamics models behave, and to decide whether the elastic-viscous-plastic model is preferable for high resolution climate simulations, considering its high efficiency in parallel computation. Results from the first year of this experiment (1990) are discussed in detail in Hunke and Zhang (1997).
Integrated Task And Data Parallel Programming: Language Design

Science.gov (United States)

Grimshaw, Andrew S.; West, Emily A.

1998-01-01

his research investigates the combination of task and data parallel language constructs within a single programming language. There are an number of applications that exhibit properties which would be well served by such an integrated language. Examples include global climate models, aircraft design problems, and multidisciplinary design optimization problems. Our approach incorporates data parallel language constructs into an existing, object oriented, task parallel language. The language will support creation and manipulation of parallel classes and objects of both types (task parallel and data parallel). Ultimately, the language will allow data parallel and task parallel classes to be used either as building blocks or managers of parallel objects of either type, thus allowing the development of single and multi-paradigm parallel applications. 1995 Research Accomplishments In February I presented a paper at Frontiers '95 describing the design of the data parallel language subset. During the spring I wrote and defended my dissertation proposal. Since that time I have developed a runtime model for the language subset. I have begun implementing the model and hand-coding simple examples which demonstrate the language subset. I have identified an astrophysical fluid flow application which will validate the data parallel language subset. 1996 Research Agenda Milestones for the coming year include implementing a significant portion of the data parallel language subset over the Legion system. Using simple hand-coded methods, I plan to demonstrate (1) concurrent task and data parallel objects and (2) task parallel objects managing both task and data parallel objects. My next steps will focus on constructing a compiler and implementing the fluid flow application with the language. Concurrently, I will conduct a search for a real-world application exhibiting both task and data parallelism within the same program m. Additional 1995 Activities During the fall I collaborated
Parallel science and engineering applications the Charm++ approach

CERN Document Server

Kale, Laxmikant V

2016-01-01

Developed in the context of science and engineering applications, with each abstraction motivated by and further honed by specific application needs, Charm++ is a production-quality system that runs on almost all parallel computers available. Parallel Science and Engineering Applications: The Charm++ Approach surveys a diverse and scalable collection of science and engineering applications, most of which are used regularly on supercomputers by scientists to further their research. After a brief introduction to Charm++, the book presents several parallel CSE codes written in the Charm++ model, along with their underlying scientific and numerical formulations, explaining their parallelization strategies and parallel performance. These chapters demonstrate the versatility of Charm++ and its utility for a wide variety of applications, including molecular dynamics, cosmology, quantum chemistry, fracture simulations, agent-based simulations, and weather modeling. The book is intended for a wide audience of people i...
Traffic Flow Prediction Model for Large-Scale Road Network Based on Cloud Computing

Directory of Open Access Journals (Sweden)

Zhaosheng Yang

2014-01-01

Full Text Available To increase the efficiency and precision of large-scale road network traffic flow prediction, a genetic algorithm-support vector machine (GA-SVM model based on cloud computing is proposed in this paper, which is based on the analysis of the characteristics and defects of genetic algorithm and support vector machine. In cloud computing environment, firstly, SVM parameters are optimized by the parallel genetic algorithm, and then this optimized parallel SVM model is used to predict traffic flow. On the basis of the traffic flow data of Haizhu District in Guangzhou City, the proposed model was verified and compared with the serial GA-SVM model and parallel GA-SVM model based on MPI (message passing interface. The results demonstrate that the parallel GA-SVM model based on cloud computing has higher prediction accuracy, shorter running time, and higher speedup.
Performance of a fine-grained parallel model for multi-group nodal-transport calculations in three-dimensional pin-by-pin reactor geometry

International Nuclear Information System (INIS)

Masahiro, Tatsumi; Akio, Yamamoto

2003-01-01

A production code SCOPE2 was developed based on the fine-grained parallel algorithm by the red/black iterative method targeting parallel computing environments such as a PC-cluster. It can perform a depletion calculation in a few hours using a PC-cluster with the model based on a 9-group nodal-SP3 transport method in 3-dimensional pin-by-pin geometry for in-core fuel management of commercial PWRs. The present algorithm guarantees the identical convergence process as that in serial execution, which is very important from the viewpoint of quality management. The fine-mesh geometry is constructed by hierarchical decomposition with introduction of intermediate management layer as a block that is a quarter piece of a fuel assembly in radial direction. A combination of a mesh division scheme forcing even meshes on each edge and a latency-hidden communication algorithm provided simplicity and efficiency to message passing to enhance parallel performance. Inter-processor communication and parallel I/O access were realized using the MPI functions. Parallel performance was measured for depletion calculations by the 9-group nodal-SP3 transport method in 3-dimensional pin-by-pin geometry with 340 x 340 x 26 meshes for full core geometry and 170 x 170 x 26 for quarter core geometry. A PC cluster that consists of 24 Pentium-4 processors connected by the Fast Ethernet was used for the performance measurement. Calculations in full core geometry gave better speedups compared to those in quarter core geometry because of larger granularity. Fine-mesh sweep and feedback calculation parts gave almost perfect scalability since granularity is large enough, while 1-group coarse-mesh diffusion acceleration gave only around 80%. The speedup and parallel efficiency for total computation time were 22.6 and 94%, respectively, for the calculation in full core geometry with 24 processors. (authors)
A new model for volume recombination in plane-parallel chambers in pulsed fields of high dose-per-pulse.

Science.gov (United States)

Gotz, M; Karsch, L; Pawelke, J

2017-11-01

In order to describe the volume recombination in a pulsed radiation field of high dose-per-pulse this study presents a numerical solution of a 1D transport model of the liberated charges in a plane-parallel ionization chamber. In addition, measurements were performed on an Advanced Markus ionization chamber in a pulsed electron beam to obtain suitable data to test the calculation. The experiment used radiation pulses of 4 μs duration and variable dose-per-pulse values up to about 1 Gy, as well as pulses of variable duration up to 308 [Formula: see text] at constant dose-per-pulse values between 85 mGy and 400 mGy. Those experimental data were compared to the developed numerical model and existing descriptions of volume recombination. At low collection voltages the observed dose-per-pulse dependence of volume recombination can be approximated by the existing theory using effective parameters. However, at high collection voltages large discrepancies are observed. The developed numerical model shows much better agreement with the observations and is able to replicate the observed behavior over the entire range of dose-per-pulse values and collection voltages. Using the developed numerical model, the differences between observation and existing theory are shown to be the result of a large fraction of the charge being collected as free electrons and the resultant distortion of the electric field inside the chamber. Furthermore, the numerical solution is able to calculate recombination losses for arbitrary pulse durations in good agreement with the experimental data, an aspect not covered by current theory. Overall, the presented numerical solution of the charge transport model should provide a more flexible tool to describe volume recombination for high dose-per-pulse values as well as for arbitrary pulse durations and repetition rates.
Professional Parallel Programming with C# Master Parallel Extensions with NET 4

CERN Document Server

Hillar, Gastón

2010-01-01

Expert guidance for those programming today's dual-core processors PCs As PC processors explode from one or two to now eight processors, there is an urgent need for programmers to master concurrent programming. This book dives deep into the latest technologies available to programmers for creating professional parallel applications using C#, .NET 4, and Visual Studio 2010. The book covers task-based programming, coordination data structures, PLINQ, thread pools, asynchronous programming model, and more. It also teaches other parallel programming techniques, such as SIMD and vectorization.Teach
Capacity Analysis for Parallel Runway through Agent-Based Simulation

Directory of Open Access Journals (Sweden)

Yang Peng

2013-01-01

Full Text Available Parallel runway is the mainstream structure of China hub airport, runway is often the bottleneck of an airport, and the evaluation of its capacity is of great importance to airport management. This study outlines a model, multiagent architecture, implementation approach, and software prototype of a simulation system for evaluating runway capacity. Agent Unified Modeling Language (AUML is applied to illustrate the inbound and departing procedure of planes and design the agent-based model. The model is evaluated experimentally, and the quality is studied in comparison with models, created by SIMMOD and Arena. The results seem to be highly efficient, so the method can be applied to parallel runway capacity evaluation and the model propose favorable flexibility and extensibility.
Modelling and simulation of a heat exchanger

Science.gov (United States)

Xia, Lei; Deabreu-Garcia, J. Alex; Hartley, Tom T.

1991-01-01

Two models for two different control systems are developed for a parallel heat exchanger. First by spatially lumping a heat exchanger model, a good approximate model which has a high system order is produced. Model reduction techniques are applied to these to obtain low order models that are suitable for dynamic analysis and control design. The simulation method is discussed to ensure a valid simulation result.

Realization of parking task based on affine system modeling

International Nuclear Information System (INIS)

Kim, Young Woo; Narikiyo, Tatsuo

2007-01-01

This paper presents a motion control system of an unmanned vehicle, where parallel parking task is realized based on a self-organizing affine system modeling and a quadratic programming based robust controller. Because of non-linearity of the vehicle system and complexity of the task to realize, control objective is not always realized with a single algorithm or control mode. This paper presents a hybrid model for parallel parking task in which seven modes for describing sub-tasks constitute an entire model
Optimal use of data in parallel tempering simulations for the construction of discrete-state Markov models of biomolecular dynamics.

Science.gov (United States)

Prinz, Jan-Hendrik; Chodera, John D; Pande, Vijay S; Swope, William C; Smith, Jeremy C; Noé, Frank

2011-06-28

Parallel tempering (PT) molecular dynamics simulations have been extensively investigated as a means of efficient sampling of the configurations of biomolecular systems. Recent work has demonstrated how the short physical trajectories generated in PT simulations of biomolecules can be used to construct the Markov models describing biomolecular dynamics at each simulated temperature. While this approach describes the temperature-dependent kinetics, it does not make optimal use of all available PT data, instead estimating the rates at a given temperature using only data from that temperature. This can be problematic, as some relevant transitions or states may not be sufficiently sampled at the temperature of interest, but might be readily sampled at nearby temperatures. Further, the comparison of temperature-dependent properties can suffer from the false assumption that data collected from different temperatures are uncorrelated. We propose here a strategy in which, by a simple modification of the PT protocol, the harvested trajectories can be reweighted, permitting data from all temperatures to contribute to the estimated kinetic model. The method reduces the statistical uncertainty in the kinetic model relative to the single temperature approach and provides estimates of transition probabilities even for transitions not observed at the temperature of interest. Further, the method allows the kinetics to be estimated at temperatures other than those at which simulations were run. We illustrate this method by applying it to the generation of a Markov model of the conformational dynamics of the solvated terminally blocked alanine peptide.
The numerical parallel computing of photon transport

International Nuclear Information System (INIS)

Huang Qingnan; Liang Xiaoguang; Zhang Lifa

1998-12-01

The parallel computing of photon transport is investigated, the parallel algorithm and the parallelization of programs on parallel computers both with shared memory and with distributed memory are discussed. By analyzing the inherent law of the mathematics and physics model of photon transport according to the structure feature of parallel computers, using the strategy of 'to divide and conquer', adjusting the algorithm structure of the program, dissolving the data relationship, finding parallel liable ingredients and creating large grain parallel subtasks, the sequential computing of photon transport into is efficiently transformed into parallel and vector computing. The program was run on various HP parallel computers such as the HY-1 (PVP), the Challenge (SMP) and the YH-3 (MPP) and very good parallel speedup has been gotten
Refinement of Parallel and Reactive Programs

OpenAIRE

Back, R. J. R.

1992-01-01

We show how to apply the refinement calculus to stepwise refinement of parallel and reactive programs. We use action systems as our basic program model. Action systems are sequential programs which can be implemented in a parallel fashion. Hence refinement calculus methods, originally developed for sequential programs, carry over to the derivation of parallel programs. Refinement of reactive programs is handled by data refinement techniques originally developed for the sequential refinement c...
Parallel 3-D numerical simulation of dielectric barrier discharge plasma actuators

Science.gov (United States)

Houba, Tomas

Dielectric barrier discharge plasma actuators have shown promise in a range of applications including flow control, sterilization and ozone generation. Developing numerical models of plasma actuators is of great importance, because a high-fidelity parallel numerical model allows new design configurations to be tested rapidly. Additionally, it provides a better understanding of the plasma actuator physics which is useful for further innovation. The physics of plasma actuators is studied numerically. A loosely coupled approach is utilized for the coupling of the plasma to the neutral fluid. The state of the art in numerical plasma modeling is advanced by the development of a parallel, three-dimensional, first-principles model with detailed air chemistry. The model incorporates 7 charged species and 18 reactions, along with a solution of the electron energy equation. To the author's knowledge, a parallel three-dimensional model of a gas discharge with a detailed air chemistry model and the solution of electron energy is unique. Three representative geometries are studied using the gas discharge model. The discharge of gas between two parallel electrodes is used to validate the air chemistry model developed for the gas discharge code. The gas discharge model is then applied to the discharge produced by placing a dc powered wire and grounded plate electrodes in a channel. Finally, a three-dimensional simulation of gas discharge produced by electrodes placed inside a riblet is carried out. The body force calculated with the gas discharge model is loosely coupled with a fluid model to predict the induced flow inside the riblet.
On synchronous parallel computations with independent probabilistic choice

International Nuclear Information System (INIS)

Reif, J.H.

1984-01-01

This paper introduces probabilistic choice to synchronous parallel machine models; in particular parallel RAMs. The power of probabilistic choice in parallel computations is illustrate by parallelizing some known probabilistic sequential algorithms. The authors characterize the computational complexity of time, space, and processor bounded probabilistic parallel RAMs in terms of the computational complexity of probabilistic sequential RAMs. They show that parallelism uniformly speeds up time bounded probabilistic sequential RAM computations by nearly a quadratic factor. They also show that probabilistic choice can be eliminated from parallel computations by introducing nonuniformity
Phase Field Modeling Using PetIGA

KAUST Repository

Vignal, Philippe; Collier, Nathan; Calo, Victor M.

2013-01-01

, and having a highly efficient and parallel framework to solve them is necessary. In this work, a brief review on phase field models is given, followed by a short analysis of the Phase Field Crystal Model solved with Isogeometric Analysis us- ing PetIGA. We
Methodology of modeling and measuring computer architectures for plasma simulations

Science.gov (United States)

Wang, L. P. T.

1977-01-01

A brief introduction to plasma simulation using computers and the difficulties on currently available computers is given. Through the use of an analyzing and measuring methodology - SARA, the control flow and data flow of a particle simulation model REM2-1/2D are exemplified. After recursive refinements the total execution time may be greatly shortened and a fully parallel data flow can be obtained. From this data flow, a matched computer architecture or organization could be configured to achieve the computation bound of an application problem. A sequential type simulation model, an array/pipeline type simulation model, and a fully parallel simulation model of a code REM2-1/2D are proposed and analyzed. This methodology can be applied to other application problems which have implicitly parallel nature.
The Medial Temporal Lobe – Conduit of Parallel Connectivity: A model for Attention, Memory, and Perception.

Directory of Open Access Journals (Sweden)

Brian B. Mozaffari

2014-11-01

Full Text Available Based on the notion that the brain is equipped with a hierarchical organization, which embodies environmental contingencies across many time scales, this paper suggests that the medial temporal lobe (MTL – located deep in the hierarchy – serves as a bridge connecting supra to infra – MTL levels. Bridging the upper and lower regions of the hierarchy provides a parallel architecture that optimizes information flow between upper and lower regions to aid attention, encoding, and processing of quick complex visual phenomenon. Bypassing intermediate hierarchy levels, information conveyed through the MTL ‘bridge’ allows upper levels to make educated predictions about the prevailing context and accordingly select lower representations to increase the efficiency of predictive coding throughout the hierarchy. This selection or activation/deactivation is associated with endogenous attention. In the event that these ‘bridge’ predictions are inaccurate, this architecture enables the rapid encoding of novel contingencies. A review of hierarchical models in relation to memory is provided along with a new theory, Medial-temporal-lobe Conduit for Parallel Connectivity (MCPC. In this scheme, consolidation is considered as a secondary process, occurring after a MTL-bridged connection, which eventually allows upper and lower levels to access each other directly. With repeated reactivations, as contingencies become consolidated, less MTL activity is predicted. Finally, MTL bridging may aid processing transient but structured perceptual events, by allowing communication between upper and lower levels without calling on intermediate levels of representation.
Impact analysis on a massively parallel computer

International Nuclear Information System (INIS)

Zacharia, T.; Aramayo, G.A.

1994-01-01

Advanced mathematical techniques and computer simulation play a major role in evaluating and enhancing the design of beverage cans, industrial, and transportation containers for improved performance. Numerical models are used to evaluate the impact requirements of containers used by the Department of Energy (DOE) for transporting radioactive materials. Many of these models are highly compute-intensive. An analysis may require several hours of computational time on current supercomputers despite the simplicity of the models being studied. As computer simulations and materials databases grow in complexity, massively parallel computers have become important tools. Massively parallel computational research at the Oak Ridge National Laboratory (ORNL) and its application to the impact analysis of shipping containers is briefly described in this paper
VALIDATION OF CRACK INTERACTION LIMIT MODEL FOR PARALLEL EDGE CRACKS USING TWO-DIMENSIONAL FINITE ELEMENT ANALYSIS

Directory of Open Access Journals (Sweden)

R. Daud

2013-06-01

Full Text Available Shielding interaction effects of two parallel edge cracks in finite thickness plates subjected to remote tension load is analyzed using a developed finite element analysis program. In the present study, the crack interaction limit is evaluated based on the fitness of service (FFS code, and focus is given to the weak crack interaction region as the crack interval exceeds the length of cracks (b > a. Crack interaction factors are evaluated based on stress intensity factors (SIFs for Mode I SIFs using a displacement extrapolation technique. Parametric studies involved a wide range of crack-to-width (0.05 ≤ a/W ≤ 0.5 and crack interval ratios (b/a > 1. For validation, crack interaction factors are compared with single edge crack SIFs as a state of zero interaction. Within the considered range of parameters, the proposed numerical evaluation used to predict the crack interaction factor reduces the error of existing analytical solution from 1.92% to 0.97% at higher a/W. In reference to FFS codes, the small discrepancy in the prediction of the crack interaction factor validates the reliability of the numerical model to predict crack interaction limits under shielding interaction effects. In conclusion, the numerical model gave a successful prediction in estimating the crack interaction limit, which can be used as a reference for the shielding orientation of other cracks.
Theoretical study on instability mechanism of jet-induced sloshing. Model development using Orr-Sommerfeld equation generalized for non-parallel flow; Funryu reiki sloshing gensho no hassei kiko ni kansuru rironteki kenkyu. Hiheiko nagare ni ippankashita Orr-Sommerfeld hoteishiki wo mochiita model ka

Energy Technology Data Exchange (ETDEWEB)

Eguchi, Y. [Central Research Institute of Electric Power Industry, Tokyo (Japan)

1998-07-25

A theoretical model was developed to study the mechanism of free surface sloshing in a vessel induced by a steady vertical jet flow. In the model, jet deflection is calculated with eigen values of the generalized Orr-Sommerfeld equation which is applicable to slightly non-parallel jet. Instability criteria employed in the model are (1) resonace condition between sloshing and jet frequencies and (2) {pi} phase relation between jet displacement at an inlet and global jet deflection. Numerical results of the mathematical model have shown good agreement with experimental ones, which justifies that the inherent instability of free jet itself and edge tone feedback are the main causes of the self-excited sloshing. 9 refs., 10 figs.
Xyce Parallel Electronic Simulator - User's Guide, Version 1.0

Energy Technology Data Exchange (ETDEWEB)

HUTCHINSON, SCOTT A; KEITER, ERIC R.; HOEKSTRA, ROBERT J.; WATERS, LON J.; RUSSO, THOMAS V.; RANKIN, ERIC LAMONT; WIX, STEVEN D.

2002-11-01

This manual describes the use of the Xyce Parallel Electronic Simulator code for simulating electrical circuits at a variety of abstraction levels. The Xyce Parallel Electronic Simulator has been written to support,in a rigorous manner, the simulation needs of the Sandia National Laboratories electrical designers. As such, the development has focused on improving the capability over the current state-of-the-art in the following areas: (1) Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). Note that this includes support for most popular parallel and serial computers. (2) Improved performance for all numerical kernels (e.g., time integrator, nonlinear and linear solvers) through state-of-the-art algorithms and novel techniques. (3) A client-server or multi-tiered operating model wherein the numerical kernel can operate independently of the graphical user interface (GUI). (4) Object-oriented code design and implementation using modern coding-practices that ensure that the Xyce Parallel Electronic Simulator will be maintainable and extensible far into the future. The code is a parallel code in the most general sense of the phrase--a message passing parallel implementation--which allows it to run efficiently on the widest possible number of computing platforms. These include serial, shared-memory and distributed-memory parallel as well as heterogeneous platforms. Furthermore, careful attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved even as the number of processors grows. Another feature required by designers is the ability to add device models, many specific to the needs of Sandia, to the code. To this end, the device package in the Xyce Parallel Electronic Simulator is designed to support a variety of device model inputs. These input formats include standard analytical models, behavioral models
Finite element modelling

International Nuclear Information System (INIS)

Tonks, M.R.; Williamson, R.; Masson, R.

2015-01-01

The Finite Element Method (FEM) is a numerical technique for finding approximate solutions to boundary value problems. While FEM is commonly used to solve solid mechanics equations, it can be applied to a large range of BVPs from many different fields. FEM has been used for reactor fuels modelling for many years. It is most often used for fuel performance modelling at the pellet and pin scale, however, it has also been used to investigate properties of the fuel material, such as thermal conductivity and fission gas release. Recently, the United Stated Department Nuclear Energy Advanced Modelling and Simulation Program has begun using FEM as the basis of the MOOSE-BISON-MARMOT Project that is developing a multi-dimensional, multi-physics fuel performance capability that is massively parallel and will use multi-scale material models to provide a truly predictive modelling capability. (authors)
Modeling of arylamide helix mimetics in the p53 peptide binding site of hDM2 suggests parallel and anti-parallel conformations are both stable.

Directory of Open Access Journals (Sweden)

Jonathan C Fuller

Full Text Available The design of novel α-helix mimetic inhibitors of protein-protein interactions is of interest to pharmaceuticals and chemical genetics researchers as these inhibitors provide a chemical scaffold presenting side chains in the same geometry as an α-helix. This conformational arrangement allows the design of high affinity inhibitors mimicking known peptide sequences binding specific protein substrates. We show that GAFF and AutoDock potentials do not properly capture the conformational preferences of α-helix mimetics based on arylamide oligomers and identify alternate parameters matching solution NMR data and suitable for molecular dynamics simulation of arylamide compounds. Results from both docking and molecular dynamics simulations are consistent with the arylamides binding in the p53 peptide binding pocket. Simulations of arylamides in the p53 binding pocket of hDM2 are consistent with binding, exhibiting similar structural dynamics in the pocket as simulations of known hDM2 binders Nutlin-2 and a benzodiazepinedione compound. Arylamide conformations converge towards the same region of the binding pocket on the 20 ns time scale, and most, though not all dihedrals in the binding pocket are well sampled on this timescale. We show that there are two putative classes of binding modes for arylamide compounds supported equally by the modeling evidence. In the first, the arylamide compound lies parallel to the observed p53 helix. In the second class, not previously identified or proposed, the arylamide compound lies anti-parallel to the p53 helix.
Maximum entropy models of ecosystem functioning

International Nuclear Information System (INIS)

Bertram, Jason

2014-01-01

Using organism-level traits to deduce community-level relationships is a fundamental problem in theoretical ecology. This problem parallels the physical one of using particle properties to deduce macroscopic thermodynamic laws, which was successfully achieved with the development of statistical physics. Drawing on this parallel, theoretical ecologists from Lotka onwards have attempted to construct statistical mechanistic theories of ecosystem functioning. Jaynes’ broader interpretation of statistical mechanics, which hinges on the entropy maximisation algorithm (MaxEnt), is of central importance here because the classical foundations of statistical physics do not have clear ecological analogues (e.g. phase space, dynamical invariants). However, models based on the information theoretic interpretation of MaxEnt are difficult to interpret ecologically. Here I give a broad discussion of statistical mechanical models of ecosystem functioning and the application of MaxEnt in these models. Emphasising the sample frequency interpretation of MaxEnt, I show that MaxEnt can be used to construct models of ecosystem functioning which are statistical mechanical in the traditional sense using a savanna plant ecology model as an example
Maximum entropy models of ecosystem functioning

Energy Technology Data Exchange (ETDEWEB)

Bertram, Jason, E-mail: jason.bertram@anu.edu.au [Research School of Biology, The Australian National University, Canberra ACT 0200 (Australia)

2014-12-05

Using organism-level traits to deduce community-level relationships is a fundamental problem in theoretical ecology. This problem parallels the physical one of using particle properties to deduce macroscopic thermodynamic laws, which was successfully achieved with the development of statistical physics. Drawing on this parallel, theoretical ecologists from Lotka onwards have attempted to construct statistical mechanistic theories of ecosystem functioning. Jaynes’ broader interpretation of statistical mechanics, which hinges on the entropy maximisation algorithm (MaxEnt), is of central importance here because the classical foundations of statistical physics do not have clear ecological analogues (e.g. phase space, dynamical invariants). However, models based on the information theoretic interpretation of MaxEnt are difficult to interpret ecologically. Here I give a broad discussion of statistical mechanical models of ecosystem functioning and the application of MaxEnt in these models. Emphasising the sample frequency interpretation of MaxEnt, I show that MaxEnt can be used to construct models of ecosystem functioning which are statistical mechanical in the traditional sense using a savanna plant ecology model as an example.
Using the extended parallel process model to prevent noise-induced hearing loss among coal miners in Appalachia

Energy Technology Data Exchange (ETDEWEB)

Murray-Johnson, L.; Witte, K.; Patel, D.; Orrego, V.; Zuckerman, C.; Maxfield, A.M.; Thimons, E.D. [Ohio State University, Columbus, OH (US)

2004-12-15

Occupational noise-induced hearing loss is the second most self-reported occupational illness or injury in the United States. Among coal miners, more than 90% of the population reports a hearing deficit by age 55. In this formative evaluation, focus groups were conducted with coal miners in Appalachia to ascertain whether miners perceive hearing loss as a major health risk and if so, what would motivate the consistent wearing of hearing protection devices (HPDs). The theoretical framework of the Extended Parallel Process Model was used to identify the miners' knowledge, attitudes, beliefs, and current behaviors regarding hearing protection. Focus group participants had strong perceived severity and varying levels of perceived susceptibility to hearing loss. Various barriers significantly reduced the self-efficacy and the response efficacy of using hearing protection.
Geometrical (Degree 0 Modelling of a FP3+3×RTR+MP3 Type Parallel Topology Robotic Guiding Device, Using the „Pair of Frames” (PF Concept

Directory of Open Access Journals (Sweden)

Calin Miclosina

2005-01-01

Full Text Available The geometrical (degree 0 model of a parallel topology robotic guiding device represents the position-orientation matrix of the mobile platform (MP versus the fixed one (FP; this model refers to generalized displacements. The kinematical scheme of a FP3+3×RTR+MP3 type mechanism is presented, as well as the manner of choice of the attached pair of frames (PF to the links. In the case of direct geometrical modelling, for certain displacements of the actuated translational joints, the position-orientation matrix of the mobile platform versus the fixed one is determined. For inverse geometrical modelling, the position-orientation matrix of MP versus FP is known and the displacements of the actuated translational joints are determined.
Analysis of multigrid methods on massively parallel computers: Architectural implications

Science.gov (United States)

Matheson, Lesley R.; Tarjan, Robert E.

1993-01-01

We study the potential performance of multigrid algorithms running on massively parallel computers with the intent of discovering whether presently envisioned machines will provide an efficient platform for such algorithms. We consider the domain parallel version of the standard V cycle algorithm on model problems, discretized using finite difference techniques in two and three dimensions on block structured grids of size 10(exp 6) and 10(exp 9), respectively. Our models of parallel computation were developed to reflect the computing characteristics of the current generation of massively parallel multicomputers. These models are based on an interconnection network of 256 to 16,384 message passing, 'workstation size' processors executing in an SPMD mode. The first model accomplishes interprocessor communications through a multistage permutation network. The communication cost is a logarithmic function which is similar to the costs in a variety of different topologies. The second model allows single stage communication costs only. Both models were designed with information provided by machine developers and utilize implementation derived parameters. With the medium grain parallelism of the current generation and the high fixed cost of an interprocessor communication, our analysis suggests an efficient implementation requires the machine to support the efficient transmission of long messages, (up to 1000 words) or the high initiation cost of a communication must be significantly reduced through an alternative optimization technique. Furthermore, with variable length message capability, our analysis suggests the low diameter multistage networks provide little or no advantage over a simple single stage communications network.

Analysis of gamma irradiator dose rate using spent fuel elements with parallel configuration

International Nuclear Information System (INIS)

Setiyanto; Pudjijanto MS; Ardani

2006-01-01

To enhance the utilization of the RSG-GAS reactor spent fuel, the gamma irradiator using spent fuel elements as a gamma source is a suitable choice. This irradiator can be used for food sterilization and preservation. The first step before realization, it is necessary to determine the gamma dose rate theoretically. The assessment was realized for parallel configuration fuel elements with the irradiation space can be placed between fuel element series. This analysis of parallel model was choice to compare with the circle model and as long as possible to get more space for irradiation and to do manipulation of irradiation target. Dose rate calculation were done with MCNP, while the estimation of gamma activities of fuel element was realized by OREGEN code with 1 year of average delay time. The calculation result show that the gamma dose rate of parallel model decreased up to 50% relatively compared with the circle model, but the value still enough for sterilization and preservation. Especially for food preservation, this parallel model give more flexible, while the gamma dose rate can be adjusted to the irradiation needed. The conclusion of this assessment showed that the utilization of reactor spent fuels for gamma irradiator with parallel model give more advantage the circle model. (author)
PIXIE3D: An efficient, fully implicit, parallel, 3D extended MHD code for fusion plasma modeling

International Nuclear Information System (INIS)

Chacon, L.

2007-01-01

PIXIE3D is a modern, parallel, state-of-the-art extended MHD code that employs fully implicit methods for efficiency and accuracy. It features a general geometry formulation, and is therefore suitable for the study of many magnetic fusion configurations of interest. PIXIE3D advances the state of the art in extended MHD modeling in two fundamental ways. Firstly, it employs a novel conservative finite volume scheme which is remarkably robust and stable, and demands very small physical and/or numerical dissipation. This is a fundamental requirement when one wants to study fusion plasmas with realistic conductivities. Secondly, PIXIE3D features fully-implicit time stepping, employing Newton-Krylov methods for inverting the associated nonlinear systems. These methods have been shown to be scalable and efficient when preconditioned properly. Novel preconditioned ideas (so-called physics based), which were prototypes in the context of reduced MHD, have been adapted for 3D primitive-variable resistive MHD in PIXIE3D, and are currently being extended to Hall MHD. PIXIE3D is fully parallel, employing PETSc for parallelism. PIXIE3D has been thoroughly benchmarked against linear theory and against other available extended MHD codes on nonlinear test problems (such as the GEM reconnection challenge). We are currently in the process of extending such comparisons to fusion-relevant problems in realistic geometries. In this talk, we will describe both the spatial discretization approach and the preconditioning strategy employed for extended MHD in PIXIE3D. We will report on recent benchmarking studies between PIXIE3D and other 3D extended MHD codes, and will demonstrate its usefulness in a variety of fusion-relevant configurations such as Tokamaks and Reversed Field Pinches. (Author)
Modeling bidirectional reflectance of forests and woodlands using Boolean models and geometric optics

Science.gov (United States)

Strahler, Alan H.; Jupp, David L. B.

1990-01-01

Geometric-optical discrete-element mathematical models for forest canopies have been developed using the Boolean logic and models of Serra. The geometric-optical approach is considered to be particularly well suited to describing the bidirectional reflectance of forest woodland canopies, where the concentration of leaf material within crowns and the resulting between-tree gaps make plane-parallel, radiative-transfer models inappropriate. The approach leads to invertible formulations, in which the spatial and directional variance provides the means for remote estimation of tree crown size, shape, and total cover from remotedly sensed imagery.
Structured Parallel Programming Patterns for Efficient Computation

CERN Document Server

McCool, Michael; Robison, Arch

2012-01-01

Programming is now parallel programming. Much as structured programming revolutionized traditional serial programming decades ago, a new kind of structured programming, based on patterns, is relevant to parallel programming today. Parallel computing experts and industry insiders Michael McCool, Arch Robison, and James Reinders describe how to design and implement maintainable and efficient parallel algorithms using a pattern-based approach. They present both theory and practice, and give detailed concrete examples using multiple programming models. Examples are primarily given using two of th
Mental models of adherence: parallels in perceptions, values, and expectations in adherence to prescribed home exercise programs and other personal regimens.

Science.gov (United States)

Rizzo, Jon; Bell, Alexandra

2018-05-09

A mental model is the collection of an individual's perceptions, values, and expectations about a particular aspect of their life, which strongly influences behaviors. This study explored orthopedic outpatients mental models of adherence to prescribed home exercise programs and how they related to mental models of adherence to other types of personal regimens. The study followed an interpretive description qualitative design. Data were collected via two semi-structured interviews. Interview One focused on participants prior experiences adhering to personal regimens. Interview Two focused on experiences adhering to their current prescribed home exercise program. Data analysis followed a constant comparative method. Findings revealed similarity in perceptions, values, and expectations that informed individuals mental models of adherence to personal regimens and prescribed home exercise programs. Perceived realized results, expected results, perceived social supports, and value of convenience characterized mental models of adherence. Parallels between mental models of adherence for prescribed home exercise and other personal regimens suggest that patients adherence behavior to prescribed routines may be influenced by adherence experiences in other aspects of their lives. By gaining insight into patients adherence experiences, values, and expectations across life domains, clinicians may tailor supports that enhance home exercise adherence. Implications for Rehabilitation A mental model is the collection of an individual's perceptions, values, and expectations about a particular aspect of their life, which is based on prior experiences and strongly influences behaviors. This study demonstrated similarity in orthopedic outpatients mental models of adherence to prescribed home exercise programs and adherence to personal regimens in other aspects of their lives. Physical therapists should inquire about patients non-medical adherence experiences, as strategies patients
Numerical Investigation of Startup Instabilities in Parallel-Channel Natural Circulation Boiling Systems

Directory of Open Access Journals (Sweden)

S. P. Lakshmanan

2010-01-01

Full Text Available The behaviour of a parallel-channel natural circulation boiling water reactor under a low-pressure low-power startup condition has been studied numerically (using RELAP5 and compared with its scaled model. The parallel-channel RELAP5 model is an extension of a single-channel model developed and validated with experimental results. Existence of in-phase and out-of-phase flashing instabilities in the parallel-channel systems is investigated through simulations under equal and unequal power boundary conditions in the channels. The effect of flow resistance on Type-I oscillations is explored. For nonidentical condition in the channels, the flow fluctuations in the parallel-channel systems are found to be out-of-phase.
On Scalable Deep Learning and Parallelizing Gradient Descent

CERN Document Server

AUTHOR|(CDS)2129036; Möckel, Rico; Baranowski, Zbigniew; Canali, Luca

Speeding up gradient based methods has been a subject of interest over the past years with many practical applications, especially with respect to Deep Learning. Despite the fact that many optimizations have been done on a hardware level, the convergence rate of very large models remains problematic. Therefore, data parallel methods next to mini-batch parallelism have been suggested to further decrease the training time of parameterized models using gradient based methods. Nevertheless, asynchronous optimization was considered too unstable for practical purposes due to a lacking understanding of the underlying mechanisms. Recently, a theoretical contribution has been made which defines asynchronous optimization in terms of (implicit) momentum due to the presence of a queuing model of gradients based on past parameterizations. This thesis mainly builds upon this work to construct a better understanding why asynchronous optimization shows proportionally more divergent behavior when the number of parallel worker...
Neutron generator power supply modeling in EMMA

International Nuclear Information System (INIS)

Robinson, A.C.; Farnsworth, A.V.; Montgomery, S.T.; Peery, J.S.; Merewether, K.O.

1996-01-01

Sandia National Laboratories has prime responsibility for neutron generator design and manufacturing, and is committed to developing predictive tools for modeling neutron generator performance. An important aspect of understanding component performance is explosively driven ferroelectric power supply modeling. EMMA (ElectroMechanical Modeling in ALEGRA) is a three dimensional compile time version of Sandia's ALEGRA code. The code is built on top of the general ALEGRA framework for parallel shock-physics computations but also includes additional capability for modeling the electric potential field in dielectrics. The overall package includes shock propagation due to explosive detonation, depoling of ferroelectric ceramics, electric field calculation and coupling with a general lumped element circuit equation system. The AZTEC parallel iterative solver is used to solve for the electric potential. The DASPK differential algebraic equation package is used to solve the circuit equation system. Sample calculations are described
Research on Parallel Three Phase PWM Converters base on RTDS

Science.gov (United States)

Xia, Yan; Zou, Jianxiao; Li, Kai; Liu, Jingbo; Tian, Jun

2018-01-01

Converters parallel operation can increase capacity of the system, but it may lead to potential zero-sequence circulating current, so the control of circulating current was an important goal in the design of parallel inverters. In this paper, the Real Time Digital Simulator (RTDS) is used to model the converters parallel system in real time and study the circulating current restraining. The equivalent model of two parallel converters and zero-sequence circulating current(ZSCC) were established and analyzed, then a strategy using variable zero vector control was proposed to suppress the circulating current. For two parallel modular converters, hardware-in-the-loop(HIL) study based on RTDS and practical experiment were implemented, results prove that the proposed control strategy is feasible and effective.
A Validated Set of MIDAS V5 Task Network Model Scenarios to Evaluate Nextgen Closely Spaced Parallel Operations Concepts

Science.gov (United States)

Gore, Brian Francis; Hooey, Becky Lee; Haan, Nancy; Socash, Connie; Mahlstedt, Eric; Foyle, David C.

2013-01-01

The Closely Spaced Parallel Operations (CSPO) scenario is a complex, human performance model scenario that tested alternate operator roles and responsibilities to a series of off-nominal operations on approach and landing (see Gore, Hooey, Mahlstedt, Foyle, 2013). The model links together the procedures, equipment, crewstation, and external environment to produce predictions of operator performance in response to Next Generation system designs, like those expected in the National Airspaces NextGen concepts. The task analysis that is contained in the present report comes from the task analysis window in the MIDAS software. These tasks link definitions and states for equipment components, environmental features as well as operational contexts. The current task analysis culminated in 3300 tasks that included over 1000 Subject Matter Expert (SME)-vetted, re-usable procedural sets for three critical phases of flight; the Descent, Approach, and Land procedural sets (see Gore et al., 2011 for a description of the development of the tasks included in the model; Gore, Hooey, Mahlstedt, Foyle, 2013 for a description of the model, and its results; Hooey, Gore, Mahlstedt, Foyle, 2013 for a description of the guidelines that were generated from the models results; Gore, Hooey, Foyle, 2012 for a description of the models implementation and its settings). The rollout, after landing checks, taxi to gate and arrive at gate illustrated in Figure 1 were not used in the approach and divert scenarios exercised. The other networks in Figure 1 set up appropriate context settings for the flight deck.The current report presents the models task decomposition from the tophighest level and decomposes it to finer-grained levels. The first task that is completed by the model is to set all of the initial settings for the scenario runs included in the model (network 75 in Figure 1). This initialization process also resets the CAD graphic files contained with MIDAS, as well as the embedded
Same-source parallel implementation of the PSU/NCAR MM5

Energy Technology Data Exchange (ETDEWEB)

Michalakes, J.

1997-12-31

The Pennsylvania State/National Center for Atmospheric Research Mesoscale Model is a limited-area model of atmospheric systems, now in its fifth generation, MM5. Designed and maintained for vector and shared-memory parallel architectures, the official version of MM5 does not run on message-passing distributed memory (DM) parallel computers. The authors describe a same-source parallel implementation of the PSU/NCAR MM5 using FLIC, the Fortran Loop and Index Converter. The resulting source is nearly line-for-line identical with the original source code. The result is an efficient distributed memory parallel option to MM5 that can be seamlessly integrated into the official version.
PSHED: a simplified approach to developing parallel programs

International Nuclear Information System (INIS)

Mahajan, S.M.; Ramesh, K.; Rajesh, K.; Somani, A.; Goel, M.

1992-01-01

This paper presents a simplified approach in the forms of a tree structured computational model for parallel application programs. An attempt is made to provide a standard user interface to execute programs on BARC Parallel Processing System (BPPS), a scalable distributed memory multiprocessor. The interface package called PSHED provides a basic framework for representing and executing parallel programs on different parallel architectures. The PSHED package incorporates concepts from a broad range of previous research in programming environments and parallel computations. (author). 6 refs
Hypergraph partitioning implementation for parallelizing matrix-vector multiplication using CUDA GPU-based parallel computing

Science.gov (United States)

Murni, Bustamam, A.; Ernastuti, Handhika, T.; Kerami, D.

2017-07-01

Calculation of the matrix-vector multiplication in the real-world problems often involves large matrix with arbitrary size. Therefore, parallelization is needed to speed up the calculation process that usually takes a long time. Graph partitioning techniques that have been discussed in the previous studies cannot be used to complete the parallelized calculation of matrix-vector multiplication with arbitrary size. This is due to the assumption of graph partitioning techniques that can only solve the square and symmetric matrix. Hypergraph partitioning techniques will overcome the shortcomings of the graph partitioning technique. This paper addresses the efficient parallelization of matrix-vector multiplication through hypergraph partitioning techniques using CUDA GPU-based parallel computing. CUDA (compute unified device architecture) is a parallel computing platform and programming model that was created by NVIDIA and implemented by the GPU (graphics processing unit).
Overview of the Force Scientific Parallel Language

Directory of Open Access Journals (Sweden)

Gita Alaghband

1994-01-01

Full Text Available The Force parallel programming language designed for large-scale shared-memory multiprocessors is presented. The language provides a number of parallel constructs as extensions to the ordinary Fortran language and is implemented as a two-level macro preprocessor to support portability across shared memory multiprocessors. The global parallelism model on which the Force is based provides a powerful parallel language. The parallel constructs, generic synchronization, and freedom from process management supported by the Force has resulted in structured parallel programs that are ported to the many multiprocessors on which the Force is implemented. Two new parallel constructs for looping and functional decomposition are discussed. Several programming examples to illustrate some parallel programming approaches using the Force are also presented.
Synchronization Techniques in Parallel Discrete Event Simulation

OpenAIRE

Lindén, Jonatan

2018-01-01

Discrete event simulation is an important tool for evaluating system models in many fields of science and engineering. To improve the performance of large-scale discrete event simulations, several techniques to parallelize discrete event simulation have been developed. In parallel discrete event simulation, the work of a single discrete event simulation is distributed over multiple processing elements. A key challenge in parallel discrete event simulation is to ensure that causally dependent ...
CICE, The Los Alamos Sea Ice Model

Energy Technology Data Exchange (ETDEWEB)

2017-05-12

The Los Alamos sea ice model (CICE) is the result of an effort to develop a computationally efficient sea ice component for a fully coupled atmosphere–land–ocean–ice global climate model. It was originally designed to be compatible with the Parallel Ocean Program (POP), an ocean circulation model developed at Los Alamos National Laboratory for use on massively parallel computers. CICE has several interacting components: a vertical thermodynamic model that computes local growth rates of snow and ice due to vertical conductive, radiative and turbulent fluxes, along with snowfall; an elastic-viscous-plastic model of ice dynamics, which predicts the velocity field of the ice pack based on a model of the material strength of the ice; an incremental remapping transport model that describes horizontal advection of the areal concentration, ice and snow volume and other state variables; and a ridging parameterization that transfers ice among thickness categories based on energetic balances and rates of strain. It also includes a biogeochemical model that describes evolution of the ice ecosystem. The CICE sea ice model is used for climate research as one component of complex global earth system models that include atmosphere, land, ocean and biogeochemistry components. It is also used for operational sea ice forecasting in the polar regions and in numerical weather prediction models.
A method of paralleling computer calculation for two-dimensional kinetic plasma model

International Nuclear Information System (INIS)

Brazhnik, V.A.; Demchenko, V.V.; Dem'yanov, V.G.; D'yakov, V.E.; Ol'shanskij, V.V.; Panchenko, V.I.

1987-01-01

A method for parallel computer calculation and OSIRIS program complex realizing it and designed for numerical plasma simulation by the macroparticle method are described. The calculation can be carried out either with one or simultaneously with two computers BESM-6, that is provided by some package of interacting programs functioning in every computer. Program interaction in every computer is based on event techniques realized in OS DISPAK. Parallel computer calculation with two BESM-6 computers allows to accelerate the computation 1.5 times
Fundamental Parallel Algorithms for Private-Cache Chip Multiprocessors

DEFF Research Database (Denmark)

Arge, Lars Allan; Goodrich, Michael T.; Nelson, Michael

2008-01-01

about the way cores are interconnected, for we assume that all inter-processor communication occurs through the memory hierarchy. We study several fundamental problems, including prefix sums, selection, and sorting, which often form the building blocks of other parallel algorithms. Indeed, we present...... two sorting algorithms, a distribution sort and a mergesort. Our algorithms are asymptotically optimal in terms of parallel cache accesses and space complexity under reasonable assumptions about the relationships between the number of processors, the size of memory, and the size of cache blocks....... In addition, we study sorting lower bounds in a computational model, which we call the parallel external-memory (PEM) model, that formalizes the essential properties of our algorithms for private-cache CMPs....
Consumer preference models: fuzzy theory approach

Science.gov (United States)

Turksen, I. B.; Wilson, I. A.

1993-12-01

Consumer preference models are widely used in new product design, marketing management, pricing and market segmentation. The purpose of this article is to develop and test a fuzzy set preference model which can represent linguistic variables in individual-level models implemented in parallel with existing conjoint models. The potential improvements in market share prediction and predictive validity can substantially improve management decisions about what to make (product design), for whom to make it (market segmentation) and how much to make (market share prediction).
Reconstruction of the 1997/1998 El Nino from TOPEX/POSEIDON and TOGA/TAO Data Using a Massively Parallel Pacific-Ocean Model and Ensemble Kalman Filter

Science.gov (United States)

Keppenne, C. L.; Rienecker, M.; Borovikov, A. Y.

1999-01-01

Two massively parallel data assimilation systems in which the model forecast-error covariances are estimated from the distribution of an ensemble of model integrations are applied to the assimilation of 97-98 TOPEX/POSEIDON altimetry and TOGA/TAO temperature data into a Pacific basin version the NASA Seasonal to Interannual Prediction Project (NSIPP)ls quasi-isopycnal ocean general circulation model. in the first system, ensemble of model runs forced by an ensemble of atmospheric model simulations is used to calculate asymptotic error statistics. The data assimilation then occurs in the reduced phase space spanned by the corresponding leading empirical orthogonal functions. The second system is an ensemble Kalman filter in which new error statistics are computed during each assimilation cycle from the time-dependent ensemble distribution. The data assimilation experiments are conducted on NSIPP's 512-processor CRAY T3E. The two data assimilation systems are validated by withholding part of the data and quantifying the extent to which the withheld information can be inferred from the assimilation of the remaining data. The pros and cons of each system are discussed.

Kinematics and dynamics analysis of a novel serial-parallel dynamic simulator

Energy Technology Data Exchange (ETDEWEB)

Hu, Bo; Zhang, Lian Dong; Yu, Jingjing [Parallel Robot and Mechatronic System Laboratory of Hebei Province, Yanshan University, Qinhuangdao, Hebei (China)

2016-11-15

A serial-parallel dynamics simulator based on serial-parallel manipulator is proposed. According to the dynamics simulator motion requirement, the proposed serial-parallel dynamics simulator formed by 3-RRS (active revolute joint-revolute joint-spherical joint) and 3-SPR (Spherical joint-active prismatic joint-revolute joint) PMs adopts the outer and inner layout. By integrating the kinematics, constraint and coupling information of the 3-RRS and 3-SPR PMs into the serial-parallel manipulator, the inverse Jacobian matrix, velocity, and acceleration of the serial-parallel dynamics simulator are studied. Based on the principle of virtual work and the kinematics model, the inverse dynamic model is established. Finally, the workspace of the (3-RRS)+(3-SPR) dynamics simulator is constructed.
Kinematics and dynamics analysis of a novel serial-parallel dynamic simulator

International Nuclear Information System (INIS)

Hu, Bo; Zhang, Lian Dong; Yu, Jingjing

2016-01-01

A serial-parallel dynamics simulator based on serial-parallel manipulator is proposed. According to the dynamics simulator motion requirement, the proposed serial-parallel dynamics simulator formed by 3-RRS (active revolute joint-revolute joint-spherical joint) and 3-SPR (Spherical joint-active prismatic joint-revolute joint) PMs adopts the outer and inner layout. By integrating the kinematics, constraint and coupling information of the 3-RRS and 3-SPR PMs into the serial-parallel manipulator, the inverse Jacobian matrix, velocity, and acceleration of the serial-parallel dynamics simulator are studied. Based on the principle of virtual work and the kinematics model, the inverse dynamic model is established. Finally, the workspace of the (3-RRS)+(3-SPR) dynamics simulator is constructed
Running parallel applications with topology-aware grid middleware

NARCIS (Netherlands)

Bar, P.; Coti, C.; Groen, D.; Herault, T.; Kravtsov, V.; Schuster, A; Swain, M.

2009-01-01

The concept of topology-aware grid applications is derived from parallelized computational models of complex systems that are executed on heterogeneous resources, either because they require specialized hardware for certain calculations, or because their parallelization is flexible enough to exploit
Nonlinear ECRH and ECCD modeling in toroidal devices

International Nuclear Information System (INIS)

Kamendje, R.; Kernbichler, W.; Heyn, M.F.; Kasilov, S.V.; Poli, E.

2003-01-01

A Monte Carlo method of evaluation of the electron distribution function which takes into account realistic orbits of electrons during their nonlinear cyclotron interaction with the wave beam has been proposed. The focus there was on a proper description of particle interaction with a wave beam while the geometry of the main magnetic field outside the beam was the simplest possible (slab model). In the actual work, a more realistic tokamak geometry has been implemented in the model. In addition, an expression for the parallel current density through Green's function has been used. This allows to reduce statistical errors which result from the fact that the current generated by particles with positive v parallel >0 is almost compensated by the current resulting from particles with v parallel <0 if the complete distribution function is taken into account in the expression for the current. The code ECNL which is a Monte Carlo kinetic equation solver based on this model, has been coupled with the beam tracing code TORBEAM. The results of nonlinear modeling of ECCD in a tokamak with ASDEX Upgrade parameters with help of this combination of codes are compared below to the results of linear modeling performed with TORBEAM alone. In addition, implications for stellarators are discussed. (orig.)
A further extension of the Extended Parallel Process Model (E-EPPM): implications of cognitive appraisal theory of emotion and dispositional coping style.

Science.gov (United States)

So, Jiyeon

2013-01-01

For two decades, the extended parallel process model (EPPM; Witte, 1992 ) has been one of the most widely used theoretical frameworks in health risk communication. The model has gained much popularity because it recognizes that, ironically, preceding fear appeal models do not incorporate the concept of fear as a legitimate and central part of them. As a remedy to this situation, the EPPM aims at "putting the fear back into fear appeals" ( Witte, 1992 , p. 330). Despite this attempt, however, this article argues that the EPPM still does not fully capture the essence of fear as an emotion. Specifically, drawing upon Lazarus's (1991 ) cognitive appraisal theory of emotion and the concept of dispositional coping style ( Miller, 1995 ), this article seeks to further extend the EPPM. The revised EPPM incorporates a more comprehensive perspective on risk perceptions as a construct involving both cognitive and affective aspects (i.e., fear and anxiety) and integrates the concept of monitoring and blunting coping style as a moderator of further information seeking regarding a given risk topic.
User's guide to the western spruce budworm modeling system

Science.gov (United States)

Nicholas L. Crookston; J. J. Colbert; Paul W. Thomas; Katharine A. Sheehan; William P. Kemp

1990-01-01

The Budworm Modeling System is a set of four computer programs: The Budworm Dynamics Model, the Prognosis-Budworm Dynamics Model, the Prognosis-Budworm Damage Model, and the Parallel Processing-Budworm Dynamics Model. Input to the first three programs and the output produced are described in this guide. A guide to the fourth program will be published separately....
MARMOT update for oxide fuel modeling

Energy Technology Data Exchange (ETDEWEB)

Zhang, Yongfeng [Idaho National Lab. (INL), Idaho Falls, ID (United States); Schwen, Daniel [Idaho National Lab. (INL), Idaho Falls, ID (United States); Chakraborty, Pritam [Idaho National Lab. (INL), Idaho Falls, ID (United States); Jiang, Chao [Idaho National Lab. (INL), Idaho Falls, ID (United States); Aagesen, Larry [Idaho National Lab. (INL), Idaho Falls, ID (United States); Ahmed, Karim [Idaho National Lab. (INL), Idaho Falls, ID (United States); Jiang, Wen [Idaho National Lab. (INL), Idaho Falls, ID (United States); Biner, Bulent [Idaho National Lab. (INL), Idaho Falls, ID (United States); Bai, Xianming [Virginia Polytechnic Inst. and State Univ. (Virginia Tech), Blacksburg, VA (United States); Tonks, Michael [Pennsylvania State Univ., University Park, PA (United States); Millett, Paul [Univ. of Arkansas, Fayetteville, AR (United States)

2016-09-01

This report summarizes the lower-length-scale research and development progresses in FY16 at Idaho National Laboratory in developing mechanistic materials models for oxide fuels, in parallel to the development of the MARMOT code which will be summarized in a separate report. This effort is a critical component of the microstructure based fuel performance modeling approach, supported by the Fuels Product Line in the Nuclear Energy Advanced Modeling and Simulation (NEAMS) program. The progresses can be classified into three categories: 1) development of materials models to be used in engineering scale fuel performance modeling regarding the effect of lattice defects on thermal conductivity, 2) development of modeling capabilities for mesoscale fuel behaviors including stage-3 gas release, grain growth, high burn-up structure, fracture and creep, and 3) improved understanding in material science by calculating the anisotropic grain boundary energies in UO$_2$ and obtaining thermodynamic data for solid fission products. Many of these topics are still under active development. They are updated in the report with proper amount of details. For some topics, separate reports are generated in parallel and so stated in the text. The accomplishments have led to better understanding of fuel behaviors and enhance capability of the MOOSE-BISON-MARMOT toolkit.
Xyce Parallel Electronic Simulator Users' Guide Version 6.8

Energy Technology Data Exchange (ETDEWEB)

Keiter, Eric R. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Aadithya, Karthik Venkatraman [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Mei, Ting [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Russo, Thomas V. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Schiek, Richard L. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Sholander, Peter E. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Thornquist, Heidi K. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Verley, Jason C. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

2017-10-01

This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been de- signed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: Capability to solve extremely large circuit problems by supporting large-scale parallel com- puting platforms (up to thousands of processors). This includes support for most popular parallel and serial computers. A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows one to develop new types of analysis without requiring the implementation of analysis-specific device models. Device models that are specifically tailored to meet Sandia's needs, including some radiation- aware devices (for Sandia users only). Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase$-$ a message passing parallel implementation $-$ which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows.
A parallel process model of the development of positive smoking expectancies and smoking behavior during early adolescence in Caucasian and African American girls

OpenAIRE

Chung, Tammy; White, Helene R.; Hipwell, Alison E.; Stepp, Stephanie D.; Loeber, Rolf

2010-01-01

This study examined the development of positive smoking expectancies and smoking behavior in an urban cohort of girls followed annually over ages 11-14. Longitudinal data from the oldest cohort of the Pittsburgh Girls Study (N=566, 56% African American, 44% Caucasian) were used to estimate a parallel process growth model of positive smoking expectancies and smoking behavior. Average level of positive smoking expectancies was relatively stable over ages 11-14, although there was significant va...
High performance statistical computing with parallel R: applications to biology and climate modelling

International Nuclear Information System (INIS)

Samatova, Nagiza F; Branstetter, Marcia; Ganguly, Auroop R; Hettich, Robert; Khan, Shiraj; Kora, Guruprasad; Li, Jiangtian; Ma, Xiaosong; Pan, Chongle; Shoshani, Arie; Yoginath, Srikanth

2006-01-01

Ultrascale computing and high-throughput experimental technologies have enabled the production of scientific data about complex natural phenomena. With this opportunity, comes a new problem - the massive quantities of data so produced. Answers to fundamental questions about the nature of those phenomena remain largely hidden in the produced data. The goal of this work is to provide a scalable high performance statistical data analysis framework to help scientists perform interactive analyses of these raw data to extract knowledge. Towards this goal we have been developing an open source parallel statistical analysis package, called Parallel R, that lets scientists employ a wide range of statistical analysis routines on high performance shared and distributed memory architectures without having to deal with the intricacies of parallelizing these routines
Parallel 3-D method of characteristics in MPACT

International Nuclear Information System (INIS)

Kochunas, B.; Dovvnar, T. J.; Liu, Z.

2013-01-01

A new parallel 3-D MOC kernel has been developed and implemented in MPACT which makes use of the modular ray tracing technique to reduce computational requirements and to facilitate parallel decomposition. The parallel model makes use of both distributed and shared memory parallelism which are implemented with the MPI and OpenMP standards, respectively. The kernel is capable of parallel decomposition of problems in space, angle, and by characteristic rays up to 0(104) processors. Initial verification of the parallel 3-D MOC kernel was performed using the Takeda 3-D transport benchmark problems. The eigenvalues computed by MPACT are within the statistical uncertainty of the benchmark reference and agree well with the averages of other participants. The MPACT k eff differs from the benchmark results for rodded and un-rodded cases by 11 and -40 pcm, respectively. The calculations were performed for various numbers of processors and parallel decompositions up to 15625 processors; all producing the same result at convergence. The parallel efficiency of the worst case was 60%, while very good efficiency (>95%) was observed for cases using 500 processors. The overall run time for the 500 processor case was 231 seconds and 19 seconds for the case with 15625 processors. Ongoing work is focused on developing theoretical performance models and the implementation of acceleration techniques to minimize the number of iterations to converge. (authors)
Interpretable decision-tree induction in a big data parallel framework

Directory of Open Access Journals (Sweden)

Weinberg Abraham Itzhak

2017-12-01

Full Text Available When running data-mining algorithms on big data platforms, a parallel, distributed framework, such asMAPREDUCE, may be used. However, in a parallel framework, each individual model fits the data allocated to its own computing node without necessarily fitting the entire dataset. In order to induce a single consistent model, ensemble algorithms such as majority voting, aggregate the local models, rather than analyzing the entire dataset directly. Our goal is to develop an efficient algorithm for choosing one representative model from multiple, locally induced decision-tree models. The proposed SySM (syntactic similarity method algorithm computes the similarity between the models produced by parallel nodes and chooses the model which is most similar to others as the best representative of the entire dataset. In 18.75% of 48 experiments on four big datasets, SySM accuracy is significantly higher than that of the ensemble; in about 43.75% of the experiments, SySM accuracy is significantly lower; in one case, the results are identical; and in the remaining 35.41% of cases the difference is not statistically significant. Compared with ensemble methods, the representative tree models selected by the proposed methodology are more compact and interpretable, their induction consumes less memory, and, as confirmed by the empirical results, they allow faster classification of new records.
Parallel genetic algorithms with migration for the hybrid flow shop scheduling problem

Directory of Open Access Journals (Sweden)

K. Belkadi

2006-01-01

Full Text Available This paper addresses scheduling problems in hybrid flow shop-like systems with a migration parallel genetic algorithm (PGA_MIG. This parallel genetic algorithm model allows genetic diversity by the application of selection and reproduction mechanisms nearer to nature. The space structure of the population is modified by dividing it into disjoined subpopulations. From time to time, individuals are exchanged between the different subpopulations (migration. Influence of parameters and dedicated strategies are studied. These parameters are the number of independent subpopulations, the interconnection topology between subpopulations, the choice/replacement strategy of the migrant individuals, and the migration frequency. A comparison between the sequential and parallel version of genetic algorithm (GA is provided. This comparison relates to the quality of the solution and the execution time of the two versions. The efficiency of the parallel model highly depends on the parameters and especially on the migration frequency. In the same way this parallel model gives a significant improvement of computational time if it is implemented on a parallel architecture which offers an acceptable number of processors (as many processors as subpopulations.
Detailed Performance Model for Photovoltaic Systems: Preprint

Energy Technology Data Exchange (ETDEWEB)

Tian, H.; Mancilla-David, F.; Ellis, K.; Muljadi, E.; Jenkins, P.

2012-07-01

This paper presents a modified current-voltage relationship for the single diode model. The single-diode model has been derived from the well-known equivalent circuit for a single photovoltaic cell. The modification presented in this paper accounts for both parallel and series connections in an array.
Simulating spin models on GPU

Science.gov (United States)

Weigel, Martin

2011-09-01

Over the last couple of years it has been realized that the vast computational power of graphics processing units (GPUs) could be harvested for purposes other than the video game industry. This power, which at least nominally exceeds that of current CPUs by large factors, results from the relative simplicity of the GPU architectures as compared to CPUs, combined with a large number of parallel processing units on a single chip. To benefit from this setup for general computing purposes, the problems at hand need to be prepared in a way to profit from the inherent parallelism and hierarchical structure of memory accesses. In this contribution I discuss the performance potential for simulating spin models, such as the Ising model, on GPU as compared to conventional simulations on CPU.
CRLH Transmission Lines for Telecommunications: Fast and Effective Modeling

Directory of Open Access Journals (Sweden)

Juanjuan Gao

2017-01-01

Full Text Available A different parameter extraction approach based on zero immittances for composite right/left-handed (CRLH structure is presented. For lossless unit cell equivalent circuit model, LC parameters of series and parallel branches are extracted according to series resonance frequency and parallel resonance frequency, respectively. This approach can be applied to symmetric and unbalanced CRLH structures. The parameter extraction procedure is provided and validated by T-type unit cell model. The responses of distributed prototype and extracted equivalent LC circuit model are in good agreement. The equivalent circuit modeling can improve the degree of freedom in the CRLH TLs design. This parameter extraction method provides an effective and straightforward way in CRLH metamaterials design and applications in telecommunication systems.
Parallelizing an electron transport Monte Carlo simulator (MOCASIN 2.0)

International Nuclear Information System (INIS)

Schwetman, H.; Burdick, S.

1988-01-01

Electron transport simulators are tools for studying electrical properties of semiconducting materials and devices. As demands for modeling more complex devices and new materials have emerged, so have demands for more processing power. This paper documents a project to convert an electron transport simulator (MOCASIN 2.0) to a parallel processing environment. In addition to describing the conversion, the paper presents PPL, a parallel programming version of C running on a Sequent multiprocessor system. In timing tests, models that simulated the movement of 2,000 particles for 100 time steps were executed on ten processors, with a parallel efficiency of over 97%
Parallel paving: An algorithm for generating distributed, adaptive, all-quadrilateral meshes on parallel computers

Energy Technology Data Exchange (ETDEWEB)

Lober, R.R.; Tautges, T.J.; Vaughan, C.T.

1997-03-01

Paving is an automated mesh generation algorithm which produces all-quadrilateral elements. It can additionally generate these elements in varying sizes such that the resulting mesh adapts to a function distribution, such as an error function. While powerful, conventional paving is a very serial algorithm in its operation. Parallel paving is the extension of serial paving into parallel environments to perform the same meshing functions as conventional paving only on distributed, discretized models. This extension allows large, adaptive, parallel finite element simulations to take advantage of paving`s meshing capabilities for h-remap remeshing. A significantly modified version of the CUBIT mesh generation code has been developed to host the parallel paving algorithm and demonstrate its capabilities on both two dimensional and three dimensional surface geometries and compare the resulting parallel produced meshes to conventionally paved meshes for mesh quality and algorithm performance. Sandia`s {open_quotes}tiling{close_quotes} dynamic load balancing code has also been extended to work with the paving algorithm to retain parallel efficiency as subdomains undergo iterative mesh refinement.
Parallel Optimization of an Earth System Model (100 Gigaflops and Beyond?)

Science.gov (United States)

Drummond, L. A.; Farrara, J. D.; Mechoso, C. R.; Spahr, J. A.; Chao, Y.; Katz, S.; Lou, J. Z.; Wang, P.

1997-01-01

We are developing an Earth System Model (ESM) to be used in research aimed to better understand the interactions between the components of the Earth System and to eventually predict their variations. Currently, our ESM includes models of the atmosphere, oceans and the important chemical tracers therein.
Event monitoring of parallel computations

Directory of Open Access Journals (Sweden)

Gruzlikov Alexander M.

2015-06-01

Full Text Available The paper considers the monitoring of parallel computations for detection of abnormal events. It is assumed that computations are organized according to an event model, and monitoring is based on specific test sequences

Radiative Heat Transfer in Combustion Applications: Parallel Efficiencies of Two Gas Models, Turbulent Radiation Interactions in Particulate Laden Flows, and Coarse Mesh Finite Difference Acceleration for Improved Temporal Accuracy

Science.gov (United States)

Cleveland, Mathew A.

We investigate several aspects of the numerical solution of the radiative transfer equation in the context of coal combustion: the parallel efficiency of two commonly-used opacity models, the sensitivity of turbulent radiation interaction (TRI) effects to the presence of coal particulate, and an improvement of the order of temporal convergence using the coarse mesh finite difference (CMFD) method. There are four opacity models commonly employed to evaluate the radiative transfer equation in combustion applications; line-by-line (LBL), multigroup, band, and global. Most of these models have been rigorously evaluated for serial computations of a spectrum of problem types [1]. Studies of these models for parallel computations [2] are limited. We assessed the performance of the Spectral-Line-Based weighted sum of gray gasses (SLW) model, a global method related to K-distribution methods [1], and the LBL model. The LBL model directly interpolates opacity information from large data tables. The LBL model outperforms the SLW model in almost all cases, as suggested by Wang et al. [3]. The SLW model, however, shows superior parallel scaling performance and a decreased sensitivity to load imbalancing, suggesting that for some problems, global methods such as the SLW model, could outperform the LBL model. Turbulent radiation interaction (TRI) effects are associated with the differences in the time scales of the fluid dynamic equations and the radiative transfer equations. Solving on the fluid dynamic time step size produces large changes in the radiation field over the time step. We have modified the statistically homogeneous, non-premixed flame problem of Deshmukh et al. [4] to include coal-type particulate. The addition of low mass loadings of particulate minimally impacts the TRI effects. Observed differences in the TRI effects from variations in the packing fractions and Stokes numbers are difficult to analyze because of the significant effect of variations in problem
Parallel Development of Products and New Business Models

OpenAIRE

Lund, Morten; Hansen, Poul H. Kyvsgård

2014-01-01

The perception of product development and the practical execution of product development in professional organizations have undergone dramatic changes in recent years. Many of these chances relate to introduction of broader and more cross-disciplinary views that involves new organizational functions and new concepts. These chances can be captured in various generations of practice. This paper will discuss the recent development of 3rd generation product development process models and the emer...
Unified dataflow model for the analysis of data and pipeline parallelism, and buffer sizing

NARCIS (Netherlands)

Hausmans, J.P.H.M.; Geuns, S.J.; Wiggers, M.H.; Bekooij, Marco Jan Gerrit

2014-01-01

Real-time stream processing applications such as software defined radios are usually executed concurrently on multiprocessor systems. Exploiting coarse-grained data parallelism by duplicating tasks is often required, besides pipeline parallelism, to meet the temporal constraints of the applications.
Takagi-Sugeno's fuzzy models

Directory of Open Access Journals (Sweden)

Yann Blanco

2001-01-01

Full Text Available This paper outlines a methodology to study the stability of Takagi-Sugeno's (TS fuzzy models. The stability analysis of the TS model is performed using a quadratic Liapunov candidate function. This paper proposes a relaxation of Tanaka's stability condition: unlike related works, the equations to be solved are not Liapunov equations for each rule matrix, but a convex combination of them. The coefficients of this sums depend on the membership functions. This method is applied to the design of continuous controllers for the TS model. Three different control structures are investigated, among which the Parallel Distributed Compensation (PDC. An application to the inverted pendulum is proposed here.
PV panel model based on datasheet values

DEFF Research Database (Denmark)

Sera, Dezso; Teodorescu, Remus; Rodriguez, Pedro

2007-01-01

This work presents the construction of a model for a PV panel using the single-diode five-parameters model, based exclusively on data-sheet parameters. The model takes into account the series and parallel (shunt) resistance of the panel. The equivalent circuit and the basic equations of the PV cell....... Based on these equations, a PV panel model, which is able to predict the panel behavior in different temperature and irradiance conditions, is built and tested....
STEREOMETRIC MODELLING

Directory of Open Access Journals (Sweden)

P. Grimaldi

2012-07-01

Full Text Available These mandatory guidelines are provided for preparation of papers accepted for publication in the series of Volumes of The The stereometric modelling means modelling achieved with : – the use of a pair of virtual cameras, with parallel axes and positioned at a mutual distance average of 1/10 of the distance camera-object (in practice the realization and use of a stereometric camera in the modeling program; – the shot visualization in two distinct windows – the stereoscopic viewing of the shot while modelling. Since the definition of "3D vision" is inaccurately referred to as the simple perspective of an object, it is required to add the word stereo so that "3D stereo vision " shall stand for "three-dimensional view" and ,therefore, measure the width, height and depth of the surveyed image. Thanks to the development of a stereo metric model , either real or virtual, through the "materialization", either real or virtual, of the optical-stereo metric model made visible with a stereoscope. It is feasible a continuous on line updating of the cultural heritage with the help of photogrammetry and stereometric modelling. The catalogue of the Architectonic Photogrammetry Laboratory of Politecnico di Bari is available on line at: http://rappresentazione.stereofot.it:591/StereoFot/FMPro?-db=StereoFot.fp5&-lay=Scheda&-format=cerca.htm&-view
Xyce™ Parallel Electronic Simulator Users' Guide, Version 6.5.

Energy Technology Data Exchange (ETDEWEB)

Keiter, Eric R. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States). Electrical Models and Simulation; Aadithya, Karthik V. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States). Electrical Models and Simulation; Mei, Ting [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States). Electrical Models and Simulation; Russo, Thomas V. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States). Electrical Models and Simulation; Schiek, Richard L. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States). Electrical Models and Simulation; Sholander, Peter E. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States). Electrical Models and Simulation; Thornquist, Heidi K. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States). Electrical Models and Simulation; Verley, Jason C. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States). Electrical Models and Simulation

2016-06-01

This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). This includes support for most popular parallel and serial computers. A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows one to develop new types of analysis without requiring the implementation of analysis-specific device models. Device models that are specifically tailored to meet Sandia's needs, including some radiation- aware devices (for Sandia users only). Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase -- a message passing parallel implementation -- which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. The information herein is subject to change without notice. Copyright © 2002-2016 Sandia Corporation. All rights reserved.
Parallel computing simulation of fluid flow in the unsaturated zone of Yucca Mountain, Nevada

International Nuclear Information System (INIS)

Zhang, Keni; Wu, Yu-Shu; Bodvarsson, G.S.

2001-01-01

This paper presents the application of parallel computing techniques to large-scale modeling of fluid flow in the unsaturated zone (UZ) at Yucca Mountain, Nevada. In this study, parallel computing techniques, as implemented into the TOUGH2 code, are applied in large-scale numerical simulations on a distributed-memory parallel computer. The modeling study has been conducted using an over-one-million-cell three-dimensional numerical model, which incorporates a wide variety of field data for the highly heterogeneous fractured formation at Yucca Mountain. The objective of this study is to analyze the impact of various surface infiltration scenarios (under current and possible future climates) on flow through the UZ system, using various hydrogeological conceptual models with refined grids. The results indicate that the one-million-cell models produce better resolution results and reveal some flow patterns that cannot be obtained using coarse-grid modeling models
Parallelized preconditioned model building algorithm for matrix factorization

OpenAIRE

Kaya, Kamer; Birbil, İlker; Birbil, Ilker; Öztürk, Mehmet Kaan; Ozturk, Mehmet Kaan; Gohari, Amir

2017-01-01

Matrix factorization is a common task underlying several machine learning applications such as recommender systems, topic modeling, or compressed sensing. Given a large and possibly sparse matrix A, we seek two smaller matrices W and H such that their product is as close to A as possible. The objective is minimizing the sum of square errors in the approximation. Typically such problems involve hundreds of thousands of unknowns, so an optimizer must be exceptionally efficient. In this study, a...
Downscaling GISS ModelE Boreal Summer Climate over Africa

Science.gov (United States)

Druyan, Leonard M.; Fulakeza, Matthew

2015-01-01

The study examines the perceived added value of downscaling atmosphere-ocean global climate model simulations over Africa and adjacent oceans by a nested regional climate model. NASA/Goddard Institute for Space Studies (GISS) coupled ModelE simulations for June- September 1998-2002 are used to form lateral boundary conditions for synchronous simulations by the GISS RM3 regional climate model. The ModelE computational grid spacing is 2deg latitude by 2.5deg longitude and the RM3 grid spacing is 0.44deg. ModelE precipitation climatology for June-September 1998-2002 is shown to be a good proxy for 30-year means so results based on the 5-year sample are presumed to be generally representative. Comparison with observational evidence shows several discrepancies in ModelE configuration of the boreal summer inter-tropical convergence zone (ITCZ). One glaring shortcoming is that ModelE simulations do not advance the West African rain band northward during the summer to represent monsoon precipitation onset over the Sahel. Results for 1998-2002 show that onset simulation is an important added value produced by downscaling with RM3. ModelE Eastern South Atlantic Ocean computed sea-surface temperatures (SST) are some 4 K warmer than reanalysis, contributing to large positive biases in overlying surface air temperatures (Tsfc). ModelE Tsfc are also too warm over most of Africa. RM3 downscaling somewhat mitigates the magnitude of Tsfc biases over the African continent, it eliminates the ModelE double ITCZ over the Atlantic and it produces more realistic orographic precipitation maxima. Parallel ModelE and RM3 simulations with observed SST forcing (in place of the predicted ocean) lower Tsfc errors but have mixed impacts on circulation and precipitation biases. Downscaling improvements of the meridional movement of the rain band over West Africa and the configuration of orographic precipitation maxima are realized irrespective of the SST biases.
Thermodynamic Model of Spatial Memory

Science.gov (United States)

Kaufman, Miron; Allen, P.

1998-03-01

We develop and test a thermodynamic model of spatial memory. Our model is an application of statistical thermodynamics to cognitive science. It is related to applications of the statistical mechanics framework in parallel distributed processes research. Our macroscopic model allows us to evaluate an entropy associated with spatial memory tasks. We find that older adults exhibit higher levels of entropy than younger adults. Thurstone's Law of Categorical Judgment, according to which the discriminal processes along the psychological continuum produced by presentations of a single stimulus are normally distributed, is explained by using a Hooke spring model of spatial memory. We have also analyzed a nonlinear modification of the ideal spring model of spatial memory. This work is supported by NIH/NIA grant AG09282-06.
The Research of the Parallel Computing Development from the Angle of Cloud Computing

Science.gov (United States)

Peng, Zhensheng; Gong, Qingge; Duan, Yanyu; Wang, Yun

2017-10-01

Cloud computing is the development of parallel computing, distributed computing and grid computing. The development of cloud computing makes parallel computing come into people’s lives. Firstly, this paper expounds the concept of cloud computing and introduces two several traditional parallel programming model. Secondly, it analyzes and studies the principles, advantages and disadvantages of OpenMP, MPI and Map Reduce respectively. Finally, it takes MPI, OpenMP models compared to Map Reduce from the angle of cloud computing. The results of this paper are intended to provide a reference for the development of parallel computing.
Three-dimensional all-speed CFD code for safety analysis of nuclear reactor containment: Status of GASFLOW parallelization, model development, validation and application

Energy Technology Data Exchange (ETDEWEB)

Xiao, Jianjun, E-mail: jianjun.xiao@kit.edu [Institute of Nuclear and Energy Technologies, Karlsruhe Institute of Technology, P.O. Box 3640, 76021 Karlsruhe (Germany); Travis, John R., E-mail: jack_travis@comcast.com [Engineering and Scientific Software Inc., 3010 Old Pecos Trail, Santa Fe, NM 87505 (United States); Royl, Peter, E-mail: peter.royl@partner.kit.edu [Institute of Nuclear and Energy Technologies, Karlsruhe Institute of Technology, P.O. Box 3640, 76021 Karlsruhe (Germany); Necker, Gottfried, E-mail: gottfried.necker@partner.kit.edu [Institute of Nuclear and Energy Technologies, Karlsruhe Institute of Technology, P.O. Box 3640, 76021 Karlsruhe (Germany); Svishchev, Anatoly, E-mail: anatoly.svishchev@kit.edu [Institute of Nuclear and Energy Technologies, Karlsruhe Institute of Technology, P.O. Box 3640, 76021 Karlsruhe (Germany); Jordan, Thomas, E-mail: thomas.jordan@kit.edu [Institute of Nuclear and Energy Technologies, Karlsruhe Institute of Technology, P.O. Box 3640, 76021 Karlsruhe (Germany)

2016-05-15

Highlights: • 3-D scalable semi-implicit pressure-based CFD code for containment safety analysis. • Robust solution algorithm valid for all-speed flows. • Well validated and widely used CFD code for hydrogen safety analysis. • Code applied in various types of nuclear reactor containments. • Parallelization enables high-fidelity models in large scale containment simulations. - Abstract: GASFLOW is a three dimensional semi-implicit all-speed CFD code which can be used to predict fluid dynamics, chemical kinetics, heat and mass transfer, aerosol transportation and other related phenomena involved in postulated accidents in nuclear reactor containments. The main purpose of the paper is to give a brief review on recent GASFLOW code development, validations and applications in the field of nuclear safety. GASFLOW code has been well validated by international experimental benchmarks, and has been widely applied to hydrogen safety analysis in various types of nuclear power plants in European and Asian countries, which have been summarized in this paper. Furthermore, four benchmark tests of a lid-driven cavity flow, low Mach number jet flow, 1-D shock tube and supersonic flow over a forward-facing step are presented in order to demonstrate the accuracy and wide-ranging capability of ICE’d ALE solution algorithm for all-speed flows. GASFLOW has been successfully parallelized using the paradigms of Message Passing Interface (MPI) and domain decomposition. The parallel version, GASFLOW-MPI, adds great value to large scale containment simulations by enabling high-fidelity models, including more geometric details and more complex physics. It will be helpful for the nuclear safety engineers to better understand the hydrogen safety related physical phenomena during the severe accident, to optimize the design of the hydrogen risk mitigation systems and to fulfill the licensing requirements by the nuclear regulatory authorities. GASFLOW-MPI is targeting a high
ICRF edge modeling studies

Energy Technology Data Exchange (ETDEWEB)

Lehrman, I.S. (Grumman Corp. Research Center, Princeton, NJ (USA)); Colestock, P.L. (Princeton Univ., NJ (USA). Plasma Physics Lab.)

1990-04-01

Theoretical models have been developed, and are currently being refined, to explain the edge plasma-antenna interaction that occurs during ICRF heating. The periodic structure of a Faraday shielded antenna is found to result in strong ponderomotive force in the vicinity of the antenna. A fluid model, which incorporates the ponderomotive force, shows an increase in transport to the Faraday shield. A kinetic model shows that the strong antenna near fields act to increase the energy of deuterons which strike the shield, thereby increasing the sputtering of shield material. Estimates of edge impurity harmonic heating show no significant heating for either in or out-of-phase antenna operation. Additionally, a particle model for electrons near the shield shows that heating results from the parallel electric field associated with the fast wave. A quasilinear model for edge electron heating is presented and compared to the particle calculations. The models' predictions are shown to be consistent with measurements of enhanced transport. (orig.).
Regional-scale calculation of the LS factor using parallel processing

Science.gov (United States)

Liu, Kai; Tang, Guoan; Jiang, Ling; Zhu, A.-Xing; Yang, Jianyi; Song, Xiaodong

2015-05-01

With the increase of data resolution and the increasing application of USLE over large areas, the existing serial implementation of algorithms for computing the LS factor is becoming a bottleneck. In this paper, a parallel processing model based on message passing interface (MPI) is presented for the calculation of the LS factor, so that massive datasets at a regional scale can be processed efficiently. The parallel model contains algorithms for calculating flow direction, flow accumulation, drainage network, slope, slope length and the LS factor. According to the existence of data dependence, the algorithms are divided into local algorithms and global algorithms. Parallel strategy are designed according to the algorithm characters including the decomposition method for maintaining the integrity of the results, optimized workflow for reducing the time taken for exporting the unnecessary intermediate data and a buffer-communication-computation strategy for improving the communication efficiency. Experiments on a multi-node system show that the proposed parallel model allows efficient calculation of the LS factor at a regional scale with a massive dataset.
Mathematical Methods and Algorithms of Mobile Parallel Computing on the Base of Multi-core Processors

Directory of Open Access Journals (Sweden)

Alexander B. Bakulev

2012-11-01

Full Text Available This article deals with mathematical models and algorithms, providing mobility of sequential programs parallel representation on the high-level language, presents formal model of operation environment processes management, based on the proposed model of programs parallel representation, presenting computation process on the base of multi-core processors.
Parallel kinematics type, kinematics, and optimal design

CERN Document Server

Liu, Xin-Jun

2014-01-01

Parallel Kinematics- Type, Kinematics, and Optimal Design presents the results of 15 year's research on parallel mechanisms and parallel kinematics machines. This book covers the systematic classification of parallel mechanisms (PMs) as well as providing a large number of mechanical architectures of PMs available for use in practical applications. It focuses on the kinematic design of parallel robots. One successful application of parallel mechanisms in the field of machine tools, which is also called parallel kinematics machines, has been the emerging trend in advanced machine tools. The book describes not only the main aspects and important topics in parallel kinematics, but also references novel concepts and approaches, i.e. type synthesis based on evolution, performance evaluation and optimization based on screw theory, singularity model taking into account motion and force transmissibility, and others. This book is intended for researchers, scientists, engineers and postgraduates or above with interes...
Application of parallel computing techniques to a large-scale reservoir simulation

International Nuclear Information System (INIS)

Zhang, Keni; Wu, Yu-Shu; Ding, Chris; Pruess, Karsten

2001-01-01

Even with the continual advances made in both computational algorithms and computer hardware used in reservoir modeling studies, large-scale simulation of fluid and heat flow in heterogeneous reservoirs remains a challenge. The problem commonly arises from intensive computational requirement for detailed modeling investigations of real-world reservoirs. This paper presents the application of a massive parallel-computing version of the TOUGH2 code developed for performing large-scale field simulations. As an application example, the parallelized TOUGH2 code is applied to develop a three-dimensional unsaturated-zone numerical model simulating flow of moisture, gas, and heat in the unsaturated zone of Yucca Mountain, Nevada, a potential repository for high-level radioactive waste. The modeling approach employs refined spatial discretization to represent the heterogeneous fractured tuffs of the system, using more than a million 3-D gridblocks. The problem of two-phase flow and heat transfer within the model domain leads to a total of 3,226,566 linear equations to be solved per Newton iteration. The simulation is conducted on a Cray T3E-900, a distributed-memory massively parallel computer. Simulation results indicate that the parallel computing technique, as implemented in the TOUGH2 code, is very efficient. The reliability and accuracy of the model results have been demonstrated by comparing them to those of small-scale (coarse-grid) models. These comparisons show that simulation results obtained with the refined grid provide more detailed predictions of the future flow conditions at the site, aiding in the assessment of proposed repository performance
Xyce parallel electronic simulator users guide, version 6.0.

Energy Technology Data Exchange (ETDEWEB)

Keiter, Eric R; Mei, Ting; Russo, Thomas V.; Schiek, Richard Louis; Thornquist, Heidi K.; Verley, Jason C.; Fixel, Deborah A.; Coffey, Todd S; Pawlowski, Roger P; Warrender, Christina E.; Baur, David Gregory.

2013-08-01

This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). This includes support for most popular parallel and serial computers. A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows one to develop new types of analysis without requiring the implementation of analysis-specific device models. Device models that are specifically tailored to meet Sandias needs, including some radiationaware devices (for Sandia users only). Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase a message passing parallel implementation which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows.
Xyce parallel electronic simulator users' guide, Version 6.0.1.

Energy Technology Data Exchange (ETDEWEB)

Keiter, Eric R; Mei, Ting; Russo, Thomas V.; Schiek, Richard Louis; Thornquist, Heidi K.; Verley, Jason C.; Fixel, Deborah A.; Coffey, Todd S; Pawlowski, Roger P; Warrender, Christina E.; Baur, David Gregory.

2014-01-01

This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). This includes support for most popular parallel and serial computers. A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows one to develop new types of analysis without requiring the implementation of analysis-specific device models. Device models that are specifically tailored to meet Sandias needs, including some radiationaware devices (for Sandia users only). Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase a message passing parallel implementation which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows.

Xyce parallel electronic simulator users guide, version 6.1

Energy Technology Data Exchange (ETDEWEB)

Keiter, Eric R; Mei, Ting; Russo, Thomas V.; Schiek, Richard Louis; Sholander, Peter E.; Thornquist, Heidi K.; Verley, Jason C.; Baur, David Gregory

2014-03-01

This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas; Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). This includes support for most popular parallel and serial computers; A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows one to develop new types of analysis without requiring the implementation of analysis-specific device models; Device models that are specifically tailored to meet Sandia's needs, including some radiationaware devices (for Sandia users only); and Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase-a message passing parallel implementation-which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows.
Reliability-Based Optimization of Series Systems of Parallel Systems

DEFF Research Database (Denmark)

Enevoldsen, I.; Sørensen, John Dalsgaard

1993-01-01

Reliability-based design of structural systems is considered. In particular, systems where the reliability model is a series system of parallel systems are treated. A sensitivity analysis for this class of problems is presented. Optimization problems with series systems of parallel systems...... optimization of series systems of parallel systems, but it is also efficient in reliability-based optimization of series systems in general....
A Parallel Workload Model and its Implications for Processor Allocation

Science.gov (United States)

1996-11-01

with SEV or AVG, both of which can tolerate c = 0.4 { 0.6 before their performance deteriorates signi cantly. On the other hand, Setia [10] has...Sanjeev. K Setia . The interaction between memory allocation and adaptive partitioning in message-passing multicomputers. In IPPS 󈨣 Workshop on Job...Scheduling Strategies for Parallel Processing, pages 89{99, 1995. [11] Sanjeev K. Setia and Satish K. Tripathi. An analysis of several processor
Parallel object-oriented specification language

NARCIS (Netherlands)

Florescu, O.; Voeten, J.P.M.; Theelen, B.D.; Geilen, M.C.W.; Corporaal, H.; Burns, Alan

2008-01-01

The Parallel Object-Oriented Specification Language (POOSL) is an expressive modelling language for hardware/software systems [10]. It was originally defined in [7] as an object-oriented extension of process algebra CCS [6], supporting (conditional) synchronous message passing between
Modelling electrolyte conductivity in a water electrolyzer cell

DEFF Research Database (Denmark)

Caspersen, Michael; Kirkegaard, Julius Bier

2012-01-01

An analytical model describing the hydrogen gas evolution under natural convection in an electrolyzer cell is developed. Main purpose of the model is to investigate the electrolyte conductivity through the cell under various conditions. Cell conductivity is calculated from a parallel resistor...
General Potential-Current Model and Validation for Electrocoagulation

International Nuclear Information System (INIS)

Dubrawski, Kristian L.; Du, Codey; Mohseni, Madjid

2014-01-01

A model relating potential and current in continuous parallel plate iron electrocoagulation (EC) was developed for application in drinking water treatment. The general model can be applied to any EC parallel plate system relying only on geometric and tabulated input variables without the need of system-specific experimentally derived constants. For the theoretical model, the anode and cathode were vertically divided into n equipotential segments in a single pass, upflow, and adiabatic EC reactor. Potential and energy balances were simultaneously solved at each vertical segment, which included the contribution of ionic concentrations, solution temperature and conductivity, cathodic hydrogen flux, and gas/liquid ratio. We experimentally validated the numerical model with a vertical upflow EC reactor using a 24 cm height 99.99% pure iron anode divided into twelve 2 cm segments. Individual experimental currents from each segment were summed to determine total current, and compared with the theoretically derived value. Several key variables were studied to determine their impact on model accuracy: solute type, solute concentration, current density, flow rate, inter-electrode gap, and electrode surface condition. Model results were in good agreement with experimental values at cell potentials of 2-20 V (corresponding to a current density range of approximately 50-800 A/m 2 ), with mean relative deviation of 9% for low flow rate, narrow electrode gap, polished electrodes, and 150 mg/L NaCl. Highest deviation occurred with a large electrode gap, unpolished electrodes, and Na 2 SO 4 electrolyte, due to parasitic H 2 O oxidation and less than unity current efficiency. This is the first general model which can be applied to any parallel plate EC system for accurate electrochemical voltage or current prediction
Landau fluid model for weakly nonlinear dispersive magnetohydrodynamics

International Nuclear Information System (INIS)

Passot, T.; Sulem, P. L.

2005-01-01

In may astrophysical plasmas such as the solar wind, the terrestrial magnetosphere, or in the interstellar medium at small enough scales, collisions are negligible. When interested in the large-scale dynamics, a hydrodynamic approach is advantageous not only because its numerical simulations is easier than of the full Vlasov-Maxwell equations, but also because it provides a deep understanding of cross-scale nonlinear couplings. It is thus of great interest to construct fluid models that extended the classical magnetohydrodynamic (MHD) equations to collisionless situations. Two ingredients need to be included in such a model to capture the main kinetic effects: finite Larmor radius (FLR) corrections and Landau damping, the only fluid-particle resonance that can affect large scales and can be modeled in a relatively simple way. The Modelization of Landau damping in a fluid formalism is hardly possible in the framework of a systematic asymptotic expansion and was addressed mainly by means of parameter fitting in a linearized setting. We introduced a similar Landau fluid model but, that has the advantage of taking dispersive effects into account. This model properly describes dispersive MHD waves in quasi-parallel propagation. Since, by construction, the system correctly reproduces their linear dynamics, appropriate tests should address the nonlinear regime. In a first case, we show analytically that the weakly nonlinear modulational dynamics of quasi-parallel propagating Alfven waves is well captured. As a second test we consider the parametric decay instability of parallel Alfven waves and show that numerical simulations of the dispersive Landau fluid model lead to results that closely match the outcome of hybrid simulations. (Author)
Inside the Monitor Model

DEFF Research Database (Denmark)

Carl, Michael; Dragsted, Barbara

2012-01-01

a “monitor model” according to which translators start with a literal default rendering procedure and where a monitor interrupts the default procedure when a problem occurs. This paper suggests an extension of the monitor model in which comprehension and production are processed in parallel by the default...
Parallel-Architecture Simulator Development Using Hardware Transactional Memory

OpenAIRE

Armejach Sanosa, Adrià

2009-01-01

To address the need for a simpler parallel programming model, Transactional Memory (TM) has been developed and promises good parallel performance with easy-to-write parallel code. Unlike lock-based approaches, with TM, programmers do not need to explicitly specify and manage the synchronization among threads. However, programmers simply mark code segments as transactions, and the TM system manages the concurrency control for them. TM can be implemented either in software (STM) or hardware (HT...
Applications of computer modeling to fusion research

International Nuclear Information System (INIS)

Dawson, J.M.

1989-01-01

Progress achieved during this report period is presented on the following topics: Development and application of gyrokinetic particle codes to tokamak transport, development of techniques to take advantage of parallel computers; model dynamo and bootstrap current drive; and in general maintain our broad-based program in basic plasma physics and computer modeling
High Performance Programming Using Explicit Shared Memory Model on Cray T3D1

Science.gov (United States)

Simon, Horst D.; Saini, Subhash; Grassi, Charles

1994-01-01

The Cray T3D system is the first-phase system in Cray Research, Inc.'s (CRI) three-phase massively parallel processing (MPP) program. This system features a heterogeneous architecture that closely couples DEC's Alpha microprocessors and CRI's parallel-vector technology, i.e., the Cray Y-MP and Cray C90. An overview of the Cray T3D hardware and available programming models is presented. Under Cray Research adaptive Fortran (CRAFT) model four programming methods (data parallel, work sharing, message-passing using PVM, and explicit shared memory model) are available to the users. However, at this time data parallel and work sharing programming models are not available to the user community. The differences between standard PVM and CRI's PVM are highlighted with performance measurements such as latencies and communication bandwidths. We have found that the performance of neither standard PVM nor CRI s PVM exploits the hardware capabilities of the T3D. The reasons for the bad performance of PVM as a native message-passing library are presented. This is illustrated by the performance of NAS Parallel Benchmarks (NPB) programmed in explicit shared memory model on Cray T3D. In general, the performance of standard PVM is about 4 to 5 times less than obtained by using explicit shared memory model. This degradation in performance is also seen on CM-5 where the performance of applications using native message-passing library CMMD on CM-5 is also about 4 to 5 times less than using data parallel methods. The issues involved (such as barriers, synchronization, invalidating data cache, aligning data cache etc.) while programming in explicit shared memory model are discussed. Comparative performance of NPB using explicit shared memory programming model on the Cray T3D and other highly parallel systems such as the TMC CM-5, Intel Paragon, Cray C90, IBM-SP1, etc. is presented.
Modelling of the thermomechanical behaviour of salt rock

International Nuclear Information System (INIS)

Albers, G.; Graefe, V.; Korthaus, E.; Pudewillis, A.; Prij, J.

1986-01-01

The modelling of the thermomechanical behaviour of salt rock is examined, with respect to the disposal of radioactive waste in salt formations. The calculation methods and programmes currently available for the modelling are described. Some examples are given of calculations carried out in parallel with tests. Some results of modelling calculations for a repository are presented by way of illustration. (U.K.)
Modeling a Million-Node Slim Fly Network Using Parallel Discrete-Event Simulation

Energy Technology Data Exchange (ETDEWEB)

Wolfe, Noah; Carothers, Christopher; Mubarak, Misbah; Ross, Robert; Carns, Philip

2016-05-15

As supercomputers close in on exascale performance, the increased number of processors and processing power translates to an increased demand on the underlying network interconnect. The Slim Fly network topology, a new lowdiameter and low-latency interconnection network, is gaining interest as one possible solution for next-generation supercomputing interconnect systems. In this paper, we present a high-fidelity Slim Fly it-level model leveraging the Rensselaer Optimistic Simulation System (ROSS) and Co-Design of Exascale Storage (CODES) frameworks. We validate our Slim Fly model with the Kathareios et al. Slim Fly model results provided at moderately sized network scales. We further scale the model size up to n unprecedented 1 million compute nodes; and through visualization of network simulation metrics such as link bandwidth, packet latency, and port occupancy, we get an insight into the network behavior at the million-node scale. We also show linear strong scaling of the Slim Fly model on an Intel cluster achieving a peak event rate of 36 million events per second using 128 MPI tasks to process 7 billion events. Detailed analysis of the underlying discrete-event simulation performance shows that a million-node Slim Fly model simulation can execute in 198 seconds on the Intel cluster.
State-space-based harmonic stability analysis for paralleled grid-connected inverters

DEFF Research Database (Denmark)

Wang, Yanbo; Wang, Xiongfei; Chen, Zhe

2016-01-01

This paper addresses a state-space-based harmonic stability analysis of paralleled grid-connected inverters system. A small signal model of individual inverter is developed, where LCL filter, the equivalent delay of control system, and current controller are modeled. Then, the overall small signal...... model of paralleled grid-connected inverters is built. Finally, the state space-based stability analysis approach is developed to explain the harmonic resonance phenomenon. The eigenvalue traces associated with time delay and coupled grid impedance are obtained, which accounts for how the unstable...... inverter produces the harmonic resonance and leads to the instability of whole paralleled system. The proposed approach reveals the contributions of the grid impedance as well as the coupled effect on other grid-connected inverters under different grid conditions. Simulation and experimental results...
Parallel-In-Time For Moving Meshes

Energy Technology Data Exchange (ETDEWEB)

Falgout, R. D. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Manteuffel, T. A. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Southworth, B. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Schroder, J. B. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

2016-02-04

With steadily growing computational resources available, scientists must develop e ective ways to utilize the increased resources. High performance, highly parallel software has be- come a standard. However until recent years parallelism has focused primarily on the spatial domain. When solving a space-time partial di erential equation (PDE), this leads to a sequential bottleneck in the temporal dimension, particularly when taking a large number of time steps. The XBraid parallel-in-time library was developed as a practical way to add temporal parallelism to existing se- quential codes with only minor modi cations. In this work, a rezoning-type moving mesh is applied to a di usion problem and formulated in a parallel-in-time framework. Tests and scaling studies are run using XBraid and demonstrate excellent results for the simple model problem considered herein.
Surface topography of parallel grinding process for nonaxisymmetric aspheric lens

International Nuclear Information System (INIS)

Zhang Ningning; Wang Zhenzhong; Pan Ri; Wang Chunjin; Guo Yinbiao

2012-01-01

Workpiece surface profile, texture and roughness can be predicted by modeling the topography of wheel surface and modeling kinematics of grinding process, which compose an important part of precision grinding process theory. Parallel grinding technology is an important method for nonaxisymmetric aspheric lens machining, but there is few report on relevant simulation. In this paper, a simulation method based on parallel grinding for precision machining of aspheric lens is proposed. The method combines modeling the random surface of wheel and modeling the single grain track based on arc wheel contact points. Then, a mathematical algorithm for surface topography is proposed and applied in conditions of different machining parameters. The consistence between the results of simulation and test proves that the algorithm is correct and efficient. (authors)
Vacuum Large Current Parallel Transfer Numerical Analysis

Directory of Open Access Journals (Sweden)

Enyuan Dong

2014-01-01

Full Text Available The stable operation and reliable breaking of large generator current are a difficult problem in power system. It can be solved successfully by the parallel interrupters and proper timing sequence with phase-control technology, in which the strategy of breaker’s control is decided by the time of both the first-opening phase and second-opening phase. The precise transfer current’s model can provide the proper timing sequence to break the generator circuit breaker. By analysis of the transfer current’s experiments and data, the real vacuum arc resistance and precise correctional model in the large transfer current’s process are obtained in this paper. The transfer time calculated by the correctional model of transfer current is very close to the actual transfer time. It can provide guidance for planning proper timing sequence and breaking the vacuum generator circuit breaker with the parallel interrupters.
Scaling predictive modeling in drug development with cloud computing.

Science.gov (United States)

Moghadam, Behrooz Torabi; Alvarsson, Jonathan; Holm, Marcus; Eklund, Martin; Carlsson, Lars; Spjuth, Ola

2015-01-26

Growing data sets with increased time for analysis is hampering predictive modeling in drug discovery. Model building can be carried out on high-performance computer clusters, but these can be expensive to purchase and maintain. We have evaluated ligand-based modeling on cloud computing resources where computations are parallelized and run on the Amazon Elastic Cloud. We trained models on open data sets of varying sizes for the end points logP and Ames mutagenicity and compare with model building parallelized on a traditional high-performance computing cluster. We show that while high-performance computing results in faster model building, the use of cloud computing resources is feasible for large data sets and scales well within cloud instances. An additional advantage of cloud computing is that the costs of predictive models can be easily quantified, and a choice can be made between speed and economy. The easy access to computational resources with no up-front investments makes cloud computing an attractive alternative for scientists, especially for those without access to a supercomputer, and our study shows that it enables cost-efficient modeling of large data sets on demand within reasonable time.
Self-organizing map models of language acquisition

Science.gov (United States)

Li, Ping; Zhao, Xiaowei

2013-01-01

Connectionist models have had a profound impact on theories of language. While most early models were inspired by the classic parallel distributed processing architecture, recent models of language have explored various other types of models, including self-organizing models for language acquisition. In this paper, we aim at providing a review of the latter type of models, and highlight a number of simulation experiments that we have conducted based on these models. We show that self-organizing connectionist models can provide significant insights into long-standing debates in both monolingual and bilingual language development. We suggest future directions in which these models can be extended, to better connect with behavioral and neural data, and to make clear predictions in testing relevant psycholinguistic theories. PMID:24312061
Lock-free parallel garbage collection

NARCIS (Netherlands)

H. Gao; J.F. Groote (Jan Friso); W.H. Hesselink (Wim)

2005-01-01

htmlabstract This paper presents a lock-free parallel algorithm for mark&sweep garbage collection (GC) in a realistic model using synchronization primitives compare-and-swap (CAS) and load-linked/store-conditional (LL/SC) offered by machine architectures. Mutators and collectors can simultaneously

Some links on this page may take you to non-federal websites. Their policies may differ from this site.