Parallel Software Model Checking
2015-01-08
JAN 2015 2. REPORT TYPE N/A 3. DATES COVERED 4. TITLE AND SUBTITLE Parallel Software Model Checking 5a. CONTRACT NUMBER 5b. GRANT NUMBER...AND ADDRESS(ES) Software Engineering Institute Carnegie Mellon University Pittsburgh, PA 15213 8. PERFORMING ORGANIZATION REPORT NUMBER 9...3: ∧ ≥ 10 ∧ ≠ 10 ⇒ : Parallel Software Model Checking Team Members Sagar Chaki, Arie Gurfinkel
Cellular automata a parallel model
Mazoyer, J
1999-01-01
Cellular automata can be viewed both as computational models and modelling systems of real processes. This volume emphasises the first aspect. In articles written by leading researchers, sophisticated massive parallel algorithms (firing squad, life, Fischer's primes recognition) are treated. Their computational power and the specific complexity classes they determine are surveyed, while some recent results in relation to chaos from a new dynamic systems point of view are also presented. Audience: This book will be of interest to specialists of theoretical computer science and the parallelism challenge.
Parallel computing in enterprise modeling.
Energy Technology Data Exchange (ETDEWEB)
Goldsby, Michael E.; Armstrong, Robert C.; Shneider, Max S.; Vanderveen, Keith; Ray, Jaideep; Heath, Zach; Allan, Benjamin A.
2008-08-01
This report presents the results of our efforts to apply high-performance computing to entity-based simulations with a multi-use plugin for parallel computing. We use the term 'Entity-based simulation' to describe a class of simulation which includes both discrete event simulation and agent based simulation. What simulations of this class share, and what differs from more traditional models, is that the result sought is emergent from a large number of contributing entities. Logistic, economic and social simulations are members of this class where things or people are organized or self-organize to produce a solution. Entity-based problems never have an a priori ergodic principle that will greatly simplify calculations. Because the results of entity-based simulations can only be realized at scale, scalable computing is de rigueur for large problems. Having said that, the absence of a spatial organizing principal makes the decomposition of the problem onto processors problematic. In addition, practitioners in this domain commonly use the Java programming language which presents its own problems in a high-performance setting. The plugin we have developed, called the Parallel Particle Data Model, overcomes both of these obstacles and is now being used by two Sandia frameworks: the Decision Analysis Center, and the Seldon social simulation facility. While the ability to engage U.S.-sized problems is now available to the Decision Analysis Center, this plugin is central to the success of Seldon. Because Seldon relies on computationally intensive cognitive sub-models, this work is necessary to achieve the scale necessary for realistic results. With the recent upheavals in the financial markets, and the inscrutability of terrorist activity, this simulation domain will likely need a capability with ever greater fidelity. High-performance computing will play an important part in enabling that greater fidelity.
A Parallel Programming Model With Sequential Semantics
1996-01-01
Parallel programming is more difficult than sequential programming in part because of the complexity of reasoning, testing, and debugging in the...context of concurrency. In the thesis, we present and investigate a parallel programming model that provides direct control of parallelism in a notation
A Topological Model for Parallel Algorithm Design
1991-09-01
New York, 1989. 108. J. Dugundji . Topology . Allen and Bacon, Rockleigh, NJ, 1966. 109. R. Duncan. A Survey of Parallel Computer Architectures. IEEE...Approved for public release; distribition unlimited 4N1f-e AFIT/DS/ENG/91-02 A TOPOLOGICAL MODEL FOR PARALLEL ALGORITHM DESIGN DISSERTATION Presented to...DC 20503. 4. TITLE AND SUBTITLE 5. FUNDING NUMBERS A Topological Model For Parallel Algorithm Design 6. AUTHOR(S) Jeffrey A Simmers, Captain, USAF 7
PDDP, A Data Parallel Programming Model
Directory of Open Access Journals (Sweden)
Karen H. Warren
1996-01-01
Full Text Available PDDP, the parallel data distribution preprocessor, is a data parallel programming model for distributed memory parallel computers. PDDP implements high-performance Fortran-compatible data distribution directives and parallelism expressed by the use of Fortran 90 array syntax, the FORALL statement, and the WHERE construct. Distributed data objects belong to a global name space; other data objects are treated as local and replicated on each processor. PDDP allows the user to program in a shared memory style and generates codes that are portable to a variety of parallel machines. For interprocessor communication, PDDP uses the fastest communication primitives on each platform.
Parallel models of associative memory
Hinton, Geoffrey E
2014-01-01
This update of the 1981 classic on neural networks includes new commentaries by the authors that show how the original ideas are related to subsequent developments. As researchers continue to uncover ways of applying the complex information processing abilities of neural networks, they give these models an exciting future which may well involve revolutionary developments in understanding the brain and the mind -- developments that may allow researchers to build adaptive intelligent machines. The original chapters show where the ideas came from and the new commentaries show where they are going
Structured building model reduction toward parallel simulation
Energy Technology Data Exchange (ETDEWEB)
Dobbs, Justin R. [Cornell University; Hencey, Brondon M. [Cornell University
2013-08-26
Building energy model reduction exchanges accuracy for improved simulation speed by reducing the number of dynamical equations. Parallel computing aims to improve simulation times without loss of accuracy but is poorly utilized by contemporary simulators and is inherently limited by inter-processor communication. This paper bridges these disparate techniques to implement efficient parallel building thermal simulation. We begin with a survey of three structured reduction approaches that compares their performance to a leading unstructured method. We then use structured model reduction to find thermal clusters in the building energy model and allocate processing resources. Experimental results demonstrate faster simulation and low error without any interprocessor communication.
Parallel Computing of Ocean General Circulation Model
Institute of Scientific and Technical Information of China (English)
无
2001-01-01
This paper discusses the parallel computing of the thirdgeneration Ocea n General Circulation Model (OGCM) from the State Key Laboratory of Numerical Mo deling for Atmospheric Science and Geophysical Fluid Dynamics(LASG),Institute of Atmosphere Physics(IAP). Meanwhile, several optimization strategies for paralle l computing of OGCM (POGCM) on Scalable Shared Memory Multiprocessor (S2MP) are presented. Using Message Passing Interface (MPI), we obtain super linear speedup on SGI Origin 2000 for parallel OGCM(POGCM) after optimization.
Iteration schemes for parallelizing models of superconductivity
Energy Technology Data Exchange (ETDEWEB)
Gray, P.A. [Michigan State Univ., East Lansing, MI (United States)
1996-12-31
The time dependent Lawrence-Doniach model, valid for high fields and high values of the Ginzburg-Landau parameter, is often used for studying vortex dynamics in layered high-T{sub c} superconductors. When solving these equations numerically, the added degrees of complexity due to the coupling and nonlinearity of the model often warrant the use of high-performance computers for their solution. However, the interdependence between the layers can be manipulated so as to allow parallelization of the computations at an individual layer level. The reduced parallel tasks may then be solved independently using a heterogeneous cluster of networked workstations connected together with Parallel Virtual Machine (PVM) software. Here, this parallelization of the model is discussed and several computational implementations of varying degrees of parallelism are presented. Computational results are also given which contrast properties of convergence speed, stability, and consistency of these implementations. Included in these results are models involving the motion of vortices due to an applied current and pinning effects due to various material properties.
A Scalable Prescriptive Parallel Debugging Model
DEFF Research Database (Denmark)
Jensen, Nicklas Bo; Quarfot Nielsen, Niklas; Lee, Gregory L.
2015-01-01
Debugging is a critical step in the development of any parallel program. However, the traditional interactive debugging model, where users manually step through code and inspect their application, does not scale well even for current supercomputers due its centralized nature. While lightweight...
Synthetic models of distributed memory parallel programs
Energy Technology Data Exchange (ETDEWEB)
Poplawski, D.A. (Michigan Technological Univ., Houghton, MI (USA). Dept. of Computer Science)
1990-09-01
This paper deals with the construction and use of simple synthetic programs that model the behavior of more complex, real parallel programs. Synthetic programs can be used in many ways: to construct an easily ported suite of benchmark programs, to experiment with alternate parallel implementations of a program without actually writing them, and to predict the behavior and performance of an algorithm on a new or hypothetical machine. Synthetic programs are constructed easily from scratch, from existing programs, and can even be constructed using nothing but information obtained from traces of the real program's execution.
Dynamic stiffness model of spherical parallel robots
Cammarata, Alessandro; Caliò, Ivo; D`Urso, Domenico; Greco, Annalisa; Lacagnina, Michele; Fichera, Gabriele
2016-12-01
A novel approach to study the elastodynamics of Spherical Parallel Robots is described through an exact dynamic model. Timoshenko arches are used to simulate flexible curved links while the base and mobile platforms are modelled as rigid bodies. Spatial joints are inherently included into the model without Lagrangian multipliers. At first, the equivalent dynamic stiffness matrix of each leg, made up of curved links joined by spatial joints, is derived; then these matrices are assembled to obtain the Global Dynamic Stiffness Matrix of the robot at a given pose. Actuator stiffness is also included into the model to verify its influence on vibrations and modes. The latter are found by applying the Wittrick-Williams algorithm. Finally, numerical simulations and direct comparison to commercial FE results are used to validate the proposed model.
Electromagnetic Physics Models for Parallel Computing Architectures
Amadio, G.; Ananya, A.; Apostolakis, J.; Aurora, A.; Bandieramonte, M.; Bhattacharyya, A.; Bianchini, C.; Brun, R.; Canal, P.; Carminati, F.; Duhem, L.; Elvira, D.; Gheata, A.; Gheata, M.; Goulas, I.; Iope, R.; Jun, S. Y.; Lima, G.; Mohanty, A.; Nikitina, T.; Novak, M.; Pokorski, W.; Ribon, A.; Seghal, R.; Shadura, O.; Vallecorsa, S.; Wenzel, S.; Zhang, Y.
2016-10-01
The recent emergence of hardware architectures characterized by many-core or accelerated processors has opened new opportunities for concurrent programming models taking advantage of both SIMD and SIMT architectures. GeantV, a next generation detector simulation, has been designed to exploit both the vector capability of mainstream CPUs and multi-threading capabilities of coprocessors including NVidia GPUs and Intel Xeon Phi. The characteristics of these architectures are very different in terms of the vectorization depth and type of parallelization needed to achieve optimal performance. In this paper we describe implementation of electromagnetic physics models developed for parallel computing architectures as a part of the GeantV project. Results of preliminary performance evaluation and physics validation are presented as well.
Parallelization of the Coupled Earthquake Model
Block, Gary; Li, P. Peggy; Song, Yuhe T.
2007-01-01
This Web-based tsunami simulation system allows users to remotely run a model on JPL s supercomputers for a given undersea earthquake. At the time of this reporting, predicting tsunamis on the Internet has never happened before. This new code directly couples the earthquake model and the ocean model on parallel computers and improves simulation speed. Seismometers can only detect information from earthquakes; they cannot detect whether or not a tsunami may occur as a result of the earthquake. When earthquake-tsunami models are coupled with the improved computational speed of modern, high-performance computers and constrained by remotely sensed data, they are able to provide early warnings for those coastal regions at risk. The software is capable of testing NASA s satellite observations of tsunamis. It has been successfully tested for several historical tsunamis, has passed all alpha and beta testing, and is well documented for users.
A Parallel, High-Fidelity Radar Model
Horsley, M.; Fasenfest, B.
2010-09-01
Accurate modeling of Space Surveillance sensors is necessary for a variety of applications. Accurate models can be used to perform trade studies on sensor designs, locations, and scheduling. In addition, they can be used to predict system-level performance of the Space Surveillance Network to a collision or satellite break-up event. A high fidelity physics-based radar simulator has been developed for Space Surveillance applications. This simulator is designed in a modular fashion, where each module describes a particular physical process or radar function (radio wave propagation & scattering, waveform generation, noise sources, etc.) involved in simulating the radar and its environment. For each of these modules, multiple versions are available in order to meet the end-users needs and requirements. For instance, the radar simulator supports different atmospheric models in order to facilitate different methods of simulating refraction of the radar beam. The radar model also has the capability to use highly accurate radar cross sections generated by the method of moments, accelerated by the fast multipole method. To accelerate this computationally expensive model, it is parallelized using MPI. As a testing framework for the radar model, it is incorporated into the Testbed Environment for Space Situational Awareness (TESSA). TESSA is based on a flexible, scalable architecture, designed to exploit high-performance computing resources and allow physics-based simulation of the SSA enterprise. In addition to the radar models, TESSA includes hydrodynamic models of satellite intercept and debris generation, orbital propagation algorithms, optical brightness calculations, optical system models, object detection algorithms, orbit determination algorithms, simulation analysis and visualization tools. Within this framework, observations and tracks generated by the new radar model are compared to results from a phenomenological radar model. In particular, the new model will be
A parallel-pipelining software process model
Institute of Scientific and Technical Information of China (English)
无
2007-01-01
Software process is a framework for effective and timely delivery of software system. The framework plays a crucial role for software success. However, the development of large-scale software still faces the crisis of high risks, low quality, high costs and long cycle time.This paper proposed a three-phase parallel-pipelining software process model for improving speed and productivity, and reducing software costs and risks without sacrificing software quality. In this model, two strategies were presented. One strategy, based on subsystem-cost priority, Was used to prevent software development cost wasting and to reduce software complexity as well; the other strategy, used for balancing subsystem complexity, was designed to reduce the software complexity in the later development stages. Moreover. The proposed function-detailed and workload-simplified subsystem pipelining software process model presents much higher parallelity than the concurrent incremental model. Finally, the component-based product line technology not only ensures software quality and further reduces cycle time, software costs. And software risks but also sufficiently and rationally utilizes previous software product resources and enhances the competition ability of software development organizations.
Parallel computing in atmospheric chemistry models
Energy Technology Data Exchange (ETDEWEB)
Rotman, D. [Lawrence Livermore National Lab., CA (United States). Atmospheric Sciences Div.
1996-02-01
Studies of atmospheric chemistry are of high scientific interest, involve computations that are complex and intense, and require enormous amounts of I/O. Current supercomputer computational capabilities are limiting the studies of stratospheric and tropospheric chemistry and will certainly not be able to handle the upcoming coupled chemistry/climate models. To enable such calculations, the authors have developed a computing framework that allows computations on a wide range of computational platforms, including massively parallel machines. Because of the fast paced changes in this field, the modeling framework and scientific modules have been developed to be highly portable and efficient. Here, the authors present the important features of the framework and focus on the atmospheric chemistry module, named IMPACT, and its capabilities. Applications of IMPACT to aircraft studies will be presented.
A Network Model for Parallel Line Balancing Problem
Recep Benzer; Hadi Gökçen; Tahsin Çetinyokus; Hakan Çerçioglu
2007-01-01
Gökçen et al. (2006) have proposed several procedures and a mathematical model on single-model (product) assembly line balancing (ALB) problem with parallel lines. In parallel ALB problem, the goal is to balance more than one assembly line together. In this paper, a network model for parallel ALB problem has been proposed and illustrated on a numerical example. This model is a new approach for parallel ALB and it provides a different point of view for i...
Exploitation of Parallelism in Climate Models
Energy Technology Data Exchange (ETDEWEB)
Baer, F.; Tribbia, J.J.; Williamson, D.L.
1999-03-01
The US Department of Energy (DOE), through its CHAMMP initiative, hopes to develop the capability to make meaningful regional climate forecasts on time scales exceeding a decade, such capability to be based on numerical prediction type models. We propose research to contribute to each of the specific items enumerated in the CHAMMP announcement (Notice 91-3); i.e., to consider theoretical limits to prediction of climate and climate change on appropriate time scales, to develop new mathematical techniques to utilize massively parallel processors (MPP), to actually utilize MPPs as a research tool, and to develop improved representations of some processes essential to climate prediction. In particular, our goals are to: (1) Reconfigure the prediction equations such that the time iteration process can be compressed by use of MMP architecture, and to develop appropriate algorithms. (2) Develop local subgrid scale models which can provide time and space dependent parameterization for a state- of-the-art climate model to minimize the scale resolution necessary for a climate model, and to utilize MPP capability to simultaneously integrate those subgrid models and their statistics. (3) Capitalize on the MPP architecture to study the inherent ensemble nature of the climate problem. By careful choice of initial states, many realizations of the climate system can be determined concurrently and more realistic assessments of the climate prediction can be made in a realistic time frame. To explore these initiatives, we will exploit all available computing technology, and in particular MPP machines. We anticipate that significant improvements in modeling of climate on the decadal and longer time scales for regional space scales will result from our efforts.
A Parallel Lattice Boltzmann Model of a Carotid Artery
Boyd, J.; Ryan, S. J.; Buick, J. M.
2008-11-01
A parallel implementation of the lattice Boltzmann model is considered for a three dimensional model of the carotid artery. The computational method and its parallel implementation are described. The performance of the parallel implementation on a Beowulf cluster is presented, as are preliminary hemodynamic results.
DYNAMIC TASK PARTITIONING MODEL IN PARALLEL COMPUTING
Directory of Open Access Journals (Sweden)
Javed Ali
2012-04-01
Full Text Available Parallel computing systems compose task partitioning strategies in a true multiprocessing manner. Such systems share the algorithm and processing unit as computing resources which leads to highly inter process communications capabilities. The main part of the proposed algorithm is resource management unit which performs task partitioning and co-scheduling .In this paper, we present a technique for integrated task partitioning and co-scheduling on the privately owned network. We focus on real-time and non preemptive systems. A large variety of experiments have been conducted on the proposed algorithm using synthetic and real tasks. Goal of computation model is to provide a realistic representation of the costs of programming The results show the benefit of the task partitioning. The main characteristics of our method are optimal scheduling and strong link between partitioning, scheduling and communication. Some important models for task partitioning are also discussed in the paper. We target the algorithm for task partitioning which improve the inter process communication between the tasks and use the recourses of the system in the efficient manner. The proposed algorithm contributes the inter-process communication cost minimization amongst the executing processes.
Hierarchical Bulk Synchronous Parallel Model and Performance Optimization
Institute of Scientific and Technical Information of China (English)
HUANG Linpeng; SUNYongqiang; YUAN Wei
1999-01-01
Based on the framework of BSP, aHierarchical Bulk Synchronous Parallel (HBSP) performance model isintroduced in this paper to capture the performance optimizationproblem for various stages in parallel program development and toaccurately predict the performance of a parallel program byconsidering factors causing variance at local computation and globalcommunication. The related methodology has been applied to several realapplications and the results show that HBSP is a suitable model foroptimizing parallel programs.
Harmony Theory: Problem Solving, Parallel Cognitive Models, and Thermal Physics.
Smolensky, Paul; Riley, Mary S.
This document consists of three papers. The first, "A Parallel Model of (Sequential) Problem Solving," describes a parallel model designed to solve a class of relatively simple problems from elementary physics and discusses implications for models of problem-solving in general. It is shown that one of the most salient features of problem…
A Network Model for Parallel Line Balancing Problem
Directory of Open Access Journals (Sweden)
Recep Benzer
2007-01-01
Full Text Available Gökçen et al. (2006 have proposed several procedures and a mathematical model on single-model (product assembly line balancing (ALB problem with parallel lines. In parallel ALB problem, the goal is to balance more than one assembly line together. In this paper, a network model for parallel ALB problem has been proposed and illustrated on a numerical example. This model is a new approach for parallel ALB and it provides a different point of view for interested researchers.
The Modeling of the ERP Systems within Parallel Calculus
Directory of Open Access Journals (Sweden)
Loredana MOCEAN
2011-01-01
Full Text Available As we know from a few years, the basic characteristics of ERP systems are: modular-design, central common database, integration of the modules, data transfer between modules done automatically, complex systems and flexible configuration. Because this, is obviously a parallel approach to design and implement them within parallel algorithms, parallel calculus and distributed databases. This paper aims to support these assertions and provide a model, in summary, what could be an ERP system based on parallel computing and algorithms.
Parallel Evolutionary Modeling for Nonlinear Ordinary Differential Equations
Institute of Scientific and Technical Information of China (English)
无
2001-01-01
We introduce a new parallel evolutionary algorithm in modeling dynamic systems by nonlinear higher-order ordinary differential equations (NHODEs). The NHODEs models are much more universal than the traditional linear models. In order to accelerate the modeling process, we propose and realize a parallel evolutionary algorithm using distributed CORBA object on the heterogeneous networking. Some numerical experiments show that the new algorithm is feasible and efficient.
Shared Variable Oriented Parallel Precompiler for SPMD Model
Institute of Scientific and Technical Information of China (English)
无
1995-01-01
For the moment,commercial parallel computer systems with distributed memory architecture are usually provided with parallel FORTRAN or parallel C compliers,which are just traditional sequential FORTRAN or C compilers expanded with communication statements.Programmers suffer from writing parallel programs with communication statements. The Shared Variable Oriented Parallel Precompiler (SVOPP) proposed in this paper can automatically generate appropriate communication statements based on shared variables for SPMD(Single Program Multiple Data) computation model and greatly ease the parallel programming with high communication efficiency.The core function of parallel C precompiler has been successfully verified on a transputer-based parallel computer.Its prominent performance shows that SVOPP is probably a break-through in parallel programming technique.
Mathematical model partitioning and packing for parallel computer calculation
Arpasi, Dale J.; Milner, Edward J.
1986-01-01
This paper deals with the development of multiprocessor simulations from a serial set of ordinary differential equations describing a physical system. The identification of computational parallelism within the model equations is discussed. A technique is presented for identifying this parallelism and for partitioning the equations for parallel solution on a multiprocessor. Next, an algorithm which packs the equations into a minimum number of processors is described. The results of applying the packing algorithm to a turboshaft engine model are presented.
Parallel local approximation MCMC for expensive models
Conrad, Patrick; Davis, Andrew; Marzouk, Youssef; Pillai, Natesh; Smith, Aaron
2016-01-01
Performing Bayesian inference via Markov chain Monte Carlo (MCMC) can be exceedingly expensive when posterior evaluations invoke the evaluation of a computationally expensive model, such as a system of partial differential equations. In recent work [Conrad et al. JASA 2015, arXiv:1402.1694] we described a framework for constructing and refining local approximations of such models during an MCMC simulation. These posterior--adapted approximations harness regularity of the model to reduce the c...
Parallel Dynamics of Continuous Hopfield Model Revisited
Mimura, Kazushi
2009-03-01
We have applied the generating functional analysis (GFA) to the continuous Hopfield model. We have also confirmed that the GFA predictions in some typical cases exhibit good consistency with computer simulation results. When a retarded self-interaction term is omitted, the GFA result becomes identical to that obtained using the statistical neurodynamics as well as the case of the sequential binary Hopfield model.
PDDP: A data parallel programming model. Revision 1
Energy Technology Data Exchange (ETDEWEB)
Warren, K.H.
1995-06-01
PDDP, the Parallel Data Distribution Preprocessor, is a data parallel programming model for distributed memory parallel computers. PDDP impelments High Performance Fortran compatible data distribution directives and parallelism expressed by the use of Fortran 90 array syntax, the FORALL statement, and the (WRERE?) construct. Distribued data objects belong to a global name space; other data objects are treated as local and replicated on each processor. PDDP allows the user to program in a shared-memory style and generates codes that are portable to a variety of parallel machines. For interprocessor communication, PDDP uses the fastest communication primitives on each platform.
Parallelism and optimization of numerical ocean forecasting model
Xu, Jianliang; Pang, Renbo; Teng, Junhua; Liang, Hongtao; Yang, Dandan
2016-10-01
According to the characteristics of Chinese marginal seas, the Marginal Sea Model of China (MSMC) has been developed independently in China. Because the model requires long simulation time, as a routine forecasting model, the parallelism of MSMC becomes necessary to be introduced to improve the performance of it. However, some methods used in MSMC, such as Successive Over Relaxation (SOR) algorithm, are not suitable for parallelism. In this paper, methods are developedto solve the parallel problem of the SOR algorithm following the steps as below. First, based on a 3D computing grid system, an automatic data partition method is implemented to dynamically divide the computing grid according to computing resources. Next, based on the characteristics of the numerical forecasting model, a parallel method is designed to solve the parallel problem of the SOR algorithm. Lastly, a communication optimization method is provided to avoid the cost of communication. In the communication optimization method, the non-blocking communication of Message Passing Interface (MPI) is used to implement the parallelism of MSMC with complex physical equations, and the process of communication is overlapped with the computations for improving the performance of parallel MSMC. The experiments show that the parallel MSMC runs 97.2 times faster than the serial MSMC, and root mean square error between the parallel MSMC and the serial MSMC is less than 0.01 for a 30-day simulation (172800 time steps), which meets the requirements of timeliness and accuracy for numerical ocean forecasting products.
Models of parallel computation :a survey and classification
Institute of Scientific and Technical Information of China (English)
ZHANG Yunquan; CHEN Guoliang; SUN Guangzhong; MIAO Qiankun
2007-01-01
In this paper,the state-of-the-art parallel computational model research is reviewed.We will introduce various models that were developed during the past decades.According to their targeting architecture features,especially memory organization,we classify these parallel computational models into three generations.These models and their characteristics are discussed based on three generations classification.We believe that with the ever increasing speed gap between the CPU and memory systems,incorporating non-uniform memory hierarchy into computational models will become unavoidable.With the emergence of multi-core CPUs,the parallelism hierarchy of current computing platforms becomes more and more complicated.Describing this complicated parallelism hierarchy in future computational models becomes more and more important.A semi-automatic toolkit that can extract model parameters and their values on real computers can reduce the model analysis complexity,thus allowing more complicated models with more parameters to be adopted.Hierarchical memory and hierarchical parallelism will be two very important features that should be considered in future model design and research.
Deterministic Consistency: A Programming Model for Shared Memory Parallelism
Aviram, Amittai; Ford, Bryan
2009-01-01
The difficulty of developing reliable parallel software is generating interest in deterministic environments, where a given program and input can yield only one possible result. Languages or type systems can enforce determinism in new code, and runtime systems can impose synthetic schedules on legacy parallel code. To parallelize existing serial code, however, we would like a programming model that is naturally deterministic without language restrictions or artificial scheduling. We propose "...
Modeling and Control of Primary Parallel Isolated Boost Converter
DEFF Research Database (Denmark)
Mira Albert, Maria del Carmen; Hernandez Botella, Juan Carlos; Sen, Gökhan
2012-01-01
In this paper state space modeling and closed loop controlled operation have been presented for primary parallel isolated boost converter (PPIBC) topology as a battery charging unit. Parasitic resistances have been included to have an accurate dynamic model. The accuracy of the model has been tes...
Development of a Massively Parallel NOGAPS Forecast Model
2016-06-07
parallel computer architectures. These algorithms will be critical for inter- processor communication dependent and computationally intensive model...to exploit massively parallel processor (MPP), distributed memory computer architectures. Future increases in computer power from MPP’s will allow...passing (MPI) is the paradigm chosen for communication between distributed memory processors. APPROACH Use integrations of the current operational
Dynamic Distribution Model with Prime Granularity for Parallel Computing
Institute of Scientific and Technical Information of China (English)
无
2005-01-01
Dynamic distribution model is one of the best schemes for parallel volume rendering. However, in homogeneous cluster system, since the granularity is traditionally identical, all processors communicate almost simultaneously and computation load may lose balance. Due to problems above, a dynamic distribution model with prime granularity for parallel computing is presented.Granularities of each processor are relatively prime, and related theories are introduced. A high parallel performance can be achieved by minimizing network competition and using a load balancing strategy that ensures all processors finish almost simultaneously. Based on Master-Slave-Gleaner (MSG) scheme, the parallel Splatting Algorithm for volume rendering is used to test the model on IBM Cluster 1350 system. The experimental results show that the model can bring a considerable improvement in performance, including computation efficiency, total execution time, speed, and load balancing.
Graph Partitioning Models for Parallel Computing
Energy Technology Data Exchange (ETDEWEB)
Hendrickson, B.; Kolda, T.G.
1999-03-02
Calculations can naturally be described as graphs in which vertices represent computation and edges reflect data dependencies. By partitioning the vertices of a graph, the calculation can be divided among processors of a parallel computer. However, the standard methodology for graph partitioning minimizes the wrong metric and lacks expressibility. We survey several recently proposed alternatives and discuss their relative merits.
Modeling and Adaptive Control of a Planar Parallel Mechanism
Institute of Scientific and Technical Information of China (English)
敖银辉; 陈新
2004-01-01
Dynamic model and control strategy of parallel mechanism have always been a problem in robotics research. In this paper,different dynamics formulation methods are discussed first, A model of redundant driven parallel mechanism with a planar parallel manipulator is then constructed as an example. A nonlinear adaptive control method is introduced. Matrix pseudo-inversion is used to get a desired actuator torque from a desired end-effector coordinate while the feedback torque is directly calculated in the actuator space. This treatment avoids forward kinematics computation that is very difficult in a parallel mechanism. Experiments with PID together with the descibed adaptive control strategy were carried out for a planar parallel mechanism. The results show that the proposed adaptive controller outperforms conventional PID methods in tracking desired input at a high speed,
Vectorial Preisach-type model designed for parallel computing
Energy Technology Data Exchange (ETDEWEB)
Stancu, Alexandru [Department of Solid State and Theoretical Physics, Al. I. Cuza University, Blvd. Carol I, 11, 700506 Iasi (Romania)]. E-mail: alstancu@uaic.ro; Stoleriu, Laurentiu [Department of Solid State and Theoretical Physics, Al. I. Cuza University, Blvd. Carol I, 11, 700506 Iasi (Romania); Andrei, Petru [Electrical and Computer Engineering, Florida State University, Tallahassee, FL (United States); Electrical and Computer Engineering, Florida A and M University, Tallahassee, FL (United States)
2007-09-15
Most of the hysteresis phenomenological models are scalar, while all the magnetization processes are vectorial. The vector models-phenomenological or micromagnetic (physical)-are time consuming and sometimes difficult to implement. In this paper, we introduce a new vector Preisach-type model that uses micromagnetic results to simulate the magnetic response of a system of several tens of thousands of pseudo-particles. The model has a modular structure that allows easy implementation for parallel computing.
Modeling groundwater flow on massively parallel computers
Energy Technology Data Exchange (ETDEWEB)
Ashby, S.F.; Falgout, R.D.; Fogwell, T.W.; Tompson, A.F.B.
1994-12-31
The authors will explore the numerical simulation of groundwater flow in three-dimensional heterogeneous porous media. An interdisciplinary team of mathematicians, computer scientists, hydrologists, and environmental engineers is developing a sophisticated simulation code for use on workstation clusters and MPPs. To date, they have concentrated on modeling flow in the saturated zone (single phase), which requires the solution of a large linear system. they will discuss their implementation of preconditioned conjugate gradient solvers. The preconditioners under consideration include simple diagonal scaling, s-step Jacobi, adaptive Chebyshev polynomial preconditioning, and multigrid. They will present some preliminary numerical results, including simulations of groundwater flow at the LLNL site. They also will demonstrate the code`s scalability.
Advances in parallel computer technology for desktop atmospheric dispersion models
Energy Technology Data Exchange (ETDEWEB)
Bian, X.; Ionescu-Niscov, S.; Fast, J.D. [Pacific Northwest National Lab., Richland, WA (United States); Allwine, K.J. [Allwine Enviornmental Serv., Richland, WA (United States)
1996-12-31
Desktop models are those models used by analysts with varied backgrounds, for performing, for example, air quality assessment and emergency response activities. These models must be robust, well documented, have minimal and well controlled user inputs, and have clear outputs. Existing coarse-grained parallel computers can provide significant increases in computation speed in desktop atmospheric dispersion modeling without considerable increases in hardware cost. This increased speed will allow for significant improvements to be made in the scientific foundations of these applied models, in the form of more advanced diffusion schemes and better representation of the wind and turbulence fields. This is especially attractive for emergency response applications where speed and accuracy are of utmost importance. This paper describes one particular application of coarse-grained parallel computer technology to a desktop complex terrain atmospheric dispersion modeling system. By comparing performance characteristics of the coarse-grained parallel version of the model with the single-processor version, we will demonstrate that applying coarse-grained parallel computer technology to desktop atmospheric dispersion modeling systems will allow us to address critical issues facing future requirements of this class of dispersion models.
Performance of Air Pollution Models on Massively Parallel Computers
DEFF Research Database (Denmark)
Brown, John; Hansen, Per Christian; Wasniewski, Jerzy
1996-01-01
To compare the performance and use of three massively parallel SIMD computers, we implemented a large air pollution model on the computers. Using a realistic large-scale model, we gain detailed insight about the performance of the three computers when used to solve large-scale scientific problems...
Term Structure Models with Parallel and Proportional Shifts
DEFF Research Database (Denmark)
Armerin, Frederik; Björk, Tomas; Astrup Jensen, Bjarne
this general framework we show that there does indeed exist a large variety of nontrivial parallel shift term structure models, and we also describe these in detail. We also show that there exists no nontrivial flat term structure model. The same analysis is repeated for the similar case, where the yield curve...
Towards an Accurate Performance Modeling of Parallel SparseFactorization
Energy Technology Data Exchange (ETDEWEB)
Grigori, Laura; Li, Xiaoye S.
2006-05-26
We present a performance model to analyze a parallel sparseLU factorization algorithm on modern cached-based, high-end parallelarchitectures. Our model characterizes the algorithmic behavior bytakingaccount the underlying processor speed, memory system performance, aswell as the interconnect speed. The model is validated using theSuperLU_DIST linear system solver, the sparse matrices from realapplications, and an IBM POWER3 parallel machine. Our modelingmethodology can be easily adapted to study performance of other types ofsparse factorizations, such as Cholesky or QR.
Parallel community climate model: Description and user`s guide
Energy Technology Data Exchange (ETDEWEB)
Drake, J.B.; Flanery, R.E.; Semeraro, B.D.; Worley, P.H. [and others
1996-07-15
This report gives an overview of a parallel version of the NCAR Community Climate Model, CCM2, implemented for MIMD massively parallel computers using a message-passing programming paradigm. The parallel implementation was developed on an Intel iPSC/860 with 128 processors and on the Intel Delta with 512 processors, and the initial target platform for the production version of the code is the Intel Paragon with 2048 processors. Because the implementation uses a standard, portable message-passing libraries, the code has been easily ported to other multiprocessors supporting a message-passing programming paradigm. The parallelization strategy used is to decompose the problem domain into geographical patches and assign each processor the computation associated with a distinct subset of the patches. With this decomposition, the physics calculations involve only grid points and data local to a processor and are performed in parallel. Using parallel algorithms developed for the semi-Lagrangian transport, the fast Fourier transform and the Legendre transform, both physics and dynamics are computed in parallel with minimal data movement and modest change to the original CCM2 source code. Sequential or parallel history tapes are written and input files (in history tape format) are read sequentially by the parallel code to promote compatibility with production use of the model on other computer systems. A validation exercise has been performed with the parallel code and is detailed along with some performance numbers on the Intel Paragon and the IBM SP2. A discussion of reproducibility of results is included. A user`s guide for the PCCM2 version 2.1 on the various parallel machines completes the report. Procedures for compilation, setup and execution are given. A discussion of code internals is included for those who may wish to modify and use the program in their own research.
Modelling parallel programs and multiprocessor architectures with AXE
Yan, Jerry C.; Fineman, Charles E.
1991-01-01
AXE, An Experimental Environment for Parallel Systems, was designed to model and simulate for parallel systems at the process level. It provides an integrated environment for specifying computation models, multiprocessor architectures, data collection, and performance visualization. AXE is being used at NASA-Ames for developing resource management strategies, parallel problem formulation, multiprocessor architectures, and operating system issues related to the High Performance Computing and Communications Program. AXE's simple, structured user-interface enables the user to model parallel programs and machines precisely and efficiently. Its quick turn-around time keeps the user interested and productive. AXE models multicomputers. The user may easily modify various architectural parameters including the number of sites, connection topologies, and overhead for operating system activities. Parallel computations in AXE are represented as collections of autonomous computing objects known as players. Their use and behavior is described. Performance data of the multiprocessor model can be observed on a color screen. These include CPU and message routing bottlenecks, and the dynamic status of the software.
Parallelizing the Cellular Potts Model on graphics processing units
Tapia, José Juan; D'Souza, Roshan M.
2011-04-01
The Cellular Potts Model (CPM) is a lattice based modeling technique used for simulating cellular structures in computational biology. The computational complexity of the model means that current serial implementations restrict the size of simulation to a level well below biological relevance. Parallelization on computing clusters enables scaling the size of the simulation but marginally addresses computational speed due to the limited memory bandwidth between nodes. In this paper we present new data-parallel algorithms and data structures for simulating the Cellular Potts Model on graphics processing units. Our implementations handle most terms in the Hamiltonian, including cell-cell adhesion constraint, cell volume constraint, cell surface area constraint, and cell haptotaxis. We use fine level checkerboards with lock mechanisms using atomic operations to enable consistent updates while maintaining a high level of parallelism. A new data-parallel memory allocation algorithm has been developed to handle cell division. Tests show that our implementation enables simulations of >10 cells with lattice sizes of up to 256 3 on a single graphics card. Benchmarks show that our implementation runs ˜80× faster than serial implementations, and ˜5× faster than previous parallel implementations on computing clusters consisting of 25 nodes. The wide availability and economy of graphics cards mean that our techniques will enable simulation of realistically sized models at a fraction of the time and cost of previous implementations and are expected to greatly broaden the scope of CPM applications.
Towards a streaming model for nested data parallelism
DEFF Research Database (Denmark)
Madsen, Frederik Meisner; Filinski, Andrzej
2013-01-01
-flattening execution strategy, comes at the price of potentially prohibitive space usage in the common case of computations with an excess of available parallelism, such as dense-matrix multiplication. We present a simple nested data-parallel functional language and associated cost semantics that retains NESL......'s intuitive work--depth model for time complexity, but also allows highly parallel computations to be expressed in a space-efficient way, in the sense that memory usage on a single (or a few) processors is of the same order as for a sequential formulation of the algorithm, and in general scales smoothly......-processable in a streaming fashion. This semantics is directly compatible with previously proposed piecewise execution models for nested data parallelism, but allows the expected space usage to be reasoned about directly at the source-language level. The language definition and implementation are still very much work...
Optimisation of a parallel ocean general circulation model
Directory of Open Access Journals (Sweden)
M. I. Beare
Full Text Available This paper presents the development of a general-purpose parallel ocean circulation model, for use on a wide range of computer platforms, from traditional scalar machines to workstation clusters and massively parallel processors. Parallelism is provided, as a modular option, via high-level message-passing routines, thus hiding the technical intricacies from the user. An initial implementation highlights that the parallel efficiency of the model is adversely affected by a number of factors, for which optimisations are discussed and implemented. The resulting ocean code is portable and, in particular, allows science to be achieved on local workstations that could otherwise only be undertaken on state-of-the-art supercomputers.
Financial Data Modeling by Using Asynchronous Parallel Evolutionary Algorithms
Institute of Scientific and Technical Information of China (English)
Wang Chun; Li Qiao-yun
2003-01-01
In this paper, the high-level knowledge of financial data modeled by ordinary differential equations (ODEs) is discovered in dynamic data by using an asynchronous parallel evolutionary modeling algorithm (APHEMA). A numerical example of Nasdaq index analysis is used to demonstrate the potential of APHEMA. The results show that the dynamic models automatically discovered in dynamic data by computer can be used to predict the financial trends.
Advanced parallel programming models research and development opportunities.
Energy Technology Data Exchange (ETDEWEB)
Wen, Zhaofang.; Brightwell, Ronald Brian
2004-07-01
There is currently a large research and development effort within the high-performance computing community on advanced parallel programming models. This research can potentially have an impact on parallel applications, system software, and computing architectures in the next several years. Given Sandia's expertise and unique perspective in these areas, particularly on very large-scale systems, there are many areas in which Sandia can contribute to this effort. This technical report provides a survey of past and present parallel programming model research projects and provides a detailed description of the Partitioned Global Address Space (PGAS) programming model. The PGAS model may offer several improvements over the traditional distributed memory message passing model, which is the dominant model currently being used at Sandia. This technical report discusses these potential benefits and outlines specific areas where Sandia's expertise could contribute to current research activities. In particular, we describe several projects in the areas of high-performance networking, operating systems and parallel runtime systems, compilers, application development, and performance evaluation.
Parallelization of the NASA Goddard Cumulus Ensemble Model for Massively Parallel Computing
Directory of Open Access Journals (Sweden)
Hann-Ming Henry Juang
2007-01-01
Full Text Available Massively parallel computing, using a message passing interface (MPI, has been implemented into a three-dimensional version of the Goddard Cumulus Ensemble (GCE model. The implementation uses the domainresemble concept to design a code structure for both the whole domain and sub-domains after decomposition. Instead of inserting a group of MPI related statements into the model routine, these statements are packed into a single routine. In other words, only a single call statement to the model code is utilized once in a place, thus there is minimal impact on the original code. Therefore, the model is easily modified and/or managed by the model developers and/or users, who have little knowledge of massively parallel computing.
Badlands: A parallel basin and landscape dynamics model
Directory of Open Access Journals (Sweden)
T. Salles
2016-01-01
Full Text Available Over more than three decades, a number of numerical landscape evolution models (LEMs have been developed to study the combined effects of climate, sea-level, tectonics and sediments on Earth surface dynamics. Most of them are written in efficient programming languages, but often cannot be used on parallel architectures. Here, I present a LEM which ports a common core of accepted physical principles governing landscape evolution into a distributed memory parallel environment. Badlands (acronym for BAsin anD LANdscape DynamicS is an open-source, flexible, TIN-based landscape evolution model, built to simulate topography development at various space and time scales.
Genetic Algorithm Modeling with GPU Parallel Computing Technology
Cavuoti, Stefano; Brescia, Massimo; Pescapé, Antonio; Longo, Giuseppe; Ventre, Giorgio
2012-01-01
We present a multi-purpose genetic algorithm, designed and implemented with GPGPU / CUDA parallel computing technology. The model was derived from a multi-core CPU serial implementation, named GAME, already scientifically successfully tested and validated on astrophysical massive data classification problems, through a web application resource (DAMEWARE), specialized in data mining based on Machine Learning paradigms. Since genetic algorithms are inherently parallel, the GPGPU computing paradigm has provided an exploit of the internal training features of the model, permitting a strong optimization in terms of processing performances and scalability.
Inverse kinematics model of parallel macro-micro manipulator system
Institute of Scientific and Technical Information of China (English)
无
2000-01-01
An improved design, which employs the integration of optic, mechanical and electronic technologies for the next generation large radio telescope, is presented in this note. The authors propose the concept of parallel macro-micro manipulator system from the feed support structure with a rough tuning subsystem based on a cable structure and a fine tuning subsystem based on the Stewart platform. According to the requirement of astronomical observation, the inverse kinematics model of this parallel macro-micro manipulator system is deduced. This inverse kinematics model is necessary for the computer-controlled motion of feed.
A hybrid parallel framework for the cellular Potts model simulations
Energy Technology Data Exchange (ETDEWEB)
Jiang, Yi [Los Alamos National Laboratory; He, Kejing [SOUTH CHINA UNIV; Dong, Shoubin [SOUTH CHINA UNIV
2009-01-01
The Cellular Potts Model (CPM) has been widely used for biological simulations. However, most current implementations are either sequential or approximated, which can't be used for large scale complex 3D simulation. In this paper we present a hybrid parallel framework for CPM simulations. The time-consuming POE solving, cell division, and cell reaction operation are distributed to clusters using the Message Passing Interface (MPI). The Monte Carlo lattice update is parallelized on shared-memory SMP system using OpenMP. Because the Monte Carlo lattice update is much faster than the POE solving and SMP systems are more and more common, this hybrid approach achieves good performance and high accuracy at the same time. Based on the parallel Cellular Potts Model, we studied the avascular tumor growth using a multiscale model. The application and performance analysis show that the hybrid parallel framework is quite efficient. The hybrid parallel CPM can be used for the large scale simulation ({approx}10{sup 8} sites) of complex collective behavior of numerous cells ({approx}10{sup 6}).
Numerical modeling of parallel-plate based AMR
DEFF Research Database (Denmark)
In this work we present an improved 2-dimensional numerical model of a parallel-plate based AMR. The model includes heat transfer in ﬂuid and magnetocaloric domains respectively. The domains are coupled via inner thermal boundaries. The MCE is modeled either as an instantaneous change between high...... and low ﬁeld or as a magnetic ﬁeld proﬁle including the actual physical movement of the regenerator block in and out of ﬁeld, i.e. as a source term in the thermal equation for the magnetocaloric material (MCM). The model is further developed to include parasitic thermal losses throughout the bed...
Parallelization of a hydrological model using the message passing interface
Wu, Yiping; Li, Tiejian; Sun, Liqun; Chen, Ji
2013-01-01
With the increasing knowledge about the natural processes, hydrological models such as the Soil and Water Assessment Tool (SWAT) are becoming larger and more complex with increasing computation time. Additionally, other procedures such as model calibration, which may require thousands of model iterations, can increase running time and thus further reduce rapid modeling and analysis. Using the widely-applied SWAT as an example, this study demonstrates how to parallelize a serial hydrological model in a Windows® environment using a parallel programing technology—Message Passing Interface (MPI). With a case study, we derived the optimal values for the two parameters (the number of processes and the corresponding percentage of work to be distributed to the master process) of the parallel SWAT (P-SWAT) on an ordinary personal computer and a work station. Our study indicates that model execution time can be reduced by 42%–70% (or a speedup of 1.74–3.36) using multiple processes (two to five) with a proper task-distribution scheme (between the master and slave processes). Although the computation time cost becomes lower with an increasing number of processes (from two to five), this enhancement becomes less due to the accompanied increase in demand for message passing procedures between the master and all slave processes. Our case study demonstrates that the P-SWAT with a five-process run may reach the maximum speedup, and the performance can be quite stable (fairly independent of a project size). Overall, the P-SWAT can help reduce the computation time substantially for an individual model run, manual and automatic calibration procedures, and optimization of best management practices. In particular, the parallelization method we used and the scheme for deriving the optimal parameters in this study can be valuable and easily applied to other hydrological or environmental models.
Performance of Air Pollution Models on Massively Parallel Computers
DEFF Research Database (Denmark)
Brown, John; Hansen, Per Christian; Wasniewski, Jerzy
1996-01-01
To compare the performance and use of three massively parallel SIMD computers, we implemented a large air pollution model on the computers. Using a realistic large-scale model, we gain detailed insight about the performance of the three computers when used to solve large-scale scientific problems...... that involve several types of numerical computations. The computers considered in our study are the Connection Machines CM-200 and CM-5, and the MasPar MP-2216...
Parallel finite element modeling of earthquake ground response and liquefaction
Institute of Scientific and Technical Information of China (English)
Jinchi Lu(陆金池); Jun Peng(彭军); Ahmed Elgamal; Zhaohui Yang(杨朝晖); Kincho H. Law
2004-01-01
Parallel computing is a promising approach to alleviate the computational demand in conducting large-scale finite element analyses. This paper presents a numerical modeling approach for earthquake ground response and liquefaction using the parallel nonlinear finite element program, ParCYCLIC, designed for distributed-memory message-passing parallel computer systems. In ParCYCLIC, finite elements are employed within an incremental plasticity, coupled solid-fluid formulation. A constitutive model calibrated by physical tests represents the salient characteristics of sand liquefaction and associated accumulation of shear deformations. Key elements of the computational strategy employed in ParCYCLIC include the development of a parallel sparse direct solver, the deployment of an automatic domain decomposer, and the use of the Multilevel Nested Dissection algorithm for ordering of the finite element nodes. Simulation results of centrifuge test models using ParCYCLIC are presented. Performance results from grid models and geotechnical simulations show that ParCYCLIC is efficiently scalable to a large number of processors.
The Extended Parallel Process Model: Illuminating the Gaps in Research
Popova, Lucy
2012-01-01
This article examines constructs, propositions, and assumptions of the extended parallel process model (EPPM). Review of the EPPM literature reveals that its theoretical concepts are thoroughly developed, but the theory lacks consistency in operational definitions of some of its constructs. Out of the 12 propositions of the EPPM, a few have not…
Postscript: Parallel Distributed Processing in Localist Models without Thresholds
Plaut, David C.; McClelland, James L.
2010-01-01
The current authors reply to a response by Bowers on a comment by the current authors on the original article. Bowers (2010) mischaracterizes the goals of parallel distributed processing (PDP research)--explaining performance on cognitive tasks is the primary motivation. More important, his claim that localist models, such as the interactive…
Methods and models for the construction of weakly parallel tests
Adema, Jos J.
1992-01-01
Several methods are proposed for the construction of weakly parallel tests [i.e., tests with the same test information function (TIF)]. A mathematical programming model that constructs tests containing a prespecified TIF and a heuristic that assigns items to tests with information functions that are
Postscript: Parallel Distributed Processing in Localist Models without Thresholds
Plaut, David C.; McClelland, James L.
2010-01-01
The current authors reply to a response by Bowers on a comment by the current authors on the original article. Bowers (2010) mischaracterizes the goals of parallel distributed processing (PDP research)--explaining performance on cognitive tasks is the primary motivation. More important, his claim that localist models, such as the interactive…
Parallel Computation of the Regional Ocean Modeling System (ROMS)
Energy Technology Data Exchange (ETDEWEB)
Wang, P; Song, Y T; Chao, Y; Zhang, H
2005-04-05
The Regional Ocean Modeling System (ROMS) is a regional ocean general circulation modeling system solving the free surface, hydrostatic, primitive equations over varying topography. It is free software distributed world-wide for studying both complex coastal ocean problems and the basin-to-global scale ocean circulation. The original ROMS code could only be run on shared-memory systems. With the increasing need to simulate larger model domains with finer resolutions and on a variety of computer platforms, there is a need in the ocean-modeling community to have a ROMS code that can be run on any parallel computer ranging from 10 to hundreds of processors. Recently, we have explored parallelization for ROMS using the MPI programming model. In this paper, an efficient parallelization strategy for such a large-scale scientific software package, based on an existing shared-memory computing model, is presented. In addition, scientific applications and data-performance issues on a couple of SGI systems, including Columbia, the world's third-fastest supercomputer, are discussed.
Exploitation of parallelism in climate models. Final report
Energy Technology Data Exchange (ETDEWEB)
Baer, Ferdinand; Tribbia, Joseph J.; Williamson, David L.
2001-02-05
This final report includes details on the research accomplished by the grant entitled 'Exploitation of Parallelism in Climate Models' to the University of Maryland. The purpose of the grant was to shed light on (a) how to reconfigure the atmospheric prediction equations such that the time iteration process could be compressed by use of MPP architecture; (b) how to develop local subgrid scale models which can provide time and space dependent parameterization for a state-of-the-art climate model to minimize the scale resolution necessary for a climate model, and to utilize MPP capability to simultaneously integrate those subgrid models and their statistics; and (c) how to capitalize on the MPP architecture to study the inherent ensemble nature of the climate problem. In the process of addressing these issues, we created parallel algorithms with spectral accuracy; we developed a process for concurrent climate simulations; we established suitable model reconstructions to speed up computation; we identified and tested optimum realization statistics; we undertook a number of parameterization studies to better understand model physics; and we studied the impact of subgrid scale motions and their parameterization in atmospheric models.
Parallelization of MATLAB for Euro50 integrated modeling
Browne, Michael; Andersen, Torben E.; Enmark, Anita; Moraru, Dan; Shearer, Andrew
2004-09-01
MATLAB and its companion product Simulink are commonly used tools in systems modelling and other scientific disciplines. A cross-disciplinary integrated MATLAB model is used to study the overall performance of the proposed 50m optical and infrared telescope, Euro50. However the computational requirements of this kind of end-to-end simulation of the telescope's behaviour, exceeds the capability of an individual contemporary Personal Computer. By parallelizing the model, primarily on a functional basis, it can be implemented across a Beowulf cluster of generic PCs. This requires MATLAB to distribute in some way data and calculations to the cluster nodes and combine completed results. There have been a number of attempts to produce toolkits to allow MATLAB to be used in a parallel fashion. They have used a variety of techniques. Here we present findings from using some of these toolkits and proposed advances.
Modeling and optimization of parallel and distributed embedded systems
Munir, Arslan; Ranka, Sanjay
2016-01-01
This book introduces the state-of-the-art in research in parallel and distributed embedded systems, which have been enabled by developments in silicon technology, micro-electro-mechanical systems (MEMS), wireless communications, computer networking, and digital electronics. These systems have diverse applications in domains including military and defense, medical, automotive, and unmanned autonomous vehicles. The emphasis of the book is on the modeling and optimization of emerging parallel and distributed embedded systems in relation to the three key design metrics of performance, power and dependability.
X: A Comprehensive Analytic Model for Parallel Machines
Energy Technology Data Exchange (ETDEWEB)
Li, Ang; Song, Shuaiwen; Brugel, Eric; Kumar, Akash; Chavarría-Miranda, Daniel; Corporaal, Henk
2016-05-23
To continuously comply with Moore’s Law, modern parallel machines become increasingly complex. Effectively tuning application performance for these machines therefore becomes a daunting task. Moreover, identifying performance bottlenecks at application and architecture level, as well as evaluating various optimization strategies, are becoming extremely difficult when the entanglement of numerous correlated factors is being presented. To tackle these challenges, we present a visual analytical model named “X”. It is intuitive and sufficiently flexible to track all the typical features of a parallel machine.
Parallel Optimization of 3D Cardiac Electrophysiological Model Using GPU
Directory of Open Access Journals (Sweden)
Yong Xia
2015-01-01
Full Text Available Large-scale 3D virtual heart model simulations are highly demanding in computational resources. This imposes a big challenge to the traditional computation resources based on CPU environment, which already cannot meet the requirement of the whole computation demands or are not easily available due to expensive costs. GPU as a parallel computing environment therefore provides an alternative to solve the large-scale computational problems of whole heart modeling. In this study, using a 3D sheep atrial model as a test bed, we developed a GPU-based simulation algorithm to simulate the conduction of electrical excitation waves in the 3D atria. In the GPU algorithm, a multicellular tissue model was split into two components: one is the single cell model (ordinary differential equation and the other is the diffusion term of the monodomain model (partial differential equation. Such a decoupling enabled realization of the GPU parallel algorithm. Furthermore, several optimization strategies were proposed based on the features of the virtual heart model, which enabled a 200-fold speedup as compared to a CPU implementation. In conclusion, an optimized GPU algorithm has been developed that provides an economic and powerful platform for 3D whole heart simulations.
Exploration Of Deep Learning Algorithms Using Openacc Parallel Programming Model
Hamam, Alwaleed A.
2017-03-13
Deep learning is based on a set of algorithms that attempt to model high level abstractions in data. Specifically, RBM is a deep learning algorithm that used in the project to increase it\\'s time performance using some efficient parallel implementation by OpenACC tool with best possible optimizations on RBM to harness the massively parallel power of NVIDIA GPUs. GPUs development in the last few years has contributed to growing the concept of deep learning. OpenACC is a directive based ap-proach for computing where directives provide compiler hints to accelerate code. The traditional Restricted Boltzmann Ma-chine is a stochastic neural network that essentially perform a binary version of factor analysis. RBM is a useful neural net-work basis for larger modern deep learning model, such as Deep Belief Network. RBM parameters are estimated using an efficient training method that called Contrastive Divergence. Parallel implementation of RBM is available using different models such as OpenMP, and CUDA. But this project has been the first attempt to apply OpenACC model on RBM.
Distributed parallel computing in stochastic modeling of groundwater systems.
Dong, Yanhui; Li, Guomin; Xu, Haizhen
2013-03-01
Stochastic modeling is a rapidly evolving, popular approach to the study of the uncertainty and heterogeneity of groundwater systems. However, the use of Monte Carlo-type simulations to solve practical groundwater problems often encounters computational bottlenecks that hinder the acquisition of meaningful results. To improve the computational efficiency, a system that combines stochastic model generation with MODFLOW-related programs and distributed parallel processing is investigated. The distributed computing framework, called the Java Parallel Processing Framework, is integrated into the system to allow the batch processing of stochastic models in distributed and parallel systems. As an example, the system is applied to the stochastic delineation of well capture zones in the Pinggu Basin in Beijing. Through the use of 50 processing threads on a cluster with 10 multicore nodes, the execution times of 500 realizations are reduced to 3% compared with those of a serial execution. Through this application, the system demonstrates its potential in solving difficult computational problems in practical stochastic modeling. © 2012, The Author(s). Groundwater © 2012, National Ground Water Association.
A systemic approach for modeling biological evolution using Parallel DEVS.
Heredia, Daniel; Sanz, Victorino; Urquia, Alfonso; Sandín, Máximo
2015-08-01
A new model for studying the evolution of living organisms is proposed in this manuscript. The proposed model is based on a non-neodarwinian systemic approach. The model is focused on considering several controversies and open discussions about modern evolutionary biology. Additionally, a simplification of the proposed model, named EvoDEVS, has been mathematically described using the Parallel DEVS formalism and implemented as a computer program using the DEVSLib Modelica library. EvoDEVS serves as an experimental platform to study different conditions and scenarios by means of computer simulations. Two preliminary case studies are presented to illustrate the behavior of the model and validate its results. EvoDEVS is freely available at http://www.euclides.dia.uned.es. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Dynamic modeling of flexible-links planar parallel robots
Institute of Scientific and Technical Information of China (English)
2008-01-01
This paper presents a finite element-based method for dynamic modeling of parallel robots with flexible links and rigid moving platform.The elastic displacements of flexible links are investigated while considering the coupling effects between links due to the structural flexibility.The kinematic constraint conditions and dynamic constraint conditions for elastic displacements are presented.Considering the effects of distributed mass,lumped mass,shearing deformation,bending deformation,tensile deformation and lateral displacements,the Kineto-Elasto dynamics (KED) theory and Lagrange formula are used to derive the dynamic equations of planar flexible-links parallel robots.The dynamic behavior of the flexible-links planar parallel robot is well illustrated through numerical simulation of a planar 3-RRR parallel robot.Compared with the results of finite element software SAMCEF,the numerical simulation results show good coherence of the proposed method.The flexibility of links is demonstrated to have a significant impact on the position error and orientation error of the flexiblelinks planar parallel robot.
Accuracy Improvement for Stiffness Modeling of Parallel Manipulators
Pashkevich, Anatoly; Chablat, Damien; Wenger, Philippe
2009-01-01
The paper focuses on the accuracy improvement of stiffness models for parallel manipulators, which are employed in high-speed precision machining. It is based on the integrated methodology that combines analytical and numerical techniques and deals with multidimensional lumped-parameter models of the links. The latter replace the link flexibility by localized 6-dof virtual springs describing both translational/rotational compliance and the coupling between them. There is presented detailed accuracy analysis of the stiffness identification procedures employed in the commercial CAD systems (including statistical analysis of round-off errors, evaluating the confidence intervals for stiffness matrices). The efficiency of the developed technique is confirmed by application examples, which deal with stiffness analysis of translational parallel manipulators.
Center for Programming Models for Scalable Parallel Computing
Energy Technology Data Exchange (ETDEWEB)
John Mellor-Crummey
2008-02-29
Rice University's achievements as part of the Center for Programming Models for Scalable Parallel Computing include: (1) design and implemention of cafc, the first multi-platform CAF compiler for distributed and shared-memory machines, (2) performance studies of the efficiency of programs written using the CAF and UPC programming models, (3) a novel technique to analyze explicitly-parallel SPMD programs that facilitates optimization, (4) design, implementation, and evaluation of new language features for CAF, including communication topologies, multi-version variables, and distributed multithreading to simplify development of high-performance codes in CAF, and (5) a synchronization strength reduction transformation for automatically replacing barrier-based synchronization with more efficient point-to-point synchronization. The prototype Co-array Fortran compiler cafc developed in this project is available as open source software from http://www.hipersoft.rice.edu/caf.
Final Report: Center for Programming Models for Scalable Parallel Computing
Energy Technology Data Exchange (ETDEWEB)
Mellor-Crummey, John [William Marsh Rice University
2011-09-13
As part of the Center for Programming Models for Scalable Parallel Computing, Rice University collaborated with project partners in the design, development and deployment of language, compiler, and runtime support for parallel programming models to support application development for the “leadership-class” computer systems at DOE national laboratories. Work over the course of this project has focused on the design, implementation, and evaluation of a second-generation version of Coarray Fortran. Research and development efforts of the project have focused on the CAF 2.0 language, compiler, runtime system, and supporting infrastructure. This has involved working with the teams that provide infrastructure for CAF that we rely on, implementing new language and runtime features, producing an open source compiler that enabled us to evaluate our ideas, and evaluating our design and implementation through the use of benchmarks. The report details the research, development, findings, and conclusions from this work.
Load-balancing algorithms for the parallel community climate model
Energy Technology Data Exchange (ETDEWEB)
Foster, I.T.; Toonen, B.R.
1995-01-01
Implementations of climate models on scalable parallel computer systems can suffer from load imbalances resulting from temporal and spatial variations in the amount of computation required for physical parameterizations such as solar radiation and convective adjustment. We have developed specialized techniques for correcting such imbalances. These techniques are incorporated in a general-purpose, programmable load-balancing library that allows the mapping of computation to processors to be specified as a series of maps generated by a programmer-supplied load-balancing module. The communication required to move from one map to another is performed automatically by the library, without programmer intervention. In this paper, we describe the load-balancing problem and the techniques that we have developed to solve it. We also describe specific load-balancing algorithms that we have developed for PCCM2, a scalable parallel implementation of the Community Climate Model, and present experimental results that demonstrate the effectiveness of these algorithms on parallel computers. The load-balancing library developed in this work is available for use in other climate models.
Efficient Parallel Statistical Model Checking of Biochemical Networks
Directory of Open Access Journals (Sweden)
Paolo Ballarini
2009-12-01
Full Text Available We consider the problem of verifying stochastic models of biochemical networks against behavioral properties expressed in temporal logic terms. Exact probabilistic verification approaches such as, for example, CSL/PCTL model checking, are undermined by a huge computational demand which rule them out for most real case studies. Less demanding approaches, such as statistical model checking, estimate the likelihood that a property is satisfied by sampling executions out of the stochastic model. We propose a methodology for efficiently estimating the likelihood that a LTL property P holds of a stochastic model of a biochemical network. As with other statistical verification techniques, the methodology we propose uses a stochastic simulation algorithm for generating execution samples, however there are three key aspects that improve the efficiency: first, the sample generation is driven by on-the-fly verification of P which results in optimal overall simulation time. Second, the confidence interval estimation for the probability of P to hold is based on an efficient variant of the Wilson method which ensures a faster convergence. Third, the whole methodology is designed according to a parallel fashion and a prototype software tool has been implemented that performs the sampling/verification process in parallel over an HPC architecture.
cellGPU: Massively parallel simulations of dynamic vertex models
Sussman, Daniel M.
2017-10-01
Vertex models represent confluent tissue by polygonal or polyhedral tilings of space, with the individual cells interacting via force laws that depend on both the geometry of the cells and the topology of the tessellation. This dependence on the connectivity of the cellular network introduces several complications to performing molecular-dynamics-like simulations of vertex models, and in particular makes parallelizing the simulations difficult. cellGPU addresses this difficulty and lays the foundation for massively parallelized, GPU-based simulations of these models. This article discusses its implementation for a pair of two-dimensional models, and compares the typical performance that can be expected between running cellGPU entirely on the CPU versus its performance when running on a range of commercial and server-grade graphics cards. By implementing the calculation of topological changes and forces on cells in a highly parallelizable fashion, cellGPU enables researchers to simulate time- and length-scales previously inaccessible via existing single-threaded CPU implementations. Program Files doi:http://dx.doi.org/10.17632/6j2cj29t3r.1 Licensing provisions: MIT Programming language: CUDA/C++ Nature of problem: Simulations of off-lattice "vertex models" of cells, in which the interaction forces depend on both the geometry and the topology of the cellular aggregate. Solution method: Highly parallelized GPU-accelerated dynamical simulations in which the force calculations and the topological features can be handled on either the CPU or GPU. Additional comments: The code is hosted at https://gitlab.com/dmsussman/cellGPU, with documentation additionally maintained at http://dmsussman.gitlab.io/cellGPUdocumentation
Ski Control Model for Parallel Turn Using Multibody System
Kawai, Shigehiro; Yamaguchi, Keishi; Sakata, Toshiyuki
Now, it is possible to discuss qualitatively the effects of skis, skier’s ski control and slope on a ski turn by simulation. The reliability of a simulation depends on the accuracy of the models used in the simulation. In the present study, we attempt to develop a new ski control model for a “parallel turn” using a computer graphics technique. The “ski control” necessary for the simulation is the relative motion of the skier’s center of gravity to the ski and the force acting on the ski from the skier. The developed procedure is as follows. First, the skier is modeled using a multibody system consisting of body parts. Second, various postures of the skier during the “parallel turn” are drawn using a 3D-CAD (three dimensional computer aided design) system referring to the pictures videotaped on a slope. The position of the skier’s center of gravity is estimated from the produced posture. Third, the skier’s ski control is obtained by arranging these postures in a time schedule. One can watch the ski control on a TV. Last, the three types of forces acting on the ski from the skier are estimated from the gravity force and the three relative types of inertia forces acting on the skier. Consequently, one can obtain accurate ski control for the simulation of the “parallel turn”, that is, the relative motion of the skier’s center of gravity to the ski and the force acting on the ski from the skier. Furthermore, it follows that one can numerically estimate the edging angle from the ski control model.
Parallel tempering and 3D spin glass models
Papakonstantinou, T.; Malakis, A.
2014-03-01
We review parallel tempering schemes and examine their main ingredients for accuracy and efficiency. We discuss two selection methods of temperatures and some alternatives for the exchange of replicas, including all-pair exchange methods. We measure specific heat errors and round-trip efficiency using the two-dimensional (2D) Ising model, and also test the efficiency for the ground state production in 3D spin glass models. We find that the optimization of the GS problem is highly influenced by the choice of the temperature range of the PT process. Finally, we present numerical evidence concerning the universality aspects of an anisotropic case of the 3D spin-glass model.
The parallel network dynamic DEA model with interval data
Directory of Open Access Journals (Sweden)
S. Keikha-Javan
2014-09-01
Full Text Available In original DEA models, data apply precisely for measuring the relative efficiency whereas in reality, we do not always deal with precise data, also, be noted that when data are non-precision, it is expected to attain non-precision efficiency due to these data. In this article, we apply the parallel network dynamic DEA model for non-precision data in which the carry-overs among periods are assumed as desired and undesired. Then Upper and lower efficiency bounds are obtained for overall-, periodical-, divisional and periodical efficiencies the part which is computed considering the subunits of DMU under evaluation. Finally, having exerted this model on data set of branches of several banks in Iran, we compute the efficiency interval.
Methods to model-check parallel systems software.
Energy Technology Data Exchange (ETDEWEB)
Matlin, O. S.; McCune, W.; Lusk, E.
2003-12-15
We report on an effort to develop methodologies for formal verification of parts of the Multi-Purpose Daemon (MPD) parallel process management system. MPD is a distributed collection of communicating processes. While the individual components of the collection execute simple algorithms, their interaction leads to unexpected errors that are difficult to uncover by conventional means. Two verification approaches are discussed here: the standard model checking approach using the software model checker SPIN and the nonstandard use of a general-purpose first-order resolution-style theorem prover OTTER to conduct the traditional state space exploration. We compare modeling methodology and analyze performance and scalability of the two methods with respect to verification of MPD.
Error Modeling and Design Optimization of Parallel Manipulators
DEFF Research Database (Denmark)
Wu, Guanglei
challenges due to their highly nonlinear behaviors, thus, the parameter and performance analysis, especially the accuracy and stiness, are particularly important. Toward the requirements of robotic technology such as light weight, compactness, high accuracy and low energy consumption, utilizing optimization...... technique in the design procedure is a suitable approach to handle these complex tasks. As there is no unied design guideline for the parallel manipulators, the study described in this thesis aims to provide a systematic analysis for this type of mechanisms in the early design stage, focusing on accuracy...... analysis and design optimization. The proposed approach is illustrated with the planar and spherical parallel manipulators. The geometric design, kinematic and dynamic analysis, kinetostatic modeling and stiness analysis are also presented. Firstly, the study on the geometric architecture and kinematic...
Calibration of parallel kinematics machine using generalized distance error model
Institute of Scientific and Technical Information of China (English)
无
2007-01-01
This paper focus on the accuracy enhancement of parallel kinematics machine through kinematics calibration. In the calibration processing, well-structured identification Jacobian matrix construction and end-effector position and orientation measurement are two main difficulties. In this paper, the identification Jacobian matrix is constructed easily by numerical calculation utilizing the unit virtual velocity method. The generalized distance errors model is presented for avoiding measuring the position and orientation directly which is difficult to be measured. At last, a measurement tool is given for acquiring the data points in the calibration processing.Experimental studies confirmed the effectiveness of method. It is also shown in the paper that the proposed approach can be applied to other typed parallel manipulators.
Hybrid fluid/kinetic model for parallel heat conduction
Energy Technology Data Exchange (ETDEWEB)
Callen, J.D.; Hegna, C.C.; Held, E.D. [Univ. of Wisconsin, Madison, WI (United States)
1998-12-31
It is argued that in order to use fluid-like equations to model low frequency ({omega} < {nu}) phenomena such as neoclassical tearing modes in low collisionality ({nu} < {omega}{sub b}) tokamak plasmas, a Chapman-Enskog-like approach is most appropriate for developing an equation for the kinetic distortion (F) of the distribution function whose velocity-space moments lead to the needed fluid moment closure relations. Further, parallel heat conduction in a long collision mean free path regime can be described through a combination of a reduced phase space Chapman-Enskog-like approach for the kinetics and a multiple-time-scale analysis for the fluid and kinetic equations.
Phase dynamics modeling of parallel stacks of Josephson junctions
Rahmonov, I. R.; Shukrinov, Yu. M.
2014-11-01
The phase dynamics of two parallel connected stacks of intrinsic Josephson junctions (JJs) in high temperature superconductors is numerically investigated. The calculations are based on the system of nonlinear differential equations obtained within the CCJJ + DC model, which allows one to determine the general current-voltage characteristic of the system, as well as each individual stack. The processes with increasing and decreasing base currents are studied. The features in the behavior of the current in each stack of the system due to the switching between the states with rotating and oscillating phases are analyzed.
PKind: A parallel k-induction based model checker
Kahsai, Temesghen; 10.4204/EPTCS.72.6
2011-01-01
PKind is a novel parallel k-induction-based model checker of invariant properties for finite- or infinite-state Lustre programs. Its architecture, which is strictly message-based, is designed to minimize synchronization delays and easily accommodate the incorporation of incremental invariant generators to enhance basic k-induction. We describe PKind's functionality and main features, and present experimental evidence that PKind significantly speeds up the verification of safety properties and, due to incremental invariant generation, also considerably increases the number of provable ones.
Parallel algorithms for interactive manipulation of digital terrain models
Davis, E. W.; Mcallister, D. F.; Nagaraj, V.
1988-01-01
Interactive three-dimensional graphics applications, such as terrain data representation and manipulation, require extensive arithmetic processing. Massively parallel machines are attractive for this application since they offer high computational rates, and grid connected architectures provide a natural mapping for grid based terrain models. Presented here are algorithms for data movement on the massive parallel processor (MPP) in support of pan and zoom functions over large data grids. It is an extension of earlier work that demonstrated real-time performance of graphics functions on grids that were equal in size to the physical dimensions of the MPP. When the dimensions of a data grid exceed the processing array size, data is packed in the array memory. Windows of the total data grid are interactively selected for processing. Movement of packed data is needed to distribute items across the array for efficient parallel processing. Execution time for data movement was found to exceed that for arithmetic aspects of graphics functions. Performance figures are given for routines written in MPP Pascal.
Requirements and Problems in Parallel Model Development at DWD
Directory of Open Access Journals (Sweden)
Ulrich Schäattler
2000-01-01
Full Text Available Nearly 30 years after introducing the first computer model for weather forecasting, the Deutscher Wetterdienst (DWD is developing the 4th generation of its numerical weather prediction (NWP system. It consists of a global grid point model (GME based on a triangular grid and a non-hydrostatic Lokal Modell (LM. The operational demand for running this new system is immense and can only be met by parallel computers. From the experience gained in developing earlier NWP models, several new problems had to be taken into account during the design phase of the system. Most important were portability (including efficieny of the programs on several computer architectures and ease of code maintainability. Also the organization and administration of the work done by developers from different teams and institutions is more complex than it used to be. This paper describes the models and gives some performance results. The modular approach used for the design of the LM is explained and the effects on the development are discussed.
Dynamic modeling of Tampa Bay urban development using parallel computing
Xian, G.; Crane, M.; Steinwand, D.
2005-01-01
Urban land use and land cover has changed significantly in the environs of Tampa Bay, Florida, over the past 50 years. Extensive urbanization has created substantial change to the region's landscape and ecosystems. This paper uses a dynamic urban-growth model, SLEUTH, which applies six geospatial data themes (slope, land use, exclusion, urban extent, transportation, hillside), to study the process of urbanization and associated land use and land cover change in the Tampa Bay area. To reduce processing time and complete the modeling process within an acceptable period, the model is recoded and ported to a Beowulf cluster. The parallel-processing computer system accomplishes the massive amount of computation the modeling simulation requires. SLEUTH calibration process for the Tampa Bay urban growth simulation spends only 10 h CPU time. The model predicts future land use/cover change trends for Tampa Bay from 1992 to 2025. Urban extent is predicted to double in the Tampa Bay watershed between 1992 and 2025. Results show an upward trend of urbanization at the expense of a decline of 58% and 80% in agriculture and forested lands, respectively. ?? 2005 Elsevier Ltd. All rights reserved.
Parallel multiscale modeling of biopolymer dynamics with hydrodynamic correlations
Fyta, Maria; Kaxiras, Efthimios; Melchionna, Simone; Bernaschi, Massimo; Succi, Sauro
2007-01-01
We employ a multiscale approach to model the translocation of biopolymers through nanometer size pores. Our computational scheme combines microscopic Molecular Dynamics (MD) with a mesoscopic Lattice Boltzmann (LB) method for the solvent dynamics, explicitly taking into account the interactions of the molecule with the surrounding fluid. We describe an efficient parallel implementation of the method which exhibits excellent scalability on the Blue Gene platform. We investigate both dynamical and statistical aspects of the translocation process by simulating polymers of various initial configurations and lengths. For a representative molecule size, we explore the effects of important parameters that enter in the simulation, paying particular attention to the strength of the molecule-solvent coupling and of the external electric field which drives the translocation process. Finally, we explore the connection between the generic polymers modeled in the simulation and DNA, for which interesting recent experimenta...
Applying the Extended Parallel Process Model to workplace safety messages.
Basil, Michael; Basil, Debra; Deshpande, Sameer; Lavack, Anne M
2013-01-01
The extended parallel process model (EPPM) proposes fear appeals are most effective when they combine threat and efficacy. Three studies conducted in the workplace safety context examine the use of various EPPM factors and their effects, especially multiplicative effects. Study 1 was a content analysis examining the use of EPPM factors in actual workplace safety messages. Study 2 experimentally tested these messages with 212 construction trainees. Study 3 replicated this experiment with 1,802 men across four English-speaking countries-Australia, Canada, the United Kingdom, and the United States. The results of these three studies (1) demonstrate the inconsistent use of EPPM components in real-world work safety communications, (2) support the necessity of self-efficacy for the effective use of threat, (3) show a multiplicative effect where communication effectiveness is maximized when all model components are present (severity, susceptibility, and efficacy), and (4) validate these findings with gory appeals across four English-speaking countries.
Energy consumption model over parallel programs implemented on multicore architectures
Directory of Open Access Journals (Sweden)
Ricardo Isidro-Ramirez
2015-06-01
Full Text Available In High Performance Computing, energy consump-tion is becoming an important aspect to consider. Due to the high costs that represent energy production in all countries it holds an important role and it seek to find ways to save energy. It is reflected in some efforts to reduce the energy requirements of hardware components and applications. Some options have been appearing in order to scale down energy use and, con-sequently, scale up energy efficiency. One of these strategies is the multithread programming paradigm, whose purpose is to produce parallel programs able to use the full amount of computing resources available in a microprocessor. That energy saving strategy focuses on efficient use of multicore processors that are found in various computing devices, like mobile devices. Actually, as a growing trend, multicore processors are found as part of various specific purpose computers since 2003, from High Performance Computing servers to mobile devices. However, it is not clear how multiprogramming affects energy efficiency. This paper presents an analysis of different types of multicore-based architectures used in computing, and then a valid model is presented. Based on Amdahl’s Law, a model that considers different scenarios of energy use in multicore architectures it is proposed. Some interesting results were found from experiments with the developed algorithm, that it was execute of a parallel and sequential way. A lower limit of energy consumption was found in a type of multicore architecture and this behavior was observed experimentally.
Parallel implementation of approximate atomistic models of the AMOEBA polarizable model
Demerdash, Omar; Head-Gordon, Teresa
2016-11-01
In this work we present a replicated data hybrid OpenMP/MPI implementation of a hierarchical progression of approximate classical polarizable models that yields speedups of up to ∼10 compared to the standard OpenMP implementation of the exact parent AMOEBA polarizable model. In addition, our parallel implementation exhibits reasonable weak and strong scaling. The resulting parallel software will prove useful for those who are interested in how molecular properties converge in the condensed phase with respect to the MBE, it provides a fruitful test bed for exploring different electrostatic embedding schemes, and offers an interesting possibility for future exascale computing paradigms.
Multiphysics & Parallel Kinematics Modeling of a 3DOF MEMS Mirror
Directory of Open Access Journals (Sweden)
Mamat N.
2015-01-01
Full Text Available This paper presents a modeling for a 3DoF electrothermal actuated micro-electro-mechanical (MEMS mirror used to achieve scanning for optical coherence tomography (OCT imaging. The device is integrated into an OCT endoscopic probe, it is desired that the optical scanner have small footprint for minimum invasiveness, large and flat optical aperture for large scanning range, low driving voltage and low power consumption for safety reason. With a footprint of 2mm×2mm, the MEMS scanner which is also called as Tip-Tilt-Piston micro-mirror, can perform two rotations around x and y-axis and a vertical translation along z-axis. This work develops a complete model and experimental characterization. The modeling is divided into two parts: multiphysics characterization of the actuators and parallel kinematics studies of the overall system. With proper experimental procedures, we are able to validate the model via Visual Servoing Platform (ViSP. The results give a detailed overview on the performance of the mirror platform while varying the applied voltage at a stable working frequency. The paper also presents a discussion on the MEMS control system based on several scanning trajectories.
Parallel Semi-Implicit Spectral Element Atmospheric Model
Fournier, A.; Thomas, S.; Loft, R.
2001-05-01
The shallow-water equations (SWE) have long been used to test atmospheric-modeling numerical methods. The SWE contain essential wave-propagation and nonlinear effects of more complete models. We present a semi-implicit (SI) improvement of the Spectral Element Atmospheric Model to solve the SWE (SEAM, Taylor et al. 1997, Fournier et al. 2000, Thomas & Loft 2000). SE methods are h-p finite element methods combining the geometric flexibility of size-h finite elements with the accuracy of degree-p spectral methods. Our work suggests that exceptional parallel-computation performance is achievable by a General-Circulation-Model (GCM) dynamical core, even at modest climate-simulation resolutions (>1o). The code derivation involves weak variational formulation of the SWE, Gauss(-Lobatto) quadrature over the collocation points, and Legendre cardinal interpolators. Appropriate weak variation yields a symmetric positive-definite Helmholtz operator. To meet the Ladyzhenskaya-Babuska-Brezzi inf-sup condition and avoid spurious modes, we use a staggered grid. The SI scheme combines leapfrog and Crank-Nicholson schemes for the nonlinear and linear terms respectively. The localization of operations to elements ideally fits the method to cache-based microprocessor computer architectures --derivatives are computed as collections of small (8x8), naturally cache-blocked matrix-vector products. SEAM also has desirable boundary-exchange communication, like finite-difference models. Timings on on the IBM SP and Compaq ES40 supercomputers indicate that the SI code (20-min timestep) requires 1/3 the CPU time of the explicit code (2-min timestep) for T42 resolutions. Both codes scale nearly linearly out to 400 processors. We achieved single-processor performance up to 30% of peak for both codes on the 375-MHz IBM Power-3 processors. Fast computation and linear scaling lead to a useful climate-simulation dycore only if enough model time is computed per unit wall-clock time. An efficient SI
Parallel processing for efficient 3D slope stability modelling
Marchesini, Ivan; Mergili, Martin; Alvioli, Massimiliano; Metz, Markus; Schneider-Muntau, Barbara; Rossi, Mauro; Guzzetti, Fausto
2014-05-01
We test the performance of the GIS-based, three-dimensional slope stability model r.slope.stability. The model was developed as a C- and python-based raster module of the GRASS GIS software. It considers the three-dimensional geometry of the sliding surface, adopting a modification of the model proposed by Hovland (1977), and revised and extended by Xie and co-workers (2006). Given a terrain elevation map and a set of relevant thematic layers, the model evaluates the stability of slopes for a large number of randomly selected potential slip surfaces, ellipsoidal or truncated in shape. Any single raster cell may be intersected by multiple sliding surfaces, each associated with a value of the factor of safety, FS. For each pixel, the minimum value of FS and the depth of the associated slip surface are stored. This information is used to obtain a spatial overview of the potentially unstable slopes in the study area. We test the model in the Collazzone area, Umbria, central Italy, an area known to be susceptible to landslides of different type and size. Availability of a comprehensive and detailed landslide inventory map allowed for a critical evaluation of the model results. The r.slope.stability code automatically splits the study area into a defined number of tiles, with proper overlap in order to provide the same statistical significance for the entire study area. The tiles are then processed in parallel by a given number of processors, exploiting a multi-purpose computing environment at CNR IRPI, Perugia. The map of the FS is obtained collecting the individual results, taking the minimum values on the overlapping cells. This procedure significantly reduces the processing time. We show how the gain in terms of processing time depends on the tile dimensions and on the number of cores.
Coupled Models and Parallel Simulations for Three-Dimensional Full-Stokes Ice Sheet Modeling
Energy Technology Data Exchange (ETDEWEB)
Zhang, Huai; Ju, Lili
2011-01-01
A three-dimensional full-Stokes computational model is considered for determining the dynamics, temperature, and thickness of ice sheets. The governing thermomechanical equations consist of the three-dimensional full-Stokes system with nonlinear rheology for the momentum, an advective-diffusion energy equation for temperature evolution, and a mass conservation equation for icethickness changes. Here, we discuss the variable resolution meshes, the finite element discretizations, and the parallel algorithms employed by the model components. The solvers are integrated through a well-designed coupler for the exchange of parametric data between components. The discretization utilizes high-quality, variable-resolution centroidal Voronoi Delaunay triangulation meshing and existing parallel solvers. We demonstrate the gridding technology, discretization schemes, and the efficiency and scalability of the parallel solvers through computational experiments using both simplified geometries arising from benchmark test problems and a realistic Greenland ice sheet geometry.
A model for dealing with parallel processes in supervision
Directory of Open Access Journals (Sweden)
Lilja Cajvert
2011-03-01
Supervision in social work is essential for successful outcomes when working with clients. In social work, unconscious difficulties may arise and similar difficulties may occur in supervision as parallel processes. In this article, the development of a practice-based model of supervision to deal with parallel processes in supervision is described. The model has six phases. In the first phase, the focus is on the supervisor’s inner world, his/her own reflections and observations. In the second phase, the supervision situation is “frozen”, and the supervisees are invited to join the supervisor in taking a meta-perspective on the current situation of supervision. The focus in the third phase is on the inner world of all the group members as well as the visualization and identification of reflections and feelings that arose during the supervision process. Phase four focuses on the supervisee who presented a case, and in phase five the focus shifts to the common understanding and theorization of the supervision process as well as the definition and identification of possible parallel processes. In the final phase, the supervisee, with the assistance of the supervisor and other members of the group, develops a solution and determines how to proceed with the client in treatment. This article uses phenomenological concepts to provide a theoretical framework for the supervision model. Phenomenological reduction is an important approach to examine and to externalize and visualize the inner words of the supervisor and supervisees. Een model voor het hanteren van parallelle processen tijdens supervisie Om succesvol te zijn in de hulpverlening aan cliënten, is supervisie cruciaal in het sociaal werk. Tijdens de hulpverlening kunnen impliciete moeilijkheden de kop opsteken en soortgelijke moeilijkheden duiken soms ook op tijdens supervisie. Dit worden parallelle processen genoemd. Dit artikel beschrijft een op praktijkervaringen gebaseerd model om dergelijke parallelle
Tarmo: A Framework for Parallelized Bounded Model Checking
Wieringa, Siert; Heljanko, Keijo; 10.4204/EPTCS.14.5
2009-01-01
This paper investigates approaches to parallelizing Bounded Model Checking (BMC) for shared memory environments as well as for clusters of workstations. We present a generic framework for parallelized BMC named Tarmo. Our framework can be used with any incremental SAT encoding for BMC but for the results in this paper we use only the current state-of-the-art encoding for full PLTL. Using this encoding allows us to check both safety and liveness properties, contrary to an earlier work on distributing BMC that is limited to safety properties only. Despite our focus on BMC after it has been translated to SAT, existing distributed SAT solvers are not well suited for our application. This is because solving a BMC problem is not solving a set of independent SAT instances but rather involves solving multiple related SAT instances, encoded incrementally, where the satisfiability of each instance corresponds to the existence of a counterexample of a specific length. Our framework includes a generic architecture for a ...
"Let's Move" campaign: applying the extended parallel process model.
Batchelder, Alicia; Matusitz, Jonathan
2014-01-01
This article examines Michelle Obama's health campaign, "Let's Move," through the lens of the extended parallel process model (EPPM). "Let's Move" aims to reduce the childhood obesity epidemic in the United States. Developed by Kim Witte, EPPM rests on the premise that people's attitudes can be changed when fear is exploited as a factor of persuasion. Fear appeals work best (a) when a person feels a concern about the issue or situation, and (b) when he or she believes to have the capability of dealing with that issue or situation. Overall, the analysis found that "Let's Move" is based on past health campaigns that have been successful. An important element of the campaign is the use of fear appeals (as it is postulated by EPPM). For example, part of the campaign's strategies is to explain the severity of the diseases associated with obesity. By looking at the steps of EPPM, readers can also understand the strengths and weaknesses of "Let's Move."
Parallel imaging enhanced MR colonography using a phantom model.
LENUS (Irish Health Repository)
Morrin, Martina M
2008-09-01
To compare various Array Spatial and Sensitivity Encoding Technique (ASSET)-enhanced T2W SSFSE (single shot fast spin echo) and T1-weighted (T1W) 3D SPGR (spoiled gradient recalled echo) sequences for polyp detection and image quality at MR colonography (MRC) in a phantom model. Limitations of MRC using standard 3D SPGR T1W imaging include the long breath-hold required to cover the entire colon within one acquisition and the relatively low spatial resolution due to the long acquisition time. Parallel imaging using ASSET-enhanced T2W SSFSE and 3D T1W SPGR imaging results in much shorter imaging times, which allows for increased spatial resolution.
Time efficient 3-D electromagnetic modeling on massively parallel computers
Energy Technology Data Exchange (ETDEWEB)
Alumbaugh, D.L.; Newman, G.A.
1995-08-01
A numerical modeling algorithm has been developed to simulate the electromagnetic response of a three dimensional earth to a dipole source for frequencies ranging from 100 to 100MHz. The numerical problem is formulated in terms of a frequency domain--modified vector Helmholtz equation for the scattered electric fields. The resulting differential equation is approximated using a staggered finite difference grid which results in a linear system of equations for which the matrix is sparse and complex symmetric. The system of equations is solved using a preconditioned quasi-minimum-residual method. Dirichlet boundary conditions are employed at the edges of the mesh by setting the tangential electric fields equal to zero. At frequencies less than 1MHz, normal grid stretching is employed to mitigate unwanted reflections off the grid boundaries. For frequencies greater than this, absorbing boundary conditions must be employed by making the stretching parameters of the modified vector Helmholtz equation complex which introduces loss at the boundaries. To allow for faster calculation of realistic models, the original serial version of the code has been modified to run on a massively parallel architecture. This modification involves three distinct tasks; (1) mapping the finite difference stencil to a processor stencil which allows for the necessary information to be exchanged between processors that contain adjacent nodes in the model, (2) determining the most efficient method to input the model which is accomplished by dividing the input into ``global`` and ``local`` data and then reading the two sets in differently, and (3) deciding how to output the data which is an inherently nonparallel process.
Parallel family trees for transfer matrices in the Potts model
Navarro, Cristobal A; Kahler, Nancy Hitschfeld; Navarro, Gonzalo
2013-01-01
The computational cost of transfer matrix methods for the Potts model is directly related to the problem of \\textit{into how many ways can two adjacent blocks of a lattice be connected}. Answering this question leads to the generation of a combinatorial set of lattice configurations. This set defines the \\textit{configuration space} of the problem, and the smaller it is, the faster the transfer matrix method can be. The configuration space of generic transfer matrix methods for strip lattices in the Potts model is in the order of the Catalan numbers, leading to an asymptotic cost of $O(4^m)$ with $m$ being the width of the strip. Transfer matrix methods with a smaller configuration space indeed exist but they make assumptions on the temperature, number of spin states, or restrict the topology of the lattice in order to work. In this paper we propose a general and parallel transfer matrix method, based on family trees, that uses a sub-Catalan configuration space of size $O(3^m)$. The improvement is achieved by...
Tarmo: A Framework for Parallelized Bounded Model Checking
Directory of Open Access Journals (Sweden)
Siert Wieringa
2009-12-01
Full Text Available This paper investigates approaches to parallelizing Bounded Model Checking (BMC for shared memory environments as well as for clusters of workstations. We present a generic framework for parallelized BMC named Tarmo. Our framework can be used with any incremental SAT encoding for BMC but for the results in this paper we use only the current state-of-the-art encoding for full PLTL. Using this encoding allows us to check both safety and liveness properties, contrary to an earlier work on distributing BMC that is limited to safety properties only. Despite our focus on BMC after it has been translated to SAT, existing distributed SAT solvers are not well suited for our application. This is because solving a BMC problem is not solving a set of independent SAT instances but rather involves solving multiple related SAT instances, encoded incrementally, where the satisfiability of each instance corresponds to the existence of a counterexample of a specific length. Our framework includes a generic architecture for a shared clause database that allows easy clause sharing between SAT solver threads solving various such instances. We present extensive experimental results obtained with multiple variants of our Tarmo implementation. Our shared memory variants have a significantly better performance than conventional single threaded approaches, which is a result that many users can benefit from as multi-core and multi-processor technology is widely available. Furthermore we demonstrate that our framework can be deployed in a typical cluster of workstations, where several multi-core machines are connected by a network.
Parallelized CCHE2D flow model with CUDA Fortran on Graphics Process Units
This paper presents the CCHE2D implicit flow model parallelized using CUDA Fortran programming technique on Graphics Processing Units (GPUs). A parallelized implicit Alternating Direction Implicit (ADI) solver using Parallel Cyclic Reduction (PCR) algorithm on GPU is developed and tested. This solve...
Parallel Lagrangian models for turbulent transport and chemistry
Crone, Gilia Cornelia
1997-01-01
In this thesis we give an overview of recent stochastic Lagrangian models and present a new particle model for turbulent dispersion and chemical reactions. Our purpose is to investigate and assess the feasibility of the Lagrangian approach for modelling the turbulent dispersion and chemistry
Parallel Development of Products and New Business Models
DEFF Research Database (Denmark)
Lund, Morten; Hansen, Poul H. Kyvsgård
2014-01-01
The perception of product development and the practical execution of product development in professional organizations have undergone dramatic changes in recent years. Many of these chances relate to introduction of broader and more cross-disciplinary views that involves new organizational functi...... and innovation management the 4th generation models are increasingly including the concept business models and business model innovation....
The Rochester Checkers Player: Multi-Model Parallel Programming for Animate Vision
1991-06-01
parallel programming is likely to serve for all tasks, however. Early vision algorithms are intensely data parallel, often utilizing fine-grain parallel computations that share an image, while cognition algorithms decompose naturally by function, often consisting of loosely-coupled, coarse-grain parallel units. A typical animate vision application will likely consist of many tasks, each of which may require a different parallel programming model, and all of which must cooperate to achieve the desired behavior. These multi-model programs require an
Stiffness Model of a 3-DOF Parallel Manipulator with Two Additional Legs
Directory of Open Access Journals (Sweden)
Guang Yu
2014-10-01
Full Text Available This paper investigates the stiffness modelling of a 3-DOF parallel manipulator with two additional legs. The stiffness model in six directions of the 3-DOF parallel manipulator with two additional legs is derived by performing condensation of DOFs for the joint connection and treatment of the fixed-end connections. Moreover, this modelling method is used to derive the stiffness model of the manipulator with zero/one additional legs. Two performance indices are given to compare the stiffness of the parallel manipulators with two additional legs with those of the manipulators with zero/one additional legs. The method not only can be used to derive the stiffness model of a redundant parallel manipulator, but also to model the stiffness of non-redundant parallel manipulators.
LARGE SIGNAL DISCRETE-TIME MODEL FOR PARALLELED BUCK CONVERTERS
Institute of Scientific and Technical Information of China (English)
无
2002-01-01
As a number of switch-combinations are involved in operation of multi-converter-system, conventional methods for obtaining discrete-time large signal model of these converter systems result in a very complex solution. A simple sampled-data technique for modeling distributed dc-dc PWM converters system (DCS) was proposed. The resulting model is nonlinear and can be linearized for analysis and design of DCS. These models are also suitable for fast simulation of these networks. As the input and output of dc-dc converters are slow varying, suitable model for DCS was obtained in terms of the finite order input/output approximation.
Institute of Scientific and Technical Information of China (English)
Cao Hong-qing; Kang Li-shan; Yu Jing-xian
2003-01-01
First, an asynchronous distributed parallel evolutionary modeling algorithm (PEMA) for building the model of system of ordinary differential equations for dynamical systems is proposed in this paper. Then a series of parallel experiments have been conducted to systematically test the influence of some important parallel control parameters on the performance of the algorithm. A lot of experimental results are obtained and we make some analysis and explanations to them.
Arkin, Ethem; Tekinerdogan, Bedir
2016-01-01
Mapping parallel algorithms to parallel computing platforms requires several activities such as the analysis of the parallel algorithm, the definition of the logical configuration of the platform, the mapping of the algorithm to the logical configuration platform and the implementation of the sou
A global parallel model based design of experiments method to minimize model output uncertainty.
Bazil, Jason N; Buzzard, Gregory T; Rundell, Ann E
2012-03-01
Model-based experiment design specifies the data to be collected that will most effectively characterize the biological system under study. Existing model-based design of experiment algorithms have primarily relied on Fisher Information Matrix-based methods to choose the best experiment in a sequential manner. However, these are largely local methods that require an initial estimate of the parameter values, which are often highly uncertain, particularly when data is limited. In this paper, we provide an approach to specify an informative sequence of multiple design points (parallel design) that will constrain the dynamical uncertainty of the biological system responses to within experimentally detectable limits as specified by the estimated experimental noise. The method is based upon computationally efficient sparse grids and requires only a bounded uncertain parameter space; it does not rely upon initial parameter estimates. The design sequence emerges through the use of scenario trees with experimental design points chosen to minimize the uncertainty in the predicted dynamics of the measurable responses of the system. The algorithm was illustrated herein using a T cell activation model for three problems that ranged in dimension from 2D to 19D. The results demonstrate that it is possible to extract useful information from a mathematical model where traditional model-based design of experiments approaches most certainly fail. The experiments designed via this method fully constrain the model output dynamics to within experimentally resolvable limits. The method is effective for highly uncertain biological systems characterized by deterministic mathematical models with limited data sets. Also, it is highly modular and can be modified to include a variety of methodologies such as input design and model discrimination.
Parallel Application Development Using Architecture View Driven Model Transformations
Arkin, E.; Tekinerdogan, B.
2015-01-01
o realize the increased need for computing performance the current trend is towards applying parallel computing in which the tasks are run in parallel on multiple nodes. On its turn we can observe the rapid increase of the scale of parallel computing platforms. This situation has led to a complexity
Parallel programming practical aspects, models and current limitations
Tarkov, Mikhail S
2014-01-01
Parallel programming is designed for the use of parallel computer systems for solving time-consuming problems that cannot be solved on a sequential computer in a reasonable time. These problems can be divided into two classes: 1. Processing large data arrays (including processing images and signals in real time)2. Simulation of complex physical processes and chemical reactions For each of these classes, prospective methods are designed for solving problems. For data processing, one of the most promising technologies is the use of artificial neural networks. Particles-in-cell method and cellular automata are very useful for simulation. Problems of scalability of parallel algorithms and the transfer of existing parallel programs to future parallel computers are very acute now. An important task is to optimize the use of the equipment (including the CPU cache) of parallel computers. Along with parallelizing information processing, it is essential to ensure the processing reliability by the relevant organization ...
Running Large-Scale Air Pollution Models on Parallel Computers
DEFF Research Database (Denmark)
Georgiev, K.; Zlatev, Z.
2000-01-01
Proceedings of the 23rd NATO/CCMS International Technical Meeting on Air Pollution Modeling and Its Application, held 28 September - 2 October 1998, in Varna, Bulgaria.......Proceedings of the 23rd NATO/CCMS International Technical Meeting on Air Pollution Modeling and Its Application, held 28 September - 2 October 1998, in Varna, Bulgaria....
Application of Parallel Algorithms in an Air Pollution Model
DEFF Research Database (Denmark)
Georgiev, K.; Zlatev, Z.
1999-01-01
Proceedings of the NATO Advanced Research Workshop on Large Scale Computations in Air Pollution Modelling, Sofia, Bulgaria, 6-10 July 1998......Proceedings of the NATO Advanced Research Workshop on Large Scale Computations in Air Pollution Modelling, Sofia, Bulgaria, 6-10 July 1998...
Directory of Open Access Journals (Sweden)
Yufeng Zhuang
2015-01-01
Full Text Available This paper presents a unified singularity modeling and reconfiguration analysis of variable topologies of a class of metamorphic parallel mechanisms with parallel constraint screws. The new parallel mechanisms consist of three reconfigurable rTPS limbs that have two working phases stemming from the reconfigurable Hooke (rT joint. While one phase has full mobility, the other supplies a constraint force to the platform. Based on these, the platform constraint screw systems show that the new metamorphic parallel mechanisms have four topologies by altering the limb phases with mobility change among 1R2T (one rotation with two translations, 2R2T, and 3R2T and mobility 6. Geometric conditions of the mechanism design are investigated with some special topologies illustrated considering the limb arrangement. Following this and the actuation scheme analysis, a unified Jacobian matrix is formed using screw theory to include the change between geometric constraints and actuation constraints in the topology reconfiguration. Various singular configurations are identified by analyzing screw dependency in the Jacobian matrix. The work in this paper provides basis for singularity-free workspace analysis and optimal design of the class of metamorphic parallel mechanisms with parallel constraint screws which shows simple geometric constraints with potential simple kinematics and dynamics properties.
Selecting Simulation Models when Predicting Parallel Program Behaviour
Broberg, Magnus; Lundberg, Lars; Grahn, Håkan
2002-01-01
The use of multiprocessors is an important way to increase the performance of a supercom-puting program. This means that the program has to be parallelized to make use of the multi-ple processors. The parallelization is unfortunately not an easy task. Development tools supporting parallel programs are important. Further, it is the customer that decides the number of processors in the target machine, and as a result the developer has to make sure that the pro-gram runs efficiently on any numbe...
Modeling the Fracture of Ice Sheets on Parallel Computers
Energy Technology Data Exchange (ETDEWEB)
Waisman, Haim [Columbia University; Tuminaro, Ray [Sandia National Labs
2013-10-10
The objective of this project was to investigate the complex fracture of ice and understand its role within larger ice sheet simulations and global climate change. This objective was achieved by developing novel physics based models for ice, novel numerical tools to enable the modeling of the physics and by collaboration with the ice community experts. At the present time, ice fracture is not explicitly considered within ice sheet models due in part to large computational costs associated with the accurate modeling of this complex phenomena. However, fracture not only plays an extremely important role in regional behavior but also influences ice dynamics over much larger zones in ways that are currently not well understood. To this end, our research findings through this project offers significant advancement to the field and closes a large gap of knowledge in understanding and modeling the fracture of ice sheets in the polar regions. Thus, we believe that our objective has been achieved and our research accomplishments are significant. This is corroborated through a set of published papers, posters and presentations at technical conferences in the field. In particular significant progress has been made in the mechanics of ice, fracture of ice sheets and ice shelves in polar regions and sophisticated numerical methods that enable the solution of the physics in an efficient way.
Improved modelling of a parallel plate active magnetic regenerator
DEFF Research Database (Denmark)
Engelbrecht, Kurt; Tušek, J.; Nielsen, Kaspar Kirstein;
2013-01-01
flow maldistribution in the regenerator. This paper studies the effects of these loss mechanisms and compares theoretical results with experimental results obtained on an experimental AMR device. Three parallel plate regenerators were tested, each having different demagnetizing field characteristics...
Parallel-Batch Scheduling with Two Models of Deterioration to Minimize the Makespan
Directory of Open Access Journals (Sweden)
Cuixia Miao
2014-01-01
Full Text Available We consider the bounded parallel-batch scheduling with two models of deterioration, in which the processing time of the first model is pj=aj+αt and of the second model is pj=a+αjt. The objective is to minimize the makespan. We present O(n log n time algorithms for the single-machine problems, respectively. And we propose fully polynomial time approximation schemes to solve the identical-parallel-machine problem and uniform-parallel-machine problem, respectively.
A model for optimizing file access patterns using spatio-temporal parallelism
Energy Technology Data Exchange (ETDEWEB)
Boonthanome, Nouanesengsy [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Patchett, John [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Geveci, Berk [Kitware Inc., Clifton Park, NY (United States); Ahrens, James [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Bauer, Andy [Kitware Inc., Clifton Park, NY (United States); Chaudhary, Aashish [Kitware Inc., Clifton Park, NY (United States); Miller, Ross G. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Shipman, Galen M. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Williams, Dean N. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
2013-01-01
For many years now, I/O read time has been recognized as the primary bottleneck for parallel visualization and analysis of large-scale data. In this paper, we introduce a model that can estimate the read time for a file stored in a parallel filesystem when given the file access pattern. Read times ultimately depend on how the file is stored and the access pattern used to read the file. The file access pattern will be dictated by the type of parallel decomposition used. We employ spatio-temporal parallelism, which combines both spatial and temporal parallelism, to provide greater flexibility to possible file access patterns. Using our model, we were able to configure the spatio-temporal parallelism to design optimized read access patterns that resulted in a speedup factor of approximately 400 over traditional file access patterns.
Parallelization of fine-scale computation in Agile Multiscale Modelling Methodology
Macioł, Piotr; Michalik, Kazimierz
2016-10-01
Nowadays, multiscale modelling of material behavior is an extensively developed area. An important obstacle against its wide application is high computational demands. Among others, the parallelization of multiscale computations is a promising solution. Heterogeneous multiscale models are good candidates for parallelization, since communication between sub-models is limited. In this paper, the possibility of parallelization of multiscale models based on Agile Multiscale Methodology framework is discussed. A sequential, FEM based macroscopic model has been combined with concurrently computed fine-scale models, employing a MatCalc thermodynamic simulator. The main issues, being investigated in this work are: (i) the speed-up of multiscale models with special focus on fine-scale computations and (ii) on decreasing the quality of computations enforced by parallel execution. Speed-up has been evaluated on the basis of Amdahl's law equations. The problem of `delay error', rising from the parallel execution of fine scale sub-models, controlled by the sequential macroscopic sub-model is discussed. Some technical aspects of combining third-party commercial modelling software with an in-house multiscale framework and a MPI library are also discussed.
Parallel direct solver for finite element modeling of manufacturing processes
DEFF Research Database (Denmark)
Nielsen, Chris Valentin; Martins, P.A.F.
2017-01-01
The central processing unit (CPU) time is of paramount importance in finite element modeling of manufacturing processes. Because the most significant part of the CPU time is consumed in solving the main system of equations resulting from finite element assemblies, different approaches have been...
Nadkarni, P M; Miller, P L
1991-01-01
A parallel program for inter-database sequence comparison was developed on the Intel Hypercube using two models of parallel programming. One version was built using machine-specific Hypercube parallel programming commands. The other version was built using Linda, a machine-independent parallel programming language. The two versions of the program provide a case study comparing these two approaches to parallelization in an important biological application area. Benchmark tests with both programs gave comparable results with a small number of processors. As the number of processors was increased, the Linda version was somewhat less efficient. The Linda version was also run without change on Network Linda, a virtual parallel machine running on a network of desktop workstations.
Taatgen, N
2005-01-01
Emerging parallel processing and increased flexibility during the acquisition of cognitive skills form a combination that is hard to reconcile with rule-based models that often produce brittle behavior. Rule-based models can exhibit these properties by adhering to 2 principles: that the model gradua
A simple and efficient parallel FFT algorithm using the BSP model
Bisseling, R.H.; Inda, M.A.
2000-01-01
In this paper we present a new parallel radix FFT algorithm based on the BSP model Our parallel algorithm uses the groupcyclic distribution family which makes it simple to understand and easy to implement We show how to reduce the com munication cost of the algorithm by a factor of three in the case
Toward a Model Framework of Generalized Parallel Componential Processing of Multi-Symbol Numbers
Huber, Stefan; Cornelsen, Sonja; Moeller, Korbinian; Nuerk, Hans-Christoph
2015-01-01
In this article, we propose and evaluate a new model framework of parallel componential multi-symbol number processing, generalizing the idea of parallel componential processing of multi-digit numbers to the case of negative numbers by considering the polarity signs similar to single digits. In a first step, we evaluated this account by defining…
Hybrid parallel execution model for logic-based specification languages
Tsai, Jeffrey J P
2001-01-01
Parallel processing is a very important technique for improving the performance of various software development and maintenance activities. The purpose of this book is to introduce important techniques for parallel executation of high-level specifications of software systems. These techniques are very useful for the construction, analysis, and transformation of reliable large-scale and complex software systems. Contents: Current Approaches; Overview of the New Approach; FRORL Requirements Specification Language and Its Decomposition; Rewriting and Data Dependency, Control Flow Analysis of a Lo
Mobile Parallel Manipulators, Modelling and Data-Driven Motion Planning
Directory of Open Access Journals (Sweden)
Amar Khoukhi
2013-11-01
Full Text Available This paper provides a kinematic and dynamic analysis of mobile parallel manipulators (MPM. The study is conducted on a composed multi-degree of freedom (DOF parallel robot carried by a wheeled mobile platform. Both positional and differential kinematics problems for the hybrid structure are solved, and the redundancy problem is solved using joint limit secondary criterion- based generalized-pseudo-inverse. A minimum time trajectory parameterization is obtained via cycloidal profile to initialize multi-objective trajectory planning of the MPM. Considered objectives include time energy minimization redundancy resolution and singularity avoidance. Simulation results illustrating the effectiveness of the proposed approach are presented and discussed.
Double-layer parallelization for hydrological model calibration on HPC systems
Zhang, Ang; Li, Tiejian; Si, Yuan; Liu, Ronghua; Shi, Haiyun; Li, Xiang; Li, Jiaye; Wu, Xia
2016-04-01
Large-scale problems that demand high precision have remarkably increased the computational time of numerical simulation models. Therefore, the parallelization of models has been widely implemented in recent years. However, computing time remains a major challenge when a large model is calibrated using optimization techniques. To overcome this difficulty, we proposed a double-layer parallel system for hydrological model calibration using high-performance computing (HPC) systems. The lower-layer parallelism is achieved using a hydrological model, the Digital Yellow River Integrated Model, which was parallelized by decomposing river basins. The upper-layer parallelism is achieved by simultaneous hydrological simulations with different parameter combinations in the same generation of the genetic algorithm and is implemented using the job scheduling functions of an HPC system. The proposed system was applied to the upstream of the Qingjian River basin, a sub-basin of the middle Yellow River, to calibrate the model effectively by making full use of the computing resources in the HPC system and to investigate the model's behavior under various parameter combinations. This approach is applicable to most of the existing hydrology models for many applications.
A Parallel and Distributed Surrogate Model Implementation for Computational Steering
Butnaru, Daniel
2012-06-01
Understanding the influence of multiple parameters in a complex simulation setting is a difficult task. In the ideal case, the scientist can freely steer such a simulation and is immediately presented with the results for a certain configuration of the input parameters. Such an exploration process is however not possible if the simulation is computationally too expensive. For these cases we present in this paper a scalable computational steering approach utilizing a fast surrogate model as substitute for the time-consuming simulation. The surrogate model we propose is based on the sparse grid technique, and we identify the main computational tasks associated with its evaluation and its extension. We further show how distributed data management combined with the specific use of accelerators allows us to approximate and deliver simulation results to a high-resolution visualization system in real-time. This significantly enhances the steering workflow and facilitates the interactive exploration of large datasets. © 2012 IEEE.
Experimental and modelling results of a parallel-plate based active magnetic regenerator
DEFF Research Database (Denmark)
Tura, A.; Nielsen, Kaspar Kirstein; Rowe, A.
2012-01-01
The performance of a permanent magnet magnetic refrigerator (PMMR) using gadolinium parallel plates is described. The configuration and operating parameters are described in detail. Experimental results are compared to simulations using an established twodimensional model of an active magnetic...
Mathematical Model of Thyristor Inverter Including a Series-parallel Resonant Circuit
Directory of Open Access Journals (Sweden)
Miroslaw Luft
2008-01-01
Full Text Available The article presents a mathematical model of thyristor inverter including a series-parallel resonant circuit with theaid of state variable method. Maple procedures are used to compute current and voltage waveforms in the inverter.
Pi: A Parallel Architecture Interface for Multi-Model Execution
1990-07-01
Directory Schemes for Cache Coherence. In The 15th Annual Interna- tional Symposium on Computer Architecture. IEEE Computer Society and ACM, June 1988. [3...Annual International Symposium on Computer Architecture. IEEE Computer Society and ACM, June 1986. [5] Arvind and Rishiyur S. Nikhil. A Dataflow...Overview, 1987. [9] Roberto Bisiani and Alessandro Forin. Multilanguage Parallel Programming of Heterogeneous Machines. IEEE Transactions on Computers
Parallel computer processing and modeling: applications for the ICU
Baxter, Grant; Pranger, L. Alex; Draghic, Nicole; Sims, Nathaniel M.; Wiesmann, William P.
2003-07-01
Current patient monitoring procedures in hospital intensive care units (ICUs) generate vast quantities of medical data, much of which is considered extemporaneous and not evaluated. Although sophisticated monitors to analyze individual types of patient data are routinely used in the hospital setting, this equipment lacks high order signal analysis tools for detecting long-term trends and correlations between different signals within a patient data set. Without the ability to continuously analyze disjoint sets of patient data, it is difficult to detect slow-forming complications. As a result, the early onset of conditions such as pneumonia or sepsis may not be apparent until the advanced stages. We report here on the development of a distributed software architecture test bed and software medical models to analyze both asynchronous and continuous patient data in real time. Hardware and software has been developed to support a multi-node distributed computer cluster capable of amassing data from multiple patient monitors and projecting near and long-term outcomes based upon the application of physiologic models to the incoming patient data stream. One computer acts as a central coordinating node; additional computers accommodate processing needs. A simple, non-clinical model for sepsis detection was implemented on the system for demonstration purposes. This work shows exceptional promise as a highly effective means to rapidly predict and thereby mitigate the effect of nosocomial infections.
Design and Performance Analysis of a Massively Parallel Atmospheric General Circulation Model
Directory of Open Access Journals (Sweden)
Daniel S. Schaffer
2000-01-01
Full Text Available In the 1990's, computer manufacturers are increasingly turning to the development of parallel processor machines to meet the high performance needs of their customers. Simultaneously, atmospheric scientists studying weather and climate phenomena ranging from hurricanes to El Niño to global warming require increasingly fine resolution models. Here, implementation of a parallel atmospheric general circulation model (GCM which exploits the power of massively parallel machines is described. Using the horizontal data domain decomposition methodology, this FORTRAN 90 model is able to integrate a 0.6° longitude by 0.5° latitude problem at a rate of 19 Gigaflops on 512 processors of a Cray T3E 600; corresponding to 280 seconds of wall-clock time per simulated model day. At this resolution, the model has 64 times as many degrees of freedom and performs 400 times as many floating point operations per simulated day as the model it replaces.
Zhu, Xiang; Zhang, Dianwen
2013-01-01
We present a fast, accurate and robust parallel Levenberg-Marquardt minimization optimizer, GPU-LMFit, which is implemented on graphics processing unit for high performance scalable parallel model fitting processing. GPU-LMFit can provide a dramatic speed-up in massive model fitting analyses to enable real-time automated pixel-wise parametric imaging microscopy. We demonstrate the performance of GPU-LMFit for the applications in superresolution localization microscopy and fluorescence lifetime imaging microscopy.
Xiang Zhu; Dianwen Zhang
2013-01-01
We present a fast, accurate and robust parallel Levenberg-Marquardt minimization optimizer, GPU-LMFit, which is implemented on graphics processing unit for high performance scalable parallel model fitting processing. GPU-LMFit can provide a dramatic speed-up in massive model fitting analyses to enable real-time automated pixel-wise parametric imaging microscopy. We demonstrate the performance of GPU-LMFit for the applications in superresolution localization microscopy and fluorescence lifetim...
Describing, using 'recognition cones'. [parallel-series model with English-like computer program
Uhr, L.
1973-01-01
A parallel-serial 'recognition cone' model is examined, taking into account the model's ability to describe scenes of objects. An actual program is presented in an English-like language. The concept of a 'description' is discussed together with possible types of descriptive information. Questions regarding the level and the variety of detail are considered along with approaches for improving the serial representations of parallel systems.
Forward and backward models for fault diagnosis based on parallel genetic algorithms
Institute of Scientific and Technical Information of China (English)
Yi LIU; Ying LI; Yi-jia CAO; Chuang-xin GUO
2008-01-01
In this paper, a mathematical model consisting of forward and backward models is built on parallel genetic algorithms (PGAs) for fault diagnosis in a transmission power system. A new method to reduce the scale of fault sections is developed in the forward model and the message passing interface (MPI) approach is chosen to parallel the genetic algorithms by global sin-gle-population master-slave method (GPGAs). The proposed approach is applied to a sample system consisting of 28 sections, 84 protective relays and 40 circuit breakers. Simulation results show that the new model based on GPGAs can achieve very fast computation in online applications of large-scale power systems.
Parallel Nonnegative Least Squares Solvers for Model Order Reduction
2016-03-01
not for the PQN method. For the latter method the size of the active set is controlled to promote sparse solutions. This is described in Section 3.2.1...or any other aspect of this collection of information, including suggestions for reducing the burden, to Department of Defense, Washington...21005-5066 primary author’s email: <james.p.collins106.civ@mail.mil>. Parallel nonnegative least squares (NNLS) solvers are developed specifically for
THE IMPROVEMENT OF THE COMPUTATIONAL PERFORMANCE OF THE ZONAL MODEL POMA USING PARALLEL TECHNIQUES
Directory of Open Access Journals (Sweden)
Yao Yu
2014-01-01
Full Text Available The zonal modeling approach is a new simplified computational method used to predict temperature distribution, energy in multi-zone building and indoor airflow thermal behaviors of building. Although this approach is known to use less computer resource than CFD models, the computational time is still an issue especially when buildings are characterized by complicated geometry and indoor layout of furnishings. Therefore, using a new computing technique to the current zonal models in order to reduce the computational time is a promising way to further improve the model performance and promote the wide application of zonal models. Parallel computing techniques provide a way to accomplish these purposes. Unlike the serial computations that are commonly used in the current zonal models, these parallel techniques decompose the serial program into several discrete instructions which can be executed simultaneously on different processors/threads. As a result, the computational time of the parallelized program can be significantly reduced, compared to that of the traditional serial program. In this article, a parallel computing technique, Open Multi-Processing (OpenMP, is used into the zonal model, Pressurized zOnal Model with the Air diffuser (POMA, in order to improve the model computational performance, including the reduction of computational time and the investigation of the model scalability.
Block and parallel modelling of broad domain nonlinear continuous mapping based on NN
Institute of Scientific and Technical Information of China (English)
Yang Guowei; Tu Xuyan; Wang Shoujue
2006-01-01
The necessity of the use of the block and parallel modeling of the nonlinear continuous mappings with NN is firstly expounded quantitatively. Then, a practical approach for the block and parallel modeling of the nonlinear continuous mappings with NN is proposed. Finally, an example indicating that the method raised in this paper can be realized by suitable existed software is given. The results of the experiment of the model discussed on the 3-D Mexican straw hat indicate that the block and parallel modeling based on NN is more precise and faster in computation than the direct ones and it is obviously a concrete example and the development of the large-scale general model established by Tu Xuyan.
DEFF Research Database (Denmark)
Wu, Guanglei; Bai, Shaoping; Kepler, Jørgen Asbøl
2012-01-01
This paper deals with the error modelling and analysis of a 3-PPR planar parallel manipulator with joint clearances. The kinematics and the Cartesian workspace of the manipulator are analyzed. An error model is established with considerations of both configuration errors and joint clearances. Usi...... this model, the upper bounds and distributions of the pose errors for this manipulator are established. The results are compared with experimental measurements and show the effectiveness of the error prediction model....
A Model of Parallel Kinematics for Machine Calibration
DEFF Research Database (Denmark)
Pedersen, David Bue; Bæk Nielsen, Morten; Kløve Christensen, Simon
2016-01-01
Parallel kinematics have been adopted by more than 25 manufacturers of high-end desktop 3D printers [Wohlers Report (2015), p.118] as well as by research projects such as the WASP project [WASP (2015)], a 12 meter tall linear delta robot for Additive Manufacture of large-scale components...... developed in order to decompose the different types of geometrical errors into 6 elementary cases. Deliberate introduction of errors to the virtual machine has subsequently allowed for the generation of deviation plots that can be used as a strong tool for the identification and correction of geometrical...... errors on a physical machine tool....
Parallelized Genetic Identification of the Thermal-Electrochemical Model for Lithium-Ion Battery
Directory of Open Access Journals (Sweden)
Liqiang Zhang
2013-01-01
Full Text Available The parameters of a well predicted model can be used as health characteristics for Lithium-ion battery. This article reports a parallelized parameter identification of the thermal-electrochemical model, which significantly reduces the time consumption of parameter identification. Since the P2D model has the most predictability, it is chosen for further research and expanded to the thermal-electrochemical model by coupling thermal effect and temperature-dependent parameters. Then Genetic Algorithm is used for parameter identification, but it takes too much time because of the long time simulation of model. For this reason, a computer cluster is built by surplus computing resource in our laboratory based on Parallel Computing Toolbox and Distributed Computing Server in MATLAB. The performance of two parallelized methods, namely Single Program Multiple Data (SPMD and parallel FOR loop (PARFOR, is investigated and then the parallelized GA identification is proposed. With this method, model simulations running parallelly and the parameter identification could be speeded up more than a dozen times, and the identification result is batter than that from serial GA. This conclusion is validated by model parameter identification of a real LiFePO4 battery.
[Parallel PLS algorithm using MapReduce and its aplication in spectral modeling].
Yang, Hui-Hua; Du, Ling-Ling; Li, Ling-Qiao; Tang, Tian-Biao; Guo, Tuo; Liang, Qiong-Lin; Wang, Yi-Ming; Luo, Guo-An
2012-09-01
Partial least squares (PLS) has been widely used in spectral analysis and modeling, and it is computation-intensive and time-demanding when dealing with massive data To solve this problem effectively, a novel parallel PLS using MapReduce is proposed, which consists of two procedures, the parallelization of data standardizing and the parallelization of principal component computing. Using NIR spectral modeling as an example, experiments were conducted on a Hadoop cluster, which is a collection of ordinary computers. The experimental results demonstrate that the parallel PLS algorithm proposed can handle massive spectra, can significantly cut down the modeling time, and gains a basically linear speedup, and can be easily scaled up.
Parallel Motion Simulation of Large-Scale Real-Time Crowd in a Hierarchical Environmental Model
Directory of Open Access Journals (Sweden)
Xin Wang
2012-01-01
Full Text Available This paper presents a parallel real-time crowd simulation method based on a hierarchical environmental model. A dynamical model of the complex environment should be constructed to simulate the state transition and propagation of individual motions. By modeling of a virtual environment where virtual crowds reside, we employ different parallel methods on a topological layer, a path layer and a perceptual layer. We propose a parallel motion path matching method based on the path layer and a parallel crowd simulation method based on the perceptual layer. The large-scale real-time crowd simulation becomes possible with these methods. Numerical experiments are carried out to demonstrate the methods and results.
A Tool for Performance Modeling of Parallel Programs
Directory of Open Access Journals (Sweden)
J.A. González
2003-01-01
Full Text Available Current performance prediction analytical models try to characterize the performance behavior of actual machines through a small set of parameters. In practice, substantial deviations are observed. These differences are due to factors as memory hierarchies or network latency. A natural approach is to associate a different proportionality constant with each basic block, and analogously, to associate different latencies and bandwidths with each "communication block". Unfortunately, to use this approach implies that the evaluation of parameters must be done for each algorithm. This is a heavy task, implying experiment design, timing, statistics, pattern recognition and multi-parameter fitting algorithms. Software support is required. We present a compiler that takes as source a C program annotated with complexity formulas and produces as output an instrumented code. The trace files obtained from the execution of the resulting code are analyzed with an interactive interpreter, giving us, among other information, the values of those parameters.
Efficiently parallelized modeling of tightly focused, large bandwidth laser pulses
Dumont, Joey; Lefebvre, Catherine; Gagnon, Denis; MacLean, Steve
2016-01-01
The Stratton-Chu integral representation of electromagnetic fields is used to study the spatio-temporal properties of large bandwidth laser pulses focused by high numerical aperture mirrors. We review the formal aspects of the derivation of diffraction integrals from the Stratton-Chu representation and discuss the use of the Hadamard finite part in the derivation of the physical optics approximation. By analyzing the formulation we show that, for the specific case of a parabolic mirror, the integrands involved in the description of the reflected field near the focal spot do not possess the strong oscillations characteristic of diffraction integrals. Consequently, the integrals can be evaluated with simple and efficient quadrature methods rather than with specialized, more costly approaches. We report on the development of an efficiently parallelized algorithm that evaluates the Stratton-Chu diffraction integrals for incident fields of arbitrary temporal and spatial dependence. We use our method to show that t...
Dirba, I.; Kleperis, J.
2011-01-01
Analytical and numerical modelling is performed for the linear actuator of a parallel path magnet motor. In the model based on finite-element analysis, the 3D problem is reduced to a 2D problem, which is sufficiently precise in a design aspect and allows modelling the principle of a parallel path motor. The paper also describes a relevant numerical model and gives comparison with experimental results. The numerical model includes all geometrical and physical characteristics of the motor components. The magnetic flux density and magnetic force are simulated using FEMM 4.2 software. An experimental model has also been developed and verified for the core of switchable magnetic flux linear actuator and motor. The results of experiments are compared with those of theoretical/analytical and numerical modelling.
Parallelization and Performance of the NIM Weather Model Running on GPUs
Govett, Mark; Middlecoff, Jacques; Henderson, Tom; Rosinski, James
2014-05-01
The Non-hydrostatic Icosahedral Model (NIM) is a global weather prediction model being developed to run on the GPU and MIC fine-grain architectures. The model dynamics, written in Fortran, was initially parallelized for GPUs in 2009 using the F2C-ACC compiler and demonstrated good results running on a single GPU. Subsequent efforts have focused on (1) running efficiently on multiple GPUs, (2) parallelization of NIM for Intel-MIC using openMP, (3) assessing commercial Fortran GPU compilers now available from Cray, PGI and CAPS, (4) keeping the model up to date with the latest scientific development while maintaining a single source performance portable code, and (5) parallelization of two physics packages used in the NIM: the operational Global Forecast System (GFS) used operationally, and the widely used Weather Research and Forecast (WRF) model physics. The presentation will touch on each of these efforts, but highlight improvements in parallel performance of the NIM running on the Titan GPU cluster at ORNL, the ongong parallelization of model physics, and a recent evaluation of commercial GPU compilers using the F2C-ACC compiler as the baseline.
Modelling and simulation of multiple single - phase induction motor in parallel connection
Directory of Open Access Journals (Sweden)
Sujitjorn, S.
2006-11-01
Full Text Available A mathematical model for parallel connected n-multiple single-phase induction motors in generalized state-space form is proposed in this paper. The motor group draws electric power from one inverter. The model is developed by the dq-frame theory and was tested against four loading scenarios in which satisfactory results were obtained.
A one-dimensional heat transfer model for parallel-plate thermoacoustic heat exchangers
de Jong, Anne; Wijnant, Ysbrand H.; de Boer, Andries
2014-01-01
A one-dimensional (1D) laminar oscillating flow heat transfer model is derived and applied to parallel-plate thermoacoustic heat exchangers. The model can be used to estimate the heat transfer from the solid wall to the acoustic medium, which is required for the heat input/output of thermoacoustic
A numerical model for thermoelectric generator with the parallel-plate heat exchanger
Yu, Jianlin; Zhao, Hua
This paper presents a numerical model to predict the performance of thermoelectric generator with the parallel-plate heat exchanger. The model is based on an elemental approach and exhibits its feature in analyzing the temperature change in a thermoelectric generator and concomitantly its performance under operation conditions. The numerical simulated examples are demonstrated for the thermoelectric generator of parallel flow type and counter flow type in this paper. Simulation results show that the variations in temperature of the fluids in the thermoelectric generator are linear. The numerical model developed in this paper may be also applied to further optimization study for thermoelectric generator.
Advancing the extended parallel process model through the inclusion of response cost measures.
Rintamaki, Lance S; Yang, Z Janet
2014-01-01
This study advances the Extended Parallel Process Model through the inclusion of response cost measures, which are drawbacks associated with a proposed response to a health threat. A sample of 502 college students completed a questionnaire on perceptions regarding sexually transmitted infections and condom use after reading information from the Centers for Disease Control and Prevention on the health risks of sexually transmitted infections and the utility of latex condoms in preventing sexually transmitted infection transmission. The questionnaire included standard Extended Parallel Process Model assessments of perceived threat and efficacy, as well as questions pertaining to response costs associated with condom use. Results from hierarchical ordinary least squares regression demonstrated how the addition of response cost measures improved the predictive power of the Extended Parallel Process Model, supporting the inclusion of this variable in the model.
Efficiently parallelized modeling of tightly focused, large bandwidth laser pulses
Dumont, Joey; Fillion-Gourdeau, François; Lefebvre, Catherine; Gagnon, Denis; MacLean, Steve
2017-02-01
The Stratton-Chu integral representation of electromagnetic fields is used to study the spatio-temporal properties of large bandwidth laser pulses focused by high numerical aperture mirrors. We review the formal aspects of the derivation of diffraction integrals from the Stratton-Chu representation and discuss the use of the Hadamard finite part in the derivation of the physical optics approximation. By analyzing the formulation we show that, for the specific case of a parabolic mirror, the integrands involved in the description of the reflected field near the focal spot do not possess the strong oscillations characteristic of diffraction integrals. Consequently, the integrals can be evaluated with simple and efficient quadrature methods rather than with specialized, more costly approaches. We report on the development of an efficiently parallelized algorithm that evaluates the Stratton-Chu diffraction integrals for incident fields of arbitrary temporal and spatial dependence. This method has the advantage that its input is the unfocused field coming from the laser chain, which is experimentally known with high accuracy. We use our method to show that the reflection of a linearly polarized Gaussian beam of femtosecond duration off a high numerical aperture parabolic mirror induces ellipticity in the dominant field components and generates strong longitudinal components. We also estimate that future high-power laser facilities may reach intensities of {10}24 {{W}} {{cm}}-2.
Modelling and analysis of fringing and metal thickness effects in MEMS parallel plate capacitors
Shah, Kriyang; Singh, Jugdutt; Zayegh, Aladin
2005-12-01
This paper presents a detailed design and analysis of fringing and metal thickness effects in a Micro Electro Mechanical System (MEMS) parallel plate capacitor. MEMS capacitor is one of the widely deployed components into various applications such are pressure sensor, accelerometers, Voltage Controlled Oscillator's (VCO's) and other tuning circuits. The advantages of MEMS capacitor are miniaturisation, integration with optics, low power consumption and high quality factor for RF circuits. Parallel plate capacitor models found in literature are discussed and the best suitable model for MEMS capacitors is presented. From the equations presented it is found that fringing filed and metal thickness have logarithmic effects on capacitance and depend on width of parallel plates, distance between them and thickness of metal plates. From this analysis a precise model of a MEMS parallel plate capacitor is developed which incorporates the effects of fringing fields and metal thickness. A parallel plate MEMS capacitor has been implemented using Coventor design suite. Finite Element Method (FEM) analysis in Coventorware design suite has been performed to verify the accuracy of the proposed model for suitable range of dimensions for MEMS capacitor Simulations and analysis show that the error between the designed and the simulated values of MEMS capacitor is significantly reduced. Application of the modified model for computing capacitance of a combed device shows that the designed values greatly differ from simulated results noticeably from 1.0339pF to 1.3171pF in case of fringed devices.
F-Nets and Software Cabling: Deriving a Formal Model and Language for Portable Parallel Programming
DiNucci, David C.; Saini, Subhash (Technical Monitor)
1998-01-01
Parallel programming is still being based upon antiquated sequence-based definitions of the terms "algorithm" and "computation", resulting in programs which are architecture dependent and difficult to design and analyze. By focusing on obstacles inherent in existing practice, a more portable model is derived here, which is then formalized into a model called Soviets which utilizes a combination of imperative and functional styles. This formalization suggests more general notions of algorithm and computation, as well as insights into the meaning of structured programming in a parallel setting. To illustrate how these principles can be applied, a very-high-level graphical architecture-independent parallel language, called Software Cabling, is described, with many of the features normally expected from today's computer languages (e.g. data abstraction, data parallelism, and object-based programming constructs).
Partitioning and packing mathematical simulation models for calculation on parallel computers
Arpasi, D. J.; Milner, E. J.
1986-01-01
The development of multiprocessor simulations from a serial set of ordinary differential equations describing a physical system is described. Degrees of parallelism (i.e., coupling between the equations) and their impact on parallel processing are discussed. The problem of identifying computational parallelism within sets of closely coupled equations that require the exchange of current values of variables is described. A technique is presented for identifying this parallelism and for partitioning the equations for parallel solution on a multiprocessor. An algorithm which packs the equations into a minimum number of processors is also described. The results of the packing algorithm when applied to a turbojet engine model are presented in terms of processor utilization.
Error Modelling and Experimental Validation for a Planar 3-PPR Parallel Manipulator
DEFF Research Database (Denmark)
Wu, Guanglei; Bai, Shaoping; Kepler, Jørgen Asbøl
2011-01-01
In this paper, the positioning error of a 3-PPR planar parallel manipulator is studied with an error model and experimental validation. First, the displacement and workspace are analyzed. An error model considering both configuration errors and joint clearance errors is established. Using...... this model, the maximum positioning error was estimated for a U-shape PPR planar manipulator, the results being compared with the experimental measurements. It is found that the error distributions from the simulation is approximate to that of themeasurements....
Error Modelling and Experimental Validation for a Planar 3-PPR Parallel Manipulator
DEFF Research Database (Denmark)
Wu, Guanglei; Bai, Shaoping; Kepler, Jørgen Asbøl
2011-01-01
In this paper, the positioning error of a 3-PPR planar parallel manipulator is studied with an error model and experimental validation. First, the displacement and workspace are analyzed. An error model considering both configuration errors and joint clearance errors is established. Using...... this model, the maximum positioning error was estimated for a U-shape PPR planar manipulator, the results being compared with the experimental measurements. It is found that the error distributions from the simulation is approximate to that of themeasurements....
Wu, Guanglei; Shaoping, Bai; Jørgen A., Kepler; Caro, Stéphane
2012-01-01
International audience; This paper deals with the error modelling and analysis of a 3-\\underline{P}PR planar parallel manipulator with joint clearances. The kinematics and the Cartesian workspace of the manipulator are analyzed. An error model is established with considerations of both configuration errors and joint clearances. Using this model, the upper bounds and distributions of the pose errors for this manipulator are established. The results are compared with experimental measurements a...
Design and Implementation of “Many Parallel Task” Hybrid Subsurface Model
Energy Technology Data Exchange (ETDEWEB)
Agarwal, Khushbu; Chase, Jared M.; Schuchardt, Karen L.; Scheibe, Timothy D.; Palmer, Bruce J.; Elsethagen, Todd O.
2011-11-01
Continuum scale models have been used to study subsurface flow, transport, and reactions for many years. Recently, pore scale models, which operate at scales of individual soil grains, have been developed to more accurately model pore scale phenomena, such as precipitation, that may not be well represented at the continuum scale. However, particle-based models become prohibitively expensive for modeling realistic domains. Instead, we are developing a hybrid model that simulates the full domain at continuum scale and applies the pore model only to areas of high reactivity. The hybrid model uses a dimension reduction approach to formulate the mathematical exchange of information across scales. Since the location, size, and number of pore regions in the model varies, an adaptive Pore Generator is being implemented to define pore regions at each iteration. A fourth code will provide data transformation from the pore scale back to the continuum scale. These components are coupled into a single hybrid model using the SWIFT workflow system. Our hybrid model workflow simulates a kinetic controlled mixing reaction in which multiple pore-scale simulations occur for every continuum scale timestep. Each pore-scale simulation is itself parallel, thus exhibiting multi-level parallelism. Our workflow manages these multiple parallel tasks simultaneously, with the number of tasks changing across iterations. It also supports dynamic allocation of job resources and visualization processing at each iteration. We discuss the design, implementation and challenges associated with building a scalable, Many Parallel Task, hybrid model to run efficiently on thousands to tens of thousands of processors.
Institute of Scientific and Technical Information of China (English)
HOU Fu-jun; WU Qi-zong
2007-01-01
A method for modeling the parallel machine scheduling problems with fuzzy parameters and precedence constraints based on credibility measure is provided.For the given n jobs to be processed on m machines, it is assumed that the processing times and the due dates are nonnegative fuzzy numbers and all the weights are positive, crisp numbers.Based on credibility measure, three parallel machine scheduling problems and a goal-programming model are formulated.Feasible schedules are evaluated not only by their objective values but also by the credibility degree of satisfaction with their precedence constraints.The genetic algorithm is utilized to find the best solutions in a short period of time.An illustrative numerical example is also given.Simulation results show that the proposed models are effective, which can deal with the parallel machine scheduling problems with fuzzy parameters and precedence constraints based on credibility measure.
Boyko, Oleksiy; Zheleznyak, Mark
2015-04-01
The original numerical code TOPKAPI-IMMS of the distributed rainfall-runoff model TOPKAPI ( Todini et al, 1996-2014) is developed and implemented in Ukraine. The parallel version of the code has been developed recently to be used on multiprocessors systems - multicore/processors PC and clusters. Algorithm is based on binary-tree decomposition of the watershed for the balancing of the amount of computation for all processors/cores. Message passing interface (MPI) protocol is used as a parallel computing framework. The numerical efficiency of the parallelization algorithms is demonstrated for the case studies for the flood predictions of the mountain watersheds of the Ukrainian Carpathian regions. The modeling results is compared with the predictions based on the lumped parameters models.
Alvioli, M.; Baum, R.L.
2016-01-01
We describe a parallel implementation of TRIGRS, the Transient Rainfall Infiltration and Grid-Based Regional Slope-Stability Model for the timing and distribution of rainfall-induced shallow landslides. We have parallelized the four time-demanding execution modes of TRIGRS, namely both the saturated and unsaturated model with finite and infinite soil depth options, within the Message Passing Interface framework. In addition to new features of the code, we outline details of the parallel implementation and show the performance gain with respect to the serial code. Results are obtained both on commercial hardware and on a high-performance multi-node machine, showing the different limits of applicability of the new code. We also discuss the implications for the application of the model on large-scale areas and as a tool for real-time landslide hazard monitoring.
A primitive kinetic-fluid model for quasi-parallel propagating magnetohydrodynamic waves
Energy Technology Data Exchange (ETDEWEB)
Nariyuki, Y. [Faculty of Human Development, University of Toyama, 3190 Toyama City, Toyama 930-8555 (Japan); Saito, S. [Graduate School of Science, Nagoya University, Nagoya, Aichi 464-8601 (Japan); Umeda, T. [Solar-Terrestrial Environment Laboratory, Nagoya University, Nagoya, Aichi 464-8601 (Japan)
2013-07-15
The extension and limitation of the existing one-dimensional kinetic-fluid model (Vlasov-MHD (magnetohydrodynamic) model), which has been used to analyze parametric instabilities of parallel propagating Alfvén waves, are discussed. The inconsistency among the given velocity distribution functions in the past studies is resolved through the systematic derivation of the multi-dimensional Vlasov-MHD model. The linear dispersion analysis of the present model indicates that the collisionless damping of the slow modes is adequately evaluated in low beta plasmas, although the deviation between the present model and the full-Vlasov theory increases with increasing plasma beta and increasing propagation angle. This is because the transit-time damping is not correctly evaluated in the present model. It is also shown that the ponderomotive density fluctuations associated with the envelope-modulated quasi-parallel propagating Alfvén waves derived from the present model is not consistent with those derived from the other models such as the Landau-fluid model, except for low beta plasmas. The result indicates the present model would be useful to understand the linear and nonlinear development of the Alfvénic turbulence in the inner heliosphere, whose condition is relatively low beta, while the existing model and the present model are insufficient to discuss the parametric instabilities of Alfvén waves in high beta plasmas and the obliquely propagating waves.
PVeStA: A Parallel Statistical Model Checking and Quantitative Analysis Tool
AlTurki, Musab
2011-01-01
Statistical model checking is an attractive formal analysis method for probabilistic systems such as, for example, cyber-physical systems which are often probabilistic in nature. This paper is about drastically increasing the scalability of statistical model checking, and making such scalability of analysis available to tools like Maude, where probabilistic systems can be specified at a high level as probabilistic rewrite theories. It presents PVeStA, an extension and parallelization of the VeStA statistical model checking tool [10]. PVeStA supports statistical model checking of probabilistic real-time systems specified as either: (i) discrete or continuous Markov Chains; or (ii) probabilistic rewrite theories in Maude. Furthermore, the properties that it can model check can be expressed in either: (i) PCTL/CSL, or (ii) the QuaTEx quantitative temporal logic. As our experiments show, the performance gains obtained from parallelization can be very high. © 2011 Springer-Verlag.
Sparse Probabilistic Parallel Factor Analysis for the Modeling of PET and Task-fMRI Data
DEFF Research Database (Denmark)
Beliveau, Vincent; Papoutsakis, Georgios; Hinrich, Jesper Løve
2017-01-01
interpretability of the results. Here we propose a variational Bayesian parallel factor analysis (VB-PARAFAC) model and an extension with sparse priors (SP-PARAFAC). Notably, our formulation admits time and subject specific noise modeling as well as subject specific offsets (i.e., mean values). We confirmed...... the validity of the models through simulation and performed exploratory analysis of positron emission tomography (PET) and functional magnetic resonance imaging (fMRI) data. Although more constrained, the proposed models performed similarly to more flexible models in approximating the PET data, which supports......Modern datasets are often multiway in nature and can contain patterns common to a mode of the data (e.g. space, time, and subjects). Multiway decomposition such as parallel factor analysis (PARAFAC) take into account the intrinsic structure of the data, and sparse versions of these methods improve...
Zehner, Björn; Hellwig, Olaf; Linke, Maik; Görz, Ines; Buske, Stefan
2016-01-01
3D geological underground models are often presented by vector data, such as triangulated networks representing boundaries of geological bodies and geological structures. Since models are to be used for numerical simulations based on the finite difference method, they have to be converted into a representation discretizing the full volume of the model into hexahedral cells. Often the simulations require a high grid resolution and are done using parallel computing. The storage of such a high-resolution raster model would require a large amount of storage space and it is difficult to create such a model using the standard geomodelling packages. Since the raster representation is only required for the calculation, but not for the geometry description, we present an algorithm and concept for rasterizing geological models on the fly for the use in finite difference codes that are parallelized by domain decomposition. As a proof of concept we implemented a rasterizer library and integrated it into seismic simulation software that is run as parallel code on a UNIX cluster using the Message Passing Interface. We can thus run the simulation with realistic and complicated surface-based geological models that are created using 3D geomodelling software, instead of using a simplified representation of the geological subsurface using mathematical functions or geometric primitives. We tested this set-up using an example model that we provide along with the implemented library.
Parallel plate model for trabecular bone exhibits volume fraction-dependant bias
DEFF Research Database (Denmark)
Day, J; Ding, Ming; Odgaard, A;
2000-01-01
Unbiased stereological methods were used in conjunction with microcomputed tomographic (micro-CT) scans of human and animal bone to investigate errors created when the parallel plate model was used to calculate morphometric parameters. Bone samples were obtained from the human proximal tibia......, canine distal femur, rat tail, and pig spine and scanned in a micro-CT scanner. Trabecular thickness, trabecular spacing, and trabecular number were calculated using the parallel plate model. Direct thickness, and spacing and connectivity density were calculated using unbiased three-dimensional methods...
Directory of Open Access Journals (Sweden)
ZHOU Hao
2015-08-01
Full Text Available In order to solve the intensive computing tasks and high memory demand problem in satellite gravity field model inversion on the basis of huge amounts of satellite gravity observations, the parallel algorithm for high truncated order and degree satellite gravity field model inversion with least square method on the basis of MPI was introduced. After analyzing the time and space complexity of each step in the solving flow, the parallel I/O, block-organized storage and block-organized computation algorithm on the basis of MPI are introduced to design the parallel algorithm for building design matrix, establishing and solving normal equation, and the simulation results indicate that the parallel efficiency of building design matrix, establishing and solving normal equation can reach to 95%, 68%and 63% respectively. In addition, on the basis of GOCE simulated orbits and radial disturbance gravity gradient data(518 400 epochs in total, two earth gravity models truncated to degree and order 120, 240 are inversed, and the relative computation time and memory demand are only about 40 minutes and 7 hours, 290 MB and 1.57 GB respectively. Eventually, a simulation numerical calculation for earth gravity field model inversion with the simulation data, which has the equivalent noise level with GRACE and GOCE mission, is conducted. The accuracy of inversion model has a good consistent with current released model, and the combined mode can complement the spectral information of each individual mission, which indicates that the parallel algorithm in this paper can be applied to inverse the high truncated degree and order earth gravity model efficiently and stably.
The Parallelism of Traditional Transaction Model%传统事务模型的并行性
Institute of Scientific and Technical Information of China (English)
张志强; 李建中; 周立柱
2001-01-01
Transaction is a very important concept in DBMS,which has several features such as consistency,atomicity, durability and isolation. In this paper, we first analyze the parallelism of traditional transaction model. Next we point out that we can investigate more parallelism with a high parallel processing manner underlying multi-processors parallel structures. We will then compare the influence of two different software architectures on database system parallelism.
Energy Technology Data Exchange (ETDEWEB)
Leng, Wei [Chinese Academy of Sciences; Ju, Lili [University of South Carolina; Gunzburger, Max [Florida State University; Price, Stephen [Los Alamos National Laboratory; Ringler, Todd [Los Alamos National Laboratory,
2012-01-01
The numerical modeling of glacier and ice sheet evolution is a subject of growing interest, in part because of the potential for models to inform estimates of global sea level change. This paper focuses on the development of a numerical model that determines the velocity and pressure fields within an ice sheet. Our numerical model features a high-fidelity mathematical model involving the nonlinear Stokes system and combinations of no-sliding and sliding basal boundary conditions, high-order accurate finite element discretizations based on variable resolution grids, and highly scalable parallel solution strategies, all of which contribute to a numerical model that can achieve accurate velocity and pressure approximations in a highly efficient manner. We demonstrate the accuracy and efficiency of our model by analytical solution tests, established ice sheet benchmark experiments, and comparisons with other well-established ice sheet models.
Wilde, M.; Mickelson, S. A.; Jacob, R. L.; Zamboni, L.; Elliott, J.; Yan, E.
2012-12-01
Climate models continually increase both in their resolution and structural complexity, resulting in multi-terabyte model outputs. This volume of data overwhelms the current model processing procedures that are used to derive climate averages, perform analysis, produce visualizations, and integrate climate models with other datasets. We describe here the application of a new programming model - implicitly parallel functional dataflow scripting - for expressing the processing steps needed to post-process, analyze, integrate, and visualize the output of climate models. This programming model, implemented in the Swift parallel scripting language, provides a many-fold speedup of processing while reducing the amount of manual effort involved. It is characterized by: - implicit, pervasive parallelism, enabling scientists to leverage diverse parallel resources with reduced programming complexity; - abstraction of computing location and resource types, and automation of high performance data transport; - compact, uniform representation for the processing protocols and procedures of a research group or community under which virtually all existing software tools and languages can be coordinated; and - tracking of the provenance of derived data objects, providing a means for diagnostic interrogation and assessment of computational results. We report here on four model-analysis and/or data integration applications of this approach: 1) Re-coding of the community-standard diagnostic packages used to post-process data from the Community Atmosphere Model and the Parallel Ocean Program in Swift. This has resulted in valuable speedups in model analysis for these heavily used procedures. 2) Processing of model output from HiRAM, the GFDL global HIgh Resolution Atmospheric Model, automating and parallelizing post-processing steps that have in the past been both manually and computationally intensive. Swift automatically processesed 50 HiRAM realizations comprising over 50TB of model
Efficient Parallel Implementation of Active Appearance Model Fitting Algorithm on GPU
Directory of Open Access Journals (Sweden)
Jinwei Wang
2014-01-01
Full Text Available The active appearance model (AAM is one of the most powerful model-based object detecting and tracking methods which has been widely used in various situations. However, the high-dimensional texture representation causes very time-consuming computations, which makes the AAM difficult to apply to real-time systems. The emergence of modern graphics processing units (GPUs that feature a many-core, fine-grained parallel architecture provides new and promising solutions to overcome the computational challenge. In this paper, we propose an efficient parallel implementation of the AAM fitting algorithm on GPUs. Our design idea is fine grain parallelism in which we distribute the texture data of the AAM, in pixels, to thousands of parallel GPU threads for processing, which makes the algorithm fit better into the GPU architecture. We implement our algorithm using the compute unified device architecture (CUDA on the Nvidia’s GTX 650 GPU, which has the latest Kepler architecture. To compare the performance of our algorithm with different data sizes, we built sixteen face AAM models of different dimensional textures. The experiment results show that our parallel AAM fitting algorithm can achieve real-time performance for videos even on very high-dimensional textures.
Efficient parallel implementation of active appearance model fitting algorithm on GPU.
Wang, Jinwei; Ma, Xirong; Zhu, Yuanping; Sun, Jizhou
2014-01-01
The active appearance model (AAM) is one of the most powerful model-based object detecting and tracking methods which has been widely used in various situations. However, the high-dimensional texture representation causes very time-consuming computations, which makes the AAM difficult to apply to real-time systems. The emergence of modern graphics processing units (GPUs) that feature a many-core, fine-grained parallel architecture provides new and promising solutions to overcome the computational challenge. In this paper, we propose an efficient parallel implementation of the AAM fitting algorithm on GPUs. Our design idea is fine grain parallelism in which we distribute the texture data of the AAM, in pixels, to thousands of parallel GPU threads for processing, which makes the algorithm fit better into the GPU architecture. We implement our algorithm using the compute unified device architecture (CUDA) on the Nvidia's GTX 650 GPU, which has the latest Kepler architecture. To compare the performance of our algorithm with different data sizes, we built sixteen face AAM models of different dimensional textures. The experiment results show that our parallel AAM fitting algorithm can achieve real-time performance for videos even on very high-dimensional textures.
Co-simulation of dynamic systems in parallel and serial model configurations
Energy Technology Data Exchange (ETDEWEB)
Sweafford, Trevor [General Motors, Milford (United States); Yoon, Hwan Sik [The University of Alabama, Tuscaloosa (United States)
2013-12-15
Recent advancement in simulation software and computation hardware make it realizable to simulate complex dynamic systems comprised of multiple submodels developed in different modeling languages. The so-called co-simulation enables one to study various aspects of a complex dynamic system with heterogeneous submodels in a cost-effective manner. Among several different model configurations for co-simulation, synchronized parallel configuration is regarded to expedite the simulation process by simulation multiple sub models concurrently on a multi core processor. In this paper, computational accuracies as well as computation time are studied for three different co-simulation frameworks : integrated, serial, and parallel. for this purpose, analytical evaluations of the three different methods are made using the explicit Euler method and then they are applied to two-DOF mass-spring systems. The result show that while the parallel simulation configuration produces the same accurate results as the integrated configuration, results of the serial configuration, results of the serial configuration show a slight deviation. it is also shown that the computation time can be reduced by running simulation in the parallel configuration. Therefore, it can be concluded that the synchronized parallel simulation methodology is the best for both simulation accuracy and time efficiency.
A one-dimensional heat transfer model for parallel-plate thermoacoustic heat exchangers.
de Jong, J A; Wijnant, Y H; de Boer, A
2014-03-01
A one-dimensional (1D) laminar oscillating flow heat transfer model is derived and applied to parallel-plate thermoacoustic heat exchangers. The model can be used to estimate the heat transfer from the solid wall to the acoustic medium, which is required for the heat input/output of thermoacoustic systems. The model is implementable in existing (quasi-)1D thermoacoustic codes, such as DeltaEC. Examples of generated results show good agreement with literature results. The model allows for arbitrary wave phasing; however, it is shown that the wave phasing does not significantly influence the heat transfer.
Dynamic modelling of a 3-CPU parallel robot via screw theory
Directory of Open Access Journals (Sweden)
L. Carbonari
2013-04-01
Full Text Available The article describes the dynamic modelling of I.Ca.Ro., a novel Cartesian parallel robot recently designed and prototyped by the robotics research group of the Polytechnic University of Marche. By means of screw theory and virtual work principle, a computationally efficient model has been built, with the final aim of realising advanced model based controllers. Then a dynamic analysis has been performed in order to point out possible model simplifications that could lead to a more efficient run time implementation.
Modeling and Control of the Redundant Parallel Adjustment Mechanism on a Deployable Antenna Panel.
Tian, Lili; Bao, Hong; Wang, Meng; Duan, Xuechao
2016-10-01
With the aim of developing multiple input and multiple output (MIMO) coupling systems with a redundant parallel adjustment mechanism on the deployable antenna panel, a structural control integrated design methodology is proposed in this paper. Firstly, the modal information from the finite element model of the structure of the antenna panel is extracted, and then the mathematical model is established with the Hamilton principle; Secondly, the discrete Linear Quadratic Regulator (LQR) controller is added to the model in order to control the actuators and adjust the shape of the panel. Finally, the engineering practicality of the modeling and control method based on finite element analysis simulation is verified.
Modeling and Control of the Redundant Parallel Adjustment Mechanism on a Deployable Antenna Panel
Directory of Open Access Journals (Sweden)
Lili Tian
2016-10-01
Full Text Available With the aim of developing multiple input and multiple output (MIMO coupling systems with a redundant parallel adjustment mechanism on the deployable antenna panel, a structural control integrated design methodology is proposed in this paper. Firstly, the modal information from the finite element model of the structure of the antenna panel is extracted, and then the mathematical model is established with the Hamilton principle; Secondly, the discrete Linear Quadratic Regulator (LQR controller is added to the model in order to control the actuators and adjust the shape of the panel. Finally, the engineering practicality of the modeling and control method based on finite element analysis simulation is verified.
Grid Service Framework:Supporting Multi-Models Parallel Grid Programming
Institute of Scientific and Technical Information of China (English)
邓倩妮; 陆鑫达
2004-01-01
Web service is a grid computing technology that promises greater ease-of-use and interoperability than previous distributed computing technologies. This paper proposed Group Service Framework, a grid computing platform based on Microsoft. NET that use web service to: (1) locate and harness volunteer computing resources for different applications, and (2) support multi-models such as Master/Slave, Divide and Conquer, Phase Parallel and so forth parallel programming paradigms in Grid environment, (3) allocate data and balance load dynamically and transparently for grid computing application. The Grid Service Framework based on Microsoft. NET was used to implement several simple parallel computing applications. The results show that the proposed Group Service Framework is suitable for generic parallel numerical computing.
Advanced boundary electrode modeling for tES and parallel tES/EEG
Agsten, Britte; Pursiainen, Sampsa; Wolters, Carsten H
2016-01-01
This paper explores advanced electrode modeling in the context of separate and parallel transcranial electrical stimulation (tES) and electroencephalography (EEG) measurements. We focus on boundary condition based approaches that do not necessitate adding auxiliary elements, e.g. sponges, to the computational domain. In particular, we investigate the complete electrode model (CEM) which incorporates a detailed description of the skin-electrode interface including its contact surface, impedance and normal current distribution. The CEM can be applied for both tES and EEG electrodes which is advantageous when a parallel system is used. In comparison to the CEM, we test two important reduced approaches: the gap model (GAP) and the point electrode model (PEM). We aim to find out the differences of these approaches for a realistic numerical setting based on the stimulation of the auditory cortex. The results obtained suggest, among other things, that GAP and GAP/PEM are sufficiently accurate for the practical appli...
Hill, Gary; Du Val, Ronald W.; Green, John A.; Huynh, Loc C.
1990-01-01
A piloted comparison of rigid and aeroelastic blade-element rotor models was conducted at the Crew Station Research and Development Facility (CSRDF) at Ames Research Center. FLIGHTLAB, a new simulation development and analysis tool, was used to implement these models in real time using parallel processing technology. Pilot comments and quantitative analysis performed both on-line and off-line confirmed that elastic degrees of freedom significantly affect perceived handling qualities. Trim comparisons show improved correlation with flight test data when elastic modes are modeled. The results demonstrate the efficiency with which the mathematical modeling sophistication of existing simulation facilities can be upgraded using parallel processing, and the importance of these upgrades to simulation fidelity.
From Cells to Islands: An unified Model of Cellular Parallel Genetic Algorithms
Simoncini, David; Verel, Sébastien; Clergue, Manuel
2008-01-01
This paper presents the Anisotropic selection scheme for cellular Genetic Algorithms (cGA). This new scheme allows to enhance diversity and to control the selective pressure which are two important issues in Genetic Algorithms, especially when trying to solve difficult optimization problems. Varying the anisotropic degree of selection allows swapping from a cellular to an island model of parallel genetic algorithm. Measures of performances and diversity have been performed on one well-known problem: the Quadratic Assignment Problem which is known to be difficult to optimize. Experiences show that, tuning the anisotropic degree, we can find the accurate trade-off between cGA and island models to optimize performances of parallel evolutionary algorithms. This trade-off can be interpreted as the suitable degree of migration among subpopulations in a parallel Genetic Algorithm.
PARALLEL ADAPTIVE MULTILEVEL SAMPLING ALGORITHMS FOR THE BAYESIAN ANALYSIS OF MATHEMATICAL MODELS
Prudencio, Ernesto
2012-01-01
In recent years, Bayesian model updating techniques based on measured data have been applied to many engineering and applied science problems. At the same time, parallel computational platforms are becoming increasingly more powerful and are being used more frequently by the engineering and scientific communities. Bayesian techniques usually require the evaluation of multi-dimensional integrals related to the posterior probability density function (PDF) of uncertain model parameters. The fact that such integrals cannot be computed analytically motivates the research of stochastic simulation methods for sampling posterior PDFs. One such algorithm is the adaptive multilevel stochastic simulation algorithm (AMSSA). In this paper we discuss the parallelization of AMSSA, formulating the necessary load balancing step as a binary integer programming problem. We present a variety of results showing the effectiveness of load balancing on the overall performance of AMSSA in a parallel computational environment.
BSIRT: a block-iterative SIRT parallel algorithm using curvilinear projection model.
Zhang, Fa; Zhang, Jingrong; Lawrence, Albert; Ren, Fei; Wang, Xuan; Liu, Zhiyong; Wan, Xiaohua
2015-03-01
Large-field high-resolution electron tomography enables visualizing detailed mechanisms under global structure. As field enlarges, the distortions of reconstruction and processing time become more critical. Using the curvilinear projection model can improve the quality of large-field ET reconstruction, but its computational complexity further exacerbates the processing time. Moreover, there is no parallel strategy on GPU for iterative reconstruction method with curvilinear projection. Here we propose a new Block-iterative SIRT parallel algorithm with the curvilinear projection model (BSIRT) for large-field ET reconstruction, to improve the quality of reconstruction and accelerate the reconstruction process. We also develop some key techniques, including block-iterative method with the curvilinear projection, a scope-based data decomposition method and a page-based data transfer scheme to implement the parallelization of BSIRT on GPU platform. Experimental results show that BSIRT can improve the reconstruction quality as well as the speed of the reconstruction process.
On dynamic loads in parallel shaft transmissions. 1: Modelling and analysis
Lin, Edward Hsiang-Hsi; Huston, Ronald L.; Coy, John J.
1987-01-01
A model of a simple parallel-shaft, spur-gear transmission is presented. The model is developed to simulate dynamic loads in power transmissions. Factors affecting these loads are identified. Included are shaft stiffness, local compliance due to contact stress, load sharing, and friction. Governing differential equations are developed and a solution procedure is outlined. A parameter study of the solutions is presented in NASA TM-100181 (AVSCOM TM-87-C-3).
The Modelling of Mechanism with Parallel Kinematic Structure in Software Matlab/Simulink
Directory of Open Access Journals (Sweden)
Vladimir Bulej
2016-09-01
Full Text Available The article deals with the preparation of simulation model of mechanism with parallel kinematic structure called hexapod as an electro-mechanical system in software MATLAB/Simulink. The simulation model is composed from functional blocks represented each part of mechanism’s kinematic structure with certain properties. The results should be used for further simulation of its behaviour as well as for generating of control algorithms for real functional prototype.
On the adequation of dynamic modelling and control of parallel kinematic manipulators.
Ozgür, Erol; Andreff, Nicolas; Martinet, Philippe
2010-01-01
International audience; This paper addresses the problem of controlling the dynamics of parallel kinematic manipulators from a global point of view, where modeling, sensing and control are considered simultaneously. The methodology is presented through the examples of the Gough-Stewart manipulator and the Quattro robot.
All-pairs Shortest Path Algorithm based on MPI+CUDA Distributed Parallel Programming Model
Directory of Open Access Journals (Sweden)
Qingshuang Wu
2013-12-01
Full Text Available In view of the problem that computing shortest paths in a graph is a complex and time-consuming process, and the traditional algorithm that rely on the CPU as computing unit solely can't meet the demand of real-time processing, in this paper, we present an all-pairs shortest paths algorithm using MPI+CUDA hybrid programming model, which can take use of the overwhelming computing power of the GPU cluster to speed up the processing. This proposed algorithm can combine the advantages of MPI and CUDA programming model, and can realize two-level parallel computing. In the cluster-level, we take use of the MPI programming model to achieve a coarse-grained parallel computing between the computational nodes of the GPU cluster. In the node-level, we take use of the CUDA programming model to achieve a GPU-accelerated fine grit parallel computing in each computational node internal. The experimental results show that the MPI+CUDA-based parallel algorithm can take full advantage of the powerful computing capability of the GPU cluster, and can achieve about hundreds of time speedup; The whole algorithm has good computing performance, reliability and scalability, and it is able to meet the demand of real-time processing of massive spatial shortest path analysis
Teaching Scientific Computing: A Model-Centered Approach to Pipeline and Parallel Programming with C
Directory of Open Access Journals (Sweden)
Vladimiras Dolgopolovas
2015-01-01
Full Text Available The aim of this study is to present an approach to the introduction into pipeline and parallel computing, using a model of the multiphase queueing system. Pipeline computing, including software pipelines, is among the key concepts in modern computing and electronics engineering. The modern computer science and engineering education requires a comprehensive curriculum, so the introduction to pipeline and parallel computing is the essential topic to be included in the curriculum. At the same time, the topic is among the most motivating tasks due to the comprehensive multidisciplinary and technical requirements. To enhance the educational process, the paper proposes a novel model-centered framework and develops the relevant learning objects. It allows implementing an educational platform of constructivist learning process, thus enabling learners’ experimentation with the provided programming models, obtaining learners’ competences of the modern scientific research and computational thinking, and capturing the relevant technical knowledge. It also provides an integral platform that allows a simultaneous and comparative introduction to pipelining and parallel computing. The programming language C for developing programming models and message passing interface (MPI and OpenMP parallelization tools have been chosen for implementation.
Parallel plate model for trabecular bone exhibits volume fraction-dependent bias
J.S. Day (Judd); M. Ding; A. Odgaard; D.R. Sumner (Dale); I. Hvid (Ivan); H.H. Weinans (Harrie)
2000-01-01
textabstractUnbiased stereological methods were used in conjunction with microcomputed tomographic (micro-CT) scans of human and animal bone to investigate errors created when the parallel plate model was used to calculate morphometric parameters. Bone samples were obtained from the human proximal t
Baba, Toshitaka; Takahashi, Narumi; Kaneda, Yoshiyuki; Ando, Kazuto; Matsuoka, Daisuke; Kato, Toshihiro
2015-12-01
Because of improvements in offshore tsunami observation technology, dispersion phenomena during tsunami propagation have often been observed in recent tsunamis, for example the 2004 Indian Ocean and 2011 Tohoku tsunamis. The dispersive propagation of tsunamis can be simulated by use of the Boussinesq model, but the model demands many computational resources. However, rapid progress has been made in parallel computing technology. In this study, we investigated a parallelized approach for dispersive tsunami wave modeling. Our new parallel software solves the nonlinear Boussinesq dispersive equations in spherical coordinates. A variable nested algorithm was used to increase spatial resolution in the target region. The software can also be used to predict tsunami inundation on land. We used the dispersive tsunami model to simulate the 2011 Tohoku earthquake on the Supercomputer K. Good agreement was apparent between the dispersive wave model results and the tsunami waveforms observed offshore. The finest bathymetric grid interval was 2/9 arcsec (approx. 5 m) along longitude and latitude lines. Use of this grid simulated tsunami soliton fission near the Sendai coast. Incorporating the three-dimensional shape of buildings and structures led to improved modeling of tsunami inundation.
Zhu, Y. K.; Yu, Y. G.; Li, L.; Jiang, T.; Wang, X. Y.; Zheng, X. J.
2016-07-01
A Timoshenko beam model combined with piezoelectric constitutive equations and an electrical model was proposed to describe the energy harvesting performances of multilayered d 15 mode PZT-51 piezoelectric bimorphs in series and parallel connections. The effect of different clamped conditions was considered for non-piezoelectric and piezoelectric layers in the theoretical model. The frequency dependences of output peak voltage and power at different load resistances and excitation voltages were studied theoretically, and the results were verified by finite element modeling (FEM) simulation and experimental measurements. Results show that the theoretical model considering different clamped conditions for non-piezoelectric and piezoelectric layers could make a reliable prediction for the energy harvesting performances of multilayered d 15 mode piezoelectric bimorphs. The multilayered d 15 mode piezoelectric bimorph in a series connection exhibits a higher output peak voltage and power than that of a parallel connection at a load resistance of 1 MΩ. A criterion for choosing a series or parallel connection for a multilayered d 15 mode piezoelectric bimorph is dependent on the comparison of applied load resistance with the critical resistance of about 55 kΩ. The proposed model may provide some useful guidelines for the design and performance optimization of d 15 mode piezoelectric energy harvesters.
Parallel Computation of Air Pollution Using a Second-Order Closure Model
Pai, Prasad Prabhakar
1991-02-01
Rational analysis, prediction and policy making of air pollution problems depend on our understanding of the individual processes that govern the atmospheric system. In the past, computational constraints have prohibited the incorporation of detailed physics of many individual processes in air pollution models. This has resulted in poor model performance for realistic situations. Recent advances in computing capabilities make it possible to develop air pollution models which capture the essential physics of the individual processes. The present study uses a three -dimensional second-order closure diffusion model to simulate dispersion from ground level and elevated point sources in convective (daytime) boundary layers. The model uses mean and turbulence variables simulated with a one-dimensional second-order closure fluid dynamic model. The calculated mean profiles of wind and temperature are found to be in good agreement with the observed Day 33 Wangara data, whereas the calculated vertical profiles of turbulence variables agree well with those estimated from other numerical models and laboratory experiments. The three-dimensional second -order closure diffusion model can capture the plume behavior in daytime atmospheric boundary layer remarkably well in comparison with laboratory data. We also compare the second -order closure diffusion model with the commonly used K -diffusion model for the same meteorological conditions. In order to reduce the computational requirements for second -order closure models, we propose a parallel algorithm of a time-splitting finite element method for the numerical solution of the governing equations. The parallel time -splitting finite element method substantially reduces the model wallclock or turnaround time by exploiting the vector and parallel capabilities of modern supercomputers. The plethora of supercomputers in the market today made it important for us to study the key issue of algorithm "portability". In view of this, we
A Hybrid Parallel Execution Model for Logic Based Requirement Specifications (Invited Paper
Directory of Open Access Journals (Sweden)
Jeffrey J. P. Tsai
1999-05-01
Full Text Available It is well known that undiscovered errors in a requirements specification is extremely expensive to be fixed when discovered in the software maintenance phase. Errors in the requirement phase can be reduced through the validation and verification of the requirements specification. Many logic-based requirements specification languages have been developed to achieve these goals. However, the execution and reasoning of a logic-based requirements specification can be very slow. An effective way to improve their performance is to execute and reason the logic-based requirements specification in parallel. In this paper, we present a hybrid model to facilitate the parallel execution of a logic-based requirements specification language. A logic-based specification is first applied by a data dependency analysis technique which can find all the mode combinations that exist within a specification clause. This mode information is used to support a novel hybrid parallel execution model, which combines both top-down and bottom-up evaluation strategies. This new execution model can find the failure in the deepest node of the search tree at the early stage of the evaluation, thus this new execution model can reduce the total number of nodes searched in the tree, the total processes needed to be generated, and the total communication channels needed in the search process. A simulator has been implemented to analyze the execution behavior of the new model. Experiments show significant improvement based on several criteria.
Ali, S Tabrez
2014-01-01
In this article, we present Defmod, a fully unstructured, two or three dimensional, parallel finite element code for modeling crustal deformation over time scales ranging from milliseconds to thousands of years. Defmod can simulate deformation due to all major processes that make up the earthquake/rifting cycle, in non-homogeneous media. Specifically, it can be used to model deformation due to dynamic and quasistatic processes such as co-seismic slip or dike intrusion(s), poroelastic rebound due to fluid flow and post-seismic or post-rifting viscoelastic relaxation. It can also be used to model deformation due to processes such as post-glacial rebound, hydrological (un)loading, injection and/or withdrawal of compressible or incompressible fluids from subsurface reservoirs etc. Defmod is written in Fortran 95 and uses PETSc's parallel sparse data structures and implicit solvers. Problems can be solved using (stabilized) linear triangular, quadrilateral, tetrahedral or hexahedral elements on shared or distribut...
Dynamic Modelling and Trajectory Tracking of Parallel Manipulator with Flexible Link
Directory of Open Access Journals (Sweden)
Chen Zhengsheng
2013-09-01
Full Text Available This paper mainly focuses on dynamic modelling and real‐time control for a parallel manipulator with flexible link. The Lagrange principle and assumed modes method (AMM substructure technique is presented to formulate the dynamic modelling of a two‐degrees‐of‐freedom (DOF parallel manipulator with flexible links. Then, the singular perturbation technique (SPT is used to decompose the nonlinear dynamic system into slow time‐scale and fast time‐scale subsystems. Furthermore, the SPT is employed to transform the differential algebraic equations (DAEs for kinematic constraints into explicit ordinary differential equations (ODEs, which makes real‐time control possible. In addition, a novel composite control scheme is presented; the computed torque control is applied for a slow subsystem and the H technique for the fast subsystem, taking account of the model uncertainty and outside disturbance. The simulation results show the composite control can effectively achieve fast and accurate tracking control.
Algorithm comparison and benchmarking using a parallel spectra transform shallow water model
Energy Technology Data Exchange (ETDEWEB)
Worley, P.H. [Oak Ridge National Lab., TN (United States); Foster, I.T.; Toonen, B. [Argonne National Lab., IL (United States)
1995-04-01
In recent years, a number of computer vendors have produced supercomputers based on a massively parallel processing (MPP) architecture. These computers have been shown to be competitive in performance with conventional vector supercomputers for some applications. As spectral weather and climate models are heavy users of vector supercomputers, it is interesting to determine how these models perform on MPPS, and which MPPs are best suited to the execution of spectral models. The benchmarking of MPPs is complicated by the fact that different algorithms may be more efficient on different architectures. Hence, a comprehensive benchmarking effort must answer two related questions: which algorithm is most efficient on each computer and how do the most efficient algorithms compare on different computers. In general, these are difficult questions to answer because of the high cost associated with implementing and evaluating a range of different parallel algorithms on each MPP platform.
Simulation of levulinic acid adsorption in packed beds using parallel pore/surface diffusion model
Energy Technology Data Exchange (ETDEWEB)
Zeng, L.; Mao, J. [Zhejiang Provincial Key Laboratory for Chemical and Biological Processing Technology of Farm Products, Zhejiang University of Science and Technology, Hangzhou (China); Ren, Q. [National Laboratory of Secondary Resources Chemical Engineering, Zhejiang University, Hangzhou (China); Liu, B.
2010-07-15
The adsorption of levulinic acid in fixed beds of basic polymeric adsorbents at 22 C was studied under various operating conditions. A general rate model which considers pore diffusion and parallel pore/surface diffusion was solved numerically by orthogonal collocation on finite elements to describe the experimental breakthrough data. The adsorption isotherms, and the pore and surface diffusion coefficients were determined independently in batch adsorption studies. The external film resistance and the axial dispersion coefficient were estimated by the Wilson-Geankoplis equation and the Chung-Wen equation, respectively. Simulation elucidated that the model which considers parallel diffusion successfully describes the breakthrough behavior and gave a much better prediction than the model which considers pore diffusion. The results obtained in this work are applicable to design and optimizes the separation process. (Abstract Copyright [2010], Wiley Periodicals, Inc.)
Stellar Structure Modeling using a Parallel Genetic Algorithm for Objective Global Optimization
Metcalfe, T S
2002-01-01
Genetic algorithms are a class of heuristic search techniques that apply basic evolutionary operators in a computational setting. We have designed a fully parallel and distributed hardware/software implementation of the generalized optimization subroutine PIKAIA, which utilizes a genetic algorithm to provide an objective determination of the globally optimal parameters for a given model against an observational data set. We have used this modeling tool in the context of white dwarf asteroseismology, i.e., the art and science of extracting physical and structural information about these stars from observations of their oscillation frequencies. The efficient, parallel exploration of parameter-space made possible by genetic-algorithm-based numerical optimization led us to a number of interesting physical results: (1) resolution of a hitherto puzzling discrepancy between stellar evolution models and prior asteroseismic inferences of the surface helium layer mass for a DBV white dwarf; (2) precise determination of...
A Parallel Interval Computation Model for Global Optimization with Automatic Load Balancing
Institute of Scientific and Technical Information of China (English)
Yong Wu; Arun Kumar
2012-01-01
In this paper,we propose a decentralized parallel computation model for global optimization using interval analysis.The model is adaptive to any number of processors and the workload is automatically and evenly distributed among all processors by alternative message passing.The problems received by each processor are processed based on their local dominance properties,which avoids unnecessary interval evaluations.Further,the problem is treated as a whole at the beginning of computation so that no initial decomposition scheme is required.Numerical experiments indicate that the model works well and is stable with different number of parallel processors,distributes the load evenly among the processors,and provides an impressive speedup,especially when the problem is time-consuming to solve.
Development Of A Parallel Performance Model For The THOR Neutral Particle Transport Code
Energy Technology Data Exchange (ETDEWEB)
Yessayan, Raffi; Azmy, Yousry; Schunert, Sebastian
2017-02-01
The THOR neutral particle transport code enables simulation of complex geometries for various problems from reactor simulations to nuclear non-proliferation. It is undergoing a thorough V&V requiring computational efficiency. This has motivated various improvements including angular parallelization, outer iteration acceleration, and development of peripheral tools. For guiding future improvements to the code’s efficiency, better characterization of its parallel performance is useful. A parallel performance model (PPM) can be used to evaluate the benefits of modifications and to identify performance bottlenecks. Using INL’s Falcon HPC, the PPM development incorporates an evaluation of network communication behavior over heterogeneous links and a functional characterization of the per-cell/angle/group runtime of each major code component. After evaluating several possible sources of variability, this resulted in a communication model and a parallel portion model. The former’s accuracy is bounded by the variability of communication on Falcon while the latter has an error on the order of 1%.
Parallelization of a Quantum-Classic Hybrid Model For Nanoscale Semiconductor Devices
2011-01-01
The expensive reengineering of the sequential software and the difficult parallel programming are two of the many technical and economic obstacles to the wide use of HPC. We investigate the chance to improve in a rapid way the performance of a numerical serial code for the simulation of the transport of a charged carriers in a Double-Gate MOSFET. We introduce the Drift-Diffusion-Schrödinger-Poisson (DDSP) model and we study a rapid parallelization strategy of the numerical procedure on shared...
Modeling of Electromagnetic Fields in Parallel-Plane Structures: A Unified Contour-Integral Approach
Directory of Open Access Journals (Sweden)
M. Stumpf
2017-04-01
Full Text Available A unified reciprocity-based modeling approach for analyzing electromagnetic fields in dispersive parallel-plane structures of arbitrary shape is described. It is shown that the use of the reciprocity theorem of the time-convolution type leads to a global contour-integral interaction quantity from which novel both time- and frequency-domain numerical schemes can be arrived at. Applications of the numerical method concerning the time-domain radiated interference and susceptibility of parallel-plane structures are discussed and illustrated on numerical examples.
Zemlyanaya, E. V.; Bashashin, M. V.; Rahmonov, I. R.; Shukrinov, Yu. M.; Atanasova, P. Kh.; Volokhova, A. V.
2016-10-01
We consider a model of system of long Josephson junctions (LJJ) with inductive and capacitive coupling. Corresponding system of nonlinear partial differential equations is solved by means of the standard three-point finite-difference approximation in the spatial coordinate and utilizing the Runge-Kutta method for solution of the resulting Cauchy problem. A parallel algorithm is developed and implemented on a basis of the MPI (Message Passing Interface) technology. Effect of the coupling between the JJs on the properties of LJJ system is demonstrated. Numerical results are discussed from the viewpoint of effectiveness of parallel implementation.
Energy Technology Data Exchange (ETDEWEB)
Bauerle, Matthew [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
2014-08-01
This project utilizes Graphics Processing Units (GPUs) to compute radiograph simulations for arbitrary objects. The generation of radiographs, also known as the forward projection imaging model, is computationally intensive and not widely utilized. The goal of this research is to develop a massively parallel algorithm that can compute forward projections for objects with a trillion voxels (3D pixels). To achieve this end, the data are divided into blocks that can each t into GPU memory. The forward projected image is also divided into segments to allow for future parallelization and to avoid needless computations.
A PARALLEL NUMERICAL MODEL OF SOLVING N-S EQUATIONS BY USING SEQUENTIAL REGULARIZATION METHOD
Institute of Scientific and Technical Information of China (English)
无
2003-01-01
A parallel numerical model was established for solving the Navier-Stokes equations by using Sequential Regularization Method (SRM). The computational domain is decomposed into P sub-domains in which the difference formulae were obtained from the governing equations. The data were exchannged at the virtual boundary of sub-domains in parallel computation. The close-channel cavity flow was solved by the implicit method. The driven square cavity flow was solved by the explicit method. The results were compared well those given by Ghia.
Parallelization of a Quantum-Classic Hybrid Model For Nanoscale Semiconductor Devices
Directory of Open Access Journals (Sweden)
Oscar Salas
2011-07-01
Full Text Available The expensive reengineering of the sequential software and the difficult parallel programming are two of the many technical and economic obstacles to the wide use of HPC. We investigate the chance to improve in a rapid way the performance of a numerical serial code for the simulation of the transport of a charged carriers in a Double-Gate MOSFET. We introduce the Drift-Diffusion-Schrödinger-Poisson (DDSP model and we study a rapid parallelization strategy of the numerical procedure on shared memory architectures.
Efficient Parallel Global Optimization for High Resolution Hydrologic and Climate Impact Models
Shoemaker, C. A.; Mueller, J.; Pang, M.
2013-12-01
High Resolution hydrologic models are typically computationally expensive, requiring many minutes or perhaps hours for one simulation. Optimization can be used with these models for parameter estimation or for analyzing management alternatives. However Optimization of these computationally expensive simulations requires algorithms that can obtain accurate answers with relatively few simulations to avoid infeasibly long computation times. We have developed a number of efficient parallel algorithms and software codes for optimization of expensive problems with multiple local minimum. This is open source software we are distributing. It runs in Matlab and Python, and has been run on Yellowstone supercomputer. The talk will quickly discuss the characteristics of the problem (e.g. the presence of integer as well as continuous variables, the number of dimensions, the availability of parallel/grid computing, the number of simulations that can be allowed to find a solution, etc. ) that determine which algorithms are most appropriate for each type of problem. A major application of this optimization software is for parameter estimation for nonlinear hydrologic models, including contaminant transport in subsurface (e.g. for groundwater remediation or multi-phase flow for carbon sequestration), nutrient transport in watersheds, and climate models. We will present results for carbon sequestration plume monitoring (multi-phase, multi-constiuent), for groundwater remediation, and for the CLM climate model. The carbon sequestration example is based on the Frio CO2 field site and the groundwater example is for a 50,000 acre remediation site (with model requiring about 1 hour per simulation). Parallel speed-ups are excellent in most cases, and our serial and parallel algorithms tend to outperform alternative methods on complex computationally expensive simulations that have multiple global minima.
A flowstream based analytical model for design of parallel plate heatsinks
Energy Technology Data Exchange (ETDEWEB)
Holahan, M.F.; Kang, S.S. [IBM Corp., Rochester, MN (United States); Bar-Cohen, A. [Univ. of Minnesota, Minneapolis, MN (United States). Dept. of Mechanical Engineering
1996-12-31
An analytical model for calculating thermal and pressure drop performance in compact, laminar flow parallel plate heatsink fins is developed. The flow field in the channel between the fins is modeled as a Hele-Shaw flow. Conduction within the fin is modeled by superposition of a kernel function derived from the method of images. Convective heat transfer coefficients are adapted from existing parallel plate correlations. A pressure drop model function is developed. Using examples of a simple side-inlet-side-outlet (SISE) flow pattern and a complex top-inlet-side-outlet (TISE) flow pattern, the model is shown to handle arbitrary flow stream patterns. TISE model results are in good agreement with experiment and CFD results. Optimization of flow pattern in a TISE heatsink at constant pumping power resulted in a 5% reduction in thermal resistance. The model can solve for anew fin geometry or flow rate in just 5 seconds on a PC platform, making it suitable for parametric design studies.
3-D Parallel Simulation Model of Continuous Beam-Electron Cloud Interactions
Ghalam, Ali F; Decyk, Viktor K; Huang Cheng Kun; Katsouleas, Thomas C; Mori, Warren; Rumolo, Giovanni; Zimmermann, Frank
2005-01-01
A 3D Particle-In-Cell model for continuous modeling of beam and electron cloud interaction in a circular accelerator is presented. A simple model for lattice structure, mainly the Quadruple and dipole magnets and chromaticity have been added to a plasma PIC code, QuickPIC, used extensively to model plasma wakefield acceleration concept. The code utilizes parallel processing techniques with domain decomposition in both longitudinal and transverse domains to overcome the massive computational costs of continuously modeling the beam-cloud interaction. Through parallel modeling, we have been able to simulate long-term beam propagation in the presence of electron cloud in many existing and future circular machines around the world. The exact dipole lattice structure has been added to the code and the simulation results for CERN-SPS and LHC with the new lattice structure have been studied. Also the simulation results are compared to the results from the two macro-particle modeling for strong head-tail instability. ...
Kostogryz, N. M.; Yakobchuk, T. M.; Berdyugina, S. V.; Milic, I.
2017-05-01
Context. To properly interpret photometric and polarimetric observations of exoplanetary transits, accurate calculations of center-to-limb variations of intensity and linear polarization of the host star are needed. These variations, in turn, depend on the choice of geometry of stellar atmosphere. Aims: We want to understand the dependence of the flux and the polarization curves during a transit on the choice of the applied approximation for the stellar atmosphere: spherical and plane-parallel. We examine whether simpler plane-parallel models of stellar atmospheres are good enough to interpret the flux and the polarization light curves during planetary transits, or whether more complicated spherical models should be used. Methods: Linear polarization during a transit appears because a planet eclipses a stellar disk and thus breaks left-right symmetry. We calculate the flux and the polarization variations during a transit with given center-to-limb variations of intensity and polarization. Results: We calculate the flux and the polarization variations during transit for a sample of 405 extrasolar systems. Most of them show higher transit polarization for the spherical stellar atmosphere. Our calculations reveal a group of exoplanetary systems that demonstrates lower maximum polarization during the transits with spherical model atmospheres of host stars with effective temperatures of Teff = 4400-5400 K and surface gravity of log g = 4.45-4.65 than that obtained with plane-parallel atmospheres. Moreover, we have found two trends of the transit polarization. The first trend is a decrease in the polarization calculated with spherical model atmosphere of host stars with effective temperatures Teff = 3500-5100 K, and the second shows an increase in the polarization for host stars with Teff = 5100-7000 K. These trends can be explained by the relative variation of temperature and pressure dependences in the plane-parallel and spherical model atmospheres. Conclusions: For
Parallel processing optimization strategy based on MapReduce model in cloud storage environment
Cui, Jianming; Liu, Jiayi; Li, Qiuyan
2017-05-01
Currently, a large number of documents in the cloud storage process employed the way of packaging after receiving all the packets. From the local transmitter this stored procedure to the server, packing and unpacking will consume a lot of time, and the transmission efficiency is low as well. A new parallel processing algorithm is proposed to optimize the transmission mode. According to the operation machine graphs model work, using MPI technology parallel execution Mapper and Reducer mechanism. It is good to use MPI technology to implement Mapper and Reducer parallel mechanism. After the simulation experiment of Hadoop cloud computing platform, this algorithm can not only accelerate the file transfer rate, but also shorten the waiting time of the Reducer mechanism. It will break through traditional sequential transmission constraints and reduce the storage coupling to improve the transmission efficiency.
Bruen, Thomas; Marco, James
2016-04-01
Variations in cell properties are unavoidable and can be caused by manufacturing tolerances and usage conditions. As a result of this, cells connected in series may have different voltages and states of charge that limit the energy and power capability of the complete battery pack. Methods of removing this energy imbalance have been extensively reported within literature. However, there has been little discussion around the effect that such variation has when cells are connected electrically in parallel. This work aims to explore the impact of connecting cells, with varied properties, in parallel and the issues regarding energy imbalance and battery management that may arise. This has been achieved through analysing experimental data and a validated model. The main results from this study highlight that significant differences in current flow can occur between cells within a parallel stack that will affect how the cells age and the temperature distribution within the battery assembly.
Nguyen, Howard; Willacy, Karen; Allen, Mark
2012-01-01
KINETICS is a coupled dynamics and chemistry atmosphere model that is data intensive and computationally demanding. The potential performance gain from using a supercomputer motivates the adaptation from a serial version to a parallelized one. Although the initial parallelization had been done, bottlenecks caused by an abundance of communication calls between processors led to an unfavorable drop in performance. Before starting on the parallel optimization process, a partial overhaul was required because a large emphasis was placed on streamlining the code for user convenience and revising the program to accommodate the new supercomputers at Caltech and JPL. After the first round of optimizations, the partial runtime was reduced by a factor of 23; however, performance gains are dependent on the size of the data, the number of processors requested, and the computer used.
Lee, Dongchul; Gillespie, Ewan; Bradley, Kerry
2011-02-10
In spinal cord stimulation (SCS), concordance of stimulation-induced paresthesia over painful body regions is a necessary condition for therapeutic efficacy. Since patient pain patterns can be unique, a common stimulation configuration is the placement of two leads in parallel in the dorsal epidural space. This construct provides flexibility in steering stimulation current mediolaterally over the dorsal column to achieve better pain-paresthesia overlap. Using a mathematical model with an accurate fiber diameter distribution, we studied the ability of dual parallel leads to steer stimulation between adjacent contacts on dual parallel leads using (1) a single source system, and (2) a multi-source system, with a dedicated current source for each contact. The volume conductor model of a low-thoracic spinal cord with epidurally-positioned dual parallel (2 mm separation) percutaneous leads was first created, and the electric field was calculated using ANSYS, a finite element modeling tool. The activating function for 10 um fibers was computed as the second difference of the extracellular potential along the nodes of Ranvier on the nerve fibers in the dorsal column. The volume of activation (VOA) and the central point of the VOA were computed using a predetermined threshold of the activating function. The model compared the field steering results with single source versus dedicated power source systems on dual 8-contact stimulation leads. The model predicted that the multi-source system can target more central points of stimulation on the dorsal column than a single source system (100 vs. 3) and the mean steering step for mediolateral steering is 0.02 mm for multi-source systems vs 1 mm for single source systems, a 50-fold improvement. The ability to center stimulation regions in the dorsal column with high resolution may allow for better optimization of paresthesia-pain overlap in patients.
Directory of Open Access Journals (Sweden)
Xiaoliang Yin
2015-03-01
Full Text Available Complex electromechanical system is usually composed of multiple components from different domains, including mechanical, electronic, hydraulic, control, and so on. Modeling and simulation for electromechanical system on a unified platform is one of the research hotspots in system engineering at present. It is also the development trend of the design for complex electromechanical system. The unified modeling techniques and tools based on Modelica language provide a satisfactory solution. To meet with the requirements of collaborative modeling, simulation, and parallel computing for complex electromechanical systems based on Modelica, a general web-based modeling and simulation prototype environment, namely, WebMWorks, is designed and implemented. Based on the rich Internet application technologies, an interactive graphic user interface for modeling and post-processing on web browser was implemented; with the collaborative design module, the environment supports top-down, concurrent modeling and team cooperation; additionally, service-oriented architecture–based architecture was applied to supply compiling and solving services which run on cloud-like servers, so the environment can manage and dispatch large-scale simulation tasks in parallel on multiple computing servers simultaneously. An engineering application about pure electric vehicle is tested on WebMWorks. The results of simulation and parametric experiment demonstrate that the tested web-based environment can effectively shorten the design cycle of the complex electromechanical system.
Energy Technology Data Exchange (ETDEWEB)
1992-03-10
The first phase of the proposed work is largely completed on schedule. Scientists at the San Diego Supercomputer Center (SDSC) succeeded in putting a version of the Hamburg isopycnal coordinate ocean model (OPYC) onto the INTEL parallel computer. Due to the slow run speeds of the OPYC on the parallel machine, another ocean is being model used during the first part of phase 2. The model chosen is the Large Scale Geostrophic (LSG) model form the Max Planck Institute.
Parallelizing Backpropagation Neural Network Using MapReduce and Cascading Model.
Liu, Yang; Jing, Weizhe; Xu, Lixiong
2016-01-01
Artificial Neural Network (ANN) is a widely used algorithm in pattern recognition, classification, and prediction fields. Among a number of neural networks, backpropagation neural network (BPNN) has become the most famous one due to its remarkable function approximation ability. However, a standard BPNN frequently employs a large number of sum and sigmoid calculations, which may result in low efficiency in dealing with large volume of data. Therefore to parallelize BPNN using distributed computing technologies is an effective way to improve the algorithm performance in terms of efficiency. However, traditional parallelization may lead to accuracy loss. Although several complements have been done, it is still difficult to find out a compromise between efficiency and precision. This paper presents a parallelized BPNN based on MapReduce computing model which supplies advanced features including fault tolerance, data replication, and load balancing. And also to improve the algorithm performance in terms of precision, this paper creates a cascading model based classification approach, which helps to refine the classification results. The experimental results indicate that the presented parallelized BPNN is able to offer high efficiency whilst maintaining excellent precision in enabling large-scale machine learning.
Parallelizing Backpropagation Neural Network Using MapReduce and Cascading Model
Directory of Open Access Journals (Sweden)
Yang Liu
2016-01-01
Full Text Available Artificial Neural Network (ANN is a widely used algorithm in pattern recognition, classification, and prediction fields. Among a number of neural networks, backpropagation neural network (BPNN has become the most famous one due to its remarkable function approximation ability. However, a standard BPNN frequently employs a large number of sum and sigmoid calculations, which may result in low efficiency in dealing with large volume of data. Therefore to parallelize BPNN using distributed computing technologies is an effective way to improve the algorithm performance in terms of efficiency. However, traditional parallelization may lead to accuracy loss. Although several complements have been done, it is still difficult to find out a compromise between efficiency and precision. This paper presents a parallelized BPNN based on MapReduce computing model which supplies advanced features including fault tolerance, data replication, and load balancing. And also to improve the algorithm performance in terms of precision, this paper creates a cascading model based classification approach, which helps to refine the classification results. The experimental results indicate that the presented parallelized BPNN is able to offer high efficiency whilst maintaining excellent precision in enabling large-scale machine learning.
Simulating Capacitances to Silicon Quantum Dots: Breakdown of the Parallel Plate Capacitor Model
Thorbeck, Ted; Fujiwara, Akira; Zimmerman, Neil M.
2012-09-01
Many electrical applications of quantum dots rely on capacitively coupled gates; therefore, to make reliable devices we need those gate capacitances to be predictable and reproducible. We demonstrate in silicon nanowire quantum dots that gate capacitances are reproducible to within 10% for nominally identical devices. We demonstrate the experimentally that gate capacitances scale with device dimensions. We also demonstrate that a capacitance simulator can be used to predict measured gate capacitances to within 20%. A simple parallel plate capacitor model can be used to predict how the capacitances change with device dimensions; however, the parallel plate capacitor model fails for the smallest devices because the capacitances are dominated by fringing fields. We show how the capacitances due to fringing fields can be quickly estimated.
SBML-PET-MPI: a parallel parameter estimation tool for Systems Biology Markup Language based models.
Zi, Zhike
2011-04-01
Parameter estimation is crucial for the modeling and dynamic analysis of biological systems. However, implementing parameter estimation is time consuming and computationally demanding. Here, we introduced a parallel parameter estimation tool for Systems Biology Markup Language (SBML)-based models (SBML-PET-MPI). SBML-PET-MPI allows the user to perform parameter estimation and parameter uncertainty analysis by collectively fitting multiple experimental datasets. The tool is developed and parallelized using the message passing interface (MPI) protocol, which provides good scalability with the number of processors. SBML-PET-MPI is freely available for non-commercial use at http://www.bioss.uni-freiburg.de/cms/sbml-pet-mpi.html or http://sites.google.com/site/sbmlpetmpi/.
Lyu, Jingyuan; Nakarmi, Ukash; Zhang, Chaoyi; Ying, Leslie
2016-05-01
This paper presents a new approach to highly accelerated dynamic parallel MRI using low rank matrix completion, partial separability (PS) model. In data acquisition, k-space data is moderately randomly undersampled at the center kspace navigator locations, but highly undersampled at the outer k-space for each temporal frame. In reconstruction, the navigator data is reconstructed from undersampled data using structured low-rank matrix completion. After all the unacquired navigator data is estimated, the partial separable model is used to obtain partial k-t data. Then the parallel imaging method is used to acquire the entire dynamic image series from highly undersampled data. The proposed method has shown to achieve high quality reconstructions with reduction factors up to 31, and temporal resolution of 29ms, when the conventional PS method fails.
Web System Dedicated to Parallel Computation for Modeling of Mushy Steel Deformation
Directory of Open Access Journals (Sweden)
Dębiński T.
2014-10-01
Full Text Available The paper presents web base system for an application of parallel object-oriented programming technique in modelling of rolling process of steel plates with semi-solid zone. It also throws light on the problem of semi-solid steels yield stress relationship, one of the main input data of the simulation, and on application of inverse solution, the only possible method of development of the stress-strain curves at extremely high temperatures. Due to limitations of available computer resources a very accurate computation can sometimes be impossible or the time performance can be a barrier for practical application of complex sequential models. Taking advantage of parallel computing the authors have developed an algorithm allowing for fast computation using multiple processors, which is the main subject of the presented paper.
Compliance modeling and analysis of a 3-RPS parallel kinematic machine module
Zhang, Jun; Zhao, Yanqin; Dai, Jiansheng
2014-07-01
The compliance modeling and rigidity performance evaluation for the lower mobility parallel manipulators are still to be remained as two overwhelming challenges in the stage of conceptual design due to their geometric complexities. By using the screw theory, this paper explores the compliance modeling and eigencompliance evaluation of a newly patented 1T2R spindle head whose topological architecture is a 3-RPS parallel mechanism. The kinematic definitions and inverse position analysis are briefly addressed in the first place to provide necessary information for compliance modeling. By considering the 3-RPS parallel kinematic machine(PKM) as a typical compliant parallel device, whose three limb assemblages have bending, extending and torsional deflections, an analytical compliance model for the spindle head is established with screw theory and the analytical stiffness matrix of the platform is formulated. Based on the eigenscrew decomposition, the eigencompliance and corresponding eigenscrews are analyzed and the platform's compliance properties are physically interpreted as the suspension of six screw springs. The distributions of stiffness constants of the six screw springs throughout the workspace are predicted in a quick manner with a piece-by-piece calculation algorithm. The numerical simulation reveals a strong dependency of platform's compliance on its configuration in that they are axially symmetric due to structural features. At the last stage, the effects of some design variables such as structural, configurational and dimensional parameters on system rigidity characteristics are investigated with the purpose of providing useful information for the structural design and performance improvement of the PKM. Compared with previous efforts in compliance analysis of PKMs, the present methodology is more intuitive and universal thus can be easily applied to evaluate the overall rigidity performance of other PKMs with high efficiency.
Au, Jennifer; Choi, Jungik; Jones, Shawn W; Venkataramanan, Keerthi P; Antoniewicz, Maciek R
2014-11-01
In this work, we provide new insights into the metabolism of Clostridium acetobutylicum ATCC 824 obtained using a systematic approach for quantifying fluxes based on parallel labeling experiments and (13)C-metabolic flux analysis ((13)C-MFA). Here, cells were grown in parallel cultures with [1-(13)C]glucose and [U-(13)C]glucose as tracers and (13)C-MFA was used to quantify intracellular metabolic fluxes. Several metabolic network models were compared: an initial model based on current knowledge, and extended network models that included additional reactions that improved the fits of experimental data. While the initial network model did not produce a statistically acceptable fit of (13)C-labeling data, an extended network model with five additional reactions was able to fit all data with 292 redundant measurements. The model was subsequently trimmed to produce a minimal network model of C. acetobutylicum for (13)C-MFA, which could still reproduce all of the experimental data. The flux results provided valuable new insights into the metabolism of C. acetobutylicum. First, we found that TCA cycle was effectively incomplete, as there was no measurable flux between α-ketoglutarate and succinyl-CoA, succinate and fumarate, and malate and oxaloacetate. Second, an active pathway was identified from pyruvate to fumarate via aspartate. Third, we found that isoleucine was produced exclusively through the citramalate synthase pathway in C. acetobutylicum and that CAC3174 was likely responsible for citramalate synthase activity. These model predictions were confirmed in several follow-up tracer experiments. The validated metabolic network model established in this study can be used in future investigations for unbiased (13)C-flux measurements in C. acetobutylicum. Copyright © 2014 International Metabolic Engineering Society. Published by Elsevier Inc. All rights reserved.
Precise Modeling Based on Dynamic Phasors for Droop-Controlled Parallel-Connected Inverters
DEFF Research Database (Denmark)
Wang, L.; Guo, X.Q.; Gu, H.R.;
2012-01-01
This paper deals with the precise modeling of droop controlled parallel inverters. This is very attractive since that is a common structure that can be found in a stand-alone droopcontrolled MicroGrid. The conventional small-signal dynamic is not able to predict instabilities of the system, so th....... In addition, the virtual ω-E frame power control method, which deals with the power coupling caused by the line impedance X/R characteristic, has been chosen as an application example of this modeling technique....
A massively parallel GPU-accelerated model for analysis of fully nonlinear free surface waves
DEFF Research Database (Denmark)
Engsig-Karup, Allan Peter; Madsen, Morten G.; Glimberg, Stefan Lemvig
2011-01-01
-throughput co-processors to the CPU. We describe and demonstrate how this approach makes it possible to do fast desktop computations for large nonlinear wave problems in numerical wave tanks (NWTs) with close to 50/100 million total grid points in double/ single precision with 4 GB global device memory...... space dimensions and is useful for fast analysis and prediction purposes in coastal and offshore engineering. A dedicated numerical model based on the proposed algorithm is executed in parallel by utilizing affordable modern special purpose graphics processing unit (GPU). The model is based on a low...
ON THE DYNAMIC MODELING AND CONTROL OF 2-DOF PLANAR PARALLEL MECHANISM WITH FLEXIBLE LINKS
Institute of Scientific and Technical Information of China (English)
Luo Lei; Wang Shigang; Mo Jinqiu; Cai Jianguo
2005-01-01
The object of study is about dynamic modeling and control for a 2 degree-of-freedom (DOF) planar parallel mechanism (PM) with flexible links. The kinematic and dynamic equations are established according to the characteristics of mixed rigid and flexible structure. By using the singular perturbation approach (SPA), the model of the mechanism can be separated into slow and fast subsystems. Based on the feedback linearization theory and input shaping technique, the large scale rigid motion controller and the flexible link vibration controller can be designed separately to achieve fast and accurate positioning of the PM.
Parallel LC circuit model for multi-band absorption and preliminary design of radiative cooling.
Feng, Rui; Qiu, Jun; Liu, Linhua; Ding, Weiqiang; Chen, Lixue
2014-12-15
We perform a comprehensive analysis of multi-band absorption by exciting magnetic polaritons in the infrared region. According to the independent properties of the magnetic polaritons, we propose a parallel inductance and capacitance(PLC) circuit model to explain and predict the multi-band resonant absorption peaks, which is fully validated by using the multi-sized structure with identical dielectric spacing layer and the multilayer structure with the same strip width. More importantly, we present the application of the PLC circuit model to preliminarily design a radiative cooling structure realized by merging several close peaks together. This omnidirectional and polarization insensitive structure is a good candidate for radiative cooling application.
Hasta, D T
2010-01-01
The current trend of multicore architectures on shared memory systems underscores the need of parallelism. While there are some programming model to express parallelism, thread programming model has become a standard to support these system such as OpenMP, and POSIX threads. MPI (Message Passing Interface) which remains the dominant model used in high-performance computing today faces this challenge. Previous version of MPI which is MPI-1 has no shared memory concept, and Current MPI version 2 which is MPI-2 has a limited support for shared memory systems. In this research, MPI-2 version of MPI will be compared with OpenMP to see how well does MPI perform on multicore / SMP (Symmetric Multiprocessor) machines. Comparison between OpenMP for thread programming model and MPI for message passing programming model will be conducted on multicore shared memory machine architectures to see who has a better performance in terms of speed and throughput. Application used to assess the scalability of the evaluated parall...
New 2D diffraction model and its applications to terahertz parallel-plate waveguide power splitters
Zhang, Fan; Song, Kaijun; Fan, Yong
2017-02-01
A two-dimensional (2D) diffraction model for the calculation of the diffraction field in 2D space and its applications to terahertz parallel-plate waveguide power splitters are proposed in this paper. Compared with the Huygens-Fresnel principle in three-dimensional (3D) space, the proposed model provides an approximate analytical expression to calculate the diffraction field in 2D space. The diffraction filed is regarded as the superposition integral in 2D space. The calculated results obtained from the proposed diffraction model agree well with the ones by software HFSS based on the element method (FEM). Based on the proposed 2D diffraction model, two parallel-plate waveguide power splitters are presented. The splitters consist of a transmitting horn antenna, reflectors, and a receiving antenna array. The reflector is cylindrical parabolic with superimposed surface relief to efficiently couple the transmitted wave into the receiving antenna array. The reflector is applied as computer-generated holograms to match the transformed field to the receiving antenna aperture field. The power splitters were optimized by a modified real-coded genetic algorithm. The computed results of the splitters agreed well with the ones obtained by software HFSS verify the novel design method for power splitter, which shows good applied prospects of the proposed 2D diffraction model.
DEFF Research Database (Denmark)
Gaspar, Jozsef; Fosbøl, Philip Loldrup
2017-01-01
Reactive absorption is a key process for gas separation and purification and it is the main technology for CO2 capture. Thus, reliable and simple mathematical models for mass transfer rate calculation are essential. Models which apply to parallel interacting and non-interacting reactions, for all......, desorption and pinch conditions.In this work, we apply the GM model to multiple parallel reactions. We deduce the model for piperazine (PZ) CO2 capture and we validate it against wetted-wall column measurements using 2, 5 and 8 molal PZ for temperatures between 40 °C and 100 °C and CO2 loadings between 0.......23 and 0.41 mol CO2/2 mol PZ. We show that overall second order kinetics describes well the reaction between CO2 and PZ accounting for the carbamate and bicarbamate reactions. Here we prove the GM model for piperazine and MEA but we expect that this practical approach is applicable for various amines...
Parallel 3-d simulations for porous media models in soil mechanics
Wieners, C.; Ammann, M.; Diebels, S.; Ehlers, W.
Numerical simulations in 3-d for porous media models in soil mechanics are a difficult task for the engineering modelling as well as for the numerical realization. Here, we present a general numerical scheme for the simulation of two-phase models in combination with an material model via the stress response with a specialized parallel saddle point solver. Therefore, we give a brief introduction into the theoretical background of the Theory of Porous Media and constitute a two-phase model consisting of a porous solid skeleton saturated by a viscous pore-fluid. The material behaviour of the skeleton is assumed to be elasto-viscoplastic. The governing equations are transfered to a weak formulation suitable for the application of the finite element method. Introducing an formulation in terms of the stress response, we define a clear interface between the assembling process and the parallel solver modules. We demonstrate the efficiency of this approach by challenging numerical experiments realized on the Linux Cluster in Chemnitz.
Algorithms for a parallel implementation of Hidden Markov Models with a small state space
DEFF Research Database (Denmark)
Nielsen, Jesper; Sand, Andreas
2011-01-01
Two of the most important algorithms for Hidden Markov Models are the forward and the Viterbi algorithms. We show how formulating these using linear algebra naturally lends itself to parallelization. Although the obtained algorithms are slow for Hidden Markov Models with large state spaces......, they require very little communication between processors, and are fast in practice on models with a small state space. We have tested our implementation against two other imple- mentations on artificial data and observe a speed-up of roughly a factor of 5 for the forward algorithm and more than 6...... for the Viterbi algorithm. We also tested our algorithm in the Coalescent Hidden Markov Model framework, where it gave a significant speed-up....
A comparison of distributed memory and virtual shared memory parallel programming models
Energy Technology Data Exchange (ETDEWEB)
Keane, J.A. [Univ. of Manchester (United Kingdom). Dept. of Computer Science; Grant, A.J. [Univ. of Manchester (United Kingdom). Computer Graphics Unit; Xu, M.Q. [Argonne National Lab., IL (United States)
1993-04-01
The virtues of the different parallel programming models, shared memory and distributed memory, have been much debated. Conventionally the debate could be reduced to programming convenience on the one hand, and high salability factors on the other. More recently the debate has become somewhat blurred with the provision of virtual shared memory models built on machines with physically distributed memory. The intention of such models/machines is to provide scalable shared memory, i.e. to provide both programmer convenience and high salability. In this paper, the different models are considered from experiences gained with a number of system ranging from applications in both commerce and science to languages and operating systems. Case studies are introduced as appropriate.
BSP模型下的并行程序设计与开发%Design and Development of Parallel Programs on Bulk Synchronous Parallel Model
Institute of Scientific and Technical Information of China (English)
赖树华; 陆朝俊; 孙永强
2001-01-01
The Bulk Synchronous Parallel (BSP) model was simply introduced, and the advantage of the parapllel program's design and development on BSP model was discussed. Then it analysed how to design and develop the parallel programs on BSP model and summarized several principles the developer must comply with. At last a useful parallel programming method based on the BSP model was presented: the two phase method of BSP parallel program design. An example was given to illustrate how to make use of the above method and the BSP performance prediction tool.%介绍了BSP(Bulk Synchronous Parallel)模型，讨论了在该模型下进行并行程序设计的优点、并行算法的分析和设计方法及其必须遵守的原则.以两矩阵的乘法为例说明了如何借助BSP并行程序性能预测工具，利用两阶段BSP并行程序设计方法进行BSP并行程序的设计和开发.
Parallel flow accumulation algorithms for graphical processing units with application to RUSLE model
Sten, Johan; Lilja, Harri; Hyväluoma, Jari; Westerholm, Jan; Aspnäs, Mats
2016-04-01
Digital elevation models (DEMs) are widely used in the modeling of surface hydrology, which typically includes the determination of flow directions and flow accumulation. The use of high-resolution DEMs increases the accuracy of flow accumulation computation, but as a drawback, the computational time may become excessively long if large areas are analyzed. In this paper we investigate the use of graphical processing units (GPUs) for efficient flow accumulation calculations. We present two new parallel flow accumulation algorithms based on dependency transfer and topological sorting and compare them to previously published flow transfer and indegree-based algorithms. We benchmark the GPU implementations against industry standards, ArcGIS and SAGA. With the flow-transfer D8 flow routing model and binary input data, a speed up of 19 is achieved compared to ArcGIS and 15 compared to SAGA. We show that on GPUs the topological sort-based flow accumulation algorithm leads on average to a speedup by a factor of 7 over the flow-transfer algorithm. Thus a total speed up of the order of 100 is achieved. We test the algorithms by applying them to the Revised Universal Soil Loss Equation (RUSLE) erosion model. For this purpose we present parallel versions of the slope, LS factor and RUSLE algorithms and show that the RUSLE erosion results for an area of 12 km x 24 km containing 72 million cells can be calculated in less than a second. Since flow accumulation is needed in many hydrological models, the developed algorithms may find use in many other applications than RUSLE modeling. The algorithm based on topological sorting is particularly promising for dynamic hydrological models where flow accumulations are repeatedly computed over an unchanged DEM.
Honkonen, I.
2015-03-01
I present a method for developing extensible and modular computational models without sacrificing serial or parallel performance or source code readability. By using a generic simulation cell method I show that it is possible to combine several distinct computational models to run in the same computational grid without requiring modification of existing code. This is an advantage for the development and testing of, e.g., geoscientific software as each submodel can be developed and tested independently and subsequently used without modification in a more complex coupled program. An implementation of the generic simulation cell method presented here, generic simulation cell class (gensimcell), also includes support for parallel programming by allowing model developers to select which simulation variables of, e.g., a domain-decomposed model to transfer between processes via a Message Passing Interface (MPI) library. This allows the communication strategy of a program to be formalized by explicitly stating which variables must be transferred between processes for the correct functionality of each submodel and the entire program. The generic simulation cell class requires a C++ compiler that supports a version of the language standardized in 2011 (C++11). The code is available at https://github.com/nasailja/gensimcell for everyone to use, study, modify and redistribute; those who do are kindly requested to acknowledge and cite this work.
Error modeling and tolerance design of a parallel manipulator with full-circle rotation
Directory of Open Access Journals (Sweden)
Yanbing Ni
2016-05-01
Full Text Available A method for improving the accuracy of a parallel manipulator with full-circle rotation is systematically investigated in this work via kinematic analysis, error modeling, sensitivity analysis, and tolerance allocation. First, a kinematic analysis of the mechanism is made using the space vector chain method. Using the results as a basis, an error model is formulated considering the main error sources. Position and orientation error-mapping models are established by mathematical transformation of the parallelogram structure characteristics. Second, a sensitivity analysis is performed on the geometric error sources. A global sensitivity evaluation index is proposed to evaluate the contribution of the geometric errors to the accuracy of the end-effector. The analysis results provide a theoretical basis for the allocation of tolerances to the parts of the mechanical design. Finally, based on the results of the sensitivity analysis, the design of the tolerances can be solved as a nonlinearly constrained optimization problem. A genetic algorithm is applied to carry out the allocation of the manufacturing tolerances of the parts. Accordingly, the tolerance ranges for nine kinds of geometrical error sources are obtained. The achievements made in this work can also be applied to other similar parallel mechanisms with full-circle rotation to improve error modeling and design accuracy.
Kinematic Model Building and Servo Parameter Identification of 3-HSS Parallel Mechanism
Institute of Scientific and Technical Information of China (English)
YANG Zhi-yong; WU Jiang; HUANG Tian; NI Yan-bing
2006-01-01
Aiming at a parallel mechanism with three degrees of freedom,a method for dynamic model building and the parameter identification of its servosystem ispresented.First,the reverse solution models of position,velocity,and acceleration of parallelogram branch structure are deduced.and then.its dynamic model of a rigid body is set up by using the virtual work principle.Based on the above model.a method to identify the servo parameter of the parallel mechanism is put up.In this method.the triangle-shaped input with variable frequency is adopted to offset the disadvantages of pseudorandom number sequence in parameter identification.such as dramatically changing the vibration amplitude of the motor,easily impacting the motor that results in its velocity loop to easily open,and so on.Moreover,the rotary inertia can also be identified bv the additive mass.The abovementioned data will lay a solid foundation for the optimum performance of the system in the whole WOrkspace.
Lin, Youzuo; O'Malley, Daniel; Vesselinov, Velimir V.
2016-09-01
Inverse modeling seeks model parameters given a set of observations. However, for practical problems because the number of measurements is often large and the model parameters are also numerous, conventional methods for inverse modeling can be computationally expensive. We have developed a new, computationally efficient parallel Levenberg-Marquardt method for solving inverse modeling problems with a highly parameterized model space. Levenberg-Marquardt methods require the solution of a linear system of equations which can be prohibitively expensive to compute for moderate to large-scale problems. Our novel method projects the original linear problem down to a Krylov subspace such that the dimensionality of the problem can be significantly reduced. Furthermore, we store the Krylov subspace computed when using the first damping parameter and recycle the subspace for the subsequent damping parameters. The efficiency of our new inverse modeling algorithm is significantly improved using these computational techniques. We apply this new inverse modeling method to invert for random transmissivity fields in 2-D and a random hydraulic conductivity field in 3-D. Our algorithm is fast enough to solve for the distributed model parameters (transmissivity) in the model domain. The algorithm is coded in Julia and implemented in the MADS computational framework (http://mads.lanl.gov). By comparing with Levenberg-Marquardt methods using standard linear inversion techniques such as QR or SVD methods, our Levenberg-Marquardt method yields a speed-up ratio on the order of ˜101 to ˜102 in a multicore computational environment. Therefore, our new inverse modeling method is a powerful tool for characterizing subsurface heterogeneity for moderate to large-scale problems.
Shen, Yanfeng; Cesnik, Carlos E. S.
2016-04-01
This paper presents a parallelized modeling technique for the efficient simulation of nonlinear ultrasonics introduced by the wave interaction with fatigue cracks. The elastodynamic wave equations with contact effects are formulated using an explicit Local Interaction Simulation Approach (LISA). The LISA formulation is extended to capture the contact-impact phenomena during the wave damage interaction based on the penalty method. A Coulomb friction model is integrated into the computation procedure to capture the stick-slip contact shear motion. The LISA procedure is coded using the Compute Unified Device Architecture (CUDA), which enables the highly parallelized supercomputing on powerful graphic cards. Both the explicit contact formulation and the parallel feature facilitates LISA's superb computational efficiency over the conventional finite element method (FEM). The theoretical formulations based on the penalty method is introduced and a guideline for the proper choice of the contact stiffness is given. The convergence behavior of the solution under various contact stiffness values is examined. A numerical benchmark problem is used to investigate the new LISA formulation and results are compared with a conventional contact finite element solution. Various nonlinear ultrasonic phenomena are successfully captured using this contact LISA formulation, including the generation of nonlinear higher harmonic responses. Nonlinear mode conversion of guided waves at fatigue cracks is also studied.
Service Virtualization Using a Non-von Neumann Parallel, Distributed, and Scalable Computing Model
Directory of Open Access Journals (Sweden)
Rao Mikkilineni
2012-01-01
Full Text Available This paper describes a prototype implementing a high degree of transaction resilience in distributed software systems using a non-von Neumann computing model exploiting parallelism in computing nodes. The prototype incorporates fault, configuration, accounting, performance, and security (FCAPS management using a signaling network overlay and allows the dynamic control of a set of distributed computing elements in a network. Each node is a computing entity endowed with self-management and signaling capabilities to collaborate with similar nodes in a network. The separation of parallel computing and management channels allows the end-to-end transaction management of computing tasks (provided by the autonomous distributed computing elements to be implemented as network-level FCAPS management. While the new computing model is operating system agnostic, a Linux, Apache, MySQL, PHP/Perl/Python (LAMP based services architecture is implemented in a prototype to demonstrate end-to-end transaction management with auto-scaling, self-repair, dynamic performance management and distributed transaction security assurance. The implementation is made possible by a non-von Neumann middleware library providing Linux process management through multi-threaded parallel execution of self-management and signaling abstractions. We did not use Hypervisors, Virtual machines, or layers of complex virtualization management systems in implementing this prototype.
Energy Technology Data Exchange (ETDEWEB)
Wehner, W.F.; Mirin, A.A.; Bolstad, J.H. [and others
1996-09-01
A comprehensive climate system model is under development at Lawrence Livermore National Laboratory. The basis for this model is a consistent coupling of multiple complex subsystem models, each describing a major component of the Earth`s climate. Among these are general circulation models of the atmosphere and ocean, a dynamic and thermodynamic sea ice model, and models of the chemical processes occurring in the air, sea water, and near-surface land. The computational resources necessary to carry out simulations at adequate spatial resolutions for durations of climatic time scales exceed those currently available. Distributed memory massively parallel processing (MPP) computers promise to affordably scale to the computational rates required by directing large numbers of relatively inexpensive processors onto a single problem. We have developed a suite of routines designed to exploit current generation MPP architectures via domain and functional decomposition strategies. These message passing techniques have been implemented in each of the component models and in their coupling interfaces. Production runs of the atmospheric and oceanic components performed on the National Environmental Supercomputing Center (NESC) Cray T3D are described.
Jagiella, Nick; Rickert, Dennis; Theis, Fabian J; Hasenauer, Jan
2017-02-22
Mechanistic understanding of multi-scale biological processes, such as cell proliferation in a changing biological tissue, is readily facilitated by computational models. While tools exist to construct and simulate multi-scale models, the statistical inference of the unknown model parameters remains an open problem. Here, we present and benchmark a parallel approximate Bayesian computation sequential Monte Carlo (pABC SMC) algorithm, tailored for high-performance computing clusters. pABC SMC is fully automated and returns reliable parameter estimates and confidence intervals. By running the pABC SMC algorithm for ∼10(6) hr, we parameterize multi-scale models that accurately describe quantitative growth curves and histological data obtained in vivo from individual tumor spheroid growth in media droplets. The models capture the hybrid deterministic-stochastic behaviors of 10(5)-10(6) of cells growing in a 3D dynamically changing nutrient environment. The pABC SMC algorithm reliably converges to a consistent set of parameters. Our study demonstrates a proof of principle for robust, data-driven modeling of multi-scale biological systems and the feasibility of multi-scale model parameterization through statistical inference.
Directory of Open Access Journals (Sweden)
Spyridon Liakas
2017-08-01
Full Text Available The particulate discrete element method (DEM can be employed to capture the response of rock, provided that appropriate bonding models are used to cement the particles to each other. Simulations of laboratory tests are important to establish the extent to which those models can capture realistic rock behaviors. Hitherto the focus in such comparison studies has either been on homogeneous specimens or use of two-dimensional (2D models. In situ rock formations are often heterogeneous, thus exploring the ability of this type of models to capture heterogeneous material behavior is important to facilitate their use in design analysis. In situ stress states are basically three-dimensional (3D, and therefore it is important to develop 3D models for this purpose. This paper revisits an earlier experimental study on heterogeneous specimens, of which the relative proportions of weaker material (siltstone and stronger, harder material (sandstone were varied in a controlled manner. Using a 3D DEM model with the parallel bond model, virtual heterogeneous specimens were created. The overall responses in terms of variations in strength and stiffness with different percentages of weaker material (siltstone were shown to agree with the experimental observations. There was also a good qualitative agreement in the failure patterns observed in the experiments and the simulations, suggesting that the DEM data enabled analysis of the initiation of localizations and micro fractures in the specimens.
Tri-Lab data models and format (DMF) project: parallel I/O and data exchange
Energy Technology Data Exchange (ETDEWEB)
Cook, L. M.; Matarazzo, C. M.; Rathkopf, J.
1998-10-01
A central goal of the ASCI program is to push simulation and modeling for Science-based Stockpile Stewardship to unprecedented levels. ASCI applications will use extremely high-fidelity models, on the order of one billion cells, to generate terabytes of raw data. Such vast amounts of data produced by these supercomputing applications will overwhelm scientists, whose efforts to understand their results are hindered by inadequate visualization and data management tools. Much of the Scientific Data Management (SDM) effort concerns managing the large and complex data emerging from these simulation codes. One particular area for which commercial and scalable solutions do not exist is in Parallel I/O and data exchange between simulations. To address these needs, the Tri-lab Data Models and Formats effort of the SDM project is developing capabilities to enable the capturing and sharing of simulation data.
Fonseca, Ricardo A; Fiúza, Frederico; Davidson, Asher; Tsung, Frank S; Mori, Warren B; Silva, Luís O
2013-01-01
A new generation of laser wakefield accelerators, supported by the extreme accelerating fields generated in the interaction of PW-Class lasers and underdense targets, promises the production of high quality electron beams in short distances for multiple applications. Achieving this goal will rely heavily on numerical modeling for further understanding of the underlying physics and identification of optimal regimes, but large scale modeling of these scenarios is computationally heavy and requires efficient use of state-of-the-art Petascale supercomputing systems. We discuss the main difficulties involved in running these simulations and the new developments implemented in the OSIRIS framework to address these issues, ranging from multi-dimensional dynamic load balancing and hybrid distributed / shared memory parallelism to the vectorization of the PIC algorithm. We present the results of the OASCR Joule Metric program on the issue of large scale modeling of LWFA, demonstrating speedups of over 1 order of magni...
Parameters Design for a Parallel Hybrid Electric Bus Using Regenerative Brake Model
Directory of Open Access Journals (Sweden)
Zilin Ma
2014-01-01
Full Text Available A design methodology which uses the regenerative brake model is introduced to determine the major system parameters of a parallel electric hybrid bus drive train. Hybrid system parameters mainly include the power rating of internal combustion engine (ICE, gear ratios of transmission, power rating, and maximal torque of motor, power, and capacity of battery. The regenerative model is built in the vehicle model to estimate the regenerative energy in the real road conditions. The design target is to ensure that the vehicle meets the specified vehicle performance, such as speed and acceleration, and at the same time, operates the ICE within an expected speed range. Several pairs of parameters are selected from the result analysis, and the fuel saving result in the road test shows that a 25% reduction is achieved in fuel consumption.
Directory of Open Access Journals (Sweden)
Peng Liang
2015-01-01
Full Text Available This research considers an unrelated parallel machine scheduling problem with energy consumption and total tardiness. This problem is compounded by two challenges: differences of unrelated parallel machines energy consumption and interaction between job assignments and machine state operations. To begin with, we establish a mathematical model for this problem. Then an ant optimization algorithm based on ATC heuristic rule (ATC-ACO is presented. Furthermore, optimal parameters of proposed algorithm are defined via Taguchi methods for generating test data. Finally, comparative experiments indicate the proposed ATC-ACO algorithm has better performance on minimizing energy consumption as well as total tardiness and the modified ATC heuristic rule is more effectively on reducing energy consumption.
Puzyrev, Vladimir; Koldan, Jelena; de la Puente, Josep; Houzeaux, Guillaume; Vázquez, Mariano; Cela, José María
2013-05-01
We present a nodal finite-element method that can be used to compute in parallel highly accurate solutions for 3-D controlled-source electromagnetic forward-modelling problems in anisotropic media. Secondary coupled-potential formulation of Maxwell's equations allows to avoid the singularities introduced by the sources, while completely unstructured tetrahedral meshes and mesh refinement support an accurate representation of geological and bathymetric complexity and improve the solution accuracy. Different complex iterative solvers and an efficient pre-conditioner based on the sparse approximate inverse are used for solving the resulting large sparse linear system of equations. Results are compared with the ones of other researchers to check the accuracy of the method. We demonstrate the performance of the code in large problems with tens and even hundreds of millions of degrees of freedom. Scalability tests on massively parallel computers show that our code is highly scalable.
Institute of Scientific and Technical Information of China (English)
LUO Bing; LU Nian-li; CHE Ren-wei
2009-01-01
Equivalent integrated finite element method is a canonical and efficient modeling method in dynamic analysis of complex mechanism. The key of establishing dynamic equations of spatial mechanism by the method is to confirm Jacobian matrix reflecting relations of all joints, nodes, and generalized coordinates, namely, relations of second-order and corresponding third-order conversion tensors. For complex motion relations of components in a parallel robot, it gives second-order and third-order conversion tensors of dynamic equations for the 6-HTRT parallel robot based on equivalent integrated f'mite element method. The method is suitable for the typical robots whose positions of work space and sizes of mechanism are different. The solving course of the method is simple and convenient, so the method lays the foundation of dynamic analysis for robots.
Implementation of Newton-Rapshon iterations for parallel staggered-grid geodynamic models
Popov, A. A.; Kaus, B. J. P.
2012-04-01
Staggered-grid finite differences discretization has a good potential for solving highly heterogeneous geodynamic models on parallel computers (e.g. Tackey, 2008; Gerya &Yuen, 2007). They are inherently stable, computationally inexpensive and relatively easy to implement. However, currently used staggered-grid geodynamic codes employ almost exclusively the sub-optimal Picard linearization scheme to deal with nonlinearities. It was shown that Newton-Rapshon linearization can lead to substantial improvements of the solution quality in geodynamic problems, simultaneously with reduction of computer time (e.g. Popov & Sobolev, 2008). This work is aimed at implementation of the Newton-Rapshon linearization in the parallel geodynamic code LaMEM together with staggered-grid discretization and viso-(elasto)-plastic rock rheologies. We present the expressions for the approximate Jacobian matrix, and give detailed comparisons with the currently employed Picard linearization scheme, in terms of solution quality and number of iterations.
Institute of Scientific and Technical Information of China (English)
QIU Zhi-qiang; ZOU Hai; SUN Jian-hua
2008-01-01
Parallel turbine-driven feedwater pumps are needed when ships travel at high speed. In order to study marine steam generator feedwater control systems which use parallel turbine-driven feed pumps,a mathematical model of marine steam generator feedwater control system was developed which includes mathematical models of two steam generators and parallel turbine-driven feed pumps as well as mathematical models of feedwater pipes and feed regulating valves. The operating condition points of the parallel turbine-driven feed pumps were calculated by the Chebyshev curve fit method. A water level controller for the steam generator and a rotary speed controller for the turbine-driven feed pumps were also included in the model. The accuracy of the mathematical models and their controllers was verified by comparing their results with those from a simulator.
Optimizing ion channel models using a parallel genetic algorithm on graphical processors.
Ben-Shalom, Roy; Aviv, Amit; Razon, Benjamin; Korngreen, Alon
2012-01-01
We have recently shown that we can semi-automatically constrain models of voltage-gated ion channels by combining a stochastic search algorithm with ionic currents measured using multiple voltage-clamp protocols. Although numerically successful, this approach is highly demanding computationally, with optimization on a high performance Linux cluster typically lasting several days. To solve this computational bottleneck we converted our optimization algorithm for work on a graphical processing unit (GPU) using NVIDIA's CUDA. Parallelizing the process on a Fermi graphic computing engine from NVIDIA increased the speed ∼180 times over an application running on an 80 node Linux cluster, considerably reducing simulation times. This application allows users to optimize models for ion channel kinetics on a single, inexpensive, desktop "super computer," greatly reducing the time and cost of building models relevant to neuronal physiology. We also demonstrate that the point of algorithm parallelization is crucial to its performance. We substantially reduced computing time by solving the ODEs (Ordinary Differential Equations) so as to massively reduce memory transfers to and from the GPU. This approach may be applied to speed up other data intensive applications requiring iterative solutions of ODEs.
SiGN-SSM: open source parallel software for estimating gene networks with state space models.
Tamada, Yoshinori; Yamaguchi, Rui; Imoto, Seiya; Hirose, Osamu; Yoshida, Ryo; Nagasaki, Masao; Miyano, Satoru
2011-04-15
SiGN-SSM is an open-source gene network estimation software able to run in parallel on PCs and massively parallel supercomputers. The software estimates a state space model (SSM), that is a statistical dynamic model suitable for analyzing short time and/or replicated time series gene expression profiles. SiGN-SSM implements a novel parameter constraint effective to stabilize the estimated models. Also, by using a supercomputer, it is able to determine the gene network structure by a statistical permutation test in a practical time. SiGN-SSM is applicable not only to analyzing temporal regulatory dependencies between genes, but also to extracting the differentially regulated genes from time series expression profiles. SiGN-SSM is distributed under GNU Affero General Public Licence (GNU AGPL) version 3 and can be downloaded at http://sign.hgc.jp/signssm/. The pre-compiled binaries for some architectures are available in addition to the source code. The pre-installed binaries are also available on the Human Genome Center supercomputer system. The online manual and the supplementary information of SiGN-SSM is available on our web site. tamada@ims.u-tokyo.ac.jp.
Honkonen, I.
2014-07-01
I present a method for developing extensible and modular computational models without sacrificing serial or parallel performance or source code readability. By using a generic simulation cell method I show that it is possible to combine several distinct computational models to run in the same computational grid without requiring any modification of existing code. This is an advantage for the development and testing of computational modeling software as each submodel can be developed and tested independently and subsequently used without modification in a more complex coupled program. Support for parallel programming is also provided by allowing users to select which simulation variables to transfer between processes via a Message Passing Interface library. This allows the communication strategy of a program to be formalized by explicitly stating which variables must be transferred between processes for the correct functionality of each submodel and the entire program. The generic simulation cell class presented here requires a C++ compiler that supports variadic templates which were standardized in 2011 (C++11). The code is available at: https://github.com/nasailja/gensimcell for everyone to use, study, modify and redistribute; those that do are kindly requested to cite this work.
Directory of Open Access Journals (Sweden)
I. Honkonen
2014-07-01
Full Text Available I present a method for developing extensible and modular computational models without sacrificing serial or parallel performance or source code readability. By using a generic simulation cell method I show that it is possible to combine several distinct computational models to run in the same computational grid without requiring any modification of existing code. This is an advantage for the development and testing of computational modeling software as each submodel can be developed and tested independently and subsequently used without modification in a more complex coupled program. Support for parallel programming is also provided by allowing users to select which simulation variables to transfer between processes via a Message Passing Interface library. This allows the communication strategy of a program to be formalized by explicitly stating which variables must be transferred between processes for the correct functionality of each submodel and the entire program. The generic simulation cell class presented here requires a C++ compiler that supports variadic templates which were standardized in 2011 (C++11. The code is available at: https://github.com/nasailja/gensimcell for everyone to use, study, modify and redistribute; those that do are kindly requested to cite this work.
Vlasov modelling of parallel transport in a tokamak scrape-off layer
Energy Technology Data Exchange (ETDEWEB)
Manfredi, G [Institut de Physique et Chimie des Materiaux, CNRS and Universite de Strasbourg, BP 43, F-67034 Strasbourg (France); Hirstoaga, S [INRIA Nancy Grand-Est and Institut de Recherche en Mathematiques Avancees, 7 rue Rene Descartes, F-67084 Strasbourg (France); Devaux, S, E-mail: Giovanni.Manfredi@ipcms.u-strasbg.f, E-mail: hirstoaga@math.unistra.f, E-mail: Stephane.Devaux@ccfe.ac.u [JET-EFDA, Culham Science Centre, Abingdon, OX14 3DB (United Kingdom)
2011-01-15
A one-dimensional Vlasov-Poisson model is used to describe the parallel transport in a tokamak scrape-off layer. Thanks to a recently developed 'asymptotic-preserving' numerical scheme, it is possible to lift numerical constraints on the time step and grid spacing, which are no longer limited by, respectively, the electron plasma period and Debye length. The Vlasov approach provides a good velocity-space resolution even in regions of low density. The model is applied to the study of parallel transport during edge-localized modes, with particular emphasis on the particles and energy fluxes on the divertor plates. The numerical results are compared with analytical estimates based on a free-streaming model, with good general agreement. An interesting feature is the observation of an early electron energy flux, due to suprathermal electrons escaping the ions' attraction. In contrast, the long-time evolution is essentially quasi-neutral and dominated by the ion dynamics.
Analysis and Modeling of Parallel Photovoltaic Systems under Partial Shading Conditions
Buddala, Santhoshi Snigdha
Since the industrial revolution, fossil fuels like petroleum, coal, oil, natural gas and other non-renewable energy sources have been used as the primary energy source. The consumption of fossil fuels releases various harmful gases into the atmosphere as byproducts which are hazardous in nature and they tend to deplete the protective layers and affect the overall environmental balance. Also the fossil fuels are bounded resources of energy and rapid depletion of these sources of energy, have prompted the need to investigate alternate sources of energy called renewable energy. One such promising source of renewable energy is the solar/photovoltaic energy. This work focuses on investigating a new solar array architecture with solar cells connected in parallel configuration. By retaining the structural simplicity of the parallel architecture, a theoretical small signal model of the solar cell is proposed and modeled to analyze the variations in the module parameters when subjected to partial shading conditions. Simulations were run in SPICE to validate the model implemented in Matlab. The voltage limitations of the proposed architecture are addressed by adopting a simple dc-dc boost converter and evaluating the performance of the architecture in terms of efficiencies by comparing it with the traditional architectures. SPICE simulations are used to compare the architectures and identify the best one in terms of power conversion efficiency under partial shading conditions.
A Review on Large Scale Graph Processing Using Big Data Based Parallel Programming Models
Directory of Open Access Journals (Sweden)
Anuraj Mohan
2017-02-01
Full Text Available Processing big graphs has become an increasingly essential activity in various fields like engineering, business intelligence and computer science. Social networks and search engines usually generate large graphs which demands sophisticated techniques for social network analysis and web structure mining. Latest trends in graph processing tend towards using Big Data platforms for parallel graph analytics. MapReduce has emerged as a Big Data based programming model for the processing of massively large datasets. Apache Giraph, an open source implementation of Google Pregel which is based on Bulk Synchronous Parallel Model (BSP is used for graph analytics in social networks like Facebook. This proposed work is to investigate the algorithmic effects of the MapReduce and BSP model on graph problems. The triangle counting problem in graphs is considered as a benchmark and evaluations are made on the basis of time of computation on the same cluster, scalability in relation to graph and cluster size, resource utilization and the structure of the graph.
Cpl6: The New Extensible, High-Performance Parallel Coupler forthe Community Climate System Model
Energy Technology Data Exchange (ETDEWEB)
Craig, Anthony P.; Jacob, Robert L.; Kauffman, Brain; Bettge,Tom; Larson, Jay; Ong, Everest; Ding, Chris; He, Yun
2005-03-24
Coupled climate models are large, multiphysics applications designed to simulate the Earth's climate and predict the response of the climate to any changes in the forcing or boundary conditions. The Community Climate System Model (CCSM) is a widely used state-of-art climate model that has released several versions to the climate community over the past ten years. Like many climate models, CCSM employs a coupler, a functional unit that coordinates the exchange of data between parts of climate system such as the atmosphere and ocean. This paper describes the new coupler, cpl6, contained in the latest version of CCSM,CCSM3. Cpl6 introduces distributed-memory parallelism to the coupler, a class library for important coupler functions, and a standardized interface for component models. Cpl6 is implemented entirely in Fortran90 and uses Model Coupling Toolkit as the base for most of its classes. Cpl6 gives improved performance over previous versions and scales well on multiple platforms.
A Parallel Ocean Model With Adaptive Mesh Refinement Capability For Global Ocean Prediction
Energy Technology Data Exchange (ETDEWEB)
Herrnstein, Aaron R. [Univ. of California, Davis, CA (United States)
2005-12-01
An ocean model with adaptive mesh refinement (AMR) capability is presented for simulating ocean circulation on decade time scales. The model closely resembles the LLNL ocean general circulation model with some components incorporated from other well known ocean models when appropriate. Spatial components are discretized using finite differences on a staggered grid where tracer and pressure variables are defined at cell centers and velocities at cell vertices (B-grid). Horizontal motion is modeled explicitly with leapfrog and Euler forward-backward time integration, and vertical motion is modeled semi-implicitly. New AMR strategies are presented for horizontal refinement on a B-grid, leapfrog time integration, and time integration of coupled systems with unequal time steps. These AMR capabilities are added to the LLNL software package SAMRAI (Structured Adaptive Mesh Refinement Application Infrastructure) and validated with standard benchmark tests. The ocean model is built on top of the amended SAMRAI library. The resulting model has the capability to dynamically increase resolution in localized areas of the domain. Limited basin tests are conducted using various refinement criteria and produce convergence trends in the model solution as refinement is increased. Carbon sequestration simulations are performed on decade time scales in domains the size of the North Atlantic and the global ocean. A suggestion is given for refinement criteria in such simulations. AMR predicts maximum pH changes and increases in CO_{2} concentration near the injection sites that are virtually unattainable with a uniform high resolution due to extremely long run times. Fine scale details near the injection sites are achieved by AMR with shorter run times than the finest uniform resolution tested despite the need for enhanced parallel performance. The North Atlantic simulations show a reduction in passive tracer errors when AMR is applied instead of a uniform coarse resolution. No
Parallelization Load Balance Strategy for a Global Grid-Point Model
Institute of Scientific and Technical Information of China (English)
WU Xiangjun; CHEN Dehui; SONG Junqiang; JIN Zhiyan; YANG Xuesheng; ZHANG Hongliang
2010-01-01
The Global/Regional Assimilation and PrEdiction System(GRAPES)is a new-generation operational numerical weather prediction(NWP)model developed by the China Meteorological Administration(CMA).It is a grid-point model with a code structure different from that of spectral models used in other operational NWP centers such as the European Centre for Medium-Range Weather Forecasts(ECMWF),National Centers for Environmental Prediction(NCEP),and Japan Meteorological Agency(JMA),especially in the context of parallel computing.In the GRAPES global model,a semi-implicit semi-Lagrangian scheme is used for the discretization over a sphere,which requires careful planning for the busy communications between the arrays of processors,because the Lagrangian differential scheme results in shortened trajectories interpolated between the grid points at the poles and in the associated adjacent areas.This means that the latitude-longitude partitioning is more complex for the polar processors.Therefore,a parallel strategy with efficient computation,balanced load,and synchronous communication shall be developed.In this paper,a message passing approach based on MPI(Message Passing Interface)group communication is proposed.Its key-point is to group the polar processors in row with matrix-topology during the processor partitioning.A load balance task distribution algorithm is also discussed.Test runs on the IBM-cluster 1600 at CMA show that the new algorithm is of desired scalability,and the readjusted load balance scheme can reduce the absolute wall clock time by 10% or more.The quasi-operational runs of the model demonstrate that the wall clock time secured by the strategy meets the real-time needs of NWP operations.
Zhang, Yanzhen; Liu, Yonghong; Wang, Xiaolong; Shen, Yang; Ji, Renjie; Cai, Baoping
2013-02-01
The charging characteristics of micrometer sized aqueous droplets have attracted more and more attentions due to the development of the microfluidics technology since the electrophoretic motion of a charged droplet can be used as the droplet actuation method. This work proposed a novel method of investigating the charging characteristics of micrometer sized aqueous droplets based on parallel plate capacitor model. With this method, the effects of the electric field strength, electrolyte concentration, and ion species on the charging characteristics of the aqueous droplets was investigated. Experimental results showed that the charging characteristics of micrometer sized droplets can be investigated by this method.
Directory of Open Access Journals (Sweden)
Chen Zhao
2016-01-01
Full Text Available It is important for huge ship to find the ceramic/metal functional gradient thermal barrier coating materials. A parallel computation model is built for optimization design of three-dimensional ceramic/metal functionally gradient thermal barrier coating material. According to the control equation and initial-boundary conditions, the heat transfer problem is considered, and numerical algorithms of optimization design is constructed by adapting difference method. The numerical results shows that gradient thermal barrier coating material can improve the function of material.
Kerr, I. D.; Sankararamakrishnan, R; Smart, O.S.; Sansom, M S
1994-01-01
A parallel bundle of transmembrane (TM) alpha-helices surrounding a central pore is present in several classes of ion channel, including the nicotinic acetylcholine receptor (nAChR). We have modeled bundles of hydrophobic and of amphipathic helices using simulated annealing via restrained molecular dynamics. Bundles of Ala20 helices, with N = 4, 5, or 6 helices/bundle were generated. For all three N values the helices formed left-handed coiled coils, with pitches ranging from 160 A (N = 4) to...
Hybrid Parallel Programming Models for AMR Neutron Monte-Carlo Transport
Dureau, David; Poëtte, Gaël
2014-06-01
This paper deals with High Performance Computing (HPC) applied to neutron transport theory on complex geometries, thanks to both an Adaptive Mesh Refinement (AMR) algorithm and a Monte-Carlo (MC) solver. Several Parallelism models are presented and analyzed in this context, among them shared memory and distributed memory ones such as Domain Replication and Domain Decomposition, together with Hybrid strategies. The study is illustrated by weak and strong scalability tests on complex benchmarks on several thousands of cores thanks to the petaflopic supercomputer Tera100.
Miura, Yuichiro; Matsuda, Tadashi; Usuda, Haruo; Watanabe, Shimpei; Kitanishi, Ryuta; Saito, Masatoshi; Hanita, Takushi; Kobayashi, Yoshiyasu
2016-05-01
An artificial placenta (AP) is an arterio-venous extracorporeal life support system that is connected to the fetal circulation via the umbilical vasculature. Previously, we published an article describing a pumpless AP system with a small priming volume. We subsequently developed a parallelized system, hypothesizing that the reduced circuit resistance conveyed by this modification would enable healthy fetal survival time to be prolonged. We conducted experiments using a premature lamb model to test this hypothesis. As a result, the fetal survival period was significantly prolonged (60.4 ± 3.8 vs. 18.2 ± 3.2 h, P < 0.01), and circuit resistance and minimal blood lactate levels were significantly lower in the parallel circuit group, compared with our previous single circuit group. Fetal physiological parameters remained stable until the conclusion of the experiments. In summary, parallelization of the AP system was associated with reduced circuit resistance and lactate levels and allowed preterm lamb fetuses to survive for a significantly longer period when compared with previous studies.
Modeling flue pipes: Subsonic flow, lattice Boltzmann, and parallel distributed computers
Skordos, Panayotis A.
1995-01-01
The problem of simulating the hydrodynamics and the acoustic waves inside wind musical instruments such as the recorder the organ, and the flute is considered. The problem is attacked by developing suitable local-interaction algorithms and a parallel simulation system on a cluster of non-dedicated workstations. Physical measurements of the acoustic signal of various flue pipes show good agreement with the simulations. Previous attempts at this problem have been frustrated because the modeling of acoustic waves requires small integration time steps which make the simulation very compute-intensive. In addition, the simulation of subsonic viscous compressible flow at high Reynolds numbers is susceptible to slow-growing numerical instabilities which are triggered by high-frequency acoustic modes. The numerical instabilities are mitigated by employing suitable explicit algorithms: lattice Boltzmann method, compressible finite differences, and fourth-order artificial-viscosity filter. Further, a technique for accurate initial and boundary conditions for the lattice Boltzmann method is developed, and the second-order accuracy of the lattice Boltzmann method is demonstrated. The compute-intensive requirements are handled by developing a parallel simulation system on a cluster of non-dedicated workstations. The system achieves 80 percent parallel efficiency (speedup/processors) using 20 HP-Apollo workstations. The system is built on UNIX and TCP/IP communication routines, and includes automatic process migration from busy hosts to free hosts.
Bellucci, Michael A; Coker, David F
2011-07-28
We describe a new method for constructing empirical valence bond potential energy surfaces using a parallel multilevel genetic program (PMLGP). Genetic programs can be used to perform an efficient search through function space and parameter space to find the best functions and sets of parameters that fit energies obtained by ab initio electronic structure calculations. Building on the traditional genetic program approach, the PMLGP utilizes a hierarchy of genetic programming on two different levels. The lower level genetic programs are used to optimize coevolving populations in parallel while the higher level genetic program (HLGP) is used to optimize the genetic operator probabilities of the lower level genetic programs. The HLGP allows the algorithm to dynamically learn the mutation or combination of mutations that most effectively increase the fitness of the populations, causing a significant increase in the algorithm's accuracy and efficiency. The algorithm's accuracy and efficiency is tested against a standard parallel genetic program with a variety of one-dimensional test cases. Subsequently, the PMLGP is utilized to obtain an accurate empirical valence bond model for proton transfer in 3-hydroxy-gamma-pyrone in gas phase and protic solvent.
Energy Technology Data Exchange (ETDEWEB)
Xie, G.; Li, J.; Majer, E.; Zuo, D.
1998-07-01
This paper describes a new 3D parallel GILD electromagnetic (EM) modeling and nonlinear inversion algorithm. The algorithm consists of: (a) a new magnetic integral equation instead of the electric integral equation to solve the electromagnetic forward modeling and inverse problem; (b) a collocation finite element method for solving the magnetic integral and a Galerkin finite element method for the magnetic differential equations; (c) a nonlinear regularizing optimization method to make the inversion stable and of high resolution; and (d) a new parallel 3D modeling and inversion using a global integral and local differential domain decomposition technique (GILD). The new 3D nonlinear electromagnetic inversion has been tested with synthetic data and field data. The authors obtained very good imaging for the synthetic data and reasonable subsurface EM imaging for the field data. The parallel algorithm has high parallel efficiency over 90% and can be a parallel solver for elliptic, parabolic, and hyperbolic modeling and inversion. The parallel GILD algorithm can be extended to develop a high resolution and large scale seismic and hydrology modeling and inversion in the massively parallel computer.
Stem thrust prediction model for W-K-M double wedge parallel expanding gate valves
Energy Technology Data Exchange (ETDEWEB)
Eldiwany, B.; Alvarez, P.D. [Kalsi Engineering Inc., Sugar Land, TX (United States); Wolfe, K. [Electric Power Research Institute, Palo Alto, CA (United States)
1996-12-01
An analytical model for determining the required valve stem thrust during opening and closing strokes of W-K-M parallel expanding gate valves was developed as part of the EPRI Motor-Operated Valve Performance Prediction Methodology (EPRI MOV PPM) Program. The model was validated against measured stem thrust data obtained from in-situ testing of three W-K-M valves. Model predictions show favorable, bounding agreement with the measured data for valves with Stellite 6 hardfacing on the disks and seat rings for water flow in the preferred flow direction (gate downstream). The maximum required thrust to open and to close the valve (excluding wedging and unwedging forces) occurs at a slightly open position and not at the fully closed position. In the nonpreferred flow direction, the model shows that premature wedging can occur during {Delta}P closure strokes even when the coefficients of friction at different sliding surfaces are within the typical range. This paper summarizes the model description and comparison against test data.
Oscillations of low-current electrical discharges between parallel-plane electrodes. III. Models
Phelps, A. V.; Petrović, Z. Lj.; Jelenković, B. M.
1993-04-01
Simple models are developed to describe the results of measurements of the oscillatory and negative differential resistance properties of low- to moderate-current discharges in parallel-plane geometry. The time-dependent model assumes that the ion transit time is fixed and is short compared to the times of interest, that electrons are produced at the cathode only by ions, and that space-charge distortion of the electric field is small but not negligible. Illustrative numerical solutions are given for large voltage and current changes and analytic solutions for the time dependence of current and voltage are obtained in the small-signal limit. The small-signal results include the frequency and damping constants for decaying oscillations following a voltage change or following the injection of photoelectrons. The conditions for underdamped, overdamped, and self-sustained or growing oscillations are obtained. A previously developed steady-state, nonequilibrium model for low-pressure hydrogen discharges that includes the effects of space-charge distortion of the electric field on the yield of electrons at the cathode is used to obtain the negative differential resistance. Analytic expressions for the differential resistance and capacitance are developed using the steady-state, local-equilibrium model for electron and ion motion and a first-order perturbation treatment of space-charge electric fields. These models generally show good agreement with data from dc and pulsed discharge experiments presented in the accompanying papers.
Wu, Johnny; Witkiewitz, Katie; McMahon, Robert J; Dodge, Kenneth A
2010-10-01
Conduct problems, substance use, and risky sexual behavior have been shown to coexist among adolescents, which may lead to significant health problems. The current study was designed to examine relations among these problem behaviors in a community sample of children at high risk for conduct disorder. A latent growth model of childhood conduct problems showed a decreasing trend from grades K to 5. During adolescence, four concurrent conduct problem and substance use trajectory classes were identified (high conduct problems and high substance use, increasing conduct problems and increasing substance use, minimal conduct problems and increasing substance use, and minimal conduct problems and minimal substance use) using a parallel process growth mixture model. Across all substances (tobacco, binge drinking, and marijuana use), higher levels of childhood conduct problems during kindergarten predicted a greater probability of classification into more problematic adolescent trajectory classes relative to less problematic classes. For tobacco and binge drinking models, increases in childhood conduct problems over time also predicted a greater probability of classification into more problematic classes. For all models, individuals classified into more problematic classes showed higher proportions of early sexual intercourse, infrequent condom use, receiving money for sexual services, and ever contracting an STD. Specifically, tobacco use and binge drinking during early adolescence predicted higher levels of sexual risk taking into late adolescence. Results highlight the importance of studying the conjoint relations among conduct problems, substance use, and risky sexual behavior in a unified model. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.
Advanced parallel computing for the coupled PCR-GLOBWB-MODFLOW model
Verkaik, Jarno; Schmitz, Oliver; Sutanudjaja, Edwin
2017-04-01
PCR-GLOBWB (https://github.com/UU-Hydro/PCR-GLOBWB_model) is a large-scale hydrological model intended for global to regional studies and developed at the Department of Physical Geography, Utrecht University (Netherlands). The latest version of the model can simulate terrestrial hydrological and water resource fluxes and storages with a typical spatial resolution of 5 arc-minutes (less than 10 km) at the global extent. One of the recent features in the model development is the inclusion of a global 2-layer MODFLOW model simulating groundwater lateral flow. This advanced feature enables us to simulate and assess the groundwater head dynamics at the global extent, including at regions with declining groundwater head problems. Unfortunately, the current coupled PCR-GLOBWB-MODFLOW requires long run times mainly attributed to the current inefficient parallel computing and coupling algorithm. In this work, we aim to improve it by setting-up a favorable river-basin partitioning manner that reduces I/O communication and optimizes load balance between PCR-GLOBWB and MODFLOW. We also aim to replace the MODFLOW-2000 in the current coupled model with MODFLOW-USG. This will allow us to use the new Parallel Krylov Solver (PKS) that can run with Message Passing Interface (MPI) and can be easily combined with Open Multi-Processing (OpenMP). The latest scaling test carried out on the Cartesius Dutch National supercomputer shows that the usage of MODFLOW-USG and new PKS solver can result in significant MODFLOW calculation speedups (up to 45). The encouraging result of this work opens a possibility for running the model with more detailed setup and at higher resolution. As MODFLOW-USG supports both structured and unstructured grids, this includes an opportunity to have a next generation of PCR-GLOBWB-MODFLOW model that has flexibility in grid design for its groundwater flow simulation (e.g. grid design can be used to focus along rivers and around wells, to discretize individual
Pazzona, Federico G.; Demontis, Pierfranco; Suffritti, Giuseppe B.
2014-08-01
The adsorption isotherm for the recently proposed parallel Kawasaki (PK) lattice-gas model [Phys. Rev. E 88, 062144 (2013), 10.1103/PhysRevE.88.062144] is calculated exactly in one dimension. To do so, a third-order difference equation for the grand-canonical partition function is derived and solved analytically. In the present version of the PK model, the attraction and repulsion effects between two neighboring particles and between a particle and a neighboring empty site are ruled, respectively, by the dimensionless parameters ϕ and θ. We discuss the inflections induced in the isotherms by situations of high repulsion, the role played by finite lattice sizes in the emergence of substeps, and the adequacy of the two most widely used mean-field approximations in lattice gases, namely, the Bragg-Williams and the Bethe-Peierls approximations.
Angle- and distance-constrained matcher with parallel implementations for model-based vision
Anhalt, David J.; Raney, Steven; Severson, William E.
1992-02-01
The matching component of a model-based vision system hypothesizes one-to-one correspondences between 2D image features and locations on the 3D model. As part of Wright Laboratory's ARAGTAP program [a synthetic aperture radar (SAR) object recognition program], we developed a matcher that searches for feature matches based on the hypothesized object type and aspect angle. Search is constrained by the presumed accuracy of the hypothesized aspect angle and scale. These constraints reduce the search space for matches, thus improving match performance and quality. The algorithm is presented and compared with a matcher based on geometric hashing. Parallel implementations on commercially available shared memory MIMD machines, distributed memory MIMD machines, and SIMD machines are presented and contrasted.
A self-calibrating robot based upon a virtual machine model of parallel kinematics
DEFF Research Database (Denmark)
Pedersen, David Bue; Eiríksson, Eyþór Rúnar; Hansen, Hans Nørgaard
2016-01-01
A delta-type parallel kinematics system for Additive Manufacturing has been created, which through a probing system can recognise its geometrical deviations from nominal and compensate for these in the driving inverse kinematic model of the machine. Novelty is that this model is derived from...... a virtual machine of the kinematics system, built on principles from geometrical metrology. Relevant mathematically non-trivial deviations to the ideal machine are identified and decomposed into elemental deviations. From these deviations, a routine is added to a physical machine tool, which allows...... it to recognise its own geometry by probing the vertical offset from tool point to the machine table, at positions in the horizontal plane. After automatic calibration the positioning error of the machine tool was reduced from an initial error after its assembly of ±170 µm to a calibrated error of ±3 µm...
Modeling of the phase lag causing fluidelastic instability in a parallel triangular tube array
Khalifa, Ahmed; Weaver, David; Ziada, Samir
2013-11-01
Fluidelastic instability is considered a critical flow induced vibration mechanism in tube and shell heat exchangers. It is believed that a finite time lag between tube vibration and fluid response is essential to predict the phenomenon. However, the physical nature of this time lag is not fully understood. This paper presents a fundamental study of this time delay using a parallel triangular tube array with a pitch ratio of 1.54. A computational fluid dynamics (CFD) model was developed and validated experimentally in an attempt to investigate the interaction between tube vibrations and flow perturbations at lower reduced velocities Ur=1-6 and Reynolds numbers Re=2000-12 000. The numerical predictions of the phase lag are in reasonable agreement with the experimental measurements for the range of reduced velocities Ug/fd=6-7. It was found that there are two propagation mechanisms; the first is associated with the acoustic wave propagation at low reduced velocities, Ur<2, and the second mechanism for higher reduced velocities is associated with the vorticity shedding and convection. An empirical model of the two mechanisms is developed and the phase lag predictions are in reasonable agreement with the experimental and numerical measurements. The developed phase lag model is then coupled with the semi-analytical model of Lever and Weaver to predict the fluidelastic stability threshold. Improved predictions of the stability boundaries for the parallel triangular array were achieved. In addition, the present study has explained why fluidelastic instability does not occur below some threshold reduced velocity.
Fast and Parallel Spectral Transform Algorithms for Global Shallow Water Models
Jakob, Ruediger
1993-01-01
This dissertation examines spectral transform algorithms for the solution of the shallow water equations on the sphere and studies their implementation and performance on shared memory vector multiprocessors. Beginning with the standard spectral transform algorithm in vorticity divergence form and its implementation in the Fortran based parallel programming language Force, two modifications are researched. First, the transforms and matrices associated with the meridional derivatives of the associated Legendre functions are replaced by corresponding operations with the spherical harmonic coefficients. Second, based on the fast Fourier transform and the fast multipole method, a lower complexity algorithm is derived that uses fast transformations between Legendre and interior Fourier nodes, fast surface spherical truncation and a fast spherical Helmholtz solver. The first modification is fully implemented, and comparative performance data are obtained for varying resolution and number of processes, showing a significant storage saving and slightly reduced execution time on a Cray Y -MP 8/864. The important performance parameters for the spectral transform algorithm and its implementation on vector multiprocessors are determined and validated with the measured performance data. The second modification is described at the algorithmic level, but only the novel fast surface spherical truncation algorithm is implemented. This new multipole algorithm has lower complexity than the standard algorithm, and requires asymptotically only order N ^2log N operations per time step for a grid with order N^2 points. Because the global shallow water equations are similar to the horizontal dynamical component of general circulation models, the results can be applied to spectral transform numerical weather prediction and climate models. In general, the derived algorithms may speed up the solution of time dependent partial differential equations in spherical geometry. A performance model
Randles, Amanda Elizabeth
the modeling of fluids in vessels with smaller diameters and a method for introducing the deformational forces exerted on the arterial flows from the movement of the heart by borrowing concepts from cosmodynamics are presented. These additional forces have a great impact on the endothelial shear stress. Third, the fluid model is extended to not only recover Navier-Stokes hydrodynamics, but also a wider range of Knudsen numbers, which is especially important in micro- and nano-scale flows. The tradeoffs of many optimizations methods such as the use of deep halo level ghost cells that, alongside hybrid programming models, reduce the impact of such higher-order models and enable efficient modeling of extreme regimes of computational fluid dynamics are discussed. Fourth, the extension of these models to other research questions like clogging in microfluidic devices and determining the severity of co-arctation of the aorta is presented. Through this work, a validation of these methods by taking real patient data and the measured pressure value before the narrowing of the aorta and predicting the pressure drop across the co-arctation is shown. Comparison with the measured pressure drop in vivo highlights the accuracy and potential impact of such patient specific simulations. Finally, a method to enable the simulation of longer trajectories in time by discretizing both spatially and temporally is presented. In this method, a serial coarse iterator is used to initialize data at discrete time steps for a fine model that runs in parallel. This coarse solver is based on a larger time step and typically a coarser discretization in space. Iterative refinement enables the compute-intensive fine iterator to be modeled with temporal parallelization. The algorithm consists of a series of prediction-corrector iterations completing when the results have converged within a certain tolerance. Combined, these developments allow large fluid models to be simulated for longer time durations
Parallel Execution of Functional Mock-up Units in Buildings Modeling
Energy Technology Data Exchange (ETDEWEB)
Ozmen, Ozgur [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Nutaro, James J. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); New, Joshua Ryan [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
2016-06-30
A Functional Mock-up Interface (FMI) defines a standardized interface to be used in computer simulations to develop complex cyber-physical systems. FMI implementation by a software modeling tool enables the creation of a simulation model that can be interconnected, or the creation of a software library called a Functional Mock-up Unit (FMU). This report describes an FMU wrapper implementation that imports FMUs into a C++ environment and uses an Euler solver that executes FMUs in parallel using Open Multi-Processing (OpenMP). The purpose of this report is to elucidate the runtime performance of the solver when a multi-component system is imported as a single FMU (for the whole system) or as multiple FMUs (for different groups of components as sub-systems). This performance comparison is conducted using two test cases: (1) a simple, multi-tank problem; and (2) a more realistic use case based on the Modelica Buildings Library. In both test cases, the performance gains are promising when each FMU consists of a large number of states and state events that are wrapped in a single FMU. Load balancing is demonstrated to be a critical factor in speeding up parallel execution of multiple FMUs.
A parallel process growth model of avoidant personality disorder symptoms and personality traits.
Wright, Aidan G C; Pincus, Aaron L; Lenzenweger, Mark F
2013-07-01
Avoidant personality disorder (AVPD), like other personality disorders, has historically been construed as a highly stable disorder. However, results from a number of longitudinal studies have found that the symptoms of AVPD demonstrate marked change over time. Little is known about which other psychological systems are related to this change. Although cross-sectional research suggests a strong relationship between AVPD and personality traits, no work has examined the relationship of their change trajectories. The current study sought to establish the longitudinal relationship between AVPD and basic personality traits using parallel process growth curve modeling. Parallel process growth curve modeling was applied to the trajectories of AVPD and basic personality traits from the Longitudinal Study of Personality Disorders (Lenzenweger, M. F., 2006, The longitudinal study of personality disorders: History, design considerations, and initial findings. Journal of Personality Disorders, 20, 645-670. doi:10.1521/pedi.2006.20.6.645), a naturalistic, prospective, multiwave, longitudinal study of personality disorder, temperament, and normal personality. The focus of these analyses is on the relationship between the rates of change in both AVPD symptoms and basic personality traits. AVPD symptom trajectories demonstrated significant negative relationships with the trajectories of interpersonal dominance and affiliation, and a significant positive relationship to rates of change in neuroticism. These results provide some of the first compelling evidence that trajectories of change in PD symptoms and personality traits are linked. These results have important implications for the ways in which temporal stability is conceptualized in AVPD specifically, and PD in general.
Chevin, Luis-Miguel; Martin, Guillaume; Lenormand, Thomas
2010-11-01
Genetic theories of adaptation generally overlook the genes in which beneficial substitutions occur, and the likely variation in their mutational effects. We investigate the consequences of heterogeneous mutational effects among loci on the genetics of adaptation. We use a generalization of Fisher's geometrical model, which assumes multivariate Gaussian stabilizing selection on multiple characters. In our model, mutation has a distinct variance-covariance matrix of phenotypic effects for each locus. Consequently, the distribution of selection coefficients s varies across loci. We assume each locus can only affect a limited number of independent linear combinations of phenotypic traits (restricted pleiotropy), which differ among loci, an effect we term "orientation heterogeneity." Restricted pleiotropy can sharply reduce the overall proportion of beneficial mutations. Orientation heterogeneity has little impact on the shape of the genomic distribution, but can substantially increase the probability of parallel evolution (the repeated fixation of beneficial mutations at the same gene in independent populations), which is highest with low pleiotropy. We also consider variation in the degree of pleiotropy and in the mean s across loci. The latter impacts the genomic distribution of s, but has a much milder effect on parallel evolution. We discuss these results in the light of evolution experiments. © 2010 The Author(s). Evolution© 2010 The Society for the Study of Evolution.
Placati, Silvio; Guermandi, Marco; Samore, Andrea; Franchi Scarselli, Eleonora; Guerrieri, Roberto
2015-11-26
Diffuse Optical Tomography is an imaging technique based on evaluating how light propagates within the human head to obtain functional information about the brain. Precision in reconstructing such an optical properties map is highly affected by the accuracy of the light propagation model implemented, which needs to take into account the presence of clear and scattering tissues. We present a numerical solver based on the radiosity-diffusion model integrating the anatomical information provided by a structural MRI. The solver is designed to run on parallel heterogeneous platforms based on multiple GPUs and CPUs. We demonstrate how the solver provides a 7x speed-up over an isotropic-scattered parallel Monte Carlo engine based on a Radiative Transport Equation for a domain composed of 2 millions voxels, along with a significant improvement in accuracy. The speed-up greatly increases for larger domains, allowing us to compute the light distribution of a full human head ( 3 million voxels) in 116 seconds for the platform used.
Johnson, Andrew; Balash, Cheslav
2015-06-01
Numerous studies have been undertaken to improve the viability, durability and suitability of materials and methods used for aquaculture enclosures. While many of the previous studies considered macro-deformation of nets, there is a paucity of information on netting micro-deformation. When aquaculture pens are towed, industry operators have observed the motion described as "baffling" — the transverse oscillation of the net planes parallel and near parallel to the flow. The difficulty to observe and assess baffling motion in a controlled experimental environment is to sufficiently reproduce netting boundary conditions and the flow environment experienced at sea. The focus of the present study was to develop and assess experimental methods for visualisation and quantification of these transverse oscillations. Four net-rig configurations with varied boundary conditions and model-netting properties were tested in a flume tank. While the Reynolds number was not equivalent to full-scale, usage of the pliable and fine mesh model netting that enabled baffling to develop at low flow velocities was deemed to be of a larger relevance to this initial study. Baffling was observed in the testing frame that constrained the net sheet on the leading edge, similarly to a flag attachment onto a pole. Baffling motion increased the hydrodynamic drag of the net by 35%-58% when compared to the previously developed formula for taut net sheets aligned parallel to the flow. Furthermore, it was found that the drag due to baffling decreased with the increasing velocity over the studied Reynolds numbers (below 200); and the drag coefficient was non-linear for Reynolds numbers below 120. It is hypothesised that baffling motion is initially propagated by vortex shedding of the netting twine which causes the netting to oscillate; there after the restoring force causes unstable pressure differences on each side of the netting which excites the amplitude of the netting oscillations.
Modelling radiative transfer through ponded first-year Arctic sea ice with a plane-parallel model
Directory of Open Access Journals (Sweden)
T. Taskjelle
2017-09-01
Full Text Available Under-ice irradiance measurements were done on ponded first-year pack ice along three transects during the ICE12 expedition north of Svalbard. Bulk transmittances (400–900 nm were found to be on average 0.15–0.20 under bare ice, and 0.39–0.46 under ponded ice. Radiative transfer modelling was done with a plane-parallel model. While simulated transmittances deviate significantly from measured transmittances close to the edge of ponds, spatially averaged bulk transmittances agree well. That is, transect-average bulk transmittances, calculated using typical simulated transmittances for ponded and bare ice weighted by the fractional coverage of the two surface types, are in good agreement with the measured values. Radiative heating rates calculated from model output indicates that about 20 % of the incident solar energy is absorbed in bare ice, and 50 % in ponded ice (35 % in pond itself, 15 % in the underlying ice. This large difference is due to the highly scattering surface scattering layer (SSL increasing the albedo of the bare ice.
Balaji, V.; Benson, Rusty; Wyman, Bruce; Held, Isaac
2016-10-01
Climate models represent a large variety of processes on a variety of timescales and space scales, a canonical example of multi-physics multi-scale modeling. Current hardware trends, such as Graphical Processing Units (GPUs) and Many Integrated Core (MIC) chips, are based on, at best, marginal increases in clock speed, coupled with vast increases in concurrency, particularly at the fine grain. Multi-physics codes face particular challenges in achieving fine-grained concurrency, as different physics and dynamics components have different computational profiles, and universal solutions are hard to come by. We propose here one approach for multi-physics codes. These codes are typically structured as components interacting via software frameworks. The component structure of a typical Earth system model consists of a hierarchical and recursive tree of components, each representing a different climate process or dynamical system. This recursive structure generally encompasses a modest level of concurrency at the highest level (e.g., atmosphere and ocean on different processor sets) with serial organization underneath. We propose to extend concurrency much further by running more and more lower- and higher-level components in parallel with each other. Each component can further be parallelized on the fine grain, potentially offering a major increase in the scalability of Earth system models. We present here first results from this approach, called coarse-grained component concurrency, or CCC. Within the Geophysical Fluid Dynamics Laboratory (GFDL) Flexible Modeling System (FMS), the atmospheric radiative transfer component has been configured to run in parallel with a composite component consisting of every other atmospheric component, including the atmospheric dynamics and all other atmospheric physics components. We will explore the algorithmic challenges involved in such an approach, and present results from such simulations. Plans to achieve even greater levels of
Larour, Eric; Utke, Jean; Bovin, Anton; Morlighem, Mathieu; Perez, Gilberto
2016-11-01
Within the framework of sea-level rise projections, there is a strong need for hindcast validation of the evolution of polar ice sheets in a way that tightly matches observational records (from radar, gravity, and altimetry observations mainly). However, the computational requirements for making hindcast reconstructions possible are severe and rely mainly on the evaluation of the adjoint state of transient ice-flow models. Here, we look at the computation of adjoints in the context of the NASA/JPL/UCI Ice Sheet System Model (ISSM), written in C++ and designed for parallel execution with MPI. We present the adaptations required in the way the software is designed and written, but also generic adaptations in the tools facilitating the adjoint computations. We concentrate on the use of operator overloading coupled with the AdjoinableMPI library to achieve the adjoint computation of the ISSM. We present a comprehensive approach to (1) carry out type changing through the ISSM, hence facilitating operator overloading, (2) bind to external solvers such as MUMPS and GSL-LU, and (3) handle MPI-based parallelism to scale the capability. We demonstrate the success of the approach by computing sensitivities of hindcast metrics such as the misfit to observed records of surface altimetry on the northeastern Greenland Ice Stream, or the misfit to observed records of surface velocities on Upernavik Glacier, central West Greenland. We also provide metrics for the scalability of the approach, and the expected performance. This approach has the potential to enable a new generation of hindcast-validated projections that make full use of the wealth of datasets currently being collected, or already collected, in Greenland and Antarctica.
fast_protein_cluster: parallel and optimized clustering of large-scale protein modeling data.
Hung, Ling-Hong; Samudrala, Ram
2014-06-15
fast_protein_cluster is a fast, parallel and memory efficient package used to cluster 60 000 sets of protein models (with up to 550 000 models per set) generated by the Nutritious Rice for the World project. fast_protein_cluster is an optimized and extensible toolkit that supports Root Mean Square Deviation after optimal superposition (RMSD) and Template Modeling score (TM-score) as metrics. RMSD calculations using a laptop CPU are 60× faster than qcprot and 3× faster than current graphics processing unit (GPU) implementations. New GPU code further increases the speed of RMSD and TM-score calculations. fast_protein_cluster provides novel k-means and hierarchical clustering methods that are up to 250× and 2000× faster, respectively, than Clusco, and identify significantly more accurate models than Spicker and Clusco. fast_protein_cluster is written in C++ using OpenMP for multi-threading support. Custom streaming Single Instruction Multiple Data (SIMD) extensions and advanced vector extension intrinsics code accelerate CPU calculations, and OpenCL kernels support AMD and Nvidia GPUs. fast_protein_cluster is available under the M.I.T. license. (http://software.compbio.washington.edu/fast_protein_cluster) © The Author 2014. Published by Oxford University Press.
Examining Mechanisms Underlying Fear-Control in the Extended Parallel Process Model.
Quick, Brian L; LaVoie, Nicole R; Reynolds-Tylus, Tobias; Martinez-Gonzalez, Andrea; Skurka, Chris
2017-01-17
This investigation sought to advance the extended parallel process model in important ways by testing associations among the strengths of efficacy and threat appeals with fear as well as two outcomes of fear-control processing, psychological reactance and message minimization. Within the context of print ads admonishing against noise-induced hearing loss (NIHL) and the fictitious Trepidosis virus, partial support was found for the additive model with no support for the multiplicative model. High efficacy appeals mitigated freedom threat perceptions across both contexts. Fear was positively associated with both freedom threat perceptions within the NIHL context and favorable attitudes for both NIHL and Trepidosis virus contexts. In line with psychological reactance theory, a freedom threat was positively associated with psychological reactance. Reactance, in turn, was positively associated with message minimization. The models supported reactance preceding message minimization across both message contexts. Both the theoretical and practical implications are discussed with an emphasis on future research opportunities within the fear-appeal literature.
Fuchs, A.; Androsov, A.; Harig, S.; Hiller, W.; Rakowsky, N.
2012-04-01
Based on the jeopardy of devastating tsunamis and the unpredictability of such events, tsunami modelling as part of warning systems is still a contemporary topic. The tsunami group of Alfred Wegener Institute developed the simulation tool TsunAWI as contribution to the Early Warning System in Indonesia. Although the precomputed scenarios for this purpose qualify for satisfying deliverables, the study of further improvements continues. While TsunAWI is governed by the Shallow Water Equations, an extension of the model is based on a nonhydrostatic approach. At the arrival of a tsunami wave in coastal regions with rough bathymetry, the term containing the nonhydrostatic part of pressure, that is neglected in the original hydrostatic model, gains in importance. In consideration of this term, a better approximation of the wave is expected. Differences of hydrostatic and nonhydrostatic model results are contrasted in the standard benchmark problem of a solitary wave runup on a plane beach. The observation data provided by Titov and Synolakis (1995) serves as reference. The nonhydrostatic approach implies a set of equations that are similar to the Shallow Water Equations, so the variation of the code can be implemented on top. However, this additional routines cause a lot of issues you have to cope with. So far the computations of the model were purely explicit. In the nonhydrostatic version the determination of an additional unknown and the solution of a large sparse system of linear equations is necessary. The latter constitutes the lion's share of computing time and memory requirement. Since the corresponding matrix is only symmetric in structure and not in values, an iterative Krylov Subspace Method is used, in particular the restarted Generalized Minimal Residual Algorithm GMRES(m). With regard to optimization, we present a comparison of several combinations of sequential and parallel preconditioning techniques respective number of iterations and setup
Three-dimensional electromagnetic modeling and inversion on massively parallel computers
Energy Technology Data Exchange (ETDEWEB)
Newman, G.A.; Alumbaugh, D.L. [Sandia National Labs., Albuquerque, NM (United States). Geophysics Dept.
1996-03-01
This report has demonstrated techniques that can be used to construct solutions to the 3-D electromagnetic inverse problem using full wave equation modeling. To this point great progress has been made in developing an inverse solution using the method of conjugate gradients which employs a 3-D finite difference solver to construct model sensitivities and predicted data. The forward modeling code has been developed to incorporate absorbing boundary conditions for high frequency solutions (radar), as well as complex electrical properties, including electrical conductivity, dielectric permittivity and magnetic permeability. In addition both forward and inverse codes have been ported to a massively parallel computer architecture which allows for more realistic solutions that can be achieved with serial machines. While the inversion code has been demonstrated on field data collected at the Richmond field site, techniques for appraising the quality of the reconstructions still need to be developed. Here it is suggested that rather than employing direct matrix inversion to construct the model covariance matrix which would be impossible because of the size of the problem, one can linearize about the 3-D model achieved in the inverse and use Monte-Carlo simulations to construct it. Using these appraisal and construction tools, it is now necessary to demonstrate 3-D inversion for a variety of EM data sets that span the frequency range from induction sounding to radar: below 100 kHz to 100 MHz. Appraised 3-D images of the earth`s electrical properties can provide researchers opportunities to infer the flow paths, flow rates and perhaps the chemistry of fluids in geologic mediums. It also offers a means to study the frequency dependence behavior of the properties in situ. This is of significant relevance to the Department of Energy, paramount to characterizing and monitoring of environmental waste sites and oil and gas exploration.
Juang, Hann-Ming Henry; Tao, Wei-Kuo; Zeng, Xi-Ping; Shie, Chung-Lin; Simpson, Joanne; Lang, Steve
2004-01-01
The capability for massively parallel programming (MPP) using a message passing interface (MPI) has been implemented into a three-dimensional version of the Goddard Cumulus Ensemble (GCE) model. The design for the MPP with MPI uses the concept of maintaining similar code structure between the whole domain as well as the portions after decomposition. Hence the model follows the same integration for single and multiple tasks (CPUs). Also, it provides for minimal changes to the original code, so it is easily modified and/or managed by the model developers and users who have little knowledge of MPP. The entire model domain could be sliced into one- or two-dimensional decomposition with a halo regime, which is overlaid on partial domains. The halo regime requires that no data be fetched across tasks during the computational stage, but it must be updated before the next computational stage through data exchange via MPI. For reproducible purposes, transposing data among tasks is required for spectral transform (Fast Fourier Transform, FFT), which is used in the anelastic version of the model for solving the pressure equation. The performance of the MPI-implemented codes (i.e., the compressible and anelastic versions) was tested on three different computing platforms. The major results are: 1) both versions have speedups of about 99% up to 256 tasks but not for 512 tasks; 2) the anelastic version has better speedup and efficiency because it requires more computations than that of the compressible version; 3) equal or approximately-equal numbers of slices between the x- and y- directions provide the fastest integration due to fewer data exchanges; and 4) one-dimensional slices in the x-direction result in the slowest integration due to the need for more memory relocation for computation.
Erdmann, Thorsten; Schwarz, Ulrich S
2013-01-01
Non-processive molecular motors have to work together in ensembles in order to generate appreciable levels of force or movement. In skeletal muscle, for example, hundreds of myosin II molecules cooperate in thick filaments. In non-muscle cells, by contrast, small groups with few tens of non-muscle myosin II motors contribute to essential cellular processes such as transport, shape changes or mechanosensing. Here we introduce a detailed and analytically tractable model for this important situation. Using a three-state crossbridge model for the myosin II motor cycle and exploiting the assumptions of fast power stroke kinetics and equal load sharing between motors in equivalent states, we reduce the stochastic reaction network to a one-step master equation for the binding and unbinding dynamics (parallel cluster model) and derive the rules for ensemble movement. We find that for constant external load, ensemble dynamics is strongly shaped by the catch bond character of myosin II, which leads to an increase of th...
Performance modeling and analysis of parallel Gaussian elimination on multi-core computers
Directory of Open Access Journals (Sweden)
Fadi N. Sibai
2014-01-01
Full Text Available Gaussian elimination is used in many applications and in particular in the solution of systems of linear equations. This paper presents mathematical performance models and analysis of four parallel Gaussian Elimination methods (precisely the Original method and the new Meet in the Middle –MiM– algorithms and their variants with SIMD vectorization on multi-core systems. Analytical performance models of the four methods are formulated and presented followed by evaluations of these models with modern multi-core systems’ operation latencies. Our results reveal that the four methods generally exhibit good performance scaling with increasing matrix size and number of cores. SIMD vectorization only makes a large difference in performance for low number of cores. For a large matrix size (n ⩾ 16 K, the performance difference between the MiM and Original methods falls from 16× with four cores to 4× with 16 K cores. The efficiencies of all four methods are low with 1 K cores or more stressing a major problem of multi-core systems where the network-on-chip and memory latencies are too high in relation to basic arithmetic operations. Thus Gaussian Elimination can greatly benefit from the resources of multi-core systems, but higher performance gains can be achieved if multi-core systems can be designed with lower memory operation, synchronization, and interconnect communication latencies, requirements of utmost importance and challenge in the exascale computing age.
A model of saccade generation based on parallel processing and competitive inhibition.
Findlay, J M; Walker, R
1999-08-01
During active vision, the eyes continually scan the visual environment using saccadic scanning movements. This target article presents an information processing model for the control of these movements, with some close parallels to established physiological processes in the oculomotor system. Two separate pathways are concerned with the spatial and the temporal programming of the movement. In the temporal pathway there is spatially distributed coding and the saccade target is selected from a "salience map." Both pathways descend through a hierarchy of levels, the lower ones operating automatically. Visual onsets have automatic access to the eye control system via the lower levels. Various centres in each pathway are interconnected via reciprocal inhibition. The model accounts for a number of well-established phenomena in target-elicited saccades: the gap effect, express saccades, the remote distractor effect, and the global effect. High-level control of the pathways in tasks such as visual search and reading is discussed; it operates through spatial selection and search selection, which generally combine in an automated way. The model is examined in relation to data from patients with unilateral neglect.
Energy Technology Data Exchange (ETDEWEB)
Johnson, Brian B [National Renewable Energy Laboratory (NREL), Golden, CO (United States)
2017-08-31
Given that next-generation infrastructures will contain large numbers of grid-connected inverters and these interfaces will be satisfying a growing fraction of system load, it is imperative to analyze the impacts of power electronics on such systems. However, since each inverter model has a relatively large number of dynamic states, it would be impractical to execute complex system models where the full dynamics of each inverter are retained. To address this challenge, we derive a reduced-order structure-preserving model for parallel-connected grid-tied three-phase inverters. Here, each inverter in the system is assumed to have a full-bridge topology, LCL filter at the point of common coupling, and the control architecture for each inverter includes a current controller, a power controller, and a phase-locked loop for grid synchronization. We outline a structure-preserving reduced-order inverter model for the setting where the parallel inverters are each designed such that the filter components and controller gains scale linearly with the power rating. By structure preserving, we mean that the reduced-order three-phase inverter model is also composed of an LCL filter, a power controller, current controller, and PLL. That is, we show that the system of parallel inverters can be modeled exactly as one aggregated inverter unit and this equivalent model has the same number of dynamical states as an individual inverter in the paralleled system. Numerical simulations validate the reduced-order models.
DEFF Research Database (Denmark)
Gunabalan, R.; Sanjeevikumar, P.; Blaabjerg, Frede
2015-01-01
This paper presents the transfer function modeling and stability analysis of two induction motors of same ratings and parameters connected in parallel. The induction motors are controlled by a single inverter and the entire drive system is modeled using transfer function in LabView. Further, the ...
Dauger, Dean Edward
2001-08-01
We are successful in building a code that models many particle dynamic quantum systems by combining a semiclassical approximation of Feynman path integrals with parallel computing techniques (particle-in-cell) and numerical methods developed for simulating plasmas, establishing this approach as a viable technique for multiparticle time-dependent quantum mechanics. Run on high-performance parallel computers, this code applies semiclassical methods to simulate the time evolution of wavefunctions of many particles. We describe the analytical derivation and computational implementation of these techniques in detail. We present a study to thoroughly demonstrate the code's fidelity to quantum mechanics, resulting in innovative visualization and analysis techniques. We introduce and exhibit a method to address fermion particle statistics. We present studies of two quantum-mechanical problems: a two-electron, one- dimensional atom, resulting in high-quality extractions of one- and two-electron eigenstates, and electrostatic quasi-modes due to quantum effects in a hot electron plasma, relevant for predictions about stellar evolution. We supply discussions of alternative derivations, alternative implementations of the derivations, and an exploration of their consequences. Source code is shown throughout this dissertation. Finally, we present an extensive discussion of applications and extrapolations of this work, with suggestions for future direction.
3D multiphysics modeling of superconducting cavities with a massively parallel simulation suite
Directory of Open Access Journals (Sweden)
Oleksiy Kononenko
2017-10-01
Full Text Available Radiofrequency cavities based on superconducting technology are widely used in particle accelerators for various applications. The cavities usually have high quality factors and hence narrow bandwidths, so the field stability is sensitive to detuning from the Lorentz force and external loads, including vibrations and helium pressure variations. If not properly controlled, the detuning can result in a serious performance degradation of a superconducting accelerator, so an understanding of the underlying detuning mechanisms can be very helpful. Recent advances in the simulation suite ace3p have enabled realistic multiphysics characterization of such complex accelerator systems on supercomputers. In this paper, we present the new capabilities in ace3p for large-scale 3D multiphysics modeling of superconducting cavities, in particular, a parallel eigensolver for determining mechanical resonances, a parallel harmonic response solver to calculate the response of a cavity to external vibrations, and a numerical procedure to decompose mechanical loads, such as from the Lorentz force or piezoactuators, into the corresponding mechanical modes. These capabilities have been used to do an extensive rf-mechanical analysis of dressed TESLA-type superconducting cavities. The simulation results and their implications for the operational stability of the Linac Coherent Light Source-II are discussed.
Directory of Open Access Journals (Sweden)
Brian B. Mozaffari
2014-11-01
Full Text Available Based on the notion that the brain is equipped with a hierarchical organization, which embodies environmental contingencies across many time scales, this paper suggests that the medial temporal lobe (MTL – located deep in the hierarchy – serves as a bridge connecting supra to infra – MTL levels. Bridging the upper and lower regions of the hierarchy provides a parallel architecture that optimizes information flow between upper and lower regions to aid attention, encoding, and processing of quick complex visual phenomenon. Bypassing intermediate hierarchy levels, information conveyed through the MTL ‘bridge’ allows upper levels to make educated predictions about the prevailing context and accordingly select lower representations to increase the efficiency of predictive coding throughout the hierarchy. This selection or activation/deactivation is associated with endogenous attention. In the event that these ‘bridge’ predictions are inaccurate, this architecture enables the rapid encoding of novel contingencies. A review of hierarchical models in relation to memory is provided along with a new theory, Medial-temporal-lobe Conduit for Parallel Connectivity (MCPC. In this scheme, consolidation is considered as a secondary process, occurring after a MTL-bridged connection, which eventually allows upper and lower levels to access each other directly. With repeated reactivations, as contingencies become consolidated, less MTL activity is predicted. Finally, MTL bridging may aid processing transient but structured perceptual events, by allowing communication between upper and lower levels without calling on intermediate levels of representation.
Mozaffari, Brian
2014-01-01
Based on the notion that the brain is equipped with a hierarchical organization, which embodies environmental contingencies across many time scales, this paper suggests that the medial temporal lobe (MTL)-located deep in the hierarchy-serves as a bridge connecting supra- to infra-MTL levels. Bridging the upper and lower regions of the hierarchy provides a parallel architecture that optimizes information flow between upper and lower regions to aid attention, encoding, and processing of quick complex visual phenomenon. Bypassing intermediate hierarchy levels, information conveyed through the MTL "bridge" allows upper levels to make educated predictions about the prevailing context and accordingly select lower representations to increase the efficiency of predictive coding throughout the hierarchy. This selection or activation/deactivation is associated with endogenous attention. In the event that these "bridge" predictions are inaccurate, this architecture enables the rapid encoding of novel contingencies. A review of hierarchical models in relation to memory is provided along with a new theory, Medial-temporal-lobe Conduit for Parallel Connectivity (MCPC). In this scheme, consolidation is considered as a secondary process, occurring after a MTL-bridged connection, which eventually allows upper and lower levels to access each other directly. With repeated reactivations, as contingencies become consolidated, less MTL activity is predicted. Finally, MTL bridging may aid processing transient but structured perceptual events, by allowing communication between upper and lower levels without calling on intermediate levels of representation.
MODELING AND CONTROLLING OF PARALLEL MANIPULATOR JOINT DRIVEN BY PNEUMATIC MUSCLES
Institute of Scientific and Technical Information of China (English)
Tao Guoliang; Zhu Xiaocong; Cao Jian
2005-01-01
A parallel manipulator joint driven by three pneumatic muscles and its posture control strategy are presented. Based on geometric constraints and dynamics, a system model is developed through which some influences on dynamic response and open-loop gain are analyzed including the supply pressure, the initial pressure and the volume of pneumatic muscle. A sliding-mode controller with a nonlinear switching function is applied to control posture, which adopts the combination of a main method that separates control of each muscle and an auxiliary method that postures error evaluation of multiple muscles, especially adopting the segmented and intelligent adjustments of sliding-mode parameters to fit different expected postures and initial states. Experimental results show that this control strategy not only amounts to the steady-state error of O.1° without overshoot, but also achieves good trajectory tracking.
Parallelization of a Three-Dimensional Shallow-Water Estuary Model on the KSR-1
Directory of Open Access Journals (Sweden)
C. FalcÓ Korn
1995-01-01
Full Text Available Flows in estuarial and coastal regions may be described by the shallow-water equations. The processes of pollution transport, sediment transport, and plume dispersion are driven by the underlying hydrodynamics. Accurate resolution of these processes requires a three-dimensional formulation with turbulence modeling, which is very demanding computationally. A numerical scheme has been developed which is both stable and accurate – we show that this scheme is also well suited to parallel processing, making the solution of massive complex problems a practical computing possibility. We describe the implementation of the numerical scheme on a Kendall Square Research KSR-1 multiprocessor, and present experimental results which demonstrate that a problem requiring 600,000 mesh points and 6,000 time steps can be solved in under 8 hours using 32 processors.
A Parallel Decision Model Based on Support Vector Machines and Its Application to Fault Diagnosis
Institute of Scientific and Technical Information of China (English)
Yan Weiwu(阎威武); Shao Huihe
2004-01-01
Many industrial process systems are becoming more and more complex and are characterized by distributed features. To ensure such a system to operate under working order, distributed parameter values are often inspected from subsystems or different points in order to judge working conditions of the system and make global decisions. In this paper, a parallel decision model based on Support Vector Machine (PDMSVM) is introduced and applied to the distributed fault diagnosis in industrial process. PDMSVM is convenient for information fusion of distributed system and it performs well in fault diagnosis with distributed features. PDMSVM makes decision based on synthetic information of subsystems and takes the advantage of Support Vector Machine. Therefore decisions made by PDMSVM are highly reliable and accurate.
Bentonite electrical conductivity: a model based on series–parallel transport
Lima, Ana T.
2010-01-30
Bentonite has significant applications nowadays, among them as landfill liners, in concrete industry as a repairing material, and as drilling mud in oil well construction. The application of an electric field to such perimeters is under wide discussion, and subject of many studies. However, to understand the behaviour of such an expansive and plastic material under the influence of an electric field, the perception of its electrical properties is essential. This work serves to compare existing data of such electrical behaviour with new laboratorial results. Electrical conductivity is a pertinent parameter since it indicates how much a material is prone to conduct electricity. In the current study, total conductivity of a compacted porous medium was established to be dependent upon density of the bentonite plug. Therefore, surface conductivity was addressed and a series-parallel transport model used to quantify/predict the total conductivity of the system. © The Author(s) 2010.
MATTER-ELEMENT MODELING OF PARALLEL STRUCTURE AND APPLICATION ABOUT EXTENSION PID CONTROL SYSTEM
Institute of Scientific and Technical Information of China (English)
Rongde LU; Zonghai CHEN
2006-01-01
This article describes in detail a new method via the extension predictable algorithm of the matter-element model of parallel structure tuning the parameters of the extension PID controller. In comparison with fuzzy and extension PID controllers, the proposed extension PID predictable controller shows higher control gains when system states are away from equilibrium, and retains a lower profile of control signals at the same time. Consequently, better control performance is achieved. Through the proposed tuning formula, the weighting factors of an extension-logic predictable controller can be systematically selected according to the control plant. An experimental example through industrial field data and site engineers' experience demonstrates the superior performance of the proposed controller over the fuzzy controller.
Chen, Yu-Yi; Juang, Jia-Yang
2016-07-01
The standard collinear four-point probe method is an indispensable tool and has been extensively used for characterizing conductive thin films with homogeneous and isotropic electrical properties. In this paper, we conduct three-dimensional (3D) finite element simulations on conductive multilayer films to study the relationship between the reading of the four-point probe and the conductivity of the individual layers. We find that a multilayer film may be modeled as a simple equivalent circuit with multiple resistances, connected in parallel for a wide range of resistivity and thickness ratios, as long as its total thickness is smaller than approximately half of the probe spacing. As a result, we may determine the resistivity of each layer sequentially by applying the four-point probe, with the original correction factor π/ln(2), after deposition of each layer.
Wall shear stress measurement method based on parallel flow model near vascular wall in echography
Shimizu, Motochika; Tanaka, Tomohiko; Okada, Takashi; Seki, Yoshinori; Nishiyama, Tomohide
2017-07-01
A high-risk vessel of arteriosclerosis is detected by assessing wall shear stress (WSS), which is calculated from the distribution of velocity in a blood flow. A novel echographic method for measuring WSS, which aims to distinguish a normal vessel from a high-risk vessel, is proposed. To achieve this aim, the measurement error should be less than 28.8%. The proposed method is based on a flow model for the area near a vascular wall under a parallel-flow assumption to avoid the influences of error factors. This was verified by an in vitro experiment in which the WSS of a carotid artery phantom was measured. According to the experimental results, the WSS measured by the proposed method correlated with the ground truth measured by particle image velocimetry; in particular, the correlation coefficient and measurement error between them were respectively 0.70 and 27.4%. The proposed method achieved the target measurement performance.
Institute of Scientific and Technical Information of China (English)
Shung Han Cho; Yuntai Kyong; Sangjin Hong; We-Duke Cho
2009-01-01
This paper presents a novel self localization method using parallel projection model for mobile sensor in navigation applications. The algorithm estimates the coordinate and the orientation of mobile sensor using projected on visual image. The proposed method considers the lens non-linearity of the camera and compensates the distortion by using a calibration table. The method determines the coordinates and orientations with iterative process, which is very accurate with low computational demand. We identify various sources of error on the coordinate and orientation estimations, and present both static sensitivity analysis of the algorithm and dynamic behavior of the mobile sensor. The algorithm can be utilized in mobile robot navigation as well as positioning application where accurate self localization is necessary.
On Predictive Modeling for Optimizing Transaction Execution in Parallel OLTP Systems
Pavlo, Andrew; Zdonik, Stanley
2011-01-01
A new emerging class of parallel database management systems (DBMS) is designed to take advantage of the partitionable workloads of on-line transaction processing (OLTP) applications. Transactions in these systems are optimized to execute to completion on a single node in a shared-nothing cluster without needing to coordinate with other nodes or use expensive concurrency control measures. But some OLTP applications cannot be partitioned such that all of their transactions execute within a single-partition in this manner. These distributed transactions access data not stored within their local partitions and subsequently require more heavy-weight concurrency control protocols. Further difficulties arise when the transaction's execution properties, such as the number of partitions it may need to access or whether it will abort, are not known beforehand. The DBMS could mitigate these performance issues if it is provided with additional information about transactions. Thus, in this paper we present a Markov model...
Directory of Open Access Journals (Sweden)
Gianni Castelli
2010-01-01
Full Text Available This paper presents results on the modelling, simulation and experimental tests of a cable-based parallel manipulator to be used as an aiding or guiding system for people with motion disabilities. There is a high level of motivation for people with a motion disability or the elderly to perform basic daily-living activities independently. Therefore, it is of great interest to design and implement safe and reliable motion assisting and guiding devices that are able to help end-users. In general, a robot for a medical application should be able to interact with a patient in safety conditions, i.e. it must not damage people or surroundings; it must be designed to guarantee high accuracy and low acceleration during the operation. Furthermore, it should not be too bulky and it should exert limited wrenches after close interaction with people. It can be advisable to have a portable system which can be easily brought into and assembled in a hospital or a domestic environment. Cable-based robotic structures can fulfil those requirements because of their main characteristics that make them light and intrinsically safe. In this paper, a reconfigurable four-cable-based parallel manipulator has been proposed as a motion assisting and guiding device to help people to accomplish a number of tasks, such as an aiding or guiding system to move the upper and lower limbs or the whole body. Modelling and simulation are presented in the ADAMS environment. Moreover, experimental tests are reported as based on an available laboratory prototype.
Global dynamic modeling of electro-hydraulic 3-UPS/S parallel stabilized platform by bond graph
Zhang, Lijie; Guo, Fei; Li, Yongquan; Lu, Wenjuan
2016-08-01
Dynamic modeling of a parallel manipulator(PM) is an important issue. A complete PM system is actually composed of multiple physical domains. As PMs are widely used in various fields, the importance of modeling the global dynamic model of the PM system becomes increasingly prominent. Currently there lacks further research in global dynamic modeling. A unified modeling approach for the multi-energy domains PM system is proposed based on bond graph and a global dynamic model of the 3-UPS/S parallel stabilized platform involving mechanical and electrical-hydraulic elements is built. Firstly, the screw bond graph theory is improved based on the screw theory, the modular joint model is modeled and the normalized dynamic model of the mechanism is established. Secondly, combined with the electro-hydraulic servo system model built by traditional bond graph, the global dynamic model of the system is obtained, and then the motion, force and power of any element can be obtained directly. Lastly, the experiments and simulations of the driving forces, pressure and flow are performed, and the results show that, the theoretical calculation results of the driving forces are in accord with the experimental ones, and the pressure and flow of the first limb and the third limb are symmetry with each other. The results are reasonable and verify the correctness and effectiveness of the model and the method. The proposed dynamic modeling method provides a reference for modeling of other multi-energy domains system which contains complex PM.
Ferrucci, Filomena; Salza, Pasquale; Sarro, Federica
2017-06-29
The need to improve the scalability of Genetic Algorithms (GAs) has motivated the research on Parallel Genetic Algorithms (PGAs), and different technologies and approaches have been used. Hadoop MapReduce represents one of the most mature technologies to develop parallel algorithms. Based on the fact that parallel algorithms introduce communication overhead, the aim of the present work is to understand if, and possibly when, the parallel GAs solutions using Hadoop MapReduce show better performance than sequential versions in terms of execution time. Moreover, we are interested in understanding which PGA model can be more effective among the global, grid and island models. We empirically assessed the performance of these three parallel models with respect to a sequential GA on a software engineering problem, evaluating the execution time and the achieved speedup. We also analysed the behaviour of the parallel models in relation to the overhead produced by the use of Hadoop MapReduce and the GAs' computational effort, which gives a more machine-independent measure of these algorithms. We exploited three problem instances to differentiate the computation load and three cluster configurations based on 2, 4 and 8 parallel nodes. Moreover, we estimated the costs of the execution of the experimentation on a potential cloud infrastructure, based on the pricing of the major commercial cloud providers. The empirical study revealed that the use of PGA based on the island model outperforms the other parallel models and the sequential GA for all the considered instances and clusters. Using 2, 4 and 8 nodes, the island model achieves an average speedup over the three datasets of 1.8×, 3.4× and 7.0× times, respectively. Hadoop MapReduce has a set of different constraints that need to be considered during the design and the implementation of parallel algorithms. The overhead of data store (i.e., HDFS) accesses, communication and latency requires solutions that reduce data store
Wang, W.; Zehner, B.; Böttcher, N.; Goerke, U.; Kolditz, O.
2013-12-01
Numerical modeling of the two-phase flow process in porous media for real applications, e.g. CO2 storage processes in saline aquifers, is computationally expensive due to the complexity and the non-linearity of the observed physical processes. In such modeling, a fine discretization of the considered domain is normally needed for a high degree of accuracy, and it leads to the requirement of extremely high computational resources. This work focuses on the parallel simulation of the two-phase flow process in porous media. The Galerkin finite element method is used to solve the governing equations. Based on the overlapping domain decomposition approach, the PETSc package is employed to parallelize the global equation assembly and the linear solver, respectively. A numerical model based on the real test site Ketzin in Germany is adopted for parallel computing. The model domain is discretized with more than four million tetrahedral elements. The parallel simulations are carried out on a Linux cluster with different number of cores. The obtained speedup shows a good scalability of the current parallel finite element approach of the two-phase flow modeling in geological CO2 storage applications.
Ferrighi, Lara; Frediani, Luca; Fossgaard, Eirik; Ruud, Kenneth
2006-10-01
We present a parallel implementation of the integral equation formalism of the polarizable continuum model for Hartree-Fock and density functional theory calculations of energies and linear, quadratic, and cubic response functions. The contributions to the free energy of the solute due to the polarizable continuum have been implemented using a master-slave approach with load balancing to ensure good scalability also on parallel machines with a slow interconnect. We demonstrate the good scaling behavior of the code through calculations of Hartree-Fock energies and linear, quadratic, and cubic response function for a modest-sized sample molecule. We also explore the behavior of the parallelization of the integral equation formulation of the polarizable continuum model code when used in conjunction with a recent scheme for the storage of two-electron integrals in the memory of the different slaves in order to achieve superlinear scaling in the parallel calculations.
Research on the Development Approach for Reusable Model in Parallel Discrete Event Simulation
Directory of Open Access Journals (Sweden)
Jianbo Li
2015-01-01
Full Text Available Model reuse is an essential means to meet the demand of model development in complex simulation. An effective approach to realize the model reusability is to establish standard model specification including interface specification and representation specification. By standardizing model’s external interfaces, Reusable Component Model Framework (RCMF achieves the model reusability acting as an interface specification. However, the RCMF model is presently developed just through manual programing. Besides implementing model’s business logic, modeler should also ensure the model strictly following the reusable framework, which is very distracting. And there lacks model description information for instructing model reuse or integration. To address these issues, we first explored an XML-based model description file which completed RCMF as the model representation and then proposed a RCMF model development tool—SuKit. Model description file describes a RCMF model and can be used for regenerating a model and instructing model integration. SuKit can generate a skeleton RCMF model together with a model-customized description file with the configured information. Modeler then just needs to concentrate on the model processing logic. The case study indicates that SuKit has good capability of developing RCMF models and the well-formed description file can be used for model reuse and integration.
Ding, Zhong-Jun; Jiang, Rui; Gao, Zi-You; Wang, Bing-Hong; Long, Jiancheng
2013-08-01
The effect of overpasses in the Biham-Middleton-Levine traffic flow model with random and parallel update rules has been studied. An overpass is a site that can be occupied simultaneously by an eastbound car and a northbound one. Under periodic boundary conditions, both self-organized and random patterns are observed in the free-flowing phase of the parallel update model, while only the random pattern is observed in the random update model. We have developed mean-field analysis for the moving phase of the random update model, which agrees with the simulation results well. An intermediate phase is observed in which some cars could pass through the jamming cluster due to the existence of free paths in the random update model. Two intermediate states are observed in the parallel update model, which have been ignored in previous studies. The intermediate phases in which the jamming skeleton is only oriented along the diagonal line in both models have been analyzed, with the analyses agreeing well with the simulation results. With the increase of overpass ratio, the jamming phase and the intermediate phases disappear in succession for both models. Under open boundary conditions, the system exhibits only two phases when the ratio of overpasses is below a threshold in the random update model. When the ratio of the overpass is close to 1, three phases could be observed, similar to the totally asymmetric simple exclusion process model. The dependence of the average velocity, the density, and the flow rate on the injection probability in the moving phase has also been obtained through mean-field analysis. The results of the parallel model under open boundary conditions are similar to that of the random update model.
Energy Technology Data Exchange (ETDEWEB)
Procassini, R.J. [Lawrence Livermore National lab., CA (United States)
1997-12-31
The fine-scale, multi-space resolution that is envisioned for accurate simulations of complex weapons systems in three spatial dimensions implies flop-rate and memory-storage requirements that will only be obtained in the near future through the use of parallel computational techniques. Since the Monte Carlo transport models in these simulations usually stress both of these computational resources, they are prime candidates for parallelization. The MONACO Monte Carlo transport package, which is currently under development at LLNL, will utilize two types of parallelism within the context of a multi-physics design code: decomposition of the spatial domain across processors (spatial parallelism) and distribution of particles in a given spatial subdomain across additional processors (particle parallelism). This implementation of the package will utilize explicit data communication between domains (message passing). Such a parallel implementation of a Monte Carlo transport model will result in non-deterministic communication patterns. The communication of particles between subdomains during a Monte Carlo time step may require a significant level of effort to achieve a high parallel efficiency.
DEFF Research Database (Denmark)
Vasquez, Juan Carlos; Guerrero, Josep M.; Savaghebi, Mehdi;
2013-01-01
Power electronics based MicroGrids consist of a number of voltage source inverters (VSIs) operating in parallel. In this paper, the modeling, control design, and stability analysis of parallel connected three-phase VSIs are derived. The proposed voltage and current inner control loops and the mat......Power electronics based MicroGrids consist of a number of voltage source inverters (VSIs) operating in parallel. In this paper, the modeling, control design, and stability analysis of parallel connected three-phase VSIs are derived. The proposed voltage and current inner control loops...... and the mathematical models of the VSIs are based on the stationary reference frame. A hierarchical control scheme for the paralleled VSI system is developed comprising two levels. The primary control includes the droop method and the virtual impedance loops, in order to share active and reactive power. The secondary...... control restores the frequency and amplitude deviations produced by the primary control. Also, a synchronization algorithm is presented in order to connect the MicroGrid to the grid. Experimental results are provided to validate the performance and robustness of the parallel VSI system control...
Eroglu, Duygu Yilmaz; Ozmutlu, H Cenk
2014-01-01
We developed mixed integer programming (MIP) models and hybrid genetic-local search algorithms for the scheduling problem of unrelated parallel machines with job sequence and machine-dependent setup times and with job splitting property. The first contribution of this paper is to introduce novel algorithms which make splitting and scheduling simultaneously with variable number of subjobs. We proposed simple chromosome structure which is constituted by random key numbers in hybrid genetic-local search algorithm (GAspLA). Random key numbers are used frequently in genetic algorithms, but it creates additional difficulty when hybrid factors in local search are implemented. We developed algorithms that satisfy the adaptation of results of local search into the genetic algorithms with minimum relocation operation of genes' random key numbers. This is the second contribution of the paper. The third contribution of this paper is three developed new MIP models which are making splitting and scheduling simultaneously. The fourth contribution of this paper is implementation of the GAspLAMIP. This implementation let us verify the optimality of GAspLA for the studied combinations. The proposed methods are tested on a set of problems taken from the literature and the results validate the effectiveness of the proposed algorithms.
Directory of Open Access Journals (Sweden)
Masoud Rabbani
2016-12-01
Full Text Available This paper deals with mixed model assembly line (MMAL balancing problem of type-I. In MMALs several products are made on an assembly line while the similarity of these products is so high. As a result, it is possible to assemble several types of products simultaneously without any additional setup times. The problem has some particular features such as parallel workstations and precedence constraints in dynamic periods in which each period also effects on its next period. The research intends to reduce the number of workstations and maximize the workload smoothness between workstations. Dynamic periods are used to determine all variables in different periods to achieve efficient solutions. A non-dominated sorting genetic algorithm (NSGA-II and multi-objective particle swarm optimization (MOPSO are used to solve the problem. The proposed model is validated with GAMS software for small size problem and the performance of the foregoing algorithms is compared with each other based on some comparison metrics. The NSGA-II outperforms MOPSO with respect to some comparison metrics used in this paper, but in other metrics MOPSO is better than NSGA-II. Finally, conclusion and future research is provided.
Sankararamakrishnan, R; Sansom, M S
1995-11-01
The transbilayer pore of the nicotinic acetylcholine receptor (nAChR) is formed by a pentameric bundle of M2 helices. Models of pentameric bundles of M2 helices have been generated using simulated annealing via restrained molecular dynamics. The influence of: (a) the initial C alpha template; and (b) screening of sidechain electrostatic interactions on the geometry of the resultant M2 helix bundles is explored. Parallel M2 helices, in the absence of sidechain electrostatic interactions, pack in accordance with simple ridges-in-grooves considerations. This results in a helix crossing angle of ca. +12 degrees, corresponding to a left-handed coiled coil structure for the bundle as a whole. Tilting of M2 helices away from the central pore axis at their C-termini and/or inclusion of sidechain electrostatic interactions may perturb such ridges-in-grooves packing. In the most extreme cases right-handed coiled coils are formed. An interplay between inter-helix H-bonding and helix bundle geometry is revealed. The effects of changes in electrostatic screening on the dimensions of the pore mouth are described and the significance of these changes in the context of models for the nAChR pore domain is discussed.
Erdmann, Thorsten; Albert, Philipp J; Schwarz, Ulrich S
2013-11-07
Non-processive molecular motors have to work together in ensembles in order to generate appreciable levels of force or movement. In skeletal muscle, for example, hundreds of myosin II molecules cooperate in thick filaments. In non-muscle cells, by contrast, small groups with few tens of non-muscle myosin II motors contribute to essential cellular processes such as transport, shape changes, or mechanosensing. Here we introduce a detailed and analytically tractable model for this important situation. Using a three-state crossbridge model for the myosin II motor cycle and exploiting the assumptions of fast power stroke kinetics and equal load sharing between motors in equivalent states, we reduce the stochastic reaction network to a one-step master equation for the binding and unbinding dynamics (parallel cluster model) and derive the rules for ensemble movement. We find that for constant external load, ensemble dynamics is strongly shaped by the catch bond character of myosin II, which leads to an increase of the fraction of bound motors under load and thus to firm attachment even for small ensembles. This adaptation to load results in a concave force-velocity relation described by a Hill relation. For external load provided by a linear spring, myosin II ensembles dynamically adjust themselves towards an isometric state with constant average position and load. The dynamics of the ensembles is now determined mainly by the distribution of motors over the different kinds of bound states. For increasing stiffness of the external spring, there is a sharp transition beyond which myosin II can no longer perform the power stroke. Slow unbinding from the pre-power-stroke state protects the ensembles against detachment.
A parallel Discrete Element Method to model collisions between non-convex particles
Rakotonirina, Andriarimina Daniel; Delenne, Jean-Yves; Wachs, Anthony
2017-06-01
In many dry granular and suspension flow configurations, particles can be highly non-spherical. It is now well established in the literature that particle shape affects the flow dynamics or the microstructure of the particles assembly in assorted ways as e.g. compacity of packed bed or heap, dilation under shear, resistance to shear, momentum transfer between translational and angular motions, ability to form arches and block the flow. In this talk, we suggest an accurate and efficient way to model collisions between particles of (almost) arbitrary shape. For that purpose, we develop a Discrete Element Method (DEM) combined with a soft particle contact model. The collision detection algorithm handles contacts between bodies of various shape and size. For nonconvex bodies, our strategy is based on decomposing a non-convex body into a set of convex ones. Therefore, our novel method can be called "glued-convex method" (in the sense clumping convex bodies together), as an extension of the popular "glued-spheres" method, and is implemented in our own granular dynamics code Grains3D. Since the whole problem is solved explicitly, our fully-MPI parallelized code Grains3D exhibits a very high scalability when dynamic load balancing is not required. In particular, simulations on up to a few thousands cores in configurations involving up to a few tens of millions of particles can readily be performed. We apply our enhanced numerical model to (i) the collapse of a granular column made of convex particles and (i) the microstructure of a heap of non-convex particles in a cylindrical reactor.
Erdmann, Thorsten; Albert, Philipp J.; Schwarz, Ulrich S.
2013-11-01
Non-processive molecular motors have to work together in ensembles in order to generate appreciable levels of force or movement. In skeletal muscle, for example, hundreds of myosin II molecules cooperate in thick filaments. In non-muscle cells, by contrast, small groups with few tens of non-muscle myosin II motors contribute to essential cellular processes such as transport, shape changes, or mechanosensing. Here we introduce a detailed and analytically tractable model for this important situation. Using a three-state crossbridge model for the myosin II motor cycle and exploiting the assumptions of fast power stroke kinetics and equal load sharing between motors in equivalent states, we reduce the stochastic reaction network to a one-step master equation for the binding and unbinding dynamics (parallel cluster model) and derive the rules for ensemble movement. We find that for constant external load, ensemble dynamics is strongly shaped by the catch bond character of myosin II, which leads to an increase of the fraction of bound motors under load and thus to firm attachment even for small ensembles. This adaptation to load results in a concave force-velocity relation described by a Hill relation. For external load provided by a linear spring, myosin II ensembles dynamically adjust themselves towards an isometric state with constant average position and load. The dynamics of the ensembles is now determined mainly by the distribution of motors over the different kinds of bound states. For increasing stiffness of the external spring, there is a sharp transition beyond which myosin II can no longer perform the power stroke. Slow unbinding from the pre-power-stroke state protects the ensembles against detachment.
DEFF Research Database (Denmark)
Kwon, Jun Bum; Wang, Xiongfei; Bak, Claus Leth;
2015-01-01
be difficult in terms of complex multi-parallel connected systems, especially in the case of renewable energy, where possibilities for intermittent operation due to the weather conditions exist. Hence, it can bring many different operating points to the power converter, and the impedance characteristics can...... change compared to the conventional operation. In this paper, a Harmonic State Space modeling method, which is based on the Linear Time varying theory, is used to analyze different operating points of the parallel connected converters. The analyzed results show that the HSS modeling approach explicitly...
DEFF Research Database (Denmark)
Juel-Christiansen, Carsten
2005-01-01
Artiklen fremhæver den visuelle rotation - billeder, tegninger, modeller, værker - som det privilligerede medium i kommunikationen af ideer imellem skabende arkitekter......Artiklen fremhæver den visuelle rotation - billeder, tegninger, modeller, værker - som det privilligerede medium i kommunikationen af ideer imellem skabende arkitekter...
Research on Task Parallel Programming Model%任务并行编程模型研究与进展
Institute of Scientific and Technical Information of China (English)
王蕾; 崔慧敏; 陈莉; 冯晓兵
2013-01-01
Task parallel programming model is a widely used parallel programming model on multi-core platforms.With the intention of simplifying parallel programming and improving the utilization of multiple cores,this paper provides an introduction to the essential programming interfaces and the supporting mechanism used in task parallel programming models and discusses issues and the latest achievements from three perspectives:Parallelism expression,data management and task scheduling.In the end,some future trends in this area are discussed.%任务并行编程模型是近年来多核平台上广泛研究和使用的并行编程模型,旨在简化并行编程和提高多核利用率.首先,介绍了任务并行编程模型的基本编程接口和支持机制；然后,从3个角度,即并行性表达、数据管理和任务调度介绍任务并行编程模型的研究问题、困难和最新研究成果；最后展望了任务并行未来的研究方向.
Non-local approach to kinetic effects on parallel transport in fluid models of the scrape-off layer
Omotani, John
2013-01-01
By using a non-local model, fluid simulations can capture kinetic effects in the parallel electron heat-flux better than is possible using flux limiters in the usual diffusive models. Non-local and diffusive models are compared using a test case representative of an ELM crash in the JET SOL, simulated in one dimension. The non-local model shows substantially enhanced electron temperature gradients, which cannot be achieved using a flux limiter. The performance of the implementation, in the BOUT++ framework, is also analysed to demonstrate its suitability for application in three-dimensional simulations of turbulent transport in the SOL.
Directory of Open Access Journals (Sweden)
S. Sulaiman
2017-06-01
Full Text Available An important element in the electric power distribution system is the underground cable. However continuous applications of high voltages unto the cable, may lead to insulation degradations and subsequent cable failure. Since any disruption to the electricity supply may lead to economic losses as well as lowering customer satisfaction, the maintenance of cables is very important to an electrical utility company. Thus, a reliable diagnostic technique that is able to accurately assess the condition of cable insulation operating is critical, in order for cable replacement exercise to be done. One such diagnostic technique to assess the level of degradation within the cable insulation is the Polarization / Depolarization Current (PDC analysis. This research work attempts to investigate PDC behaviour for medium voltage (MV cross-linked polyethylene (XLPE insulated cables, via baseline PDC measurements and utilizing the measured data to simulate for PDC analysis. Once PDC simulations have been achieved, the values of conductivity of XLPE cable insulations can be approximated. Cable conductivity serves as an indicator to the level of degradation within XLPE cable insulation. It was found that for new and unused XLPE cables, the polarization and depolarization currents have almost overlapping trendlines, as the cable insulation’s conduction current is negligible. Using a linear dielectric circuit equivalence model as the XLPE cable insulation and its corresponding governing equations, it is possible to optimize the number of parallel RC branches to simulate PDC analysis, with a very high degree of accuracy. The PDC simulation model has been validated against the baseline PDC measurements.
Hill, Gary; Duval, Ronald W.; Green, John A.; Huynh, Loc C.
1991-01-01
A piloted comparison of rigid and aeroelastic blade-element rotor models was conducted at the Crew Station Research and Development Facility (CSRDF) at Ames Research Center. A simulation development and analysis tool, FLIGHTLAB, was used to implement these models in real time using parallel processing technology. Pilot comments and quantitative analysis performed both on-line and off-line confirmed that elastic degrees of freedom significantly affect perceived handling qualities. Trim comparisons show improved correlation with flight test data when elastic modes are modeled. The results demonstrate the efficiency with which the mathematical modeling sophistication of existing simulation facilities can be upgraded using parallel processing, and the importance of these upgrades to simulation fidelity.
Hill, Gary; Du Val, Ronald W.; Green, John A.; Huynh, Loc C.
1991-01-01
A piloted comparison of rigid and aeroelastic blade-element rotor models was conducted at the Crew Station Research and Development Facility (CSRDF) at Ames Research Center. A simulation development and analysis tool, FLIGHTLAB, was used to implement these models in real time using parallel processing technology. Pilot comments and qualitative analysis performed both on-line and off-line confirmed that elastic degrees of freedom significantly affect perceived handling qualities. Trim comparisons show improved correlation with flight test data when elastic modes are modeled. The results demonstrate the efficiency with which the mathematical modeling sophistication of existing simulation facilities can be upgraded using parallel processing, and the importance of these upgrades to simulation fidelity.
Examining HPV threat-to-efficacy ratios in the Extended Parallel Process Model.
Carcioppolo, Nick; Jensen, Jakob D; Wilson, Steven R; Collins, W Bart; Carrion, Melissa; Linnemeier, Georgiann
2013-01-01
The Extended Parallel Process Model (EPPM) posits that an effective fear appeal includes both threat and efficacy components; however, research has not addressed whether there is an optimal threat-to-efficacy ratio. It is possible that varying levels of threat and efficacy in a persuasive message could yield different effects on attitudes, beliefs, and behaviors. In a laboratory experiment, women (n = 442) were exposed to human papilloma virus (HPV) prevention messages containing one of six threat-to-efficacy ratios and one of two message frames (messages emphasizing the connection between HPV and cervical cancer or HPV and genital warts). Multiple mediation analysis revealed that a 1-to-1 ratio of threat to efficacy was most effective at increasing prevention intentions, primarily because it caused more fear and risk susceptibility than other message ratios. Response efficacy significantly mediated the relationship between message framing and intentions, such that participants exposed to a genital warts message reported significantly higher intentions, and this association can be explained in part through response efficacy. Implications for future theoretical research as well as campaigns and intervention research are discussed.
Measurement, Modeling and Reconstruction of Parallel Currents in the HSX Stellarator
Schmitt, J. C.; Talmadge, J. N.; Lore, J.
2010-11-01
Parallel currents are measured with a set of magnetic diagnostics on the HSX. Measurements show that the Pfirsch-Schlüter current is helical due to the lack of toroidal curvature and is reduced in magnitude compared to an equivalent tokamak because of the high effective transform (˜3) in a quasihelically symmetric stellarator. The bootstrap current density is calculated using the PENTA code,^1 which includes momentum conservation between plasma species. The data shows better agreement with a model that includes momentum conservation. HSX plasmas are heated by a 28 GHz gyrotron which allows the electrons to access the low collisionality regime, while the cold ions are generally in the plateau. In HSX, a 3-D plasma with small symmetry-breaking, the calculations show that for two species in different collisionality regimes, the bootstrap current can be strong function of the radial electric field. In the plasma core, multiple stable electric field solutions to the ambipolarity constraint exist. The large positive electric field, the ``electron-root'' solution, can result in a reduction and even a reversal of the bootstrap current. The measured fields and fluxes are used in the V3FIT^2 code to reconstruct the current profile. Supported by DOE grant DE-FG02-93ER54222. ^1D.A. Spong, Phys. Plasmas 12 (2005) 056114. ^2J.D. Hanson, et al, Nucl. Fusion 49 (2009) 075031.
Barnes, Richard
2016-11-01
Algorithms for extracting hydrologic features and properties from digital elevation models (DEMs) are challenged by large datasets, which often cannot fit within a computer's RAM. Depression filling is an important preconditioning step to many of these algorithms. Here, I present a new, linearly scaling algorithm which parallelizes the Priority-Flood depression-filling algorithm by subdividing a DEM into tiles. Using a single-producer, multi-consumer design, the new algorithm works equally well on one core, multiple cores, or multiple machines and can take advantage of large memories or cope with small ones. Unlike previous algorithms, the new algorithm guarantees a fixed number of memory access and communication events per subdivision of the DEM. In comparison testing, this results in the new algorithm running generally faster while using fewer resources than previous algorithms. For moderately sized tiles, the algorithm exhibits ∼60% strong and weak scaling efficiencies up to 48 cores, and linear time scaling across datasets ranging over three orders of magnitude. The largest dataset on which I run the algorithm has 2 trillion (2×1012) cells. With 48 cores, processing required 4.8 h wall-time (9.3 compute-days). This test is three orders of magnitude larger than any previously performed in the literature. Complete, well-commented source code and correctness tests are available for download from a repository.
De Lorenzi, Flavio; Debattista, Victor; Gerhard, Ortwin; Sambhus, Niranjan
2007-01-01
We describe a made-to-measure algorithm for constructing N-particle models of stellar systems from observational data (Chi-Squared-M2M), extending earlier ideas by Syer and Tremaine. The algorithm properly accounts for observational errors, is flexible, and can be applied to various systems and geometries. We implement this algorithm in a parallel code NMAGIC and carry out a sequence of tests to illustrate its power and performance: (i) We reconstruct an isotropic Hernquist model from density...
Directory of Open Access Journals (Sweden)
Jimit R. Patel
2016-01-01
Full Text Available Efforts have been made to present a comparison of all the three magnetic fluid flow models (Neuringer-Rosensweig model, Shliomis model, and Jenkins model so far as the performance of a magnetic fluid based parallel plate rough slider bearing is concerned. The stochastic model of Christensen and Tonder is adopted for the evaluation of effect of transverse surface roughness. The stochastically averaged Reynolds-type equation is solved with suitable boundary conditions to obtain the pressure distribution resulting in the calculation of load carrying capacity. The graphical results establish that for a bearing’s long life period the Shliomis model may be employed for higher loads. However, for lower to moderate loads, the Neuringer-Rosensweig model may be deployed.
With enhanced data availability, distributed watershed models for large areas with high spatial and temporal resolution are increasingly used to understand water budgets and examine effects of human activities and climate change/variability on water resources. Developing parallel computing software...
Modeling and Control of a Parallel Waste Heat Recovery System for Euro-VI Heavy-Duty Diesel Engines
Feru, E.; Willems, F.P.T.; Jager, B. de; Steinbuch, M.
2014-01-01
This paper presents the modeling and control of a waste heat recovery system for a Euro-VI heavy-duty truck engine. The considered waste heat recovery system consists of two parallel evaporators with expander and pumps mechanically coupled to the engine crankshaft. Compared to previous work, the
Modeling and Control of a Parallel Waste Heat Recovery System for Euro-VI Heavy-Duty Diesel Engines
Feru, E.; Willems, F.P.T.; Jager, B. de; Steinbuch, M.
2014-01-01
This paper presents the modeling and control of a waste heat recovery system for a Euro-VI heavy-duty truck engine. The considered waste heat recovery system consists of two parallel evaporators with expander and pumps mechanically coupled to the engine crankshaft. Compared to previous work, the was
Directory of Open Access Journals (Sweden)
Ahmad Rusdiansyah
2010-01-01
Full Text Available Airline revenue management (ARM is one of emerging topics in transportation logistics areas. This paper discusses a problem in ARM which is dynamic pricing for two parallel flights owned by the same airline. We extended the existing model on Joint Pricing Model for Parallel Flights under passenger choice behavior in the literature. We generalized the model to consider multiple full-fare class instead of only single full-fare class. Consequently, we have to define the seat allocation for each fare class beforehand. We have combined the joint pricing model and the model of nested Expected Marginal Seat Revenue (EMSR model. To solve this hybrid model, we have developed a dynamic programming-based algorithm. We also have conducted numerical experiments to show the behavior of our model. Our experiment results have showed that the expected revenue of both flights significantly induced by the proportion of the time flexible passengers and the number of allocated seat in each full-fare class. As managerial insights, our model has proved that there is a closed relationship between demand management, which is represented by the price of each fare class, and total expected revenue considering the passenger choice behavior.
Implementation science: a role for parallel dual processing models of reasoning?
Directory of Open Access Journals (Sweden)
Phillips Paddy A
2006-05-01
Full Text Available Abstract Background A better theoretical base for understanding professional behaviour change is needed to support evidence-based changes in medical practice. Traditionally strategies to encourage changes in clinical practices have been guided empirically, without explicit consideration of underlying theoretical rationales for such strategies. This paper considers a theoretical framework for reasoning from within psychology for identifying individual differences in cognitive processing between doctors that could moderate the decision to incorporate new evidence into their clinical decision-making. Discussion Parallel dual processing models of reasoning posit two cognitive modes of information processing that are in constant operation as humans reason. One mode has been described as experiential, fast and heuristic; the other as rational, conscious and rule based. Within such models, the uptake of new research evidence can be represented by the latter mode; it is reflective, explicit and intentional. On the other hand, well practiced clinical judgments can be positioned in the experiential mode, being automatic, reflexive and swift. Research suggests that individual differences between people in both cognitive capacity (e.g., intelligence and cognitive processing (e.g., thinking styles influence how both reasoning modes interact. This being so, it is proposed that these same differences between doctors may moderate the uptake of new research evidence. Such dispositional characteristics have largely been ignored in research investigating effective strategies in implementing research evidence. Whilst medical decision-making occurs in a complex social environment with multiple influences and decision makers, it remains true that an individual doctor's judgment still retains a key position in terms of diagnostic and treatment decisions for individual patients. This paper argues therefore, that individual differences between doctors in terms of
Ostromsky, Tz.; Georgiev, K.; Zlatev, Z.
2012-10-01
In this paper we discuss the efficient distributed-memory parallelization strategy of the Unified Danish Eulerian Model (UNI-DEM). We apply an improved decomposition strategy to the spatial domain in order to get more parallel tasks (based on the larger number of subdomains) with less communications between them (due to optimization of the overlapping area when the advection-diffusion problem is solved numerically). This kind of rectangular block partitioning (with a squareshape trend) allows us not only to increase significantly the number of potential parallel tasks, but also to reduce the local memory requirements per task, which is critical for the distributed-memory implementation of the higher-resolution/finergrid versions of UNI-DEM on some parallel systems, and particularly on the IBM BlueGene/P platform - our target hardware. We will show by experiments that our new parallel implementation can use rather efficiently the resources of the powerful IBM BlueGene/P supercomputer, the largest in Bulgaria, up to its full capacity. It turned out to be extremely useful in the large and computationally expensive numerical experiments, carried out to calculate some initial data for sensitivity analysis of the Danish Eulerian model.
Spädtke, P
2013-01-01
Modeling of technical machines became a standard technique since computer became powerful enough to handle the amount of data relevant to the specific system. Simulation of an existing physical device requires the knowledge of all relevant quantities. Electric fields given by the surrounding boundary as well as magnetic fields caused by coils or permanent magnets have to be known. Internal sources for both fields are sometimes taken into account, such as space charge forces or the internal magnetic field of a moving bunch of charged particles. Used solver routines are briefly described and some bench-marking is shown to estimate necessary computing times for different problems. Different types of charged particle sources will be shown together with a suitable model to describe the physical model. Electron guns are covered as well as different ion sources (volume ion sources, laser ion sources, Penning ion sources, electron resonance ion sources, and H$^-$-sources) together with some remarks on beam transport.
Institute of Scientific and Technical Information of China (English)
张昕; 刘月巍; 王斌; 季仲贞
2004-01-01
The Spectral Statistical Interpolation (SSI) analysis system of NCEP is used to assimilate meteorological data from the Global Positioning Satellite System (GPS/MET) refraction angles with the variational technique. Verified by radiosonde, including GPS/MET observations into the analysis makes an overall improvement to the analysis variables of temperature, winds, and water vapor. However, the variational model with the ray-tracing method is quite expensive for numerical weather prediction and climate research. For example, about 4 000 GPS/MET refraction angles need to be assimilated to produce an ideal global analysis. Just one iteration of minimization will take more than 24 hours CPU time on the NCEP's Gray C90 computer. Although efforts have been taken to reduce the computational cost, it is still prohibitive for operational data assimilation. In this paper, a parallel version of the three-dimensional variational data assimilation model of GPS/MET occultation measurement suitable for massive parallel processors architectures is developed. The divide-and-conquer strategy is used to achieve parallelism and is implemented by message passing. The authors present the principles for the code's design and examine the performance on the state-of-the-art parallel computers in China. The results show that this parallel model scales favorably as the number of processors is increased. With the Memory-IO technique implemented by the author, the wall clock time per iteration used for assimilating 1420 refraction angles is reduced from 45 s to 12 s using 1420 processors. This suggests that the new parallelized code has the potential to be useful in numerical weather prediction (NWP) and climate studies.
Sakane, Shinji; Takaki, Tomohiro; Rojas, Roberto; Ohno, Munekazu; Shibuta, Yasushi; Shimokawabe, Takashi; Aoki, Takayuki
2017-09-01
Melt flow drastically changes dendrite morphology during the solidification of pure metals and alloys. Numerical simulation of dendrite growth in the presence of the melt flow is crucial for the accurate prediction and control of the solidification microstructure. However, accurate simulations are difficult because of the large computational costs required. In this study, we develop a parallel computational scheme using multiple graphics processing units (GPUs) for a very large-scale three-dimensional phase-field-lattice Boltzmann simulation. In the model, a quantitative phase field model, which can accurately simulate the dendrite growth of a dilute binary alloy, and a lattice Boltzmann model to simulate the melt flow are coupled to simulate the dendrite growth in the melt flow. By performing very large-scale simulations using the developed scheme, we demonstrate the applicability of multi-GPUs parallel computation to the systematical large-scale-simulations of dendrite growth with the melt flow.
DEFF Research Database (Denmark)
Guan, Yajuan; Quintero, Juan Carlos Vasquez; Guerrero, Josep M.
2015-01-01
active or reactive power, instead it uses a virtual impedance loop and a SFR phase-locked loop. The small-signal model of the system was developed for the autonomous operation of inverter-based microgrid with the proposed controller. The developed model shows large stability margin and fast transient......A novel simple and effective autonomous currentsharing controller for parallel three-phase inverters is employed in this paper. The novel controller is able to endow to the system high speed response and precision in contrast to the conventional droop control as it does not require calculating any...... response of the system. This model can help identifying the origin of each of the modes and possible feedback signals for design of controllers to improve the system stability. Experimental results from two parallel 2.2 kVA inverters verify the effectiveness of the novel control approach....
Institute of Scientific and Technical Information of China (English)
王新运; 万新军; 陈明强; 王君
2012-01-01
The pyrolysis behavior of two kinds of typical biomass (pine wood and cotton stalk) was studied in nitrogen atmosphere at various heating rates by thermogravimetric analysis (TGA).The pyrolysis process can be divided into three stages:evolution of moisture (＜200 ℃),devolatilization (200～400 ℃) and carbonization (＞400 ℃).The comparison of DTG curves of two biomass materials show that the higher the hemicellulose content of biomass,the more evident the shoulder peak of DTG curve.The weight loss process of two materials was simulated by the kinetic model assuming cellulose,hemicellulose and lignin pyrolyzing independently and in parallel,obeying first-order reactions.The pyrolysis kinetic parameters corresponding to the three components were estimated by the nonlinear least square algorithm.The results show that their fitting curves are in good agreement with the experimental data.Their activation energy values for pine wood and cotton stalk are in the range of 188～215,90～102,29～49 and 187～214,95～101,30～38 kJ/mol,respectively.The corresponding pre-exponential factors are in the range of 1.8×1015～2.0×1016,1.6×107～7.1×108,9.3×101～l.5×103 and 1.2× 1015～6.7×1017,1.2× 108～1.4×109,1.4× 102～4.6× 102 min-1,respectively.In addition,the activation energy of cellulose and lignin increased and their contributions to volatile tended to fall,whereas the activation energy of herricellulose decreased and its contribution to volatile tended to rise with increasing of heating rate.
Directory of Open Access Journals (Sweden)
Filippo eTempia
2015-06-01
Full Text Available Genetically inherited mutations in the fibroblast growth factor 14 (FGF14 gene lead to spinocerebellar ataxia type 27 (SCA27, an autosomal dominant disorder characterized by severe heterogeneous motor and cognitive impairments. Consistently, genetic deletion of Fgf14 in Fgf14-/- mice recapitulates salient features of the SCA27 human disease. In vitro molecular studies in cultured neurons indicate that the FGF14F145S SCA27 allele acts as a dominant negative mutant suppressing the FGF14 wild type function and resulting in inhibition of voltage-gated Na+ and Ca2+ channels. To gain insights in the cerebellar deficits in the animal model of the human disease, we applied whole-cell voltage-clamp in the acute cerebellar slice preparation to examine the properties of parallel fibers (PF to Purkinje neuron synapses in Fgf14-/- mice and wild type littermates. We found that the AMPA receptor-mediated excitatory postsynaptic currents evoked by PF stimulation (PF-EPSCs were significantly reduced in Fgf14-/- animals, while short-term plasticity, measured as paired-pulse facilitation (PPF, was enhanced. Measuring Sr2+-induced release of quanta from stimulated synapses, we found that the size of the PF-EPSCs was unchanged, ruling out a postsynaptic deficit. This phenotype was corroborated by decreased expression of VGLUT1, a specific presynaptic marker at PF-Purkinje neuron synapses. We next examined the mGluR1 receptor-induced response (mGluR1-EPSC that under normal conditions requires a gradual build-up of glutamate concentration in the synaptic cleft, and found no changes in these responses in Fgf14-/- mice. These results provide evidence of a critical role of FGF14 in maintaining presynaptic function at PF-Purkinje neuron synapses highlighting critical target mechanisms to recapitulate the complexity of the SCA27 disease.
Healy, T. M.; Fontaine, A. A.; Ellis, J. T.; Walton, S. P.; Yoganathan, A. P.
In this work, a flow visualization experiment was performed to elucidate features of the retrograde hinge flow through a 5:1 scaled model of the Medtronic Parallel bileaflet heart valve. It was hypothesized that this model would provide detailed flow information facilitating identification of flow structures associated with thrombus formation in this valve. The experimental protocol was designed to ensure fluid dynamic similarity between the model and prototype heart valves. Flow was visualized using dye injection. The detailed flow structures observed showed the hinge's inflow channel was the most suspect region for thrombus formation. Here a complex helical structure was observed.
Parallel runs of a large air pollution model on a grid of Sun computers
DEFF Research Database (Denmark)
Alexandrov, V.N.; Owczarz, W.; Thomsen, Per Grove
2004-01-01
Large -scale air pollution models can successfully be used in different environmental studies. These models are described mathematically by systems of partial differential equations. Splitting procedures followed by discretization of the spatial derivatives leads to several large systems of ordin...
Parallel computation of a dam-break flow model using OpenMP on a multi-core computer
Zhang, Shanghong; Xia, Zhongxi; Yuan, Rui; Jiang, Xiaoming
2014-05-01
High-performance calculations are of great importance to the simulation of dam-break events, as discontinuous solutions and accelerated speed are key factors in the process of dam-break flow modeling. In this study, Roe's approximate Riemann solution of the finite volume method is adopted to solve the interface flux of grid cells and accurately simulate the discontinuous flow, and shared memory technology (OpenMP) is used to realize parallel computing. Because an explicit discrete technique is used to solve the governing equations, and there is no correlation between grid calculations in a single time step, the parallel dam-break model can be easily realized by adding OpenMP instructions to the loop structure of the grid calculations. The performance of the model is analyzed using six computing cores and four different grid division schemes for the Pangtoupao flood storage area in China. The results show that the parallel computing improves precision and increases the simulation speed of the dam-break flow, the simulation of 320 h flood process can be completed within 1.6 h on a 16-kernel computer; a speedup factor of 8.64× is achieved. Further analysis reveals that the models involving a larger number of calculations exhibit greater efficiency and a higher rate of acceleration. At the same time, the model has good extendibility, as the speedup increases with the number of processor cores. The parallel model based on OpenMP can make full use of multi-core processors, making it possible to simulate dam-break flows in large-scale watersheds on a single computer.
Energy Technology Data Exchange (ETDEWEB)
Kuhn, E.
2004-09-15
This work deals with the dynamical and energetic modeling of a 42 V NiMH battery, the model of which is taking into account into a control law for an hybrid electrical vehicle. Using an inventory of the electrochemical phenomena, an equivalent electrical scheme has been established. In this model, diffusion phenomena were represented using non integer derivatives. This tool leads to a very good approximation of diffusion phenomena, nevertheless such a pure mathematical approach did not allow to represent energetic losses inside the battery. Consequently, a second model, made of a series of electric circuits has been proposed to represent energetic transfers. This second model has been used in the determination of a control law which warrants an autonomous management of electrical energy embedded in a parallel hybrid electrical vehicle, and to prevent deep discharge of the battery. (author)
African Journals Online (AJOL)
trie neural construction oí inoiviouo! unci communal identities in ... occurs, Including models based on Information processing,1 ... Applying the DSM descriptive approach to dissociation in the ... a personal, narrative path lhal connects personal lo ethnic ..... managed the problem in the context of the community, using a.
Bulygin, Y. I.; Koronchik, D. A.; Abuzyarov, A. A.
2015-09-01
Currently researchers are giving serious consideration to studying questions, related to issues of atmosphere protection, in particular, studying of new construction of gas-cleaning SPM cyclonic devices effectivity. Engineering new devices is impossible without applying mathematical model methods, computer modeling and making physical models of studying processes due nature tests.
Simulation of the world ocean climate with a massively parallel numerical model
Ushakov, K. V.; Ibrayev, R. A.; Kalmykov, V. V.
2015-07-01
The INM-IO numerical World Ocean model is verified through the calculation of the model ocean climate. The numerical experiment was conducted for a period of 500 years following the CORE-I protocol. We analyze some basic elements of the large-scale ocean circulation and local and integral characteristics of the model solution. The model limitations and ways they are overcome are described. The results generally fit the level of leading models. This experiment is a necessary step preceding the transition to high-resolution diagnostic and prognostic calculations of the state of the World Ocean and its individual basins.
Oikawa, Takaaki; Sonoda, Jun; Sato, Motoyuki; Honma, Noriyasu; Ikegawa, Yutaka
Analysis of lightning electromagnetic field using the FDTD method have been studied in recent year. However, large-scale three-dimensional analysis on real environment have not been considered, because the FDTD method has huge computational cost on large-scale analysis. So we have proposed a three-dimensional moving window FDTD (MW-FDTD) method with parallel computation. Our method use few computational cost than the conventional FDTD method and the original MW-FDTD method. In this paper, we have studied about computation performance of MW-FDTD parallel computation and large-scale three-dimensional analysis of lightning electromagnetic field on a real terrain model using our MW-FDTD with parallel computation.
Chen, Tianju; Zhang, Jinzhi; Wu, Jinhu
2016-07-01
The kinetic and energy productions of pyrolysis of a lignocellulosic biomass were investigated using a three-parallel Gaussian distribution method in this work. The pyrolysis experiment of the pine sawdust was performed using a thermogravimetric-mass spectroscopy (TG-MS) analyzer. A three-parallel Gaussian distributed activation energy model (DAEM)-reaction model was used to describe thermal decomposition behaviors of the three components, hemicellulose, cellulose and lignin. The first, second and third pseudocomponents represent the fractions of hemicellulose, cellulose and lignin, respectively. It was found that the model is capable of predicting the pyrolysis behavior of the pine sawdust. The activation energy distribution peaks for the three pseudo-components were centered at 186.8, 197.5 and 203.9kJmol(-1) for the pine sawdust, respectively. The evolution profiles of H2, CH4, CO, and CO2 were well predicted using the three-parallel Gaussian distribution model. In addition, the chemical composition of bio-oil was also obtained by pyrolysis-gas chromatography/mass spectrometry instrument (Py-GC/MS). Copyright © 2016 Elsevier Ltd. All rights reserved.
Cai, Hongzhu; Hu, Xiangyun; Li, Jianhui; Endo, Masashi; Xiong, Bin
2017-02-01
We solve the 3D controlled-source electromagnetic (CSEM) problem using the edge-based finite element method. The modeling domain is discretized using unstructured tetrahedral mesh. We adopt the total field formulation for the quasi-static variant of Maxwell's equation and the computation cost to calculate the primary field can be saved. We adopt a new boundary condition which approximate the total field on the boundary by the primary field corresponding to the layered earth approximation of the complicated conductivity model. The primary field on the modeling boundary is calculated using fast Hankel transform. By using this new type of boundary condition, the computation cost can be reduced significantly and the modeling accuracy can be improved. We consider that the conductivity can be anisotropic. We solve the finite element system of equations using a parallelized multifrontal solver which works efficiently for multiple source and large scale electromagnetic modeling.
Institute of Scientific and Technical Information of China (English)
XI Lifeng; DU Shichang
2007-01-01
The final product quality is determined by cumulation, coupling and propagation of product quality variations from all stations in multi-stage manufacturing systems (MMSs). Modeling and control of variation propagation is essential to improve product quality. However, the current stream of variations (SOV) theory can only solve the problem that a single SOV affects the product quality. Due to the existence of multiple variation streams, limited research has been done on the quality control in serial-parallel hybrid multi-stage manufacturing systems (SPH-MMSs). A state space model and its modeling strategies are developed to describe the multiple variation streams stack-up in an SPH-MMS. The SOV theory is extended to SPH-MMS. The dimensions of system model are reduced to the production-reality level, and the effect and feasibility of the model is validated by a machining case.
Energy Technology Data Exchange (ETDEWEB)
1992-03-10
The first phase of the proposed work is largely completed on schedule. Scientists at the San Diego Supercomputer Center (SDSC) succeeded in putting a version of the Hamburg isopycnal coordinate ocean model (OPYC) onto the INTEL parallel computer. Due to the slow run speeds of the OPYC on the parallel machine, another ocean is being model used during the first part of phase 2. The model chosen is the Large Scale Geostrophic (LSG) model form the Max Planck Institute.
Directory of Open Access Journals (Sweden)
Jonathan C Fuller
Full Text Available The design of novel α-helix mimetic inhibitors of protein-protein interactions is of interest to pharmaceuticals and chemical genetics researchers as these inhibitors provide a chemical scaffold presenting side chains in the same geometry as an α-helix. This conformational arrangement allows the design of high affinity inhibitors mimicking known peptide sequences binding specific protein substrates. We show that GAFF and AutoDock potentials do not properly capture the conformational preferences of α-helix mimetics based on arylamide oligomers and identify alternate parameters matching solution NMR data and suitable for molecular dynamics simulation of arylamide compounds. Results from both docking and molecular dynamics simulations are consistent with the arylamides binding in the p53 peptide binding pocket. Simulations of arylamides in the p53 binding pocket of hDM2 are consistent with binding, exhibiting similar structural dynamics in the pocket as simulations of known hDM2 binders Nutlin-2 and a benzodiazepinedione compound. Arylamide conformations converge towards the same region of the binding pocket on the 20 ns time scale, and most, though not all dihedrals in the binding pocket are well sampled on this timescale. We show that there are two putative classes of binding modes for arylamide compounds supported equally by the modeling evidence. In the first, the arylamide compound lies parallel to the observed p53 helix. In the second class, not previously identified or proposed, the arylamide compound lies anti-parallel to the p53 helix.
Limits to the critical current in Bi2Sr2Ca2Cu3Ox tape conductors: The parallel path model
van der Laan, D.C.; Schwartz, J.; ten Haken, Bernard; Dhalle, M.; van Eck, H.J.N.
2008-01-01
An extensive overview of a model that describes current flow and dissipation in high-quality Bi2Sr2Ca2Cu3Ox superconducting tapes is provided. The parallel path model is based on a superconducting current running in two distinct parallel paths. One of the current paths is formed by grains that are
Liang, Dong; Song, Yimin; Sun, Tao; Jin, Xueying
2017-09-01
A systematic dynamic modeling methodology is presented to develop the rigid-flexible coupling dynamic model (RFDM) of an emerging flexible parallel manipulator with multiple actuation modes. By virtue of assumed mode method, the general dynamic model of an arbitrary flexible body with any number of lumped parameters is derived in an explicit closed form, which possesses the modular characteristic. Then the completely dynamic model of system is formulated based on the flexible multi-body dynamics (FMD) theory and the augmented Lagrangian multipliers method. An approach of combining the Udwadia-Kalaba formulation with the hybrid TR-BDF2 numerical algorithm is proposed to address the nonlinear RFDM. Two simulation cases are performed to investigate the dynamic performance of the manipulator with different actuation modes. The results indicate that the redundant actuation modes can effectively attenuate vibration and guarantee higher dynamic performance compared to the traditional non-redundant actuation modes. Finally, a virtual prototype model is developed to demonstrate the validity of the presented RFDM. The systematic methodology proposed in this study can be conveniently extended for the dynamic modeling and controller design of other planar flexible parallel manipulators, especially the emerging ones with multiple actuation modes.
Directory of Open Access Journals (Sweden)
Lianchao Sheng
2017-01-01
Full Text Available Due to the complexity of the dynamic model of a planar 3-RRR flexible parallel manipulator (FPM, it is often difficult to achieve active vibration control algorithm based on the system dynamic model. To establish a simple and efficient dynamic model of the planar 3-RRR FPM to study its dynamic characteristics and build a controller conveniently, firstly, considering the effect of rigid-flexible coupling and the moment of inertia at the end of the flexible intermediate link, the modal function is determined with the pinned-free boundary condition. Then, considering the main vibration modes of the system, a high-efficiency coupling dynamic model is established on the basis of guaranteeing the model control accuracy. According to the model, the modal characteristics of the flexible intermediate link are analyzed and compared with the modal test results. The results show that the model can effectively reflect the main vibration modes of the planar 3-RRR FPM; in addition the model can be used to analyze the effects of inertial and coupling forces on the dynamics model and the drive torque of the drive motor. Because this model is of the less dynamic parameters, it is convenient to carry out the control program.
Frickenhaus, Stephan; Hiller, Wolfgang; Best, Meike
The portable software FoSSI is introduced that—in combination with additional free solver software packages—allows for an efficient and scalable parallel solution of large sparse linear equations systems arising in finite element model codes. FoSSI is intended to support rapid model code development, completely hiding the complexity of the underlying solver packages. In particular, the model developer need not be an expert in parallelization and is yet free to switch between different solver packages by simple modifications of the interface call. FoSSI offers an efficient and easy, yet flexible interface to several parallel solvers, most of them available on the web, such as PETSC, AZTEC, MUMPS, PILUT and HYPRE. FoSSI makes use of the concept of handles for vectors, matrices, preconditioners and solvers, that is frequently used in solver libraries. Hence, FoSSI allows for a flexible treatment of several linear equations systems and associated preconditioners at the same time, even in parallel on separate MPI-communicators. The second special feature in FoSSI is the task specifier, being a combination of keywords, each configuring a certain phase in the solver setup. This enables the user to control a solver over one unique subroutine. Furthermore, FoSSI has rather similar features for all solvers, making a fast solver intercomparison or exchange an easy task. FoSSI is a community software, proven in an adaptive 2D-atmosphere model and a 3D-primitive equation ocean model, both formulated in finite elements. The present paper discusses perspectives of an OpenMP-implementation of parallel iterative solvers based on domain decomposition methods. This approach to OpenMP solvers is rather attractive, as the code for domain-local operations of factorization, preconditioning and matrix-vector product can be readily taken from a sequential implementation that is also suitable to be used in an MPI-variant. Code development in this direction is in an advanced state under
Energy Technology Data Exchange (ETDEWEB)
Rajagopalan, A.; Washington, G.; Rizzoni, G.; Guezennec, Y.
2003-12-01
This report describes the development of new control strategies and models for Hybrid Electric Vehicles (HEV) by the Ohio State University. The report indicates results from models created in NREL's ADvanced VehIcle SimulatOR (ADVISOR 3.2), and results of a scalable IC Engine model, called in Willan's Line technique, implemented in ADVISOR 3.2.
Penas, David R; González, Patricia; Egea, Jose A; Doallo, Ramón; Banga, Julio R
2017-01-21
The development of large-scale kinetic models is one of the current key issues in computational systems biology and bioinformatics. Here we consider the problem of parameter estimation in nonlinear dynamic models. Global optimization methods can be used to solve this type of problems but the associated computational cost is very large. Moreover, many of these methods need the tuning of a number of adjustable search parameters, requiring a number of initial exploratory runs and therefore further increasing the computation times. Here we present a novel parallel method, self-adaptive cooperative enhanced scatter search (saCeSS), to accelerate the solution of this class of problems. The method is based on the scatter search optimization metaheuristic and incorporates several key new mechanisms: (i) asynchronous cooperation between parallel processes, (ii) coarse and fine-grained parallelism, and (iii) self-tuning strategies. The performance and robustness of saCeSS is illustrated by solving a set of challenging parameter estimation problems, including medium and large-scale kinetic models of the bacterium E. coli, bakerés yeast S. cerevisiae, the vinegar fly D. melanogaster, Chinese Hamster Ovary cells, and a generic signal transduction network. The results consistently show that saCeSS is a robust and efficient method, allowing very significant reduction of computation times with respect to several previous state of the art methods (from days to minutes, in several cases) even when only a small number of processors is used. The new parallel cooperative method presented here allows the solution of medium and large scale parameter estimation problems in reasonable computation times and with small hardware requirements. Further, the method includes self-tuning mechanisms which facilitate its use by non-experts. We believe that this new method can play a key role in the development of large-scale and even whole-cell dynamic models.
Institute of Scientific and Technical Information of China (English)
曾庆华; 孙世新; 陈天麒
2003-01-01
LogP is becoming a practical parallel computation model that meets the demanding of parallel computersand parallel algorithms. So it is important to re-design parallel algorithms on the LogP model. This paper studies theparallel algorithm of computing converse matrix on the simplified LogP model, and gets the simulating results.
Keppenne, Christian L.; Rienecker, Michele M.; Koblinsky, Chester (Technical Monitor)
2001-01-01
A multivariate ensemble Kalman filter (MvEnKF) implemented on a massively parallel computer architecture has been implemented for the Poseidon ocean circulation model and tested with a Pacific Basin model configuration. There are about two million prognostic state-vector variables. Parallelism for the data assimilation step is achieved by regionalization of the background-error covariances that are calculated from the phase-space distribution of the ensemble. Each processing element (PE) collects elements of a matrix measurement functional from nearby PEs. To avoid the introduction of spurious long-range covariances associated with finite ensemble sizes, the background-error covariances are given compact support by means of a Hadamard (element by element) product with a three-dimensional canonical correlation function. The methodology and the MvEnKF configuration are discussed. It is shown that the regionalization of the background covariances; has a negligible impact on the quality of the analyses. The parallel algorithm is very efficient for large numbers of observations but does not scale well beyond 100 PEs at the current model resolution. On a platform with distributed memory, memory rather than speed is the limiting factor.
Institute of Scientific and Technical Information of China (English)
TAN Wenchang; XU Mingyu
2004-01-01
The fractional calculus approach in the constitutive relationship model of a generalized second grade fluid is introduced. Exact analytical solutions are obtained for a class of unsteady flows for the generalized second grade fluid with the fractional derivative model between two parallel plates by using the Laplace transform and Fourier transform for fractional calculus. The unsteady flows are generated by the impulsive motion or periodic oscillation of one of the plates. In addition, the solutions of the shear stresses at the plates are also determined.
Parallel workflows for data-driven structural equation modeling in functional neuroimaging
Directory of Open Access Journals (Sweden)
Sarah Kenny
2009-10-01
Full Text Available We present a computational framework suitable for a data-driven approach to structural equation modeling (SEM and describe several workflows for modeling functional magnetic resonance imaging (fMRI data within this framework. The Computational Neuroscience Applications Research Infrastructure (CNARI employs a high-level scripting language called Swift, which is capable of spawning hundreds of thousands of simultaneous R processes (R Core Development Team, 2008, consisting of self-contained structural equation models, on a high performance computing system (HPC. These self-contained R processing jobs are data objects generated by OpenMx, a plug-in for R, which can generate a single model object containing the matrices and algebraic information necessary to estimate parameters of the model. With such an infrastructure in place a structural modeler may begin to investigate exhaustive searches of the model space. Specific applications of the infrastructure, statistics related to model fit, and limitations are discussed in relation to exhaustive SEM. In particular, we discuss how workflow management techniques can help to solve large computational problems in neuroimaging.
PARALLEL MEASUREMENT AND MODELING OF TRANSPORT IN THE DARHT II BEAMLINE ON ETA II
Energy Technology Data Exchange (ETDEWEB)
Chambers, F W; Raymond, B A; Falabella, S; Lee, B S; Richardson, R A; Weir, J T; Davis, H A; Schultze, M E
2005-05-31
To successfully tune the DARHT II transport beamline requires the close coupling of a model of the beam transport and the measurement of the beam observables as the beam conditions and magnet settings are varied. For the ETA II experiment using the DARHT II beamline components this was achieved using the SUICIDE (Simple User Interface Connecting to an Integrated Data Environment) data analysis environment and the FITS (Fully Integrated Transport Simulation) model. The SUICIDE environment has direct access to the experimental beam transport data at acquisition and the FITS predictions of the transport for immediate comparison. The FITS model is coupled into the control system where it can read magnet current settings for real time modeling. We find this integrated coupling is essential for model verification and the successful development of a tuning aid for the efficient convergence on a useable tune. We show the real time comparisons of simulation and experiment and explore the successes and limitations of this close coupled approach.
Energy Technology Data Exchange (ETDEWEB)
Colli, A.N. [Programa de Electroquimica Aplicada e Ingenieria Electroquimica (PRELINE), Facultad de Ingenieria Quimica, Universidad Nacional del Litoral, Santiago del Estero 2829, S3000AOM Santa Fe (Argentina); Bisang, J.M., E-mail: jbisang@fiq.unl.edu.ar [Programa de Electroquimica Aplicada e Ingenieria Electroquimica (PRELINE), Facultad de Ingenieria Quimica, Universidad Nacional del Litoral, Santiago del Estero 2829, S3000AOM Santa Fe (Argentina)
2011-08-30
Highlights: {center_dot} The type of turbulence promoters has a strong influence on the hydrodynamics. {center_dot} The dispersion model is appropriate for expanded plastic turbulence promoters. {center_dot} The dispersion model is appropriate for glass beads turbulence promoters. - Abstract: The hydrodynamic behaviour of electrochemical reactors with parallel plate electrodes is experimentally studied using the stimulus-response method either with an empty reactor or with different turbulence promoters. Theoretical results which are in accordance with the analytical and numerical resolution of the dispersion model for a closed system are compared with the classical relationships of the normalized outlet concentration for open systems and the validity range of the equations is discussed. The experimental results were well correlated with the dispersion model using glass beads or expanded plastic meshes as turbulence promoters, which have shown the most advantageous performance. The Peclet number was higher than 63. The dispersion coefficient was found to increase linearly with flow velocity in these cases.
Tassi, E.; Sulem, P. L.; Passot, T.
2016-12-01
Reduced models are derived for a strongly magnetized collisionless plasma at scales which are large relative to the electron thermal gyroradius and in two asymptotic regimes. One corresponds to cold ions and the other to far sub-ion scales. By including the electron pressure dynamics, these models improve the Hall reduced magnetohydrodynamics (MHD) and the kinetic Alfvén wave model of Boldyrev et al. (2013 Astrophys. J., vol. 777, 2013, p. 41), respectively. We show that the two models can be obtained either within the gyrofluid formalism of Brizard (Phys. Fluids, vol. 4, 1992, pp. 1213-1228) or as suitable weakly nonlinear limits of the finite Larmor radius (FLR)-Landau fluid model of Sulem and Passot (J. Plasma Phys., vol 81, 2015, 325810103) which extends anisotropic Hall MHD by retaining low-frequency kinetic effects. It is noticeable that, at the far sub-ion scales, the simplifications originating from the gyroaveraging operators in the gyrofluid formalism and leading to subdominant ion velocity and temperature fluctuations, correspond, at the level of the FLR-Landau fluid, to cancellation between hydrodynamic contributions and ion finite Larmor radius corrections. Energy conservation properties of the models are discussed and an explicit example of a closure relation leading to a model with a Hamiltonian structure is provided.
Sfakiotakis, Stelios; Vamvuka, Despina
2015-12-01
The pyrolysis of six waste biomass samples was studied and the fuels were kinetically evaluated. A modified independent parallel reactions scheme (IPR) and a distributed activation energy model (DAEM) were developed and their validity was assessed and compared by checking their accuracy of fitting the experimental results, as well as their prediction capability in different experimental conditions. The pyrolysis experiments were carried out in a thermogravimetric analyzer and a fitting procedure, based on least squares minimization, was performed simultaneously at different experimental conditions. A modification of the IPR model, considering dependence of the pre-exponential factor on heating rate, was proved to give better fit results for the same number of tuned kinetic parameters, comparing to the known IPR model and very good prediction results for stepwise experiments. Fit of calculated data to the experimental ones using the developed DAEM model was also proved to be very good. Copyright © 2015 Elsevier Ltd. All rights reserved.
Tang, Xiaolin; Yang, Wei; Hu, Xiaosong; Zhang, Dejiu
2017-02-01
In this study, based on our previous work, a novel simplified torsional vibration dynamic model is established to study the torsional vibration characteristics of a compound planetary hybrid propulsion system. The main frequencies of the hybrid driveline are determined. In contrast to vibration characteristics of the previous 16-degree of freedom model, the simplified model can be used to accurately describe the low-frequency vibration property of this hybrid powertrain. This study provides a basis for further vibration control of the hybrid powertrain during the process of engine start/stop.
Global SH-wave propagation in a 2D whole Moon model using the parallel hybrid PSM/FDM method
Jiang, Xianghua; Wang, Yanbin; Qin, Yanfang; Takenaka, Hiroshi
2015-06-01
We present numerical modeling of SH-wave propagation for the recently proposed whole Moon model and try to improve our understanding of lunar seismic wave propagation. We use a hybrid PSM/FDM method on staggered grids to solve the wave equations and implement the calculation on a parallel PC cluster to improve the computing efficiency. Features of global SH-wave propagation are firstly discussed for a 100-km shallow and 900-km deep moonquakes, respectively. Effects of frequency range and lateral variation of crust thickness are then investigated with various models. Our synthetic waveforms are finally compared with observed Apollo data to show the features of wave propagation that were produced by our model and those not reproduced by our models. Our numerical modeling show that the low-velocity upper crust plays significant role in the development of reverberating wave trains. Increasing frequency enhances the strength and duration of the reverberations. Surface multiples dominate wavefields for shallow event. Core-mantle reflections can be clearly identified for deep event at low frequency. The layered whole Moon model and the low-velocity upper crust produce the reverberating wave trains following each phases consistent with observation. However, more realistic Moon model should be considered in order to explain the strong and slow decay scattering between various phases shown on observation data.
Environmental models are products of the computer architecture and software tools available at the time of development. Scientifically sound algorithms may persist in their original state even as system architectures and software development approaches evolve and progress. Dating...
Environmental models are products of the computer architecture and software tools available at the time of development. Scientifically sound algorithms may persist in their original state even as system architectures and software development approaches evolve and progress. Dating...
Analysis and Modeling of Circulating Current in Two Parallel-Connected Inverters
DEFF Research Database (Denmark)
Maheshwari, Ram Krishan; Gohil, Ghanshyamsinh Vijaysinh; Bede, Lorand;
2015-01-01
Parallel-connected inverters are gaining attention for high power applications because of the limited power handling capability of the power modules. Moreover, the parallel-connected inverters may have low total harmonic distortion of the ac current if they are operated with the interleaved pulse...
Parallelization experience with four canonical econometric models using ParMitISEM
Baştürk, N.; Grassi, S.; Hoogerheide, L.; van Dijk, H.K.
2016-01-01
This paper presents the parallel computing implementation of the MitISEM algorithm, labeled Parallel MitISEM. The basic MitISEM algorithm, introduced by Hoogerheide et al. (2012), provides an automatic and flexible method to approximate a non-elliptical target density using adaptive mixtures of
Parallelization experience with four canonical econometric models using ParMitISEM
Bastürk, Nalan; Grassi, S.; Hoogerheide, L.; van Dijk, Herman K.
2016-01-01
This paper presents the parallel computing implementation of the MitISEM algorithm, labeled Parallel MitISEM. The basic MitISEM algorithm provides an automatic and flexible method to approximate a non-elliptical target density using adaptive mixtures of Student-t densities, where only a kernel of
Parallelization Experience with Four Canonical Econometric Models Using ParMitISEM
N. Basturk (Nalan); S. Grassi (Stefano); L.F. Hoogerheide (Lennart); H.K. van Dijk (Herman)
2016-01-01
textabstractThis paper presents the parallel computing implementation of the MitISEM algorithm, labeled Parallel MitISEM. The basic MitISEM algorithm, introduced by Hoogerheide, Opschoor and Van Dijk (2012), provides an automatic and flexible method to approximate a non-elliptical target density
Parallelization experience with four canonical econometric models using ParMitISEM
Baştürk, N.; Grassi, S.; Hoogerheide, L.; van Dijk, H.K.
2016-01-01
This paper presents the parallel computing implementation of the MitISEM algorithm, labeled Parallel MitISEM. The basic MitISEM algorithm, introduced by Hoogerheide et al. (2012), provides an automatic and flexible method to approximate a non-elliptical target density using adaptive mixtures of Stud
Some approaches for modeling and analysis of a parallel mechanism with stewart platform architecture
Energy Technology Data Exchange (ETDEWEB)
V. De Sapio
1998-05-01
Parallel mechanisms represent a family of devices based on a closed kinematic architecture. This is in contrast to serial mechanisms, which are comprised of a chain-like series of joints and links in an open kinematic architecture. The closed architecture of parallel mechanisms offers certain benefits and disadvantages.
Modeling a Million-Node Slim Fly Network Using Parallel Discrete-Event Simulation
Energy Technology Data Exchange (ETDEWEB)
Wolfe, Noah; Carothers, Christopher; Mubarak, Misbah; Ross, Robert; Carns, Philip
2016-05-15
As supercomputers close in on exascale performance, the increased number of processors and processing power translates to an increased demand on the underlying network interconnect. The Slim Fly network topology, a new lowdiameter and low-latency interconnection network, is gaining interest as one possible solution for next-generation supercomputing interconnect systems. In this paper, we present a high-fidelity Slim Fly it-level model leveraging the Rensselaer Optimistic Simulation System (ROSS) and Co-Design of Exascale Storage (CODES) frameworks. We validate our Slim Fly model with the Kathareios et al. Slim Fly model results provided at moderately sized network scales. We further scale the model size up to n unprecedented 1 million compute nodes; and through visualization of network simulation metrics such as link bandwidth, packet latency, and port occupancy, we get an insight into the network behavior at the million-node scale. We also show linear strong scaling of the Slim Fly model on an Intel cluster achieving a peak event rate of 36 million events per second using 128 MPI tasks to process 7 billion events. Detailed analysis of the underlying discrete-event simulation performance shows that a million-node Slim Fly model simulation can execute in 198 seconds on the Intel cluster.
Optimality models of phage life history and parallels in disease evolution.
Bull, J J
2006-08-21
Optimality models constitute one of the simplest approaches to understanding phenotypic evolution. Yet they have shortcomings that are not easily evaluated in most organisms. Most importantly, the genetic basis of phenotype evolution is almost never understood, and phenotypic selection experiments are rarely possible. Both limitations can be overcome with bacteriophages. However, phages have such elementary life histories that few phenotypes seem appropriate for optimality approaches. Here we develop optimality models of two phage life history traits, lysis time and host range. The lysis time models show that the optimum is less sensitive to differences in host density than suggested by earlier analytical work. Host range evolution is approached from the perspective of whether the virus should avoid particular hosts, and the results match optimal foraging theory: there is an optimal "diet" in which host types are either strictly included or excluded, depending on their infection qualities. Experimental tests of both models are feasible, and phages provide concrete illustrations of many ways that optimality models can guide understanding and explanation. Phage genetic systems already support the perspective that lysis time and host range can evolve readily and evolve without greatly affecting other traits, one of the main tenets of optimality theory. The models can be extended to more general properties of infection, such as the evolution of virulence and tissue tropism.
Directory of Open Access Journals (Sweden)
Hua Wang
2008-01-01
Full Text Available With the movement of magnetic resonance imaging (MRI technology towards higher field (and therefore frequency systems, the interaction of the fields generated by the system with patients, healthcare workers, and internally within the system is attracting more attention. Due to the complexity of the interactions, computational modeling plays an essential role in the analysis, design, and development of modern MRI systems. As a result of the large computational scale associated with most of the MRI models, numerical schemes that rely on a single computer processing unit often require a significant amount of memory and long computational times, which makes modeling of these problems quite inefficient. This paper presents dedicated message passing interface (MPI, OPENMP parallel computing solvers for finite-difference time-domain (FDTD, and quasistatic finite-difference (QSFD schemes. The FDTD and QSFD methods have been widely used to model/ analyze the induction of electric fields/ currents in voxel phantoms and MRI system components at high and low frequencies, respectively. The power of the optimized parallel computing architectures is illustrated by distinct, large-scale field calculation problems and shows significant computational advantages over conventional single processing platforms.
Gart, Natalie; Zamora, Irina; Williams, Marian E
2016-07-01
Therapeutic Assessment (TA; S.E. Finn & M.E. Tonsager, 1997; J.D. Smith, 2010) is a collaborative, semistructured model that encourages self-discovery and meaning-making through the use of assessment as an intervention approach. This model shares core strategies with infant mental health assessment, including close collaboration with parents and caregivers, active participation of the family, a focus on developing new family stories and increasing parents' understanding of their child, and reducing isolation and increasing hope through the assessment process. The intersection of these two theoretical approaches is explored, using case studies of three infants/young children and their families to illustrate the application of TA to infant mental health. The case of an 18-month-old girl whose parents fear that she has bipolar disorder illustrates the core principles of the TA model, highlighting the use of assessment intervention sessions and the clinical approach to preparing assessment feedback. The second case follows an infant with a rare genetic syndrome from ages 2 to 24 months, focusing on the assessor-parent relationship and the importance of a developmental perspective. Finally, assessment of a 3-year-old boy illustrates the development and use of a fable as a tool to provide feedback to a young child about assessment findings and recommendations. © 2016 Michigan Association for Infant Mental Health.
Bazhenov, V. G.; Bragov, A. M.; Konstantinov, A. Yu.; Kotov, V. L.
2015-05-01
This paper presents an analysis of the accuracy of known and new modeling methods using the hypothesis of local and plane sections for solution of problems of the impact and plane-parallel motion of conical bodies at an angle to the free surface of the half-space occupied by elastoplastic soil. The parameters of the local interaction model that is quadratic in velocity are determined by solving the one-dimensional problem of the expansion of a spherical cavity. Axisymmetric problems for each of the meridional section are solved simultaneously neglecting mass and momentum transfer in the circumferential direction and using an approach based on the hypothesis of plane sections. The dynamic and kinematic parameters of oblique penetration obtained using modified models are compared with the results of computer simulation in a three-dimensional formulation. The results obtained with regard to the contact stress distribution along the generator of the pointed cone are in satisfactory agreement.
Bisetti, Fabrizio; Attili, Antonio; Pitsch, Heinz
2014-08-13
Combustion of fossil fuels is likely to continue for the near future due to the growing trends in energy consumption worldwide. The increase in efficiency and the reduction of pollutant emissions from combustion devices are pivotal to achieving meaningful levels of carbon abatement as part of the ongoing climate change efforts. Computational fluid dynamics featuring adequate combustion models will play an increasingly important role in the design of more efficient and cleaner industrial burners, internal combustion engines, and combustors for stationary power generation and aircraft propulsion. Today, turbulent combustion modelling is hindered severely by the lack of data that are accurate and sufficiently complete to assess and remedy model deficiencies effectively. In particular, the formation of pollutants is a complex, nonlinear and multi-scale process characterized by the interaction of molecular and turbulent mixing with a multitude of chemical reactions with disparate time scales. The use of direct numerical simulation (DNS) featuring a state of the art description of the underlying chemistry and physical processes has contributed greatly to combustion model development in recent years. In this paper, the analysis of the intricate evolution of soot formation in turbulent flames demonstrates how DNS databases are used to illuminate relevant physico-chemical mechanisms and to identify modelling needs.
Bisetti, Fabrizio
2014-07-14
Combustion of fossil fuels is likely to continue for the near future due to the growing trends in energy consumption worldwide. The increase in efficiency and the reduction of pollutant emissions from combustion devices are pivotal to achieving meaningful levels of carbon abatement as part of the ongoing climate change efforts. Computational fluid dynamics featuring adequate combustion models will play an increasingly important role in the design of more efficient and cleaner industrial burners, internal combustion engines, and combustors for stationary power generation and aircraft propulsion. Today, turbulent combustion modelling is hindered severely by the lack of data that are accurate and sufficiently complete to assess and remedy model deficiencies effectively. In particular, the formation of pollutants is a complex, nonlinear and multi-scale process characterized by the interaction of molecular and turbulent mixing with a multitude of chemical reactions with disparate time scales. The use of direct numerical simulation (DNS) featuring a state of the art description of the underlying chemistry and physical processes has contributed greatly to combustion model development in recent years. In this paper, the analysis of the intricate evolution of soot formation in turbulent flames demonstrates how DNS databases are used to illuminate relevant physico-chemical mechanisms and to identify modelling needs. © 2014 The Author(s) Published by the Royal Society.
Energy Technology Data Exchange (ETDEWEB)
Littlefield, R.J.; Maschhoff, K.J.
1991-04-01
Many linear algebra algorithms utilize an array of processors across which matrices are distributed. Given a particular matrix size and a maximum number of processors, what configuration of processors, i.e., what size and shape array, will execute the fastest The answer to this question depends on tradeoffs between load balancing, communication startup and transfer costs, and computational overhead. In this paper we analyze in detail one algorithm: the blocked factored Jacobi method for solving dense eigensystems. A performance model is developed to predict execution time as a function of the processor array and matrix sizes, plus the basic computation and communication speeds of the underlying computer system. In experiments on a large hypercube (up to 512 processors), this model has been found to be highly accurate (mean error {approximately} 2%) over a wide range of matrix sizes (10 {times} 10 through 200 {times} 200) and processor counts (1 to 512). The model reveals, and direct experiment confirms, that the tradeoffs mentioned above can be surprisingly complex and counterintuitive. We propose decision procedures based directly on the performance model to choose configurations for fastest execution. The model-based decision procedures are compared to a heuristic strategy and shown to be significantly better. 7 refs., 8 figs., 1 tab.
Shen, Bo-Wen; Cheung, Samson; Li, Jui-Lin F.; Wu, Yu-ling
2013-01-01
In this study, we discuss the performance of the parallel ensemble empirical mode decomposition (EMD) in the analysis of tropical waves that are associated with tropical cyclone (TC) formation. To efficiently analyze high-resolution, global, multiple-dimensional data sets, we first implement multilevel parallelism into the ensemble EMD (EEMD) and obtain a parallel speedup of 720 using 200 eight-core processors. We then apply the parallel EEMD (PEEMD) to extract the intrinsic mode functions (IMFs) from preselected data sets that represent (1) idealized tropical waves and (2) large-scale environmental flows associated with Hurricane Sandy (2012). Results indicate that the PEEMD is efficient and effective in revealing the major wave characteristics of the data, such as wavelengths and periods, by sifting out the dominant (wave) components. This approach has a potential for hurricane climate study by examining the statistical relationship between tropical waves and TC formation.
Execution Model of Three Parallel Languages: OpenMP, UPC and CAF
Directory of Open Access Journals (Sweden)
Ami Marowka
2005-01-01
Full Text Available The aim of this paper is to present a qualitative evaluation of three state-of-the-art parallel languages: OpenMP, Unified Parallel C (UPC and Co-Array Fortran (CAF. OpenMP and UPC are explicit parallel programming languages based on the ANSI standard. CAF is an implicit programming language. On the one hand, OpenMP designs for shared-memory architectures and extends the base-language by using compiler directives that annotate the original source-code. On the other hand, UPC and CAF designs for distribute-shared memory architectures and extends the base-language by new parallel constructs. We deconstruct each language into its basic components, show examples, make a detailed analysis, compare them, and finally draw some conclusions.
The Multiproduct Parallel Assembly Lines Balancing Problem: Model and Optimization Procedure
Armin Scholl; Nils Boysen
2008-01-01
A production system which consists of a number of parallel assembly lines is considered. On each line a certain product is manufactured observing a common cycle time. By arranging the lines in a favourable manner, it is possible to increase efficiency of the production system by combining stations of neighbouring lines when balancing them. The objective is to minimize the number of operators required. This problem is called Multiproduct Parallel Assembly Lines Balancing Problem (MPALBP) and h...
Energy Technology Data Exchange (ETDEWEB)
Cwik, T.; Jamnejad, V.; Zuffada, C. [California Institute of Technology, Pasadena, CA (United States)
1994-12-31
The usefulness of finite element modeling follows from the ability to accurately simulate the geometry and three-dimensional fields on the scale of a fraction of a wavelength. To make this modeling practical for engineering design, it is necessary to integrate the stages of geometry modeling and mesh generation, numerical solution of the fields-a stage heavily dependent on the efficient use of a sparse matrix equation solver, and display of field information. The stages of geometry modeling, mesh generation, and field display are commonly completed using commercially available software packages. Algorithms for the numerical solution of the fields need to be written for the specific class of problems considered. Interior problems, i.e. simulating fields in waveguides and cavities, have been successfully solved using finite element methods. Exterior problems, i.e. simulating fields scattered or radiated from structures, are more difficult to model because of the need to numerically truncate the finite element mesh. To practically compute a solution to exterior problems, the domain must be truncated at some finite surface where the Sommerfeld radiation condition is enforced, either approximately or exactly. Approximate methods attempt to truncate the mesh using only local field information at each grid point, whereas exact methods are global, needing information from the entire mesh boundary. In this work, a method that couples three-dimensional finite element (FE) solutions interior to the bounding surface, with an efficient integral equation (IE) solution that exactly enforces the Sommerfeld radiation condition is developed. The bounding surface is taken to be a surface of revolution (SOR) to greatly reduce computational expense in the IE portion of the modeling.
Fang, Ye; Feng, Sheng; Tam, Ka-Ming; Yun, Zhifeng; Moreno, Juana; Ramanujam, J.; Jarrell, Mark
2014-10-01
Monte Carlo simulations of the Ising model play an important role in the field of computational statistical physics, and they have revealed many properties of the model over the past few decades. However, the effect of frustration due to random disorder, in particular the possible spin glass phase, remains a crucial but poorly understood problem. One of the obstacles in the Monte Carlo simulation of random frustrated systems is their long relaxation time making an efficient parallel implementation on state-of-the-art computation platforms highly desirable. The Graphics Processing Unit (GPU) is such a platform that provides an opportunity to significantly enhance the computational performance and thus gain new insight into this problem. In this paper, we present optimization and tuning approaches for the CUDA implementation of the spin glass simulation on GPUs. We discuss the integration of various design alternatives, such as GPU kernel construction with minimal communication, memory tiling, and look-up tables. We present a binary data format, Compact Asynchronous Multispin Coding (CAMSC), which provides an additional 28.4% speedup compared with the traditionally used Asynchronous Multispin Coding (AMSC). Our overall design sustains a performance of 33.5 ps per spin flip attempt for simulating the three-dimensional Edwards-Anderson model with parallel tempering, which significantly improves the performance over existing GPU implementations.
Directory of Open Access Journals (Sweden)
Maryam Banitalebi Dehkordi
2012-11-01
Full Text Available This paper presents the modelling and experimental evaluation of the gravity compensation of a horizontal 3-UPU parallel mechanism. The conventional Newton-Euler method for static analysis and balancing of mechanisms works for serial robots; however, it can become computationally expensive when applied to the analysis of parallel manipulators. To overcome this difficulty, in this paper we propose an approach, based on a Lagrangian method, that is more efficient in terms of computation time. The derivation of the gravity compensation model is based on the analytical computation of the total potential energy of the system at each position of the end-effector. In order to satisfy the gravity compensation condition, the total potential energy of the system should remain constant for all of the manipulator's configurations. Analytical and mechanical gravity compensation is taken into account, and the set of conditions and the system of springs are defined. Finally, employing a virtual reality environment, some experiments are carried out and the reliability and feasibility of the proposed model are evaluated in the presence and absence of the elastic components.
Liu, Xunliang; Lou, Guofeng; Wen, Zhi
A non-isothermal, steady-state, three-dimensional (3D), two-phase, multicomponent transport model is developed for proton exchange membrane (PEM) fuel cell with parallel gas distributors. A key feature of this work is that a detailed membrane model is developed for the liquid water transport with a two-mode water transfer condition, accounting for the non-equilibrium humidification of membrane with the replacement of an equilibrium assumption. Another key feature is that water transport processes inside electrodes are coupled and the balance of water flux is insured between anode and cathode during the modeling. The model is validated by the comparison of predicted cell polarization curve with experimental data. The simulation is performed for water vapor concentration field of reactant gases, water content distribution in the membrane, liquid water velocity field and liquid water saturation distribution inside the cathode. The net water flux and net water transport coefficient values are obtained at different current densities in this work, which are seldom discussed in other modeling works. The temperature distribution inside the cell is also simulated by this model.
Bifurcation analysis of 3D ocean flows using a parallel fully-implicit ocean model
Thies, J.; Wubs, F.W.; Dijkstra, H.A.
2009-01-01
To understand the physics and dynamics of the ocean circulation, techniques of numerical bifurcation theory such as continuation methods have proved to be useful. Up to now these techniques have been applied to models with relatively few degrees of freedom such as multi-layer quasi-geostrophic and s
Detailed numerical modeling of a linear parallel-plate Active Magnetic Regenerator
DEFF Research Database (Denmark)
Nielsen, Kaspar Kirstein; Bahl, Christian Robert Haffenden; Smith, Anders;
2009-01-01
in the spatially not-resolved direction. The implementation of the magnetocaloric effect (MCE) is made possible through a source term in the heat equation for the magnetocaloric material (MCM). This adds the possibility to model a continuously varying magnetic field. The adiabatic temperature change of the used...
Modelling and Control of Inverse Dynamics for a 5-DOF Parallel Kinematic Polishing Machine
Directory of Open Access Journals (Sweden)
Weiyang Lin
2013-08-01
/ control method is presented and investigated 2∞ in order to track the error control of the inverse dynamic model; the simulation results from different conditions show that the mixed / control method could 2∞ achieve an optimal and robust control performance. This work shows that the presented PKPM has a higher dynamic performance than conventional machine tools.
Bifurcation analysis of 3D ocean flows using a parallel fully-implicit ocean model
Thies, J.; Wubs, F.W.; Dijkstra, H.A.
2009-01-01
To understand the physics and dynamics of the ocean circulation, techniques of numerical bifurcation theory such as continuation methods have proved to be useful. Up to now these techniques have been applied to models with relatively few degrees of freedom such as multi-layer quasi-geostrophic and s
Toward an animal model for antisocial behavior : parallels between mice and humans
Sluyter, F; Arseneault, L; Moffitt, TE; Veenema, AH; de Boer, S; Koolhaas, JM
The goal of this article is to examine whether mouse lines genetically selected for short and long attack latencies are good animal models for antisocial behavior in humans. To this end, we compared male Short and Long Attack Latency mice (SAL and LAL, respectively) with the extremes of the Dunedin
Bifurcation analysis of 3D ocean flows using a parallel fully-implicit ocean model
Thies, Jonas; Wubs, Fred; Dijkstra, Henk A.
2009-01-01
To understand the physics and dynamics of the ocean circulation, techniques of numerical bifurcation theory such as continuation methods have proved to be useful. Up to now these techniques have been applied to models with relatively few (O(10(5))) degrees of freedom such as multi-layer
Toward an animal model for antisocial behavior : parallels between mice and humans
Sluyter, F; Arseneault, L; Moffitt, TE; Veenema, AH; de Boer, S; Koolhaas, JM
2003-01-01
The goal of this article is to examine whether mouse lines genetically selected for short and long attack latencies are good animal models for antisocial behavior in humans. To this end, we compared male Short and Long Attack Latency mice (SAL and LAL, respectively) with the extremes of the Dunedin
Modeling and Control of a Parallel Waste Heat Recovery System for Euro-VI Heavy-Duty Diesel Engines
Directory of Open Access Journals (Sweden)
Emanuel Feru
2014-10-01
Full Text Available This paper presents the modeling and control of a waste heat recovery systemfor a Euro-VI heavy-duty truck engine. The considered waste heat recovery system consists of two parallel evaporators with expander and pumps mechanically coupled to the engine crankshaft. Compared to previous work, the waste heat recovery system modeling is improved by including evaporator models that combine the finite difference modeling approach with a moving boundary one. Over a specific cycle, the steady-state and dynamic temperature prediction accuracy improved on average by 2% and 7%. From a control design perspective, the objective is to maximize the waste heat recovery system output power.However, for safe system operation, the vapor state needs to be maintained before the expander under highly dynamic engine disturbances. To achieve this, a switching model predictive control strategy is developed. The proposed control strategy performance is demonstrated using the high-fidelity waste heat recovery system model subject to measured disturbances from an Euro-VI heavy-duty diesel engine. Simulations are performed usinga cold-start World Harmonized Transient cycle that covers typical urban, rural and highway driving conditions. The model predictive control strategy provides 15% more time in vaporand recovered thermal energy than a classical proportional-integral (PI control strategy. In the case that the model is accurately known, the proposed control strategy performance can be improved by 10% in terms of time in vapor and recovered thermal energy. This is demonstrated with an offline nonlinear model predictive control strategy.
Institute of Scientific and Technical Information of China (English)
无
2002-01-01
This paper presents an error modeling methodology that enables the tolerance design, assembly and kinematic calibration of a class of 3-DOF parallel kinematic machines with parallelogram struts to be integrated into a unified framework. The error mapping function is formulated to identify the source errors affecting the uncompensable pose error. The sensitivity analysis in the sense of statistics is also carried out to investigate the influences of source errors on the pose accuracy. An assembly process that can effectively minimize the uncompensable pose error is proposed as one of the results of this investigation.
Rajagopalan, J.; Xing, K.; Guo, Y.; Lee, F. C.; Manners, Bruce
1996-01-01
A simple, application-oriented, transfer function model of paralleled converters employing Master-Slave Current-sharing (MSC) control is developed. Dynamically, the Master converter retains its original design characteristics; all the Slave converters are forced to depart significantly from their original design characteristics into current-controlled current sources. Five distinct loop gains to assess system stability and performance are identified and their physical significance is described. A design methodology for the current share compensator is presented. The effect of this current sharing scheme on 'system output impedance' is analyzed.
Rajagopalan, J.; Xing, K.; Guo, Y.; Lee, F. C.; Manners, Bruce
1996-01-01
A simple, application-oriented, transfer function model of paralleled converters employing Master-Slave Current-sharing (MSC) control is developed. Dynamically, the Master converter retains its original design characteristics; all the Slave converters are forced to depart significantly from their original design characteristics into current-controlled current sources. Five distinct loop gains to assess system stability and performance are identified and their physical significance is described. A design methodology for the current share compensator is presented. The effect of this current sharing scheme on 'system output impedance' is analyzed.
Massively Parallel Linear Stability Analysis with P_ARPACK for 3D Fluid Flow Modeled with MPSalsa
Energy Technology Data Exchange (ETDEWEB)
Lehoucq, R.B.; Salinger, A.G.
1998-10-13
We are interested in the stability of three-dimensional fluid flows to small dkturbances. One computational approach is to solve a sequence of large sparse generalized eigenvalue problems for the leading modes that arise from discretizating the differential equations modeling the flow. The modes of interest are the eigenvalues of largest real part and their associated eigenvectors. We discuss our work to develop an effi- cient and reliable eigensolver for use by the massively parallel simulation code MPSalsa. MPSalsa allows simulation of complex 3D fluid flow, heat transfer, and mass transfer with detailed bulk fluid and surface chemical reaction kinetics.
Energy Technology Data Exchange (ETDEWEB)
Finsterle, S.; Pruess, K.
1999-01-01
ITOUGH2 is an optimization code that allows estimation of any input parameter of the nonisothermal, multiphase flow simulator TOUGH2. ITOUGH2 inversions are computationally intensive because the so-called forward problem, i.e., the simulation of fluid and heat flow through the geologic formation, must be solved many times for different parameter combinations to evaluate the misfit criterion or to numerically calculate sensitivity coefficients. Most of these forward runs am independent from each other and can therefore be performed in parallel. Message passing based on the Parallel Virtual Machine (PVM) system has been implemented into ITOUGH2 to enable parallel processing of forward simulations on a heterogeneous network of Unix workstations or networked PCs that run under the Linux operating system. This paper describes the PVM system and its implementation into ITOUGH2. Examples are discussed, demonstrating the use, efficiency, and limitations of ITOUGH2-PVM.
Energy Technology Data Exchange (ETDEWEB)
Moura, Fabricio A.M.; Camacho, Jose R. [Universidade Federal de Uberlandia, School of Electrical Engineering, Rural Electricity and Alternative Sources Lab, PO Box 593, 38400.902 Uberlandia, MG (Brazil); Chaves, Marcelo L.R.; Guimaraes, Geraldo C. [Universidade Federal de Uberlandia, School of Electrical Engineering, Power Systems Dynamics Group, PO Box: 593, 38400.902 Uberlandia, MG (Brazil)
2010-02-15
The main task in this paper is to present a performance analysis of a distribution network in the presence of an independent power producer (IP) synchronous generator with its speed governor and voltage regulator modeled using TACS -Transient Analysis of Control Systems, for distributed generation studies. Regulators were implemented through their transfer functions in the S domain. However, since ATP-EMTP (Electromagnetic Transient Program) works in the time domain, a discretization is necessary to return the TACS output to time domain. It must be highlighted that this generator is driven by a steam turbine, and the whole system with regulators and the equivalent of the power authority system at the common coupling point (CCP) are modeled in the ''ATP-EMTP -Alternative Transients Program''. (author)
Energy Technology Data Exchange (ETDEWEB)
Duan, Nan [ORNL; Dimitrovski, Aleksandar D [ORNL; Simunovic, Srdjan [ORNL; Sun, Kai [University of Tennessee (UT)
2016-01-01
The development of high-performance computing techniques and platforms has provided many opportunities for real-time or even faster-than-real-time implementation of power system simulations. One approach uses the Parareal in time framework. The Parareal algorithm has shown promising theoretical simulation speedups by temporal decomposing a simulation run into a coarse simulation on the entire simulation interval and fine simulations on sequential sub-intervals linked through the coarse simulation. However, it has been found that the time cost of the coarse solver needs to be reduced to fully exploit the potentials of the Parareal algorithm. This paper studies a Parareal implementation using reduced generator models for the coarse solver and reports the testing results on the IEEE 39-bus system and a 327-generator 2383-bus Polish system model.
2014-11-01
Army position unless so designated by other authorized documents. Citation of manufacturer’s or trade names does not constitute an official...high- performance computing (HPC) and overall data longevity . The Battlefield Environment Division Modeling Framework (BMF) v0.90 was developed for the...use of object-oriented program (OOP) design . Here we extend BMF to include IO functionality for serial and distributed compute configurations. The
Modeling Large Scale Circuits Using Massively Parallel Descrete-Event Simulation
2013-06-01
in VHDL and Verilog . Using the Synopsys Design Compiler and scripts provided by the OpenSPARC code base, we were able to generate gate level...efficiently used in a simulation model. This process is described in Figure 1. 3.2.1 Source. The OpenSPARC T2 design is provided in Verilog Register Transfer...one flat netlist. This file format is still completely valid Verilog code. The module is defined with connection arguments and the netlist of its
Use of massively parallel computing to improve modelling accuracy within the nuclear sector
Directory of Open Access Journals (Sweden)
L M Evans
2016-06-01
This work presents recent advancements in three techniques: Uncertainty quantification (UQ; Cellular automata finite element (CAFE; Image based finite element methods (IBFEM. Case studies are presented demonstrating their suitability for use in nuclear engineering made possible by advancements in parallel computing hardware that is projected to be available for industry within the next decade costing of the order of $100k.
Parallel shooting methods for finding steady state solutions to engine simulation models
DEFF Research Database (Denmark)
Andersen, Stig Kildegård; Thomsen, Per Grove; Carlsen, Henrik
2007-01-01
as test case. A parallel speedup factor of 23 on 33 processors was achieved with multiple shooting. But fast transients at the beginnings of sub intervals caused significant overhead for the multiple shooting methods and limited the best speedup to 3.8 relative to the fastest sequential method: Single...
Parallel Process and Isomorphism: A Model for Decision Making in the Supervisory Triad
Koltz, Rebecca L.; Odegard, Melissa A.; Feit, Stephen S.; Provost, Kent; Smith, Travis
2012-01-01
Parallel process and isomorphism are two supervisory concepts that are often discussed independently but rarely discussed in connection with each other. These two concepts, philosophically, have different historical roots, as well as different implications for interventions with regard to the supervisory triad. The authors examine the difference…
Mars-solar wind interaction: LatHyS, an improved parallel 3-D multispecies hybrid model
Modolo, Ronan; Hess, Sebastien; Mancini, Marco; Leblanc, Francois; Chaufray, Jean-Yves; Brain, David; Leclercq, Ludivine; Esteban-Hernández, Rosa; Chanteur, Gerard; Weill, Philippe; González-Galindo, Francisco; Forget, Francois; Yagi, Manabu; Mazelle, Christian
2016-07-01
In order to better represent Mars-solar wind interaction, we present an unprecedented model achieving spatial resolution down to 50 km, a so far unexplored resolution for global kinetic models of the Martian ionized environment. Such resolution approaches the ionospheric plasma scale height. In practice, the model is derived from a first version described in Modolo et al. (2005). An important effort of parallelization has been conducted and is presented here. A better description of the ionosphere was also implemented including ionospheric chemistry, electrical conductivities, and a drag force modeling the ion-neutral collisions in the ionosphere. This new version of the code, named LatHyS (Latmos Hybrid Simulation), is here used to characterize the impact of various spatial resolutions on simulation results. In addition, and following a global model challenge effort, we present the results of simulation run for three cases which allow addressing the effect of the suprathermal corona and of the solar EUV activity on the magnetospheric plasma boundaries and on the global escape. Simulation results showed that global patterns are relatively similar for the different spatial resolution runs, but finest grid runs provide a better representation of the ionosphere and display more details of the planetary plasma dynamic. Simulation results suggest that a significant fraction of escaping O+ ions is originated from below 1200 km altitude.
Chu, Chunlei
2009-01-01
The major performance bottleneck of the parallel Fourier method on distributed memory systems is the network communication cost. In this study, we investigate the potential of using non‐blocking all‐to‐all communications to solve this problem by overlapping computation and communication. We present the runtime comparison of a 3D seismic modeling problem with the Fourier method using non‐blocking and blocking calls, respectively, on a Linux cluster. The data demonstrate that a performance improvement of up to 40% can be achieved by simply changing blocking all‐to‐all communication calls to non‐blocking ones to introduce the overlapping capability. A 3D reverse‐time migration result is also presented as an extension to the modeling work based on non‐blocking collective communications.
Institute of Scientific and Technical Information of China (English)
Xueyan TANG; IMing CHEN
2009-01-01
This paper systematically studies structure synthesis and dimension optimization of XYZ flexure parallel mechanisms (FPMs) with large-motion and decoupled kinematic structure. Different from structure synthesis of rigid-body mechanisms, structure synthesis of flexure mechanisms is constrained by the limitations inherent in flexure mechanisms. These limitations are investigated and summarized as the structure constraints. With consideration of these structure constraints, the configurations of the decoupled XYZ-FPMs are synthesized using the Screw Theory. The synthesized XYZ-FPMs also possess large motion range, due to integration of a new type of large-motion prismatic joint designed in this paper. The stiffness models of the synthesized XYZ-FPMs are formulated. A 3-PPP XYZ-FPM is developed as the case of the studies of structure synthesis and stiffness modeling.
DEFF Research Database (Denmark)
Vasquez, Juan Carlos; Guerrero, Josep M.; Savaghebi, Mehdi
2011-01-01
and discussed. Experimental results are provided to validate the performance and robustness of the VSIs functionality during Islanded and grid-connected operations, allowing a seamless transition between these modes through control hierarchies by regulating frequency and voltage, main-grid interactivity......Power electronics based microgrids consist of a number of voltage source inverters (VSIs) operating in parallel. In this paper, the modeling, control design, and stability analysis of three-phase VSIs are derived. The proposed voltage and current inner control loops and the mathematical models...... the frequency and amplitude deviations produced by the primary control. And the tertiary control regulates the power flow between the grid and the microgrid. Also, a synchronization algorithm is presented in order to connect the microgrid to the grid. The evaluation of the hierarchical control is presented...
Ren, Yihui; Eubank, Stephen; Nath, Madhurima
2016-10-01
Network reliability is the probability that a dynamical system composed of discrete elements interacting on a network will be found in a configuration that satisfies a particular property. We introduce a reliability property, Ising feasibility, for which the network reliability is the Ising model's partition function. As shown by Moore and Shannon, the network reliability can be separated into two factors: structural, solely determined by the network topology, and dynamical, determined by the underlying dynamics. In this case, the structural factor is known as the joint density of states. Using methods developed to approximate the structural factor for other reliability properties, we simulate the joint density of states, yielding an approximation for the partition function. Based on a detailed examination of why naïve Monte Carlo sampling gives a poor approximation, we introduce a parallel scheme for estimating the joint density of states using a Markov-chain Monte Carlo method with a spin-exchange random walk. This parallel scheme makes simulating the Ising model in the presence of an external field practical on small computer clusters for networks with arbitrary topology with ˜106 energy levels and more than 10308 microstates.
Ovaysi, S.; Piri, M.
2009-12-01
We present a three-dimensional fully dynamic parallel particle-based model for direct pore-level simulation of incompressible viscous fluid flow in disordered porous media. The model was developed from scratch and is capable of simulating flow directly in three-dimensional high-resolution microtomography images of naturally occurring or man-made porous systems. It reads the images as input where the position of the solid walls are given. The entire medium, i.e., solid and fluid, is then discretized using particles. The model is based on Moving Particle Semi-implicit (MPS) technique. We modify this technique in order to improve its stability. The model handles highly irregular fluid-solid boundaries effectively. It takes into account viscous pressure drop in addition to the gravity forces. It conserves mass and can automatically detect any false connectivity with fluid particles in the neighboring pores and throats. It includes a sophisticated algorithm to automatically split and merge particles to maintain hydraulic connectivity of extremely narrow conduits. Furthermore, it uses novel methods to handle particle inconsistencies and open boundaries. To handle the computational load, we present a fully parallel version of the model that runs on distributed memory computer clusters and exhibits excellent scalability. The model is used to simulate unsteady-state flow problems under different conditions starting from straight noncircular capillary tubes with different cross-sectional shapes, i.e., circular/elliptical, square/rectangular and triangular cross-sections. We compare the predicted dimensionless hydraulic conductances with the data available in the literature and observe an excellent agreement. We then test the scalability of our parallel model with two samples of an artificial sandstone, samples A and B, with different volumes and different distributions (non-uniform and uniform) of solid particles among the processors. An excellent linear scalability is
Christian Becker; Armin Scholl
2008-01-01
Assembly line balancing problems (ALBP) arise whenever an assembly line is con- figured, redesigned or adjusted. An ALBP consists of distributing the total workload for manu- facturing any unit of the products to be assembled among the work stations along the line sub- ject to a strict or average cycle time. Traditionally, stations are considered to be manned by one operator, respectively, or duplicated in form of identical parallel stations, each also manned by a single operator. In practice...
Modeling of Configuration-Dependent Flexible Joints for a Parallel Robot
Zili Zhou; Mechefske, Chris K.; Fengfeng (Jeff) Xi
2010-01-01
This paper provides a method to determine the variable flexible joint parameters which are dependent on configurations for a PRS Parallel Robot. Based on the continuous force approach, virtual springs were used between the joint components to simulate the joint flexibility. The stiffness matrix of the joint virtual springs was derived. The method uses system dynamic characteristics in different configurations to set the virtual spring stiffness for all the joints in the system. Modal testing ...
Protein folding of the HOP model: A parallel Wang—Landau study
Shi, G.; Wüst, T.; Li, Y. W.; Landau, D. P.
2015-09-01
We propose a simple modification to the hydrophobic-polar (HP) protein model, by introducing a new type of monomer, “0”, with intermediate hydrophobicity of some amino acids between H and P. With the replica-exchange Wang-Landau sampling method, we investigate some widely studied HP sequences as well as their H0P counterparts and observe that the H0P sequences exhibit dramatically reduced ground state degeneracy and more significant transition signals at low temperature for some thermodynamic properties, such as the specific heat.
Protein folding of the H0P model: A parallel Wang-Landau study
Energy Technology Data Exchange (ETDEWEB)
Shi, Guangjie [University of Georgia, Athens, GA; Wuest, Thomas [ETH Zurich, Switzerland; Li, Ying Wai [ORNL; Landau, David P [University of Georgia, Athens, GA
2015-01-01
We propose a simple modication to the hydrophobic-polar (HP) protein model, by introducing a new type of monomer, "0", with intermediate hydrophobicity of some amino acids between H and P. With the replica-exchange Wang-Landau sampling method, we investigate some widely studied HP sequences as well as their H0P counterparts and observe that the H0P sequences exhibit dramatically reduced ground state degeneracy and more signicant transition signals at low temperature for some thermodynamic properties, such as the specific heat.
平面并联机构的动力学建模及自适应控制%Modeling and Adaptive Control of a Planar Parallel Mechanism
Institute of Scientific and Technical Information of China (English)
敖银辉; 陈新
2004-01-01
Dynamic model and control strategy of parallel mechanism have always been a problem in robotics research. In this paper, different dynamics formulation methods are discussed first. A model of redundant driven parallel mechanism with a planar parallel manipulator is then constructed as an example. A nonlinear adaptive control method is introduced. Matrix pseudo-inversion is used to get a desired actuator torque from a desired end-effector coordinate while the feedback torque is directly calculated in the actuator space. This treatment avoids forward kinematics computation that is very difficult in a parallel mechanism. Experiments with PID together with the descibed adaptive control strategy were carried out for a planar parallel mechanism. The results show that the proposed adaptive controller outperforms conventional PID methods in tracking desired input at a high speed.
Directory of Open Access Journals (Sweden)
Umar Iqbal
2010-01-01
Full Text Available Present land vehicle navigation relies mostly on the Global Positioning System (GPS that may be interrupted or deteriorated in urban areas. In order to obtain continuous positioning services in all environments, GPS can be integrated with inertial sensors and vehicle odometer using Kalman filtering (KF. For car navigation, low-cost positioning solutions based on MEMS-based inertial sensors are utilized. To further reduce the cost, a reduced inertial sensor system (RISS consisting of only one gyroscope and speed measurement (obtained from the car odometer is integrated with GPS. The MEMS-based gyroscope measurement deteriorates over time due to different errors like the bias drift. These errors may lead to large azimuth errors and mitigating the azimuth errors requires robust modeling of both linear and nonlinear effects. Therefore, this paper presents a solution based on Parallel Cascade Identification (PCI module that models the azimuth errors and is augmented to KF. The proposed augmented KF-PCI method can handle both linear and nonlinear system errors as the linear parts of the errors are modeled inside the KF and the nonlinear and residual parts of the azimuth errors are modeled by PCI. The performance of this method is examined using road test experiments in a land vehicle.
Yuan, G.; Wang, D. H.
2017-03-01
Multi-directional and multi-degree-of-freedom (multi-DOF) vibration energy harvesting are attracting more and more research interest in recent years. In this paper, the principle of a piezoelectric six-DOF vibration energy harvester based on parallel mechanism is proposed to convert the energy of the six-DOF vibration to single-DOF vibrations of the limbs on the energy harvester and output voltages. The dynamic model of the piezoelectric six-DOF vibration energy harvester is established to estimate the vibrations of the limbs. On this basis, a Stewart-type piezoelectric six-DOF vibration energy harvester is developed and explored. In order to validate the established dynamic model and the analysis results, the simulation model of the Stewart-type piezoelectric six-DOF vibration energy harvester is built and tested with different vibration excitations by SimMechanics, and some preliminary experiments are carried out. The results show that the vibration of the limbs on the piezoelectric six-DOF vibration energy harvester can be estimated by the established dynamic model. The developed Stewart-type piezoelectric six-DOF vibration energy harvester can harvest the energy of multi-directional linear vibration and multi-axis rotating vibration with resonance frequencies of 17 Hz, 25 Hz, and 47 Hz. Moreover, the resonance frequencies of the developed piezoelectric six-DOF vibration energy harvester are not affected by the direction changing of the vibration excitation.
Energy Technology Data Exchange (ETDEWEB)
Paula, A.V. de, E-mail: vagtinski@mecanica.ufrgs.br [PROMEC – Programa de Pós Graduação em Engenharia Mecânica, UFRGS – Universidade Federal do Rio Grande do Sul, Porto Alegre, RS (Brazil); Möller, S.V., E-mail: svmoller@ufrgs.br [PROMEC – Programa de Pós Graduação em Engenharia Mecânica, UFRGS – Universidade Federal do Rio Grande do Sul, Porto Alegre, RS (Brazil)
2013-11-15
This paper presents a study of the bistable phenomenon which occurs in the turbulent flow impinging on circular cylinders placed side-by-side. Time series of axial and transversal velocity obtained with the constant temperature hot wire anemometry technique in an aerodynamic channel are used as input data in a finite mixture model, to classify the observed data according to a family of probability density functions. Wavelet transforms are applied to analyze the unsteady turbulent signals. Results of flow visualization show that the flow is predominantly two-dimensional. A double-well energy model is suggested to describe the behavior of the bistable phenomenon in this case. -- Highlights: ► Bistable flow on two parallel cylinders is studied with hot wire anemometry as a first step for the application on the analysis to tube bank flow. ► The method of maximum likelihood estimation is applied to hot wire experimental series to classify the data according to PDF functions in a mixture model approach. ► Results show no evident correlation between the changes of flow modes with time. ► An energy model suggests the presence of more than two flow modes.
Velimsky, J.
2011-12-01
Inversion of observatory and low-orbit satellite geomagnetic data in terms of the three-dimensional distribution of electrical conductivity in the Earth's mantle can provide an independent constraint on the physical, chemical, and mineralogical composition of the Earth's mantle. This problem has been recently approached by different numerical methods. There are several key challenges from the numerical and algorithmic point of view, in particular the accuracy and speed of the forward solver, the effective evaluation of sensitivities of data to changes of model parameters, and the dependence of results on the a-priori knowledge of the spatio-temporal structure of the primary ionospheric and magnetospheric electric currents. Here I present recent advancements of the time-domain, spherical harmonic-finite element approach. The forward solver has been adapted to distributed-memory parallel architecture using band-matrix routines from the ScaLapack library. The evaluation of gradient of data misfit in the model space using adjoint approach has been also paralellized. Finally, the inverse problem has been reformulated in a way which allows for simultaneous reconstruction of conductivity model and external field model directly from the data.
Directory of Open Access Journals (Sweden)
Jing Sun
2015-01-01
Full Text Available The torque coordination control during mode transition is a very important task for hybrid electric vehicle (HEV with a clutch serving as the key enabling actuator element. Poor coordination will deteriorate the drivability of the driver and lead to excessive wearing to the clutch friction plates. In this paper, a novel torque coordination control strategy for a single-shaft parallel hybrid electric vehicle is presented to coordinate the motor torque, engine torque, and clutch torque so that the seamless mode switching can be achieved. Different to the existing model predictive control (MPC methods, only one model predictive controller is needed and the clutch torque is taken as an optimized variable rather than a known parameter. Furthermore, the successful idea of model reference control (MRC is also used for reference to generate the set-point signal required by MPC. The parameter sensitivity is studied for better performance of the proposed model predictive controller. The simulation results validate that the proposed novel torque coordination control strategy has less vehicle jerk, less torque interruption, and smaller clutch frictional losses, compared with the baseline method. In addition, the sensitivity and adaptiveness of the proposed novel torque coordination control strategy are evaluated.
de Lorenzi, F; Gerhard, O; Sambhus, N; Lorenzi, Flavio de; Debattista, Victor; Gerhard, Ortwin; Sambhus, Niranjan
2007-01-01
We describe a made-to-measure algorithm for constructing N-particle models of stellar systems from observational data (Chi-Squared-M2M), extending earlier ideas by Syer and Tremaine. The algorithm properly accounts for observational errors, is flexible, and can be applied to various systems and geometries. We implement this algorithm in a parallel code NMAGIC and carry out a sequence of tests to illustrate its power and performance: (i) We reconstruct an isotropic Hernquist model from density moments and projected kinematics and recover the correct differential energy distribution and intrinsic kinematics. (ii) We build a self-consistent oblate three-integral maximum rotator model and compare how the distribution function is recovered from integral field and slit kinematic data. (iii) We create a non-rotating and a figure rotating triaxial stellar particle model, reproduce the projected kinematics of the figure rotating system by a non-rotating system of the same intrinsic shape, and illustrate the signature ...
Jones, Christina L; Jensen, Jakob D; Scherr, Courtney L; Brown, Natasha R; Christy, Katheryn; Weaver, Jeremy
2015-01-01
The Health Belief Model (HBM) posits that messages will achieve optimal behavior change if they successfully target perceived barriers, benefits, self-efficacy, and threat. While the model seems to be an ideal explanatory framework for communication research, theoretical limitations have limited its use in the field. Notably, variable ordering is currently undefined in the HBM. Thus, it is unclear whether constructs mediate relationships comparably (parallel mediation), in sequence (serial mediation), or in tandem with a moderator (moderated mediation). To investigate variable ordering, adults (N = 1,377) completed a survey in the aftermath of an 8-month flu vaccine campaign grounded in the HBM. Exposure to the campaign was positively related to vaccination behavior. Statistical evaluation supported a model where the indirect effect of exposure on behavior through perceived barriers and threat was moderated by self-efficacy (moderated mediation). Perceived barriers and benefits also formed a serial mediation chain. The results indicate that variable ordering in the Health Belief Model may be complex, may help to explain conflicting results of the past, and may be a good focus for future research.
Parallel Adaptive Computation of Blood Flow in a 3D ``Whole'' Body Model
Zhou, M.; Figueroa, C. A.; Taylor, C. A.; Sahni, O.; Jansen, K. E.
2008-11-01
Accurate numerical simulations of vascular trauma require the consideration of a larger portion of the vasculature than previously considered, due to the systemic nature of the human body's response. A patient-specific 3D model composed of 78 connected arterial branches extending from the neck to the lower legs is constructed to effectively represent the entire body. Recently developed outflow boundary conditions that appropriately represent the downstream vasculature bed which is not included in the 3D computational domain are applied at 78 outlets. In this work, the pulsatile blood flow simulations are started on a fairly uniform, unstructured mesh that is subsequently adapted using a solution-based approach to efficiently resolve the flow features. The adapted mesh contains non-uniform, anisotropic elements resulting in resolution that conforms with the physical length scales present in the problem. The effects of the mesh resolution on the flow field are studied, specifically on relevant quantities of pressure, velocity and wall shear stress.
Ofir-Eyal, Shani; Hasson-Ohayon, Ilanit; Kravetz, Shlomo
2014-12-15
Two alternative models of impaired cognitive and affective processing that may underlie reduced social quality of life (SQoL) of persons with schizophrenia, were examined. According to the parallel process model, impaired cognitive empathy and affective empathy make relatively independent contributions to the symptoms of schizophrenia and to the consequent reduction in SQoL. According to the integrative mediation model, the symptoms of schizophrenia and the reduction in SQoL associated with these symptoms are the products of a process by which impairments of cognitive empathy are contingent on impairments of affective empathy. 90 persons with schizophrenia were assessed for SQoL, symptoms and cognitive and affective empathy. Results support the integrative mediation model only for cognitive empathy and negative psychiatric symptoms. Only the negative links between cognitive empathy and negative symptoms served to mediate the positive relation between affective empathy and SQoL. Positive symptoms had a limited negative impact on SQoL and did not play a role in the paths that linked affective empathy to SQoL. Age had a statistically significant and negative indirect relationship to SQoL. Results are consistent with recent approach that distinguish between cognitive and affective empathy and specify how these two processes are integrated. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Solving the Maximum Weighted Clique Problem Based on Parallel Biological Computing Model
Directory of Open Access Journals (Sweden)
Zhaocai Wang
2015-01-01
Full Text Available The maximum weighted clique (MWC problem, as a typical NP-complete problem, is difficult to be solved by the electronic computer algorithm. The aim of the problem is to seek a vertex clique with maximal weight sum in a given undirected graph. It is an extremely important problem in the field of optimal engineering scheme and control with numerous practical applications. From the point of view of practice, we give a parallel biological algorithm to solve the MWC problem. For the maximum weighted clique problem with m edges and n vertices, we use fixed length DNA strands to represent different vertices and edges, fully conduct biochemical reaction, and find the solution to the MVC problem in certain length range with O(n2 time complexity, comparing to the exponential time level by previous computer algorithms. We expand the applied scope of parallel biological computation and reduce computational complexity of practical engineering problems. Meanwhile, we provide a meaningful reference for solving other complex problems.
Directory of Open Access Journals (Sweden)
Sekhri Larbi
2014-12-01
Full Text Available The optimal resources allocation to tasks was the primary objective of the research dealing with scheduling problems. These problems are characterized by their complexity, known as NP-hard in most cases. Currently with the evolution of technology, classical methods are inadequate because they degrade system performance (inflexibility, inefficient resources using policy, etc.. In the context of parallel and distributed systems, several computing units process multitasking applications in concurrent way. Main goal of such process is to schedule tasks and map them on the appropriate machines to achieve the optimal overall system performance (Minimize the Make-span and balance the load among the machines. In this paper we present a Time Petri Net (TPN based approach to solve the scheduling problem by mapping each entity (tasks, resources and constraints to correspondent one in the TPN. In this case, the scheduling problem can be reduced to finding an optimal sequence of transitions leading from an initial marking to a final one. Our approach improves the classical mapping algorithms by introducing a control over resources allocation and by taking into consideration the resource balancing aspect leading to an acceptable state of the system. The approach is applied to a specific class of problems where the machines are parallel and identical. This class is analyzed by using the TiNA (Time Net Analyzer tool software developed in the LAAS laboratory (Toulouse, France.
Institute of Scientific and Technical Information of China (English)
Zhengong Zhou; Peiwei Zhang; Linzhi Wu
2010-01-01
In this paper,the interactions of multiple parallel symmetric and permeable finite length cracks in a piezoelectric/piezomagnetic material plane subjected to anti-plane shear stress loading are studied by the Schmidt method.The problem is formulated through Fourier transform into dual integral equations,in which the unknown variables are the displacement jumps across the crack surfaces.To solve the dual integral equations,the displacement jumps across the crack surfaces are directly expanded as a series of Jacobi polynomials.Finally,the relation between the electric field,the magnetic flux field and the stress field near the crack tips is obtained.The results show that the stress,the electric displacement and the magnetic flux intensity factors at the crack tips depend on the length and spacing of the cracks.It is also revealed that the crack shielding effect presents in piezoelectric/piezomagnetic materials.
Parallelization of Dct Using Opencl Model%面向OpenCL模型的DCT并行化
Institute of Scientific and Technical Information of China (English)
向阳霞; 张惠民; 王子强
2013-01-01
为了提高DCT变换的速度，文中对面向OpenCL模型的DCT并行化过程进行了研究，首先分析了GPU和OpenCL的特性和优势，研究了传统DCT变换的工作原理，然后针对CPU和GPU两种不同平台对DCT变换进行测试和结果分析，实验结果表明基于OpenCL模型的并行化能够有效地提高DCT变换的速度。%In order to improve speed up the DCT inversion, this paper analyzed characteristics and advantages of GPU and Open-CL; and researched working principle of traditional DCT inversion, then tested algorithm on the different platforms, the results shows that parallelization can effectively improve the fast DCT performance.
Exploring Parallel Algorithms for Volumetric Mass-Spring-Damper Models in CUDA
DEFF Research Database (Denmark)
Rasmusson, Allan; Mosegaard, Jesper; Sørensen, Thomas Sangild
2008-01-01
Since the advent of programmable graphics processors (GPUs) their computational powers have been utilized for general purpose computation. Initially by “exploiting” graphics APIs and recently through dedicated parallel computation frameworks such as the Compute Unified Device Architecture (CUDA......) from Nvidia. This paper investigates multiple implementations of volumetric Mass-Spring-Damper systems in CUDA. The obtained performance is compared to previous implementations utilizing the GPU through the OpenGL graphics API. We find that both performance and optimization strategies differ widely...... between the OpenGL and CUDA implementations. Specifically, the previous recommendation of using implicitly connected particles is replaced by a recommendation that supports unstructured meshes and run-time topological changes with an insignificant performance reduction....
A DISCRETE TIME TWO-LEVEL MIXED SERVICE PARALLEL POLLING MODEL
Institute of Scientific and Technical Information of China (English)
Guan Zheng; Zhao Dongfeng; Zhao Yifan
2012-01-01
We present a discrete time single-server two-level mixed service polling systems with two queue types,one center queue and N normal queues.Two-level means the center queue will be successive served after each normal queue.In the first level,server visits between the center queue and the normal queue.In the second level,normal queues are polled by a cyclic order.Mixed service means the service discipline are exhaustive for center queue,and parallel i-limited for normal queues.We propose an imbedded Markov chain framework to drive the closed-form expressions for the mean cycle time,mean queue length,and mean waiting time.Numerical examples demonstrate that theoretical and simulation results are identical the new system efficiently differentiates priorities.
Rebič, Matúš; Laaksonen, Aatto; Šponer, Jiří; Uličný, Jozef; Mocci, Francesca
2016-08-04
Most molecular dynamics (MD) simulations of DNA quadruplexes have been performed under minimal salt conditions using the Åqvist potential parameters for the cation with the TIP3P water model. Recently, this combination of parameters has been reported to be problematic for the stability of quadruplex DNA, especially caused by the ion interactions inside or near the quadruplex channel. Here, we verify how the choice of ion parameters and water model can affect the quadruplex structural stability and the interactions with the ions outside the channel. We have performed a series of MD simulations of the human full-parallel telomeric quadruplex by neutralizing its negative charge with K(+) ions. Three combinations of different cation potential parameters and water models have been used: (a) Åqvist ion parameters, TIP3P water model; (b) Joung and Cheatham ion parameters, TIP3P water model; and (c) Joung and Cheatham ion parameters, TIP4Pew water model. For the combinations (b) and (c), the effect of the ionic strength has been evaluated by adding increasing amounts of KCl salt (50, 100, and 200 mM). Two independent simulations using the Åqvist parameters with the TIP3P model show that this combination is clearly less suited for the studied quadruplex with K(+) as counterions. In both simulations, one ion escapes from the channel, followed by significant deformation of the structure, leading to deviating conformation compared to that in the reference crystallographic data. For the other combinations of ion and water potentials, no tendency is observed for the channel ions to escape from the quadruplex channel. In addition, the internal mobility of the three loops, torsion angles, and counterion affinity have been investigated at varied salt concentrations. In summary, the selection of ion and water models is crucial as it can affect both the structure and dynamics as well as the interactions of the quadruplex with its counterions. The results obtained with the TIP4Pew
Gunnar, Megan R; Hostinar, Camelia E; Sanchez, Mar M; Tottenham, Nim; Sullivan, Regina M
2015-01-01
It has been long recognized that parents exert profound influences on child development. Dating back to at least the seventeenth-century Enlightenment, the ability for parents to shape child behavior in an enduring way has been noted. Twentieth-century scholars developed theories to explain how parenting histories influence psychological development, and since that time, the number of scientific publications on parenting influences in both human and nonhuman animal fields has grown at an exponential rate, reaching numbers in the thousands by 2015. This special issue describes a symposium delivered by Megan Gunnar, Regina Sullivan, Mar Sanchez, and Nim Tottenham in the Fall of 2014 at the Society for Social Neuroscience. The goal of the symposium was to describe the emerging knowledge on neurobiological mechanisms that mediate parent-offspring interactions across three different species: rodent, monkey, and human. The talks were aimed at designing testable models of parenting effects on the development of emotional and stress regulation. Specifically, the symposium aimed at characterizing the special modulatory (buffering) effects of parental cues on fear- and stress-relevant neurobiology and behaviors of the offspring and to discuss examples of impaired buffering when the parent-infant relationship is disrupted.
Kordilla, J.; Shigorina, E.; Tartakovsky, A. M.; Pan, W.; Geyer, T.
2015-12-01
Under idealized conditions (smooth surfaces, linear relationship between Bond number and Capillary number of droplets) steady-state flow modes on fracture surfaces have been shown to develop from sliding droplets to rivulets and finally (wavy) film flow, depending on the specified flux. In a recent study we demonstrated the effect of surface roughness on droplet flow in unsaturated wide aperture fractures, however, its effect on other prevailing flow modes is still an open question. The objective of this work is to investigate the formation of complex flow modes on fracture surfaces employing an efficient three-dimensional parallelized SPH model. The model is able to simulate highly intermittent, gravity-driven free-surface flows under dynamic wetting conditions. The effect of surface tension is included via efficient pairwise interaction forces. We validate the model using various analytical and semi-analytical relationships for droplet and complex flow dynamics. To investigate the effect of surface roughness on flow dynamics we construct surfaces with a self-affine fractal geometry and roughness characterized by the Hurst exponent. We demonstrate the effect of surface roughness (on macroscopic scales this can be understood as a tortuosity) on the steady-state distribution of flow modes. Furthermore we show the influence of a wide range of natural wetting conditions (defined by static contact angles) on the final distribution of surface coverage, which is of high importance for matrix-fracture interaction processes.
Directory of Open Access Journals (Sweden)
Volzer Thomas
2016-12-01
Full Text Available The use of elastic bodies within a multibody simulation became more and more important within the last years. To include the elastic bodies, described as a finite element model in multibody simulations, the dimension of the system of ordinary differential equations must be reduced by projection. For this purpose, in this work, the modal reduction method, a component mode synthesis based method and a moment-matching method are used. Due to the always increasing size of the non-reduced systems, the calculation of the projection matrix leads to a large demand of computational resources and cannot be done on usual serial computers with available memory. In this paper, the model reduction software Morembs++ is presented using a parallelization concept based on the message passing interface to satisfy the need of memory and reduce the runtime of the model reduction process. Additionally, the behaviour of the Block-Krylov-Schur eigensolver, implemented in the Anasazi package of the Trilinos project, is analysed with regard to the choice of the size of the Krylov base, the block size and the number of blocks. Besides, an iterative solver is considered within the CMS-based method.
Poulet, Thomas; Paesold, Martin; Veveakis, Manolis
2017-03-01
Faults play a major role in many economically and environmentally important geological systems, ranging from impermeable seals in petroleum reservoirs to fluid pathways in ore-forming hydrothermal systems. Their behavior is therefore widely studied and fault mechanics is particularly focused on the mechanisms explaining their transient evolution. Single faults can change in time from seals to open channels as they become seismically active and various models have recently been presented to explain the driving forces responsible for such transitions. A model of particular interest is the multi-physics oscillator of Alevizos et al. (J Geophys Res Solid Earth 119(6), 4558-4582, 2014) which extends the traditional rate and state friction approach to rate and temperature-dependent ductile rocks, and has been successfully applied to explain spatial features of exposed thrusts as well as temporal evolutions of current subduction zones. In this contribution we implement that model in REDBACK, a parallel open-source multi-physics simulator developed to solve such geological instabilities in three dimensions. The resolution of the underlying system of equations in a tightly coupled manner allows REDBACK to capture appropriately the various theoretical regimes of the system, including the periodic and non-periodic instabilities. REDBACK can then be used to simulate the drastic permeability evolution in time of such systems, where nominally impermeable faults can sporadically become fluid pathways, with permeability increases of several orders of magnitude.
Directory of Open Access Journals (Sweden)
Yuan Cao
2016-01-01
Full Text Available To directly obtain physical dimensions of parallel coupled microstrip lines with a floating ground-plane conductor (PCMLFGPC, an accurate synthesis model based on an artificial neural network (ANN is proposed. The synthesis model is validated by using the conformal mapping technique (CMT analysis contours. Using the synthesis model and the CMT analysis, the PCMLFGPC having equal even- and odd-mode phase velocities can be obtained by adjusting the width of the floating ground-plane conductor. Applying the method, a 7 dB coupler with the measured isolation better than 27 dB across a wide bandwidth (more than 120%, a 90° Schiffman phase shifter with phase deviation ±2.5° and return loss more than 17.5 dB covering 63.4% bandwidth, and a bandpass filter with completely eliminated second-order spurious band are implemented. The performances of the current designs are superior to those of the previous components configured with the PCMLFGPC.
Ganesan, Nandhini; Basu, Suman; Hariharan, Krishnan S.; Kolake, Subramanya Mayya; Song, Taewon; Yeo, Taejung; Sohn, Dong Kee; Doo, Seokgwang
2016-08-01
Lithium-Ion batteries used for electric vehicle applications are subject to large currents and various operation conditions, making battery pack design and life extension a challenging problem. With increase in complexity, modeling and simulation can lead to insights that ensure optimal performance and life extension. In this manuscript, an electrochemical-thermal (ECT) coupled model for a 6 series × 5 parallel pack is developed for Li ion cells with NCA/C electrodes and validated against experimental data. Contribution of the cathode to overall degradation at various operating conditions is assessed. Pack asymmetry is analyzed from a design and an operational perspective. Design based asymmetry leads to a new approach of obtaining the individual cell responses of the pack from an average ECT output. Operational asymmetry is demonstrated in terms of effects of thermal gradients on cycle life, and an efficient model predictive control technique is developed. Concept of reconfigurable battery pack is studied using detailed simulations that can be used for effective monitoring and extension of battery pack life.
Janssen, P. J. A.; Anderson, P. D.
2008-10-01
A boundary-integral method is presented for drop deformation between two parallel walls for non-unit viscosity ratio systems. To account for the effect of the walls the Green's functions are modified and all terms for the double-layer potential are derived. The full three-dimensional implementation is validated, and the model is shown to be accurate and consistent. The method is applied to study drop deformation in shear flow. An excellent match with small-deformation theory is found at low capillary numbers, and our results match with other BIM simulations for pressure-driven flows. For shear flow with moderate capillary numbers, we see that the behavior of a low-viscosity drop is similar to that of drop with a viscosity ratio of unity. High-viscosity drops, on the other hand, are prevented from rotating in shear flow, which results in a larger deformation, but less overshoot in the drop axes is observed. In contrast with unconfined flow, high-viscosity drops can be broken in shear flow between parallel plates; for low-viscosity drops the critical capillary number is higher in confined situations.
PARALLELIZATION OF THE MM4 NUMERICAL MODEL ON DAWNING-1000%MM4数值模式在曙光1000机上的并行化
Institute of Scientific and Technical Information of China (English)
李国杰; 李柏; 翟武全; 赵建勇; 陈国良; 刘清; 石春娥
2001-01-01
In the light of the function and characteristics of the Dawning-1000 paral lel computer made in China, we designed the parallel calculation scheme of the MM4 mesoscale numerical model. Because of the technological processing and meticulously remaking of the data organization and program flow of the sequential program, it reaches reliable calculation results and the SPEEDUP ratio rises obviously at last. When sixteen processing nodes are used, the SPEEDUP ratio reaches 1:6.7. In finite region, the model needs only less than 30 minutes to integrate 36 hours, and the operational parallel calculation environment for application of MM4 mesoscale model is preliminarily set up.
Energy Technology Data Exchange (ETDEWEB)
Murray-Johnson, L.; Witte, K.; Patel, D.; Orrego, V.; Zuckerman, C.; Maxfield, A.M.; Thimons, E.D. [Ohio State University, Columbus, OH (US)
2004-12-15
Occupational noise-induced hearing loss is the second most self-reported occupational illness or injury in the United States. Among coal miners, more than 90% of the population reports a hearing deficit by age 55. In this formative evaluation, focus groups were conducted with coal miners in Appalachia to ascertain whether miners perceive hearing loss as a major health risk and if so, what would motivate the consistent wearing of hearing protection devices (HPDs). The theoretical framework of the Extended Parallel Process Model was used to identify the miners' knowledge, attitudes, beliefs, and current behaviors regarding hearing protection. Focus group participants had strong perceived severity and varying levels of perceived susceptibility to hearing loss. Various barriers significantly reduced the self-efficacy and the response efficacy of using hearing protection.
Matlab/Simulink Modeling of Parallel Resonant DC Link Soft-Switching Four-leg SVPWM Inverter
Directory of Open Access Journals (Sweden)
Riyadh G. Omar
2015-06-01
Full Text Available This paper suggests the use of the traditional parallel resonant dc link (PRDCL circuit to give soft switching to the Four-leg Space Vector Pulse Width Modulation (SVPWM inverter. The proposed circuit provides a short period of zero voltage across the inverter during the zero-vectors occurrence. The transition between the zero and active vectors accomplished with zero- voltage condition (ZVC, this reduces the switching losses. Moreover, the inverter output voltage Total Harmonic Distortion (THD not affected by circuit operation, since the zero voltage periods occur simultaneously with zero-vector periods. To confirm the results, balanced and unbalanced loads are used. Matlab/Simulink model implemented for simulation.
Montreuil, M; Jouvent, R
1989-01-01
A psychometric test of visual detection, based upon the theory of parallel information processing taking place in the left (analytical) and the right (holistic) hemispheres, is presented. The aim of this test is to study the capacity of visual holistic attention during a simultaneous logical distractive task (point to point closure). We first present the validation study on one hundred and four normal subjects. A second study on 20 psychosomatics is then presented, showing significantly inferior results in all subtests in comparison with paired controls. Results are discussed in terms of cognitive psychology, and in particular the link between holistic processing and the emotion processing function of the right hemisphere. This test therefore constitutes a new model of exploration of the cognitivo-emotional process. It should lead to a better evaluation for the study of interhemispheric dysconnexion in neurology as well as psychopathological states such as psychosomatic diseases, where dysconnexion- like phenomena (alexithymia) are thought to play a part.
Green tea polyphenols and sulfasalazine have parallel anti-inflammatory properties in colitis models
Directory of Open Access Journals (Sweden)
Helieh S Oz
2013-06-01
Full Text Available Background: There is no cure for autoimmune chronic inflammatory bowel disease (IBD. IBD patients commonly use complementary and alternative medications of which the safety, efficacy and interaction with standard-of-care therapies are not fully known. Thus the consequences can become life-threatening. Sulfasalazine commonly used in IBD, potentially has severe adverse effects, including infertility, pulmonary fibrosis, lack of response and ultimately patients may require intestinal resection. We hypothesized that green tea polyphenols (GrTP, EGCG and sulfasalazine have similar anti-inflammatory properties. Methods: BALB/c mice received Dextran sodium sulfate (DSS to induce colitis (ulcerative colitis model. Exposure of IL-10 deficient mice (BALB/c-background to normal microbiota provoked enterocolitis (mimics Crohn’s disease. Animals were treated with agents incorporated into daily diets. Control animals received sham treatment. Results: DSS-treated animals developed severe bloody diarrhea and colitis (score 0-4, 3.2+0.27. IL-10 deficient mice developed severe enterocolitis as manifested by diarrhea, rectal prolapse and colonic lesions. Animals tolerated regimens (GrTP, EGCG, sulfasalazine with no major side effects, and further developed less severe colitis/enterocolitis. GrTP, EGCG and sulfasalazine significantly ameliorated colonic damage and histological scores in treated animals in a similar manner (GrTP vs DSS p<0.05; EGCG, sulfasalazine vs DSS p<0.01. The inflammatory markers TNFα (3-fold, IL-6 (14-fold and serum amyloid A (40-fold increased in colitic animals and significantly decreased with treatment regiments. In contrast, circulatory leptin levels decreased in colitic animals (2-fold. EGCG additionally reduced leptin levels (p<0.01 while GrTP and sulfasalazine had no effect on leptin levels (p<0.05. Hepatic and colonic antioxidants were significantly depleted in colitic animals and treatment regiments significantly restored
Cacace, Mauro; Jacquey, Antoine B.
2017-09-01
Theory and numerical implementation describing groundwater flow and the transport of heat and solute mass in fully saturated fractured rocks with elasto-plastic mechanical feedbacks are developed. In our formulation, fractures are considered as being of lower dimension than the hosting deformable porous rock and we consider their hydraulic and mechanical apertures as scaling parameters to ensure continuous exchange of fluid mass and energy within the fracture-solid matrix system. The coupled system of equations is implemented in a new simulator code that makes use of a Galerkin finite-element technique. The code builds on a flexible, object-oriented numerical framework (MOOSE, Multiphysics Object Oriented Simulation Environment) which provides an extensive scalable parallel and implicit coupling to solve for the multiphysics problem. The governing equations of groundwater flow, heat and mass transport, and rock deformation are solved in a weak sense (either by classical Newton-Raphson or by free Jacobian inexact Newton-Krylow schemes) on an underlying unstructured mesh. Nonlinear feedbacks among the active processes are enforced by considering evolving fluid and rock properties depending on the thermo-hydro-mechanical state of the system and the local structure, i.e. degree of connectivity, of the fracture system. A suite of applications is presented to illustrate the flexibility and capability of the new simulator to address problems of increasing complexity and occurring at different spatial (from centimetres to tens of kilometres) and temporal scales (from minutes to hundreds of years).
Directory of Open Access Journals (Sweden)
Lorenzo L. Pesce
2013-01-01
Full Text Available Our limited understanding of the relationship between the behavior of individual neurons and large neuronal networks is an important limitation in current epilepsy research and may be one of the main causes of our inadequate ability to treat it. Addressing this problem directly via experiments is impossibly complex; thus, we have been developing and studying medium-large-scale simulations of detailed neuronal networks to guide us. Flexibility in the connection schemas and a complete description of the cortical tissue seem necessary for this purpose. In this paper we examine some of the basic issues encountered in these multiscale simulations. We have determined the detailed behavior of two such simulators on parallel computer systems. The observed memory and computation-time scaling behavior for a distributed memory implementation were very good over the range studied, both in terms of network sizes (2,000 to 400,000 neurons and processor pool sizes (1 to 256 processors. Our simulations required between a few megabytes and about 150 gigabytes of RAM and lasted between a few minutes and about a week, well within the capability of most multinode clusters. Therefore, simulations of epileptic seizures on networks with millions of cells should be feasible on current supercomputers.
Formation of parallel joint sets and shear band/fracture networks in physical models
Jorand, C.; Chemenda, A. I.; Petit, J.-P.
2012-12-01
Both oedometric and plane-strain tests were performed with parallelepipedic samples made of synthetic granular, cohesive, frictional and dilatant rock analogue material GRAM2. For the first time parallel sets of fractures that have all the characteristics of natural joints were reproduced in the laboratory. The fractures are regularly spaced, normal to σ3, and have plumose morphology very similar to that of natural joints. These fractures can form at tensile stress σ3 much smaller in magnitude than the tensile strength of material and even at slightly compressive σ3. When mean stress σ exceeds a certain value, the fractures become oblique to σ1 (the obliquity increases with σ), forming networks of conjugate shear bands/fractures. These results of plane-strain experiments are in good agreement with those of better controlled conventional axisymmetric tests on a similar material in Chemenda et al. (2011b) and are closer to real geological situations. Both types of experiments are complementary. Their results lead to the conclusion that at least certain categories of natural fractures (including joints, and conjugate shear fractures/bands) were initiated as deformation localization bands. The band orientation is defined by the constitutive properties/parameters (notably the dilatancy factor) that are sensitive to σ.
Kostopoulos, Spiros; Glotsos, Dimitris; Sidiropoulos, Konstantinos; Asvestas, Pantelis; Cavouras, Dionisis; Kalatzis, Ioannis
2014-03-01
The aim of the present study was to implement a pattern recognition system for the discrimination of healthy from malignant prostate tumors from proteomic Mass Spectroscopy (MS) samples and to identify m/z intervals of potential biomarkers associated with prostate cancer. One hundred and six MS-spectra were studied in total. Sixty three spectra corresponded to healthy cases (PSA 10). The MS-spectra are publicly available from the NCI Clinical Proteomics Database. The pre-processing comprised the steps: denoising, normalization, peak extraction and peak alignment. Due to the enormous number of features that rose from MS-spectra as informative peaks, and in order to secure optimum system design, the classification task was performed by programming in parallel the multiprocessors of an nVIDIA GPU card, using the CUDA framework. The proposed system achieved 98.1% accuracy. The identified m/z intervals displayed significant statistical differences between the two classes and were found to possess adequate discriminatory power in characterizing prostate samples, when employed in the design of the classification system. Those intervals should be further investigated since they might lead to the identification of potential new biomarkers for prostate cancer.
Directory of Open Access Journals (Sweden)
A. Norozi
2010-01-01
Full Text Available Problem statement: In the area of globalization the degree of competition in the market increased and many companies attempted to manufacture the products efficiently to overcome the challenges faced. Approach: Mixed model assembly line was able to provide continuous flow of material and flexibility with regard to model change. The problem under study attempted to describe the mathematical programming limitation for minimizing the overall make-span and balancing objective for set of parallel lines. Results: A proposed mixed-integer model only able to find the best job sequence in each line to meet the problem objectives for the given number of job allotted to each line. Hence using the proposed mathematical model for large size problem was time consuming and inefficient as so many job allocation values should be checked. This study presented an intelligence based genetic algorithm approach to optimize the considered problem objectives through reducing the problem complexity. A heuristic algorithm was introduced to generate the initial population for intelligence based genetic algorithm. Then, it started to find the best sequence of jobs for each line based on the generated population by heuristic algorithm. By this means, intelligence based genetic algorithm only concentrated on those initial populations that produce better solutions instead of probing the entire search space. Conclusion/Recommendations: The results obtained from intelligence based genetic algorithm were used as an initial point for fine-tuning by simulated annealing to increase the quality of solution. In order to check the capability of proposed algorithm, several experimentations on the set of problems were done. As the total objective values in most of problems could not be improved by simulated algorithm, it proved the well performing of proposed intelligence based genetic algorithm in reaching the near optimal solutions.
Energy Technology Data Exchange (ETDEWEB)
Piteau, Ph. [CEA Saclay, DEN, DM2S, SEMT, DYN, CEA, Lab Etud Dynam, F-91191 Gif Sur Yvette (France); Antunes, J. [ITN, ADL, P-2686 Sacavem Codex (Portugal)
2010-07-01
In this paper, we develop a theoretical model to predict the nonlinear fluid-structure interaction forces and the dynamics of parallel vibrating plates subjected to an axial gap flow. The gap is assumed small, when compared to the plate dimensions, the plate width being much larger than the length, so that the simplifying assumptions of 1D bulk-flow models are adequate. We thus develop a simplified theoretical squeeze-film formulation, which includes both the distributed and singular dissipative flow terms. This model is suitable for performing effective time-domain numerical simulations of vibrating systems which are coupled by the nonlinear unsteady flow forces, for instance the vibro-impact dynamics of plates with fluid gap interfaces. A linearized version of the flow model is also presented and discussed, which is appropriate for studying the complex modes and linear stability of flow/structure coupled systems as a function of the average axial gap velocity. Two applications of our formulation are presented: (1) first we study how an axial flow modifies the rigid-body motion of immersed plates falling under gravity; (2) then we compute the dynamical behavior of an immersed oscillating plate as a function of the axial gap flow velocity. Linear stability plots of oscillating plates are shown, as a function of the average fluid gap and of the axial flow velocity, for various scenarios of the loss terms. These results highlight the conditions leading to either the divergence or flutter instabilities. Numerical simulations of the nonlinear flow/structure dynamical responses are also presented, for both stable and unstable regimes. This work is of interest to a large body of real-life problems, for instance the dynamics of nuclear spent fuel racks immersed in a pool when subjected to seismic excitations, or the self-excited vibro-impact motions of valve-like components under axial flows. (authors)
DEFF Research Database (Denmark)
Chen, Zhiyong; Chen, Yandong; Guerrero, Josep M.
2016-01-01
This paper firstly presents an equivalent coupling circuit modeling of multi-parallel inverters in microgrid operating in grid-connected mode. By using the model, the coupling resonance phenomena are explicitly investigated through the mathematical approach, and the intrinsic and extrinsic resona...... to attenuate coupling resonance, and the most salient feature is that the optimal range of the damping parameter can be easily located through an initiatively graphic method. Finally, simulations and experiments verify the validity of the proposed modeling and method....
Directory of Open Access Journals (Sweden)
R. Daud
2013-06-01
Full Text Available Shielding interaction effects of two parallel edge cracks in finite thickness plates subjected to remote tension load is analyzed using a developed finite element analysis program. In the present study, the crack interaction limit is evaluated based on the fitness of service (FFS code, and focus is given to the weak crack interaction region as the crack interval exceeds the length of cracks (b > a. Crack interaction factors are evaluated based on stress intensity factors (SIFs for Mode I SIFs using a displacement extrapolation technique. Parametric studies involved a wide range of crack-to-width (0.05 ≤ a/W ≤ 0.5 and crack interval ratios (b/a > 1. For validation, crack interaction factors are compared with single edge crack SIFs as a state of zero interaction. Within the considered range of parameters, the proposed numerical evaluation used to predict the crack interaction factor reduces the error of existing analytical solution from 1.92% to 0.97% at higher a/W. In reference to FFS codes, the small discrepancy in the prediction of the crack interaction factor validates the reliability of the numerical model to predict crack interaction limits under shielding interaction effects. In conclusion, the numerical model gave a successful prediction in estimating the crack interaction limit, which can be used as a reference for the shielding orientation of other cracks.
Kulmala, A; Tenhunen, M
2012-11-07
The signal of the dosimetric detector is generally dependent on the shape and size of the sensitive volume of the detector. In order to optimize the performance of the detector and reliability of the output signal the effect of the detector size should be corrected or, at least, taken into account. The response of the detector can be modelled using the convolution theorem that connects the system input (actual dose), output (measured result) and the effect of the detector (response function) by a linear convolution operator. We have developed the super-resolution and non-parametric deconvolution method for determination of the cylinder symmetric ionization chamber radial response function. We have demonstrated that the presented deconvolution method is able to determine the radial response for the Roos parallel plate ionization chamber with a better than 0.5 mm correspondence with the physical measures of the chamber. In addition, the performance of the method was proved by the excellent agreement between the output factors of the stereotactic conical collimators (4-20 mm diameter) measured by the Roos chamber, where the detector size is larger than the measured field, and the reference detector (diode). The presented deconvolution method has a potential in providing reference data for more accurate physical models of the ionization chamber as well as for improving and enhancing the performance of the detectors in specific dosimetric problems.
Napper, Lucy E; Harris, Peter R; Klein, William M P
2014-01-01
There is potential for fruitful integration of research using the Extended Parallel Process Model (EPPM) with research using Self-affirmation Theory. However, to date no studies have attempted to do this. This article reports an experiment that tests whether (a) the effects of a self-affirmation manipulation add to those of EPPM variables in predicting intentions to improve a health behavior and (b) self-affirmation moderates the relationship between EPPM variables and intentions. Participants (N = 80) were randomized to either a self-affirmation or control condition prior to receiving personally relevant health information about the risks of not eating at least five portions of fruit and vegetables per day. A hierarchical regression model revealed that efficacy, threat × efficacy, self-affirmation, and self-affirmation × efficacy all uniquely contributed to the prediction of intentions to eat at least five portions per day. Self-affirmed participants and those with higher efficacy reported greater motivation to change. Threat predicted intentions at low levels of efficacy, but not at high levels. Efficacy had a stronger relationship with intentions in the nonaffirmed condition than in the self-affirmed condition. The findings indicate that self-affirmation processes can moderate the impact of variables in the EPPM and also add to the variance explained. We argue that there is potential for integration of the two traditions of research, to the benefit of both.
Energy Technology Data Exchange (ETDEWEB)
Barbara Chapman
2012-02-01
OpenMP was not well recognized at the beginning of the project, around year 2003, because of its limited use in DoE production applications and the inmature hardware support for an efficient implementation. Yet in the recent years, it has been graduately adopted both in HPC applications, mostly in the form of MPI+OpenMP hybrid code, and in mid-scale desktop applications for scientific and experimental studies. We have observed this trend and worked deligiently to improve our OpenMP compiler and runtimes, as well as to work with the OpenMP standard organization to make sure OpenMP are evolved in the direction close to DoE missions. In the Center for Programming Models for Scalable Parallel Computing project, the HPCTools team at the University of Houston (UH), directed by Dr. Barbara Chapman, has been working with project partners, external collaborators and hardware vendors to increase the scalability and applicability of OpenMP for multi-core (and future manycore) platforms and for distributed memory systems by exploring different programming models, language extensions, compiler optimizations, as well as runtime library support.
Ghosal, Ashitava; Shyam, R. B. Ashith
2016-05-01
There is an increased thrust to harvest solar energy in India to meet increasing energy requirements and to minimize imported fossil fuels. In a solar power tower system, an array of tracking mirrors or heliostats are used to concentrate the incident solar energy on an elevated stationary receiver and then the thermal energy converted to electricity using a heat engine. The conventional method of tracking are the Azimuth-Elevation (Az-El) or Target-Aligned (T-A) mount. In both the cases, the mirror is rotated about two mutually perpendicular axes and is supported at the center using a pedestal which is fixed to the ground. In this paper, a three degree-of-freedom parallel manipulator, namely the 3-RPS, is proposed for tracking the sun in a solar power tower system. We present modeling, simulation and design of the 3-RPS parallel manipulator and show its advantages over conventional Az-El and T-A mounts. The 3-RPS manipulator consists of three rotary (R), three prismatic (P) and three spherical (S) joints and the mirror assembly is mounted at three points in contrast to the Az-El and T-A mounts. The kinematic equations for sun tracking are derived for the 3-RPS manipulator and from the simulations, we obtain the range of motion of the rotary, prismatic and spherical joints. Since the mirror assembly is mounted at three points, the wind load and self-weight are distributed and as a consequence, the deflections due to loading are smaller than in conventional mounts. It is shown that the weight of the supporting structure is between 15% and 65% less than that of conventional systems. Hence, even though one additional actuator is used, the larger area mirrors can be used and costs can be reduced.
Gibiansky, Leonid; Gibiansky, Ekaterina; Bauer, Robert
2012-02-01
The paper compares performance of Nonmem estimation methods--first order conditional estimation with interaction (FOCEI), iterative two stage (ITS), Monte Carlo importance sampling (IMP), importance sampling assisted by mode a posteriori (IMPMAP), stochastic approximation expectation-maximization (SAEM), and Markov chain Monte Carlo Bayesian (BAYES), on the simulated examples of a monoclonal antibody with target-mediated drug disposition (TMDD), demonstrates how optimization of the estimation options improves performance, and compares standard errors of Nonmem parameter estimates with those predicted by PFIM 3.2 optimal design software. In the examples of the one- and two-target quasi-steady-state TMDD models with rich sampling, the parameter estimates and standard errors of the new Nonmem 7.2.0 ITS, IMP, IMPMAP, SAEM and BAYES estimation methods were similar to the FOCEI method, although larger deviation from the true parameter values (those used to simulate the data) was observed using the BAYES method for poorly identifiable parameters. Standard errors of the parameter estimates were in general agreement with the PFIM 3.2 predictions. The ITS, IMP, and IMPMAP methods with the convergence tester were the fastest methods, reducing the computation time by about ten times relative to the FOCEI method. Use of lower computational precision requirements for the FOCEI method reduced the estimation time by 3-5 times without compromising the quality of the parameter estimates, and equaled or exceeded the speed of the SAEM and BAYES methods. Use of parallel computations with 4-12 processors running on the same computer improved the speed proportionally to the number of processors with the efficiency (for 12 processor run) in the range of 85-95% for all methods except BAYES, which had parallelization efficiency of about 70%.
Guo, L.; Huang, H.; Gaston, D.; Redden, G. D.; Fox, D. T.; Fujita, Y.
2010-12-01
Inducing mineral precipitation in the subsurface is one potential strategy for immobilizing trace metal and radionuclide contaminants. Generating mineral precipitates in situ can be achieved by manipulating chemical conditions, typically through injection or in situ generation of reactants. How these reactants transport, mix and react within the medium controls the spatial distribution and composition of the resulting mineral phases. Multiple processes, including fluid flow, dispersive/diffusive transport of reactants, biogeochemical reactions and changes in porosity-permeability, are tightly coupled over a number of scales. Numerical modeling can be used to investigate the nonlinear coupling effects of these processes which are quite challenging to explore experimentally. Many subsurface reactive transport simulators employ a de-coupled or operator-splitting approach where transport equations and batch chemistry reactions are solved sequentially. However, such an approach has limited applicability for biogeochemical systems with fast kinetics and strong coupling between chemical reactions and medium properties. A massively parallel, fully coupled, fully implicit Reactive Transport simulator (referred to as “RAT”) based on a parallel multi-physics object-oriented simulation framework (MOOSE) has been developed at the Idaho National Laboratory. Within this simulator, systems of transport and reaction equations can be solved simultaneously in a fully coupled, fully implicit manner using the Jacobian Free Newton-Krylov (JFNK) method with additional advanced computing capabilities such as (1) physics-based preconditioning for solution convergence acceleration, (2) massively parallel computing and scalability, and (3) adaptive mesh refinements for 2D and 3D structured and unstructured mesh. The simulator was first tested against analytical solutions, then applied to simulating induced calcium carbonate mineral precipitation in 1D columns and 2D flow cells as analogs
Adamuszek, Marta; Dabrowski, Marcin; Schmid, Daniel W.
2016-04-01
We present Folder, a numerical tool to simulate and analyse the structure development in mechanically layered media during the layer parallel shortening or extension. Folder includes a graphical user interface that allows for easy designing of complex geometrical models, defining material parameters (including linear and non-linear rheology), and specifying type and amount of deformation. It also includes a range of features that facilitate the visualization and examination of various relevant quantities e.g. velocities, stress, rate of deformation, pressure, and finite strain. Folder contains a separate application, which illustrates analytical solutions of growth rate spectra for layer parallel shortening and extension of a single viscous layer. In the study, we also demonstrate a Folder application, where the role of confinement on the growth rate spectrum and the fold shape evolution during the deformation of a single layer subject to the layer parallel shortening is presented. In the case of the linear viscous materials used for the layer and matrix, the close wall proximity leads to a decrease of the growth rate values. The decrease is more pronounced for the larger wavelengths than for the smaller wavelengths. The growth rate reduction is greater when the walls are set closer to the layer. The presence of the close confinement can also affect the wavelength selection process and significantly shift the position of the dominant wavelength. The influence of the wall proximity on the growth rate spectrum for the case of non-linear viscous materials used for the layer and/or matrix is very different as compared to the linear viscous case. We observe a multiple maxima in the growth rate spectrum. The number of the growth rate maxima, their value and the position strongly depend on the closeness of the confinement. The maximum growth rate value for a selected range of layer-wall distances is much larger than in the case when the confinement effect is not taken
Keppenne, Christian L.; Rienecker, Michele; Borovikov, Anna Y.; Suarez, Max
1999-01-01
A massively parallel ensemble Kalman filter (EnKF)is used to assimilate temperature data from the TOGA/TAO array and altimetry from TOPEX/POSEIDON into a Pacific basin version of the NASA Seasonal to Interannual Prediction Project (NSIPP)ls quasi-isopycnal ocean general circulation model. The EnKF is an approximate Kalman filter in which the error-covariance propagation step is modeled by the integration of multiple instances of a numerical model. An estimate of the true error covariances is then inferred from the distribution of the ensemble of model state vectors. This inplementation of the filter takes advantage of the inherent parallelism in the EnKF algorithm by running all the model instances concurrently. The Kalman filter update step also occurs in parallel by having each processor process the observations that occur in the region of physical space for which it is responsible. The massively parallel data assimilation system is validated by withholding some of the data and then quantifying the extent to which the withheld information can be inferred from the assimilation of the remaining data. The distributions of the forecast and analysis error covariances predicted by the ENKF are also examined.
层级式可视化并行程序建模系统研究%Research of Hierarchical Visual Modeling System for Parallel Programs
Institute of Scientific and Technical Information of China (English)
徐祯; 孙济洲; 于策; 孙超; 汤善江
2011-01-01
The visual modeling technology can reduce the difficulty of the design of parallel programs effectively, the complex hardware architecture still puts forward new challenges on the parallel program design method on the software level.To solve these issues, this paper proposes a visual modeling methodology based on the hierarchical idea and an hierarchical modeling scheme for parallel programs, and designs and implements a modeling system called e-ParaModel for multi-core cluster environments.A modeling paradigm to verify the system's feasibility and applicability is completed.%可视化建模技术虽能降低并行程序设计的难度,但复杂的硬件结构仍使软件层面上的并行程序设计方法存在一定难度.为此,提出一种基于层级式建模思想的并行程序可视化建模方法和分层建模方案,设计和实现一个面向多层次集群环境的可视化建模系统e-ParaModel,用建模实例验证其可行性和实用性.
Directory of Open Access Journals (Sweden)
K Witte
2005-04-01
Full Text Available Introduction: An effective preventive health education program on drug abuse can be delivered by applying behavior change theories in a complementary fashion. Methods: The aim of this study was to assess the effectiveness of integrating self-control into Extended Parallel Process Model in drug substance abuse behaviors. A sample of 189 governmental high school students participated in this survey. Information was collected individually by completing researcher designed questionnaire and a urinary rapid immuno-chromatography test for opium and marijuana. Results: The results of the study show that 6.9% of students used drugs (especially opium and marijuana and also peer pressure was determinant factor for using drugs. Moreover the EPPM theoretical variables of perceived severity and perceived self-efficacy with self-control are predictive factors to behavior intention against substance abuse. In this manner, self-control had a significant effect on protective motivation and perceived efficacy. Low self- control was a predictive factor of drug abuse and low self-control students had drug abuse experience. Conclusion: The results of this study suggest that an integration of self-control into EPPM can be effective in expressing and designing primary preventive programs against drug abuse, and assessing abused behavior and deviance behaviors among adolescent population, especially risk seekers
并行编程模型的研究与发展%Research and Development on Parallel-Programming Model
Institute of Scientific and Technical Information of China (English)
董仁举; 祝永志
2011-01-01
并行编程模型在分布式计算中发挥着很重要的作用,随着人们对高性能计算需求的不断扩大和各种新技术的出现,并行编程模型也处于不断的发展和完善之中.对两种主要的编程模型进行了详细的分析和研究,针对前两种模型的优缺点分析并研究了两级并行模型的使用范围和优势等,最后针对硬件的新发展提出了新的编程模型的发展TBB+MPI.并在基于CMP的集群系统中实现丁矩阵相乘的算法.实验结果显示TBB+MPI在多核集群编程方面有明显的优势,因此模型TBB+MPI更适合于多核集群.%Parallel-programming model takes a very important part in distributed computing. With the increasing need for high performance computing and the appearances of many new technologies, parallel-programming model is also in need of exploration and improvement. Two major parallel-programming models are compared in details in many aspects at first. Against to the advantages and disadvantages of the first two models, the usage and advantages of the two-level parallel is studied. According to the development of hardware,given the future trend of parallel-programming model TBB+MPI. And has realized the matrix multiplication algorithm based on the CMP cluster system. The experment result showed that the new model got the performance improved.
Institute of Scientific and Technical Information of China (English)
Bin CHEN; Lao-bing ZHANG; Xiao-cheng LIU; Hans VANGHELUWE
2014-01-01
Improving simulation performance using activity tracking has attracted attention in the modeling field in recent years. The reference to activity has been successfully used to predict and promote the simulation performance. Tracking activity, how-ever, uses only the inherent performance information contained in the models. To extend activity prediction in modeling, we propose the activity enhanced modeling with an activity meta-model at the meta-level. The meta-model provides a set of interfaces to model activity in a specific domain. The activity model transformation in subsequence is devised to deal with the simulation difference due to the heterogeneous activity model. Finally, the resource-aware simulation framework is implemented to integrate the activity models in activity-based simulation. The case study shows the improvement brought on by activity-based simulation using discrete event system specification (DEVS).
Alameda, J. C.
2011-12-01
Development and optimization of computational science models, particularly on high performance computers, and with the advent of ubiquitous multicore processor systems, practically on every system, has been accomplished with basic software tools, typically, command-line based compilers, debuggers, performance tools that have not changed substantially from the days of serial and early vector computers. However, model complexity, including the complexity added by modern message passing libraries such as MPI, and the need for hybrid code models (such as openMP and MPI) to be able to take full advantage of high performance computers with an increasing core count per shared memory node, has made development and optimization of such codes an increasingly arduous task. Additional architectural developments, such as many-core processors, only complicate the situation further. In this paper, we describe how our NSF-funded project, "SI2-SSI: A Productive and Accessible Development Workbench for HPC Applications Using the Eclipse Parallel Tools Platform" (WHPC) seeks to improve the Eclipse Parallel Tools Platform, an environment designed to support scientific code development targeted at a diverse set of high performance computing systems. Our WHPC project to improve Eclipse PTP takes an application-centric view to improve PTP. We are using a set of scientific applications, each with a variety of challenges, and using PTP to drive further improvements to both the scientific application, as well as to understand shortcomings in Eclipse PTP from an application developer perspective, to drive our list of improvements we seek to make. We are also partnering with performance tool providers, to drive higher quality performance tool integration. We have partnered with the Cactus group at Louisiana State University to improve Eclipse's ability to work with computational frameworks and extremely complex build systems, as well as to develop educational materials to incorporate into
Li, Gen; Tang, Chun-An; Liang, Zheng-Zhao
2017-01-01
Multi-scale high-resolution modeling of rock failure process is a powerful means in modern rock mechanics studies to reveal the complex failure mechanism and to evaluate engineering risks. However, multi-scale continuous modeling of rock, from deformation, damage to failure, has raised high requirements on the design, implementation scheme and computation capacity of the numerical software system. This study is aimed at developing the parallel finite element procedure, a parallel rock failure process analysis (RFPA) simulator that is capable of modeling the whole trans-scale failure process of rock. Based on the statistical meso-damage mechanical method, the RFPA simulator is able to construct heterogeneous rock models with multiple mechanical properties, deal with and represent the trans-scale propagation of cracks, in which the stress and strain fields are solved for the damage evolution analysis of representative volume element by the parallel finite element method (FEM) solver. This paper describes the theoretical basis of the approach and provides the details of the parallel implementation on a Windows - Linux interactive platform. A numerical model is built to test the parallel performance of FEM solver. Numerical simulations are then carried out on a laboratory-scale uniaxial compression test, and field-scale net fracture spacing and engineering-scale rock slope examples, respectively. The simulation results indicate that relatively high speedup and computation efficiency can be achieved by the parallel FEM solver with a reasonable boot process. In laboratory-scale simulation, the well-known physical phenomena, such as the macroscopic fracture pattern and stress-strain responses, can be reproduced. In field-scale simulation, the formation process of net fracture spacing from initiation, propagation to saturation can be revealed completely. In engineering-scale simulation, the whole progressive failure process of the rock slope can be well modeled. It is
Mamey, Mary Rose; Barbosa-Leiker, Celestina; McPherson, Sterling; Burns, G. Leonard; Parks, Craig; Roll, John
2015-01-01
Researchers often want to examine two comorbid conditions simultaneously. One strategy to do so is through the use of parallel latent growth curve modeling (LGCM). This statistical technique allows for the simultaneous evaluation of two disorders to determine the explanations and predictors of change over time. Additionally, a piecewise model can help identify whether there are more than two growth processes within each disorder (e.g., during a clinical trial). A parallel piecewise LGCM was applied to self-reported attention deficit/hyperactivity disorder (ADHD) and self-reported substance use symptoms in 303 adolescents enrolled in cognitive behavioral therapy treatment for a substance use disorder (SUD) and receiving either oral-methylphenidate or placebo for ADHD across 16 weeks. Assessing these two disorders concurrently allowed us to determine whether elevated levels of one disorder predicted elevated levels or increased risk of the other disorder. First, a piecewise growth model measured ADHD and SU separately. Next, a parallel piecewise LGCM was used to estimate the regressions across disorders to determine whether higher scores at baseline of the disorders (i.e., ADHD or SUD) predicted rates of change in the related disorder. Finally, treatment was added to the model to predict change. While the analyses revealed no significant relationships across disorders, this study explains and applies a parallel piecewise growth model to examine the developmental processes of comorbid conditions over the course of a clinical trial. Strengths of piecewise and parallel LGCMs for other addictions researchers interested in examining dual processes over time are discussed. PMID:26389639
Directory of Open Access Journals (Sweden)
Muhammad Ali Ismail
2011-08-01
Full Text Available With the arrival of multi-cores, every processor has now built-in parallel computational power and that can be fully utilized only if the program in execution is written accordingly. This study is a part of an on-going research for designing of a new parallel programming model for multi-core processors. In this paper we have presented a combined parallel and concurrent implementation of Lin-Kernighan Heuristic (LKH-2 for Solving Travelling Salesman Problem (TSP using a newly developed parallel programming model, SPC3 PM, for general purpose multi-core processors. This implementation is found to be very simple, highly efficient, scalable and less time consuming in compare to the existing LKH-2 serial implementations in multi-core processing environment. We have tested our parallel implementation of LKH-2 with medium and large size TSP instances of TSBLIB. And for all these tests our proposed approach has shown much improved performance and scalability.
Energy Technology Data Exchange (ETDEWEB)
Cai, Jun; Shi, Jiamin; Wang, Kuaishe; Wang, Wen; Wang, Qingjuan; Liu, Yingying [Xi' an Univ. of Architecture and Technology, Xi' an (China). School of Metallurgical Engineering; Li, Fuguo [Northwestern Polytechnical Univ., Xi' an (China). School of Materials Science and Engineering
2017-07-15
Constitutive analysis for hot working of Ti-6Al-4V alloy was carried out by using experimental stress-strain data from isothermal hot compression tests. A new kind of constitutive equation called a modified parallel constitutive model was proposed by considering the independent effects of strain, strain rate and temperature. The predicted flow stress data were compared with the experimental data. Statistical analysis was introduced to verify the validity of the developed constitutive equation. Subsequently, the accuracy of the proposed constitutive equations was evaluated by comparing with other constitutive models. The results showed that the developed modified parallel constitutive model based on multiple regression could predict flow stress of Ti-6Al-4V alloy with good correlation and generalization.