A Parallel Particle Swarm Optimization Algorithm Accelerated by Asynchronous Evaluations
Venter, Gerhard; Sobieszczanski-Sobieski, Jaroslaw
2005-01-01
A parallel Particle Swarm Optimization (PSO) algorithm is presented. Particle swarm optimization is a fairly recent addition to the family of non-gradient based, probabilistic search algorithms that is based on a simplified social model and is closely tied to swarming theory. Although PSO algorithms present several attractive properties to the designer, they are plagued by high computational cost as measured by elapsed time. One approach to reduce the elapsed time is to make use of coarse-grained parallelization to evaluate the design points. Previous parallel PSO algorithms were mostly implemented in a synchronous manner, where all design points within a design iteration are evaluated before the next iteration is started. This approach leads to poor parallel speedup in cases where a heterogeneous parallel environment is used and/or where the analysis time depends on the design point being analyzed. This paper introduces an asynchronous parallel PSO algorithm that greatly improves the parallel e ciency. The asynchronous algorithm is benchmarked on a cluster assembled of Apple Macintosh G5 desktop computers, using the multi-disciplinary optimization of a typical transport aircraft wing as an example.
Aspects of computation on asynchronous parallel processors
International Nuclear Information System (INIS)
Wright, M.
1989-01-01
The increasing availability of asynchronous parallel processors has provided opportunities for original and useful work in scientific computing. However, the field of parallel computing is still in a highly volatile state, and researchers display a wide range of opinion about many fundamental questions such as models of parallelism, approaches for detecting and analyzing parallelism of algorithms, and tools that allow software developers and users to make effective use of diverse forms of complex hardware. This volume collects the work of researchers specializing in different aspects of parallel computing, who met to discuss the framework and the mechanics of numerical computing. The far-reaching impact of high-performance asynchronous systems is reflected in the wide variety of topics, which include scientific applications (e.g. linear algebra, lattice gauge simulation, ordinary and partial differential equations), models of parallelism, parallel language features, task scheduling, automatic parallelization techniques, tools for algorithm development in parallel environments, and system design issues
Asynchronous Parallelization of a CFD Solver
Abdi, Daniel S.; Bitsuamlak, Girma T.
2015-01-01
The article of record as published may be found at http://dx.doi.org/10.1155/2015/295393 A Navier-Stokes equations solver is parallelized to run on a cluster of computers using the domain decomposition method. Two approaches of communication and computation are investigated, namely, synchronous and asynchronous methods. Asynchronous communication between subdomains is not commonly used inCFDcodes; however, it has a potential to alleviate scaling bottlenecks incurred due to process...
On the Convergence of Asynchronous Parallel Pattern Search
International Nuclear Information System (INIS)
Tamara Gilbson Kolda
2002-01-01
In this paper the authors prove global convergence for asynchronous parallel pattern search. In standard pattern search, decisions regarding the update of the iterate and the step-length control parameter are synchronized implicitly across all search directions. They lose this feature in asynchronous parallel pattern search since the search along each direction proceeds semi-autonomously. By bounding the value of the step-length control parameter after any step that produces decrease along a single search direction, they can prove that all the processes share a common accumulation point and that such a point is a stationary point of the standard nonlinear unconstrained optimization problem
Parallel, Asynchronous Executive (PAX): System concepts, facilities, and architecture
Jones, W. H.
1983-01-01
The Parallel, Asynchronous Executive (PAX) is a software operating system simulation that allows many computers to work on a single problem at the same time. PAX is currently implemented on a UNIVAC 1100/42 computer system. Independent UNIVAC runstreams are used to simulate independent computers. Data are shared among independent UNIVAC runstreams through shared mass-storage files. PAX has achieved the following: (1) applied several computing processes simultaneously to a single, logically unified problem; (2) resolved most parallel processor conflicts by careful work assignment; (3) resolved by means of worker requests to PAX all conflicts not resolved by work assignment; (4) provided fault isolation and recovery mechanisms to meet the problems of an actual parallel, asynchronous processing machine. Additionally, one real-life problem has been constructed for the PAX environment. This is CASPER, a collection of aerodynamic and structural dynamic problem simulation routines. CASPER is not discussed in this report except to provide examples of parallel-processing techniques.
A Synchronous-Asynchronous Particle Swarm Optimisation Algorithm
Ab Aziz, Nor Azlina; Mubin, Marizan; Mohamad, Mohd Saberi; Ab Aziz, Kamarulzaman
2014-01-01
In the original particle swarm optimisation (PSO) algorithm, the particles' velocities and positions are updated after the whole swarm performance is evaluated. This algorithm is also known as synchronous PSO (S-PSO). The strength of this update method is in the exploitation of the information. Asynchronous update PSO (A-PSO) has been proposed as an alternative to S-PSO. A particle in A-PSO updates its velocity and position as soon as its own performance has been evaluated. Hence, particles are updated using partial information, leading to stronger exploration. In this paper, we attempt to improve PSO by merging both update methods to utilise the strengths of both methods. The proposed synchronous-asynchronous PSO (SA-PSO) algorithm divides the particles into smaller groups. The best member of a group and the swarm's best are chosen to lead the search. Members within a group are updated synchronously, while the groups themselves are asynchronously updated. Five well-known unimodal functions, four multimodal functions, and a real world optimisation problem are used to study the performance of SA-PSO, which is compared with the performances of S-PSO and A-PSO. The results are statistically analysed and show that the proposed SA-PSO has performed consistently well. PMID:25121109
An Evaluation of Parallel Synchronous and Conservative Asynchronous Logic-Level Simulations
Directory of Open Access Journals (Sweden)
Ausif Mahmood
1996-01-01
a circuit remain fixed during the entire simulation. We remove this limitation and, by extending the analyses to multi-input, multi-output circuits with an arbitrary number of input events, show that the conservative asynchronous simulation extracts more parallelism and executes faster than synchronous simulation in general. Our conclusions are supported by a comparison of the idealized execution times of synchronous and conservative asynchronous algorithms on ISCAS combinational and sequential benchmark circuits.
Massive Asynchronous Parallelization of Sparse Matrix Factorizations
Energy Technology Data Exchange (ETDEWEB)
Chow, Edmond [Georgia Inst. of Technology, Atlanta, GA (United States)
2018-01-08
Solving sparse problems is at the core of many DOE computational science applications. We focus on the challenge of developing sparse algorithms that can fully exploit the parallelism in extreme-scale computing systems, in particular systems with massive numbers of cores per node. Our approach is to express a sparse matrix factorization as a large number of bilinear constraint equations, and then solving these equations via an asynchronous iterative method. The unknowns in these equations are the matrix entries of the factorization that is desired.
A parallel algorithm for 3D particle tracking and Lagrangian trajectory reconstruction
International Nuclear Information System (INIS)
Barker, Douglas; Zhang, Yuanhui; Lifflander, Jonathan; Arya, Anshu
2012-01-01
Particle-tracking methods are widely used in fluid mechanics and multi-target tracking research because of their unique ability to reconstruct long trajectories with high spatial and temporal resolution. Researchers have recently demonstrated 3D tracking of several objects in real time, but as the number of objects is increased, real-time tracking becomes impossible due to data transfer and processing bottlenecks. This problem may be solved by using parallel processing. In this paper, a parallel-processing framework has been developed based on frame decomposition and is programmed using the asynchronous object-oriented Charm++ paradigm. This framework can be a key step in achieving a scalable Lagrangian measurement system for particle-tracking velocimetry and may lead to real-time measurement capabilities. The parallel tracking algorithm was evaluated with three data sets including the particle image velocimetry standard 3D images data set #352, a uniform data set for optimal parallel performance and a computational-fluid-dynamics-generated non-uniform data set to test trajectory reconstruction accuracy, consistency with the sequential version and scalability to more than 500 processors. The algorithm showed strong scaling up to 512 processors and no inherent limits of scalability were seen. Ultimately, up to a 200-fold speedup is observed compared to the serial algorithm when 256 processors were used. The parallel algorithm is adaptable and could be easily modified to use any sequential tracking algorithm, which inputs frames of 3D particle location data and outputs particle trajectories
Parallel asynchronous hardware implementation of image processing algorithms
Coon, Darryl D.; Perera, A. G. U.
1990-01-01
Research is being carried out on hardware for a new approach to focal plane processing. The hardware involves silicon injection mode devices. These devices provide a natural basis for parallel asynchronous focal plane image preprocessing. The simplicity and novel properties of the devices would permit an independent analog processing channel to be dedicated to every pixel. A laminar architecture built from arrays of the devices would form a two-dimensional (2-D) array processor with a 2-D array of inputs located directly behind a focal plane detector array. A 2-D image data stream would propagate in neuron-like asynchronous pulse-coded form through the laminar processor. No multiplexing, digitization, or serial processing would occur in the preprocessing state. High performance is expected, based on pulse coding of input currents down to one picoampere with noise referred to input of about 10 femtoamperes. Linear pulse coding has been observed for input currents ranging up to seven orders of magnitude. Low power requirements suggest utility in space and in conjunction with very large arrays. Very low dark current and multispectral capability are possible because of hardware compatibility with the cryogenic environment of high performance detector arrays. The aforementioned hardware development effort is aimed at systems which would integrate image acquisition and image processing.
A new asynchronous parallel algorithm for inferring large-scale gene regulatory networks.
Directory of Open Access Journals (Sweden)
Xiangyun Xiao
Full Text Available The reconstruction of gene regulatory networks (GRNs from high-throughput experimental data has been considered one of the most important issues in systems biology research. With the development of high-throughput technology and the complexity of biological problems, we need to reconstruct GRNs that contain thousands of genes. However, when many existing algorithms are used to handle these large-scale problems, they will encounter two important issues: low accuracy and high computational cost. To overcome these difficulties, the main goal of this study is to design an effective parallel algorithm to infer large-scale GRNs based on high-performance parallel computing environments. In this study, we proposed a novel asynchronous parallel framework to improve the accuracy and lower the time complexity of large-scale GRN inference by combining splitting technology and ordinary differential equation (ODE-based optimization. The presented algorithm uses the sparsity and modularity of GRNs to split whole large-scale GRNs into many small-scale modular subnetworks. Through the ODE-based optimization of all subnetworks in parallel and their asynchronous communications, we can easily obtain the parameters of the whole network. To test the performance of the proposed approach, we used well-known benchmark datasets from Dialogue for Reverse Engineering Assessments and Methods challenge (DREAM, experimentally determined GRN of Escherichia coli and one published dataset that contains more than 10 thousand genes to compare the proposed approach with several popular algorithms on the same high-performance computing environments in terms of both accuracy and time complexity. The numerical results demonstrate that our parallel algorithm exhibits obvious superiority in inferring large-scale GRNs.
A new asynchronous parallel algorithm for inferring large-scale gene regulatory networks.
Xiao, Xiangyun; Zhang, Wei; Zou, Xiufen
2015-01-01
The reconstruction of gene regulatory networks (GRNs) from high-throughput experimental data has been considered one of the most important issues in systems biology research. With the development of high-throughput technology and the complexity of biological problems, we need to reconstruct GRNs that contain thousands of genes. However, when many existing algorithms are used to handle these large-scale problems, they will encounter two important issues: low accuracy and high computational cost. To overcome these difficulties, the main goal of this study is to design an effective parallel algorithm to infer large-scale GRNs based on high-performance parallel computing environments. In this study, we proposed a novel asynchronous parallel framework to improve the accuracy and lower the time complexity of large-scale GRN inference by combining splitting technology and ordinary differential equation (ODE)-based optimization. The presented algorithm uses the sparsity and modularity of GRNs to split whole large-scale GRNs into many small-scale modular subnetworks. Through the ODE-based optimization of all subnetworks in parallel and their asynchronous communications, we can easily obtain the parameters of the whole network. To test the performance of the proposed approach, we used well-known benchmark datasets from Dialogue for Reverse Engineering Assessments and Methods challenge (DREAM), experimentally determined GRN of Escherichia coli and one published dataset that contains more than 10 thousand genes to compare the proposed approach with several popular algorithms on the same high-performance computing environments in terms of both accuracy and time complexity. The numerical results demonstrate that our parallel algorithm exhibits obvious superiority in inferring large-scale GRNs.
READ-EVAL-PRINT in Parallel and Asynchronous Proof-checking
Directory of Open Access Journals (Sweden)
Makarius Wenzel
2013-07-01
Full Text Available The LCF tradition of interactive theorem proving, which was started by Milner in the 1970-ies, appears to be tied to the classic READ-EVAL-PRINT-LOOP of sequential and synchronous evaluation of prover commands. We break up this loop and retrofit the read-eval-print phases into a model of parallel and asynchronous proof processing. Thus we explain some key concepts of the Isabelle/Scala approach to prover interaction and integration, and the Isabelle/jEdit Prover IDE as front-end technology. We hope to open up the scientific discussion about non-trivial interaction models for ITP systems again, and help getting other old-school proof assistants on a similar track.
Parallel asynchronous systems and image processing algorithms
Coon, D. D.; Perera, A. G. U.
1989-01-01
A new hardware approach to implementation of image processing algorithms is described. The approach is based on silicon devices which would permit an independent analog processing channel to be dedicated to evey pixel. A laminar architecture consisting of a stack of planar arrays of the device would form a two-dimensional array processor with a 2-D array of inputs located directly behind a focal plane detector array. A 2-D image data stream would propagate in neuronlike asynchronous pulse coded form through the laminar processor. Such systems would integrate image acquisition and image processing. Acquisition and processing would be performed concurrently as in natural vision systems. The research is aimed at implementation of algorithms, such as the intensity dependent summation algorithm and pyramid processing structures, which are motivated by the operation of natural vision systems. Implementation of natural vision algorithms would benefit from the use of neuronlike information coding and the laminar, 2-D parallel, vision system type architecture. Besides providing a neural network framework for implementation of natural vision algorithms, a 2-D parallel approach could eliminate the serial bottleneck of conventional processing systems. Conversion to serial format would occur only after raw intensity data has been substantially processed. An interesting challenge arises from the fact that the mathematical formulation of natural vision algorithms does not specify the means of implementation, so that hardware implementation poses intriguing questions involving vision science.
Asynchronous Task-Based Parallelization of Algebraic Multigrid
AlOnazi, Amani A.
2017-06-23
As processor clock rates become more dynamic and workloads become more adaptive, the vulnerability to global synchronization that already complicates programming for performance in today\\'s petascale environment will be exacerbated. Algebraic multigrid (AMG), the solver of choice in many large-scale PDE-based simulations, scales well in the weak sense, with fixed problem size per node, on tightly coupled systems when loads are well balanced and core performance is reliable. However, its strong scaling to many cores within a node is challenging. Reducing synchronization and increasing concurrency are vital adaptations of AMG to hybrid architectures. Recent communication-reducing improvements to classical additive AMG by Vassilevski and Yang improve concurrency and increase communication-computation overlap, while retaining convergence properties close to those of standard multiplicative AMG, but remain bulk synchronous.We extend the Vassilevski and Yang additive AMG to asynchronous task-based parallelism using a hybrid MPI+OmpSs (from the Barcelona Supercomputer Center) within a node, along with MPI for internode communications. We implement a tiling approach to decompose the grid hierarchy into parallel units within task containers. We compare against the MPI-only BoomerAMG and the Auxiliary-space Maxwell Solver (AMS) in the hypre library for the 3D Laplacian operator and the electromagnetic diffusion, respectively. In time to solution for a full solve an MPI-OmpSs hybrid improves over an all-MPI approach in strong scaling at full core count (32 threads per single Haswell node of the Cray XC40) and maintains this per node advantage as both weak scale to thousands of cores, with MPI between nodes.
PENTACLE: Parallelized particle-particle particle-tree code for planet formation
Iwasawa, Masaki; Oshino, Shoichi; Fujii, Michiko S.; Hori, Yasunori
2017-10-01
We have newly developed a parallelized particle-particle particle-tree code for planet formation, PENTACLE, which is a parallelized hybrid N-body integrator executed on a CPU-based (super)computer. PENTACLE uses a fourth-order Hermite algorithm to calculate gravitational interactions between particles within a cut-off radius and a Barnes-Hut tree method for gravity from particles beyond. It also implements an open-source library designed for full automatic parallelization of particle simulations, FDPS (Framework for Developing Particle Simulator), to parallelize a Barnes-Hut tree algorithm for a memory-distributed supercomputer. These allow us to handle 1-10 million particles in a high-resolution N-body simulation on CPU clusters for collisional dynamics, including physical collisions in a planetesimal disc. In this paper, we show the performance and the accuracy of PENTACLE in terms of \\tilde{R}_cut and a time-step Δt. It turns out that the accuracy of a hybrid N-body simulation is controlled through Δ t / \\tilde{R}_cut and Δ t / \\tilde{R}_cut ˜ 0.1 is necessary to simulate accurately the accretion process of a planet for ≥106 yr. For all those interested in large-scale particle simulations, PENTACLE, customized for planet formation, will be freely available from https://github.com/PENTACLE-Team/PENTACLE under the MIT licence.
Amitai, Dganit; Averbuch, Amir; Itzikowitz, Samuel; Turkel, Eli
1991-01-01
A major problem in achieving significant speed-up on parallel machines is the overhead involved with synchronizing the concurrent process. Removing the synchronization constraint has the potential of speeding up the computation. The authors present asynchronous (AS) and corrected-asynchronous (CA) finite difference schemes for the multi-dimensional heat equation. Although the discussion concentrates on the Euler scheme for the solution of the heat equation, it has the potential for being extended to other schemes and other parabolic partial differential equations (PDEs). These schemes are analyzed and implemented on the shared memory multi-user Sequent Balance machine. Numerical results for one and two dimensional problems are presented. It is shown experimentally that the synchronization penalty can be about 50 percent of run time: in most cases, the asynchronous scheme runs twice as fast as the parallel synchronous scheme. In general, the efficiency of the parallel schemes increases with processor load, with the time level, and with the problem dimension. The efficiency of the AS may reach 90 percent and over, but it provides accurate results only for steady-state values. The CA, on the other hand, is less efficient, but provides more accurate results for intermediate (non steady-state) values.
International Nuclear Information System (INIS)
Tosic, P.T.
2011-01-01
We study certain types of Cellular Automata (CA) viewed as an abstraction of large-scale Multi-Agent Systems (MAS). We argue that the classical CA model needs to be modified in several important respects, in order to become a relevant and sufficiently general model for the large-scale MAS, and so that thus generalized model can capture many important MAS properties at the level of agent ensembles and their long-term collective behavior patterns. We specifically focus on the issue of inter-agent communication in CA, and propose sequential cellular automata (SCA) as the first step, and genuinely Asynchronous Cellular Automata (ACA) as the ultimate deterministic CA-based abstract models for large-scale MAS made of simple reactive agents. We first formulate deterministic and nondeterministic versions of sequential CA, and then summarize some interesting configuration space properties (i.e., possible behaviors) of a restricted class of sequential CA. In particular, we compare and contrast those properties of sequential CA with the corresponding properties of the classical (that is, parallel and perfectly synchronous) CA with the same restricted class of update rules. We analytically demonstrate failure of the studied sequential CA models to simulate all possible behaviors of perfectly synchronous parallel CA, even for a very restricted class of non-linear totalistic node update rules. The lesson learned is that the interleaving semantics of concurrency, when applied to sequential CA, is not refined enough to adequately capture the perfect synchrony of parallel CA updates. Last but not least, we outline what would be an appropriate CA-like abstraction for large-scale distributed computing insofar as the inter-agent communication model is concerned, and in that context we propose genuinely asynchronous CA. (author)
Data parallel sorting for particle simulation
Dagum, Leonardo
1992-01-01
Sorting on a parallel architecture is a communications intensive event which can incur a high penalty in applications where it is required. In the case of particle simulation, only integer sorting is necessary, and sequential implementations easily attain the minimum performance bound of O (N) for N particles. Parallel implementations, however, have to cope with the parallel sorting problem which, in addition to incurring a heavy communications cost, can make the minimun performance bound difficult to attain. This paper demonstrates how the sorting problem in a particle simulation can be reduced to a merging problem, and describes an efficient data parallel algorithm to solve this merging problem in a particle simulation. The new algorithm is shown to be optimal under conditions usual for particle simulation, and its fieldwise implementation on the Connection Machine is analyzed in detail. The new algorithm is about four times faster than a fieldwise implementation of radix sort on the Connection Machine.
Design issues in the semantics and scheduling of asynchronous tasks.
Energy Technology Data Exchange (ETDEWEB)
Olivier, Stephen L.
2013-07-01
The asynchronous task model serves as a useful vehicle for shared memory parallel programming, particularly on multicore and manycore processors. As adoption of model among programmers has increased, support has emerged for the integration of task parallel language constructs into mainstream programming languages, e.g., C and C++. This paper examines some of the design decisions in Cilk and OpenMP concerning semantics and scheduling of asynchronous tasks with the aim of informing the efforts of committees considering language integration, as well as developers of new task parallel languages and libraries.
Error characterization for asynchronous computations: Proxy equation approach
Sallai, Gabriella; Mittal, Ankita; Girimaji, Sharath
2017-11-01
Numerical techniques for asynchronous fluid flow simulations are currently under development to enable efficient utilization of massively parallel computers. These numerical approaches attempt to accurately solve time evolution of transport equations using spatial information at different time levels. The truncation error of asynchronous methods can be divided into two parts: delay dependent (EA) or asynchronous error and delay independent (ES) or synchronous error. The focus of this study is a specific asynchronous error mitigation technique called proxy-equation approach. The aim of this study is to examine these errors as a function of the characteristic wavelength of the solution. Mitigation of asynchronous effects requires that the asynchronous error be smaller than synchronous truncation error. For a simple convection-diffusion equation, proxy-equation error analysis identifies critical initial wave-number, λc. At smaller wave numbers, synchronous error are larger than asynchronous errors. We examine various approaches to increase the value of λc in order to improve the range of applicability of proxy-equation approach.
Parallel pic plasma simulation through particle decomposition techniques
International Nuclear Information System (INIS)
Briguglio, S.; Vlad, G.; Di Martino, B.; Naples, Univ. 'Federico II'
1998-02-01
Particle-in-cell (PIC) codes are among the major candidates to yield a satisfactory description of the detail of kinetic effects, such as the resonant wave-particle interaction, relevant in determining the transport mechanism in magnetically confined plasmas. A significant improvement of the simulation performance of such codes con be expected from parallelization, e.g., by distributing the particle population among several parallel processors. Parallelization of a hybrid magnetohydrodynamic-gyrokinetic code has been accomplished within the High Performance Fortran (HPF) framework, and tested on the IBM SP2 parallel system, using a 'particle decomposition' technique. The adopted technique requires a moderate effort in porting the code in parallel form and results in intrinsic load balancing and modest inter processor communication. The performance tests obtained confirm the hypothesis of high effectiveness of the strategy, if targeted towards moderately parallel architectures. Optimal use of resources is also discussed with reference to a specific physics problem [it
Parallelization Issues and Particle-In Codes.
Elster, Anne Cathrine
1994-01-01
"Everything should be made as simple as possible, but not simpler." Albert Einstein. The field of parallel scientific computing has concentrated on parallelization of individual modules such as matrix solvers and factorizers. However, many applications involve several interacting modules. Our analyses of a particle-in-cell code modeling charged particles in an electric field, show that these accompanying dependencies affect data partitioning and lead to new parallelization strategies concerning processor, memory and cache utilization. Our test-bed, a KSR1, is a distributed memory machine with a globally shared addressing space. However, most of the new methods presented hold generally for hierarchical and/or distributed memory systems. We introduce a novel approach that uses dual pointers on the local particle arrays to keep the particle locations automatically partially sorted. Complexity and performance analyses with accompanying KSR benchmarks, have been included for both this scheme and for the traditional replicated grids approach. The latter approach maintains load-balance with respect to particles. However, our results demonstrate it fails to scale properly for problems with large grids (say, greater than 128-by-128) running on as few as 15 KSR nodes, since the extra storage and computation time associated with adding the grid copies, becomes significant. Our grid partitioning scheme, although harder to implement, does not need to replicate the whole grid. Consequently, it scales well for large problems on highly parallel systems. It may, however, require load balancing schemes for non-uniform particle distributions. Our dual pointer approach may facilitate this through dynamically partitioned grids. We also introduce hierarchical data structures that store neighboring grid-points within the same cache -line by reordering the grid indexing. This alignment produces a 25% savings in cache-hits for a 4-by-4 cache. A consideration of the input data's effect on
A Parallel Particle Swarm Optimizer
National Research Council Canada - National Science Library
Schutte, J. F; Fregly, B .J; Haftka, R. T; George, A. D
2003-01-01
.... Motivated by a computationally demanding biomechanical system identification problem, we introduce a parallel implementation of a stochastic population based global optimizer, the Particle Swarm...
Multithreaded Asynchronous Graph Traversal for In-Memory and Semi-External Memory
Pearce, Roger
2010-11-01
Processing large graphs is becoming increasingly important for many domains such as social networks, bioinformatics, etc. Unfortunately, many algorithms and implementations do not scale with increasing graph sizes. As a result, researchers have attempted to meet the growing data demands using parallel and external memory techniques. We present a novel asynchronous approach to compute Breadth-First-Search (BFS), Single-Source-Shortest-Paths, and Connected Components for large graphs in shared memory. Our highly parallel asynchronous approach hides data latency due to both poor locality and delays in the underlying graph data storage. We present an experimental study applying our technique to both In-Memory and Semi-External Memory graphs utilizing multi-core processors and solid-state memory devices. Our experiments using synthetic and real-world datasets show that our asynchronous approach is able to overcome data latencies and provide significant speedup over alternative approaches. For example, on billion vertex graphs our asynchronous BFS scales up to 14x on 16-cores. © 2010 IEEE.
Parallel treatment of simulation particles in particle-in-cell codes on SUPRENUM
International Nuclear Information System (INIS)
Seldner, D.
1990-02-01
This report contains the program documentation and description of the program package 2D-PLAS, which has been developed at the Nuclear Research Center Karlsruhe in the Institute for Data Processing in Technology (IDT) under the auspices of the BMFT. 2D-PLAS is a parallel program version of the treatment of the simulation particles of the two-dimensional stationary particle-in-cell code BFCPIC which has been developed at the Nuclear Research Center Karlsruhe. This parallel version has been designed for the parallel computer SUPRENUM. (orig.) [de
A Block-Asynchronous Relaxation Method for Graphics Processing Units
Anzt, H.; Dongarra, J.; Heuveline, Vincent; Tomov, S.
2011-01-01
In this paper, we analyze the potential of asynchronous relaxation methods on Graphics Processing Units (GPUs). For this purpose, we developed a set of asynchronous iteration algorithms in CUDA and compared them with a parallel implementation of synchronous relaxation methods on CPU-based systems. For a set of test matrices taken from the University of Florida Matrix Collection we monitor the convergence behavior, the average iteration time and the total time-to-solution time. Analyzing the r...
Study of a centrifugal pump, asynchronous motor and inverter, using ...
African Journals Online (AJOL)
The signals generated by the micro controller have been used to program the parallel port of a computer. By reading the recorded bits of the parallel port in LabVIEW software, the signals from the micro controller have been restored and made available to the simulation model of the three-phase inverter, asynchronous ...
Parallel-vector algorithms for particle simulations on shared-memory multiprocessors
International Nuclear Information System (INIS)
Nishiura, Daisuke; Sakaguchi, Hide
2011-01-01
Over the last few decades, the computational demands of massive particle-based simulations for both scientific and industrial purposes have been continuously increasing. Hence, considerable efforts are being made to develop parallel computing techniques on various platforms. In such simulations, particles freely move within a given space, and so on a distributed-memory system, load balancing, i.e., assigning an equal number of particles to each processor, is not guaranteed. However, shared-memory systems achieve better load balancing for particle models, but suffer from the intrinsic drawback of memory access competition, particularly during (1) paring of contact candidates from among neighboring particles and (2) force summation for each particle. Here, novel algorithms are proposed to overcome these two problems. For the first problem, the key is a pre-conditioning process during which particle labels are sorted by a cell label in the domain to which the particles belong. Then, a list of contact candidates is constructed by pairing the sorted particle labels. For the latter problem, a table comprising the list indexes of the contact candidate pairs is created and used to sum the contact forces acting on each particle for all contacts according to Newton's third law. With just these methods, memory access competition is avoided without additional redundant procedures. The parallel efficiency and compatibility of these two algorithms were evaluated in discrete element method (DEM) simulations on four types of shared-memory parallel computers: a multicore multiprocessor computer, scalar supercomputer, vector supercomputer, and graphics processing unit. The computational efficiency of a DEM code was found to be drastically improved with our algorithms on all but the scalar supercomputer. Thus, the developed parallel algorithms are useful on shared-memory parallel computers with sufficient memory bandwidth.
Load-balancing techniques for a parallel electromagnetic particle-in-cell code
Energy Technology Data Exchange (ETDEWEB)
PLIMPTON,STEVEN J.; SEIDEL,DAVID B.; PASIK,MICHAEL F.; COATS,REBECCA S.
2000-01-01
QUICKSILVER is a 3-d electromagnetic particle-in-cell simulation code developed and used at Sandia to model relativistic charged particle transport. It models the time-response of electromagnetic fields and low-density-plasmas in a self-consistent manner: the fields push the plasma particles and the plasma current modifies the fields. Through an LDRD project a new parallel version of QUICKSILVER was created to enable large-scale plasma simulations to be run on massively-parallel distributed-memory supercomputers with thousands of processors, such as the Intel Tflops and DEC CPlant machines at Sandia. The new parallel code implements nearly all the features of the original serial QUICKSILVER and can be run on any platform which supports the message-passing interface (MPI) standard as well as on single-processor workstations. This report describes basic strategies useful for parallelizing and load-balancing particle-in-cell codes, outlines the parallel algorithms used in this implementation, and provides a summary of the modifications made to QUICKSILVER. It also highlights a series of benchmark simulations which have been run with the new code that illustrate its performance and parallel efficiency. These calculations have up to a billion grid cells and particles and were run on thousands of processors. This report also serves as a user manual for people wishing to run parallel QUICKSILVER.
Load-balancing techniques for a parallel electromagnetic particle-in-cell code
International Nuclear Information System (INIS)
Plimpton, Steven J.; Seidel, David B.; Pasik, Michael F.; Coats, Rebecca S.
2000-01-01
QUICKSILVER is a 3-d electromagnetic particle-in-cell simulation code developed and used at Sandia to model relativistic charged particle transport. It models the time-response of electromagnetic fields and low-density-plasmas in a self-consistent manner: the fields push the plasma particles and the plasma current modifies the fields. Through an LDRD project a new parallel version of QUICKSILVER was created to enable large-scale plasma simulations to be run on massively-parallel distributed-memory supercomputers with thousands of processors, such as the Intel Tflops and DEC CPlant machines at Sandia. The new parallel code implements nearly all the features of the original serial QUICKSILVER and can be run on any platform which supports the message-passing interface (MPI) standard as well as on single-processor workstations. This report describes basic strategies useful for parallelizing and load-balancing particle-in-cell codes, outlines the parallel algorithms used in this implementation, and provides a summary of the modifications made to QUICKSILVER. It also highlights a series of benchmark simulations which have been run with the new code that illustrate its performance and parallel efficiency. These calculations have up to a billion grid cells and particles and were run on thousands of processors. This report also serves as a user manual for people wishing to run parallel QUICKSILVER
Generalized Asynchronous Systems
Directory of Open Access Journals (Sweden)
E. S. Kudryashova
2012-01-01
Full Text Available The paper consider a mathematical model of a concurrent system, the special case of which is an asynchronous system. Distributed asynchronous automata are introduced here. It is proved that Petri nets and transition systems with independence can be considered as distributed asynchronous automata. Time distributed asynchronous automata are defined in a standard way by correspondence which relates events with time intervals. It is proved that the time distributed asynchronous automata generalize time Petri nets and asynchronous systems.
Directory of Open Access Journals (Sweden)
Ufnalski Bartlomiej
2014-12-01
Full Text Available In this paper two different update schemes for the recently developed plug-in direct particle swarm repetitive controller (PDPSRC are investigated and compared. The proposed approach employs the particle swarm optimizer (PSO to solve in on-line mode a dynamic optimization problem (DOP related to the control task in the constant-amplitude constant-frequency voltage-source inverter (CACF VSI with an LC output filter. The effectiveness of synchronous and asynchronous update rules, both commonly used in static optimization problems (SOPs, is assessed and compared in the case of PDPSRC. The performance of the controller, when synthesized using each of the update schemes, is studied numerically.
Kinetic-Monte-Carlo-Based Parallel Evolution Simulation Algorithm of Dust Particles
Directory of Open Access Journals (Sweden)
Xiaomei Hu
2014-01-01
Full Text Available The evolution simulation of dust particles provides an important way to analyze the impact of dust on the environment. KMC-based parallel algorithm is proposed to simulate the evolution of dust particles. In the parallel evolution simulation algorithm of dust particles, data distribution way and communication optimizing strategy are raised to balance the load of every process and reduce the communication expense among processes. The experimental results show that the simulation of diffusion, sediment, and resuspension of dust particles in virtual campus is realized and the simulation time is shortened by parallel algorithm, which makes up for the shortage of serial computing and makes the simulation of large-scale virtual environment possible.
On Scalable Deep Learning and Parallelizing Gradient Descent
AUTHOR|(CDS)2129036; Möckel, Rico; Baranowski, Zbigniew; Canali, Luca
Speeding up gradient based methods has been a subject of interest over the past years with many practical applications, especially with respect to Deep Learning. Despite the fact that many optimizations have been done on a hardware level, the convergence rate of very large models remains problematic. Therefore, data parallel methods next to mini-batch parallelism have been suggested to further decrease the training time of parameterized models using gradient based methods. Nevertheless, asynchronous optimization was considered too unstable for practical purposes due to a lacking understanding of the underlying mechanisms. Recently, a theoretical contribution has been made which defines asynchronous optimization in terms of (implicit) momentum due to the presence of a queuing model of gradients based on past parameterizations. This thesis mainly builds upon this work to construct a better understanding why asynchronous optimization shows proportionally more divergent behavior when the number of parallel worker...
Parallelization of a Monte Carlo particle transport simulation code
Hadjidoukas, P.; Bousis, C.; Emfietzoglou, D.
2010-05-01
We have developed a high performance version of the Monte Carlo particle transport simulation code MC4. The original application code, developed in Visual Basic for Applications (VBA) for Microsoft Excel, was first rewritten in the C programming language for improving code portability. Several pseudo-random number generators have been also integrated and studied. The new MC4 version was then parallelized for shared and distributed-memory multiprocessor systems using the Message Passing Interface. Two parallel pseudo-random number generator libraries (SPRNG and DCMT) have been seamlessly integrated. The performance speedup of parallel MC4 has been studied on a variety of parallel computing architectures including an Intel Xeon server with 4 dual-core processors, a Sun cluster consisting of 16 nodes of 2 dual-core AMD Opteron processors and a 200 dual-processor HP cluster. For large problem size, which is limited only by the physical memory of the multiprocessor server, the speedup results are almost linear on all systems. We have validated the parallel implementation against the serial VBA and C implementations using the same random number generator. Our experimental results on the transport and energy loss of electrons in a water medium show that the serial and parallel codes are equivalent in accuracy. The present improvements allow for studying of higher particle energies with the use of more accurate physical models, and improve statistics as more particles tracks can be simulated in low response time.
Directory of Open Access Journals (Sweden)
Zhang Wei
2005-01-01
Full Text Available The optimum and many suboptimum iterative soft-input soft-output (SISO multiuser detectors require a priori information about the multiuser system, such as the users' transmitted signature waveforms, relative delays, as well as the channel impulse response. In this paper, we employ adaptive algorithms in the SISO multiuser detector in order to avoid the need for this a priori information. First, we derive the optimum SISO parallel decision-feedback detector for asynchronous coded DS-CDMA systems. Then, we propose two adaptive versions of this SISO detector, which are based on the normalized least mean square (NLMS and recursive least squares (RLS algorithms. Our SISO adaptive detectors effectively exploit the a priori information of coded symbols, whose soft inputs are obtained from a bank of single-user decoders. Furthermore, we consider how to select practical finite feedforward and feedback filter lengths to obtain a good tradeoff between the performance and computational complexity of the receiver.
Asynchronous Multiparty Computation
DEFF Research Database (Denmark)
Damgård, Ivan Bjerre; Geisler, Martin; Krøigaard, Mikkel
2009-01-01
guarantees termination if the adversary allows a preprocessing phase to terminate, in which no information is released. The communication complexity of this protocol is the same as that of a passively secure solution up to a constant factor. It is secure against an adaptive and active adversary corrupting...... less than n/3 players. We also present a software framework for implementation of asynchronous protocols called VIFF (Virtual Ideal Functionality Framework), which allows automatic parallelization of primitive operations such as secure multiplications, without having to resort to complicated...... multithreading. Benchmarking of a VIFF implementation of our protocol confirms that it is applicable to practical non-trivial secure computations....
Energetic particle diffusion coefficients upstream of quasi-parallel interplanetary shocks
Tan, L. C.; Mason, G. M.; Gloeckler, G.; Ipavich, F. M.
1989-01-01
The properties of about 30 to 130-keV/e protons and alpha particles upstream of six quasi-parallel interplanetary shocks that passed by the ISEE 3 spacecraft during 1978-1979 were analyzed, and the values for the upstream energegic particle diffusion coefficient, kappa, in these six events were deduced for a number of energies and upstream positions. These observations were compared with predictions of Lee's (1983) theory of shock acceleration. It was found that the observations verified the prediction of the A/Q dependence (where A and Q are the particle atomic mass and ionization state, respectively) of kappa for alpha and proton particles upstream of the quasi-parallel shocks.
Multithreaded Asynchronous Graph Traversal for In-Memory and Semi-External Memory
Pearce, Roger; Gokhale, Maya; Amato, Nancy M.
2010-01-01
. Our highly parallel asynchronous approach hides data latency due to both poor locality and delays in the underlying graph data storage. We present an experimental study applying our technique to both In-Memory and Semi-External Memory graphs utilizing
Directory of Open Access Journals (Sweden)
E. G. Dada
2017-04-01
Full Text Available Acute damage to the retina vessel has been identified to be main reason for blindness and impaired vision all over the world. A timely detection and control of these illnesses can greatly decrease the number of loss of sight cases. Developing a high performance unsupervised retinal vessel segmentation technique poses an uphill task. This paper presents study on the Primal-Dual Asynchronous Particle Swarm Optimisation (pdAPSO method for the segmentation of retinal vessels. A maximum average accuracy rate 0.9243 with an average specificity of sensitivity rate of 0.9834 and average sensitivity rate of 0.5721 were achieved on DRIVE database. The proposed method produces higher mean sensitivity and accuracy rates in the same range of very good speciﬁcity.
International Nuclear Information System (INIS)
Satake, Shin-ichi; Kanamori, Hiroyuki; Kunugi, Tomoaki; Sato, Kazuho; Ito, Tomoyoshi; Yamamoto, Keisuke
2007-01-01
We have developed a parallel algorithm for microdigital-holographic particle-tracking velocimetry. The algorithm is used in (1) numerical reconstruction of a particle image computer using a digital hologram, and (2) searching for particles. The numerical reconstruction from the digital hologram makes use of the Fresnel diffraction equation and the FFT (fast Fourier transform),whereas the particle search algorithm looks for local maximum graduation in a reconstruction field represented by a 3D matrix. To achieve high performance computing for both calculations (reconstruction and particle search), two memory partitions are allocated to the 3D matrix. In this matrix, the reconstruction part consists of horizontally placed 2D memory partitions on the x-y plane for the FFT, whereas, the particle search part consists of vertically placed 2D memory partitions set along the z axes.Consequently, the scalability can be obtained for the proportion of processor elements,where the benchmarks are carried out for parallel computation by a SGI Altix machine
A software framework for the portable parallelization of particle-mesh simulations
DEFF Research Database (Denmark)
Sbalzarini, I.F.; Walther, Jens Honore; Polasek, B.
2006-01-01
Abstract: We present a software framework for the transparent and portable parallelization of simulations using particle-mesh methods. Particles are used to transport physical properties and a mesh is required in order to reinitialize the distorted particle locations, ensuring the convergence...
Low latency asynchronous interface circuits
Sadowski, Greg
2017-06-20
In one form, a logic circuit includes an asynchronous logic circuit, a synchronous logic circuit, and an interface circuit coupled between the asynchronous logic circuit and the synchronous logic circuit. The asynchronous logic circuit has a plurality of asynchronous outputs for providing a corresponding plurality of asynchronous signals. The synchronous logic circuit has a plurality of synchronous inputs corresponding to the plurality of asynchronous outputs, a stretch input for receiving a stretch signal, and a clock output for providing a clock signal. The synchronous logic circuit provides the clock signal as a periodic signal but prolongs a predetermined state of the clock signal while the stretch signal is active. The asynchronous interface detects whether metastability could occur when latching any of the plurality of the asynchronous outputs of the asynchronous logic circuit using said clock signal, and activates the stretch signal while the metastability could occur.
3-D electromagnetic plasma particle simulations on the Intel Delta parallel computer
International Nuclear Information System (INIS)
Wang, J.; Liewer, P.C.
1994-01-01
A three-dimensional electromagnetic PIC code has been developed on the 512 node Intel Touchstone Delta MIMD parallel computer. This code is based on the General Concurrent PIC algorithm which uses a domain decomposition to divide the computation among the processors. The 3D simulation domain can be partitioned into 1-, 2-, or 3-dimensional sub-domains. Particles must be exchanged between processors as they move among the subdomains. The Intel Delta allows one to use this code for very-large-scale simulations (i.e. over 10 8 particles and 10 6 grid cells). The parallel efficiency of this code is measured, and the overall code performance on the Delta is compared with that on Cray supercomputers. It is shown that their code runs with a high parallel efficiency of ≥ 95% for large size problems. The particle push time achieved is 115 nsecs/particle/time step for 162 million particles on 512 nodes. Comparing with the performance on a single processor Cray C90, this represents a factor of 58 speedup. The code uses a finite-difference leap frog method for field solve which is significantly more efficient than fast fourier transforms on parallel computers. The performance of this code on the 128 node Cray T3D will also be discussed
A parallel implementation of particle tracking with space charge effects on an INTEL iPSC/860
International Nuclear Information System (INIS)
Chang, L.; Bourianoff, G.; Cole, B.; Machida, S.
1993-05-01
Particle-tracking simulation is one of the scientific applications that is well-suited to parallel computations. At the Superconducting Super Collider, it has been theoretically and empirically demonstrated that particle tracking on a designed lattice can achieve very high parallel efficiency on a MIMD Intel iPSC/860 machine. The key to such success is the realization that the particles can be tracked independently without considering their interaction. The perfectly parallel nature of particle tracking is broken if the interaction effects between particles are included. The space charge introduces an electromagnetic force that will affect the motion of tracked particles in 3-D space. For accurate modeling of the beam dynamics with space charge effects, one needs to solve three-dimensional Maxwell field equations, usually by a particle-in-cell (PIC) algorithm. This will require each particle to communicate with its neighbor grids to compute the momentum changes at each time step. It is expected that the 3-D PIC method will degrade parallel efficiency of particle-tracking implementation on any parallel computer. In this paper, we describe an efficient scheme for implementing particle tracking with space charge effects on an INTEL iPSC/860 machine. Experimental results show that a parallel efficiency of 75% can be obtained
Parallel Global Optimization with the Particle Swarm Algorithm (Preprint)
National Research Council Canada - National Science Library
Schutte, J. F; Reinbolt, J. A; Fregly, B. J; Haftka, R. T; George, A. D
2004-01-01
.... To obtain enhanced computational throughput and global search capability, we detail the coarse-grained parallelization of an increasingly popular global search method, the Particle Swarm Optimization (PSO) algorithm...
Kumar, Sameer
2010-06-15
Disclosed is a mechanism on receiving processors in a parallel computing system for providing order to data packets received from a broadcast call and to distinguish data packets received at nodes from several incoming asynchronous broadcast messages where header space is limited. In the present invention, processors at lower leafs of a tree do not need to obtain a broadcast message by directly accessing the data in a root processor's buffer. Instead, each subsequent intermediate node's rank id information is squeezed into the software header of packet headers. In turn, the entire broadcast message is not transferred from the root processor to each processor in a communicator but instead is replicated on several intermediate nodes which then replicated the message to nodes in lower leafs. Hence, the intermediate compute nodes become "virtual root compute nodes" for the purpose of replicating the broadcast message to lower levels of a tree.
Professional Parallel Programming with C# Master Parallel Extensions with NET 4
Hillar, Gastón
2010-01-01
Expert guidance for those programming today's dual-core processors PCs As PC processors explode from one or two to now eight processors, there is an urgent need for programmers to master concurrent programming. This book dives deep into the latest technologies available to programmers for creating professional parallel applications using C#, .NET 4, and Visual Studio 2010. The book covers task-based programming, coordination data structures, PLINQ, thread pools, asynchronous programming model, and more. It also teaches other parallel programming techniques, such as SIMD and vectorization.Teach
Computational chaos in massively parallel neural networks
Barhen, Jacob; Gulati, Sandeep
1989-01-01
A fundamental issue which directly impacts the scalability of current theoretical neural network models to massively parallel embodiments, in both software as well as hardware, is the inherent and unavoidable concurrent asynchronicity of emerging fine-grained computational ensembles and the possible emergence of chaotic manifestations. Previous analyses attributed dynamical instability to the topology of the interconnection matrix, to parasitic components or to propagation delays. However, researchers have observed the existence of emergent computational chaos in a concurrently asynchronous framework, independent of the network topology. Researcher present a methodology enabling the effective asynchronous operation of large-scale neural networks. Necessary and sufficient conditions guaranteeing concurrent asynchronous convergence are established in terms of contracting operators. Lyapunov exponents are computed formally to characterize the underlying nonlinear dynamics. Simulation results are presented to illustrate network convergence to the correct results, even in the presence of large delays.
Parallel processing of Monte Carlo code MCNP for particle transport problem
Energy Technology Data Exchange (ETDEWEB)
Higuchi, Kenji; Kawasaki, Takuji
1996-06-01
It is possible to vectorize or parallelize Monte Carlo codes (MC code) for photon and neutron transport problem, making use of independency of the calculation for each particle. Applicability of existing MC code to parallel processing is mentioned. As for parallel computer, we have used both vector-parallel processor and scalar-parallel processor in performance evaluation. We have made (i) vector-parallel processing of MCNP code on Monte Carlo machine Monte-4 with four vector processors, (ii) parallel processing on Paragon XP/S with 256 processors. In this report we describe the methodology and results for parallel processing on two types of parallel or distributed memory computers. In addition, we mention the evaluation of parallel programming environments for parallel computers used in the present work as a part of the work developing STA (Seamless Thinking Aid) Basic Software. (author)
Frog: Asynchronous Graph Processing on GPU with Hybrid Coloring Model
Energy Technology Data Exchange (ETDEWEB)
Shi, Xuanhua; Luo, Xuan; Liang, Junling; Zhao, Peng; Di, Sheng; He, Bingsheng; Jin, Hai
2018-01-01
GPUs have been increasingly used to accelerate graph processing for complicated computational problems regarding graph theory. Many parallel graph algorithms adopt the asynchronous computing model to accelerate the iterative convergence. Unfortunately, the consistent asynchronous computing requires locking or atomic operations, leading to significant penalties/overheads when implemented on GPUs. As such, coloring algorithm is adopted to separate the vertices with potential updating conflicts, guaranteeing the consistency/correctness of the parallel processing. Common coloring algorithms, however, may suffer from low parallelism because of a large number of colors generally required for processing a large-scale graph with billions of vertices. We propose a light-weight asynchronous processing framework called Frog with a preprocessing/hybrid coloring model. The fundamental idea is based on Pareto principle (or 80-20 rule) about coloring algorithms as we observed through masses of realworld graph coloring cases. We find that a majority of vertices (about 80%) are colored with only a few colors, such that they can be read and updated in a very high degree of parallelism without violating the sequential consistency. Accordingly, our solution separates the processing of the vertices based on the distribution of colors. In this work, we mainly answer three questions: (1) how to partition the vertices in a sparse graph with maximized parallelism, (2) how to process large-scale graphs that cannot fit into GPU memory, and (3) how to reduce the overhead of data transfers on PCIe while processing each partition. We conduct experiments on real-world data (Amazon, DBLP, YouTube, RoadNet-CA, WikiTalk and Twitter) to evaluate our approach and make comparisons with well-known non-preprocessed (such as Totem, Medusa, MapGraph and Gunrock) and preprocessed (Cusha) approaches, by testing four classical algorithms (BFS, PageRank, SSSP and CC). On all the tested applications and
Parallelization and scheduling of data intensive particle physics analysis jobs on clusters of PCs
Ponce, S
2004-01-01
Summary form only given. Scheduling policies are proposed for parallelizing data intensive particle physics analysis applications on computer clusters. Particle physics analysis jobs require the analysis of tens of thousands of particle collision events, each event requiring typically 200ms processing time and 600KB of data. Many jobs are launched concurrently by a large number of physicists. At a first view, particle physics jobs seem to be easy to parallelize, since particle collision events can be processed independently one from another. However, since large amounts of data need to be accessed, the real challenge resides in making an efficient use of the underlying computing resources. We propose several job parallelization and scheduling policies aiming at reducing job processing times and at increasing the sustainable load of a cluster server. Since particle collision events are usually reused by several jobs, cache based job splitting strategies considerably increase cluster utilization and reduce job ...
Particle orbit tracking on a parallel computer: Hypertrack
International Nuclear Information System (INIS)
Cole, B.; Bourianoff, G.; Pilat, F.; Talman, R.
1991-05-01
A program has been written which performs particle orbit tracking on the Intel iPSC/860 distributed memory parallel computer. The tracking is performed using a thin element approach. A brief description of the structure and performance of the code is presented, along with applications of the code to the analysis of accelerator lattices for the SSC. The concept of ''ensemble tracking'', i.e. the tracking of ensemble averages of noninteracting particles, such as the emittance, is presented. Preliminary results of such studies will be presented. 2 refs., 6 figs
Load balancing in highly parallel processing of Monte Carlo code for particle transport
International Nuclear Information System (INIS)
Higuchi, Kenji; Takemiya, Hiroshi; Kawasaki, Takuji
1998-01-01
In parallel processing of Monte Carlo (MC) codes for neutron, photon and electron transport problems, particle histories are assigned to processors making use of independency of the calculation for each particle. Although we can easily parallelize main part of a MC code by this method, it is necessary and practically difficult to optimize the code concerning load balancing in order to attain high speedup ratio in highly parallel processing. In fact, the speedup ratio in the case of 128 processors remains in nearly one hundred times when using the test bed for the performance evaluation. Through the parallel processing of the MCNP code, which is widely used in the nuclear field, it is shown that it is difficult to attain high performance by static load balancing in especially neutron transport problems, and a load balancing method, which dynamically changes the number of assigned particles minimizing the sum of the computational and communication costs, overcomes the difficulty, resulting in nearly fifteen percentage of reduction for execution time. (author)
Concurrent particle-in-cell plasma simulation on a multi-transputer parallel computer
International Nuclear Information System (INIS)
Khare, A.N.; Jethra, A.; Patel, Kartik
1992-01-01
This report describes the parallelization of a Particle-in-Cell (PIC) plasma simulation code on a multi-transputer parallel computer. The algorithm used in the parallelization of the PIC method is described. The decomposition schemes related to the distribution of the particles among the processors are discussed. The implementation of the algorithm on a transputer network connected as a torus is presented. The solutions of the problems related to global communication of data are presented in the form of a set of generalized communication functions. The performance of the program as a function of data size and the number of transputers show that the implementation is scalable and represents an effective way of achieving high performance at acceptable cost. (author). 11 refs., 4 figs., 2 tabs., appendices
Memory effect on energy losses of charged particles moving parallel to solid surface
International Nuclear Information System (INIS)
Kwei, C.M.; Tu, Y.H.; Hsu, Y.H.; Tung, C.J.
2006-01-01
Theoretical derivations were made for the induced potential and the stopping power of a charged particle moving close and parallel to the surface of a solid. It was illustrated that the induced potential produced by the interaction of particle and solid depended not only on the velocity but also on the previous velocity of the particle before its last inelastic interaction. Another words, the particle kept a memory on its previous velocity, v , in determining the stopping power for the particle of velocity v. Based on the dielectric response theory, formulas were derived for the induced potential and the stopping power with memory effect. An extended Drude dielectric function with spatial dispersion was used in the application of these formulas for a proton moving parallel to Si surface. It was found that the induced potential with memory effect lay between induced potentials without memory effect for constant velocities v and v. The memory effect was manifest as the proton changes its velocity in the previous inelastic interaction. This memory effect also reduced the stopping power of the proton. The formulas derived in the present work can be applied to any solid surface and charged particle moving with arbitrary parallel trajectory either inside or outside the solid
Parallel nanostructuring of GeSbTe film with particle mask
Energy Technology Data Exchange (ETDEWEB)
Wang, Z.B.; Hong, M.H.; Wang, Q.F.; Chong, T.C. [Data Storage Institute, DSI Building, 5 Engineering Drive 1, 117608, Singapore (Singapore); Department of Electrical and Computer Engineering, National University of Singapore, 119260, Singapore (Singapore); Luk' yanchuk, B.S.; Huang, S.M.; Shi, L.P. [Data Storage Institute, DSI Building, 5 Engineering Drive 1, 117608, Singapore (Singapore)
2004-09-01
Parallel nanostructuring of a GeSbTe film may significantly improve the recording performance in data storage. In this paper, a method that permits direct and massively parallel nanopatterning of the substrate surface by laser irradiation is investigated. Polystyrene spherical particles were deposited on the surface in a monolayer array by self-assembly. The array was then irradiated with a 248-nm KrF laser. A sub-micron nanodent array can be obtained after single-pulse irradiation. These nanodents change their shapes at different laser energies. The optical near-field distribution around the particles was calculated according to the exact solution of the light-scattering problem. The influence of the presence of the substrate on the optical near field was also studied. The mechanisms for the generation of the nanodent structures are discussed. (orig.)
Many-particle hydrodynamic interactions in parallel-wall geometry: Cartesian-representation method
International Nuclear Information System (INIS)
Blawzdziewicz, J.; Wajnryb, E.; Bhattacharya, S.
2005-01-01
This talk will describe the results of our theoretical and numerical studies of hydrodynamic interactions in a suspension of spherical particles confined between two parallel planar walls, under creeping-flow conditions. We propose an efficient algorithm for evaluating many-particle friction matrix in this system-no Stokesian-dynamics algorithm of this kind has been available so far. Our approach involves expanding the fluid velocity field in the wall-bounded suspension into spherical and Cartesian fundamental sets of Stokes flows. The spherical set is used to describe the interaction of the fluid with the particles and the Cartesian set to describe the interaction with the walls. At the core of our method are transformation relations between the spherical and Cartesian fundamental sets. Using the transformation formulas, we derive a system of linear equations for the force multipoles induced on the particle surfaces; the coefficients in these equations are given in terms of lateral Fourier integrals corresponding to the directions parallel to the walls. The force-multipole equations have been implemented in a numerical algorithm for the evaluation of the multiparticle friction matrix in the wall-bounded system. The algorithm involves subtraction of the particle-wall and particle-particle lubrication contributions to accelerate the convergence of the results with the spherical-harmonics order, and a subtraction of the single-wall contributions to accelerate the convergence of the Fourier integrals. (author)
Plasma and energetic particle structure of a collisionless quasi-parallel shock
Kennel, C. F.; Scarf, F. L.; Coroniti, F. V.; Russell, C. T.; Smith, E. J.; Wenzel, K. P.; Reinhard, R.; Sanderson, T. R.; Feldman, W. C.; Parks, G. K.
1983-01-01
The quasi-parallel interplanetary shock of November 11-12, 1978 from both the collisionless shock and energetic particle points of view were studied using measurements of the interplanetary magnetic and electric fields, solar wind electrons, plasma and MHD waves, and intermediate and high energy ions obtained on ISEE-1, -2, and -3. The interplanetary environment through which the shock was propagating when it encountered the three spacecraft was characterized; the observations of this shock are documented and current theories of quasi-parallel shock structure and particle acceleration are tested. These observations tend to confirm present self consistent theories of first order Fermi acceleration by shocks and of collisionless shock dissipation involving firehouse instability.
International Nuclear Information System (INIS)
Zhang, B.; Li, G.; Wang, W.; Shangguan, D.; Deng, L.
2015-01-01
This paper introduces the Strategy of multilevel hybrid parallelism of JCOGIN Infrastructure on Monte Carlo Particle Transport for the large-scale full-core pin-by-pin simulations. The particle parallelism, domain decomposition parallelism and MPI/OpenMP parallelism are designed and implemented. By the testing, JMCT presents the parallel scalability of JCOGIN, which reaches the parallel efficiency 80% on 120,000 cores for the pin-by-pin computation of the BEAVRS benchmark. (author)
Parallel and Cooperative Particle Swarm Optimizer for Multimodal Problems
Directory of Open Access Journals (Sweden)
Geng Zhang
2015-01-01
Full Text Available Although the original particle swarm optimizer (PSO method and its related variant methods show some effectiveness for solving optimization problems, it may easily get trapped into local optimum especially when solving complex multimodal problems. Aiming to solve this issue, this paper puts forward a novel method called parallel and cooperative particle swarm optimizer (PCPSO. In case that the interacting of the elements in D-dimensional function vector X=[x1,x2,…,xd,…,xD] is independent, cooperative particle swarm optimizer (CPSO is used. Based on this, the PCPSO is presented to solve real problems. Since the dimension cannot be split into several lower dimensional search spaces in real problems because of the interacting of the elements, PCPSO exploits the cooperation of two parallel CPSO algorithms by orthogonal experimental design (OED learning. Firstly, the CPSO algorithm is used to generate two locally optimal vectors separately; then the OED is used to learn the merits of these two vectors and creates a better combination of them to generate further search. Experimental studies on a set of test functions show that PCPSO exhibits better robustness and converges much closer to the global optimum than several other peer algorithms.
Parallelization of MRCI based on hole-particle symmetry.
Suo, Bing; Zhai, Gaohong; Wang, Yubin; Wen, Zhenyi; Hu, Xiangqian; Li, Lemin
2005-01-15
The parallel implementation of multireference configuration interaction program based on the hole-particle symmetry is described. The platform to implement the parallelization is an Intel-Architectural cluster consisting of 12 nodes, each of which is equipped with two 2.4-G XEON processors, 3-GB memory, and 36-GB disk, and are connected by a Gigabit Ethernet Switch. The dependence of speedup on molecular symmetries and task granularities is discussed. Test calculations show that the scaling with the number of nodes is about 1.9 (for C1 and Cs), 1.65 (for C2v), and 1.55 (for D2h) when the number of nodes is doubled. The largest calculation performed on this cluster involves 5.6 x 10(8) CSFs.
Asynchronous schemes for CFD at extreme scales
Konduri, Aditya; Donzis, Diego
2013-11-01
Recent advances in computing hardware and software have made simulations an indispensable research tool in understanding fluid flow phenomena in complex conditions at great detail. Due to the nonlinear nature of the governing NS equations, simulations of high Re turbulent flows are computationally very expensive and demand for extreme levels of parallelism. Current large simulations are being done on hundreds of thousands of processing elements (PEs). Benchmarks from these simulations show that communication between PEs take a substantial amount of time, overwhelming the compute time, resulting in substantial waste in compute cycles as PEs remain idle. We investigate a novel approach based on widely used finite-difference schemes in which computations are carried out asynchronously, i.e. synchronization of data among PEs is not enforced and computations proceed regardless of the status of messages. This drastically reduces PE idle time and results in much larger computation rates. We show that while these schemes remain stable, their accuracy is significantly affected. We present new schemes that maintain accuracy under asynchronous conditions and provide a viable path towards exascale computing. Performance of these schemes will be shown for simple models like Burgers' equation.
AP-IO: asynchronous pipeline I/O for hiding periodic output cost in CFD simulation.
Xiaoguang, Ren; Xinhai, Xu
2014-01-01
Computational fluid dynamics (CFD) simulation often needs to periodically output intermediate results to files in the form of snapshots for visualization or restart, which seriously impacts the performance. In this paper, we present asynchronous pipeline I/O (AP-IO) optimization scheme for the periodically snapshot output on the basis of asynchronous I/O and CFD application characteristics. In AP-IO, dedicated background I/O processes or threads are in charge of handling the file write in pipeline mode, therefore the write overhead can be hidden with more calculation than classic asynchronous I/O. We design the framework of AP-IO and implement it in OpenFOAM, providing CFD users with a user-friendly interface. Experimental results on the Tianhe-2 supercomputer demonstrate that AP-IO can achieve a good optimization effect for the periodical snapshot output in CFD application, and the effect is especially better for massively parallel CFD simulations, which can reduce the total execution time up to about 40%.
AP-IO: Asynchronous Pipeline I/O for Hiding Periodic Output Cost in CFD Simulation
Directory of Open Access Journals (Sweden)
Ren Xiaoguang
2014-01-01
Full Text Available Computational fluid dynamics (CFD simulation often needs to periodically output intermediate results to files in the form of snapshots for visualization or restart, which seriously impacts the performance. In this paper, we present asynchronous pipeline I/O (AP-IO optimization scheme for the periodically snapshot output on the basis of asynchronous I/O and CFD application characteristics. In AP-IO, dedicated background I/O processes or threads are in charge of handling the file write in pipeline mode, therefore the write overhead can be hidden with more calculation than classic asynchronous I/O. We design the framework of AP-IO and implement it in OpenFOAM, providing CFD users with a user-friendly interface. Experimental results on the Tianhe-2 supercomputer demonstrate that AP-IO can achieve a good optimization effect for the periodical snapshot output in CFD application, and the effect is especially better for massively parallel CFD simulations, which can reduce the total execution time up to about 40%.
CERN. Geneva
2016-01-01
The traditionally used and well established parallel programming models OpenMP and MPI are both targeting lower level parallelism and are meant to be as language agnostic as possible. For a long time, those models were the only widely available portable options for developing parallel C++ applications beyond using plain threads. This has strongly limited the optimization capabilities of compilers, has inhibited extensibility and genericity, and has restricted the use of those models together with other, modern higher level abstractions introduced by the C++11 and C++14 standards. The recent revival of interest in the industry and wider community for the C++ language has also spurred a remarkable amount of standardization proposals and technical specifications being developed. Those efforts however have so far failed to build a vision on how to seamlessly integrate various types of parallelism, such as iterative parallel execution, task-based parallelism, asynchronous many-task execution flows, continuation s...
Evaluating the performance of the particle finite element method in parallel architectures
Gimenez, Juan M.; Nigro, Norberto M.; Idelsohn, Sergio R.
2014-05-01
This paper presents a high performance implementation for the particle-mesh based method called particle finite element method two (PFEM-2). It consists of a material derivative based formulation of the equations with a hybrid spatial discretization which uses an Eulerian mesh and Lagrangian particles. The main aim of PFEM-2 is to solve transport equations as fast as possible keeping some level of accuracy. The method was found to be competitive with classical Eulerian alternatives for these targets, even in their range of optimal application. To evaluate the goodness of the method with large simulations, it is imperative to use of parallel environments. Parallel strategies for Finite Element Method have been widely studied and many libraries can be used to solve Eulerian stages of PFEM-2. However, Lagrangian stages, such as streamline integration, must be developed considering the parallel strategy selected. The main drawback of PFEM-2 is the large amount of memory needed, which limits its application to large problems with only one computer. Therefore, a distributed-memory implementation is urgently needed. Unlike a shared-memory approach, using domain decomposition the memory is automatically isolated, thus avoiding race conditions; however new issues appear due to data distribution over the processes. Thus, a domain decomposition strategy for both particle and mesh is adopted, which minimizes the communication between processes. Finally, performance analysis running over multicore and multinode architectures are presented. The Courant-Friedrichs-Lewy number used influences the efficiency of the parallelization and, in some cases, a weighted partitioning can be used to improve the speed-up. However the total cputime for cases presented is lower than that obtained when using classical Eulerian strategies.
The STAPL Parallel Graph Library
Harshvardhan,
2013-01-01
This paper describes the stapl Parallel Graph Library, a high-level framework that abstracts the user from data-distribution and parallelism details and allows them to concentrate on parallel graph algorithm development. It includes a customizable distributed graph container and a collection of commonly used parallel graph algorithms. The library introduces pGraph pViews that separate algorithm design from the container implementation. It supports three graph processing algorithmic paradigms, level-synchronous, asynchronous and coarse-grained, and provides common graph algorithms based on them. Experimental results demonstrate improved scalability in performance and data size over existing graph libraries on more than 16,000 cores and on internet-scale graphs containing over 16 billion vertices and 250 billion edges. © Springer-Verlag Berlin Heidelberg 2013.
Where are the parallel algorithms?
Voigt, R. G.
1985-01-01
Four paradigms that can be useful in developing parallel algorithms are discussed. These include computational complexity analysis, changing the order of computation, asynchronous computation, and divide and conquer. Each is illustrated with an example from scientific computation, and it is shown that computational complexity must be used with great care or an inefficient algorithm may be selected.
Vanaverbeke, S.; Keppens, R.; Poedts, S.; Boffin, H.
2009-01-01
We describe the algorithms implemented in the first version of GRADSPH, a parallel, tree-based, smoothed particle hydrodynamics code for simulating self-gravitating astrophysical systems written in FORTRAN 90. The paper presents details on the implementation of the Smoothed Particle Hydro (SPH)
Asynchronous LMS adaptive equalization
Bergmans, J.W.M.; Lin, M.Y.; Modrie, D.; Otte, R.
2005-01-01
Digital data receivers often operate at a fixed sampling rate 1/Ts that is asynchronous to the baud rate 1/T. A digital equalizer that processes the incoming signal will also operate in the asynchronous clock domain. Existing adaptation techniques for this equalizer involve an error sequence ek that
Asynchronous design of Networks-on-Chip
DEFF Research Database (Denmark)
Sparsø, Jens
2007-01-01
-synchronous, mesochronous, globally-asynchronous locally-synchronous and fully asynchronous), discusses the circuitry needed to implement these timing methodologies, and provides some implementation details for a couple of asynchronous NoCs designed at the Technical University of Denmark (DTU). The paper is written...... to support an invited talk at the NORCHIP’2007 conference....
A parallel 3D particle-in-cell code with dynamic load balancing
International Nuclear Information System (INIS)
Wolfheimer, Felix; Gjonaj, Erion; Weiland, Thomas
2006-01-01
A parallel 3D electrostatic Particle-In-Cell (PIC) code including an algorithm for modelling Space Charge Limited (SCL) emission [E. Gjonaj, T. Weiland, 3D-modeling of space-charge-limited electron emission. A charge conserving algorithm, Proceedings of the 11th Biennial IEEE Conference on Electromagnetic Field Computation, 2004] is presented. A domain decomposition technique based on orthogonal recursive bisection is used to parallelize the computation on a distributed memory environment of clustered workstations. For problems with a highly nonuniform and time dependent distribution of particles, e.g., bunch dynamics, a dynamic load balancing between the processes is needed to preserve the parallel performance. The algorithm for the detection of a load imbalance and the redistribution of the tasks among the processes is based on a weight function criterion, where the weight of a cell measures the computational load associated with it. The algorithm is studied with two examples. In the first example, multiple electron bunches as occurring in the S-DALINAC [A. Richter, Operational experience at the S-DALINAC, Proceedings of the Fifth European Particle Accelerator Conference, 1996] accelerator are simulated in the absence of space charge fields. In the second example, the SCL emission and electron trajectories in an electron gun are simulated
A parallel 3D particle-in-cell code with dynamic load balancing
Energy Technology Data Exchange (ETDEWEB)
Wolfheimer, Felix [Technische Universitaet Darmstadt, Institut fuer Theorie Elektromagnetischer Felder, Schlossgartenstr.8, 64283 Darmstadt (Germany)]. E-mail: wolfheimer@temf.de; Gjonaj, Erion [Technische Universitaet Darmstadt, Institut fuer Theorie Elektromagnetischer Felder, Schlossgartenstr.8, 64283 Darmstadt (Germany); Weiland, Thomas [Technische Universitaet Darmstadt, Institut fuer Theorie Elektromagnetischer Felder, Schlossgartenstr.8, 64283 Darmstadt (Germany)
2006-03-01
A parallel 3D electrostatic Particle-In-Cell (PIC) code including an algorithm for modelling Space Charge Limited (SCL) emission [E. Gjonaj, T. Weiland, 3D-modeling of space-charge-limited electron emission. A charge conserving algorithm, Proceedings of the 11th Biennial IEEE Conference on Electromagnetic Field Computation, 2004] is presented. A domain decomposition technique based on orthogonal recursive bisection is used to parallelize the computation on a distributed memory environment of clustered workstations. For problems with a highly nonuniform and time dependent distribution of particles, e.g., bunch dynamics, a dynamic load balancing between the processes is needed to preserve the parallel performance. The algorithm for the detection of a load imbalance and the redistribution of the tasks among the processes is based on a weight function criterion, where the weight of a cell measures the computational load associated with it. The algorithm is studied with two examples. In the first example, multiple electron bunches as occurring in the S-DALINAC [A. Richter, Operational experience at the S-DALINAC, Proceedings of the Fifth European Particle Accelerator Conference, 1996] accelerator are simulated in the absence of space charge fields. In the second example, the SCL emission and electron trajectories in an electron gun are simulated.
Behavioral synthesis of asynchronous circuits
DEFF Research Database (Denmark)
Nielsen, Sune Fallgaard
2005-01-01
This thesis presents a method for behavioral synthesis of asynchronous circuits, which aims at providing a synthesis flow which uses and tranfers methods from synchronous circuits to asynchronous circuits. We move the synchronous behavioral synthesis abstraction into the asynchronous handshake...... is idle. This reduces unnecessary switching activity in the individual functional units and therefore the energy consumption of the entire circuit. A collection of behavioral synthesis algorithms have been developed allowing the designer to perform time and power constrained design space exploration...
International Nuclear Information System (INIS)
Colavita, A.; Capello, G.
1997-01-01
In this paper we present a novel parallel sorting algorithm, which works through a cascade of elementary sorting units and leads to a scalable architecture. The algorithm's complexity is analyzed and compared with a classical parallel algorithm. It comes out that, although it may be less efficient than classical approaches, the proposed algorithm is highly suited for VLSI implementation for its simplicity and scalability. The paper describes the applications of such device to the asynchronous data acquisition for a gamma-ray telescope. (orig.)
Parallel and Distributed Systems for Probabilistic Reasoning
2012-12-01
Ranganathan "et"al...typically a random permutation over the vertices. Advances by Elidan et al. [2006] and Ranganathan et al. [2007] have focused on dynamic asynchronous...Wildfire algorithm shown in Alg. 3.6 is a direct parallelization of the algorithm proposed by [ Ranganathan et al., 2007]. The Wildfire algorithm
Pro asynchronous programming with .NET
Blewett, Richard; Ltd, Rock Solid Knowledge
2014-01-01
Pro Asynchronous Programming with .NET teaches the essential skill of asynchronous programming in .NET. It answers critical questions in .NET application development, such as: how do I keep my program responding at all times to keep my users happy how do I make the most of the available hardware how can I improve performanceIn the modern world, users expect more and more from their applications and devices, and multi-core hardware has the potential to provide it. But it takes carefully crafted code to turn that potential into responsive, scalable applications.With Pro Asynchronous Programming
Asynchronized synchronous machines
Botvinnik, M M
1964-01-01
Asynchronized Synchronous Machines focuses on the theoretical research on asynchronized synchronous (AS) machines, which are "hybrids of synchronous and induction machines that can operate with slip. Topics covered in this book include the initial equations; vector diagram of an AS machine; regulation in cases of deviation from the law of full compensation; parameters of the excitation system; and schematic diagram of an excitation regulator. The possible applications of AS machines and its calculations in certain cases are also discussed. This publication is beneficial for students and indiv
Asynchronous zero-forcing adaptive equalization
Bergmans, J.W.M.; Pozidis, H.; Lin, M.Y.
2005-01-01
Digital data receivers often operate at a fixed sampling rate 1/Ts that is asynchronous to the baud rate 1/T. A digital equalizer that processes the incoming signal will also be asynchronous, and its adaptation is commonly based on extensions of the LMS algorithm. In this paper, we develop and
A Parallel Adaptive Particle Swarm Optimization Algorithm for Economic/Environmental Power Dispatch
Directory of Open Access Journals (Sweden)
Jinchao Li
2012-01-01
Full Text Available A parallel adaptive particle swarm optimization algorithm (PAPSO is proposed for economic/environmental power dispatch, which can overcome the premature characteristic, the slow-speed convergence in the late evolutionary phase, and lacking good direction in particles’ evolutionary process. A search population is randomly divided into several subpopulations. Then for each subpopulation, the optimal solution is searched synchronously using the proposed method, and thus parallel computing is realized. To avoid converging to a local optimum, a crossover operator is introduced to exchange the information among the subpopulations and the diversity of population is sustained simultaneously. Simulation results show that the proposed algorithm can effectively solve the economic/environmental operation problem of hydropower generating units. Performance comparisons show that the solution from the proposed method is better than those from the conventional particle swarm algorithm and other optimization algorithms.
Asynchronous Checkpoint Migration with MRNet in the Scalable Checkpoint / Restart Library
Energy Technology Data Exchange (ETDEWEB)
Mohror, K; Moody, A; de Supinski, B R
2012-03-20
Applications running on today's supercomputers tolerate failures by periodically saving their state in checkpoint files on stable storage, such as a parallel file system. Although this approach is simple, the overhead of writing the checkpoints can be prohibitive, especially for large-scale jobs. In this paper, we present initial results of an enhancement to our Scalable Checkpoint/Restart Library (SCR). We employ MRNet, a tree-based overlay network library, to transfer checkpoints from the compute nodes to the parallel file system asynchronously. This enhancement increases application efficiency by removing the need for an application to block while checkpoints are transferred to the parallel file system. We show that the integration of SCR with MRNet can reduce the time spent in I/O operations by as much as 15x. However, our experiments exposed new scalability issues with our initial implementation. We discuss the sources of the scalability problems and our plans to address them.
H5Part A Portable High Performance Parallel Data Interface for Particle Simulations
Adelmann, Andreas; Shalf, John M; Siegerist, Cristina
2005-01-01
Largest parallel particle simulations, in six dimensional phase space generate wast amont of data. It is also desirable to share data and data analysis tools such as ParViT (Particle Visualization Toolkit) among other groups who are working on particle-based accelerator simulations. We define a very simple file schema built on top of HDF5 (Hierarchical Data Format version 5) as well as an API that simplifies the reading/writing of the data to the HDF5 file format. HDF5 offers a self-describing machine-independent binary file format that supports scalable parallel I/O performance for MPI codes on a variety of supercomputing systems and works equally well on laptop computers. The API is available for C, C++, and Fortran codes. The file format will enable disparate research groups with very different simulation implementations to share data transparently and share data analysis tools. For instance, the common file format will enable groups that depend on completely different simulation implementations to share c...
Energetic particle parallel diffusion in a cascading wave turbulence in the foreshock region
Directory of Open Access Journals (Sweden)
F. Otsuka
2007-09-01
Full Text Available We study parallel (field-aligned diffusion of energetic particles in the upstream of the bow shock with test particle simulations. We assume parallel shock geometry of the bow shock, and that MHD wave turbulence convected by the solar wind toward the shock is purely transverse in one-dimensional system with a constant background magnetic field. We use three turbulence models: a homogeneous turbulence, a regular cascade from a large scale to smaller scales, and an inverse cascade from a small scale to larger scales. For the homogeneous model the particle motions along the average field are Brownian motions due to random and isotropic scattering across 90 degree pitch angle. On the other hand, for the two cascade models particle motion is non-Brownian due to coherent and anisotropic pitch angle scattering for finite time scale. The mean free path λ_{||} calculated by the ensemble average of these particle motions exhibits dependence on the distance from the shock. It also depends on the parameters such as the thermal velocity of the particles, solar wind flow velocity, and a wave turbulence model. For the inverse cascade model, the dependence of λ_{||} at the shock on the thermal energy is consistent with the hybrid simulation done by Giacalone (2004, but the spatial dependence of λ_{||} is inconsistent with it.
International Nuclear Information System (INIS)
Apisit, Patchimpattapong; Alireza, Haghighat; Shedlock, D.
2003-01-01
An expert system for generating an effective mesh distribution for the SN particle transport simulation has been developed. This expert system consists of two main parts: 1) an algorithm for generating an effective mesh distribution in a serial environment, and 2) an algorithm for inference of an effective domain decomposition strategy for parallel computing. For the first part, the algorithm prepares an effective mesh distribution considering problem physics and the spatial differencing scheme. For the second part, the algorithm determines a parallel-performance-index (PPI), which is defined as the ratio of the granularity to the degree-of-coupling. The parallel-performance-index provides expected performance of an algorithm depending on computing environment and resources. A large index indicates a high granularity algorithm with relatively low coupling among processors. This expert system has been successfully tested within the PENTRAN (Parallel Environment Neutral-Particle Transport) code system for simulating real-life shielding problems. (authors)
Energy Technology Data Exchange (ETDEWEB)
Apisit, Patchimpattapong [Electricity Generating Authority of Thailand, Office of Corporate Planning, Bangkruai, Nonthaburi (Thailand); Alireza, Haghighat; Shedlock, D. [Florida Univ., Department of Nuclear and Radiological Engineering, Gainesville, FL (United States)
2003-07-01
An expert system for generating an effective mesh distribution for the SN particle transport simulation has been developed. This expert system consists of two main parts: 1) an algorithm for generating an effective mesh distribution in a serial environment, and 2) an algorithm for inference of an effective domain decomposition strategy for parallel computing. For the first part, the algorithm prepares an effective mesh distribution considering problem physics and the spatial differencing scheme. For the second part, the algorithm determines a parallel-performance-index (PPI), which is defined as the ratio of the granularity to the degree-of-coupling. The parallel-performance-index provides expected performance of an algorithm depending on computing environment and resources. A large index indicates a high granularity algorithm with relatively low coupling among processors. This expert system has been successfully tested within the PENTRAN (Parallel Environment Neutral-Particle Transport) code system for simulating real-life shielding problems. (authors)
The Aeolian Asynchronous Generator
Directory of Open Access Journals (Sweden)
Ionel Dragomirescu
2008-10-01
Full Text Available The production of the electric energy with lower costs could be realized with the help of the aeolian electric central. In these centrals we can use the squirrel cage asynchronous generators, because these machines are the most safety in function and easy exploited. This work show the function analyzing of the asynchronous generator having on involving torque depending on the square wind speed, the air-density and on the construction of the wing spiral.
Exploring Asynchronous Many-Task Runtime Systems toward Extreme Scales
Energy Technology Data Exchange (ETDEWEB)
Knight, Samuel [O8953; Baker, Gavin Matthew; Gamell, Marc [Rutgers U; Hollman, David [08953; Sjaardema, Gregor [SNL; Kolla, Hemanth [SNL; Teranishi, Keita; Wilke, Jeremiah J; Slattengren, Nicole [SNL; Bennett, Janine Camille
2015-10-01
Major exascale computing reports indicate a number of software challenges to meet the dramatic change of system architectures in near future. While several-orders-of-magnitude increase in parallelism is the most commonly cited of those, hurdles also include performance heterogeneity of compute nodes across the system, increased imbalance between computational capacity and I/O capabilities, frequent system interrupts, and complex hardware architectures. Asynchronous task-parallel programming models show a great promise in addressing these issues, but are not yet fully understood nor developed su ciently for computational science and engineering application codes. We address these knowledge gaps through quantitative and qualitative exploration of leading candidate solutions in the context of engineering applications at Sandia. In this poster, we evaluate MiniAero code ported to three leading candidate programming models (Charm++, Legion and UINTAH) to examine the feasibility of these models that permits insertion of new programming model elements into an existing code base.
Burst-Mode Asynchronous Controllers on FPGA
Directory of Open Access Journals (Sweden)
Duarte L. Oliveira
2008-01-01
Full Text Available FPGAs have been mainly used to design synchronous circuits. Asynchronous design on FPGAs is difficult because the resulting circuit may suffer from hazard problems. We propose a method that implements a popular class of asynchronous circuits, known as burst mode, on FPGAs based on look-up table architectures. We present two conditions that, if satisfied, guarantee essential hazard-free implementation on any LUT-based FPGA. By doing that, besides all the intrinsic advantages of asynchronous over synchronous circuits, they also take advantage of the shorter design time and lower cost associated with FPGA designs.
Asynchronous beating of cilia enhances particle capture rate
Ding, Yang; Kanso, Eva
2014-11-01
Many aquatic micro-organisms use beating cilia to generate feeding currents and capture particles in surrounding fluids. One of the capture strategies is to ``catch up'' with particles when a cilium is beating towards the overall flow direction (effective stroke) and intercept particles on the downstream side of the cilium. Here, we developed a 3D computational model of a cilia band with prescribed motion in a viscous fluid and calculated the trajectories of the particles with different sizes in the fluid. We found an optimal particle diameter that maximizes the capture rate. The flow field and particle motion indicate that the low capture rate of smaller particles is due to the laminar flow in the neighbor of the cilia, whereas larger particles have to move above the cilia tips to get advected downstream which decreases their capture rate. We then analyzed the effect of beating coordination between neighboring cilia on the capture rate. Interestingly, we found that asynchrony of the beating of the cilia can enhance the relative motion between a cilium and the particles near it and hence increase the capture rate.
Parallel particle swarm optimization algorithm in nuclear problems
International Nuclear Information System (INIS)
Waintraub, Marcel; Pereira, Claudio M.N.A.; Schirru, Roberto
2009-01-01
Particle Swarm Optimization (PSO) is a population-based metaheuristic (PBM), in which solution candidates evolve through simulation of a simplified social adaptation model. Putting together robustness, efficiency and simplicity, PSO has gained great popularity. Many successful applications of PSO are reported, in which PSO demonstrated to have advantages over other well-established PBM. However, computational costs are still a great constraint for PSO, as well as for all other PBMs, especially in optimization problems with time consuming objective functions. To overcome such difficulty, parallel computation has been used. The default advantage of parallel PSO (PPSO) is the reduction of computational time. Master-slave approaches, exploring this characteristic are the most investigated. However, much more should be expected. It is known that PSO may be improved by more elaborated neighborhood topologies. Hence, in this work, we develop several different PPSO algorithms exploring the advantages of enhanced neighborhood topologies implemented by communication strategies in multiprocessor architectures. The proposed PPSOs have been applied to two complex and time consuming nuclear engineering problems: reactor core design and fuel reload optimization. After exhaustive experiments, it has been concluded that: PPSO still improves solutions after many thousands of iterations, making prohibitive the efficient use of serial (non-parallel) PSO in such kind of realworld problems; and PPSO with more elaborated communication strategies demonstrated to be more efficient and robust than the master-slave model. Advantages and peculiarities of each model are carefully discussed in this work. (author)
Dynamic grid refinement for partial differential equations on parallel computers
International Nuclear Information System (INIS)
Mccormick, S.; Quinlan, D.
1989-01-01
The fast adaptive composite grid method (FAC) is an algorithm that uses various levels of uniform grids to provide adaptive resolution and fast solution of PDEs. An asynchronous version of FAC, called AFAC, that completely eliminates the bottleneck to parallelism is presented. This paper describes the advantage that this algorithm has in adaptive refinement for moving singularities on multiprocessor computers. This work is applicable to the parallel solution of two- and three-dimensional shock tracking problems. 6 refs
Proxy-equation paradigm: A strategy for massively parallel asynchronous computations
Mittal, Ankita; Girimaji, Sharath
2017-09-01
Massively parallel simulations of transport equation systems call for a paradigm change in algorithm development to achieve efficient scalability. Traditional approaches require time synchronization of processing elements (PEs), which severely restricts scalability. Relaxing synchronization requirement introduces error and slows down convergence. In this paper, we propose and develop a novel "proxy equation" concept for a general transport equation that (i) tolerates asynchrony with minimal added error, (ii) preserves convergence order and thus, (iii) expected to scale efficiently on massively parallel machines. The central idea is to modify a priori the transport equation at the PE boundaries to offset asynchrony errors. Proof-of-concept computations are performed using a one-dimensional advection (convection) diffusion equation. The results demonstrate the promise and advantages of the present strategy.
New physics beyond the standard model of particle physics and parallel universes
Energy Technology Data Exchange (ETDEWEB)
Plaga, R. [Franzstr. 40, 53111 Bonn (Germany)]. E-mail: rainer.plaga@gmx.de
2006-03-09
It is shown that if-and only if-'parallel universes' exist, an electroweak vacuum that is expected to have decayed since the big bang with a high probability might exist. It would neither necessarily render our existence unlikely nor could it be observed. In this special case the observation of certain combinations of Higgs-boson and top-quark masses-for which the standard model predicts such a decay-cannot be interpreted as evidence for new physics at low energy scales. The question of whether parallel universes exist is of interest to our understanding of the standard model of particle physics.
Comparing the force ripple during asynchronous and conventional stimulation.
Downey, Ryan J; Tate, Mark; Kawai, Hiroyuki; Dixon, Warren E
2014-10-01
Asynchronous stimulation has been shown to reduce fatigue during electrical stimulation; however, it may also exhibit a force ripple. We quantified the ripple during asynchronous and conventional single-channel transcutaneous stimulation across a range of stimulation frequencies. The ripple was measured during 5 asynchronous stimulation protocols, 2 conventional stimulation protocols, and 3 volitional contractions in 12 healthy individuals. Conventional 40 Hz and asynchronous 16 Hz stimulation were found to induce contractions that were as smooth as volitional contractions. Asynchronous 8, 10, and 12 Hz stimulation induced contractions with significant ripple. Lower stimulation frequencies can reduce fatigue; however, they may also lead to increased ripple. Future efforts should study the relationship between force ripple and the smoothness of the evoked movements in addition to the relationship between stimulation frequency and NMES-induced fatigue to elucidate an optimal stimulation frequency for asynchronous stimulation. © 2014 Wiley Periodicals, Inc.
A Parallel Saturation Algorithm on Shared Memory Architectures
Ezekiel, Jonathan; Siminiceanu
2007-01-01
Symbolic state-space generators are notoriously hard to parallelize. However, the Saturation algorithm implemented in the SMART verification tool differs from other sequential symbolic state-space generators in that it exploits the locality of ring events in asynchronous system models. This paper explores whether event locality can be utilized to efficiently parallelize Saturation on shared-memory architectures. Conceptually, we propose to parallelize the ring of events within a decision diagram node, which is technically realized via a thread pool. We discuss the challenges involved in our parallel design and conduct experimental studies on its prototypical implementation. On a dual-processor dual core PC, our studies show speed-ups for several example models, e.g., of up to 50% for a Kanban model, when compared to running our algorithm only on a single core.
Motion of charged suspended particle in a non-Newtonian fluid between two long parallel plates
Energy Technology Data Exchange (ETDEWEB)
Abd Elkhalek, M M [Nuclear Research Center-Atomic Energy Authority, Cairo (Egypt)
1997-12-31
The motion of charged suspended particle in a non-Newtonian fluid between two long parallel plates is discussed. The equation of motion of a suspended particle was suggested by Closkin. The equations of motion are reduced to ordinary differential equations by similarity transformation and solved numerically by using Runge-Kutta method. The trajectories of particles are calculated by integrating the equation of motion of a single particle. The present simulation requires some empirical parameters concerning the collision of the particles with the wall. The effect of solid particles on flow properties are discussed. Some typical results for both fluid and particle phases and density distributions of the particles are presented graphically. 4 figs.
Motion of Charged Suspended Particle in a Non-Newtonian Fluid between Two Long Parallel Plated
International Nuclear Information System (INIS)
Abd-El Khalek, M.M.
1998-01-01
The motion of charged suspended particle in a non-Newtonian fluid between two long parallel plates is discussed. The equation of motion of a suspended particle was suggested by Closkin. The equations of motion are reduced to ordinary differential equations by similarity transformations and solved numerically by using the Runge-Kutta method. The trajectories of particles are calculated by integrating the equation of motion of a single particle. The present simulation requires some empirical parameters concerning the collision of the particles with the wall. The effects of solid particles on flow properties are discussed. Some typical results for both fluid and particle phases and density distributions of the particles are presented graphically
Induction motor for superconducting synchronous/asynchronous motor
International Nuclear Information System (INIS)
Litz, D.C.; Haller, H.E. III.
1975-01-01
An induction motor structure for use on the outside of a superconducting rotor comprising a cylindrical shell of solid and laminated, magnetic iron with squirrel cage windings embedded in the outer circumference of said shell is described. The sections of the shell between the superconducting windings of the rotor are solid magnetic iron. The sections of the shell over the superconducting windings are made of laminations of magnetic iron. These laminations are parallel to the axis of the machine and are divided in halves with the laminations in each half oriented in diagonal opposition so that the intersection of the laminations forms a V. This structure presents a relatively high reluctance to leakage flux from the superconducting windings in the synchronous operating mode, while presenting a low reluctance path to the stator flux during asynchronous operation
Label-acquired magnetorotation for biosensing: An asynchronous rotation assay
International Nuclear Information System (INIS)
Hecht, Ariel; Kinnunen, Paivo; McNaughton, Brandon; Kopelman, Raoul
2011-01-01
This paper presents a novel application of magnetic particles for biosensing, called label-acquired magnetorotation (LAM). This method is based on a combination of the traditional sandwich assay format with the asynchronous magnetic bead rotation (AMBR) method. In label-acquired magnetorotation, an analyte facilitates the binding of a magnetic label bead to a nonmagnetic solid phase sphere, forming a sandwich complex. The sandwich complex is then placed in a rotating magnetic field, where the rotational frequency of the sandwich complex is a function of the amount of analyte attached to the surface of the sphere. Here, we use streptavidin-coated beads and biotin-coated particles as analyte mimics, to be replaced by proteins and other biological targets in future work. We show this sensing method to have a dynamic range of two orders of magnitude.
Energy Technology Data Exchange (ETDEWEB)
Fornel, B. de [Institut National Polytechnique, 31 - Toulouse (France)
2006-05-15
The asynchronous machine, with its low cost and robustness, is today the most widely used motor to make speed variators. However, its main drawback is that the same current generates both the magnetic flux and the torque, and thus any torque variation creates a flux variation. Such a coupling gives to the asynchronous machine a nonlinear behaviour which makes its control much more complex. The direct self control (DSC) method has been developed to improve the low efficiency of the scalar control method and for the specific railway drive application. The direct torque control (DTC) method is derived from the DSC method but corresponds to other type of applications. The DSC and DTC algorithms for asynchronous motors are presented in this article: 1 - direct control of the stator flux (DSC): principle, flux control, torque control, switching frequency of the inverter, speed estimation; 2 - direct torque control (DTC): principle, electromagnetic torque derivative, signals shape and switching frequency, some results, DTC speed variator without speed sensor, DTC application to multi-machine multi-converter systems; 3 - conclusion. (J.S.)
Gomez-Suarez, C; van der Mei, HC; Busscher, HJ
2001-01-01
Particle size was found to be an important factor in air bubble-induced detachment of colloidal particles from collector surfaces in a parallel plate flow chamber and generally polystyrene particles with a diameter of 806 nm detached less than particles with a diameter of 1400 nm. Particle
Multi-petascale highly efficient parallel supercomputer
Asaad, Sameh; Bellofatto, Ralph E.; Blocksome, Michael A.; Blumrich, Matthias A.; Boyle, Peter; Brunheroto, Jose R.; Chen, Dong; Cher, Chen-Yong; Chiu, George L.; Christ, Norman; Coteus, Paul W.; Davis, Kristan D.; Dozsa, Gabor J.; Eichenberger, Alexandre E.; Eisley, Noel A.; Ellavsky, Matthew R.; Evans, Kahn C.; Fleischer, Bruce M.; Fox, Thomas W.; Gara, Alan; Giampapa, Mark E.; Gooding, Thomas M.; Gschwind, Michael K.; Gunnels, John A.; Hall, Shawn A.; Haring, Rudolf A.; Heidelberger, Philip; Inglett, Todd A.; Knudson, Brant L.; Kopcsay, Gerard V.; Kumar, Sameer; Mamidala, Amith R.; Marcella, James A.; Megerian, Mark G.; Miller, Douglas R.; Miller, Samuel J.; Muff, Adam J.; Mundy, Michael B.; O'Brien, John K.; O'Brien, Kathryn M.; Ohmacht, Martin; Parker, Jeffrey J.; Poole, Ruth J.; Ratterman, Joseph D.; Salapura, Valentina; Satterfield, David L.; Senger, Robert M.; Steinmacher-Burow, Burkhard; Stockdell, William M.; Stunkel, Craig B.; Sugavanam, Krishnan; Sugawara, Yutaka; Takken, Todd E.; Trager, Barry M.; Van Oosten, James L.; Wait, Charles D.; Walkup, Robert E.; Watson, Alfred T.; Wisniewski, Robert W.; Wu, Peng
2018-05-15
A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaflop-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC). The ASIC nodes are interconnected by a five dimensional torus network that optimally maximize the throughput of packet communications between nodes and minimize latency. The network implements collective network and a global asynchronous network that provides global barrier and notification functions. Integrated in the node design include a list-based prefetcher. The memory system implements transaction memory, thread level speculation, and multiversioning cache that improves soft error rate at the same time and supports DMA functionality allowing for parallel processing message-passing.
An Asynchronous Many-Task Implementation of In-Situ Statistical Analysis using Legion.
Energy Technology Data Exchange (ETDEWEB)
Pebay, Philippe Pierre [Sandia National Lab. (SNL-CA), Livermore, CA (United States); Bennett, Janine Camille [Sandia National Lab. (SNL-CA), Livermore, CA (United States)
2015-11-01
In this report, we propose a framework for the design and implementation of in-situ analy- ses using an asynchronous many-task (AMT) model, using the Legion programming model together with the MiniAero mini-application as a surrogate for full-scale parallel scientific computing applications. The bulk of this work consists of converting the Learn/Derive/Assess model which we had initially developed for parallel statistical analysis using MPI [PTBM11], from a SPMD to an AMT model. In this goal, we propose an original use of the concept of Legion logical regions as a replacement for the parallel communication schemes used for the only operation of the statistics engines that require explicit communication. We then evaluate this proposed scheme in a shared memory environment, using the Legion port of MiniAero as a proxy for a full-scale scientific application, as a means to provide input data sets of variable size for the in-situ statistical analyses in an AMT context. We demonstrate in particular that the approach has merit, and warrants further investigation, in collaboration with ongoing efforts to improve the overall parallel performance of the Legion system.
Current Trends in High-Level Synthesis of Asynchronous Circuits
DEFF Research Database (Denmark)
Sparsø, Jens
2009-01-01
This paper is a survey paper presenting what the author sees as two major and promising trends in the current research in CAD-tools and design-methods for asynchronous circuits. One branch of research builds on top of existing asynchronous CAD-tools that perform syntax directed translation, e...... a conventional synchronous circuit as the starting point, and then adds some form of handshake-based flow-control. One approach keeps the global clock and implements discrete-time asynchronous operation. Another approach substitutes the clocked registers by asynchronous handshake-registers, thus creating truly...
Ultrascalable petaflop parallel supercomputer
Blumrich, Matthias A [Ridgefield, CT; Chen, Dong [Croton On Hudson, NY; Chiu, George [Cross River, NY; Cipolla, Thomas M [Katonah, NY; Coteus, Paul W [Yorktown Heights, NY; Gara, Alan G [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Hall, Shawn [Pleasantville, NY; Haring, Rudolf A [Cortlandt Manor, NY; Heidelberger, Philip [Cortlandt Manor, NY; Kopcsay, Gerard V [Yorktown Heights, NY; Ohmacht, Martin [Yorktown Heights, NY; Salapura, Valentina [Chappaqua, NY; Sugavanam, Krishnan [Mahopac, NY; Takken, Todd [Brewster, NY
2010-07-20
A massively parallel supercomputer of petaOPS-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC) having up to four processing elements. The ASIC nodes are interconnected by multiple independent networks that optimally maximize the throughput of packet communications between nodes with minimal latency. The multiple networks may include three high-speed networks for parallel algorithm message passing including a Torus, collective network, and a Global Asynchronous network that provides global barrier and notification functions. These multiple independent networks may be collaboratively or independently utilized according to the needs or phases of an algorithm for optimizing algorithm processing performance. The use of a DMA engine is provided to facilitate message passing among the nodes without the expenditure of processing resources at the node.
Massively parallel Monte Carlo for many-particle simulations on GPUs
Energy Technology Data Exchange (ETDEWEB)
Anderson, Joshua A.; Jankowski, Eric [Department of Chemical Engineering, University of Michigan, Ann Arbor, MI 48109 (United States); Grubb, Thomas L. [Department of Materials Science and Engineering, University of Michigan, Ann Arbor, MI 48109 (United States); Engel, Michael [Department of Chemical Engineering, University of Michigan, Ann Arbor, MI 48109 (United States); Glotzer, Sharon C., E-mail: sglotzer@umich.edu [Department of Chemical Engineering, University of Michigan, Ann Arbor, MI 48109 (United States); Department of Materials Science and Engineering, University of Michigan, Ann Arbor, MI 48109 (United States)
2013-12-01
Current trends in parallel processors call for the design of efficient massively parallel algorithms for scientific computing. Parallel algorithms for Monte Carlo simulations of thermodynamic ensembles of particles have received little attention because of the inherent serial nature of the statistical sampling. In this paper, we present a massively parallel method that obeys detailed balance and implement it for a system of hard disks on the GPU. We reproduce results of serial high-precision Monte Carlo runs to verify the method. This is a good test case because the hard disk equation of state over the range where the liquid transforms into the solid is particularly sensitive to small deviations away from the balance conditions. On a Tesla K20, our GPU implementation executes over one billion trial moves per second, which is 148 times faster than on a single Intel Xeon E5540 CPU core, enables 27 times better performance per dollar, and cuts energy usage by a factor of 13. With this improved performance we are able to calculate the equation of state for systems of up to one million hard disks. These large system sizes are required in order to probe the nature of the melting transition, which has been debated for the last forty years. In this paper we present the details of our computational method, and discuss the thermodynamics of hard disks separately in a companion paper.
Asynchronous communication in real space process algebra
Baeten, J.C.M.; Bergstra, J.A.
1991-01-01
A version of classical real space process algebra is given in which messages travel with constant speed through a three-dimensional medium. It follows that communication is asynchronous and has a broadcasting character. A state operator is used to describe asynchronous message transfer and a
Asynchronous communication in real space process algebra
Bergstra, J.A.; Baeten, J.C.M.
1992-01-01
A version of classical real space process algebra is given in which messages travel with constant speed through a three-dimensional medium. It follows that communication is asynchronous and has a broadcasting character. A state operator is used to describe asynchronous message transfer and a
Development Of A Parallel Performance Model For The THOR Neutral Particle Transport Code
Energy Technology Data Exchange (ETDEWEB)
Yessayan, Raffi; Azmy, Yousry; Schunert, Sebastian
2017-02-01
The THOR neutral particle transport code enables simulation of complex geometries for various problems from reactor simulations to nuclear non-proliferation. It is undergoing a thorough V&V requiring computational efficiency. This has motivated various improvements including angular parallelization, outer iteration acceleration, and development of peripheral tools. For guiding future improvements to the code’s efficiency, better characterization of its parallel performance is useful. A parallel performance model (PPM) can be used to evaluate the benefits of modifications and to identify performance bottlenecks. Using INL’s Falcon HPC, the PPM development incorporates an evaluation of network communication behavior over heterogeneous links and a functional characterization of the per-cell/angle/group runtime of each major code component. After evaluating several possible sources of variability, this resulted in a communication model and a parallel portion model. The former’s accuracy is bounded by the variability of communication on Falcon while the latter has an error on the order of 1%.
Dynamic Load Balancing Based on Constrained K-D Tree Decomposition for Parallel Particle Tracing
Energy Technology Data Exchange (ETDEWEB)
Zhang, Jiang; Guo, Hanqi; Yuan, Xiaoru; Hong, Fan; Peterka, Tom
2018-01-01
Particle tracing is a fundamental technique in flow field data visualization. In this work, we present a novel dynamic load balancing method for parallel particle tracing. Specifically, we employ a constrained k-d tree decomposition approach to dynamically redistribute tasks among processes. Each process is initially assigned a regularly partitioned block along with duplicated ghost layer under the memory limit. During particle tracing, the k-d tree decomposition is dynamically performed by constraining the cutting planes in the overlap range of duplicated data. This ensures that each process is reassigned particles as even as possible, and on the other hand the new assigned particles for a process always locate in its block. Result shows good load balance and high efficiency of our method.
Multi-objective parallel particle swarm optimization for day-ahead Vehicle-to-Grid scheduling
DEFF Research Database (Denmark)
Soares, Joao; Vale, Zita; Canizes, Bruno
2013-01-01
This paper presents a methodology for multi-objective day-ahead energy resource scheduling for smart grids considering intensive use of distributed generation and Vehicle-To-Grid (V2G). The main focus is the application of weighted Pareto to a multi-objective parallel particle swarm approach aiming...... to solve the dual-objective V2G scheduling: minimizing total operation costs and maximizing V2G income. A realistic mathematical formulation, considering the network constraints and V2G charging and discharging efficiencies is presented and parallel computing is applied to the Pareto weights. AC power flow...
International Nuclear Information System (INIS)
Mole, C.J.; Haller, H.E. III.
1977-01-01
Two parallel magnetic flux paths are provided in a dynamoelectric machine having a superconductive field winding. A first, or main, magnetic flux path includes at least one area of nonferromagnetic or diamagnetic material. A second, or shunt, magnetic flux path prevents the relatively low frequency ac flux present during starting or asynchronous operation of the machine, when used as an ac motor, from penetrating the superconductive winding
CCS, locations and asynchronous transition systems
DEFF Research Database (Denmark)
Mukund, Madhavan; Nielsen, Mogens
1992-01-01
We provide a simple non-interleaved operational semantics for CCS in terms of asynchronous transition systems. We identify the concurrency present in the system in a natural way, in terms of events occurring at independent locations in the system. We extend the standard interleaving transition...... system for CCS by introducing labels on the transitions with information about the locations of events. We then show that the resulting transition system is an asynchronous transition system which has the additional property of being elementary, which means that it can also be represented by a 1-safe net....... We also introduce a notion of bisimulation on asynchronous transition systems which preserves independence. We conjecture that the induced equivalence on CCS processes coincides with the notion of location equivalence proposed by Boudol et al....
International Nuclear Information System (INIS)
Wong Unhong; Wong Honcheng; Tang Zesheng
2010-01-01
The smoothed particle hydrodynamics (SPH), which is a class of meshfree particle methods (MPMs), has a wide range of applications from micro-scale to macro-scale as well as from discrete systems to continuum systems. Graphics hardware, originally designed for computer graphics, now provide unprecedented computational power for scientific computation. Particle system needs a huge amount of computations in physical simulation. In this paper, an efficient parallel implementation of a SPH method on graphics hardware using the Compute Unified Device Architecture is developed for fluid simulation. Comparing to the corresponding CPU implementation, our experimental results show that the new approach allows significant speedups of fluid simulation through handling huge amount of computations in parallel on graphics hardware.
System-Enforced Deterministic Streaming for Eﬃcient Pipeline Parallelism
Institute of Scientific and Technical Information of China (English)
张昱; 李兆鹏; 曹慧芳
2015-01-01
Pipeline parallelism is a popular parallel programming pattern for emerging applications. However, program-ming pipelines directly on conventional multithreaded shared memory is diﬃcult and error-prone. We present DStream, a C library that provides high-level abstractions of deterministic threads and streams for simply representing pipeline stage work-ers and their communications. The deterministic stream is established atop our proposed single-producer/multi-consumer (SPMC) virtual memory, which integrates synchronization with the virtual memory model to enforce determinism on shared memory accesses. We investigate various strategies on how to eﬃciently implement DStream atop the SPMC memory, so that an infinite sequence of data items can be asynchronously published (fixed) and asynchronously consumed in order among adjacent stage workers. We have successfully transformed two representative pipeline applications – ferret and dedup using DStream, and conclude conversion rules. An empirical evaluation shows that the converted ferret performed on par with its Pthreads and TBB counterparts in term of running time, while the converted dedup is close to 2.56X, 7.05X faster than the Pthreads counterpart and 1.06X, 3.9X faster than the TBB counterpart on 16 and 32 CPUs, respectively.
Directory of Open Access Journals (Sweden)
G. Ahirwar
2006-08-01
Full Text Available The effect of parallel electric field on the growth rate, parallel and perpendicular resonant energy and marginal stability of the electromagnetic ion-cyclotron (EMIC wave with general loss-cone distribution function in a low β homogeneous plasma is investigated by particle aspect approach. The effect of the steepness of the loss-cone distribution is investigated on the electromagnetic ion-cyclotron wave. The whole plasma is considered to consist of resonant and non-resonant particles. It is assumed that resonant particles participate in the energy exchange with the wave, whereas non-resonant particles support the oscillatory motion of the wave. The wave is assumed to propagate parallel to the static magnetic field. The effect of the parallel electric field with the general distribution function is to control the growth rate of the EMIC waves, whereas the effect of steep loss-cone distribution is to enhance the growth rate and perpendicular heating of the ions. This study is relevant to the analysis of ion conics in the presence of an EMIC wave in the auroral acceleration region of the Earth's magnetoplasma.
Directory of Open Access Journals (Sweden)
Yu Huang
Full Text Available Parameter estimation for fractional-order chaotic systems is an important issue in fractional-order chaotic control and synchronization and could be essentially formulated as a multidimensional optimization problem. A novel algorithm called quantum parallel particle swarm optimization (QPPSO is proposed to solve the parameter estimation for fractional-order chaotic systems. The parallel characteristic of quantum computing is used in QPPSO. This characteristic increases the calculation of each generation exponentially. The behavior of particles in quantum space is restrained by the quantum evolution equation, which consists of the current rotation angle, individual optimal quantum rotation angle, and global optimal quantum rotation angle. Numerical simulation based on several typical fractional-order systems and comparisons with some typical existing algorithms show the effectiveness and efficiency of the proposed algorithm.
Liu, Yang; Yucel, Abdulkadir; Bagci, Hakan; Michielssen, Eric
2015-01-01
of processors by leveraging two mechanisms: (i) a hierarchical parallelization strategy to evenly distribute the computation and memory loads at all levels of the PWTD tree among processors, and (ii) a novel asynchronous communication scheme to reduce the cost
Developing asynchronous online interprofessional education.
Sanborn, Heidi
2016-09-01
For many health programmes, developing interprofessional education (IPE) has been a challenge. Evidence on the best method for design and implementation of IPE has been slow to emerge, with little research on how to best incorporate IPE in the asynchronous online learning environment. This leaves online programmes with no clear guidance when embarking upon an initiative to integrate IPE into the curriculum. One tool that can be effective at guiding the incorporation of IPE across all learning platforms is the Interprofessional Education Collaborative (IPEC) competencies. A project was designed to integrate the nationally defined IPEC competencies throughout an asynchronous, online baccalaureate nursing completion programme. A programme-wide review led to targeted revision of course and unit-level objectives, learning experiences, and assessments based on the IPEC framework. As a result of this effort, the programme curriculum now provides interprofessional learning activities across all courses. This report provides a method for using the IPEC competencies to incorporate IPE within various asynchronous learning assessments, assuring students learn about, with, and from other professions.
Asynchronous networks: modularization of dynamics theorem
Bick, Christian; Field, Michael
2017-02-01
Building on the first part of this paper, we develop the theory of functional asynchronous networks. We show that a large class of functional asynchronous networks can be (uniquely) represented as feedforward networks connecting events or dynamical modules. For these networks we can give a complete description of the network function in terms of the function of the events comprising the network: the modularization of dynamics theorem. We give examples to illustrate the main results.
Energy Technology Data Exchange (ETDEWEB)
D' Ettorre Piazzoli, B; Mannocchi, G [Consiglio Nazionale delle Ricerche, Turin (Italy). Lab. di Cosmo-Geofisica; Melone, S [Istituto di Fisica dell' Universita, Ancona, Italy; Picchi, P; Visentin, R [Comitato Nazionale per l' Energia Nucleare, Frascati (Italy). Laboratori Nazionali di Frascati
1976-06-01
Expressions for the counting rate of rectangular telescopes in the case of single as well as multiple particles are given. The aperture for single particles is obtained in the form of a double integral and analytical solutions are given for some cases. The intensity for different multiplicities of parallel particles is related to the geometry of the detectors and to the features of the radiation. This allows an absolute comparison between the data recorded by different devices.
International Nuclear Information System (INIS)
He, H.-Q.; Wan, W.
2012-01-01
The parallel mean free path of solar energetic particles (SEPs), which is determined by physical properties of SEPs as well as those of solar wind, is a very important parameter in space physics to study the transport of charged energetic particles in the heliosphere, especially for space weather forecasting. In space weather practice, it is necessary to find a quick approach to obtain the parallel mean free path of SEPs for a solar event. In addition, the adiabatic focusing effect caused by a spatially varying mean magnetic field in the solar system is important to the transport processes of SEPs. Recently, Shalchi presented an analytical description of the parallel diffusion coefficient with adiabatic focusing. Based on Shalchi's results, in this paper we provide a direct analytical formula as a function of parameters concerning the physical properties of SEPs and solar wind to directly and quickly determine the parallel mean free path of SEPs with adiabatic focusing. Since all of the quantities in the analytical formula can be directly observed by spacecraft, this direct method would be a very useful tool in space weather research. As applications of the direct method, we investigate the inherent relations between the parallel mean free path and various parameters concerning physical properties of SEPs and solar wind. Comparisons of parallel mean free paths with and without adiabatic focusing are also presented.
Implementation of a 3D plasma particle-in-cell code on a MIMD parallel computer
International Nuclear Information System (INIS)
Liewer, P.C.; Lyster, P.; Wang, J.
1993-01-01
A three-dimensional plasma particle-in-cell (PIC) code has been implemented on the Intel Delta MIMD parallel supercomputer using the General Concurrent PIC algorithm. The GCPIC algorithm uses a domain decomposition to divide the computation among the processors: A processor is assigned a subdomain and all the particles in it. Particles must be exchanged between processors as they move. Results are presented comparing the efficiency for 1-, 2- and 3-dimensional partitions of the three dimensional domain. This algorithm has been found to be very efficient even when a large fraction (e.g. 30%) of the particles must be exchanged at every time step. On the 512-node Intel Delta, up to 125 million particles have been pushed with an electrostatic push time of under 500 nsec/particle/time step
Scattering by a plane-parallel layer with high concentration of optically soft particles
International Nuclear Information System (INIS)
Loiko, Valery A.; Berdnik, Vladimir V.
2009-01-01
A method describing light propagation in a plane-parallel light-scattering layer with large concentration of homogeneous particles is developed. It is based on the radiative transfer equation and the doubling method. The interference approximation is used to take into account collective scattering effects. Spectral dependence of transmitted light for a layer of nonabsorbing optically soft particles with subwavelength-sized particles is investigated. At small volume concentration of the particles the weak spectral dependences of wave exponents for coherently transmitted and diffuse light are observed. It is shown that in a layer with large volume concentration of the subwavelength-sized particles the wave exponent can exceed considerably the value of four, which takes place for the Rayleigh particles. The dependence of wave exponents for coherently transmitted and diffuse light on the refractive index and concentration of particles is investigated in detail. Multiple scattering of light results in the reduction of the exponent. The quantitative results are presented and discussed. It is shown that there is a range of wavelengths where the negative values of the wave exponent at the regime of multiple scattering are implemented.
2011-01-01
An asynchronous analog to digital convertor for converting an analog input signal into a digital output is presented. According to an embodiment, the analog to digital convertor comprises a clock input operable to receive an external clock signal having a clock period, a comparator operable to
A 3D gyrokinetic particle-in-cell simulation of fusion plasma microturbulence on parallel computers
Williams, T. J.
1992-12-01
One of the grand challenge problems now supported by HPCC is the Numerical Tokamak Project. A goal of this project is the study of low-frequency micro-instabilities in tokamak plasmas, which are believed to cause energy loss via turbulent thermal transport across the magnetic field lines. An important tool in this study is gyrokinetic particle-in-cell (PIC) simulation. Gyrokinetic, as opposed to fully-kinetic, methods are particularly well suited to the task because they are optimized to study the frequency and wavelength domain of the microinstabilities. Furthermore, many researchers now employ low-noise delta(f) methods to greatly reduce statistical noise by modelling only the perturbation of the gyrokinetic distribution function from a fixed background, not the entire distribution function. In spite of the increased efficiency of these improved algorithms over conventional PIC algorithms, gyrokinetic PIC simulations of tokamak micro-turbulence are still highly demanding of computer power--even fully-vectorized codes on vector supercomputers. For this reason, we have worked for several years to redevelop these codes on massively parallel computers. We have developed 3D gyrokinetic PIC simulation codes for SIMD and MIMD parallel processors, using control-parallel, data-parallel, and domain-decomposition message-passing (DDMP) programming paradigms. This poster summarizes our earlier work on codes for the Connection Machine and BBN TC2000 and our development of a generic DDMP code for distributed-memory parallel machines. We discuss the memory-access issues which are of key importance in writing parallel PIC codes, with special emphasis on issues peculiar to gyrokinetic PIC. We outline the domain decompositions in our new DDMP code and discuss the interplay of different domain decompositions suited for the particle-pushing and field-solution components of the PIC algorithm.
Simulating fail-stop in asynchronous distributed systems
Sabel, Laura; Marzullo, Keith
1994-01-01
The fail-stop failure model appears frequently in the distributed systems literature. However, in an asynchronous distributed system, the fail-stop model cannot be implemented. In particular, it is impossible to reliably detect crash failures in an asynchronous system. In this paper, we show that it is possible to specify and implement a failure model that is indistinguishable from the fail-stop model from the point of view of any process within an asynchronous system. We give necessary conditions for a failure model to be indistinguishable from the fail-stop model, and derive lower bounds on the amount of process replication needed to implement such a failure model. We present a simple one-round protocol for implementing one such failure model, which we call simulated fail-stop.
Asynchronous decentralized method for interconnected electricity markets
International Nuclear Information System (INIS)
Huang, Anni; Joo, Sung-Kwan; Song, Kyung-Bin; Kim, Jin-Ho; Lee, Kisung
2008-01-01
This paper presents an asynchronous decentralized method to solve the optimization problem of interconnected electricity markets. The proposed method decomposes the optimization problem of combined electricity markets into individual optimization problems. The impact of neighboring markets' information is included in the objective function of the individual market optimization problem by the standard Lagrangian relaxation method. Most decentralized optimization methods use synchronous models of communication to exchange updated market information among markets during the iterative process. In this paper, however, the solutions of the individual optimization problems are coordinated through an asynchronous communication model until they converge to the global optimal solution of combined markets. Numerical examples are presented to demonstrate the advantages of the proposed asynchronous method over the existing synchronous methods. (author)
Gomez-Suarez, C; Van der Mei, HC; Busscher, HJ
2000-01-01
Electrostatic interactions between colloidal particles and collector surfaces were found tcr be important in particle detachment as induced by the passage of air bubbles in a parallel-plate Row chamber. Electrostatic interactions between adhering particles and passing air bubbles, however, a-ere
Optimization of parameters of special asynchronous electric drives
Karandey, V. Yu; Popov, B. K.; Popova, O. B.; Afanasyev, V. L.
2018-03-01
The article considers the solution of the problem of parameters optimization of special asynchronous electric drives. The solution of the problem will allow one to project and create special asynchronous electric drives for various industries. The created types of electric drives will have optimum mass-dimensional and power parameters. It will allow one to realize and fulfill the set characteristics of management of technological processes with optimum level of expenses of electric energy, time of completing the process or other set parameters. The received decision allows one not only to solve a certain optimizing problem, but also to construct dependences between the optimized parameters of special asynchronous electric drives, for example, with the change of power, current in a winding of the stator or rotor, induction in a gap or steel of magnetic conductors and other parameters. On the constructed dependences, it is possible to choose necessary optimum values of parameters of special asynchronous electric drives and their components without carrying out repeated calculations.
Multiparty Asynchronous Session Types
DEFF Research Database (Denmark)
Honda, Kohei; Yoshida, Nobuko; Carbone, Marco
2016-01-01
. This work extends the foregoing theories of binary session types to multiparty, asynchronous sessions, which often arise in practical communication-centered applications. Presented as a typed calculus for mobile processes, the theory introduces a new notion of types in which interactions involving multiple......Communication is a central elements in software development. As a potential typed foundation for structured communication-centered programming, session types have been studied over the past decade for a wide range of process calculi and programming languages, focusing on binary (two-party) sessions...... peers are directly abstracted as a global scenario. Global types retain the friendly type syntax of binary session types while specifying dependencies and capturing complex causal chains of multiparty asynchronous interactions. A global type plays the role of a shared agreement among communication peers...
Interpolation algorithm for asynchronous ADC-data
Directory of Open Access Journals (Sweden)
S. Bramburger
2017-09-01
Full Text Available This paper presents a modified interpolation algorithm for signals with variable data rate from asynchronous ADCs. The Adaptive weights Conjugate gradient Toeplitz matrix (ACT algorithm is extended to operate with a continuous data stream. An additional preprocessing of data with constant and linear sections and a weighted overlap of step-by-step into spectral domain transformed signals improve the reconstruction of the asycnhronous ADC signal. The interpolation method can be used if asynchronous ADC data is fed into synchronous digital signal processing.
Yunxiao, CAO; Zhiqiang, WANG; Jinjun, WANG; Guofeng, LI
2018-05-01
Electrostatic separation has been extensively used in mineral processing, and has the potential to separate gangue minerals from raw talcum ore. As for electrostatic separation, the particle charging status is one of important influence factors. To describe the talcum particle charging status in a parallel plate electrostatic separator accurately, this paper proposes a modern images processing method. Based on the actual trajectories obtained from sequence images of particle movement and the analysis of physical forces applied on a charged particle, a numerical model is built, which could calculate the charge-to-mass ratios represented as the charging status of particle and simulate the particle trajectories. The simulated trajectories agree well with the experimental results obtained by images processing. In addition, chemical composition analysis is employed to reveal the relationship between ferrum gangue mineral content and charge-to-mass ratios. Research results show that the proposed method is effective for describing the particle charging status in electrostatic separation.
Synchronous and Asynchronous ATM Multiplexor Properties Comparsion
Jan Zabka
2006-01-01
The article is aimed to ATM multiplexor computer model utilisation. Based on simulation runs we try to review aspects of use a synchronous and asynchronous ATM multiplexors. ATM multiplexor is the input queuing model with three inputs. Synchronous multiplexor works without an input priority. Multiplexor inputs are served periodically. Asynchronous multiplexor model supports several queuing and priority mechanisms. CLR and CTD are basic performance parameters. Input cell flows are genera...
Asynchronous communication in real space process algebra
Baeten, JCM Jos; Bergstra, JA Jan
1990-01-01
A version of classical real space process algebra is given in which messages travel with constant speed through a three-dimensional medium. It follows that communication is asynchronous and has a broadcasting character. A state operator is used to describe asynchronous message transfer and a priority mechanism allows to express the broadcasting mechanism. As an application, a protocol is specified in which the receiver moves with respect to the sender.
Asynchronous Learning Sources in a High-Tech Organization
Bouhnik, Dan; Giat, Yahel; Sanderovitch, Yafit
2009-01-01
Purpose: The purpose of this study is to characterize learning from asynchronous sources among research and development (R&D) personnel. It aims to examine four aspects of asynchronous source learning: employee preferences regarding self-learning; extent of source usage; employee satisfaction with these sources and the effect of the sources on the…
TCDQ-TCT retraction and losses during asynchronous beam dump
Bracco, Chiara; Quaranta, Elena; CERN. Geneva. ATS Department
2016-01-01
The protection provided by the TCDQs in case of asynchronous beam dump depends strongly on their correct setup. They have to respect the strict hierarchy of the full collimation system and shield the tertiary collimators in the experimental regions. This MD aimed at performing asynchronous beam dump tests with different configurations, in order to assess the minimum allowed retraction between TCTs and TCDQs and, as a consequence, on the The protection provided by the TCDQs in case of asynchronous beam dump depends strongly on their correct setup. They have to respect the strict hierarchy of the full collimation system and shield the tertiary collimators in the experimental regions. This MD aimed at performing asynchronous beam dump tests with different configurations, in order to assess the minimum allowed retraction between TCTs and TCDQs and, as a consequence, on the β* reach.
Konduri, Aditya
Many natural and engineering systems are governed by nonlinear partial differential equations (PDEs) which result in a multiscale phenomena, e.g. turbulent flows. Numerical simulations of these problems are computationally very expensive and demand for extreme levels of parallelism. At realistic conditions, simulations are being carried out on massively parallel computers with hundreds of thousands of processing elements (PEs). It has been observed that communication between PEs as well as their synchronization at these extreme scales take up a significant portion of the total simulation time and result in poor scalability of codes. This issue is likely to pose a bottleneck in scalability of codes on future Exascale systems. In this work, we propose an asynchronous computing algorithm based on widely used finite difference methods to solve PDEs in which synchronization between PEs due to communication is relaxed at a mathematical level. We show that while stability is conserved when schemes are used asynchronously, accuracy is greatly degraded. Since message arrivals at PEs are random processes, so is the behavior of the error. We propose a new statistical framework in which we show that average errors drop always to first-order regardless of the original scheme. We propose new asynchrony-tolerant schemes that maintain accuracy when synchronization is relaxed. The quality of the solution is shown to depend, not only on the physical phenomena and numerical schemes, but also on the characteristics of the computing machine. A novel algorithm using remote memory access communications has been developed to demonstrate excellent scalability of the method for large-scale computing. Finally, we present a path to extend this method in solving complex multi-scale problems on Exascale machines.
Basic Algorithms for the Asynchronous Reconfigurable Mesh
Directory of Open Access Journals (Sweden)
Yosi Ben-Asher
2002-01-01
Full Text Available Many constant time algorithms for various problems have been developed for the reconfigurable mesh (RM in the past decade. All these algorithms are designed to work with synchronous execution, with no regard for the fact that large size RMs will probably be asynchronous. A similar observation about the PRAM model motivated many researchers to develop algorithms and complexity measures for the asynchronous PRAM (APRAM. In this work, we show how to define the asynchronous reconfigurable mesh (ARM and how to measure the complexity of asynchronous algorithms executed on it. We show that connecting all processors in a row of an n×n ARM (the analog of barrier synchronization in the APRAM model can be solved with complexity Θ(nlogn. Intuitively, this is average work time for solving such a problem. Next, we describe general a technique for simulating T -step synchronous RM algorithms on the ARM with complexity of Θ(T⋅n2logn. Finally, we consider the simulation of the classical synchronous algorithm for counting the number of non-zero bits in an n bits vector using (k
Energy Technology Data Exchange (ETDEWEB)
Takemiya, Hiroshi; Ohta, Hirofumi; Honma, Ichirou
1996-03-01
The parallelization of Electro-Magnetic Cascade Monte Carlo Simulation Code, EGS4 on distributed memory scalar parallel computer: Intel Paragon XP/S15-256 is described. EGS4 has the feature that calculation time for one incident particle is quite different from each other because of the dynamic generation of secondary particles and different behavior of each particle. Granularity for parallel processing, parallel programming model and the algorithm of parallel random number generation are discussed and two kinds of method, each of which allocates particles dynamically or statically, are used for the purpose of realizing high speed parallel processing of this code. Among four problems chosen for performance evaluation, the speedup factors for three problems have been attained to nearly 100 times with 128 processor. It has been found that when both the calculation time for each incident particles and its dispersion are large, it is preferable to use dynamic particle allocation method which can average the load for each processor. And it has also been found that when they are small, it is preferable to use static particle allocation method which reduces the communication overhead. Moreover, it is pointed out that to get the result accurately, it is necessary to use double precision variables in EGS4 code. Finally, the workflow of program parallelization is analyzed and tools for program parallelization through the experience of the EGS4 parallelization are discussed. (author).
Sandalski, Stou
Smooth particle hydrodynamics is an efficient method for modeling the dynamics of fluids. It is commonly used to simulate astrophysical processes such as binary mergers. We present a newly developed GPU accelerated smooth particle hydrodynamics code for astrophysical simulations. The code is named neptune after the Roman god of water. It is written in OpenMP parallelized C++ and OpenCL and includes octree based hydrodynamic and gravitational acceleration. The design relies on object-oriented methodologies in order to provide a flexible and modular framework that can be easily extended and modified by the user. Several pre-built scenarios for simulating collisions of polytropes and black-hole accretion are provided. The code is released under the MIT Open Source license and publicly available at http://code.google.com/p/neptune-sph/.
Buxton, Eric C
2014-02-12
To evaluate and compare pharmacists' satisfaction with the content and learning environment of a continuing education program series offered as either synchronous or asynchronous webinars. An 8-lecture series of online presentations on the topic of new drug therapies was offered to pharmacists in synchronous and asynchronous webinar formats. Participants completed a 50-question online survey at the end of the program series to evaluate their perceptions of the distance learning experience. Eighty-two participants completed the survey instrument (41 participants from the live webinar series and 41 participants from the asynchronous webinar series.) Responses indicated that while both groups were satisfied with the program content, the asynchronous group showed greater satisfaction with many aspects of the learning environment. The synchronous and asynchronous webinar participants responded positively regarding the quality of the programming and the method of delivery, but asynchronous participants rated their experience more positively overall.
Designing Asynchronous Circuits for Low Power: An IFIR Filter
DEFF Research Database (Denmark)
Nielsen, Lars Skovby; Sparsø, Jens
1999-01-01
This paper addresses the design of asynchronous circuits for low power through an example: a filter bank for a digital hearing aid. The asynchronous design re-implements an existing synchronous circuit which is used in a commercial product. For comparison, both designs have been fabricated...
FPGA BASED ASYNCHRONOUS PIPELINED MB-OFDM UWB TRANSMITTER BACKEND MODULES
Directory of Open Access Journals (Sweden)
M. Santhi
2010-03-01
Full Text Available In this paper, a novel scheme is proposed which comprises the advantages of asynchronous pipelining techniques and the advantages of FPGAs for implementing a 200Mbps MB-OFDM UWB transmitter digital backend modules. In asynchronous pipelined system, registers are used as in synchronous system. But they are controlled by handshaking signals. Since FPGAs are rich in registers, design and implementation of asynchronous pipelined MBOFDM UWB transmitter on FPGA using four-phase bundled-data protocol is considered in this paper. Novel ideas have also been proposed for designing asynchronous OFDM using Modified Radix-24 SDF and asynchronous interleaver using two RAM banks. Implementation has been performed on ALTERA STRATIX II EP2S60F1020C4 FPGA and it is operating at a speed of 350MHz. It is assured that the proposed MB-OFDM UWB system can be made to work on STRATIX III device with the operating frequency of 528MHz in compliance to the ECMA-368 standard. The proposed scheme is also applicable for FPGA from other vendors and ASIC.
A parallel Discrete Element Method to model collisions between non-convex particles
Directory of Open Access Journals (Sweden)
Rakotonirina Andriarimina Daniel
2017-01-01
Full Text Available In many dry granular and suspension flow configurations, particles can be highly non-spherical. It is now well established in the literature that particle shape affects the flow dynamics or the microstructure of the particles assembly in assorted ways as e.g. compacity of packed bed or heap, dilation under shear, resistance to shear, momentum transfer between translational and angular motions, ability to form arches and block the flow. In this talk, we suggest an accurate and efficient way to model collisions between particles of (almost arbitrary shape. For that purpose, we develop a Discrete Element Method (DEM combined with a soft particle contact model. The collision detection algorithm handles contacts between bodies of various shape and size. For nonconvex bodies, our strategy is based on decomposing a non-convex body into a set of convex ones. Therefore, our novel method can be called “glued-convex method” (in the sense clumping convex bodies together, as an extension of the popular “glued-spheres” method, and is implemented in our own granular dynamics code Grains3D. Since the whole problem is solved explicitly, our fully-MPI parallelized code Grains3D exhibits a very high scalability when dynamic load balancing is not required. In particular, simulations on up to a few thousands cores in configurations involving up to a few tens of millions of particles can readily be performed. We apply our enhanced numerical model to (i the collapse of a granular column made of convex particles and (i the microstructure of a heap of non-convex particles in a cylindrical reactor.
Integrating Asynchronous Digital Design Into the Computer Engineering Curriculum
Smith, S. C.; Al-Assadi, W. K.; Di, J.
2010-01-01
As demand increases for circuits with higher performance, higher complexity, and decreased feature size, asynchronous (clockless) paradigms will become more widely used in the semiconductor industry, as evidenced by the International Technology Roadmap for Semiconductors' (ITRS) prediction of a likely shift from synchronous to asynchronous design…
Exploring Asynchronous and Synchronous Tool Use in Online Courses
Oztok, Murat; Zingaro, Daniel; Brett, Clare; Hewitt, Jim
2013-01-01
While the independent contributions of synchronous and asynchronous interaction in online learning are clear, comparatively less is known about the pedagogical consequences of using both modes in the same environment. In this study, we examine relationships between students' use of asynchronous discussion forums and synchronous private messages…
An Overview of the Asynchronous Digital Systems – Part 3
Directory of Open Access Journals (Sweden)
Mihai Timis
2008-01-01
Full Text Available Implementation methods for the digital asynchronous systems use different predefined models like self timed circuits, speed independent circuits, delay insensitive circuits, handshake protocol implementation in asynchronous systems,C Muller circuits.
An Overview of the Asynchronous Digital Systems – Part 2
Directory of Open Access Journals (Sweden)
Mihai Timis
2008-01-01
Full Text Available Implementation methods for the digital asynchronous systems use different predefined models like self timed circuits, speed independent circuits, delay insensitive circuits, handshake protocol implementation in asynchronous systems,C Muller circuits.
Router Designs for an Asynchronous Time-Division-Multiplexed Network-on-Chip
DEFF Research Database (Denmark)
Kasapaki, Evangelia; Sparsø, Jens; Sørensen, Rasmus Bo
2013-01-01
In this paper we explore the design of an asynchronous router for a time-division-multiplexed (TDM) network-on-chip (NOC) that is being developed for a multi-processor platform for hard real-time systems. TDM inherently requires a common time reference, and existing TDM-based NOC designs are either....... This adds hardware complexity and increases area and power consumption. We propose to use asynchronous routers in order to achieve a simpler, more robust and globally-asynchronous NOC, and this represents an unexplored point in the design space. The paper presents a range of alternative router designs. All...... routers have been synthesized for a 65nm CMOS technology, and the paper reports post-layout figures for area, speed and energy and compares the asynchronous designs with an existing mesochronous clocked router. The results show that an asynchronous router is 2 times smaller, marginally slower...
Evaluation of discrete modeling efficiency of asynchronous electric machines
Byczkowska-Lipińska, Liliana; Stakhiv, Petro; Hoholyuk, Oksana; Vasylchyshyn, Ivanna
2011-01-01
In the paper the problem of effective mathematical macromodels in the form of state variables intended for asynchronous motor transient analysis is considered. Their comparing with traditional mathematical models of asynchronous motors including models built into MATLAB/Simulink software was carried out and analysis of their efficiency was conducted.
Two Studies Examining Argumentation in Asynchronous Computer Mediated Communication
Joiner, Richard; Jones, Sarah; Doherty, John
2008-01-01
Asynchronous computer mediated communication (CMC) would seem to be an ideal medium for supporting development in student argumentation. This paper investigates this assumption through two studies. The first study compared asynchronous CMC with face-to-face discussions. The transactional and strategic level of the argumentation (i.e. measures of…
MODELING AND INVESTIGATION OF ASYNCHRONOUS TWO-MACHINE SYSTEM MODES
Directory of Open Access Journals (Sweden)
V. S. Safaryan
2014-01-01
Full Text Available The paper considers stationary and transient processes of an asynchronous two-machine system. A mathematical model for investigation of stationary and transient modes, static characteristics and research results of dynamic process pertaining to starting-up the asynchronous two-machine system has been given in paper.
A prototype pixel readout chip for asynchronous detection applications
International Nuclear Information System (INIS)
Raymond, D.M.; Hall, G.; Lewis, A.J.; Sharp, P.H.
1991-01-01
A two-dimensional array of amplifier cells has been fabricated as a prototype readout system for a matching array of silicon diode detectors. Each cell contains a preamplifier, shaping amplifier, comparator and analogue signal storage in an area of 300 μmx320 μm using 3 μm CMOS technology. Full size chips will be bump bonded to pixel detector arrays. Low noise and asynchronous operation are novel design features. With noise levels of less than 250 rms electrons for input capacitances up to 600 fF, pixel detectors will be suitable for autoradiography, synchrotron X-ray and high energy particle detection applications. The design of the prototype chip is presented and future developments and prospects for applications are discussed. (orig.)
Plasma and energetic particle structure upstream of a quasi-parallel interplanetary shock
Kennel, C. F.; Scarf, F. L.; Coroniti, F. V.; Russell, C. T.; Wenzel, K.-P.; Sanderson, T. R.; Van Nes, P.; Smith, E. J.; Tsurutani, B. T.; Scudder, J. D.
1984-01-01
ISEE 1, 2 and 3 data from 1978 on interplanetary magnetic fields, shock waves and particle energetics are examined to characterize a quasi-parallel shock. The intense shock studied exhibited a 640 km/sec velocity. The data covered 1-147 keV protons and electrons and ions with energies exceeding 30 keV in regions both upstream and downstream of the shock, and also the magnitudes of ion-acoustic and MHD waves. The energetic particles and MHD waves began being detected 5 hr before the shock. Intense halo electron fluxes appeared ahead of the shock. A closed magnetic field structure was produced with a front end 700 earth radii from the shock. The energetic protons were cut off from the interior of the magnetic bubble, which contained a markedly increased density of 2-6 keV protons as well as the shock itself.
PsychVACS: a system for asynchronous telepsychiatry.
Odor, Alberto; Yellowlees, Peter; Hilty, Donald; Parish, Michelle Burke; Nafiz, Najia; Iosif, Ana-Maria
2011-05-01
To describe the technical development of an asynchronous telepsychiatry application, the Psychiatric Video Archiving and Communication System. A client-server application was developed in Visual Basic.Net with Microsoft(®) SQL database as the backend. It includes the capability of storing video-recorded psychiatric interviews and manages the workflow of the system with automated messaging. Psychiatric Video Archiving and Communication System has been used to conduct the first ever series of asynchronous telepsychiatry consultations worldwide. A review of the software application and the process as part of this project has led to a number of improvements that are being implemented in the next version, which is being written in Java. This is the first description of the use of video recorded data in an asynchronous telemedicine application. Primary care providers and consulting psychiatrists have found it easy to work with and a valuable resource to increase the availability of psychiatric consultation in remote rural locations.
ASCERTAINMENT OF THE EQUIVALENT CIRCUIT PARAMETERS OF THE ASYNCHRONOUS MACHINE
Directory of Open Access Journals (Sweden)
V. S. Safaryan
2015-01-01
Full Text Available The article considers experimental and analytical determination of the asynchronous machine equivalent-circuit parameters with application of the reference data. Transient processes investigation of the asynchronous machines necessitates the equivalent circuit parameters (resistance impedance, inductances and coefficient of the stator-rotor contours mutual inductance that help form the transitory-process mathematical simulation model. The reference books do not provide those parameters; they instead give the rated ones (active power, voltage, slide, coefficient of performance and capacity coefficient as well as the ratio of starting and nominal currents and torques. The noted studies on the asynchronous machine equivalent-circuits parametrization fail to solve the problems ad finem or solve them with admissions. The paper presents experimental and analytical determinations of the asynchronous machine equivalent-circuit parameters: the experimental one based on the results of two measurements and the analytical one where the problem boils down to solving a system of nonlineal algebraic equations. The authors investigate the equivalent asynchronous machine input-resistance properties and adduce the dependence curvatures of the input-resistances on the slide. They present a symbolic model for analytical parameterization of the asynchronous machine equivalent-circuit that represents a system of nonlineal equations and requires one of the rotor-parameters arbitrary assignment. The article demonstrates that for the asynchronous machine equivalent-circuit experimental parameterization the measures are to be conducted of the stator-circuit voltage, current and active power with two different slides and arbitrary assignment of one of the rotor parameters. The paper substantiates the fact that additional measurement does not discard the rotor-parameter choice arbitrariness. The authors establish that in motoring mode there is a critical slide by which the
Effect of asynchronous updating on the stability of cellular automata
International Nuclear Information System (INIS)
Baetens, J.M.; Van der Weeën, P.; De Baets, B.
2012-01-01
Highlights: ► An upper bound on the Lyapunov exponent of asynchronously updated CA is established. ► The employed update method has repercussions on the stability of CAs. ► A decision on the employed update method should be taken with care. ► Substantial discrepancies arise between synchronously and asynchronously updated CA. ► Discrepancies between different asynchronous update schemes are less pronounced. - Abstract: Although cellular automata (CAs) were conceptualized as utter discrete mathematical models in which the states of all their spatial entities are updated simultaneously at every consecutive time step, i.e. synchronously, various CA-based models that rely on so-called asynchronous update methods have been constructed in order to overcome the limitations that are tied up with the classical way of evolving CAs. So far, only a few researchers have addressed the consequences of this way of updating on the evolved spatio-temporal patterns, and the reachable stationary states. In this paper, we exploit Lyapunov exponents to determine to what extent the stability of the rules within a family of totalistic CAs is affected by the underlying update method. For that purpose, we derive an upper bound on the maximum Lyapunov exponent of asynchronously iterated CAs, and show its validity, after which we present a comparative study between the Lyapunov exponents obtained for five different update methods, namely one synchronous method and four well-established asynchronous methods. It is found that the stability of CAs is seriously affected if one of the latter methods is employed, whereas the discrepancies arising between the different asynchronous methods are far less pronounced and, finally, we discuss the repercussions of our findings on the development of CA-based models.
Parallelized Kalman-Filter-Based Reconstruction of Particle Tracks on Many-Core Architectures
Energy Technology Data Exchange (ETDEWEB)
Cerati, Giuseppe [Fermilab; Elmer, Peter [Princeton U.; Krutelyov, Slava [UC, San Diego; Lantz, Steven [Cornell U., Phys. Dept.; Lefebvre, Matthieu [Princeton U.; Masciovecchio, Mario [UC, San Diego; McDermott, Kevin [Cornell U., Phys. Dept.; Riley, Daniel [Cornell U., Phys. Dept.; Tadel, Matevž [UC, San Diego; Wittich, Peter [Cornell U., Phys. Dept.; Würthwein, Frank [UC, San Diego; Yagil, Avi [UC, San Diego
2017-11-16
Faced with physical and energy density limitations on clock speed, contemporary microprocessor designers have increasingly turned to on-chip parallelism for performance gains. Examples include the Intel Xeon Phi, GPGPUs, and similar technologies. Algorithms should accordingly be designed with ample amounts of fine-grained parallelism if they are to realize the full performance of the hardware. This requirement can be challenging for algorithms that are naturally expressed as a sequence of small-matrix operations, such as the Kalman filter methods widely in use in high-energy physics experiments. In the High-Luminosity Large Hadron Collider (HL-LHC), for example, one of the dominant computational problems is expected to be finding and fitting charged-particle tracks during event reconstruction; today, the most common track-finding methods are those based on the Kalman filter. Experience at the LHC, both in the trigger and offline, has shown that these methods are robust and provide high physics performance. Previously we reported the significant parallel speedups that resulted from our efforts to adapt Kalman-filter-based tracking to many-core architectures such as Intel Xeon Phi. Here we report on how effectively those techniques can be applied to more realistic detector configurations and event complexity.
Detection of Failure in Asynchronous Motor Using Soft Computing Method
Vinoth Kumar, K.; Sony, Kevin; Achenkunju John, Alan; Kuriakose, Anto; John, Ano P.
2018-04-01
This paper investigates the stator short winding failure of asynchronous motor also their effects on motor current spectrums. A fuzzy logic approach i.e., model based technique possibly will help to detect the asynchronous motor failure. Actually, fuzzy logic similar to humanoid intelligent methods besides expected linguistic empowering inferences through vague statistics. The dynamic model is technologically advanced for asynchronous motor by means of fuzzy logic classifier towards investigate the stator inter turn failure in addition open phase failure. A hardware implementation was carried out with LabVIEW for the online-monitoring of faults.
Long-time self-diffusion of charged spherical colloidal particles in parallel planar layers.
Contreras-Aburto, Claudio; Báez, César A; Méndez-Alcaraz, José M; Castañeda-Priego, Ramón
2014-06-28
The long-time self-diffusion coefficient, D(L), of charged spherical colloidal particles in parallel planar layers is studied by means of Brownian dynamics computer simulations and mode-coupling theory. All particles (regardless which layer they are located on) interact with each other via the screened Coulomb potential and there is no particle transfer between layers. As a result of the geometrical constraint on particle positions, the simulation results show that D(L) is strongly controlled by the separation between layers. On the basis of the so-called contraction of the description formalism [C. Contreras-Aburto, J. M. Méndez-Alcaraz, and R. Castañeda-Priego, J. Chem. Phys. 132, 174111 (2010)], the effective potential between particles in a layer (the so-called observed layer) is obtained from integrating out the degrees of freedom of particles in the remaining layers. We have shown in a previous work that the effective potential performs well in describing the static structure of the observed layer (loc. cit.). In this work, we find that the D(L) values determined from the simulations of the observed layer, where the particles interact via the effective potential, do not agree with the exact values of D(L). Our findings confirm that even when an effective potential can perform well in describing the static properties, there is no guarantee that it will correctly describe the dynamic properties of colloidal systems.
Parallelization of quantum molecular dynamics simulation code
International Nuclear Information System (INIS)
Kato, Kaori; Kunugi, Tomoaki; Shibahara, Masahiko; Kotake, Susumu
1998-02-01
A quantum molecular dynamics simulation code has been developed for the analysis of the thermalization of photon energies in the molecule or materials in Kansai Research Establishment. The simulation code is parallelized for both Scalar massively parallel computer (Intel Paragon XP/S75) and Vector parallel computer (Fujitsu VPP300/12). Scalable speed-up has been obtained with a distribution to processor units by division of particle group in both parallel computers. As a result of distribution to processor units not only by particle group but also by the particles calculation that is constructed with fine calculations, highly parallelization performance is achieved in Intel Paragon XP/S75. (author)
Asynchronous Operators of Sequential Logic Venjunction & Sequention
Vasyukevich, Vadim
2011-01-01
This book is dedicated to new mathematical instruments assigned for logical modeling of the memory of digital devices. The case in point is logic-dynamical operation named venjunction and venjunctive function as well as sequention and sequentional function. Venjunction and sequention operate within the framework of sequential logic. In a form of the corresponding equations, they organically fit analytical expressions of Boolean algebra. Thus, a sort of symbiosis is formed using elements of asynchronous sequential logic on the one hand and combinational logic on the other hand. So, asynchronous
[A novel proposal explaining sleep disturbance of children in Japan--asynchronization].
Kohyama, Jun
2008-07-01
It has been reported that more than half of the children in Japan suffer from daytime sleepiness. In contrast, about one quarter of junior high-school students in Japan complain of insomnia. According to the International Classification of Sleep Disorders (Second edition), these children could be diagnosed as having behaviorally-induced insufficient sleep syndrome due to inadequate sleeping habits. Getting on adequate amount of sleep should solve such problems;however, such a therapeutic approach often fails. Although social factors are involved in these sleep disturbances, I feel that a novel notion - asynchronization - leads to an understanding of the pathophysiology of disturbances in these children. Further, it could contribute to resolve their problems. The essence of asynchronization is a disturbance of various aspects (e.g., cycle, amplitude, phase, and interrelationship) of the biological rhythms that normally exhibits circadian oscillation. The main cause of asynchronization is hypothesized to be the combination of light exposure during night and the lack of light exposure in the morning. Asynchronization results in the disturbance of variable systems. Thus, symptoms of asynchronization include disturbances of the autonomic nervous system (sleepiness, insomnia, disturbance of hormonal excretion, gastrointestinal problems, etc.) and higher brain function (disorientation, loss of sociality, loss of will or motivation, impaired alertness and performance, etc.). Neurological (attention deficit, aggression, impulsiveness, hyperactivity, etc.), psychiatric (depressive disorders, personality disorders, anxiety disorders, etc.) and somatic (tiredness, fatigue, etc.) disturbances could also be symptoms of asynchronization. At the initial phase of asynchronization, disturbances are functional and can be resolved relatively easily, such as by the establishment of a regular sleep-wakefulness cycle;however, without adequate intervention the disturbances could gradually
Localized radio frequency communication using asynchronous transfer mode protocol
Witzke, Edward L.; Robertson, Perry J.; Pierson, Lyndon G.
2007-08-14
A localized wireless communication system for communication between a plurality of circuit boards, and between electronic components on the circuit boards. Transceivers are located on each circuit board and electronic component. The transceivers communicate with one another over spread spectrum radio frequencies. An asynchronous transfer mode protocol controls communication flow with asynchronous transfer mode switches located on the circuit boards.
Energy Technology Data Exchange (ETDEWEB)
Jacob, D.
2005-07-01
This book proposes a presentation of AC electric motors essentially based on physics and technology. Its originality consists in avoiding to use mathematical formulations (like Park's transformation). The modeling retained, which only uses magnetic momentum, magnetic fields and reluctance concepts, leads simply and naturally to the vectorial control principle. The book develops some lecture elements which includes some topics rarely considered like the dimensioning of an asynchronous motor or of a single-phase brush-less motor. Experimental results illustrate the physical phenomena described and many original problems are resolved and commented at the end of each chapter. Content: signals and systems in electrotechnics, torque and rotating magnetic fields generation, asynchronous machine in permanent regime, speed variation of the asynchronous motor, special asynchronous motors, synchronous machine in permanent regime, brush-less motor, note about step motors, note about inverters, index. (J.S.)
Behavioral Synthesis of Asynchronous Circuits Using Syntax Directed Translation as Backend
DEFF Research Database (Denmark)
Nielsen, Sune Fallgaard; Sparsø, Jens; Madsen, Jan
2009-01-01
The current state-of-the art in high-level synthesis of asynchronous circuits is syntax directed translation, which performs a one-to-one mapping of a HDL-description into a corresponding circuit. This paper presents a method for behavioral synthesis of asynchronous circuits which builds on top...... description language Balsa [1]. This ”conventional” template architecture allows us to adapt traditional synchronous synthesis techniques for resource sharing, scheduling, binding etc, to the domain of asynchronous circuits. A prototype tool has been implemented on top of the Balsa framework, and the method...... is illustrated through the implementation of a set of example circuits. The main contributions of the paper are: the fundamental idea, the template architecture and its implementation using asynchronous handshake components, and the implementation of a prototype tool....
Energy Technology Data Exchange (ETDEWEB)
Guerette, D.
2009-07-01
This document presented a detailed mathematical explanation and validation of the steps leading to the development of an asynchronous squirrel-cage machine. The MatLab/Simulink software was used to model a wind turbine at variable high speeds. The asynchronous squirrel-cage machine is an electromechanical system coupled to a magnetic circuit. The resulting electromagnetic circuit can be represented as a set of resistances, leakage inductances and mutual inductances. Different models were used for a comparison study, including the Munteanu, Boldea, Wind Turbine Blockset, and SimPowerSystem. MatLab/Simulink modeling results were in good agreement with the results from other comparable models. Simulation results were in good agreement with analytical calculations. 6 refs, 2 tabs, 9 figs.
Yang, Sheng-Chun; Lu, Zhong-Yuan; Qian, Hu-Jun; Wang, Yong-Lei; Han, Jie-Ping
2017-11-01
In this work, we upgraded the electrostatic interaction method of CU-ENUF (Yang, et al., 2016) which first applied CUNFFT (nonequispaced Fourier transforms based on CUDA) to the reciprocal-space electrostatic computation and made the computation of electrostatic interaction done thoroughly in GPU. The upgraded edition of CU-ENUF runs concurrently in a hybrid parallel way that enables the computation parallelizing on multiple computer nodes firstly, then further on the installed GPU in each computer. By this parallel strategy, the size of simulation system will be never restricted to the throughput of a single CPU or GPU. The most critical technical problem is how to parallelize a CUNFFT in the parallel strategy, which is conquered effectively by deep-seated research of basic principles and some algorithm skills. Furthermore, the upgraded method is capable of computing electrostatic interactions for both the atomistic molecular dynamics (MD) and the dissipative particle dynamics (DPD). Finally, the benchmarks conducted for validation and performance indicate that the upgraded method is able to not only present a good precision when setting suitable parameters, but also give an efficient way to compute electrostatic interactions for huge simulation systems. Program Files doi:http://dx.doi.org/10.17632/zncf24fhpv.1 Licensing provisions: GNU General Public License 3 (GPL) Programming language: C, C++, and CUDA C Supplementary material: The program is designed for effective electrostatic interactions of large-scale simulation systems, which runs on particular computers equipped with NVIDIA GPUs. It has been tested on (a) single computer node with Intel(R) Core(TM) i7-3770@ 3.40 GHz (CPU) and GTX 980 Ti (GPU), and (b) MPI parallel computer nodes with the same configurations. Nature of problem: For molecular dynamics simulation, the electrostatic interaction is the most time-consuming computation because of its long-range feature and slow convergence in simulation space
Yan, Beichuan; Regueiro, Richard A.
2018-02-01
A three-dimensional (3D) DEM code for simulating complex-shaped granular particles is parallelized using message-passing interface (MPI). The concepts of link-block, ghost/border layer, and migration layer are put forward for design of the parallel algorithm, and theoretical scalability function of 3-D DEM scalability and memory usage is derived. Many performance-critical implementation details are managed optimally to achieve high performance and scalability, such as: minimizing communication overhead, maintaining dynamic load balance, handling particle migrations across block borders, transmitting C++ dynamic objects of particles between MPI processes efficiently, eliminating redundant contact information between adjacent MPI processes. The code executes on multiple US Department of Defense (DoD) supercomputers and tests up to 2048 compute nodes for simulating 10 million three-axis ellipsoidal particles. Performance analyses of the code including speedup, efficiency, scalability, and granularity across five orders of magnitude of simulation scale (number of particles) are provided, and they demonstrate high speedup and excellent scalability. It is also discovered that communication time is a decreasing function of the number of compute nodes in strong scaling measurements. The code's capability of simulating a large number of complex-shaped particles on modern supercomputers will be of value in both laboratory studies on micromechanical properties of granular materials and many realistic engineering applications involving granular materials.
Asynchronous and Synchronous Online Discussion: Real and Perceived Achievement Differences
Johnson, Genevieve Marie; Buck, George H.
2007-01-01
Students in an introductory educational psychology course used two WebCT communication tools (synchronous chat and asynchronous discussion) to discuss four case studies. In response to the item, "I learned the case studies best when using," 39 students selected synchronous chat and 51 students selected asynchronous discussion. Students who…
DESIGN METHODOLOGY OF SELF-EXCITED ASYNCHRONOUS GENERATOR
Directory of Open Access Journals (Sweden)
Berzan V.
2012-04-01
Full Text Available The paper sets out the methodology of designing an asynchronous generator with capacitive self-excitation. It is known that its design is possible on the basis of serial synchronous motor with squirrel cage rotor. With this approach, the design reworked only the stator winding of electrical machines, making it cost-effectively implement the creation of the generator. Therefore, the methodology for the design, optimization calculations, the development scheme and the stator winding excitation system gain, not only of practical interest, and may also be useful for specialists in the field of electrical machines in the design of asynchronous generators.
Handbook of asynchronous machines with variable speed
Razik, Hubert
2013-01-01
This handbook deals with the asynchronous machine in its close environment. It was born from a reflection on this electromagnetic converter whose integration in industrial environments takes a wide part. Previously this type of motor operated at fixed speed, from now on it has been integrated more and more in processes at variable speed. For this reason it seemed useful, or necessary, to write a handbook on the various aspects from the motor in itself, via the control and while finishing by the diagnosis aspect. Indeed, an asynchronous motor is used nowadays in industry where variation speed a
Plicht, J. van der
1980-01-01
A parallel plate avalanche detector developed for the detection of fission fragments in particle induced fission reactions is described. The active area is 6 × 10 cm2; it is position sensitive in one dimension with a resolution of 2.5 mm. The detector can withstand a count rate of 25000 fission
A multithreaded parallel implementation of a dynamic programming algorithm for sequence comparison.
Martins, W S; Del Cuvillo, J B; Useche, F J; Theobald, K B; Gao, G R
2001-01-01
This paper discusses the issues involved in implementing a dynamic programming algorithm for biological sequence comparison on a general-purpose parallel computing platform based on a fine-grain event-driven multithreaded program execution model. Fine-grain multithreading permits efficient parallelism exploitation in this application both by taking advantage of asynchronous point-to-point synchronizations and communication with low overheads and by effectively tolerating latency through the overlapping of computation and communication. We have implemented our scheme on EARTH, a fine-grain event-driven multithreaded execution and architecture model which has been ported to a number of parallel machines with off-the-shelf processors. Our experimental results show that the dynamic programming algorithm can be efficiently implemented on EARTH systems with high performance (e.g., speedup of 90 on 120 nodes), good programmability and reasonable cost.
International Nuclear Information System (INIS)
Ishizuki, Shigeru; Kawai, Wataru; Nemoto, Toshiyuki; Ogasawara, Shinobu; Kume, Etsuo; Adachi, Masaaki; Kawasaki, Nobuo; Yatake, Yo-ichi
2000-03-01
Several computer codes in the nuclear field have been vectorized, parallelized and transported on the FUJITSU VPP500 system, the AP3000 system and the Paragon system at Center for Promotion of Computational Science and Engineering in Japan Atomic Energy Research Institute. We dealt with 12 codes in fiscal 1998. These results are reported in 3 parts, i.e., the vectorization and parallelization on vector processors part, the parallelization on scalar processors part and the porting part. In this report, we describe the vectorization and parallelization on vector processors. In this vectorization and parallelization on vector processors part, the vectorization of General Tokamak Circuit Simulation Program code GTCSP, the vectorization and parallelization of Molecular Dynamics NTV (n-particle, Temperature and Velocity) Simulation code MSP2, Eddy Current Analysis code EDDYCAL, Thermal Analysis Code for Test of Passive Cooling System by HENDEL T2 code THANPACST2 and MHD Equilibrium code SELENEJ on the VPP500 are described. In the parallelization on scalar processors part, the parallelization of Monte Carlo N-Particle Transport code MCNP4B2, Plasma Hydrodynamics code using Cubic Interpolated Propagation Method PHCIP and Vectorized Monte Carlo code (continuous energy model / multi-group model) MVP/GMVP on the Paragon are described. In the porting part, the porting of Monte Carlo N-Particle Transport code MCNP4B2 and Reactor Safety Analysis code RELAP5 on the AP3000 are described. (author)
Reliable self-replicating machines in asynchronous cellular automata.
Lee, Jia; Adachi, Susumu; Peper, Ferdinand
2007-01-01
We propose a self-replicating machine that is embedded in a two-dimensional asynchronous cellular automaton with von Neumann neighborhood. The machine dynamically encodes its shape into description signals, and despite the randomness of cell updating, it is able to successfully construct copies of itself according to the description signals. Self-replication on asynchronously updated cellular automata may find application in nanocomputers, where reconfigurability is an essential property, since it allows avoidance of defective parts and simplifies programming of such computers.
Functional asynchronous networks: Factorization of dynamics and function
Directory of Open Access Journals (Sweden)
Bick Christian
2016-01-01
Full Text Available In this note we describe the theory of functional asynchronous networks and one of the main results, the Modularization of Dynamics Theorem, which for a large class of functional asynchronous networks gives a factorization of dynamics in terms of constituent subnetworks. For these networks we can give a complete description of the network function in terms of the function of the events comprising the network and thereby answer a question originally raised by Alon in the context of biological networks.
Low-power Implementation of an Encryption/Decryption System with Asynchronous Techniques
Directory of Open Access Journals (Sweden)
Nikos Sklavos
2002-01-01
Full Text Available An asynchronous VLSI implementation of the International Data Encryption Algorithm (IDEA is presented in this paper. In order to evaluate the asynchronous design a synchronous version of the algorithm was also designed. VHDL hardware description language was used in order to describe the algorithm. By using Synopsys commercial available tools the VHDL code was synthesized. After placing and routing both designs were fabricated with 0.6 μm CMOS technology. With a system clock of up to 8 MHz and a power supply of 5 V the two chips were tested and evaluated comparing with the software implementation of the IDEA algorithm. This new approach proves efficiently the lowest power consumption of the asynchronous implementation compared to the existing synchronous. Therefore, the asynchronous chip performs efficiently in Wireless Encryption Protocols and high speed networks.
Cao, Jianfang; Cui, Hongyan; Shi, Hao; Jiao, Lijuan
2016-01-01
A back-propagation (BP) neural network can solve complicated random nonlinear mapping problems; therefore, it can be applied to a wide range of problems. However, as the sample size increases, the time required to train BP neural networks becomes lengthy. Moreover, the classification accuracy decreases as well. To improve the classification accuracy and runtime efficiency of the BP neural network algorithm, we proposed a parallel design and realization method for a particle swarm optimization (PSO)-optimized BP neural network based on MapReduce on the Hadoop platform using both the PSO algorithm and a parallel design. The PSO algorithm was used to optimize the BP neural network's initial weights and thresholds and improve the accuracy of the classification algorithm. The MapReduce parallel programming model was utilized to achieve parallel processing of the BP algorithm, thereby solving the problems of hardware and communication overhead when the BP neural network addresses big data. Datasets on 5 different scales were constructed using the scene image library from the SUN Database. The classification accuracy of the parallel PSO-BP neural network algorithm is approximately 92%, and the system efficiency is approximately 0.85, which presents obvious advantages when processing big data. The algorithm proposed in this study demonstrated both higher classification accuracy and improved time efficiency, which represents a significant improvement obtained from applying parallel processing to an intelligent algorithm on big data.
Application of intelligent soft start in asynchronous motor
Du, Xue; Ye, Ying; Wang, Yuelong; Peng, Lei; Zhang, Suying
2018-05-01
The starting way of three phase asynchronous motor has full voltage start and step-down start. Direct starting brings large current impact, causing excessive local temperature to the power grid and larger starting torque will also impact the motor equipment and affect the service life of the motor. Aim at the problem of large current and torque caused by start-up, an intelligent soft starter is proposed. Through the application of intelligent soft start on asynchronous motor, highlights its application advantage in motor control.
Object-Oriented Parallel Particle-in-Cell Code for Beam Dynamics Simulation in Linear Accelerators
International Nuclear Information System (INIS)
Qiang, J.; Ryne, R.D.; Habib, S.; Decky, V.
1999-01-01
In this paper, we present an object-oriented three-dimensional parallel particle-in-cell code for beam dynamics simulation in linear accelerators. A two-dimensional parallel domain decomposition approach is employed within a message passing programming paradigm along with a dynamic load balancing. Implementing object-oriented software design provides the code with better maintainability, reusability, and extensibility compared with conventional structure based code. This also helps to encapsulate the details of communications syntax. Performance tests on SGI/Cray T3E-900 and SGI Origin 2000 machines show good scalability of the object-oriented code. Some important features of this code also include employing symplectic integration with linear maps of external focusing elements and using z as the independent variable, typical in accelerators. A successful application was done to simulate beam transport through three superconducting sections in the APT linac design
Distributed Consensus of Stochastic Delayed Multi-agent Systems Under Asynchronous Switching.
Wu, Xiaotai; Tang, Yang; Cao, Jinde; Zhang, Wenbing
2016-08-01
In this paper, the distributed exponential consensus of stochastic delayed multi-agent systems with nonlinear dynamics is investigated under asynchronous switching. The asynchronous switching considered here is to account for the time of identifying the active modes of multi-agent systems. After receipt of confirmation of mode's switching, the matched controller can be applied, which means that the switching time of the matched controller in each node usually lags behind that of system switching. In order to handle the coexistence of switched signals and stochastic disturbances, a comparison principle of stochastic switched delayed systems is first proved. By means of this extended comparison principle, several easy to verified conditions for the existence of an asynchronously switched distributed controller are derived such that stochastic delayed multi-agent systems with asynchronous switching and nonlinear dynamics can achieve global exponential consensus. Two examples are given to illustrate the effectiveness of the proposed method.
The Determination of the Asynchronous Traction Motor Characteristics of Locomotive
Directory of Open Access Journals (Sweden)
Pavel Grigorievich Kolpakhchyan
2017-01-01
Full Text Available The article deals with the problem of the locomotive asynchronous traction motor control with the AC diesel-electric transmission. The limitations of the torque of the traction motor when powered by the inverter are determined. The recommendations to improve the use of asynchronous traction motor of locomotives with the AC diesel-electric transmission are given.
OFDM with Index Modulation for Asynchronous mMTC Networks.
Doğan, Seda; Tusha, Armed; Arslan, Hüseyin
2018-04-21
One of the critical missions for next-generation wireless communication systems is to fulfill the high demand for massive Machine-Type Communications (mMTC). In mMTC systems, a sporadic transmission is performed between machine users and base station (BS). Lack of coordination between the users and BS in time destroys orthogonality between the subcarriers, and causes inter-carrier interference (ICI). Therefore, providing services to asynchronous massive machine users is a major challenge for Orthogonal Frequency Division Multiplexing (OFDM). In this study, OFDM with index modulation (OFDM-IM) is proposed as an eligible solution to alleviate ICI caused by asynchronous transmission in uncoordinated mMTC networks. In OFDM-IM, data transmission is performed not only by modulated subcarriers but also by the indices of active subcarriers. Unlike classical OFDM, fractional subcarrier activation leads to less ICI in OFDM-IM technology. A novel subcarrier mapping scheme (SMS) named as Inner Subcarrier Activation is proposed to further alleviate adjacent user interference in asynchronous OFDM-IM-based systems. ISA reduces inter-user interference since it gives more activation priority to inner subcarriers compared with the existing SMS-s. The superiority of the proposed SMS is shown through both theoretical analysis and computer-based simulations in comparison to existing mapping schemes for asynchronous systems.
Asynchronous control for networked systems
Rubio, Francisco; Bencomo, Sebastián
2015-01-01
This book sheds light on networked control systems; it describes different techniques for asynchronous control, moving away from the periodic actions of classical control, replacing them with state-based decisions and reducing the frequency with which communication between subsystems is required. The text focuses specially on event-based control. Split into two parts, Asynchronous Control for Networked Systems begins by addressing the problems of single-loop networked control systems, laying out various solutions which include two alternative model-based control schemes (anticipatory and predictive) and the use of H2/H∞ robust control to deal with network delays and packet losses. Results on self-triggering and send-on-delta sampling are presented to reduce the need for feedback in the loop. In Part II, the authors present solutions for distributed estimation and control. They deal first with reliable networks and then extend their results to scenarios in which delays and packet losses may occur. The novel ...
Kunin, Marc; Julliard, Kell N; Rodriguez, Tobias E
2014-06-01
The Department of Dental Medicine of Lutheran Medical Center has developed an asynchronous online curriculum consisting of prerecorded PowerPoint presentations with audio explanations. The focus of this study was to evaluate if the new asynchronous format satisfied the educational needs of the residents compared to traditional lecture (face-to-face) and synchronous (distance learning) formats. Lectures were delivered to 219 dental residents employing face-to-face and synchronous formats, as well as the new asynchronous format; 169 (77 percent) participated in the study. Outcomes were assessed with pretests, posttests, and individual lecture surveys. Results found the residents preferred face-to-face and asynchronous formats to the synchronous format in terms of effectiveness and clarity of presentations. This preference was directly related to the residents' perception of how well the technology worked in each format. The residents also rated the quality of student-instructor and student-student interactions in the synchronous and asynchronous formats significantly higher after taking the lecture series than they did before taking it. However, they rated the face-to-face format as significantly more conducive to student-instructor and student-student interaction. While the study found technology had a major impact on the efficacy of this curricular model, the results suggest that the asynchronous format can be an effective way to teach a postgraduate course.
Modeling and Analysis of Asynchronous Systems Using SAL and Hybrid SAL
Tiwari, Ashish; Dutertre, Bruno
2013-01-01
We present formal models and results of formal analysis of two different asynchronous systems. We first examine a mid-value select module that merges the signals coming from three different sensors that are each asynchronously sampling the same input signal. We then consider the phase locking protocol proposed by Daly, Hopkins, and McKenna. This protocol is designed to keep a set of non-faulty (asynchronous) clocks phase locked even in the presence of Byzantine-faulty clocks on the network. All models and verifications have been developed using the SAL model checking tools and the Hybrid SAL abstractor.
Supporting collaborative discussions on asynchronous time: a technological perspective
Caballé, Santi
2011-01-01
The aim of this paper is to report on an experience of using an innovative on-line learning tool to support real, collaborative learning through discussion in asynchronous time. While asynchronous interaction gives rise to unique opportunities that support active, collaborative learning, unique problems also arise, such as frustration, caused by waiting for other peoples' reactions and feedback and the consequent loss of motivation, which has a negative impact on learning outcomes. In order t...
International Nuclear Information System (INIS)
Colavita, E.; Hacyan, S.
2014-01-01
We analyze the solutions of the Klein–Gordon and Dirac equations describing a charged particle in an electromagnetic plane wave combined with a magnetic field parallel to the direction of propagation of the wave. It is shown that the Klein–Gordon equation admits coherent states as solutions, while the corresponding solutions of the Dirac equation are superpositions of coherent and displaced-number states. Particular attention is paid to the resonant case in which the motion of the particle is unbounded. -- Highlights: •We study a relativistic electron in a particular electromagnetic field configuration. •New exact solutions of the Klein–Gordon and Dirac equations are obtained. •Coherent and displaced number states can describe a relativistic particle
EPOS for Coordination of Asynchronous Sensor Webs
National Aeronautics and Space Administration — Develop, integrate, and deploy software-based tools to coordinate asynchronous, distributed missions and optimize observation planning spanning simultaneous...
Wit, PJ; vanderMei, HC; Busscher, HJ
1997-01-01
By allowing an air-bubble to pass through a parallel plate flow chamber with negatively charged, colloidal polystyrene particles adhering to the bottom collector plate of the chamber, the detachment of adhering particles stimulated by surface tension forces induced by the passage of a liquid-air
DEFF Research Database (Denmark)
Vlachogiannis, Ioannis (John); Lee, K Y
2009-01-01
In this paper the state-of-the-art extended particle swarm optimization (PSO) methods for solving multi-objective optimization problems are represented. We emphasize in those, the co-evolution technique of the parallel vector evaluated PSO (VEPSO), analysed and applied in a multi-objective problem...
On a model of three-dimensional bursting and its parallel implementation
Tabik, S.; Romero, L. F.; Garzón, E. M.; Ramos, J. I.
2008-04-01
A mathematical model for the simulation of three-dimensional bursting phenomena and its parallel implementation are presented. The model consists of four nonlinearly coupled partial differential equations that include fast and slow variables, and exhibits bursting in the absence of diffusion. The differential equations have been discretized by means of a second-order accurate in both space and time, linearly-implicit finite difference method in equally-spaced grids. The resulting system of linear algebraic equations at each time level has been solved by means of the Preconditioned Conjugate Gradient (PCG) method. Three different parallel implementations of the proposed mathematical model have been developed; two of these implementations, i.e., the MPI and the PETSc codes, are based on a message passing paradigm, while the third one, i.e., the OpenMP code, is based on a shared space address paradigm. These three implementations are evaluated on two current high performance parallel architectures, i.e., a dual-processor cluster and a Shared Distributed Memory (SDM) system. A novel representation of the results that emphasizes the most relevant factors that affect the performance of the paralled implementations, is proposed. The comparative analysis of the computational results shows that the MPI and the OpenMP implementations are about twice more efficient than the PETSc code on the SDM system. It is also shown that, for the conditions reported here, the nonlinear dynamics of the three-dimensional bursting phenomena exhibits three stages characterized by asynchronous, synchronous and then asynchronous oscillations, before a quiescent state is reached. It is also shown that the fast system reaches steady state in much less time than the slow variables.
Novel Asynchronous Wrapper and Its Application to GALS Systems
Institute of Scientific and Technical Information of China (English)
Zhuang Shengxian; Peng Anjin; Lars Wanhammar
2006-01-01
An asynchronous wrapper with novel handshake circuits for data communication in globally asynchronous locally synchronous (GALS) systems is proposed. The handshake circuits include two communication ports and a local clock generator. Two approaches for the implementation of communication ports are presented, one with pure standard cells and the others with Müller-C elements. The detailed design methodology for GALS systems is given and the circuits are validated with VHDL and circuits simulation in standard CMOS technology.
Blending Online Asynchronous and Synchronous Learning
Directory of Open Access Journals (Sweden)
Lisa C. Yamagata-Lynch
2014-04-01
Full Text Available In this article I will share a qualitative self-study about a 15-week blended 100% online graduate level course facilitated through synchronous meetings on Blackboard Collaborate and asynchronous discussions on Blackboard. I taught the course at the University of Tennessee (UT during the spring 2012 semester and the course topic was online learning environments. The primary research question of this study was: How can the designer/instructor optimize learning experiences for students who are studying about online learning environments in a blended online course relying on both synchronous and asynchronous technologies? I relied on student reflections of course activities during the beginning, middle, and the end of the semester as the primary data source to obtain their insights regarding course experiences. Through the experiences involved in designing and teaching the course and engaging in this study I found that there is room in the instructional technology research community to address strategies for facilitating online synchronous learning that complement asynchronous learning. Synchronous online whole class meetings and well-structured small group meetings can help students feel a stronger sense of connection to their peers and instructor and stay engaged with course activities. In order to provide meaningful learning spaces in synchronous learning environments, the instructor/designer needs to balance the tension between embracing the flexibility that the online space affords to users and designing deliberate structures that will help them take advantage of the flexible space.
Nishiura, Daisuke; Furuichi, Mikito; Sakaguchi, Hide
2015-09-01
The computational performance of a smoothed particle hydrodynamics (SPH) simulation is investigated for three types of current shared-memory parallel computer devices: many integrated core (MIC) processors, graphics processing units (GPUs), and multi-core CPUs. We are especially interested in efficient shared-memory allocation methods for each chipset, because the efficient data access patterns differ between compute unified device architecture (CUDA) programming for GPUs and OpenMP programming for MIC processors and multi-core CPUs. We first introduce several parallel implementation techniques for the SPH code, and then examine these on our target computer architectures to determine the most effective algorithms for each processor unit. In addition, we evaluate the effective computing performance and power efficiency of the SPH simulation on each architecture, as these are critical metrics for overall performance in a multi-device environment. In our benchmark test, the GPU is found to produce the best arithmetic performance as a standalone device unit, and gives the most efficient power consumption. The multi-core CPU obtains the most effective computing performance. The computational speed of the MIC processor on Xeon Phi approached that of two Xeon CPUs. This indicates that using MICs is an attractive choice for existing SPH codes on multi-core CPUs parallelized by OpenMP, as it gains computational acceleration without the need for significant changes to the source code.
Beam dynamics calculations and particle tracking using massively parallel processors
International Nuclear Information System (INIS)
Ryne, R.D.; Habib, S.
1995-01-01
During the past decade massively parallel processors (MPPs) have slowly gained acceptance within the scientific community. At present these machines typically contain a few hundred to one thousand off-the-shelf microprocessors and a total memory of up to 32 GBytes. The potential performance of these machines is illustrated by the fact that a month long job on a high end workstation might require only a few hours on an MPP. The acceptance of MPPs has been slow for a variety of reasons. For example, some algorithms are not easily parallelizable. Also, in the past these machines were difficult to program. But in recent years the development of Fortran-like languages such as CM Fortran and High Performance Fortran have made MPPs much easier to use. In the following we will describe how MPPs can be used for beam dynamics calculations and long term particle tracking
Ibraheem; Hasan, Naimul; Hussein, Arkan Ahmed
2014-01-01
This Paper presents the design of decentralized automatic generation controller for an interconnected power system using PID, Genetic Algorithm (GA) and Particle Swarm Optimization (PSO). The designed controllers are tested on identical two-area interconnected power systems consisting of thermal power plants. The area interconnections between two areas are considered as (i) AC tie-line only (ii) Asynchronous tie-line. The dynamic response analysis is carried out for 1% load perturbation. The performance of the intelligent controllers based on GA and PSO has been compared with the conventional PID controller. The investigations of the system dynamic responses reveal that PSO has the better dynamic response result as compared with PID and GA controller for both type of area interconnection.
The Design of Finite State Machine for Asynchronous Replication Protocol
Wang, Yanlong; Li, Zhanhuai; Lin, Wei; Hei, Minglei; Hao, Jianhua
Data replication is a key way to design a disaster tolerance system and to achieve reliability and availability. It is difficult for a replication protocol to deal with the diverse and complex environment. This means that data is less well replicated than it ought to be. To reduce data loss and to optimize replication protocols, we (1) present a finite state machine, (2) run it to manage an asynchronous replication protocol and (3) report a simple evaluation of the asynchronous replication protocol based on our state machine. It's proved that our state machine is applicable to guarantee the asynchronous replication protocol running in the proper state to the largest extent in the event of various possible events. It also can helpful to build up replication-based disaster tolerance systems to ensure the business continuity.
Network evolution induced by asynchronous stimuli through spike-timing-dependent plasticity.
Directory of Open Access Journals (Sweden)
Wu-Jie Yuan
Full Text Available In sensory neural system, external asynchronous stimuli play an important role in perceptual learning, associative memory and map development. However, the organization of structure and dynamics of neural networks induced by external asynchronous stimuli are not well understood. Spike-timing-dependent plasticity (STDP is a typical synaptic plasticity that has been extensively found in the sensory systems and that has received much theoretical attention. This synaptic plasticity is highly sensitive to correlations between pre- and postsynaptic firings. Thus, STDP is expected to play an important role in response to external asynchronous stimuli, which can induce segregative pre- and postsynaptic firings. In this paper, we study the impact of external asynchronous stimuli on the organization of structure and dynamics of neural networks through STDP. We construct a two-dimensional spatial neural network model with local connectivity and sparseness, and use external currents to stimulate alternately on different spatial layers. The adopted external currents imposed alternately on spatial layers can be here regarded as external asynchronous stimuli. Through extensive numerical simulations, we focus on the effects of stimulus number and inter-stimulus timing on synaptic connecting weights and the property of propagation dynamics in the resulting network structure. Interestingly, the resulting feedforward structure induced by stimulus-dependent asynchronous firings and its propagation dynamics reflect both the underlying property of STDP. The results imply a possible important role of STDP in generating feedforward structure and collective propagation activity required for experience-dependent map plasticity in developing in vivo sensory pathways and cortices. The relevance of the results to cue-triggered recall of learned temporal sequences, an important cognitive function, is briefly discussed as well. Furthermore, this finding suggests a potential
International Nuclear Information System (INIS)
Candel, A.; Kabel, A.; Ko, K.; Lee, L.; Li, Z.; Limborg, C.; Ng, C.; Prudencio, E.; Schussman, G.; Uplenchwar, R.
2007-01-01
Over the past years, SLAC's Advanced Computations Department (ACD) has developed the parallel finite element (FE) particle-in-cell code Pic3P (Pic2P) for simulations of beam-cavity interactions dominated by space-charge effects. As opposed to standard space-charge dominated beam transport codes, which are based on the electrostatic approximation, Pic3P (Pic2P) includes space-charge, retardation and boundary effects as it self-consistently solves the complete set of Maxwell-Lorentz equations using higher-order FE methods on conformal meshes. Use of efficient, large-scale parallel processing allows for the modeling of photoinjectors with unprecedented accuracy, aiding the design and operation of the next-generation of accelerator facilities. Applications to the Linac Coherent Light Source (LCLS) RF gun are presented
MED5/355: Using Web-technology for Asynchronous Telemedicine Consulting
Reviakin, Y; Sukhanov, A
1999-01-01
Introduction Common telemedicine consultations can be divided in two classes: real-time telemedicine consultations and asynchronous telemedicine consultations. The advantage of real-time consultations is obvious - this is a natural discussion between physicians, which may be realised on the basis of desktop videoconferences. But the problems are also obvious: the necessity of additional hardware and the elevated demands for channel bandwidth. Because of the latter, the use of asynchronous tel...
Averkin, Sergey N.; Gatsonis, Nikolaos A.
2018-06-01
An unstructured electrostatic Particle-In-Cell (EUPIC) method is developed on arbitrary tetrahedral grids for simulation of plasmas bounded by arbitrary geometries. The electric potential in EUPIC is obtained on cell vertices from a finite volume Multi-Point Flux Approximation of Gauss' law using the indirect dual cell with Dirichlet, Neumann and external circuit boundary conditions. The resulting matrix equation for the nodal potential is solved with a restarted generalized minimal residual method (GMRES) and an ILU(0) preconditioner algorithm, parallelized using a combination of node coloring and level scheduling approaches. The electric field on vertices is obtained using the gradient theorem applied to the indirect dual cell. The algorithms for injection, particle loading, particle motion, and particle tracking are parallelized for unstructured tetrahedral grids. The algorithms for the potential solver, electric field evaluation, loading, scatter-gather algorithms are verified using analytic solutions for test cases subject to Laplace and Poisson equations. Grid sensitivity analysis examines the L2 and L∞ norms of the relative error in potential, field, and charge density as a function of edge-averaged and volume-averaged cell size. Analysis shows second order of convergence for the potential and first order of convergence for the electric field and charge density. Temporal sensitivity analysis is performed and the momentum and energy conservation properties of the particle integrators in EUPIC are examined. The effects of cell size and timestep on heating, slowing-down and the deflection times are quantified. The heating, slowing-down and the deflection times are found to be almost linearly dependent on number of particles per cell. EUPIC simulations of current collection by cylindrical Langmuir probes in collisionless plasmas show good comparison with previous experimentally validated numerical results. These simulations were also used in a parallelization
Asynchronous stream processing with S-Net
Grelck, C.; Scholz, S.-B.; Shafarenko, A.
2010-01-01
We present the rationale and design of S-Net, a coordination language for asynchronous stream processing. The language achieves a near-complete separation between the application code, written in any conventional programming language, and the coordination/communication code written in S-Net. Our
Load Balancing of Parallel Monte Carlo Transport Calculations
International Nuclear Information System (INIS)
Procassini, R J; O'Brien, M J; Taylor, J M
2005-01-01
The performance of parallel Monte Carlo transport calculations which use both spatial and particle parallelism is increased by dynamically assigning processors to the most worked domains. Since he particle work load varies over the course of the simulation, this algorithm determines each cycle if dynamic load balancing would speed up the calculation. If load balancing is required, a small number of particle communications are initiated in order to achieve load balance. This method has decreased the parallel run time by more than a factor of three for certain criticality calculations
Using Television Sitcoms to Facilitate Asynchronous Discussions in the Online Communication Course
Tolman, Elizabeth; Asbury, Bryan
2012-01-01
Asynchronous discussions are a useful instructional resource in the online communication course. In discussion groups students have the opportunity to actively participate and interact with students and the instructor. Asynchronous communication allows for flexibility because "participants can interact with significant amounts of time between…
A Fast, High Quality, and Reproducible Parallel Lagged-Fibonacci Pseudorandom Number Generator
Mascagni, Michael; Cuccaro, Steven A.; Pryor, Daniel V.; Robinson, M. L.
1995-07-01
We study the suitability of the additive lagged-Fibonacci pseudo-random number generator for parallel computation. This generator has relatively short period with respect to the size of its seed. However, the short period is more than made up for with the huge number of full-period cycles it contains. These different full period cycles are called equivalence classes. We show how to enumerate the equivalence classes and how to compute seeds to select a given equivalence class, In addition, we present some theoretical measures of quality for this generator when used in parallel. Next, we conjecture on the size of these measures of quality for this generator. Extensive empirical evidence supports this conjecture. In addition, a probabilistic interpretation of these measures leads to another conjecture similarly supported by empirical evidence. Finally we give an explicit parallelization suitable for a fully reproducible asynchronous MIMD implementation.
Barrera-Valencia, Camilo; Benito-Devia, Alexis Vladimir; Vélez-Álvarez, Consuelo; Figueroa-Barrera, Mario; Franco-Idárraga, Sandra Milena
Telepsychiatry is defined as the use of information and communication technology (ICT) in providing remote psychiatric services. Telepsychiatry is applied using two types of communication: synchronous (real time) and asynchronous (store and forward). To determine the cost-effectiveness of a synchronous and an asynchronous telepsychiatric model in prison inmate patients with symptoms of depression. A cost-effectiveness study was performed on a population consisting of 157 patients from the Establecimiento Penitenciario y Carcelario de Mediana Seguridad de Manizales, Colombia. The sample was determined by applying Zung self-administered surveys for depression (1965) and the Hamilton Depression Rating Scale (HDRS), the latter being the tool used for the comparison. Initial Hamilton score, arrival time, duration of system downtime, and clinical effectiveness variables had normal distributions (P>.05). There were significant differences (P<.001) between care costs for the different models, showing that the mean cost of the asynchronous model is less than synchronous model, and making the asynchronous model more cost-effective. The asynchronous model is the most cost-effective model of telepsychiatry care for patients with depression admitted to a detention centre, according to the results of clinical effectiveness, cost measurement, and patient satisfaction. Copyright © 2016 Asociación Colombiana de Psiquiatría. Publicado por Elsevier España. All rights reserved.
Suarez, CG; van der Mei, HC; Busscher, HJ
1999-01-01
The detachment of polystyrene particles adhering to collector surfaces with different electrostatic charge and hydrophobicity by attachment to a passing air bubble has been studied in a parallel plate flow chamber. Particle detachment decreased linearly with increasing air bubble velocity and
Parallel Monte Carlo simulation of aerosol dynamics
Zhou, K.
2014-01-01
A highly efficient Monte Carlo (MC) algorithm is developed for the numerical simulation of aerosol dynamics, that is, nucleation, surface growth, and coagulation. Nucleation and surface growth are handled with deterministic means, while coagulation is simulated with a stochastic method (Marcus-Lushnikov stochastic process). Operator splitting techniques are used to synthesize the deterministic and stochastic parts in the algorithm. The algorithm is parallelized using the Message Passing Interface (MPI). The parallel computing efficiency is investigated through numerical examples. Near 60% parallel efficiency is achieved for the maximum testing case with 3.7 million MC particles running on 93 parallel computing nodes. The algorithm is verified through simulating various testing cases and comparing the simulation results with available analytical and/or other numerical solutions. Generally, it is found that only small number (hundreds or thousands) of MC particles is necessary to accurately predict the aerosol particle number density, volume fraction, and so forth, that is, low order moments of the Particle Size Distribution (PSD) function. Accurately predicting the high order moments of the PSD needs to dramatically increase the number of MC particles. 2014 Kun Zhou et al.
International Nuclear Information System (INIS)
Lichters, R.; Pfund, R.E.W.; Meyer-ter-Vehn, J.
1997-08-01
The code LPIC++ presented here, is based on a one-dimensional, electromagnetic, relativistic PIC code that has originally been developed by one of the authors during a PhD thesis at the Max-Planck-Institut fuer Quantenoptik for kinetic simulations of high harmonic generation from overdense plasma surfaces. The code uses essentially the algorithm of Birdsall and Langdon and Villasenor and Bunemann. It is written in C++ in order to be easily extendable and has been parallelized to be able to grow in power linearly with the size of accessable hardware, e.g. massively parallel machines like Cray T3E. The parallel LPIC++ version uses PVM for communication between processors. PVM is public domain software, can be downloaded from the world wide web. A particular strength of LPIC++ lies in its clear program and data structure, which uses chained lists for the organization of grid cells and enables dynamic adjustment of spatial domain sizes in a very convenient way, and therefore easy balancing of processor loads. Also particles belonging to one cell are linked in a chained list and are immediately accessable from this cell. In addition to this convenient type of data organization in a PIC code, the code shows excellent performance in both its single processor and parallel version. (orig.)
ZONES OF STEADY CAPACITOR EXCITATION IN A MODE OF GENERATION OF TYPICAL ASYNCHRONOUS MACHINES
Directory of Open Access Journals (Sweden)
Postoronca Sv.
2009-12-01
Full Text Available In work some features of a mode of capacitor excitation of industrial asynchronous electric motors, and also generators made on their base which can be used in wind installations of low power are considered. Borders of zones of steady capacitor excitation of asynchronous electric motors in rated power of 0,25-22,0 kW and generators made on their base, and also character of influence of own losses and active capacity of loading of the equivalent circuit of the asynchronous machine resulted in parameters have been determined. Some recommendations after maintenance of stability of capacitor excitation of asynchronous machines for work in a mode of generation of electric energy are given.
Gigabit Ethernet signal transmission using asynchronous optical code division multiple access.
Ma, Philip Y; Fok, Mable P; Shastri, Bhavin J; Wu, Ben; Prucnal, Paul R
2015-12-15
We propose and experimentally demonstrate a novel architecture for interfacing and transmitting a Gigabit Ethernet (GbE) signal using asynchronous incoherent optical code division multiple access (OCDMA). This is the first such asynchronous incoherent OCDMA system carrying GbE data being demonstrated to be working among multi-users where each user is operating with an independent clock/data rate and is granted random access to the network. Three major components, the GbE interface, the OCDMA transmitter, and the OCDMA receiver are discussed in detail. The performance of the system is studied and characterized through measuring eye diagrams, bit-error rate and packet loss rate in real-time file transfer. Our Letter also addresses the near-far problem and realizes asynchronous transmission and detection of signal.
Asynchronous vs didactic education: it's too early to throw in the towel on tradition.
Jordan, Jaime; Jalali, Azadeh; Clarke, Samuel; Dyne, Pamela; Spector, Tahlia; Coates, Wendy
2013-08-08
Asynchronous, computer based instruction is cost effective, allows self-directed pacing and review, and addresses preferences of millennial learners. Current research suggests there is no significant difference in learning compared to traditional classroom instruction. Data are limited for novice learners in emergency medicine. The objective of this study was to compare asynchronous, computer-based instruction with traditional didactics for senior medical students during a week-long intensive course in acute care. We hypothesized both modalities would be equivalent. This was a prospective observational quasi-experimental study of 4th year medical students who were novice learners with minimal prior exposure to curricular elements. We assessed baseline knowledge with an objective pre-test. The curriculum was delivered in either traditional lecture format (shock, acute abdomen, dyspnea, field trauma) or via asynchronous, computer-based modules (chest pain, EKG interpretation, pain management, trauma). An interactive review covering all topics was followed by a post-test. Knowledge retention was measured after 10 weeks. Pre and post-test items were written by a panel of medical educators and validated with a reference group of learners. Mean scores were analyzed using dependent t-test and attitudes were assessed by a 5-point Likert scale. 44 of 48 students completed the protocol. Students initially acquired more knowledge from didactic education as demonstrated by mean gain scores (didactic: 28.39% ± 18.06; asynchronous 9.93% ± 23.22). Mean difference between didactic and asynchronous = 18.45% with 95% CI [10.40 to 26.50]; p = 0.0001. Retention testing demonstrated similar knowledge attrition: mean gain scores -14.94% (didactic); -17.61% (asynchronous), which was not significantly different: 2.68% ± 20.85, 95% CI [-3.66 to 9.02], p = 0.399. The attitudinal survey revealed that 60.4% of students believed the asynchronous modules were educational and 95
Computing by Temporal Order: Asynchronous Cellular Automata
Directory of Open Access Journals (Sweden)
Michael Vielhaber
2012-08-01
Full Text Available Our concern is the behaviour of the elementary cellular automata with state set 0,1 over the cell set Z/nZ (one-dimensional finite wrap-around case, under all possible update rules (asynchronicity. Over the torus Z/nZ (n<= 11,we will see that the ECA with Wolfram rule 57 maps any v in F_2^n to any w in F_2^n, varying the update rule. We furthermore show that all even (element of the alternating group bijective functions on the set F_2^n = 0,...,2^n-1, can be computed by ECA57, by iterating it a sufficient number of times with varying update rules, at least for n <= 10. We characterize the non-bijective functions computable by asynchronous rules.
Computational Aspects of Asynchronous CA
Chandesris, Jérôme; Dennunzio, Alberto; Formenti, Enrico; Manzoni, Luca
2011-01-01
This work studies some aspects of the computational power of fully asynchronous cellular automata (ACA). We deal with some notions of simulation between ACA and Turing Machines. In particular, we characterize the updating sequences specifying which are "universal", i.e., allowing a (specific family of) ACA to simulate any TM on any input. We also consider the computational cost of such simulations.
The design of an asynchronous Tiny RISC TM/TR4101 microprocessor core
DEFF Research Database (Denmark)
Christensen, Kåre Tais; Jensen, P.; Korger, P.
1998-01-01
This paper presents the design of an asynchronous version of the TR4101 embedded microprocessor core developed by LSI Logic Inc. The asynchronous processor, called ARISC, was designed using the same CAD tools and the same standard cell library that was used to implement the TR4101. The paper repo...
Romano, Paul Kollath
Monte Carlo particle transport methods are being considered as a viable option for high-fidelity simulation of nuclear reactors. While Monte Carlo methods offer several potential advantages over deterministic methods, there are a number of algorithmic shortcomings that would prevent their immediate adoption for full-core analyses. In this thesis, algorithms are proposed both to ameliorate the degradation in parallel efficiency typically observed for large numbers of processors and to offer a means of decomposing large tally data that will be needed for reactor analysis. A nearest-neighbor fission bank algorithm was proposed and subsequently implemented in the OpenMC Monte Carlo code. A theoretical analysis of the communication pattern shows that the expected cost is O( N ) whereas traditional fission bank algorithms are O(N) at best. The algorithm was tested on two supercomputers, the Intrepid Blue Gene/P and the Titan Cray XK7, and demonstrated nearly linear parallel scaling up to 163,840 processor cores on a full-core benchmark problem. An algorithm for reducing network communication arising from tally reduction was analyzed and implemented in OpenMC. The proposed algorithm groups only particle histories on a single processor into batches for tally purposes---in doing so it prevents all network communication for tallies until the very end of the simulation. The algorithm was tested, again on a full-core benchmark, and shown to reduce network communication substantially. A model was developed to predict the impact of load imbalances on the performance of domain decomposed simulations. The analysis demonstrated that load imbalances in domain decomposed simulations arise from two distinct phenomena: non-uniform particle densities and non-uniform spatial leakage. The dominant performance penalty for domain decomposition was shown to come from these physical effects rather than insufficient network bandwidth or high latency. The model predictions were verified with
Xiong, Wenjun; Patel, Ragini; Cao, Jinde; Zheng, Wei Xing
In this brief, our purpose is to apply asynchronous and intermittent sampled-data control methods to achieve the synchronization of hierarchical time-varying neural networks. The asynchronous and intermittent sampled-data controllers are proposed for two reasons: 1) the controllers may not transmit the control information simultaneously and 2) the controllers cannot always exist at any time . The synchronization is then discussed for a kind of hierarchical time-varying neural networks based on the asynchronous and intermittent sampled-data controllers. Finally, the simulation results are given to illustrate the usefulness of the developed criteria.In this brief, our purpose is to apply asynchronous and intermittent sampled-data control methods to achieve the synchronization of hierarchical time-varying neural networks. The asynchronous and intermittent sampled-data controllers are proposed for two reasons: 1) the controllers may not transmit the control information simultaneously and 2) the controllers cannot always exist at any time . The synchronization is then discussed for a kind of hierarchical time-varying neural networks based on the asynchronous and intermittent sampled-data controllers. Finally, the simulation results are given to illustrate the usefulness of the developed criteria.
DEFF Research Database (Denmark)
Nielsen, Sune Fallgaard; Sparsø, Jens; Madsen, Jan
2004-01-01
This paper presents a method for behavioral synthesis of asynchronous circuits. Our approach aims at providing a synthesis flow which is very similar to what is found in existing synchronous design tools. We adapt the synchronous behavioral synthesis abstraction into the asynchronous handshake...
FAST: A fully asynchronous and status-tracking pattern for geoprocessing services orchestration
Wu, Huayi; You, Lan; Gui, Zhipeng; Gao, Shuang; Li, Zhenqiang; Yu, Jingmin
2014-09-01
Geoprocessing service orchestration (GSO) provides a unified and flexible way to implement cross-application, long-lived, and multi-step geoprocessing service workflows by coordinating geoprocessing services collaboratively. Usually, geoprocessing services and geoprocessing service workflows are data and/or computing intensive. The intensity feature may make the execution process of a workflow time-consuming. Since it initials an execution request without blocking other interactions on the client side, an asynchronous mechanism is especially appropriate for GSO workflows. Many critical problems remain to be solved in existing asynchronous patterns for GSO including difficulties in improving performance, status tracking, and clarifying the workflow structure. These problems are a challenge when orchestrating performance efficiency, making statuses instantly available, and constructing clearly structured GSO workflows. A Fully Asynchronous and Status-Tracking (FAST) pattern that adopts asynchronous interactions throughout the whole communication tier of a workflow is proposed for GSO. The proposed FAST pattern includes a mechanism that actively pushes the latest status to clients instantly and economically. An independent proxy was designed to isolate the status tracking logic from the geoprocessing business logic, which assists the formation of a clear GSO workflow structure. A workflow was implemented in the FAST pattern to simulate the flooding process in the Poyang Lake region. Experimental results show that the proposed FAST pattern can efficiently tackle data/computing intensive geoprocessing tasks. The performance of all collaborative partners was improved due to the asynchronous mechanism throughout communication tier. A status-tracking mechanism helps users retrieve the latest running status of a GSO workflow in an efficient and instant way. The clear structure of the GSO workflow lowers the barriers for geospatial domain experts and model designers to
Embedded Vehicle Speed Estimation System Using an Asynchronous Temporal Contrast Vision Sensor
Directory of Open Access Journals (Sweden)
D. Bauer
2007-01-01
Full Text Available This article presents an embedded multilane traffic data acquisition system based on an asynchronous temporal contrast vision sensor, and algorithms for vehicle speed estimation developed to make efficient use of the asynchronous high-precision timing information delivered by this sensor. The vision sensor features high temporal resolution with a latency of less than 100 ÃŽÂ¼s, wide dynamic range of 120 dB of illumination, and zero-redundancy, asynchronous data output. For data collection, processing and interfacing, a low-cost digital signal processor is used. The speed of the detected vehicles is calculated from the vision sensor's asynchronous temporal contrast event data. We present three different algorithms for velocity estimation and evaluate their accuracy by means of calibrated reference measurements. The error of the speed estimation of all algorithms is near zero mean and has a standard deviation better than 3% for both traffic flow directions. The results and the accuracy limitations as well as the combined use of the algorithms in the system are discussed.
Chang, Todd P; Pham, Phung K; Sobolewski, Brad; Doughty, Cara B; Jamal, Nazreen; Kwan, Karen Y; Little, Kim; Brenkert, Timothy E; Mathison, David J
2014-08-01
Asynchronous e-learning allows for targeted teaching, particularly advantageous when bedside and didactic education is insufficient. An asynchronous e-learning curriculum has not been studied across multiple centers in the context of a clinical rotation. We hypothesize that an asynchronous e-learning curriculum during the pediatric emergency medicine (EM) rotation improves medical knowledge among residents and students across multiple participating centers. Trainees on pediatric EM rotations at four large pediatric centers from 2012 to 2013 were randomized in a Solomon four-group design. The experimental arms received an asynchronous e-learning curriculum consisting of nine Web-based, interactive, peer-reviewed Flash/HTML5 modules. Postrotation testing and in-training examination (ITE) scores quantified improvements in knowledge. A 2 × 2 analysis of covariance (ANCOVA) tested interaction and main effects, and Pearson's correlation tested associations between module usage, scores, and ITE scores. A total of 256 of 458 participants completed all study elements; 104 had access to asynchronous e-learning modules, and 152 were controls who used the current education standards. No pretest sensitization was found (p = 0.75). Use of asynchronous e-learning modules was associated with an improvement in posttest scores (p effect (partial η(2) = 0.19). Posttest scores correlated with ITE scores (r(2) = 0.14, p e-learning is an effective educational tool to improve knowledge in a clinical rotation. Web-based asynchronous e-learning is a promising modality to standardize education among multiple institutions with common curricula, particularly in clinical rotations where scheduling difficulties, seasonality, and variable experiences limit in-hospital learning. © 2014 by the Society for Academic Emergency Medicine.
Dynamic Load Balancing of Parallel Monte Carlo Transport Calculations
International Nuclear Information System (INIS)
O'Brien, M; Taylor, J; Procassini, R
2004-01-01
The performance of parallel Monte Carlo transport calculations which use both spatial and particle parallelism is increased by dynamically assigning processors to the most worked domains. Since the particle work load varies over the course of the simulation, this algorithm determines each cycle if dynamic load balancing would speed up the calculation. If load balancing is required, a small number of particle communications are initiated in order to achieve load balance. This method has decreased the parallel run time by more than a factor of three for certain criticality calculations
Increasing Student Engagement Using Asynchronous Learning
Northey, Gavin; Bucic, Tania; Chylinski, Mathew; Govind, Rahul
2015-01-01
Student engagement is an ongoing concern for educators because of its positive association with deep learning and educational outcomes. This article tests the use of a social networking site (Facebook) as a tool to facilitate asynchronous learning opportunities that complement face-to-face interactions and thereby enable a stronger learning…
Asynchronous Gossip for Averaging and Spectral Ranking
Borkar, Vivek S.; Makhijani, Rahul; Sundaresan, Rajesh
2014-08-01
We consider two variants of the classical gossip algorithm. The first variant is a version of asynchronous stochastic approximation. We highlight a fundamental difficulty associated with the classical asynchronous gossip scheme, viz., that it may not converge to a desired average, and suggest an alternative scheme based on reinforcement learning that has guaranteed convergence to the desired average. We then discuss a potential application to a wireless network setting with simultaneous link activation constraints. The second variant is a gossip algorithm for distributed computation of the Perron-Frobenius eigenvector of a nonnegative matrix. While the first variant draws upon a reinforcement learning algorithm for an average cost controlled Markov decision problem, the second variant draws upon a reinforcement learning algorithm for risk-sensitive control. We then discuss potential applications of the second variant to ranking schemes, reputation networks, and principal component analysis.
Understanding and Optimizing Asynchronous Low-Precision Stochastic Gradient Descent
De Sa, Christopher; Feldman, Matthew; Ré, Christopher; Olukotun, Kunle
2018-01-01
Stochastic gradient descent (SGD) is one of the most popular numerical algorithms used in machine learning and other domains. Since this is likely to continue for the foreseeable future, it is important to study techniques that can make it run fast on parallel hardware. In this paper, we provide the first analysis of a technique called Buckwild! that uses both asynchronous execution and low-precision computation. We introduce the DMGC model, the first conceptualization of the parameter space that exists when implementing low-precision SGD, and show that it provides a way to both classify these algorithms and model their performance. We leverage this insight to propose and analyze techniques to improve the speed of low-precision SGD. First, we propose software optimizations that can increase throughput on existing CPUs by up to 11×. Second, we propose architectural changes, including a new cache technique we call an obstinate cache, that increase throughput beyond the limits of current-generation hardware. We also implement and analyze low-precision SGD on the FPGA, which is a promising alternative to the CPU for future SGD systems. PMID:29391770
Violation of the equivalence principle for stressed bodies in asynchronous relativity
Energy Technology Data Exchange (ETDEWEB)
Andrade Martins, R. de (Centro de Logica, Epistemologia e Historia da Ciencia, Campinas (Brazil))
1983-12-11
In the recently developed asynchronous formulation of the relativistic theory of extended bodies, the inertial mass of a body does not explicitly depend on its pressure or stress. The detailed analysis of the weight of a box filled with a gas and placed in a weak gravitational field shows that this feature of asynchronous relativity implies a breakdown of the equivalence between inertial and passive gravitational mass for stressed systems.
International Nuclear Information System (INIS)
Techaumnat, B; Eua-arporn, B; Takuma, T
2004-01-01
This paper presents results of calculations of the electric field and dielectrophoretic force on a dielectric particle chain suspended in a host liquid lying between parallel-plate electrodes. The method of calculation is based on the method of multipole images using the multipole re-expansion technique. We have investigated the effect of the particle permittivity, the tilt angle (between the chain and the applied field) and the chain arrangement on the electric field and force. The results show that the electric field intensification rises in accordance with the increase in the ratio of the particle-to-liquid permittivity, Γ ε . The electric field at the contact point between the particles decreases with increasing tilt angle, while the maximal field at the contact point between the particles and the plate electrodes is almost unchanged. The maximal field can be approximated by a simple formula, which is a quadratic function of Γ ε . The dielectrophoretic force depends significantly on the distance from other particles or an electrode. However, for the tilt angles in this paper, the horizontal force on the upper particle of the chain always has the direction opposite to the shear direction. The maximal horizontal force of a chain varies proportional to (Γ ε - 1) 1.7 if the particles in the chain are still in contact with each other. The approximated force, based on the force on an isolated chain, has been compared with our calculation results. The comparison shows that no approximation model agrees well with our results throughout the range of permittivity ratios
QDP++: Data Parallel Interface for QCD
Energy Technology Data Exchange (ETDEWEB)
Robert Edwards
2003-03-01
This is a user's guide for the C++ binding for the QDP Data Parallel Applications Programmer Interface developed under the auspices of the US Department of Energy Scientific Discovery through Advanced Computing (SciDAC) program. The QDP Level 2 API has the following features: (1) Provides data parallel operations (logically SIMD) on all sites across the lattice or subsets of these sites. (2) Operates on lattice objects, which have an implementation-dependent data layout that is not visible above this API. (3) Hides details of how the implementation maps onto a given architecture, namely how the logical problem grid (i.el lattice) is mapped onto the machine architecture. (4) Allows asynchronous (non-blocking) shifts of lattice level objects over any permutation map of site sonto sites. However, from the user's view these instructions appear blocking and in fact may be so in some implementation. (5) Provides broadcast operations (filling a lattice quantity from a scalar value(s)), global reduction operations, and lattice-wide operations on various data-type primitives, such as matrices, vectors, and tensor products of matrices (propagators). (6) Operator syntax that support complex expression constructions.
On the theoretical gap between synchronous and asynchronous MPC protocols
DEFF Research Database (Denmark)
Beerliová-Trubíniová, Zuzana; Hirt, Martin; Nielsen, Jesper Buus
2010-01-01
that in the cryptographic setting (with setup), the sole reason for it is the distribution of inputs: given an oracle for input distribution, cryptographically-secure asynchronous MPC is possible with the very same condition as synchronous MPC, namely t ..., we show that such an input-distribution oracle can be reduced to an oracle that allows each party to synchronously broadcast one single message. This means that when one single round of synchronous broadcast is available, then asynchronous MPC is possible at the same condition as synchronous MPC...
Novel Simplified Model for Asynchronous Machine with Consideration of Frequency Characteristic
Directory of Open Access Journals (Sweden)
Changchun Cai
2014-01-01
Full Text Available The frequency characteristic of electric equipment should be considered in the digital simulation of power systems. The traditional asynchronous machine third-order transient model excludes not only the stator transient but also the frequency characteristics, thus decreasing the application sphere of the model and resulting in a large error under some special conditions. Based on the physical equivalent circuit and Park model for asynchronous machines, this study proposes a novel asynchronous third-order transient machine model with consideration of the frequency characteristic. In the new definitions of variables, the voltages behind the reactance are redefined as the linear equation of flux linkage. In this way, the rotor voltage equation is not associated with the derivative terms of frequency. However, the derivative terms of frequency should not always be ignored in the application of the traditional third-order transient model. Compared with the traditional third-order transient model, the novel simplified third-order transient model with consideration of the frequency characteristic is more accurate without increasing the order and complexity. Simulation results show that the novel third-order transient model for the asynchronous machine is suitable and effective and is more accurate than the widely used traditional simplified third-order transient model under some special conditions with drastic frequency fluctuations.
Argo: A Time-Elastic Time-Division-Multiplexed NOC using Asynchronous Routers
DEFF Research Database (Denmark)
Kasapaki, Evangelia; Sparsø, Jens
2014-01-01
are either synchronous or mesochronous. We use asynchronous routers to achieve a simpler, smaller, and more robust, self-timed design. Our design exploits the fact that pipelined asynchronous circuits also behave as ripple FIFOs. Thus, it avoids the need for explicit synchronization FIFOs between the routers......In this paper we explore the use of asynchronous routers in a time-division-multiplexed (TDM) network-on-chip (NOC), Argo, that is being developed for a multi-processor platform for hard real-time systems. TDM inherently requires a common time reference, and existing TDM-based NOC designs...... delays derived from a 65nm CMOS implementation, a worstcase analysis shows that a typical design can tolerate a skew of 1-5 cycles (depending on FIFO depths and NI clock frequency). Simulation results of a 2 x 2 NOC confirm this....
Starnet, a high-speed fiber optical network for particle physics application
International Nuclear Information System (INIS)
Bacilieri, P.; Ghiselli, A.; Caccia, B.; Valentini, S.; Ciaffoni, O.; Di Pirro, G.; Ferrer, M.L.; Martini, A.; Pace, E.; Trasatti, L.
1990-01-01
An asynchronous data transmission optical network using single-mode fibers and capable of transmitting frequencies of a few Gbit/s at distances of tens of kilometers is presented. This network (or part of it) is of interest for application in particle physics. (orig.)
Energy Technology Data Exchange (ETDEWEB)
Lasuik, J.; Shalchi, A., E-mail: andreasm4@yahoo.com [Department of Physics and Astronomy, University of Manitoba, Winnipeg, MB R3T 2N2 (Canada)
2017-09-20
Recently, a new theory for the transport of energetic particles across a mean magnetic field was presented. Compared to other nonlinear theories the new approach has the advantage that it provides a full time-dependent description of the transport. Furthermore, a diffusion approximation is no longer part of that theory. The purpose of this paper is to combine this new approach with a time-dependent model for parallel transport and different turbulence configurations in order to explore the parameter regimes for which we get ballistic transport, compound subdiffusion, and normal Markovian diffusion.
Asynchronous vs didactic education: it’s too early to throw in the towel on tradition
2013-01-01
Background Asynchronous, computer based instruction is cost effective, allows self-directed pacing and review, and addresses preferences of millennial learners. Current research suggests there is no significant difference in learning compared to traditional classroom instruction. Data are limited for novice learners in emergency medicine. The objective of this study was to compare asynchronous, computer-based instruction with traditional didactics for senior medical students during a week-long intensive course in acute care. We hypothesized both modalities would be equivalent. Methods This was a prospective observational quasi-experimental study of 4th year medical students who were novice learners with minimal prior exposure to curricular elements. We assessed baseline knowledge with an objective pre-test. The curriculum was delivered in either traditional lecture format (shock, acute abdomen, dyspnea, field trauma) or via asynchronous, computer-based modules (chest pain, EKG interpretation, pain management, trauma). An interactive review covering all topics was followed by a post-test. Knowledge retention was measured after 10 weeks. Pre and post-test items were written by a panel of medical educators and validated with a reference group of learners. Mean scores were analyzed using dependent t-test and attitudes were assessed by a 5-point Likert scale. Results 44 of 48 students completed the protocol. Students initially acquired more knowledge from didactic education as demonstrated by mean gain scores (didactic: 28.39% ± 18.06; asynchronous 9.93% ± 23.22). Mean difference between didactic and asynchronous = 18.45% with 95% CI [10.40 to 26.50]; p = 0.0001. Retention testing demonstrated similar knowledge attrition: mean gain scores −14.94% (didactic); -17.61% (asynchronous), which was not significantly different: 2.68% ± 20.85, 95% CI [−3.66 to 9.02], p = 0.399. The attitudinal survey revealed that 60.4% of students believed the asynchronous
Psychophysiological effects of synchronous versus asynchronous music during cycling.
Lim, Harry B T; Karageorghis, Costas I; Romer, Lee M; Bishop, Daniel T
2014-02-01
Synchronizing movement to a musical beat may reduce the metabolic cost of exercise, but findings to date have been equivocal. Our aim was to examine the degree to which the synchronous application of music moderates the metabolic demands of a cycle ergometer task. Twenty-three recreationally active men made two laboratory visits. During the first visit, participants completed a maximal incremental ramp test on a cycle ergometer. At the second visit, they completed four randomized 6-min cycling bouts at 90% of ventilatory threshold (control, metronome, synchronous music, and asynchronous music). Main outcome variables were oxygen uptake, HR, ratings of dyspnea and limb discomfort, affective valence, and arousal. No significant differences were evident for oxygen uptake. HR was lower under the metronome condition (122 ± 15 bpm) compared to asynchronous music (124 ± 17 bpm) and control (125 ± 16 bpm). Limb discomfort was lower while listening to the metronome (2.5 ± 1.2) and synchronous music (2.3 ± 1.1) compared to control (3.0 ± 1.5). Both music conditions, synchronous (1.9 ± 1.2) and asynchronous (2.1 ± 1.3), elicited more positive affective valence compared to metronome (1.2 ± 1.4) and control (1.2 ± 1.2), while arousal was higher with synchronous music (3.4 ± 0.9) compared to metronome (2.8 ± 1.0) and control (2.8 ± 0.9). Synchronizing movement to a rhythmic stimulus does not reduce metabolic cost but may lower limb discomfort. Moreover, synchronous music has a stronger effect on limb discomfort and arousal when compared to asynchronous music.
Emphasis on the Impact of Asynchronous Media
African Journals Online (AJOL)
ICTs and their utilization is one of the most pertinent issues in the education industry today. ... The paper pointed out specific impact of asynchronous ICT media in ... The paper finally noted that the struggle to be part of the digital world is ...
Heating calculation features at self-start of large asynchronous motor
Shevchenko, A. A.; Temlyakova, Z. S.; Grechkin, V. V.; Vilberger, M. E.
2017-10-01
The article proposes a method for optimizing the incremental heating calculation in the active volume of a large asynchronous motor for certain kinds of load characteristics. The incremental heating calculation is conditioned by the need to determine the aging level of the insulation and to predict a decrease in the electric machine service life. The method for optimizing the incremental heating calculation of asynchronous motor active volume is based on the automation of calculating the heating when simulating the self-starting process of the motor after eliminating an AC drop.
UNIVERSAL REGULAR AUTONOMOUS ASYNCHRONOUS SYSTEMS: ω-LIMIT SETS, INVARIANCE AND BASINS OF ATTRACTION
Directory of Open Access Journals (Sweden)
Serban Vlad
2011-07-01
Full Text Available The asynchronous systems are the non-deterministic real timebinarymodels of the asynchronous circuits from electrical engineering.Autonomy means that the circuits and their models have no input.Regularity means analogies with the dynamical systems, thus such systems may be considered to be real time dynamical systems with a’vector field’, Universality refers to the case when the state space of the system is the greatest possible in the sense of theinclusion. The purpose of this paper is that of defining, by analogy with the dynamical systems theory, the omega-limit sets, the invariance and the basins of attraction of the universal regular autonomous asynchronous systems.
A novel asynchronous access method with binary interfaces
Directory of Open Access Journals (Sweden)
Torres-Solis Jorge
2008-10-01
Full Text Available Abstract Background Traditionally synchronous access strategies require users to comply with one or more time constraints in order to communicate intent with a binary human-machine interface (e.g., mechanical, gestural or neural switches. Asynchronous access methods are preferable, but have not been used with binary interfaces in the control of devices that require more than two commands to be successfully operated. Methods We present the mathematical development and evaluation of a novel asynchronous access method that may be used to translate sporadic activations of binary interfaces into distinct outcomes for the control of devices requiring an arbitrary number of commands to be controlled. With this method, users are required to activate their interfaces only when the device under control behaves erroneously. Then, a recursive algorithm, incorporating contextual assumptions relevant to all possible outcomes, is used to obtain an informed estimate of user intention. We evaluate this method by simulating a control task requiring a series of target commands to be tracked by a model user. Results When compared to a random selection, the proposed asynchronous access method offers a significant reduction in the number of interface activations required from the user. Conclusion This novel access method offers a variety of advantages over traditionally synchronous access strategies and may be adapted to a wide variety of contexts, with primary relevance to applications involving direct object manipulation.
Directory of Open Access Journals (Sweden)
D. G. Patalakh
2018-02-01
Full Text Available Purpose. Development of calculation of electromagnetic and electromechanic transients is in asynchronous engines without iterations. Methodology. Numeral methods of integration of usual differential equations, programming. Findings. As the system of equations, describing the dynamics of asynchronous engine, contents the products of rotor and stator currents and product of rotation frequency of rotor and currents, so this system is nonlinear one. The numeral solution of nonlinear differential equations supposes an iteration process on every step of integration. Time-continuing and badly converging iteration process may be the reason of calculation slowing. The improvement of numeral method by the way of an iteration process removing is offered. As result the modeling time is reduced. The improved numeral method is applied for integration of differential equations, describing the dynamics of asynchronous engine. Originality. The improvement of numeral method allowing to execute numeral integrations of differential equations containing product of functions is offered, that allows to avoid an iteration process on every step of integration and shorten modeling time. Practical value. On the basis of the offered methodology the universal program of modeling of electromechanics processes in asynchronous engines could be developed as taking advantage on fast-acting.
Dynamic Performances of Asynchronous Machines | Ubeku ...
African Journals Online (AJOL)
The per-phase parameters of a 1.5 hp, 380 V, 50 Hz, 4 poles, 3 phase asynchronous machine used in the simulation were computed with reading obtained from a dc, no-load and blocked rotor tests carried out on the machine in the laboratory. The results obtained from the computer simulations confirmed the capabilities ...
THE ROLE OF OFFLINE METALANGUAGE TALK IN ASYNCHRONOUS COMPUTER-MEDIATED COMMUNICATION
Directory of Open Access Journals (Sweden)
Keiko Kitade
2008-02-01
Full Text Available In order to demonstrate how learners utilize the text-based asynchronous attributes of the Bulletin Board System, this study explored Japanese-as-a-second-language learners' metalanguage episodes (Swain & Lapkin, 1995, 1998 in offline verbal peer speech and online asynchronous discussions with their Japanese key pals. The findings suggest the crucial role of offline collaborative dialogue, the interactional modes in which the episodes occur, and the unique discourse structure of metalanguage episodes concerning online and offline interactions. A high score on the posttest also suggests the high retention of linguistic knowledge constructed through offline peer dialogue. In the offline mode, the learners were able to collaboratively construct knowledge with peers in the stipulated time, while simultaneously focusing on task content in the online interaction. The retrospective interviews and questionnaires reveal the factors that could affect the benefits of the asynchronous computer-mediated communication medium for language learning.
HPC parallel programming model for gyrokinetic MHD simulation
International Nuclear Information System (INIS)
Naitou, Hiroshi; Yamada, Yusuke; Tokuda, Shinji; Ishii, Yasutomo; Yagi, Masatoshi
2011-01-01
The 3-dimensional gyrokinetic PIC (particle-in-cell) code for MHD simulation, Gpic-MHD, was installed on SR16000 (“Plasma Simulator”), which is a scalar cluster system consisting of 8,192 logical cores. The Gpic-MHD code advances particle and field quantities in time. In order to distribute calculations over large number of logical cores, the total simulation domain in cylindrical geometry was broken up into N DD-r × N DD-z (number of radial decomposition times number of axial decomposition) small domains including approximately the same number of particles. The axial direction was uniformly decomposed, while the radial direction was non-uniformly decomposed. N RP replicas (copies) of each decomposed domain were used (“particle decomposition”). The hybrid parallelization model of multi-threads and multi-processes was employed: threads were parallelized by the auto-parallelization and N DD-r × N DD-z × N RP processes were parallelized by MPI (message-passing interface). The parallelization performance of Gpic-MHD was investigated for the medium size system of N r × N θ × N z = 1025 × 128 × 128 mesh with 4.196 or 8.192 billion particles. The highest speed for the fixed number of logical cores was obtained for two threads, the maximum number of N DD-z , and optimum combination of N DD-r and N RP . The observed optimum speeds demonstrated good scaling up to 8,192 logical cores. (author)
Determination of power and moment on shaft of special asynchronous electric drives
Karandey, V. Yu; Popov, B. K.; Popova, O. B.; Afanasyev, V. L.
2018-03-01
In the article, questions and tasks of determination of power and the moment on a shaft of special asynchronous electric drives are considered. Use of special asynchronous electric drives in mechanical engineering and other industries is relevant. The considered types of electric drives possess the improved mass-dimensional indicators in comparison with singleengine systems. Also these types of electric drives have constructive advantages; the improved characteristics allow one to realize the technological process. But creation and design of new electric drives demands adjustment of existing or development of new methods and approaches of calculation of parameters. Determination of power and the moment on a shaft of special asynchronous electric drives is the main objective during design of electric drives. This task has been solved based on a method of electromechanical transformation of energy.
Implementation and performance of parallelized elegant
International Nuclear Information System (INIS)
Wang, Y.; Borland, M.
2008-01-01
The program elegant is widely used for design and modeling of linacs for free-electron lasers and energy recovery linacs, as well as storage rings and other applications. As part of a multi-year effort, we have parallelized many aspects of the code, including single-particle dynamics, wakefields, and coherent synchrotron radiation. We report on the approach used for gradual parallelization, which proved very beneficial in getting parallel features into the hands of users quickly. We also report details of parallelization of collective effects. Finally, we discuss performance of the parallelized code in various applications.
Archer, Charles J [Rochester, MN; Blocksome, Michael A [Rochester, MN; Peters, Amanda E [Cambridge, MA; Ratterman, Joseph D [Rochester, MN; Smith, Brian E [Rochester, MN
2012-04-17
Methods, apparatus, and products are disclosed for reducing power consumption while synchronizing a plurality of compute nodes during execution of a parallel application that include: beginning, by each compute node, performance of a blocking operation specified by the parallel application, each compute node beginning the blocking operation asynchronously with respect to the other compute nodes; reducing, for each compute node, power to one or more hardware components of that compute node in response to that compute node beginning the performance of the blocking operation; and restoring, for each compute node, the power to the hardware components having power reduced in response to all of the compute nodes beginning the performance of the blocking operation.
Archer, Charles J [Rochester, MN; Blocksome, Michael A [Rochester, MN; Peters, Amanda A [Rochester, MN; Ratterman, Joseph D [Rochester, MN; Smith, Brian E [Rochester, MN
2012-01-10
Methods, apparatus, and products are disclosed for reducing power consumption while synchronizing a plurality of compute nodes during execution of a parallel application that include: beginning, by each compute node, performance of a blocking operation specified by the parallel application, each compute node beginning the blocking operation asynchronously with respect to the other compute nodes; reducing, for each compute node, power to one or more hardware components of that compute node in response to that compute node beginning the performance of the blocking operation; and restoring, for each compute node, the power to the hardware components having power reduced in response to all of the compute nodes beginning the performance of the blocking operation.
Methodological Reflections on the Use of Asynchronous Online Focus Groups in Health Research
Directory of Open Access Journals (Sweden)
Sarah Williams PhD
2012-09-01
Full Text Available The Internet is increasingly used as a tool in qualitative research. In particular, asynchronous online focus groups are used when factors such as cost, time, or access to participants can make conducting face-to-face research difficult. In this article we consider key methodological issues involved in using asynchronous online focus groups to explore experiences of health and illness. The written nature of Internet communication, the lack of physical presence, and the asynchronous, longitudinal aspects enable participants who might not normally contribute to research studies to reflect on their personal stories before disclosing them to the researcher. Implications for study design, recruitment strategies, and ethics should be considered when deciding whether to use this method.
An Asynchronous IEEE Floating-Point Arithmetic Unit
Directory of Open Access Journals (Sweden)
Joel R. Noche
2007-12-01
Full Text Available An asynchronous floating-point arithmetic unit is designed and tested at the transistor level usingCadence software. It uses CMOS (complementary metal oxide semiconductor and DCVS (differentialcascode voltage switch logic in a 0.35 µm process using a 3.3 V supply voltage, with dual-rail data andsingle-rail control signals using four-phase handshaking.Using 17,085 transistors, the unit handles single-precision (32-bit addition/subtraction, multiplication,division, and remainder using the IEEE 754-1985 Standard for Binary Floating-Point Arithmetic, withrounding and other operations to be handled by separate hardware or software. Division and remainderare done using a restoring subtractive algorithm; multiplication uses an additive algorithm. Exceptionsare noted by flags (and not trap handlers and the output is in single-precision.Previous work on asynchronous floating-point arithmetic units have mostly focused on single operationssuch as division. This is the first work to the authors' knowledge that can perform floating-point addition,multiplication, division, and remainder using a common datapath.
Regression analysis of sparse asynchronous longitudinal data.
Cao, Hongyuan; Zeng, Donglin; Fine, Jason P
2015-09-01
We consider estimation of regression models for sparse asynchronous longitudinal observations, where time-dependent responses and covariates are observed intermittently within subjects. Unlike with synchronous data, where the response and covariates are observed at the same time point, with asynchronous data, the observation times are mismatched. Simple kernel-weighted estimating equations are proposed for generalized linear models with either time invariant or time-dependent coefficients under smoothness assumptions for the covariate processes which are similar to those for synchronous data. For models with either time invariant or time-dependent coefficients, the estimators are consistent and asymptotically normal but converge at slower rates than those achieved with synchronous data. Simulation studies evidence that the methods perform well with realistic sample sizes and may be superior to a naive application of methods for synchronous data based on an ad hoc last value carried forward approach. The practical utility of the methods is illustrated on data from a study on human immunodeficiency virus.
An Asynchronous Circuit Design Technique for a Flexible 8-Bit Microprocessor
Karaki, Nobuo; Nanmoto, Takashi; Inoue, Satoshi
This paper presents an asynchronous design technique, an enabler for the emerging technology of flexible microelectronics that feature low-temperature processed polysilicon (LTPS) thin-film transistors (TFT) and surface-free technology by laser annealing/ablation (SUFTLA®). The first design instance chosen is an 8-bit microprocessor. LTPS TFTs are good for realizing displays having integrated VLSI circuit at lower costs. However, LTPS TFTs have drawbacks, including substantial deviations in characteristics and the self-heating phenomenon. To solve these problems, the authors adopted the asynchronous circuit design technique and developed an asynchronous design language called Verilog+, which is based on a subset of Verilog HDL® and includes minimal primitives used for describing the communications between modules, and the dedicated tools including a translator called xlator and a synthesizer called ctrlsyn. The flexible 8-bit microprocessor stably operates at 500kHz, drawing 180μA from a 5V power source. The microprocessor's electromagnetic emissions are 21dB less than those of the synchronous counterpart.
Pseudo Asynchronous Level Crossing adc for ecg Signal Acquisition.
Marisa, T; Niederhauser, T; Haeberlin, A; Wildhaber, R A; Vogel, R; Goette, J; Jacomet, M
2017-02-07
A new pseudo asynchronous level crossing analogue-to-digital converter (adc) architecture targeted for low-power, implantable, long-term biomedical sensing applications is presented. In contrast to most of the existing asynchronous level crossing adc designs, the proposed design has no digital-to-analogue converter (dac) and no continuous time comparators. Instead, the proposed architecture uses an analogue memory cell and dynamic comparators. The architecture retains the signal activity dependent sampling operation by generating events only when the input signal is changing. The architecture offers the advantages of smaller chip area, energy saving and fewer analogue system components. Beside lower energy consumption the use of dynamic comparators results in a more robust performance in noise conditions. Moreover, dynamic comparators make interfacing the asynchronous level crossing system to synchronous processing blocks simpler. The proposed adc was implemented in [Formula: see text] complementary metal-oxide-semiconductor (cmos) technology, the hardware occupies a chip area of 0.0372 mm 2 and operates from a supply voltage of [Formula: see text] to [Formula: see text]. The adc's power consumption is as low as 0.6 μW with signal bandwidth from [Formula: see text] to [Formula: see text] and achieves an equivalent number of bits (enob) of up to 8 bits.
Evolution of a minimal parallel programming model
International Nuclear Information System (INIS)
Lusk, Ewing; Butler, Ralph; Pieper, Steven C.
2017-01-01
Here, we take a historical approach to our presentation of self-scheduled task parallelism, a programming model with its origins in early irregular and nondeterministic computations encountered in automated theorem proving and logic programming. We show how an extremely simple task model has evolved into a system, asynchronous dynamic load balancing (ADLB), and a scalable implementation capable of supporting sophisticated applications on today’s (and tomorrow’s) largest supercomputers; and we illustrate the use of ADLB with a Green’s function Monte Carlo application, a modern, mature nuclear physics code in production use. Our lesson is that by surrendering a certain amount of generality and thus applicability, a minimal programming model (in terms of its basic concepts and the size of its application programmer interface) can achieve extreme scalability without introducing complexity.
Asynchronous versus Synchronous Learning in Pharmacy Education
Motycka, Carol A.; St. Onge, Erin L.; Williams, Jennifer
2013-01-01
Objective: To better understand the technology being used today in pharmacy education through a review of the current methodologies being employed at various institutions. Also, to discuss the benefits and difficulties of asynchronous and synchronous methodologies, which are being utilized at both traditional and distance education campuses.…
Three-dimensional particle tracking velocimetry using dynamic vision sensors
Borer, D.; Delbruck, T.; Rösgen, T.
2017-12-01
A fast-flow visualization method is presented based on tracking neutrally buoyant soap bubbles with a set of neuromorphic cameras. The "dynamic vision sensors" register only the changes in brightness with very low latency, capturing fast processes at a low data rate. The data consist of a stream of asynchronous events, each encoding the corresponding pixel position, the time instant of the event and the sign of the change in logarithmic intensity. The work uses three such synchronized cameras to perform 3D particle tracking in a medium sized wind tunnel. The data analysis relies on Kalman filters to associate the asynchronous events with individual tracers and to reconstruct the three-dimensional path and velocity based on calibrated sensor information.
International Nuclear Information System (INIS)
Brown, P.; Chang, B.
1998-01-01
The linear Boltzmann transport equation (BTE) is an integro-differential equation arising in deterministic models of neutral and charged particle transport. In slab (one-dimensional Cartesian) geometry and certain higher-dimensional cases, Diffusion Synthetic Acceleration (DSA) is known to be an effective algorithm for the iterative solution of the discretized BTE. Fourier and asymptotic analyses have been applied to various idealizations (e.g., problems on infinite domains with constant coefficients) to obtain sharp bounds on the convergence rate of DSA in such cases. While DSA has been shown to be a highly effective acceleration (or preconditioning) technique in one-dimensional problems, it has been observed to be less effective in higher dimensions. This is due in part to the expense of solving the related diffusion linear system. We investigate here the effectiveness of a parallel semicoarsening multigrid (SMG) solution approach to DSA preconditioning in several three dimensional problems. In particular, we consider the algorithmic and implementation scalability of a parallel SMG-DSA preconditioner on several types of test problems
Asynchronous Free-Space Optical CDMA Communications System for Last-mile Access Network
DEFF Research Database (Denmark)
Jurado-Navas, Antonio; Raddo, Thiago R.; Sanches, Anderson L.
2016-01-01
We propose a new hybrid asynchronous OCDMA-FSO communications system for access network solutions. New ABER expressions are derived under gamma-gamma scintillation channels, where all users can surprisingly achieve error-free transmissions when FEC is employed.......We propose a new hybrid asynchronous OCDMA-FSO communications system for access network solutions. New ABER expressions are derived under gamma-gamma scintillation channels, where all users can surprisingly achieve error-free transmissions when FEC is employed....
Strict optical orthogonal codes for purely asynchronous code-division multiple-access applications
Zhang, Jian-Guo
1996-12-01
Strict optical orthogonal codes are presented for purely asynchronous optical code-division multiple-access (CDMA) applications. The proposed code can strictly guarantee the peaks of its cross-correlation functions and the sidelobes of any of its autocorrelation functions to have a value of 1 in purely asynchronous data communications. The basic theory of the proposed codes is given. An experiment on optical CDMA systems is also demonstrated to verify the characteristics of the proposed code.
ON THE ISSUE OF VECTOR CONTROL OF THE ASYNCHRONOUS MOTORS
Directory of Open Access Journals (Sweden)
B. I. Firago
2015-01-01
Full Text Available The paper considers the issue of one of the widespread types of vector control realization for the asynchronous motors with a short-circuited rotor. Of all more than 20 vector control types known presently, the following are applied most frequently: direct vector control with velocity pickup (VP, direct vector control without VP, indirect vector control with VP and indirect vector control without VP. Despite the fact that the asynchronous-motor indirect vector control without VP is the easiest and most spread, the absence of VP does not allow controlling the motor electromagnetic torque at zero velocity. This is the reason why for electric motor drives of such requirements they utilize the vector control with a velocity transducer. The systems of widest dissemination became the direct and indirect vector control systems with X-axis alignment of the synchronously rotating x–y-coordinate frame along the rotor flux-linkage vector inasmuch as this provides the simplest correlations for controlling variables. Although these two types of vector control are well presented in literature, a number of issues concerning their realization and practical application require further elaboration. These include: the block schemes adequate representation as consisted with the modern realization of vector control and clarification of the analytical expressions for evaluating the regulator parameters.The authors present a technique for evaluating the dynamics of an asynchronous electric motor drive with direct vector control and x-axis alignment along the vector of rotor flux linkage. The article offers a generalized structure of this vector control type with detailed description of its principal blocks: controlling system, frequency converter, and the asynchronous motor.The paper presents a direct vector control simulating model developed in the MatLab environment on the grounds of this structure. The authors illustrate the described technique with the results
Data Collection for Mobile Group Consumption: An Asynchronous Distributed Approach
Directory of Open Access Journals (Sweden)
Weiping Zhu
2016-04-01
Full Text Available Mobile group consumption refers to consumption by a group of people, such as a couple, a family, colleagues and friends, based on mobile communications. It differs from consumption only involving individuals, because of the complex relations among group members. Existing data collection systems for mobile group consumption are centralized, which has the disadvantages of being a performance bottleneck, having single-point failure and increasing business and security risks. Moreover, these data collection systems are based on a synchronized clock, which is often unrealistic because of hardware constraints, privacy concerns or synchronization cost. In this paper, we propose the first asynchronous distributed approach to collecting data generated by mobile group consumption. We formally built a system model thereof based on asynchronous distributed communication. We then designed a simulation system for the model for which we propose a three-layer solution framework. After that, we describe how to detect the causality relation of two/three gathering events that happened in the system based on the collected data. Various definitions of causality relations based on asynchronous distributed communication are supported. Extensive simulation results show that the proposed approach is effective for data collection relating to mobile group consumption.
Data Collection for Mobile Group Consumption: An Asynchronous Distributed Approach.
Zhu, Weiping; Chen, Weiran; Hu, Zhejie; Li, Zuoyou; Liang, Yue; Chen, Jiaojiao
2016-04-06
Mobile group consumption refers to consumption by a group of people, such as a couple, a family, colleagues and friends, based on mobile communications. It differs from consumption only involving individuals, because of the complex relations among group members. Existing data collection systems for mobile group consumption are centralized, which has the disadvantages of being a performance bottleneck, having single-point failure and increasing business and security risks. Moreover, these data collection systems are based on a synchronized clock, which is often unrealistic because of hardware constraints, privacy concerns or synchronization cost. In this paper, we propose the first asynchronous distributed approach to collecting data generated by mobile group consumption. We formally built a system model thereof based on asynchronous distributed communication. We then designed a simulation system for the model for which we propose a three-layer solution framework. After that, we describe how to detect the causality relation of two/three gathering events that happened in the system based on the collected data. Various definitions of causality relations based on asynchronous distributed communication are supported. Extensive simulation results show that the proposed approach is effective for data collection relating to mobile group consumption.
Laser dynamics of asynchronous rational harmonic mode-locked fiber soliton lasers
International Nuclear Information System (INIS)
Jyu, Siao-Shan; Jiang, Guo-Hao; Lai, Yinchieh
2013-01-01
Laser dynamics of asynchronous rational harmonic mode-locked (ARHM) fiber soliton lasers are investigated in detail. In particular, based on the unique laser dynamics of asynchronous mode-locking, we have developed a new method for determining the effective active modulation strength in situ for ARHM lasers. By measuring the magnitudes of the slowly oscillating pulse timing position and central frequency, the effective phase modulation strength at the multiplication frequency of rational harmonic mode-locking can be accurately inferred. The method can be a very useful tool for developing ARHM fiber lasers. (paper)
A low-power asynchronous data-path for a FIR filter bank
DEFF Research Database (Denmark)
Nielsen, Lars Skovby; Sparsø, Jens
1996-01-01
This paper describes a number of design issues relating to the implementation of low-power asynchronous signal processing circuits. Specifically, the paper addresses the design of a dedicated processor structure that implements an audio FIR filter bank which is part of an industrial application....... The algorithm requires a fixed number of steps and the moderate speed requirement allows a sequential implementation. The latter, in combination with a huge predominance of numerically small data values in the input data stream, is the key to a low-power asynchronous implementation. Power is minimized in two...
Adaptive hatching hypotheses do not explain asynchronous ...
African Journals Online (AJOL)
At the core of the suite of adaptive hatching hypotheses advanced to explain asynchronous hatching in birds is the assumption that if food is not limited then all the hatchlings will develop normally to adulthood. In this study Brown-headed Parrot Poicephalus cryptoxanthus chicks were hand fed and weighed on a daily basis.
Directory of Open Access Journals (Sweden)
Jiayi Wu
Full Text Available Structural heterogeneity in single-particle cryo-electron microscopy (cryo-EM data represents a major challenge for high-resolution structure determination. Unsupervised classification may serve as the first step in the assessment of structural heterogeneity. However, traditional algorithms for unsupervised classification, such as K-means clustering and maximum likelihood optimization, may classify images into wrong classes with decreasing signal-to-noise-ratio (SNR in the image data, yet demand increased computational costs. Overcoming these limitations requires further development of clustering algorithms for high-performance cryo-EM data processing. Here we introduce an unsupervised single-particle clustering algorithm derived from a statistical manifold learning framework called generative topographic mapping (GTM. We show that unsupervised GTM clustering improves classification accuracy by about 40% in the absence of input references for data with lower SNRs. Applications to several experimental datasets suggest that our algorithm can detect subtle structural differences among classes via a hierarchical clustering strategy. After code optimization over a high-performance computing (HPC environment, our software implementation was able to generate thousands of reference-free class averages within hours in a massively parallel fashion, which allows a significant improvement on ab initio 3D reconstruction and assists in the computational purification of homogeneous datasets for high-resolution visualization.
Wu, Jiayi; Ma, Yong-Bei; Congdon, Charles; Brett, Bevin; Chen, Shuobing; Xu, Yaofang; Ouyang, Qi; Mao, Youdong
2017-01-01
Structural heterogeneity in single-particle cryo-electron microscopy (cryo-EM) data represents a major challenge for high-resolution structure determination. Unsupervised classification may serve as the first step in the assessment of structural heterogeneity. However, traditional algorithms for unsupervised classification, such as K-means clustering and maximum likelihood optimization, may classify images into wrong classes with decreasing signal-to-noise-ratio (SNR) in the image data, yet demand increased computational costs. Overcoming these limitations requires further development of clustering algorithms for high-performance cryo-EM data processing. Here we introduce an unsupervised single-particle clustering algorithm derived from a statistical manifold learning framework called generative topographic mapping (GTM). We show that unsupervised GTM clustering improves classification accuracy by about 40% in the absence of input references for data with lower SNRs. Applications to several experimental datasets suggest that our algorithm can detect subtle structural differences among classes via a hierarchical clustering strategy. After code optimization over a high-performance computing (HPC) environment, our software implementation was able to generate thousands of reference-free class averages within hours in a massively parallel fashion, which allows a significant improvement on ab initio 3D reconstruction and assists in the computational purification of homogeneous datasets for high-resolution visualization.
Mišković, Zoran L.; Akbari, Kamran; Segui, Silvina; Gervasoni, Juana L.; Arista, Néstor R.
2018-05-01
We present a fully relativistic formulation for the energy loss rate of a charged particle moving parallel to a sheet containing two-dimensional electron gas, allowing that its in-plane polarization may be described by different longitudinal and transverse conductivities. We apply our formulation to the case of a doped graphene layer in the terahertz range of frequencies, where excitation of the Dirac plasmon polariton (DPP) in graphene plays a major role. By using the Drude model with zero damping we evaluate the energy loss rate due to excitation of the DPP, and show that the retardation effects are important when the incident particle speed and its distance from graphene both increase. Interestingly, the retarded energy loss rate obtained in this manner may be both larger and smaller than its non-retarded counterpart for different combinations of the particle speed and distance.
Energy Technology Data Exchange (ETDEWEB)
Peeters, J.; Van Dorst, C. [Hyteps, Gemert (Netherlands)
2008-10-15
The three phase asynchronous motor has been applied in various installations since time immemorial. Although the motor is more efficient at full mechanical load, this is not always applied efficiently. Can the efficiency of low load motors be improved or is this a utopia? The Sinusoidal Motor Controller (SinuMEC) improves efficiency, saves energy and lengthens the life span. [mk]. [Dutch] De driefasen asynchrone motor wordt sinds mensenheugenis in uiteenlopende installaties toegepast. Hoewel de motor met een volle mechanische belasting efficient is, wordt deze niet altijd efficient toegepast. Kan de efficiency van laag belaste motoren worden verbeterd of is dit een utopie? De Sinusoidal Motor efficiency controller (SinuMEC) verbetert de efficiency, bespaart energie en verlengt de levensduur.
Asynchronous L1-gain control of uncertain switched positive linear systems with dwell time.
Li, Yang; Zhang, Hongbin
2018-04-01
In this paper, dwell time (DT) stability, L 1 -gain performance analysis and asynchronous L 1 -gain controller design problems of uncertain switched positive linear systems (SPLSs) are investigated. Via a time-scheduled multiple linear co-positive Lyapunov function (TSMLCLF) approach, convex sufficient conditions of DT stability and L 1 -gain performance of SPLSs with interval and polytopic uncertainties are presented. Furthermore, by utilizing the feature that the TSMLCLF keeps decreasing even if the controller is running asynchronously with the system, the asynchronous L 1 -gain controller design problem of SPLSs with interval and polytopic uncertainties is investigated. Convex sufficient conditions of the existence of time-varying asynchronous state-feedback controller which can ensure the closed-loop system's positivity, stability and L 1 -gain performance are established, and the controller gain matrices can be calculated instantaneously online. The obtained L 1 -gain in the paper is standard. All the results are presented in terms of linear programming. A practical example is provided to show the effectiveness of the results. Copyright © 2018 ISA. Published by Elsevier Ltd. All rights reserved.
International Nuclear Information System (INIS)
Damek, Nawel; Kamoun, Samira
2011-01-01
In this communication, two recursive parametric estimation algorithms are analyzed and applied to an squirrelcage asynchronous machine located at the research ''Unit of Automatic Control'' (UCA) at ENIS. The first algorithm which, use the transfer matrix mathematical model, is based on the gradient principle. The second algorithm, which use the state-space mathematical model, is based on the minimization of the estimation error. These algorithms are applied as a key technique to estimate asynchronous machine with unknown, but constant or timevarying parameters. Stator voltage and current are used as measured data. The proposed recursive parametric estimation algorithms are validated on the experimental data of an asynchronous machine under normal operating condition as full load. The results show that these algorithms can estimate effectively the machine parameters with reliability.
An Efficient Algorithm for Computing Attractors of Synchronous And Asynchronous Boolean Networks
Zheng, Desheng; Yang, Guowu; Li, Xiaoyu; Wang, Zhicai; Liu, Feng; He, Lei
2013-01-01
Biological networks, such as genetic regulatory networks, often contain positive and negative feedback loops that settle down to dynamically stable patterns. Identifying these patterns, the so-called attractors, can provide important insights for biologists to understand the molecular mechanisms underlying many coordinated cellular processes such as cellular division, differentiation, and homeostasis. Both synchronous and asynchronous Boolean networks have been used to simulate genetic regulatory networks and identify their attractors. The common methods of computing attractors are that start with a randomly selected initial state and finish with exhaustive search of the state space of a network. However, the time complexity of these methods grows exponentially with respect to the number and length of attractors. Here, we build two algorithms to achieve the computation of attractors in synchronous and asynchronous Boolean networks. For the synchronous scenario, combing with iterative methods and reduced order binary decision diagrams (ROBDD), we propose an improved algorithm to compute attractors. For another algorithm, the attractors of synchronous Boolean networks are utilized in asynchronous Boolean translation functions to derive attractors of asynchronous scenario. The proposed algorithms are implemented in a procedure called geneFAtt. Compared to existing tools such as genYsis, geneFAtt is significantly faster in computing attractors for empirical experimental systems. Availability The software package is available at https://sites.google.com/site/desheng619/download. PMID:23585840
Asynchronous Channel-Hopping Scheme under Jamming Attacks
Directory of Open Access Journals (Sweden)
Yongchul Kim
2018-01-01
Full Text Available Cognitive radio networks (CRNs are considered an attractive technology to mitigate inefficiency in the usage of licensed spectrum. CRNs allow the secondary users (SUs to access the unused licensed spectrum and use a blind rendezvous process to establish communication links between SUs. In particular, quorum-based channel-hopping (CH schemes have been studied recently to provide guaranteed blind rendezvous in decentralized CRNs without using global time synchronization. However, these schemes remain vulnerable to jamming attacks. In this paper, we first analyze the limitations of quorum-based rendezvous schemes called asynchronous channel hopping (ACH. Then, we introduce a novel sequence sensing jamming attack (SSJA model in which a sophisticated jammer can dramatically reduce the rendezvous success rates of ACH schemes. In addition, we propose a fast and robust asynchronous rendezvous scheme (FRARS that can significantly enhance robustness under jamming attacks. Our numerical results demonstrate that the performance of the proposed scheme vastly outperforms the ACH scheme when there are security concerns about a sequence sensing jammer.
Kwek, Jin Wang
2011-07-01
A combination of small parallel plate condenser with Indium Tin Oxide (ITO) glass slides as electrodes and an atomic force microscope (AFM) is used to characterize the electrostatic behavior of single glass bead microparticles (105-150 μm) glued to the AFM cantilever. This novel setup allows measurements of the electrostatic forces acting on a particle in an applied electrical field to be performed in ambient air conditions. By varying the position of the microparticle between the electrodes and the strength of the applied electric field, the relative contributions of the particle net charge, induced and image charges were investigated. When the microparticle is positioned in the middle of the electrodes, the force acting on the microparticle was linear with the applied electric field and proportional to the microparticle net charge. At distances close to the bottom electrode, the force follows a parabolic relationship with the applied electric field reflecting the contributions of induced and image charges. The method can be used for the rapid evaluation of the charging and polarizability properties of the microparticle as well as an alternative to the conventional Faraday\\'s pail technique. © 2011 Elsevier B.V.
Positive semidefinite integrated covariance estimation, factorizations and asynchronicity
DEFF Research Database (Denmark)
Boudt, Kris; Laurent, Sébastien; Lunde, Asger
2017-01-01
An estimator of the ex-post covariation of log-prices under asynchronicity and microstructure noise is proposed. It uses the Cholesky factorization of the covariance matrix in order to exploit the heterogeneity in trading intensities to estimate the different parameters sequentially with as many...
Positive Semidefinite Integrated Covariance Estimation, Factorizations and Asynchronicity
DEFF Research Database (Denmark)
Boudt, Kris; Laurent, Sébastien; Lunde, Asger
An estimator of the ex-post covariation of log-prices under asynchronicity and microstructure noise is proposed. It uses the Cholesky factorization on the correlation matrix in order to exploit the heterogeneity in trading intensity to estimate the different parameters sequentially with as many...
Energy Technology Data Exchange (ETDEWEB)
Maneva, Y. G.; Poedts, Stefaan [Centre for Mathematical Plasma Astrophysics, KU Leuven, B-3001 Leuven (Belgium); Viñas, Adolfo F.; Moya, Pablo S.; Wicks, Robert T., E-mail: yana.maneva@wis.kuleuven.be [NASA Goddard Space Flight Center, Greenbelt, MD 20771 (United States)
2015-11-20
We perform 2.5D hybrid simulations with massless fluid electrons and kinetic particle-in-cell ions to study the temporal evolution of ion temperatures, temperature anisotropies, and velocity distribution functions in relation to the dissipation and turbulent evolution of a broadband spectrum of parallel and obliquely propagating Alfvén-cyclotron waves. The purpose of this paper is to study the relative role of parallel versus oblique Alfvén-cyclotron waves in the observed heating and acceleration of alpha particles in the fast solar wind. We consider collisionless homogeneous multi-species plasma, consisting of isothermal electrons, isotropic protons, and a minor component of drifting α particles in a finite-β fast stream near the Earth. The kinetic ions are modeled by initially isotropic Maxwellian velocity distribution functions, which develop nonthermal features and temperature anisotropies when a broadband spectrum of low-frequency nonresonant, ω ≤ 0.34 Ω{sub p}, Alfvén-cyclotron waves is imposed at the beginning of the simulations. The initial plasma parameter values, such as ion density, temperatures, and relative drift speeds, are supplied by fast solar wind observations made by the Wind spacecraft at 1 AU. The imposed broadband wave spectra are left-hand polarized and resemble Wind measurements of Alfvénic turbulence in the solar wind. The imposed magnetic field fluctuations for all cases are within the inertial range of the solar wind turbulence and have a Kraichnan-type spectral slope α = −3/2. We vary the propagation angle from θ = 0° to θ = 30° and θ = 60°, and find that the heating of alpha particles is most efficient for the highly oblique waves propagating at 60°, whereas the protons exhibit perpendicular cooling at all propagation angles.
International Nuclear Information System (INIS)
Maneva, Y. G.; Poedts, Stefaan; Viñas, Adolfo F.; Moya, Pablo S.; Wicks, Robert T.
2015-01-01
We perform 2.5D hybrid simulations with massless fluid electrons and kinetic particle-in-cell ions to study the temporal evolution of ion temperatures, temperature anisotropies, and velocity distribution functions in relation to the dissipation and turbulent evolution of a broadband spectrum of parallel and obliquely propagating Alfvén-cyclotron waves. The purpose of this paper is to study the relative role of parallel versus oblique Alfvén-cyclotron waves in the observed heating and acceleration of alpha particles in the fast solar wind. We consider collisionless homogeneous multi-species plasma, consisting of isothermal electrons, isotropic protons, and a minor component of drifting α particles in a finite-β fast stream near the Earth. The kinetic ions are modeled by initially isotropic Maxwellian velocity distribution functions, which develop nonthermal features and temperature anisotropies when a broadband spectrum of low-frequency nonresonant, ω ≤ 0.34 Ω p , Alfvén-cyclotron waves is imposed at the beginning of the simulations. The initial plasma parameter values, such as ion density, temperatures, and relative drift speeds, are supplied by fast solar wind observations made by the Wind spacecraft at 1 AU. The imposed broadband wave spectra are left-hand polarized and resemble Wind measurements of Alfvénic turbulence in the solar wind. The imposed magnetic field fluctuations for all cases are within the inertial range of the solar wind turbulence and have a Kraichnan-type spectral slope α = −3/2. We vary the propagation angle from θ = 0° to θ = 30° and θ = 60°, and find that the heating of alpha particles is most efficient for the highly oblique waves propagating at 60°, whereas the protons exhibit perpendicular cooling at all propagation angles
Beam dynamics simulations using a parallel version of PARMILA
International Nuclear Information System (INIS)
Ryne, R.D.
1996-01-01
The computer code PARMILA has been the primary tool for the design of proton and ion linacs in the United States for nearly three decades. Previously it was sufficient to perform simulations with of order 10000 particles, but recently the need to perform high resolution halo studies for next-generation, high intensity linacs has made it necessary to perform simulations with of order 100 million particles. With the advent of massively parallel computers such simulations are now within reach. Parallel computers already make it possible, for example, to perform beam dynamics calculations with tens of millions of particles, requiring over 10 GByte of core memory, in just a few hours. Also, parallel computers are becoming easier to use thanks to the availability of mature, Fortran-like languages such as Connection Machine Fortran and High Performance Fortran. We will describe our experience developing a parallel version of PARMILA and the performance of the new code
Beam dynamics simulations using a parallel version of PARMILA
International Nuclear Information System (INIS)
Ryne, Robert
1996-01-01
The computer code PARMILA has been the primary tool for the design of proton and ion linacs in the United States for nearly three decades. Previously it was sufficient to perform simulations with of order 10000 particles, but recently the need to perform high resolution halo studies for next-generation, high intensity linacs has made it necessary to perform simulations with of order 100 million particles. With the advent of massively parallel computers such simulations are now within reach. Parallel computers already make it possible, for example, to perform beam dynamics calculations with tens of millions of particles, requiring over 10 GByte of core memory, in just a few hours. Also, parallel computers are becoming easier to use thanks to the availability of mature, Fortran-like languages such as Connection Machine Fortran and High Performance Fortran. We will describe our experience developing a parallel version of PARMILA and the performance of the new code. (author)
Particle injection and cosmic ray acceleration at collisionless parallel shocks
International Nuclear Information System (INIS)
Quest, K.B.
1987-01-01
The structure of collisionless parallel shocks is studied using one-dimensional hybrid simulations, with emphasis on particle injection into the first-order Fermi acceleration process. It is argued that for sufficiently high Mach number shocks, and in the absence of wave turbulence, the fluid firehose marginal stability condition will be exceeded at the interface between the upstream, unshocked, plasma and the heated plasma downstream. As a consequence, nonlinear, low-frequency, electromagnetic waves are generated and act to slow the plasma and provide dissipation for the shock. It is shown that large amplitude waves at the shock ramp scatter a small fraction of the upstream ions back into the upstream medium. These ions, in turn, resonantly generate the electromagnetic waves that are swept back into the shock. As these waves propagate through the shock they are compressed and amplified, allowing them to non-resonantly scatter the bulk of the plasma. Moreover, the compressed waves back-scatter a small fraction of the upstream ions, maintaining the shock structure in a quasi-steady state. The back-scattered ions are accelerated during the wave generation process to 2 to 4 times the ram energy and provide a likely seed population for cosmic rays. 49 refs., 7 figs
International Nuclear Information System (INIS)
Lee, Handol; Yook, Sejin; Han, Seogyoung
2012-01-01
The deposition velocity is used to assess the degree of particulate contamination of wafers or photomasks. A numerical model was developed to predict the deposition velocity under the combined influences of thermophoresis and electrophoresis. The deposition velocity onto a face-up flat plate in parallel airflow was simulated by varying the temperature difference between the plate's surface and ambient air or by changing the strength of the electric field established above the plate. Both attraction and repulsion by thermophoresis or electrophoresis were considered. When the plate's surface was colder than ambient air, the surface of the face-up plate could be at risk of contamination by charged particles even with a repulsive applied electric force. When the temperature of the plate's surface was higher than the ambient temperature, the degree of particulate contamination on the surface of the face-up plate could be remarkably reduced in the presence of an electric field. The effect of repulsive thermophoresis, however, is expected to be reduced for very fine particles of high electric mobility or for micrometer-sized particles with large gravitational settling speed when the charged particles are influenced by an attractive electric force.
Verification and Planning for Stochastic Processes with Asynchronous Events
National Research Council Canada - National Science Library
Younes, Hakan L
2005-01-01
.... The most common assumption is that of history-independence: the Markov assumption. In this thesis, the author considers the problems of verification and planning for stochastic processes with asynchronous events, without relying on the Markov assumption...
International Nuclear Information System (INIS)
Chankin, A. V.; Stangeby, P. C.
2006-01-01
A system of plasma particle and parallel momentum balance equations is derived appropriate for understanding the role of drifts in the edge and for edge modelling, particularly in the scrape-off layer (SOL) of tokamaks, stellarators and other magnetic confinement devices. The formulation allows for strong collisionality-but also covers the case of weak collisionality and strong drifts, a combination often encountered in the SOL. The most important terms are identified by assessing the magnitude of characteristic velocities and fluxes for the plasma edge region. Explanations of the physical nature of each term are provided. A number of terms that are sometimes not included in edge modelling has been included in the parallel momentum balance equation after detailed analysis of the parallel component of the gradient of the total pressure-stress tensor. This includes terms related to curvature and divergence of the field lines, as well as further contributions coming from viscous forces related mainly to the ion centrifugal drift. All these terms are shown to be roughly of the same order of magnitude as convective momentum fluxes related to drifts and therefore should be included in the momentum balance equation
Effect of field-aligned-beam in parallel diffusion of energetic particles in the Earth's foreshock
Matsukiyo, S.; Nakanishi, K.; Otsuka, F.; Kis, A.; Lemperger, I.; Hada, T.
2016-12-01
Diffusive shock acceleration (DSA) is one of the plausible acceleration mechanisms of cosmic rays. In the standard DSA model the partial density of the accelerated particles, diffused into upstream, exponentially decreases as the distance to the shock increases. Kis et al. (GRL, 31, L20801, 2004) examined the density gradients of energetic ions upstream of the bow shock with high accuracy by using Cluster data. They estimated the diffusion coefficients of energetic ions for the event in February 18, 2003 and showed that the obtained diffusion coefficients are significantly smaller than those estimated in the past statistical study. This implies that particle acceleration at the bow shock can be more efficient than considered before. Here, we focus on the effect of the field-aligned-beam (FAB) which is often observed in the foreshock, and examine how the FAB affects the efficiency of diffusion of the energetic ions by performing test particle simulations. The upstream turbulence is given by the superposition of parallel Alfven waves with power-law energy spectrum with random phase approximation. In the spectrum we further add a peak corresponding to the waves resonantly generated by the FAB. The dependence of the diffusion coefficient on the presence of the FAB as well as total energy of the turbulence, power-law index of the turbulence, and intensity of FAB oriented waves are discussed.
Al Dobaikhi, Hend; Woollard, John
2011-01-01
The impacts of emerging ICT into educational curricula Asynchronous discussion forumDiscussion groups via e-learning environmentPosting questions and commentsSelf-efficacy in asynchronous e-learning Web community participationCollaborative learning can be fosteredPositive impacts on objectives of educational curriculum
Adding the Human Touch to Asynchronous Online Learning
Glenn, Cynthia Wheatley
2018-01-01
For learners to actively accept responsibility in a virtual classroom platform, it is necessary to provide special motivation extending across the traditional classroom setting into asynchronous online learning. This article explores specific ways to do this that bridge the gap between ground and online students' learning experiences, and how…
Designing a Web-Based Asynchronous Innovation/Entrepreneurism Course
Ghandforoush, Parviz
2017-01-01
Teaching an online fully asynchronous information technology course that requires students to ideate, build an e-commerce website, and develop an effective business plan involves a well-developed and highly engaging course design. This paper describes the design, development, and implementation of such a course and presents information on…
Asynchronous decision making in a memorized paddle pressing task.
Dankert, James R; Olson, Byron; Si, Jennie
2008-12-01
This paper presents a method for asynchronous decision making using recorded neural data in a binary decision task. This is a demonstration of a technique for developing motor cortical neural prosthetics that do not rely on external cued timing information. The system presented in this paper uses support vector machines and leaky integrate-and-fire elements to predict directional paddle presses. In addition to the traditional metrics of accuracy, asynchronous systems must also optimize the time needed to make a decision. The system presented is able to predict paddle presses with a median accuracy of 88% and all decisions are made before the time of the actual paddle press. An alternative bit rate measure of performance is defined to show that the system proposed here is able to perform the task with the same efficiency as the rats.
Huang, Yuecheng; Cheng, Wuyi; Luo, Sida; Luo, Yun; Ma, Chengchen; He, Tailin
2016-01-01
The features of the asynchronous correlation between accident indices and the factors that influence accidents can provide an effective reference for warnings of coal mining accidents. However, what are the features of this correlation? To answer this question, data from the China coal price index and the number of deaths from coal mining accidents were selected as the sample data. The fluctuation modes of the asynchronous correlation between the two data sets were defined according to the asynchronous correlation coefficients, symbolization, and sliding windows. We then built several directed and weighted network models, within which the fluctuation modes and the transformations between modes were represented by nodes and edges. Then, the features of the asynchronous correlation between these two variables could be studied from a perspective of network topology. We found that the correlation between the price index and the accidental deaths was asynchronous and fluctuating. Certain aspects, such as the key fluctuation modes, the subgroups characteristics, the transmission medium, the periodicity and transmission path length in the network, were analyzed by using complex network theory, analytical methods and spectral analysis method. These results provide a scientific reference for generating warnings for coal mining accidents based on economic indices. PMID:27902748
Data Collection for Mobile Group Consumption: An Asynchronous Distributed Approach †
Zhu, Weiping; Chen, Weiran; Hu, Zhejie; Li, Zuoyou; Liang, Yue; Chen, Jiaojiao
2016-01-01
Mobile group consumption refers to consumption by a group of people, such as a couple, a family, colleagues and friends, based on mobile communications. It differs from consumption only involving individuals, because of the complex relations among group members. Existing data collection systems for mobile group consumption are centralized, which has the disadvantages of being a performance bottleneck, having single-point failure and increasing business and security risks. Moreover, these data collection systems are based on a synchronized clock, which is often unrealistic because of hardware constraints, privacy concerns or synchronization cost. In this paper, we propose the first asynchronous distributed approach to collecting data generated by mobile group consumption. We formally built a system model thereof based on asynchronous distributed communication. We then designed a simulation system for the model for which we propose a three-layer solution framework. After that, we describe how to detect the causality relation of two/three gathering events that happened in the system based on the collected data. Various definitions of causality relations based on asynchronous distributed communication are supported. Extensive simulation results show that the proposed approach is effective for data collection relating to mobile group consumption. PMID:27058544
The study of transient processes in the asynchronous starting of the synchronous motor
Alexandru Bârlea; Olivian Chiver
2012-01-01
Starting synchronous motors can be achieved by several ethods: starting with an auxiliary motor launch, starting in asynchronous regim, by feeding from a variable frequency source, auto-synchronization with the network.. In our case we study the transient processes in a asynchronous regim . In this case the synchronous motor is started like a squirrel cage induction motor . To start, the synchronous motor is equipped with a starting winding cage placed in the pole pieces of polar inducers; la...
Ford, L E; Smiseth, P T
2016-02-01
In species with biparental care, sexual conflict occurs because the benefit of care depends on the total amount of care provided by the two parents while the cost of care depends on each parent's own contribution. Asynchronous hatching may play a role in mediating the resolution of this conflict over parental care. The sexual conflict hypothesis for the evolution of asynchronous hatching suggests that females adjust hatching patterns in order to increase male parental effort relative to female effort. We tested this hypothesis in the burying beetle Nicrophorus vespilloides by setting up experimental broods with three different hatching patterns: synchronous, asynchronous and highly asynchronous broods. As predicted, we found that males provided care for longer in asynchronous broods whereas the opposite was true of females. However, we did not find any benefit to females of reducing their duration of care in terms of increased lifespan or reduced mass loss during breeding. We found substantial negative effects of hatching asynchrony on offspring fitness as larval mass was lower and fewer larvae survived to dispersal in highly asynchronous broods compared to synchronous or asynchronous broods. Our results suggest that, even though females can increase male parental effort by hatching their broods more asynchronously, females pay a substantial cost from doing so in terms of reducing offspring growth and survival. Thus, females should be under selection to produce a hatching pattern that provides the best possible trade-off between the benefits of increased male parental effort and the costs due to reduced offspring fitness. © 2015 European Society For Evolutionary Biology. Journal of Evolutionary Biology © 2015 European Society For Evolutionary Biology.
Tolba, Khaled Ibrahim; Morgenthal, Guido
2018-01-01
This paper presents an analysis of the scalability and efficiency of a simulation framework based on the vortex particle method. The code is applied for the numerical aerodynamic analysis of line-like structures. The numerical code runs on multicore CPU and GPU architectures using OpenCL framework. The focus of this paper is the analysis of the parallel efficiency and scalability of the method being applied to an engineering test case, specifically the aeroelastic response of a long-span bridge girder at the construction stage. The target is to assess the optimal configuration and the required computer architecture, such that it becomes feasible to efficiently utilise the method within the computational resources available for a regular engineering office. The simulations and the scalability analysis are performed on a regular gaming type computer.
Dual stator winding variable speed asynchronous generator: optimal design and experiments
International Nuclear Information System (INIS)
Tutelea, L N; Deaconu, S I; Popa, G N
2015-01-01
In the present paper is carried out a theoretical and experimental study of dual stator winding squirrel cage asynchronous generator (DSWA) behavior in the presence of saturation regime (non-sinusoidal) due to the variable speed operation. The main aims are the determination of the relations of calculating the equivalent parameters of the machine windings to optimal design using a Matlab code. Issue is limited to three phase range of double stator winding cage-induction generator of small sized powers, the most currently used in the small adjustable speed wind or hydro power plants. The tests were carried out using three-phase asynchronous generator having rated power of 6 [kVA]. (paper)
Asynchronous Assessment in a Large Lecture Marketing Course
Downey, W. Scott; Schetzsle, Stacey
2012-01-01
Asynchronous assessment, which includes quizzes or exams online or outside class, offers marketing educators an opportunity to make more efficient use of class time and to enhance students' learning experiences by giving them more flexibility and choice in their assessment environment. In this paper, we examine the performance difference between…
Data driven parallelism in experimental high energy physics applications
International Nuclear Information System (INIS)
Pohl, M.
1987-01-01
I present global design principles for the implementation of high energy physics data analysis code on sequential and parallel processors with mixed shared and local memory. Potential parallelism in the structure of high energy physics tasks is identified with granularity varying from a few times 10 8 instructions all the way down to a few times 10 4 instructions. It follows the hierarchical structure of detector and data acquisition systems. To take advantage of this - yet preserving the necessary portability of the code - I propose a computational model with purely data driven concurrency in Single Program Multiple Data (SPMD) mode. The task granularity is defined by varying the granularity of the central data structure manipulated. Concurrent processes coordiate themselves asynchroneously using simple lock constructs on parts of the data structure. Load balancing among processes occurs naturally. The scheme allows to map the internal layout of the data structure closely onto the layout of local and shared memory in a parallel architecture. It thus allows to optimize the application with respect to synchronization as well as data transport overheads. I present a coarse top level design for a portable implementation of this scheme on sequential machines, multiprocessor mainframes (e.g. IBM 3090), tightly coupled multiprocessors (e.g. RP-3) and loosely coupled processor arrays (e.g. LCAP, Emulating Processor Farms). (orig.)
Data driven parallelism in experimental high energy physics applications
Pohl, Martin
1987-08-01
I present global design principles for the implementation of High Energy Physics data analysis code on sequential and parallel processors with mixed shared and local memory. Potential parallelism in the structure of High Energy Physics tasks is identified with granularity varying from a few times 10 8 instructions all the way down to a few times 10 4 instructions. It follows the hierarchical structure of detector and data acquisition systems. To take advantage of this - yet preserving the necessary portability of the code - I propose a computational model with purely data driven concurrency in Single Program Multiple Data (SPMD) mode. The Task granularity is defined by varying the granularity of the central data structure manipulated. Concurrent processes coordinate themselves asynchroneously using simple lock constructs on parts of the data structure. Load balancing among processes occurs naturally. The scheme allows to map the internal layout of the data structure closely onto the layout of local and shared memory in a parallel architecture. It thus allows to optimize the application with respect to synchronization as well as data transport overheads. I present a coarse top level design for a portable implementation of this scheme on sequential machines, multiprocessor mainframes (e.g. IBM 3090), tightly coupled multiprocessors (e.g. RP-3) and loosely coupled processor arrays (e.g. LCAP, Emulating Processor Farms).
Spatiotemporal Features for Asynchronous Event-based Data
Directory of Open Access Journals (Sweden)
Xavier eLagorce
2015-02-01
Full Text Available Bio-inspired asynchronous event-based vision sensors are currently introducing a paradigm shift in visual information processing. These new sensors rely on a stimulus-driven principle of light acquisition similar to biological retinas. They are event-driven and fully asynchronous, thereby reducing redundancy and encoding exact times of input signal changes, leading to a very precise temporal resolution. Approaches for higher-level computer vision often rely on the realiable detection of features in visual frames, but similar definitions of features for the novel dynamic and event-based visual input representation of silicon retinas have so far been lacking. This article addresses the problem of learning and recognizing features for event-based vision sensors, which capture properties of truly spatiotemporal volumes of sparse visual event information. A novel computational architecture for learning and encoding spatiotemporal features is introduced based on a set of predictive recurrent reservoir networks, competing via winner-take-all selection. Features are learned in an unsupervised manner from real-world input recorded with event-based vision sensors. It is shown that the networks in the architecture learn distinct and task-specific dynamic visual features, and can predict their trajectories over time.
Asynchronous replication and autosome-pair non-equivalence in human embryonic stem cells.
Directory of Open Access Journals (Sweden)
Devkanya Dutta
Full Text Available A number of mammalian genes exhibit the unusual properties of random monoallelic expression and random asynchronous replication. Such exceptional genes include genes subject to X inactivation and autosomal genes including odorant receptors, immunoglobulins, interleukins, pheromone receptors, and p120 catenin. In differentiated cells, random asynchronous replication of interspersed autosomal genes is coordinated at the whole chromosome level, indicative of chromosome-pair non-equivalence. Here we have investigated the replication pattern of the random asynchronously replicating genes in undifferentiated human embryonic stem cells, using fluorescence in situ hybridization based assay. We show that allele-specific replication of X-linked genes and random monoallelic autosomal genes occur in human embryonic stem cells. The direction of replication is coordinated at the whole chromosome level and can cross the centromere, indicating the existence of autosome-pair non-equivalence in human embryonic stem cells. These results suggest that epigenetic mechanism(s that randomly distinguish between two parental alleles are emerging in the cells of the inner cell mass, the source of human embryonic stem cells.
Content Analysis Coding Schemes for Online Asynchronous Discussion
Weltzer-Ward, Lisa
2011-01-01
Purpose: Researchers commonly utilize coding-based analysis of classroom asynchronous discussion contributions as part of studies of online learning and instruction. However, this analysis is inconsistent from study to study with over 50 coding schemes and procedures applied in the last eight years. The aim of this article is to provide a basis…
Directory of Open Access Journals (Sweden)
T. Kalavathi Devi
2015-01-01
Full Text Available Convolutional codes are comprehensively used as Forward Error Correction (FEC codes in digital communication systems. For decoding of convolutional codes at the receiver end, Viterbi decoder is often used to have high priority. This decoder meets the demand of high speed and low power. At present, the design of a competent system in Very Large Scale Integration (VLSI technology requires these VLSI parameters to be finely defined. The proposed asynchronous method focuses on reducing the power consumption of Viterbi decoder for various constraint lengths using asynchronous modules. The asynchronous designs are based on commonly used Quasi Delay Insensitive (QDI templates, namely, Precharge Half Buffer (PCHB and Weak Conditioned Half Buffer (WCHB. The functionality of the proposed asynchronous design is simulated and verified using Tanner Spice (TSPICE in 0.25 µm, 65 nm, and 180 nm technologies of Taiwan Semiconductor Manufacture Company (TSMC. The simulation result illustrates that the asynchronous design techniques have 25.21% of power reduction compared to synchronous design and work at a speed of 475 MHz.
Miscellany of Students' Satisfaction in an Asynchronous Learning Environment
Larbi-Siaw, Otu; Owusu-Agyeman, Yaw
2017-01-01
This study investigates the determinants of students' satisfaction in an asynchronous learning environment using seven key considerations: the e-learning environment, student-content interaction, student and student interaction, student-teacher interaction, group cohesion and timely participation, knowledge of Internet usage, and satisfaction. The…
Directory of Open Access Journals (Sweden)
Jilin Zhang
2017-01-01
Full Text Available With the development of the mobile systems, we gain a lot of benefits and convenience by leveraging mobile devices; at the same time, the information gathered by smartphones, such as location and environment, is also valuable for business to provide more intelligent services for customers. More and more machine learning methods have been used in the field of mobile information systems to study user behavior and classify usage patterns, especially convolutional neural network. With the increasing of model training parameters and data scale, the traditional single machine training method cannot meet the requirements of time complexity in practical application scenarios. The current training framework often uses simple data parallel or model parallel method to speed up the training process, which is why heterogeneous computing resources have not been fully utilized. To solve these problems, our paper proposes a delay synchronization convolutional neural network parallel strategy, which leverages the heterogeneous system. The strategy is based on both synchronous parallel and asynchronous parallel approaches; the model training process can reduce the dependence on the heterogeneous architecture in the premise of ensuring the model convergence, so the convolution neural network framework is more adaptive to different heterogeneous system environments. The experimental results show that the proposed delay synchronization strategy can achieve at least three times the speedup compared to the traditional data parallelism.
Pass-transistor asynchronous sequential circuits
Whitaker, Sterling R.; Maki, Gary K.
1989-01-01
Design methods for asynchronous sequential pass-transistor circuits, which result in circuits that are hazard- and critical-race-free and which have added degrees of freedom for the input signals, are discussed. The design procedures are straightforward and easy to implement. Two single-transition-time state assignment methods are presented, and hardware bounds for each are established. A surprising result is that the hardware realizations for each next state variable and output variable is identical for a given flow table. Thus, a state machine with N states and M outputs can be constructed using a single layout replicated N + M times.
A Novel Approach to Asynchronous MVP Data Interpretation Based on Elliptical-Vectors
Kruglyakov, M.; Trofimov, I.; Korotaev, S.; Shneyer, V.; Popova, I.; Orekhova, D.; Scshors, Y.; Zhdanov, M. S.
2014-12-01
We suggest a novel approach to asynchronous magnetic-variation profiling (MVP) data interpretation. Standard method in MVP is based on the interpretation of the coefficients of linear relation between vertical and horizontal components of the measured magnetic field.From mathematical point of view this pair of linear coefficients is not a vector which leads to significant difficulties in asynchronous data interpretation. Our approach allows us to actually treat such a pair of complex numbers as a special vector called an ellipse-vector (EV). By choosing the particular definitions of complex length and direction, the basic relation of MVP can be considered as the dot product. This considerably simplifies the interpretation of asynchronous data. The EV is described by four real numbers: the values of major and minor semiaxes, the angular direction of the major semiaxis and the phase. The notation choice is motivated by historical reasons. It is important that different EV's components have different sensitivity with respect to the field sources and the local heterogeneities. Namely, the value of major semiaxis and the angular direction are mostly determined by the field source and the normal cross-section. On the other hand, the value of minor semiaxis and the phase are responsive to local heterogeneities. Since the EV is the general form of complex vector, the traditional Schmucker vectors can be explicitly expressed through its components.The proposed approach was successfully applied to interpretation the results of asynchronous measurements that had been obtained in the Arctic Ocean at the drift stations "North Pole" in 1962-1976.
Monte Carlo problem and parallel computers, and how to do a fast particle mover on the STAR 100
International Nuclear Information System (INIS)
Sinz, K.H.P.H.
1975-01-01
Particle simulation problems of the Monte Carlo type are widely believed to be intrinsically highly scalar problems. In the absence of a definitive mathematical theorem to the contrary, this belief is based on the very apparent programming difficulties encountered on a vector machine. This class of problem is therefore thought to be ill-suited to highly parallel and vectorized computers. However, it is demonstrated by several examples that a particle mover is fully vectorizable. In the case of the CDC STAR 100 it is found that the performance of such a particle mover is not hopeless but hopeful, and is in fact helpful. One of the several possible vectorizations is estimated to yield a gain of a factor of 15 on the STAR over good serial coding on the same machine. This falls far short of the STAR's peak vector performance of 30 to 70 times scalar rates because certain fast vector instructions are not available and have to be simulated. The current STAR algorithm outperforms the carefully handcoded 7600 by a factor of 3. This performance margin is achievable despite the 7600's fivefold superior scalar capability. A more generally vectorized particle mover will always substantially outperform scalar coding on any machine equipped with a properly chosen set of fast vector instructions. (U.S.)
Particle beam dynamics simulations using the POOMA framework
International Nuclear Information System (INIS)
Humphrey, W.; Ryne, R.; Cleland, T.; Cummings, J.; Habib, S.; Mark, G.; Ji Qiang
1998-01-01
A program for simulation of the dynamics of high intensity charged particle beams in linear particle accelerators has been developed in C++ using the POOMA Framework, for use on serial and parallel architectures. The code models the trajectories of charged particles through a sequence of different accelerator beamline elements such as drift chambers, quadrupole magnets, or RF cavities. An FFT-based particle-in-cell algorithm is used to solve the Poisson equation that models the Coulomb interactions of the particles. The code employs an object-oriented design with software abstractions for the particle beam, accelerator beamline, and beamline elements, using C++ templates to efficiently support both 2D and 3D capabilities in the same code base. The POOMA Framework, which encapsulates much of the effort required for parallel execution, provides particle and field classes, particle-field interaction capabilities, and parallel FFT algorithms. The performance of this application running serially and in parallel is compared to an existing HPF implementation, with the POOMA version seen to run four times faster than the HPF code
Directory of Open Access Journals (Sweden)
Richard A. Schwier
2002-06-01
Full Text Available A group of graduate students and an instructor at the University of Saskatchewan experimented with the use of synchronous communication (chat and asynchronous communication (bulletin board in a theory course in Educational Communications and Technology for an eight-month period. Synchronous communication contributed dramatically to the continuity and convenience of the class, and promoted a strong sense of community. At the same time, it was viewed as less effective than asynchronous communication for dealing with content and issues deeply, and it introduced a number of pedagogical and intellectual limitations. We concluded that synchronous and asynchronous strategies were suitable for different types of learning, and what we experienced was a balancing act between content and community in our group. A combination of synchronous and asynchronous experiences seems to be necessary to promote the kind of engagement and depth required in a graduate seminar.
Zhao, Yaqin; Zhong, Xin; Wu, Di; Zhang, Ye; Ren, Guanghui; Wu, Zhilu
2013-09-01
Optical code-division multiple access (OCDMA) systems usually allocate orthogonal or quasi-orthogonal codes to the active users. When transmitting through atmospheric scattering channel, the coding pulses are broadened and the orthogonality of the codes is worsened. In truly asynchronous case, namely both the chips and the bits are asynchronous among each active user, the pulse broadening affects the system performance a lot. In this paper, we evaluate the performance of a 2D asynchronous hard-limiting wireless OCDMA system through atmospheric scattering channel. The probability density function of multiple access interference in truly asynchronous case is given. The bit error rate decreases as the ratio of the chip period to the root mean square delay spread increases and the channel limits the bit rate to different levels when the chip period varies.
(Nearly) portable PIC code for parallel computers
International Nuclear Information System (INIS)
Decyk, V.K.
1993-01-01
As part of the Numerical Tokamak Project, the author has developed a (nearly) portable, one dimensional version of the GCPIC algorithm for particle-in-cell codes on parallel computers. This algorithm uses a spatial domain decomposition for the fields, and passes particles from one domain to another as the particles move spatially. With only minor changes, the code has been run in parallel on the Intel Delta, the Cray C-90, the IBM ES/9000 and a cluster of workstations. After a line by line translation into cmfortran, the code was also run on the CM-200. Impressive speeds have been achieved, both on the Intel Delta and the Cray C-90, around 30 nanoseconds per particle per time step. In addition, the author was able to isolate the data management modules, so that the physics modules were not changed much from their sequential version, and the data management modules can be used as open-quotes black boxes.close quotes
Asynchronous Magnetic Bead Rotation (AMBR Microviscometer for Label-Free DNA Analysis
Directory of Open Access Journals (Sweden)
Yunzi Li
2014-03-01
Full Text Available We have developed a label-free viscosity-based DNA detection system, using paramagnetic beads as an asynchronous magnetic bead rotation (AMBR microviscometer. We have demonstrated experimentally that the bead rotation period is linearly proportional to the viscosity of a DNA solution surrounding the paramagnetic bead, as expected theoretically. Simple optical measurement of asynchronous microbead motion determines solution viscosity precisely in microscale volumes, thus allowing an estimate of DNA concentration or average fragment length. The response of the AMBR microviscometer yields reproducible measurement of DNA solutions, enzymatic digestion reactions, and PCR systems at template concentrations across a 5000-fold range. The results demonstrate the feasibility of viscosity-based DNA detection using AMBR in microscale aqueous volumes.
Asynchronous Multi-Party Computation with Quadratic Communication
DEFF Research Database (Denmark)
Hirt, Martin; Nielsen, Jesper Buus; Przydatek, Bartosz
2008-01-01
We present an efficient protocol for secure multi-party computation in the asynchronous model with optimal resilience. For n parties, up to t < n/3 of them being corrupted, and security parameter κ, a circuit with c gates can be securely computed with communication complexity O(cn^2k) bits, which...... circuit randomization due to Beaver (Crypto’91), and an abstraction of certificates, which can be of independent interest....
A modular control architecture for real-time synchronous and asynchronous systems
International Nuclear Information System (INIS)
Butler, P.L.; Jones, J.P.
1993-01-01
This paper describes a control architecture for real-time control of complex robotic systems. The Modular Integrated Control Architecture (MICA), which is actually two complementary control systems, recognizes and exploits the differences between asynchronous and synchronous control. The asynchronous control system simulates shared memory on a heterogeneous network. For control information, a portable event-scheme is used. This scheme provides consistent interprocess coordination among multiple tasks on a number of distributed systems. The machines in the network can vary with respect to their native operating systems and the intemal representation of numbers they use. The synchronous control system is needed for tight real-time control of complex electromechanical systems such as robot manipulators, and the system uses multiple processors at a specified rate. Both the synchronous and asynchronous portions of MICA have been developed to be extremely modular. MICA presents a simple programming model to code developers and also considers the needs of system integrators and maintainers. MICA has been used successfully in a complex robotics project involving a mobile 7-degree-of-freedom manipulator in a heterogeneous network with a body of software totaling over 100,000 lines of code. MICA has also been used in another robotics system, controlling a commercial long-reach manipulator
Full-load converter connected asynchronous generators for MW class wind turbines
Energy Technology Data Exchange (ETDEWEB)
Akhmatov, Vladislav
2005-06-15
Wind turbines equipped with full-load converter-connected asynchronous generators are a known concept. These have rating up to hundreds of kW and are a feasible concept for MW class wind turbines and may have advantages when compared to conventional wind turbines with directly connected generators. The concept requires the use of full-scale frequency converters, but the mechanical gearbox is smaller than in conventional wind turbines of the same rating. Application of smaller gearbox may reduce the no-load losses in the wind turbines, which is why such wind turbines with converter connected generators may start operation at a smaller wind speed. Wind turbines equipped with such converted connected asynchronous generators are pitch-controlled and variable-speed. This allows better performance and control. The converter control may be applied to support the grid voltage at short-circuit faults and to improve the fault-ride-through capability of the wind turbines, which makes the concepts relevant for large wind farms. The Danish transmission system operator Energinet-DK has implemented the general model of wind turbines equipped with converter connected asynchronous generators with the simulation tool Powerfactory (DlgSilent). The article presents Energinet-DK's experience of modeling this feasible wind turbine concept. (Author)
Vectorization, parallelization and porting of nuclear codes. 2001
International Nuclear Information System (INIS)
Akiyama, Mitsunaga; Katakura, Fumishige; Kume, Etsuo; Nemoto, Toshiyuki; Tsuruoka, Takuya; Adachi, Masaaki
2003-07-01
Several computer codes in the nuclear field have been vectorized, parallelized and transported on the super computer system at Center for Promotion of Computational Science and Engineering in Japan Atomic Energy Research Institute. We dealt with 10 codes in fiscal 2001. In this report, the parallelization of Neutron Radiography for 3 Dimensional CT code NR3DCT, the vectorization of unsteady-state heat conduction code THERMO3D, the porting of initial program of MHD simulation, the tuning of Heat And Mass Balance Analysis Code HAMBAC, the porting and parallelization of Monte Carlo N-Particle transport code MCNP4C3, the porting and parallelization of Monte Carlo N-Particle transport code system MCNPX2.1.5, the porting of induced activity calculation code CINAC-V4, the use of VisLink library in multidimensional two-fluid model code ACD3D and the porting of experiment data processing code from GS8500 to SR8000 are described. (author)
Turing Incompleteness of Asynchronous P Systems with Active Membranes
Leporati, Alberto; Manzoni, Luca; Porreca, Antonio E.
2013-01-01
We prove that asynchronous P systems with active membranes without divi- sion rules can be simulated by place/transition Petri nets, and hence are computationally weaker than Turing machines. This result holds even if the synchronisation mechanisms provided by electrical charges and membrane dissolution are exploited.
An improved modelling of asynchronous machine with skin-effect ...
African Journals Online (AJOL)
The conventional method of analysis of Asynchronous machine fails to give accurate results especially when the machine is operated under high rotor frequency. At high rotor frequency, skin-effect dominates causing the rotor impedance to be frequency dependant. This paper therefore presents an improved method of ...
Emmanouilidou, Kyriaki; Derri, Vassiliki; Antoniou, Panagiotis; Kyrgiridis, Pavlos
2012-01-01
The purpose of the study was to compare the influences of a training programme's instructional delivery method (synchronous and asynchronous) on Greek in-service physical educators' cognitive understanding on student assessment. Forty nine participants were randomly divided into synchronous, asynchronous, and control group. The experimental groups…
Area/latency optimized early output asynchronous full adders and relative-timed ripple carry adders.
Balasubramanian, P; Yamashita, S
2016-01-01
This article presents two area/latency optimized gate level asynchronous full adder designs which correspond to early output logic. The proposed full adders are constructed using the delay-insensitive dual-rail code and adhere to the four-phase return-to-zero handshaking. For an asynchronous ripple carry adder (RCA) constructed using the proposed early output full adders, the relative-timing assumption becomes necessary and the inherent advantages of the relative-timed RCA are: (1) computation with valid inputs, i.e., forward latency is data-dependent, and (2) computation with spacer inputs involves a bare minimum constant reverse latency of just one full adder delay, thus resulting in the optimal cycle time. With respect to different 32-bit RCA implementations, and in comparison with the optimized strong-indication, weak-indication, and early output full adder designs, one of the proposed early output full adders achieves respective reductions in latency by 67.8, 12.3 and 6.1 %, while the other proposed early output full adder achieves corresponding reductions in area by 32.6, 24.6 and 6.9 %, with practically no power penalty. Further, the proposed early output full adders based asynchronous RCAs enable minimum reductions in cycle time by 83.4, 15, and 8.8 % when considering carry-propagation over the entire RCA width of 32-bits, and maximum reductions in cycle time by 97.5, 27.4, and 22.4 % for the consideration of a typical carry chain length of 4 full adder stages, when compared to the least of the cycle time estimates of various strong-indication, weak-indication, and early output asynchronous RCAs of similar size. All the asynchronous full adders and RCAs were realized using standard cells in a semi-custom design fashion based on a 32/28 nm CMOS process technology.
Fast electrostatic force calculation on parallel computer clusters
International Nuclear Information System (INIS)
Kia, Amirali; Kim, Daejoong; Darve, Eric
2008-01-01
The fast multipole method (FMM) and smooth particle mesh Ewald (SPME) are well known fast algorithms to evaluate long range electrostatic interactions in molecular dynamics and other fields. FMM is a multi-scale method which reduces the computation cost by approximating the potential due to a group of particles at a large distance using few multipole functions. This algorithm scales like O(N) for N particles. SPME algorithm is an O(NlnN) method which is based on an interpolation of the Fourier space part of the Ewald sum and evaluating the resulting convolutions using fast Fourier transform (FFT). Those algorithms suffer from relatively poor efficiency on large parallel machines especially for mid-size problems around hundreds of thousands of atoms. A variation of the FMM, called PWA, based on plane wave expansions is presented in this paper. A new parallelization strategy for PWA, which takes advantage of the specific form of this expansion, is described. Its parallel efficiency is compared with SPME through detail time measurements on two different computer clusters
Integration of asynchronous knowledge sources in a novel speech recognition framework
Van hamme, Hugo
2008-01-01
Van hamme H., ''Integration of asynchronous knowledge sources in a novel speech recognition framework'', Proceedings ITRW on speech analysis and processing for knowledge discovery, 4 pp., June 2008, Aalborg, Denmark.
Liu, Yang
2015-12-17
A scalable parallel plane-wave time-domain (PWTD) algorithm for efficient and accurate analysis of transient scattering from electrically large objects is presented. The algorithm produces scalable communication patterns on very large numbers of processors by leveraging two mechanisms: (i) a hierarchical parallelization strategy to evenly distribute the computation and memory loads at all levels of the PWTD tree among processors, and (ii) a novel asynchronous communication scheme to reduce the cost and memory requirement of the communications between the processors. The efficiency and accuracy of the algorithm are demonstrated through its applications to the analysis of transient scattering from a perfect electrically conducting (PEC) sphere with a diameter of 70 wavelengths and a PEC square plate with a dimension of 160 wavelengths. Furthermore, the proposed algorithm is used to analyze transient fields scattered from realistic airplane and helicopter models under high frequency excitation.
Resonance parallel viscosity in the banana regime in poloidally rotating tokamak plasmas
International Nuclear Information System (INIS)
Shaing, K.C.; Hsu, C.T.; Dominguez, N.
1994-01-01
Parallel viscosity in the banana regime in a poloidally (ExB) rotating tokamak plasma is calculated to include the effects of orbit squeezing and to allow the poloidal ExB Mach number M p to have a value of order unity. Here, E is the electric field and B is the magnetic field. The effects of orbit squeezing not only modify the size of the particle orbit, but also change the fraction of poloidally trapped particles. Resonance between the particle parallel (to B) speed u and the poloidal component of the ExB velocity can only occur for those particles with energy (v/v t ) 2 >M 2 p (with v the particle speed and v t the thermal speed). Thus, the resonance parallel plasma viscosity in the banana regime decreases exponentially with M 2 p when M 2 p ≥1, and has a local maximum of M 2 p ∼1
Aloise, Fabio; Schettini, Francesca; Aricò, Pietro; Salinari, Serenella; Guger, Christoph; Rinsma, Johanna; Aiello, Marco; Mattia, Donatella; Cincotti, Febo
2011-10-01
Motor disability and/or ageing can prevent individuals from fully enjoying home facilities, thus worsening their quality of life. Advances in the field of accessible user interfaces for domotic appliances can represent a valuable way to improve the independence of these persons. An asynchronous P300-based Brain-Computer Interface (BCI) system was recently validated with the participation of healthy young volunteers for environmental control. In this study, the asynchronous P300-based BCI for the interaction with a virtual home environment was tested with the participation of potential end-users (clients of a Frisian home care organization) with limited autonomy due to ageing and/or motor disabilities. System testing revealed that the minimum number of stimulation sequences needed to achieve correct classification had a higher intra-subject variability in potential end-users with respect to what was previously observed in young controls. Here we show that the asynchronous modality performed significantly better as compared to the synchronous mode in continuously adapting its speed to the users' state. Furthermore, the asynchronous system modality confirmed its reliability in avoiding misclassifications and false positives, as previously shown in young healthy subjects. The asynchronous modality may contribute to filling the usability gap between BCI systems and traditional input devices, representing an important step towards their use in the activities of daily living.
Engineered plant biomass feedstock particles
Dooley, James H [Federal Way, WA; Lanning, David N [Federal Way, WA; Broderick, Thomas F [Lake Forest Park, WA
2012-04-17
A new class of plant biomass feedstock particles characterized by consistent piece size and shape uniformity, high skeletal surface area, and good flow properties. The particles of plant biomass material having fibers aligned in a grain are characterized by a length dimension (L) aligned substantially parallel to the grain and defining a substantially uniform distance along the grain, a width dimension (W) normal to L and aligned cross grain, and a height dimension (H) normal to W and L. In particular, the L.times.H dimensions define a pair of substantially parallel side surfaces characterized by substantially intact longitudinally arrayed fibers, the W.times.H dimensions define a pair of substantially parallel end surfaces characterized by crosscut fibers and end checking between fibers, and the L.times.W dimensions define a pair of substantially parallel top and bottom surfaces. The L.times.W surfaces of particles with L/H dimension ratios of 4:1 or less are further elaborated by surface checking between longitudinally arrayed fibers. The length dimension L is preferably aligned within 30.degree. parallel to the grain, and more preferably within 10.degree. parallel to the grain. The plant biomass material is preferably selected from among wood, agricultural crop residues, plantation grasses, hemp, bagasse, and bamboo.
A tomograph VMEbus parallel processing data acquisition system
International Nuclear Information System (INIS)
Wilkinson, N.A.; Rogers, J.G.; Atkins, M.S.
1989-01-01
This paper describes a VME based data acquisition system suitable for the development of Positron Volume Imaging tomographs which use 3-D data for improved image resolution over slice-oriented tomographs. the data acquisition must be flexible enough to accommodate several 3-D reconstruction algorithms; hence, a software-based system is most suitable. Furthermore, because of the increased dimensions and resolution of volume imaging tomographs, the raw data event rate is greater than that of slice-oriented machines. These dual requirements are met by our data acquisition system. Flexibility is achieved through an array of processors connected over a VMEbus, operating asynchronously and in parallel. High raw data throughput is achieved using a dedicated high speed data transfer device available for the VMEbus. The device can attain a raw data rate of 2.5 million coincidence events per second for raw events which are 64 bits wide
Directory of Open Access Journals (Sweden)
Jinhui TANG
2018-05-01
Full Text Available Attacking time-sensitive targets has rigid demands for the timeliness and reliability of information transmission, while typical Media Access Control (MAC designed for this application works well only in very light-load scenarios; as a consequence, the performances of system throughput and channel utilization are degraded. For this problem, a feedback-retransmission based asynchronous FRequency hopping Media Access (FRMA control protocol is proposed. Burst communication, asynchronous Frequency Hopping (FH, channel coding, and feedback retransmission are utilized in FRMA. With the mechanism of asynchronous FH, immediate packet transmission and multi-packet reception can be realized, and thus the timeliness is improved. Furthermore, reliability can be achieved via channel coding and feedback retransmission. With theories of queuing theory, Markov model, packets collision model, and discrete Laplace transformation, the formulas of packet success probability, system throughput, average packet end-to-end delay, and delay distribution are obtained. The approximation accuracy of theoretical derivation is verified by experimental results. Within a light-load network, the proposed FRMA has the ability of millisecond delay and 99% reliability as well as outperforms the non-feedback-retransmission based asynchronous frequency hopping media access control protocol. Keywords: Ad hoc networks, Aeronautical communications, Frequency hopping, Media Access Control (MAC, Time-sensitive
Directory of Open Access Journals (Sweden)
Sungsik Park
2016-02-01
Full Text Available Parallel-plate capacitors were fabricated using a printed multi-layer structure in order to determine the effects of particle size and solvent on the capacitance. The conductive-dielectric-conductive layers were sequentially spun using commercial inks and by intermediate drying with the aid of a masking polymeric layer. Both optical and scanning electron microscopy were used to characterize the morphology of the printed layers. The measured capacitance was larger than the theoretically calculated value when ink with small-sized particles was used as the top plate. Furthermore, the use of a solvent whose polarity was similar to that of the underlying dielectric layer enhanced the penetration and resulted in an increase in capacitance. The functional resistance-capacitance low-pass filter was implemented using printed resistors and capacitors, a process that may be scalable in the future.
Asynchronous data change notification between database server and accelerator controls system
International Nuclear Information System (INIS)
Fu, W.; Morris, J.; Nemesure, S.
2011-01-01
Database data change notification (DCN) is a commonly used feature. Not all database management systems (DBMS) provide an explicit DCN mechanism. Even for those DBMS's which support DCN (such as Oracle and MS SQL server), some server side and/or client side programming may be required to make the DCN system work. This makes the setup of DCN between database server and interested clients tedious and time consuming. In accelerator control systems, there are many well established software client/server architectures (such as CDEV, EPICS, and ADO) that can be used to implement data reflection servers that transfer data asynchronously to any client using the standard SET/GET API. This paper describes a method for using such a data reflection server to set up asynchronous DCN (ADCN) between a DBMS and clients. This method works well for all DBMS systems which provide database trigger functionality. Asynchronous data change notification (ADCN) between database server and clients can be realized by combining the use of a database trigger mechanism, which is supported by major DBMS systems, with server processes that use client/server software architectures that are familiar in the accelerator controls community (such as EPICS, CDEV or ADO). This approach makes the ADCN system easy to set up and integrate into an accelerator controls system. Several ADCN systems have been set up and used in the RHIC-AGS controls system.
International Nuclear Information System (INIS)
Plimpton, Steven J.; Hendrickson, Bruce; Burns, Shawn P.; McLendon, William III; Rauchwerger, Lawrence
2005-01-01
The method of discrete ordinates is commonly used to solve the Boltzmann transport equation. The solution in each ordinate direction is most efficiently computed by sweeping the radiation flux across the computational grid. For unstructured grids this poses many challenges, particularly when implemented on distributed-memory parallel machines where the grid geometry is spread across processors. We present several algorithms relevant to this approach: (a) an asynchronous message-passing algorithm that performs sweeps simultaneously in multiple ordinate directions, (b) a simple geometric heuristic to prioritize the computational tasks that a processor works on, (c) a partitioning algorithm that creates columnar-style decompositions for unstructured grids, and (d) an algorithm for detecting and eliminating cycles that sometimes exist in unstructured grids and can prevent sweeps from successfully completing. Algorithms (a) and (d) are fully parallel; algorithms (b) and (c) can be used in conjunction with (a) to achieve higher parallel efficiencies. We describe our message-passing implementations of these algorithms within a radiation transport package. Performance and scalability results are given for unstructured grids with up to 3 million elements (500 million unknowns) running on thousands of processors of Sandia National Laboratories' Intel Tflops machine and DEC-Alpha CPlant cluster
Continuous EEG signal analysis for asynchronous BCI application.
Hsu, Wei-Yen
2011-08-01
In this study, we propose a two-stage recognition system for continuous analysis of electroencephalogram (EEG) signals. An independent component analysis (ICA) and correlation coefficient are used to automatically eliminate the electrooculography (EOG) artifacts. Based on the continuous wavelet transform (CWT) and Student's two-sample t-statistics, active segment selection then detects the location of active segment in the time-frequency domain. Next, multiresolution fractal feature vectors (MFFVs) are extracted with the proposed modified fractal dimension from wavelet data. Finally, the support vector machine (SVM) is adopted for the robust classification of MFFVs. The EEG signals are continuously analyzed in 1-s segments, and every 0.5 second moves forward to simulate asynchronous BCI works in the two-stage recognition architecture. The segment is first recognized as lifted or not in the first stage, and then is classified as left or right finger lifting at stage two if the segment is recognized as lifting in the first stage. Several statistical analyses are used to evaluate the performance of the proposed system. The results indicate that it is a promising system in the applications of asynchronous BCI work.
Sum rates of asynchronous GFDMA and SC-FDMA for 5G uplink
Directory of Open Access Journals (Sweden)
Woojin Park
2015-12-01
Full Text Available The fifth generation (5G of mobile communication envisions ultralow latency less than 1 ms for radio interface. To this end, frameless asynchronous multiple access may be needed to allow users to transmit instantly without waiting for the next frame start. In this paper, generalized frequency division multiple-access (GFDMA, one of the promising multiple-access candidates for 5G mobile, is compared with the conventional single-carrier FDMA (SC-FDMA in terms of the uplink sum rate when both techniques are adapted for the asynchronous scenario. In particular, a waveform windowing technique is applied to both schemes to mitigate the inter-user interference due to non-zero out-of-band emission.
High-speed asynchronous optical sampling for high-sensitivity detection of coherent phonons
International Nuclear Information System (INIS)
Dekorsy, T; Taubert, R; Hudert, F; Schrenk, G; Bartels, A; Cerna, R; Kotaidis, V; Plech, A; Koehler, K; Schmitz, J; Wagner, J
2007-01-01
A new optical pump-probe technique is implemented for the investigation of coherent acoustic phonon dynamics in the GHz to THz frequency range which is based on two asynchronously linked femtosecond lasers. Asynchronous optical sampling (ASOPS) provides the performance of on all-optical oscilloscope and allows us to record optically induced lattice dynamics over nanosecond times with femtosecond resolution at scan rates of 10 kHz without any moving part in the set-up. Within 1 minute of data acquisition time signal-to-noise ratios better than 10 7 are achieved. We present examples of the high-sensitivity detection of coherent phonons in superlattices and of the coherent acoustic vibration of metallic nanoparticles
Directory of Open Access Journals (Sweden)
Annie Jane Hill
2016-12-01
Full Text Available Asynchronous telerehabilitation in which computer-based interventions are remotely monitored and adapted offline is an emerging service delivery model in the rehabilitation of communication disorders. The asynchronous nature of this model may hold a benefit over its synchronous counterpart by eliminating scheduling issues and thus improving efficiency in a healthcare landscape of constrained resource allocation. The design of asynchronous telerehabilitation platforms should therefore ensure efficiency and flexibility. The authors have been engaged in a program of research to develop and evaluate an asynchronous telerehabilitation platform for use in speech-language pathology. eSALT is a novel asynchronous telerehabilitation platform in which clinicians design and individualize therapy tasks for upload to a client’s mobile device. An inbuilt telerehabilitation module allows for remote monitoring and updating of tasks. This paper introduces eSALT and reports outcomes from an usability study that considered the needs of two end-user groups, people with aphasia and clinicians, in the on-going refinement of eSALT. In the study participants with aphasia were paired with clinicians who used eSALT to design and customize therapy tasks. After training on the mobile device the participants engaged in therapy at home for a period of three weeks, while clinicians remotely monitored and updated tasks. Following the home trial, participants and clinicians engaged in semi-structured interviews and completed surveys on the usability of eSALT and their satisfaction with the platform. Content analysis of data involving five participants and three clinicians revealed a number of usability themes including ease of use, user support, satisfaction, limitations and potential improvements. These findings were translated into a number of refinements of the eSALT platform including the development of a client interface for use on the Apple iPad®, greater variety in
A parallel algorithm for 3D dislocation dynamics
International Nuclear Information System (INIS)
Wang Zhiqiang; Ghoniem, Nasr; Swaminarayan, Sriram; LeSar, Richard
2006-01-01
Dislocation dynamics (DD), a discrete dynamic simulation method in which dislocations are the fundamental entities, is a powerful tool for investigation of plasticity, deformation and fracture of materials at the micron length scale. However, severe computational difficulties arising from complex, long-range interactions between these curvilinear line defects limit the application of DD in the study of large-scale plastic deformation. We present here the development of a parallel algorithm for accelerated computer simulations of DD. By representing dislocations as a 3D set of dislocation particles, we show here that the problem of an interacting ensemble of dislocations can be converted to a problem of a particle ensemble, interacting with a long-range force field. A grid using binary space partitioning is constructed to keep track of node connectivity across domains. We demonstrate the computational efficiency of the parallel micro-plasticity code and discuss how O(N) methods map naturally onto the parallel data structure. Finally, we present results from applications of the parallel code to deformation in single crystal fcc metals
International Nuclear Information System (INIS)
Yokohama, Noriya
2013-01-01
This report was aimed at structuring the design of architectures and studying performance measurement of a parallel computing environment using a Monte Carlo simulation for particle therapy using a high performance computing (HPC) instance within a public cloud-computing infrastructure. Performance measurements showed an approximately 28 times faster speed than seen with single-thread architecture, combined with improved stability. A study of methods of optimizing the system operations also indicated lower cost. (author)
The aspect of vector control using the asynchronous traction motor in locomotives
Directory of Open Access Journals (Sweden)
L. Liudvinavičius
2009-12-01
Full Text Available The article examines curves controlling asynchronous traction motors increasingly used in locomotive electric drives the main task of which is to create a tractive effort-speed curve of an ideal locomotive Fk = f(v, including a hyperbolic area the curve of which will create conditions showing that energy created by the diesel engine of diesel locomotives (electric locomotives and in case of electric trains, electricity taken from the contact network over the entire range of locomotive speed is turned into efficient work. Mechanical power on wheel sets is constant Pk = Fkv = const, the power of the diesel engine is fully used over the entire range of locomotive speed. Tractive effort-speed curve Fk(v shows the dependency of locomotive traction power Fk on movement speed v. The article presents theoretical and practical aspects relevant to creating the structure of locomotive electric drive and selecting optimal control that is especially relevant to creating the structure of locomotive electric drive using ATM (asynchronous traction motor that gains special popularity in traction rolling stock replacing DC traction motors having low reliability. The frequency modes of asynchronous motor speed regulation are examined. To control ATM, the authors suggest the method of vector control presenting the structural schemes of a locomotive with ATM and control algorithm.
Chick Development and Asynchroneous Hatching in the Zebra Finch (Taeniopygia guttata castanotis).
Ikebuchi, Maki; Okanoya, Kazuo; Hasegawa, Toshikazu; Bischof, Hans-Joachim
2017-10-01
The mode of hatching in birds has important impacts on both parents and chicks, including the costs and risks of breeding for parents, and sibling competition in a clutch. Birds with multiple eggs in a single clutch often begin incubating when most eggs are laid, thereby reducing time of incubation, nursing burden, and sibling competition. In some songbirds and some other species, however, incubation starts immediately after the first egg is laid, and the chicks thus hatch asynchronously. This may result in differences in parental care and in sibling competition based on body size differences among older and younger chicks, which in turn might produce asynchronous development among siblings favoring the first hatchling, and further affect the development and fitness of the chicks after fledging. To determine whether such processes in fact occur in the zebra finch, we observed chick development in 18 clutches of zebra finches. We found that there were effects of asynchronous hatching, but these were smaller than expected and mostly not significant. Our observations suggest that the amount of care given to each chick may be equated with such factors as a camouflage effect of the down feathers, and that the low illumination within the nest also complicates the determination of the hatching order by the parents.
Directory of Open Access Journals (Sweden)
Keming Zhou
2017-05-01
Full Text Available Excitation-inhibition imbalance in neural networks is widely linked to neurological and neuropsychiatric disorders. However, how genetic factors alter neuronal activity, leading to excitation-inhibition imbalance, remains unclear. Here, using the C. elegans locomotor circuit, we examine how altering neuronal activity for varying time periods affects synaptic release pattern and animal behavior. We show that while short-duration activation of excitatory cholinergic neurons elicits a reversible enhancement of presynaptic strength, persistent activation results to asynchronous and reduced cholinergic drive, inducing imbalance between endogenous excitation and inhibition. We find that the neuronal calcium sensor protein NCS-2 is required for asynchronous cholinergic release in an activity-dependent manner and dampens excitability of inhibitory neurons non-cell autonomously. The function of NCS-2 requires its Ca2+ binding and membrane association domains. These results reveal a synaptic mechanism implicating asynchronous release in regulation of excitation-inhibition balance.
Asynchronous online foresight panels: the case of wildfire management
David N. Bengston; Robert L. Olson
2015-01-01
Text-based asynchronous online conferencing involves structured online discussion and deliberation among multiple participants from multiple sites in which there is a delay in interaction between contributors. This method has been widely used for a variety of purposes in higher education and other settings, but has not been commonly used in futures research. This paper...
A Coupling Tool for Parallel Molecular Dynamics-Continuum Simulations
Neumann, Philipp
2012-06-01
We present a tool for coupling Molecular Dynamics and continuum solvers. It is written in C++ and is meant to support the developers of hybrid molecular - continuum simulations in terms of both realisation of the respective coupling algorithm as well as parallel execution of the hybrid simulation. We describe the implementational concept of the tool and its parallel extensions. We particularly focus on the parallel execution of particle insertions into dense molecular systems and propose a respective parallel algorithm. Our implementations are validated for serial and parallel setups in two and three dimensions. © 2012 IEEE.
International Nuclear Information System (INIS)
Clark, Haley; Wu, Jonn; Moiseenko, Vitali; Thomas, Steven
2014-01-01
Many have speculated about the future of computational technology in clinical radiation oncology. It has been advocated that the next generation of computational infrastructure will improve on the current generation by incorporating richer aspects of automation, more heavily and seamlessly featuring distributed and parallel computation, and providing more flexibility toward aggregate data analysis. In this report we describe how a recently created — but currently existing — analysis framework (DICOMautomaton) incorporates these aspects. DICOMautomaton supports a variety of use cases but is especially suited for dosimetric outcomes correlation analysis, investigation and comparison of radiotherapy treatment efficacy, and dose-volume computation. We describe: how it overcomes computational bottlenecks by distributing workload across a network of machines; how modern, asynchronous computational techniques are used to reduce blocking and avoid unnecessary computation; and how issues of out-of-date data are addressed using reactive programming techniques and data dependency chains. We describe internal architecture of the software and give a detailed demonstration of how DICOMautomaton could be used to search for correlations between dosimetric and outcomes data
Hu, Wang; Yen, Gary G; Luo, Guangchun
2017-06-01
It is a daunting challenge to balance the convergence and diversity of an approximate Pareto front in a many-objective optimization evolutionary algorithm. A novel algorithm, named many-objective particle swarm optimization with the two-stage strategy and parallel cell coordinate system (PCCS), is proposed in this paper to improve the comprehensive performance in terms of the convergence and diversity. In the proposed two-stage strategy, the convergence and diversity are separately emphasized at different stages by a single-objective optimizer and a many-objective optimizer, respectively. A PCCS is exploited to manage the diversity, such as maintaining a diverse archive, identifying the dominance resistant solutions, and selecting the diversified solutions. In addition, a leader group is used for selecting the global best solutions to balance the exploitation and exploration of a population. The experimental results illustrate that the proposed algorithm outperforms six chosen state-of-the-art designs in terms of the inverted generational distance and hypervolume over the DTLZ test suite.
Human Exposure to Electromagnetic Fields from Parallel Wireless Power Transfer Systems.
Wen, Feng; Huang, Xueliang
2017-02-08
The scenario of multiple wireless power transfer (WPT) systems working closely, synchronously or asynchronously with phase difference often occurs in power supply for household appliances and electric vehicles in parking lots. Magnetic field leakage from the WPT systems is also varied due to unpredictable asynchronous working conditions. In this study, the magnetic field leakage from parallel WPT systems working with phase difference is predicted, and the induced electric field and specific absorption rate (SAR) in a human body standing in the vicinity are also evaluated. Computational results are compared with the restrictions prescribed in the regulations established to limit human exposure to time-varying electromagnetic fields (EMFs). The results show that the middle region between the two WPT coils is safer for the two WPT systems working in-phase, and the peripheral regions are safer around the WPT systems working anti-phase. Thin metallic plates larger than the WPT coils can shield the magnetic field leakage well, while smaller ones may worsen the situation. The orientation of the human body will influence the maximum magnitude of induced electric field and its distribution within the human body. The induced electric field centralizes in the trunk, groin, and genitals with only one exception: when the human body is standing right at the middle of the two WPT coils working in-phase, the induced electric field focuses on lower limbs. The SAR value in the lungs always seems to be greater than in other organs, while the value in the liver is minimal. Human exposure to EMFs meets the guidelines of the International Committee on Non-Ionizing Radiation Protection (ICNIRP), specifically reference levels with respect to magnetic field and basic restrictions on induced electric fields and SAR, as the charging power is lower than 3.1 kW and 55.5 kW, respectively. These results are positive with respect to the safe applications of parallel WPT systems working
Human Exposure to Electromagnetic Fields from Parallel Wireless Power Transfer Systems
Wen, Feng; Huang, Xueliang
2017-01-01
The scenario of multiple wireless power transfer (WPT) systems working closely, synchronously or asynchronously with phase difference often occurs in power supply for household appliances and electric vehicles in parking lots. Magnetic field leakage from the WPT systems is also varied due to unpredictable asynchronous working conditions. In this study, the magnetic field leakage from parallel WPT systems working with phase difference is predicted, and the induced electric field and specific absorption rate (SAR) in a human body standing in the vicinity are also evaluated. Computational results are compared with the restrictions prescribed in the regulations established to limit human exposure to time-varying electromagnetic fields (EMFs). The results show that the middle region between the two WPT coils is safer for the two WPT systems working in-phase, and the peripheral regions are safer around the WPT systems working anti-phase. Thin metallic plates larger than the WPT coils can shield the magnetic field leakage well, while smaller ones may worsen the situation. The orientation of the human body will influence the maximum magnitude of induced electric field and its distribution within the human body. The induced electric field centralizes in the trunk, groin, and genitals with only one exception: when the human body is standing right at the middle of the two WPT coils working in-phase, the induced electric field focuses on lower limbs. The SAR value in the lungs always seems to be greater than in other organs, while the value in the liver is minimal. Human exposure to EMFs meets the guidelines of the International Committee on Non-Ionizing Radiation Protection (ICNIRP), specifically reference levels with respect to magnetic field and basic restrictions on induced electric fields and SAR, as the charging power is lower than 3.1 kW and 55.5 kW, respectively. These results are positive with respect to the safe applications of parallel WPT systems working
Human Exposure to Electromagnetic Fields from Parallel Wireless Power Transfer Systems
Directory of Open Access Journals (Sweden)
Feng Wen
2017-02-01
Full Text Available The scenario of multiple wireless power transfer (WPT systems working closely, synchronously or asynchronously with phase difference often occurs in power supply for household appliances and electric vehicles in parking lots. Magnetic field leakage from the WPT systems is also varied due to unpredictable asynchronous working conditions. In this study, the magnetic field leakage from parallel WPT systems working with phase difference is predicted, and the induced electric field and specific absorption rate (SAR in a human body standing in the vicinity are also evaluated. Computational results are compared with the restrictions prescribed in the regulations established to limit human exposure to time-varying electromagnetic fields (EMFs. The results show that the middle region between the two WPT coils is safer for the two WPT systems working in-phase, and the peripheral regions are safer around the WPT systems working anti-phase. Thin metallic plates larger than the WPT coils can shield the magnetic field leakage well, while smaller ones may worsen the situation. The orientation of the human body will influence the maximum magnitude of induced electric field and its distribution within the human body. The induced electric field centralizes in the trunk, groin, and genitals with only one exception: when the human body is standing right at the middle of the two WPT coils working in-phase, the induced electric field focuses on lower limbs. The SAR value in the lungs always seems to be greater than in other organs, while the value in the liver is minimal. Human exposure to EMFs meets the guidelines of the International Committee on Non-Ionizing Radiation Protection (ICNIRP, specifically reference levels with respect to magnetic field and basic restrictions on induced electric fields and SAR, as the charging power is lower than 3.1 kW and 55.5 kW, respectively. These results are positive with respect to the safe applications of parallel WPT systems
Parallelized Kalman-Filter-Based Reconstruction of Particle Tracks on Many-Core Processors and GPUs
Cerati, Giuseppe; Elmer, Peter; Krutelyov, Slava; Lantz, Steven; Lefebvre, Matthieu; Masciovecchio, Mario; McDermott, Kevin; Riley, Daniel; Tadel, Matevž; Wittich, Peter; Würthwein, Frank; Yagil, Avi
2017-08-01
For over a decade now, physical and energy constraints have limited clock speed improvements in commodity microprocessors. Instead, chipmakers have been pushed into producing lower-power, multi-core processors such as Graphical Processing Units (GPU), ARM CPUs, and Intel MICs. Broad-based efforts from manufacturers and developers have been devoted to making these processors user-friendly enough to perform general computations. However, extracting performance from a larger number of cores, as well as specialized vector or SIMD units, requires special care in algorithm design and code optimization. One of the most computationally challenging problems in high-energy particle experiments is finding and fitting the charged-particle tracks during event reconstruction. This is expected to become by far the dominant problem at the High-Luminosity Large Hadron Collider (HL-LHC), for example. Today the most common track finding methods are those based on the Kalman filter. Experience with Kalman techniques on real tracking detector systems has shown that they are robust and provide high physics performance. This is why they are currently in use at the LHC, both in the trigger and offine. Previously we reported on the significant parallel speedups that resulted from our investigations to adapt Kalman filters to track fitting and track building on Intel Xeon and Xeon Phi. Here, we discuss our progresses toward the understanding of these processors and the new developments to port the Kalman filter to NVIDIA GPUs.
Parallelized Kalman-Filter-Based Reconstruction of Particle Tracks on Many-Core Processors and GPUs
Directory of Open Access Journals (Sweden)
Cerati Giuseppe
2017-01-01
Full Text Available For over a decade now, physical and energy constraints have limited clock speed improvements in commodity microprocessors. Instead, chipmakers have been pushed into producing lower-power, multi-core processors such as Graphical Processing Units (GPU, ARM CPUs, and Intel MICs. Broad-based efforts from manufacturers and developers have been devoted to making these processors user-friendly enough to perform general computations. However, extracting performance from a larger number of cores, as well as specialized vector or SIMD units, requires special care in algorithm design and code optimization. One of the most computationally challenging problems in high-energy particle experiments is finding and fitting the charged-particle tracks during event reconstruction. This is expected to become by far the dominant problem at the High-Luminosity Large Hadron Collider (HL-LHC, for example. Today the most common track finding methods are those based on the Kalman filter. Experience with Kalman techniques on real tracking detector systems has shown that they are robust and provide high physics performance. This is why they are currently in use at the LHC, both in the trigger and offine. Previously we reported on the significant parallel speedups that resulted from our investigations to adapt Kalman filters to track fitting and track building on Intel Xeon and Xeon Phi. Here, we discuss our progresses toward the understanding of these processors and the new developments to port the Kalman filter to NVIDIA GPUs.
Parallelized Kalman-Filter-Based Reconstruction of Particle Tracks on Many-Core Processors and GPUs
Energy Technology Data Exchange (ETDEWEB)
Cerati, Giuseppe [Fermilab; Elmer, Peter [Princeton U.; Krutelyov, Slava [UC, San Diego; Lantz, Steven [Cornell U.; Lefebvre, Matthieu [Princeton U.; Masciovecchio, Mario [UC, San Diego; McDermott, Kevin [Cornell U.; Riley, Daniel [Cornell U., LNS; Tadel, Matevž [UC, San Diego; Wittich, Peter [Cornell U.; Würthwein, Frank [UC, San Diego; Yagil, Avi [UC, San Diego
2017-01-01
For over a decade now, physical and energy constraints have limited clock speed improvements in commodity microprocessors. Instead, chipmakers have been pushed into producing lower-power, multi-core processors such as Graphical Processing Units (GPU), ARM CPUs, and Intel MICs. Broad-based efforts from manufacturers and developers have been devoted to making these processors user-friendly enough to perform general computations. However, extracting performance from a larger number of cores, as well as specialized vector or SIMD units, requires special care in algorithm design and code optimization. One of the most computationally challenging problems in high-energy particle experiments is finding and fitting the charged-particle tracks during event reconstruction. This is expected to become by far the dominant problem at the High-Luminosity Large Hadron Collider (HL-LHC), for example. Today the most common track finding methods are those based on the Kalman filter. Experience with Kalman techniques on real tracking detector systems has shown that they are robust and provide high physics performance. This is why they are currently in use at the LHC, both in the trigger and offine. Previously we reported on the significant parallel speedups that resulted from our investigations to adapt Kalman filters to track fitting and track building on Intel Xeon and Xeon Phi. Here, we discuss our progresses toward the understanding of these processors and the new developments to port the Kalman filter to NVIDIA GPUs.
Particle simulation on a distributed memory highly parallel processor
International Nuclear Information System (INIS)
Sato, Hiroyuki; Ikesaka, Morio
1990-01-01
This paper describes parallel molecular dynamics simulation of atoms governed by local force interaction. The space in the model is divided into cubic subspaces and mapped to the processor array of the CAP-256, a distributed memory, highly parallel processor developed at Fujitsu Labs. We developed a new technique to avoid redundant calculation of forces between atoms in different processors. Experiments showed the communication overhead was less than 5%, and the idle time due to load imbalance was less than 11% for two model problems which contain 11,532 and 46,128 argon atoms. From the software simulation, the CAP-II which is under development is estimated to be about 45 times faster than CAP-256 and will be able to run the same problem about 40 times faster than Fujitsu's M-380 mainframe when 256 processors are used. (author)
Eighth SIAM conference on parallel processing for scientific computing: Final program and abstracts
Energy Technology Data Exchange (ETDEWEB)
NONE
1997-12-31
This SIAM conference is the premier forum for developments in parallel numerical algorithms, a field that has seen very lively and fruitful developments over the past decade, and whose health is still robust. Themes for this conference were: combinatorial optimization; data-parallel languages; large-scale parallel applications; message-passing; molecular modeling; parallel I/O; parallel libraries; parallel software tools; parallel compilers; particle simulations; problem-solving environments; and sparse matrix computations.
Dewiyanti, Silvia; Brand-Gruwel, Saskia; Jochems, Wim; Broers, Nick
2008-01-01
Dewiyanti, S., Brand-Gruwel, S., Jochems, W., & Broers, N. (2007). Students experiences with collaborative learning in asynchronous computer-supported collaborative learning environments. Computers in Human Behavior, 23, 496-514.
Reconceptualising Moderation in Asynchronous Online Discussions Using Grounded Theory
Vlachopoulos, Panos; Cowan, John
2010-01-01
This article reports a grounded theory study of the moderation of asynchronous online discussions, to explore the processes by which tutors in higher education decide when and how to moderate. It aims to construct a theory of e-moderation based on some key factors which appear to influence e-moderation. It discusses previous research on the…
How to share concurrent asynchronous wait-free variables: Preliminary version
M. Li (Ming); P.M.B. Vitányi (Paul)
1989-01-01
textabstractWe use a structured top-down approach to develop algorithms for atomic variables shared by concurrent asynchronous wait-free processes, starting from the problem specification. By this design we obtain a better understanding of what the algorithms do, why they do it, and that they
Developing a Successful Asynchronous Online Extension Program for Forest Landowners
Zobrist, Kevin W.
2014-01-01
Asynchronous online Extension classes can reach a wide audience, is convenient for the learner, and minimizes ongoing demands on instructor time. However, producing such classes takes significant effort up front. Advance planning and good communication with contributors are essential to success. Considerations include delivery platforms, content…
Rissanen, Mikko J; Kume, Naoto; Kuroda, Yoshihiro; Kuroda, Tomohiro; Yoshimura, Koji; Yoshihara, Hiroyuki
2008-01-01
Many VR technology based training systems use expert's motion data as the training aid, but would not provide any short-cut to teaching medical skills that do not depend on exact motions. Earlier we presented Annotated Simulation Records (ASRs), which can be used to encapsulate experts' insight on psychomotor skills. Annotations made to behavioural parameters in training simulators enable asynchronous teaching instead of just motion training in a proactive way to the learner. We evaluated ASRs for asynchronous teaching of Digital Rectal Examination (DRE) with 3 urologists and 8 medical students. The ASRs were found more effective than motion-based training with verbal feedback.
Ultra Low Energy FDSOI Asynchronous Reconfiguration Network for Adaptive Circuits
Directory of Open Access Journals (Sweden)
Soundous Chairat
2017-05-01
Full Text Available This paper introduces a plug-and-play on-chip asynchronous communication network aimed at the dynamic reconfiguration of a low-power adaptive circuit such as an internet of things (IoT system. By using a separate communication network, we can address both digital and analog blocks at a lower configuration cost, increasing the overall system power efficiency. As reconfiguration only occurs according to specific events and has to be automatically in stand-by most of the time, our design is fully asynchronous using handshake protocols. The paper presents the circuit’s architecture, performance results, and an example of the reconfiguration of frequency locked loops (FLL to validate our work. We obtain an overall energy per bit of 0.07 pJ/bit for one stage, in a 28 nm Fully Depleted Silicon On Insulator (FDSOI technology at 0.6 V and a 1.1 ns/bit latency per stage.
Algebraic Number Precoded OFDM Transmission for Asynchronous Cooperative Multirelay Networks
Directory of Open Access Journals (Sweden)
Hua Jiang
2014-01-01
Full Text Available This paper proposes a space-time block coding (STBC transmission scheme for asynchronous cooperative systems. By combination of rotated complex constellations and Hadamard transform, these constructed codes are capable of achieving full cooperative diversity with the analysis of the pairwise error probability (PEP. Due to the asynchronous characteristic of cooperative systems, orthogonal frequency division multiplexing (OFDM technique with cyclic prefix (CP is adopted for combating timing delays from relay nodes. The total transmit power across the entire network is fixed and appropriate power allocation can be implemented to optimize the network performance. The relay nodes do not require decoding and demodulation operation, resulting in a low complexity. Besides, there is no delay for forwarding the OFDM symbols to the destination node. At the destination node the received signals have the corresponding STBC structure on each subcarrier. In order to reduce the decoding complexity, the sphere decoder is implemented for fast data decoding. Bit error rate (BER performance demonstrates the effectiveness of the proposed scheme.
Precision wood particle feedstocks
Dooley, James H; Lanning, David N
2013-07-30
Wood particles having fibers aligned in a grain, wherein: the wood particles are characterized by a length dimension (L) aligned substantially parallel to the grain, a width dimension (W) normal to L and aligned cross grain, and a height dimension (H) normal to W and L; the L.times.H dimensions define two side surfaces characterized by substantially intact longitudinally arrayed fibers; the W.times.H dimensions define two cross-grain end surfaces characterized individually as aligned either normal to the grain or oblique to the grain; the L.times.W dimensions define two substantially parallel top and bottom surfaces; and, a majority of the W.times.H surfaces in the mixture of wood particles have end checking.
Particle image velocimetry measurements of the flow in the converging region of two parallel jets
Energy Technology Data Exchange (ETDEWEB)
Wang, Huhu, E-mail: huhuwang@tamu.edu; Lee, Saya, E-mail: sayalee@tamu.edu; Hassan, Yassin A., E-mail: y-hassan@tamu.edu
2016-09-15
Highlights: • The flow behaviors in the converging region were non-intrusively investigated using PIV. • The PIV results using two measuring scales and LDV data matched very well. • Significant momentum transfer was observed in the merging region right after the merging point. • Instantaneous vector field revealed characteristic interacting patterns of the jets. - Abstract: The interaction between parallel jets plays a critical role in determining the characteristics of the momentum and heat transfer in the flow. Specifically for next generation VHTR, the output temperature will be about 900 °C, and any thermal oscillations will create safety issues. The mixing variations of the coolants in the reactor core may influence these power oscillations. Numerous numerical tools such as computational fluid dynamics (CFD) simulations have been used to support the reactor design. The validation of CFD method is important to ensure the fidelity of the calculations. This requires high-fidelity, qualified benchmark data. Particle image velocimetry (PIV), a non-intrusive measuring technique, was used to provide benchmark data for resolving a simultaneous flow field in the converging region of two submerged parallel jets issued from rectangular channels. The jets studied in this work had an equal discharge velocity at room temperature. The turbulent characteristics including the distributions of mean velocities, turbulence intensities, Reynolds stresses and z-component vorticity were studied. The streamwise mean velocity measured by PIV and LDV were compared, and they agreed very well.
The 2nd Symposium on the Frontiers of Massively Parallel Computations
Mills, Ronnie (Editor)
1988-01-01
Programming languages, computer graphics, neural networks, massively parallel computers, SIMD architecture, algorithms, digital terrain models, sort computation, simulation of charged particle transport on the massively parallel processor and image processing are among the topics discussed.
Involving the users remotely: an exploratory study using asynchronous usability testing
Directory of Open Access Journals (Sweden)
Beth Filar Williams
2015-02-01
Full Text Available Open Educational Resources (OER are increasingly used in the higher education landscape as a solution for a variety of copyright, publishing and cost-prohibiting issues. While OERs are becoming more common, reports of usability tests that evaluate how well learners can use them to accomplish their learning tasks have lagged behind. Because both the researchers and the learners in this study use resources and tools remotely, asynchronous usability testing of a prototype OER and MOOC online guide was conducted with an exploratory group of users to determine the guide’s ease of use for two distinct groups of users: Educators and Learners. In this article, we share the background and context of this usability project, suggest best methods for asynchronous remote usability testing, and share challenges and insights of the process and results of the testing
Theory of charged particle heating by low-frequency Alfven waves
International Nuclear Information System (INIS)
Guo Zehua; Crabtree, Chris; Chen, Liu
2008-01-01
The heating of charged particles by a linearly polarized and obliquely propagating shear Alfven wave (SAW) at frequencies a fraction of the charged particle cyclotron frequency is demonstrated both analytically and numerically. Applying Lie perturbation theory, with the wave amplitude as the perturbation parameter, the resonance conditions in the laboratory frame are systematically derived. At the lowest order, one recovers the well-known linear cyclotron resonance condition k parallel v parallel -ω-nΩ=0, where v parallel is the particle velocity parallel to the background magnetic field, k parallel is the parallel wave number, ω is the wave frequency, Ω is the gyrofrequency, and n is any integer. At higher orders, however, one discovers a novel nonlinear cyclotron resonance condition given by k parallel v parallel -ω-nΩ/2=0. Analytical predictions on the locations of fixed points, widths of resonances, and resonance overlapping criteria for global stochasticity are also found to agree with those given by computed Poincare surfaces of section
Asynchronous Group Review of EFL Writing: Interactions and Text Revisions
Saeed, Murad Abdu; Ghazali, Kamila
2017-01-01
The current paper reports an empirical study of asynchronous online group review of argumentative essays among nine English as foreign language (EFL) Arab university learners joining English in their first, second, and third years at the institution. In investigating online interactions, commenting patterns, and how the students facilitate text…
Asynchronous cracking with dissimilar paths in multilayer graphene.
Jang, Bongkyun; Kim, Byungwoon; Kim, Jae-Hyun; Lee, Hak-Joo; Sumigawa, Takashi; Kitamura, Takayuki
2017-11-16
Multilayer graphene consists of a stack of single-atomic-thick monolayer graphene sheets bound with π-π interactions and is a fascinating model material opening up a new field of fracture mechanics. In this study, fracture behavior of single-crystalline multilayer graphene was investigated using an in situ mode I fracture test under a scanning electron microscope, and abnormal crack propagation in multilayer graphene was identified for the first time. The fracture toughness of graphene was determined from the measured load-displacement curves and the realistic finite element modelling of specimen geometries. Nonlinear fracture behavior of the multilayer graphene is discussed based on nonlinear elastic fracture mechanics. In situ scanning electron microscope images obtained during the fracture test showed asynchronous crack propagation along independent paths, causing interlayer shear stress and slippages. We also found that energy dissipation by interlayer slippages between the graphene layers is the reason for the enhanced fracture toughness of multilayer graphene. The asynchronous cracking with independent paths is a unique cracking and toughening mechanism for single-crystalline multilayer graphene, which is not observed for the monolayer graphene. This could provide a useful insight for the design and development of graphene-based composite materials for structural applications.
Formation of the wide asynchronous binary asteroid population
International Nuclear Information System (INIS)
Jacobson, Seth A.; Scheeres, Daniel J.; McMahon, Jay
2014-01-01
We propose and analyze a new mechanism for the formation of the wide asynchronous binary population. These binary asteroids have wide semimajor axes relative to most near-Earth and main belt asteroid systems. Confirmed members have rapidly rotating primaries and satellites that are not tidally locked. Previously suggested formation mechanisms from impact ejecta, from planetary flybys, and directly from rotational fission events cannot satisfy all of the observations. The newly hypothesized mechanism works as follows: (1) these systems are formed from rotational fission, (2) their satellites are tidally locked, (3) their orbits are expanded by the binary Yarkovsky-O'Keefe-Radzievskii-Paddack (BYORP) effect, (4) their satellites desynchronize as a result of the adiabatic invariance between the libration of the secondary and the mutual orbit, and (5) the secondary avoids resynchronization because of the YORP effect. This seemingly complex chain of events is a natural pathway for binaries with satellites that have particular shapes, which define the BYORP effect torque that acts on the system. After detailing the theory, we analyze each of the wide asynchronous binary members and candidates to assess their most likely formation mechanism. Finally, we suggest possible future observations to check and constrain our hypothesis.
Pedagogical dimensions of effective online asynchronous teacher communication in higher education
Smits, A.; Voogt, J.; Rutledge, D.; Slykhuis, D.
2015-01-01
In this research teacher behaviour in online asynchronous discussions is studied. To this end teachers’ online messages were analyzed and correlated to measures of student satisfaction. Findings show a positive relation between student satisfaction and the presence of content knowledge, multiple
DEFF Research Database (Denmark)
Zhao, Bo; Blanke, Mogens; Skjetne, Roger
2012-01-01
This paper presents a fault tolerant navigation system for a remotely operated vehicle (ROV). The navigation system uses hydro-acoustic position reference (HPR) and Doppler velocity log (DVL) measurements to achieve an integrated navigation. The fault tolerant functionality is based on a modied...... particle lter. This particle lter is able to run in an asynchronous manner to accommodate the measurement drop out problem, and it overcomes the measurement outliers by switching observation models. Simulations with experimental data show that this fault tolerant navigation system can accurately estimate...
Parallel field line and stream line tracing algorithms for space physics applications
Toth, G.; de Zeeuw, D.; Monostori, G.
2004-05-01
Field line and stream line tracing is required in various space physics applications, such as the coupling of the global magnetosphere and inner magnetosphere models, the coupling of the solar energetic particle and heliosphere models, or the modeling of comets, where the multispecies chemical equations are solved along stream lines of a steady state solution obtained with single fluid MHD model. Tracing a vector field is an inherently serial process, which is difficult to parallelize. This is especially true when the data corresponding to the vector field is distributed over a large number of processors. We designed algorithms for the various applications, which scale well to a large number of processors. In the first algorithm the computational domain is divided into blocks. Each block is on a single processor. The algorithm folows the vector field inside the blocks, and calculates a mapping of the block surfaces. The blocks communicate the values at the coinciding surfaces, and the results are interpolated. Finally all block surfaces are defined and values inside the blocks are obtained. In the second algorithm all processors start integrating along the vector field inside the accessible volume. When the field line leaves the local subdomain, the position and other information is stored in a buffer. Periodically the processors exchange the buffers, and continue integration of the field lines until they reach a boundary. At that point the results are sent back to the originating processor. Efficiency is achieved by a careful phasing of computation and communication. In the third algorithm the results of a steady state simulation are stored on a hard drive. The vector field is contained in blocks. All processors read in all the grid and vector field data and the stream lines are integrated in parallel. If a stream line enters a block, which has already been integrated, the results can be interpolated. By a clever ordering of the blocks the execution speed can be
Students' Learning in Asynchronous Discussion Forums: A Meta-Analysis
Martono, Fkipuntan; Salam, Urai
2017-01-01
Asynchronous discussion forums are among the most preferred tools chosen to foster learning opportunities and knowledge construction. To reveal the cognitive engagement evidenced in the transcripts of the discussion forums, this study presents 51 papers. 17 papers reported research on students' attitude toward the use of ICT for learning, 16…
DEFF Research Database (Denmark)
Krøigård, Anne Bruun; Larsen, Martin Jakob; Lænkholm, Anne Vibeke
2015-01-01
Evolution of the breast cancer genome from pre-invasive stages to asynchronous metastasis is complex and mostly unexplored, but highly demanded as it may provide novel markers for and mechanistic insights in cancer progression. The increasing use of personalized therapy of breast cancer necessita......Evolution of the breast cancer genome from pre-invasive stages to asynchronous metastasis is complex and mostly unexplored, but highly demanded as it may provide novel markers for and mechanistic insights in cancer progression. The increasing use of personalized therapy of breast cancer...... progression from one breast cancer patient, including two different regions of Ductal Carcinoma In Situ (DCIS), primary tumor and an asynchronous metastasis. We identify a remarkable landscape of somatic mutations, retained throughout breast cancer progression and with new mutational events emerging at each...
Costa, M; Letheren, M; Djidi, K; Gustafsson, L; Lazraq, T; Minerskjold, M; Tenhunen, H; Manabe, A; Nomachi, M; Watase, Y
2002-01-01
RD31 : The project is evaluating a new approach to event building for level-two and level-three processor farms at high rate experiments. It is based on the use of commercial switching fabrics to replace the traditional bus-based architectures used in most previous data acquisition sytems. Switching fabrics permit the construction of parallel, expandable, hardware-driven event builders that can deliver higher aggregate throughput than the bus-based architectures. A standard industrial switching fabric technology is being evaluated. It is based on Asynchronous Transfer Mode (ATM) packet-switching network technology. Commercial, expandable ATM switching fabrics and processor interfaces, now being developed for the future Broadband ISDN infrastructure, could form the basis of an implementation. The goals of the project are to demonstrate the viability of this approach, to evaluate the trade-offs involved in make versus buy options, to study the interfacing of the physics frontend data buffers to such a fabric, a...
International Nuclear Information System (INIS)
Procassini, R J; Beck, B R
2004-01-01
It might be assumed that use of a ''high-quality'' random number generator (RNG), producing a sequence of ''pseudo random'' numbers with a ''long'' repetition period, is crucial for producing unbiased results in Monte Carlo particle transport simulations. While several theoretical and empirical tests have been devised to check the quality (randomness and period) of an RNG, for many applications it is not clear what level of RNG quality is required to produce unbiased results. This paper explores the issue of RNG quality in the context of parallel, Monte Carlo transport simulations in order to determine how ''good'' is ''good enough''. This study employs the MERCURY Monte Carlo code, which incorporates the CNPRNG library for the generation of pseudo-random numbers via linear congruential generator (LCG) algorithms. The paper outlines the usage of random numbers during parallel MERCURY simulations, and then describes the source and criticality transport simulations which comprise the empirical basis of this study. A series of calculations for each test problem in which the quality of the RNG (period of the LCG) is varied provides the empirical basis for determining the minimum repetition period which may be employed without producing a bias in the mean integrated results
Wind generator based on cascade connection of two asynchronized synchronous machines
International Nuclear Information System (INIS)
Dzhagarov, N.; Dzhagarova, Yu.
2000-01-01
A model of a wind generator with two asynchronized synchronous machines presented and different regimes are investigated. The analysis shows that the suggested scheme of a brushless generator works and has more advantages (reliability, easy for operation) in comparison with the known ones
An Asynchronous Time-Division-Multiplexed Network-on-Chip for Real-Time Systems
DEFF Research Database (Denmark)
Kasapaki, Evangelia
is an important part of the T-CREST paltform and used in a number of configurations. The flexible timing organization of Argo combines asynchronous routers with mesochronous NIs, which are connected to individually clocked cores, supporting a GALS system organization. The mesochronous NIs operate at the same......Multi-processor architectures using networks-on-chip (NOCs) for communication are becoming the standard approach in the development of embedded systems and general purpose platforms. Typically, multi-processor platforms follow a globally asynchronous locally synchronous (GALS) timing organization....... This thesis focuses on the design of Argo, a NOC targeted at hard real-time multi-processor platforms with a GALS timing organization. To support real-time communication, NOCs establish end-to-end connections and provide latency and throughput guarantees for these connections. Argo uses time division...
Fluka studies of the Asynchronous Beam Dump Effects on LHC Point 6
Versaci, R; Goddard, B; Mereghetti, A; Schmidt, R; Vlachoudis, V; CERN. Geneva. ATS Department
2011-01-01
The LHC is a record-breaking machine for beam energy and intensity. An intense effort has therefore been deployed in simulating critical operational scenarios of energy deposition. Using FLUKA Monte Carlo simulations, we have investigated the effects of an asynchronous beam dump at the LHC Point 6 where beams, with a stored energy of 360 MJ, can instantaneously release up to a few J cm^-3 in the cryogenic magnets which have a quench limit of the order of the mJ cm^-3. In the present paper we will describe the simulation approach, and discuss the evaluated maximum energy release onto the superconducting magnets during an asynchronous beam dump. We will then analyze the shielding provided by collimators installed in the area and discuss safety limits for the operation of the LHC.
International Nuclear Information System (INIS)
Byers, J.A.; Williams, T.J.; Cohen, B.I.; Dimits, A.M.
1994-01-01
One of the programs of the Magnetic fusion Energy (MFE) Theory and computations Program is studying the anomalous transport of thermal energy across the field lines in the core of a tokamak. We use the method of gyrokinetic particle-in-cell simulation in this study. For this LDRD project we employed massively parallel processing, new algorithms, and new algorithms, and new formal techniques to improve this research. Specifically, we sought to take steps toward: researching experimentally-relevant parameters in our simulations, learning parallel computing to have as a resource for our group, and achieving a 100 x speedup over our starting-point Cray2 simulation code's performance
An Asynchronous Multi-Sensor Micro Control Unit for Wireless Body Sensor Networks (WBSNs
Directory of Open Access Journals (Sweden)
Ching-Hsing Luo
2011-07-01
Full Text Available In this work, an asynchronous multi-sensor micro control unit (MCU core is proposed for wireless body sensor networks (WBSNs. It consists of asynchronous interfaces, a power management unit, a multi-sensor controller, a data encoder (DE, and an error correct coder (ECC. To improve the system performance and expansion abilities, the asynchronous interface is created for handshaking different clock domains between ADC and RF with MCU. To increase the use time of the WBSN system, a power management technique is developed for reducing power consumption. In addition, the multi-sensor controller is designed for detecting various biomedical signals. To prevent loss error from wireless transmission, use of an error correct coding technique is important in biomedical applications. The data encoder is added for lossless compression of various biomedical signals with a compression ratio of almost three. This design is successfully tested on a FPGA board. The VLSI architecture of this work contains 2.68-K gate counts and consumes power 496-μW at 133-MHz processing rate by using TSMC 0.13-μm CMOS process. Compared with the previous techniques, this work offers higher performance, more functions, and lower hardware cost than other micro controller designs.
Miller, Steven P.; Whalen, Mike W.; O'Brien, Dan; Heimdahl, Mats P.; Joshi, Anjali
2005-01-01
Recent advanced in model-checking have made it practical to formally verify the correctness of many complex synchronous systems (i.e., systems driven by a single clock). However, many computer systems are implemented by asynchronously composing several synchronous components, where each component has its own clock and these clocks are not synchronized. Formal verification of such Globally Asynchronous/Locally Synchronous (GA/LS) architectures is a much more difficult task. In this report, we describe a methodology for developing and reasoning about such systems. This approach allows a developer to start from an ideal system specification and refine it along two axes. Along one axis, the system can be refined one component at a time towards an implementation. Along the other axis, the behavior of the system can be relaxed to produce a more cost effective but still acceptable solution. We illustrate this process by applying it to the synchronization logic of a Dual Fight Guidance System, evolving the system from an ideal case in which the components do not fail and communicate synchronously to one in which the components can fail and communicate asynchronously. For each step, we show how the system requirements have to change if the system is to be implemented and prove that each implementation meets the revised system requirements through modelchecking.
A Parallel Modular Biomimetic Cilia Sorting Platform
Directory of Open Access Journals (Sweden)
James G. H. Whiting
2018-03-01
Full Text Available The aquatic unicellular organism Paramecium caudatum uses cilia to swim around its environment and to graze on food particles and bacteria. Paramecia use waves of ciliary beating for locomotion, intake of food particles and sensing. There is some evidence that Paramecia pre-sort food particles by discarding larger particles, but intake the particles matching their mouth cavity. Most prior attempts to mimic cilia-based manipulation merely mimicked the overall action rather than the beating of cilia. The majority of massive-parallel actuators are controlled by a central computer; however, a distributed control would be far more true-to-life. We propose and test a distributed parallel cilia platform where each actuating unit is autonomous, yet exchanging information with its closest neighboring units. The units are arranged in a hexagonal array. Each unit is a tileable circuit board, with a microprocessor, color-based object sensor and servo-actuated biomimetic cilia actuator. Localized synchronous communication between cilia allowed for the emergence of coordinated action, moving different colored objects together. The coordinated beating action was capable of moving objects up to 4 cm/s at its highest beating frequency; however, objects were moved at a speed proportional to the beat frequency. Using the local communication, we were able to detect the shape of objects and rotating an object using edge detection was performed; however, lateral manipulation using shape information was unsuccessful.
An Asynchronous P300 BCI With SSVEP-Based Control State Detection
DEFF Research Database (Denmark)
Panicker, Rajesh C.; Puthusserypady, Sadasivan; Sun, Ying
2011-01-01
In this paper, an asynchronous brain–computer interface (BCI) system combining the P300 and steady-state visually evoked potentials (SSVEPs) paradigms is proposed. The information transfer is accomplished using P300 event-related potential paradigm and the control state (CS) detection is achieved...
SU-F-SPS-09: Parallel MC Kernel Calculations for VMAT Plan Improvement
International Nuclear Information System (INIS)
Chamberlain, S; French, S; Nazareth, D
2016-01-01
Purpose: Adding kernels (small perturbations in leaf positions) to the existing apertures of VMAT control points may improve plan quality. We investigate the calculation of kernel doses using a parallelized Monte Carlo (MC) method. Methods: A clinical prostate VMAT DICOM plan was exported from Eclipse. An arbitrary control point and leaf were chosen, and a modified MLC file was created, corresponding to the leaf position offset by 0.5cm. The additional dose produced by this 0.5 cm × 0.5 cm kernel was calculated using the DOSXYZnrc component module of BEAMnrc. A range of particle history counts were run (varying from 3 × 10"6 to 3 × 10"7); each job was split among 1, 10, or 100 parallel processes. A particle count of 3 × 10"6 was established as the lower range because it provided the minimal accuracy level. Results: As expected, an increase in particle counts linearly increases run time. For the lowest particle count, the time varied from 30 hours for the single-processor run, to 0.30 hours for the 100-processor run. Conclusion: Parallel processing of MC calculations in the EGS framework significantly decreases time necessary for each kernel dose calculation. Particle counts lower than 1 × 10"6 have too large of an error to output accurate dose for a Monte Carlo kernel calculation. Future work will investigate increasing the number of parallel processes and optimizing run times for multiple kernel calculations.
Parallel programming practical aspects, models and current limitations
Tarkov, Mikhail S
2014-01-01
Parallel programming is designed for the use of parallel computer systems for solving time-consuming problems that cannot be solved on a sequential computer in a reasonable time. These problems can be divided into two classes: 1. Processing large data arrays (including processing images and signals in real time)2. Simulation of complex physical processes and chemical reactions For each of these classes, prospective methods are designed for solving problems. For data processing, one of the most promising technologies is the use of artificial neural networks. Particles-in-cell method and cellular automata are very useful for simulation. Problems of scalability of parallel algorithms and the transfer of existing parallel programs to future parallel computers are very acute now. An important task is to optimize the use of the equipment (including the CPU cache) of parallel computers. Along with parallelizing information processing, it is essential to ensure the processing reliability by the relevant organization ...
Advanced computers and Monte Carlo
International Nuclear Information System (INIS)
Jordan, T.L.
1979-01-01
High-performance parallelism that is currently available is synchronous in nature. It is manifested in such architectures as Burroughs ILLIAC-IV, CDC STAR-100, TI ASC, CRI CRAY-1, ICL DAP, and many special-purpose array processors designed for signal processing. This form of parallelism has apparently not been of significant value to many important Monte Carlo calculations. Nevertheless, there is much asynchronous parallelism in many of these calculations. A model of a production code that requires up to 20 hours per problem on a CDC 7600 is studied for suitability on some asynchronous architectures that are on the drawing board. The code is described and some of its properties and resource requirements ae identified to compare with corresponding properties and resource requirements are identified to compare with corresponding properties and resource requirements are identified to compare with corresponding properties and resources of some asynchronous multiprocessor architectures. Arguments are made for programer aids and special syntax to identify and support important asynchronous parallelism. 2 figures, 5 tables
Moradi, Saber; Qiao, Ning; Stefanini, Fabio; Indiveri, Giacomo
2018-02-01
Neuromorphic computing systems comprise networks of neurons that use asynchronous events for both computation and communication. This type of representation offers several advantages in terms of bandwidth and power consumption in neuromorphic electronic systems. However, managing the traffic of asynchronous events in large scale systems is a daunting task, both in terms of circuit complexity and memory requirements. Here, we present a novel routing methodology that employs both hierarchical and mesh routing strategies and combines heterogeneous memory structures for minimizing both memory requirements and latency, while maximizing programming flexibility to support a wide range of event-based neural network architectures, through parameter configuration. We validated the proposed scheme in a prototype multicore neuromorphic processor chip that employs hybrid analog/digital circuits for emulating synapse and neuron dynamics together with asynchronous digital circuits for managing the address-event traffic. We present a theoretical analysis of the proposed connectivity scheme, describe the methods and circuits used to implement such scheme, and characterize the prototype chip. Finally, we demonstrate the use of the neuromorphic processor with a convolutional neural network for the real-time classification of visual symbols being flashed to a dynamic vision sensor (DVS) at high speed.
1 Commande multivariable du moteur asynchrone triphasé à cage ...
African Journals Online (AJOL)
AKA BOKO
Correspondance, courriel : rabenarivo.michel@yahoo.fr. Résumé. La commande du moteur asynchrone triphasé à ... synthèse du système à l'aide du logiciel MATLAB. Mots-clés : commande, système multivariable, variation de ... of the system by MATLAB software. Keywords : control, MIMO system, frequency variation, ...
Parallel transport of long mean-free-path plasma along open magnetic field lines: Parallel heat flux
International Nuclear Information System (INIS)
Guo Zehua; Tang Xianzhu
2012-01-01
In a long mean-free-path plasma where temperature anisotropy can be sustained, the parallel heat flux has two components with one associated with the parallel thermal energy and the other the perpendicular thermal energy. Due to the large deviation of the distribution function from local Maxwellian in an open field line plasma with low collisionality, the conventional perturbative calculation of the parallel heat flux closure in its local or non-local form is no longer applicable. Here, a non-perturbative calculation is presented for a collisionless plasma in a two-dimensional flux expander bounded by absorbing walls. Specifically, closures of previously unfamiliar form are obtained for ions and electrons, which relate two distinct components of the species parallel heat flux to the lower order fluid moments such as density, parallel flow, parallel and perpendicular temperatures, and the field quantities such as the magnetic field strength and the electrostatic potential. The plasma source and boundary condition at the absorbing wall enter explicitly in the closure calculation. Although the closure calculation does not take into account wave-particle interactions, the results based on passing orbits from steady-state collisionless drift-kinetic equation show remarkable agreement with fully kinetic-Maxwell simulations. As an example of the physical implications of the theory, the parallel heat flux closures are found to predict a surprising observation in the kinetic-Maxwell simulation of the 2D magnetic flux expander problem, where the parallel heat flux of the parallel thermal energy flows from low to high parallel temperature region.
Forced synchronization and asynchronous quenching in a thermo-acoustic system
Mondal, Sirshendu; Pawar, Samadhan A.; Sujith, Raman
2017-11-01
Forced synchronization, which has been extensively studied in theory and experiments, occurs through two different mechanisms known as phase locking and asynchronous quenching. The latter indicates the suppression of oscillation amplitude. In most practical combustion systems such as gas turbine engines, the main concern is high amplitude pressure oscillations, known as thermo-acoustic instability. Thermo-acoustic instability is undesirable and needs to be suppressed because of its damaging consequences to an engine. In the present study, a systematic experimental investigation of forced synchronization is performed in a prototypical thermo-acoustic system, a Rijke tube, in its limit cycle operation. Further, we show a qualitatively similar behavior using a reduced order model. In the phase locking region, the simultaneous occurrence of synchronization and resonant amplification leads to high amplitude pressure oscillations. However, a reduction in the amplitude of natural oscillations by about 78% of the unforced amplitude is observed when the forcing frequency is far lower than the natural frequency. This shows the possibility of suppression of the oscillation amplitude through asynchronous quenching in thermo-acoustic systems.
A task parallel implementation of fast multipole methods
Taura, Kenjiro
2012-11-01
This paper describes a task parallel implementation of ExaFMM, an open source implementation of fast multipole methods (FMM), using a lightweight task parallel library MassiveThreads. Although there have been many attempts on parallelizing FMM, experiences have almost exclusively been limited to formulation based on flat homogeneous parallel loops. FMM in fact contains operations that cannot be readily expressed in such conventional but restrictive models. We show that task parallelism, or parallel recursions in particular, allows us to parallelize all operations of FMM naturally and scalably. Moreover it allows us to parallelize a \\'\\'mutual interaction\\'\\' for force/potential evaluation, which is roughly twice as efficient as a more conventional, unidirectional force/potential evaluation. The net result is an open source FMM that is clearly among the fastest single node implementations, including those on GPUs; with a million particles on a 32 cores Sandy Bridge 2.20GHz node, it completes a single time step including tree construction and force/potential evaluation in 65 milliseconds. The study clearly showcases both programmability and performance benefits of flexible parallel constructs over more monolithic parallel loops. © 2012 IEEE.
Directory of Open Access Journals (Sweden)
Axel J. Fenwick
2014-01-01
Full Text Available Cranial visceral afferents contained within the solitary tract (ST contact second-order neurons in the nucleus of the solitary tract (NTS and release the excitatory amino acid glutamate via three distinct exocytosis pathways; synchronous, asynchronous, and spontaneous release. The presence of TRPV1 in the central terminals of a majority of ST afferents conveys activity-dependent asynchronous glutamate release and provides a temperature sensitive calcium conductance which largely determines the rate of spontaneous vesicle fusion. TRPV1 is present in unmyelinated C-fiber afferents and these facilitated forms of glutamate release may underlie the relative strength of C-fibers in activating autonomic reflex pathways. However, pharmacological blockade of TRPV1 signaling eliminates only ~50% of the asynchronous profile and attenuates the temperature sensitivity of spontaneous release indicating additional thermosensitive calcium influx pathways may exist which mediate these forms of vesicle release. In the present study we isolate the contribution of TRPV1 independent forms of glutamate release at ST-NTS synapses. We found ST afferent innervation at NTS neurons and synchronous vesicle release from TRPV1 KO mice was not different to control animals; however, only half of TRPV1 KO ST afferents completely lacked asynchronous glutamate release. Further, temperature driven spontaneous rates of vesicle release were not different from 33˚ - 37˚C between control and TRPV1 KO afferents. These findings suggest additional temperature dependent mechanisms controlling asynchronous and thermosensitive spontaneous release at physiological temperatures, possibly mediated by additional thermosensitive TRP channels in primary afferent terminals.
Asynchronous social search as a single point of access to information
Buijs, M.P.; Spruit, M.
2017-01-01
The purpose of this paper is to present asynchronous social search as a novel and intuitive approach to search for information in which people collaborate to find the information they are looking for. Design/methodology/approach A prototype was built to test the feasibility in a business
Commande adaptive d'une machine asynchrone
Slama-Belkhodja, I.; de Fornel, B.
1996-06-01
The paper deals with an indirect self-tuning speed control for an induction motor supplied by a chopper-filter-inverter system. Input/Output models are identified with the recursive least squares algorithm and the controller adaptation is based on a pole assignement strategy. Emphasis is put on the evaluation of the parameter identification in order to avoid instabilities because of disturbances or insufficient excitations. This is especially of importance when the adaptive control is carried out in closed loop systems and without additional test signals. Simulation results show the improvement of the dynamic responses and the robustness against load variations or parameters variations (rotor resistance, inertia). Cat article décrit une stratégie de commande adaptive indirecte à Placement de Pôles (PP), appliquée à la commande en vitesse d'une machine asynchrone alimentée par un ensemble hacheur-filtre-onduleur de tension. L'algorithme des Moindres Carrés Récursifs (MCR) est utilisé pour l'identification des modèles de comportement type entrées/sorties. Un intérêt particulier est porté à la mise en oeuvre de cet algorithme et à la discussion de ses résultats, tenant compte des erreurs de modélisation et de la nature peu riche en excitations des entrées du processus. Différents régimes transitoires ont été simulés pour apprécier l'apport de cette association (MCR-PP) : démarrages et inversion des sens de rotation, à vide et en charges, applications d'échelons de couple résistant, variations paramétriques. Les résultats permettent d'illustrer, tant au niveau des performances que de la robustesse, l'apport d'une telle commande adaptive pour des entraînements électriques avec une machine asynchrone.
Asynchronous data change notification between database server and accelerator control systems
International Nuclear Information System (INIS)
Wenge Fu; Seth Nemesure; Morris, J.
2012-01-01
Database data change notification (DCN) is a commonly used feature, it allows to be informed when the data has been changed on the server side by another client. Not all database management systems (DBMS) provide an explicit DCN mechanism. Even for those DBMS's which support DCN (such as Oracle and MS SQL server), some server side and/or client side programming may be required to make the DCN system work. This makes the setup of DCN between database server and interested clients tedious and time consuming. In accelerator control systems, there are many well established software client/server architectures (such as CDEV, EPICS, and ADO) that can be used to implement data reflection servers that transfer data asynchronously to any client using the standard SET/GET API. This paper describes a method for using such a data reflection server to set up asynchronous DCN (ADCN) between a DBMS and clients. This method works well for all DBMS systems which provide database trigger functionality. (authors)
Directory of Open Access Journals (Sweden)
Romanenko N. G.
2017-10-01
Full Text Available the application of virtual laboratories will allow to show different transition processes, as well as to carry out experiments that are very expensive in real electrical machines’ labs, for example, to calculate the energy costs in electric drives. Models of non-regulation asynchronous drive and frequency-regulated asynchronous electric drive are examined in this article. The author has calculated and compared the energy losses of these systems with various types of loads and this lets us to evaluate work processes of many technical devices.
Application of Artificial Intelligence Techniques for the Control of the Asynchronous Machine
Directory of Open Access Journals (Sweden)
F. Khammar
2016-01-01
Full Text Available The induction machine is experiencing a growing success for two decades by gradually replacing the DC machines and synchronous in many industrial applications. This paper is devoted to the study of advanced methods applied to the command of the asynchronous machine in order to obtain a system of control of high performance. While the criteria for response time, overtaking, and static error can be assured by the techniques of conventional control, the criterion of robustness remains a challenge for researchers. This criterion can be satisfied only by applying advanced techniques of command. After mathematical modeling of the asynchronous machine, it defines the control strategies based on the orientation of the rotor flux. The results of the different simulation tests highlight the properties of robustness of algorithms proposed and suggested to compare the different control strategies.
Asynchronous Sensor fuSion for Improved Safety of air Traffic (ASSIST), Phase I
National Aeronautics and Space Administration — SSCI proposes to develop, implement and test a collision detection system for unmanned aerial vehicles (UAV), referred to as the Asynchronous Sensor fuSion for...
High-Fidelity RF Gun Simulations with the Parallel 3D Finite Element Particle-In-Cell Code Pic3P
Energy Technology Data Exchange (ETDEWEB)
Candel, A; Kabel, A.; Lee, L.; Li, Z.; Limborg, C.; Ng, C.; Schussman, G.; Ko, K.; /SLAC
2009-06-19
SLAC's Advanced Computations Department (ACD) has developed the first parallel Finite Element 3D Particle-In-Cell (PIC) code, Pic3P, for simulations of RF guns and other space-charge dominated beam-cavity interactions. Pic3P solves the complete set of Maxwell-Lorentz equations and thus includes space charge, retardation and wakefield effects from first principles. Pic3P uses higher-order Finite Elementmethods on unstructured conformal meshes. A novel scheme for causal adaptive refinement and dynamic load balancing enable unprecedented simulation accuracy, aiding the design and operation of the next generation of accelerator facilities. Application to the Linac Coherent Light Source (LCLS) RF gun is presented.
Teleoperation system using Asynchronous transfer mode, ATM network
International Nuclear Information System (INIS)
Mohd Dani Baba; A Nasoruddin Mohamad
1999-01-01
This paper examines the application of Asynchronous Transfer Mode (ATM) in a distributed industrial environment such as in teleoperation, which performs real time control manipulation from a remote location. In our study, two models of teleoperation are proposed; the first model is a point to point connection and the second model is through an ATM network. The performance results are analysed as to determine whether the two models can support the teleoperation traffics via simulation using commercial software design tool. (Author)
Alqadoumi, Omar Mohamed
2012-01-01
Previous studies in the field of e-tutoring dealt either with asynchronous tutoring or synchronous conferencing as modes for providing e-tutoring services to English learners. This qualitative research study reports the experiences of Arab ESL tutees with both asynchronous tutoring and synchronous conferencing. It also reports the experiences of…
International Nuclear Information System (INIS)
Anderson, D.V.; Shumaker, D.E.
1993-01-01
From a computational standpoint, particle simulation calculations for plasmas have not adapted well to the transitions from scalar to vector processing nor from serial to parallel environments. They have suffered from inordinate and excessive accessing of computer memory and have been hobbled by relatively inefficient gather-scatter constructs resulting from the use of indirect indexing. Lastly, the many-to-one mapping characteristic of the deposition phase has made it difficult to perform this in parallel. The authors' code sorts and reorders the particles in a spatial order. This allows them to greatly reduce the memory references, to run in directly indexed vector mode, and to employ domain decomposition to achieve parallelization. In this hybrid simulation the electrons are modeled as a fluid and the field equations solved are obtained from the electron momentum equation together with the pre-Maxwell equations (displacement current neglected). Either zero or finite electron mass can be used in the electron model. The resulting field equations are solved with an iteratively explicit procedure which is thus trivial to parallelize. Likewise, the field interpolations and the particle pushing is simple to parallelize. The deposition, sorting, and reordering phases are less simple and it is for these that the authors present detailed algorithms. They have now successfully tested the parallel version of HOPS in serial mode and it is now being readied for parallel execution on the Cray C-90. They will then port HOPS to a massively parallel computer, in the next year
Directory of Open Access Journals (Sweden)
R. I. Mustafayev
2012-01-01
Full Text Available The paper presents methodology for mathematical modeling of power system (its part when jointly operated with wind power plants (stations that contain asynchronous doubly-fed machines used as generators. The essence and advantage of the methodology is that it allows efficiently to mate equations of doubly-fed asynchronous machines, written in the axes that rotate with the machine rotor speed with the equations of external electric power system, written in synchronously rotating axes.
Flipping the Online Classroom with Web 2.0: The Asynchronous Workshop
Cummings, Lance
2016-01-01
This article examines how Web 2.0 technologies can be used to "flip" the online classroom by creating asynchronous workshops in social environments where immediacy and social presence can be maximized. Using experience teaching several communication and writing classes in Google Apps (Google+, Google Hangouts, Google Drive, etc.), I…
Reactors: A data-oriented synchronous/asynchronous programming model for distributed applications
DEFF Research Database (Denmark)
Field, John; Marinescu, Maria-Cristina; Stefansen, Christian Oskar Erik
2009-01-01
of messages. Similarly, the interface to a reactor is simply its state, rather than a collection of message channels, ports, or methods. One novel feature of our model is the ability to compose behaviors both synchronously and asynchronously. Also, our use of Datalog-style rules allows aspect-like composition...
Strang, Kenneth
2013-01-01
Cooperative learning was applied in a graduate project management course to compare the effectiveness of asynchronous versus synchronous online team meetings. An experiment was constructed to allocate students to project teams while ensuring there was a balance of requisite skills, namely systems analysis and design along with HTML/Javascript…
Defining the Symmetry of the Universal Semi-Regular Autonomous Asynchronous Systems
Directory of Open Access Journals (Sweden)
Serban E. Vlad
2012-02-01
Full Text Available The regular autonomous asynchronous systems are the non-deterministic Boolean dynamical systems and universality means the greatest in the sense of the inclusion. The paper gives four definitions of symmetry of these systems in a slightly more general framework, called semi-regularity, and also many examples.
Bashashati, Ali; Mason, Steve; Ward, Rabab K.; Birch, Gary E.
2006-06-01
The low-frequency asynchronous switch design (LF-ASD) has been introduced as a direct brain interface (BI) for asynchronous control applications. Asynchronous interfaces, as opposed to synchronous interfaces, have the advantage of being operational at all times and not only at specific system-defined periods. This paper modifies the LF-ASD design by incorporating into the system more knowledge about the attempted movements. Specifically, the history of feature values extracted from the EEG signal is used to detect a right index finger movement attempt. Using data collected from individuals with high-level spinal cord injuries and able-bodied subjects, it is shown that the error characteristics of the modified design are significantly better than the previous LF-ASD design. The true positive rate percentage increased by up to 15 which corresponds to 50% improvement when the system is operating with false positive rates in the 1-2% range.
International Nuclear Information System (INIS)
Jejcic, A.; Maillard, J.; Maurel, G.; Silva, J.; Wolff-Bacha, F.
1997-01-01
The work in the field of parallel processing has developed as research activities using several numerical Monte Carlo simulations related to basic or applied current problems of nuclear and particle physics. For the applications utilizing the GEANT code development or improvement works were done on parts simulating low energy physical phenomena like radiation, transport and interaction. The problem of actinide burning by means of accelerators was approached using a simulation with the GEANT code. A program of neutron tracking in the range of low energies up to the thermal region has been developed. It is coupled to the GEANT code and permits in a single pass the simulation of a hybrid reactor core receiving a proton burst. Other works in this field refers to simulations for nuclear medicine applications like, for instance, development of biological probes, evaluation and characterization of the gamma cameras (collimators, crystal thickness) as well as the method for dosimetric calculations. Particularly, these calculations are suited for a geometrical parallelization approach especially adapted to parallel machines of the TN310 type. Other works mentioned in the same field refer to simulation of the electron channelling in crystals and simulation of the beam-beam interaction effect in colliders. The GEANT code was also used to simulate the operation of germanium detectors designed for natural and artificial radioactivity monitoring of environment
Determining sociability, social space, and social presence in (a)synchronous collaborating groups
Kreijns, C.J.; Kirschner, P.A.; Jochems, W.M.G.; Buuren, van H.
2004-01-01
The effectiveness of group learning in asynchronous distributed learning groups depends on the social interaction that takes place. This social interaction affects both cognitive and socioemotional processes that take place during learning, group forming, establishment of group structures, and group
Determining sociability, social space, and social presence in (A)synchronous collaborative groups
Kreijns, K.; Kirschner, P.A.; Jochems, W.; Buuren, H. van
2004-01-01
The effectiveness of group learning in asynchronous distributed learning groups depends on the social interaction that takes place. This social interaction affects both cognitive and socioemotional processes that take place during learning, group forming, establishment of group structures, and group
Energy Technology Data Exchange (ETDEWEB)
Nishioka, K.; Nakamura, Y. [Graduate School of Energy Science, Kyoto University, Gokasho, Uji, Kyoto 611-0011 (Japan); Nishimura, S. [National Institute for Fusion Science, 322-6 Oroshi-cho, Toki, Gifu 509-5292 (Japan); Lee, H. Y. [Korea Advanced Institute of Science and Technology, Daejeon 305-701 (Korea, Republic of); Kobayashi, S.; Mizuuchi, T.; Nagasaki, K.; Okada, H.; Minami, T.; Kado, S.; Yamamoto, S.; Ohshima, S.; Konoshima, S.; Sano, F. [Institute of Advanced Energy, Kyoto University, Gokasho, Uji, Kyoto 611-0011 (Japan)
2016-03-15
A moment approach to calculate neoclassical transport in non-axisymmetric torus plasmas composed of multiple ion species is extended to include the external parallel momentum sources due to unbalanced tangential neutral beam injections (NBIs). The momentum sources that are included in the parallel momentum balance are calculated from the collision operators of background particles with fast ions. This method is applied for the clarification of the physical mechanism of the neoclassical parallel ion flows and the multi-ion species effect on them in Heliotron J NBI plasmas. It is found that parallel ion flow can be determined by the balance between the parallel viscosity and the external momentum source in the region where the external source is much larger than the thermodynamic force driven source in the collisional plasmas. This is because the friction between C{sup 6+} and D{sup +} prevents a large difference between C{sup 6+} and D{sup +} flow velocities in such plasmas. The C{sup 6+} flow velocities, which are measured by the charge exchange recombination spectroscopy system, are numerically evaluated with this method. It is shown that the experimentally measured C{sup 6+} impurity flow velocities do not contradict clearly with the neoclassical estimations, and the dependence of parallel flow velocities on the magnetic field ripples is consistent in both results.
Parallel beam dynamics simulation of linear accelerators
International Nuclear Information System (INIS)
Qiang, Ji; Ryne, Robert D.
2002-01-01
In this paper we describe parallel particle-in-cell methods for the large scale simulation of beam dynamics in linear accelerators. These techniques have been implemented in the IMPACT (Integrated Map and Particle Accelerator Tracking) code. IMPACT is being used to study the behavior of intense charged particle beams and as a tool for the design of next-generation linear accelerators. As examples, we present applications of the code to the study of emittance exchange in high intensity beams and to the study of beam transport in a proposed accelerator for the development of accelerator-driven waste transmutation technologies
MacDonald, Penny; Garcia-Carbonell, Amparo; Carot, Sierra, Jose Miguel
2013-01-01
This study focuses on the computer-aided analysis of interlanguage errors made by the participants in the telematic simulation IDEELS (Intercultural Dynamics in European Education through on-Line Simulation). The synchronous and asynchronous communication analysed was part of the MiLC Corpus, a multilingual learner corpus of texts written by…
An Examination of Computer Engineering Students' Perceptions about Asynchronous Discussion Forums
Ozyurt, Ozcan; Ozyurt, Hacer
2013-01-01
This study was conducted in order to reveal the usage profiles and perceptions of Asynchronous Discussion Forums (ADFs) of 126 computer engineering students from the Computer Engineering Department in a university in Turkey. By using a mixed methods research design both quantitative and qualitative data were collected and analyzed. Research…
Information criteria for quantifying loss of reversibility in parallelized KMC
Energy Technology Data Exchange (ETDEWEB)
Gourgoulias, Konstantinos, E-mail: gourgoul@math.umass.edu; Katsoulakis, Markos A., E-mail: markos@math.umass.edu; Rey-Bellet, Luc, E-mail: luc@math.umass.edu
2017-01-01
Parallel Kinetic Monte Carlo (KMC) is a potent tool to simulate stochastic particle systems efficiently. However, despite literature on quantifying domain decomposition errors of the particle system for this class of algorithms in the short and in the long time regime, no study yet explores and quantifies the loss of time-reversibility in Parallel KMC. Inspired by concepts from non-equilibrium statistical mechanics, we propose the entropy production per unit time, or entropy production rate, given in terms of an observable and a corresponding estimator, as a metric that quantifies the loss of reversibility. Typically, this is a quantity that cannot be computed explicitly for Parallel KMC, which is why we develop a posteriori estimators that have good scaling properties with respect to the size of the system. Through these estimators, we can connect the different parameters of the scheme, such as the communication time step of the parallelization, the choice of the domain decomposition, and the computational schedule, with its performance in controlling the loss of reversibility. From this point of view, the entropy production rate can be seen both as an information criterion to compare the reversibility of different parallel schemes and as a tool to diagnose reversibility issues with a particular scheme. As a demonstration, we use Sandia Lab's SPPARKS software to compare different parallelization schemes and different domain (lattice) decompositions.
Simulation of the Dynamic Behavior of an Asynchronous Machine Using Direct Self-Control
Directory of Open Access Journals (Sweden)
Cristian Paul Chioncel
2007-01-01
Full Text Available The paper presents the major steps that have to be gone for the implementation of the mathematical model of the asynchronous machine in SciLab / Scicos. This implemented ASM model, will be used to check the dynamic behavior of the system, the current diagrams as well as the behavior motor speed and the torque, if the input signal has a pulsation form. This implementation’s are made in Scilab / Scicos environment, a clone of the MATLAB, which provides number-crunching power similar to MATLAB, at a much better cost/performance ratio. The implemented model offers the possibility to analyze the behaviors of the asynchronous machine in different dynamic situations: speed, torques, current in motor or generator regime and to study its behavior in different possible control schemes.
Asynchronous Gossip-Based Gradient-Free Method for Multiagent Optimization
Deming Yuan
2014-01-01
This paper considers the constrained multiagent optimization problem. The objective function of the problem is a sum of convex functions, each of which is known by a specific agent only. For solving this problem, we propose an asynchronous distributed method that is based on gradient-free oracles and gossip algorithm. In contrast to the existing work, we do not require that agents be capable of computing the subgradients of their objective functions and coordinating their...
The queueing perspective of asynchronous network coding in two-way relay network
Liang, Yaping; Chang, Qing; Li, Xianxu
2018-04-01
Asynchronous network coding (NC) has potential to improve the wireless network performance compared with a routing or the synchronous network coding. Recent researches concentrate on the optimization between throughput/energy consuming and delay with a couple of independent input flow. However, the implementation of NC requires a thorough investigation of its impact on relevant queueing systems where few work focuses on. Moreover, few works study the probability density function (pdf) in network coding scenario. In this paper, the scenario with two independent Poisson input flows and one output flow is considered. The asynchronous NC-based strategy is that a new arrival evicts a head packet holding in its queue when waiting for another packet from the other flow to encode. The pdf for the output flow which contains both coded and uncoded packets is derived. Besides, the statistic characteristics of this strategy are analyzed. These results are verified by numerical simulations.
Directory of Open Access Journals (Sweden)
Jens P Weber
Full Text Available Synchronization of neurotransmitter release with the presynaptic action potential is essential for maintaining fidelity of information transfer in the central nervous system. However, synchronous release is frequently accompanied by an asynchronous release component that builds up during repetitive stimulation, and can even play a dominant role in some synapses. Here, we show that substitution of SNAP-23 for SNAP-25 in mouse autaptic glutamatergic hippocampal neurons results in asynchronous release and a higher frequency of spontaneous release events (mEPSCs. Use of neurons from double-knock-out (SNAP-25, synaptotagmin-7 mice in combination with viral transduction showed that SNAP-23-driven release is triggered by endogenous synaptotagmin-7. In the absence of synaptotagmin-7 release became even more asynchronous, and the spontaneous release rate increased even more, indicating that synaptotagmin-7 acts to synchronize release and suppress spontaneous release. However, compared to synaptotagmin-1, synaptotagmin-7 is a both leaky and asynchronous calcium sensor. In the presence of SNAP-25, consequences of the elimination of synaptotagmin-7 were small or absent, indicating that the protein pairs SNAP-25/synaptotagmin-1 and SNAP-23/synaptotagmin-7 might act as mutually exclusive calcium sensors. Expression of fusion proteins between pHluorin (pH-sensitive GFP and synaptotagmin-1 or -7 showed that vesicles that fuse using the SNAP-23/synaptotagmin-7 combination contained synaptotagmin-1, while synaptotagmin-7 barely displayed activity-dependent trafficking between vesicle and plasma membrane, implying that it acts as a plasma membrane calcium sensor. Overall, these findings support the idea of alternative syt∶SNARE combinations driving release with different kinetics and fidelity.
A tomograph VMEbus parallel processing data acquisition system
International Nuclear Information System (INIS)
Atkins, M.S.; Wilkinson, N.A.; Rogers, J.G.
1988-11-01
This paper describes a VME based data acquisition system suitable for the development of Positron Volume Imaging tomographs which use 3-D data for improved image resolution over slice-oriented tomographs. The data acquisition must be flexible enough to accommodate several 3-D reconstruction algorithms; hence, a software-based system is most suitable. Furthermore, because of the increased dimensions and resolution of volume imaging tomographs, the raw data event rate is greater than that of slice-oriented machines. These dual requirements are met by our data acquisition systems. Flexibility is achieved through an array of processors connected over a VMEbus, operating asynchronously and in parallel. High raw data throughput is achieved using a dedicated high speed data transfer device available for the VMEbus. The device can attain a raw data rate of 2.5 million coincidence events per second for raw events per second for raw events which are 64 bits wide. Real-time data acquisition and pre-processing requirements can be met by about forty 20 MHz Motorola 68020/68881 processors
Asynchronous Task-Based Polar Decomposition on Manycore Architectures
Sukkari, Dalal
2016-10-25
This paper introduces the first asynchronous, task-based implementation of the polar decomposition on manycore architectures. Based on a new formulation of the iterative QR dynamically-weighted Halley algorithm (QDWH) for the calculation of the polar decomposition, the proposed implementation replaces the original and hostile LU factorization for the condition number estimator by the more adequate QR factorization to enable software portability across various architectures. Relying on fine-grained computations, the novel task-based implementation is also capable of taking advantage of the identity structure of the matrix involved during the QDWH iterations, which decreases the overall algorithmic complexity. Furthermore, the artifactual synchronization points have been severely weakened compared to previous implementations, unveiling look-ahead opportunities for better hardware occupancy. The overall QDWH-based polar decomposition can then be represented as a directed acyclic graph (DAG), where nodes represent computational tasks and edges define the inter-task data dependencies. The StarPU dynamic runtime system is employed to traverse the DAG, to track the various data dependencies and to asynchronously schedule the computational tasks on the underlying hardware resources, resulting in an out-of-order task scheduling. Benchmarking experiments show significant improvements against existing state-of-the-art high performance implementations (i.e., Intel MKL and Elemental) for the polar decomposition on latest shared-memory vendors\\' systems (i.e., Intel Haswell/Broadwell/Knights Landing, NVIDIA K80/P100 GPUs and IBM Power8), while maintaining high numerical accuracy.
Parallel ray tracing for one-dimensional discrete ordinate computations
International Nuclear Information System (INIS)
Jarvis, R.D.; Nelson, P.
1996-01-01
The ray-tracing sweep in discrete-ordinates, spatially discrete numerical approximation methods applied to the linear, steady-state, plane-parallel, mono-energetic, azimuthally symmetric, neutral-particle transport equation can be reduced to a parallel prefix computation. In so doing, the often severe penalty in convergence rate of the source iteration, suffered by most current parallel algorithms using spatial domain decomposition, can be avoided while attaining parallelism in the spatial domain to whatever extent desired. In addition, the reduction implies parallel algorithm complexity limits for the ray-tracing sweep. The reduction applies to all closed, linear, one-cell functional (CLOF) spatial approximation methods, which encompasses most in current popular use. Scalability test results of an implementation of the algorithm on a 64-node nCube-2S hypercube-connected, message-passing, multi-computer are described. (author)
Lu, San; Artemyev, A. V.; Angelopoulos, V.
2017-11-01
Magnetotail current sheet thinning is a distinctive feature of substorm growth phase, during which magnetic energy is stored in the magnetospheric lobes. Investigation of charged particle dynamics in such thinning current sheets is believed to be important for understanding the substorm energy storage and the current sheet destabilization responsible for substorm expansion phase onset. We use Time History of Events and Macroscale Interactions during Substorms (THEMIS) B and C observations in 2008 and 2009 at 18 - 25 RE to show that during magnetotail current sheet thinning, the electron temperature decreases (cooling), and the parallel temperature decreases faster than the perpendicular temperature, leading to a decrease of the initially strong electron temperature anisotropy (isotropization). This isotropization cannot be explained by pure adiabatic cooling or by pitch angle scattering. We use test particle simulations to explore the mechanism responsible for the cooling and isotropization. We find that during the thinning, a fast decrease of a parallel electric field (directed toward the Earth) can speed up the electron parallel cooling, causing it to exceed the rate of perpendicular cooling, and thus lead to isotropization, consistent with observation. If the parallel electric field is too small or does not change fast enough, the electron parallel cooling is slower than the perpendicular cooling, so the parallel electron anisotropy grows, contrary to observation. The same isotropization can also be accomplished by an increasing parallel electric field directed toward the equatorial plane. Our study reveals the existence of a large-scale parallel electric field, which plays an important role in magnetotail particle dynamics during the current sheet thinning process.
Pteros 2.0: Evolution of the fast parallel molecular analysis library for C++ and python.
Yesylevskyy, Semen O
2015-07-15
Pteros is the high-performance open-source library for molecular modeling and analysis of molecular dynamics trajectories. Starting from version 2.0 Pteros is available for C++ and Python programming languages with very similar interfaces. This makes it suitable for writing complex reusable programs in C++ and simple interactive scripts in Python alike. New version improves the facilities for asynchronous trajectory reading and parallel execution of analysis tasks by introducing analysis plugins which could be written in either C++ or Python in completely uniform way. The high level of abstraction provided by analysis plugins greatly simplifies prototyping and implementation of complex analysis algorithms. Pteros is available for free under Artistic License from http://sourceforge.net/projects/pteros/. © 2015 Wiley Periodicals, Inc.
Neutron particle injection device
International Nuclear Information System (INIS)
Hashimoto, Kiyoshi.
1997-01-01
Plasma particles are used as target particles for converting ions to neutral particles by a charge exchange reaction in a neutralization cell, and a neutralization cell is disposed in adjacent with drawing electrodes. In addition, a magnetic field generation means is disposed additionally for generating magnetic rays substantially in parallel with the drawing electrode at the downmost stream in the progressing direction of the ions. The intensity of electric fields between the drawing electrode at the downmost stream and the nearest electrode, among electrodes present at the upstream, is made smaller than the intensity of electric fields between other electrodes. Since magnetic rays substantially in parallel with the drawing electrode at the downmost stream in the progressing direction of the ions are generated, the ions are prevented from being accelerated in the direction reverse to the progressing direction thereby further enhancing the neutralization efficiency of the neutralizing cell. Then, there can be provided effects that the constitution of the electrode of NBI (Neutral particle Beam Injector) can be simplified and the power source for preventing acceleration of neutral particles can be saved. (N.H.)
Beyond Social Presence: Facelessness and the Ethics of Asynchronous Online Education
Rose, Ellen
2017-01-01
In this position paper, I argue that a focus on achieving and increasing social presence in online courses tends to derail a consideration of the ethical implications and dimensions of the essential facelessness of asynchronous education. Drawing upon the work of Emmanuel Levinas and Nel Noddings, who contended that the face is the basis of…
Asynchronous monitoring of the quality of multilevel optical PAM signals
Siuzdak, J.
2017-08-01
In the paper, there is analyzed the signal quality assessment method based on delay tap asynchronous sampling, both for binary and multilevel PAM signals. The obtained multilevel phase diagrams are far more complicated than binary ones. The phase diagrams are affected by the signal distortions but it is difficult to relate reliably the phase diagram form to the distortion type and its influence on the signal quality.
Energy Technology Data Exchange (ETDEWEB)
Watanabe, Hideo; Kawai, Wataru; Nemoto, Toshiyuki [Fujitsu Ltd., Tokyo (Japan); and others
1997-12-01
Several computer codes in the nuclear field have been vectorized, parallelized and transported on the FUJITSU VPP500 system at Center for Promotion of Computational Science and Engineering in Japan Atomic Energy Research Institute. These results are reported in 3 parts, i.e., the vectorization part, the parallelization part and the porting part. In this report, we describe the parallelization. In this parallelization part, the parallelization of 2-Dimensional relativistic electromagnetic particle code EM2D, Cylindrical Direct Numerical Simulation code CYLDNS and molecular dynamics code for simulating radiation damages in diamond crystals DGR are described. In the vectorization part, the vectorization of two and three dimensional discrete ordinates simulation code DORT-TORT, gas dynamics analysis code FLOWGR and relativistic Boltzmann-Uehling-Uhlenbeck simulation code RBUU are described. And then, in the porting part, the porting of reactor safety analysis code RELAP5/MOD3.2 and RELAP5/MOD3.2.1.2, nuclear data processing system NJOY and 2-D multigroup discrete ordinate transport code TWOTRAN-II are described. And also, a survey for the porting of command-driven interactive data analysis plotting program IPLOT are described. (author)
Asynchronous data-driven classification of weapon systems
International Nuclear Information System (INIS)
Jin, Xin; Mukherjee, Kushal; Gupta, Shalabh; Ray, Asok; Phoha, Shashi; Damarla, Thyagaraju
2009-01-01
This communication addresses real-time weapon classification by analysis of asynchronous acoustic data, collected from microphones on a sensor network. The weapon classification algorithm consists of two parts: (i) feature extraction from time-series data using symbolic dynamic filtering (SDF), and (ii) pattern classification based on the extracted features using the language measure (LM) and support vector machine (SVM). The proposed algorithm has been tested on field data, generated by firing of two types of rifles. The results of analysis demonstrate high accuracy and fast execution of the pattern classification algorithm with low memory requirements. Potential applications include simultaneous shooter localization and weapon classification with soldier-wearable networked sensors. (rapid communication)
Non-fragile switched H∞ control for morphing aircraft with asynchronous switching
Directory of Open Access Journals (Sweden)
Haoyu CHENG
2017-06-01
Full Text Available This paper deals with the problem of non-fragile linear parameter-varying (LPV H∞ control for morphing aircraft with asynchronous switching. The switched LPV model of morphing aircraft is established by Jacobian linearization approach according to the nonlinear model. The data missing is taken into account in the link from sensors to controllers and the link from controllers to actuators, which satisfies Bernoulli distribution. The non-fragile switched LPV controllers are constructed with consideration of the uncertainties of controllers and asynchronous switching phenomenon. The parameter-dependent Lyapunov functional method and mode-dependent average dwell time (MDADT method are combined to guarantee the stability and prescribed performance of the system. The sufficient conditions on the solvability of the problem are derived in the form of linear matrix inequalities (LMI. In order to achieve higher efficiency of the designing process, an algorithm is applied to divide the whole set into subsets automatically. Simulation results are provided to verify the effectiveness and superiority of the method in the paper.
Energy Technology Data Exchange (ETDEWEB)
Amadio, G.; et al.
2017-11-22
An intensive R&D and programming effort is required to accomplish new challenges posed by future experimental high-energy particle physics (HEP) programs. The GeantV project aims to narrow the gap between the performance of the existing HEP detector simulation software and the ideal performance achievable, exploiting latest advances in computing technology. The project has developed a particle detector simulation prototype capable of transporting in parallel particles in complex geometries exploiting instruction level microparallelism (SIMD and SIMT), task-level parallelism (multithreading) and high-level parallelism (MPI), leveraging both the multi-core and the many-core opportunities. We present preliminary verification results concerning the electromagnetic (EM) physics models developed for parallel computing architectures within the GeantV project. In order to exploit the potential of vectorization and accelerators and to make the physics model effectively parallelizable, advanced sampling techniques have been implemented and tested. In this paper we introduce a set of automated statistical tests in order to verify the vectorized models by checking their consistency with the corresponding Geant4 models and to validate them against experimental data.
From discrete-time models to continuous-time, asynchronous modeling of financial markets
Boer, Katalin; Kaymak, Uzay; Spiering, Jaap
2007-01-01
Most agent-based simulation models of financial markets are discrete-time in nature. In this paper, we investigate to what degree such models are extensible to continuous-time, asynchronous modeling of financial markets. We study the behavior of a learning market maker in a market with information
From Discrete-Time Models to Continuous-Time, Asynchronous Models of Financial Markets
K. Boer-Sorban (Katalin); U. Kaymak (Uzay); J. Spiering (Jaap)
2006-01-01
textabstractMost agent-based simulation models of financial markets are discrete-time in nature. In this paper, we investigate to what degree such models are extensible to continuous-time, asynchronous modelling of financial markets. We study the behaviour of a learning market maker in a market with
Cultural Influences on Chinese Students' Asynchronous Online Learning in a Canadian University
Zhao, Naxin; McDougall, Douglas
2008-01-01
This study explored six Chinese graduate students' asynchronous online learning in a large urban Canadian university. Individual interviews in Mandarin elicited their perceptions of online learning, their participation in it, and the cultural factors that influenced their experiences. In general, the participants had a positive attitude towards…
FLUKA Studies of the Asynchronous Beam Dump Effects on LHC Point 6
Versaci, R; Goddard, B; Schmidt, R; Vlachoudis, V; Mereghetti, A
2011-01-01
The LHC is a record-breaking machine for beam energy and intensity. An intense effort has therefore been deployed in simulating critical operational scenarios of energy deposition. FLUKA is the most widely used code for this kind of simulations at CERN because of the high reliability of its results and the ease to custom detailed simulations all along hundreds of meters of beam line. We have investigated the effects of an asynchronous beam dump on the LHC Point 6 where, beams with a stored energy of 360 MJ, can instantaneously release up to a few J cm−3 in the cryogenic magnets which have a quench limit of the order of the mJ cm−3. In the present paper we will describe the simulation approach, and discuss the evaluated maximum energy release onto the superconducting magnets during an asynchronous beam dump. We will then analyse the shielding provided by collimators installed in the area and discuss safety limits for the operation of the LHC.
Wu, Kaihua; Shao, Zhencheng; Chen, Nian; Wang, Wenjie
2018-01-01
The wearing degree of the wheel set tread is one of the main factors that influence the safety and stability of running train. Geometrical parameters mainly include flange thickness and flange height. Line structure laser light was projected on the wheel tread surface. The geometrical parameters can be deduced from the profile image. An online image acquisition system was designed based on asynchronous reset of CCD and CUDA parallel processing unit. The image acquisition was fulfilled by hardware interrupt mode. A high efficiency parallel segmentation algorithm based on CUDA was proposed. The algorithm firstly divides the image into smaller squares, and extracts the squares of the target by fusion of k_means and STING clustering image segmentation algorithm. Segmentation time is less than 0.97ms. A considerable acceleration ratio compared with the CPU serial calculation was obtained, which greatly improved the real-time image processing capacity. When wheel set was running in a limited speed, the system placed alone railway line can measure the geometrical parameters automatically. The maximum measuring speed is 120km/h.
Lari, L; Boccone, V; Bruce, R; Cerutti, F; Rossi, A; Vlachoudis, V; Mereghetti, A; Faus-Golfe, A
2012-01-01
Asynchronous beam aborts at the LHC are estimated to occur on average once per year. Accelerator physics studies of asynchronous dumps have been performed at different beam energies and beta-stars. The loss patterns are analyzed in order to identify the losses in particular on the Phase 1 Tertiary Collimators (TCT), since their tungsten-based active jaw insert has a lower damage threshold than the carbon-based other LHC collimators. Settings of the tilt angle of the TCTs are discussed with the aim of reducing the thermal loads on the TCT themselves.
Transition to Asynchronous Transfer Mode (ATM) an Implementation Model for NPS Software Metrics Lab
National Research Council Canada - National Science Library
Carney, Cameron
1999-01-01
With Asynchronous Transfer Mode (ATM), we are experiencing the emergence of a network technology that has the potential of satisfying the requirement for a worldwide standard to allow interoperability of information, regardless...
Investigation of a photo-voltaic pump station with asynchronous electric drive
International Nuclear Information System (INIS)
Dzhagarov, N.; Vladimirov, P.
2000-01-01
A scheme of a photo-voltaic pump station with constant current drive is presented. The requirements for reliability and minimal maintenance necessitate the use of asynchronous drive which has been studied. The studies of the system's model for various regimes show its adequacy. The model can be used for determination of the optimal conditions providing maximal working efficiency
International Nuclear Information System (INIS)
Mao, Xiaoan; Jaworski, Artur J
2010-01-01
This paper describes the development of the experimental setup and measurement methodologies to study the physics of oscillatory flows in the vicinity of parallel-plate stacks by using the particle image velocimetry (PIV) techniques. Parallel-plate configurations often appear as internal structures in thermoacoustic devices and are responsible for the hydrodynamic energy transfer processes. The flow around selected stack configurations is induced by a standing acoustic wave, whose amplitude can be varied. Depending on the direction of the flow within the acoustic cycle, relative to the stack, it can be treated as an entrance flow or a wake flow. The insight into the flow behaviour, its kinematics, dynamics and scales of turbulence, is obtained using the classical Reynolds decomposition to separate the instantaneous velocity fields into ensemble-averaged mean velocity fields and fluctuations in a set of predetermined phases within an oscillation cycle. The mean velocity field and the fluctuation intensity distributions are investigated over the acoustic oscillation cycle. The velocity fluctuation is further divided into large- and small-scale fluctuations by using fast Fourier transform (FFT) spatial filtering techniques
Synchronous and Asynchronous Communication in Distance Learning: A Review of the Literature
Watts, Lynette
2016-01-01
Distance learning is commonplace in higher education, with increasing numbers of students enjoying the flexibility e-learning provides. Keeping students connected with peers and instructors has been a challenge with e-learning, but as technology has advanced, the methods by which educators keep students engaged, synchronously and asynchronously,…
International Nuclear Information System (INIS)
Gus'kov, B.N.; Kalinnikov, V.A.; Krastev, V.R.; Maksimov, A.N.; Nikityuk, N.M.
1985-01-01
This paper describes a high-speed parallel counter that contains 31 inputs and 15 outputs and is implemented by integrated circuits of series 500. The counter is designed for fast sampling of events according to the number of particles that pass simultaneously through the hodoscopic plane of the detector. The minimum delay of the output signals relative to the input is 43 nsec. The duration of the output signals can be varied from 75 to 120 nsec
Potential of the test particle in the magnetic field. I
International Nuclear Information System (INIS)
Sestak, B.
1980-01-01
The problem of the test particle potential in an external homogeneous magnetic field is solved in an unmagnetized plasma. It is shown that for the case when the parallel velocity component of the test particle is greater than the thermal velocity of the background particles, the potential is of a Coulomb character while for the case where the parallel velocity component is less than the thermal velocity the potential is of a Debye character. The Larmor radius of the test particle appears as an additional parameter in these potentials. (author)
Parallel processing at the SSC: The fact and the fiction
International Nuclear Information System (INIS)
Bourianoff, G.; Cole, B.
1991-10-01
Accurately modelling the behavior of particles circulating in accelerators is a computationally demanding task. The particle tracking code currently in use at SSC is based upon a ''thin element'' analysis (TEAPOT). In this model each magnet in the lattice is described by a thin element at which the particle experiences an impulsive kick. Each kick requires approximately 200 floating point operations (''FLOP''). For the SSC collider lattice consisting of 10 4 elements, performing a tracking of study for a set of 100 particles for 10 7 turns would require 2 x 10 15 FLOPS. Even on a machine capable of 100 MFLOP/sec (MFLOPS), this would require 2 x 10 7 seconds, and many such runs are necessary. It should be noted that the accuracy with which the kicks are to be calculated is important: the large number of iterations involved will magnify the effects of small errors. The inability of current computational resources to effectively perform the full calculation motivates the migration of this calculation to the most powerful computers available. A survey of the current research into new technologies for superconducting reveals that the supercomputers of the future will be parallel in nature. Further, numerous such machines exist today, and are being used to solve other difficult problems. Thus it seems clear that it is not early to begin developing the capability to develop tracking codes for parallel architectures. This report discusses implementing parallel processing on the SCC
A general concurrent algorithm for plasma particle-in-cell simulation codes
International Nuclear Information System (INIS)
Liewer, P.C.; Decyk, V.K.
1989-01-01
We have developed a new algorithm for implementing plasma particle-in-cell (PIC) simulation codes on concurrent processors with distributed memory. This algorithm, named the general concurrent PIC algorithm (GCPIC), has been used to implement an electrostatic PIC code on the 33-node JPL Mark III Hypercube parallel computer. To decompose at PIC code using the GCPIC algorithm, the physical domain of the particle simulation is divided into sub-domains, equal in number to the number of processors, such that all sub-domains have roughly equal numbers of particles. For problems with non-uniform particle densities, these sub-domains will be of unequal physical size. Each processor is assigned a sub-domain and is responsible for updating the particles in its sub-domain. This algorithm has led to a a very efficient parallel implementation of a well-benchmarked 1-dimensional PIC code. The dominant portion of the code, updating the particle positions and velocities, is nearly 100% efficient when the number of particles is increased linearly with the number of hypercube processors used so that the number of particles per processor is constant. For example, the increase in time spent updating particles in going from a problem with 11,264 particles run on 1 processor to 360,448 particles on 32 processors was only 3% (parallel efficiency of 97%). Although implemented on a hypercube concurrent computer, this algorithm should also be efficient for PIC codes on other parallel architectures and for large PIC codes on sequential computers where part of the data must reside on external disks. copyright 1989 Academic Press, Inc
DEFF Research Database (Denmark)
Hu, Hao; Laguardia Areal, Janaina; Mulvad, Hans Christian Hansen
2011-01-01
An asynchronous 10G Ethernet packet is synchronized and retimed to a master clock using a time lens. The NRZ packet is converted into an RZ packet and multiplexed with a serial 1.28 Tb/s signal.......An asynchronous 10G Ethernet packet is synchronized and retimed to a master clock using a time lens. The NRZ packet is converted into an RZ packet and multiplexed with a serial 1.28 Tb/s signal....
Gerry, Shannon P; Ramsay, Jason B; Dean, Mason N; Wilga, Cheryl D
2008-08-01
Many studies of feeding behavior have implanted electrodes unilaterally (in muscles on only one side of the head) to determine the basic motor patterns of muscles controlling the jaws. However, bilateral implantation has the potential to achieve a more comprehensive understanding of modification of the motor activity that may be occurring between the left and right sides of the head. In particular, complex processing of prey is often characterized by bilaterally asynchronous and even unilateral activation of the jaw musculature. In this study, we bilaterally implant feeding muscles in species from four orders of elasmobranchs (Squaliformes, Orectolobiformes, Carcharhiniformes, Rajoidea) in order to characterize the effects of type of prey, feeding behavior, and phylogeny on the degree of asynchronous muscle activation. Electrodes were implanted in three of the jaw adductors, two divisions of the quadratomandibularis and the preorbitalis, as well as in a cranial elevator in sharks, the epaxialis. The asynchrony of feeding events (measured as the degree to which activity of members of a muscle pair is out of phase) was compared across species for capture versus processing and simple versus complex prey, then interpreted in the contexts of phylogeny, morphology, and ecology to clarify determinants of asynchronous activity. Whereas capture and processing of prey were characterized by statistically similar degrees of asynchrony for data pooled across species, events involving complex prey were more asynchronous than were those involving simple prey. The two trophic generalists, Squalus acanthias and Leucoraja erinacea, modulated the degree of asynchrony according to type of prey, whereas the two behavioral specialists, Chiloscyllium plagiosum and Mustelus canis, activated the cranial muscles synchronously regardless of type of prey. These differences in jaw muscle activity would not have been detected with unilateral implantation. Therefore, we advocate bilateral
Vectorization, parallelization and porting of nuclear codes (porting). Progress report fiscal 1998
International Nuclear Information System (INIS)
Nemoto, Toshiyuki; Kawai, Wataru; Ishizuki, Shigeru; Kawasaki, Nobuo; Kume, Etsuo; Adachi, Masaaki; Ogasawara, Shinobu
2000-03-01
Several computer codes in the nuclear field have been vectorized, parallelized and transported on the FUJITSU VPP500 system, the AP3000 system and the Paragon system at Center for Promotion of Computational Science and Engineering in Japan Atomic Energy Research Institute. We dealt with 12 codes in fiscal 1998. These results are reported in 3 parts, i.e., the vectorization and parallelization on vector processors part, the parallelization on scalar processors part and the porting part. In this report, we describe the porting. In this porting part, the porting of Monte Carlo N-Particle Transport code MCNP4B2 and Reactor Safety Analysis code RELAP5 on the AP3000 are described. In the vectorization and parallelization on vector processors part, the vectorization of General Tokamak Circuit Simulation Program code GTCSP, the vectorization and parallelization of Molecular Dynamics Ntv Simulation code MSP2, Eddy Current Analysis code EDDYCAL, Thermal Analysis Code for Test of Passive Cooling System by HENDEL T2 code THANPACST2 and MHD Equilibrium code SELENEJ on the VPP500 are described. In the parallelization on scalar processors part, the parallelization of Monte Carlo N-Particle Transport code MCNP4B2, Plasma Hydrodynamics code using Cubic Interpolated propagation Method PHCIP and Vectorized Monte Carlo code (continuous energy model/multi-group model) MVP/GMVP on the Paragon are described. (author)
Long, Caryn L. Smith
This dissertation examines how various designs of asynchronous online courses for teacher professional development may impact science-teacher self-efficacy. Mayer's studies, providing the cognitive theory of multimedia learning, targeted designs of asynchronous online learning and the point where contributions of written, auditory, and visual information on these sites could cause cognitive overload (Mayer, 2005). With increasing usage of online resources for educators to gain teaching credits, understanding how to construct these professional development offerings is critical. Teacher self-efficacy can affect how well information from these courses relays to students in their classroom. This research explored the connection between online asynchronous professional development design and teacher self-efficacy through analysis of a physics-based course in three distinct course-design offerings, while collecting content-acquisition data and self-efficacy effects before and after participation. Results from this research showed teacher self-efficacy had improved in all online treatments which included a text-only, text and audio and text, audio and animation version of the same physics content. Content knowledge was most effected by the text-only and text and audio treatments with significan growth occurring in the remember, apply, and analyze levels of bloom's taxonomy. Due to the small number of participants, it cannot be said that these results are conclusive.
Fang, L.; Yang, X. H.; Sun, B. Q.; Qin, W. J.; Kong, Y.
2013-09-01
The measurement of the inter-satellite link is one of the key techniques in the autonomous operation of satellite navigation system. Based on the asynchronism inter-satellite two-way measurement mode in GPS constellation, the reduction formula of the inter-satellite time synchronization is built in this paper. Moreover, the corrective method of main systematic errors is proposed. Inter-satellite two-way time synchronization is simulated on the basis of IGS (International GNSS Service) precise ephemeris. The impacts of the epoch domestication of asynchronism inter-satellite link pseudo-range, the initial orbit, and the main systematic errors on satellite time synchronization are analyzed. Furthermore, the broadcast clock error of each satellite is calculated by the ``centralized'' inter-satellite autonomous time synchronization. Simulation results show that the epoch domestication of asynchronism inter-satellite link pseudo-range and the initial orbit have little impact on the satellite clock errors, and thus they needn't be taken into account. The errors caused by the relativistic effect and the asymmetry of path travel have large impact on the satellite clock errors. These should be corrected with theoretical formula. Compared with the IGS precise clock error, the root mean square of the broadcast clock error of each satellite is about 0.4 ns.
Asynchronous Execution of the Fast Multipole Method Using Charm++
AbdulJabbar, Mustafa; Yokota, Rio; Keyes, David
2014-01-01
Fast multipole methods (FMM) on distributed mem- ory have traditionally used a bulk-synchronous model of com- municating the local essential tree (LET) and overlapping it with computation of the local data. This could be perceived as an extreme case of data aggregation, where the whole LET is communicated at once. Charm++ allows a much finer control over the granularity of communication, and has a asynchronous execution model that fits well with the structure of our FMM code. Unlike previous ...
Hu, Guoqing; Pan, Yingling; Zhao, Xin; Yin, Siyao; Zhang, Meng; Zheng, Zheng
2017-12-01
The evolution from asynchronous to synchronous dual-wavelength pulse generation in a passively mode-locked fiber laser is experimentally investigated by tailoring the intracavity dispersion. Through tuning the intracavity-loss-dependent gain profile and the birefringence-induced filter effect, asynchronous dual-wavelength soliton pulses can be generated until the intracavity anomalous dispersion is reduced to ∼8 fs/nm. The transition from asynchronous to synchronous pulse generation is then observed at an elevated pump power in the presence of residual anomalous dispersion, and it is shown that pulses are temporally synchronized at the mode-locker in the cavity. Spectral sidelobes are observed and could be attributed to the four-wave-mixing effect between dual-wavelength pulses at the carbon nanotube mode-locker. These results could provide further insight into the design and realization of such dual-wavelength ultrafast lasers for different applications such as dual-comb metrology as well as better understanding of the inter-pulse interactions in such dual-comb lasers.
Current-voltage characteristic of parallel-plane ionization chamber with inhomogeneous ionization
International Nuclear Information System (INIS)
Stoyanov, D G
2007-01-01
The balances of particles and charges in the volume of parallel-plane ionization chamber are considered. Differential equations describing the distribution of current densities in the chamber volume are obtained. As a result of the differential equations solution an analytical form of the current-voltage characteristic of parallel-plane ionization chamber with inhomogeneous ionization in the volume is obtained
Current-voltage characteristic of parallel-plane ionization chamber with inhomogeneous ionization
Energy Technology Data Exchange (ETDEWEB)
Stoyanov, D G [Faculty of Engineering and Pedagogy in Sliven, Technical University of Sofia, 59, Bourgasko Shaussee Blvd, 8800 Sliven (Bulgaria)
2007-08-15
The balances of particles and charges in the volume of parallel-plane ionization chamber are considered. Differential equations describing the distribution of current densities in the chamber volume are obtained. As a result of the differential equations solution an analytical form of the current-voltage characteristic of parallel-plane ionization chamber with inhomogeneous ionization in the volume is obtained.
A numerical simulation of solar energetic particle dropouts during impulsive events
International Nuclear Information System (INIS)
Wang, Y.; Qin, G.; Zhang, M.; Dalla, S.
2014-01-01
This paper investigates the conditions for producing rapid variations of solar energetic particle (SEP) intensity commonly known as 'dropouts'. In particular, we use numerical model simulations based on solving the focused transport equation in the three-dimensional Parker interplanetary magnetic field to put constraints on the properties of particle transport coefficients in both directions perpendicular and parallel to the magnetic field. Our calculations of the temporal intensity profile of 0.5 and 5 MeV protons at the Earth show that the perpendicular diffusion must be small while the parallel mean free path is long in order to reproduce the phenomenon of SEP dropouts. When the parallel mean free path is a fraction of 1 AU and the observer is located at 1 AU, the perpendicular to parallel diffusion ratio must be below 10 –5 if we want to see the particle flux dropping by at least several times within 3 hr. When the observer is located at a larger solar radial distance, the perpendicular to parallel diffusion ratio for reproducing the dropouts should be even lower than that in the case of 1 AU distance. A shorter parallel mean free path or a larger radial distance from the source to observer will cause the particles to arrive later, making the effects of perpendicular diffusion more prominent and SEP dropouts disappear. All of these effects require the magnetic turbulence that resonates with the particles to be low everywhere in the inner heliosphere.
Brennan-Jones, Christopher G; Eikelboom, Robert H; Swanepoel, De Wet
2017-02-01
Introduction Standard criteria exist for diagnosing different types of hearing loss, yet audiologists interpret audiograms manually. This pilot study examined the feasibility of standardised interpretations of audiometry in a telehealth model of care. The aim of this study was to examine diagnostic accuracy of automated audiometry in adults with hearing loss in an asynchronous telehealth model using pre-defined diagnostic protocols. Materials and methods We recruited 42 study participants from a public audiology and otolaryngology clinic in Perth, Western Australia. Manual audiometry was performed by an audiologist either before or after automated audiometry. Diagnostic protocols were applied asynchronously for normal hearing, disabling hearing loss, conductive hearing loss and unilateral hearing loss. Sensitivity and specificity analyses were conducted using a two-by-two matrix and Cohen's kappa was used to measure agreement. Results The overall sensitivity for the diagnostic criteria was 0.88 (range: 0.86-1) and overall specificity was 0.93 (range: 0.86-0.97). Overall kappa ( k) agreement was 'substantial' k = 0.80 (95% confidence interval (CI) 0.70-0.89) and significant at p loss. This method has the potential to improve synchronous and asynchronous tele-audiology service delivery.
Otter, den A.F.H.J.; Emmitt, S.
2007-01-01
Purpose – Effective teams use a balance of synchronous and asynchronous communication. Team communication is dependent on the communication acts of team members and the ability of managers to facilitate, stimulate and motivate them. Team members from organizations using different information systems
De Oliveira, Luciana C.; Olesova, Larisa
2013-01-01
This study examined asynchronous online discussions in the online course "English Language Development" to identify themes related to participants' learning about the language and literacy development of English Language Learners when they facilitated online discussions to determine whether the participants developed sufficient…
de Jong, Catharina Carolina; Ros, Wynand Jg; Schrijvers, Guus
2014-01-16
In support of professional practice, asynchronous communication between the patient and the provider is implemented separately or in combination with Internet-based self-management interventions. This interaction occurs primarily through electronic messaging or discussion boards. There is little evidence as to whether it is a useful tool for chronically ill patients to support their self-management and increase the effectiveness of interventions. The aim of our study was to review the use and usability of patient-provider asynchronous communication for chronically ill patients and the effects of such communication on health behavior, health outcomes, and patient satisfaction. A literature search was performed using PubMed and Embase. The quality of the articles was appraised according to the National Institute for Health and Clinical Excellence (NICE) criteria. The use and usability of the asynchronous communication was analyzed by examining the frequency of use and the number of users of the interventions with asynchronous communication, as well as of separate electronic messaging. The effectiveness of asynchronous communication was analyzed by examining effects on health behavior, health outcomes, and patient satisfaction. Patients' knowledge concerning their chronic condition increased and they seemed to appreciate being able to communicate asynchronously with their providers. They not only had specific questions but also wanted to communicate about feeling ill. A decrease in visits to the physician was shown in two studies (P=.07, P=.07). Increases in self-management/self-efficacy for patients with back pain, dyspnea, and heart failure were found. Positive health outcomes were shown in 12 studies, where the clinical outcomes for diabetic patients (HbA1c level) and for asthmatic patients (forced expiratory volume [FEV]) improved. Physical symptoms improved in five studies. Five studies generated a variety of positive psychosocial outcomes. The effect of
Asynchronous Task-Based Polar Decomposition on Single Node Manycore Architectures
Sukkari, Dalal E.; Ltaief, Hatem; Faverge, Mathieu; Keyes, David E.
2017-01-01
This paper introduces the first asynchronous, task-based formulation of the polar decomposition and its corresponding implementation on manycore architectures. Based on a formulation of the iterative QR dynamically-weighted Halley algorithm (QDWH) for the calculation of the polar decomposition, the proposed implementation replaces the original LU factorization for the condition number estimator by the more adequate QR factorization to enable software portability across various architectures. Relying on fine-grained computations, the novel task-based implementation is capable of taking advantage of the identity structure of the matrix involved during the QDWH iterations, which decreases the overall algorithmic complexity. Furthermore, the artifactual synchronization points have been weakened compared to previous implementations, unveiling look-ahead opportunities for better hardware occupancy. The overall QDWH-based polar decomposition can then be represented as a directed acyclic graph (DAG), where nodes represent computational tasks and edges define the inter-task data dependencies. The StarPU dynamic runtime system is employed to traverse the DAG, to track the various data dependencies and to asynchronously schedule the computational tasks on the underlying hardware resources, resulting in an out-of-order task scheduling. Benchmarking experiments show significant improvements against existing state-of-the-art high performance implementations for the polar decomposition on latest shared-memory vendors' systems, while maintaining numerical accuracy.
Asynchronous Task-Based Polar Decomposition on Single Node Manycore Architectures
Sukkari, Dalal E.
2017-09-29
This paper introduces the first asynchronous, task-based formulation of the polar decomposition and its corresponding implementation on manycore architectures. Based on a formulation of the iterative QR dynamically-weighted Halley algorithm (QDWH) for the calculation of the polar decomposition, the proposed implementation replaces the original LU factorization for the condition number estimator by the more adequate QR factorization to enable software portability across various architectures. Relying on fine-grained computations, the novel task-based implementation is capable of taking advantage of the identity structure of the matrix involved during the QDWH iterations, which decreases the overall algorithmic complexity. Furthermore, the artifactual synchronization points have been weakened compared to previous implementations, unveiling look-ahead opportunities for better hardware occupancy. The overall QDWH-based polar decomposition can then be represented as a directed acyclic graph (DAG), where nodes represent computational tasks and edges define the inter-task data dependencies. The StarPU dynamic runtime system is employed to traverse the DAG, to track the various data dependencies and to asynchronously schedule the computational tasks on the underlying hardware resources, resulting in an out-of-order task scheduling. Benchmarking experiments show significant improvements against existing state-of-the-art high performance implementations for the polar decomposition on latest shared-memory vendors\\' systems, while maintaining numerical accuracy.
McGhee, Rosie M. Hector
This research is a correlational study of the relationship among the independent variables: asynchronous interaction, online technologies self-efficacy, and self-regulated learning, and the dependent variable; academic achievement. This study involves an online computer literacy course at a local community college. Very little research exists on the relationship among asynchronous interaction, online technologies self-efficacy and self-regulated learning on predicting academic achievement in an online class. Liu (2008), in his study on student interaction in online courses, concluded that student interaction is a complex issue that needs more research to increase our understanding as it relates to distance education. The purpose of this study was to examine the relationships between asynchronous interaction, online technologies self-efficacy, self-regulated learning and academic achievement in an online computer literacy class at a community college. The researcher used quantitative methods to obtain and analyze data on the relationships among the variables during the summer 2010 semester. Forty-five community college students completed three web-based self-reporting instruments: (a) the GVU 10th WWW User Survey Questionnaire, (b) the Online Technologies Self-Efficacy Survey, and (c) selected items from the Motivated Strategies for Learning Questionnaire. Additional data was obtained from asynchronous discussions posted on Blackboard(TM) Learning Management System. The results of this study found that there were statistically significant relationships between asynchronous interaction and academic achievement (r = .55, p online technologies self-efficacy and academic achievement (r = .50, p online instructors, online course designers, faculty, students and others who are concerned about predictors for online students' success. Also, it serves as a foundation for future research and provides valuable information for educators interested in taking online teaching and
A Coding Scheme to Analyse the Online Asynchronous Discussion Forums of University Students
Biasutti, Michele
2017-01-01
The current study describes the development of a content analysis coding scheme to examine transcripts of online asynchronous discussion groups in higher education. The theoretical framework comprises the theories regarding knowledge construction in computer-supported collaborative learning (CSCL) based on a sociocultural perspective. The coding…
Majeski, Robin; Stover, Merrily
2007-01-01
Online learning has enjoyed increasing popularity in gerontology. This paper presents instructional strategies grounded in Fink's (2003) theory of significant learning designed for the completely asynchronous online gerontology classroom. It links these components with the development of mastery learning goals and provides specific guidelines for…
Scalable Domain Decomposed Monte Carlo Particle Transport
Energy Technology Data Exchange (ETDEWEB)
O' Brien, Matthew Joseph [Univ. of California, Davis, CA (United States)
2013-12-05
In this dissertation, we present the parallel algorithms necessary to run domain decomposed Monte Carlo particle transport on large numbers of processors (millions of processors). Previous algorithms were not scalable, and the parallel overhead became more computationally costly than the numerical simulation.
The Mechanical Transient Process at Asynchronous Motor Oscillating Mode
Antonovičs, Uldis; Bražis, Viesturs; Greivulis, Jānis
2009-01-01
The research object is squirrel-cage asynchronous motor connected to single-phase sinusoidal. There are shown, that by connecting to the stator windings a certain sequence of half-period positive and negative voltage, a motor rotor is rotated, but three times slower than in the three-phase mode. Changing the connecting sequence of positive and negative half-period voltage to stator windings, motor can work in various oscillating modes. It is tested experimentally. The mechanical transient processes had been researched in rotation and oscillating modes.
Particle transport due to magnetic fluctuations
International Nuclear Information System (INIS)
Stoneking, M.R.; Hokin, S.A.; Prager, S.C.; Fiksel, G.; Ji, H.; Den Hartog, D.J.
1994-01-01
Electron current fluctuations are measured with an electrostatic energy analyzer at the edge of the MST reversed-field pinch plasma. The radial flux of fast electrons (E>T e ) due to parallel streaming along a fluctuating magnetic field is determined locally by measuring the correlated product e B r >. Particle transport is small just inside the last closed flux surface (Γ e,mag e,total ), but can account for all observed particle losses inside r/a=0.8. Electron diffusion is found to increase with parallel velocity, as expected for diffusion in a region of field stochasticity
Effects of synchronous versus asynchronous mode of propulsion on wheelchair basketball sprinting.
Faupin, Arnaud; Borel, Benoit; Meyer, Christophe; Gorce, Philippe; Watelain, Eric
2013-11-01
This study aimed to first investigate synchronous (SYN) versus asynchronous (ASY) mode of propulsion and, second, investigate the wheel camber effects on sprinting performance as well as temporal parameters. Seven wheelchair basketball players performed four maximal eight-second sprints on a wheelchair ergometer. They repeated the test according to two modes of propulsion (SYN and ASY) and two wheel cambers (9° and 15°). The mean maximal velocity and push power output was greater in the synchronous mode compared to the asynchronous mode for both camber angles. However, the fluctuation in the velocity profile is inferior for ASY versus SYN mode for both camber angles. Greater push time/cycle time (Pt/Ct) and arm frequency (AF) for synchronous mode versus asynchronous mode and inversely, lesser Ct and rest time (Rt) values for the synchronous mode, for which greater velocity were observed. SYN mode leads to better performance than ASY mode in terms of maximal propulsion velocity. However, ASY propulsion allows greater continuity of the hand-rim force application, reducing fluctuations in the velocity profile. The camber angle had no effect on ASY and SYN mean maximal velocity and push power output. The study of wheelchair propulsion strategies is important for better understanding physiological and biomechanical impacts of wheelchair propulsion for individuals with disabilities. From a kinematical point of view, this study highlights synchronous mode of propulsion to be more efficient, with regards to mean maximal velocity reaching during maximal sprinting exercises. Even if this study focuses on well-trained wheelchair athletes, results from this study could complement the knowledge on the physiological and biomechanical adaptations to wheelchair propulsion and therefore, might be interesting for wheelchair modifications for purposes of rehabilitation.
Parallelizing an electron transport Monte Carlo simulator (MOCASIN 2.0)
International Nuclear Information System (INIS)
Schwetman, H.; Burdick, S.
1988-01-01
Electron transport simulators are tools for studying electrical properties of semiconducting materials and devices. As demands for modeling more complex devices and new materials have emerged, so have demands for more processing power. This paper documents a project to convert an electron transport simulator (MOCASIN 2.0) to a parallel processing environment. In addition to describing the conversion, the paper presents PPL, a parallel programming version of C running on a Sequent multiprocessor system. In timing tests, models that simulated the movement of 2,000 particles for 100 time steps were executed on ten processors, with a parallel efficiency of over 97%
Moutinho, Filipe de Carvalho
2016-01-01
This book describes a model-based development approach for globally-asynchronous locally-synchronous distributed embedded controllers. This approach uses Petri nets as modeling formalism to create platform and network independent models supporting the use of design automation tools. To support this development approach, the Petri nets class in use is extended with time-domains and asynchronous-channels. The authors’ approach uses models not only providing a better understanding of the distributed controller and improving the communication among the stakeholders, but also to be ready to support the entire lifecycle, including the simulation, the verification (using model-checking tools), the implementation (relying on automatic code generators), and the deployment of the distributed controller into specific platforms. Uses a graphical and intuitive modeling formalism supported by design automation tools; Enables verification, ensuring that the distributed controller was correctly specified; Provides flex...
Quorum system and random based asynchronous rendezvous protocol for cognitive radio ad hoc networks
Directory of Open Access Journals (Sweden)
Sylwia Romaszko
2013-12-01
Full Text Available This paper proposes a rendezvous protocol for cognitive radio ad hoc networks, RAC2E-gQS, which utilizes (1 the asynchronous and randomness properties of the RAC2E protocol, and (2 channel mapping protocol, based on a grid Quorum System (gQS, and taking into account channel heterogeneity and asymmetric channel views. We show that the combination of the RAC2E protocol with the grid-quorum based channel mapping can yield a powerful RAC2E-gQS rendezvous protocol for asynchronous operation in a distributed environment assuring a rapid rendezvous between the cognitive radio nodes having available both symmetric and asymmetric channel views. We also propose an enhancement of the protocol, which uses a torus QS for a slot allocation, dealing with the worst case scenario, a large number of channels with opposite ranking lists.
Evaluating the Quality of Interaction in Asynchronous Discussion Forums in Fully Online Courses
Nandi, Dip; Hamilton, Margaret; Harland, James
2012-01-01
Fully online courses are becoming progressively more popular because of their "anytime anywhere" learning flexibility. One of the ways students interact with each other and with the instructors within fully online learning environments is via asynchronous discussion forums. However, student engagement in online discussion forums does not…
Scalability of Several Asynchronous Many-Task Models for In Situ Statistical Analysis.
Energy Technology Data Exchange (ETDEWEB)
Pebay, Philippe Pierre [Sandia National Lab. (SNL-CA), Livermore, CA (United States); Bennett, Janine Camille [Sandia National Lab. (SNL-CA), Livermore, CA (United States); Kolla, Hemanth [Sandia National Lab. (SNL-CA), Livermore, CA (United States); Borghesi, Giulio [Sandia National Lab. (SNL-CA), Livermore, CA (United States)
2017-05-01
This report is a sequel to [PB16], in which we provided a first progress report on research and development towards a scalable, asynchronous many-task, in situ statistical analysis engine using the Legion runtime system. This earlier work included a prototype implementation of a proposed solution, using a proxy mini-application as a surrogate for a full-scale scientific simulation code. The first scalability studies were conducted with the above on modestly-sized experimental clusters. In contrast, in the current work we have integrated our in situ analysis engines with a full-size scientific application (S3D, using the Legion-SPMD model), and have conducted nu- merical tests on the largest computational platform currently available for DOE science ap- plications. We also provide details regarding the design and development of a light-weight asynchronous collectives library. We describe how this library is utilized within our SPMD- Legion S3D workflow, and compare the data aggregation technique deployed herein to the approach taken within our previous work.
BER and total throughput of asynchronous DS-OCDMA/WDM systems with multiple user interference
Ghiringhelli, F.; Zervas, M.N.
2003-01-01
The BER and throughput of Direct-Sequence OCDMA/WDM systems based on quadripolar codes and superstructured fiber Bragg gratings are statistically derived under asynchronous operation, intensity detection, and Multiple User Interference. Performance improvements with Forward Error Correction are included.
A tandem parallel plate analyzer
International Nuclear Information System (INIS)
Hamada, Y.; Fujisawa, A.; Iguchi, H.; Nishizawa, A.; Kawasumi, Y.
1996-11-01
By a new modification of a parallel plate analyzer the second-order focus is obtained in an arbitrary injection angle. This kind of an analyzer with a small injection angle will have an advantage of small operational voltage, compared to the Proca and Green analyzer where the injection angle is 30 degrees. Thus, the newly proposed analyzer will be very useful for the precise energy measurement of high energy particles in MeV range. (author)
Massively parallel computation of PARASOL code on the Origin 3800 system
International Nuclear Information System (INIS)
Hosokawa, Masanari; Takizuka, Tomonori
2001-10-01
The divertor particle simulation code named PARASOL simulates open-field plasmas between divertor walls self-consistently by using an electrostatic PIC method and a binary collision Monte Carlo model. The PARASOL parallelized with MPI-1.1 for scalar parallel computer worked on Intel Paragon XP/S system. A system SGI Origin 3800 was newly installed (May, 2001). The parallel programming was improved at this switchover. As a result of the high-performance new hardware and this improvement, the PARASOL is speeded up by about 60 times with the same number of processors. (author)
Information flow vs resource access in the asynchronous pi-calculus.
HENNESSY, MATTHEW
2002-01-01
PUBLISHED We propose an extension of the asynchronous ?-calculus in which a variety of security properties may be captured using types. These are an extension of the input/output types for the ?-calculus in which I/O capabilities are assigned specific security levels. The main innovation is a uniform typing system that, by varying slightly the allowed set of types, captures different notions of security.We first define a typing system that ensures that processes running at security level ?...
Appleton, Jessica; Fowler, Cathrine; Brown, Nicola
2014-01-01
The use of Internet and social media is increasing in every area of life. Parents are increasingly using online mediums to seek information about their children's health. Therefore, this is becoming an increasingly important topic area for health professionals to acknowledge. Developing an understanding about the dissemination of child health information through these online mediums will assist health professional to continue to engage and support parents to seek and share accurate and safe child health information. To explore parents' use of asynchronous online discussion boards for child health information seeking, advice and social support. A qualitative descriptive approach using an a priori template analysis was used to explore 34 discussions threads sampled from two Australian based online parenting discussion forums. To contain the scope of this study the threads chosen focused on childhood obesity in the Australian context. Four major themes related to parents' use of asynchronous online discussion boards were found. These were seeking advice, sharing advice, social support and making judgement. This final theme of making judgements included parents' perceptions of health professionals' advice. Asynchronous online discussion boards are online mediums being utilised for seeking and sharing child health related information and support between parents. The notion
Control or non-control state: that is the question! An asynchronous visual P300-based BCI approach
Pinegger, Andreas; Faller, Josef; Halder, Sebastian; Wriessnegger, Selina C.; Müller-Putz, Gernot R.
2015-02-01
Objective. Brain-computer interfaces (BCI) based on event-related potentials (ERP) were proven to be a reliable synchronous communication method. For everyday life situations, however, this synchronous mode is impractical because the system will deliver a selection even if the user is not paying attention to the stimulation. So far, research into attention-aware visual ERP-BCIs (i.e., asynchronous ERP-BCIs) has led to variable success. In this study, we investigate new approaches for detection of user engagement. Approach. Classifier output and frequency-domain features of electroencephalogram signals as well as the hybridization of them were used to detect the user's state. We tested their capabilities for state detection in different control scenarios on offline data from 21 healthy volunteers. Main results. The hybridization of classifier output and frequency-domain features outperformed the results of the single methods, and allowed building an asynchronous P300-based BCI with an average correct state detection accuracy of more than 95%. Significance. Our results show that all introduced approaches for state detection in an asynchronous P300-based BCI can effectively avoid involuntary selections, and that the hybrid method is the most effective approach.
Parallel, distributed and GPU computing technologies in single-particle electron microscopy.
Schmeisser, Martin; Heisen, Burkhard C; Luettich, Mario; Busche, Boris; Hauer, Florian; Koske, Tobias; Knauber, Karl-Heinz; Stark, Holger
2009-07-01
Most known methods for the determination of the structure of macromolecular complexes are limited or at least restricted at some point by their computational demands. Recent developments in information technology such as multicore, parallel and GPU processing can be used to overcome these limitations. In particular, graphics processing units (GPUs), which were originally developed for rendering real-time effects in computer games, are now ubiquitous and provide unprecedented computational power for scientific applications. Each parallel-processing paradigm alone can improve overall performance; the increased computational performance obtained by combining all paradigms, unleashing the full power of today's technology, makes certain applications feasible that were previously virtually impossible. In this article, state-of-the-art paradigms are introduced, the tools and infrastructure needed to apply these paradigms are presented and a state-of-the-art infrastructure and solution strategy for moving scientific applications to the next generation of computer hardware is outlined.
Parallel simulation of radio-frequency plasma discharges
International Nuclear Information System (INIS)
Fivaz, M.; Howling, A.; Ruegsegger, L.; Schwarzenbach, W.; Baeumle, B.
1994-01-01
The 1D Particle-In-Cell and Monte Carlo collision code XPDP1 is used to model radio-frequency argon plasma discharges. The code runs faster on a single-user parallel system called MUSIC than on a CRAY-YMP. The low cost of the MUSIC system allows a 24-hours-per-day use and the simulation results are available one to two orders of magnitude quicker than with a super computer shared with other users. The parallelization strategy and its implementation are discussed. Very good agreement is found between simulation results and measurements done in an experimental argon discharge. (author) 2 figs., 3 refs
Asynchronous variational integration using continuous assumed gradient elements.
Wolff, Sebastian; Bucher, Christian
2013-03-01
Asynchronous variational integration (AVI) is a tool which improves the numerical efficiency of explicit time stepping schemes when applied to finite element meshes with local spatial refinement. This is achieved by associating an individual time step length to each spatial domain. Furthermore, long-term stability is ensured by its variational structure. This article presents AVI in the context of finite elements based on a weakened weak form (W2) Liu (2009) [1], exemplified by continuous assumed gradient elements Wolff and Bucher (2011) [2]. The article presents the main ideas of the modified AVI, gives implementation notes and a recipe for estimating the critical time step.
Green, Rodney A; Farchione, Davide; Hughes, Diane L; Chan, Siew-Pang
2014-01-01
Asynchronous online discussion forums are common in blended learning models and are popular with students. A previous report has suggested that participation in these forums may assist student learning in a gross anatomy subject but it was unclear as to whether more academically able students post more often or whether participation led to improved learning outcomes. This study used a path model to analyze the contribution of forum participation, previous academic ability, and student campus of enrolment to final marks in a multicampus gross anatomy course for physiotherapy students. The course has a substantial online learning management system (LMS) that incorporates asynchronous forums as a learning tool, particularly to answer learning objectives. Students were encouraged to post new threads and answer queries in threads started by others. The forums were moderated weekly by staff. Discussion forums were the most used feature of the LMS site with 31,920 hits. Forty-eight percent of the students posted at least once with 186 threads initiated by students and a total of 608 posts. The total number of posts made a significant direct contribution to final mark (P = 0.008) as did previous academic ability (P = 0.002). Although campus did not contribute to final mark, there was a trend for students at the campus where the course coordinator was situated to post more often than those at the other campus (P = 0.073). These results indicate that asynchronous online discussion forums can be an effective tool for improving student learning outcomes as evidenced by final marks in gross anatomy teaching. Copyright © 2013 American Association of Anatomists.
DEFF Research Database (Denmark)
Weber, Jens P; Toft-Bertelsen, Trine L; Mohrmann, Ralf
2014-01-01
Synchronization of neurotransmitter release with the presynaptic action potential is essential for maintaining fidelity of information transfer in the central nervous system. However, synchronous release is frequently accompanied by an asynchronous release component that builds up during repetitive...... stimulation, and can even play a dominant role in some synapses. Here, we show that substitution of SNAP-23 for SNAP-25 in mouse autaptic glutamatergic hippocampal neurons results in asynchronous release and a higher frequency of spontaneous release events (mEPSCs). Use of neurons from double-knock-out (SNAP......, while synaptotagmin-7 barely displayed activity-dependent trafficking between vesicle and plasma membrane, implying that it acts as a plasma membrane calcium sensor. Overall, these findings support the idea of alternative syt∶SNARE combinations driving release with different kinetics and fidelity....
International Nuclear Information System (INIS)
Orii, Shigeo
1998-06-01
A benchmark specification for performance evaluation of parallel computers for numerical analysis is proposed. Level 1 benchmark, which is a conventional type benchmark using processing time, measures performance of computers running a code. Level 2 benchmark proposed in this report is to give the reason of the performance. As an example, scalar-parallel computer SP2 is evaluated with this benchmark specification in case of a molecular dynamics code. As a result, the main causes to suppress the parallel performance are maximum band width and start-up time of communication between nodes. Especially the start-up time is proportional not only to the number of processors but also to the number of particles. (author)
Asynchronous emergence by loggerhead turtle (Caretta caretta) hatchlings.
Houghton, J D; Hays, G C
2001-03-01
For many decades it has been accepted that marine turtle hatchlings from the same nest generally emerge from the sand together. However, for loggerhead turtles (Caretta caretta) nesting on the Greek Island of Kefalonia, a more asynchronous pattern of emergence has been documented. By placing temperature loggers at the top and bottom of nests laid on Kefalonia during 1998, we examined whether this asynchronous emergence was related to the thermal conditions within nests. Pronounced thermal variation existed not only between, but also within, individual nests. These within-nest temperature differences were related to the patterns of hatchling emergence, with hatchlings from nests displaying large thermal ranges emerging over a longer time-scale than those characterised by more uniform temperatures. In many egg-laying animals, parental care of the offspring may continue while the eggs are incubating and also after they have hatched. Consequently, the importance of the nest site for determining incubation conditions may be reduced since the parents themselves may alter the local environment. By contrast, in marine turtles, parental care ceases once the eggs have been laid and the nest site covered. The positioning of the nest site, in both space and time, may therefore have profound effects for marine turtles by affecting, for example, the survival of the eggs and hatchlings as well as their sex (Janzen and Paukstis 1991). During incubation, sea turtle embryos grow from a few cells at oviposition to a self-sufficient organism at hatching some 50-80 days later (Ackerman 1997). After hatching, the young turtles dig up through the sand and emerge typically en masse at the surface 1-7 nights later, with a number of stragglers following over the next few nights (Christens 1990). This contrasts with the frequently observed pattern of hatching asynchrony in birds. It has been suggested that the cause of mass emergence in turtles is that eggs within a clutch are fertilised
Asynchronous Task-Based Parallelization of Algebraic Multigrid
AlOnazi, Amani A.; Markomanolis, George S.; Keyes, David E.
2017-01-01
As processor clock rates become more dynamic and workloads become more adaptive, the vulnerability to global synchronization that already complicates programming for performance in today's petascale environment will be exacerbated. Algebraic
Plasma Physics Calculations on a Parallel Macintosh Cluster
Decyk, Viktor; Dauger, Dean; Kokelaar, Pieter
2000-03-01
We have constructed a parallel cluster consisting of 16 Apple Macintosh G3 computers running the MacOS, and achieved very good performance on numerically intensive, parallel plasma particle-in-cell simulations. A subset of the MPI message-passing library was implemented in Fortran77 and C. This library enabled us to port code, without modification, from other parallel processors to the Macintosh cluster. For large problems where message packets are large and relatively few in number, performance of 50-150 MFlops/node is possible, depending on the problem. This is fast enough that 3D calculations can be routinely done. Unlike Unix-based clusters, no special expertise in operating systems is required to build and run the cluster. Full details are available on our web site: http://exodus.physics.ucla.edu/appleseed/.
Parallel, distributed and GPU computing technologies in single-particle electron microscopy
International Nuclear Information System (INIS)
Schmeisser, Martin; Heisen, Burkhard C.; Luettich, Mario; Busche, Boris; Hauer, Florian; Koske, Tobias; Knauber, Karl-Heinz; Stark, Holger
2009-01-01
An introduction to the current paradigm shift towards concurrency in software. Most known methods for the determination of the structure of macromolecular complexes are limited or at least restricted at some point by their computational demands. Recent developments in information technology such as multicore, parallel and GPU processing can be used to overcome these limitations. In particular, graphics processing units (GPUs), which were originally developed for rendering real-time effects in computer games, are now ubiquitous and provide unprecedented computational power for scientific applications. Each parallel-processing paradigm alone can improve overall performance; the increased computational performance obtained by combining all paradigms, unleashing the full power of today’s technology, makes certain applications feasible that were previously virtually impossible. In this article, state-of-the-art paradigms are introduced, the tools and infrastructure needed to apply these paradigms are presented and a state-of-the-art infrastructure and solution strategy for moving scientific applications to the next generation of computer hardware is outlined
Asynchronous discrete event schemes for PDEs
Stone, D.; Geiger, S.; Lord, G. J.
2017-08-01
A new class of asynchronous discrete-event simulation schemes for advection-diffusion-reaction equations is introduced, based on the principle of allowing quanta of mass to pass through faces of a (regular, structured) Cartesian finite volume grid. The timescales of these events are linked to the flux on the face. The resulting schemes are self-adaptive, and local in both time and space. Experiments are performed on realistic physical systems related to porous media flow applications, including a large 3D advection diffusion equation and advection diffusion reaction systems. The results are compared to highly accurate reference solutions where the temporal evolution is computed with exponential integrator schemes using the same finite volume discretisation. This allows a reliable estimation of the solution error. Our results indicate a first order convergence of the error as a control parameter is decreased, and we outline a framework for analysis.
Information flow vs. resource access in the asynchronous pi-calculus
Hennessy, Matthew; Riely, James
2002-01-01
We propose an extension of the asynchronous π-calculus in which a variety of security properties may be captured using types. These are an extension of the input/output types for the π-calculus in which I/O capabilities are assigned specific security levels. The main innovation is a uniform typing system that, by varying slightly the allowed set of types, captures different notions of security.We first define a typing system that ensures that processes running at security level σ cannot acces...
Parallel computers and three-dimensional computational electromagnetics
International Nuclear Information System (INIS)
Madsen, N.K.
1994-01-01
The authors have continued to enhance their ability to use new massively parallel processing computers to solve time-domain electromagnetic problems. New vectorization techniques have improved the performance of their code DSI3D by factors of 5 to 15, depending on the computer used. New radiation boundary conditions and far-field transformations now allow the computation of radar cross-section values for complex objects. A new parallel-data extraction code has been developed that allows the extraction of data subsets from large problems, which have been run on parallel computers, for subsequent post-processing on workstations with enhanced graphics capabilities. A new charged-particle-pushing version of DSI3D is under development. Finally, DSI3D has become a focal point for several new Cooperative Research and Development Agreement activities with industrial companies such as Lockheed Advanced Development Company, Varian, Hughes Electron Dynamics Division, General Atomic, and Cray
Parallel SOL transport in MAST and JET: the impact of the mirror force
International Nuclear Information System (INIS)
Kirk, A; Fundamenski, W; Ahn, J-W; Counsell, G
2003-01-01
Interpretative modelling of the SOL plasma in conventional (JET) and tight (MAST) aspect ratio devices has been performed using OSM2/EIRENE. A detailed comparison has been made of the solutions of the fluid equations and one key issue uncovered by this modelling is the significance of the mirror force for the spherical tokamak (ST) SOL. This force is proportional to ∇ parallel B/B, which is typically a factor 10 larger in an ST due to the low aspect ratio. This term leads to changes in the charged particle velocity distributions near regions with large V parallel B/B representing an effective, upstream particle and momentum source. The modelling performed in this paper indicates that exclusion of the ∇ parallel B term may lead to incorrect conclusions on, for example, the upstream density, especially in STs
Energy Technology Data Exchange (ETDEWEB)
Liu, Hongjun [Faculty of Electronic Information and Electrical Engineering, Dalian University of Technology, Dalian 116024 (China); Weifang Vocational College, Weifang 261041 (China); Wang, Xingyuan, E-mail: wangxy@dlut.edu.cn [Faculty of Electronic Information and Electrical Engineering, Dalian University of Technology, Dalian 116024 (China); Zhu, Quanlong [Faculty of Electronic Information and Electrical Engineering, Dalian University of Technology, Dalian 116024 (China)
2011-07-18
This Letter designs an asynchronous hyper chaotic secure communication system, which possesses high stability against noise, using dynamic delay and state variables switching to ensure the high security. The relationship between the bit error ratio (BER) and the signal-to-noise ratio (SNR) is analyzed by simulation tests, the results show that the BER can be ensured to reach zero by proportionally adjusting the amplitudes of the state variables and the noise figure. The modules of the transmitter and receiver are implemented, and numerical simulations demonstrate the effectiveness of the system. -- Highlights: → Asynchronous anti-noise hyper chaotic secure communication system. → Dynamic delay and state switching to ensure the high security. → BER can reach zero by adjusting the amplitudes of state variables and noise figure.
Particle Acceleration, Magnetic Field Generation in Relativistic Shocks
Nishikawa, Ken-Ichi; Hardee, P.; Hededal, C. B.; Richardson, G.; Sol, H.; Preece, R.; Fishman, G. J.
2005-01-01
Shock acceleration is an ubiquitous phenomenon in astrophysical plasmas. Plasma waves and their associated instabilities (e.g., the Buneman instability, two-streaming instability, and the Weibel instability) created in the shocks are responsible for particle (electron, positron, and ion) acceleration. Using a 3-D relativistic electromagnetic particle (REMP) code, we have investigated particle acceleration associated with a relativistic jet front propagating through an ambient plasma with and without initial magnetic fields. We find only small differences in the results between no ambient and weak ambient parallel magnetic fields. Simulations show that the Weibel instability created in the collisionless shock front accelerates particles perpendicular and parallel to the jet propagation direction. New simulations with an ambient perpendicular magnetic field show the strong interaction between the relativistic jet and the magnetic fields. The magnetic fields are piled up by the jet and the jet electrons are bent, which creates currents and displacement currents. At the nonlinear stage, the magnetic fields are reversed by the current and the reconnection may take place. Due to these dynamics the jet and ambient electron are strongly accelerated in both parallel and perpendicular directions.
Duncan, Keith; Kenworthy, Amy; McNamara, Ray
2012-01-01
This article examines the relationship between MBA students' performance and participation in two online environments: a synchronous forum (chat room) and an asynchronous forum (discussion board) at an Australian university. The "quality" and "quantity" of students' participation is used to predict their final examination and…