Dynamic Load Balancing Based on Constrained K-D Tree Decomposition for Parallel Particle Tracing
Energy Technology Data Exchange (ETDEWEB)
Zhang, Jiang; Guo, Hanqi; Yuan, Xiaoru; Hong, Fan; Peterka, Tom
2018-01-01
Particle tracing is a fundamental technique in flow field data visualization. In this work, we present a novel dynamic load balancing method for parallel particle tracing. Specifically, we employ a constrained k-d tree decomposition approach to dynamically redistribute tasks among processes. Each process is initially assigned a regularly partitioned block along with duplicated ghost layer under the memory limit. During particle tracing, the k-d tree decomposition is dynamically performed by constraining the cutting planes in the overlap range of duplicated data. This ensures that each process is reassigned particles as even as possible, and on the other hand the new assigned particles for a process always locate in its block. Result shows good load balance and high efficiency of our method.
Efficient Delaunay Tessellation through K-D Tree Decomposition
Energy Technology Data Exchange (ETDEWEB)
Morozov, Dmitriy; Peterka, Tom
2017-08-21
Delaunay tessellations are fundamental data structures in computational geometry. They are important in data analysis, where they can represent the geometry of a point set or approximate its density. The algorithms for computing these tessellations at scale perform poorly when the input data is unbalanced. We investigate the use of k-d trees to evenly distribute points among processes and compare two strategies for picking split points between domain regions. Because resulting point distributions no longer satisfy the assumptions of existing parallel Delaunay algorithms, we develop a new parallel algorithm that adapts to its input and prove its correctness. We evaluate the new algorithm using two late-stage cosmology datasets. The new running times are up to 50 times faster using k-d tree compared with regular grid decomposition. Moreover, in the unbalanced data sets, decomposing the domain into a k-d tree is up to five times faster than decomposing it into a regular grid.
Directory of Open Access Journals (Sweden)
Lixin Li
2014-09-01
Full Text Available Epidemiological studies have identified associations between mortality and changes in concentration of particulate matter. These studies have highlighted the public concerns about health effects of particulate air pollution. Modeling fine particulate matter PM2.5 exposure risk and monitoring day-to-day changes in PM2.5 concentration is a critical step for understanding the pollution problem and embarking on the necessary remedy. This research designs, implements and compares two inverse distance weighting (IDW-based spatiotemporal interpolation methods, in order to assess the trend of daily PM2.5 concentration for the contiguous United States over the year of 2009, at both the census block group level and county level. Traditionally, when handling spatiotemporal interpolation, researchers tend to treat space and time separately and reduce the spatiotemporal interpolation problems to a sequence of snapshots of spatial interpolations. In this paper, PM2.5 data interpolation is conducted in the continuous space-time domain by integrating space and time simultaneously, using the so-called extension approach. Time values are calculated with the help of a factor under the assumption that spatial and temporal dimensions are equally important when interpolating a continuous changing phenomenon in the space-time domain. Various IDW-based spatiotemporal interpolation methods with different parameter configurations are evaluated by cross-validation. In addition, this study explores computational issues (computer processing speed faced during implementation of spatiotemporal interpolation for huge data sets. Parallel programming techniques and an advanced data structure, named k-d tree, are adapted in this paper to address the computational challenges. Significant computational improvement has been achieved. Finally, a web-based spatiotemporal IDW-based interpolation application is designed and implemented where users can visualize and animate
Li, Lixin; Losser, Travis; Yorke, Charles; Piltner, Reinhard
2014-01-01
Epidemiological studies have identified associations between mortality and changes in concentration of particulate matter. These studies have highlighted the public concerns about health effects of particulate air pollution. Modeling fine particulate matter PM2.5 exposure risk and monitoring day-to-day changes in PM2.5 concentration is a critical step for understanding the pollution problem and embarking on the necessary remedy. This research designs, implements and compares two inverse distance weighting (IDW)-based spatiotemporal interpolation methods, in order to assess the trend of daily PM2.5 concentration for the contiguous United States over the year of 2009, at both the census block group level and county level. Traditionally, when handling spatiotemporal interpolation, researchers tend to treat space and time separately and reduce the spatiotemporal interpolation problems to a sequence of snapshots of spatial interpolations. In this paper, PM2.5 data interpolation is conducted in the continuous space-time domain by integrating space and time simultaneously, using the so-called extension approach. Time values are calculated with the help of a factor under the assumption that spatial and temporal dimensions are equally important when interpolating a continuous changing phenomenon in the space-time domain. Various IDW-based spatiotemporal interpolation methods with different parameter configurations are evaluated by cross-validation. In addition, this study explores computational issues (computer processing speed) faced during implementation of spatiotemporal interpolation for huge data sets. Parallel programming techniques and an advanced data structure, named k-d tree, are adapted in this paper to address the computational challenges. Significant computational improvement has been achieved. Finally, a web-based spatiotemporal IDW-based interpolation application is designed and implemented where users can visualize and animate spatiotemporal interpolation
Li, Lixin; Losser, Travis; Yorke, Charles; Piltner, Reinhard
2014-09-03
Epidemiological studies have identified associations between mortality and changes in concentration of particulate matter. These studies have highlighted the public concerns about health effects of particulate air pollution. Modeling fine particulate matter PM2.5 exposure risk and monitoring day-to-day changes in PM2.5 concentration is a critical step for understanding the pollution problem and embarking on the necessary remedy. This research designs, implements and compares two inverse distance weighting (IDW)-based spatiotemporal interpolation methods, in order to assess the trend of daily PM2.5 concentration for the contiguous United States over the year of 2009, at both the census block group level and county level. Traditionally, when handling spatiotemporal interpolation, researchers tend to treat space and time separately and reduce the spatiotemporal interpolation problems to a sequence of snapshots of spatial interpolations. In this paper, PM2.5 data interpolation is conducted in the continuous space-time domain by integrating space and time simultaneously, using the so-called extension approach. Time values are calculated with the help of a factor under the assumption that spatial and temporal dimensions are equally important when interpolating a continuous changing phenomenon in the space-time domain. Various IDW-based spatiotemporal interpolation methods with different parameter configurations are evaluated by cross-validation. In addition, this study explores computational issues (computer processing speed) faced during implementation of spatiotemporal interpolation for huge data sets. Parallel programming techniques and an advanced data structure, named k-d tree, are adapted in this paper to address the computational challenges. Significant computational improvement has been achieved. Finally, a web-based spatiotemporal IDW-based interpolation application is designed and implemented where users can visualize and animate spatiotemporal interpolation
Implementasi KD-Tree K-Means Clustering untuk Klasterisasi Dokumen
Directory of Open Access Journals (Sweden)
Eric Budiman Gosno
2013-09-01
Full Text Available Klasterisasi dokumen adalah suatu proses pengelompokan dokumen secara otomatis dan unsupervised. Klasterisasi dokumen merupakan permasalahan yang sering ditemui dalam berbagai bidang seperti text mining dan sistem temu kembali informasi. Metode klasterisasi dokumen yang memiliki akurasi dan efisiensi waktu yang tinggi sangat diperlukan untuk meningkatkan hasil pada mesin pencari web, dan untuk proses filtering. Salah satu metode klasterisasi yang telah dikenal dan diaplikasikan dalam klasterisasi dokumen adalah K-Means Clustering. Tetapi K-Means Clustering sensitif terhadap pemilihan posisi awal dari titik tengah klaster sehingga pemilihan posisi awal dari titik tengah klaster yang buruk akan mengakibatkan K-Means Clustering terjebak dalam local optimum. KD-Tree K-Means Clustering merupakan perbaikan dari K-Means Clustering. KD-Tree K-Means Clustering menggunakan struktur data K-Dimensional Tree dan nilai kerapatan pada proses inisialisasi titik tengah klaster. Pada makalah ini diimplementasikan algoritma KD-Tree K-Means Clustering untuk permasalahan klasterisasi dokumen. Performa klasterisasi dokumen yang dihasilkan oleh metode KD-Tree K-Means Clustering pada data set 20 newsgroup memiliki nilai distorsi 3×105 lebih rendah dibandingkan dengan nilai rerata distorsi K-Means Clustering dan nilai NIG 0,09 lebih baik dibandingkan dengan nilai NIG K-Means Clustering.
Using an implicit min/max KD-Tree for doing efficient terrain line of sight calculations
CSIR Research Space (South Africa)
Duvenhage, B
2009-02-01
Full Text Available -dimensional tree (kd-tree) based raytracing approach, to calculating LOS information, is efficient. A new implicit min/max kd-tree algorithm is discussed for evaluating LOS queries on large scale spherical terrain. In particular the value of low resolution boundary...
A γ dose distribution evaluation technique using the k-d tree for nearest neighbor searching
International Nuclear Information System (INIS)
Yuan Jiankui; Chen Weimin
2010-01-01
Purpose: The authors propose an algorithm based on the k-d tree for nearest neighbor searching to improve the γ calculation time for 2D and 3D dose distributions. Methods: The γ calculation method has been widely used for comparisons of dose distributions in clinical treatment plans and quality assurances. By specifying the acceptable dose and distance-to-agreement criteria, the method provides quantitative measurement of the agreement between the reference and evaluation dose distributions. The γ value indicates the acceptability. In regions where γ≤1, the predefined criterion is satisfied and thus the agreement is acceptable; otherwise, the agreement fails. Although the concept of the method is not complicated and a quick naieve implementation is straightforward, an efficient and robust implementation is not trivial. Recent algorithms based on exhaustive searching within a maximum radius, the geometric Euclidean distance, and the table lookup method have been proposed to improve the computational time for multidimensional dose distributions. Motivated by the fact that the least searching time for finding a nearest neighbor can be an O(log N) operation with a k-d tree, where N is the total number of the dose points, the authors propose an algorithm based on the k-d tree for the γ evaluation in this work. Results: In the experiment, the authors found that the average k-d tree construction time per reference point is O(log N), while the nearest neighbor searching time per evaluation point is proportional to O(N 1/k ), where k is between 2 and 3 for two-dimensional and three-dimensional dose distributions, respectively. Conclusions: Comparing with other algorithms such as exhaustive search and sorted list O(N), the k-d tree algorithm for γ evaluation is much more efficient.
a Gross Error Elimination Method for Point Cloud Data Based on Kd-Tree
Kang, Q.; Huang, G.; Yang, S.
2018-04-01
Point cloud data has been one type of widely used data sources in the field of remote sensing. Key steps of point cloud data's pro-processing focus on gross error elimination and quality control. Owing to the volume feature of point could data, existed gross error elimination methods need spend massive memory both in space and time. This paper employed a new method which based on Kd-tree algorithm to construct, k-nearest neighbor algorithm to search, settled appropriate threshold to determine with result turns out a judgement that whether target point is or not an outlier. Experimental results show that, our proposed algorithm will help to delete gross error in point cloud data and facilitate to decrease memory consumption, improve efficiency.
A GROSS ERROR ELIMINATION METHOD FOR POINT CLOUD DATA BASED ON KD-TREE
Directory of Open Access Journals (Sweden)
Q. Kang
2018-04-01
Full Text Available Point cloud data has been one type of widely used data sources in the field of remote sensing. Key steps of point cloud data’s pro-processing focus on gross error elimination and quality control. Owing to the volume feature of point could data, existed gross error elimination methods need spend massive memory both in space and time. This paper employed a new method which based on Kd-tree algorithm to construct, k-nearest neighbor algorithm to search, settled appropriate threshold to determine with result turns out a judgement that whether target point is or not an outlier. Experimental results show that, our proposed algorithm will help to delete gross error in point cloud data and facilitate to decrease memory consumption, improve efficiency.
Sirait, Kamson; Tulus; Budhiarti Nababan, Erna
2017-12-01
Clustering methods that have high accuracy and time efficiency are necessary for the filtering process. One method that has been known and applied in clustering is K-Means Clustering. In its application, the determination of the begining value of the cluster center greatly affects the results of the K-Means algorithm. This research discusses the results of K-Means Clustering with starting centroid determination with a random and KD-Tree method. The initial determination of random centroid on the data set of 1000 student academic data to classify the potentially dropout has a sse value of 952972 for the quality variable and 232.48 for the GPA, whereas the initial centroid determination by KD-Tree has a sse value of 504302 for the quality variable and 214,37 for the GPA variable. The smaller sse values indicate that the result of K-Means Clustering with initial KD-Tree centroid selection have better accuracy than K-Means Clustering method with random initial centorid selection.
Yin, Lucy; Andrews, Jennifer; Heaton, Thomas
2018-05-01
Earthquake parameter estimations using nearest neighbor searching among a large database of observations can lead to reliable prediction results. However, in the real-time application of Earthquake Early Warning (EEW) systems, the accurate prediction using a large database is penalized by a significant delay in the processing time. We propose to use a multidimensional binary search tree (KD tree) data structure to organize large seismic databases to reduce the processing time in nearest neighbor search for predictions. We evaluated the performance of KD tree on the Gutenberg Algorithm, a database-searching algorithm for EEW. We constructed an offline test to predict peak ground motions using a database with feature sets of waveform filter-bank characteristics, and compare the results with the observed seismic parameters. We concluded that large database provides more accurate predictions of the ground motion information, such as peak ground acceleration, velocity, and displacement (PGA, PGV, PGD), than source parameters, such as hypocenter distance. Application of the KD tree search to organize the database reduced the average searching process by 85% time cost of the exhaustive method, allowing the method to be feasible for real-time implementation. The algorithm is straightforward and the results will reduce the overall time of warning delivery for EEW.
Algoritmos rápidos de detecção de colisão broad phase utilizando KD-trees
Rocha, Rafael de Sousa
2010-01-01
Neste trabalho, três novos algoritmos rápidos de detecção de colisão broad phase, os quais utilizam a estrutura de particionamento espacial conhecida como KD-Tree, foram pro- postos e implementados: KDTreeSpace, DynamicKDTreeSpace e StatelessKDTreeSpace. Estes algoritmos foram integrados à biblioteca Open Dynamics Engine (ODE), responsável pelo cálculo do movimento dos objetos dinâmicos, como possíveis alternativas aos algoritmos de broad phase disponíveis nesta biblioteca. Os algori...
Adapting algorithms to massively parallel hardware
Sioulas, Panagiotis
2016-01-01
In the recent years, the trend in computing has shifted from delivering processors with faster clock speeds to increasing the number of cores per processor. This marks a paradigm shift towards parallel programming in which applications are programmed to exploit the power provided by multi-cores. Usually there is gain in terms of the time-to-solution and the memory footprint. Specifically, this trend has sparked an interest towards massively parallel systems that can provide a large number of processors, and possibly computing nodes, as in the GPUs and MPPAs (Massively Parallel Processor Arrays). In this project, the focus was on two distinct computing problems: k-d tree searches and track seeding cellular automata. The goal was to adapt the algorithms to parallel systems and evaluate their performance in different cases.
Crockett, Thomas W.
1995-01-01
This article provides a broad introduction to the subject of parallel rendering, encompassing both hardware and software systems. The focus is on the underlying concepts and the issues which arise in the design of parallel rendering algorithms and systems. We examine the different types of parallelism and how they can be applied in rendering applications. Concepts from parallel computing, such as data decomposition, task granularity, scalability, and load balancing, are considered in relation to the rendering problem. We also explore concepts from computer graphics, such as coherence and projection, which have a significant impact on the structure of parallel rendering algorithms. Our survey covers a number of practical considerations as well, including the choice of architectural platform, communication and memory requirements, and the problem of image assembly and display. We illustrate the discussion with numerous examples from the parallel rendering literature, representing most of the principal rendering methods currently used in computer graphics.
1982-01-01
Parallel Computations focuses on parallel computation, with emphasis on algorithms used in a variety of numerical and physical applications and for many different types of parallel computers. Topics covered range from vectorization of fast Fourier transforms (FFTs) and of the incomplete Cholesky conjugate gradient (ICCG) algorithm on the Cray-1 to calculation of table lookups and piecewise functions. Single tridiagonal linear systems and vectorized computation of reactive flow are also discussed.Comprised of 13 chapters, this volume begins by classifying parallel computers and describing techn
Implementasi KD-Tree K-Means Clustering Untuk Klasterisasi Dokumen
Gosno, Eric Budiman; Arieshanti, Isye; Soelaiman, Rully
2013-01-01
Klasterisasi dokumen adalah suatu proses pengelompokan dokumen secara otomatis dan unsupervised. Klasterisasi dokumen merupakan permasalahan yang sering ditemui dalam berbagai bidang seperti text mining dan sistem temu kembali informasi. Metode klasterisasi dokumen yang memiliki akurasi dan efisiensi waktu yang tinggi sangat diperlukan untuk meningkatkan hasil pada mesin pencari web, dan untuk proses filtering. Salah satu metode klasterisasi yang telah dikenal dan diaplikasikan dalam klaster...
Real-time generation of kd-trees for ray tracing using DirectX 11
Säll, Martin; Cronqvist, Fredrik
2017-01-01
Context. Ray tracing has always been a simple but eﬀective way to create a photorealistic scene but at a greater cost when expanding the scene. Recent improvements in GPU and CPU hardware have made ray tracing faster, making more complex scenes possible with the same amount of time needed to process the scene. Despite the improvements in hardware ray tracing is still rarely run at a interactive speed. Objectives. The aim of this experiment was to implement a new kdtree generation algorithm us...
Casanova, Henri; Robert, Yves
2008-01-01
""…The authors of the present book, who have extensive credentials in both research and instruction in the area of parallelism, present a sound, principled treatment of parallel algorithms. … This book is very well written and extremely well designed from an instructional point of view. … The authors have created an instructive and fascinating text. The book will serve researchers as well as instructors who need a solid, readable text for a course on parallelism in computing. Indeed, for anyone who wants an understandable text from which to acquire a current, rigorous, and broad vi
International Nuclear Information System (INIS)
Jejcic, A.; Maillard, J.; Maurel, G.; Silva, J.; Wolff-Bacha, F.
1997-01-01
The work in the field of parallel processing has developed as research activities using several numerical Monte Carlo simulations related to basic or applied current problems of nuclear and particle physics. For the applications utilizing the GEANT code development or improvement works were done on parts simulating low energy physical phenomena like radiation, transport and interaction. The problem of actinide burning by means of accelerators was approached using a simulation with the GEANT code. A program of neutron tracking in the range of low energies up to the thermal region has been developed. It is coupled to the GEANT code and permits in a single pass the simulation of a hybrid reactor core receiving a proton burst. Other works in this field refers to simulations for nuclear medicine applications like, for instance, development of biological probes, evaluation and characterization of the gamma cameras (collimators, crystal thickness) as well as the method for dosimetric calculations. Particularly, these calculations are suited for a geometrical parallelization approach especially adapted to parallel machines of the TN310 type. Other works mentioned in the same field refer to simulation of the electron channelling in crystals and simulation of the beam-beam interaction effect in colliders. The GEANT code was also used to simulate the operation of germanium detectors designed for natural and artificial radioactivity monitoring of environment
Laparoscopy After Previous Laparotomy
Directory of Open Access Journals (Sweden)
Zulfo Godinjak
2006-11-01
Full Text Available Following the abdominal surgery, extensive adhesions often occur and they can cause difficulties during laparoscopic operations. However, previous laparotomy is not considered to be a contraindication for laparoscopy. The aim of this study is to present that an insertion of Veres needle in the region of umbilicus is a safe method for creating a pneumoperitoneum for laparoscopic operations after previous laparotomy. In the last three years, we have performed 144 laparoscopic operations in patients that previously underwent one or two laparotomies. Pathology of digestive system, genital organs, Cesarean Section or abdominal war injuries were the most common causes of previouslaparotomy. During those operations or during entering into abdominal cavity we have not experienced any complications, while in 7 patients we performed conversion to laparotomy following the diagnostic laparoscopy. In all patients an insertion of Veres needle and trocar insertion in the umbilical region was performed, namely a technique of closed laparoscopy. Not even in one patient adhesions in the region of umbilicus were found, and no abdominal organs were injured.
McCallum, Ethan
2011-01-01
It's tough to argue with R as a high-quality, cross-platform, open source statistical software product-unless you're in the business of crunching Big Data. This concise book introduces you to several strategies for using R to analyze large datasets. You'll learn the basics of Snow, Multicore, Parallel, and some Hadoop-related tools, including how to find them, how to use them, when they work well, and when they don't. With these packages, you can overcome R's single-threaded nature by spreading work across multiple CPUs, or offloading work to multiple machines to address R's memory barrier.
Directory of Open Access Journals (Sweden)
James G. Worner
2017-05-01
Full Text Available James Worner is an Australian-based writer and scholar currently pursuing a PhD at the University of Technology Sydney. His research seeks to expose masculinities lost in the shadow of Australia’s Anzac hegemony while exploring new opportunities for contemporary historiography. He is the recipient of the Doctoral Scholarship in Historical Consciousness at the university’s Australian Centre of Public History and will be hosted by the University of Bologna during 2017 on a doctoral research writing scholarship. ‘Parallel Lines’ is one of a collection of stories, The Shapes of Us, exploring liminal spaces of modern life: class, gender, sexuality, race, religion and education. It looks at lives, like lines, that do not meet but which travel in proximity, simultaneously attracted and repelled. James’ short stories have been published in various journals and anthologies.
Concurrent computation of attribute filters on shared memory parallel machines
Wilkinson, Michael H.F.; Gao, Hui; Hesselink, Wim H.; Jonker, Jan-Eppo; Meijster, Arnold
2008-01-01
Morphological attribute filters have not previously been parallelized mainly because they are both global and nonseparable. We propose a parallel algorithm that achieves efficient parallelism for a large class of attribute filters, including attribute openings, closings, thinnings, and thickenings,
Wald, Ingo; Ize, Santiago
2015-07-28
Parallel population of a grid with a plurality of objects using a plurality of processors. One example embodiment is a method for parallel population of a grid with a plurality of objects using a plurality of processors. The method includes a first act of dividing a grid into n distinct grid portions, where n is the number of processors available for populating the grid. The method also includes acts of dividing a plurality of objects into n distinct sets of objects, assigning a distinct set of objects to each processor such that each processor determines by which distinct grid portion(s) each object in its distinct set of objects is at least partially bounded, and assigning a distinct grid portion to each processor such that each processor populates its distinct grid portion with any objects that were previously determined to be at least partially bounded by its distinct grid portion.
Development of a Robust and Efficient Parallel Solver for Unsteady Turbomachinery Flows
West, Jeff; Wright, Jeffrey; Thakur, Siddharth; Luke, Ed; Grinstead, Nathan
2012-01-01
second-order time-stepping scheme, (c) a novel cloud-of-points interpolation method (based on a fast parallel kd-tree search algorithm) for interfaces between turbomachinery components in relative motion which is demonstrated to be highly scalable, and (d) demonstrated accuracy and parallel scalability on large grids (approx 250 million cells) in full turbomachinery geometries.
Template based parallel checkpointing in a massively parallel computer system
Archer, Charles Jens [Rochester, MN; Inglett, Todd Alan [Rochester, MN
2009-01-13
A method and apparatus for a template based parallel checkpoint save for a massively parallel super computer system using a parallel variation of the rsync protocol, and network broadcast. In preferred embodiments, the checkpoint data for each node is compared to a template checkpoint file that resides in the storage and that was previously produced. Embodiments herein greatly decrease the amount of data that must be transmitted and stored for faster checkpointing and increased efficiency of the computer system. Embodiments are directed to a parallel computer system with nodes arranged in a cluster with a high speed interconnect that can perform broadcast communication. The checkpoint contains a set of actual small data blocks with their corresponding checksums from all nodes in the system. The data blocks may be compressed using conventional non-lossy data compression algorithms to further reduce the overall checkpoint size.
Exploiting Symmetry on Parallel Architectures.
Stiller, Lewis Benjamin
1995-01-01
This thesis describes techniques for the design of parallel programs that solve well-structured problems with inherent symmetry. Part I demonstrates the reduction of such problems to generalized matrix multiplication by a group-equivariant matrix. Fast techniques for this multiplication are described, including factorization, orbit decomposition, and Fourier transforms over finite groups. Our algorithms entail interaction between two symmetry groups: one arising at the software level from the problem's symmetry and the other arising at the hardware level from the processors' communication network. Part II illustrates the applicability of our symmetry -exploitation techniques by presenting a series of case studies of the design and implementation of parallel programs. First, a parallel program that solves chess endgames by factorization of an associated dihedral group-equivariant matrix is described. This code runs faster than previous serial programs, and discovered it a number of results. Second, parallel algorithms for Fourier transforms for finite groups are developed, and preliminary parallel implementations for group transforms of dihedral and of symmetric groups are described. Applications in learning, vision, pattern recognition, and statistics are proposed. Third, parallel implementations solving several computational science problems are described, including the direct n-body problem, convolutions arising from molecular biology, and some communication primitives such as broadcast and reduce. Some of our implementations ran orders of magnitude faster than previous techniques, and were used in the investigation of various physical phenomena.
Parallel Programming with Intel Parallel Studio XE
Blair-Chappell , Stephen
2012-01-01
Optimize code for multi-core processors with Intel's Parallel Studio Parallel programming is rapidly becoming a "must-know" skill for developers. Yet, where to start? This teach-yourself tutorial is an ideal starting point for developers who already know Windows C and C++ and are eager to add parallelism to their code. With a focus on applying tools, techniques, and language extensions to implement parallelism, this essential resource teaches you how to write programs for multicore and leverage the power of multicore in your programs. Sharing hands-on case studies and real-world examples, the
Parallel algorithms for continuum dynamics
International Nuclear Information System (INIS)
Hicks, D.L.; Liebrock, L.M.
1987-01-01
Simply porting existing parallel programs to a new parallel processor may not achieve the full speedup possible; to achieve the maximum efficiency may require redesigning the parallel algorithms for the specific architecture. The authors discuss here parallel algorithms that were developed first for the HEP processor and then ported to the CRAY X-MP/4, the ELXSI/10, and the Intel iPSC/32. Focus is mainly on the most recent parallel processing results produced, i.e., those on the Intel Hypercube. The applications are simulations of continuum dynamics in which the momentum and stress gradients are important. Examples of these are inertial confinement fusion experiments, severe breaks in the coolant system of a reactor, weapons physics, shock-wave physics. Speedup efficiencies on the Intel iPSC Hypercube are very sensitive to the ratio of communication to computation. Great care must be taken in designing algorithms for this machine to avoid global communication. This is much more critical on the iPSC than it was on the three previous parallel processors
Parallel processing from applications to systems
Moldovan, Dan I
1993-01-01
This text provides one of the broadest presentations of parallelprocessing available, including the structure of parallelprocessors and parallel algorithms. The emphasis is on mappingalgorithms to highly parallel computers, with extensive coverage ofarray and multiprocessor architectures. Early chapters provideinsightful coverage on the analysis of parallel algorithms andprogram transformations, effectively integrating a variety ofmaterial previously scattered throughout the literature. Theory andpractice are well balanced across diverse topics in this concisepresentation. For exceptional cla
Parallel Boltzmann machines : a mathematical model
Zwietering, P.J.; Aarts, E.H.L.
1991-01-01
A mathematical model is presented for the description of parallel Boltzmann machines. The framework is based on the theory of Markov chains and combines a number of previously known results into one generic model. It is argued that parallel Boltzmann machines maximize a function consisting of a
Morse, H Stephen
1994-01-01
Practical Parallel Computing provides information pertinent to the fundamental aspects of high-performance parallel processing. This book discusses the development of parallel applications on a variety of equipment.Organized into three parts encompassing 12 chapters, this book begins with an overview of the technology trends that converge to favor massively parallel hardware over traditional mainframes and vector machines. This text then gives a tutorial introduction to parallel hardware architectures. Other chapters provide worked-out examples of programs using several parallel languages. Thi
Akl, Selim G
1985-01-01
Parallel Sorting Algorithms explains how to use parallel algorithms to sort a sequence of items on a variety of parallel computers. The book reviews the sorting problem, the parallel models of computation, parallel algorithms, and the lower bounds on the parallel sorting problems. The text also presents twenty different algorithms, such as linear arrays, mesh-connected computers, cube-connected computers. Another example where algorithm can be applied is on the shared-memory SIMD (single instruction stream multiple data stream) computers in which the whole sequence to be sorted can fit in the
PSHED: a simplified approach to developing parallel programs
International Nuclear Information System (INIS)
Mahajan, S.M.; Ramesh, K.; Rajesh, K.; Somani, A.; Goel, M.
1992-01-01
This paper presents a simplified approach in the forms of a tree structured computational model for parallel application programs. An attempt is made to provide a standard user interface to execute programs on BARC Parallel Processing System (BPPS), a scalable distributed memory multiprocessor. The interface package called PSHED provides a basic framework for representing and executing parallel programs on different parallel architectures. The PSHED package incorporates concepts from a broad range of previous research in programming environments and parallel computations. (author). 6 refs
Introduction to parallel programming
Brawer, Steven
1989-01-01
Introduction to Parallel Programming focuses on the techniques, processes, methodologies, and approaches involved in parallel programming. The book first offers information on Fortran, hardware and operating system models, and processes, shared memory, and simple parallel programs. Discussions focus on processes and processors, joining processes, shared memory, time-sharing with multiple processors, hardware, loops, passing arguments in function/subroutine calls, program structure, and arithmetic expressions. The text then elaborates on basic parallel programming techniques, barriers and race
Fox, Geoffrey C; Messina, Guiseppe C
2014-01-01
A clear illustration of how parallel computers can be successfully appliedto large-scale scientific computations. This book demonstrates how avariety of applications in physics, biology, mathematics and other scienceswere implemented on real parallel computers to produce new scientificresults. It investigates issues of fine-grained parallelism relevant forfuture supercomputers with particular emphasis on hypercube architecture. The authors describe how they used an experimental approach to configuredifferent massively parallel machines, design and implement basic systemsoftware, and develop
Parallel imaging microfluidic cytometer.
Ehrlich, Daniel J; McKenna, Brian K; Evans, James G; Belkina, Anna C; Denis, Gerald V; Sherr, David H; Cheung, Man Ching
2011-01-01
By adding an additional degree of freedom from multichannel flow, the parallel microfluidic cytometer (PMC) combines some of the best features of fluorescence-activated flow cytometry (FCM) and microscope-based high-content screening (HCS). The PMC (i) lends itself to fast processing of large numbers of samples, (ii) adds a 1D imaging capability for intracellular localization assays (HCS), (iii) has a high rare-cell sensitivity, and (iv) has an unusual capability for time-synchronized sampling. An inability to practically handle large sample numbers has restricted applications of conventional flow cytometers and microscopes in combinatorial cell assays, network biology, and drug discovery. The PMC promises to relieve a bottleneck in these previously constrained applications. The PMC may also be a powerful tool for finding rare primary cells in the clinic. The multichannel architecture of current PMC prototypes allows 384 unique samples for a cell-based screen to be read out in ∼6-10 min, about 30 times the speed of most current FCM systems. In 1D intracellular imaging, the PMC can obtain protein localization using HCS marker strategies at many times for the sample throughput of charge-coupled device (CCD)-based microscopes or CCD-based single-channel flow cytometers. The PMC also permits the signal integration time to be varied over a larger range than is practical in conventional flow cytometers. The signal-to-noise advantages are useful, for example, in counting rare positive cells in the most difficult early stages of genome-wide screening. We review the status of parallel microfluidic cytometry and discuss some of the directions the new technology may take. Copyright © 2011 Elsevier Inc. All rights reserved.
Parallel Atomistic Simulations
Energy Technology Data Exchange (ETDEWEB)
HEFFELFINGER,GRANT S.
2000-01-18
Algorithms developed to enable the use of atomistic molecular simulation methods with parallel computers are reviewed. Methods appropriate for bonded as well as non-bonded (and charged) interactions are included. While strategies for obtaining parallel molecular simulations have been developed for the full variety of atomistic simulation methods, molecular dynamics and Monte Carlo have received the most attention. Three main types of parallel molecular dynamics simulations have been developed, the replicated data decomposition, the spatial decomposition, and the force decomposition. For Monte Carlo simulations, parallel algorithms have been developed which can be divided into two categories, those which require a modified Markov chain and those which do not. Parallel algorithms developed for other simulation methods such as Gibbs ensemble Monte Carlo, grand canonical molecular dynamics, and Monte Carlo methods for protein structure determination are also reviewed and issues such as how to measure parallel efficiency, especially in the case of parallel Monte Carlo algorithms with modified Markov chains are discussed.
CERN. Geneva
2016-01-01
The traditionally used and well established parallel programming models OpenMP and MPI are both targeting lower level parallelism and are meant to be as language agnostic as possible. For a long time, those models were the only widely available portable options for developing parallel C++ applications beyond using plain threads. This has strongly limited the optimization capabilities of compilers, has inhibited extensibility and genericity, and has restricted the use of those models together with other, modern higher level abstractions introduced by the C++11 and C++14 standards. The recent revival of interest in the industry and wider community for the C++ language has also spurred a remarkable amount of standardization proposals and technical specifications being developed. Those efforts however have so far failed to build a vision on how to seamlessly integrate various types of parallelism, such as iterative parallel execution, task-based parallelism, asynchronous many-task execution flows, continuation s...
Parallelism in matrix computations
Gallopoulos, Efstratios; Sameh, Ahmed H
2016-01-01
This book is primarily intended as a research monograph that could also be used in graduate courses for the design of parallel algorithms in matrix computations. It assumes general but not extensive knowledge of numerical linear algebra, parallel architectures, and parallel programming paradigms. The book consists of four parts: (I) Basics; (II) Dense and Special Matrix Computations; (III) Sparse Matrix Computations; and (IV) Matrix functions and characteristics. Part I deals with parallel programming paradigms and fundamental kernels, including reordering schemes for sparse matrices. Part II is devoted to dense matrix computations such as parallel algorithms for solving linear systems, linear least squares, the symmetric algebraic eigenvalue problem, and the singular-value decomposition. It also deals with the development of parallel algorithms for special linear systems such as banded ,Vandermonde ,Toeplitz ,and block Toeplitz systems. Part III addresses sparse matrix computations: (a) the development of pa...
DEFF Research Database (Denmark)
Sitchinava, Nodar; Zeh, Norbert
2012-01-01
We present the parallel buffer tree, a parallel external memory (PEM) data structure for batched search problems. This data structure is a non-trivial extension of Arge's sequential buffer tree to a private-cache multiprocessor environment and reduces the number of I/O operations by the number of...... in the optimal OhOf(psortN + K/PB) parallel I/O complexity, where K is the size of the output reported in the process and psortN is the parallel I/O complexity of sorting N elements using P processors....
Deshmane, Anagha; Gulani, Vikas; Griswold, Mark A; Seiberlich, Nicole
2012-07-01
Parallel imaging is a robust method for accelerating the acquisition of magnetic resonance imaging (MRI) data, and has made possible many new applications of MR imaging. Parallel imaging works by acquiring a reduced amount of k-space data with an array of receiver coils. These undersampled data can be acquired more quickly, but the undersampling leads to aliased images. One of several parallel imaging algorithms can then be used to reconstruct artifact-free images from either the aliased images (SENSE-type reconstruction) or from the undersampled data (GRAPPA-type reconstruction). The advantages of parallel imaging in a clinical setting include faster image acquisition, which can be used, for instance, to shorten breath-hold times resulting in fewer motion-corrupted examinations. In this article the basic concepts behind parallel imaging are introduced. The relationship between undersampling and aliasing is discussed and two commonly used parallel imaging methods, SENSE and GRAPPA, are explained in detail. Examples of artifacts arising from parallel imaging are shown and ways to detect and mitigate these artifacts are described. Finally, several current applications of parallel imaging are presented and recent advancements and promising research in parallel imaging are briefly reviewed. Copyright © 2012 Wiley Periodicals, Inc.
Parallel Algorithms and Patterns
Energy Technology Data Exchange (ETDEWEB)
Robey, Robert W. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
2016-06-16
This is a powerpoint presentation on parallel algorithms and patterns. A parallel algorithm is a well-defined, step-by-step computational procedure that emphasizes concurrency to solve a problem. Examples of problems include: Sorting, searching, optimization, matrix operations. A parallel pattern is a computational step in a sequence of independent, potentially concurrent operations that occurs in diverse scenarios with some frequency. Examples are: Reductions, prefix scans, ghost cell updates. We only touch on parallel patterns in this presentation. It really deserves its own detailed discussion which Gabe Rockefeller would like to develop.
Application Portable Parallel Library
Cole, Gary L.; Blech, Richard A.; Quealy, Angela; Townsend, Scott
1995-01-01
Application Portable Parallel Library (APPL) computer program is subroutine-based message-passing software library intended to provide consistent interface to variety of multiprocessor computers on market today. Minimizes effort needed to move application program from one computer to another. User develops application program once and then easily moves application program from parallel computer on which created to another parallel computer. ("Parallel computer" also include heterogeneous collection of networked computers). Written in C language with one FORTRAN 77 subroutine for UNIX-based computers and callable from application programs written in C language or FORTRAN 77.
Parallel discrete event simulation
Overeinder, B.J.; Hertzberger, L.O.; Sloot, P.M.A.; Withagen, W.J.
1991-01-01
In simulating applications for execution on specific computing systems, the simulation performance figures must be known in a short period of time. One basic approach to the problem of reducing the required simulation time is the exploitation of parallelism. However, in parallelizing the simulation
Parallel reservoir simulator computations
International Nuclear Information System (INIS)
Hemanth-Kumar, K.; Young, L.C.
1995-01-01
The adaptation of a reservoir simulator for parallel computations is described. The simulator was originally designed for vector processors. It performs approximately 99% of its calculations in vector/parallel mode and relative to scalar calculations it achieves speedups of 65 and 81 for black oil and EOS simulations, respectively on the CRAY C-90
Totally parallel multilevel algorithms
Frederickson, Paul O.
1988-01-01
Four totally parallel algorithms for the solution of a sparse linear system have common characteristics which become quite apparent when they are implemented on a highly parallel hypercube such as the CM2. These four algorithms are Parallel Superconvergent Multigrid (PSMG) of Frederickson and McBryan, Robust Multigrid (RMG) of Hackbusch, the FFT based Spectral Algorithm, and Parallel Cyclic Reduction. In fact, all four can be formulated as particular cases of the same totally parallel multilevel algorithm, which are referred to as TPMA. In certain cases the spectral radius of TPMA is zero, and it is recognized to be a direct algorithm. In many other cases the spectral radius, although not zero, is small enough that a single iteration per timestep keeps the local error within the required tolerance.
Energy Technology Data Exchange (ETDEWEB)
1991-10-23
An account of the Caltech Concurrent Computation Program (C{sup 3}P), a five year project that focused on answering the question: Can parallel computers be used to do large-scale scientific computations '' As the title indicates, the question is answered in the affirmative, by implementing numerous scientific applications on real parallel computers and doing computations that produced new scientific results. In the process of doing so, C{sup 3}P helped design and build several new computers, designed and implemented basic system software, developed algorithms for frequently used mathematical computations on massively parallel machines, devised performance models and measured the performance of many computers, and created a high performance computing facility based exclusively on parallel computers. While the initial focus of C{sup 3}P was the hypercube architecture developed by C. Seitz, many of the methods developed and lessons learned have been applied successfully on other massively parallel architectures.
Massively parallel mathematical sieves
Energy Technology Data Exchange (ETDEWEB)
Montry, G.R.
1989-01-01
The Sieve of Eratosthenes is a well-known algorithm for finding all prime numbers in a given subset of integers. A parallel version of the Sieve is described that produces computational speedups over 800 on a hypercube with 1,024 processing elements for problems of fixed size. Computational speedups as high as 980 are achieved when the problem size per processor is fixed. The method of parallelization generalizes to other sieves and will be efficient on any ensemble architecture. We investigate two highly parallel sieves using scattered decomposition and compare their performance on a hypercube multiprocessor. A comparison of different parallelization techniques for the sieve illustrates the trade-offs necessary in the design and implementation of massively parallel algorithms for large ensemble computers.
Previously unknown species of Aspergillus.
Gautier, M; Normand, A-C; Ranque, S
2016-08-01
The use of multi-locus DNA sequence analysis has led to the description of previously unknown 'cryptic' Aspergillus species, whereas classical morphology-based identification of Aspergillus remains limited to the section or species-complex level. The current literature highlights two main features concerning these 'cryptic' Aspergillus species. First, the prevalence of such species in clinical samples is relatively high compared with emergent filamentous fungal taxa such as Mucorales, Scedosporium or Fusarium. Second, it is clearly important to identify these species in the clinical laboratory because of the high frequency of antifungal drug-resistant isolates of such Aspergillus species. Matrix-assisted laser desorption/ionization-time of flight mass spectrometry (MALDI-TOF MS) has recently been shown to enable the identification of filamentous fungi with an accuracy similar to that of DNA sequence-based methods. As MALDI-TOF MS is well suited to the routine clinical laboratory workflow, it facilitates the identification of these 'cryptic' Aspergillus species at the routine mycology bench. The rapid establishment of enhanced filamentous fungi identification facilities will lead to a better understanding of the epidemiology and clinical importance of these emerging Aspergillus species. Based on routine MALDI-TOF MS-based identification results, we provide original insights into the key interpretation issues of a positive Aspergillus culture from a clinical sample. Which ubiquitous species that are frequently isolated from air samples are rarely involved in human invasive disease? Can both the species and the type of biological sample indicate Aspergillus carriage, colonization or infection in a patient? Highly accurate routine filamentous fungi identification is central to enhance the understanding of these previously unknown Aspergillus species, with a vital impact on further improved patient care. Copyright © 2016 European Society of Clinical Microbiology and
Freeman, Bryan
2013-01-01
This book contains practical recipes on everything you will need to create task-based parallel programs using C#, .NET 4.5, and Visual Studio. The book is packed with illustrated code examples to create scalable programs.This book is intended to help experienced C# developers write applications that leverage the power of modern multicore processors. It provides the necessary knowledge for an experienced C# developer to work with .NET parallelism APIs. Previous experience of writing multithreaded applications is not necessary.
Algorithms for parallel computers
International Nuclear Information System (INIS)
Churchhouse, R.F.
1985-01-01
Until relatively recently almost all the algorithms for use on computers had been designed on the (usually unstated) assumption that they were to be run on single processor, serial machines. With the introduction of vector processors, array processors and interconnected systems of mainframes, minis and micros, however, various forms of parallelism have become available. The advantage of parallelism is that it offers increased overall processing speed but it also raises some fundamental questions, including: (i) which, if any, of the existing 'serial' algorithms can be adapted for use in the parallel mode. (ii) How close to optimal can such adapted algorithms be and, where relevant, what are the convergence criteria. (iii) How can we design new algorithms specifically for parallel systems. (iv) For multi-processor systems how can we handle the software aspects of the interprocessor communications. Aspects of these questions illustrated by examples are considered in these lectures. (orig.)
Parallelism and array processing
International Nuclear Information System (INIS)
Zacharov, V.
1983-01-01
Modern computing, as well as the historical development of computing, has been dominated by sequential monoprocessing. Yet there is the alternative of parallelism, where several processes may be in concurrent execution. This alternative is discussed in a series of lectures, in which the main developments involving parallelism are considered, both from the standpoint of computing systems and that of applications that can exploit such systems. The lectures seek to discuss parallelism in a historical context, and to identify all the main aspects of concurrency in computation right up to the present time. Included will be consideration of the important question as to what use parallelism might be in the field of data processing. (orig.)
Parallel magnetic resonance imaging
International Nuclear Information System (INIS)
Larkman, David J; Nunes, Rita G
2007-01-01
Parallel imaging has been the single biggest innovation in magnetic resonance imaging in the last decade. The use of multiple receiver coils to augment the time consuming Fourier encoding has reduced acquisition times significantly. This increase in speed comes at a time when other approaches to acquisition time reduction were reaching engineering and human limits. A brief summary of spatial encoding in MRI is followed by an introduction to the problem parallel imaging is designed to solve. There are a large number of parallel reconstruction algorithms; this article reviews a cross-section, SENSE, SMASH, g-SMASH and GRAPPA, selected to demonstrate the different approaches. Theoretical (the g-factor) and practical (coil design) limits to acquisition speed are reviewed. The practical implementation of parallel imaging is also discussed, in particular coil calibration. How to recognize potential failure modes and their associated artefacts are shown. Well-established applications including angiography, cardiac imaging and applications using echo planar imaging are reviewed and we discuss what makes a good application for parallel imaging. Finally, active research areas where parallel imaging is being used to improve data quality by repairing artefacted images are also reviewed. (invited topical review)
The STAPL Parallel Graph Library
Harshvardhan,; Fidel, Adam; Amato, Nancy M.; Rauchwerger, Lawrence
2013-01-01
This paper describes the stapl Parallel Graph Library, a high-level framework that abstracts the user from data-distribution and parallelism details and allows them to concentrate on parallel graph algorithm development. It includes a customizable
Massively parallel multicanonical simulations
Gross, Jonathan; Zierenberg, Johannes; Weigel, Martin; Janke, Wolfhard
2018-03-01
Generalized-ensemble Monte Carlo simulations such as the multicanonical method and similar techniques are among the most efficient approaches for simulations of systems undergoing discontinuous phase transitions or with rugged free-energy landscapes. As Markov chain methods, they are inherently serial computationally. It was demonstrated recently, however, that a combination of independent simulations that communicate weight updates at variable intervals allows for the efficient utilization of parallel computational resources for multicanonical simulations. Implementing this approach for the many-thread architecture provided by current generations of graphics processing units (GPUs), we show how it can be efficiently employed with of the order of 104 parallel walkers and beyond, thus constituting a versatile tool for Monte Carlo simulations in the era of massively parallel computing. We provide the fully documented source code for the approach applied to the paradigmatic example of the two-dimensional Ising model as starting point and reference for practitioners in the field.
Murni, Bustamam, A.; Ernastuti, Handhika, T.; Kerami, D.
2017-07-01
Calculation of the matrix-vector multiplication in the real-world problems often involves large matrix with arbitrary size. Therefore, parallelization is needed to speed up the calculation process that usually takes a long time. Graph partitioning techniques that have been discussed in the previous studies cannot be used to complete the parallelized calculation of matrix-vector multiplication with arbitrary size. This is due to the assumption of graph partitioning techniques that can only solve the square and symmetric matrix. Hypergraph partitioning techniques will overcome the shortcomings of the graph partitioning technique. This paper addresses the efficient parallelization of matrix-vector multiplication through hypergraph partitioning techniques using CUDA GPU-based parallel computing. CUDA (compute unified device architecture) is a parallel computing platform and programming model that was created by NVIDIA and implemented by the GPU (graphics processing unit).
SPINning parallel systems software
International Nuclear Information System (INIS)
Matlin, O.S.; Lusk, E.; McCune, W.
2002-01-01
We describe our experiences in using Spin to verify parts of the Multi Purpose Daemon (MPD) parallel process management system. MPD is a distributed collection of processes connected by Unix network sockets. MPD is dynamic processes and connections among them are created and destroyed as MPD is initialized, runs user processes, recovers from faults, and terminates. This dynamic nature is easily expressible in the Spin/Promela framework but poses performance and scalability challenges. We present here the results of expressing some of the parallel algorithms of MPD and executing both simulation and verification runs with Spin
Parallel programming with Python
Palach, Jan
2014-01-01
A fast, easy-to-follow and clear tutorial to help you develop Parallel computing systems using Python. Along with explaining the fundamentals, the book will also introduce you to slightly advanced concepts and will help you in implementing these techniques in the real world. If you are an experienced Python programmer and are willing to utilize the available computing resources by parallelizing applications in a simple way, then this book is for you. You are required to have a basic knowledge of Python development to get the most of this book.
Contributions to computational stereology and parallel programming
DEFF Research Database (Denmark)
Rasmusson, Allan
rotator, even without the need for isotropic sections. To meet the need for computational power to perform image restoration of virtual tissue sections, parallel programming on GPUs has also been part of the project. This has lead to a significant change in paradigm for a previously developed surgical...
Expressing Parallelism with ROOT
Energy Technology Data Exchange (ETDEWEB)
Piparo, D. [CERN; Tejedor, E. [CERN; Guiraud, E. [CERN; Ganis, G. [CERN; Mato, P. [CERN; Moneta, L. [CERN; Valls Pla, X. [CERN; Canal, P. [Fermilab
2017-11-22
The need for processing the ever-increasing amount of data generated by the LHC experiments in a more efficient way has motivated ROOT to further develop its support for parallelism. Such support is being tackled both for shared-memory and distributed-memory environments. The incarnations of the aforementioned parallelism are multi-threading, multi-processing and cluster-wide executions. In the area of multi-threading, we discuss the new implicit parallelism and related interfaces, as well as the new building blocks to safely operate with ROOT objects in a multi-threaded environment. Regarding multi-processing, we review the new MultiProc framework, comparing it with similar tools (e.g. multiprocessing module in Python). Finally, as an alternative to PROOF for cluster-wide executions, we introduce the efforts on integrating ROOT with state-of-the-art distributed data processing technologies like Spark, both in terms of programming model and runtime design (with EOS as one of the main components). For all the levels of parallelism, we discuss, based on real-life examples and measurements, how our proposals can increase the productivity of scientists.
Expressing Parallelism with ROOT
Piparo, D.; Tejedor, E.; Guiraud, E.; Ganis, G.; Mato, P.; Moneta, L.; Valls Pla, X.; Canal, P.
2017-10-01
The need for processing the ever-increasing amount of data generated by the LHC experiments in a more efficient way has motivated ROOT to further develop its support for parallelism. Such support is being tackled both for shared-memory and distributed-memory environments. The incarnations of the aforementioned parallelism are multi-threading, multi-processing and cluster-wide executions. In the area of multi-threading, we discuss the new implicit parallelism and related interfaces, as well as the new building blocks to safely operate with ROOT objects in a multi-threaded environment. Regarding multi-processing, we review the new MultiProc framework, comparing it with similar tools (e.g. multiprocessing module in Python). Finally, as an alternative to PROOF for cluster-wide executions, we introduce the efforts on integrating ROOT with state-of-the-art distributed data processing technologies like Spark, both in terms of programming model and runtime design (with EOS as one of the main components). For all the levels of parallelism, we discuss, based on real-life examples and measurements, how our proposals can increase the productivity of scientists.
Parallel Fast Legendre Transform
Alves de Inda, M.; Bisseling, R.H.; Maslen, D.K.
1998-01-01
We discuss a parallel implementation of a fast algorithm for the discrete polynomial Legendre transform We give an introduction to the DriscollHealy algorithm using polynomial arithmetic and present experimental results on the eciency and accuracy of our implementation The algorithms were
Practical parallel programming
Bauer, Barr E
2014-01-01
This is the book that will teach programmers to write faster, more efficient code for parallel processors. The reader is introduced to a vast array of procedures and paradigms on which actual coding may be based. Examples and real-life simulations using these devices are presented in C and FORTRAN.
Parallel hierarchical radiosity rendering
Energy Technology Data Exchange (ETDEWEB)
Carter, Michael [Iowa State Univ., Ames, IA (United States)
1993-07-01
In this dissertation, the step-by-step development of a scalable parallel hierarchical radiosity renderer is documented. First, a new look is taken at the traditional radiosity equation, and a new form is presented in which the matrix of linear system coefficients is transformed into a symmetric matrix, thereby simplifying the problem and enabling a new solution technique to be applied. Next, the state-of-the-art hierarchical radiosity methods are examined for their suitability to parallel implementation, and scalability. Significant enhancements are also discovered which both improve their theoretical foundations and improve the images they generate. The resultant hierarchical radiosity algorithm is then examined for sources of parallelism, and for an architectural mapping. Several architectural mappings are discussed. A few key algorithmic changes are suggested during the process of making the algorithm parallel. Next, the performance, efficiency, and scalability of the algorithm are analyzed. The dissertation closes with a discussion of several ideas which have the potential to further enhance the hierarchical radiosity method, or provide an entirely new forum for the application of hierarchical methods.
Parallel universes beguile science
2007-01-01
A staple of mind-bending science fiction, the possibility of multiple universes has long intrigued hard-nosed physicists, mathematicians and cosmologists too. We may not be able -- as least not yet -- to prove they exist, many serious scientists say, but there are plenty of reasons to think that parallel dimensions are more than figments of eggheaded imagination.
Energy Technology Data Exchange (ETDEWEB)
2017-04-04
A parallelization of the k-means++ seed selection algorithm on three distinct hardware platforms: GPU, multicore CPU, and multithreaded architecture. K-means++ was developed by David Arthur and Sergei Vassilvitskii in 2007 as an extension of the k-means data clustering technique. These algorithms allow people to cluster multidimensional data, by attempting to minimize the mean distance of data points within a cluster. K-means++ improved upon traditional k-means by using a more intelligent approach to selecting the initial seeds for the clustering process. While k-means++ has become a popular alternative to traditional k-means clustering, little work has been done to parallelize this technique. We have developed original C++ code for parallelizing the algorithm on three unique hardware architectures: GPU using NVidia's CUDA/Thrust framework, multicore CPU using OpenMP, and the Cray XMT multithreaded architecture. By parallelizing the process for these platforms, we are able to perform k-means++ clustering much more quickly than it could be done before.
International Nuclear Information System (INIS)
Gardes, D.; Volkov, P.
1981-01-01
A 5x3cm 2 (timing only) and a 15x5cm 2 (timing and position) parallel plate avalanche counters (PPAC) are considered. The theory of operation and timing resolution is given. The measurement set-up and the curves of experimental results illustrate the possibilities of the two counters [fr
Parallel hierarchical global illumination
Energy Technology Data Exchange (ETDEWEB)
Snell, Quinn O. [Iowa State Univ., Ames, IA (United States)
1997-10-08
Solving the global illumination problem is equivalent to determining the intensity of every wavelength of light in all directions at every point in a given scene. The complexity of the problem has led researchers to use approximation methods for solving the problem on serial computers. Rather than using an approximation method, such as backward ray tracing or radiosity, the authors have chosen to solve the Rendering Equation by direct simulation of light transport from the light sources. This paper presents an algorithm that solves the Rendering Equation to any desired accuracy, and can be run in parallel on distributed memory or shared memory computer systems with excellent scaling properties. It appears superior in both speed and physical correctness to recent published methods involving bidirectional ray tracing or hybrid treatments of diffuse and specular surfaces. Like progressive radiosity methods, it dynamically refines the geometry decomposition where required, but does so without the excessive storage requirements for ray histories. The algorithm, called Photon, produces a scene which converges to the global illumination solution. This amounts to a huge task for a 1997-vintage serial computer, but using the power of a parallel supercomputer significantly reduces the time required to generate a solution. Currently, Photon can be run on most parallel environments from a shared memory multiprocessor to a parallel supercomputer, as well as on clusters of heterogeneous workstations.
Towards a streaming model for nested data parallelism
DEFF Research Database (Denmark)
Madsen, Frederik Meisner; Filinski, Andrzej
2013-01-01
The language-integrated cost semantics for nested data parallelism pioneered by NESL provides an intuitive, high-level model for predicting performance and scalability of parallel algorithms with reasonable accuracy. However, this predictability, obtained through a uniform, parallelism-flattening......The language-integrated cost semantics for nested data parallelism pioneered by NESL provides an intuitive, high-level model for predicting performance and scalability of parallel algorithms with reasonable accuracy. However, this predictability, obtained through a uniform, parallelism......-processable in a streaming fashion. This semantics is directly compatible with previously proposed piecewise execution models for nested data parallelism, but allows the expected space usage to be reasoned about directly at the source-language level. The language definition and implementation are still very much work...
Ultrascalable petaflop parallel supercomputer
Blumrich, Matthias A [Ridgefield, CT; Chen, Dong [Croton On Hudson, NY; Chiu, George [Cross River, NY; Cipolla, Thomas M [Katonah, NY; Coteus, Paul W [Yorktown Heights, NY; Gara, Alan G [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Hall, Shawn [Pleasantville, NY; Haring, Rudolf A [Cortlandt Manor, NY; Heidelberger, Philip [Cortlandt Manor, NY; Kopcsay, Gerard V [Yorktown Heights, NY; Ohmacht, Martin [Yorktown Heights, NY; Salapura, Valentina [Chappaqua, NY; Sugavanam, Krishnan [Mahopac, NY; Takken, Todd [Brewster, NY
2010-07-20
A massively parallel supercomputer of petaOPS-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC) having up to four processing elements. The ASIC nodes are interconnected by multiple independent networks that optimally maximize the throughput of packet communications between nodes with minimal latency. The multiple networks may include three high-speed networks for parallel algorithm message passing including a Torus, collective network, and a Global Asynchronous network that provides global barrier and notification functions. These multiple independent networks may be collaboratively or independently utilized according to the needs or phases of an algorithm for optimizing algorithm processing performance. The use of a DMA engine is provided to facilitate message passing among the nodes without the expenditure of processing resources at the node.
DEFF Research Database (Denmark)
Gregersen, Frans; Josephson, Olle; Kristoffersen, Gjert
of departure that English may be used in parallel with the various local, in this case Nordic, languages. As such, the book integrates the challenge of internationalization faced by any university with the wish to improve quality in research, education and administration based on the local language......Abstract [en] More parallel, please is the result of the work of an Inter-Nordic group of experts on language policy financed by the Nordic Council of Ministers 2014-17. The book presents all that is needed to plan, practice and revise a university language policy which takes as its point......(s). There are three layers in the text: First, you may read the extremely brief version of the in total 11 recommendations for best practice. Second, you may acquaint yourself with the extended version of the recommendations and finally, you may study the reasoning behind each of them. At the end of the text, we give...
PARALLEL MOVING MECHANICAL SYSTEMS
Directory of Open Access Journals (Sweden)
Florian Ion Tiberius Petrescu
2014-09-01
Full Text Available Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 Moving mechanical systems parallel structures are solid, fast, and accurate. Between parallel systems it is to be noticed Stewart platforms, as the oldest systems, fast, solid and precise. The work outlines a few main elements of Stewart platforms. Begin with the geometry platform, kinematic elements of it, and presented then and a few items of dynamics. Dynamic primary element on it means the determination mechanism kinetic energy of the entire Stewart platforms. It is then in a record tail cinematic mobile by a method dot matrix of rotation. If a structural mottoelement consists of two moving elements which translates relative, drive train and especially dynamic it is more convenient to represent the mottoelement as a single moving components. We have thus seven moving parts (the six motoelements or feet to which is added mobile platform 7 and one fixed.
Xyce parallel electronic simulator.
Energy Technology Data Exchange (ETDEWEB)
Keiter, Eric R; Mei, Ting; Russo, Thomas V.; Rankin, Eric Lamont; Schiek, Richard Louis; Thornquist, Heidi K.; Fixel, Deborah A.; Coffey, Todd S; Pawlowski, Roger P; Santarelli, Keith R.
2010-05-01
This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users Guide. The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce. This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users Guide.
Betchov, R
2012-01-01
Stability of Parallel Flows provides information pertinent to hydrodynamical stability. This book explores the stability problems that occur in various fields, including electronics, mechanics, oceanography, administration, economics, as well as naval and aeronautical engineering. Organized into two parts encompassing 10 chapters, this book starts with an overview of the general equations of a two-dimensional incompressible flow. This text then explores the stability of a laminar boundary layer and presents the equation of the inviscid approximation. Other chapters present the general equation
Algorithmically specialized parallel computers
Snyder, Lawrence; Gannon, Dennis B
1985-01-01
Algorithmically Specialized Parallel Computers focuses on the concept and characteristics of an algorithmically specialized computer.This book discusses the algorithmically specialized computers, algorithmic specialization using VLSI, and innovative architectures. The architectures and algorithms for digital signal, speech, and image processing and specialized architectures for numerical computations are also elaborated. Other topics include the model for analyzing generalized inter-processor, pipelined architecture for search tree maintenance, and specialized computer organization for raster
International Nuclear Information System (INIS)
Deng Li; Xie Zhongsheng
1999-01-01
The coupled neutron and photon transport Monte Carlo code MCNP (version 3B) has been parallelized in parallel virtual machine (PVM) and message passing interface (MPI) by modifying a previous serial code. The new code has been verified by solving sample problems. The speedup increases linearly with the number of processors and the average efficiency is up to 99% for 12-processor. (author)
Resistor Combinations for Parallel Circuits.
McTernan, James P.
1978-01-01
To help simplify both teaching and learning of parallel circuits, a high school electricity/electronics teacher presents and illustrates the use of tables of values for parallel resistive circuits in which total resistances are whole numbers. (MF)
SOFTWARE FOR DESIGNING PARALLEL APPLICATIONS
Directory of Open Access Journals (Sweden)
M. K. Bouza
2017-01-01
Full Text Available The object of research is the tools to support the development of parallel programs in C/C ++. The methods and software which automates the process of designing parallel applications are proposed.
A Parallel Priority Queue with Constant Time Operations
DEFF Research Database (Denmark)
Brodal, Gerth Stølting; Träff, Jesper Larsson; Zaroliagis, Christos D.
1998-01-01
We present a parallel priority queue that supports the following operations in constant time:parallel insertionof a sequence of elements ordered according to key,parallel decrease keyfor a sequence of elements ordered according to key,deletion of the minimum key element, anddeletion of an arbitrary...... application is a parallel implementation of Dijkstra's algorithm for the single-source shortest path problem, which runs inO(n) time andO(mlogn) work on a CREW PRAM on graphs withnvertices andmedges. This is a logarithmic factor improvement in the running time compared with previous approaches....
Parallel External Memory Graph Algorithms
DEFF Research Database (Denmark)
Arge, Lars Allan; Goodrich, Michael T.; Sitchinava, Nodari
2010-01-01
In this paper, we study parallel I/O efficient graph algorithms in the Parallel External Memory (PEM) model, one o f the private-cache chip multiprocessor (CMP) models. We study the fundamental problem of list ranking which leads to efficient solutions to problems on trees, such as computing lowest...... an optimal speedup of Â¿(P) in parallel I/O complexity and parallel computation time, compared to the single-processor external memory counterparts....
Parallelization of Subchannel Analysis Code MATRA
International Nuclear Information System (INIS)
Kim, Seongjin; Hwang, Daehyun; Kwon, Hyouk
2014-01-01
A stand-alone calculation of MATRA code used up pertinent computing time for the thermal margin calculations while a relatively considerable time is needed to solve the whole core pin-by-pin problems. In addition, it is strongly required to improve the computation speed of the MATRA code to satisfy the overall performance of the multi-physics coupling calculations. Therefore, a parallel approach to improve and optimize the computability of the MATRA code is proposed and verified in this study. The parallel algorithm is embodied in the MATRA code using the MPI communication method and the modification of the previous code structure was minimized. An improvement is confirmed by comparing the results between the single and multiple processor algorithms. The speedup and efficiency are also evaluated when increasing the number of processors. The parallel algorithm was implemented to the subchannel code MATRA using the MPI. The performance of the parallel algorithm was verified by comparing the results with those from the MATRA with the single processor. It is also noticed that the performance of the MATRA code was greatly improved by implementing the parallel algorithm for the 1/8 core and whole core problems
Improvement of Parallel Algorithm for MATRA Code
International Nuclear Information System (INIS)
Kim, Seong-Jin; Seo, Kyong-Won; Kwon, Hyouk; Hwang, Dae-Hyun
2014-01-01
The feasibility study to parallelize the MATRA code was conducted in KAERI early this year. As a result, a parallel algorithm for the MATRA code has been developed to decrease a considerably required computing time to solve a bigsize problem such as a whole core pin-by-pin problem of a general PWR reactor and to improve an overall performance of the multi-physics coupling calculations. It was shown that the performance of the MATRA code was greatly improved by implementing the parallel algorithm using MPI communication. For problems of a 1/8 core and whole core for SMART reactor, a speedup was evaluated as about 10 when the numbers of used processor were 25. However, it was also shown that the performance deteriorated as the axial node number increased. In this paper, the procedure of a communication between processors is optimized to improve the previous parallel algorithm.. To improve the performance deterioration of the parallelized MATRA code, the communication algorithm between processors was newly presented. It was shown that the speedup was improved and stable regardless of the axial node number
International Nuclear Information System (INIS)
Baron, E.; Hauschildt, Peter H.
1998-01-01
We describe an important addition to the parallel implementation of our generalized nonlocal thermodynamic equilibrium (NLTE) stellar atmosphere and radiative transfer computer program PHOENIX. In a previous paper in this series we described data and task parallel algorithms we have developed for radiative transfer, spectral line opacity, and NLTE opacity and rate calculations. These algorithms divided the work spatially or by spectral lines, that is, distributing the radial zones, individual spectral lines, or characteristic rays among different processors and employ, in addition, task parallelism for logically independent functions (such as atomic and molecular line opacities). For finite, monotonic velocity fields, the radiative transfer equation is an initial value problem in wavelength, and hence each wavelength point depends upon the previous one. However, for sophisticated NLTE models of both static and moving atmospheres needed to accurately describe, e.g., novae and supernovae, the number of wavelength points is very large (200,000 - 300,000) and hence parallelization over wavelength can lead both to considerable speedup in calculation time and the ability to make use of the aggregate memory available on massively parallel supercomputers. Here, we describe an implementation of a pipelined design for the wavelength parallelization of PHOENIX, where the necessary data from the processor working on a previous wavelength point is sent to the processor working on the succeeding wavelength point as soon as it is known. Our implementation uses a MIMD design based on a relatively small number of standard message passing interface (MPI) library calls and is fully portable between serial and parallel computers. copyright 1998 The American Astronomical Society
Parallel inter channel interaction mechanisms
International Nuclear Information System (INIS)
Jovic, V.; Afgan, N.; Jovic, L.
1995-01-01
Parallel channels interactions are examined. For experimental researches of nonstationary regimes flow in three parallel vertical channels results of phenomenon analysis and mechanisms of parallel channel interaction for adiabatic condition of one-phase fluid and two-phase mixture flow are shown. (author)
International Nuclear Information System (INIS)
Soltz, R; Vranas, P; Blumrich, M; Chen, D; Gara, A; Giampap, M; Heidelberger, P; Salapura, V; Sexton, J; Bhanot, G
2007-01-01
The theory of the strong nuclear force, Quantum Chromodynamics (QCD), can be numerically simulated from first principles on massively-parallel supercomputers using the method of Lattice Gauge Theory. We describe the special programming requirements of lattice QCD (LQCD) as well as the optimal supercomputer hardware architectures that it suggests. We demonstrate these methods on the BlueGene massively-parallel supercomputer and argue that LQCD and the BlueGene architecture are a natural match. This can be traced to the simple fact that LQCD is a regular lattice discretization of space into lattice sites while the BlueGene supercomputer is a discretization of space into compute nodes, and that both are constrained by requirements of locality. This simple relation is both technologically important and theoretically intriguing. The main result of this paper is the speedup of LQCD using up to 131,072 CPUs on the largest BlueGene/L supercomputer. The speedup is perfect with sustained performance of about 20% of peak. This corresponds to a maximum of 70.5 sustained TFlop/s. At these speeds LQCD and BlueGene are poised to produce the next generation of strong interaction physics theoretical results
A Parallel Butterfly Algorithm
Poulson, Jack; Demanet, Laurent; Maxwell, Nicholas; Ying, Lexing
2014-01-01
The butterfly algorithm is a fast algorithm which approximately evaluates a discrete analogue of the integral transform (Equation Presented.) at large numbers of target points when the kernel, K(x, y), is approximately low-rank when restricted to subdomains satisfying a certain simple geometric condition. In d dimensions with O(Nd) quasi-uniformly distributed source and target points, when each appropriate submatrix of K is approximately rank-r, the running time of the algorithm is at most O(r2Nd logN). A parallelization of the butterfly algorithm is introduced which, assuming a message latency of α and per-process inverse bandwidth of β, executes in at most (Equation Presented.) time using p processes. This parallel algorithm was then instantiated in the form of the open-source DistButterfly library for the special case where K(x, y) = exp(iΦ(x, y)), where Φ(x, y) is a black-box, sufficiently smooth, real-valued phase function. Experiments on Blue Gene/Q demonstrate impressive strong-scaling results for important classes of phase functions. Using quasi-uniform sources, hyperbolic Radon transforms, and an analogue of a three-dimensional generalized Radon transform were, respectively, observed to strong-scale from 1-node/16-cores up to 1024-nodes/16,384-cores with greater than 90% and 82% efficiency, respectively. © 2014 Society for Industrial and Applied Mathematics.
A Parallel Butterfly Algorithm
Poulson, Jack
2014-02-04
The butterfly algorithm is a fast algorithm which approximately evaluates a discrete analogue of the integral transform (Equation Presented.) at large numbers of target points when the kernel, K(x, y), is approximately low-rank when restricted to subdomains satisfying a certain simple geometric condition. In d dimensions with O(Nd) quasi-uniformly distributed source and target points, when each appropriate submatrix of K is approximately rank-r, the running time of the algorithm is at most O(r2Nd logN). A parallelization of the butterfly algorithm is introduced which, assuming a message latency of α and per-process inverse bandwidth of β, executes in at most (Equation Presented.) time using p processes. This parallel algorithm was then instantiated in the form of the open-source DistButterfly library for the special case where K(x, y) = exp(iΦ(x, y)), where Φ(x, y) is a black-box, sufficiently smooth, real-valued phase function. Experiments on Blue Gene/Q demonstrate impressive strong-scaling results for important classes of phase functions. Using quasi-uniform sources, hyperbolic Radon transforms, and an analogue of a three-dimensional generalized Radon transform were, respectively, observed to strong-scale from 1-node/16-cores up to 1024-nodes/16,384-cores with greater than 90% and 82% efficiency, respectively. © 2014 Society for Industrial and Applied Mathematics.
Fast parallel event reconstruction
CERN. Geneva
2010-01-01
On-line processing of large data volumes produced in modern HEP experiments requires using maximum capabilities of modern and future many-core CPU and GPU architectures.One of such powerful feature is a SIMD instruction set, which allows packing several data items in one register and to operate on all of them, thus achievingmore operations per clock cycle. Motivated by the idea of using the SIMD unit ofmodern processors, the KF based track fit has been adapted for parallelism, including memory optimization, numerical analysis, vectorization with inline operator overloading, and optimization using SDKs. The speed of the algorithm has been increased in 120000 times with 0.1 ms/track, running in parallel on 16 SPEs of a Cell Blade computer. Running on a Nehalem CPU with 8 cores it shows the processing speed of 52 ns/track using the Intel Threading Building Blocks. The same KF algorithm running on an Nvidia GTX 280 in the CUDA frameworkprovi...
International Nuclear Information System (INIS)
DeHart, Mark D.; Williams, Mark L.; Bowman, Stephen M.
2010-01-01
The SCALE computational architecture has remained basically the same since its inception 30 years ago, although constituent modules and capabilities have changed significantly. This SCALE concept was intended to provide a framework whereby independent codes can be linked to provide a more comprehensive capability than possible with the individual programs - allowing flexibility to address a wide variety of applications. However, the current system was designed originally for mainframe computers with a single CPU and with significantly less memory than today's personal computers. It has been recognized that the present SCALE computation system could be restructured to take advantage of modern hardware and software capabilities, while retaining many of the modular features of the present system. Preliminary work is being done to define specifications and capabilities for a more advanced computational architecture. This paper describes the state of current SCALE development activities and plans for future development. With the release of SCALE 6.1 in 2010, a new phase of evolutionary development will be available to SCALE users within the TRITON and NEWT modules. The SCALE (Standardized Computer Analyses for Licensing Evaluation) code system developed by Oak Ridge National Laboratory (ORNL) provides a comprehensive and integrated package of codes and nuclear data for a wide range of applications in criticality safety, reactor physics, shielding, isotopic depletion and decay, and sensitivity/uncertainty (S/U) analysis. Over the last three years, since the release of version 5.1 in 2006, several important new codes have been introduced within SCALE, and significant advances applied to existing codes. Many of these new features became available with the release of SCALE 6.0 in early 2009. However, beginning with SCALE 6.1, a first generation of parallel computing is being introduced. In addition to near-term improvements, a plan for longer term SCALE enhancement
Parallel Polarization State Generation.
She, Alan; Capasso, Federico
2016-05-17
The control of polarization, an essential property of light, is of wide scientific and technological interest. The general problem of generating arbitrary time-varying states of polarization (SOP) has always been mathematically formulated by a series of linear transformations, i.e. a product of matrices, imposing a serial architecture. Here we show a parallel architecture described by a sum of matrices. The theory is experimentally demonstrated by modulating spatially-separated polarization components of a laser using a digital micromirror device that are subsequently beam combined. This method greatly expands the parameter space for engineering devices that control polarization. Consequently, performance characteristics, such as speed, stability, and spectral range, are entirely dictated by the technologies of optical intensity modulation, including absorption, reflection, emission, and scattering. This opens up important prospects for polarization state generation (PSG) with unique performance characteristics with applications in spectroscopic ellipsometry, spectropolarimetry, communications, imaging, and security.
Neural Parallel Engine: A toolbox for massively parallel neural signal processing.
Tam, Wing-Kin; Yang, Zhi
2018-05-01
Large-scale neural recordings provide detailed information on neuronal activities and can help elicit the underlying neural mechanisms of the brain. However, the computational burden is also formidable when we try to process the huge data stream generated by such recordings. In this study, we report the development of Neural Parallel Engine (NPE), a toolbox for massively parallel neural signal processing on graphical processing units (GPUs). It offers a selection of the most commonly used routines in neural signal processing such as spike detection and spike sorting, including advanced algorithms such as exponential-component-power-component (EC-PC) spike detection and binary pursuit spike sorting. We also propose a new method for detecting peaks in parallel through a parallel compact operation. Our toolbox is able to offer a 5× to 110× speedup compared with its CPU counterparts depending on the algorithms. A user-friendly MATLAB interface is provided to allow easy integration of the toolbox into existing workflows. Previous efforts on GPU neural signal processing only focus on a few rudimentary algorithms, are not well-optimized and often do not provide a user-friendly programming interface to fit into existing workflows. There is a strong need for a comprehensive toolbox for massively parallel neural signal processing. A new toolbox for massively parallel neural signal processing has been created. It can offer significant speedup in processing signals from large-scale recordings up to thousands of channels. Copyright © 2018 Elsevier B.V. All rights reserved.
About Parallel Programming: Paradigms, Parallel Execution and Collaborative Systems
Directory of Open Access Journals (Sweden)
Loredana MOCEAN
2009-01-01
Full Text Available In the last years, there were made efforts for delineation of a stabile and unitary frame, where the problems of logical parallel processing must find solutions at least at the level of imperative languages. The results obtained by now are not at the level of the made efforts. This paper wants to be a little contribution at these efforts. We propose an overview in parallel programming, parallel execution and collaborative systems.
Parallel Framework for Cooperative Processes
Directory of Open Access Journals (Sweden)
Mitică Craus
2005-01-01
Full Text Available This paper describes the work of an object oriented framework designed to be used in the parallelization of a set of related algorithms. The idea behind the system we are describing is to have a re-usable framework for running several sequential algorithms in a parallel environment. The algorithms that the framework can be used with have several things in common: they have to run in cycles and the work should be possible to be split between several "processing units". The parallel framework uses the message-passing communication paradigm and is organized as a master-slave system. Two applications are presented: an Ant Colony Optimization (ACO parallel algorithm for the Travelling Salesman Problem (TSP and an Image Processing (IP parallel algorithm for the Symmetrical Neighborhood Filter (SNF. The implementations of these applications by means of the parallel framework prove to have good performances: approximatively linear speedup and low communication cost.
Parallel Monte Carlo reactor neutronics
International Nuclear Information System (INIS)
Blomquist, R.N.; Brown, F.B.
1994-01-01
The issues affecting implementation of parallel algorithms for large-scale engineering Monte Carlo neutron transport simulations are discussed. For nuclear reactor calculations, these include load balancing, recoding effort, reproducibility, domain decomposition techniques, I/O minimization, and strategies for different parallel architectures. Two codes were parallelized and tested for performance. The architectures employed include SIMD, MIMD-distributed memory, and workstation network with uneven interactive load. Speedups linear with the number of nodes were achieved
DEFF Research Database (Denmark)
Kosbar, Tamer R.; Sofan, Mamdouh A.; Waly, Mohamed A.
2015-01-01
about 6.1 °C when the TFO strand was modified with Z and the Watson-Crick strand with adenine-LNA (AL). The molecular modeling results showed that, in case of nucleobases Y and Z a hydrogen bond (1.69 and 1.72 Å, respectively) was formed between the protonated 3-aminopropyn-1-yl chain and one...... of the phosphate groups in Watson-Crick strand. Also, it was shown that the nucleobase Y made a good stacking and binding with the other nucleobases in the TFO and Watson-Crick duplex, respectively. In contrast, the nucleobase Z with LNA moiety was forced to twist out of plane of Watson-Crick base pair which......The phosphoramidites of DNA monomers of 7-(3-aminopropyn-1-yl)-8-aza-7-deazaadenine (Y) and 7-(3-aminopropyn-1-yl)-8-aza-7-deazaadenine LNA (Z) are synthesized, and the thermal stability at pH 7.2 and 8.2 of anti-parallel triplexes modified with these two monomers is determined. When, the anti...
Parallel consensual neural networks.
Benediktsson, J A; Sveinsson, J R; Ersoy, O K; Swain, P H
1997-01-01
A new type of a neural-network architecture, the parallel consensual neural network (PCNN), is introduced and applied in classification/data fusion of multisource remote sensing and geographic data. The PCNN architecture is based on statistical consensus theory and involves using stage neural networks with transformed input data. The input data are transformed several times and the different transformed data are used as if they were independent inputs. The independent inputs are first classified using the stage neural networks. The output responses from the stage networks are then weighted and combined to make a consensual decision. In this paper, optimization methods are used in order to weight the outputs from the stage networks. Two approaches are proposed to compute the data transforms for the PCNN, one for binary data and another for analog data. The analog approach uses wavelet packets. The experimental results obtained with the proposed approach show that the PCNN outperforms both a conjugate-gradient backpropagation neural network and conventional statistical methods in terms of overall classification accuracy of test data.
Performance studies of the parallel VIM code
International Nuclear Information System (INIS)
Shi, B.; Blomquist, R.N.
1996-01-01
In this paper, the authors evaluate the performance of the parallel version of the VIM Monte Carlo code on the IBM SPx at the High Performance Computing Research Facility at ANL. Three test problems with contrasting computational characteristics were used to assess effects in performance. A statistical method for estimating the inefficiencies due to load imbalance and communication is also introduced. VIM is a large scale continuous energy Monte Carlo radiation transport program and was parallelized using history partitioning, the master/worker approach, and p4 message passing library. Dynamic load balancing is accomplished when the master processor assigns chunks of histories to workers that have completed a previously assigned task, accommodating variations in the lengths of histories, processor speeds, and worker loads. At the end of each batch (generation), the fission sites and tallies are sent from each worker to the master process, contributing to the parallel inefficiency. All communications are between master and workers, and are serial. The SPx is a scalable 128-node parallel supercomputer with high-performance Omega switches of 63 microsec latency and 35 MBytes/sec bandwidth. For uniform and reproducible performance, they used only the 120 identical regular processors (IBM RS/6000) and excluded the remaining eight planet nodes, which may be loaded by other's jobs
Parallel generation of architecture on the GPU
Steinberger, Markus
2014-05-01
In this paper, we present a novel approach for the parallel evaluation of procedural shape grammars on the graphics processing unit (GPU). Unlike previous approaches that are either limited in the kind of shapes they allow, the amount of parallelism they can take advantage of, or both, our method supports state of the art procedural modeling including stochasticity and context-sensitivity. To increase parallelism, we explicitly express independence in the grammar, reduce inter-rule dependencies required for context-sensitive evaluation, and introduce intra-rule parallelism. Our rule scheduling scheme avoids unnecessary back and forth between CPU and GPU and reduces round trips to slow global memory by dynamically grouping rules in on-chip shared memory. Our GPU shape grammar implementation is multiple orders of magnitude faster than the standard in CPU-based rule evaluation, while offering equal expressive power. In comparison to the state of the art in GPU shape grammar derivation, our approach is nearly 50 times faster, while adding support for geometric context-sensitivity. © 2014 The Author(s) Computer Graphics Forum © 2014 The Eurographics Association and John Wiley & Sons Ltd. Published by John Wiley & Sons Ltd.
A Parallel Particle Swarm Optimizer
National Research Council Canada - National Science Library
Schutte, J. F; Fregly, B .J; Haftka, R. T; George, A. D
2003-01-01
.... Motivated by a computationally demanding biomechanical system identification problem, we introduce a parallel implementation of a stochastic population based global optimizer, the Particle Swarm...
Patterns for Parallel Software Design
Ortega-Arjona, Jorge Luis
2010-01-01
Essential reading to understand patterns for parallel programming Software patterns have revolutionized the way we think about how software is designed, built, and documented, and the design of parallel software requires you to consider other particular design aspects and special skills. From clusters to supercomputers, success heavily depends on the design skills of software developers. Patterns for Parallel Software Design presents a pattern-oriented software architecture approach to parallel software design. This approach is not a design method in the classic sense, but a new way of managin
DEFF Research Database (Denmark)
Christensen, Mark Schram; Ehrsson, H Henrik; Nielsen, Jens Bo
2013-01-01
a different network, involving bilateral dorsal premotor cortex (PMd), primary motor cortex, and SMA, was more active when subjects viewed parallel movements while performing either symmetrical or parallel movements. Correlations between behavioral instability and brain activity were present in right lateral...... adduction-abduction movements symmetrically or in parallel with real-time congruent or incongruent visual feedback of the movements. One network, consisting of bilateral superior and middle frontal gyrus and supplementary motor area (SMA), was more active when subjects performed parallel movements, whereas...
Scalable parallel prefix solvers for discrete ordinates transport
International Nuclear Information System (INIS)
Pautz, S.; Pandya, T.; Adams, M.
2009-01-01
The well-known 'sweep' algorithm for inverting the streaming-plus-collision term in first-order deterministic radiation transport calculations has some desirable numerical properties. However, it suffers from parallel scaling issues caused by a lack of concurrency. The maximum degree of concurrency, and thus the maximum parallelism, grows more slowly than the problem size for sweeps-based solvers. We investigate a new class of parallel algorithms that involves recasting the streaming-plus-collision problem in prefix form and solving via cyclic reduction. This method, although computationally more expensive at low levels of parallelism than the sweep algorithm, offers better theoretical scalability properties. Previous work has demonstrated this approach for one-dimensional calculations; we show how to extend it to multidimensional calculations. Notably, for multiple dimensions it appears that this approach is limited to long-characteristics discretizations; other discretizations cannot be cast in prefix form. We implement two variants of the algorithm within the radlib/SCEPTRE transport code library at Sandia National Laboratories and show results on two different massively parallel systems. Both the 'forward' and 'symmetric' solvers behave similarly, scaling well to larger degrees of parallelism then sweeps-based solvers. We do observe some issues at the highest levels of parallelism (relative to the system size) and discuss possible causes. We conclude that this approach shows good potential for future parallel systems, but the parallel scalability will depend heavily on the architecture of the communication networks of these systems. (authors)
Preoperative screening: value of previous tests.
Macpherson, D S; Snow, R; Lofgren, R P
1990-12-15
To determine the frequency of tests done in the year before elective surgery that might substitute for preoperative screening tests and to determine the frequency of test results that change from a normal value to a value likely to alter perioperative management. Retrospective cohort analysis of computerized laboratory data (complete blood count, sodium, potassium, and creatinine levels, prothrombin time, and partial thromboplastin time). Urban tertiary care Veterans Affairs Hospital. Consecutive sample of 1109 patients who had elective surgery in 1988. At admission, 7549 preoperative tests were done, 47% of which duplicated tests performed in the previous year. Of 3096 previous results that were normal as defined by hospital reference range and done closest to the time of but before admission (median interval, 2 months), 13 (0.4%; 95% CI, 0.2% to 0.7%), repeat values were outside a range considered acceptable for surgery. Most of the abnormalities were predictable from the patient's history, and most were not noted in the medical record. Of 461 previous tests that were abnormal, 78 (17%; CI, 13% to 20%) repeat values at admission were outside a range considered acceptable for surgery (P less than 0.001, frequency of clinically important abnormalities of patients with normal previous results with those with abnormal previous results). Physicians evaluating patients preoperatively could safely substitute the previous test results analyzed in this study for preoperative screening tests if the previous tests are normal and no obvious indication for retesting is present.
PARALLEL IMPORT: REALITY FOR RUSSIA
Directory of Open Access Journals (Sweden)
Т. А. Сухопарова
2014-01-01
Full Text Available Problem of parallel import is urgent question at now. Parallel import legalization in Russia is expedient. Such statement based on opposite experts opinion analysis. At the same time it’s necessary to negative consequences consider of this decision and to apply remedies to its minimization.Purchase on Elibrary.ru > Buy now
Large amplitude parallel propagating electromagnetic oscillitons
International Nuclear Information System (INIS)
Cattaert, Tom; Verheest, Frank
2005-01-01
Earlier systematic nonlinear treatments of parallel propagating electromagnetic waves have been given within a fluid dynamic approach, in a frame where the nonlinear structures are stationary and various constraining first integrals can be obtained. This has lead to the concept of oscillitons that has found application in various space plasmas. The present paper differs in three main aspects from the previous studies: first, the invariants are derived in the plasma frame, as customary in the Sagdeev method, thus retaining in Maxwell's equations all possible effects. Second, a single differential equation is obtained for the parallel fluid velocity, in a form reminiscent of the Sagdeev integrals, hence allowing a fully nonlinear discussion of the oscilliton properties, at such amplitudes as the underlying Mach number restrictions allow. Third, the transition to weakly nonlinear whistler oscillitons is done in an analytical rather than a numerical fashion
The Galley Parallel File System
Nieuwejaar, Nils; Kotz, David
1996-01-01
Most current multiprocessor file systems are designed to use multiple disks in parallel, using the high aggregate bandwidth to meet the growing I/0 requirements of parallel scientific applications. Many multiprocessor file systems provide applications with a conventional Unix-like interface, allowing the application to access multiple disks transparently. This interface conceals the parallelism within the file system, increasing the ease of programmability, but making it difficult or impossible for sophisticated programmers and libraries to use knowledge about their I/O needs to exploit that parallelism. In addition to providing an insufficient interface, most current multiprocessor file systems are optimized for a different workload than they are being asked to support. We introduce Galley, a new parallel file system that is intended to efficiently support realistic scientific multiprocessor workloads. We discuss Galley's file structure and application interface, as well as the performance advantages offered by that interface.
Parallelization of the FLAPW method
International Nuclear Information System (INIS)
Canning, A.; Mannstadt, W.; Freeman, A.J.
1999-01-01
The FLAPW (full-potential linearized-augmented plane-wave) method is one of the most accurate first-principles methods for determining electronic and magnetic properties of crystals and surfaces. Until the present work, the FLAPW method has been limited to systems of less than about one hundred atoms due to a lack of an efficient parallel implementation to exploit the power and memory of parallel computers. In this work we present an efficient parallelization of the method by division among the processors of the plane-wave components for each state. The code is also optimized for RISC (reduced instruction set computer) architectures, such as those found on most parallel computers, making full use of BLAS (basic linear algebra subprograms) wherever possible. Scaling results are presented for systems of up to 686 silicon atoms and 343 palladium atoms per unit cell, running on up to 512 processors on a CRAY T3E parallel computer
Parallelization of the FLAPW method
Canning, A.; Mannstadt, W.; Freeman, A. J.
2000-08-01
The FLAPW (full-potential linearized-augmented plane-wave) method is one of the most accurate first-principles methods for determining structural, electronic and magnetic properties of crystals and surfaces. Until the present work, the FLAPW method has been limited to systems of less than about a hundred atoms due to the lack of an efficient parallel implementation to exploit the power and memory of parallel computers. In this work, we present an efficient parallelization of the method by division among the processors of the plane-wave components for each state. The code is also optimized for RISC (reduced instruction set computer) architectures, such as those found on most parallel computers, making full use of BLAS (basic linear algebra subprograms) wherever possible. Scaling results are presented for systems of up to 686 silicon atoms and 343 palladium atoms per unit cell, running on up to 512 processors on a CRAY T3E parallel supercomputer.
Automatic electromagnetic valve for previous vacuum
International Nuclear Information System (INIS)
Granados, C. E.; Martin, F.
1959-01-01
A valve which permits the maintenance of an installation vacuum when electric current fails is described. It also lets the air in the previous vacuum bomb to prevent the oil ascending in the vacuum tubes. (Author)
Parallel computing techniques for rotorcraft aerodynamics
Ekici, Kivanc
The modification of unsteady three-dimensional Navier-Stokes codes for application on massively parallel and distributed computing environments is investigated. The Euler/Navier-Stokes code TURNS (Transonic Unsteady Rotor Navier-Stokes) was chosen as a test bed because of its wide use by universities and industry. For the efficient implementation of TURNS on parallel computing systems, two algorithmic changes are developed. First, main modifications to the implicit operator, Lower-Upper Symmetric Gauss Seidel (LU-SGS) originally used in TURNS, is performed. Second, application of an inexact Newton method, coupled with a Krylov subspace iterative method (Newton-Krylov method) is carried out. Both techniques have been tried previously for the Euler equations mode of the code. In this work, we have extended the methods to the Navier-Stokes mode. Several new implicit operators were tried because of convergence problems of traditional operators with the high cell aspect ratio (CAR) grids needed for viscous calculations on structured grids. Promising results for both Euler and Navier-Stokes cases are presented for these operators. For the efficient implementation of Newton-Krylov methods to the Navier-Stokes mode of TURNS, efficient preconditioners must be used. The parallel implicit operators used in the previous step are employed as preconditioners and the results are compared. The Message Passing Interface (MPI) protocol has been used because of its portability to various parallel architectures. It should be noted that the proposed methodology is general and can be applied to several other CFD codes (e.g. OVERFLOW).
Integrative Dynamic Reconfiguration in a Parallel Stream Processing Engine
DEFF Research Database (Denmark)
Madsen, Kasper Grud Skat; Zhou, Yongluan; Cao, Jianneng
2017-01-01
Load balancing, operator instance collocations and horizontal scaling are critical issues in Parallel Stream Processing Engines to achieve low data processing latency, optimized cluster utilization and minimized communication cost respectively. In previous work, these issues are typically tackled...... solution called ALBIC, which support general jobs. We implement the proposed techniques on top of Apache Storm, an open-source Parallel Stream Processing Engine. The extensive experimental results over both synthetic and real datasets show that our techniques clearly outperform existing approaches....
Sensitivity Analysis of the Proximal-Based Parallel Decomposition Methods
Directory of Open Access Journals (Sweden)
Feng Ma
2014-01-01
Full Text Available The proximal-based parallel decomposition methods were recently proposed to solve structured convex optimization problems. These algorithms are eligible for parallel computation and can be used efficiently for solving large-scale separable problems. In this paper, compared with the previous theoretical results, we show that the range of the involved parameters can be enlarged while the convergence can be still established. Preliminary numerical tests on stable principal component pursuit problem testify to the advantages of the enlargement.
AZTEC: A parallel iterative package for the solving linear systems
Energy Technology Data Exchange (ETDEWEB)
Hutchinson, S.A.; Shadid, J.N.; Tuminaro, R.S. [Sandia National Labs., Albuquerque, NM (United States)
1996-12-31
We describe a parallel linear system package, AZTEC. The package incorporates a number of parallel iterative methods (e.g. GMRES, biCGSTAB, CGS, TFQMR) and preconditioners (e.g. Jacobi, Gauss-Seidel, polynomial, domain decomposition with LU or ILU within subdomains). Additionally, AZTEC allows for the reuse of previous preconditioning factorizations within Newton schemes for nonlinear methods. Currently, a number of different users are using this package to solve a variety of PDE applications.
Is Monte Carlo embarrassingly parallel?
Energy Technology Data Exchange (ETDEWEB)
Hoogenboom, J. E. [Delft Univ. of Technology, Mekelweg 15, 2629 JB Delft (Netherlands); Delft Nuclear Consultancy, IJsselzoom 2, 2902 LB Capelle aan den IJssel (Netherlands)
2012-07-01
Monte Carlo is often stated as being embarrassingly parallel. However, running a Monte Carlo calculation, especially a reactor criticality calculation, in parallel using tens of processors shows a serious limitation in speedup and the execution time may even increase beyond a certain number of processors. In this paper the main causes of the loss of efficiency when using many processors are analyzed using a simple Monte Carlo program for criticality. The basic mechanism for parallel execution is MPI. One of the bottlenecks turn out to be the rendez-vous points in the parallel calculation used for synchronization and exchange of data between processors. This happens at least at the end of each cycle for fission source generation in order to collect the full fission source distribution for the next cycle and to estimate the effective multiplication factor, which is not only part of the requested results, but also input to the next cycle for population control. Basic improvements to overcome this limitation are suggested and tested. Also other time losses in the parallel calculation are identified. Moreover, the threading mechanism, which allows the parallel execution of tasks based on shared memory using OpenMP, is analyzed in detail. Recommendations are given to get the maximum efficiency out of a parallel Monte Carlo calculation. (authors)
Is Monte Carlo embarrassingly parallel?
International Nuclear Information System (INIS)
Hoogenboom, J. E.
2012-01-01
Monte Carlo is often stated as being embarrassingly parallel. However, running a Monte Carlo calculation, especially a reactor criticality calculation, in parallel using tens of processors shows a serious limitation in speedup and the execution time may even increase beyond a certain number of processors. In this paper the main causes of the loss of efficiency when using many processors are analyzed using a simple Monte Carlo program for criticality. The basic mechanism for parallel execution is MPI. One of the bottlenecks turn out to be the rendez-vous points in the parallel calculation used for synchronization and exchange of data between processors. This happens at least at the end of each cycle for fission source generation in order to collect the full fission source distribution for the next cycle and to estimate the effective multiplication factor, which is not only part of the requested results, but also input to the next cycle for population control. Basic improvements to overcome this limitation are suggested and tested. Also other time losses in the parallel calculation are identified. Moreover, the threading mechanism, which allows the parallel execution of tasks based on shared memory using OpenMP, is analyzed in detail. Recommendations are given to get the maximum efficiency out of a parallel Monte Carlo calculation. (authors)
Parallel integer sorting with medium and fine-scale parallelism
Dagum, Leonardo
1993-01-01
Two new parallel integer sorting algorithms, queue-sort and barrel-sort, are presented and analyzed in detail. These algorithms do not have optimal parallel complexity, yet they show very good performance in practice. Queue-sort designed for fine-scale parallel architectures which allow the queueing of multiple messages to the same destination. Barrel-sort is designed for medium-scale parallel architectures with a high message passing overhead. The performance results from the implementation of queue-sort on a Connection Machine CM-2 and barrel-sort on a 128 processor iPSC/860 are given. The two implementations are found to be comparable in performance but not as good as a fully vectorized bucket sort on the Cray YMP.
Parallel education: what is it?
Amos, Michelle Peta
2017-01-01
In the history of education it has long been discussed that single-sex and coeducation are the two models of education present in schools. With the introduction of parallel schools over the last 15 years, there has been very little research into this 'new model'. Many people do not understand what it means for a school to be parallel or they confuse a parallel model with co-education, due to the presence of both boys and girls within the one institution. Therefore, the main obj...
Balanced, parallel operation of flashlamps
International Nuclear Information System (INIS)
Carder, B.M.; Merritt, B.T.
1979-01-01
A new energy store, the Compensated Pulsed Alternator (CPA), promises to be a cost effective substitute for capacitors to drive flashlamps that pump large Nd:glass lasers. Because the CPA is large and discrete, it will be necessary that it drive many parallel flashlamp circuits, presenting a problem in equal current distribution. Current division to +- 20% between parallel flashlamps has been achieved, but this is marginal for laser pumping. A method is presented here that provides equal current sharing to about 1%, and it includes fused protection against short circuit faults. The method was tested with eight parallel circuits, including both open-circuit and short-circuit fault tests
77 FR 70176 - Previous Participation Certification
2012-11-23
... participants' previous participation in government programs and ensure that the past record is acceptable prior... information is designed to be 100 percent automated and digital submission of all data and certifications is... government programs and ensure that the past record is acceptable prior to granting approval to participate...
On the Tengiz petroleum deposit previous study
International Nuclear Information System (INIS)
Nysangaliev, A.N.; Kuspangaliev, T.K.
1997-01-01
Tengiz petroleum deposit previous study is described. Some consideration about structure of productive formation, specific characteristic properties of petroleum-bearing collectors are presented. Recommendation on their detail study and using of experience on exploration and development of petroleum deposit which have analogy on most important geological and industrial parameters are given. (author)
Subsequent pregnancy outcome after previous foetal death
Nijkamp, J. W.; Korteweg, F. J.; Holm, J. P.; Timmer, A.; Erwich, J. J. H. M.; van Pampus, M. G.
Objective: A history of foetal death is a risk factor for complications and foetal death in subsequent pregnancies as most previous risk factors remain present and an underlying cause of death may recur. The purpose of this study was to evaluate subsequent pregnancy outcome after foetal death and to
Workspace Analysis for Parallel Robot
Directory of Open Access Journals (Sweden)
Ying Sun
2013-05-01
Full Text Available As a completely new-type of robot, the parallel robot possesses a lot of advantages that the serial robot does not, such as high rigidity, great load-carrying capacity, small error, high precision, small self-weight/load ratio, good dynamic behavior and easy control, hence its range is extended in using domain. In order to find workspace of parallel mechanism, the numerical boundary-searching algorithm based on the reverse solution of kinematics and limitation of link length has been introduced. This paper analyses position workspace, orientation workspace of parallel robot of the six degrees of freedom. The result shows: It is a main means to increase and decrease its workspace to change the length of branch of parallel mechanism; The radius of the movement platform has no effect on the size of workspace, but will change position of workspace.
"Feeling" Series and Parallel Resistances.
Morse, Robert A.
1993-01-01
Equipped with drinking straws and stirring straws, a teacher can help students understand how resistances in electric circuits combine in series and in parallel. Follow-up suggestions are provided. (ZWH)
Parallel encoders for pixel detectors
International Nuclear Information System (INIS)
Nikityuk, N.M.
1991-01-01
A new method of fast encoding and determining the multiplicity and coordinates of fired pixels is described. A specific example construction of parallel encodes and MCC for n=49 and t=2 is given. 16 refs.; 6 figs.; 2 tabs
Massively Parallel Finite Element Programming
Heister, Timo
2010-01-01
Today\\'s large finite element simulations require parallel algorithms to scale on clusters with thousands or tens of thousands of processor cores. We present data structures and algorithms to take advantage of the power of high performance computers in generic finite element codes. Existing generic finite element libraries often restrict the parallelization to parallel linear algebra routines. This is a limiting factor when solving on more than a few hundreds of cores. We describe routines for distributed storage of all major components coupled with efficient, scalable algorithms. We give an overview of our effort to enable the modern and generic finite element library deal.II to take advantage of the power of large clusters. In particular, we describe the construction of a distributed mesh and develop algorithms to fully parallelize the finite element calculation. Numerical results demonstrate good scalability. © 2010 Springer-Verlag.
Event monitoring of parallel computations
Directory of Open Access Journals (Sweden)
Gruzlikov Alexander M.
2015-06-01
Full Text Available The paper considers the monitoring of parallel computations for detection of abnormal events. It is assumed that computations are organized according to an event model, and monitoring is based on specific test sequences
Massively Parallel Finite Element Programming
Heister, Timo; Kronbichler, Martin; Bangerth, Wolfgang
2010-01-01
Today's large finite element simulations require parallel algorithms to scale on clusters with thousands or tens of thousands of processor cores. We present data structures and algorithms to take advantage of the power of high performance computers in generic finite element codes. Existing generic finite element libraries often restrict the parallelization to parallel linear algebra routines. This is a limiting factor when solving on more than a few hundreds of cores. We describe routines for distributed storage of all major components coupled with efficient, scalable algorithms. We give an overview of our effort to enable the modern and generic finite element library deal.II to take advantage of the power of large clusters. In particular, we describe the construction of a distributed mesh and develop algorithms to fully parallelize the finite element calculation. Numerical results demonstrate good scalability. © 2010 Springer-Verlag.
The STAPL Parallel Graph Library
Harshvardhan,
2013-01-01
This paper describes the stapl Parallel Graph Library, a high-level framework that abstracts the user from data-distribution and parallelism details and allows them to concentrate on parallel graph algorithm development. It includes a customizable distributed graph container and a collection of commonly used parallel graph algorithms. The library introduces pGraph pViews that separate algorithm design from the container implementation. It supports three graph processing algorithmic paradigms, level-synchronous, asynchronous and coarse-grained, and provides common graph algorithms based on them. Experimental results demonstrate improved scalability in performance and data size over existing graph libraries on more than 16,000 cores and on internet-scale graphs containing over 16 billion vertices and 250 billion edges. © Springer-Verlag Berlin Heidelberg 2013.
Writing parallel programs that work
CERN. Geneva
2012-01-01
Serial algorithms typically run inefficiently on parallel machines. This may sound like an obvious statement, but it is the root cause of why parallel programming is considered to be difficult. The current state of the computer industry is still that almost all programs in existence are serial. This talk will describe the techniques used in the Intel Parallel Studio to provide a developer with the tools necessary to understand the behaviors and limitations of the existing serial programs. Once the limitations are known the developer can refactor the algorithms and reanalyze the resulting programs with the tools in the Intel Parallel Studio to create parallel programs that work. About the speaker Paul Petersen is a Sr. Principal Engineer in the Software and Solutions Group (SSG) at Intel. He received a Ph.D. degree in Computer Science from the University of Illinois in 1993. After UIUC, he was employed at Kuck and Associates, Inc. (KAI) working on auto-parallelizing compiler (KAP), and was involved in th...
Subsequent childbirth after a previous traumatic birth.
Beck, Cheryl Tatano; Watson, Sue
2010-01-01
Nine percent of new mothers in the United States who participated in the Listening to Mothers II Postpartum Survey screened positive for meeting the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition criteria for posttraumatic stress disorder after childbirth. Women who have had a traumatic birth experience report fewer subsequent children and a longer length of time before their second baby. Childbirth-related posttraumatic stress disorder impacts couples' physical relationship, communication, conflict, emotions, and bonding with their children. The purpose of this study was to describe the meaning of women's experiences of a subsequent childbirth after a previous traumatic birth. Phenomenology was the research design used. An international sample of 35 women participated in this Internet study. Women were asked, "Please describe in as much detail as you can remember your subsequent pregnancy, labor, and delivery following your previous traumatic birth." Colaizzi's phenomenological data analysis approach was used to analyze the stories of the 35 women. Data analysis yielded four themes: (a) riding the turbulent wave of panic during pregnancy; (b) strategizing: attempts to reclaim their body and complete the journey to motherhood; (c) bringing reverence to the birthing process and empowering women; and (d) still elusive: the longed-for healing birth experience. Subsequent childbirth after a previous birth trauma has the potential to either heal or retraumatize women. During pregnancy, women need permission and encouragement to grieve their prior traumatic births to help remove the burden of their invisible pain.
Archer, Charles J.; Blocksome, Michael A.; Ratterman, Joseph D.; Smith, Brian E.
2014-08-12
Endpoint-based parallel data processing in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI composed of data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, the compute nodes coupled for data communications through the PAMI, including establishing a data communications geometry, the geometry specifying, for tasks representing processes of execution of the parallel application, a set of endpoints that are used in collective operations of the PAMI including a plurality of endpoints for one of the tasks; receiving in endpoints of the geometry an instruction for a collective operation; and executing the instruction for a collective operation through the endpoints in dependence upon the geometry, including dividing data communications operations among the plurality of endpoints for one of the tasks.
Parallel Implicit Algorithms for CFD
Keyes, David E.
1998-01-01
The main goal of this project was efficient distributed parallel and workstation cluster implementations of Newton-Krylov-Schwarz (NKS) solvers for implicit Computational Fluid Dynamics (CFD.) "Newton" refers to a quadratically convergent nonlinear iteration using gradient information based on the true residual, "Krylov" to an inner linear iteration that accesses the Jacobian matrix only through highly parallelizable sparse matrix-vector products, and "Schwarz" to a domain decomposition form of preconditioning the inner Krylov iterations with primarily neighbor-only exchange of data between the processors. Prior experience has established that Newton-Krylov methods are competitive solvers in the CFD context and that Krylov-Schwarz methods port well to distributed memory computers. The combination of the techniques into Newton-Krylov-Schwarz was implemented on 2D and 3D unstructured Euler codes on the parallel testbeds that used to be at LaRC and on several other parallel computers operated by other agencies or made available by the vendors. Early implementations were made directly in Massively Parallel Integration (MPI) with parallel solvers we adapted from legacy NASA codes and enhanced for full NKS functionality. Later implementations were made in the framework of the PETSC library from Argonne National Laboratory, which now includes pseudo-transient continuation Newton-Krylov-Schwarz solver capability (as a result of demands we made upon PETSC during our early porting experiences). A secondary project pursued with funding from this contract was parallel implicit solvers in acoustics, specifically in the Helmholtz formulation. A 2D acoustic inverse problem has been solved in parallel within the PETSC framework.
Second derivative parallel block backward differentiation type ...
African Journals Online (AJOL)
Second derivative parallel block backward differentiation type formulas for Stiff ODEs. ... Log in or Register to get access to full text downloads. ... and the methods are inherently parallel and can be distributed over parallel processors. They are ...
A Parallel Approach to Fractal Image Compression
Lubomir Dedera
2004-01-01
The paper deals with a parallel approach to coding and decoding algorithms in fractal image compressionand presents experimental results comparing sequential and parallel algorithms from the point of view of achieved bothcoding and decoding time and effectiveness of parallelization.
Parallel transport of long mean-free-path plasma along open magnetic field lines: Parallel heat flux
International Nuclear Information System (INIS)
Guo Zehua; Tang Xianzhu
2012-01-01
In a long mean-free-path plasma where temperature anisotropy can be sustained, the parallel heat flux has two components with one associated with the parallel thermal energy and the other the perpendicular thermal energy. Due to the large deviation of the distribution function from local Maxwellian in an open field line plasma with low collisionality, the conventional perturbative calculation of the parallel heat flux closure in its local or non-local form is no longer applicable. Here, a non-perturbative calculation is presented for a collisionless plasma in a two-dimensional flux expander bounded by absorbing walls. Specifically, closures of previously unfamiliar form are obtained for ions and electrons, which relate two distinct components of the species parallel heat flux to the lower order fluid moments such as density, parallel flow, parallel and perpendicular temperatures, and the field quantities such as the magnetic field strength and the electrostatic potential. The plasma source and boundary condition at the absorbing wall enter explicitly in the closure calculation. Although the closure calculation does not take into account wave-particle interactions, the results based on passing orbits from steady-state collisionless drift-kinetic equation show remarkable agreement with fully kinetic-Maxwell simulations. As an example of the physical implications of the theory, the parallel heat flux closures are found to predict a surprising observation in the kinetic-Maxwell simulation of the 2D magnetic flux expander problem, where the parallel heat flux of the parallel thermal energy flows from low to high parallel temperature region.
A Parallel Particle Swarm Optimization Algorithm Accelerated by Asynchronous Evaluations
Venter, Gerhard; Sobieszczanski-Sobieski, Jaroslaw
2005-01-01
A parallel Particle Swarm Optimization (PSO) algorithm is presented. Particle swarm optimization is a fairly recent addition to the family of non-gradient based, probabilistic search algorithms that is based on a simplified social model and is closely tied to swarming theory. Although PSO algorithms present several attractive properties to the designer, they are plagued by high computational cost as measured by elapsed time. One approach to reduce the elapsed time is to make use of coarse-grained parallelization to evaluate the design points. Previous parallel PSO algorithms were mostly implemented in a synchronous manner, where all design points within a design iteration are evaluated before the next iteration is started. This approach leads to poor parallel speedup in cases where a heterogeneous parallel environment is used and/or where the analysis time depends on the design point being analyzed. This paper introduces an asynchronous parallel PSO algorithm that greatly improves the parallel e ciency. The asynchronous algorithm is benchmarked on a cluster assembled of Apple Macintosh G5 desktop computers, using the multi-disciplinary optimization of a typical transport aircraft wing as an example.
Beam dynamics simulations using a parallel version of PARMILA
International Nuclear Information System (INIS)
Ryne, R.D.
1996-01-01
The computer code PARMILA has been the primary tool for the design of proton and ion linacs in the United States for nearly three decades. Previously it was sufficient to perform simulations with of order 10000 particles, but recently the need to perform high resolution halo studies for next-generation, high intensity linacs has made it necessary to perform simulations with of order 100 million particles. With the advent of massively parallel computers such simulations are now within reach. Parallel computers already make it possible, for example, to perform beam dynamics calculations with tens of millions of particles, requiring over 10 GByte of core memory, in just a few hours. Also, parallel computers are becoming easier to use thanks to the availability of mature, Fortran-like languages such as Connection Machine Fortran and High Performance Fortran. We will describe our experience developing a parallel version of PARMILA and the performance of the new code
Beam dynamics simulations using a parallel version of PARMILA
International Nuclear Information System (INIS)
Ryne, Robert
1996-01-01
The computer code PARMILA has been the primary tool for the design of proton and ion linacs in the United States for nearly three decades. Previously it was sufficient to perform simulations with of order 10000 particles, but recently the need to perform high resolution halo studies for next-generation, high intensity linacs has made it necessary to perform simulations with of order 100 million particles. With the advent of massively parallel computers such simulations are now within reach. Parallel computers already make it possible, for example, to perform beam dynamics calculations with tens of millions of particles, requiring over 10 GByte of core memory, in just a few hours. Also, parallel computers are becoming easier to use thanks to the availability of mature, Fortran-like languages such as Connection Machine Fortran and High Performance Fortran. We will describe our experience developing a parallel version of PARMILA and the performance of the new code. (author)
High Efficiency EBCOT with Parallel Coding Architecture for JPEG2000
Directory of Open Access Journals (Sweden)
Chiang Jen-Shiun
2006-01-01
Full Text Available This work presents a parallel context-modeling coding architecture and a matching arithmetic coder (MQ-coder for the embedded block coding (EBCOT unit of the JPEG2000 encoder. Tier-1 of the EBCOT consumes most of the computation time in a JPEG2000 encoding system. The proposed parallel architecture can increase the throughput rate of the context modeling. To match the high throughput rate of the parallel context-modeling architecture, an efficient pipelined architecture for context-based adaptive arithmetic encoder is proposed. This encoder of JPEG2000 can work at 180 MHz to encode one symbol each cycle. Compared with the previous context-modeling architectures, our parallel architectures can improve the throughput rate up to 25%.
Parallel fabrication of macroporous scaffolds.
Dobos, Andrew; Grandhi, Taraka Sai Pavan; Godeshala, Sudhakar; Meldrum, Deirdre R; Rege, Kaushal
2018-07-01
Scaffolds generated from naturally occurring and synthetic polymers have been investigated in several applications because of their biocompatibility and tunable chemo-mechanical properties. Existing methods for generation of 3D polymeric scaffolds typically cannot be parallelized, suffer from low throughputs, and do not allow for quick and easy removal of the fragile structures that are formed. Current molds used in hydrogel and scaffold fabrication using solvent casting and porogen leaching are often single-use and do not facilitate 3D scaffold formation in parallel. Here, we describe a simple device and related approaches for the parallel fabrication of macroporous scaffolds. This approach was employed for the generation of macroporous and non-macroporous materials in parallel, in higher throughput and allowed for easy retrieval of these 3D scaffolds once formed. In addition, macroporous scaffolds with interconnected as well as non-interconnected pores were generated, and the versatility of this approach was employed for the generation of 3D scaffolds from diverse materials including an aminoglycoside-derived cationic hydrogel ("Amikagel"), poly(lactic-co-glycolic acid) or PLGA, and collagen. Macroporous scaffolds generated using the device were investigated for plasmid DNA binding and cell loading, indicating the use of this approach for developing materials for different applications in biotechnology. Our results demonstrate that the device-based approach is a simple technology for generating scaffolds in parallel, which can enhance the toolbox of current fabrication techniques. © 2018 Wiley Periodicals, Inc.
Parallel plasma fluid turbulence calculations
International Nuclear Information System (INIS)
Leboeuf, J.N.; Carreras, B.A.; Charlton, L.A.; Drake, J.B.; Lynch, V.E.; Newman, D.E.; Sidikman, K.L.; Spong, D.A.
1994-01-01
The study of plasma turbulence and transport is a complex problem of critical importance for fusion-relevant plasmas. To this day, the fluid treatment of plasma dynamics is the best approach to realistic physics at the high resolution required for certain experimentally relevant calculations. Core and edge turbulence in a magnetic fusion device have been modeled using state-of-the-art, nonlinear, three-dimensional, initial-value fluid and gyrofluid codes. Parallel implementation of these models on diverse platforms--vector parallel (National Energy Research Supercomputer Center's CRAY Y-MP C90), massively parallel (Intel Paragon XP/S 35), and serial parallel (clusters of high-performance workstations using the Parallel Virtual Machine protocol)--offers a variety of paths to high resolution and significant improvements in real-time efficiency, each with its own advantages. The largest and most efficient calculations have been performed at the 200 Mword memory limit on the C90 in dedicated mode, where an overlap of 12 to 13 out of a maximum of 16 processors has been achieved with a gyrofluid model of core fluctuations. The richness of the physics captured by these calculations is commensurate with the increased resolution and efficiency and is limited only by the ingenuity brought to the analysis of the massive amounts of data generated
Evaluating parallel optimization on transputers
Directory of Open Access Journals (Sweden)
A.G. Chalmers
2003-12-01
Full Text Available The faster processing power of modern computers and the development of efficient algorithms have made it possible for operations researchers to tackle a much wider range of problems than ever before. Further improvements in processing speed can be achieved utilising relatively inexpensive transputers to process components of an algorithm in parallel. The Davidon-Fletcher-Powell method is one of the most successful and widely used optimisation algorithms for unconstrained problems. This paper examines the algorithm and identifies the components that can be processed in parallel. The results of some experiments with these components are presented which indicates under what conditions parallel processing with an inexpensive configuration is likely to be faster than the traditional sequential implementations. The performance of the whole algorithm with its parallel components is then compared with the original sequential algorithm. The implementation serves to illustrate the practicalities of speeding up typical OR algorithms in terms of difficulty, effort and cost. The results give an indication of the savings in time a given parallel implementation can be expected to yield.
Pattern-Driven Automatic Parallelization
Directory of Open Access Journals (Sweden)
Christoph W. Kessler
1996-01-01
Full Text Available This article describes a knowledge-based system for automatic parallelization of a wide class of sequential numerical codes operating on vectors and dense matrices, and for execution on distributed memory message-passing multiprocessors. Its main feature is a fast and powerful pattern recognition tool that locally identifies frequently occurring computations and programming concepts in the source code. This tool also works for dusty deck codes that have been "encrypted" by former machine-specific code transformations. Successful pattern recognition guides sophisticated code transformations including local algorithm replacement such that the parallelized code need not emerge from the sequential program structure by just parallelizing the loops. It allows access to an expert's knowledge on useful parallel algorithms, available machine-specific library routines, and powerful program transformations. The partially restored program semantics also supports local array alignment, distribution, and redistribution, and allows for faster and more exact prediction of the performance of the parallelized target code than is usually possible.
Parallel artificial liquid membrane extraction
DEFF Research Database (Denmark)
Gjelstad, Astrid; Rasmussen, Knut Einar; Parmer, Marthe Petrine
2013-01-01
This paper reports development of a new approach towards analytical liquid-liquid-liquid membrane extraction termed parallel artificial liquid membrane extraction. A donor plate and acceptor plate create a sandwich, in which each sample (human plasma) and acceptor solution is separated by an arti......This paper reports development of a new approach towards analytical liquid-liquid-liquid membrane extraction termed parallel artificial liquid membrane extraction. A donor plate and acceptor plate create a sandwich, in which each sample (human plasma) and acceptor solution is separated...... by an artificial liquid membrane. Parallel artificial liquid membrane extraction is a modification of hollow-fiber liquid-phase microextraction, where the hollow fibers are replaced by flat membranes in a 96-well plate format....
Efficient Parallel Kernel Solvers for Computational Fluid Dynamics Applications
Sun, Xian-He
1997-01-01
Distributed-memory parallel computers dominate today's parallel computing arena. These machines, such as Intel Paragon, IBM SP2, and Cray Origin2OO, have successfully delivered high performance computing power for solving some of the so-called "grand-challenge" problems. Despite initial success, parallel machines have not been widely accepted in production engineering environments due to the complexity of parallel programming. On a parallel computing system, a task has to be partitioned and distributed appropriately among processors to reduce communication cost and to attain load balance. More importantly, even with careful partitioning and mapping, the performance of an algorithm may still be unsatisfactory, since conventional sequential algorithms may be serial in nature and may not be implemented efficiently on parallel machines. In many cases, new algorithms have to be introduced to increase parallel performance. In order to achieve optimal performance, in addition to partitioning and mapping, a careful performance study should be conducted for a given application to find a good algorithm-machine combination. This process, however, is usually painful and elusive. The goal of this project is to design and develop efficient parallel algorithms for highly accurate Computational Fluid Dynamics (CFD) simulations and other engineering applications. The work plan is 1) developing highly accurate parallel numerical algorithms, 2) conduct preliminary testing to verify the effectiveness and potential of these algorithms, 3) incorporate newly developed algorithms into actual simulation packages. The work plan has well achieved. Two highly accurate, efficient Poisson solvers have been developed and tested based on two different approaches: (1) Adopting a mathematical geometry which has a better capacity to describe the fluid, (2) Using compact scheme to gain high order accuracy in numerical discretization. The previously developed Parallel Diagonal Dominant (PDD) algorithm
Parallel algorithms for mapping pipelined and parallel computations
Nicol, David M.
1988-01-01
Many computational problems in image processing, signal processing, and scientific computing are naturally structured for either pipelined or parallel computation. When mapping such problems onto a parallel architecture it is often necessary to aggregate an obvious problem decomposition. Even in this context the general mapping problem is known to be computationally intractable, but recent advances have been made in identifying classes of problems and architectures for which optimal solutions can be found in polynomial time. Among these, the mapping of pipelined or parallel computations onto linear array, shared memory, and host-satellite systems figures prominently. This paper extends that work first by showing how to improve existing serial mapping algorithms. These improvements have significantly lower time and space complexities: in one case a published O(nm sup 3) time algorithm for mapping m modules onto n processors is reduced to an O(nm log m) time complexity, and its space requirements reduced from O(nm sup 2) to O(m). Run time complexity is further reduced with parallel mapping algorithms based on these improvements, which run on the architecture for which they create the mappings.
Books average previous decade of economic misery.
Bentley, R Alexander; Acerbi, Alberto; Ormerod, Paul; Lampos, Vasileios
2014-01-01
For the 20(th) century since the Depression, we find a strong correlation between a 'literary misery index' derived from English language books and a moving average of the previous decade of the annual U.S. economic misery index, which is the sum of inflation and unemployment rates. We find a peak in the goodness of fit at 11 years for the moving average. The fit between the two misery indices holds when using different techniques to measure the literary misery index, and this fit is significantly better than other possible correlations with different emotion indices. To check the robustness of the results, we also analysed books written in German language and obtained very similar correlations with the German economic misery index. The results suggest that millions of books published every year average the authors' shared economic experiences over the past decade.
Books Average Previous Decade of Economic Misery
Bentley, R. Alexander; Acerbi, Alberto; Ormerod, Paul; Lampos, Vasileios
2014-01-01
For the 20th century since the Depression, we find a strong correlation between a ‘literary misery index’ derived from English language books and a moving average of the previous decade of the annual U.S. economic misery index, which is the sum of inflation and unemployment rates. We find a peak in the goodness of fit at 11 years for the moving average. The fit between the two misery indices holds when using different techniques to measure the literary misery index, and this fit is significantly better than other possible correlations with different emotion indices. To check the robustness of the results, we also analysed books written in German language and obtained very similar correlations with the German economic misery index. The results suggest that millions of books published every year average the authors' shared economic experiences over the past decade. PMID:24416159
Cellular automata a parallel model
Mazoyer, J
1999-01-01
Cellular automata can be viewed both as computational models and modelling systems of real processes. This volume emphasises the first aspect. In articles written by leading researchers, sophisticated massive parallel algorithms (firing squad, life, Fischer's primes recognition) are treated. Their computational power and the specific complexity classes they determine are surveyed, while some recent results in relation to chaos from a new dynamic systems point of view are also presented. Audience: This book will be of interest to specialists of theoretical computer science and the parallelism challenge.
Some aspects of radial flow between parallel disks
International Nuclear Information System (INIS)
Tabatabai, M.; Pollard, A.
1985-01-01
Radial flow of air between two closely spaced parallel disks is examined experimentally. A comprehensive review of the previous work performed on similar flow situations is given by Tabatabai and Pollard. The present paper is a discussion of some of the results obtained so far and offers some observations on the decay of turbulence in this flow. (author)
Underestimation of Severity of Previous Whiplash Injuries
Naqui, SZH; Lovell, SJ; Lovell, ME
2008-01-01
INTRODUCTION We noted a report that more significant symptoms may be expressed after second whiplash injuries by a suggested cumulative effect, including degeneration. We wondered if patients were underestimating the severity of their earlier injury. PATIENTS AND METHODS We studied recent medicolegal reports, to assess subjects with a second whiplash injury. They had been asked whether their earlier injury was worse, the same or lesser in severity. RESULTS From the study cohort, 101 patients (87%) felt that they had fully recovered from their first injury and 15 (13%) had not. Seventy-six subjects considered their first injury of lesser severity, 24 worse and 16 the same. Of the 24 that felt the violence of their first accident was worse, only 8 had worse symptoms, and 16 felt their symptoms were mainly the same or less than their symptoms from their second injury. Statistical analysis of the data revealed that the proportion of those claiming a difference who said the previous injury was lesser was 76% (95% CI 66–84%). The observed proportion with a lesser injury was considerably higher than the 50% anticipated. CONCLUSIONS We feel that subjects may underestimate the severity of an earlier injury and associated symptoms. Reasons for this may include secondary gain rather than any proposed cumulative effect. PMID:18201501
[Electronic cigarettes - effects on health. Previous reports].
Napierała, Marta; Kulza, Maksymilian; Wachowiak, Anna; Jabłecka, Katarzyna; Florek, Ewa
2014-01-01
Currently very popular in the market of tobacco products have gained electronic cigarettes (ang. E-cigarettes). These products are considered to be potentially less harmful in compared to traditional tobacco products. However, current reports indicate that the statements of the producers regarding to the composition of the e- liquids not always are sufficient, and consumers often do not have reliable information on the quality of the product used by them. This paper contain a review of previous reports on the composition of e-cigarettes and their impact on health. Most of the observed health effects was related to symptoms of the respiratory tract, mouth, throat, neurological complications and sensory organs. Particularly hazardous effects of the e-cigarettes were: pneumonia, congestive heart failure, confusion, convulsions, hypotension, aspiration pneumonia, face second-degree burns, blindness, chest pain and rapid heartbeat. In the literature there is no information relating to passive exposure by the aerosols released during e-cigarette smoking. Furthermore, the information regarding to the use of these products in the long term are not also available.
Parallel Sparse Matrix - Vector Product
DEFF Research Database (Denmark)
Alexandersen, Joe; Lazarov, Boyan Stefanov; Dammann, Bernd
This technical report contains a case study of a sparse matrix-vector product routine, implemented for parallel execution on a compute cluster with both pure MPI and hybrid MPI-OpenMP solutions. C++ classes for sparse data types were developed and the report shows how these class can be used...
[Falsified medicines in parallel trade].
Muckenfuß, Heide
2017-11-01
The number of falsified medicines on the German market has distinctly increased over the past few years. In particular, stolen pharmaceutical products, a form of falsified medicines, have increasingly been introduced into the legal supply chain via parallel trading. The reasons why parallel trading serves as a gateway for falsified medicines are most likely the complex supply chains and routes of transport. It is hardly possible for national authorities to trace the history of a medicinal product that was bought and sold by several intermediaries in different EU member states. In addition, the heterogeneous outward appearance of imported and relabelled pharmaceutical products facilitates the introduction of illegal products onto the market. Official batch release at the Paul-Ehrlich-Institut offers the possibility of checking some aspects that might provide an indication of a falsified medicine. In some circumstances, this may allow the identification of falsified medicines before they come onto the German market. However, this control is only possible for biomedicinal products that have not received a waiver regarding official batch release. For improved control of parallel trade, better networking among the EU member states would be beneficial. European-wide regulations, e. g., for disclosure of the complete supply chain, would help to minimise the risks of parallel trading and hinder the marketing of falsified medicines.
The parallel adult education system
DEFF Research Database (Denmark)
Wahlgren, Bjarne
2015-01-01
for competence development. The Danish university educational system includes two parallel programs: a traditional academic track (candidatus) and an alternative practice-based track (master). The practice-based program was established in 2001 and organized as part time. The total program takes half the time...
Where are the parallel algorithms?
Voigt, R. G.
1985-01-01
Four paradigms that can be useful in developing parallel algorithms are discussed. These include computational complexity analysis, changing the order of computation, asynchronous computation, and divide and conquer. Each is illustrated with an example from scientific computation, and it is shown that computational complexity must be used with great care or an inefficient algorithm may be selected.
Parallel imaging with phase scrambling.
Zaitsev, Maxim; Schultz, Gerrit; Hennig, Juergen; Gruetter, Rolf; Gallichan, Daniel
2015-04-01
Most existing methods for accelerated parallel imaging in MRI require additional data, which are used to derive information about the sensitivity profile of each radiofrequency (RF) channel. In this work, a method is presented to avoid the acquisition of separate coil calibration data for accelerated Cartesian trajectories. Quadratic phase is imparted to the image to spread the signals in k-space (aka phase scrambling). By rewriting the Fourier transform as a convolution operation, a window can be introduced to the convolved chirp function, allowing a low-resolution image to be reconstructed from phase-scrambled data without prominent aliasing. This image (for each RF channel) can be used to derive coil sensitivities to drive existing parallel imaging techniques. As a proof of concept, the quadratic phase was applied by introducing an offset to the x(2) - y(2) shim and the data were reconstructed using adapted versions of the image space-based sensitivity encoding and GeneRalized Autocalibrating Partially Parallel Acquisitions algorithms. The method is demonstrated in a phantom (1 × 2, 1 × 3, and 2 × 2 acceleration) and in vivo (2 × 2 acceleration) using a 3D gradient echo acquisition. Phase scrambling can be used to perform parallel imaging acceleration without acquisition of separate coil calibration data, demonstrated here for a 3D-Cartesian trajectory. Further research is required to prove the applicability to other 2D and 3D sampling schemes. © 2014 Wiley Periodicals, Inc.
Default Parallels Plesk Panel Page
services that small businesses want and need. Our software includes key building blocks of cloud service virtualized servers Service Provider Products ParallelsÂ® Automation Hosting, SaaS, and cloud computing , the leading hosting automation software. You see this page because there is no Web site at this
Parallel plate transmission line transformer
Voeten, S.J.; Brussaard, G.J.H.; Pemen, A.J.M.
2011-01-01
A Transmission Line Transformer (TLT) can be used to transform high-voltage nanosecond pulses. These transformers rely on the fact that the length of the pulse is shorter than the transmission lines used. This allows connecting the transmission lines in parallel at the input and in series at the
Matpar: Parallel Extensions for MATLAB
Springer, P. L.
1998-01-01
Matpar is a set of client/server software that allows a MATLAB user to take advantage of a parallel computer for very large problems. The user can replace calls to certain built-in MATLAB functions with calls to Matpar functions.
Massively parallel quantum computer simulator
De Raedt, K.; Michielsen, K.; De Raedt, H.; Trieu, B.; Arnold, G.; Richter, M.; Lippert, Th.; Watanabe, H.; Ito, N.
2007-01-01
We describe portable software to simulate universal quantum computers on massive parallel Computers. We illustrate the use of the simulation software by running various quantum algorithms on different computer architectures, such as a IBM BlueGene/L, a IBM Regatta p690+, a Hitachi SR11000/J1, a Cray
Parallel computing: numerics, applications, and trends
National Research Council Canada - National Science Library
Trobec, Roman; Vajteršic, Marián; Zinterhof, Peter
2009-01-01
... and/or distributed systems. The contributions to this book are focused on topics most concerned in the trends of today's parallel computing. These range from parallel algorithmics, programming, tools, network computing to future parallel computing. Particular attention is paid to parallel numerics: linear algebra, differential equations, numerica...
Experiments with parallel algorithms for combinatorial problems
G.A.P. Kindervater (Gerard); H.W.J.M. Trienekens
1985-01-01
textabstractIn the last decade many models for parallel computation have been proposed and many parallel algorithms have been developed. However, few of these models have been realized and most of these algorithms are supposed to run on idealized, unrealistic parallel machines. The parallel machines
International Nuclear Information System (INIS)
Heggarty, J.W.
1999-06-01
For almost thirty years, sequential R-matrix computation has been used by atomic physics research groups, from around the world, to model collision phenomena involving the scattering of electrons or positrons with atomic or molecular targets. As considerable progress has been made in the understanding of fundamental scattering processes, new data, obtained from more complex calculations, is of current interest to experimentalists. Performing such calculations, however, places considerable demands on the computational resources to be provided by the target machine, in terms of both processor speed and memory requirement. Indeed, in some instances the computational requirements are so great that the proposed R-matrix calculations are intractable, even when utilising contemporary classic supercomputers. Historically, increases in the computational requirements of R-matrix computation were accommodated by porting the problem codes to a more powerful classic supercomputer. Although this approach has been successful in the past, it is no longer considered to be a satisfactory solution due to the limitations of current (and future) Von Neumann machines. As a consequence, there has been considerable interest in the high performance multicomputers, that have emerged over the last decade which appear to offer the computational resources required by contemporary R-matrix research. Unfortunately, developing codes for these machines is not as simple a task as it was to develop codes for successive classic supercomputers. The difficulty arises from the considerable differences in the computing models that exist between the two types of machine and results in the programming of multicomputers to be widely acknowledged as a difficult, time consuming and error-prone task. Nevertheless, unless parallel R-matrix computation is realised, important theoretical and experimental atomic physics research will continue to be hindered. This thesis describes work that was undertaken in
The numerical parallel computing of photon transport
International Nuclear Information System (INIS)
Huang Qingnan; Liang Xiaoguang; Zhang Lifa
1998-12-01
The parallel computing of photon transport is investigated, the parallel algorithm and the parallelization of programs on parallel computers both with shared memory and with distributed memory are discussed. By analyzing the inherent law of the mathematics and physics model of photon transport according to the structure feature of parallel computers, using the strategy of 'to divide and conquer', adjusting the algorithm structure of the program, dissolving the data relationship, finding parallel liable ingredients and creating large grain parallel subtasks, the sequential computing of photon transport into is efficiently transformed into parallel and vector computing. The program was run on various HP parallel computers such as the HY-1 (PVP), the Challenge (SMP) and the YH-3 (MPP) and very good parallel speedup has been gotten
Mediastinal involvement in lymphangiomatosis: a previously unreported MRI sign
Energy Technology Data Exchange (ETDEWEB)
Shah, Vikas; Shah, Sachit; Barnacle, Alex; McHugh, Kieran [Great Ormond Street Hospital for Children, Department of Radiology, London (United Kingdom); Sebire, Neil J. [Great Ormond Street Hospital for Children, Department of Histopathology, London (United Kingdom); Brock, Penelope [Great Ormond Street Hospital for Children, Department of Oncology, London (United Kingdom); Harper, John I. [Great Ormond Street Hospital for Children, Department of Dermatology, London (United Kingdom)
2011-08-15
Multifocal lymphangiomatosis is a rare systemic disorder affecting children. Due to its rarity and wide spectrum of clinical, histological and imaging features, establishing the diagnosis of multifocal lymphangiomatosis can be challenging. The purpose of this study was to describe a new imaging sign in this disorder: paraspinal soft tissue and signal abnormality at MRI. We retrospectively reviewed the imaging, clinical and histopathological findings in a cohort of eight children with thoracic involvement from this condition. Evidence of paraspinal chest disease was identified at MRI and CT in all eight of these children. The changes comprise heterogeneous intermediate-to-high signal parallel to the thoracic vertebrae on T2-weighted sequences at MRI, with abnormal paraspinal soft tissue at CT and plain radiography. Multifocal lymphangiomatosis is a rare disorder with a broad range of clinicopathological and imaging features. MRI allows complete evaluation of disease extent without the use of ionising radiation and has allowed us to describe a previously unreported imaging sign in this disorder, namely, heterogeneous hyperintense signal in abnormal paraspinal tissue on T2-weighted images. (orig.)
Twelve previously unknown phage genera are ubiquitous in global oceans.
Holmfeldt, Karin; Solonenko, Natalie; Shah, Manesh; Corrier, Kristen; Riemann, Lasse; Verberkmoes, Nathan C; Sullivan, Matthew B
2013-07-30
Viruses are fundamental to ecosystems ranging from oceans to humans, yet our ability to study them is bottlenecked by the lack of ecologically relevant isolates, resulting in "unknowns" dominating culture-independent surveys. Here we present genomes from 31 phages infecting multiple strains of the aquatic bacterium Cellulophaga baltica (Bacteroidetes) to provide data for an underrepresented and environmentally abundant bacterial lineage. Comparative genomics delineated 12 phage groups that (i) each represent a new genus, and (ii) represent one novel and four well-known viral families. This diversity contrasts the few well-studied marine phage systems, but parallels the diversity of phages infecting human-associated bacteria. Although all 12 Cellulophaga phages represent new genera, the podoviruses and icosahedral, nontailed ssDNA phages were exceptional, with genomes up to twice as large as those previously observed for each phage type. Structural novelty was also substantial, requiring experimental phage proteomics to identify 83% of the structural proteins. The presence of uncommon nucleotide metabolism genes in four genera likely underscores the importance of scavenging nutrient-rich molecules as previously seen for phages in marine environments. Metagenomic recruitment analyses suggest that these particular Cellulophaga phages are rare and may represent a first glimpse into the phage side of the rare biosphere. However, these analyses also revealed that these phage genera are widespread, occurring in 94% of 137 investigated metagenomes. Together, this diverse and novel collection of phages identifies a small but ubiquitous fraction of unknown marine viral diversity and provides numerous environmentally relevant phage-host systems for experimental hypothesis testing.
Automatic Parallelization Tool: Classification of Program Code for Parallel Computing
Directory of Open Access Journals (Sweden)
Mustafa Basthikodi
2016-04-01
Full Text Available Performance growth of single-core processors has come to a halt in the past decade, but was re-enabled by the introduction of parallelism in processors. Multicore frameworks along with Graphical Processing Units empowered to enhance parallelism broadly. Couples of compilers are updated to developing challenges forsynchronization and threading issues. Appropriate program and algorithm classifications will have advantage to a great extent to the group of software engineers to get opportunities for effective parallelization. In present work we investigated current species for classification of algorithms, in that related work on classification is discussed along with the comparison of issues that challenges the classification. The set of algorithms are chosen which matches the structure with different issues and perform given task. We have tested these algorithms utilizing existing automatic species extraction toolsalong with Bones compiler. We have added functionalities to existing tool, providing a more detailed characterization. The contributions of our work include support for pointer arithmetic, conditional and incremental statements, user defined types, constants and mathematical functions. With this, we can retain significant data which is not captured by original speciesof algorithms. We executed new theories into the device, empowering automatic characterization of program code.
Structural synthesis of parallel robots
Gogu, Grigore
This book represents the fifth part of a larger work dedicated to the structural synthesis of parallel robots. The originality of this work resides in the fact that it combines new formulae for mobility, connectivity, redundancy and overconstraints with evolutionary morphology in a unified structural synthesis approach that yields interesting and innovative solutions for parallel robotic manipulators. This is the first book on robotics that presents solutions for coupled, decoupled, uncoupled, fully-isotropic and maximally regular robotic manipulators with Schönflies motions systematically generated by using the structural synthesis approach proposed in Part 1. Overconstrained non-redundant/overactuated/redundantly actuated solutions with simple/complex limbs are proposed. Many solutions are presented here for the first time in the literature. The author had to make a difficult and challenging choice between protecting these solutions through patents and releasing them directly into the public domain. T...
GPU Parallel Bundle Block Adjustment
Directory of Open Access Journals (Sweden)
ZHENG Maoteng
2017-09-01
Full Text Available To deal with massive data in photogrammetry, we introduce the GPU parallel computing technology. The preconditioned conjugate gradient and inexact Newton method are also applied to decrease the iteration times while solving the normal equation. A brand new workflow of bundle adjustment is developed to utilize GPU parallel computing technology. Our method can avoid the storage and inversion of the big normal matrix, and compute the normal matrix in real time. The proposed method can not only largely decrease the memory requirement of normal matrix, but also largely improve the efficiency of bundle adjustment. It also achieves the same accuracy as the conventional method. Preliminary experiment results show that the bundle adjustment of a dataset with about 4500 images and 9 million image points can be done in only 1.5 minutes while achieving sub-pixel accuracy.
Computational chaos in massively parallel neural networks
Barhen, Jacob; Gulati, Sandeep
1989-01-01
A fundamental issue which directly impacts the scalability of current theoretical neural network models to massively parallel embodiments, in both software as well as hardware, is the inherent and unavoidable concurrent asynchronicity of emerging fine-grained computational ensembles and the possible emergence of chaotic manifestations. Previous analyses attributed dynamical instability to the topology of the interconnection matrix, to parasitic components or to propagation delays. However, researchers have observed the existence of emergent computational chaos in a concurrently asynchronous framework, independent of the network topology. Researcher present a methodology enabling the effective asynchronous operation of large-scale neural networks. Necessary and sufficient conditions guaranteeing concurrent asynchronous convergence are established in terms of contracting operators. Lyapunov exponents are computed formally to characterize the underlying nonlinear dynamics. Simulation results are presented to illustrate network convergence to the correct results, even in the presence of large delays.
Parallel Fast Multipole Boundary Element Method for crustal dynamics
International Nuclear Information System (INIS)
Quevedo, Leonardo; Morra, Gabriele; Mueller, R Dietmar
2010-01-01
Crustal faults and sharp material transitions in the crust are usually represented as triangulated surfaces in structural geological models. The complex range of volumes separating such surfaces is typically three-dimensionally meshed in order to solve equations that describe crustal deformation with the finite-difference (FD) or finite-element (FEM) methods. We show here how the Boundary Element Method, combined with the Multipole approach, can revolutionise the calculation of stress and strain, solving the problem of computational scalability from reservoir to basin scales. The Fast Multipole Boundary Element Method (Fast BEM) tackles the difficulty of handling the intricate volume meshes and high resolution of crustal data that has put classical Finite 3D approaches in a performance crisis. The two main performance enhancements of this method: the reduction of required mesh elements from cubic to quadratic with linear size and linear-logarithmic runtime; achieve a reduction of memory and runtime requirements allowing the treatment of a new scale of geodynamic models. This approach was recently tested and applied in a series of papers by [1, 2, 3] for regional and global geodynamics, using KD trees for fast identification of near and far-field interacting elements, and MPI parallelised code on distributed memory architectures, and is now in active development for crustal dynamics. As the method is based on a free-surface, it allows easy data transfer to geological visualisation tools where only changes in boundaries and material properties are required as input parameters. In addition, easy volume mesh sampling of physical quantities enables direct integration with existing FD/FEM code.
A tandem parallel plate analyzer
International Nuclear Information System (INIS)
Hamada, Y.; Fujisawa, A.; Iguchi, H.; Nishizawa, A.; Kawasumi, Y.
1996-11-01
By a new modification of a parallel plate analyzer the second-order focus is obtained in an arbitrary injection angle. This kind of an analyzer with a small injection angle will have an advantage of small operational voltage, compared to the Proca and Green analyzer where the injection angle is 30 degrees. Thus, the newly proposed analyzer will be very useful for the precise energy measurement of high energy particles in MeV range. (author)
International Nuclear Information System (INIS)
Gus'kov, B.N.; Kalinnikov, V.A.; Krastev, V.R.; Maksimov, A.N.; Nikityuk, N.M.
1985-01-01
This paper describes a high-speed parallel counter that contains 31 inputs and 15 outputs and is implemented by integrated circuits of series 500. The counter is designed for fast sampling of events according to the number of particles that pass simultaneously through the hodoscopic plane of the detector. The minimum delay of the output signals relative to the input is 43 nsec. The duration of the output signals can be varied from 75 to 120 nsec
An anthropologist in parallel structure
Directory of Open Access Journals (Sweden)
Noelle Molé Liston
2016-08-01
Full Text Available The essay examines the parallels between Molé Liston’s studies on labor and precarity in Italy and the United States’ anthropology job market. Probing the way economic shift reshaped the field of anthropology of Europe in the late 2000s, the piece explores how the neoliberalization of the American academy increased the value in studying the hardships and daily lives of non-western populations in Europe.
Combinatorics of spreads and parallelisms
Johnson, Norman
2010-01-01
Partitions of Vector Spaces Quasi-Subgeometry Partitions Finite Focal-SpreadsGeneralizing André SpreadsThe Going Up Construction for Focal-SpreadsSubgeometry Partitions Subgeometry and Quasi-Subgeometry Partitions Subgeometries from Focal-SpreadsExtended André SubgeometriesKantor's Flag-Transitive DesignsMaximal Additive Partial SpreadsSubplane Covered Nets and Baer Groups Partial Desarguesian t-Parallelisms Direct Products of Affine PlanesJha-Johnson SL(2,
New algorithms for parallel MRI
International Nuclear Information System (INIS)
Anzengruber, S; Ramlau, R; Bauer, F; Leitao, A
2008-01-01
Magnetic Resonance Imaging with parallel data acquisition requires algorithms for reconstructing the patient's image from a small number of measured lines of the Fourier domain (k-space). In contrast to well-known algorithms like SENSE and GRAPPA and its flavors we consider the problem as a non-linear inverse problem. However, in order to avoid cost intensive derivatives we will use Landweber-Kaczmarz iteration and in order to improve the overall results some additional sparsity constraints.
Wakefield calculations on parallel computers
International Nuclear Information System (INIS)
Schoessow, P.
1990-01-01
The use of parallelism in the solution of wakefield problems is illustrated for two different computer architectures (SIMD and MIMD). Results are given for finite difference codes which have been implemented on a Connection Machine and an Alliant FX/8 and which are used to compute wakefields in dielectric loaded structures. Benchmarks on code performance are presented for both cases. 4 refs., 3 figs., 2 tabs
Automatic parallelization of while-Loops using speculative execution
International Nuclear Information System (INIS)
Collard, J.F.
1995-01-01
Automatic parallelization of imperative sequential programs has focused on nests of for-loops. The most recent of them consist in finding an affine mapping with respect to the loop indices to simultaneously capture the temporal and spatial properties of the parallelized program. Such a mapping is usually called a open-quotes space-time transformation.close quotes This work describes an extension of these techniques to while-loops using speculative execution. We show that space-time transformations are a good framework for summing up previous restructuration techniques of while-loop, such as pipelining. Moreover, we show that these transformations can be derived and applied automatically
A high performance parallel approach to medical imaging
International Nuclear Information System (INIS)
Frieder, G.; Frieder, O.; Stytz, M.R.
1988-01-01
Research into medical imaging using general purpose parallel processing architectures is described and a review of the performance of previous medical imaging machines is provided. Results demonstrating that general purpose parallel architectures can achieve performance comparable to other, specialized, medical imaging machine architectures is presented. A new back-to-front hidden-surface removal algorithm is described. Results demonstrating the computational savings obtained by using the modified back-to-front hidden-surface removal algorithm are presented. Performance figures for forming a full-scale medical image on a mesh interconnected multiprocessor are presented
Aspects of computation on asynchronous parallel processors
International Nuclear Information System (INIS)
Wright, M.
1989-01-01
The increasing availability of asynchronous parallel processors has provided opportunities for original and useful work in scientific computing. However, the field of parallel computing is still in a highly volatile state, and researchers display a wide range of opinion about many fundamental questions such as models of parallelism, approaches for detecting and analyzing parallelism of algorithms, and tools that allow software developers and users to make effective use of diverse forms of complex hardware. This volume collects the work of researchers specializing in different aspects of parallel computing, who met to discuss the framework and the mechanics of numerical computing. The far-reaching impact of high-performance asynchronous systems is reflected in the wide variety of topics, which include scientific applications (e.g. linear algebra, lattice gauge simulation, ordinary and partial differential equations), models of parallelism, parallel language features, task scheduling, automatic parallelization techniques, tools for algorithm development in parallel environments, and system design issues
Parallel processing of genomics data
Agapito, Giuseppe; Guzzi, Pietro Hiram; Cannataro, Mario
2016-10-01
The availability of high-throughput experimental platforms for the analysis of biological samples, such as mass spectrometry, microarrays and Next Generation Sequencing, have made possible to analyze a whole genome in a single experiment. Such platforms produce an enormous volume of data per single experiment, thus the analysis of this enormous flow of data poses several challenges in term of data storage, preprocessing, and analysis. To face those issues, efficient, possibly parallel, bioinformatics software needs to be used to preprocess and analyze data, for instance to highlight genetic variation associated with complex diseases. In this paper we present a parallel algorithm for the parallel preprocessing and statistical analysis of genomics data, able to face high dimension of data and resulting in good response time. The proposed system is able to find statistically significant biological markers able to discriminate classes of patients that respond to drugs in different ways. Experiments performed on real and synthetic genomic datasets show good speed-up and scalability.
Overview of the Force Scientific Parallel Language
Directory of Open Access Journals (Sweden)
Gita Alaghband
1994-01-01
Full Text Available The Force parallel programming language designed for large-scale shared-memory multiprocessors is presented. The language provides a number of parallel constructs as extensions to the ordinary Fortran language and is implemented as a two-level macro preprocessor to support portability across shared memory multiprocessors. The global parallelism model on which the Force is based provides a powerful parallel language. The parallel constructs, generic synchronization, and freedom from process management supported by the Force has resulted in structured parallel programs that are ported to the many multiprocessors on which the Force is implemented. Two new parallel constructs for looping and functional decomposition are discussed. Several programming examples to illustrate some parallel programming approaches using the Force are also presented.
Automatic Loop Parallelization via Compiler Guided Refactoring
DEFF Research Database (Denmark)
Larsen, Per; Ladelsky, Razya; Lidman, Jacob
For many parallel applications, performance relies not on instruction-level parallelism, but on loop-level parallelism. Unfortunately, many modern applications are written in ways that obstruct automatic loop parallelization. Since we cannot identify sufficient parallelization opportunities...... for these codes in a static, off-line compiler, we developed an interactive compilation feedback system that guides the programmer in iteratively modifying application source, thereby improving the compiler’s ability to generate loop-parallel code. We use this compilation system to modify two sequential...... benchmarks, finding that the code parallelized in this way runs up to 8.3 times faster on an octo-core Intel Xeon 5570 system and up to 12.5 times faster on a quad-core IBM POWER6 system. Benchmark performance varies significantly between the systems. This suggests that semi-automatic parallelization should...
Parallel kinematics type, kinematics, and optimal design
Liu, Xin-Jun
2014-01-01
Parallel Kinematics- Type, Kinematics, and Optimal Design presents the results of 15 year's research on parallel mechanisms and parallel kinematics machines. This book covers the systematic classification of parallel mechanisms (PMs) as well as providing a large number of mechanical architectures of PMs available for use in practical applications. It focuses on the kinematic design of parallel robots. One successful application of parallel mechanisms in the field of machine tools, which is also called parallel kinematics machines, has been the emerging trend in advanced machine tools. The book describes not only the main aspects and important topics in parallel kinematics, but also references novel concepts and approaches, i.e. type synthesis based on evolution, performance evaluation and optimization based on screw theory, singularity model taking into account motion and force transmissibility, and others. This book is intended for researchers, scientists, engineers and postgraduates or above with interes...
Applied Parallel Computing Industrial Computation and Optimization
DEFF Research Database (Denmark)
Madsen, Kaj; NA NA NA Olesen, Dorte
Proceedings and the Third International Workshop on Applied Parallel Computing in Industrial Problems and Optimization (PARA96)......Proceedings and the Third International Workshop on Applied Parallel Computing in Industrial Problems and Optimization (PARA96)...
Parallel algorithms and cluster computing
Hoffmann, Karl Heinz
2007-01-01
This book presents major advances in high performance computing as well as major advances due to high performance computing. It contains a collection of papers in which results achieved in the collaboration of scientists from computer science, mathematics, physics, and mechanical engineering are presented. From the science problems to the mathematical algorithms and on to the effective implementation of these algorithms on massively parallel and cluster computers we present state-of-the-art methods and technology as well as exemplary results in these fields. This book shows that problems which seem superficially distinct become intimately connected on a computational level.
Parallel computation of rotating flows
DEFF Research Database (Denmark)
Lundin, Lars Kristian; Barker, Vincent A.; Sørensen, Jens Nørkær
1999-01-01
This paper deals with the simulation of 3‐D rotating flows based on the velocity‐vorticity formulation of the Navier‐Stokes equations in cylindrical coordinates. The governing equations are discretized by a finite difference method. The solution is advanced to a new time level by a two‐step process...... is that of solving a singular, large, sparse, over‐determined linear system of equations, and the iterative method CGLS is applied for this purpose. We discuss some of the mathematical and numerical aspects of this procedure and report on the performance of our software on a wide range of parallel computers. Darbe...
The parallel volume at large distances
DEFF Research Database (Denmark)
Kampf, Jürgen
In this paper we examine the asymptotic behavior of the parallel volume of planar non-convex bodies as the distance tends to infinity. We show that the difference between the parallel volume of the convex hull of a body and the parallel volume of the body itself tends to . This yields a new proof...... for the fact that a planar body can only have polynomial parallel volume, if it is convex. Extensions to Minkowski spaces and random sets are also discussed....
The parallel volume at large distances
DEFF Research Database (Denmark)
Kampf, Jürgen
In this paper we examine the asymptotic behavior of the parallel volume of planar non-convex bodies as the distance tends to infinity. We show that the difference between the parallel volume of the convex hull of a body and the parallel volume of the body itself tends to 0. This yields a new proof...... for the fact that a planar body can only have polynomial parallel volume, if it is convex. Extensions to Minkowski spaces and random sets are also discussed....
Parallel Careers and their Consequences for Companies in Brazil
Directory of Open Access Journals (Sweden)
Maria Candida Baumer Azevedo
2014-04-01
Full Text Available Given the relevance of the need to manage parallel careers to attract and retain people in organizations, this paper provides insight into this phenomenon from an organizational perspective. The parallel career concept, introduced by Alboher (2007 and recently addressed by Schuiling (2012, has previously been examined only from the perspective of the parallel career holder (PC holder. The paper provides insight from both individual and organizational perspectives on the phenomenon of parallel careers and considers how it can function as an important tool for attracting and retaining people by contributing to human development. This paper employs a qualitative approach that includes 30 semi-structured one-on-one interviews. The organizational perspective arises from the 15 interviews with human resources (HR executives from different companies. The individual viewpoint originates from the interviews with 15 executives who are also PC holders. An inductive content analysis approach was used to examine Brazilian companies and the Brazilian office of multinationals. Companies that are concerned about having the best talent on their teams can benefit from a deeper understanding of parallel careers, which can be used to attract, develop, and retain talent. Limitations and directions for future research are discussed.
Development of parallel Fokker-Planck code ALLAp
International Nuclear Information System (INIS)
Batishcheva, A.A.; Sigmar, D.J.; Koniges, A.E.
1996-01-01
We report on our ongoing development of the 3D Fokker-Planck code ALLA for a highly collisional scrape-off-layer (SOL) plasma. A SOL with strong gradients of density and temperature in the spatial dimension is modeled. Our method is based on a 3-D adaptive grid (in space, magnitude of the velocity, and cosine of the pitch angle) and a second order conservative scheme. Note that the grid size is typically 100 x 257 x 65 nodes. It was shown in our previous work that only these capabilities make it possible to benchmark a 3D code against a spatially-dependent self-similar solution of a kinetic equation with the Landau collision term. In the present work we show results of a more precise benchmarking against the exact solutions of the kinetic equation using a new parallel code ALLAp with an improved method of parallelization and a modified boundary condition at the plasma edge. We also report first results from the code parallelization using Message Passing Interface for a Massively Parallel CRI T3D platform. We evaluate the ALLAp code performance versus the number of T3D processors used and compare its efficiency against a Work/Data Sharing parallelization scheme and a workstation version
A Parallel Approach to Fractal Image Compression
Directory of Open Access Journals (Sweden)
Lubomir Dedera
2004-01-01
Full Text Available The paper deals with a parallel approach to coding and decoding algorithms in fractal image compressionand presents experimental results comparing sequential and parallel algorithms from the point of view of achieved bothcoding and decoding time and effectiveness of parallelization.
Parallel Computing Using Web Servers and "Servlets".
Lo, Alfred; Bloor, Chris; Choi, Y. K.
2000-01-01
Describes parallel computing and presents inexpensive ways to implement a virtual parallel computer with multiple Web servers. Highlights include performance measurement of parallel systems; models for using Java and intranet technology including single server, multiple clients and multiple servers, single client; and a comparison of CGI (common…
An Introduction to Parallel Computation R
Indian Academy of Sciences (India)
How are they programmed? This article provides an introduction. A parallel computer is a network of processors built for ... and have been used to solve problems much faster than a single ... in parallel computer design is to select an organization which ..... The most ambitious approach to parallel computing is to develop.
Comparison of parallel viscosity with neoclassical theory
International Nuclear Information System (INIS)
Ida, K.; Nakajima, N.
1996-04-01
Toroidal rotation profiles are measured with charge exchange spectroscopy for the plasma heated with tangential NBI in CHS heliotron/torsatron device to estimate parallel viscosity. The parallel viscosity derived from the toroidal rotation velocity shows good agreement with the neoclassical parallel viscosity plus the perpendicular viscosity. (μ perpendicular = 2 m 2 /s). (author)
Advances in randomized parallel computing
Rajasekaran, Sanguthevar
1999-01-01
The technique of randomization has been employed to solve numerous prob lems of computing both sequentially and in parallel. Examples of randomized algorithms that are asymptotically better than their deterministic counterparts in solving various fundamental problems abound. Randomized algorithms have the advantages of simplicity and better performance both in theory and often in practice. This book is a collection of articles written by renowned experts in the area of randomized parallel computing. A brief introduction to randomized algorithms In the aflalysis of algorithms, at least three different measures of performance can be used: the best case, the worst case, and the average case. Often, the average case run time of an algorithm is much smaller than the worst case. 2 For instance, the worst case run time of Hoare's quicksort is O(n ), whereas its average case run time is only O( n log n). The average case analysis is conducted with an assumption on the input space. The assumption made to arrive at t...
Xyce parallel electronic simulator design.
Energy Technology Data Exchange (ETDEWEB)
Thornquist, Heidi K.; Rankin, Eric Lamont; Mei, Ting; Schiek, Richard Louis; Keiter, Eric Richard; Russo, Thomas V.
2010-09-01
This document is the Xyce Circuit Simulator developer guide. Xyce has been designed from the 'ground up' to be a SPICE-compatible, distributed memory parallel circuit simulator. While it is in many respects a research code, Xyce is intended to be a production simulator. As such, having software quality engineering (SQE) procedures in place to insure a high level of code quality and robustness are essential. Version control, issue tracking customer support, C++ style guildlines and the Xyce release process are all described. The Xyce Parallel Electronic Simulator has been under development at Sandia since 1999. Historically, Xyce has mostly been funded by ASC, the original focus of Xyce development has primarily been related to circuits for nuclear weapons. However, this has not been the only focus and it is expected that the project will diversify. Like many ASC projects, Xyce is a group development effort, which involves a number of researchers, engineers, scientists, mathmaticians and computer scientists. In addition to diversity of background, it is to be expected on long term projects for there to be a certain amount of staff turnover, as people move on to different projects. As a result, it is very important that the project maintain high software quality standards. The point of this document is to formally document a number of the software quality practices followed by the Xyce team in one place. Also, it is hoped that this document will be a good source of information for new developers.
PDDP, A Data Parallel Programming Model
Directory of Open Access Journals (Sweden)
Karen H. Warren
1996-01-01
Full Text Available PDDP, the parallel data distribution preprocessor, is a data parallel programming model for distributed memory parallel computers. PDDP implements high-performance Fortran-compatible data distribution directives and parallelism expressed by the use of Fortran 90 array syntax, the FORALL statement, and the WHERE construct. Distributed data objects belong to a global name space; other data objects are treated as local and replicated on each processor. PDDP allows the user to program in a shared memory style and generates codes that are portable to a variety of parallel machines. For interprocessor communication, PDDP uses the fastest communication primitives on each platform.
Parallelization of quantum molecular dynamics simulation code
International Nuclear Information System (INIS)
Kato, Kaori; Kunugi, Tomoaki; Shibahara, Masahiko; Kotake, Susumu
1998-02-01
A quantum molecular dynamics simulation code has been developed for the analysis of the thermalization of photon energies in the molecule or materials in Kansai Research Establishment. The simulation code is parallelized for both Scalar massively parallel computer (Intel Paragon XP/S75) and Vector parallel computer (Fujitsu VPP300/12). Scalable speed-up has been obtained with a distribution to processor units by division of particle group in both parallel computers. As a result of distribution to processor units not only by particle group but also by the particles calculation that is constructed with fine calculations, highly parallelization performance is achieved in Intel Paragon XP/S75. (author)
Implementation and performance of parallelized elegant
International Nuclear Information System (INIS)
Wang, Y.; Borland, M.
2008-01-01
The program elegant is widely used for design and modeling of linacs for free-electron lasers and energy recovery linacs, as well as storage rings and other applications. As part of a multi-year effort, we have parallelized many aspects of the code, including single-particle dynamics, wakefields, and coherent synchrotron radiation. We report on the approach used for gradual parallelization, which proved very beneficial in getting parallel features into the hands of users quickly. We also report details of parallelization of collective effects. Finally, we discuss performance of the parallelized code in various applications.
Parallelization of 2-D lattice Boltzmann codes
International Nuclear Information System (INIS)
Suzuki, Soichiro; Kaburaki, Hideo; Yokokawa, Mitsuo.
1996-03-01
Lattice Boltzmann (LB) codes to simulate two dimensional fluid flow are developed on vector parallel computer Fujitsu VPP500 and scalar parallel computer Intel Paragon XP/S. While a 2-D domain decomposition method is used for the scalar parallel LB code, a 1-D domain decomposition method is used for the vector parallel LB code to be vectorized along with the axis perpendicular to the direction of the decomposition. High parallel efficiency of 95.1% by the vector parallel calculation on 16 processors with 1152x1152 grid and 88.6% by the scalar parallel calculation on 100 processors with 800x800 grid are obtained. The performance models are developed to analyze the performance of the LB codes. It is shown by our performance models that the execution speed of the vector parallel code is about one hundred times faster than that of the scalar parallel code with the same number of processors up to 100 processors. We also analyze the scalability in keeping the available memory size of one processor element at maximum. Our performance model predicts that the execution time of the vector parallel code increases about 3% on 500 processors. Although the 1-D domain decomposition method has in general a drawback in the interprocessor communication, the vector parallel LB code is still suitable for the large scale and/or high resolution simulations. (author)
Parallelization of 2-D lattice Boltzmann codes
Energy Technology Data Exchange (ETDEWEB)
Suzuki, Soichiro; Kaburaki, Hideo; Yokokawa, Mitsuo
1996-03-01
Lattice Boltzmann (LB) codes to simulate two dimensional fluid flow are developed on vector parallel computer Fujitsu VPP500 and scalar parallel computer Intel Paragon XP/S. While a 2-D domain decomposition method is used for the scalar parallel LB code, a 1-D domain decomposition method is used for the vector parallel LB code to be vectorized along with the axis perpendicular to the direction of the decomposition. High parallel efficiency of 95.1% by the vector parallel calculation on 16 processors with 1152x1152 grid and 88.6% by the scalar parallel calculation on 100 processors with 800x800 grid are obtained. The performance models are developed to analyze the performance of the LB codes. It is shown by our performance models that the execution speed of the vector parallel code is about one hundred times faster than that of the scalar parallel code with the same number of processors up to 100 processors. We also analyze the scalability in keeping the available memory size of one processor element at maximum. Our performance model predicts that the execution time of the vector parallel code increases about 3% on 500 processors. Although the 1-D domain decomposition method has in general a drawback in the interprocessor communication, the vector parallel LB code is still suitable for the large scale and/or high resolution simulations. (author).
Arkin, Ethem; Tekinerdogan, Bedir; Imre, Kayhan M.
2017-01-01
The need for high-performance computing together with the increasing trend from single processor to parallel computer architectures has leveraged the adoption of parallel computing. To benefit from parallel computing power, usually parallel algorithms are defined that can be mapped and executed
Experiences in Data-Parallel Programming
Directory of Open Access Journals (Sweden)
Terry W. Clark
1997-01-01
Full Text Available To efficiently parallelize a scientific application with a data-parallel compiler requires certain structural properties in the source program, and conversely, the absence of others. A recent parallelization effort of ours reinforced this observation and motivated this correspondence. Specifically, we have transformed a Fortran 77 version of GROMOS, a popular dusty-deck program for molecular dynamics, into Fortran D, a data-parallel dialect of Fortran. During this transformation we have encountered a number of difficulties that probably are neither limited to this particular application nor do they seem likely to be addressed by improved compiler technology in the near future. Our experience with GROMOS suggests a number of points to keep in mind when developing software that may at some time in its life cycle be parallelized with a data-parallel compiler. This note presents some guidelines for engineering data-parallel applications that are compatible with Fortran D or High Performance Fortran compilers.
Streaming for Functional Data-Parallel Languages
DEFF Research Database (Denmark)
Madsen, Frederik Meisner
In this thesis, we investigate streaming as a general solution to the space inefficiency commonly found in functional data-parallel programming languages. The data-parallel paradigm maps well to parallel SIMD-style hardware. However, the traditional fully materializing execution strategy...... by extending two existing data-parallel languages: NESL and Accelerate. In the extensions we map bulk operations to data-parallel streams that can evaluate fully sequential, fully parallel or anything in between. By a dataflow, piecewise parallel execution strategy, the runtime system can adjust to any target...... flattening necessitates all sub-computations to materialize at the same time. For example, naive n by n matrix multiplication requires n^3 space in NESL because the algorithm contains n^3 independent scalar multiplications. For large values of n, this is completely unacceptable. We address the problem...
Discontinuous interleaving of parallel inverters for efficiency improvement
DEFF Research Database (Denmark)
Rannestad, Bjørn; Munk-Nielsen, Stig; Gadgaard, Kristian
2017-01-01
Interleaved switching of parallel inverters has previously been proposed for efficiency/size improvements of grid connected three-phase inverters. This paper proposes a novel interleaving method which practically eliminates insulated gate bipolar transistor (IGBT) turn-on losses and drastically...... overall power module losses are reduced. The modulation strategy is suited for converters with doubly fed induction generators (DFIG) for wind turbines, but are not limited hereto. Improvement of switching performance are measured and operational efficiency improvements are calculated and verified...
Parallel algorithms for simulating continuous time Markov chains
Nicol, David M.; Heidelberger, Philip
1992-01-01
We have previously shown that the mathematical technique of uniformization can serve as the basis of synchronization for the parallel simulation of continuous-time Markov chains. This paper reviews the basic method and compares five different methods based on uniformization, evaluating their strengths and weaknesses as a function of problem characteristics. The methods vary in their use of optimism, logical aggregation, communication management, and adaptivity. Performance evaluation is conducted on the Intel Touchstone Delta multiprocessor, using up to 256 processors.
Massively parallel diffuse optical tomography
Energy Technology Data Exchange (ETDEWEB)
Sandusky, John V.; Pitts, Todd A.
2017-09-05
Diffuse optical tomography systems and methods are described herein. In a general embodiment, the diffuse optical tomography system comprises a plurality of sensor heads, the plurality of sensor heads comprising respective optical emitter systems and respective sensor systems. A sensor head in the plurality of sensors heads is caused to act as an illuminator, such that its optical emitter system transmits a transillumination beam towards a portion of a sample. Other sensor heads in the plurality of sensor heads act as observers, detecting portions of the transillumination beam that radiate from the sample in the fields of view of the respective sensory systems of the other sensor heads. Thus, sensor heads in the plurality of sensors heads generate sensor data in parallel.
Embodied and Distributed Parallel DJing.
Cappelen, Birgitta; Andersson, Anders-Petter
2016-01-01
Everyone has a right to take part in cultural events and activities, such as music performances and music making. Enforcing that right, within Universal Design, is often limited to a focus on physical access to public areas, hearing aids etc., or groups of persons with special needs performing in traditional ways. The latter might be people with disabilities, being musicians playing traditional instruments, or actors playing theatre. In this paper we focus on the innovative potential of including people with special needs, when creating new cultural activities. In our project RHYME our goal was to create health promoting activities for children with severe disabilities, by developing new musical and multimedia technologies. Because of the users' extreme demands and rich contribution, we ended up creating both a new genre of musical instruments and a new art form. We call this new art form Embodied and Distributed Parallel DJing, and the new genre of instruments for Empowering Multi-Sensorial Things.
Device for balancing parallel strings
Mashikian, Matthew S.
1985-01-01
A battery plant is described which features magnetic circuit means in association with each of the battery strings in the battery plant for balancing the electrical current flow through the battery strings by equalizing the voltage across each of the battery strings. Each of the magnetic circuit means generally comprises means for sensing the electrical current flow through one of the battery strings, and a saturable reactor having a main winding connected electrically in series with the battery string, a bias winding connected to a source of alternating current and a control winding connected to a variable source of direct current controlled by the sensing means. Each of the battery strings is formed by a plurality of batteries connected electrically in series, and these battery strings are connected electrically in parallel across common bus conductors.
Linear parallel processing machines I
Energy Technology Data Exchange (ETDEWEB)
Von Kunze, M
1984-01-01
As is well-known, non-context-free grammars for generating formal languages happen to be of a certain intrinsic computational power that presents serious difficulties to efficient parsing algorithms as well as for the development of an algebraic theory of contextsensitive languages. In this paper a framework is given for the investigation of the computational power of formal grammars, in order to start a thorough analysis of grammars consisting of derivation rules of the form aB ..-->.. A/sub 1/ ... A /sub n/ b/sub 1/...b /sub m/ . These grammars may be thought of as automata by means of parallel processing, if one considers the variables as operators acting on the terminals while reading them right-to-left. This kind of automata and their 2-dimensional programming language prove to be useful by allowing a concise linear-time algorithm for integer multiplication. Linear parallel processing machines (LP-machines) which are, in their general form, equivalent to Turing machines, include finite automata and pushdown automata (with states encoded) as special cases. Bounded LP-machines yield deterministic accepting automata for nondeterministic contextfree languages, and they define an interesting class of contextsensitive languages. A characterization of this class in terms of generating grammars is established by using derivation trees with crossings as a helpful tool. From the algebraic point of view, deterministic LP-machines are effectively represented semigroups with distinguished subsets. Concerning the dualism between generating and accepting devices of formal languages within the algebraic setting, the concept of accepting automata turns out to reduce essentially to embeddability in an effectively represented extension monoid, even in the classical cases.
Parallel computing in enterprise modeling.
Energy Technology Data Exchange (ETDEWEB)
Goldsby, Michael E.; Armstrong, Robert C.; Shneider, Max S.; Vanderveen, Keith; Ray, Jaideep; Heath, Zach; Allan, Benjamin A.
2008-08-01
This report presents the results of our efforts to apply high-performance computing to entity-based simulations with a multi-use plugin for parallel computing. We use the term 'Entity-based simulation' to describe a class of simulation which includes both discrete event simulation and agent based simulation. What simulations of this class share, and what differs from more traditional models, is that the result sought is emergent from a large number of contributing entities. Logistic, economic and social simulations are members of this class where things or people are organized or self-organize to produce a solution. Entity-based problems never have an a priori ergodic principle that will greatly simplify calculations. Because the results of entity-based simulations can only be realized at scale, scalable computing is de rigueur for large problems. Having said that, the absence of a spatial organizing principal makes the decomposition of the problem onto processors problematic. In addition, practitioners in this domain commonly use the Java programming language which presents its own problems in a high-performance setting. The plugin we have developed, called the Parallel Particle Data Model, overcomes both of these obstacles and is now being used by two Sandia frameworks: the Decision Analysis Center, and the Seldon social simulation facility. While the ability to engage U.S.-sized problems is now available to the Decision Analysis Center, this plugin is central to the success of Seldon. Because Seldon relies on computationally intensive cognitive sub-models, this work is necessary to achieve the scale necessary for realistic results. With the recent upheavals in the financial markets, and the inscrutability of terrorist activity, this simulation domain will likely need a capability with ever greater fidelity. High-performance computing will play an important part in enabling that greater fidelity.
Compiler Technology for Parallel Scientific Computation
Directory of Open Access Journals (Sweden)
Can Özturan
1994-01-01
Full Text Available There is a need for compiler technology that, given the source program, will generate efficient parallel codes for different architectures with minimal user involvement. Parallel computation is becoming indispensable in solving large-scale problems in science and engineering. Yet, the use of parallel computation is limited by the high costs of developing the needed software. To overcome this difficulty we advocate a comprehensive approach to the development of scalable architecture-independent software for scientific computation based on our experience with equational programming language (EPL. Our approach is based on a program decomposition, parallel code synthesis, and run-time support for parallel scientific computation. The program decomposition is guided by the source program annotations provided by the user. The synthesis of parallel code is based on configurations that describe the overall computation as a set of interacting components. Run-time support is provided by the compiler-generated code that redistributes computation and data during object program execution. The generated parallel code is optimized using techniques of data alignment, operator placement, wavefront determination, and memory optimization. In this article we discuss annotations, configurations, parallel code generation, and run-time support suitable for parallel programs written in the functional parallel programming language EPL and in Fortran.
Computer-Aided Parallelizer and Optimizer
Jin, Haoqiang
2011-01-01
The Computer-Aided Parallelizer and Optimizer (CAPO) automates the insertion of compiler directives (see figure) to facilitate parallel processing on Shared Memory Parallel (SMP) machines. While CAPO currently is integrated seamlessly into CAPTools (developed at the University of Greenwich, now marketed as ParaWise), CAPO was independently developed at Ames Research Center as one of the components for the Legacy Code Modernization (LCM) project. The current version takes serial FORTRAN programs, performs interprocedural data dependence analysis, and generates OpenMP directives. Due to the widely supported OpenMP standard, the generated OpenMP codes have the potential to run on a wide range of SMP machines. CAPO relies on accurate interprocedural data dependence information currently provided by CAPTools. Compiler directives are generated through identification of parallel loops in the outermost level, construction of parallel regions around parallel loops and optimization of parallel regions, and insertion of directives with automatic identification of private, reduction, induction, and shared variables. Attempts also have been made to identify potential pipeline parallelism (implemented with point-to-point synchronization). Although directives are generated automatically, user interaction with the tool is still important for producing good parallel codes. A comprehensive graphical user interface is included for users to interact with the parallelization process.
Parallel processing for fluid dynamics applications
International Nuclear Information System (INIS)
Johnson, G.M.
1989-01-01
The impact of parallel processing on computational science and, in particular, on computational fluid dynamics is growing rapidly. In this paper, particular emphasis is given to developments which have occurred within the past two years. Parallel processing is defined and the reasons for its importance in high-performance computing are reviewed. Parallel computer architectures are classified according to the number and power of their processing units, their memory, and the nature of their connection scheme. Architectures which show promise for fluid dynamics applications are emphasized. Fluid dynamics problems are examined for parallelism inherent at the physical level. CFD algorithms and their mappings onto parallel architectures are discussed. Several example are presented to document the performance of fluid dynamics applications on present-generation parallel processing devices
Design considerations for parallel graphics libraries
Crockett, Thomas W.
1994-01-01
Applications which run on parallel supercomputers are often characterized by massive datasets. Converting these vast collections of numbers to visual form has proven to be a powerful aid to comprehension. For a variety of reasons, it may be desirable to provide this visual feedback at runtime. One way to accomplish this is to exploit the available parallelism to perform graphics operations in place. In order to do this, we need appropriate parallel rendering algorithms and library interfaces. This paper provides a tutorial introduction to some of the issues which arise in designing parallel graphics libraries and their underlying rendering algorithms. The focus is on polygon rendering for distributed memory message-passing systems. We illustrate our discussion with examples from PGL, a parallel graphics library which has been developed on the Intel family of parallel systems.
A parallel solver for huge dense linear systems
Badia, J. M.; Movilla, J. L.; Climente, J. I.; Castillo, M.; Marqués, M.; Mayo, R.; Quintana-Ortí, E. S.; Planelles, J.
2011-11-01
: Linux/Unix Has the code been vectorized or parallelized?: Yes, includes MPI primitives. RAM: Tested for up to 190 GB Classification: 6.5 External routines: MPI ( http://www.mpi-forum.org/), BLAS ( http://www.netlib.org/blas/), PLAPACK ( http://www.cs.utexas.edu/~plapack/), POOCLAPACK ( ftp://ftp.cs.utexas.edu/pub/rvdg/PLAPACK/pooclapack.ps) (code for PLAPACK and POOCLAPACK is included in the distribution). Catalogue identifier of previous version: AEHU_v1_0 Journal reference of previous version: Comput. Phys. Comm. 182 (2011) 533 Does the new version supersede the previous version?: Yes Nature of problem: Huge scale dense systems of linear equations, Ax=B, beyond standard LAPACK capabilities. Solution method: The linear systems are solved by means of parallelized routines based on the LU factorization, using efficient secondary storage algorithms when the available main memory is insufficient. Reasons for new version: In many applications we need to guarantee a high accuracy in the solution of very large linear systems and we can do it by using double-precision arithmetic. Summary of revisions: Version 1.1 Can be used to solve linear systems using double-precision arithmetic. New version of the initialization routine. The user can choose the kind of arithmetic and the values of several parameters of the environment. Running time: About 5 hours to solve a system with more than 200 000 equations and more than 10 000 right-hand side vectors using double-precision arithmetic on an eight-node commodity cluster with a total of 64 Intel cores.
Synchronization Techniques in Parallel Discrete Event Simulation
Lindén, Jonatan
2018-01-01
Discrete event simulation is an important tool for evaluating system models in many fields of science and engineering. To improve the performance of large-scale discrete event simulations, several techniques to parallelize discrete event simulation have been developed. In parallel discrete event simulation, the work of a single discrete event simulation is distributed over multiple processing elements. A key challenge in parallel discrete event simulation is to ensure that causally dependent ...
Parallel processing for artificial intelligence 1
Kanal, LN; Kumar, V; Suttner, CB
1994-01-01
Parallel processing for AI problems is of great current interest because of its potential for alleviating the computational demands of AI procedures. The articles in this book consider parallel processing for problems in several areas of artificial intelligence: image processing, knowledge representation in semantic networks, production rules, mechanization of logic, constraint satisfaction, parsing of natural language, data filtering and data mining. The publication is divided into six sections. The first addresses parallel computing for processing and understanding images. The second discus
A survey of parallel multigrid algorithms
Chan, Tony F.; Tuminaro, Ray S.
1987-01-01
A typical multigrid algorithm applied to well-behaved linear-elliptic partial-differential equations (PDEs) is described. Criteria for designing and evaluating parallel algorithms are presented. Before evaluating the performance of some parallel multigrid algorithms, consideration is given to some theoretical complexity results for solving PDEs in parallel and for executing the multigrid algorithm. The effect of mapping and load imbalance on the partial efficiency of the algorithm is studied.
Refinement of Parallel and Reactive Programs
Back, R. J. R.
1992-01-01
We show how to apply the refinement calculus to stepwise refinement of parallel and reactive programs. We use action systems as our basic program model. Action systems are sequential programs which can be implemented in a parallel fashion. Hence refinement calculus methods, originally developed for sequential programs, carry over to the derivation of parallel programs. Refinement of reactive programs is handled by data refinement techniques originally developed for the sequential refinement c...
Parallel Prediction of Stock Volatility
Directory of Open Access Journals (Sweden)
Priscilla Jenq
2017-10-01
Full Text Available Volatility is a measurement of the risk of financial products. A stock will hit new highs and lows over time and if these highs and lows fluctuate wildly, then it is considered a high volatile stock. Such a stock is considered riskier than a stock whose volatility is low. Although highly volatile stocks are riskier, the returns that they generate for investors can be quite high. Of course, with a riskier stock also comes the chance of losing money and yielding negative returns. In this project, we will use historic stock data to help us forecast volatility. Since the financial industry usually uses S&P 500 as the indicator of the market, we will use S&P 500 as a benchmark to compute the risk. We will also use artificial neural networks as a tool to predict volatilities for a specific time frame that will be set when we configure this neural network. There have been reports that neural networks with different numbers of layers and different numbers of hidden nodes may generate varying results. In fact, we may be able to find the best configuration of a neural network to compute volatilities. We will implement this system using the parallel approach. The system can be used as a tool for investors to allocating and hedging assets.
Vectoring of parallel synthetic jets
Berk, Tim; Ganapathisubramani, Bharathram; Gomit, Guillaume
2015-11-01
A pair of parallel synthetic jets can be vectored by applying a phase difference between the two driving signals. The resulting jet can be merged or bifurcated and either vectored towards the actuator leading in phase or the actuator lagging in phase. In the present study, the influence of phase difference and Strouhal number on the vectoring behaviour is examined experimentally. Phase-locked vorticity fields, measured using Particle Image Velocimetry (PIV), are used to track vortex pairs. The physical mechanisms that explain the diversity in vectoring behaviour are observed based on the vortex trajectories. For a fixed phase difference, the vectoring behaviour is shown to be primarily influenced by pinch-off time of vortex rings generated by the synthetic jets. Beyond a certain formation number, the pinch-off timescale becomes invariant. In this region, the vectoring behaviour is determined by the distance between subsequent vortex rings. We acknowledge the financial support from the European Research Council (ERC grant agreement no. 277472).
A Soft Parallel Kinematic Mechanism.
White, Edward L; Case, Jennifer C; Kramer-Bottiglio, Rebecca
2018-02-01
In this article, we describe a novel holonomic soft robotic structure based on a parallel kinematic mechanism. The design is based on the Stewart platform, which uses six sensors and actuators to achieve full six-degree-of-freedom motion. Our design is much less complex than a traditional platform, since it replaces the 12 spherical and universal joints found in a traditional Stewart platform with a single highly deformable elastomer body and flexible actuators. This reduces the total number of parts in the system and simplifies the assembly process. Actuation is achieved through coiled-shape memory alloy actuators. State observation and feedback is accomplished through the use of capacitive elastomer strain gauges. The main structural element is an elastomer joint that provides antagonistic force. We report the response of the actuators and sensors individually, then report the response of the complete assembly. We show that the completed robotic system is able to achieve full position control, and we discuss the limitations associated with using responsive material actuators. We believe that control demonstrated on a single body in this work could be extended to chains of such bodies to create complex soft robots.
Study and simulation of a parallel numerical processing machine
International Nuclear Information System (INIS)
Bel Hadj, Slaheddine
1981-12-01
This study has been carried out in the perspective of the implementation on a minicomputer of the NEPTUNIX package (software for the resolution of very large algebra-differential equation systems). Aiming at increasing the system performance, a previous research work has shown the necessity of reducing the execution time of certain numerical computation tasks, which are of frequent use. It has also demonstrated the feasibility of handling these tasks with efficient algorithms of parallel type. The present work deals with the study and simulation of a parallel architecture processor adapted to the fast execution of these algorithms. A minicomputer fitted with a connection to such a parallel processor, has a greatly extended computing power. Then the architecture of a parallel numerical processor, based on the use of VLSI microprocessors and co-processors, is described. Its design aims at the best cost / performance ratio. The last part deals with the simulation processor with the 'CHAMBOR' program. Results show an increasing factor of 30 in speed, in comparison with the execution on a MITRA 15 minicomputer. Moreover the conflicts importance, mainly at the level of access to a shared resource is evaluated. Although this implementation has been designed having in mind a dedicated application, other uses could be envisaged, particularly for the simulation of nuclear reactors: operator guiding system, the behavioural study under accidental circumstances, etc. (author) [fr
A new parallelization algorithm of ocean model with explicit scheme
Fu, X. D.
2017-08-01
This paper will focus on the parallelization of ocean model with explicit scheme which is one of the most commonly used schemes in the discretization of governing equation of ocean model. The characteristic of explicit schema is that calculation is simple, and that the value of the given grid point of ocean model depends on the grid point at the previous time step, which means that one doesn’t need to solve sparse linear equations in the process of solving the governing equation of the ocean model. Aiming at characteristics of the explicit scheme, this paper designs a parallel algorithm named halo cells update with tiny modification of original ocean model and little change of space step and time step of the original ocean model, which can parallelize ocean model by designing transmission module between sub-domains. This paper takes the GRGO for an example to implement the parallelization of GRGO (Global Reduced Gravity Ocean model) with halo update. The result demonstrates that the higher speedup can be achieved at different problem size.
Optimal conductive constructal configurations with “parallel design”
International Nuclear Information System (INIS)
Eslami, M.
2016-01-01
Highlights: • A new parallel design is proposed for conductive cooling of heat generating rectangles. • The geometric features are optimized analytically. • The internal structure morph as a function of available conductive material. • Thermal performance is superior to the previously numerically optimized designs. - Abstract: Today, conductive volume to point cooling of heat generating bodies is under investigation as an alternative method for thermal management of electronic chipsets with high power density. In this paper, a new simple geometry called “parallel design” is proposed for effective conductive cooling of rectangular heat generating bodies. This configuration tries to minimize the thermal resistance associated with the temperature drop inside the heat generating volume. The geometric features of the design are all optimized analytically and expressed with simple explicit equations. It is proved that optimal number of parallel links is equal to the thermal conductivity ratio multiplied by the porosity (or the volume ratio). With the universal aspect ratio of H/L = 2, total thermal resistance of the present parallel design is lower than the recently proposed networks of various shapes that are optimized with help of numerical simulations; especially when more conducting material is available.
Productive Parallel Programming: The PCN Approach
Directory of Open Access Journals (Sweden)
Ian Foster
1992-01-01
Full Text Available We describe the PCN programming system, focusing on those features designed to improve the productivity of scientists and engineers using parallel supercomputers. These features include a simple notation for the concise specification of concurrent algorithms, the ability to incorporate existing Fortran and C code into parallel applications, facilities for reusing parallel program components, a portable toolkit that allows applications to be developed on a workstation or small parallel computer and run unchanged on supercomputers, and integrated debugging and performance analysis tools. We survey representative scientific applications and identify problem classes for which PCN has proved particularly useful.
Prabhat
2014-01-01
Gain Critical Insight into the Parallel I/O EcosystemParallel I/O is an integral component of modern high performance computing (HPC), especially in storing and processing very large datasets to facilitate scientific discovery. Revealing the state of the art in this field, High Performance Parallel I/O draws on insights from leading practitioners, researchers, software architects, developers, and scientists who shed light on the parallel I/O ecosystem.The first part of the book explains how large-scale HPC facilities scope, configure, and operate systems, with an emphasis on choices of I/O har
Parallel, Rapid Diffuse Optical Tomography of Breast
National Research Council Canada - National Science Library
Yodh, Arjun
2001-01-01
During the last year we have experimentally and computationally investigated rapid acquisition and analysis of informationally dense diffuse optical data sets in the parallel plate compressed breast geometry...
Parallel, Rapid Diffuse Optical Tomography of Breast
National Research Council Canada - National Science Library
Yodh, Arjun
2002-01-01
During the last year we have experimentally and computationally investigated rapid acquisition and analysis of informationally dense diffuse optical data sets in the parallel plate compressed breast geometry...
Parallel auto-correlative statistics with VTK.
Energy Technology Data Exchange (ETDEWEB)
Pebay, Philippe Pierre; Bennett, Janine Camille
2013-08-01
This report summarizes existing statistical engines in VTK and presents both the serial and parallel auto-correlative statistics engines. It is a sequel to [PT08, BPRT09b, PT09, BPT09, PT10] which studied the parallel descriptive, correlative, multi-correlative, principal component analysis, contingency, k-means, and order statistics engines. The ease of use of the new parallel auto-correlative statistics engine is illustrated by the means of C++ code snippets and algorithm verification is provided. This report justifies the design of the statistics engines with parallel scalability in mind, and provides scalability and speed-up analysis results for the autocorrelative statistics engine.
Conformal pure radiation with parallel rays
International Nuclear Information System (INIS)
Leistner, Thomas; Paweł Nurowski
2012-01-01
We define pure radiation metrics with parallel rays to be n-dimensional pseudo-Riemannian metrics that admit a parallel null line bundle K and whose Ricci tensor vanishes on vectors that are orthogonal to K. We give necessary conditions in terms of the Weyl, Cotton and Bach tensors for a pseudo-Riemannian metric to be conformal to a pure radiation metric with parallel rays. Then, we derive conditions in terms of the tractor calculus that are equivalent to the existence of a pure radiation metric with parallel rays in a conformal class. We also give analogous results for n-dimensional pseudo-Riemannian pp-waves. (paper)
Compiling Scientific Programs for Scalable Parallel Systems
National Research Council Canada - National Science Library
Kennedy, Ken
2001-01-01
...). The research performed in this project included new techniques for recognizing implicit parallelism in sequential programs, a powerful and precise set-based framework for analysis and transformation...
Parallel thermal radiation transport in two dimensions
International Nuclear Information System (INIS)
Smedley-Stevenson, R.P.; Ball, S.R.
2003-01-01
This paper describes the distributed memory parallel implementation of a deterministic thermal radiation transport algorithm in a 2-dimensional ALE hydrodynamics code. The parallel algorithm consists of a variety of components which are combined in order to produce a state of the art computational capability, capable of solving large thermal radiation transport problems using Blue-Oak, the 3 Tera-Flop MPP (massive parallel processors) computing facility at AWE (United Kingdom). Particular aspects of the parallel algorithm are described together with examples of the performance on some challenging applications. (author)
Parallel Algorithms for the Exascale Era
Energy Technology Data Exchange (ETDEWEB)
Robey, Robert W. [Los Alamos National Laboratory
2016-10-19
New parallel algorithms are needed to reach the Exascale level of parallelism with millions of cores. We look at some of the research developed by students in projects at LANL. The research blends ideas from the early days of computing while weaving in the fresh approach brought by students new to the field of high performance computing. We look at reproducibility of global sums and why it is important to parallel computing. Next we look at how the concept of hashing has led to the development of more scalable algorithms suitable for next-generation parallel computers. Nearly all of this work has been done by undergraduates and published in leading scientific journals.
Parallel thermal radiation transport in two dimensions
Energy Technology Data Exchange (ETDEWEB)
Smedley-Stevenson, R.P.; Ball, S.R. [AWE Aldermaston (United Kingdom)
2003-07-01
This paper describes the distributed memory parallel implementation of a deterministic thermal radiation transport algorithm in a 2-dimensional ALE hydrodynamics code. The parallel algorithm consists of a variety of components which are combined in order to produce a state of the art computational capability, capable of solving large thermal radiation transport problems using Blue-Oak, the 3 Tera-Flop MPP (massive parallel processors) computing facility at AWE (United Kingdom). Particular aspects of the parallel algorithm are described together with examples of the performance on some challenging applications. (author)
Structured Parallel Programming Patterns for Efficient Computation
McCool, Michael; Robison, Arch
2012-01-01
Programming is now parallel programming. Much as structured programming revolutionized traditional serial programming decades ago, a new kind of structured programming, based on patterns, is relevant to parallel programming today. Parallel computing experts and industry insiders Michael McCool, Arch Robison, and James Reinders describe how to design and implement maintainable and efficient parallel algorithms using a pattern-based approach. They present both theory and practice, and give detailed concrete examples using multiple programming models. Examples are primarily given using two of th
Parallel Computing for Brain Simulation.
Pastur-Romay, L A; Porto-Pazos, A B; Cedron, F; Pazos, A
2017-01-01
The human brain is the most complex system in the known universe, it is therefore one of the greatest mysteries. It provides human beings with extraordinary abilities. However, until now it has not been understood yet how and why most of these abilities are produced. For decades, researchers have been trying to make computers reproduce these abilities, focusing on both understanding the nervous system and, on processing data in a more efficient way than before. Their aim is to make computers process information similarly to the brain. Important technological developments and vast multidisciplinary projects have allowed creating the first simulation with a number of neurons similar to that of a human brain. This paper presents an up-to-date review about the main research projects that are trying to simulate and/or emulate the human brain. They employ different types of computational models using parallel computing: digital models, analog models and hybrid models. This review includes the current applications of these works, as well as future trends. It is focused on various works that look for advanced progress in Neuroscience and still others which seek new discoveries in Computer Science (neuromorphic hardware, machine learning techniques). Their most outstanding characteristics are summarized and the latest advances and future plans are presented. In addition, this review points out the importance of considering not only neurons: Computational models of the brain should also include glial cells, given the proven importance of astrocytes in information processing. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
von Davier, Matthias
2016-01-01
This report presents results on a parallel implementation of the expectation-maximization (EM) algorithm for multidimensional latent variable models. The developments presented here are based on code that parallelizes both the E step and the M step of the parallel-E parallel-M algorithm. Examples presented in this report include item response…
The language parallel Pascal and other aspects of the massively parallel processor
Reeves, A. P.; Bruner, J. D.
1982-01-01
A high level language for the Massively Parallel Processor (MPP) was designed. This language, called Parallel Pascal, is described in detail. A description of the language design, a description of the intermediate language, Parallel P-Code, and details for the MPP implementation are included. Formal descriptions of Parallel Pascal and Parallel P-Code are given. A compiler was developed which converts programs in Parallel Pascal into the intermediate Parallel P-Code language. The code generator to complete the compiler for the MPP is being developed independently. A Parallel Pascal to Pascal translator was also developed. The architecture design for a VLSI version of the MPP was completed with a description of fault tolerant interconnection networks. The memory arrangement aspects of the MPP are discussed and a survey of other high level languages is given.
The convergence of parallel Boltzmann machines
Zwietering, P.J.; Aarts, E.H.L.; Eckmiller, R.; Hartmann, G.; Hauske, G.
1990-01-01
We discuss the main results obtained in a study of a mathematical model of synchronously parallel Boltzmann machines. We present supporting evidence for the conjecture that a synchronously parallel Boltzmann machine maximizes a consensus function that consists of a weighted sum of the regular
Customizable Memory Schemes for Data Parallel Architectures
Gou, C.
2011-01-01
Memory system efficiency is crucial for any processor to achieve high performance, especially in the case of data parallel machines. Processing capabilities of parallel lanes will be wasted, when data requests are not accomplished in a sustainable and timely manner. Irregular vector memory accesses
Parallel Narrative Structure in Paul Harding's "Tinkers"
Çirakli, Mustafa Zeki
2014-01-01
The present paper explores the implications of parallel narrative structure in Paul Harding's "Tinkers" (2009). Besides primarily recounting the two sets of parallel narratives, "Tinkers" also comprises of seemingly unrelated fragments such as excerpts from clock repair manuals and diaries. The main stories, however, told…
Streaming nested data parallelism on multicores
DEFF Research Database (Denmark)
Madsen, Frederik Meisner; Filinski, Andrzej
2016-01-01
The paradigm of nested data parallelism (NDP) allows a variety of semi-regular computation tasks to be mapped onto SIMD-style hardware, including GPUs and vector units. However, some care is needed to keep down space consumption in situations where the available parallelism may vastly exceed...
Bayer image parallel decoding based on GPU
Hu, Rihui; Xu, Zhiyong; Wei, Yuxing; Sun, Shaohua
2012-11-01
In the photoelectrical tracking system, Bayer image is decompressed in traditional method, which is CPU-based. However, it is too slow when the images become large, for example, 2K×2K×16bit. In order to accelerate the Bayer image decoding, this paper introduces a parallel speedup method for NVIDA's Graphics Processor Unit (GPU) which supports CUDA architecture. The decoding procedure can be divided into three parts: the first is serial part, the second is task-parallelism part, and the last is data-parallelism part including inverse quantization, inverse discrete wavelet transform (IDWT) as well as image post-processing part. For reducing the execution time, the task-parallelism part is optimized by OpenMP techniques. The data-parallelism part could advance its efficiency through executing on the GPU as CUDA parallel program. The optimization techniques include instruction optimization, shared memory access optimization, the access memory coalesced optimization and texture memory optimization. In particular, it can significantly speed up the IDWT by rewriting the 2D (Tow-dimensional) serial IDWT into 1D parallel IDWT. Through experimenting with 1K×1K×16bit Bayer image, data-parallelism part is 10 more times faster than CPU-based implementation. Finally, a CPU+GPU heterogeneous decompression system was designed. The experimental result shows that it could achieve 3 to 5 times speed increase compared to the CPU serial method.
Parallelization of TMVA Machine Learning Algorithms
Hajili, Mammad
2017-01-01
This report reflects my work on Parallelization of TMVA Machine Learning Algorithms integrated to ROOT Data Analysis Framework during summer internship at CERN. The report consists of 4 impor- tant part - data set used in training and validation, algorithms that multiprocessing applied on them, parallelization techniques and re- sults of execution time changes due to number of workers.
17 CFR 12.24 - Parallel proceedings.
2010-04-01
...) Definition. For purposes of this section, a parallel proceeding shall include: (1) An arbitration proceeding... the receivership includes the resolution of claims made by customers; or (3) A petition filed under... any of the foregoing with knowledge of a parallel proceeding shall promptly notify the Commission, by...
Parallel S/sub n/ iteration schemes
International Nuclear Information System (INIS)
Wienke, B.R.; Hiromoto, R.E.
1986-01-01
The iterative, multigroup, discrete ordinates (S/sub n/) technique for solving the linear transport equation enjoys widespread usage and appeal. Serial iteration schemes and numerical algorithms developed over the years provide a timely framework for parallel extension. On the Denelcor HEP, the authors investigate three parallel iteration schemes for solving the one-dimensional S/sub n/ transport equation. The multigroup representation and serial iteration methods are also reviewed. This analysis represents a first attempt to extend serial S/sub n/ algorithms to parallel environments and provides good baseline estimates on ease of parallel implementation, relative algorithm efficiency, comparative speedup, and some future directions. The authors examine ordered and chaotic versions of these strategies, with and without concurrent rebalance and diffusion acceleration. Two strategies efficiently support high degrees of parallelization and appear to be robust parallel iteration techniques. The third strategy is a weaker parallel algorithm. Chaotic iteration, difficult to simulate on serial machines, holds promise and converges faster than ordered versions of the schemes. Actual parallel speedup and efficiency are high and payoff appears substantial
Parallel Computing Strategies for Irregular Algorithms
Biswas, Rupak; Oliker, Leonid; Shan, Hongzhang; Biegel, Bryan (Technical Monitor)
2002-01-01
Parallel computing promises several orders of magnitude increase in our ability to solve realistic computationally-intensive problems, but relies on their efficient mapping and execution on large-scale multiprocessor architectures. Unfortunately, many important applications are irregular and dynamic in nature, making their effective parallel implementation a daunting task. Moreover, with the proliferation of parallel architectures and programming paradigms, the typical scientist is faced with a plethora of questions that must be answered in order to obtain an acceptable parallel implementation of the solution algorithm. In this paper, we consider three representative irregular applications: unstructured remeshing, sparse matrix computations, and N-body problems, and parallelize them using various popular programming paradigms on a wide spectrum of computer platforms ranging from state-of-the-art supercomputers to PC clusters. We present the underlying problems, the solution algorithms, and the parallel implementation strategies. Smart load-balancing, partitioning, and ordering techniques are used to enhance parallel performance. Overall results demonstrate the complexity of efficiently parallelizing irregular algorithms.
Parallel fuzzy connected image segmentation on GPU
Zhuge, Ying; Cao, Yong; Udupa, Jayaram K.; Miller, Robert W.
2011-01-01
Purpose: Image segmentation techniques using fuzzy connectedness (FC) principles have shown their effectiveness in segmenting a variety of objects in several large applications. However, one challenge in these algorithms has been their excessive computational requirements when processing large image datasets. Nowadays, commodity graphics hardware provides a highly parallel computing environment. In this paper, the authors present a parallel fuzzy connected image segmentation algorithm impleme...
Non-Cartesian parallel imaging reconstruction.
Wright, Katherine L; Hamilton, Jesse I; Griswold, Mark A; Gulani, Vikas; Seiberlich, Nicole
2014-11-01
Non-Cartesian parallel imaging has played an important role in reducing data acquisition time in MRI. The use of non-Cartesian trajectories can enable more efficient coverage of k-space, which can be leveraged to reduce scan times. These trajectories can be undersampled to achieve even faster scan times, but the resulting images may contain aliasing artifacts. Just as Cartesian parallel imaging can be used to reconstruct images from undersampled Cartesian data, non-Cartesian parallel imaging methods can mitigate aliasing artifacts by using additional spatial encoding information in the form of the nonhomogeneous sensitivities of multi-coil phased arrays. This review will begin with an overview of non-Cartesian k-space trajectories and their sampling properties, followed by an in-depth discussion of several selected non-Cartesian parallel imaging algorithms. Three representative non-Cartesian parallel imaging methods will be described, including Conjugate Gradient SENSE (CG SENSE), non-Cartesian generalized autocalibrating partially parallel acquisition (GRAPPA), and Iterative Self-Consistent Parallel Imaging Reconstruction (SPIRiT). After a discussion of these three techniques, several potential promising clinical applications of non-Cartesian parallel imaging will be covered. © 2014 Wiley Periodicals, Inc.
Parallel Algorithms for Groebner-Basis Reduction
1987-09-25
22209 ELEMENT NO. NO. NO. ACCESSION NO. 11. TITLE (Include Security Classification) * PARALLEL ALGORITHMS FOR GROEBNER -BASIS REDUCTION 12. PERSONAL...All other editions are obsolete. Productivity Engineering in the UNIXt Environment p Parallel Algorithms for Groebner -Basis Reduction Technical Report
Parallel knock-out schemes in networks
Broersma, H.J.; Fomin, F.V.; Woeginger, G.J.
2004-01-01
We consider parallel knock-out schemes, a procedure on graphs introduced by Lampert and Slater in 1997 in which each vertex eliminates exactly one of its neighbors in each round. We are considering cases in which after a finite number of rounds, where the minimimum number is called the parallel
Building a parallel file system simulator
International Nuclear Information System (INIS)
Molina-Estolano, E; Maltzahn, C; Brandt, S A; Bent, J
2009-01-01
Parallel file systems are gaining in popularity in high-end computing centers as well as commercial data centers. High-end computing systems are expected to scale exponentially and to pose new challenges to their storage scalability in terms of cost and power. To address these challenges scientists and file system designers will need a thorough understanding of the design space of parallel file systems. Yet there exist few systematic studies of parallel file system behavior at petabyte- and exabyte scale. An important reason is the significant cost of getting access to large-scale hardware to test parallel file systems. To contribute to this understanding we are building a parallel file system simulator that can simulate parallel file systems at very large scale. Our goal is to simulate petabyte-scale parallel file systems on a small cluster or even a single machine in reasonable time and fidelity. With this simulator, file system experts will be able to tune existing file systems for specific workloads, scientists and file system deployment engineers will be able to better communicate workload requirements, file system designers and researchers will be able to try out design alternatives and innovations at scale, and instructors will be able to study very large-scale parallel file system behavior in the class room. In this paper we describe our approach and provide preliminary results that are encouraging both in terms of fidelity and simulation scalability.
Efficient sequential and parallel algorithms for record linkage.
Mamun, Abdullah-Al; Mi, Tian; Aseltine, Robert; Rajasekaran, Sanguthevar
2014-01-01
Integrating data from multiple sources is a crucial and challenging problem. Even though there exist numerous algorithms for record linkage or deduplication, they suffer from either large time needs or restrictions on the number of datasets that they can integrate. In this paper we report efficient sequential and parallel algorithms for record linkage which handle any number of datasets and outperform previous algorithms. Our algorithms employ hierarchical clustering algorithms as the basis. A key idea that we use is radix sorting on certain attributes to eliminate identical records before any further processing. Another novel idea is to form a graph that links similar records and find the connected components. Our sequential and parallel algorithms have been tested on a real dataset of 1,083,878 records and synthetic datasets ranging in size from 50,000 to 9,000,000 records. Our sequential algorithm runs at least two times faster, for any dataset, than the previous best-known algorithm, the two-phase algorithm using faster computation of the edit distance (TPA (FCED)). The speedups obtained by our parallel algorithm are almost linear. For example, we get a speedup of 7.5 with 8 cores (residing in a single node), 14.1 with 16 cores (residing in two nodes), and 26.4 with 32 cores (residing in four nodes). We have compared the performance of our sequential algorithm with TPA (FCED) and found that our algorithm outperforms the previous one. The accuracy is the same as that of this previous best-known algorithm.
Apparatus to examine pulsed parallel field losses in large conductors
International Nuclear Information System (INIS)
Miller, J.R.; Shen, S.S.
1977-01-01
Conductors in tokamak toroidal field coils will be exposed to pulsed fields both parallel and perpendicular to the current direction. These conductors will likely be quite high capacity (10 to 20 kA) and therefore probably will be built up out of smaller units. We have previously published measurements of losses in conductors exposed to a pulsed parallel field, but those experiments necessarily used monolithic conductors of relatively small cross section because the pulse coil, a torus that surrounded the test conductor, was itself small. Here we describe an apparatus that is conceptually similar but has been scaled up to accept conductors of much larger cross section and current capacity. The apparatus consists basically of a superconducting torus that contains a movable spool to allow test samples to be wound inside without unwinding the torus. Details of apparatus design and capabilities are described and preliminary results from tests of the apparatus and from loss measurements using it are reported
A Robust Parallel Algorithm for Combinatorial Compressed Sensing
Mendoza-Smith, Rodrigo; Tanner, Jared W.; Wechsung, Florian
2018-04-01
In previous work two of the authors have shown that a vector $x \\in \\mathbb{R}^n$ with at most $k Parallel-$\\ell_0$ decoding algorithm, where $\\mathrm{nnz}(A)$ denotes the number of nonzero entries in $A \\in \\mathbb{R}^{m \\times n}$. In this paper we present the Robust-$\\ell_0$ decoding algorithm, which robustifies Parallel-$\\ell_0$ when the sketch $Ax$ is corrupted by additive noise. This robustness is achieved by approximating the asymptotic posterior distribution of values in the sketch given its corrupted measurements. We provide analytic expressions that approximate these posteriors under the assumptions that the nonzero entries in the signal and the noise are drawn from continuous distributions. Numerical experiments presented show that Robust-$\\ell_0$ is superior to existing greedy and combinatorial compressed sensing algorithms in the presence of small to moderate signal-to-noise ratios in the setting of Gaussian signals and Gaussian additive noise.
Broadcasting a message in a parallel computer
Berg, Jeremy E [Rochester, MN; Faraj, Ahmad A [Rochester, MN
2011-08-02
Methods, systems, and products are disclosed for broadcasting a message in a parallel computer. The parallel computer includes a plurality of compute nodes connected together using a data communications network. The data communications network optimized for point to point data communications and is characterized by at least two dimensions. The compute nodes are organized into at least one operational group of compute nodes for collective parallel operations of the parallel computer. One compute node of the operational group assigned to be a logical root. Broadcasting a message in a parallel computer includes: establishing a Hamiltonian path along all of the compute nodes in at least one plane of the data communications network and in the operational group; and broadcasting, by the logical root to the remaining compute nodes, the logical root's message along the established Hamiltonian path.
Advanced parallel processing with supercomputer architectures
International Nuclear Information System (INIS)
Hwang, K.
1987-01-01
This paper investigates advanced parallel processing techniques and innovative hardware/software architectures that can be applied to boost the performance of supercomputers. Critical issues on architectural choices, parallel languages, compiling techniques, resource management, concurrency control, programming environment, parallel algorithms, and performance enhancement methods are examined and the best answers are presented. The authors cover advanced processing techniques suitable for supercomputers, high-end mainframes, minisupers, and array processors. The coverage emphasizes vectorization, multitasking, multiprocessing, and distributed computing. In order to achieve these operation modes, parallel languages, smart compilers, synchronization mechanisms, load balancing methods, mapping parallel algorithms, operating system functions, application library, and multidiscipline interactions are investigated to ensure high performance. At the end, they assess the potentials of optical and neural technologies for developing future supercomputers
Differences Between Distributed and Parallel Systems
Energy Technology Data Exchange (ETDEWEB)
Brightwell, R.; Maccabe, A.B.; Rissen, R.
1998-10-01
Distributed systems have been studied for twenty years and are now coming into wider use as fast networks and powerful workstations become more readily available. In many respects a massively parallel computer resembles a network of workstations and it is tempting to port a distributed operating system to such a machine. However, there are significant differences between these two environments and a parallel operating system is needed to get the best performance out of a massively parallel system. This report characterizes the differences between distributed systems, networks of workstations, and massively parallel systems and analyzes the impact of these differences on operating system design. In the second part of the report, we introduce Puma, an operating system specifically developed for massively parallel systems. We describe Puma portals, the basic building blocks for message passing paradigms implemented on top of Puma, and show how the differences observed in the first part of the report have influenced the design and implementation of Puma.
Parallel-In-Time For Moving Meshes
Energy Technology Data Exchange (ETDEWEB)
Falgout, R. D. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Manteuffel, T. A. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Southworth, B. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Schroder, J. B. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
2016-02-04
With steadily growing computational resources available, scientists must develop e ective ways to utilize the increased resources. High performance, highly parallel software has be- come a standard. However until recent years parallelism has focused primarily on the spatial domain. When solving a space-time partial di erential equation (PDE), this leads to a sequential bottleneck in the temporal dimension, particularly when taking a large number of time steps. The XBraid parallel-in-time library was developed as a practical way to add temporal parallelism to existing se- quential codes with only minor modi cations. In this work, a rezoning-type moving mesh is applied to a di usion problem and formulated in a parallel-in-time framework. Tests and scaling studies are run using XBraid and demonstrate excellent results for the simple model problem considered herein.
Parallel programming with Easy Java Simulations
Esquembre, F.; Christian, W.; Belloni, M.
2018-01-01
Nearly all of today's processors are multicore, and ideally programming and algorithm development utilizing the entire processor should be introduced early in the computational physics curriculum. Parallel programming is often not introduced because it requires a new programming environment and uses constructs that are unfamiliar to many teachers. We describe how we decrease the barrier to parallel programming by using a java-based programming environment to treat problems in the usual undergraduate curriculum. We use the easy java simulations programming and authoring tool to create the program's graphical user interface together with objects based on those developed by Kaminsky [Building Parallel Programs (Course Technology, Boston, 2010)] to handle common parallel programming tasks. Shared-memory parallel implementations of physics problems, such as time evolution of the Schrödinger equation, are available as source code and as ready-to-run programs from the AAPT-ComPADRE digital library.
Multilevel Parallelization of AutoDock 4.2
Directory of Open Access Journals (Sweden)
Norgan Andrew P
2011-04-01
Full Text Available Abstract Background Virtual (computational screening is an increasingly important tool for drug discovery. AutoDock is a popular open-source application for performing molecular docking, the prediction of ligand-receptor interactions. AutoDock is a serial application, though several previous efforts have parallelized various aspects of the program. In this paper, we report on a multi-level parallelization of AutoDock 4.2 (mpAD4. Results Using MPI and OpenMP, AutoDock 4.2 was parallelized for use on MPI-enabled systems and to multithread the execution of individual docking jobs. In addition, code was implemented to reduce input/output (I/O traffic by reusing grid maps at each node from docking to docking. Performance of mpAD4 was examined on two multiprocessor computers. Conclusions Using MPI with OpenMP multithreading, mpAD4 scales with near linearity on the multiprocessor systems tested. In situations where I/O is limiting, reuse of grid maps reduces both system I/O and overall screening time. Multithreading of AutoDock's Lamarkian Genetic Algorithm with OpenMP increases the speed of execution of individual docking jobs, and when combined with MPI parallelization can significantly reduce the execution time of virtual screens. This work is significant in that mpAD4 speeds the execution of certain molecular docking workloads and allows the user to optimize the degree of system-level (MPI and node-level (OpenMP parallelization to best fit both workloads and computational resources.
Multilevel Parallelization of AutoDock 4.2.
Norgan, Andrew P; Coffman, Paul K; Kocher, Jean-Pierre A; Katzmann, David J; Sosa, Carlos P
2011-04-28
Virtual (computational) screening is an increasingly important tool for drug discovery. AutoDock is a popular open-source application for performing molecular docking, the prediction of ligand-receptor interactions. AutoDock is a serial application, though several previous efforts have parallelized various aspects of the program. In this paper, we report on a multi-level parallelization of AutoDock 4.2 (mpAD4). Using MPI and OpenMP, AutoDock 4.2 was parallelized for use on MPI-enabled systems and to multithread the execution of individual docking jobs. In addition, code was implemented to reduce input/output (I/O) traffic by reusing grid maps at each node from docking to docking. Performance of mpAD4 was examined on two multiprocessor computers. Using MPI with OpenMP multithreading, mpAD4 scales with near linearity on the multiprocessor systems tested. In situations where I/O is limiting, reuse of grid maps reduces both system I/O and overall screening time. Multithreading of AutoDock's Lamarkian Genetic Algorithm with OpenMP increases the speed of execution of individual docking jobs, and when combined with MPI parallelization can significantly reduce the execution time of virtual screens. This work is significant in that mpAD4 speeds the execution of certain molecular docking workloads and allows the user to optimize the degree of system-level (MPI) and node-level (OpenMP) parallelization to best fit both workloads and computational resources.
Parallel computation of rotating flows
DEFF Research Database (Denmark)
Lundin, Lars Kristian; Barker, Vincent A.; Sørensen, Jens Nørkær
1999-01-01
This paper deals with the simulation of 3‐D rotating flows based on the velocity‐vorticity formulation of the Navier‐Stokes equations in cylindrical coordinates. The governing equations are discretized by a finite difference method. The solution is advanced to a new time level by a two‐step process....... In the first step, the vorticity at the new time level is computed using the velocity at the previous time level. In the second step, the velocity at the new time level is computed using the new vorticity. We discuss here the second part which is by far the most time‐consuming. The numerical problem...
Arkin, Ethem; Tekinerdogan, Bedir
2016-01-01
Mapping parallel algorithms to parallel computing platforms requires several activities such as the analysis of the parallel algorithm, the definition of the logical configuration of the platform, the mapping of the algorithm to the logical configuration platform and the implementation of the
Impact of previously disadvantaged land-users on sustainable ...
African Journals Online (AJOL)
Impact of previously disadvantaged land-users on sustainable agricultural ... about previously disadvantaged land users involved in communal farming systems ... of input, capital, marketing, information and land use planning, with effect on ...
The Effects of Stress and Executive Functions on Decision Making in an Executive Parallel Task
McGuigan, Brian
2016-01-01
The aim of this study was to investigate the effects of acute stress on parallel task performance with the Game of Dice Task (GDT) to measure decision making and the Stroop test. Two previous studies have found that the combination of stress and a parallel task with the GDT and an executive functions task preserved performance on the GDT for a stress group compared to a control group. The purpose of this study was to create and use a new parallel task with the GDT and the stroop test to elu...
22 CFR 40.91 - Certain aliens previously removed.
2010-04-01
... 22 Foreign Relations 1 2010-04-01 2010-04-01 false Certain aliens previously removed. 40.91... IMMIGRANTS UNDER THE IMMIGRATION AND NATIONALITY ACT, AS AMENDED Aliens Previously Removed § 40.91 Certain aliens previously removed. (a) 5-year bar. An alien who has been found inadmissible, whether as a result...
Portable parallel programming in a Fortran environment
International Nuclear Information System (INIS)
May, E.N.
1989-01-01
Experience using the Argonne-developed PARMACs macro package to implement a portable parallel programming environment is described. Fortran programs with intrinsic parallelism of coarse and medium granularity are easily converted to parallel programs which are portable among a number of commercially available parallel processors in the class of shared-memory bus-based and local-memory network based MIMD processors. The parallelism is implemented using standard UNIX (tm) tools and a small number of easily understood synchronization concepts (monitors and message-passing techniques) to construct and coordinate multiple cooperating processes on one or many processors. Benchmark results are presented for parallel computers such as the Alliant FX/8, the Encore MultiMax, the Sequent Balance, the Intel iPSC/2 Hypercube and a network of Sun 3 workstations. These parallel machines are typical MIMD types with from 8 to 30 processors, each rated at from 1 to 10 MIPS processing power. The demonstration code used for this work is a Monte Carlo simulation of the response to photons of a ''nearly realistic'' lead, iron and plastic electromagnetic and hadronic calorimeter, using the EGS4 code system. 6 refs., 2 figs., 2 tabs
Performance of the Galley Parallel File System
Nieuwejaar, Nils; Kotz, David
1996-01-01
As the input/output (I/O) needs of parallel scientific applications increase, file systems for multiprocessors are being designed to provide applications with parallel access to multiple disks. Many parallel file systems present applications with a conventional Unix-like interface that allows the application to access multiple disks transparently. This interface conceals the parallism within the file system, which increases the ease of programmability, but makes it difficult or impossible for sophisticated programmers and libraries to use knowledge about their I/O needs to exploit that parallelism. Furthermore, most current parallel file systems are optimized for a different workload than they are being asked to support. We introduce Galley, a new parallel file system that is intended to efficiently support realistic parallel workloads. Initial experiments, reported in this paper, indicate that Galley is capable of providing high-performance 1/O to applications the applications that rely on them. In Section 3 we describe that access data in patterns that have been observed to be common.
Determining root correspondence between previously and newly detected objects
Paglieroni, David W.; Beer, N Reginald
2014-06-17
A system that applies attribute and topology based change detection to networks of objects that were detected on previous scans of a structure, roadway, or area of interest. The attributes capture properties or characteristics of the previously detected objects, such as location, time of detection, size, elongation, orientation, etc. The topology of the network of previously detected objects is maintained in a constellation database that stores attributes of previously detected objects and implicitly captures the geometrical structure of the network. A change detection system detects change by comparing the attributes and topology of new objects detected on the latest scan to the constellation database of previously detected objects.
The kpx, a program analyzer for parallelization
International Nuclear Information System (INIS)
Matsuyama, Yuji; Orii, Shigeo; Ota, Toshiro; Kume, Etsuo; Aikawa, Hiroshi.
1997-03-01
The kpx is a program analyzer, developed as a common technological basis for promoting parallel processing. The kpx consists of three tools. The first is ktool, that shows how much execution time is spent in program segments. The second is ptool, that shows parallelization overhead on the Paragon system. The last is xtool, that shows parallelization overhead on the VPP system. The kpx, designed to work for any FORTRAN cord on any UNIX computer, is confirmed to work well after testing on Paragon, SP2, SR2201, VPP500, VPP300, Monte-4, SX-4 and T90. (author)
Synchronization Of Parallel Discrete Event Simulations
Steinman, Jeffrey S.
1992-01-01
Adaptive, parallel, discrete-event-simulation-synchronization algorithm, Breathing Time Buckets, developed in Synchronous Parallel Environment for Emulation and Discrete Event Simulation (SPEEDES) operating system. Algorithm allows parallel simulations to process events optimistically in fluctuating time cycles that naturally adapt while simulation in progress. Combines best of optimistic and conservative synchronization strategies while avoiding major disadvantages. Algorithm processes events optimistically in time cycles adapting while simulation in progress. Well suited for modeling communication networks, for large-scale war games, for simulated flights of aircraft, for simulations of computer equipment, for mathematical modeling, for interactive engineering simulations, and for depictions of flows of information.
Multistage parallel-serial time averaging filters
International Nuclear Information System (INIS)
Theodosiou, G.E.
1980-01-01
Here, a new time averaging circuit design, the 'parallel filter' is presented, which can reduce the time jitter, introduced in time measurements using counters of large dimensions. This parallel filter could be considered as a single stage unit circuit which can be repeated an arbitrary number of times in series, thus providing a parallel-serial filter type as a result. The main advantages of such a filter over a serial one are much less electronic gate jitter and time delay for the same amount of total time uncertainty reduction. (orig.)
Implementations of BLAST for parallel computers.
Jülich, A
1995-02-01
The BLAST sequence comparison programs have been ported to a variety of parallel computers-the shared memory machine Cray Y-MP 8/864 and the distributed memory architectures Intel iPSC/860 and nCUBE. Additionally, the programs were ported to run on workstation clusters. We explain the parallelization techniques and consider the pros and cons of these methods. The BLAST programs are very well suited for parallelization for a moderate number of processors. We illustrate our results using the program blastp as an example. As input data for blastp, a 799 residue protein query sequence and the protein database PIR were used.
Speedup predictions on large scientific parallel programs
International Nuclear Information System (INIS)
Williams, E.; Bobrowicz, F.
1985-01-01
How much speedup can we expect for large scientific parallel programs running on supercomputers. For insight into this problem we extend the parallel processing environment currently existing on the Cray X-MP (a shared memory multiprocessor with at most four processors) to a simulated N-processor environment, where N greater than or equal to 1. Several large scientific parallel programs from Los Alamos National Laboratory were run in this simulated environment, and speedups were predicted. A speedup of 14.4 on 16 processors was measured for one of the three most used codes at the Laboratory
Language constructs for modular parallel programs
Energy Technology Data Exchange (ETDEWEB)
Foster, I.
1996-03-01
We describe programming language constructs that facilitate the application of modular design techniques in parallel programming. These constructs allow us to isolate resource management and processor scheduling decisions from the specification of individual modules, which can themselves encapsulate design decisions concerned with concurrence, communication, process mapping, and data distribution. This approach permits development of libraries of reusable parallel program components and the reuse of these components in different contexts. In particular, alternative mapping strategies can be explored without modifying other aspects of program logic. We describe how these constructs are incorporated in two practical parallel programming languages, PCN and Fortran M. Compilers have been developed for both languages, allowing experimentation in substantial applications.
Distributed parallel messaging for multiprocessor systems
Chen, Dong; Heidelberger, Philip; Salapura, Valentina; Senger, Robert M; Steinmacher-Burrow, Burhard; Sugawara, Yutaka
2013-06-04
A method and apparatus for distributed parallel messaging in a parallel computing system. The apparatus includes, at each node of a multiprocessor network, multiple injection messaging engine units and reception messaging engine units, each implementing a DMA engine and each supporting both multiple packet injection into and multiple reception from a network, in parallel. The reception side of the messaging unit (MU) includes a switch interface enabling writing of data of a packet received from the network to the memory system. The transmission side of the messaging unit, includes switch interface for reading from the memory system when injecting packets into the network.
Massively parallel Fokker-Planck code ALLAp
International Nuclear Information System (INIS)
Batishcheva, A.A.; Krasheninnikov, S.I.; Craddock, G.G.; Djordjevic, V.
1996-01-01
The recently developed for workstations Fokker-Planck code ALLA simulates the temporal evolution of 1V, 2V and 1D2V collisional edge plasmas. In this work we present the results of code parallelization on the CRI T3D massively parallel platform (ALLAp version). Simultaneously we benchmark the 1D2V parallel vesion against an analytic self-similar solution of the collisional kinetic equation. This test is not trivial as it demands a very strong spatial temperature and density variation within the simulation domain. (orig.)
A parallel coordinates style interface for exploratory volume visualization.
Tory, Melanie; Potts, Simeon; Möller, Torsten
2005-01-01
We present a user interface, based on parallel coordinates, that facilitates exploration of volume data. By explicitly representing the visualization parameter space, the interface provides an overview of rendering options and enables users to easily explore different parameters. Rendered images are stored in an integrated history bar that facilitates backtracking to previous visualization options. Initial usability testing showed clear agreement between users and experts of various backgrounds (usability, graphic design, volume visualization, and medical physics) that the proposed user interface is a valuable data exploration tool.
Stability of unstably stratified shear flow between parallel plates
Energy Technology Data Exchange (ETDEWEB)
Fujimura, Kaoru; Kelly, R E
1987-09-01
The linear stability of unstably stratified shear flows between two horizontal parallel plates was investigated. Eigenvalue problems were solved numerically by making use of the expansion method in Chebyshev polynomials, and the critical Rayleigh numbers were obtained accurately in the Reynolds number range of (0.01, 100). It was found that the critical Rayleigh number increases with an increase of the Reynolds number. The result strongly supports previous stability analyses except for the analysis by Makino and Ishikawa (J. Jpn. Soc. Fluid Mech. 4 (1985) 148 - 158) in which a decrease of the critical Rayleigh number was obtained.
Stability of unstably stratified shear flow between parallel plates
International Nuclear Information System (INIS)
Fujimura, Kaoru; Kelly, R.E.
1987-01-01
The linear stability of unstably stratified shear flows between two horizontal parallel plates was investigated. Eigenvalue problems were solved numerically by making use of the expansion method in Chebyshev polynomials, and the critical Rayleigh numbers were obtained accurately in the Reynolds number range of [0.01, 100]. It was found that the critical Rayleigh number increases with an increase of the Reynolds number. The result strongly supports previous stability analyses except for the analysis by Makino and Ishikawa [J. Jpn. Soc. Fluid Mech. 4 (1985) 148 - 158] in which a decrease of the critical Rayleigh number was obtained. (author)
A model of breakdown in parallel-plate detectors
International Nuclear Information System (INIS)
Fonte, P.
1996-01-01
Parallel-plate avalanche chambers (PPAC's) have many desirable properties, such as a fast, large area particle detector. However, the maximum gain is limited by a form of violent breakdown that limits the usefulness of this detector, despite its other evident qualities. The exact nature of this phenomenon is not yet sufficiently clear to sustain possible improvements. A previous experimental study is complemented in the present work by a quantitative model of the breakdown phenomenon in PPAC's, based on the streamer theory. The model reproduces well the peculiar behavior of the external current observed in PPAC's and resistive-plate chambers. Other breakdown properties measured in PPAC's are also well reproduced
A curvature theory for discrete surfaces based on mesh parallelity
Bobenko, Alexander Ivanovich
2009-12-18
We consider a general theory of curvatures of discrete surfaces equipped with edgewise parallel Gauss images, and where mean and Gaussian curvatures of faces are derived from the faces\\' areas and mixed areas. Remarkably these notions are capable of unifying notable previously defined classes of surfaces, such as discrete isothermic minimal surfaces and surfaces of constant mean curvature. We discuss various types of natural Gauss images, the existence of principal curvatures, constant curvature surfaces, Christoffel duality, Koenigs nets, contact element nets, s-isothermic nets, and interesting special cases such as discrete Delaunay surfaces derived from elliptic billiards. © 2009 Springer-Verlag.
Massively Parallel Computing: A Sandia Perspective
Energy Technology Data Exchange (ETDEWEB)
Dosanjh, Sudip S.; Greenberg, David S.; Hendrickson, Bruce; Heroux, Michael A.; Plimpton, Steve J.; Tomkins, James L.; Womble, David E.
1999-05-06
The computing power available to scientists and engineers has increased dramatically in the past decade, due in part to progress in making massively parallel computing practical and available. The expectation for these machines has been great. The reality is that progress has been slower than expected. Nevertheless, massively parallel computing is beginning to realize its potential for enabling significant break-throughs in science and engineering. This paper provides a perspective on the state of the field, colored by the authors' experiences using large scale parallel machines at Sandia National Laboratories. We address trends in hardware, system software and algorithms, and we also offer our view of the forces shaping the parallel computing industry.
Parallel generation of architecture on the GPU
Steinberger, Markus; Kenzel, Michael; Kainz, Bernhard K.; Mü ller, Jö rg; Wonka, Peter; Schmalstieg, Dieter
2014-01-01
they can take advantage of, or both, our method supports state of the art procedural modeling including stochasticity and context-sensitivity. To increase parallelism, we explicitly express independence in the grammar, reduce inter-rule dependencies
New high voltage parallel plate analyzer
International Nuclear Information System (INIS)
Hamada, Y.; Kawasumi, Y.; Masai, K.; Iguchi, H.; Fujisawa, A.; Abe, Y.
1992-01-01
A new modification on the parallel plate analyzer for 500 keV heavy ions to eliminate the effect of the intense UV and visible radiations, is successfully conducted. Its principle and results are discussed. (author)
Parallel data encryption with RSA algorithm
Неретин, А. А.
2016-01-01
In this paper a parallel RSA algorithm with preliminary shuffling of source text was presented.Dependence of an encryption speed on the number of encryption nodes has been analysed, The proposed algorithm was implemented on C# language.
Data parallel sorting for particle simulation
Dagum, Leonardo
1992-01-01
Sorting on a parallel architecture is a communications intensive event which can incur a high penalty in applications where it is required. In the case of particle simulation, only integer sorting is necessary, and sequential implementations easily attain the minimum performance bound of O (N) for N particles. Parallel implementations, however, have to cope with the parallel sorting problem which, in addition to incurring a heavy communications cost, can make the minimun performance bound difficult to attain. This paper demonstrates how the sorting problem in a particle simulation can be reduced to a merging problem, and describes an efficient data parallel algorithm to solve this merging problem in a particle simulation. The new algorithm is shown to be optimal under conditions usual for particle simulation, and its fieldwise implementation on the Connection Machine is analyzed in detail. The new algorithm is about four times faster than a fieldwise implementation of radix sort on the Connection Machine.
Parallel debt in the Serbian finance law
Directory of Open Access Journals (Sweden)
Kuzman Miloš
2014-01-01
Full Text Available The purpose of this paper is to present the mechanism of parallel debt in the Serbian financial law. While considering whether the mechanism of parallel debt exists under the Serbian law, the Anglo-Saxon mechanism of trust is represented. Hence it is explained why the mechanism of trust is not allowed under the Serbian law. Further on, the mechanism of parallel debt is introduced as well as a debate on permissibility of its cause in the Serbian law. Comparative legal arguments about this issue are also presented in this paper. In conclusion, the author suggests that on the basis of the conclusions drawn in this paper, the parallel debt mechanism is to be declared admissible if it is ever taken into consideration by the Serbian courts.
Parallel Monte Carlo simulation of aerosol dynamics
Zhou, K.; He, Z.; Xiao, M.; Zhang, Z.
2014-01-01
is simulated with a stochastic method (Marcus-Lushnikov stochastic process). Operator splitting techniques are used to synthesize the deterministic and stochastic parts in the algorithm. The algorithm is parallelized using the Message Passing Interface (MPI
Stranger than fiction: parallel universes beguile science
2007-01-01
We may not be able - at least not yet - to prove they exist, many serious scientists say, but there are plenty of reasons to think that parallel dimensions are more than figments of effeaded imagination. (1/2 page)
Parallel computation of nondeterministic algorithms in VLSI
Energy Technology Data Exchange (ETDEWEB)
Hortensius, P D
1987-01-01
This work examines parallel VLSI implementations of nondeterministic algorithms. It is demonstrated that conventional pseudorandom number generators are unsuitable for highly parallel applications. Efficient parallel pseudorandom sequence generation can be accomplished using certain classes of elementary one-dimensional cellular automata. The pseudorandom numbers appear in parallel on each clock cycle. Extensive study of the properties of these new pseudorandom number generators is made using standard empirical random number tests, cycle length tests, and implementation considerations. Furthermore, it is shown these particular cellular automata can form the basis of efficient VLSI architectures for computations involved in the Monte Carlo simulation of both the percolation and Ising models from statistical mechanics. Finally, a variation on a Built-In Self-Test technique based upon cellular automata is presented. These Cellular Automata-Logic-Block-Observation (CALBO) circuits improve upon conventional design for testability circuitry.
Implementing Shared Memory Parallelism in MCBEND
Directory of Open Access Journals (Sweden)
Bird Adam
2017-01-01
Full Text Available MCBEND is a general purpose radiation transport Monte Carlo code from AMEC Foster Wheelers’s ANSWERS® Software Service. MCBEND is well established in the UK shielding community for radiation shielding and dosimetry assessments. The existing MCBEND parallel capability effectively involves running the same calculation on many processors. This works very well except when the memory requirements of a model restrict the number of instances of a calculation that will fit on a machine. To more effectively utilise parallel hardware OpenMP has been used to implement shared memory parallelism in MCBEND. This paper describes the reasoning behind the choice of OpenMP, notes some of the challenges of multi-threading an established code such as MCBEND and assesses the performance of the parallel method implemented in MCBEND.
Domain decomposition methods and parallel computing
International Nuclear Information System (INIS)
Meurant, G.
1991-01-01
In this paper, we show how to efficiently solve large linear systems on parallel computers. These linear systems arise from discretization of scientific computing problems described by systems of partial differential equations. We show how to get a discrete finite dimensional system from the continuous problem and the chosen conjugate gradient iterative algorithm is briefly described. Then, the different kinds of parallel architectures are reviewed and their advantages and deficiencies are emphasized. We sketch the problems found in programming the conjugate gradient method on parallel computers. For this algorithm to be efficient on parallel machines, domain decomposition techniques are introduced. We give results of numerical experiments showing that these techniques allow a good rate of convergence for the conjugate gradient algorithm as well as computational speeds in excess of a billion of floating point operations per second. (author). 5 refs., 11 figs., 2 tabs., 1 inset
6th International Parallel Tools Workshop
Brinkmann, Steffen; Gracia, José; Resch, Michael; Nagel, Wolfgang
2013-01-01
The latest advances in the High Performance Computing hardware have significantly raised the level of available compute performance. At the same time, the growing hardware capabilities of modern supercomputing architectures have caused an increasing complexity of the parallel application development. Despite numerous efforts to improve and simplify parallel programming, there is still a lot of manual debugging and tuning work required. This process is supported by special software tools, facilitating debugging, performance analysis, and optimization and thus making a major contribution to the development of robust and efficient parallel software. This book introduces a selection of the tools, which were presented and discussed at the 6th International Parallel Tools Workshop, held in Stuttgart, Germany, 25-26 September 2012.
Parallel processor programs in the Federal Government
Schneck, P. B.; Austin, D.; Squires, S. L.; Lehmann, J.; Mizell, D.; Wallgren, K.
1985-01-01
In 1982, a report dealing with the nation's research needs in high-speed computing called for increased access to supercomputing resources for the research community, research in computational mathematics, and increased research in the technology base needed for the next generation of supercomputers. Since that time a number of programs addressing future generations of computers, particularly parallel processors, have been started by U.S. government agencies. The present paper provides a description of the largest government programs in parallel processing. Established in fiscal year 1985 by the Institute for Defense Analyses for the National Security Agency, the Supercomputing Research Center will pursue research to advance the state of the art in supercomputing. Attention is also given to the DOE applied mathematical sciences research program, the NYU Ultracomputer project, the DARPA multiprocessor system architectures program, NSF research on multiprocessor systems, ONR activities in parallel computing, and NASA parallel processor projects.
Density functional theory and parallel processing
International Nuclear Information System (INIS)
Ward, R.C.; Geist, G.A.; Butler, W.H.
1987-01-01
The authors demonstrate a method for obtaining the ground state energies and charge densities of a system of atoms described within density functional theory using simulated annealing on a parallel computer
High performance parallel computers for science
International Nuclear Information System (INIS)
Nash, T.; Areti, H.; Atac, R.; Biel, J.; Cook, A.; Deppe, J.; Edel, M.; Fischler, M.; Gaines, I.; Hance, R.
1989-01-01
This paper reports that Fermilab's Advanced Computer Program (ACP) has been developing cost effective, yet practical, parallel computers for high energy physics since 1984. The ACP's latest developments are proceeding in two directions. A Second Generation ACP Multiprocessor System for experiments will include $3500 RISC processors each with performance over 15 VAX MIPS. To support such high performance, the new system allows parallel I/O, parallel interprocess communication, and parallel host processes. The ACP Multi-Array Processor, has been developed for theoretical physics. Each $4000 node is a FORTRAN or C programmable pipelined 20 Mflops (peak), 10 MByte single board computer. These are plugged into a 16 port crossbar switch crate which handles both inter and intra crate communication. The crates are connected in a hypercube. Site oriented applications like lattice gauge theory are supported by system software called CANOPY, which makes the hardware virtually transparent to users. A 256 node, 5 GFlop, system is under construction
Massively parallel evolutionary computation on GPGPUs
Tsutsui, Shigeyoshi
2013-01-01
Evolutionary algorithms (EAs) are metaheuristics that learn from natural collective behavior and are applied to solve optimization problems in domains such as scheduling, engineering, bioinformatics, and finance. Such applications demand acceptable solutions with high-speed execution using finite computational resources. Therefore, there have been many attempts to develop platforms for running parallel EAs using multicore machines, massively parallel cluster machines, or grid computing environments. Recent advances in general-purpose computing on graphics processing units (GPGPU) have opened u
Simulation Exploration through Immersive Parallel Planes: Preprint
Energy Technology Data Exchange (ETDEWEB)
Brunhart-Lupo, Nicholas; Bush, Brian W.; Gruchalla, Kenny; Smith, Steve
2016-03-01
We present a visualization-driven simulation system that tightly couples systems dynamics simulations with an immersive virtual environment to allow analysts to rapidly develop and test hypotheses in a high-dimensional parameter space. To accomplish this, we generalize the two-dimensional parallel-coordinates statistical graphic as an immersive 'parallel-planes' visualization for multivariate time series emitted by simulations running in parallel with the visualization. In contrast to traditional parallel coordinate's mapping the multivariate dimensions onto coordinate axes represented by a series of parallel lines, we map pairs of the multivariate dimensions onto a series of parallel rectangles. As in the case of parallel coordinates, each individual observation in the dataset is mapped to a polyline whose vertices coincide with its coordinate values. Regions of the rectangles can be 'brushed' to highlight and select observations of interest: a 'slider' control allows the user to filter the observations by their time coordinate. In an immersive virtual environment, users interact with the parallel planes using a joystick that can select regions on the planes, manipulate selection, and filter time. The brushing and selection actions are used to both explore existing data as well as to launch additional simulations corresponding to the visually selected portions of the input parameter space. As soon as the new simulations complete, their resulting observations are displayed in the virtual environment. This tight feedback loop between simulation and immersive analytics accelerates users' realization of insights about the simulation and its output.
Alternative derivation of the parallel ion viscosity
International Nuclear Information System (INIS)
Bravenec, R.V.; Berk, H.L.; Hammer, J.H.
1982-01-01
A set of double-adiabatic fluid equations with additional collisional relaxation between the ion temperatures parallel and perpendicular to a magnetic field are shown to reduce to a set involving a single temperature and a parallel viscosity. This result is applied to a recently published paper [R. V. Bravenec, A. J. Lichtenberg, M. A. Leiberman, and H. L. Berk, Phys. Fluids 24, 1320 (1981)] on viscous flow in a multiple-mirror configuration
Acoustic simulation in architecture with parallel algorithm
Li, Xiaohong; Zhang, Xinrong; Li, Dan
2004-03-01
In allusion to complexity of architecture environment and Real-time simulation of architecture acoustics, a parallel radiosity algorithm was developed. The distribution of sound energy in scene is solved with this method. And then the impulse response between sources and receivers at frequency segment, which are calculated with multi-process, are combined into whole frequency response. The numerical experiment shows that parallel arithmetic can improve the acoustic simulating efficiency of complex scene.
PARALLEL SOLUTION METHODS OF PARTIAL DIFFERENTIAL EQUATIONS
Directory of Open Access Journals (Sweden)
Korhan KARABULUT
1998-03-01
Full Text Available Partial differential equations arise in almost all fields of science and engineering. Computer time spent in solving partial differential equations is much more than that of in any other problem class. For this reason, partial differential equations are suitable to be solved on parallel computers that offer great computation power. In this study, parallel solution to partial differential equations with Jacobi, Gauss-Siedel, SOR (Succesive OverRelaxation and SSOR (Symmetric SOR algorithms is studied.
Simulation Exploration through Immersive Parallel Planes
Energy Technology Data Exchange (ETDEWEB)
Brunhart-Lupo, Nicholas J [National Renewable Energy Laboratory (NREL), Golden, CO (United States); Bush, Brian W [National Renewable Energy Laboratory (NREL), Golden, CO (United States); Gruchalla, Kenny M [National Renewable Energy Laboratory (NREL), Golden, CO (United States); Smith, Steve [Los Alamos Visualization Associates
2017-05-25
We present a visualization-driven simulation system that tightly couples systems dynamics simulations with an immersive virtual environment to allow analysts to rapidly develop and test hypotheses in a high-dimensional parameter space. To accomplish this, we generalize the two-dimensional parallel-coordinates statistical graphic as an immersive 'parallel-planes' visualization for multivariate time series emitted by simulations running in parallel with the visualization. In contrast to traditional parallel coordinate's mapping the multivariate dimensions onto coordinate axes represented by a series of parallel lines, we map pairs of the multivariate dimensions onto a series of parallel rectangles. As in the case of parallel coordinates, each individual observation in the dataset is mapped to a polyline whose vertices coincide with its coordinate values. Regions of the rectangles can be 'brushed' to highlight and select observations of interest: a 'slider' control allows the user to filter the observations by their time coordinate. In an immersive virtual environment, users interact with the parallel planes using a joystick that can select regions on the planes, manipulate selection, and filter time. The brushing and selection actions are used to both explore existing data as well as to launch additional simulations corresponding to the visually selected portions of the input parameter space. As soon as the new simulations complete, their resulting observations are displayed in the virtual environment. This tight feedback loop between simulation and immersive analytics accelerates users' realization of insights about the simulation and its output.
Caligula-Christ: Preliminary Study of a Parallel
Directory of Open Access Journals (Sweden)
Lorene M. Birden
2010-01-01
Full Text Available Caligula, at the very beginning of the Albert Camus play, conceives a very ambitious project; to surpass the gods and take their place in his empire, in order to decree impossibility. Camus has, however, gone a step further in developing the god-image of his main character through the incorporation of much Christian imagery into the scenes. This aspect of the play seems not to have been noticed by Camus scholars; there is no in-depth study of the use of this imagery. However, Camus scholar Patricia Johnson and the members of the Société des études camusiennes have noted the usefulness of the analysis presented here and the absence of it in previous research. This study, designated as “preliminary,” attempts to prompt further analyses of the question and offers different approaches. It proceeds by intertextual study of Caligula and the gospels (here referred to in Revised Standard Version and brings out aspects of the emperor’s intentions that expose a combination of perversion and similarity in relation to deity. It briefly outlines the sources of this parallel and the reasons for creating it, then details the parallels that show first the reversal of the image of Jesus, then the striking consonance. It ends with interpretations of the parallels and concludes with commentaries on the use of irony to create them.
Application of parallelized software architecture to an autonomous ground vehicle
Shakya, Rahul; Wright, Adam; Shin, Young Ho; Momin, Orko; Petkovsek, Steven; Wortman, Paul; Gautam, Prasanna; Norton, Adam
2011-01-01
This paper presents improvements made to Q, an autonomous ground vehicle designed to participate in the Intelligent Ground Vehicle Competition (IGVC). For the 2010 IGVC, Q was upgraded with a new parallelized software architecture and a new vision processor. Improvements were made to the power system reducing the number of batteries required for operation from six to one. In previous years, a single state machine was used to execute the bulk of processing activities including sensor interfacing, data processing, path planning, navigation algorithms and motor control. This inefficient approach led to poor software performance and made it difficult to maintain or modify. For IGVC 2010, the team implemented a modular parallel architecture using the National Instruments (NI) LabVIEW programming language. The new architecture divides all the necessary tasks - motor control, navigation, sensor data collection, etc. into well-organized components that execute in parallel, providing considerable flexibility and facilitating efficient use of processing power. Computer vision is used to detect white lines on the ground and determine their location relative to the robot. With the new vision processor and some optimization of the image processing algorithm used last year, two frames can be acquired and processed in 70ms. With all these improvements, Q placed 2nd in the autonomous challenge.
Design and test of a parallel kinematic solar tracker
Directory of Open Access Journals (Sweden)
Stefano Mauro
2015-12-01
Full Text Available This article proposes a parallel kinematic solar tracker designed for driving high-concentration photovoltaic modules. This kind of module produces energy only if they are oriented with misalignment errors lower than 0.4°. Generally, a parallel kinematic structure provides high stiffness and precision in positioning, so these features make this mechanism fit for the purpose. This article describes the work carried out to design a suitable parallel machine: an already existing architecture was chosen, and the geometrical parameters of the system were defined in order to obtain a workspace consistent with the requirements for sun tracking. Besides, an analysis of the singularities of the system was carried out. The method used for the singularity analysis revealed the existence of singularities which had not been previously identified for this kind of mechanism. From the analysis of the mechanism developed, very low nominal energy consumption and elevated stiffness were found. A small-scale prototype of the system was constructed for the first time. A control algorithm was also developed, implemented, and tested. Finally, experimental tests were carried out in order to verify the capability of the system of ensuring precise pointing. The tests have been considered passed as the system showed an orientation error lower than 0.4° during sun tracking.
Parallel solutions of the two-group neutron diffusion equations
International Nuclear Information System (INIS)
Zee, K.S.; Turinsky, P.J.
1987-01-01
Recent efforts to adapt various numerical solution algorithms to parallel computer architectures have addressed the possibility of substantially reducing the running time of few-group neutron diffusion calculations. The authors have developed an efficient iterative parallel algorithm and an associated computer code for the rapid solution of the finite difference method representation of the two-group neutron diffusion equations on the CRAY X/MP-48 supercomputer having multi-CPUs and vector pipelines. For realistic simulation of light water reactor cores, the code employees a macroscopic depletion model with trace capability for selected fission product transients and critical boron. In addition to this, moderator and fuel temperature feedback models are also incorporated into the code. The validity of the physics models used in the code were benchmarked against qualified codes and proved accurate. This work is an extension of previous work in that various feedback effects are accounted for in the system; the entire code is structured to accommodate extensive vectorization; and an additional parallelism by multitasking is achieved not only for the solution of the matrix equations associated with the inner iterations but also for the other segments of the code, e.g., outer iterations
Prosodic Parallelism – comparing spoken and written language
Directory of Open Access Journals (Sweden)
Richard Wiese
2016-10-01
Full Text Available The Prosodic Parallelism hypothesis claims adjacent prosodic categories to prefer identical branching of internal adjacent constituents. According to Wiese and Speyer (2015, this preference implies feet contained in the same phonological phrase to display either binary or unary branching, but not different types of branching. The seemingly free schwa-zero alternations at the end of some words in German make it possible to test this hypothesis. The hypothesis was successfully tested by conducting a corpus study which used large-scale bodies of written German. As some open questions remain, and as it is unclear whether Prosodic Parallelism is valid for the spoken modality as well, the present study extends this inquiry to spoken German. As in the previous study, the results of a corpus analysis recruiting a variety of linguistic constructions are presented. The Prosodic Parallelism hypothesis can be demonstrated to be valid for spoken German as well as for written German. The paper thus contributes to the question whether prosodic preferences are similar between the spoken and written modes of a language. Some consequences of the results for the production of language are discussed.
Thermal effects on parallel-propagating electron cyclotron waves
International Nuclear Information System (INIS)
Robinson, P.A.
1987-01-01
Thermal effects on the dispersion of right-handed electron cyclotron waves propagating parallel to a uniform, ambient magnetic field are investigated in the strictly non-relativistic ('classical') and weakly relativistic approximations for real frequency and complex wave vector. In each approximation, the two branches of the RH mode reconnect near the cyclotron frequency as the plasma temperature is increased or the density is lowered. This reconnection occurs in a manner different from that previously assumed at parallel propagation and from that at perpendicular propagation, giving rise to a new mode near the cold plasma cut-off frequency ωsub(xC). For both parallel and perpendicular propagation, it is noted that reconnection occurs approximately when the cyclotron linewidth equals the width of the stop-band in the cold plasma dispersion relation. Inclusion of weakly relativistic effects is found to be necessary for quantitative calculations and for an accurate treatment of the new mode near ωsub(xC). Weakly relativistic effects also modify the analytic properties of the dispersion relation so as to introduce a new family of weakly damped and undamped solutions. (author)
Current distribution characteristics of superconducting parallel circuits
International Nuclear Information System (INIS)
Mori, K.; Suzuki, Y.; Hara, N.; Kitamura, M.; Tominaka, T.
1994-01-01
In order to increase the current carrying capacity of the current path of the superconducting magnet system, the portion of parallel circuits such as insulated multi-strand cables or parallel persistent current switches (PCS) are made. In superconducting parallel circuits of an insulated multi-strand cable or a parallel persistent current switch (PCS), the current distribution during the current sweep, the persistent mode, and the quench process were investigated. In order to measure the current distribution, two methods were used. (1) Each strand was surrounded with a pure iron core with the air gap. In the air gap, a Hall probe was located. The accuracy of this method was deteriorated by the magnetic hysteresis of iron. (2) The Rogowski coil without iron was used for the current measurement of each path in a 4-parallel PCS. As a result, it was shown that the current distribution characteristics of a parallel PCS is very similar to that of an insulated multi-strand cable for the quench process
Parallel processing of structural integrity analysis codes
International Nuclear Information System (INIS)
Swami Prasad, P.; Dutta, B.K.; Kushwaha, H.S.
1996-01-01
Structural integrity analysis forms an important role in assessing and demonstrating the safety of nuclear reactor components. This analysis is performed using analytical tools such as Finite Element Method (FEM) with the help of digital computers. The complexity of the problems involved in nuclear engineering demands high speed computation facilities to obtain solutions in reasonable amount of time. Parallel processing systems such as ANUPAM provide an efficient platform for realising the high speed computation. The development and implementation of software on parallel processing systems is an interesting and challenging task. The data and algorithm structure of the codes plays an important role in exploiting the parallel processing system capabilities. Structural analysis codes based on FEM can be divided into two categories with respect to their implementation on parallel processing systems. The first category codes such as those used for harmonic analysis, mechanistic fuel performance codes need not require the parallelisation of individual modules of the codes. The second category of codes such as conventional FEM codes require parallelisation of individual modules. In this category, parallelisation of equation solution module poses major difficulties. Different solution schemes such as domain decomposition method (DDM), parallel active column solver and substructuring method are currently used on parallel processing systems. Two codes, FAIR and TABS belonging to each of these categories have been implemented on ANUPAM. The implementation details of these codes and the performance of different equation solvers are highlighted. (author). 5 refs., 12 figs., 1 tab
GRADSPMHD: A parallel MHD code based on the SPH formalism
Vanaverbeke, S.; Keppens, R.; Poedts, S.
2014-03-01
We present GRADSPMHD, a completely Lagrangian parallel magnetohydrodynamics code based on the SPH formalism. The implementation of the equations of SPMHD in the “GRAD-h” formalism assembles known results, including the derivation of the discretized MHD equations from a variational principle, the inclusion of time-dependent artificial viscosity, resistivity and conductivity terms, as well as the inclusion of a mixed hyperbolic/parabolic correction scheme for satisfying the ∇ṡB→ constraint on the magnetic field. The code uses a tree-based formalism for neighbor finding and can optionally use the tree code for computing the self-gravity of the plasma. The structure of the code closely follows the framework of our parallel GRADSPH FORTRAN 90 code which we added previously to the CPC program library. We demonstrate the capabilities of GRADSPMHD by running 1, 2, and 3 dimensional standard benchmark tests and we find good agreement with previous work done by other researchers. The code is also applied to the problem of simulating the magnetorotational instability in 2.5D shearing box tests as well as in global simulations of magnetized accretion disks. We find good agreement with available results on this subject in the literature. Finally, we discuss the performance of the code on a parallel supercomputer with distributed memory architecture. Catalogue identifier: AERP_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AERP_v1_0.html Program obtainable from: CPC Program Library, Queen’s University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 620503 No. of bytes in distributed program, including test data, etc.: 19837671 Distribution format: tar.gz Programming language: FORTRAN 90/MPI. Computer: HPC cluster. Operating system: Unix. Has the code been vectorized or parallelized?: Yes, parallelized using MPI. RAM: ˜30 MB for a
Characterization of a parallel-stranded DNA hairpin
International Nuclear Information System (INIS)
Germann, M.W.; Vogel, H.J.; Pon, R.T.; van de Sande, J.H.
1989-01-01
Recently, the authors have shown that synthetic DNA containing homooligomeric A-T base pairs can form a parallel-stranded intramolecular hairpin structure. In the present study, they have employed NMR and optical spectroscopy to investigate the structure of the parallel-stranded (PS) DNA hairpin 3'-d(T) 8 C 4 (A) 8 -3' and the related antiparallel (APS) hair 5'-d(T) 8 C 4 (A) 8 -3'. The parallel orientation of the strands in the PS oligonucleotide is achieved by introducing a 5'-5' phosphodiester linkage in the hairpin loop. Ultraviolet spectroscopic and fluorescence data of drug binding are consistent with the formation of PS and APS structures, respectively, in these two hairpins. Vacuum circular dichroism measurements in combination with theoretical CD calculations indicate that the PS structure forms a right-handed helix. 31 P NMR measurements indicate that the conformation of the phosphodiester backbone of the PS structure is not drastically different from that of the APS control. The presence of slowly exchanging imino protons at 14 ppm and the observation of nuclear Overhauser enhancement between imino protons and the AH-2 protons demonstrate that similar base pairing and base stacking between T and A residues occur in both hairpins. On the basis of NOESY measurements, they find that the orientation of the bases is in the anti region and that the sugar puckering is in the 2'-endo range. The results indicate a B-like conformation for each of the strands in the stem part of the PS hairpin and reverse Watson-Crick base pairing between the T and A residues. These data are consistent with a previously calculated structure for parallel-stranded DNA
Two-state ion heating at quasi-parallel shocks
International Nuclear Information System (INIS)
Thomsen, M.F.; Gosling, J.T.; Bame, S.J.; Onsager, T.G.; Russell, C.T.
1990-01-01
In a previous study of ion heating at quasi-parallel shocks, the authors showed a case in which the ion distributions downstream from the shock alternated between a cooler, denser, core/shoulder type and a hotter, less dense, more Maxwellian type. In this paper they further document the alternating occurrence of two different ion states downstream from several quasi-parallel shocks. Three separate lines of evidence are presented to show that the two states are not related in an evolutionary sense, but rather both are produced alternately at the shock: (1) the asymptotic downstream plasma parameters (density, ion temperature, and flow speed) are intermediate between those characterizing the two different states closer to the shock, suggesting that the asymptotic state is produced by a mixing of the two initial states; (2) examples of apparently interpenetrating (i.e., mixing) distributions can be found during transitions from one state to the other; and (3) examples of both types of distributions can be found at actual crossings of the shock ramp. The alternation between the two different types of ion distribution provides direct observational support for the idea that the dissipative dynamics of at least some quasi-parallel shocks is non-stationary and cyclic in nature, as demonstrated by recent numerical simulations. Typical cycle times between intervals of similar ion heating states are ∼2 upstream ion gyroperiods. Both the simulations and the in situ observations indicate that a process of coherent ion reflection is commonly an important part of the dissipation at quasi-parallel shocks
49 CFR 173.23 - Previously authorized packaging.
2010-10-01
... 49 Transportation 2 2010-10-01 2010-10-01 false Previously authorized packaging. 173.23 Section... REQUIREMENTS FOR SHIPMENTS AND PACKAGINGS Preparation of Hazardous Materials for Transportation § 173.23 Previously authorized packaging. (a) When the regulations specify a packaging with a specification marking...
28 CFR 10.5 - Incorporation of papers previously filed.
2010-07-01
... 28 Judicial Administration 1 2010-07-01 2010-07-01 false Incorporation of papers previously filed... CARRYING ON ACTIVITIES WITHIN THE UNITED STATES Registration Statement § 10.5 Incorporation of papers previously filed. Papers and documents already filed with the Attorney General pursuant to the said act and...
75 FR 76056 - FEDERAL REGISTER CITATION OF PREVIOUS ANNOUNCEMENT:
2010-12-07
... SECURITIES AND EXCHANGE COMMISSION Sunshine Act Meeting FEDERAL REGISTER CITATION OF PREVIOUS ANNOUNCEMENT: STATUS: Closed meeting. PLACE: 100 F Street, NE., Washington, DC. DATE AND TIME OF PREVIOUSLY ANNOUNCED MEETING: Thursday, December 9, 2010 at 2 p.m. CHANGE IN THE MEETING: Time change. The closed...
No discrimination against previous mates in a sexually cannibalistic spider
Fromhage, Lutz; Schneider, Jutta M.
2005-09-01
In several animal species, females discriminate against previous mates in subsequent mating decisions, increasing the potential for multiple paternity. In spiders, female choice may take the form of selective sexual cannibalism, which has been shown to bias paternity in favor of particular males. If cannibalistic attacks function to restrict a male's paternity, females may have little interest to remate with males having survived such an attack. We therefore studied the possibility of female discrimination against previous mates in sexually cannibalistic Argiope bruennichi, where females almost always attack their mate at the onset of copulation. We compared mating latency and copulation duration of males having experienced a previous copulation either with the same or with a different female, but found no evidence for discrimination against previous mates. However, males copulated significantly shorter when inserting into a used, compared to a previously unused, genital pore of the female.
Implant breast reconstruction after salvage mastectomy in previously irradiated patients.
Persichetti, Paolo; Cagli, Barbara; Simone, Pierfranco; Cogliandro, Annalisa; Fortunato, Lucio; Altomare, Vittorio; Trodella, Lucio
2009-04-01
The most common surgical approach in case of local tumor recurrence after quadrantectomy and radiotherapy is salvage mastectomy. Breast reconstruction is the subsequent phase of the treatment and the plastic surgeon has to operate on previously irradiated and manipulated tissues. The medical literature highlights that breast reconstruction with tissue expanders is not a pursuable option, considering previous radiotherapy a contraindication. The purpose of this retrospective study is to evaluate the influence of previous radiotherapy on 2-stage breast reconstruction (tissue expander/implant). Only patients with analogous timing of radiation therapy and the same demolitive and reconstructive procedures were recruited. The results of this study prove that, after salvage mastectomy in previously irradiated patients, implant reconstruction is still possible. Further comparative studies are, of course, advisable to draw any conclusion on the possibility to perform implant reconstruction in previously irradiated patients.
Parallel Architectures and Parallel Algorithms for Integrated Vision Systems. Ph.D. Thesis
Choudhary, Alok Nidhi
1989-01-01
Computer vision is regarded as one of the most complex and computationally intensive problems. An integrated vision system (IVS) is a system that uses vision algorithms from all levels of processing to perform for a high level application (e.g., object recognition). An IVS normally involves algorithms from low level, intermediate level, and high level vision. Designing parallel architectures for vision systems is of tremendous interest to researchers. Several issues are addressed in parallel architectures and parallel algorithms for integrated vision systems.
A task parallel implementation of fast multipole methods
Taura, Kenjiro; Nakashima, Jun; Yokota, Rio; Maruyama, Naoya
2012-01-01
This paper describes a task parallel implementation of ExaFMM, an open source implementation of fast multipole methods (FMM), using a lightweight task parallel library MassiveThreads. Although there have been many attempts on parallelizing FMM
Parallel phase model : a programming model for high-end parallel machines with manycores.
Energy Technology Data Exchange (ETDEWEB)
Wu, Junfeng (Syracuse University, Syracuse, NY); Wen, Zhaofang; Heroux, Michael Allen; Brightwell, Ronald Brian
2009-04-01
This paper presents a parallel programming model, Parallel Phase Model (PPM), for next-generation high-end parallel machines based on a distributed memory architecture consisting of a networked cluster of nodes with a large number of cores on each node. PPM has a unified high-level programming abstraction that facilitates the design and implementation of parallel algorithms to exploit both the parallelism of the many cores and the parallelism at the cluster level. The programming abstraction will be suitable for expressing both fine-grained and coarse-grained parallelism. It includes a few high-level parallel programming language constructs that can be added as an extension to an existing (sequential or parallel) programming language such as C; and the implementation of PPM also includes a light-weight runtime library that runs on top of an existing network communication software layer (e.g. MPI). Design philosophy of PPM and details of the programming abstraction are also presented. Several unstructured applications that inherently require high-volume random fine-grained data accesses have been implemented in PPM with very promising results.
Parallel algorithms for placement and routing in VLSI design. Ph.D. Thesis
Brouwer, Randall Jay
1991-01-01
The computational requirements for high quality synthesis, analysis, and verification of very large scale integration (VLSI) designs have rapidly increased with the fast growing complexity of these designs. Research in the past has focused on the development of heuristic algorithms, special purpose hardware accelerators, or parallel algorithms for the numerous design tasks to decrease the time required for solution. Two new parallel algorithms are proposed for two VLSI synthesis tasks, standard cell placement and global routing. The first algorithm, a parallel algorithm for global routing, uses hierarchical techniques to decompose the routing problem into independent routing subproblems that are solved in parallel. Results are then presented which compare the routing quality to the results of other published global routers and which evaluate the speedups attained. The second algorithm, a parallel algorithm for cell placement and global routing, hierarchically integrates a quadrisection placement algorithm, a bisection placement algorithm, and the previous global routing algorithm. Unique partitioning techniques are used to decompose the various stages of the algorithm into independent tasks which can be evaluated in parallel. Finally, results are presented which evaluate the various algorithm alternatives and compare the algorithm performance to other placement programs. Measurements are presented on the parallel speedups available.
The role of parallelism in the real-time processing of anaphora.
Poirier, Josée; Walenski, Matthew; Shapiro, Lewis P
2012-06-01
Parallelism effects refer to the facilitated processing of a target structure when it follows a similar, parallel structure. In coordination, a parallelism-related conjunction triggers the expectation that a second conjunct with the same structure as the first conjunct should occur. It has been proposed that parallelism effects reflect the use of the first structure as a template that guides the processing of the second. In this study, we examined the role of parallelism in real-time anaphora resolution by charting activation patterns in coordinated constructions containing anaphora, Verb-Phrase Ellipsis (VPE) and Noun-Phrase Traces (NP-traces). Specifically, we hypothesised that an expectation of parallelism would incite the parser to assume a structure similar to the first conjunct in the second, anaphora-containing conjunct. The speculation of a similar structure would result in early postulation of covert anaphora. Experiment 1 confirms that following a parallelism-related conjunction, first-conjunct material is activated in the second conjunct. Experiment 2 reveals that an NP-trace in the second conjunct is posited immediately where licensed, which is earlier than previously reported in the literature. In light of our findings, we propose an intricate relation between structural expectations and anaphor resolution.
Plastic collapse behavior for thin tube with two parallel cracks
International Nuclear Information System (INIS)
Moon, Seong In; Chang, Yoon Suk; Kim, Young Jin; Lee, Jin Ho; Song, Myung Ho; Choi, Young Hwan; Kim, Joung Soo
2004-01-01
The current plugging criterion is known to be too conservative for some locations and types of defects. Many defects detected during in-service inspection take on the form of multiple cracks at the top of tube sheet but there is no reliable plugging criterion for the steam generator tubes with multiple cracks. Most of the previous studies on multiple cracks are confined to elastic analyses and only few studies have been done on the steam generator tubes failed by plastic collapse. Therefore, it is necessary to develop models which can be used to estimate the failure behavior of steam generator tubes with multiple cracks. The objective of this study is to verify the applicability of the optimum local failure prediction models proposed in the previous study. For this, plastic collapse tests are performed with the tube specimens containing two parallel through-wall cracks. The plastic collapse load of the steam generator tubes containing two parallel through-wall cracks are also estimated by using the proposed optimum global failure model and the applicability is investigated by comparing the estimated results with the experimental results. Also, the interaction effect between two cracks was evaluated to explain the plastic collapse behavior
Random number generators for large-scale parallel Monte Carlo simulations on FPGA
Lin, Y.; Wang, F.; Liu, B.
2018-05-01
Through parallelization, field programmable gate array (FPGA) can achieve unprecedented speeds in large-scale parallel Monte Carlo (LPMC) simulations. FPGA presents both new constraints and new opportunities for the implementations of random number generators (RNGs), which are key elements of any Monte Carlo (MC) simulation system. Using empirical and application based tests, this study evaluates all of the four RNGs used in previous FPGA based MC studies and newly proposed FPGA implementations for two well-known high-quality RNGs that are suitable for LPMC studies on FPGA. One of the newly proposed FPGA implementations: a parallel version of additive lagged Fibonacci generator (Parallel ALFG) is found to be the best among the evaluated RNGs in fulfilling the needs of LPMC simulations on FPGA.
Modular high-temperature gas-cooled reactor simulation using parallel processors
International Nuclear Information System (INIS)
Ball, S.J.; Conklin, J.C.
1989-01-01
The MHPP (Modular HTGR Parallel Processor) code has been developed to simulate modular high-temperature gas-cooled reactor (MHTGR) transients and accidents. MHPP incorporates a very detailed model for predicting the dynamics of the reactor core, vessel, and cooling systems over a wide variety of scenarios ranging from expected transients to very-low-probability severe accidents. The simulations routines, which had originally been developed entirely as serial code, were readily adapted to parallel processing Fortran. The resulting parallelized simulation speed was enhanced significantly. Workstation interfaces are being developed to provide for user (operator) interaction. In this paper the benefits realized by adapting previous MHTGR codes to run on a parallel processor are discussed, along with results of typical accident analyses
Parallel evolutionary computation in bioinformatics applications.
Pinho, Jorge; Sobral, João Luis; Rocha, Miguel
2013-05-01
A large number of optimization problems within the field of Bioinformatics require methods able to handle its inherent complexity (e.g. NP-hard problems) and also demand increased computational efforts. In this context, the use of parallel architectures is a necessity. In this work, we propose ParJECoLi, a Java based library that offers a large set of metaheuristic methods (such as Evolutionary Algorithms) and also addresses the issue of its efficient execution on a wide range of parallel architectures. The proposed approach focuses on the easiness of use, making the adaptation to distinct parallel environments (multicore, cluster, grid) transparent to the user. Indeed, this work shows how the development of the optimization library can proceed independently of its adaptation for several architectures, making use of Aspect-Oriented Programming. The pluggable nature of parallelism related modules allows the user to easily configure its environment, adding parallelism modules to the base source code when needed. The performance of the platform is validated with two case studies within biological model optimization. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Iteration schemes for parallelizing models of superconductivity
Energy Technology Data Exchange (ETDEWEB)
Gray, P.A. [Michigan State Univ., East Lansing, MI (United States)
1996-12-31
The time dependent Lawrence-Doniach model, valid for high fields and high values of the Ginzburg-Landau parameter, is often used for studying vortex dynamics in layered high-T{sub c} superconductors. When solving these equations numerically, the added degrees of complexity due to the coupling and nonlinearity of the model often warrant the use of high-performance computers for their solution. However, the interdependence between the layers can be manipulated so as to allow parallelization of the computations at an individual layer level. The reduced parallel tasks may then be solved independently using a heterogeneous cluster of networked workstations connected together with Parallel Virtual Machine (PVM) software. Here, this parallelization of the model is discussed and several computational implementations of varying degrees of parallelism are presented. Computational results are also given which contrast properties of convergence speed, stability, and consistency of these implementations. Included in these results are models involving the motion of vortices due to an applied current and pinning effects due to various material properties.
Personality disorders in previously detained adolescent females: a prospective study
Krabbendam, A.; Colins, O.F.; Doreleijers, T.A.H.; van der Molen, E.; Beekman, A.T.F.; Vermeiren, R.R.J.M.
2015-01-01
This longitudinal study investigated the predictive value of trauma and mental health problems for the development of antisocial personality disorder (ASPD) and borderline personality disorder (BPD) in previously detained women. The participants were 229 detained adolescent females who were assessed
Payload specialist Reinhard Furrer show evidence of previous blood sampling
1985-01-01
Payload specialist Reinhard Furrer shows evidence of previous blood sampling while Wubbo J. Ockels, Dutch payload specialist (only partially visible), extends his right arm after a sample has been taken. Both men show bruises on their arms.
Choice of contraception after previous operative delivery at a family ...
African Journals Online (AJOL)
Choice of contraception after previous operative delivery at a family planning clinic in Northern Nigeria. Amina Mohammed‑Durosinlorun, Joel Adze, Stephen Bature, Caleb Mohammed, Matthew Taingson, Amina Abubakar, Austin Ojabo, Lydia Airede ...
Previous utilization of service does not improve timely booking in ...
African Journals Online (AJOL)
Previous utilization of service does not improve timely booking in antenatal care: Cross sectional study ... Journal Home > Vol 24, No 3 (2010) > ... Results: Past experience on antenatal care service utilization did not come out as a predictor for ...
A previous hamstring injury affects kicking mechanics in soccer players.
Navandar, Archit; Veiga, Santiago; Torres, Gonzalo; Chorro, David; Navarro, Enrique
2018-01-10
Although the kicking skill is influenced by limb dominance and sex, how a previous hamstring injury affects kicking has not been studied in detail. Thus, the objective of this study was to evaluate the effect of sex and limb dominance on kicking in limbs with and without a previous hamstring injury. 45 professional players (males: n=19, previously injured players=4, age=21.16 ± 2.00 years; females: n=19, previously injured players=10, age=22.15 ± 4.50 years) performed 5 kicks each with their preferred and non-preferred limb at a target 7m away, which were recorded with a three-dimensional motion capture system. Kinematic and kinetic variables were extracted for the backswing, leg cocking, leg acceleration and follow through phases. A shorter backswing (20.20 ± 3.49% vs 25.64 ± 4.57%), and differences in knee flexion angle (58 ± 10o vs 72 ± 14o) and hip flexion velocity (8 ± 0rad/s vs 10 ± 2rad/s) were observed in previously injured, non-preferred limb kicks for females. A lower peak hip linear velocity (3.50 ± 0.84m/s vs 4.10 ± 0.45m/s) was observed in previously injured, preferred limb kicks of females. These differences occurred in the backswing and leg-cocking phases where the hamstring muscles were the most active. A variation in the functioning of the hamstring muscles and that of the gluteus maximus and iliopsoas in the case of a previous injury could account for the differences observed in the kicking pattern. Therefore, the effects of a previous hamstring injury must be considered while designing rehabilitation programs to re-educate kicking movement.
Parallel visualization on leadership computing resources
Energy Technology Data Exchange (ETDEWEB)
Peterka, T; Ross, R B [Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL 60439 (United States); Shen, H-W [Department of Computer Science and Engineering, Ohio State University, Columbus, OH 43210 (United States); Ma, K-L [Department of Computer Science, University of California at Davis, Davis, CA 95616 (United States); Kendall, W [Department of Electrical Engineering and Computer Science, University of Tennessee at Knoxville, Knoxville, TN 37996 (United States); Yu, H, E-mail: tpeterka@mcs.anl.go [Sandia National Laboratories, California, Livermore, CA 94551 (United States)
2009-07-01
Changes are needed in the way that visualization is performed, if we expect the analysis of scientific data to be effective at the petascale and beyond. By using similar techniques as those used to parallelize simulations, such as parallel I/O, load balancing, and effective use of interprocess communication, the supercomputers that compute these datasets can also serve as analysis and visualization engines for them. Our team is assessing the feasibility of performing parallel scientific visualization on some of the most powerful computational resources of the U.S. Department of Energy's National Laboratories in order to pave the way for analyzing the next generation of computational results. This paper highlights some of the conclusions of that research.
Parallelization of ITOUGH2 using PVM
International Nuclear Information System (INIS)
Finsterle, Stefan
1998-01-01
ITOUGH2 inversions are computationally intensive because the forward problem must be solved many times to evaluate the objective function for different parameter combinations or to numerically calculate sensitivity coefficients. Most of these forward runs are independent from each other and can therefore be performed in parallel. Message passing based on the Parallel Virtual Machine (PVM) system has been implemented into ITOUGH2 to enable parallel processing of ITOUGH2 jobs on a heterogeneous network of Unix workstations. This report describes the PVM system and its implementation into ITOUGH2. Instructions are given for installing PVM, compiling ITOUGH2-PVM for use on a workstation cluster, the preparation of an 1.TOUGH2 input file under PVM, and the execution of an ITOUGH2-PVM application. Examples are discussed, demonstrating the use of ITOUGH2-PVM
Distributed Parallel Architecture for "Big Data"
Directory of Open Access Journals (Sweden)
Catalin BOJA
2012-01-01
Full Text Available This paper is an extension to the "Distributed Parallel Architecture for Storing and Processing Large Datasets" paper presented at the WSEAS SEPADS’12 conference in Cambridge. In its original version the paper went over the benefits of using a distributed parallel architecture to store and process large datasets. This paper analyzes the problem of storing, processing and retrieving meaningful insight from petabytes of data. It provides a survey on current distributed and parallel data processing technologies and, based on them, will propose an architecture that can be used to solve the analyzed problem. In this version there is more emphasis put on distributed files systems and the ETL processes involved in a distributed environment.
Java parallel secure stream for grid computing
International Nuclear Information System (INIS)
Chen, J.; Akers, W.; Chen, Y.; Watson, W.
2001-01-01
The emergence of high speed wide area networks makes grid computing a reality. However grid applications that need reliable data transfer still have difficulties to achieve optimal TCP performance due to network tuning of TCP window size to improve the bandwidth and to reduce latency on a high speed wide area network. The authors present a pure Java package called JPARSS (Java Parallel Secure Stream) that divides data into partitions that are sent over several parallel Java streams simultaneously and allows Java or Web applications to achieve optimal TCP performance in a gird environment without the necessity of tuning the TCP window size. Several experimental results are provided to show that using parallel stream is more effective than tuning TCP window size. In addition X.509 certificate based single sign-on mechanism and SSL based connection establishment are integrated into this package. Finally a few applications using this package will be discussed
Applications of Parallel Processing in Mobile Banking
Directory of Open Access Journals (Sweden)
2007-01-01
Full Text Available The future of mobile banking will be represented by such applications that support mobile, Internet banking and EFT (Electronic Funds Transfer transactions in a single user interface. In such a way, the mobile banking will be able to cover all the types of applications demanded at the market level. The parallel processing of credit card bank transactions could be performed with the help of a grid network. Excluding some limitations, the grid processing offers huge opportunities to exploit the parallelism. For this reason, a lot of applications of waiting queues in grid processing were developed in the last years. Grid networks represent a distinctive and very modern field of the parallel and distributed processing.
Parallel optoelectronic trinary signed-digit division
Alam, Mohammad S.
1999-03-01
The trinary signed-digit (TSD) number system has been found to be very useful for parallel addition and subtraction of any arbitrary length operands in constant time. Using the TSD addition and multiplication modules as the basic building blocks, we develop an efficient algorithm for performing parallel TSD division in constant time. The proposed division technique uses one TSD subtraction and two TSD multiplication steps. An optoelectronic correlator based architecture is suggested for implementation of the proposed TSD division algorithm, which fully exploits the parallelism and high processing speed of optics. An efficient spatial encoding scheme is used to ensure better utilization of space bandwidth product of the spatial light modulators used in the optoelectronic implementation.
Parallel computational in nuclear group constant calculation
International Nuclear Information System (INIS)
Su'ud, Zaki; Rustandi, Yaddi K.; Kurniadi, Rizal
2002-01-01
In this paper parallel computational method in nuclear group constant calculation using collision probability method will be discuss. The main focus is on the calculation of collision matrix which need large amount of computational time. The geometry treated here is concentric cylinder. The calculation of collision probability matrix is carried out using semi analytic method using Beckley Naylor Function. To accelerate computation speed some computer parallel used to solve the problem. We used LINUX based parallelization using PVM software with C or fortran language. While in windows based we used socket programming using DELPHI or C builder. The calculation results shows the important of optimal weight for each processor in case there area many type of processor speed
Abstract Level Parallelization of Finite Difference Methods
Directory of Open Access Journals (Sweden)
Edwin Vollebregt
1997-01-01
Full Text Available A formalism is proposed for describing finite difference calculations in an abstract way. The formalism consists of index sets and stencils, for characterizing the structure of sets of data items and interactions between data items (“neighbouring relations”. The formalism provides a means for lifting programming to a more abstract level. This simplifies the tasks of performance analysis and verification of correctness, and opens the way for automaticcode generation. The notation is particularly useful in parallelization, for the systematic construction of parallel programs in a process/channel programming paradigm (e.g., message passing. This is important because message passing, unfortunately, still is the only approach that leads to acceptable performance for many more unstructured or irregular problems on parallel computers that have non-uniform memory access times. It will be shown that the use of index sets and stencils greatly simplifies the determination of which data must be exchanged between different computing processes.
Parallel visualization on leadership computing resources
International Nuclear Information System (INIS)
Peterka, T; Ross, R B; Shen, H-W; Ma, K-L; Kendall, W; Yu, H
2009-01-01
Changes are needed in the way that visualization is performed, if we expect the analysis of scientific data to be effective at the petascale and beyond. By using similar techniques as those used to parallelize simulations, such as parallel I/O, load balancing, and effective use of interprocess communication, the supercomputers that compute these datasets can also serve as analysis and visualization engines for them. Our team is assessing the feasibility of performing parallel scientific visualization on some of the most powerful computational resources of the U.S. Department of Energy's National Laboratories in order to pave the way for analyzing the next generation of computational results. This paper highlights some of the conclusions of that research.
A possibility of parallel and anti-parallel diffraction measurements on ...
Indian Academy of Sciences (India)
However, a bent perfect crystal (BPC) monochromator at monochromatic focusing condition can provide a quite flat and equal resolution property at both parallel and anti-parallel positions and thus one can have a chance to use both sides for the diffraction experiment. From the data of the FWHM and the / measured ...
International Nuclear Information System (INIS)
Ishizuki, Shigeru; Kawai, Wataru; Nemoto, Toshiyuki; Ogasawara, Shinobu; Kume, Etsuo; Adachi, Masaaki; Kawasaki, Nobuo; Yatake, Yo-ichi
2000-03-01
Several computer codes in the nuclear field have been vectorized, parallelized and transported on the FUJITSU VPP500 system, the AP3000 system and the Paragon system at Center for Promotion of Computational Science and Engineering in Japan Atomic Energy Research Institute. We dealt with 12 codes in fiscal 1998. These results are reported in 3 parts, i.e., the vectorization and parallelization on vector processors part, the parallelization on scalar processors part and the porting part. In this report, we describe the vectorization and parallelization on vector processors. In this vectorization and parallelization on vector processors part, the vectorization of General Tokamak Circuit Simulation Program code GTCSP, the vectorization and parallelization of Molecular Dynamics NTV (n-particle, Temperature and Velocity) Simulation code MSP2, Eddy Current Analysis code EDDYCAL, Thermal Analysis Code for Test of Passive Cooling System by HENDEL T2 code THANPACST2 and MHD Equilibrium code SELENEJ on the VPP500 are described. In the parallelization on scalar processors part, the parallelization of Monte Carlo N-Particle Transport code MCNP4B2, Plasma Hydrodynamics code using Cubic Interpolated Propagation Method PHCIP and Vectorized Monte Carlo code (continuous energy model / multi-group model) MVP/GMVP on the Paragon are described. In the porting part, the porting of Monte Carlo N-Particle Transport code MCNP4B2 and Reactor Safety Analysis code RELAP5 on the AP3000 are described. (author)
A SPECT reconstruction method for extending parallel to non-parallel geometries
International Nuclear Information System (INIS)
Wen Junhai; Liang Zhengrong
2010-01-01
Due to its simplicity, parallel-beam geometry is usually assumed for the development of image reconstruction algorithms. The established reconstruction methodologies are then extended to fan-beam, cone-beam and other non-parallel geometries for practical application. This situation occurs for quantitative SPECT (single photon emission computed tomography) imaging in inverting the attenuated Radon transform. Novikov reported an explicit parallel-beam formula for the inversion of the attenuated Radon transform in 2000. Thereafter, a formula for fan-beam geometry was reported by Bukhgeim and Kazantsev (2002 Preprint N. 99 Sobolev Institute of Mathematics). At the same time, we presented a formula for varying focal-length fan-beam geometry. Sometimes, the reconstruction formula is so implicit that we cannot obtain the explicit reconstruction formula in the non-parallel geometries. In this work, we propose a unified reconstruction framework for extending parallel-beam geometry to any non-parallel geometry using ray-driven techniques. Studies by computer simulations demonstrated the accuracy of the presented unified reconstruction framework for extending parallel-beam to non-parallel geometries in inverting the attenuated Radon transform.
Programming massively parallel processors a hands-on approach
Kirk, David B
2010-01-01
Programming Massively Parallel Processors discusses basic concepts about parallel programming and GPU architecture. ""Massively parallel"" refers to the use of a large number of processors to perform a set of computations in a coordinated parallel way. The book details various techniques for constructing parallel programs. It also discusses the development process, performance level, floating-point format, parallel patterns, and dynamic parallelism. The book serves as a teaching guide where parallel programming is the main topic of the course. It builds on the basics of C programming for CUDA, a parallel programming environment that is supported on NVI- DIA GPUs. Composed of 12 chapters, the book begins with basic information about the GPU as a parallel computer source. It also explains the main concepts of CUDA, data parallelism, and the importance of memory access efficiency using CUDA. The target audience of the book is graduate and undergraduate students from all science and engineering disciplines who ...
Parallelization of Reversible Ripple-carry Adders
DEFF Research Database (Denmark)
Thomsen, Michael Kirkedal; Axelsen, Holger Bock
2009-01-01
The design of fast arithmetic logic circuits is an important research topic for reversible and quantum computing. A special challenge in this setting is the computation of standard arithmetical functions without the generation of \\emph{garbage}. Here, we present a novel parallelization scheme...... wherein $m$ parallel $k$-bit reversible ripple-carry adders are combined to form a reversible $mk$-bit \\emph{ripple-block carry adder} with logic depth $\\mathcal{O}(m+k)$ for a \\emph{minimal} logic depth $\\mathcal{O}(\\sqrt{mk})$, thus improving on the $mk$-bit ripple-carry adder logic depth $\\mathcal...
Parallel algorithms for numerical linear algebra
van der Vorst, H
1990-01-01
This is the first in a new series of books presenting research results and developments concerning the theory and applications of parallel computers, including vector, pipeline, array, fifth/future generation computers, and neural computers.All aspects of high-speed computing fall within the scope of the series, e.g. algorithm design, applications, software engineering, networking, taxonomy, models and architectural trends, performance, peripheral devices.Papers in Volume One cover the main streams of parallel linear algebra: systolic array algorithms, message-passing systems, algorithms for p
Keldysh formalism for multiple parallel worlds
International Nuclear Information System (INIS)
Ansari, M.; Nazarov, Y. V.
2016-01-01
We present a compact and self-contained review of the recently developed Keldysh formalism for multiple parallel worlds. The formalism has been applied to consistent quantum evaluation of the flows of informational quantities, in particular, to the evaluation of Renyi and Shannon entropy flows. We start with the formulation of the standard and extended Keldysh techniques in a single world in a form convenient for our presentation. We explain the use of Keldysh contours encompassing multiple parallel worlds. In the end, we briefly summarize the concrete results obtained with the method.
Keldysh formalism for multiple parallel worlds
Ansari, M.; Nazarov, Y. V.
2016-03-01
We present a compact and self-contained review of the recently developed Keldysh formalism for multiple parallel worlds. The formalism has been applied to consistent quantum evaluation of the flows of informational quantities, in particular, to the evaluation of Renyi and Shannon entropy flows. We start with the formulation of the standard and extended Keldysh techniques in a single world in a form convenient for our presentation. We explain the use of Keldysh contours encompassing multiple parallel worlds. In the end, we briefly summarize the concrete results obtained with the method.
A Massively Parallel Face Recognition System
Directory of Open Access Journals (Sweden)
Lahdenoja Olli
2007-01-01
Full Text Available We present methods for processing the LBPs (local binary patterns with a massively parallel hardware, especially with CNN-UM (cellular nonlinear network-universal machine. In particular, we present a framework for implementing a massively parallel face recognition system, including a dedicated highly accurate algorithm suitable for various types of platforms (e.g., CNN-UM and digital FPGA. We study in detail a dedicated mixed-mode implementation of the algorithm and estimate its implementation cost in the view of its performance and accuracy restrictions.
Xyce parallel electronic simulator release notes.
Energy Technology Data Exchange (ETDEWEB)
Keiter, Eric R; Hoekstra, Robert John; Mei, Ting; Russo, Thomas V.; Schiek, Richard Louis; Thornquist, Heidi K.; Rankin, Eric Lamont; Coffey, Todd S; Pawlowski, Roger P; Santarelli, Keith R.
2010-05-01
The Xyce Parallel Electronic Simulator has been written to support, in a rigorous manner, the simulation needs of the Sandia National Laboratories electrical designers. Specific requirements include, among others, the ability to solve extremely large circuit problems by supporting large-scale parallel computing platforms, improved numerical performance and object-oriented code design and implementation. The Xyce release notes describe: Hardware and software requirements New features and enhancements Any defects fixed since the last release Current known defects and defect workarounds For up-to-date information not available at the time these notes were produced, please visit the Xyce web page at http://www.cs.sandia.gov/xyce.
Parallel transposition of sparse data structures
DEFF Research Database (Denmark)
Wang, Hao; Liu, Weifeng; Hou, Kaixi
2016-01-01
Many applications in computational sciences and social sciences exploit sparsity and connectivity of acquired data. Even though many parallel sparse primitives such as sparse matrix-vector (SpMV) multiplication have been extensively studied, some other important building blocks, e.g., parallel tr...... transposition in the latest vendor-supplied library on an Intel multicore CPU platform, and the MergeTrans approach achieves on average of 3.4-fold (up to 11.7-fold) speedup on an Intel Xeon Phi many-core processor....
Temporal fringe pattern analysis with parallel computing
International Nuclear Information System (INIS)
Tuck Wah Ng; Kar Tien Ang; Argentini, Gianluca
2005-01-01
Temporal fringe pattern analysis is invaluable in transient phenomena studies but necessitates long processing times. Here we describe a parallel computing strategy based on the single-program multiple-data model and hyperthreading processor technology to reduce the execution time. In a two-node cluster workstation configuration we found that execution periods were reduced by 1.6 times when four virtual processors were used. To allow even lower execution times with an increasing number of processors, the time allocated for data transfer, data read, and waiting should be minimized. Parallel computing is found here to present a feasible approach to reduce execution times in temporal fringe pattern analysis
On radial flow between parallel disks
International Nuclear Information System (INIS)
Wee, A Y L; Gorin, A
2015-01-01
Approximate analytical solutions are presented for converging flow in between two parallel non rotating disks. The static pressure distribution and radial component of the velocity are developed by averaging the inertial term across the gap in between parallel disks. The predicted results from the first approximation are favourable to experimental results as well as results presented by other authors. The second approximation shows that as the fluid approaches the center, the velocity at the mid channel slows down which is due to the struggle between the inertial term and the flowrate. (paper)
Logical inference techniques for loop parallelization
DEFF Research Database (Denmark)
Oancea, Cosmin Eugen; Rauchwerger, Lawrence
2012-01-01
the parallelization transformation by verifying the independence of the loop's memory references. To this end it represents array references using the USR (uniform set representation) language and expresses the independence condition as an equation, S={}, where S is a set expression representing array indexes. Using...... of their estimated complexities. We evaluate our automated solution on 26 benchmarks from PERFECT-CLUB and SPEC suites and show that our approach is effective in parallelizing large, complex loops and obtains much better full program speedups than the Intel and IBM Fortran compilers....
A PARALLEL EXTENSION OF THE UAL ENVIRONMENT
International Nuclear Information System (INIS)
MALITSKY, N.; SHISHLO, A.
2001-01-01
The deployment of the Unified Accelerator Library (UAL) environment on the parallel cluster is presented. The approach is based on the Message-Passing Interface (MPI) library and the Perl adapter that allows one to control and mix together the existing conventional UAL components with the new MPI-based parallel extensions. In the paper, we provide timing results and describe the application of the new environment to the SNS Ring complex beam dynamics studies, particularly, simulations of several physical effects, such as space charge, field errors, fringe fields, and others
Analysis of a parallel multigrid algorithm
Chan, Tony F.; Tuminaro, Ray S.
1989-01-01
The parallel multigrid algorithm of Frederickson and McBryan (1987) is considered. This algorithm uses multiple coarse-grid problems (instead of one problem) in the hope of accelerating convergence and is found to have a close relationship to traditional multigrid methods. Specifically, the parallel coarse-grid correction operator is identical to a traditional multigrid coarse-grid correction operator, except that the mixing of high and low frequencies caused by aliasing error is removed. Appropriate relaxation operators can be chosen to take advantage of this property. Comparisons between the standard multigrid and the new method are made.
Parallel processing for artificial intelligence 2
Kumar, V; Suttner, CB
1994-01-01
With the increasing availability of parallel machines and the raising of interest in large scale and real world applications, research on parallel processing for Artificial Intelligence (AI) is gaining greater importance in the computer science environment. Many applications have been implemented and delivered but the field is still considered to be in its infancy. This book assembles diverse aspects of research in the area, providing an overview of the current state of technology. It also aims to promote further growth across the discipline. Contributions have been grouped according to their
Configuration affects parallel stent grafting results.
Tanious, Adam; Wooster, Mathew; Armstrong, Paul A; Zwiebel, Bruce; Grundy, Shane; Back, Martin R; Shames, Murray L
2018-05-01
A number of adjunctive "off-the-shelf" procedures have been described to treat complex aortic diseases. Our goal was to evaluate parallel stent graft configurations and to determine an optimal formula for these procedures. This is a retrospective review of all patients at a single medical center treated with parallel stent grafts from January 2010 to September 2015. Outcomes were evaluated on the basis of parallel graft orientation, type, and main body device. Primary end points included parallel stent graft compromise and overall endovascular aneurysm repair (EVAR) compromise. There were 78 patients treated with a total of 144 parallel stents for a variety of pathologic processes. There was a significant correlation between main body oversizing and snorkel compromise (P = .0195) and overall procedural complication (P = .0019) but not with endoleak rates. Patients were organized into the following oversizing groups for further analysis: 0% to 10%, 10% to 20%, and >20%. Those oversized into the 0% to 10% group had the highest rate of overall EVAR complication (73%; P = .0003). There were no significant correlations between any one particular configuration and overall procedural complication. There was also no significant correlation between total number of parallel stents employed and overall complication. Composite EVAR configuration had no significant correlation with individual snorkel compromise, endoleak, or overall EVAR or procedural complication. The configuration most prone to individual snorkel compromise and overall EVAR complication was a four-stent configuration with two stents in an antegrade position and two stents in a retrograde position (60% complication rate). The configuration most prone to endoleak was one or two stents in retrograde position (33% endoleak rate), followed by three stents in an all-antegrade position (25%). There was a significant correlation between individual stent configuration and stent compromise (P = .0385), with 31
Keldysh formalism for multiple parallel worlds
Energy Technology Data Exchange (ETDEWEB)
Ansari, M.; Nazarov, Y. V., E-mail: y.v.nazarov@tudelft.nl [Delft University of Technology, Kavli Institute of Nanoscience (Netherlands)
2016-03-15
We present a compact and self-contained review of the recently developed Keldysh formalism for multiple parallel worlds. The formalism has been applied to consistent quantum evaluation of the flows of informational quantities, in particular, to the evaluation of Renyi and Shannon entropy flows. We start with the formulation of the standard and extended Keldysh techniques in a single world in a form convenient for our presentation. We explain the use of Keldysh contours encompassing multiple parallel worlds. In the end, we briefly summarize the concrete results obtained with the method.
Use of parallel counters for triggering
International Nuclear Information System (INIS)
Nikityuk, N.M.
1991-01-01
Results of investigation of using parallel counters, majority coincidence schemes, parallel compressors for triggering in multichannel high energy spectrometers are described. Concrete examples of methods of constructing fast and economic new devices used to determine multiplicity hits t>900 registered in a hodoscopic plane and a pixel detector are given. For this purpose the author uses the syndrome coding method and cellular arrays. In addition, an effective coding matrix has been created which can be used for light signal coding. For example, such signals are supplied from scintillators to photomultipliers. 23 refs.; 21 figs
A Massively Parallel Face Recognition System
Directory of Open Access Journals (Sweden)
Ari Paasio
2006-12-01
Full Text Available We present methods for processing the LBPs (local binary patterns with a massively parallel hardware, especially with CNN-UM (cellular nonlinear network-universal machine. In particular, we present a framework for implementing a massively parallel face recognition system, including a dedicated highly accurate algorithm suitable for various types of platforms (e.g., CNN-UM and digital FPGA. We study in detail a dedicated mixed-mode implementation of the algorithm and estimate its implementation cost in the view of its performance and accuracy restrictions.
Parallel processor for fast event analysis
International Nuclear Information System (INIS)
Hensley, D.C.
1983-01-01
Current maximum data rates from the Spin Spectrometer of approx. 5000 events/s (up to 1.3 MBytes/s) and minimum analysis requiring at least 3000 operations/event require a CPU cycle time near 70 ns. In order to achieve an effective cycle time of 70 ns, a parallel processing device is proposed where up to 4 independent processors will be implemented in parallel. The individual processors are designed around the Am2910 Microsequencer, the AM29116 μP, and the Am29517 Multiplier. Satellite histogramming in a mass memory system will be managed by a commercial 16-bit μP system
Parallel adaptive simulations on unstructured meshes
International Nuclear Information System (INIS)
Shephard, M S; Jansen, K E; Sahni, O; Diachin, L A
2007-01-01
This paper discusses methods being developed by the ITAPS center to support the execution of parallel adaptive simulations on unstructured meshes. The paper first outlines the ITAPS approach to the development of interoperable mesh, geometry and field services to support the needs of SciDAC application in these areas. The paper then demonstrates the ability of unstructured adaptive meshing methods built on such interoperable services to effectively solve important physics problems. Attention is then focused on ITAPs' developing ability to solve adaptive unstructured mesh problems on massively parallel computers
Structured building model reduction toward parallel simulation
Energy Technology Data Exchange (ETDEWEB)
Dobbs, Justin R. [Cornell University; Hencey, Brondon M. [Cornell University
2013-08-26
Building energy model reduction exchanges accuracy for improved simulation speed by reducing the number of dynamical equations. Parallel computing aims to improve simulation times without loss of accuracy but is poorly utilized by contemporary simulators and is inherently limited by inter-processor communication. This paper bridges these disparate techniques to implement efficient parallel building thermal simulation. We begin with a survey of three structured reduction approaches that compares their performance to a leading unstructured method. We then use structured model reduction to find thermal clusters in the building energy model and allocate processing resources. Experimental results demonstrate faster simulation and low error without any interprocessor communication.
Parallel preconditioning techniques for sparse CG solvers
Energy Technology Data Exchange (ETDEWEB)
Basermann, A.; Reichel, B.; Schelthoff, C. [Central Institute for Applied Mathematics, Juelich (Germany)
1996-12-31
Conjugate gradient (CG) methods to solve sparse systems of linear equations play an important role in numerical methods for solving discretized partial differential equations. The large size and the condition of many technical or physical applications in this area result in the need for efficient parallelization and preconditioning techniques of the CG method. In particular for very ill-conditioned matrices, sophisticated preconditioner are necessary to obtain both acceptable convergence and accuracy of CG. Here, we investigate variants of polynomial and incomplete Cholesky preconditioners that markedly reduce the iterations of the simply diagonally scaled CG and are shown to be well suited for massively parallel machines.
Data communications in a parallel active messaging interface of a parallel computer
Archer, Charles J; Blocksome, Michael A; Ratterman, Joseph D; Smith, Brian E
2013-11-12
Data communications in a parallel active messaging interface (`PAMI`) of a parallel computer composed of compute nodes that execute a parallel application, each compute node including application processors that execute the parallel application and at least one management processor dedicated to gathering information regarding data communications. The PAMI is composed of data communications endpoints, each endpoint composed of a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, the compute nodes and the endpoints coupled for data communications through the PAMI and through data communications resources. Embodiments function by gathering call site statistics describing data communications resulting from execution of data communications instructions and identifying in dependence upon the call cite statistics a data communications algorithm for use in executing a data communications instruction at a call site in the parallel application.
Secondary recurrent miscarriage is associated with previous male birth.
LENUS (Irish Health Repository)
Ooi, Poh Veh
2012-01-31
Secondary recurrent miscarriage (RM) is defined as three or more consecutive pregnancy losses after delivery of a viable infant. Previous reports suggest that a firstborn male child is associated with less favourable subsequent reproductive potential, possibly due to maternal immunisation against male-specific minor histocompatibility antigens. In a retrospective cohort study of 85 cases of secondary RM we aimed to determine if secondary RM was associated with (i) gender of previous child, maternal age, or duration of miscarriage history, and (ii) increased risk of pregnancy complications. Fifty-three women (62.0%; 53\\/85) gave birth to a male child prior to RM compared to 32 (38.0%; 32\\/85) who gave birth to a female child (p=0.002). The majority (91.7%; 78\\/85) had uncomplicated, term deliveries and normal birth weight neonates, with one quarter of the women previously delivered by Caesarean section. All had routine RM investigations and 19.0% (16\\/85) had an abnormal result. Fifty-seven women conceived again and 33.3% (19\\/57) miscarried, but there was no significant difference in failure rates between those with a previous male or female child (13\\/32 vs. 6\\/25, p=0.2). When patients with abnormal results were excluded, or when women with only one previous child were considered, there was still no difference in these rates. A previous male birth may be associated with an increased risk of secondary RM but numbers preclude concluding whether this increases recurrence risk. The suggested association with previous male birth provides a basis for further investigations at a molecular level.
Secondary recurrent miscarriage is associated with previous male birth.
LENUS (Irish Health Repository)
Ooi, Poh Veh
2011-01-01
Secondary recurrent miscarriage (RM) is defined as three or more consecutive pregnancy losses after delivery of a viable infant. Previous reports suggest that a firstborn male child is associated with less favourable subsequent reproductive potential, possibly due to maternal immunisation against male-specific minor histocompatibility antigens. In a retrospective cohort study of 85 cases of secondary RM we aimed to determine if secondary RM was associated with (i) gender of previous child, maternal age, or duration of miscarriage history, and (ii) increased risk of pregnancy complications. Fifty-three women (62.0%; 53\\/85) gave birth to a male child prior to RM compared to 32 (38.0%; 32\\/85) who gave birth to a female child (p=0.002). The majority (91.7%; 78\\/85) had uncomplicated, term deliveries and normal birth weight neonates, with one quarter of the women previously delivered by Caesarean section. All had routine RM investigations and 19.0% (16\\/85) had an abnormal result. Fifty-seven women conceived again and 33.3% (19\\/57) miscarried, but there was no significant difference in failure rates between those with a previous male or female child (13\\/32 vs. 6\\/25, p=0.2). When patients with abnormal results were excluded, or when women with only one previous child were considered, there was still no difference in these rates. A previous male birth may be associated with an increased risk of secondary RM but numbers preclude concluding whether this increases recurrence risk. The suggested association with previous male birth provides a basis for further investigations at a molecular level.
Knechtle, Beat; Knechtle, Patrizia; Rüst, Christoph Alexander; Rosemann, Thomas; Lepers, Romuald
2012-02-01
The association of characteristics of anthropometry, training, and previous experience with race time in 84 recreational, long-distance, inline skaters at the longest inline marathon in Europe (111 km), the Inline One-eleven in Switzerland, was investigated to identify predictor variables for performance. Age, duration per training unit, and personal best time were the only three variables related to race time in a multiple regression, while none of the 16 anthropometric variables were related. Anthropometric characteristics seem to be of no importance for a fast race time in a long-distance inline skating race in contrast to training volume and previous experience, when controlled with covariates. Improving performance in a long-distance inline skating race might be related to a high training volume and previous race experience. Also, doing such a race requires a parallel psychological effort, mental stamina, focus, and persistence. This may be reflected in the preparation and training for the event. Future studies should investigate what motivates these athletes to train and compete.
Erlotinib-induced rash spares previously irradiated skin
International Nuclear Information System (INIS)
Lips, Irene M.; Vonk, Ernest J.A.; Koster, Mariska E.Y.; Houwing, Ronald H.
2011-01-01
Erlotinib is an epidermal growth factor receptor inhibitor prescribed to patients with locally advanced or metastasized non-small cell lung carcinoma after failure of at least one earlier chemotherapy treatment. Approximately 75% of the patients treated with erlotinib develop acneiform skin rashes. A patient treated with erlotinib 3 months after finishing concomitant treatment with chemotherapy and radiotherapy for non-small cell lung cancer is presented. Unexpectedly, the part of the skin that had been included in his previously radiotherapy field was completely spared from the erlotinib-induced acneiform skin rash. The exact mechanism of erlotinib-induced rash sparing in previously irradiated skin is unclear. The underlying mechanism of this phenomenon needs to be explored further, because the number of patients being treated with a combination of both therapeutic modalities is increasing. The therapeutic effect of erlotinib in the area of the previously irradiated lesion should be assessed. (orig.)
Reasoning with Previous Decisions: Beyond the Doctrine of Precedent
DEFF Research Database (Denmark)
Komárek, Jan
2013-01-01
in different jurisdictions use previous judicial decisions in their argument, we need to move beyond the concept of precedent to a wider notion, which would embrace practices and theories in legal systems outside the Common law tradition. This article presents the concept of ‘reasoning with previous decisions...... law method’, but they are no less rational and intellectually sophisticated. The reason for the rather conceited attitude of some comparatists is in the dominance of the common law paradigm of precedent and the accompanying ‘case law method’. If we want to understand how courts and lawyers......’ as such an alternative and develops its basic models. The article first points out several shortcomings inherent in limiting the inquiry into reasoning with previous decisions by the common law paradigm (1). On the basis of numerous examples provided in section (1), I will present two basic models of reasoning...
[Prevalence of previously diagnosed diabetes mellitus in Mexico.
Rojas-Martínez, Rosalba; Basto-Abreu, Ana; Aguilar-Salinas, Carlos A; Zárate-Rojas, Emiliano; Villalpando, Salvador; Barrientos-Gutiérrez, Tonatiuh
2018-01-01
To compare the prevalence of previously diagnosed diabetes in 2016 with previous national surveys and to describe treatment and its complications. Mexico's national surveys Ensa 2000, Ensanut 2006, 2012 and 2016 were used. For 2016, logistic regression models and measures of central tendency and dispersion were obtained. The prevalence of previously diagnosed diabetes in 2016 was 9.4%. The increase of 2.2% relative to 2012 was not significant and only observed in patients older than 60 years. While preventive measures have increased, the access to medical treatment and lifestyle has not changed. The treatment has been modified, with an increase in insulin and decrease in hypoglycaemic agents. Population aging, lack of screening actions and the increase in diabetes complications will lead to an increase on the burden of disease. Policy measures targeting primary and secondary prevention of diabetes are crucial.
Streaming Parallel GPU Acceleration of Large-Scale filter-based Spiking Neural Networks
L.P. Slazynski (Leszek); S.M. Bohte (Sander)
2012-01-01
htmlabstractThe arrival of graphics processing (GPU) cards suitable for massively parallel computing promises a↵ordable large-scale neural network simulation previously only available at supercomputing facil- ities. While the raw numbers suggest that GPUs may outperform CPUs by at least an order of
Further comments on the geometrical efficiency of a parallel-disk source and detector system
International Nuclear Information System (INIS)
Ruby, L.
1994-01-01
A derivation is presented for a previously published formula, which determines the geometrical efficiency of a parallel-disk source and detector system. The formula involves an integral over a product of two Bessel functions. An algebraic approximation to the integral is also discussed. (orig.)
'Re-zoning' proximal development in a parallel e-learning course ...
African Journals Online (AJOL)
'Re-zoning' proximal development in a parallel e-learning course. ... Journal Home > Vol 22, No 4 (2002) >. Log in or Register to get access to full text downloads. ... This twinning course was introduced to expand learning opportunities in what we ... face-to-face curriculum with less scheduled teaching time than previously.
Parallelizing More Loops with Compiler Guided Refactoring
DEFF Research Database (Denmark)
Larsen, Per; Ladelsky, Razya; Lidman, Jacob
2012-01-01
an interactive compilation feedback system that guides programmers in iteratively modifying their application source code. This helps leverage the compiler’s ability to generate loop-parallel code. We employ our system to modify two sequential benchmarks dealing with image processing and edge detection...
Parallel and Distributed Systems for Probabilistic Reasoning
2012-12-01
Ranganathan "et"al...typically a random permutation over the vertices. Advances by Elidan et al. [2006] and Ranganathan et al. [2007] have focused on dynamic asynchronous...Wildfire algorithm shown in Alg. 3.6 is a direct parallelization of the algorithm proposed by [ Ranganathan et al., 2007]. The Wildfire algorithm
Lock-free parallel garbage collection
H. Gao; J.F. Groote (Jan Friso); W.H. Hesselink (Wim)
2005-01-01
htmlabstract This paper presents a lock-free parallel algorithm for mark&sweep garbage collection (GC) in a realistic model using synchronization primitives compare-and-swap (CAS) and load-linked/store-conditional (LL/SC) offered by machine architectures. Mutators and collectors can simultaneously
Parallel Monte Carlo simulation of aerosol dynamics
Zhou, K.
2014-01-01
A highly efficient Monte Carlo (MC) algorithm is developed for the numerical simulation of aerosol dynamics, that is, nucleation, surface growth, and coagulation. Nucleation and surface growth are handled with deterministic means, while coagulation is simulated with a stochastic method (Marcus-Lushnikov stochastic process). Operator splitting techniques are used to synthesize the deterministic and stochastic parts in the algorithm. The algorithm is parallelized using the Message Passing Interface (MPI). The parallel computing efficiency is investigated through numerical examples. Near 60% parallel efficiency is achieved for the maximum testing case with 3.7 million MC particles running on 93 parallel computing nodes. The algorithm is verified through simulating various testing cases and comparing the simulation results with available analytical and/or other numerical solutions. Generally, it is found that only small number (hundreds or thousands) of MC particles is necessary to accurately predict the aerosol particle number density, volume fraction, and so forth, that is, low order moments of the Particle Size Distribution (PSD) function. Accurately predicting the high order moments of the PSD needs to dramatically increase the number of MC particles. 2014 Kun Zhou et al.
Parallel Education and Defining the Fourth Sector.
Chessell, Diana
1996-01-01
Parallel to the primary, secondary, postsecondary, and adult/community education sectors is education not associated with formal programs--learning in arts and cultural sites. The emergence of cultural and educational tourism is an opportunity for adult/community education to define itself by extending lifelong learning opportunities into parallel…
Evidence of Parallel Processing During Translation
DEFF Research Database (Denmark)
Balling, Laura Winther; Hvelplund, Kristian Tangsgaard; Sjørup, Annette Camilla
2014-01-01
conclude that translation is a parallel process and that literal translation is likely to be a universal initial default strategy in translation. This conclusion is strengthened by the fact that all three experiments were relatively naturalistic, due to the combination of remote eye tracking and mixed...
Vector and parallel processors in computational science
International Nuclear Information System (INIS)
Duff, I.S.; Reid, J.K.
1985-01-01
These proceedings contain the articles presented at the named conference. These concern hardware and software for vector and parallel processors, numerical methods and algorithms for the computation on such processors, as well as applications of such methods to different fields of physics and related sciences. See hints under the relevant topics. (HSI)
Message passing with parallel queue traversal
Underwood, Keith D [Albuquerque, NM; Brightwell, Ronald B [Albuquerque, NM; Hemmert, K Scott [Albuquerque, NM
2012-05-01
In message passing implementations, associative matching structures are used to permit list entries to be searched in parallel fashion, thereby avoiding the delay of linear list traversal. List management capabilities are provided to support list entry turnover semantics and priority ordering semantics.
Parallel Volunteer Learning during Youth Programs
Lesmeister, Marilyn K.; Green, Jeremy; Derby, Amy; Bothum, Candi
2012-01-01
Lack of time is a hindrance for volunteers to participate in educational opportunities, yet volunteer success in an organization is tied to the orientation and education they receive. Meeting diverse educational needs of volunteers can be a challenge for program managers. Scheduling a Volunteer Learning Track for chaperones that is parallel to a…
Parallel electric fields from ionospheric winds
International Nuclear Information System (INIS)
Nakada, M.P.
1987-01-01
The possible production of electric fields parallel to the magnetic field by dynamo winds in the E region is examined, using a jet stream wind model. Current return paths through the F region above the stream are examined as well as return paths through the conjugate ionosphere. The Wulf geometry with horizontal winds moving in opposite directions one above the other is also examined. Parallel electric fields are found to depend strongly on the width of current sheets at the edges of the jet stream. If these are narrow enough, appreciable parallel electric fields are produced. These appear to be sufficient to heat the electrons which reduces the conductivity and produces further increases in parallel electric fields and temperatures. Calculations indicate that high enough temperatures for optical emission can be produced in less than 0.3 s. Some properties of auroras that might be produced by dynamo winds are examined; one property is a time delay in brightening at higher and lower altitudes
Kalman Filter Tracking on Parallel Architectures
International Nuclear Information System (INIS)
Cerati, Giuseppe; Elmer, Peter; Krutelyov, Slava; Lantz, Steven; Lefebvre, Matthieu; McDermott, Kevin; Riley, Daniel; Tadel, Matevž; Wittich, Peter; Würthwein, Frank; Yagil, Avi
2016-01-01
Power density constraints are limiting the performance improvements of modern CPUs. To address this we have seen the introduction of lower-power, multi-core processors such as GPGPU, ARM and Intel MIC. In order to achieve the theoretical performance gains of these processors, it will be necessary to parallelize algorithms to exploit larger numbers of lightweight cores and specialized functions like large vector units. Track finding and fitting is one of the most computationally challenging problems for event reconstruction in particle physics. At the High-Luminosity Large Hadron Collider (HL-LHC), for example, this will be by far the dominant problem. The need for greater parallelism has driven investigations of very different track finding techniques such as Cellular Automata or Hough Transforms. The most common track finding techniques in use today, however, are those based on a Kalman filter approach. Significant experience has been accumulated with these techniques on real tracking detector systems, both in the trigger and offline. They are known to provide high physics performance, are robust, and are in use today at the LHC. Given the utility of the Kalman filter in track finding, we have begun to port these algorithms to parallel architectures, namely Intel Xeon and Xeon Phi. We report here on our progress towards an end-to-end track reconstruction algorithm fully exploiting vectorization and parallelization techniques in a simplified experimental environment
Bessel functions: parallel display and processing.
Lohmann, A W; Ojeda-Castañeda, J; Serrano-Heredia, A
1994-01-01
We present an optical setup that converts planar binary curves into two-dimensional amplitude distributions, which are proportional, along one axis, to the Bessel function of order n, whereas along the other axis the order n increases. This Bessel displayer can be used for parallel Bessel transformation of a signal. Experimental verifications are included.
Hypercube Expert System Shell - Applying Production Parallelism.
1989-12-01
possible processor organizations, or int( rconntction n thod,, for par- allel architetures . The following are examples of commonlv used interconnection...this timing analysis because match speed-up avaiiah& from production parallelism is proportional to the average number of affected produclions1 ( 11:5
Efficient Parallel Algorithms for Unsteady Incompressible Flows
Guermond, Jean-Luc; Minev, Peter D.
2013-01-01
The objective of this paper is to give an overview of recent developments on splitting schemes for solving the time-dependent incompressible Navier–Stokes equations and to discuss possible extensions to the variable density/viscosity case. A particular attention is given to algorithms that can be implemented efficiently on large parallel clusters.
Stranger than fiction parallel universes beguile science
2007-01-01
A staple of mind-bending science fiction, the possibility of multiple universes has long intrigued hard-nosed physicists, mathematicians and cosmologists too. We may not be able -- at least not yet -- to prove they exist, many serious scientists say, but there are plenty of reasons to think that parallel dimensions are more than figments of eggheaded imagination.
Stranger that fiction parallel universes beguile science
2007-01-01
Is the universe -- correction: 'our' universe -- no more than a speck of cosmic dust amid an infinite number of parallel worlds? A staple of mind-bending science fiction, the possibility of multiple universes has long intrigued hard-nosed physicists, mathematicians and cosmologists too.
Stranger than fiction: parallel universes beguile science
Hautefeuille, Annie
2007-01-01
Is the universe-correction: 'our' universe-no more than a speck of cosmic dust amid an infinite number of parallel worlds? A staple of mind-bending science fiction, the possibility of multiple universes has long intrigued hard-nosed physicists, mathematicians and cosmologists too.
Logical inference techniques for loop parallelization
Oancea, Cosmin E.; Rauchwerger, Lawrence
2012-01-01
This paper presents a fully automatic approach to loop parallelization that integrates the use of static and run-time analysis and thus overcomes many known difficulties such as nonlinear and indirect array indexing and complex control flow. Our hybrid analysis framework validates the parallelization transformation by verifying the independence of the loop's memory references. To this end it represents array references using the USR (uniform set representation) language and expresses the independence condition as an equation, S = Ø, where S is a set expression representing array indexes. Using a language instead of an array-abstraction representation for S results in a smaller number of conservative approximations but exhibits a potentially-high runtime cost. To alleviate this cost we introduce a language translation F from the USR set-expression language to an equally rich language of predicates (F(S) ⇒ S = Ø). Loop parallelization is then validated using a novel logic inference algorithm that factorizes the obtained complex predicates (F(S)) into a sequence of sufficient-independence conditions that are evaluated first statically and, when needed, dynamically, in increasing order of their estimated complexities. We evaluate our automated solution on 26 benchmarks from PERFECTCLUB and SPEC suites and show that our approach is effective in parallelizing large, complex loops and obtains much better full program speedups than the Intel and IBM Fortran compilers. Copyright © 2012 ACM.
Design strategies for irregularly adapting parallel applications
International Nuclear Information System (INIS)
Oliker, Leonid; Biswas, Rupak; Shan, Hongzhang; Sing, Jaswinder Pal
2000-01-01
Achieving scalable performance for dynamic irregular applications is eminently challenging. Traditional message-passing approaches have been making steady progress towards this goal; however, they suffer from complex implementation requirements. The use of a global address space greatly simplifies the programming task, but can degrade the performance of dynamically adapting computations. In this work, we examine two major classes of adaptive applications, under five competing programming methodologies and four leading parallel architectures. Results indicate that it is possible to achieve message-passing performance using shared-memory programming techniques by carefully following the same high level strategies. Adaptive applications have computational work loads and communication patterns which change unpredictably at runtime, requiring dynamic load balancing to achieve scalable performance on parallel machines. Efficient parallel implementations of such adaptive applications are therefore a challenging task. This work examines the implementation of two typical adaptive applications, Dynamic Remeshing and N-Body, across various programming paradigms and architectural platforms. We compare several critical factors of the parallel code development, including performance, programmability, scalability, algorithmic development, and portability
Learning and Parallelization Boost Constraint Search
Yun, Xi
2013-01-01
Constraint satisfaction problems are a powerful way to abstract and represent academic and real-world problems from both artificial intelligence and operations research. A constraint satisfaction problem is typically addressed by a sequential constraint solver running on a single processor. Rather than construct a new, parallel solver, this work…
Impedance Control of a Redundant Parallel Manipulator
DEFF Research Database (Denmark)
Méndez, Juan de Dios Flores; Schiøler, Henrik; Madsen, Ole
2017-01-01
This paper presents the design of Impedance Control to a redundantly actuated Parallel Kinematic Manipulator. The proposed control is based on treating each limb as a single system and their connection through the internal interaction forces. The controller introduces a stiffness and damping...
Gestalt and Adventure Therapy: Parallels and Perspectives.
Gilsdorf, Rudiger
This paper calls attention to parallels in the literature of adventure education and that of Gestalt therapy, demonstrating that both are rooted in an experiential tradition. The philosophies of adventure or experiential education and Gestalt therapy have the following areas in common: (1) emphasis on personal growth and the development of present…
Parallel single-cell analysis microfluidic platform
van den Brink, Floris Teunis Gerardus; Gool, Elmar; Frimat, Jean-Philippe; Bomer, Johan G.; van den Berg, Albert; le Gac, Severine
2011-01-01
We report a PDMS microfluidic platform for parallel single-cell analysis (PaSCAl) as a powerful tool to decipher the heterogeneity found in cell populations. Cells are trapped individually in dedicated pockets, and thereafter, a number of invasive or non-invasive analysis schemes are performed.
Vector and parallel processors in computational science
International Nuclear Information System (INIS)
Duff, I.S.; Reid, J.K.
1985-01-01
This book presents the papers given at a conference which reviewed the new developments in parallel and vector processing. Topics considered at the conference included hardware (array processors, supercomputers), programming languages, software aids, numerical methods (e.g., Monte Carlo algorithms, iterative methods, finite elements, optimization), and applications (e.g., neutron transport theory, meteorology, image processing)
An interactive parallel processor for data analysis
International Nuclear Information System (INIS)
Mong, J.; Logan, D.; Maples, C.; Rathbun, W.; Weaver, D.
1984-01-01
A parallel array of eight minicomputers has been assembled in an attempt to deal with kiloparameter data events. By exporting computer system functions to a separate processor, the authors have been able to achieve computer amplification linearly proportional to the number of executing processors
Partitions in languages and parallel computations
Energy Technology Data Exchange (ETDEWEB)
Burgin, M S; Burgina, E S
1982-05-01
Partitions of entries (linguistic structures) are studied that are intended for parallel data processing. The representations of formal languages with the aid of such structures is examined, and the relationships are considered between partitions of entries and abstract families of languages and automata. 18 references.
Heuristic framework for parallel sorting computations | Nwanze ...
African Journals Online (AJOL)
Parallel sorting techniques have become of practical interest with the advent of new multiprocessor architectures. The decreasing cost of these processors will probably in the future, make the solutions that are derived thereof to be more appealing. Efficient algorithms for sorting scheme that are encountered in a number of ...
Algorithms for parallel and vector computations
Ortega, James M.
1995-01-01
This is a final report on work performed under NASA grant NAG-1-1112-FOP during the period March, 1990 through February 1995. Four major topics are covered: (1) solution of nonlinear poisson-type equations; (2) parallel reduced system conjugate gradient method; (3) orderings for conjugate gradient preconditioners, and (4) SOR as a preconditioner.
Parallel algorithms on the ASTRA SIMD machine
International Nuclear Information System (INIS)
Odor, G.; Rohrbach, F.; Vesztergombi, G.; Varga, G.; Tatrai, F.
1996-01-01
In view of the tremendous computing power jump of modern RISC processors the interest in parallel computing seems to be thinning out. Why use a complicated system of parallel processors, if the problem can be solved by a single powerful micro-chip. It is a general law, however, that exponential growth will always end by some kind of a saturation, and then parallelism will again become a hot topic. We try to prepare ourselves for this eventuality. The MPPC project started in 1990 in the keydeys of parallelism and produced four ASTRA machines (presented at CHEP's 92) with 4k processors (which are expandable to 16k) based on yesterday's chip-technology (chip presented at CHEP'91). These machines now provide excellent test-beds for algorithmic developments in a complete, real environment. We are developing for example fast-pattern recognition algorithms which could be used in high-energy physics experiments at the LHC (planned to be operational after 2004 at CERN) for triggering and data reduction. The basic feature of our ASP (Associate String Processor) approach is to use extremely simple (thus very cheap) processor elements but in huge quantities (up to millions of processors) connected together by a very simple string-like communication chain. In this paper we present powerful algorithms based on this architecture indicating the performance perspectives if the hardware quality reaches present or even future technology levels. (author)
Parallel object-oriented specification language
Florescu, O.; Voeten, J.P.M.; Theelen, B.D.; Geilen, M.C.W.; Corporaal, H.; Burns, Alan
2008-01-01
The Parallel Object-Oriented Specification Language (POOSL) is an expressive modelling language for hardware/software systems [10]. It was originally defined in [7] as an object-oriented extension of process algebra CCS [6], supporting (conditional) synchronous message passing between
Massively parallel sequencing of forensic STRs
DEFF Research Database (Denmark)
Parson, Walther; Ballard, David; Budowle, Bruce
2016-01-01
The DNA Commission of the International Society for Forensic Genetics (ISFG) is reviewing factors that need to be considered ahead of the adoption by the forensic community of short tandem repeat (STR) genotyping by massively parallel sequencing (MPS) technologies. MPS produces sequence data that...
A Model for Speedup of Parallel Programs
1997-01-01
Sanjeev. K Setia . The interaction between mem- ory allocation and adaptive partitioning in message- passing multicomputers. In IPPS Workshop on Job...Scheduling Strategies for Parallel Processing, pages 89{99, 1995. [15] Sanjeev K. Setia and Satish K. Tripathi. A compar- ative analysis of static
Parallel computing, failure recovery, and extreme values
DEFF Research Database (Denmark)
Andersen, Lars Nørvang; Asmussen, Søren
A task of random size T is split into M subtasks of lengths T1, . . . , TM, each of which is sent to one out of M parallel processors. Each processor may fail at a random time before completing its allocated task, and then has to restart it from the beginning. If X1, . . . ,XM are the total task ...
Experience with a clustered parallel reduction machine
Beemster, M.; Hartel, Pieter H.; Hertzberger, L.O.; Hofman, R.F.H.; Langendoen, K.G.; Li, L.L.; Milikowski, R.; Vree, W.G.; Barendregt, H.P.; Mulder, J.C.
A clustered architecture has been designed to exploit divide and conquer parallelism in functional programs. The programming methodology developed for the machine is based on explicit annotations and program transformations. It has been successfully applied to a number of algorithms resulting in a
Logical inference techniques for loop parallelization
Oancea, Cosmin E.
2012-01-01
This paper presents a fully automatic approach to loop parallelization that integrates the use of static and run-time analysis and thus overcomes many known difficulties such as nonlinear and indirect array indexing and complex control flow. Our hybrid analysis framework validates the parallelization transformation by verifying the independence of the loop\\'s memory references. To this end it represents array references using the USR (uniform set representation) language and expresses the independence condition as an equation, S = Ø, where S is a set expression representing array indexes. Using a language instead of an array-abstraction representation for S results in a smaller number of conservative approximations but exhibits a potentially-high runtime cost. To alleviate this cost we introduce a language translation F from the USR set-expression language to an equally rich language of predicates (F(S) ⇒ S = Ø). Loop parallelization is then validated using a novel logic inference algorithm that factorizes the obtained complex predicates (F(S)) into a sequence of sufficient-independence conditions that are evaluated first statically and, when needed, dynamically, in increasing order of their estimated complexities. We evaluate our automated solution on 26 benchmarks from PERFECTCLUB and SPEC suites and show that our approach is effective in parallelizing large, complex loops and obtains much better full program speedups than the Intel and IBM Fortran compilers. Copyright © 2012 ACM.
Cardiovascular magnetic resonance in adults with previous cardiovascular surgery.
von Knobelsdorff-Brenkenhoff, Florian; Trauzeddel, Ralf Felix; Schulz-Menger, Jeanette
2014-03-01
Cardiovascular magnetic resonance (CMR) is a versatile non-invasive imaging modality that serves a broad spectrum of indications in clinical cardiology and has proven evidence. Most of the numerous applications are appropriate in patients with previous cardiovascular surgery in the same manner as in non-surgical subjects. However, some specifics have to be considered. This review article is intended to provide information about the application of CMR in adults with previous cardiovascular surgery. In particular, the two main scenarios, i.e. following coronary artery bypass surgery and following heart valve surgery, are highlighted. Furthermore, several pictorial descriptions of other potential indications for CMR after cardiovascular surgery are given.
Researching the Parallel Process in Supervision and Psychotherapy
DEFF Research Database (Denmark)
Jacobsen, Claus Haugaard
Reflects upon how to do process research in supervision and in the parallel process. A single case study is presented illustrating how a study on parallel process can be carried out.......Reflects upon how to do process research in supervision and in the parallel process. A single case study is presented illustrating how a study on parallel process can be carried out....
3D printed soft parallel actuator
Zolfagharian, Ali; Kouzani, Abbas Z.; Khoo, Sui Yang; Noshadi, Amin; Kaynak, Akif
2018-04-01
This paper presents a 3-dimensional (3D) printed soft parallel contactless actuator for the first time. The actuator involves an electro-responsive parallel mechanism made of two segments namely active chain and passive chain both 3D printed. The active chain is attached to the ground from one end and constitutes two actuator links made of responsive hydrogel. The passive chain, on the other hand, is attached to the active chain from one end and consists of two rigid links made of polymer. The actuator links are printed using an extrusion-based 3D-Bioplotter with polyelectrolyte hydrogel as printer ink. The rigid links are also printed by a 3D fused deposition modelling (FDM) printer with acrylonitrile butadiene styrene (ABS) as print material. The kinematics model of the soft parallel actuator is derived via transformation matrices notations to simulate and determine the workspace of the actuator. The printed soft parallel actuator is then immersed into NaOH solution with specific voltage applied to it via two contactless electrodes. The experimental data is then collected and used to develop a parametric model to estimate the end-effector position and regulate kinematics model in response to specific input voltage over time. It is observed that the electroactive actuator demonstrates expected behaviour according to the simulation of its kinematics model. The use of 3D printing for the fabrication of parallel soft actuators opens a new chapter in manufacturing sophisticated soft actuators with high dexterity and mechanical robustness for biomedical applications such as cell manipulation and drug release.
Effects of parallel planning on agreement production.
Veenstra, Alma; Meyer, Antje S; Acheson, Daniel J
2015-11-01
An important issue in current psycholinguistics is how the time course of utterance planning affects the generation of grammatical structures. The current study investigated the influence of parallel activation of the components of complex noun phrases on the generation of subject-verb agreement. Specifically, the lexical interference account (Gillespie & Pearlmutter, 2011b; Solomon & Pearlmutter, 2004) predicts more agreement errors (i.e., attraction) for subject phrases in which the head and local noun mismatch in number (e.g., the apple next to the pears) when nouns are planned in parallel than when they are planned in sequence. We used a speeded picture description task that yielded sentences such as the apple next to the pears is red. The objects mentioned in the noun phrase were either semantically related or unrelated. To induce agreement errors, pictures sometimes mismatched in number. In order to manipulate the likelihood of parallel processing of the objects and to test the hypothesized relationship between parallel processing and the rate of agreement errors, the pictures were either placed close together or far apart. Analyses of the participants' eye movements and speech onset latencies indicated slower processing of the first object and stronger interference from the related (compared to the unrelated) second object in the close than in the far condition. Analyses of the agreement errors yielded an attraction effect, with more errors in mismatching than in matching conditions. However, the magnitude of the attraction effect did not differ across the close and far conditions. Thus, spatial proximity encouraged parallel processing of the pictures, which led to interference of the associated conceptual and/or lexical representation, but, contrary to the prediction, it did not lead to more attraction errors. Copyright © 2015 Elsevier B.V. All rights reserved.
Collectively loading an application in a parallel computer
Aho, Michael E.; Attinella, John E.; Gooding, Thomas M.; Miller, Samuel J.; Mundy, Michael B.
2016-01-05
Collectively loading an application in a parallel computer, the parallel computer comprising a plurality of compute nodes, including: identifying, by a parallel computer control system, a subset of compute nodes in the parallel computer to execute a job; selecting, by the parallel computer control system, one of the subset of compute nodes in the parallel computer as a job leader compute node; retrieving, by the job leader compute node from computer memory, an application for executing the job; and broadcasting, by the job leader to the subset of compute nodes in the parallel computer, the application for executing the job.
On synchronous parallel computations with independent probabilistic choice
International Nuclear Information System (INIS)
Reif, J.H.
1984-01-01
This paper introduces probabilistic choice to synchronous parallel machine models; in particular parallel RAMs. The power of probabilistic choice in parallel computations is illustrate by parallelizing some known probabilistic sequential algorithms. The authors characterize the computational complexity of time, space, and processor bounded probabilistic parallel RAMs in terms of the computational complexity of probabilistic sequential RAMs. They show that parallelism uniformly speeds up time bounded probabilistic sequential RAM computations by nearly a quadratic factor. They also show that probabilistic choice can be eliminated from parallel computations by introducing nonuniformity
Multitasking TORT Under UNICOS: Parallel Performance Models and Measurements
International Nuclear Information System (INIS)
Azmy, Y.Y.; Barnett, D.A.
1999-01-01
The existing parallel algorithms in the TORT discrete ordinates were updated to function in a UNI-COS environment. A performance model for the parallel overhead was derived for the existing algorithms. The largest contributors to the parallel overhead were identified and a new algorithm was developed. A parallel overhead model was also derived for the new algorithm. The results of the comparison of parallel performance models were compared to applications of the code to two TORT standard test problems and a large production problem. The parallel performance models agree well with the measured parallel overhead
Multitasking TORT under UNICOS: Parallel performance models and measurements
International Nuclear Information System (INIS)
Barnett, A.; Azmy, Y.Y.
1999-01-01
The existing parallel algorithms in the TORT discrete ordinates code were updated to function in a UNICOS environment. A performance model for the parallel overhead was derived for the existing algorithms. The largest contributors to the parallel overhead were identified and a new algorithm was developed. A parallel overhead model was also derived for the new algorithm. The results of the comparison of parallel performance models were compared to applications of the code to two TORT standard test problems and a large production problem. The parallel performance models agree well with the measured parallel overhead
Parallelization for first principles electronic state calculation program
International Nuclear Information System (INIS)
Watanabe, Hiroshi; Oguchi, Tamio.
1997-03-01
In this report we study the parallelization for First principles electronic state calculation program. The target machines are NEC SX-4 for shared memory type parallelization and FUJITSU VPP300 for distributed memory type parallelization. The features of each parallel machine are surveyed, and the parallelization methods suitable for each are proposed. It is shown that 1.60 times acceleration is achieved with 2 CPU parallelization by SX-4 and 4.97 times acceleration is achieved with 12 PE parallelization by VPP 300. (author)
Squamous cell carcinoma arising in previously burned or irradiated skin
International Nuclear Information System (INIS)
Edwards, M.J.; Hirsch, R.M.; Broadwater, J.R.; Netscher, D.T.; Ames, F.C.
1989-01-01
Squamous cell carcinoma (SCC) arising in previously burned or irradiated skin was reviewed in 66 patients treated between 1944 and 1986. Healing of the initial injury was complicated in 70% of patients. Mean interval from initial injury to diagnosis of SCC was 37 years. The overwhelming majority of patients presented with a chronic intractable ulcer in previously injured skin. The regional relapse rate after surgical excision was very high, 58% of all patients. Predominant patterns of recurrence were in local skin and regional lymph nodes (93% of recurrences). Survival rates at 5, 10, and 20 years were 52%, 34%, and 23%, respectively. Five-year survival rates in previously burned and irradiated patients were not significantly different (53% and 50%, respectively). This review, one of the largest reported series, better defines SCC arising in previously burned or irradiated skin as a locally aggressive disease that is distinct from SCC arising in sunlight-damaged skin. An increased awareness of the significance of chronic ulceration in scar tissue may allow earlier diagnosis. Regional disease control and survival depend on surgical resection of all known disease and may require radical lymph node dissection or amputation
Outcome Of Pregnancy Following A Previous Lower Segment ...
African Journals Online (AJOL)
Background: A previous ceasarean section is an important variable that influences patient management in subsequent pregnancies. A trial of vaginal delivery in such patients is a feasible alternative to a secondary section, thus aiding to reduce the ceasarean section rate and its associated co-morbidities. Objective: To ...
24 CFR 1710.552 - Previously accepted state filings.
2010-04-01
... of Substantially Equivalent State Law § 1710.552 Previously accepted state filings. (a) Materials... and contracts or agreements contain notice of purchaser's revocation rights. In addition see § 1715.15..., unless the developer is obligated to do so in the contract. (b) If any such filing becomes inactive or...
The job satisfaction of principals of previously disadvantaged schools
African Journals Online (AJOL)
The aim of this study was to identify influences on the job satisfaction of previously disadvantaged ..... I am still riding the cloud … I hope it lasts. .... as a way of creating a climate and culture in schools where individuals are willing to explore.
Haemophilus influenzae type f meningitis in a previously healthy boy
DEFF Research Database (Denmark)
Ronit, Andreas; Berg, Ronan M G; Bruunsgaard, Helle
2013-01-01
Non-serotype b strains of Haemophilus influenzae are extremely rare causes of acute bacterial meningitis in immunocompetent individuals. We report a case of acute bacterial meningitis in a 14-year-old boy, who was previously healthy and had been immunised against H influenzae serotype b (Hib...
Research Note Effects of previous cultivation on regeneration of ...
African Journals Online (AJOL)
We investigated the effects of previous cultivation on regeneration potential under miombo woodlands in a resettlement area, a spatial product of Zimbabwe's land reforms. We predicted that cultivation would affect population structure, regeneration, recruitment and potential grazing capacity of rangelands. Plant attributes ...
Cryptococcal meningitis in a previously healthy child | Chimowa ...
African Journals Online (AJOL)
An 8-year-old previously healthy female presented with a 3 weeks history of headache, neck stiffness, deafness, fever and vomiting and was diagnosed with cryptococcal meningitis. She had documented hearing loss and was referred to tertiary-level care after treatment with fluconazole did not improve her neurological ...
Investigation of previously derived Hyades, Coma, and M67 reddenings
International Nuclear Information System (INIS)
Taylor, B.J.
1980-01-01
New Hyades polarimetry and field star photometry have been obtained to check the Hyades reddening, which was found to be nonzero in a previous paper. The new Hyades polarimetry implies essentially zero reddening; this is also true of polarimetry published by Behr (which was incorrectly interpreted in the previous paper). Four photometric techniques which are presumed to be insensitive to blanketing are used to compare the Hyades to nearby field stars; these four techniques also yield essentially zero reddening. When all of these results are combined with others which the author has previously published and a simultaneous solution for the Hyades, Coma, and M67 reddenings is made, the results are E (B-V) =3 +- 2 (sigma) mmag, -1 +- 3 (sigma) mmag, and 46 +- 6 (sigma) mmag, respectively. No support for a nonzero Hyades reddening is offered by the new results. When the newly obtained reddenings for the Hyades, Coma, and M67 are compared with results from techniques given by Crawford and by users of the David Dunlap Observatory photometric system, no differences between the new and other reddenings are found which are larger than about 2 sigma. The author had previously found that the M67 main-sequence stars have about the same blanketing as that of Coma and less blanketing than the Hyades; this conclusion is essentially unchanged by the revised reddenings
Rapid fish stock depletion in previously unexploited seamounts: the ...
African Journals Online (AJOL)
Rapid fish stock depletion in previously unexploited seamounts: the case of Beryx splendens from the Sierra Leone Rise (Gulf of Guinea) ... A spectral analysis and red-noise spectra procedure (REDFIT) algorithm was used to identify the red-noise spectrum from the gaps in the observed time-series of catch per unit effort by ...
18 CFR 154.302 - Previously submitted material.
2010-04-01
... 18 Conservation of Power and Water Resources 1 2010-04-01 2010-04-01 false Previously submitted material. 154.302 Section 154.302 Conservation of Power and Water Resources FEDERAL ENERGY REGULATORY... concurrently with the rate change filing. There must be furnished to the Director, Office of Energy Market...
Process cells dismantling of EUREX pant: previous activities
International Nuclear Information System (INIS)
Gili, M.
1998-01-01
In the '98-'99 period some process cells of the EUREX pant will be dismantled, in order to place there the liquid wastes conditioning plant 'CORA'. This report resumes the previous activities (plant rinsing campaigns and inactive Cell 014 dismantling), run in the past three years and the drawn experience [it
The job satisfaction of principals of previously disadvantaged schools
African Journals Online (AJOL)
The aim of this study was to identify influences on the job satisfaction of previously disadvantaged school principals in North-West Province. Evans's theory of job satisfaction, morale and motivation was useful as a conceptual framework. A mixedmethods explanatory research design was important in discovering issues with ...
Obstructive pulmonary disease in patients with previous tuberculosis ...
African Journals Online (AJOL)
Obstructive pulmonary disease in patients with previous tuberculosis: Pathophysiology of a community-based cohort. B.W. Allwood, R Gillespie, M Galperin-Aizenberg, M Bateman, H Olckers, L Taborda-Barata, G.L. Calligaro, Q Said-Hartley, R van Zyl-Smit, C.B. Cooper, E van Rikxoort, J Goldin, N Beyers, E.D. Bateman ...
Abiraterone in metastatic prostate cancer without previous chemotherapy
Ryan, Charles J.; Smith, Matthew R.; de Bono, Johann S.; Molina, Arturo; Logothetis, Christopher J.; de Souza, Paul; Fizazi, Karim; Mainwaring, Paul; Piulats, Josep M.; Ng, Siobhan; Carles, Joan; Mulders, Peter F. A.; Basch, Ethan; Small, Eric J.; Saad, Fred; Schrijvers, Dirk; van Poppel, Hendrik; Mukherjee, Som D.; Suttmann, Henrik; Gerritsen, Winald R.; Flaig, Thomas W.; George, Daniel J.; Yu, Evan Y.; Efstathiou, Eleni; Pantuck, Allan; Winquist, Eric; Higano, Celestia S.; Taplin, Mary-Ellen; Park, Youn; Kheoh, Thian; Griffin, Thomas; Scher, Howard I.; Rathkopf, Dana E.; Boyce, A.; Costello, A.; Davis, I.; Ganju, V.; Horvath, L.; Lynch, R.; Marx, G.; Parnis, F.; Shapiro, J.; Singhal, N.; Slancar, M.; van Hazel, G.; Wong, S.; Yip, D.; Carpentier, P.; Luyten, D.; de Reijke, T.
2013-01-01
Abiraterone acetate, an androgen biosynthesis inhibitor, improves overall survival in patients with metastatic castration-resistant prostate cancer after chemotherapy. We evaluated this agent in patients who had not received previous chemotherapy. In this double-blind study, we randomly assigned
Response to health insurance by previously uninsured rural children.
Tilford, J M; Robbins, J M; Shema, S J; Farmer, F L
1999-08-01
To examine the healthcare utilization and costs of previously uninsured rural children. Four years of claims data from a school-based health insurance program located in the Mississippi Delta. All children who were not Medicaid-eligible or were uninsured, were eligible for limited benefits under the program. The 1987 National Medical Expenditure Survey (NMES) was used to compare utilization of services. The study represents a natural experiment in the provision of insurance benefits to a previously uninsured population. Premiums for the claims cost were set with little or no information on expected use of services. Claims from the insurer were used to form a panel data set. Mixed model logistic and linear regressions were estimated to determine the response to insurance for several categories of health services. The use of services increased over time and approached the level of utilization in the NMES. Conditional medical expenditures also increased over time. Actuarial estimates of claims cost greatly exceeded actual claims cost. The provision of a limited medical, dental, and optical benefit package cost approximately $20-$24 per member per month in claims paid. An important uncertainty in providing health insurance to previously uninsured populations is whether a pent-up demand exists for health services. Evidence of a pent-up demand for medical services was not supported in this study of rural school-age children. States considering partnerships with private insurers to implement the State Children's Health Insurance Program could lower premium costs by assembling basic data on previously uninsured children.
Reoperative sentinel lymph node biopsy after previous mastectomy.
Karam, Amer; Stempel, Michelle; Cody, Hiram S; Port, Elisa R
2008-10-01
Sentinel lymph node (SLN) biopsy is the standard of care for axillary staging in breast cancer, but many clinical scenarios questioning the validity of SLN biopsy remain. Here we describe our experience with reoperative-SLN (re-SLN) biopsy after previous mastectomy. Review of the SLN database from September 1996 to December 2007 yielded 20 procedures done in the setting of previous mastectomy. SLN biopsy was performed using radioisotope with or without blue dye injection superior to the mastectomy incision, in the skin flap in all patients. In 17 of 20 patients (85%), re-SLN biopsy was performed for local or regional recurrence after mastectomy. Re-SLN biopsy was successful in 13 of 20 patients (65%) after previous mastectomy. Of the 13 patients, 2 had positive re-SLN, and completion axillary dissection was performed, with 1 having additional positive nodes. In the 11 patients with negative re-SLN, 2 patients underwent completion axillary dissection demonstrating additional negative nodes. One patient with a negative re-SLN experienced chest wall recurrence combined with axillary recurrence 11 months after re-SLN biopsy. All others remained free of local or axillary recurrence. Re-SLN biopsy was unsuccessful in 7 of 20 patients (35%). In three of seven patients, axillary dissection was performed, yielding positive nodes in two of the three. The remaining four of seven patients all had previous modified radical mastectomy, so underwent no additional axillary surgery. In this small series, re-SLN was successful after previous mastectomy, and this procedure may play some role when axillary staging is warranted after mastectomy.
cellGPU: Massively parallel simulations of dynamic vertex models
Sussman, Daniel M.
2017-10-01
Vertex models represent confluent tissue by polygonal or polyhedral tilings of space, with the individual cells interacting via force laws that depend on both the geometry of the cells and the topology of the tessellation. This dependence on the connectivity of the cellular network introduces several complications to performing molecular-dynamics-like simulations of vertex models, and in particular makes parallelizing the simulations difficult. cellGPU addresses this difficulty and lays the foundation for massively parallelized, GPU-based simulations of these models. This article discusses its implementation for a pair of two-dimensional models, and compares the typical performance that can be expected between running cellGPU entirely on the CPU versus its performance when running on a range of commercial and server-grade graphics cards. By implementing the calculation of topological changes and forces on cells in a highly parallelizable fashion, cellGPU enables researchers to simulate time- and length-scales previously inaccessible via existing single-threaded CPU implementations. Program Files doi:http://dx.doi.org/10.17632/6j2cj29t3r.1 Licensing provisions: MIT Programming language: CUDA/C++ Nature of problem: Simulations of off-lattice "vertex models" of cells, in which the interaction forces depend on both the geometry and the topology of the cellular aggregate. Solution method: Highly parallelized GPU-accelerated dynamical simulations in which the force calculations and the topological features can be handled on either the CPU or GPU. Additional comments: The code is hosted at https://gitlab.com/dmsussman/cellGPU, with documentation additionally maintained at http://dmsussman.gitlab.io/cellGPUdocumentation
Analysis of parallel computing performance of the code MCNP
International Nuclear Information System (INIS)
Wang Lei; Wang Kan; Yu Ganglin
2006-01-01
Parallel computing can reduce the running time of the code MCNP effectively. With the MPI message transmitting software, MCNP5 can achieve its parallel computing on PC cluster with Windows operating system. Parallel computing performance of MCNP is influenced by factors such as the type, the complexity level and the parameter configuration of the computing problem. This paper analyzes the parallel computing performance of MCNP regarding with these factors and gives measures to improve the MCNP parallel computing performance. (authors)
A comparison of energetic ions in the plasma depletion layer and the quasi-parallel magnetosheath
Fuselier, Stephen A.
1994-01-01
Energetic ion spectra measured by the Active Magnetospheric Particle Tracer Explorers/Charge Composition Explorer (AMPTE/CCE) downstream from the Earth's quasi-parallel bow shock (in the quasi-parallel magnetosheath) and in the plasma depletion layer are compared. In the latter region, energetic ions are from a single source, leakage of magnetospheric ions across the magnetopause and into the plasma depletion layer. In the former region, both the magnetospheric source and shock acceleration of the thermal solar wind population at the quasi-parallel shock can contribute to the energetic ion spectra. The relative strengths of these two energetic ion sources are determined through the comparison of spectra from the two regions. It is found that magnetospheric leakage can provide an upper limit of 35% of the total energetic H(+) population in the quasi-parallel magnetosheath near the magnetopause in the energy range from approximately 10 to approximately 80 keV/e and substantially less than this limit for the energetic He(2+) population. The rest of the energetic H(+) population and nearly all of the energetic He(2+) population are accelerated out of the thermal solar wind population through shock acceleration processes. By comparing the energetic and thermal He(2+) and H(+) populations in the quasi-parallel magnetosheath, it is found that the quasi-parallel bow shock is 2 to 3 times more efficient at accelerating He(2+) than H(+). This result is consistent with previous estimates from shock acceleration theory and simulati ons.
DEFF Research Database (Denmark)
Li, Helong; Zhou, Wei; Wang, Xiongfei
2018-01-01
This paper addresses the transient current distribution in the multichip half-bridge power modules, where two types of paralleling connections with different current commutation mechanisms are considered: paralleling dies and paralleling half-bridges. It reveals that with paralleling dies, both t...
Parallel Processing of Images in Mobile Devices using BOINC
Curiel, Mariela; Calle, David F.; Santamaría, Alfredo S.; Suarez, David F.; Flórez, Leonardo
2018-04-01
Medical image processing helps health professionals make decisions for the diagnosis and treatment of patients. Since some algorithms for processing images require substantial amounts of resources, one could take advantage of distributed or parallel computing. A mobile grid can be an adequate computing infrastructure for this problem. A mobile grid is a grid that includes mobile devices as resource providers. In a previous step of this research, we selected BOINC as the infrastructure to build our mobile grid. However, parallel processing of images in mobile devices poses at least two important challenges: the execution of standard libraries for processing images and obtaining adequate performance when compared to desktop computers grids. By the time we started our research, the use of BOINC in mobile devices also involved two issues: a) the execution of programs in mobile devices required to modify the code to insert calls to the BOINC API, and b) the division of the image among the mobile devices as well as its merging required additional code in some BOINC components. This article presents answers to these four challenges.
Massive parallel 3D PIC simulation of negative ion extraction
Revel, Adrien; Mochalskyy, Serhiy; Montellano, Ivar Mauricio; Wünderlich, Dirk; Fantz, Ursel; Minea, Tiberiu
2017-09-01
The 3D PIC-MCC code ONIX is dedicated to modeling Negative hydrogen/deuterium Ion (NI) extraction and co-extraction of electrons from radio-frequency driven, low pressure plasma sources. It provides valuable insight on the complex phenomena involved in the extraction process. In previous calculations, a mesh size larger than the Debye length was used, implying numerical electron heating. Important steps have been achieved in terms of computation performance and parallelization efficiency allowing successful massive parallel calculations (4096 cores), imperative to resolve the Debye length. In addition, the numerical algorithms have been improved in terms of grid treatment, i.e., the electric field near the complex geometry boundaries (plasma grid) is calculated more accurately. The revised model preserves the full 3D treatment, but can take advantage of a highly refined mesh. ONIX was used to investigate the role of the mesh size, the re-injection scheme for lost particles (extracted or wall absorbed), and the electron thermalization process on the calculated extracted current and plasma characteristics. It is demonstrated that all numerical schemes give the same NI current distribution for extracted ions. Concerning the electrons, the pair-injection technique is found well-adapted to simulate the sheath in front of the plasma grid.
Parallel Processing of Images in Mobile Devices using BOINC
Directory of Open Access Journals (Sweden)
Curiel Mariela
2018-04-01
Full Text Available Medical image processing helps health professionals make decisions for the diagnosis and treatment of patients. Since some algorithms for processing images require substantial amounts of resources, one could take advantage of distributed or parallel computing. A mobile grid can be an adequate computing infrastructure for this problem. A mobile grid is a grid that includes mobile devices as resource providers. In a previous step of this research, we selected BOINC as the infrastructure to build our mobile grid. However, parallel processing of images in mobile devices poses at least two important challenges: the execution of standard libraries for processing images and obtaining adequate performance when compared to desktop computers grids. By the time we started our research, the use of BOINC in mobile devices also involved two issues: a the execution of programs in mobile devices required to modify the code to insert calls to the BOINC API, and b the division of the image among the mobile devices as well as its merging required additional code in some BOINC components. This article presents answers to these four challenges.
International Nuclear Information System (INIS)
Waintraub, Marcel; Pereira, Claudio M.N.A.; Baptista, Rafael P.
2005-01-01
This work presents the development of a distributed parallel genetic algorithm applied to a nuclear reactor core design optimization. In the implementation of the parallelism, a 'Message Passing Interface' (MPI) library, standard for parallel computation in distributed memory platforms, has been used. Another important characteristic of MPI is its portability for various architectures. The main objectives of this paper are: validation of the results obtained by the application of this algorithm in a nuclear reactor core optimization problem, through comparisons with previous results presented by Pereira et al.; and performance test of the Brazilian Nuclear Engineering Institute (IEN) cluster in reactors physics optimization problems. The experiments demonstrated that the developed parallel genetic algorithm using the MPI library presented significant gains in the obtained results and an accentuated reduction of the processing time. Such results ratify the use of the parallel genetic algorithms for the solution of nuclear reactor core optimization problems. (author)
[Fatal amnioinfusion with previous choriocarcinoma in a parturient woman].
Hrgović, Z; Bukovic, D; Mrcela, M; Hrgović, I; Siebzehnrübl, E; Karelovic, D
2004-04-01
The case of 36-year-old tercipare is described who developed choriocharcinoma in a previous pregnancy. During the first term labour the patient developed cardiac arrest, so reanimation and sectio cesarea was performed. A male new-born was delivered in good condition, but even after intensive therapy and reanimation occurred death of parturient woman with picture of disseminate intravascular coagulopathia (DIK). On autopsy and on histology there was no sign of malignant disease, so it was not possible to connect previous choricarcinoma with amniotic fluid embolism. Maybe was place of choriocarcinoma "locus minoris resistentiae" which later resulted with failure in placentation what was hard to prove. On autopsy we found embolia of lung with a microthrombosis of terminal circulation with punctiformis bleeding in mucous, what stands for DIK.
Challenging previous conceptions of vegetarianism and eating disorders.
Fisak, B; Peterson, R D; Tantleff-Dunn, S; Molnar, J M
2006-12-01
The purpose of this study was to replicate and expand upon previous research that has examined the potential association between vegetarianism and disordered eating. Limitations of previous research studies are addressed, including possible low reliability of measures of eating pathology within vegetarian samples, use of only a few dietary restraint measures, and a paucity of research examining potential differences in body image and food choice motives of vegetarians versus nonvegetarians. Two hundred and fifty-six college students completed a number of measures of eating pathology and body image, and a food choice motives questionnaire. Interestingly, no significant differences were found between vegetarians and nonvegetarians in measures of eating pathology or body image. However, significant differences in food choice motives were found. Implications for both researchers and clinicians are discussed.
Previously unreported abnormalities in Wolfram Syndrome Type 2.
Akturk, Halis Kaan; Yasa, Seda
2017-01-01
Wolfram syndrome (WFS) is a rare autosomal recessive disease with non-autoimmune childhood onset insulin dependent diabetes and optic atrophy. WFS type 2 (WFS2) differs from WFS type 1 (WFS1) with upper intestinal ulcers, bleeding tendency and the lack ofdiabetes insipidus. Li-fespan is short due to related comorbidities. Only a few familieshave been reported with this syndrome with the CISD2 mutation. Here we report two siblings with a clinical diagnosis of WFS2, previously misdiagnosed with type 1 diabetes mellitus and diabetic retinopathy-related blindness. We report possible additional clinical and laboratory findings that have not been pre-viously reported, such as asymptomatic hypoparathyroidism, osteomalacia, growth hormone (GH) deficiency and hepatomegaly. Even though not a requirement for the diagnosis of WFS2 currently, our case series confirm hypogonadotropic hypogonadism to be also a feature of this syndrome, as reported before. © Polish Society for Pediatric Endocrinology and Diabetology.
Data communications in a parallel active messaging interface of a parallel computer
Archer, Charles J; Blocksome, Michael A; Ratterman, Joseph D; Smith, Brian E
2013-10-29
Data communications in a parallel active messaging interface (`PAMI`) of a parallel computer, the parallel computer including a plurality of compute nodes that execute a parallel application, the PAMI composed of data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, the compute nodes and the endpoints coupled for data communications through the PAMI and through data communications resources, including receiving in an origin endpoint of the PAMI a data communications instruction, the instruction characterized by an instruction type, the instruction specifying a transmission of transfer data from the origin endpoint to a target endpoint and transmitting, in accordance with the instruction type, the transfer data from the origin endpoint to the target endpoint.
A possibility of parallel and anti-parallel diffraction measurements on ...
Indian Academy of Sciences (India)
resolution property of the other one, anti-parallel position, is very poor. .... in a wide angular region using BPC mochromator at the MF condition by showing ... and N Nimura, Proceedings of the 7th World Conference on Neutron Radiography,.
Previous climatic alterations are caused by the sun
International Nuclear Information System (INIS)
Groenaas, Sigbjoern
2003-01-01
The article surveys the scientific results of previous research into the contribution of the sun to climatic alterations. The author concludes that there is evidence of eight cold periods after the last ice age and that the alterations largely were due to climate effects from the sun. However, these effects are only causing a fraction of the registered global warming. It is assumed that the human activities are contributing to the rest of the greenhouse effect
Influence of previous knowledge in Torrance tests of creative thinking
Aranguren, María; Consejo Nacional de Investigaciones Científicas y Técnicas CONICET
2015-01-01
The aim of this work is to analyze the influence of study field, expertise and recreational activities participation in Torrance Tests of Creative Thinking (TTCT, 1974) performance. Several hypotheses were postulated to explore the possible effects of previous knowledge in TTCT verbal and TTCT figural university students’ outcomes. Participants in this study included 418 students from five study fields: Psychology;Philosophy and Literature, Music; Engineering; and Journalism and Advertisin...
Analysis of previous screening examinations for patients with breast cancer
International Nuclear Information System (INIS)
Lee, Eun Hye; Cha, Joo Hee; Han, Dae Hee; Choi, Young Ho; Hwang, Ki Tae; Ryu, Dae Sik; Kwak, Jin Ho; Moon, Woo Kyung
2007-01-01
We wanted to improve the quality of subsequent screening by reviewing the previous screening of breast cancer patients. Twenty-four breast cancer patients who underwent previous screening were enrolled. All 24 took mammograms and 15 patients also took sonograms. We reviewed the screening retrospectively according to the BI-RADS criteria and we categorized the results into false negative, true negative, true positive and occult cancers. We also categorized the causes of false negative cancers into misperception, misinterpretation and technical factors and then we analyzed the attributing factors. Review of the previous screening revealed 66.7% (16/24) false negative, 25.0% (6/24) true negative, and 8.3% (2/24) true positive cancers. False negative cancers were caused by the mammogram in 56.3% (9/16) and by the sonogram in 43.7% (7/16). For the false negative cases, all of misperception were related with mammograms and this was attributed to dense breast, a lesion located at the edge of glandular tissue or the image, and findings seen on one view only. Almost all misinterpretations were related with sonograms and attributed to loose application of the final assessment. To improve the quality of breast screening, it is essential to overcome the main causes of false negative examinations, including misperception and misinterpretation. We need systematic education and strict application of final assessment categories of BI-RADS. For effective communication among physicians, it is also necessary to properly educate them about BI-RADS
Parallel scalability of Hartree-Fock calculations
Chow, Edmond; Liu, Xing; Smelyanskiy, Mikhail; Hammond, Jeff R.
2015-03-01
Quantum chemistry is increasingly performed using large cluster computers consisting of multiple interconnected nodes. For a fixed molecular problem, the efficiency of a calculation usually decreases as more nodes are used, due to the cost of communication between the nodes. This paper empirically investigates the parallel scalability of Hartree-Fock calculations. The construction of the Fock matrix and the density matrix calculation are analyzed separately. For the former, we use a parallelization of Fock matrix construction based on a static partitioning of work followed by a work stealing phase. For the latter, we use density matrix purification from the linear scaling methods literature, but without using sparsity. When using large numbers of nodes for moderately sized problems, density matrix computations are network-bandwidth bound, making purification methods potentially faster than eigendecomposition methods.
Locating hardware faults in a parallel computer
Archer, Charles J.; Megerian, Mark G.; Ratterman, Joseph D.; Smith, Brian E.
2010-04-13
Locating hardware faults in a parallel computer, including defining within a tree network of the parallel computer two or more sets of non-overlapping test levels of compute nodes of the network that together include all the data communications links of the network, each non-overlapping test level comprising two or more adjacent tiers of the tree; defining test cells within each non-overlapping test level, each test cell comprising a subtree of the tree including a subtree root compute node and all descendant compute nodes of the subtree root compute node within a non-overlapping test level; performing, separately on each set of non-overlapping test levels, an uplink test on all test cells in a set of non-overlapping test levels; and performing, separately from the uplink tests and separately on each set of non-overlapping test levels, a downlink test on all test cells in a set of non-overlapping test levels.
Parallel interactive data analysis with PROOF
International Nuclear Information System (INIS)
Ballintijn, Maarten; Biskup, Marek; Brun, Rene; Canal, Philippe; Feichtinger, Derek; Ganis, Gerardo; Kickinger, Guenter; Peters, Andreas; Rademakers, Fons
2006-01-01
The Parallel ROOT Facility, PROOF, enables the analysis of much larger data sets on a shorter time scale. It exploits the inherent parallelism in data of uncorrelated events via a multi-tier architecture that optimizes I/O and CPU utilization in heterogeneous clusters with distributed storage. The system provides transparent and interactive access to gigabytes today. Being part of the ROOT framework PROOF inherits the benefits of a performant object storage system and a wealth of statistical and visualization tools. This paper describes the data analysis model of ROOT and the latest developments on closer integration of PROOF into that model and the ROOT user environment, e.g. support for PROOF-based browsing of trees stored remotely, and the popular TTree::Draw() interface. We also outline the ongoing developments aimed to improve the flexibility and user-friendliness of the system
Electromagnetic Physics Models for Parallel Computing Architectures
International Nuclear Information System (INIS)
Amadio, G; Bianchini, C; Iope, R; Ananya, A; Apostolakis, J; Aurora, A; Bandieramonte, M; Brun, R; Carminati, F; Gheata, A; Gheata, M; Goulas, I; Nikitina, T; Bhattacharyya, A; Mohanty, A; Canal, P; Elvira, D; Jun, S Y; Lima, G; Duhem, L
2016-01-01
The recent emergence of hardware architectures characterized by many-core or accelerated processors has opened new opportunities for concurrent programming models taking advantage of both SIMD and SIMT architectures. GeantV, a next generation detector simulation, has been designed to exploit both the vector capability of mainstream CPUs and multi-threading capabilities of coprocessors including NVidia GPUs and Intel Xeon Phi. The characteristics of these architectures are very different in terms of the vectorization depth and type of parallelization needed to achieve optimal performance. In this paper we describe implementation of electromagnetic physics models developed for parallel computing architectures as a part of the GeantV project. Results of preliminary performance evaluation and physics validation are presented as well. (paper)
Electromagnetic Physics Models for Parallel Computing Architectures
Amadio, G.; Ananya, A.; Apostolakis, J.; Aurora, A.; Bandieramonte, M.; Bhattacharyya, A.; Bianchini, C.; Brun, R.; Canal, P.; Carminati, F.; Duhem, L.; Elvira, D.; Gheata, A.; Gheata, M.; Goulas, I.; Iope, R.; Jun, S. Y.; Lima, G.; Mohanty, A.; Nikitina, T.; Novak, M.; Pokorski, W.; Ribon, A.; Seghal, R.; Shadura, O.; Vallecorsa, S.; Wenzel, S.; Zhang, Y.
2016-10-01
The recent emergence of hardware architectures characterized by many-core or accelerated processors has opened new opportunities for concurrent programming models taking advantage of both SIMD and SIMT architectures. GeantV, a next generation detector simulation, has been designed to exploit both the vector capability of mainstream CPUs and multi-threading capabilities of coprocessors including NVidia GPUs and Intel Xeon Phi. The characteristics of these architectures are very different in terms of the vectorization depth and type of parallelization needed to achieve optimal performance. In this paper we describe implementation of electromagnetic physics models developed for parallel computing architectures as a part of the GeantV project. Results of preliminary performance evaluation and physics validation are presented as well.
Vacuum Large Current Parallel Transfer Numerical Analysis
Directory of Open Access Journals (Sweden)
Enyuan Dong
2014-01-01
Full Text Available The stable operation and reliable breaking of large generator current are a difficult problem in power system. It can be solved successfully by the parallel interrupters and proper timing sequence with phase-control technology, in which the strategy of breaker’s control is decided by the time of both the first-opening phase and second-opening phase. The precise transfer current’s model can provide the proper timing sequence to break the generator circuit breaker. By analysis of the transfer current’s experiments and data, the real vacuum arc resistance and precise correctional model in the large transfer current’s process are obtained in this paper. The transfer time calculated by the correctional model of transfer current is very close to the actual transfer time. It can provide guidance for planning proper timing sequence and breaking the vacuum generator circuit breaker with the parallel interrupters.
Complementarity beyond physics Niels Bohr's parallels
Bala, Arun
2017-01-01
In this study Arun Bala examines the implications that Niels Bohr’s principle of complementarity holds for fields beyond physics. Bohr, one of the founding figures of modern quantum physics, argued that the principle of complementarity he proposed for understanding atomic processes has parallels in psychology, biology, and social science, as well as in Buddhist and Taoist thought. But Bohr failed to offer any explanation for why complementarity might extend beyond physics, and his claims have been widely rejected by scientists as empty speculation. Scientific scepticism has only been reinforced by the naïve enthusiasm of postmodern relativists and New Age intuitionists, who seize upon Bohr’s ideas to justify anti-realist and mystical positions. Arun Bala offers a detailed defence of Bohr’s claim that complementarity has far-reaching implications for the biological and social sciences, as well as for comparative philosophies of science, by explaining Bohr’s parallels as responses to the omnipresence...
Flexibility and Performance of Parallel File Systems
Kotz, David; Nieuwejaar, Nils
1996-01-01
As we gain experience with parallel file systems, it becomes increasingly clear that a single solution does not suit all applications. For example, it appears to be impossible to find a single appropriate interface, caching policy, file structure, or disk-management strategy. Furthermore, the proliferation of file-system interfaces and abstractions make applications difficult to port. We propose that the traditional functionality of parallel file systems be separated into two components: a fixed core that is standard on all platforms, encapsulating only primitive abstractions and interfaces, and a set of high-level libraries to provide a variety of abstractions and application-programmer interfaces (API's). We present our current and next-generation file systems as examples of this structure. Their features, such as a three-dimensional file structure, strided read and write interfaces, and I/O-node programs, are specifically designed with the flexibility and performance necessary to support a wide range of applications.
(Nearly) portable PIC code for parallel computers
International Nuclear Information System (INIS)
Decyk, V.K.
1993-01-01
As part of the Numerical Tokamak Project, the author has developed a (nearly) portable, one dimensional version of the GCPIC algorithm for particle-in-cell codes on parallel computers. This algorithm uses a spatial domain decomposition for the fields, and passes particles from one domain to another as the particles move spatially. With only minor changes, the code has been run in parallel on the Intel Delta, the Cray C-90, the IBM ES/9000 and a cluster of workstations. After a line by line translation into cmfortran, the code was also run on the CM-200. Impressive speeds have been achieved, both on the Intel Delta and the Cray C-90, around 30 nanoseconds per particle per time step. In addition, the author was able to isolate the data management modules, so that the physics modules were not changed much from their sequential version, and the data management modules can be used as open-quotes black boxes.close quotes
Parallel Jacobi EVD Methods on Integrated Circuits
Directory of Open Access Journals (Sweden)
Chi-Chia Sun
2014-01-01
Full Text Available Design strategies for parallel iterative algorithms are presented. In order to further study different tradeoff strategies in design criteria for integrated circuits, A 10 × 10 Jacobi Brent-Luk-EVD array with the simplified μ-CORDIC processor is used as an example. The experimental results show that using the μ-CORDIC processor is beneficial for the design criteria as it yields a smaller area, faster overall computation time, and less energy consumption than the regular CORDIC processor. It is worth to notice that the proposed parallel EVD method can be applied to real-time and low-power array signal processing algorithms performing beamforming or DOA estimation.
Oxytocin: parallel processing in the social brain?
Dölen, Gül
2015-06-01
Early studies attempting to disentangle the network complexity of the brain exploited the accessibility of sensory receptive fields to reveal circuits made up of synapses connected both in series and in parallel. More recently, extension of this organisational principle beyond the sensory systems has been made possible by the advent of modern molecular, viral and optogenetic approaches. Here, evidence supporting parallel processing of social behaviours mediated by oxytocin is reviewed. Understanding oxytocinergic signalling from this perspective has significant implications for the design of oxytocin-based therapeutic interventions aimed at disorders such as autism, where disrupted social function is a core clinical feature. Moreover, identification of opportunities for novel technology development will require a better appreciation of the complexity of the circuit-level organisation of the social brain. © 2015 The Authors. Journal of Neuroendocrinology published by John Wiley & Sons Ltd on behalf of British Society for Neuroendocrinology.
The new landscape of parallel computer architecture
Energy Technology Data Exchange (ETDEWEB)
Shalf, John [NERSC Division, Lawrence Berkeley National Laboratory 1 Cyclotron Road, Berkeley California, 94720 (United States)
2007-07-15
The past few years has seen a sea change in computer architecture that will impact every facet of our society as every electronic device from cell phone to supercomputer will need to confront parallelism of unprecedented scale. Whereas the conventional multicore approach (2, 4, and even 8 cores) adopted by the computing industry will eventually hit a performance plateau, the highest performance per watt and per chip area is achieved using manycore technology (hundreds or even thousands of cores). However, fully unleashing the potential of the manycore approach to ensure future advances in sustained computational performance will require fundamental advances in computer architecture and programming models that are nothing short of reinventing computing. In this paper we examine the reasons behind the movement to exponentially increasing parallelism, and its ramifications for system design, applications and programming models.
Parallel Evolutionary Optimization for Neuromorphic Network Training
Energy Technology Data Exchange (ETDEWEB)
Schuman, Catherine D [ORNL; Disney, Adam [University of Tennessee (UT); Singh, Susheela [North Carolina State University (NCSU), Raleigh; Bruer, Grant [University of Tennessee (UT); Mitchell, John Parker [University of Tennessee (UT); Klibisz, Aleksander [University of Tennessee (UT); Plank, James [University of Tennessee (UT)
2016-01-01
One of the key impediments to the success of current neuromorphic computing architectures is the issue of how best to program them. Evolutionary optimization (EO) is one promising programming technique; in particular, its wide applicability makes it especially attractive for neuromorphic architectures, which can have many different characteristics. In this paper, we explore different facets of EO on a spiking neuromorphic computing model called DANNA. We focus on the performance of EO in the design of our DANNA simulator, and on how to structure EO on both multicore and massively parallel computing systems. We evaluate how our parallel methods impact the performance of EO on Titan, the U.S.'s largest open science supercomputer, and BOB, a Beowulf-style cluster of Raspberry Pi's. We also focus on how to improve the EO by evaluating commonality in higher performing neural networks, and present the result of a study that evaluates the EO performed by Titan.
Parallelization of a blind deconvolution algorithm
Matson, Charles L.; Borelli, Kathy J.
2006-09-01
Often it is of interest to deblur imagery in order to obtain higher-resolution images. Deblurring requires knowledge of the blurring function - information that is often not available separately from the blurred imagery. Blind deconvolution algorithms overcome this problem by jointly estimating both the high-resolution image and the blurring function from the blurred imagery. Because blind deconvolution algorithms are iterative in nature, they can take minutes to days to deblur an image depending how many frames of data are used for the deblurring and the platforms on which the algorithms are executed. Here we present our progress in parallelizing a blind deconvolution algorithm to increase its execution speed. This progress includes sub-frame parallelization and a code structure that is not specialized to a specific computer hardware architecture.
Capacity Bounds for Parallel Optical Wireless Channels
Chaaban, Anas; Rezki, Zouheir; Alouini, Mohamed-Slim
2016-01-01
A system consisting of parallel optical wireless channels with a total average intensity constraint is studied. Capacity upper and lower bounds for this system are derived. Under perfect channel-state information at the transmitter (CSIT), the bounds have to be optimized with respect to the power allocation over the parallel channels. The optimization of the lower bound is non-convex, however, the KKT conditions can be used to find a list of possible solutions one of which is optimal. The optimal solution can then be found by an exhaustive search algorithm, which is computationally expensive. To overcome this, we propose low-complexity power allocation algorithms which are nearly optimal. The optimized capacity lower bound nearly coincides with the capacity at high SNR. Without CSIT, our capacity bounds lead to upper and lower bounds on the outage probability. The outage probability bounds meet at high SNR. The system with average and peak intensity constraints is also discussed.
Parallel algorithms for boundary value problems
Lin, Avi
1991-01-01
A general approach to solve boundary value problems numerically in a parallel environment is discussed. The basic algorithm consists of two steps: the local step where all the P available processors work in parallel, and the global step where one processor solves a tridiagonal linear system of the order P. The main advantages of this approach are twofold. First, this suggested approach is very flexible, especially in the local step and thus the algorithm can be used with any number of processors and with any of the SIMD or MIMD machines. Secondly, the communication complexity is very small and thus can be used as easily with shared memory machines. Several examples for using this strategy are discussed.
Parallel GPU implementation of iterative PCA algorithms.
Andrecut, M
2009-11-01
Principal component analysis (PCA) is a key statistical technique for multivariate data analysis. For large data sets, the common approach to PCA computation is based on the standard NIPALS-PCA algorithm, which unfortunately suffers from loss of orthogonality, and therefore its applicability is usually limited to the estimation of the first few components. Here we present an algorithm based on Gram-Schmidt orthogonalization (called GS-PCA), which eliminates this shortcoming of NIPALS-PCA. Also, we discuss the GPU (Graphics Processing Unit) parallel implementation of both NIPALS-PCA and GS-PCA algorithms. The numerical results show that the GPU parallel optimized versions, based on CUBLAS (NVIDIA), are substantially faster (up to 12 times) than the CPU optimized versions based on CBLAS (GNU Scientific Library).
Frontiers of massively parallel scientific computation
International Nuclear Information System (INIS)
Fischer, J.R.
1987-07-01
Practical applications using massively parallel computer hardware first appeared during the 1980s. Their development was motivated by the need for computing power orders of magnitude beyond that available today for tasks such as numerical simulation of complex physical and biological processes, generation of interactive visual displays, satellite image analysis, and knowledge based systems. Representative of the first generation of this new class of computers is the Massively Parallel Processor (MPP). A team of scientists was provided the opportunity to test and implement their algorithms on the MPP. The first results are presented. The research spans a broad variety of applications including Earth sciences, physics, signal and image processing, computer science, and graphics. The performance of the MPP was very good. Results obtained using the Connection Machine and the Distributed Array Processor (DAP) are presented
Computation and parallel implementation for early vision
Gualtieri, J. Anthony
1990-01-01
The problem of early vision is to transform one or more retinal illuminance images-pixel arrays-to image representations built out of such primitive visual features such as edges, regions, disparities, and clusters. These transformed representations form the input to later vision stages that perform higher level vision tasks including matching and recognition. Researchers developed algorithms for: (1) edge finding in the scale space formulation; (2) correlation methods for computing matches between pairs of images; and (3) clustering of data by neural networks. These algorithms are formulated for parallel implementation of SIMD machines, such as the Massively Parallel Processor, a 128 x 128 array processor with 1024 bits of local memory per processor. For some cases, researchers can show speedups of three orders of magnitude over serial implementations.
A parallel robot to assist vitreoretinal surgery
Energy Technology Data Exchange (ETDEWEB)
Nakano, Taiga; Sugita, Naohiko; Mitsuishi, Mamoru [University of Tokyo, School of Engineering, Tokyo (Japan); Ueta, Takashi; Tamaki, Yasuhiro [University of Tokyo, Graduate School of Medicine, Tokyo (Japan)
2009-11-15
This paper describes the development and evaluation of a parallel prototype robot for vitreoretinal surgery where physiological hand tremor limits performance. The manipulator was specifically designed to meet requirements such as size, precision, and sterilization; this has six-degree-of-freedom parallel architecture and provides positioning accuracy with micrometer resolution within the eye. The manipulator is controlled by an operator with a ''master manipulator'' consisting of multiple joints. Results of the in vitro experiments revealed that when compared to the manual procedure, a higher stability and accuracy of tool positioning could be achieved using the prototype robot. This microsurgical system that we have developed has superior operability as compared to traditional manual procedure and has sufficient potential to be used clinically for vitreoretinal surgery. (orig.)
Impact analysis on a massively parallel computer
International Nuclear Information System (INIS)
Zacharia, T.; Aramayo, G.A.
1994-01-01
Advanced mathematical techniques and computer simulation play a major role in evaluating and enhancing the design of beverage cans, industrial, and transportation containers for improved performance. Numerical models are used to evaluate the impact requirements of containers used by the Department of Energy (DOE) for transporting radioactive materials. Many of these models are highly compute-intensive. An analysis may require several hours of computational time on current supercomputers despite the simplicity of the models being studied. As computer simulations and materials databases grow in complexity, massively parallel computers have become important tools. Massively parallel computational research at the Oak Ridge National Laboratory (ORNL) and its application to the impact analysis of shipping containers is briefly described in this paper
The new landscape of parallel computer architecture
International Nuclear Information System (INIS)
Shalf, John
2007-01-01
The past few years has seen a sea change in computer architecture that will impact every facet of our society as every electronic device from cell phone to supercomputer will need to confront parallelism of unprecedented scale. Whereas the conventional multicore approach (2, 4, and even 8 cores) adopted by the computing industry will eventually hit a performance plateau, the highest performance per watt and per chip area is achieved using manycore technology (hundreds or even thousands of cores). However, fully unleashing the potential of the manycore approach to ensure future advances in sustained computational performance will require fundamental advances in computer architecture and programming models that are nothing short of reinventing computing. In this paper we examine the reasons behind the movement to exponentially increasing parallelism, and its ramifications for system design, applications and programming models
Parallel magnetotransport in multiple quantum well structures
International Nuclear Information System (INIS)
Sheregii, E.M.; Ploch, D.; Marchewka, M.; Tomaka, G.; Kolek, A.; Stadler, A.; Mleczko, K.; Strupinski, W.; Jasik, A.; Jakiela, R.
2004-01-01
The results of investigations of parallel magnetotransport in AlGaAs/GaAs and InGaAs/InAlAs/InP multiple quantum wells structures (MQW's) are presented in this paper. The MQW's were obtained by metalorganic vapour phase epitaxy with different shapes of QW, numbers of QW and levels of doping. The magnetotransport measurements were performed in wide region of temperatures (0.5-300 K) and at high magnetic fields up to 30 T (B is perpendicular and current is parallel to the plane of the QW). Three types of observed effects are analyzed: quantum Hall effect and Shubnikov-de Haas oscillations at low temperatures (0.5-6 K) as well as magnetophonon resonance at higher temperatures (77-300 K)
Multi-petascale highly efficient parallel supercomputer
Asaad, Sameh; Bellofatto, Ralph E.; Blocksome, Michael A.; Blumrich, Matthias A.; Boyle, Peter; Brunheroto, Jose R.; Chen, Dong; Cher, Chen-Yong; Chiu, George L.; Christ, Norman; Coteus, Paul W.; Davis, Kristan D.; Dozsa, Gabor J.; Eichenberger, Alexandre E.; Eisley, Noel A.; Ellavsky, Matthew R.; Evans, Kahn C.; Fleischer, Bruce M.; Fox, Thomas W.; Gara, Alan; Giampapa, Mark E.; Gooding, Thomas M.; Gschwind, Michael K.; Gunnels, John A.; Hall, Shawn A.; Haring, Rudolf A.; Heidelberger, Philip; Inglett, Todd A.; Knudson, Brant L.; Kopcsay, Gerard V.; Kumar, Sameer; Mamidala, Amith R.; Marcella, James A.; Megerian, Mark G.; Miller, Douglas R.; Miller, Samuel J.; Muff, Adam J.; Mundy, Michael B.; O'Brien, John K.; O'Brien, Kathryn M.; Ohmacht, Martin; Parker, Jeffrey J.; Poole, Ruth J.; Ratterman, Joseph D.; Salapura, Valentina; Satterfield, David L.; Senger, Robert M.; Steinmacher-Burow, Burkhard; Stockdell, William M.; Stunkel, Craig B.; Sugavanam, Krishnan; Sugawara, Yutaka; Takken, Todd E.; Trager, Barry M.; Van Oosten, James L.; Wait, Charles D.; Walkup, Robert E.; Watson, Alfred T.; Wisniewski, Robert W.; Wu, Peng
2018-05-15
A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaflop-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC). The ASIC nodes are interconnected by a five dimensional torus network that optimally maximize the throughput of packet communications between nodes and minimize latency. The network implements collective network and a global asynchronous network that provides global barrier and notification functions. Integrated in the node design include a list-based prefetcher. The memory system implements transaction memory, thread level speculation, and multiversioning cache that improves soft error rate at the same time and supports DMA functionality allowing for parallel processing message-passing.
Neural nets for massively parallel optimization
Dixon, Laurence C. W.; Mills, David
1992-07-01
To apply massively parallel processing systems to the solution of large scale optimization problems it is desirable to be able to evaluate any function f(z), z (epsilon) Rn in a parallel manner. The theorem of Cybenko, Hecht Nielsen, Hornik, Stinchcombe and White, and Funahasi shows that this can be achieved by a neural network with one hidden layer. In this paper we address the problem of the number of nodes required in the layer to achieve a given accuracy in the function and gradient values at all points within a given n dimensional interval. The type of activation function needed to obtain nonsingular Hessian matrices is described and a strategy for obtaining accurate minimal networks presented.
Parallel channel effects under BWR LOCA conditions
International Nuclear Information System (INIS)
Suzuki, H.; Hatamiya, S.; Murase, M.
1988-01-01
Due to parallel channel effects, different flow patterns such as liquid down-flow and gas up-flow appear simultaneously in fuel bundles of a BWR core during postulated LOCAs. Applying the parallel channel effects to the fuel bundle, water drain tubes with a restricted bottom end have been developed in order to mitigate counter-current flow limiting and to increase the falling water flow rate at the upper tie plate. The upper tie plate with water drain tubes is an especially effective means of increasing the safety margin of a reactor with narrow gaps between fuel rods and high steam velocity at the upper tie plate. The characteristics of the water drain tubes have been experimentally investigated using a small-scaled steam-water system simulating a BWR core. Then, their effect on the fuel cladding temperature was evaluated using the LOCA analysis program SAFER. (orig.)
Parallel Relational Universes – experiments in modularity
DEFF Research Database (Denmark)
Pagliarini, Luigi; Lund, Henrik Hautop
2015-01-01
: We here describe Parallel Relational Universes, an artistic method used for the psychological analysis of group dynamics. The design of the artistic system, which mediates group dynamics, emerges from our studies of modular playware and remixing playware. Inspired from remixing modular playware......, where users remix samples in the form of physical and functional modules, we created an artistic instantiation of such a concept with the Parallel Relational Universes, allowing arts alumni to remix artistic expressions. Here, we report the data emerged from a first pre-test, run with gymnasium’s alumni....... We then report both the artistic and the psychological findings. We discuss possible variations of such an instrument. Between an art piece and a psychological test, at a first cognitive analysis, it seems to be a promising research tool...
SNSPD with parallel nanowires (Conference Presentation)
Ejrnaes, Mikkel; Parlato, Loredana; Gaggero, Alessandro; Mattioli, Francesco; Leoni, Roberto; Pepe, Giampiero; Cristiano, Roberto
2017-05-01
Superconducting nanowire single-photon detectors (SNSPDs) have shown to be promising in applications such as quantum communication and computation, quantum optics, imaging, metrology and sensing. They offer the advantages of a low dark count rate, high efficiency, a broadband response, a short time jitter, a high repetition rate, and no need for gated-mode operation. Several SNSPD designs have been proposed in literature. Here, we discuss the so-called parallel nanowires configurations. They were introduced with the aim of improving some SNSPD property like detection efficiency, speed, signal-to-noise ratio, or photon number resolution. Although apparently similar, the various parallel designs are not the same. There is no one design that can improve the mentioned properties all together. In fact, each design presents its own characteristics with specific advantages and drawbacks. In this work, we will discuss the various designs outlining peculiarities and possible improvements.
Nonlocal microscopic theory of quantum friction between parallel metallic slabs
International Nuclear Information System (INIS)
Despoja, Vito; Echenique, Pedro M.; Sunjic, Marijan
2011-01-01
We present a new derivation of the friction force between two metallic slabs moving with constant relative parallel velocity, based on T=0 quantum-field theory formalism. By including a fully nonlocal description of dynamically screened electron fluctuations in the slab, and avoiding the usual matching-condition procedure, we generalize previous expressions for the friction force, to which our results reduce in the local limit. Analyzing the friction force calculated in the two local models and in the nonlocal theory, we show that for physically relevant velocities local theories using the plasmon and Drude models of dielectric response are inappropriate to describe friction, which is due to excitation of low-energy electron-hole pairs, which are properly included in nonlocal theory. We also show that inclusion of dissipation in the nonlocal electronic response has negligible influence on friction.
Plagiarism Detection for Indonesian Language using Winnowing with Parallel Processing
Arifin, Y.; Isa, S. M.; Wulandhari, L. A.; Abdurachman, E.
2018-03-01
The plagiarism has many forms, not only copy paste but include changing passive become active voice, or paraphrasing without appropriate acknowledgment. It happens on all language include Indonesian Language. There are many previous research that related with plagiarism detection in Indonesian Language with different method. But there are still some part that still has opportunity to improve. This research proposed the solution that can improve the plagiarism detection technique that can detect not only copy paste form but more advance than that. The proposed solution is using Winnowing with some addition process in pre-processing stage. With stemming processing in Indonesian Language and generate fingerprint in parallel processing that can saving time processing and produce the plagiarism result on the suspected document.
Energy Technology Data Exchange (ETDEWEB)
Nishioka, K.; Nakamura, Y. [Graduate School of Energy Science, Kyoto University, Gokasho, Uji, Kyoto 611-0011 (Japan); Nishimura, S. [National Institute for Fusion Science, 322-6 Oroshi-cho, Toki, Gifu 509-5292 (Japan); Lee, H. Y. [Korea Advanced Institute of Science and Technology, Daejeon 305-701 (Korea, Republic of); Kobayashi, S.; Mizuuchi, T.; Nagasaki, K.; Okada, H.; Minami, T.; Kado, S.; Yamamoto, S.; Ohshima, S.; Konoshima, S.; Sano, F. [Institute of Advanced Energy, Kyoto University, Gokasho, Uji, Kyoto 611-0011 (Japan)
2016-03-15
A moment approach to calculate neoclassical transport in non-axisymmetric torus plasmas composed of multiple ion species is extended to include the external parallel momentum sources due to unbalanced tangential neutral beam injections (NBIs). The momentum sources that are included in the parallel momentum balance are calculated from the collision operators of background particles with fast ions. This method is applied for the clarification of the physical mechanism of the neoclassical parallel ion flows and the multi-ion species effect on them in Heliotron J NBI plasmas. It is found that parallel ion flow can be determined by the balance between the parallel viscosity and the external momentum source in the region where the external source is much larger than the thermodynamic force driven source in the collisional plasmas. This is because the friction between C{sup 6+} and D{sup +} prevents a large difference between C{sup 6+} and D{sup +} flow velocities in such plasmas. The C{sup 6+} flow velocities, which are measured by the charge exchange recombination spectroscopy system, are numerically evaluated with this method. It is shown that the experimentally measured C{sup 6+} impurity flow velocities do not contradict clearly with the neoclassical estimations, and the dependence of parallel flow velocities on the magnetic field ripples is consistent in both results.
Energy Technology Data Exchange (ETDEWEB)
Lober, R.R.; Tautges, T.J.; Vaughan, C.T.
1997-03-01
Paving is an automated mesh generation algorithm which produces all-quadrilateral elements. It can additionally generate these elements in varying sizes such that the resulting mesh adapts to a function distribution, such as an error function. While powerful, conventional paving is a very serial algorithm in its operation. Parallel paving is the extension of serial paving into parallel environments to perform the same meshing functions as conventional paving only on distributed, discretized models. This extension allows large, adaptive, parallel finite element simulations to take advantage of paving`s meshing capabilities for h-remap remeshing. A significantly modified version of the CUBIT mesh generation code has been developed to host the parallel paving algorithm and demonstrate its capabilities on both two dimensional and three dimensional surface geometries and compare the resulting parallel produced meshes to conventionally paved meshes for mesh quality and algorithm performance. Sandia`s {open_quotes}tiling{close_quotes} dynamic load balancing code has also been extended to work with the paving algorithm to retain parallel efficiency as subdomains undergo iterative mesh refinement.
Aspects of parallel processing and control engineering
McKittrick, Brendan J
1991-01-01
The concept of parallel processing is not a new one, but the application of it to control engineering tasks is a relatively recent development, made possible by contemporary hardware and software innovation. It has long been accepted that, if properly orchestrated several processors/CPUs when combined can form a powerful processing entity. What prevented this from being implemented in commercial systems was the adequacy of the microprocessor for most tasks and hence the expense of a multi-pro...
Program For Parallel Discrete-Event Simulation
Beckman, Brian C.; Blume, Leo R.; Geiselman, John S.; Presley, Matthew T.; Wedel, John J., Jr.; Bellenot, Steven F.; Diloreto, Michael; Hontalas, Philip J.; Reiher, Peter L.; Weiland, Frederick P.
1991-01-01
User does not have to add any special logic to aid in synchronization. Time Warp Operating System (TWOS) computer program is special-purpose operating system designed to support parallel discrete-event simulation. Complete implementation of Time Warp mechanism. Supports only simulations and other computations designed for virtual time. Time Warp Simulator (TWSIM) subdirectory contains sequential simulation engine interface-compatible with TWOS. TWOS and TWSIM written in, and support simulations in, C programming language.
Parallel Hybrid Vehicle Optimal Storage System
Bloomfield, Aaron P.
2009-01-01
A paper reports the results of a Hybrid Diesel Vehicle Project focused on a parallel hybrid configuration suitable for diesel-powered, medium-sized, commercial vehicles commonly used for parcel delivery and shuttle buses, as the missions of these types of vehicles require frequent stops. During these stops, electric hybridization can effectively recover the vehicle's kinetic energy during the deceleration, store it onboard, and then use that energy to assist in the subsequent acceleration.
Asynchronous Parallelization of a CFD Solver
Abdi, Daniel S.; Bitsuamlak, Girma T.
2015-01-01
The article of record as published may be found at http://dx.doi.org/10.1155/2015/295393 A Navier-Stokes equations solver is parallelized to run on a cluster of computers using the domain decomposition method. Two approaches of communication and computation are investigated, namely, synchronous and asynchronous methods. Asynchronous communication between subdomains is not commonly used inCFDcodes; however, it has a potential to alleviate scaling bottlenecks incurred due to process...
A position sensitive parallel plate avalanche counter
International Nuclear Information System (INIS)
Lombardi, M.; Tan Jilian; Potenza, R.; D'amico, V.
1986-01-01
A position sensitive parallel plate avalanche counter with a distributed constant delay-line-cathode (PSAC) is described. The strips formed on the printed board were served as the cathode and the delay line for readout of signals. The detector (PSAC) was operated in isobutane gas at the pressure range from 10 to 20 torr. The position resolution is better than 1 mm and the time resolution is about 350 ps, for 252 Cf fission-spectrum source
Parallel fuzzy connected image segmentation on GPU.
Zhuge, Ying; Cao, Yong; Udupa, Jayaram K; Miller, Robert W
2011-07-01
Image segmentation techniques using fuzzy connectedness (FC) principles have shown their effectiveness in segmenting a variety of objects in several large applications. However, one challenge in these algorithms has been their excessive computational requirements when processing large image datasets. Nowadays, commodity graphics hardware provides a highly parallel computing environment. In this paper, the authors present a parallel fuzzy connected image segmentation algorithm implementation on NVIDIA's compute unified device Architecture (CUDA) platform for segmenting medical image data sets. In the FC algorithm, there are two major computational tasks: (i) computing the fuzzy affinity relations and (ii) computing the fuzzy connectedness relations. These two tasks are implemented as CUDA kernels and executed on GPU. A dramatic improvement in speed for both tasks is achieved as a result. Our experiments based on three data sets of small, medium, and large data size demonstrate the efficiency of the parallel algorithm, which achieves a speed-up factor of 24.4x, 18.1x, and 10.3x, correspondingly, for the three data sets on the NVIDIA Tesla C1060 over the implementation of the algorithm on CPU, and takes 0.25, 0.72, and 15.04 s, correspondingly, for the three data sets. The authors developed a parallel algorithm of the widely used fuzzy connected image segmentation method on the NVIDIA GPUs, which are far more cost- and speed-effective than both cluster of workstations and multiprocessing systems. A near-interactive speed of segmentation has been achieved, even for the large data set.
Base drive for paralleled inverter systems
Nagano, S. (Inventor)
1980-01-01
In a paralleled inverter system, a positive feedback current derived from the total current from all of the modules of the inverter system is applied to the base drive of each of the power transistors of all modules, thereby to provide all modules protection against open or short circuit faults occurring in any of the modules, and force equal current sharing among the modules during turn on of the power transistors.