parallel functional testing: Topics by WorldWideScience.org

Sample records for parallel functional testing

Methods and models for the construction of weakly parallel tests

NARCIS (Netherlands)

Adema, J.J.; Adema, Jos J.

1990-01-01

Methods are proposed for the construction of weakly parallel tests, that is, tests with the same test information function. A mathematical programing model for constructing tests with a prespecified test information function and a heuristic for assigning items to tests such that their information
Methods and models for the construction of weakly parallel tests

NARCIS (Netherlands)

Adema, J.J.; Adema, Jos J.

1992-01-01

Several methods are proposed for the construction of weakly parallel tests [i.e., tests with the same test information function (TIF)]. A mathematical programming model that constructs tests containing a prespecified TIF and a heuristic that assigns items to tests with information functions that are
Creating IRT-Based Parallel Test Forms Using the Genetic Algorithm Method

Science.gov (United States)

Sun, Koun-Tem; Chen, Yu-Jen; Tsai, Shu-Yen; Cheng, Chien-Fen

2008-01-01

In educational measurement, the construction of parallel test forms is often a combinatorial optimization problem that involves the time-consuming selection of items to construct tests having approximately the same test information functions (TIFs) and constraints. This article proposes a novel method, genetic algorithm (GA), to construct parallel…
Streaming for Functional Data-Parallel Languages

DEFF Research Database (Denmark)

Madsen, Frederik Meisner

In this thesis, we investigate streaming as a general solution to the space inefficiency commonly found in functional data-parallel programming languages. The data-parallel paradigm maps well to parallel SIMD-style hardware. However, the traditional fully materializing execution strategy...... by extending two existing data-parallel languages: NESL and Accelerate. In the extensions we map bulk operations to data-parallel streams that can evaluate fully sequential, fully parallel or anything in between. By a dataflow, piecewise parallel execution strategy, the runtime system can adjust to any target...... flattening necessitates all sub-computations to materialize at the same time. For example, naive n by n matrix multiplication requires n^3 space in NESL because the algorithm contains n^3 independent scalar multiplications. For large values of n, this is completely unacceptable. We address the problem...
Test generation for digital circuits using parallel processing

Science.gov (United States)

Hartmann, Carlos R.; Ali, Akhtar-Uz-Zaman M.

1990-12-01

The problem of test generation for digital logic circuits is an NP-Hard problem. Recently, the availability of low cost, high performance parallel machines has spurred interest in developing fast parallel algorithms for computer-aided design and test. This report describes a method of applying a 15-valued logic system for digital logic circuit test vector generation in a parallel programming environment. A concept called fault site testing allows for test generation, in parallel, that targets more than one fault at a given location. The multi-valued logic system allows results obtained by distinct processors and/or processes to be merged by means of simple set intersections. A machine-independent description is given for the proposed algorithm.
Massively parallel sparse matrix function calculations with NTPoly

Science.gov (United States)

Dawson, William; Nakajima, Takahito

2018-04-01

We present NTPoly, a massively parallel library for computing the functions of sparse, symmetric matrices. The theory of matrix functions is a well developed framework with a wide range of applications including differential equations, graph theory, and electronic structure calculations. One particularly important application area is diagonalization free methods in quantum chemistry. When the input and output of the matrix function are sparse, methods based on polynomial expansions can be used to compute matrix functions in linear time. We present a library based on these methods that can compute a variety of matrix functions. Distributed memory parallelization is based on a communication avoiding sparse matrix multiplication algorithm. OpenMP task parallellization is utilized to implement hybrid parallelization. We describe NTPoly's interface and show how it can be integrated with programs written in many different programming languages. We demonstrate the merits of NTPoly by performing large scale calculations on the K computer.
The Effects of Stress and Executive Functions on Decision Making in an Executive Parallel Task

OpenAIRE

McGuigan, Brian

2016-01-01

The aim of this study was to investigate the effects of acute stress on parallel task performance with the Game of Dice Task (GDT) to measure decision making and the Stroop test. Two previous studies have found that the combination of stress and a parallel task with the GDT and an executive functions task preserved performance on the GDT for a stress group compared to a control group. The purpose of this study was to create and use a new parallel task with the GDT and the stroop test to elu...
Parallel transaction processing in functional languages, towards practical functional databases

NARCIS (Netherlands)

Wevers, L.; Huisman, Marieke; de Keijzer, Ander

2013-01-01

This paper shows how functional languages can be adapted for transaction processing, and discusses the implementation of a parallel runtime system for such functional transaction processing languages. We extend functional languages with current state variables and result state variables to allow the
Parallel-Processing Test Bed For Simulation Software

Science.gov (United States)

Blech, Richard; Cole, Gary; Townsend, Scott

1996-01-01

Second-generation Hypercluster computing system is multiprocessor test bed for research on parallel algorithms for simulation in fluid dynamics, electromagnetics, chemistry, and other fields with large computational requirements but relatively low input/output requirements. Built from standard, off-shelf hardware readily upgraded as improved technology becomes available. System used for experiments with such parallel-processing concepts as message-passing algorithms, debugging software tools, and computational steering. First-generation Hypercluster system described in "Hypercluster Parallel Processor" (LEW-15283).
Parallel Execution of Functional Mock-up Units in Buildings Modeling

Energy Technology Data Exchange (ETDEWEB)

Ozmen, Ozgur [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Nutaro, James J. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); New, Joshua Ryan [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)

2016-06-30

A Functional Mock-up Interface (FMI) defines a standardized interface to be used in computer simulations to develop complex cyber-physical systems. FMI implementation by a software modeling tool enables the creation of a simulation model that can be interconnected, or the creation of a software library called a Functional Mock-up Unit (FMU). This report describes an FMU wrapper implementation that imports FMUs into a C++ environment and uses an Euler solver that executes FMUs in parallel using Open Multi-Processing (OpenMP). The purpose of this report is to elucidate the runtime performance of the solver when a multi-component system is imported as a single FMU (for the whole system) or as multiple FMUs (for different groups of components as sub-systems). This performance comparison is conducted using two test cases: (1) a simple, multi-tank problem; and (2) a more realistic use case based on the Modelica Buildings Library. In both test cases, the performance gains are promising when each FMU consists of a large number of states and state events that are wrapped in a single FMU. Load balancing is demonstrated to be a critical factor in speeding up parallel execution of multiple FMUs.
Density functional theory and parallel processing

International Nuclear Information System (INIS)

Ward, R.C.; Geist, G.A.; Butler, W.H.

1987-01-01

The authors demonstrate a method for obtaining the ground state energies and charge densities of a system of atoms described within density functional theory using simulated annealing on a parallel computer
Rocket measurement of auroral partial parallel distribution functions

Science.gov (United States)

Lin, C.-A.

1980-01-01

The auroral partial parallel distribution functions are obtained by using the observed energy spectra of electrons. The experiment package was launched by a Nike-Tomahawk rocket from Poker Flat, Alaska over a bright auroral band and covered an altitude range of up to 180 km. Calculated partial distribution functions are presented with emphasis on their slopes. The implications of the slopes are discussed. It should be pointed out that the slope of the partial parallel distribution function obtained from one energy spectra will be changed by superposing another energy spectra on it.
A two-level parallel direct search implementation for arbitrarily sized objective functions

Energy Technology Data Exchange (ETDEWEB)

Hutchinson, S.A.; Shadid, N.; Moffat, H.K. [Sandia National Labs., Albuquerque, NM (United States)] [and others

1994-12-31

In the past, many optimization schemes for massively parallel computers have attempted to achieve parallel efficiency using one of two methods. In the case of large and expensive objective function calculations, the optimization itself may be run in serial and the objective function calculations parallelized. In contrast, if the objective function calculations are relatively inexpensive and can be performed on a single processor, then the actual optimization routine itself may be parallelized. In this paper, a scheme based upon the Parallel Direct Search (PDS) technique is presented which allows the objective function calculations to be done on an arbitrarily large number (p{sub 2}) of processors. If, p, the number of processors available, is greater than or equal to 2p{sub 2} then the optimization may be parallelized as well. This allows for efficient use of computational resources since the objective function calculations can be performed on the number of processors that allow for peak parallel efficiency and then further speedup may be achieved by parallelizing the optimization. Results are presented for an optimization problem which involves the solution of a PDE using a finite-element algorithm as part of the objective function calculation. The optimum number of processors for the finite-element calculations is less than p/2. Thus, the PDS method is also parallelized. Performance comparisons are given for a nCUBE 2 implementation.
When do evolutionary algorithms optimize separable functions in parallel?

DEFF Research Database (Denmark)

Doerr, Benjamin; Sudholt, Dirk; Witt, Carsten

2013-01-01

is that evolutionary algorithms make progress on all subfunctions in parallel, so that optimizing a separable function does not take not much longer than optimizing the hardest subfunction-subfunctions are optimized "in parallel." We show that this is only partially true, already for the simple (1+1) evolutionary...... algorithm ((1+1) EA). For separable functions composed of k Boolean functions indeed the optimization time is the maximum optimization time of these functions times a small O(log k) overhead. More generally, for sums of weighted subfunctions that each attain non-negative integer values less than r = o(log1...
Parallel keyed hash function construction based on chaotic maps

International Nuclear Information System (INIS)

Xiao Di; Liao Xiaofeng; Deng Shaojiang

2008-01-01

Recently, a variety of chaos-based hash functions have been proposed. Nevertheless, none of them works efficiently in parallel computing environment. In this Letter, an algorithm for parallel keyed hash function construction is proposed, whose structure can ensure the uniform sensitivity of hash value to the message. By means of the mechanism of both changeable-parameter and self-synchronization, the keystream establishes a close relation with the algorithm key, the content and the order of each message block. The entire message is modulated into the chaotic iteration orbit, and the coarse-graining trajectory is extracted as the hash value. Theoretical analysis and computer simulation indicate that the proposed algorithm can satisfy the performance requirements of hash function. It is simple, efficient, practicable, and reliable. These properties make it a good choice for hash on parallel computing platform
High temporal resolution functional MRI using parallel echo volumar imaging

International Nuclear Information System (INIS)

Rabrait, C.; Ciuciu, P.; Ribes, A.; Poupon, C.; Dehaine-Lambertz, G.; LeBihan, D.; Lethimonnier, F.; Le Roux, P.; Dehaine-Lambertz, G.

2008-01-01

Purpose: To combine parallel imaging with 3D single-shot acquisition (echo volumar imaging, EVI) in order to acquire high temporal resolution volumar functional MRI (fMRI) data. Materials and Methods: An improved EVI sequence was associated with parallel acquisition and field of view reduction in order to acquire a large brain volume in 200 msec. Temporal stability and functional sensitivity were increased through optimization of all imaging parameters and Tikhonov regularization of parallel reconstruction. Two human volunteers were scanned with parallel EVI in a 1.5 T whole-body MR system, while submitted to a slow event-related auditory paradigm. Results: Thanks to parallel acquisition, the EVI volumes display a low level of geometric distortions and signal losses. After removal of low-frequency drifts and physiological artifacts,activations were detected in the temporal lobes of both volunteers and voxel-wise hemodynamic response functions (HRF) could be computed. On these HRF different habituation behaviors in response to sentence repetition could be identified. Conclusion: This work demonstrates the feasibility of high temporal resolution 3D fMRI with parallel EVI. Combined with advanced estimation tools,this acquisition method should prove useful to measure neural activity timing differences or study the nonlinearities and non-stationarities of the BOLD response. (authors)
Unpacking the cognitive map: the parallel map theory of hippocampal function.

Science.gov (United States)

Jacobs, Lucia F; Schenk, Françoise

2003-04-01

In the parallel map theory, the hippocampus encodes space with 2 mapping systems. The bearing map is constructed primarily in the dentate gyrus from directional cues such as stimulus gradients. The sketch map is constructed within the hippocampus proper from positional cues. The integrated map emerges when data from the bearing and sketch maps are combined. Because the component maps work in parallel, the impairment of one can reveal residual learning by the other. Such parallel function may explain paradoxes of spatial learning, such as learning after partial hippocampal lesions, taxonomic and sex differences in spatial learning, and the function of hippocampal neurogenesis. By integrating evidence from physiology to phylogeny, the parallel map theory offers a unified explanation for hippocampal function.
Simplifying the parallelization of scientific codes by a function-centric approach in Python

International Nuclear Information System (INIS)

Nilsen, Jon K; Cai Xing; Langtangen, Hans Petter; Hoeyland, Bjoern

2010-01-01

The purpose of this paper is to show how existing scientific software can be parallelized using a separate thin layer of Python code where all parallelization-specific tasks are implemented. We provide specific examples of such a Python code layer, which can act as templates for parallelizing a wide set of serial scientific codes. The use of Python for parallelization is motivated by the fact that the language is well suited for reusing existing serial codes programmed in other languages. The extreme flexibility of Python with regard to handling functions makes it very easy to wrap up decomposed computational tasks of a serial scientific application as Python functions. Many parallelization-specific components can be implemented as generic Python functions, which may take as input those wrapped functions that perform concrete computational tasks. The overall programming effort needed by this parallelization approach is limited, and the resulting parallel Python scripts have a compact and clean structure. The usefulness of the parallelization approach is exemplified by three different classes of application in natural and social sciences.
An approach to multicore parallelism using functional programming: A case study based on Presburger Arithmetic

DEFF Research Database (Denmark)

Dung, Phan Anh; Hansen, Michael Reichhardt

2015-01-01

In this paper we investigate multicore parallelism in the context of functional programming by means of two quantifier-elimination procedures for Presburger Arithmetic: one is based on Cooper’s algorithm and the other is based on the Omega Test. We first develop correct-by-construction prototype...... platform executing on an 8-core machine. A speedup of approximately 4 was obtained for Cooper’s algorithm and a speedup of approximately 6 was obtained for the exact-shadow part of the Omega Test. The considered procedures are complex, memory-intense algorithms on huge formula trees and the case study...... reveals more general applicable techniques and guideline for deriving parallel algorithms from sequential ones in the context of data-intensive tree algorithms. The obtained insights should apply for any strict and impure functional programming language. Furthermore, the results obtained for the exact...
Bessel functions: parallel display and processing.

Science.gov (United States)

Lohmann, A W; Ojeda-Castañeda, J; Serrano-Heredia, A

1994-01-01

We present an optical setup that converts planar binary curves into two-dimensional amplitude distributions, which are proportional, along one axis, to the Bessel function of order n, whereas along the other axis the order n increases. This Bessel displayer can be used for parallel Bessel transformation of a signal. Experimental verifications are included.

Parallel Alterations of Functional Connectivity during Execution and Imagination after Motor Imagery Learning

Science.gov (United States)

Zhang, Rushao; Hui, Mingqi; Long, Zhiying; Zhao, Xiaojie; Yao, Li

2012-01-01

Background Neural substrates underlying motor learning have been widely investigated with neuroimaging technologies. Investigations have illustrated the critical regions of motor learning and further revealed parallel alterations of functional activation during imagination and execution after learning. However, little is known about the functional connectivity associated with motor learning, especially motor imagery learning, although benefits from functional connectivity analysis attract more attention to the related explorations. We explored whether motor imagery (MI) and motor execution (ME) shared parallel alterations of functional connectivity after MI learning. Methodology/Principal Findings Graph theory analysis, which is widely used in functional connectivity exploration, was performed on the functional magnetic resonance imaging (fMRI) data of MI and ME tasks before and after 14 days of consecutive MI learning. The control group had no learning. Two measures, connectivity degree and interregional connectivity, were calculated and further assessed at a statistical level. Two interesting results were obtained: (1) The connectivity degree of the right posterior parietal lobe decreased in both MI and ME tasks after MI learning in the experimental group; (2) The parallel alterations of interregional connectivity related to the right posterior parietal lobe occurred in the supplementary motor area for both tasks. Conclusions/Significance These computational results may provide the following insights: (1) The establishment of motor schema through MI learning may induce the significant decrease of connectivity degree in the posterior parietal lobe; (2) The decreased interregional connectivity between the supplementary motor area and the right posterior parietal lobe in post-test implicates the dissociation between motor learning and task performing. These findings and explanations further revealed the neural substrates underpinning MI learning and supported that
Design and test of a parallel kinematic solar tracker

Directory of Open Access Journals (Sweden)

Stefano Mauro

2015-12-01

Full Text Available This article proposes a parallel kinematic solar tracker designed for driving high-concentration photovoltaic modules. This kind of module produces energy only if they are oriented with misalignment errors lower than 0.4°. Generally, a parallel kinematic structure provides high stiffness and precision in positioning, so these features make this mechanism fit for the purpose. This article describes the work carried out to design a suitable parallel machine: an already existing architecture was chosen, and the geometrical parameters of the system were defined in order to obtain a workspace consistent with the requirements for sun tracking. Besides, an analysis of the singularities of the system was carried out. The method used for the singularity analysis revealed the existence of singularities which had not been previously identified for this kind of mechanism. From the analysis of the mechanism developed, very low nominal energy consumption and elevated stiffness were found. A small-scale prototype of the system was constructed for the first time. A control algorithm was also developed, implemented, and tested. Finally, experimental tests were carried out in order to verify the capability of the system of ensuring precise pointing. The tests have been considered passed as the system showed an orientation error lower than 0.4° during sun tracking.
Functional networks in parallel with cortical development associate with executive functions in children.

Science.gov (United States)

Zhong, Jidan; Rifkin-Graboi, Anne; Ta, Anh Tuan; Yap, Kar Lai; Chuang, Kai-Hsiang; Meaney, Michael J; Qiu, Anqi

2014-07-01

Children begin performing similarly to adults on tasks requiring executive functions in late childhood, a transition that is probably due to neuroanatomical fine-tuning processes, including myelination and synaptic pruning. In parallel to such structural changes in neuroanatomical organization, development of functional organization may also be associated with cognitive behaviors in children. We examined 6- to 10-year-old children's cortical thickness, functional organization, and cognitive performance. We used structural magnetic resonance imaging (MRI) to identify areas with cortical thinning, resting-state fMRI to identify functional organization in parallel to cortical development, and working memory/response inhibition tasks to assess executive functioning. We found that neuroanatomical changes in the form of cortical thinning spread over bilateral frontal, parietal, and occipital regions. These regions were engaged in 3 functional networks: sensorimotor and auditory, executive control, and default mode network. Furthermore, we found that working memory and response inhibition only associated with regional functional connectivity, but not topological organization (i.e., local and global efficiency of information transfer) of these functional networks. Interestingly, functional connections associated with "bottom-up" as opposed to "top-down" processing were more clearly related to children's performance on working memory and response inhibition, implying an important role for brain systems involved in late childhood. © The Author 2013. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Convergent Evolution of Hemoglobin Function in High-Altitude Andean Waterfowl Involves Limited Parallelism at the Molecular Sequence Level.

Directory of Open Access Journals (Sweden)

Chandrasekhar Natarajan

2015-12-01

Full Text Available A fundamental question in evolutionary genetics concerns the extent to which adaptive phenotypic convergence is attributable to convergent or parallel changes at the molecular sequence level. Here we report a comparative analysis of hemoglobin (Hb function in eight phylogenetically replicated pairs of high- and low-altitude waterfowl taxa to test for convergence in the oxygenation properties of Hb, and to assess the extent to which convergence in biochemical phenotype is attributable to repeated amino acid replacements. Functional experiments on native Hb variants and protein engineering experiments based on site-directed mutagenesis revealed the phenotypic effects of specific amino acid replacements that were responsible for convergent increases in Hb-O2 affinity in multiple high-altitude taxa. In six of the eight taxon pairs, high-altitude taxa evolved derived increases in Hb-O2 affinity that were caused by a combination of unique replacements, parallel replacements (involving identical-by-state variants with independent mutational origins in different lineages, and collateral replacements (involving shared, identical-by-descent variants derived via introgressive hybridization. In genome scans of nucleotide differentiation involving high- and low-altitude populations of three separate species, function-altering amino acid polymorphisms in the globin genes emerged as highly significant outliers, providing independent evidence for adaptive divergence in Hb function. The experimental results demonstrate that convergent changes in protein function can occur through multiple historical paths, and can involve multiple possible mutations. Most cases of convergence in Hb function did not involve parallel substitutions and most parallel substitutions did not affect Hb-O2 affinity, indicating that the repeatability of phenotypic evolution does not require parallelism at the molecular level.
A preclinical cognitive test battery to parallel the National Institute of Health Toolbox in humans: bridging the translational gap.

Science.gov (United States)

Snigdha, Shikha; Milgram, Norton W; Willis, Sherry L; Albert, Marylin; Weintraub, S; Fortin, Norbert J; Cotman, Carl W

2013-07-01

A major goal of animal research is to identify interventions that can promote successful aging and delay or reverse age-related cognitive decline in humans. Recent advances in standardizing cognitive assessment tools for humans have the potential to bring preclinical work closer to human research in aging and Alzheimer's disease. The National Institute of Health (NIH) has led an initiative to develop a comprehensive Toolbox for Neurologic Behavioral Function (NIH Toolbox) to evaluate cognitive, motor, sensory and emotional function for use in epidemiologic and clinical studies spanning 3 to 85 years of age. This paper aims to analyze the strengths and limitations of animal behavioral tests that can be used to parallel those in the NIH Toolbox. We conclude that there are several paradigms available to define a preclinical battery that parallels the NIH Toolbox. We also suggest areas in which new tests may benefit the development of a comprehensive preclinical test battery for assessment of cognitive function in animal models of aging and Alzheimer's disease. Copyright © 2013 Elsevier Inc. All rights reserved.
Improving the security of a parallel keyed hash function based on chaotic maps

Energy Technology Data Exchange (ETDEWEB)

Xiao Di, E-mail: xiaodi_cqu@hotmail.co [College of Computer Science and Engineering, Chongqing University, Chongqing 400044 (China); Liao Xiaofeng [College of Computer Science and Engineering, Chongqing University, Chongqing 400044 (China); Wang Yong [College of Computer Science and Engineering, Chongqing University, Chongqing 400044 (China)] [College of Economy and Management, Chongqing University of Posts and Telecommunications, Chongqing 400065 (China)

2009-11-23

In this Letter, we analyze the cause of vulnerability of the original parallel keyed hash function based on chaotic maps in detail, and then propose the corresponding enhancement measures. Theoretical analysis and computer simulation indicate that the modified hash function is more secure than the original one. At the same time, it can keep the parallel merit and satisfy the other performance requirements of hash function.
Improving the security of a parallel keyed hash function based on chaotic maps

International Nuclear Information System (INIS)

Xiao Di; Liao Xiaofeng; Wang Yong

2009-01-01

In this Letter, we analyze the cause of vulnerability of the original parallel keyed hash function based on chaotic maps in detail, and then propose the corresponding enhancement measures. Theoretical analysis and computer simulation indicate that the modified hash function is more secure than the original one. At the same time, it can keep the parallel merit and satisfy the other performance requirements of hash function.
Searching for globally optimal functional forms for interatomic potentials using genetic programming with parallel tempering.

Science.gov (United States)

Slepoy, A; Peters, M D; Thompson, A P

2007-11-30

Molecular dynamics and other molecular simulation methods rely on a potential energy function, based only on the relative coordinates of the atomic nuclei. Such a function, called a force field, approximately represents the electronic structure interactions of a condensed matter system. Developing such approximate functions and fitting their parameters remains an arduous, time-consuming process, relying on expert physical intuition. To address this problem, a functional programming methodology was developed that may enable automated discovery of entirely new force-field functional forms, while simultaneously fitting parameter values. The method uses a combination of genetic programming, Metropolis Monte Carlo importance sampling and parallel tempering, to efficiently search a large space of candidate functional forms and parameters. The methodology was tested using a nontrivial problem with a well-defined globally optimal solution: a small set of atomic configurations was generated and the energy of each configuration was calculated using the Lennard-Jones pair potential. Starting with a population of random functions, our fully automated, massively parallel implementation of the method reproducibly discovered the original Lennard-Jones pair potential by searching for several hours on 100 processors, sampling only a minuscule portion of the total search space. This result indicates that, with further improvement, the method may be suitable for unsupervised development of more accurate force fields with completely new functional forms. Copyright (c) 2007 Wiley Periodicals, Inc.
Parallelize Automated Tests in a Build and Test Environment

OpenAIRE

Durairaj, Selva Ganesh

2016-01-01

This thesis investigates the possibilities of finding solutions, in order to reduce the total time spent for testing and waiting times for running multiple automated test cases in a test framework. The “Automated Test Framework”, developed by Axis Communications AB, is used to write the functional tests to test both hardware and software of a resource. The functional tests that tests the software is considered in this thesis work. In the current infrastructure, tests are executed sequentially...
Parallel sites implicate functional convergence of the hearing gene prestin among echolocating mammals.

Science.gov (United States)

Liu, Zhen; Qi, Fei-Yan; Zhou, Xin; Ren, Hai-Qing; Shi, Peng

2014-09-01

Echolocation is a sensory system whereby certain mammals navigate and forage using sound waves, usually in environments where visibility is limited. Curiously, echolocation has evolved independently in bats and whales, which occupy entirely different environments. Based on this phenotypic convergence, recent studies identified several echolocation-related genes with parallel sites at the protein sequence level among different echolocating mammals, and among these, prestin seems the most promising. Although previous studies analyzed the evolutionary mechanism of prestin, the functional roles of the parallel sites in the evolution of mammalian echolocation are not clear. By functional assays, we show that a key parameter of prestin function, 1/α, is increased in all echolocating mammals and that the N7T parallel substitution accounted for this functional convergence. Moreover, another parameter, V1/2, was shifted toward the depolarization direction in a toothed whale, the bottlenose dolphin (Tursiops truncatus) and a constant-frequency (CF) bat, the Stoliczka's trident bat (Aselliscus stoliczkanus). The parallel site of I384T between toothed whales and CF bats was responsible for this functional convergence. Furthermore, the two parameters (1/α and V1/2) were correlated with mammalian high-frequency hearing, suggesting that the convergent changes of the prestin function in echolocating mammals may play important roles in mammalian echolocation. To our knowledge, these findings present the functional patterns of echolocation-related genes in echolocating mammals for the first time and rigorously demonstrate adaptive parallel evolution at the protein sequence level, paving the way to insights into the molecular mechanism underlying mammalian echolocation. © The Author 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
The Glasgow Parallel Reduction Machine: Programming Shared-memory Many-core Systems using Parallel Task Composition

Directory of Open Access Journals (Sweden)

Ashkan Tousimojarad

2013-12-01

Full Text Available We present the Glasgow Parallel Reduction Machine (GPRM, a novel, flexible framework for parallel task-composition based many-core programming. We allow the programmer to structure programs into task code, written as C++ classes, and communication code, written in a restricted subset of C++ with functional semantics and parallel evaluation. In this paper we discuss the GPRM, the virtual machine framework that enables the parallel task composition approach. We focus the discussion on GPIR, the functional language used as the intermediate representation of the bytecode running on the GPRM. Using examples in this language we show the flexibility and power of our task composition framework. We demonstrate the potential using an implementation of a merge sort algorithm on a 64-core Tilera processor, as well as on a conventional Intel quad-core processor and an AMD 48-core processor system. We also compare our framework with OpenMP tasks in a parallel pointer chasing algorithm running on the Tilera processor. Our results show that the GPRM programs outperform the corresponding OpenMP codes on all test platforms, and can greatly facilitate writing of parallel programs, in particular non-data parallel algorithms such as reductions.
Massively Parallel Interrogation of Aptamer Sequence, Structure and Function

Energy Technology Data Exchange (ETDEWEB)

Fischer, N O; Tok, J B; Tarasow, T M

2008-02-08

Optimization of high affinity reagents is a significant bottleneck in medicine and the life sciences. The ability to synthetically create thousands of permutations of a lead high-affinity reagent and survey the properties of individual permutations in parallel could potentially relieve this bottleneck. Aptamers are single stranded oligonucleotides affinity reagents isolated by in vitro selection processes and as a class have been shown to bind a wide variety of target molecules. Methodology/Principal Findings. High density DNA microarray technology was used to synthesize, in situ, arrays of approximately 3,900 aptamer sequence permutations in triplicate. These sequences were interrogated on-chip for their ability to bind the fluorescently-labeled cognate target, immunoglobulin E, resulting in the parallel execution of thousands of experiments. Fluorescence intensity at each array feature was well resolved and shown to be a function of the sequence present. The data demonstrated high intra- and interchip correlation between the same features as well as among the sequence triplicates within a single array. Consistent with aptamer mediated IgE binding, fluorescence intensity correlated strongly with specific aptamer sequences and the concentration of IgE applied to the array. The massively parallel sequence-function analyses provided by this approach confirmed the importance of a consensus sequence found in all 21 of the original IgE aptamer sequences and support a common stem:loop structure as being the secondary structure underlying IgE binding. The microarray application, data and results presented illustrate an efficient, high information content approach to optimizing aptamer function. It also provides a foundation from which to better understand and manipulate this important class of high affinity biomolecules.
Massively parallel interrogation of aptamer sequence, structure and function.

Directory of Open Access Journals (Sweden)

Nicholas O Fischer

Full Text Available BACKGROUND: Optimization of high affinity reagents is a significant bottleneck in medicine and the life sciences. The ability to synthetically create thousands of permutations of a lead high-affinity reagent and survey the properties of individual permutations in parallel could potentially relieve this bottleneck. Aptamers are single stranded oligonucleotides affinity reagents isolated by in vitro selection processes and as a class have been shown to bind a wide variety of target molecules. METHODOLOGY/PRINCIPAL FINDINGS: High density DNA microarray technology was used to synthesize, in situ, arrays of approximately 3,900 aptamer sequence permutations in triplicate. These sequences were interrogated on-chip for their ability to bind the fluorescently-labeled cognate target, immunoglobulin E, resulting in the parallel execution of thousands of experiments. Fluorescence intensity at each array feature was well resolved and shown to be a function of the sequence present. The data demonstrated high intra- and inter-chip correlation between the same features as well as among the sequence triplicates within a single array. Consistent with aptamer mediated IgE binding, fluorescence intensity correlated strongly with specific aptamer sequences and the concentration of IgE applied to the array. CONCLUSION AND SIGNIFICANCE: The massively parallel sequence-function analyses provided by this approach confirmed the importance of a consensus sequence found in all 21 of the original IgE aptamer sequences and support a common stem:loop structure as being the secondary structure underlying IgE binding. The microarray application, data and results presented illustrate an efficient, high information content approach to optimizing aptamer function. It also provides a foundation from which to better understand and manipulate this important class of high affinity biomolecules.
Design, construction, and testing of a hysteresis controlled inverter for paralleling

OpenAIRE

Fillmore, Paul F.

2003-01-01

The U. S. Navy is pursuing an all electric ship that will require enormous amounts of power for applications such as electric propulsion. Reliability and redundancy in the electronics are imperative, since failure of a critical system could leave a ship stranded and vulnerable. A parallel inverter drive topology has been proposed to provide reliability and redundancy through load sharing. The parallel architecture enables some functionality in the event that one of the inverters fails. This t...
Cryptanalysis on a parallel keyed hash function based on chaotic maps

International Nuclear Information System (INIS)

Guo Wei; Wang Xiaoming; He Dake; Cao Yang

2009-01-01

This Letter analyzes the security of a novel parallel keyed hash function based on chaotic maps, proposed by Xiao et al. to improve the efficiency in parallel computing environment. We show how to devise forgery attacks on Xiao's scheme with differential cryptanalysis and give the experiment results of two kinds of forgery attacks firstly. Furthermore, we discuss the problem of weak keys in the scheme and demonstrate how to utilize weak keys to construct collision.
Testing New Programming Paradigms with NAS Parallel Benchmarks

Science.gov (United States)

Jin, H.; Frumkin, M.; Schultz, M.; Yan, J.

2000-01-01

Over the past decade, high performance computing has evolved rapidly, not only in hardware architectures but also with increasing complexity of real applications. Technologies have been developing to aim at scaling up to thousands of processors on both distributed and shared memory systems. Development of parallel programs on these computers is always a challenging task. Today, writing parallel programs with message passing (e.g. MPI) is the most popular way of achieving scalability and high performance. However, writing message passing programs is difficult and error prone. Recent years new effort has been made in defining new parallel programming paradigms. The best examples are: HPF (based on data parallelism) and OpenMP (based on shared memory parallelism). Both provide simple and clear extensions to sequential programs, thus greatly simplify the tedious tasks encountered in writing message passing programs. HPF is independent of memory hierarchy, however, due to the immaturity of compiler technology its performance is still questionable. Although use of parallel compiler directives is not new, OpenMP offers a portable solution in the shared-memory domain. Another important development involves the tremendous progress in the internet and its associated technology. Although still in its infancy, Java promisses portability in a heterogeneous environment and offers possibility to "compile once and run anywhere." In light of testing these new technologies, we implemented new parallel versions of the NAS Parallel Benchmarks (NPBs) with HPF and OpenMP directives, and extended the work with Java and Java-threads. The purpose of this study is to examine the effectiveness of alternative programming paradigms. NPBs consist of five kernels and three simulated applications that mimic the computation and data movement of large scale computational fluid dynamics (CFD) applications. We started with the serial version included in NPB2.3. Optimization of memory and cache usage
DGDFT: A massively parallel method for large scale density functional theory calculations.

Science.gov (United States)

Hu, Wei; Lin, Lin; Yang, Chao

2015-09-28

We describe a massively parallel implementation of the recently developed discontinuous Galerkin density functional theory (DGDFT) method, for efficient large-scale Kohn-Sham DFT based electronic structure calculations. The DGDFT method uses adaptive local basis (ALB) functions generated on-the-fly during the self-consistent field iteration to represent the solution to the Kohn-Sham equations. The use of the ALB set provides a systematic way to improve the accuracy of the approximation. By using the pole expansion and selected inversion technique to compute electron density, energy, and atomic forces, we can make the computational complexity of DGDFT scale at most quadratically with respect to the number of electrons for both insulating and metallic systems. We show that for the two-dimensional (2D) phosphorene systems studied here, using 37 basis functions per atom allows us to reach an accuracy level of 1.3 × 10(-4) Hartree/atom in terms of the error of energy and 6.2 × 10(-4) Hartree/bohr in terms of the error of atomic force, respectively. DGDFT can achieve 80% parallel efficiency on 128,000 high performance computing cores when it is used to study the electronic structure of 2D phosphorene systems with 3500-14 000 atoms. This high parallel efficiency results from a two-level parallelization scheme that we will describe in detail.
DGDFT: A massively parallel method for large scale density functional theory calculations

International Nuclear Information System (INIS)

Hu, Wei; Yang, Chao; Lin, Lin

2015-01-01

We describe a massively parallel implementation of the recently developed discontinuous Galerkin density functional theory (DGDFT) method, for efficient large-scale Kohn-Sham DFT based electronic structure calculations. The DGDFT method uses adaptive local basis (ALB) functions generated on-the-fly during the self-consistent field iteration to represent the solution to the Kohn-Sham equations. The use of the ALB set provides a systematic way to improve the accuracy of the approximation. By using the pole expansion and selected inversion technique to compute electron density, energy, and atomic forces, we can make the computational complexity of DGDFT scale at most quadratically with respect to the number of electrons for both insulating and metallic systems. We show that for the two-dimensional (2D) phosphorene systems studied here, using 37 basis functions per atom allows us to reach an accuracy level of 1.3 × 10 −4 Hartree/atom in terms of the error of energy and 6.2 × 10 −4 Hartree/bohr in terms of the error of atomic force, respectively. DGDFT can achieve 80% parallel efficiency on 128,000 high performance computing cores when it is used to study the electronic structure of 2D phosphorene systems with 3500-14 000 atoms. This high parallel efficiency results from a two-level parallelization scheme that we will describe in detail
DGDFT: A massively parallel method for large scale density functional theory calculations

Energy Technology Data Exchange (ETDEWEB)

Hu, Wei, E-mail: whu@lbl.gov; Yang, Chao, E-mail: cyang@lbl.gov [Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720 (United States); Lin, Lin, E-mail: linlin@math.berkeley.edu [Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720 (United States); Department of Mathematics, University of California, Berkeley, California 94720 (United States)

2015-09-28

We describe a massively parallel implementation of the recently developed discontinuous Galerkin density functional theory (DGDFT) method, for efficient large-scale Kohn-Sham DFT based electronic structure calculations. The DGDFT method uses adaptive local basis (ALB) functions generated on-the-fly during the self-consistent field iteration to represent the solution to the Kohn-Sham equations. The use of the ALB set provides a systematic way to improve the accuracy of the approximation. By using the pole expansion and selected inversion technique to compute electron density, energy, and atomic forces, we can make the computational complexity of DGDFT scale at most quadratically with respect to the number of electrons for both insulating and metallic systems. We show that for the two-dimensional (2D) phosphorene systems studied here, using 37 basis functions per atom allows us to reach an accuracy level of 1.3 × 10{sup −4} Hartree/atom in terms of the error of energy and 6.2 × 10{sup −4} Hartree/bohr in terms of the error of atomic force, respectively. DGDFT can achieve 80% parallel efficiency on 128,000 high performance computing cores when it is used to study the electronic structure of 2D phosphorene systems with 3500-14 000 atoms. This high parallel efficiency results from a two-level parallelization scheme that we will describe in detail.
Multitasking TORT Under UNICOS: Parallel Performance Models and Measurements

International Nuclear Information System (INIS)

Azmy, Y.Y.; Barnett, D.A.

1999-01-01

The existing parallel algorithms in the TORT discrete ordinates were updated to function in a UNI-COS environment. A performance model for the parallel overhead was derived for the existing algorithms. The largest contributors to the parallel overhead were identified and a new algorithm was developed. A parallel overhead model was also derived for the new algorithm. The results of the comparison of parallel performance models were compared to applications of the code to two TORT standard test problems and a large production problem. The parallel performance models agree well with the measured parallel overhead

Multitasking TORT under UNICOS: Parallel performance models and measurements

International Nuclear Information System (INIS)

Barnett, A.; Azmy, Y.Y.

1999-01-01

The existing parallel algorithms in the TORT discrete ordinates code were updated to function in a UNICOS environment. A performance model for the parallel overhead was derived for the existing algorithms. The largest contributors to the parallel overhead were identified and a new algorithm was developed. A parallel overhead model was also derived for the new algorithm. The results of the comparison of parallel performance models were compared to applications of the code to two TORT standard test problems and a large production problem. The parallel performance models agree well with the measured parallel overhead
A hybrid method for the parallel computation of Green's functions

International Nuclear Information System (INIS)

Petersen, Dan Erik; Li Song; Stokbro, Kurt; Sorensen, Hans Henrik B.; Hansen, Per Christian; Skelboe, Stig; Darve, Eric

2009-01-01

Quantum transport models for nanodevices using the non-equilibrium Green's function method require the repeated calculation of the block tridiagonal part of the Green's and lesser Green's function matrices. This problem is related to the calculation of the inverse of a sparse matrix. Because of the large number of times this calculation needs to be performed, this is computationally very expensive even on supercomputers. The classical approach is based on recurrence formulas which cannot be efficiently parallelized. This practically prevents the solution of large problems with hundreds of thousands of atoms. We propose new recurrences for a general class of sparse matrices to calculate Green's and lesser Green's function matrices which extend formulas derived by Takahashi and others. We show that these recurrences may lead to a dramatically reduced computational cost because they only require computing a small number of entries of the inverse matrix. Then, we propose a parallelization strategy for block tridiagonal matrices which involves a combination of Schur complement calculations and cyclic reduction. It achieves good scalability even on problems of modest size.
Automatic Parallelization Tool: Classification of Program Code for Parallel Computing

Directory of Open Access Journals (Sweden)

Mustafa Basthikodi

2016-04-01

Full Text Available Performance growth of single-core processors has come to a halt in the past decade, but was re-enabled by the introduction of parallelism in processors. Multicore frameworks along with Graphical Processing Units empowered to enhance parallelism broadly. Couples of compilers are updated to developing challenges forsynchronization and threading issues. Appropriate program and algorithm classifications will have advantage to a great extent to the group of software engineers to get opportunities for effective parallelization. In present work we investigated current species for classification of algorithms, in that related work on classification is discussed along with the comparison of issues that challenges the classification. The set of algorithms are chosen which matches the structure with different issues and perform given task. We have tested these algorithms utilizing existing automatic species extraction toolsalong with Bones compiler. We have added functionalities to existing tool, providing a more detailed characterization. The contributions of our work include support for pointer arithmetic, conditional and incremental statements, user defined types, constants and mathematical functions. With this, we can retain significant data which is not captured by original speciesof algorithms. We executed new theories into the device, empowering automatic characterization of program code.
Eddy current testing probe optimization using a parallel genetic algorithm

Directory of Open Access Journals (Sweden)

Dolapchiev Ivaylo

2008-01-01

Full Text Available This paper uses the developed parallel version of Michalewicz's Genocop III Genetic Algorithm (GA searching technique to optimize the coil geometry of an eddy current non-destructive testing probe (ECTP. The electromagnetic field is computed using FEMM 2D finite element code. The aim of this optimization was to determine coil dimensions and positions that improve ECTP sensitivity to physical properties of the tested devices.
COMPARISON BETWEEN TEST METHODS TO DETERMINE WOOD EMBEDMENT STRENGTH PARALLEL TO THE GRAIN

Directory of Open Access Journals (Sweden)

Diego Henrique de Almeida

Full Text Available ABSTRACT This study compares the test methods according to the ABNT NBR 7190:1997, EN 383:2007, ASTM D5764:2007, EUROCODE 5:2004, NDS:2001 standards in order to provide support to establish a new test method for determining the embedment strength of wood parallel to the grain. Parallel-to-grain tests were carried out for six wood species (Schizolobium amazonicum; Pinus elliottii; Pinus oocarpa; Hymenaea spp.; Lyptus(r: hybrid Eucalyptus grandis and Eucalyptus urophylla, and Goupia glabra using four diameters (8 mm, 10 mm, 12 mm and 16 mm for the metal pin fasteners (bolts. The experimental results obtained according to the EN 383:2007 standard were closer to the specific values for the metal-dowel connections design used by ABNT NBR 7190:1997, which are considered equal compression parallel to the grain. The use of maximum embedment force or the force causing displacement of 5 mm between the bolt and the test-piece as criteria for determining embedment strength for EN 383:2007 appears to be more appropriate than the criteria used by the Brazilian and American Standards.
Device-independent parallel self-testing of two singlets

Science.gov (United States)

Wu, Xingyao; Bancal, Jean-Daniel; McKague, Matthew; Scarani, Valerio

2016-06-01

Device-independent self-testing offers the possibility of certifying the quantum state and measurements, up to local isometries, using only the statistics observed by querying uncharacterized local devices. In this paper we study parallel self-testing of two maximally entangled pairs of qubits; in particular, the local tensor product structure is not assumed but derived. We prove two criteria that achieve the desired result: a double use of the Clauser-Horne-Shimony-Holt inequality and the 3 ×3 magic square game. This demonstrate that the magic square game can only be perfectly won by measuring a two-singlet state. The tolerance to noise is well within reach of state-of-the-art experiments.
Functional Parallel Factor Analysis for Functions of One- and Two-dimensional Arguments.

Science.gov (United States)

Choi, Ji Yeh; Hwang, Heungsun; Timmerman, Marieke E

2018-03-01

Parallel factor analysis (PARAFAC) is a useful multivariate method for decomposing three-way data that consist of three different types of entities simultaneously. This method estimates trilinear components, each of which is a low-dimensional representation of a set of entities, often called a mode, to explain the maximum variance of the data. Functional PARAFAC permits the entities in different modes to be smooth functions or curves, varying over a continuum, rather than a collection of unconnected responses. The existing functional PARAFAC methods handle functions of a one-dimensional argument (e.g., time) only. In this paper, we propose a new extension of functional PARAFAC for handling three-way data whose responses are sequenced along both a two-dimensional domain (e.g., a plane with x- and y-axis coordinates) and a one-dimensional argument. Technically, the proposed method combines PARAFAC with basis function expansion approximations, using a set of piecewise quadratic finite element basis functions for estimating two-dimensional smooth functions and a set of one-dimensional basis functions for estimating one-dimensional smooth functions. In a simulation study, the proposed method appeared to outperform the conventional PARAFAC. We apply the method to EEG data to demonstrate its empirical usefulness.
From functional programming to multicore parallelism: A case study based on Presburger Arithmetic

DEFF Research Database (Denmark)

Dung, Phan Anh; Hansen, Michael Reichhardt

2011-01-01

, we are interested in using PA in connection with the Duration Calculus Model Checker (DCMC) [5]. There are effective decision procedures for PA including Cooper’s algorithm and the Omega Test; however, their complexity is extremely high with doubly exponential lower bound and triply exponential upper...... bound [7]. We investigate these decision procedures in the context of multicore parallelism with the hope of exploiting multicore powers. Unfortunately, we are not aware of any prior parallelism research related to decision procedures for PA. The closest work is the preliminary results on parallelism...
Administering truncated receive functions in a parallel messaging interface

Science.gov (United States)

Archer, Charles J; Blocksome, Michael A; Ratterman, Joseph D; Smith, Brian E

2014-12-09

Administering truncated receive functions in a parallel messaging interface (`PMI`) of a parallel computer comprising a plurality of compute nodes coupled for data communications through the PMI and through a data communications network, including: sending, through the PMI on a source compute node, a quantity of data from the source compute node to a destination compute node; specifying, by an application on the destination compute node, a portion of the quantity of data to be received by the application on the destination compute node and a portion of the quantity of data to be discarded; receiving, by the PMI on the destination compute node, all of the quantity of data; providing, by the PMI on the destination compute node to the application on the destination compute node, only the portion of the quantity of data to be received by the application; and discarding, by the PMI on the destination compute node, the portion of the quantity of data to be discarded.
A hybrid method for the parallel computation of Green's functions

DEFF Research Database (Denmark)

Petersen, Dan Erik; Li, Song; Stokbro, Kurt

2009-01-01

of the large number of times this calculation needs to be performed, this is computationally very expensive even on supercomputers. The classical approach is based on recurrence formulas which cannot be efficiently parallelized. This practically prevents the solution of large problems with hundreds...... of thousands of atoms. We propose new recurrences for a general class of sparse matrices to calculate Green's and lesser Green's function matrices which extend formulas derived by Takahashi and others. We show that these recurrences may lead to a dramatically reduced computational cost because they only...... require computing a small number of entries of the inverse matrix. Then. we propose a parallelization strategy for block tridiagonal matrices which involves a combination of Schur complement calculations and cyclic reduction. It achieves good scalability even on problems of modest size....
Compiling the functional data-parallel language SaC for Microgrids of Self-Adaptive Virtual Processors

NARCIS (Netherlands)

Grelck, C.; Herhut, S.; Jesshope, C.; Joslin, C.; Lankamp, M.; Scholz, S.-B.; Shafarenko, A.

2009-01-01

We present preliminary results from compiling the high-level, functional and data-parallel programming language SaC into a novel multi-core design: Microgrids of Self-Adaptive Virtual Processors (SVPs). The side-effect free nature of SaC in conjunction with its data-parallel foundation make it an
Test Time Reduction for BIST by Parallel Divide-and-Conquer Method

Energy Technology Data Exchange (ETDEWEB)

Choi, Byung Gu; Kim, Dong Wook [Kwangwoon University (Korea)

2000-06-01

BIST(Built-in Self Test) has been considered as the most promising DFT(design-for-test) scheme for the present and future test strategy. The most serious problem in applying BIST(Built-in Self Test) into a large circuit is the excessive increase in test time. This paper is focused on this problem. We proposed a new BIST construction scheme which uses a parallel divide-and-conquer method. The circuit division is performed with respect to some internal nodes called test points. The test points are selected by considering the nodal connectivity of the circuit rather than the testability of each node. The test patterns are generated by only one linear feedback shift register(LFSR) and they are shared by all the divided circuits. Thus, the test for each divided circuit is performed in parallel. Test responses are collected from the test point as well as the primary outputs. Even though the divide-and-conquer scheme is used and test patterns are generated in one LFSR, the proposed scheme does not lose its pseudo-exhaustive property. We proposed a selection procedure to find the test points and it was implemented with C/C{sup ++} language. Several example circuits were applied to this procedure and the results showed that test time was reduced upto 1/2{sup 1}51 but the increase in the hardware overhead or the delay increase was not much high. Because the proposed scheme showed a tendency that the increasing rates in hardware overhead and delay overhead were less than that in test time reduction as the size of circuit increases, it is expected to be used efficiently for large circuits as VLSI and ULSI. (author). 15 refs., 7 figs., 5 tabs.
Electron beam test of an iron/gas calorimeter based on ceramic parallel plate chambers

International Nuclear Information System (INIS)

Arefiev, A.; Bencze, Gy.L.; Bizzeti, A.; Choumilov, E.; Civinini, C; Dalla Santa, F.; D'Alessandro, R.; Ferrando, A.; Fouz, M.C.; Herve, A.; Iglesias, A.; Ivochkin, V.; Josa, M.I.; Maggi, F.; Malinin, A.; Meschini, M.; Pojidaev, V.; Radermacher, E.; Salicio, J.M.

1995-01-01

The baseline option for the very forward calorimetry in the CMS experiment is an iron/gas calorimeter based on parallel plate chambers. A small prototype module of such a calorimeter, has been tested using electrons of 5 to 100 GeV/c momentum with various high voltages and two gases: CO2 (100%) and CF4/CO2 (80/20), at atmospheric pressure. The collected charge has been measured as a function of the high voltage and of the electron energy. The energy resolution has also been measured. Comparisons have been made with Monte-Carlo predictions. Agreement between data an simulation allows to make and estimation of the expected performance of a full size calorimeter. (Author) 23 refs
Electron beam test of an iron/gas calorimeter based on ceramic parallel plate chambers

International Nuclear Information System (INIS)

Arefiev, A.; Bencze, G.L.; Bizzeti, A.; choumilov, E.; Civinini, C.; Dalla Santa, F.; D'Alessandro, R.; Ferrando, A.; Fouz, M.C.; Herve, A.; Iglesias, A.; Ivochkin, V.; Josa, M.I.; Maggi, F.; Malininin, A.; Meschini, M.; Pojidaev, V.; Radermacher, E.; Salicio, J.M.

1995-12-01

The baseline option for the very forward calorimetry in the CMS experiment is an iron/gas calorimeter based on parallel plate chambers. A small prototype module of such a calorimeter, has been tested using electrons of 5 to 100 GeV/c momentum with various high voltages and two gases: CO 2 (100%) and CF 4 /CO 2 (80/20), at atmospheric pressure. The collected charge has been measured as a function of the high voltage and of the electron energy. The energy resolution has also been measured. Comparisons have been made with Monte-Carlo predictions. Agreement between data an simulation allows to make and estimation of the expected performance of a full size calorimeter. (Author)
Parallel computations

CERN Document Server

1982-01-01

Parallel Computations focuses on parallel computation, with emphasis on algorithms used in a variety of numerical and physical applications and for many different types of parallel computers. Topics covered range from vectorization of fast Fourier transforms (FFTs) and of the incomplete Cholesky conjugate gradient (ICCG) algorithm on the Cray-1 to calculation of table lookups and piecewise functions. Single tridiagonal linear systems and vectorized computation of reactive flow are also discussed.Comprised of 13 chapters, this volume begins by classifying parallel computers and describing techn
Thyroid Function Tests

Science.gov (United States)

... Home » Thyroid Function Tests Leer en Español Thyroid Function Tests FUNCTION HOW DOES THE THYROID GLAND FUNCTION? ... Cancer Thyroid Nodules in Children and Adolescents Thyroid Function Tests Resources Thyroid Function Tests Brochure PDF En ...
A study of objective functions for organs with parallel and serial architecture

International Nuclear Information System (INIS)

Stavrev, P.V.; Stavreva, N.A.; Round, W.H.

1997-01-01

An objective function analysis when target volumes are deliberately enlarged to account for tumour mobility and consecutive uncertainty in the tumour position in external beam radiotherapy has been carried out. The dose distribution inside the tumour is assumed to have logarithmic dependence on the tumour cell density which assures an iso-local tumour control probability. The normal tissue immediately surrounding the tumour is irradiated homogeneously at a dose level equal to the dose D(R)) delivered at the edge of the tumour The normal tissue in the high dose field is modelled as being organized in identical functional subunits (FSUs) composed of a relatively large number of cells. Two types of organs - having serial and parallel architecture are considered. Implicit averaging over intrapatient normal tissue radiosensitivity variations is done. A function describing the normal tissue survival probability S 0 is constructed. The objective function is given as a product of the total tumour control probability (TCP) and the normal tissue survival probability S 0 . The values of the dose D(R)) which result in a maximum of the objective function are obtained for different combinations of tumour and normal tissue parameters, such as tumour and normal tissue radiosensitivities, number of cells constituting a normal tissue functional unit, total number of normal cells under high dose (D(R)) exposure and functional reserve for organs having parallel architecture. The corresponding TCP and S 0 values are computed and discussed. (authors)
Time-dependent density-functional theory in massively parallel computer architectures: the OCTOPUS project.

Science.gov (United States)

Andrade, Xavier; Alberdi-Rodriguez, Joseba; Strubbe, David A; Oliveira, Micael J T; Nogueira, Fernando; Castro, Alberto; Muguerza, Javier; Arruabarrena, Agustin; Louie, Steven G; Aspuru-Guzik, Alán; Rubio, Angel; Marques, Miguel A L

2012-06-13

Octopus is a general-purpose density-functional theory (DFT) code, with a particular emphasis on the time-dependent version of DFT (TDDFT). In this paper we present the ongoing efforts to achieve the parallelization of octopus. We focus on the real-time variant of TDDFT, where the time-dependent Kohn-Sham equations are directly propagated in time. This approach has great potential for execution in massively parallel systems such as modern supercomputers with thousands of processors and graphics processing units (GPUs). For harvesting the potential of conventional supercomputers, the main strategy is a multi-level parallelization scheme that combines the inherent scalability of real-time TDDFT with a real-space grid domain-partitioning approach. A scalable Poisson solver is critical for the efficiency of this scheme. For GPUs, we show how using blocks of Kohn-Sham states provides the required level of data parallelism and that this strategy is also applicable for code optimization on standard processors. Our results show that real-time TDDFT, as implemented in octopus, can be the method of choice for studying the excited states of large molecular systems in modern parallel architectures.
Time-dependent density-functional theory in massively parallel computer architectures: the octopus project

Science.gov (United States)

Andrade, Xavier; Alberdi-Rodriguez, Joseba; Strubbe, David A.; Oliveira, Micael J. T.; Nogueira, Fernando; Castro, Alberto; Muguerza, Javier; Arruabarrena, Agustin; Louie, Steven G.; Aspuru-Guzik, Alán; Rubio, Angel; Marques, Miguel A. L.

2012-06-01

Octopus is a general-purpose density-functional theory (DFT) code, with a particular emphasis on the time-dependent version of DFT (TDDFT). In this paper we present the ongoing efforts to achieve the parallelization of octopus. We focus on the real-time variant of TDDFT, where the time-dependent Kohn-Sham equations are directly propagated in time. This approach has great potential for execution in massively parallel systems such as modern supercomputers with thousands of processors and graphics processing units (GPUs). For harvesting the potential of conventional supercomputers, the main strategy is a multi-level parallelization scheme that combines the inherent scalability of real-time TDDFT with a real-space grid domain-partitioning approach. A scalable Poisson solver is critical for the efficiency of this scheme. For GPUs, we show how using blocks of Kohn-Sham states provides the required level of data parallelism and that this strategy is also applicable for code optimization on standard processors. Our results show that real-time TDDFT, as implemented in octopus, can be the method of choice for studying the excited states of large molecular systems in modern parallel architectures.
Time-dependent density-functional theory in massively parallel computer architectures: the octopus project

International Nuclear Information System (INIS)

Andrade, Xavier; Aspuru-Guzik, Alán; Alberdi-Rodriguez, Joseba; Rubio, Angel; Strubbe, David A; Louie, Steven G; Oliveira, Micael J T; Nogueira, Fernando; Castro, Alberto; Muguerza, Javier; Arruabarrena, Agustin; Marques, Miguel A L

2012-01-01

Octopus is a general-purpose density-functional theory (DFT) code, with a particular emphasis on the time-dependent version of DFT (TDDFT). In this paper we present the ongoing efforts to achieve the parallelization of octopus. We focus on the real-time variant of TDDFT, where the time-dependent Kohn-Sham equations are directly propagated in time. This approach has great potential for execution in massively parallel systems such as modern supercomputers with thousands of processors and graphics processing units (GPUs). For harvesting the potential of conventional supercomputers, the main strategy is a multi-level parallelization scheme that combines the inherent scalability of real-time TDDFT with a real-space grid domain-partitioning approach. A scalable Poisson solver is critical for the efficiency of this scheme. For GPUs, we show how using blocks of Kohn-Sham states provides the required level of data parallelism and that this strategy is also applicable for code optimization on standard processors. Our results show that real-time TDDFT, as implemented in octopus, can be the method of choice for studying the excited states of large molecular systems in modern parallel architectures. (topical review)

A massively-parallel electronic-structure calculations based on real-space density functional theory

International Nuclear Information System (INIS)

Iwata, Jun-Ichi; Takahashi, Daisuke; Oshiyama, Atsushi; Boku, Taisuke; Shiraishi, Kenji; Okada, Susumu; Yabana, Kazuhiro

2010-01-01

Based on the real-space finite-difference method, we have developed a first-principles density functional program that efficiently performs large-scale calculations on massively-parallel computers. In addition to efficient parallel implementation, we also implemented several computational improvements, substantially reducing the computational costs of O(N 3 ) operations such as the Gram-Schmidt procedure and subspace diagonalization. Using the program on a massively-parallel computer cluster with a theoretical peak performance of several TFLOPS, we perform electronic-structure calculations for a system consisting of over 10,000 Si atoms, and obtain a self-consistent electronic-structure in a few hundred hours. We analyze in detail the costs of the program in terms of computation and of inter-node communications to clarify the efficiency, the applicability, and the possibility for further improvements.
Vector Green's function algorithm for radiative transfer in plane-parallel atmosphere

International Nuclear Information System (INIS)

Qin Yi; Box, Michael A.

2006-01-01

Green's function is a widely used approach for boundary value problems. In problems related to radiative transfer, Green's function has been found to be useful in land, ocean and atmosphere remote sensing. It is also a key element in higher order perturbation theory. This paper presents an explicit expression of the Green's function, in terms of the source and radiation field variables, for a plane-parallel atmosphere with either vacuum boundaries or a reflecting (BRDF) surface. Full polarization state is considered but the algorithm has been developed in such way that it can be easily reduced to solve scalar radiative transfer problems, which makes it possible to implement a single set of code for computing both the scalar and the vector Green's function
Nematic-smectic A and nematic-solid transitions of parallel hard spherocylinders from density functional theory

NARCIS (Netherlands)

University Utrecht

1992-01-01

A simple density functional theory for the various liquid-crystalline phases of parallel hard spherocylinders is formulated on the basis of Pynn's ansatz for the direct correlation function of the spherocylinders. Fair agreement with the computer simulations is found.
Functional assessment of the ex vivo vocal folds through biomechanical testing: A review

Science.gov (United States)

Dion, Gregory R.; Jeswani, Seema; Roof, Scott; Fritz, Mark; Coelho, Paulo; Sobieraj, Michael; Amin, Milan R.; Branski, Ryan C.

2016-01-01

The human vocal folds are complex structures made up of distinct layers that vary in cellular and extracellular composition. The mechanical properties of vocal fold tissue are fundamental to the study of both the acoustics and biomechanics of voice production. To date, quantitative methods have been applied to characterize the vocal fold tissue in both normal and pathologic conditions. This review describes, summarizes, and discusses the most commonly employed methods for vocal fold biomechanical testing. Force-elongation, torsional parallel plate rheometry, simple-shear parallel plate rheometry, linear skin rheometry, and indentation are the most frequently employed biomechanical tests for vocal fold tissues and each provide material properties data that can be used to compare native tissue verses diseased for treated tissue. Force-elongation testing is clinically useful, as it allows for functional unit testing, while rheometry provides physiologically relevant shear data, and nanoindentation permits micrometer scale testing across different areas of the vocal fold as well as whole organ testing. Thoughtful selection of the testing technique during experimental design to evaluate a hypothesis is important to optimizing biomechanical testing of vocal fold tissues. PMID:27127075
Air-side performance of a parallel-flow parallel-fin (PF{sup 2}) heat exchanger in sequential frosting

Energy Technology Data Exchange (ETDEWEB)

Zhang, Ping [Zhejiang Vocational College of Commerce, Hangzhou, Binwen Road 470 (China); Department of Mechanical Science and Engineering, University of Illinois at Urbana-Champaign, 1206 West Green Street, Urbana, IL 61801 (United States); Hrnjak, P.S. [Department of Mechanical Science and Engineering, University of Illinois at Urbana-Champaign, 1206 West Green Street, Urbana, IL 61801 (United States)

2010-09-15

The thermal-hydraulic performance in periodic frosting conditions is experimentally studied for the parallel-flow parallel-fin heat exchanger, henceforth referred to as a PF{sup 2} heat exchanger, a new style of heat exchanger that uses louvered bent fins on flat tubes to enhance water drainage when the flat tubes are horizontal. Typically, it takes a few frosting/defrosting cycles to come to repeatable conditions. The criterion for the initiation of defrost and a sufficiently long defrost period are determined for the test PF{sup 2} heat exchanger and test condition. The effects of blower operation on the pressure drop, frost accumulation, water retention, and capacity in time are compared under the conditions of 15 sequential frosting cycles. Pressure drop across the heat exchanger and overall heat transfer coefficient are quantified under frost conditions as functions of the air humidity and air face velocity. The performances of two types of flat-tube heat exchangers, PF{sup 2} heat exchanger and conventional parallel-flow serpentine-fin (PFSF) heat exchanger, are compared and the results obtained are presented. (author)
Parallel Monte Carlo simulation of aerosol dynamics

KAUST Repository

Zhou, K.

2014-01-01

A highly efficient Monte Carlo (MC) algorithm is developed for the numerical simulation of aerosol dynamics, that is, nucleation, surface growth, and coagulation. Nucleation and surface growth are handled with deterministic means, while coagulation is simulated with a stochastic method (Marcus-Lushnikov stochastic process). Operator splitting techniques are used to synthesize the deterministic and stochastic parts in the algorithm. The algorithm is parallelized using the Message Passing Interface (MPI). The parallel computing efficiency is investigated through numerical examples. Near 60% parallel efficiency is achieved for the maximum testing case with 3.7 million MC particles running on 93 parallel computing nodes. The algorithm is verified through simulating various testing cases and comparing the simulation results with available analytical and/or other numerical solutions. Generally, it is found that only small number (hundreds or thousands) of MC particles is necessary to accurately predict the aerosol particle number density, volume fraction, and so forth, that is, low order moments of the Particle Size Distribution (PSD) function. Accurately predicting the high order moments of the PSD needs to dramatically increase the number of MC particles. 2014 Kun Zhou et al.
Parallel ray tracing for one-dimensional discrete ordinate computations

International Nuclear Information System (INIS)

Jarvis, R.D.; Nelson, P.

1996-01-01

The ray-tracing sweep in discrete-ordinates, spatially discrete numerical approximation methods applied to the linear, steady-state, plane-parallel, mono-energetic, azimuthally symmetric, neutral-particle transport equation can be reduced to a parallel prefix computation. In so doing, the often severe penalty in convergence rate of the source iteration, suffered by most current parallel algorithms using spatial domain decomposition, can be avoided while attaining parallelism in the spatial domain to whatever extent desired. In addition, the reduction implies parallel algorithm complexity limits for the ray-tracing sweep. The reduction applies to all closed, linear, one-cell functional (CLOF) spatial approximation methods, which encompasses most in current popular use. Scalability test results of an implementation of the algorithm on a 64-node nCube-2S hypercube-connected, message-passing, multi-computer are described. (author)
Parallel paving: An algorithm for generating distributed, adaptive, all-quadrilateral meshes on parallel computers

Energy Technology Data Exchange (ETDEWEB)

Lober, R.R.; Tautges, T.J.; Vaughan, C.T.

1997-03-01

Paving is an automated mesh generation algorithm which produces all-quadrilateral elements. It can additionally generate these elements in varying sizes such that the resulting mesh adapts to a function distribution, such as an error function. While powerful, conventional paving is a very serial algorithm in its operation. Parallel paving is the extension of serial paving into parallel environments to perform the same meshing functions as conventional paving only on distributed, discretized models. This extension allows large, adaptive, parallel finite element simulations to take advantage of paving`s meshing capabilities for h-remap remeshing. A significantly modified version of the CUBIT mesh generation code has been developed to host the parallel paving algorithm and demonstrate its capabilities on both two dimensional and three dimensional surface geometries and compare the resulting parallel produced meshes to conventionally paved meshes for mesh quality and algorithm performance. Sandia`s {open_quotes}tiling{close_quotes} dynamic load balancing code has also been extended to work with the paving algorithm to retain parallel efficiency as subdomains undergo iterative mesh refinement.
Mixed-meal tolerance test versus glucagon stimulation test for the assessment of beta-cell function in therapeutic trials in type 1 diabetes

DEFF Research Database (Denmark)

Greenbaum, Carla J; Mandrup-Poulsen, Thomas; McGee, Paula Friedenberg

2008-01-01

OBJECTIVE: Beta-cell function in type 1 diabetes clinical trials is commonly measured by C-peptide response to a secretagogue in either a mixed-meal tolerance test (MMTT) or a glucagon stimulation test (GST). The Type 1 Diabetes TrialNet Research Group and the European C-peptide Trial (ECPT) Study...... Group conducted parallel randomized studies to compare the sensitivity, reproducibility, and tolerability of these procedures. RESEARCH DESIGN AND METHODS: In randomized sequences, 148 TrialNet subjects completed 549 tests with up to 2 MMTT and 2 GST tests on separate days, and 118 ECPT subjects...
Storing files in a parallel computing system based on user-specified parser function

Science.gov (United States)

Faibish, Sorin; Bent, John M; Tzelnic, Percy; Grider, Gary; Manzanares, Adam; Torres, Aaron

2014-10-21

Techniques are provided for storing files in a parallel computing system based on a user-specified parser function. A plurality of files generated by a distributed application in a parallel computing system are stored by obtaining a parser from the distributed application for processing the plurality of files prior to storage; and storing one or more of the plurality of files in one or more storage nodes of the parallel computing system based on the processing by the parser. The plurality of files comprise one or more of a plurality of complete files and a plurality of sub-files. The parser can optionally store only those files that satisfy one or more semantic requirements of the parser. The parser can also extract metadata from one or more of the files and the extracted metadata can be stored with one or more of the plurality of files and used for searching for files.
CUBESIM, Hypercube and Denelcor Hep Parallel Computer Simulation

International Nuclear Information System (INIS)

Dunigan, T.H.

1988-01-01

1 - Description of program or function: CUBESIM is a set of subroutine libraries and programs for the simulation of message-passing parallel computers and shared-memory parallel computers. Subroutines are supplied to simulate the Intel hypercube and the Denelcor HEP parallel computers. The system permits a user to develop and test parallel programs written in C or FORTRAN on a single processor. The user may alter such hypercube parameters as message startup times, packet size, and the computation-to-communication ratio. The simulation generates a trace file that can be used for debugging, performance analysis, or graphical display. 2 - Method of solution: The CUBESIM simulator is linked with the user's parallel application routines to run as a single UNIX process. The simulator library provides a small operating system to perform process and message management. 3 - Restrictions on the complexity of the problem: Up to 128 processors can be simulated with a virtual memory limit of 6 million bytes. Up to 1000 processes can be simulated
Parallelization of ultrasonic field simulations for non destructive testing

International Nuclear Information System (INIS)

Lambert, Jason

2015-01-01

The Non Destructive Testing field increasingly uses simulation. It is used at every step of the whole control process of an industrial part, from speeding up control development to helping experts understand results. During this thesis, a fast ultrasonic field simulation tool dedicated to the computation of an ultrasonic field radiated by a phase array probe in an isotropic specimen has been developed. During this thesis, a simulation tool dedicated to the fast computation of an ultrasonic field radiated by a phased array probe in an isotropic specimen has been developed. Its performance enables an interactive usage. To benefit from the commonly available parallel architectures, a regular model (aimed at removing divergent branching) derived from the generic CIVA model has been developed. First, a reference implementation was developed to validate this model against CIVA results, and to analyze its performance behaviour before optimization. The resulting code has been optimized for three kinds of parallel architectures commonly available in workstations: general purpose processors (GPP), many-core co-processors (Intel MIC) and graphics processing units (nVidia GPU). On the GPP and the MIC, the algorithm was reorganized and implemented to benefit from both parallelism levels, multithreading and vector instructions. On the GPU, the multiple steps of field computing have been divided in multiple successive CUDA kernels. Moreover, libraries dedicated to each architecture were used to speedup Fast Fourier Transforms, Intel MKL on GPP and MIC and nVidia cuFFT on GPU. Performance and hardware adequation of the produced codes were thoroughly studied for each architecture. On multiple realistic control configurations, interactive performance was reached. Perspectives to address more complex configurations were drawn. Finally, the integration and the industrialization of this code in the commercial NDT platform CIVA is discussed. (author) [fr
Parallel Relational Universes – experiments in modularity

DEFF Research Database (Denmark)

Pagliarini, Luigi; Lund, Henrik Hautop

2015-01-01

: We here describe Parallel Relational Universes, an artistic method used for the psychological analysis of group dynamics. The design of the artistic system, which mediates group dynamics, emerges from our studies of modular playware and remixing playware. Inspired from remixing modular playware......, where users remix samples in the form of physical and functional modules, we created an artistic instantiation of such a concept with the Parallel Relational Universes, allowing arts alumni to remix artistic expressions. Here, we report the data emerged from a first pre-test, run with gymnasium’s alumni....... We then report both the artistic and the psychological findings. We discuss possible variations of such an instrument. Between an art piece and a psychological test, at a first cognitive analysis, it seems to be a promising research tool...
Introduction to parallel programming

CERN Document Server

Brawer, Steven

1989-01-01

Introduction to Parallel Programming focuses on the techniques, processes, methodologies, and approaches involved in parallel programming. The book first offers information on Fortran, hardware and operating system models, and processes, shared memory, and simple parallel programs. Discussions focus on processes and processors, joining processes, shared memory, time-sharing with multiple processors, hardware, loops, passing arguments in function/subroutine calls, program structure, and arithmetic expressions. The text then elaborates on basic parallel programming techniques, barriers and race
Topology in SU(2) lattice gauge theory and parallelization of functional magnetic resonance imaging

Energy Technology Data Exchange (ETDEWEB)

Solbrig, Stefan

2008-07-01

In this thesis, I discuss topological properties of quenched SU(2) lattice gauge fields. In particular, clusters of topological charge density exhibit a power-law. The exponent of that power-law can be used to validate models for lattice gauge fields. Instead of working with fixed cutoffs of the topological charge density, using the notion of a ''watermark'' is more convenient. Furthermore, I discuss how a parallel computer, originally designed for lattice gauge field simulations, can be used for functional magnetic resonance imaging. Multi parameter fits can be parallelized to achieve almost real-time evaluation of fMRI data. (orig.)
Topology in SU(2) lattice gauge theory and parallelization of functional magnetic resonance imaging

International Nuclear Information System (INIS)

Solbrig, Stefan

2008-01-01

In this thesis, I discuss topological properties of quenched SU(2) lattice gauge fields. In particular, clusters of topological charge density exhibit a power-law. The exponent of that power-law can be used to validate models for lattice gauge fields. Instead of working with fixed cutoffs of the topological charge density, using the notion of a ''watermark'' is more convenient. Furthermore, I discuss how a parallel computer, originally designed for lattice gauge field simulations, can be used for functional magnetic resonance imaging. Multi parameter fits can be parallelized to achieve almost real-time evaluation of fMRI data. (orig.)
Three pillars for achieving quantum mechanical molecular dynamics simulations of huge systems: Divide-and-conquer, density-functional tight-binding, and massively parallel computation.

Science.gov (United States)

Nishizawa, Hiroaki; Nishimura, Yoshifumi; Kobayashi, Masato; Irle, Stephan; Nakai, Hiromi

2016-08-05

The linear-scaling divide-and-conquer (DC) quantum chemical methodology is applied to the density-functional tight-binding (DFTB) theory to develop a massively parallel program that achieves on-the-fly molecular reaction dynamics simulations of huge systems from scratch. The functions to perform large scale geometry optimization and molecular dynamics with DC-DFTB potential energy surface are implemented to the program called DC-DFTB-K. A novel interpolation-based algorithm is developed for parallelizing the determination of the Fermi level in the DC method. The performance of the DC-DFTB-K program is assessed using a laboratory computer and the K computer. Numerical tests show the high efficiency of the DC-DFTB-K program, a single-point energy gradient calculation of a one-million-atom system is completed within 60 s using 7290 nodes of the K computer. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Forced-convection boiling tests performed in parallel simulated LMR fuel assemblies

International Nuclear Information System (INIS)

Rose, S.D.; Carbajo, J.J.; Levin, A.E.; Lloyd, D.B.; Montgomery, B.H.; Wantland, J.L.

1985-01-01

Forced-convection tests have been carried out using parallel simulated Liquid Metal Reactor fuel assemblies in an engineering-scale sodium loop, the Thermal-Hydraulic Out-of-Reactor Safety facility. The tests, performed under single- and two-phase conditions, have shown that for low forced-convection flow there is significant flow augmentation by thermal convection, an important phenomenon under degraded shutdown heat removal conditions in an LMR. The power and flows required for boiling and dryout to occur are much higher than decay heat levels. The experimental evidence supports analytical results that heat removal from an LMR is possible with a degraded shutdown heat removal system
Transfer function modeling of parallel connected two three-phase induction motor implementation using LabView platform

DEFF Research Database (Denmark)

Gunabalan, R.; Sanjeevikumar, P.; Blaabjerg, Frede

2015-01-01

This paper presents the transfer function modeling and stability analysis of two induction motors of same ratings and parameters connected in parallel. The induction motors are controlled by a single inverter and the entire drive system is modeled using transfer function in LabView. Further...
Platelet Function Tests

Science.gov (United States)

... Patient Resources For Health Professionals Subscribe Search Platelet Function Tests Send Us Your Feedback Choose Topic At ... Also Known As Platelet Aggregation Studies PFT Platelet Function Assay PFA Formal Name Platelet Function Tests This ...

Liver Function Tests

Science.gov (United States)

... digest food, store energy, and remove poisons. Liver function tests are blood tests that check to see ... as hepatitis and cirrhosis. You may have liver function tests as part of a regular checkup. Or ...
SPSS and SAS programs for determining the number of components using parallel analysis and velicer's MAP test.

Science.gov (United States)

O'Connor, B P

2000-08-01

Popular statistical software packages do not have the proper procedures for determining the number of components in factor and principal components analyses. Parallel analysis and Velicer's minimum average partial (MAP) test are validated procedures, recommended widely by statisticians. However, many researchers continue to use alternative, simpler, but flawed procedures, such as the eigenvalues-greater-than-one rule. Use of the proper procedures might be increased if these procedures could be conducted within familiar software environments. This paper describes brief and efficient programs for using SPSS and SAS to conduct parallel analyses and the MAP test.
Kidney function tests

Science.gov (United States)

Kidney function tests are common lab tests used to evaluate how well the kidneys are working. Such tests include: ... Oh MS, Briefel G. Evaluation of renal function, water, electrolytes ... and Management by Laboratory Methods . 23rd ed. Philadelphia, ...
Topology in SU(2) lattice gauge theory and parallelization of functional magnetic resonance imaging

Energy Technology Data Exchange (ETDEWEB)

Solbrig, Stefan

2008-07-01

In this thesis, I discuss topological properties of quenched SU(2) lattice gauge fields. In particular, clusters of topological charge density exhibit a power-law. The exponent of that power-law can be used to validate models for lattice gauge fields. Instead of working with fixed cutoffs of the topological charge density, using the notion of a ''watermark'' is more convenient. Furthermore, I discuss how a parallel computer, originally designed for lattice gauge field simulations, can be used for functional magnetic resonance imaging. Multi parameter fits can be parallelized to achieve almost real-time evaluation of fMRI data. (orig.)
Parallel susceptibility testing of bacteria through culture-quantitative PCR in 96-well plates

Directory of Open Access Journals (Sweden)

Jun Luo

2018-05-01

Full Text Available Objective: The methods combining culture and quantitative PCR(qPCR offer new solutions for rapid antibiotic susceptibility testing(AST. However, the multiple steps of DNA extraction and cold storage of PCR reagents needed make them unsuitable for rapid high throughput AST. In this study, a parallel culture-qPCR method was developed to overcome above problems. Method: In this method, bacteria culture and DNA extraction automatically and simultaneously completed through using a common PCR instrument as a controllable heating device. A lyophilized 16S rDNA targeted qPCR reagent was also developed, which was stable and could be kept at 4 °C for long time and at 37 °C for about two months. Result: Testing of 36 P. aeruginosa isolates and 28 S. aureus isolates showed that the method had good agreements with the standard broth microdilution method, with an overall agreement of 97.22% (95% CI, 85.83–99.51 for P. aeruginosa and 96.43% (95% CI, 79.76–99.81 for S. aureus. This method could test 12 samples against a panel of up to 7 antibiotics simultaneously in two 96-well PCR plates within 4 h, which greatly improves the testing efficiency of the culture-qPCR method. Conclusion: With rapidness to obtain results and the capabilities for automation and multiple-sample testing, the parallel culture-qPCR method would have great potentials in clinical labs. Keywords: Antibiotic susceptibility testing, Thermo-cold lysis, Lyophilized qPCR reagent, Quantitative PCR, Bacteria
Fort St. Vrain hot functional test results

International Nuclear Information System (INIS)

Phelps, R.D.

1974-01-01

A description is given of Fort St. Vrain hot functional tests performed to evaluate the initial nonnuclear performance of the primary coolant system and the associated effects on the various internal components of the reactor vessel and primary coolant system. The components included the twelve steam generator modules, the four helium circulators, the PCRV thermal barrier and liner coolant system, the helium purification system, and the primary and secondary closures at each of the PCRV penetrations. Additional objectives included analysis of the parallel operation of the four helium circulators and the performance of several circulator start/stop transients under various conditions of primary coolant temperature and pressure. Vibration and acoustical phenomena within the vessel were measured, recorded, and compared to theoretical analyses; a verification of reverse flow in the shutdown loop steam generator during one loop operation was performed; the PCRV was again observed for its structural response to internal pressure; and comparisons were made relative to data recorded during the initial pressure test completed in July 1971. (U.S.)
Systematic test on fast time resolution parallel plate avalanche counter

International Nuclear Information System (INIS)

Chen Yu; Li Guangwu; Gu Xianbao; Chen Yanchao; Zhang Gang; Zhang Wenhui; Yan Guohong

2011-01-01

Systematic test on each detect unit of parallel plate avalanche counter (PPAC) used in the fission multi-parameter measurement was performed with a 241 Am α source to get the time resolution and position resolution. The detectors work at 600 Pa flowing isobutane and with-600 V on cathode. The time resolution was got by TOF method and the position resolution was got by delay line method. The time resolution of detect units is better than 400 ps, and the position resolution is 6 mm. The results show that the demand of measurement is fully covered. (authors)
Introducing PROFESS 2.0: A parallelized, fully linear scaling program for orbital-free density functional theory calculations

Science.gov (United States)

Hung, Linda; Huang, Chen; Shin, Ilgyou; Ho, Gregory S.; Lignères, Vincent L.; Carter, Emily A.

2010-12-01

Orbital-free density functional theory (OFDFT) is a first principles quantum mechanics method to find the ground-state energy of a system by variationally minimizing with respect to the electron density. No orbitals are used in the evaluation of the kinetic energy (unlike Kohn-Sham DFT), and the method scales nearly linearly with the size of the system. The PRinceton Orbital-Free Electronic Structure Software (PROFESS) uses OFDFT to model materials from the atomic scale to the mesoscale. This new version of PROFESS allows the study of larger systems with two significant changes: PROFESS is now parallelized, and the ion-electron and ion-ion terms scale quasilinearly, instead of quadratically as in PROFESS v1 (L. Hung and E.A. Carter, Chem. Phys. Lett. 475 (2009) 163). At the start of a run, PROFESS reads the various input files that describe the geometry of the system (ion positions and cell dimensions), the type of elements (defined by electron-ion pseudopotentials), the actions you want it to perform (minimize with respect to electron density and/or ion positions and/or cell lattice vectors), and the various options for the computation (such as which functionals you want it to use). Based on these inputs, PROFESS sets up a computation and performs the appropriate optimizations. Energies, forces, stresses, material geometries, and electron density configurations are some of the values that can be output throughout the optimization. New version program summaryProgram Title: PROFESS Catalogue identifier: AEBN_v2_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEBN_v2_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 68 721 No. of bytes in distributed program, including test data, etc.: 1 708 547 Distribution format: tar.gz Programming language: Fortran 90 Computer
Vector Green's function algorithm for radiative transfer in plane-parallel atmosphere

Energy Technology Data Exchange (ETDEWEB)

Qin Yi [School of Physics, University of New South Wales (Australia)]. E-mail: yi.qin@csiro.au; Box, Michael A. [School of Physics, University of New South Wales (Australia)

2006-01-15

Green's function is a widely used approach for boundary value problems. In problems related to radiative transfer, Green's function has been found to be useful in land, ocean and atmosphere remote sensing. It is also a key element in higher order perturbation theory. This paper presents an explicit expression of the Green's function, in terms of the source and radiation field variables, for a plane-parallel atmosphere with either vacuum boundaries or a reflecting (BRDF) surface. Full polarization state is considered but the algorithm has been developed in such way that it can be easily reduced to solve scalar radiative transfer problems, which makes it possible to implement a single set of code for computing both the scalar and the vector Green's function.
Parallel Fixed Point Implementation of a Radial Basis Function Network in an FPGA

Directory of Open Access Journals (Sweden)

Alisson C. D. de Souza

2014-09-01

Full Text Available This paper proposes a parallel fixed point radial basis function (RBF artificial neural network (ANN, implemented in a field programmable gate array (FPGA trained online with a least mean square (LMS algorithm. The processing time and occupied area were analyzed for various fixed point formats. The problems of precision of the ANN response for nonlinear classification using the XOR gate and interpolation using the sine function were also analyzed in a hardware implementation. The entire project was developed using the System Generator platform (Xilinx, with a Virtex-6 xc6vcx240t-1ff1156 as the target FPGA.
Sensitivity and specificity of parallel or serial serological testing for detection of canine Leishmania infection

Directory of Open Access Journals (Sweden)

Mauro Maciel de Arruda

2016-01-01

Full Text Available In Brazil, human and canine visceral leishmaniasis (CVL caused byLeishmania infantum has undergone urbanisation since 1980, constituting a public health problem, and serological tests are tools of choice for identifying infected dogs. Until recently, the Brazilian zoonoses control program recommended enzyme-linked immunosorbent assays (ELISA and indirect immunofluorescence assays (IFA as the screening and confirmatory methods, respectively, for the detection of canine infection. The purpose of this study was to estimate the accuracy of ELISA and IFA in parallel or serial combinations. The reference standard comprised the results of direct visualisation of parasites in histological sections, immunohistochemical test, or isolation of the parasite in culture. Samples from 98 cases and 1,327 noncases were included. Individually, both tests presented sensitivity of 91.8% and 90.8%, and specificity of 83.4 and 53.4%, for the ELISA and IFA, respectively. When tests were used in parallel combination, sensitivity attained 99.2%, while specificity dropped to 44.8%. When used in serial combination (ELISA followed by IFA, decreased sensitivity (83.3% and increased specificity (92.5% were observed. Serial testing approach improved specificity with moderate loss in sensitivity. This strategy could partially fulfill the needs of public health and dog owners for a more accurate diagnosis of CVL.
Parallelism in matrix computations

CERN Document Server

Gallopoulos, Efstratios; Sameh, Ahmed H

2016-01-01

This book is primarily intended as a research monograph that could also be used in graduate courses for the design of parallel algorithms in matrix computations. It assumes general but not extensive knowledge of numerical linear algebra, parallel architectures, and parallel programming paradigms. The book consists of four parts: (I) Basics; (II) Dense and Special Matrix Computations; (III) Sparse Matrix Computations; and (IV) Matrix functions and characteristics. Part I deals with parallel programming paradigms and fundamental kernels, including reordering schemes for sparse matrices. Part II is devoted to dense matrix computations such as parallel algorithms for solving linear systems, linear least squares, the symmetric algebraic eigenvalue problem, and the singular-value decomposition. It also deals with the development of parallel algorithms for special linear systems such as banded ,Vandermonde ,Toeplitz ,and block Toeplitz systems. Part III addresses sparse matrix computations: (a) the development of pa...
Extraocular muscle function testing

Science.gov (United States)

... medlineplus.gov/ency/article/003397.htm Extraocular muscle function testing To use the sharing features on this page, please enable JavaScript. Extraocular muscle function testing examines the function of the eye muscles. ...
Sustainability Attitudes and Behavioral Motivations of College Students: Testing the Extended Parallel Process Model

Science.gov (United States)

Perrault, Evan K.; Clark, Scott K.

2018-01-01

Purpose: A planet that can no longer sustain life is a frightening thought--and one that is often present in mass media messages. Therefore, this study aims to test the components of a classic fear appeal theory, the extended parallel process model (EPPM) and to determine how well its constructs predict sustainability behavioral intentions. This…
Processing data communications events by awakening threads in parallel active messaging interface of a parallel computer

Science.gov (United States)

Archer, Charles J.; Blocksome, Michael A.; Ratterman, Joseph D.; Smith, Brian E.

2016-03-15

Processing data communications events in a parallel active messaging interface (`PAMI`) of a parallel computer that includes compute nodes that execute a parallel application, with the PAMI including data communications endpoints, and the endpoints are coupled for data communications through the PAMI and through other data communications resources, including determining by an advance function that there are no actionable data communications events pending for its context, placing by the advance function its thread of execution into a wait state, waiting for a subsequent data communications event for the context; responsive to occurrence of a subsequent data communications event for the context, awakening by the thread from the wait state; and processing by the advance function the subsequent data communications event now pending for the context.
Comparison of likelihood testing procedures for parallel systems with covariances

International Nuclear Information System (INIS)

Ayman Baklizi; Isa Daud; Noor Akma Ibrahim

1998-01-01

In this paper we considered investigating and comparing the behavior of the likelihood ratio, the Rao's and the Wald's statistics for testing hypotheses on the parameters of the simple linear regression model based on parallel systems with covariances. These statistics are asymptotically equivalent (Barndorff-Nielsen and Cox, 1994). However, their relative performances in finite samples are generally known. A Monte Carlo experiment is conducted to stimulate the sizes and the powers of these statistics for complete samples and in the presence of time censoring. Comparisons of the statistics are made according to the attainment of assumed size of the test and their powers at various points in the parameter space. The results show that the likelihood ratio statistics appears to have the best performance in terms of the attainment of the assumed size of the test. Power comparisons show that the Rao statistic has some advantage over the Wald statistic in almost all of the space of alternatives while likelihood ratio statistic occupies either the first or the last position in term of power. Overall, the likelihood ratio statistic appears to be more appropriate to the model under study, especially for small sample sizes
Parallel island genetic algorithm applied to a nuclear power plant auxiliary feedwater system surveillance tests policy optimization

International Nuclear Information System (INIS)

Pereira, Claudio M.N.A.; Lapa, Celso M.F.

2003-01-01

In this work, we focus the application of an Island Genetic Algorithm (IGA), a coarse-grained parallel genetic algorithm (PGA) model, to a Nuclear Power Plant (NPP) Auxiliary Feedwater System (AFWS) surveillance tests policy optimization. Here, the main objective is to outline, by means of comparisons, the advantages of the IGA over the simple (non-parallel) genetic algorithm (GA), which has been successfully applied in the solution of such kind of problem. The goal of the optimization is to maximize the system's average availability for a given period of time, considering realistic features such as: i) aging effects on standby components during the tests; ii) revealing failures in the tests implies on corrective maintenance, increasing outage times; iii) components have distinct test parameters (outage time, aging factors, etc.) and iv) tests are not necessarily periodic. In our experiments, which were made in a cluster comprised by 8 1-GHz personal computers, we could clearly observe gains not only in the computational time, which reduced linearly with the number of computers, but in the optimization outcome
Data communications in a parallel active messaging interface of a parallel computer

Science.gov (United States)

Archer, Charles J; Blocksome, Michael A; Ratterman, Joseph D; Smith, Brian E

2013-11-12

Data communications in a parallel active messaging interface (`PAMI`) of a parallel computer composed of compute nodes that execute a parallel application, each compute node including application processors that execute the parallel application and at least one management processor dedicated to gathering information regarding data communications. The PAMI is composed of data communications endpoints, each endpoint composed of a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, the compute nodes and the endpoints coupled for data communications through the PAMI and through data communications resources. Embodiments function by gathering call site statistics describing data communications resulting from execution of data communications instructions and identifying in dependence upon the call cite statistics a data communications algorithm for use in executing a data communications instruction at a call site in the parallel application.
PASSPORT-seq: A Novel High-Throughput Bioassay to Functionally Test Polymorphisms in Micro-RNA Target Sites

Directory of Open Access Journals (Sweden)

Joseph Ipe

2018-06-01

Full Text Available Next-generation sequencing (NGS studies have identified large numbers of genetic variants that are predicted to alter miRNA–mRNA interactions. We developed a novel high-throughput bioassay, PASSPORT-seq, that can functionally test in parallel 100s of these variants in miRNA binding sites (mirSNPs. The results are highly reproducible across both technical and biological replicates. The utility of the bioassay was demonstrated by testing 100 mirSNPs in HEK293, HepG2, and HeLa cells. The results of several of the variants were validated in all three cell lines using traditional individual luciferase assays. Fifty-five mirSNPs were functional in at least one of three cell lines (FDR ≤ 0.05; 11, 36, and 27 of them were functional in HEK293, HepG2, and HeLa cells, respectively. Only four of the variants were functional in all three cell lines, which demonstrates the cell-type specific effects of mirSNPs and the importance of testing the mirSNPs in multiple cell lines. Using PASSPORT-seq, we functionally tested 111 variants in the 3′ UTR of 17 pharmacogenes that are predicted to alter miRNA regulation. Thirty-three of the variants tested were functional in at least one cell line.
Conceptual design and testing strategy of a dual functional lithium-lead test blanket module in ITER and EAST

International Nuclear Information System (INIS)

Wu, Y.

2007-01-01

A dual functional lithium-lead (DFLL) test blanket module (TBM) concept has been proposed for testing in the International Thermonuclear Experimental Reactor (ITER) and the Experimental Advanced Superconducting Tokamak (EAST) in China to demonstrate the technologies of the liquid lithium-lead breeder blankets with emphasis on the balance between the risks and the potential attractiveness of blanket technology development. The design of DFLL-TBM concept has the flexibility of testing both the helium-cooled quasi-static lithium-lead (SLL) blanket concept and the He/PbLi dual-cooled lithium-lead (DLL) blanket concept. This paper presents an effective testing strategy proposed to achieve the testing target of SLL and DLL DEMO blankets relevant conditions, which includes three parts: materials R and D and small-scale out-of-pile mockups testing in loops, middle-scale TBMs pre-testing in EAST and full-scale consecutive TBMs testing corresponding to different operation phases of ITER during the first 10 years. The design of the DFLL-TBM concept and the testing strategy ability to test TBMs for both blanket concepts in sequence and or in parallel for both ITER and EAST are discussed

Distributed Cooperative Current-Sharing Control of Parallel Chargers Using Feedback Linearization

Directory of Open Access Journals (Sweden)

Jiangang Liu

2014-01-01

Full Text Available We propose a distributed current-sharing scheme to address the output current imbalance problem for the parallel chargers in the energy storage type light rail vehicle system. By treating the parallel chargers as a group of agents with output information sharing through communication network, the current-sharing control problem is recast as the consensus tracking problem of multiagents. To facilitate the design, input-output feedback linearization is first applied to transform the nonidentical nonlinear charging system model into the first-order integrator. Then, a general saturation function is introduced to design the cooperative current-sharing control law which can guarantee the boundedness of the proposed control. The cooperative stability of the closed-loop system under fixed and dynamic communication topologies is rigorously proved with the aid of Lyapunov function and LaSalle invariant principle. Simulation using a multicharging test system further illustrates that the output currents of parallel chargers are balanced using the proposed control.
Development of imaging and reconstructions algorithms on parallel processing architectures for applications in non-destructive testing

International Nuclear Information System (INIS)

Pedron, Antoine

2013-01-01

This thesis work is placed between the scientific domain of ultrasound non-destructive testing and algorithm-architecture adequation. Ultrasound non-destructive testing includes a group of analysis techniques used in science and industry to evaluate the properties of a material, component, or system without causing damage. In order to characterise possible defects, determining their position, size and shape, imaging and reconstruction tools have been developed at CEA-LIST, within the CIVA software platform. Evolution of acquisition sensors implies a continuous growth of datasets and consequently more and more computing power is needed to maintain interactive reconstructions. General purpose processors (GPP) evolving towards parallelism and emerging architectures such as GPU allow large acceleration possibilities than can be applied to these algorithms. The main goal of the thesis is to evaluate the acceleration than can be obtained for two reconstruction algorithms on these architectures. These two algorithms differ in their parallelization scheme. The first one can be properly parallelized on GPP whereas on GPU, an intensive use of atomic instructions is required. Within the second algorithm, parallelism is easier to express, but loop ordering on GPP, as well as thread scheduling and a good use of shared memory on GPU are necessary in order to obtain efficient results. Different API or libraries, such as OpenMP, CUDA and OpenCL are evaluated through chosen benchmarks. An integration of both algorithms in the CIVA software platform is proposed and different issues related to code maintenance and durability are discussed. (author) [fr
Parallelization methods study of thermal-hydraulics codes

International Nuclear Information System (INIS)

Gaudart, Catherine

2000-01-01

The variety of parallelization methods and machines leads to a wide selection for programmers. In this study we suggest, in an industrial context, some solutions from the experience acquired through different parallelization methods. The study is about several scientific codes which simulate a large variety of thermal-hydraulics phenomena. A bibliography on parallelization methods and a first analysis of the codes showed the difficulty of our process on the whole applications to study. Therefore, it would be necessary to identify and extract a representative part of these applications and parallelization methods. The linear solver part of the codes forced itself. On this particular part several parallelization methods had been used. From these developments one could estimate the necessary work for a non initiate programmer to parallelize his application, and the impact of the development constraints. The different methods of parallelization tested are the numerical library PETSc, the parallelizer PAF, the language HPF, the formalism PEI and the communications library MPI and PYM. In order to test several methods on different applications and to follow the constraint of minimization of the modifications in codes, a tool called SPS (Server of Parallel Solvers) had be developed. We propose to describe the different constraints about the optimization of codes in an industrial context, to present the solutions given by the tool SPS, to show the development of the linear solver part with the tested parallelization methods and lastly to compare the results against the imposed criteria. (author) [fr
Parallel Implementation of the Discrete Green's Function Formulation of the FDTD Method on a Multicore Central Processing Unit

Directory of Open Access Journals (Sweden)

T. Stefański

2014-12-01

Full Text Available Parallel implementation of the discrete Green's function formulation of the finite-difference time-domain (DGF-FDTD method was developed on a multicore central processing unit. DGF-FDTD avoids computations of the electromagnetic field in free-space cells and does not require domain termination by absorbing boundary conditions. Computed DGF-FDTD solutions are compatible with the FDTD grid enabling the perfect hybridization of FDTD with the use of time-domain integral equation methods. The developed implementation can be applied to simulations of antenna characteristics. For the sake of example, arrays of Yagi-Uda antennas were simulated with the use of parallel DGF-FDTD. The efficiency of parallel computations was investigated as a function of the number of current elements in the FDTD grid. Although the developed method does not apply the fast Fourier transform for convolution computations, advantages stemming from the application of DGF-FDTD instead of FDTD can be demonstrated for one-dimensional wire antennas when simulation results are post-processed by the near-to-far-field transformation.
Balanced, parallel operation of flashlamps

International Nuclear Information System (INIS)

Carder, B.M.; Merritt, B.T.

1979-01-01

A new energy store, the Compensated Pulsed Alternator (CPA), promises to be a cost effective substitute for capacitors to drive flashlamps that pump large Nd:glass lasers. Because the CPA is large and discrete, it will be necessary that it drive many parallel flashlamp circuits, presenting a problem in equal current distribution. Current division to +- 20% between parallel flashlamps has been achieved, but this is marginal for laser pumping. A method is presented here that provides equal current sharing to about 1%, and it includes fused protection against short circuit faults. The method was tested with eight parallel circuits, including both open-circuit and short-circuit fault tests
Parallel computations of molecular dynamics trajectories using the stochastic path approach

Science.gov (United States)

Zaloj, Veaceslav; Elber, Ron

2000-06-01

A novel protocol to parallelize molecular dynamics trajectories is discussed and tested on a cluster of PCs running the NT operating system. The new technique does not propagate the solution in small time steps, but uses instead a global optimization of a functional of the whole trajectory. The new approach is especially attractive for parallel and distributed computing and its advantages (and disadvantages) are presented. Two numerical examples are discussed: (a) A conformational transition in a solvated dipeptide, and (b) The R→T conformational transition in solvated hemoglobin.
Combining Compile-Time and Run-Time Parallelization

Directory of Open Access Journals (Sweden)

Sungdo Moon

1999-01-01

Full Text Available This paper demonstrates that significant improvements to automatic parallelization technology require that existing systems be extended in two ways: (1 they must combine high‐quality compile‐time analysis with low‐cost run‐time testing; and (2 they must take control flow into account during analysis. We support this claim with the results of an experiment that measures the safety of parallelization at run time for loops left unparallelized by the Stanford SUIF compiler’s automatic parallelization system. We present results of measurements on programs from two benchmark suites – SPECFP95 and NAS sample benchmarks – which identify inherently parallel loops in these programs that are missed by the compiler. We characterize remaining parallelization opportunities, and find that most of the loops require run‐time testing, analysis of control flow, or some combination of the two. We present a new compile‐time analysis technique that can be used to parallelize most of these remaining loops. This technique is designed to not only improve the results of compile‐time parallelization, but also to produce low‐cost, directed run‐time tests that allow the system to defer binding of parallelization until run‐time when safety cannot be proven statically. We call this approach predicated array data‐flow analysis. We augment array data‐flow analysis, which the compiler uses to identify independent and privatizable arrays, by associating predicates with array data‐flow values. Predicated array data‐flow analysis allows the compiler to derive “optimistic” data‐flow values guarded by predicates; these predicates can be used to derive a run‐time test guaranteeing the safety of parallelization.
Pancreatic Exocrine Function Testing

OpenAIRE

Berk, J. Edward

1982-01-01

It is important to understand which pancreatic function tests are available and how to interpret them when evaluating patients with malabsorption. Available direct tests are the secretin stimulation test, the Lundh test meal, and measurement of serum or fecal enzymes. Indirect tests assess pancreatic exocrine function by measuring the effect of pancreatic secretion on various nutrients. These include triglycerides labeled with carbon 14, cobalamin labeled with cobalt 57 and cobalt 58, and par...
Green's function for electrons in a narrow quantum well in a parallel magnetic field

International Nuclear Information System (INIS)

Horing, Norman J. Morgenstern; Glasser, M. Lawrence; Dong Bing

2005-01-01

Electron dynamics in a narrow quantum well in a parallel magnetic field of arbitrary strength are examined here. We derive an explicit analytical closed-form solution for the Green's function of Landau-quantized electrons in skipping states of motion between the narrow well walls coupled with in-plane translational motion and hybridized with the zero-field lowest subband energy eigenstate. Such Landau-quantized modes are not uniformly spaced
Parallel hole collimator acceptance tests for SPECT and planar studies

International Nuclear Information System (INIS)

Babicheva, R.R.; Bennie, D.N.; Collins, L.T.; Gruenwald, S.M.

1998-01-01

Full text: Different kinds of collimator damage can occur either during shipping or from regular use. Imperfections of construction along the strips or their connections give rise to nonperpendicular hole alignments to the crystal face and can produce potential problems such as ring artifacts and image degradation. Gamma camera collimator hole alignments and integrity were compared in four parallel hole high resolution collimators-two new cast and two used foil collimators, one with damage to the protective surface. [1] The point source flood image of the defective collimator was non-circular as were the images of cast collimators. The image of new foil collimator was circular. [2] High count sheet flood did not show any imperfections. [3] Bone mineral densitometer was used to perform collimated X-ray beam. The collimator was placed on the scanning bed with an X-ray cassette placed directly above it. The damaged area was well demonstrated. [4] The COR offset test was taken at two extreme radii. The offset value with the defective collimator is increased by 0.53 pixel or 129% with increase of COR from radius 14 cm to 28cm. [5] The collimator hole alignment test involves performing multiple measurements of COR along the length of the collimator, and checking for variations in COR with both position of source and angle of rotation. The maximum variation in COR of the defective collimator hole alignment was 1.13 mm. Collimators require testing when new and at regular intervals, or following damage. The point source test can be used for foil collimators. The most sensitive tests were collimated X-ray source, COR offset test and collimator hole alignment
Parallel Monte Carlo reactor neutronics

International Nuclear Information System (INIS)

Blomquist, R.N.; Brown, F.B.

1994-01-01

The issues affecting implementation of parallel algorithms for large-scale engineering Monte Carlo neutron transport simulations are discussed. For nuclear reactor calculations, these include load balancing, recoding effort, reproducibility, domain decomposition techniques, I/O minimization, and strategies for different parallel architectures. Two codes were parallelized and tested for performance. The architectures employed include SIMD, MIMD-distributed memory, and workstation network with uneven interactive load. Speedups linear with the number of nodes were achieved
Electromagnetic ion-cyclotron instability in the presence of a parallel electric field with general loss-cone distribution function - particle aspect analysis

Directory of Open Access Journals (Sweden)

G. Ahirwar

2006-08-01

Full Text Available The effect of parallel electric field on the growth rate, parallel and perpendicular resonant energy and marginal stability of the electromagnetic ion-cyclotron (EMIC wave with general loss-cone distribution function in a low β homogeneous plasma is investigated by particle aspect approach. The effect of the steepness of the loss-cone distribution is investigated on the electromagnetic ion-cyclotron wave. The whole plasma is considered to consist of resonant and non-resonant particles. It is assumed that resonant particles participate in the energy exchange with the wave, whereas non-resonant particles support the oscillatory motion of the wave. The wave is assumed to propagate parallel to the static magnetic field. The effect of the parallel electric field with the general distribution function is to control the growth rate of the EMIC waves, whereas the effect of steep loss-cone distribution is to enhance the growth rate and perpendicular heating of the ions. This study is relevant to the analysis of ion conics in the presence of an EMIC wave in the auroral acceleration region of the Earth's magnetoplasma.
Investigation of the applicability of a functional programming model to fault-tolerant parallel processing for knowledge-based systems

Science.gov (United States)

Harper, Richard

1989-01-01

In a fault-tolerant parallel computer, a functional programming model can facilitate distributed checkpointing, error recovery, load balancing, and graceful degradation. Such a model has been implemented on the Draper Fault-Tolerant Parallel Processor (FTPP). When used in conjunction with the FTPP's fault detection and masking capabilities, this implementation results in a graceful degradation of system performance after faults. Three graceful degradation algorithms have been implemented and are presented. A user interface has been implemented which requires minimal cognitive overhead by the application programmer, masking such complexities as the system's redundancy, distributed nature, variable complement of processing resources, load balancing, fault occurrence and recovery. This user interface is described and its use demonstrated. The applicability of the functional programming style to the Activation Framework, a paradigm for intelligent systems, is then briefly described.
Analysis of clinical complication data for radiation hepatitis using a parallel architecture model

International Nuclear Information System (INIS)

Jackson, A.; Haken, R.K. ten; Robertson, J.M.; Kessler, M.L.; Kutcher, G.J.; Lawrence, T.S.

1995-01-01

Purpose: The detailed knowledge of dose volume distributions available from the three-dimensional (3D) conformal radiation treatment of tumors in the liver (reported elsewhere) offers new opportunities to quantify the effect of volume on the probability of producing radiation hepatitis. We aim to test a new parallel architecture model of normal tissue complication probability (NTCP) with these data. Methods and Materials: Complication data and dose volume histograms from a total of 93 patients with normal liver function, treated on a prospective protocol with 3D conformal radiation therapy and intraarterial hepatic fluorodeoxyuridine, were analyzed with a new parallel architecture model. Patient treatment fell into six categories differing in doses delivered and volumes irradiated. By modeling the radiosensitivity of liver subunits, we are able to use dose volume histograms to calculate the fraction of the liver damaged in each patient. A complication results if this fraction exceeds the patient's functional reserve. To determine the patient distribution of functional reserves and the subunit radiosensitivity, the maximum likelihood method was used to fit the observed complication data. Results: The parallel model fit the complication data well, although uncertainties on the functional reserve distribution and subunit radiosensitivy are highly correlated. Conclusion: The observed radiation hepatitis complications show a threshold effect that can be described well with a parallel architecture model. However, additional independent studies are required to better determine the parameters defining the functional reserve distribution and subunit radiosensitivity
Functional Task Test (FTT)

Science.gov (United States)

Bloomberg, Jacob J.; Mulavara, Ajitkumar; Peters, Brian T.; Rescheke, Millard F.; Wood, Scott; Lawrence, Emily; Koffman, Igor; Ploutz-Snyder, Lori; Spiering, Barry A.; Feeback, Daniel L.;

2009-01-01

This slide presentation reviews the Functional Task Test (FTT), an interdisciplinary testing regimen that has been developed to evaluate astronaut postflight functional performance and related physiological changes. The objectives of the project are: (1) to develop a set of functional tasks that represent critical mission tasks for the Constellation Program, (2) determine the ability to perform these tasks after space flight, (3) Identify the key physiological factors that contribute to functional decrements and (4) Use this information to develop targeted countermeasures.

Overview of the Force Scientific Parallel Language

Directory of Open Access Journals (Sweden)

Gita Alaghband

1994-01-01

Full Text Available The Force parallel programming language designed for large-scale shared-memory multiprocessors is presented. The language provides a number of parallel constructs as extensions to the ordinary Fortran language and is implemented as a two-level macro preprocessor to support portability across shared memory multiprocessors. The global parallelism model on which the Force is based provides a powerful parallel language. The parallel constructs, generic synchronization, and freedom from process management supported by the Force has resulted in structured parallel programs that are ported to the many multiprocessors on which the Force is implemented. Two new parallel constructs for looping and functional decomposition are discussed. Several programming examples to illustrate some parallel programming approaches using the Force are also presented.
High temporal resolution magnetic resonance imaging: development of a parallel three dimensional acquisition method for functional neuroimaging

International Nuclear Information System (INIS)

Rabrait, C.

2007-11-01

Echo Planar Imaging is widely used to perform data acquisition in functional neuroimaging. This sequence allows the acquisition of a set of about 30 slices, covering the whole brain, at a spatial resolution ranging from 2 to 4 mm, and a temporal resolution ranging from 1 to 2 s. It is thus well adapted to the mapping of activated brain areas but does not allow precise study of the brain dynamics. Moreover, temporal interpolation is needed in order to correct for inter-slices delays and 2-dimensional acquisition is subject to vascular in flow artifacts. To improve the estimation of the hemodynamic response functions associated with activation, this thesis aimed at developing a 3-dimensional high temporal resolution acquisition method. To do so, Echo Volume Imaging was combined with reduced field-of-view acquisition and parallel imaging. Indeed, E.V.I. allows the acquisition of a whole volume in Fourier space following a single excitation, but it requires very long echo trains. Parallel imaging and field-of-view reduction are used to reduce the echo train durations by a factor of 4, which allows the acquisition of a 3-dimensional brain volume with limited susceptibility-induced distortions and signal losses, in 200 ms. All imaging parameters have been optimized in order to reduce echo train durations and to maximize S.N.R., so that cerebral activation can be detected with a high level of confidence. Robust detection of brain activation was demonstrated with both visual and auditory paradigms. High temporal resolution hemodynamic response functions could be estimated through selective averaging of the response to the different trials of the stimulation. To further improve S.N.R., the matrix inversions required in parallel reconstruction were regularized, and the impact of the level of regularization on activation detection was investigated. Eventually, potential applications of parallel E.V.I. such as the study of non-stationary effects in the B.O.L.D. response
Parallelizing the spectral transform method: A comparison of alternative parallel algorithms

International Nuclear Information System (INIS)

Foster, I.; Worley, P.H.

1993-01-01

The spectral transform method is a standard numerical technique for solving partial differential equations on the sphere and is widely used in global climate modeling. In this paper, we outline different approaches to parallelizing the method and describe experiments that we are conducting to evaluate the efficiency of these approaches on parallel computers. The experiments are conducted using a testbed code that solves the nonlinear shallow water equations on a sphere, but are designed to permit evaluation in the context of a global model. They allow us to evaluate the relative merits of the approaches as a function of problem size and number of processors. The results of this study are guiding ongoing work on PCCM2, a parallel implementation of the Community Climate Model developed at the National Center for Atmospheric Research
The convergence of parallel Boltzmann machines

NARCIS (Netherlands)

Zwietering, P.J.; Aarts, E.H.L.; Eckmiller, R.; Hartmann, G.; Hauske, G.

1990-01-01

We discuss the main results obtained in a study of a mathematical model of synchronously parallel Boltzmann machines. We present supporting evidence for the conjecture that a synchronously parallel Boltzmann machine maximizes a consensus function that consists of a weighted sum of the regular
Parallel computation of nondeterministic algorithms in VLSI

Energy Technology Data Exchange (ETDEWEB)

Hortensius, P D

1987-01-01

This work examines parallel VLSI implementations of nondeterministic algorithms. It is demonstrated that conventional pseudorandom number generators are unsuitable for highly parallel applications. Efficient parallel pseudorandom sequence generation can be accomplished using certain classes of elementary one-dimensional cellular automata. The pseudorandom numbers appear in parallel on each clock cycle. Extensive study of the properties of these new pseudorandom number generators is made using standard empirical random number tests, cycle length tests, and implementation considerations. Furthermore, it is shown these particular cellular automata can form the basis of efficient VLSI architectures for computations involved in the Monte Carlo simulation of both the percolation and Ising models from statistical mechanics. Finally, a variation on a Built-In Self-Test technique based upon cellular automata is presented. These Cellular Automata-Logic-Block-Observation (CALBO) circuits improve upon conventional design for testability circuitry.

Parallel Boltzmann machines : a mathematical model

NARCIS (Netherlands)

Zwietering, P.J.; Aarts, E.H.L.

1991-01-01

A mathematical model is presented for the description of parallel Boltzmann machines. The framework is based on the theory of Markov chains and combines a number of previously known results into one generic model. It is argued that parallel Boltzmann machines maximize a function consisting of a
Shared Variable Oriented Parallel Precompiler for SPMD Model

Institute of Scientific and Technical Information of China (English)

无

1995-01-01

For the moment,commercial parallel computer systems with distributed memory architecture are usually provided with parallel FORTRAN or parallel C compliers,which are just traditional sequential FORTRAN or C compilers expanded with communication statements.Programmers suffer from writing parallel programs with communication statements. The Shared Variable Oriented Parallel Precompiler (SVOPP) proposed in this paper can automatically generate appropriate communication statements based on shared variables for SPMD(Single Program Multiple Data) computation model and greatly ease the parallel programming with high communication efficiency.The core function of parallel C precompiler has been successfully verified on a transputer-based parallel computer.Its prominent performance shows that SVOPP is probably a break-through in parallel programming technique.
Pancreatic exocrine function testing

International Nuclear Information System (INIS)

Goff, J.S.

1981-01-01

It is important to understand which pancreatic function tests are available and how to interpret them when evaluating patients with malabsorption. Available direct tests are the secretin stimulation test, the Lundh test meal, and measurement of serum or fecal enzymes. Indirect tests assess pancreatic exocrine function by measuring the effect of pancreatic secretion on various nutrients. These include triglycerides labeled with carbon 14, cobalamin labeled with cobalt 57 and cobalt 58, and para-aminobenzoic acid bound to a dipeptide. Of all these tests the secretin stimulation test is the most accurate and reliable if done by experienced personnel. However, the indirect tests are simpler to do and appear to be comparable to the secretin test at detecting pancreatic exocrine insufficiency. These indirect tests are becoming clinically available and clinicians should familiarize themselves with the strengths and weaknesses of each
A poloidal non-uniformity of the collisionless parallel current in a tokamak plasma

Energy Technology Data Exchange (ETDEWEB)

Romannikov, A.; Fenzi-Bonizec, C

2005-07-01

The collisionless distortion of the ion (electron) distribution function at certain points on a magnetic surface is studied in the framework of a simple model of a large aspect ratio tokamak plasma. The flow velocity driven by this distortion is calculated. The possibility of an additional non-uniform collisionless parallel current density on a magnetic surface, other than the known neo-classical non-uniformity is shown. The difference between the parallel current density on the low and high field side of a magnetic surface is close to the neoclassical bootstrap current density. The first Tore-Supra experimental test indicates the possibility of the poloidal non-uniformity of the parallel current density. (authors)
NASA's Functional Task Test: Providing Information for an Integrated Countermeasure System

Science.gov (United States)

Bloomberg, J. J.; Feiveson, A. H.; Laurie, S. S.; Lee, S. M. C.; Mulavara, A. P.; Peters, B. T.; Platts, S. H.; Ploutz-Snyder, L. L.; Reschke, M. F.; Ryder, J. W.;

2015-01-01

postural stability (i.e. hatch opening, ladder climb, manual manipulation of objects and tool use) showed little reduction in performance. These changes in functional performance were paralleled by similar decrements in sensorimotor tests designed to specifically assess postural equilibrium and dynamic gait control. Bed rest subjects experienced similar deficits both in functional tests with balance challenges and in sensorimotor tests designed to evaluate postural and gait control as spaceflight subjects indicating that body support unloading experienced during spaceflight plays a central role in post-flight alteration of functional task performance. To determine how differences in body-support loading experienced during in-flight treadmill exercise affect postflight functional performance, the loading history for each subject during in-flight treadmill (T2) exercise was correlated with postflight measures of performance. ISS crewmembers who walked on the treadmill with higher pull-down loads had enhanced post-flight performance on tests requiring mobility. Taken together the spaceflight and bed rest data point to the importance of supplementing inflight exercise countermeasures with balance and sensorimotor adaptability training. These data also support the notion that inflight treadmill exercise performed with higher body loading provides sensorimotor benefits leading to improved performance on functional tasks that require dynamic postural stability and mobility.

Vectorization, parallelization and porting of nuclear codes. Vectorization and parallelization. Progress report fiscal 1999

Energy Technology Data Exchange (ETDEWEB)

Adachi, Masaaki; Ogasawara, Shinobu; Kume, Etsuo [Japan Atomic Energy Research Inst., Tokai, Ibaraki (Japan). Tokai Research Establishment; Ishizuki, Shigeru; Nemoto, Toshiyuki; Kawasaki, Nobuo; Kawai, Wataru [Fujitsu Ltd., Tokyo (Japan); Yatake, Yo-ichi [Hitachi Ltd., Tokyo (Japan)

2001-02-01

Several computer codes in the nuclear field have been vectorized, parallelized and trans-ported on the FUJITSU VPP500 system, the AP3000 system, the SX-4 system and the Paragon system at Center for Promotion of Computational Science and Engineering in Japan Atomic Energy Research Institute. We dealt with 18 codes in fiscal 1999. These results are reported in 3 parts, i.e., the vectorization and the parallelization part on vector processors, the parallelization part on scalar processors and the porting part. In this report, we describe the vectorization and parallelization on vector processors. In this vectorization and parallelization on vector processors part, the vectorization of Relativistic Molecular Orbital Calculation code RSCAT, a microscopic transport code for high energy nuclear collisions code JAM, three-dimensional non-steady thermal-fluid analysis code STREAM, Relativistic Density Functional Theory code RDFT and High Speed Three-Dimensional Nodal Diffusion code MOSRA-Light on the VPP500 system and the SX-4 system are described. (author)
Mixed-meal tolerance test versus glucagon stimulation test for the assessment of beta-cell function in therapeutic trials in type 1 diabetes.

Science.gov (United States)

Greenbaum, Carla J; Mandrup-Poulsen, Thomas; McGee, Paula Friedenberg; Battelino, Tadej; Haastert, Burkhard; Ludvigsson, Johnny; Pozzilli, Paolo; Lachin, John M; Kolb, Hubert

2008-10-01

Beta-cell function in type 1 diabetes clinical trials is commonly measured by C-peptide response to a secretagogue in either a mixed-meal tolerance test (MMTT) or a glucagon stimulation test (GST). The Type 1 Diabetes TrialNet Research Group and the European C-peptide Trial (ECPT) Study Group conducted parallel randomized studies to compare the sensitivity, reproducibility, and tolerability of these procedures. In randomized sequences, 148 TrialNet subjects completed 549 tests with up to 2 MMTT and 2 GST tests on separate days, and 118 ECPT subjects completed 348 tests (up to 3 each) with either two MMTTs or two GSTs. Among individuals with up to 4 years' duration of type 1 diabetes, >85% had measurable stimulated C-peptide values. The MMTT stimulus produced significantly higher concentrations of C-peptide than the GST. Whereas both tests were highly reproducible, the MMTT was significantly more so (R(2) = 0.96 for peak C-peptide response). Overall, the majority of subjects preferred the MMTT, and there were few adverse events. Some older subjects preferred the shorter duration of the GST. Nausea was reported in the majority of GST studies, particularly in the young age-group. The MMTT is preferred for the assessment of beta-cell function in therapeutic trials in type 1 diabetes.
An Experimental Study of Natural Circulation in a Loop with Parallel Flow Test Sections

Energy Technology Data Exchange (ETDEWEB)

Mathisen, R P; Eklind, O

1965-10-15

The dynamic behaviour of a natural circulation loop parallel round duct channels has been studied. The test sections were both electrically heated and the power distribution was uniform along the 4300 mm heated length of the 20 mm dia. channels. The inter channel interference and the threshold of flow instability were obtained by using a dynamically calibrated flowmeter in each channel. The pressure was 50 bars and the sub-cooling 6 deg C. The main parameters varied, were the flow restrictions in the one-phase and two-phase sections. The instability data were correlated to the resistance coefficients due to these restrictions. Theoretical calculations for parallel channels in natural circulation have been compared with the experimental results. For the conditions determined by the above mentioned magnitudes, the steady-state computations are in excellent agreement with experiment. The transients are also nearly similar, except for the resonance frequency which for the theoretical case is higher by an amount between 0.3 and 0.5 c.p.s.
An Experimental Study of Natural Circulation in a Loop with Parallel Flow Test Sections

International Nuclear Information System (INIS)

Mathisen, R.P.; Eklind, O.

1965-10-01

The dynamic behaviour of a natural circulation loop parallel round duct channels has been studied. The test sections were both electrically heated and the power distribution was uniform along the 4300 mm heated length of the 20 mm dia. channels. The inter channel interference and the threshold of flow instability were obtained by using a dynamically calibrated flowmeter in each channel. The pressure was 50 bars and the sub-cooling 6 deg C. The main parameters varied, were the flow restrictions in the one-phase and two-phase sections. The instability data were correlated to the resistance coefficients due to these restrictions. Theoretical calculations for parallel channels in natural circulation have been compared with the experimental results. For the conditions determined by the above mentioned magnitudes, the steady-state computations are in excellent agreement with experiment. The transients are also nearly similar, except for the resonance frequency which for the theoretical case is higher by an amount between 0.3 and 0.5 c.p.s
Advanced parallel processing with supercomputer architectures

International Nuclear Information System (INIS)

Hwang, K.

1987-01-01

This paper investigates advanced parallel processing techniques and innovative hardware/software architectures that can be applied to boost the performance of supercomputers. Critical issues on architectural choices, parallel languages, compiling techniques, resource management, concurrency control, programming environment, parallel algorithms, and performance enhancement methods are examined and the best answers are presented. The authors cover advanced processing techniques suitable for supercomputers, high-end mainframes, minisupers, and array processors. The coverage emphasizes vectorization, multitasking, multiprocessing, and distributed computing. In order to achieve these operation modes, parallel languages, smart compilers, synchronization mechanisms, load balancing methods, mapping parallel algorithms, operating system functions, application library, and multidiscipline interactions are investigated to ensure high performance. At the end, they assess the potentials of optical and neural technologies for developing future supercomputers
LUNG FUNCTION TESTING IN CHILDREN

Directory of Open Access Journals (Sweden)

Matjaž Fležar

2004-03-01

Full Text Available Background. Lung function testing in children above five years old is standardised similarly as is in adult population (1. Nevertheless bronchial provocation testing can be more hazardous since the calibre and reactivity of childhood airway is different. We analysed the frequency of different lung function testing procedures and addressed the safety issues of bronchial provocation testing in children.Methods. We analysed lung function testing results in 517 children, older than 5 years, tested in our laboratory in threeyear period. Spirometry was done in every patient, metacholine provocation test was used as a part of diagnostic work-up in suspected asthma. In case of airway obstruction, bronchodilator test with salbutamol was used instead of a metacholine provocation test.Results. The most common procedure in children was spirometry with bronchial provocation test as a part of diagnostic work-up of obstructive syndrome (mostly asthma. 291 children required metacholine test and 153 tests were interpreted as positive. The decline in expiratory flows (forced expiratory flow in first second – FEV1 in positive tests was greater than in adult population as was the dose of metacholine, needed to induce bronchoconstriction. The compliance of children was better than in adults.Conclusions. Lung function testing in children is reliable and safe and can be done in a well-standardised laboratory that follows the regulations of such testing in adults.
Matpar: Parallel Extensions for MATLAB

Science.gov (United States)

Springer, P. L.

1998-01-01

Matpar is a set of client/server software that allows a MATLAB user to take advantage of a parallel computer for very large problems. The user can replace calls to certain built-in MATLAB functions with calls to Matpar functions.
Solving Large Quadratic|Assignment Problems in Parallel

DEFF Research Database (Denmark)

Clausen, Jens; Perregaard, Michael

1997-01-01

and recalculation of bounds between branchings when used in a parallel Branch-and-Bound algorithm. The algorithm has been implemented on a 16-processor MEIKO Computing Surface with Intel i860 processors. Computational results from the solution of a number of large QAPs, including the classical Nugent 20...... processors, and have hence not been ideally suited for computations essentially involving non-vectorizable computations on integers.In this paper we investigate the combination of one of the best bound functions for a Branch-and-Bound algorithm (the Gilmore-Lawler bound) and various testing, variable binding...
Functional Parallel Factor Analysis for Functions of One- and Two-dimensional Arguments

NARCIS (Netherlands)

Choi, Ji Yeh; Hwang, Heungsun; Timmerman, Marieke

Parallel factor analysis (PARAFAC) is a useful multivariate method for decomposing three-way data that consist of three different types of entities simultaneously. This method estimates trilinear components, each of which is a low-dimensional representation of a set of entities, often called a mode,
On the adequacy of message-passing parallel supercomputers for solving neutron transport problems

International Nuclear Information System (INIS)

Azmy, Y.Y.

1990-01-01

A coarse-grained, static-scheduling parallelization of the standard iterative scheme used for solving the discrete-ordinates approximation of the neutron transport equation is described. The parallel algorithm is based on a decomposition of the angular domain along the discrete ordinates, thus naturally producing a set of completely uncoupled systems of equations in each iteration. Implementation of the parallel code on Intcl's iPSC/2 hypercube, and solutions to test problems are presented as evidence of the high speedup and efficiency of the parallel code. The performance of the parallel code on the iPSC/2 is analyzed, and a model for the CPU time as a function of the problem size (order of angular quadrature) and the number of participating processors is developed and validated against measured CPU times. The performance model is used to speculate on the potential of massively parallel computers for significantly speeding up real-life transport calculations at acceptable efficiencies. We conclude that parallel computers with a few hundred processors are capable of producing large speedups at very high efficiencies in very large three-dimensional problems. 10 refs., 8 figs
Large-area parallel near-field optical nanopatterning of functional materials using microsphere mask

Energy Technology Data Exchange (ETDEWEB)

Chen, G.X. [NUS Nanoscience and Nanotechnology Initiative, National University of Singapore, 2 Engineering Drive 3, Singapore 117576 (Singapore); Department of Electrical and Computer Engineering, National University of Singapore, 4 Engineering Drive 3, Singapore 117576 (Singapore); Hong, M.H. [NUS Nanoscience and Nanotechnology Initiative, National University of Singapore, 2 Engineering Drive 3, Singapore 117576 (Singapore); Department of Electrical and Computer Engineering, National University of Singapore, 4 Engineering Drive 3, Singapore 117576 (Singapore); Data Storage Institute, ASTAR, DSI Building, 5 Engineering Drive 1, Singapore 117608 (Singapore)], E-mail: Hong_Minghui@dsi.a-star.edu.sg; Lin, Y. [NUS Nanoscience and Nanotechnology Initiative, National University of Singapore, 2 Engineering Drive 3, Singapore 117576 (Singapore); Department of Electrical and Computer Engineering, National University of Singapore, 4 Engineering Drive 3, Singapore 117576 (Singapore); Wang, Z.B. [Data Storage Institute, ASTAR, DSI Building, 5 Engineering Drive 1, Singapore 117608 (Singapore); Ng, D.K.T. [Department of Electrical and Computer Engineering, National University of Singapore, 4 Engineering Drive 3, Singapore 117576 (Singapore); Data Storage Institute, ASTAR, DSI Building, 5 Engineering Drive 1, Singapore 117608 (Singapore); Xie, Q. [Data Storage Institute, ASTAR, DSI Building, 5 Engineering Drive 1, Singapore 117608 (Singapore); Tan, L.S. [NUS Nanoscience and Nanotechnology Initiative, National University of Singapore, 2 Engineering Drive 3, Singapore 117576 (Singapore); Department of Electrical and Computer Engineering, National University of Singapore, 4 Engineering Drive 3, Singapore 117576 (Singapore); Chong, T.C. [Department of Electrical and Computer Engineering, National University of Singapore, 4 Engineering Drive 3, Singapore 117576 (Singapore); Data Storage Institute, ASTAR, DSI Building, 5 Engineering Drive 1, Singapore 117608 (Singapore)

2008-01-31

Large-area parallel near-field optical nanopatterning on functional material surfaces was investigated with KrF excimer laser irradiation. A monolayer of silicon dioxide microspheres was self-assembled on the sample surfaces as the processing mask. Nanoholes and nanospots were obtained on silicon surfaces and thin silver films, respectively. The nanopatterning results were affected by the refractive indices of the surrounding media. Near-field optical enhancement beneath the microspheres is the physical origin of nanostructure formation. Theoretical calculation was performed to study the intensity of optical field distributions under the microspheres according to the light scattering model of a sphere on the substrate.
Parallel implementation of the PHOENIX generalized stellar atmosphere program. II. Wavelength parallelization

International Nuclear Information System (INIS)

Baron, E.; Hauschildt, Peter H.

1998-01-01

We describe an important addition to the parallel implementation of our generalized nonlocal thermodynamic equilibrium (NLTE) stellar atmosphere and radiative transfer computer program PHOENIX. In a previous paper in this series we described data and task parallel algorithms we have developed for radiative transfer, spectral line opacity, and NLTE opacity and rate calculations. These algorithms divided the work spatially or by spectral lines, that is, distributing the radial zones, individual spectral lines, or characteristic rays among different processors and employ, in addition, task parallelism for logically independent functions (such as atomic and molecular line opacities). For finite, monotonic velocity fields, the radiative transfer equation is an initial value problem in wavelength, and hence each wavelength point depends upon the previous one. However, for sophisticated NLTE models of both static and moving atmospheres needed to accurately describe, e.g., novae and supernovae, the number of wavelength points is very large (200,000 - 300,000) and hence parallelization over wavelength can lead both to considerable speedup in calculation time and the ability to make use of the aggregate memory available on massively parallel supercomputers. Here, we describe an implementation of a pipelined design for the wavelength parallelization of PHOENIX, where the necessary data from the processor working on a previous wavelength point is sent to the processor working on the succeeding wavelength point as soon as it is known. Our implementation uses a MIMD design based on a relatively small number of standard message passing interface (MPI) library calls and is fully portable between serial and parallel computers. copyright 1998 The American Astronomical Society
14 CFR 35.40 - Functional test.

Science.gov (United States)

2010-01-01

... 14 Aeronautics and Space 1 2010-01-01 2010-01-01 false Functional test. 35.40 Section 35.40... STANDARDS: PROPELLERS Tests and Inspections § 35.40 Functional test. The variable-pitch propeller system must be subjected to the applicable functional tests of this section. The same propeller system used in...
Compiler Technology for Parallel Scientific Computation

Directory of Open Access Journals (Sweden)

Can Özturan

1994-01-01

Full Text Available There is a need for compiler technology that, given the source program, will generate efficient parallel codes for different architectures with minimal user involvement. Parallel computation is becoming indispensable in solving large-scale problems in science and engineering. Yet, the use of parallel computation is limited by the high costs of developing the needed software. To overcome this difficulty we advocate a comprehensive approach to the development of scalable architecture-independent software for scientific computation based on our experience with equational programming language (EPL. Our approach is based on a program decomposition, parallel code synthesis, and run-time support for parallel scientific computation. The program decomposition is guided by the source program annotations provided by the user. The synthesis of parallel code is based on configurations that describe the overall computation as a set of interacting components. Run-time support is provided by the compiler-generated code that redistributes computation and data during object program execution. The generated parallel code is optimized using techniques of data alignment, operator placement, wavefront determination, and memory optimization. In this article we discuss annotations, configurations, parallel code generation, and run-time support suitable for parallel programs written in the functional parallel programming language EPL and in Fortran.
Testing properties of generic functions

NARCIS (Netherlands)

Jansson, P.; Jeuring, J.T.; Cabenda, L.; Engels, G.; Kleerekoper, J.; Mak, S.; Overeem, M.; Visser, Kees

2007-01-01

A datatype-generic function is a family of functions indexed by (the structure of) a type. Examples include equality tests, maps and pretty printers. Property based testing tools like QuickCheck and Gast support the de¯nition of properties and test-data generators, and they check if a

Pthreads vs MPI Parallel Performance of Angular-Domain Decomposed S

International Nuclear Information System (INIS)

Azmy, Y.Y.; Barnett, D.A.

2000-01-01

Two programming models for parallelizing the Angular Domain Decomposition (ADD) of the discrete ordinates (S n ) approximation of the neutron transport equation are examined. These are the shared memory model based on the POSIX threads (Pthreads) standard, and the message passing model based on the Message Passing Interface (MPI) standard. These standard libraries are available on most multiprocessor platforms thus making the resulting parallel codes widely portable. The question is: on a fixed platform, and for a particular code solving a given test problem, which of the two programming models delivers better parallel performance? Such comparison is possible on Symmetric Multi-Processors (SMP) architectures in which several CPUs physically share a common memory, and in addition are capable of emulating message passing functionality. Implementation of the two-dimensional,(S n ), Arbitrarily High Order Transport (AHOT) code for solving neutron transport problems using these two parallelization models is described. Measured parallel performance of each model on the COMPAQ AlphaServer 8400 and the SGI Origin 2000 platforms is described, and comparison of the observed speedup for the two programming models is reported. For the case presented in this paper it appears that the MPI implementation scales better than the Pthreads implementation on both platforms
Endpoint-based parallel data processing with non-blocking collective instructions in a parallel active messaging interface of a parallel computer

Science.gov (United States)

Archer, Charles J; Blocksome, Michael A; Cernohous, Bob R; Ratterman, Joseph D; Smith, Brian E

2014-11-11

Endpoint-based parallel data processing with non-blocking collective instructions in a PAMI of a parallel computer is disclosed. The PAMI is composed of data communications endpoints, each including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task. The compute nodes are coupled for data communications through the PAMI. The parallel application establishes a data communications geometry specifying a set of endpoints that are used in collective operations of the PAMI by associating with the geometry a list of collective algorithms valid for use with the endpoints of the geometry; registering in each endpoint in the geometry a dispatch callback function for a collective operation; and executing without blocking, through a single one of the endpoints in the geometry, an instruction for the collective operation.
A parallelization study of the general purpose Monte Carlo code MCNP4 on a distributed memory highly parallel computer

International Nuclear Information System (INIS)

Yamazaki, Takao; Fujisaki, Masahide; Okuda, Motoi; Takano, Makoto; Masukawa, Fumihiro; Naito, Yoshitaka

1993-01-01

The general purpose Monte Carlo code MCNP4 has been implemented on the Fujitsu AP1000 distributed memory highly parallel computer. Parallelization techniques developed and studied are reported. A shielding analysis function of the MCNP4 code is parallelized in this study. A technique to map a history to each processor dynamically and to map control process to a certain processor was applied. The efficiency of parallelized code is up to 80% for a typical practical problem with 512 processors. These results demonstrate the advantages of a highly parallel computer to the conventional computers in the field of shielding analysis by Monte Carlo method. (orig.)
Vectorization and parallelization of the finite strip method for dynamic Mindlin plate problems

Science.gov (United States)

Chen, Hsin-Chu; He, Ai-Fang

1993-01-01

The finite strip method is a semi-analytical finite element process which allows for a discrete analysis of certain types of physical problems by discretizing the domain of the problem into finite strips. This method decomposes a single large problem into m smaller independent subproblems when m harmonic functions are employed, thus yielding natural parallelism at a very high level. In this paper we address vectorization and parallelization strategies for the dynamic analysis of simply-supported Mindlin plate bending problems and show how to prevent potential conflicts in memory access during the assemblage process. The vector and parallel implementations of this method and the performance results of a test problem under scalar, vector, and vector-concurrent execution modes on the Alliant FX/80 are also presented.
Automate functional testing

Directory of Open Access Journals (Sweden)

Ramesh Kalindri

2014-06-01

Full Text Available Currently, software engineers are increasingly turning to the option of automating functional tests, but not always have successful in this endeavor. Reasons range from low planning until over cost in the process. Some principles that can guide teams in automating these tests are described in this article.
SPRINT: A new parallel framework for R

Directory of Open Access Journals (Sweden)

Scharinger Florian

2008-12-01

Full Text Available Abstract Background Microarray analysis allows the simultaneous measurement of thousands to millions of genes or sequences across tens to thousands of different samples. The analysis of the resulting data tests the limits of existing bioinformatics computing infrastructure. A solution to this issue is to use High Performance Computing (HPC systems, which contain many processors and more memory than desktop computer systems. Many biostatisticians use R to process the data gleaned from microarray analysis and there is even a dedicated group of packages, Bioconductor, for this purpose. However, to exploit HPC systems, R must be able to utilise the multiple processors available on these systems. There are existing modules that enable R to use multiple processors, but these are either difficult to use for the HPC novice or cannot be used to solve certain classes of problems. A method of exploiting HPC systems, using R, but without recourse to mastering parallel programming paradigms is therefore necessary to analyse genomic data to its fullest. Results We have designed and built a prototype framework that allows the addition of parallelised functions to R to enable the easy exploitation of HPC systems. The Simple Parallel R INTerface (SPRINT is a wrapper around such parallelised functions. Their use requires very little modification to existing sequential R scripts and no expertise in parallel computing. As an example we created a function that carries out the computation of a pairwise calculated correlation matrix. This performs well with SPRINT. When executed using SPRINT on an HPC resource of eight processors this computation reduces by more than three times the time R takes to complete it on one processor. Conclusion SPRINT allows the biostatistician to concentrate on the research problems rather than the computation, while still allowing exploitation of HPC systems. It is easy to use and with further development will become more useful as more
Empirical valence bond models for reactive potential energy surfaces: a parallel multilevel genetic program approach.

Science.gov (United States)

Bellucci, Michael A; Coker, David F

2011-07-28

We describe a new method for constructing empirical valence bond potential energy surfaces using a parallel multilevel genetic program (PMLGP). Genetic programs can be used to perform an efficient search through function space and parameter space to find the best functions and sets of parameters that fit energies obtained by ab initio electronic structure calculations. Building on the traditional genetic program approach, the PMLGP utilizes a hierarchy of genetic programming on two different levels. The lower level genetic programs are used to optimize coevolving populations in parallel while the higher level genetic program (HLGP) is used to optimize the genetic operator probabilities of the lower level genetic programs. The HLGP allows the algorithm to dynamically learn the mutation or combination of mutations that most effectively increase the fitness of the populations, causing a significant increase in the algorithm's accuracy and efficiency. The algorithm's accuracy and efficiency is tested against a standard parallel genetic program with a variety of one-dimensional test cases. Subsequently, the PMLGP is utilized to obtain an accurate empirical valence bond model for proton transfer in 3-hydroxy-gamma-pyrone in gas phase and protic solvent. © 2011 American Institute of Physics
Parallel pic plasma simulation through particle decomposition techniques

International Nuclear Information System (INIS)

Briguglio, S.; Vlad, G.; Di Martino, B.; Naples, Univ. 'Federico II'

1998-02-01

Particle-in-cell (PIC) codes are among the major candidates to yield a satisfactory description of the detail of kinetic effects, such as the resonant wave-particle interaction, relevant in determining the transport mechanism in magnetically confined plasmas. A significant improvement of the simulation performance of such codes con be expected from parallelization, e.g., by distributing the particle population among several parallel processors. Parallelization of a hybrid magnetohydrodynamic-gyrokinetic code has been accomplished within the High Performance Fortran (HPF) framework, and tested on the IBM SP2 parallel system, using a 'particle decomposition' technique. The adopted technique requires a moderate effort in porting the code in parallel form and results in intrinsic load balancing and modest inter processor communication. The performance tests obtained confirm the hypothesis of high effectiveness of the strategy, if targeted towards moderately parallel architectures. Optimal use of resources is also discussed with reference to a specific physics problem [it
User's guide of parallel program development environment (PPDE). The 2nd edition

International Nuclear Information System (INIS)

Ueno, Hirokazu; Takemiya, Hiroshi; Imamura, Toshiyuki; Koide, Hiroshi; Matsuda, Katsuyuki; Higuchi, Kenji; Hirayama, Toshio; Ohta, Hirofumi

2000-03-01

The STA basic system has been enhanced to accelerate support for parallel programming on heterogeneous parallel computers, through a series of R and D on the technology of parallel processing. The enhancement has been made through extending the function of the PPDF, Parallel Program Development Environment in the STA basic system. The extended PPDE has the function to make: 1) the automatic creation of a 'makefile' and a shell script file for its execution, 2) the multi-tools execution which makes the tools on heterogeneous computers to execute with one operation a task on a computer, and 3) the mirror composition to reflect editing results of a file on a computer into all related files on other computers. These additional functions will enhance the work efficiency for program development on some computers. More functions have been added to the PPDE to provide help for parallel program development. New functions were also designed to complement a HPF translator and a parallelizing support tool when working together so that a sequential program is efficiently converted to a parallel program. This report describes the use of extended PPDE. (author)
Neural nets for massively parallel optimization

Science.gov (United States)

Dixon, Laurence C. W.; Mills, David

1992-07-01

To apply massively parallel processing systems to the solution of large scale optimization problems it is desirable to be able to evaluate any function f(z), z (epsilon) Rn in a parallel manner. The theorem of Cybenko, Hecht Nielsen, Hornik, Stinchcombe and White, and Funahasi shows that this can be achieved by a neural network with one hidden layer. In this paper we address the problem of the number of nodes required in the layer to achieve a given accuracy in the function and gradient values at all points within a given n dimensional interval. The type of activation function needed to obtain nonsingular Hessian matrices is described and a strategy for obtaining accurate minimal networks presented.
PERFORMANCE EVALUATION OF OR1200 PROCESSOR WITH EVOLUTIONARY PARALLEL HPRC USING GEP

Directory of Open Access Journals (Sweden)

R. Maheswari

2012-04-01

Full Text Available In this fast computing era, most of the embedded system requires more computing power to complete the complex function/ task at the lesser amount of time. One way to achieve this is by boosting up the processor performance which allows processor core to run faster. This paper presents a novel technique of increasing the performance by parallel HPRC (High Performance Reconfigurable Computing in the CPU/DSP (Digital Signal Processor unit of OR1200 (Open Reduced Instruction Set Computer (RISC 1200 using Gene Expression Programming (GEP an evolutionary programming model. OR1200 is a soft-core RISC processor of the Intellectual Property cores that can efficiently run any modern operating system. In the manufacturing process of OR1200 a parallel HPRC is placed internally in the Integer Execution Pipeline unit of the CPU/DSP core to increase the performance. The GEP Parallel HPRC is activated /deactivated by triggering the signals i HPRC_Gene_Start ii HPRC_Gene_End. A Verilog HDL(Hardware Description language functional code for Gene Expression Programming parallel HPRC is developed and synthesised using XILINX ISE in the former part of the work and a CoreMark processor core benchmark is used to test the performance of the OR1200 soft core in the later part of the work. The result of the implementation ensures the overall speed-up increased to 20.59% by GEP based parallel HPRC in the execution unit of OR1200.
Parallel selection on TRPV6 in human populations.

Science.gov (United States)

Hughes, David A; Tang, Kun; Strotmann, Rainer; Schöneberg, Torsten; Prenen, Jean; Nilius, Bernd; Stoneking, Mark

2008-02-27

We identified and examined a candidate gene for local directional selection in Europeans, TRPV6, and conclude that selection has acted on standing genetic variation at this locus, creating parallel soft sweep events in humans. A novel modification of the extended haplotype homozygosity (EHH) test was utilized, which compares EHH for a single allele across populations, to investigate the signature of selection at TRPV6 and neighboring linked loci in published data sets for Europeans, Asians and African-Americans, as well as in newly-obtained sequence data for additional populations. We find that all non-African populations carry a signature of selection on the same haplotype at the TRPV6 locus. The selective footprints, however, are significantly differentiated between non-African populations and estimated to be younger than an ancestral population of non-Africans. The possibility of a single selection event occurring in an ancestral population of non-Africans was tested by simulations and rejected. The putatively-selected TRPV6 haplotype contains three candidate sites for functional differences, namely derived non-synonymous substitutions C157R, M378V and M681T. Potential functional differences between the ancestral and derived TRPV6 proteins were investigated by cloning the ancestral and derived forms, transfecting cell lines, and carrying out electrophysiology experiments via patch clamp analysis. No statistically-significant differences in biophysical channel function were found, although one property of the protein, namely Ca(2+) dependent inactivation, may show functionally relevant differences between the ancestral and derived forms. Although the reason for selection on this locus remains elusive, this is the first demonstration of a widespread parallel selection event acting on standing genetic variation in humans, and highlights the utility of between population EHH statistics.
Parallel selection on TRPV6 in human populations.

Directory of Open Access Journals (Sweden)

David A Hughes

Full Text Available We identified and examined a candidate gene for local directional selection in Europeans, TRPV6, and conclude that selection has acted on standing genetic variation at this locus, creating parallel soft sweep events in humans. A novel modification of the extended haplotype homozygosity (EHH test was utilized, which compares EHH for a single allele across populations, to investigate the signature of selection at TRPV6 and neighboring linked loci in published data sets for Europeans, Asians and African-Americans, as well as in newly-obtained sequence data for additional populations. We find that all non-African populations carry a signature of selection on the same haplotype at the TRPV6 locus. The selective footprints, however, are significantly differentiated between non-African populations and estimated to be younger than an ancestral population of non-Africans. The possibility of a single selection event occurring in an ancestral population of non-Africans was tested by simulations and rejected. The putatively-selected TRPV6 haplotype contains three candidate sites for functional differences, namely derived non-synonymous substitutions C157R, M378V and M681T. Potential functional differences between the ancestral and derived TRPV6 proteins were investigated by cloning the ancestral and derived forms, transfecting cell lines, and carrying out electrophysiology experiments via patch clamp analysis. No statistically-significant differences in biophysical channel function were found, although one property of the protein, namely Ca(2+ dependent inactivation, may show functionally relevant differences between the ancestral and derived forms. Although the reason for selection on this locus remains elusive, this is the first demonstration of a widespread parallel selection event acting on standing genetic variation in humans, and highlights the utility of between population EHH statistics.
Short-circuit testing of monofilar Bi-2212 coils connected in series and in parallel

International Nuclear Information System (INIS)

Polasek, A; Dias, R; Serra, E T; Filho, O O; Niedu, D

2010-01-01

Superconducting Fault Current Limiters (SCFCL's) are one of the most promising technologies for fault current limitation. In the present work, resistive SCFCL components based on Bi-2212 monofilar coils are subjected to short-circuit testing. These SCFCL components can be easily connected in series and/or in parallel by using joints and clamps. This allows a considerable flexibility to developing larger SCFCL devices, since the configuration and size of the whole device can be easily adapted to the operational conditions. The single components presented critical current (Ic) values of 240-260 A, at 77 K. Short-circuits during 40-120 ms were applied. A single component can withstand a voltage drop of 126-252 V (0.3-0.6 V/cm). Components connected in series withstand higher voltage levels, whereas parallel connection allows higher rated currents during normal operation, but the limited current is also higher. Prospective currents as high as 10-40 kA (peak value) were limited to 3-9 kA (peak value) in the first half cycle.
Mixed-Meal Tolerance Test Versus Glucagon Stimulation Test for the Assessment of β-Cell Function in Therapeutic Trials in Type 1 Diabetes

Science.gov (United States)

Greenbaum, Carla J.; Mandrup-Poulsen, Thomas; McGee, Paula Friedenberg; Battelino, Tadej; Haastert, Burkhard; Ludvigsson, Johnny; Pozzilli, Paolo; Lachin, John M.; Kolb, Hubert

2008-01-01

OBJECTIVE—β-Cell function in type 1 diabetes clinical trials is commonly measured by C-peptide response to a secretagogue in either a mixed-meal tolerance test (MMTT) or a glucagon stimulation test (GST). The Type 1 Diabetes TrialNet Research Group and the European C-peptide Trial (ECPT) Study Group conducted parallel randomized studies to compare the sensitivity, reproducibility, and tolerability of these procedures. RESEARCH DESIGN AND METHODS—In randomized sequences, 148 TrialNet subjects completed 549 tests with up to 2 MMTT and 2 GST tests on separate days, and 118 ECPT subjects completed 348 tests (up to 3 each) with either two MMTTs or two GSTs. RESULTS—Among individuals with up to 4 years’ duration of type 1 diabetes, >85% had measurable stimulated C-peptide values. The MMTT stimulus produced significantly higher concentrations of C-peptide than the GST. Whereas both tests were highly reproducible, the MMTT was significantly more so (R2 = 0.96 for peak C-peptide response). Overall, the majority of subjects preferred the MMTT, and there were few adverse events. Some older subjects preferred the shorter duration of the GST. Nausea was reported in the majority of GST studies, particularly in the young age-group. CONCLUSIONS—The MMTT is preferred for the assessment of β-cell function in therapeutic trials in type 1 diabetes. PMID:18628574
Construction and functional characterization of double and triple mutants of parallel beta-bulge of ubiquitin.

Science.gov (United States)

Sharma, Mrinal; Prabha, C Ratna

2011-12-01

Ubiquitin, a small eukaryotic protein serving as a post-translational modification on many important proteins, plays central role in cellular homeostasis and cell cycle regulation. Ubiquitin features two beta-bulges, the second beta-bulge, located at the C-terminal region of the protein along with type II turn, holds 3 residues Glu64(1), Ser65(2) and Gln2(X). Percent frequency of occurrence of such a sequence in parallel beta-bulge is very low. However, the sequence and structure have been conserved in ubiquitin through out the evolution. Present study involves replacement of residues in unusual beta-bulge of ubiquitin by introducing mutations in combination through site directed mutagenesis, generating double and triple mutants and their functional characterization. Mutant ubiquitins cloned in yeast expression vector YEp96 tested for growth profile, viability assay and heat stress complementation study have revealed significant decrease in growth rate, loss of viability and non-complementation of heat sensitive phenotype with UbE64G-S65D and UbQ2N-E64G-S65D mutations. However, UbQ2N-S65D did not show any negative effects in the above assays. Present results show that, replacement of residues in beta-bulge of ubiquitin exerts severe effects on growth and viability in Saccharomyces cerevisiae due to functional failure of the mutant ubiquitins UbE64G-S65D and UbQ2N-E64G-S65D.
99m-Tc-IDA scintigraphic demonstrability of biliary elements and liver function tests in hepatobiliary diseases

International Nuclear Information System (INIS)

Kim, C.Y.; Bahk, Y.W.

1982-01-01

In the present communication, the results will be reported on a clinical study of how well scintigraphic visualization of the hepatobiliary elements and several commonly used clinical liver function tests correlate each other in various diseases of the hepatobiliary system. The demonstrability of the biliary tract, gallblader (GB) and duodenum was rather closely paralleled to serum bilirubin level and less closely to alkaline phosphatase and rather poorly to SGOT and SGPT. The biliary tree could not be visualized scintigraphically when bilirubin exceeded 10 mg/dl
Research on Control Strategy of Complex Systems through VSC-HVDC Grid Parallel Device

Directory of Open Access Journals (Sweden)

Xue Mei-Juan

2014-07-01

Full Text Available After the completion of grid parallel, the device can turn to be UPFC, STATCOM, SSSC, research on the conversion circuit and transform method by corresponding switching operation. Accomplish the grid parallel and comprehensive control of the tie-line and stable operation and control functions of grid after parallel. Defines the function select operation switch matrix and grid parallel system branch variable, forming a switch matrix to achieve corresponding function of the composite system. Formed a criterion of the selection means to choose control strategy according to the switch matrix, to accomplish corresponding function. Put the grid parallel, STATCOM, SSSC and UPFC together as a system, improve the stable operation and flexible control of the power system.
Vectorization, parallelization and porting of nuclear codes (vectorization and parallelization). Progress report fiscal 1998

International Nuclear Information System (INIS)

Ishizuki, Shigeru; Kawai, Wataru; Nemoto, Toshiyuki; Ogasawara, Shinobu; Kume, Etsuo; Adachi, Masaaki; Kawasaki, Nobuo; Yatake, Yo-ichi

2000-03-01

Several computer codes in the nuclear field have been vectorized, parallelized and transported on the FUJITSU VPP500 system, the AP3000 system and the Paragon system at Center for Promotion of Computational Science and Engineering in Japan Atomic Energy Research Institute. We dealt with 12 codes in fiscal 1998. These results are reported in 3 parts, i.e., the vectorization and parallelization on vector processors part, the parallelization on scalar processors part and the porting part. In this report, we describe the vectorization and parallelization on vector processors. In this vectorization and parallelization on vector processors part, the vectorization of General Tokamak Circuit Simulation Program code GTCSP, the vectorization and parallelization of Molecular Dynamics NTV (n-particle, Temperature and Velocity) Simulation code MSP2, Eddy Current Analysis code EDDYCAL, Thermal Analysis Code for Test of Passive Cooling System by HENDEL T2 code THANPACST2 and MHD Equilibrium code SELENEJ on the VPP500 are described. In the parallelization on scalar processors part, the parallelization of Monte Carlo N-Particle Transport code MCNP4B2, Plasma Hydrodynamics code using Cubic Interpolated Propagation Method PHCIP and Vectorized Monte Carlo code (continuous energy model / multi-group model) MVP/GMVP on the Paragon are described. In the porting part, the porting of Monte Carlo N-Particle Transport code MCNP4B2 and Reactor Safety Analysis code RELAP5 on the AP3000 are described. (author)
Linkage mechanisms in the vertebrate skull: Structure and function of three-dimensional, parallel transmission systems.

Science.gov (United States)

Olsen, Aaron M; Westneat, Mark W

2016-12-01

Many musculoskeletal systems, including the skulls of birds, fishes, and some lizards consist of interconnected chains of mobile skeletal elements, analogous to linkage mechanisms used in engineering. Biomechanical studies have applied linkage models to a diversity of musculoskeletal systems, with previous applications primarily focusing on two-dimensional linkage geometries, bilaterally symmetrical pairs of planar linkages, or single four-bar linkages. Here, we present new, three-dimensional (3D), parallel linkage models of the skulls of birds and fishes and use these models (available as free kinematic simulation software), to investigate structure-function relationships in these systems. This new computational framework provides an accessible and integrated workflow for exploring the evolution of structure and function in complex musculoskeletal systems. Linkage simulations show that kinematic transmission, although a suitable functional metric for linkages with single rotating input and output links, can give misleading results when applied to linkages with substantial translational components or multiple output links. To take into account both linear and rotational displacement we define force mechanical advantage for a linkage (analogous to lever mechanical advantage) and apply this metric to measure transmission efficiency in the bird cranial mechanism. For linkages with multiple, expanding output points we propose a new functional metric, expansion advantage, to measure expansion amplification and apply this metric to the buccal expansion mechanism in fishes. Using the bird cranial linkage model, we quantify the inaccuracies that result from simplifying a 3D geometry into two dimensions. We also show that by combining single-chain linkages into parallel linkages, more links can be simulated while decreasing or maintaining the same number of input parameters. This generalized framework for linkage simulation and analysis can accommodate linkages of differing

Is Monte Carlo embarrassingly parallel?

Energy Technology Data Exchange (ETDEWEB)

Hoogenboom, J. E. [Delft Univ. of Technology, Mekelweg 15, 2629 JB Delft (Netherlands); Delft Nuclear Consultancy, IJsselzoom 2, 2902 LB Capelle aan den IJssel (Netherlands)

2012-07-01

Monte Carlo is often stated as being embarrassingly parallel. However, running a Monte Carlo calculation, especially a reactor criticality calculation, in parallel using tens of processors shows a serious limitation in speedup and the execution time may even increase beyond a certain number of processors. In this paper the main causes of the loss of efficiency when using many processors are analyzed using a simple Monte Carlo program for criticality. The basic mechanism for parallel execution is MPI. One of the bottlenecks turn out to be the rendez-vous points in the parallel calculation used for synchronization and exchange of data between processors. This happens at least at the end of each cycle for fission source generation in order to collect the full fission source distribution for the next cycle and to estimate the effective multiplication factor, which is not only part of the requested results, but also input to the next cycle for population control. Basic improvements to overcome this limitation are suggested and tested. Also other time losses in the parallel calculation are identified. Moreover, the threading mechanism, which allows the parallel execution of tasks based on shared memory using OpenMP, is analyzed in detail. Recommendations are given to get the maximum efficiency out of a parallel Monte Carlo calculation. (authors)
Is Monte Carlo embarrassingly parallel?

International Nuclear Information System (INIS)

Hoogenboom, J. E.

2012-01-01

Monte Carlo is often stated as being embarrassingly parallel. However, running a Monte Carlo calculation, especially a reactor criticality calculation, in parallel using tens of processors shows a serious limitation in speedup and the execution time may even increase beyond a certain number of processors. In this paper the main causes of the loss of efficiency when using many processors are analyzed using a simple Monte Carlo program for criticality. The basic mechanism for parallel execution is MPI. One of the bottlenecks turn out to be the rendez-vous points in the parallel calculation used for synchronization and exchange of data between processors. This happens at least at the end of each cycle for fission source generation in order to collect the full fission source distribution for the next cycle and to estimate the effective multiplication factor, which is not only part of the requested results, but also input to the next cycle for population control. Basic improvements to overcome this limitation are suggested and tested. Also other time losses in the parallel calculation are identified. Moreover, the threading mechanism, which allows the parallel execution of tasks based on shared memory using OpenMP, is analyzed in detail. Recommendations are given to get the maximum efficiency out of a parallel Monte Carlo calculation. (authors)
Parallelization and automatic data distribution for nuclear reactor simulations

Energy Technology Data Exchange (ETDEWEB)

Liebrock, L.M. [Liebrock-Hicks Research, Calumet, MI (United States)

1997-07-01

Detailed attempts at realistic nuclear reactor simulations currently take many times real time to execute on high performance workstations. Even the fastest sequential machine can not run these simulations fast enough to ensure that the best corrective measure is used during a nuclear accident to prevent a minor malfunction from becoming a major catastrophe. Since sequential computers have nearly reached the speed of light barrier, these simulations will have to be run in parallel to make significant improvements in speed. In physical reactor plants, parallelism abounds. Fluids flow, controls change, and reactions occur in parallel with only adjacent components directly affecting each other. These do not occur in the sequentialized manner, with global instantaneous effects, that is often used in simulators. Development of parallel algorithms that more closely approximate the real-world operation of a reactor may, in addition to speeding up the simulations, actually improve the accuracy and reliability of the predictions generated. Three types of parallel architecture (shared memory machines, distributed memory multicomputers, and distributed networks) are briefly reviewed as targets for parallelization of nuclear reactor simulation. Various parallelization models (loop-based model, shared memory model, functional model, data parallel model, and a combined functional and data parallel model) are discussed along with their advantages and disadvantages for nuclear reactor simulation. A variety of tools are introduced for each of the models. Emphasis is placed on the data parallel model as the primary focus for two-phase flow simulation. Tools to support data parallel programming for multiple component applications and special parallelization considerations are also discussed.
Parallelization and automatic data distribution for nuclear reactor simulations

International Nuclear Information System (INIS)

Liebrock, L.M.

1997-01-01

Detailed attempts at realistic nuclear reactor simulations currently take many times real time to execute on high performance workstations. Even the fastest sequential machine can not run these simulations fast enough to ensure that the best corrective measure is used during a nuclear accident to prevent a minor malfunction from becoming a major catastrophe. Since sequential computers have nearly reached the speed of light barrier, these simulations will have to be run in parallel to make significant improvements in speed. In physical reactor plants, parallelism abounds. Fluids flow, controls change, and reactions occur in parallel with only adjacent components directly affecting each other. These do not occur in the sequentialized manner, with global instantaneous effects, that is often used in simulators. Development of parallel algorithms that more closely approximate the real-world operation of a reactor may, in addition to speeding up the simulations, actually improve the accuracy and reliability of the predictions generated. Three types of parallel architecture (shared memory machines, distributed memory multicomputers, and distributed networks) are briefly reviewed as targets for parallelization of nuclear reactor simulation. Various parallelization models (loop-based model, shared memory model, functional model, data parallel model, and a combined functional and data parallel model) are discussed along with their advantages and disadvantages for nuclear reactor simulation. A variety of tools are introduced for each of the models. Emphasis is placed on the data parallel model as the primary focus for two-phase flow simulation. Tools to support data parallel programming for multiple component applications and special parallelization considerations are also discussed
Development of Parallel Computing Framework to Enhance Radiation Transport Code Capabilities for Rare Isotope Beam Facility Design

Energy Technology Data Exchange (ETDEWEB)

Kostin, Mikhail [Michigan State Univ., East Lansing, MI (United States); Mokhov, Nikolai [Fermi National Accelerator Lab. (FNAL), Batavia, IL (United States); Niita, Koji [Research Organization for Information Science and Technology, Ibaraki-ken (Japan)

2013-09-25

A parallel computing framework has been developed to use with general-purpose radiation transport codes. The framework was implemented as a C++ module that uses MPI for message passing. It is intended to be used with older radiation transport codes implemented in Fortran77, Fortran 90 or C. The module is significantly independent of radiation transport codes it can be used with, and is connected to the codes by means of a number of interface functions. The framework was developed and tested in conjunction with the MARS15 code. It is possible to use it with other codes such as PHITS, FLUKA and MCNP after certain adjustments. Besides the parallel computing functionality, the framework offers a checkpoint facility that allows restarting calculations with a saved checkpoint file. The checkpoint facility can be used in single process calculations as well as in the parallel regime. The framework corrects some of the known problems with the scheduling and load balancing found in the original implementations of the parallel computing functionality in MARS15 and PHITS. The framework can be used efficiently on homogeneous systems and networks of workstations, where the interference from the other users is possible.
Parallel and vector implementation of APROS simulator code

International Nuclear Information System (INIS)

Niemi, J.; Tommiska, J.

1990-01-01

In this paper the vector and parallel processing implementation of a general purpose simulator code is discussed. In this code the utilization of vector processing is straightforward. In addition to the loop level parallel processing, the functional decomposition and the domain decomposition have been considered. Results represented for a PWR-plant simulation illustrate the potential speed-up factors of the alternatives. It turns out that the loop level parallelism and the domain decomposition are the most promising alternative to employ the parallel processing. (author)
Parallel transport of long mean-free-path plasma along open magnetic field lines: Parallel heat flux

International Nuclear Information System (INIS)

Guo Zehua; Tang Xianzhu

2012-01-01

In a long mean-free-path plasma where temperature anisotropy can be sustained, the parallel heat flux has two components with one associated with the parallel thermal energy and the other the perpendicular thermal energy. Due to the large deviation of the distribution function from local Maxwellian in an open field line plasma with low collisionality, the conventional perturbative calculation of the parallel heat flux closure in its local or non-local form is no longer applicable. Here, a non-perturbative calculation is presented for a collisionless plasma in a two-dimensional flux expander bounded by absorbing walls. Specifically, closures of previously unfamiliar form are obtained for ions and electrons, which relate two distinct components of the species parallel heat flux to the lower order fluid moments such as density, parallel flow, parallel and perpendicular temperatures, and the field quantities such as the magnetic field strength and the electrostatic potential. The plasma source and boundary condition at the absorbing wall enter explicitly in the closure calculation. Although the closure calculation does not take into account wave-particle interactions, the results based on passing orbits from steady-state collisionless drift-kinetic equation show remarkable agreement with fully kinetic-Maxwell simulations. As an example of the physical implications of the theory, the parallel heat flux closures are found to predict a surprising observation in the kinetic-Maxwell simulation of the 2D magnetic flux expander problem, where the parallel heat flux of the parallel thermal energy flows from low to high parallel temperature region.
User's guide of parallel program development environment (PPDE). The 2nd edition

Energy Technology Data Exchange (ETDEWEB)

Ueno, Hirokazu; Takemiya, Hiroshi; Imamura, Toshiyuki; Koide, Hiroshi; Matsuda, Katsuyuki; Higuchi, Kenji; Hirayama, Toshio [Center for Promotion of Computational Science and Engineering, Japan Atomic Energy Research Institute, Tokyo (Japan); Ohta, Hirofumi [Hitachi Ltd., Tokyo (Japan)

2000-03-01

The STA basic system has been enhanced to accelerate support for parallel programming on heterogeneous parallel computers, through a series of R and D on the technology of parallel processing. The enhancement has been made through extending the function of the PPDF, Parallel Program Development Environment in the STA basic system. The extended PPDE has the function to make: 1) the automatic creation of a 'makefile' and a shell script file for its execution, 2) the multi-tools execution which makes the tools on heterogeneous computers to execute with one operation a task on a computer, and 3) the mirror composition to reflect editing results of a file on a computer into all related files on other computers. These additional functions will enhance the work efficiency for program development on some computers. More functions have been added to the PPDE to provide help for parallel program development. New functions were also designed to complement a HPF translator and a paralleilizing support tool when working together so that a sequential program is efficiently converted to a parallel program. This report describes the use of extended PPDE. (author)
Three-dimensional radiative transfer in an isotropically scattering, plane-parallel medium: generalized X- and Y-functions

International Nuclear Information System (INIS)

Mueller, D.W.; Crosbie, A.L.

2005-01-01

The topic of this work is the generalized X- and Y-functions of multidimensional radiative transfer. The physical problem considered is spatially varying, collimated radiation incident on the upper boundary of an isotropically scattering, plane-parallel medium. An integral transform is used to reduce the three-dimensional transport equation to a one-dimensional form, and a modified Ambarzumian's method is used to derive coupled, integro-differential equations for the source functions at the boundaries of the medium. The resulting equations are said to be in double-integral form because the integration is over both angular variables. Numerical results are presented to illustrate the computational characteristics of the formulation
Declarative Parallel Programming in Spreadsheet End-User Development

DEFF Research Database (Denmark)

Biermann, Florian

2016-01-01

Spreadsheets are first-order functional languages and are widely used in research and industry as a tool to conveniently perform all kinds of computations. Because cells on a spreadsheet are immutable, there are possibilities for implicit parallelization of spreadsheet computations. In this liter...... can directly apply results from functional array programming to a spreadsheet model of computations.......Spreadsheets are first-order functional languages and are widely used in research and industry as a tool to conveniently perform all kinds of computations. Because cells on a spreadsheet are immutable, there are possibilities for implicit parallelization of spreadsheet computations....... In this literature study, we provide an overview of the publications on spreadsheet end-user programming and declarative array programming to inform further research on parallel programming in spreadsheets. Our results show that there is a clear overlap between spreadsheet programming and array programming and we...
Parallel-In-Time For Moving Meshes

Energy Technology Data Exchange (ETDEWEB)

Falgout, R. D. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Manteuffel, T. A. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Southworth, B. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Schroder, J. B. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

2016-02-04

With steadily growing computational resources available, scientists must develop e ective ways to utilize the increased resources. High performance, highly parallel software has be- come a standard. However until recent years parallelism has focused primarily on the spatial domain. When solving a space-time partial di erential equation (PDE), this leads to a sequential bottleneck in the temporal dimension, particularly when taking a large number of time steps. The XBraid parallel-in-time library was developed as a practical way to add temporal parallelism to existing se- quential codes with only minor modi cations. In this work, a rezoning-type moving mesh is applied to a di usion problem and formulated in a parallel-in-time framework. Tests and scaling studies are run using XBraid and demonstrate excellent results for the simple model problem considered herein.
Massively parallel Fokker-Planck code ALLAp

International Nuclear Information System (INIS)

Batishcheva, A.A.; Krasheninnikov, S.I.; Craddock, G.G.; Djordjevic, V.

1996-01-01

The recently developed for workstations Fokker-Planck code ALLA simulates the temporal evolution of 1V, 2V and 1D2V collisional edge plasmas. In this work we present the results of code parallelization on the CRI T3D massively parallel platform (ALLAp version). Simultaneously we benchmark the 1D2V parallel vesion against an analytic self-similar solution of the collisional kinetic equation. This test is not trivial as it demands a very strong spatial temperature and density variation within the simulation domain. (orig.)
Testing properties of generic functions

NARCIS (Netherlands)

Jansson, P.; Jeuring, J.T.; Cabenda, L.; Engels, G.; Kleerekoper, J.; Mak, S.; Overeem, M.; Visser, Kees

2006-01-01

Software testing is an important part of the software devel- opment process. Testing comes in many flavours: unit testing, property testing, regression testing, contract checking, etc. QuickCheck is proba- bly one of the most advanced tools for testing properties of functional programs. It
Algorithms for the Construction of Parallel Tests by Zero-One Programming. Project Psychometric Aspects of Item Banking No. 7. Research Report 86-7.

Science.gov (United States)

Boekkooi-Timminga, Ellen

Nine methods for automated test construction are described. All are based on the concepts of information from item response theory. Two general kinds of methods for the construction of parallel tests are presented: (1) sequential test design; and (2) simultaneous test design. Sequential design implies that the tests are constructed one after the…
ELEFUNT, Testing of Elementary Function Subroutines

International Nuclear Information System (INIS)

Cody, W.J.

1981-01-01

1 - Description of problem or function: ELEFUNT is a FORTRAN test package for the elementary functions. Each program is an aggressive test of one or more of the elementary function subroutines generally supplied with the support library accompanying a FORTRAN compiler. Functions tested are ALOG/ALOG10, ASIN/ACOS, ATAN, EXP, POWER, SIN/ COS, SINH/COSH, SQRT, TAN/COTAN, and TANH. 2 - Method of solution: The programs check the accuracy of the functions by using purified random arguments from appropriate intervals in carefully selected identities. They also check special properties of each function, test for the handling of special arguments, and exercise the error returns. 3 - Restrictions on the complexity of the problem: The package contains one subroutine (MACHAR) for dynamic determination of parameters describing the floating-point arithmetic system of the host machine, the test programs must be modified to insert the necessary machine- dependent parameters in DATA statements, or otherwise make them available. This computing environment inquiry routine is known to malfunction when the arithmetic registers are wider than the storage registers
Analytic energy gradient of excited electronic state within TDDFT/MMpol framework: Benchmark tests and parallel implementation.

Science.gov (United States)

Zeng, Qiao; Liang, WanZhen

2015-10-07

The time-dependent density functional theory (TDDFT) has become the most popular method to calculate the electronic excitation energies, describe the excited-state properties, and perform the excited-state geometric optimization of medium and large-size molecules due to the implementation of analytic excited-state energy gradient and Hessian in many electronic structure software packages. To describe the molecules in condensed phase, one usually adopts the computationally efficient hybrid Quantum Mechanics/Molecular Mechanics (QM/MM) models. Here, we extend our previous work on the energy gradient of TDDFT/MM excited state to account for the mutual polarization effects between QM and MM regions, which is believed to hold a crucial position in the potential energy surface of molecular systems when the photoexcitation-induced charge rearrangement in the QM region is drastic. The implementation of a simple polarizable TDDFT/MM (TDDFT/MMpol) model in Q-Chem/CHARMM interface with both the linear response and the state-specific features has been realized. Several benchmark tests and preliminary applications are exhibited to confirm our implementation and assess the effects of different treatment of environmental polarization on the excited-state properties, and the efficiency of parallel implementation is demonstrated as well.
A Green's function method for two-dimensional reactive solute transport in a parallel fracture-matrix system

Science.gov (United States)

Chen, Kewei; Zhan, Hongbin

2018-06-01

The reactive solute transport in a single fracture bounded by upper and lower matrixes is a classical problem that captures the dominant factors affecting transport behavior beyond pore scale. A parallel fracture-matrix system which considers the interaction among multiple paralleled fractures is an extension to a single fracture-matrix system. The existing analytical or semi-analytical solution for solute transport in a parallel fracture-matrix simplifies the problem to various degrees, such as neglecting the transverse dispersion in the fracture and/or the longitudinal diffusion in the matrix. The difficulty of solving the full two-dimensional (2-D) problem lies in the calculation of the mass exchange between the fracture and matrix. In this study, we propose an innovative Green's function approach to address the 2-D reactive solute transport in a parallel fracture-matrix system. The flux at the interface is calculated numerically. It is found that the transverse dispersion in the fracture can be safely neglected due to the small scale of fracture aperture. However, neglecting the longitudinal matrix diffusion would overestimate the concentration profile near the solute entrance face and underestimate the concentration profile at the far side. The error caused by neglecting the longitudinal matrix diffusion decreases with increasing Peclet number. The longitudinal matrix diffusion does not have obvious influence on the concentration profile in long-term. The developed model is applied to a non-aqueous-phase-liquid (DNAPL) contamination field case in New Haven Arkose of Connecticut in USA to estimate the Trichloroethylene (TCE) behavior over 40 years. The ratio of TCE mass stored in the matrix and the injected TCE mass increases above 90% in less than 10 years.
Locating hardware faults in a parallel computer

Science.gov (United States)

Archer, Charles J.; Megerian, Mark G.; Ratterman, Joseph D.; Smith, Brian E.

2010-04-13

Locating hardware faults in a parallel computer, including defining within a tree network of the parallel computer two or more sets of non-overlapping test levels of compute nodes of the network that together include all the data communications links of the network, each non-overlapping test level comprising two or more adjacent tiers of the tree; defining test cells within each non-overlapping test level, each test cell comprising a subtree of the tree including a subtree root compute node and all descendant compute nodes of the subtree root compute node within a non-overlapping test level; performing, separately on each set of non-overlapping test levels, an uplink test on all test cells in a set of non-overlapping test levels; and performing, separately from the uplink tests and separately on each set of non-overlapping test levels, a downlink test on all test cells in a set of non-overlapping test levels.
Real-time trajectory optimization on parallel processors

Science.gov (United States)

Psiaki, Mark L.

1993-01-01

A parallel algorithm has been developed for rapidly solving trajectory optimization problems. The goal of the work has been to develop an algorithm that is suitable to do real-time, on-line optimal guidance through repeated solution of a trajectory optimization problem. The algorithm has been developed on an INTEL iPSC/860 message passing parallel processor. It uses a zero-order-hold discretization of a continuous-time problem and solves the resulting nonlinear programming problem using a custom-designed augmented Lagrangian nonlinear programming algorithm. The algorithm achieves parallelism of function, derivative, and search direction calculations through the principle of domain decomposition applied along the time axis. It has been encoded and tested on 3 example problems, the Goddard problem, the acceleration-limited, planar minimum-time to the origin problem, and a National Aerospace Plane minimum-fuel ascent guidance problem. Execution times as fast as 118 sec of wall clock time have been achieved for a 128-stage Goddard problem solved on 32 processors. A 32-stage minimum-time problem has been solved in 151 sec on 32 processors. A 32-stage National Aerospace Plane problem required 2 hours when solved on 32 processors. A speed-up factor of 7.2 has been achieved by using 32-nodes instead of 1-node to solve a 64-stage Goddard problem.
Liver Function Tests

Science.gov (United States)

... Liver Function Tests Clinical Trials Liver Transplant FAQs Medical Terminology Diseases of the Liver Alagille Syndrome Alcohol-Related ... the Liver The Progression of Liver Disease FAQs Medical Terminology HOW YOU CAN HELP Sponsorship Ways to Give ...

Programming parallel architectures - The BLAZE family of languages

Science.gov (United States)

Mehrotra, Piyush

1989-01-01

This paper gives an overview of the various approaches to programming multiprocessor architectures that are currently being explored. It is argued that two of these approaches, interactive programming environments and functional parallel languages, are particularly attractive, since they remove much of the burden of exploiting parallel architectures from the user. This paper also describes recent work in the design of parallel languages. Research on languages for both shared and nonshared memory multiprocessors is described.
Parallel and non-parallel laminar mixed convection flow in an inclined tube: The effect of the boundary conditions

International Nuclear Information System (INIS)

Barletta, A.

2008-01-01

The necessary condition for the onset of parallel flow in the fully developed region of an inclined duct is applied to the case of a circular tube. Parallel flow in inclined ducts is an uncommon regime, since in most cases buoyancy tends to produce the onset of secondary flow. The present study shows how proper thermal boundary conditions may preserve parallel flow regime. Mixed convection flow is studied for a special non-axisymmetric thermal boundary condition that, with a proper choice of a switch parameter, may be compatible with parallel flow. More precisely, a circumferentially variable heat flux distribution is prescribed on the tube wall, expressed as a sinusoidal function of the azimuthal coordinate θ with period 2π. A π/2 rotation in the position of the maximum heat flux, achieved by setting the switch parameter, may allow or not the existence of parallel flow. Two cases are considered corresponding to parallel and non-parallel flow. In the first case, the governing balance equations allow a simple analytical solution. On the contrary, in the second case, the local balance equations are solved numerically by employing a finite element method
Stampi: a message passing library for distributed parallel computing. User's guide

International Nuclear Information System (INIS)

Imamura, Toshiyuki; Koide, Hiroshi; Takemiya, Hiroshi

1998-11-01

A new message passing library, Stampi, has been developed to realize a computation with different kind of parallel computers arbitrarily and making MPI (Message Passing Interface) as an unique interface for communication. Stampi is based on MPI2 specification. It realizes dynamic process creation to different machines and communication between spawned one within the scope of MPI semantics. Vender implemented MPI as a closed system in one parallel machine and did not support both functions; process creation and communication to external machines. Stampi supports both functions and enables us distributed parallel computing. Currently Stampi has been implemented on COMPACS (COMplex PArallel Computer System) introduced in CCSE, five parallel computers and one graphic workstation, and any communication on them can be processed on. (author)
A NEW CLINICAL MUSCLE FUNCTION TEST FOR ASSESSMENT OF HIP EXTERNAL ROTATION STRENGTH: AUGUSTSSON STRENGTH TEST.

Science.gov (United States)

Augustsson, Jesper

2016-08-01

Dynamic clinical tests of hip strength applicable on patients, non-athletes and athletes alike, are lacking. The aim of this study was therefore to develop and evaluate the reliability of a dynamic muscle function test of hip external rotation strength, using a novel device. A second aim was to determine if gender differences exist in absolute and relative hip strength using the new test. Fifty-three healthy sport science students (34 women and 19 men) were tested for hip external rotation strength using a device that consisted of a strap connected in series with an elastic resistance band loop, and a measuring tape connected in parallel with the elastic resistance band. The test was carried out with the subject side lying, positioned in 45 ° of hip flexion and the knees flexed to 90 ° with the device firmly fastened proximally across the knees. The subject then exerted maximal concentric hip external rotation force against the device thereby extending the elastic resistance band. The displacement achieved by the subject was documented by the tape measure and the corresponding force production was calculated. Both right and left hip strength was measured. Fifteen of the subjects were tested on repeated occasions to evaluate test-retest reliability. No significant test-retest differences were observed. Intra-class correlation coefficients ranged 0.93-0.94 and coefficients of variation 2.76-4.60%. In absolute values, men were significantly stronger in hip external rotation than women (right side 13.2 vs 11.0 kg, p = 0.001, left side 13.2 vs 11.5 kg, p = 0.002). There were no significant differences in hip external rotation strength normalized for body weight (BW) between men and women (right side 0.17 kg/BW vs 0.17 kg/BW, p = 0.675, left side 0.17 kg/BW vs 0.18 kg/BW, p = 0.156). The new muscle function test showed high reliability and thus could be useful for measuring dynamic hip external rotation strength in patients, non-athletes and athletes
Multi-petascale highly efficient parallel supercomputer

Science.gov (United States)

Asaad, Sameh; Bellofatto, Ralph E.; Blocksome, Michael A.; Blumrich, Matthias A.; Boyle, Peter; Brunheroto, Jose R.; Chen, Dong; Cher, Chen-Yong; Chiu, George L.; Christ, Norman; Coteus, Paul W.; Davis, Kristan D.; Dozsa, Gabor J.; Eichenberger, Alexandre E.; Eisley, Noel A.; Ellavsky, Matthew R.; Evans, Kahn C.; Fleischer, Bruce M.; Fox, Thomas W.; Gara, Alan; Giampapa, Mark E.; Gooding, Thomas M.; Gschwind, Michael K.; Gunnels, John A.; Hall, Shawn A.; Haring, Rudolf A.; Heidelberger, Philip; Inglett, Todd A.; Knudson, Brant L.; Kopcsay, Gerard V.; Kumar, Sameer; Mamidala, Amith R.; Marcella, James A.; Megerian, Mark G.; Miller, Douglas R.; Miller, Samuel J.; Muff, Adam J.; Mundy, Michael B.; O'Brien, John K.; O'Brien, Kathryn M.; Ohmacht, Martin; Parker, Jeffrey J.; Poole, Ruth J.; Ratterman, Joseph D.; Salapura, Valentina; Satterfield, David L.; Senger, Robert M.; Steinmacher-Burow, Burkhard; Stockdell, William M.; Stunkel, Craig B.; Sugavanam, Krishnan; Sugawara, Yutaka; Takken, Todd E.; Trager, Barry M.; Van Oosten, James L.; Wait, Charles D.; Walkup, Robert E.; Watson, Alfred T.; Wisniewski, Robert W.; Wu, Peng

2018-05-15

A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaflop-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC). The ASIC nodes are interconnected by a five dimensional torus network that optimally maximize the throughput of packet communications between nodes and minimize latency. The network implements collective network and a global asynchronous network that provides global barrier and notification functions. Integrated in the node design include a list-based prefetcher. The memory system implements transaction memory, thread level speculation, and multiversioning cache that improves soft error rate at the same time and supports DMA functionality allowing for parallel processing message-passing.
Performance of a parallel plate volume calorimeter prototype

International Nuclear Information System (INIS)

Arefiev, A.; Bencze, Gy.L.; Bizzeti, A.; Choumilov, E.; Civinini, C; D'Alessandro, R.; Ferrando, A.; Fouz, M.C.; Iglesias, A.; Ivochkin, V.; Josa, M.I.; Malinin, A.; Meschini, M.; Misyura, S.; Pojidaev, V.; Salicio, J.M.; Sikler, F.

1995-01-01

An iron/gas parallel plate volume calorimeter prototype, working in the avalanche mode, has been tested using electrons of 20 to 150 GeV/c momentum with high voltages varying from 5400 to 5600 V (electric fields ranging from 36 to 37 KV/cm), and a gas mixture of CF4/CO, (80/20%). The collected charge was measured as a function of the high voltage and of the electron energy. The energy resolution was also measured. Comparisons are made with Monte-Carlo predictions. Agreement between data and simulation allows the calculation of the expected performance of a full size calorimeter. (Author)
Performance of a parallel plate volume calorimeter prototype

International Nuclear Information System (INIS)

Arefiev, A.; Bencze, G.L.; Bizzeti, A.

1995-09-01

An iron/gas parallel plate volume calorimeter prototype, working in the avalanche mode, has been tested using electrons of 20 to 150 GeV/c momentum with high voltages varying from 5400 to 5600 V (electric fields ranging from 36 to 37 KV/cm), and a gas mixture of CF 4 /CO 2 (80/20%). The collected charge was measured as a function of the high voltage and of the electron energy. The energy resolution was also measured. Comparisons are made with Monte-Carlo predictions. Agreement between data and simulation allows the calculation of the expected performance of a full size calorimeter
Measuring effectiveness of a university by a parallel network DEA model

Science.gov (United States)

Kashim, Rosmaini; Kasim, Maznah Mat; Rahman, Rosshairy Abd

2017-11-01

Universities contribute significantly to the development of human capital and socio-economic improvement of a country. Due to that, Malaysian universities carried out various initiatives to improve their performance. Most studies have used the Data Envelopment Analysis (DEA) model to measure efficiency rather than effectiveness, even though, the measurement of effectiveness is important to realize how effective a university in achieving its ultimate goals. A university system has two major functions, namely teaching and research and every function has different resources based on its emphasis. Therefore, a university is actually structured as a parallel production system with its overall effectiveness is the aggregated effectiveness of teaching and research. Hence, this paper is proposing a parallel network DEA model to measure the effectiveness of a university. This model includes internal operations of both teaching and research functions into account in computing the effectiveness of a university system. In literature, the graduate and the number of program offered are defined as the outputs, then, the employed graduates and the numbers of programs accredited from professional bodies are considered as the outcomes for measuring the teaching effectiveness. Amount of grants is regarded as the output of research, while the different quality of publications considered as the outcomes of research. A system is considered effective if only all functions are effective. This model has been tested using a hypothetical set of data consisting of 14 faculties at a public university in Malaysia. The results show that none of the faculties is relatively effective for the overall performance. Three faculties are effective in teaching and two faculties are effective in research. The potential applications of the parallel network DEA model allow the top management of a university to identify weaknesses in any functions in their universities and take rational steps for improvement.
Parallel Implicit Algorithms for CFD

Science.gov (United States)

Keyes, David E.

1998-01-01

The main goal of this project was efficient distributed parallel and workstation cluster implementations of Newton-Krylov-Schwarz (NKS) solvers for implicit Computational Fluid Dynamics (CFD.) "Newton" refers to a quadratically convergent nonlinear iteration using gradient information based on the true residual, "Krylov" to an inner linear iteration that accesses the Jacobian matrix only through highly parallelizable sparse matrix-vector products, and "Schwarz" to a domain decomposition form of preconditioning the inner Krylov iterations with primarily neighbor-only exchange of data between the processors. Prior experience has established that Newton-Krylov methods are competitive solvers in the CFD context and that Krylov-Schwarz methods port well to distributed memory computers. The combination of the techniques into Newton-Krylov-Schwarz was implemented on 2D and 3D unstructured Euler codes on the parallel testbeds that used to be at LaRC and on several other parallel computers operated by other agencies or made available by the vendors. Early implementations were made directly in Massively Parallel Integration (MPI) with parallel solvers we adapted from legacy NASA codes and enhanced for full NKS functionality. Later implementations were made in the framework of the PETSC library from Argonne National Laboratory, which now includes pseudo-transient continuation Newton-Krylov-Schwarz solver capability (as a result of demands we made upon PETSC during our early porting experiences). A secondary project pursued with funding from this contract was parallel implicit solvers in acoustics, specifically in the Helmholtz formulation. A 2D acoustic inverse problem has been solved in parallel within the PETSC framework.
A parallel approach to the stable marriage problem

DEFF Research Database (Denmark)

Larsen, Jesper

1997-01-01

This paper describes two parallel algorithms for the stable marriage problem implemented on a MIMD parallel computer. The algorithms are tested against sequential algorithms on randomly generated and worst-case instances. The results clearly show that the combination fo a very simple problem...... and a commercial MIMD system results in parallel algorithms which are not competitive with sequential algorithms wrt. practical performance. 1 Introduction In 1962 the Stable Marriage Problem was....
NDL-v2.0: A new version of the numerical differentiation library for parallel architectures

Science.gov (United States)

Hadjidoukas, P. E.; Angelikopoulos, P.; Voglis, C.; Papageorgiou, D. G.; Lagaris, I. E.

2014-07-01

We present a new version of the numerical differentiation library (NDL) used for the numerical estimation of first and second order partial derivatives of a function by finite differencing. In this version we have restructured the serial implementation of the code so as to achieve optimal task-based parallelization. The pure shared-memory parallelization of the library has been based on the lightweight OpenMP tasking model allowing for the full extraction of the available parallelism and efficient scheduling of multiple concurrent library calls. On multicore clusters, parallelism is exploited by means of TORC, an MPI-based multi-threaded tasking library. The new MPI implementation of NDL provides optimal performance in terms of function calls and, furthermore, supports asynchronous execution of multiple library calls within legacy MPI programs. In addition, a Python interface has been implemented for all cases, exporting the functionality of our library to sequential Python codes. Catalog identifier: AEDG_v2_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEDG_v2_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 63036 No. of bytes in distributed program, including test data, etc.: 801872 Distribution format: tar.gz Programming language: ANSI Fortran-77, ANSI C, Python. Computer: Distributed systems (clusters), shared memory systems. Operating system: Linux, Unix. Has the code been vectorized or parallelized?: Yes. RAM: The library uses O(N) internal storage, N being the dimension of the problem. It can use up to O(N2) internal storage for Hessian calculations, if a task throttling factor has not been set by the user. Classification: 4.9, 4.14, 6.5. Catalog identifier of previous version: AEDG_v1_0 Journal reference of previous version: Comput. Phys. Comm. 180
A highly efficient parallel algorithm for solving the neutron diffusion nodal equations on shared-memory computers

International Nuclear Information System (INIS)

Azmy, Y.Y.; Kirk, B.L.

1990-01-01

Modern parallel computer architectures offer an enormous potential for reducing CPU and wall-clock execution times of large-scale computations commonly performed in various applications in science and engineering. Recently, several authors have reported their efforts in developing and implementing parallel algorithms for solving the neutron diffusion equation on a variety of shared- and distributed-memory parallel computers. Testing of these algorithms for a variety of two- and three-dimensional meshes showed significant speedup of the computation. Even for very large problems (i.e., three-dimensional fine meshes) executed concurrently on a few nodes in serial (nonvector) mode, however, the measured computational efficiency is very low (40 to 86%). In this paper, the authors present a highly efficient (∼85 to 99.9%) algorithm for solving the two-dimensional nodal diffusion equations on the Sequent Balance 8000 parallel computer. Also presented is a model for the performance, represented by the efficiency, as a function of problem size and the number of participating processors. The model is validated through several tests and then extrapolated to larger problems and more processors to predict the performance of the algorithm in more computationally demanding situations
Shaft torsional oscillation interactions between turbo-generators in parallel in series compensated transmission systems

Energy Technology Data Exchange (ETDEWEB)

Mello, F.P. de

1994-12-31

Several investigators have raised the possibility of interaction between shaft systems of parallel units, particularly among identical units. The question addressed in this paper is the significance of this interaction between shaft systems of units coupled through the electrical system. A time domain model of two parallels units connected to an infinite bus trough a series compensated transmission is used to evaluate the phenomena. The same model is used to extract pertinent frequency response functions by Fourier processing of pulse response tests from which a frequency response analysis is performed to lend additional insight into the phenomena. (author) 8 refs., 13 figs., 3 tabs.
Fast ℓ1-SPIRiT Compressed Sensing Parallel Imaging MRI: Scalable Parallel Implementation and Clinically Feasible Runtime

Science.gov (United States)

Murphy, Mark; Alley, Marcus; Demmel, James; Keutzer, Kurt; Vasanawala, Shreyas; Lustig, Michael

2012-01-01

We present ℓ1-SPIRiT, a simple algorithm for auto calibrating parallel imaging (acPI) and compressed sensing (CS) that permits an efficient implementation with clinically-feasible runtimes. We propose a CS objective function that minimizes cross-channel joint sparsity in the Wavelet domain. Our reconstruction minimizes this objective via iterative soft-thresholding, and integrates naturally with iterative Self-Consistent Parallel Imaging (SPIRiT). Like many iterative MRI reconstructions, ℓ1-SPIRiT’s image quality comes at a high computational cost. Excessively long runtimes are a barrier to the clinical use of any reconstruction approach, and thus we discuss our approach to efficiently parallelizing ℓ1-SPIRiT and to achieving clinically-feasible runtimes. We present parallelizations of ℓ1-SPIRiT for both multi-GPU systems and multi-core CPUs, and discuss the software optimization and parallelization decisions made in our implementation. The performance of these alternatives depends on the processor architecture, the size of the image matrix, and the number of parallel imaging channels. Fundamentally, achieving fast runtime requires the correct trade-off between cache usage and parallelization overheads. We demonstrate image quality via a case from our clinical experimentation, using a custom 3DFT Spoiled Gradient Echo (SPGR) sequence with up to 8× acceleration via poisson-disc undersampling in the two phase-encoded directions. PMID:22345529
Parallel Programming with Intel Parallel Studio XE

CERN Document Server

Blair-Chappell , Stephen

2012-01-01

Optimize code for multi-core processors with Intel's Parallel Studio Parallel programming is rapidly becoming a "must-know" skill for developers. Yet, where to start? This teach-yourself tutorial is an ideal starting point for developers who already know Windows C and C++ and are eager to add parallelism to their code. With a focus on applying tools, techniques, and language extensions to implement parallelism, this essential resource teaches you how to write programs for multicore and leverage the power of multicore in your programs. Sharing hands-on case studies and real-world examples, the
Parallelization of elliptic solver for solving 1D Boussinesq model

Science.gov (United States)

Tarwidi, D.; Adytia, D.

2018-03-01

In this paper, a parallel implementation of an elliptic solver in solving 1D Boussinesq model is presented. Numerical solution of Boussinesq model is obtained by implementing a staggered grid scheme to continuity, momentum, and elliptic equation of Boussinesq model. Tridiagonal system emerging from numerical scheme of elliptic equation is solved by cyclic reduction algorithm. The parallel implementation of cyclic reduction is executed on multicore processors with shared memory architectures using OpenMP. To measure the performance of parallel program, large number of grids is varied from 28 to 214. Two test cases of numerical experiment, i.e. propagation of solitary and standing wave, are proposed to evaluate the parallel program. The numerical results are verified with analytical solution of solitary and standing wave. The best speedup of solitary and standing wave test cases is about 2.07 with 214 of grids and 1.86 with 213 of grids, respectively, which are executed by using 8 threads. Moreover, the best efficiency of parallel program is 76.2% and 73.5% for solitary and standing wave test cases, respectively.
Test bank for precalculus functions & graphs

CERN Document Server

Kolman, Bernard; Levitan, Michael L

1984-01-01

Test Bank for Precalculus: Functions & Graphs is a supplementary material for the text, Precalculus: Functions & Graphs. The book is intended for use by mathematics teachers.The book contains standard tests for each chapter in the textbook. Each set of test focuses on gauging the level of knowledge the student has achieved during the course. The answers for each chapter test and the final exam are found at the end of the book.Mathematics teachers teaching calculus will find the book extremely useful.
FILMPAR: A parallel algorithm designed for the efficient and accurate computation of thin film flow on functional surfaces containing micro-structure

Science.gov (United States)

Lee, Y. C.; Thompson, H. M.; Gaskell, P. H.

2009-12-01

FILMPAR is a highly efficient and portable parallel multigrid algorithm for solving a discretised form of the lubrication approximation to three-dimensional, gravity-driven, continuous thin film free-surface flow over substrates containing micro-scale topography. While generally applicable to problems involving heterogeneous and distributed features, for illustrative purposes the algorithm is benchmarked on a distributed memory IBM BlueGene/P computing platform for the case of flow over a single trench topography, enabling direct comparison with complementary experimental data and existing serial multigrid solutions. Parallel performance is assessed as a function of the number of processors employed and shown to lead to super-linear behaviour for the production of mesh-independent solutions. In addition, the approach is used to solve for the case of flow over a complex inter-connected topographical feature and a description provided of how FILMPAR could be adapted relatively simply to solve for a wider class of related thin film flow problems. Program summaryProgram title: FILMPAR Catalogue identifier: AEEL_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEEL_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 530 421 No. of bytes in distributed program, including test data, etc.: 1 960 313 Distribution format: tar.gz Programming language: C++ and MPI Computer: Desktop, server Operating system: Unix/Linux Mac OS X Has the code been vectorised or parallelised?: Yes. Tested with up to 128 processors RAM: 512 MBytes Classification: 12 External routines: GNU C/C++, MPI Nature of problem: Thin film flows over functional substrates containing well-defined single and complex topographical features are of enormous significance, having a wide variety of engineering
Building a parallel file system simulator

International Nuclear Information System (INIS)

Molina-Estolano, E; Maltzahn, C; Brandt, S A; Bent, J

2009-01-01

Parallel file systems are gaining in popularity in high-end computing centers as well as commercial data centers. High-end computing systems are expected to scale exponentially and to pose new challenges to their storage scalability in terms of cost and power. To address these challenges scientists and file system designers will need a thorough understanding of the design space of parallel file systems. Yet there exist few systematic studies of parallel file system behavior at petabyte- and exabyte scale. An important reason is the significant cost of getting access to large-scale hardware to test parallel file systems. To contribute to this understanding we are building a parallel file system simulator that can simulate parallel file systems at very large scale. Our goal is to simulate petabyte-scale parallel file systems on a small cluster or even a single machine in reasonable time and fidelity. With this simulator, file system experts will be able to tune existing file systems for specific workloads, scientists and file system deployment engineers will be able to better communicate workload requirements, file system designers and researchers will be able to try out design alternatives and innovations at scale, and instructors will be able to study very large-scale parallel file system behavior in the class room. In this paper we describe our approach and provide preliminary results that are encouraging both in terms of fidelity and simulation scalability.
Sparse BLIP: BLind Iterative Parallel imaging reconstruction using compressed sensing.

Science.gov (United States)

She, Huajun; Chen, Rong-Rong; Liang, Dong; DiBella, Edward V R; Ying, Leslie

2014-02-01

To develop a sensitivity-based parallel imaging reconstruction method to reconstruct iteratively both the coil sensitivities and MR image simultaneously based on their prior information. Parallel magnetic resonance imaging reconstruction problem can be formulated as a multichannel sampling problem where solutions are sought analytically. However, the channel functions given by the coil sensitivities in parallel imaging are not known exactly and the estimation error usually leads to artifacts. In this study, we propose a new reconstruction algorithm, termed Sparse BLind Iterative Parallel, for blind iterative parallel imaging reconstruction using compressed sensing. The proposed algorithm reconstructs both the sensitivity functions and the image simultaneously from undersampled data. It enforces the sparseness constraint in the image as done in compressed sensing, but is different from compressed sensing in that the sensing matrix is unknown and additional constraint is enforced on the sensitivities as well. Both phantom and in vivo imaging experiments were carried out with retrospective undersampling to evaluate the performance of the proposed method. Experiments show improvement in Sparse BLind Iterative Parallel reconstruction when compared with Sparse SENSE, JSENSE, IRGN-TV, and L1-SPIRiT reconstructions with the same number of measurements. The proposed Sparse BLind Iterative Parallel algorithm reduces the reconstruction errors when compared to the state-of-the-art parallel imaging methods. Copyright © 2013 Wiley Periodicals, Inc.

Modelling, Simulation and Testing of a Reconfigurable Cable-Based Parallel Manipulator as Motion Aiding System

Directory of Open Access Journals (Sweden)

Gianni Castelli

2010-01-01

Full Text Available This paper presents results on the modelling, simulation and experimental tests of a cable-based parallel manipulator to be used as an aiding or guiding system for people with motion disabilities. There is a high level of motivation for people with a motion disability or the elderly to perform basic daily-living activities independently. Therefore, it is of great interest to design and implement safe and reliable motion assisting and guiding devices that are able to help end-users. In general, a robot for a medical application should be able to interact with a patient in safety conditions, i.e. it must not damage people or surroundings; it must be designed to guarantee high accuracy and low acceleration during the operation. Furthermore, it should not be too bulky and it should exert limited wrenches after close interaction with people. It can be advisable to have a portable system which can be easily brought into and assembled in a hospital or a domestic environment. Cable-based robotic structures can fulfil those requirements because of their main characteristics that make them light and intrinsically safe. In this paper, a reconfigurable four-cable-based parallel manipulator has been proposed as a motion assisting and guiding device to help people to accomplish a number of tasks, such as an aiding or guiding system to move the upper and lower limbs or the whole body. Modelling and simulation are presented in the ADAMS environment. Moreover, experimental tests are reported as based on an available laboratory prototype.
User's guide of parallel program development environment (PPDE). The 2nd edition

Energy Technology Data Exchange (ETDEWEB)

Ueno, Hirokazu; Takemiya, Hiroshi; Imamura, Toshiyuki; Koide, Hiroshi; Matsuda, Katsuyuki; Higuchi, Kenji; Hirayama, Toshio [Center for Promotion of Computational Science and Engineering, Japan Atomic Energy Research Institute, Tokyo (Japan); Ohta, Hirofumi [Hitachi Ltd., Tokyo (Japan)

2000-03-01

The STA basic system has been enhanced to accelerate support for parallel programming on heterogeneous parallel computers, through a series of R and D on the technology of parallel processing. The enhancement has been made through extending the function of the PPDF, Parallel Program Development Environment in the STA basic system. The extended PPDE has the function to make: 1) the automatic creation of a 'makefile' and a shell script file for its execution, 2) the multi-tools execution which makes the tools on heterogeneous computers to execute with one operation a task on a computer, and 3) the mirror composition to reflect editing results of a file on a computer into all related files on other computers. These additional functions will enhance the work efficiency for program development on some computers. More functions have been added to the PPDE to provide help for parallel program development. New functions were also designed to complement a HPF translator and a paralleilizing support tool when working together so that a sequential program is efficiently converted to a parallel program. This report describes the use of extended PPDE. (author)
Event monitoring of parallel computations

Directory of Open Access Journals (Sweden)

Gruzlikov Alexander M.

2015-06-01

Full Text Available The paper considers the monitoring of parallel computations for detection of abnormal events. It is assumed that computations are organized according to an event model, and monitoring is based on specific test sequences
CALTRANS: A parallel, deterministic, 3D neutronics code

Energy Technology Data Exchange (ETDEWEB)

Carson, L.; Ferguson, J.; Rogers, J.

1994-04-01

Our efforts to parallelize the deterministic solution of the neutron transport equation has culminated in a new neutronics code CALTRANS, which has full 3D capability. In this article, we describe the layout and algorithms of CALTRANS and present performance measurements of the code on a variety of platforms. Explicit implementation of the parallel algorithms of CALTRANS using both the function calls of the Parallel Virtual Machine software package (PVM 3.2) and the Meiko CS-2 tagged message passing library (based on the Intel NX/2 interface) are provided in appendices.
Parallel and Cooperative Particle Swarm Optimizer for Multimodal Problems

Directory of Open Access Journals (Sweden)

Geng Zhang

2015-01-01

Full Text Available Although the original particle swarm optimizer (PSO method and its related variant methods show some effectiveness for solving optimization problems, it may easily get trapped into local optimum especially when solving complex multimodal problems. Aiming to solve this issue, this paper puts forward a novel method called parallel and cooperative particle swarm optimizer (PCPSO. In case that the interacting of the elements in D-dimensional function vector X=[x1,x2,…,xd,…,xD] is independent, cooperative particle swarm optimizer (CPSO is used. Based on this, the PCPSO is presented to solve real problems. Since the dimension cannot be split into several lower dimensional search spaces in real problems because of the interacting of the elements, PCPSO exploits the cooperation of two parallel CPSO algorithms by orthogonal experimental design (OED learning. Firstly, the CPSO algorithm is used to generate two locally optimal vectors separately; then the OED is used to learn the merits of these two vectors and creates a better combination of them to generate further search. Experimental studies on a set of test functions show that PCPSO exhibits better robustness and converges much closer to the global optimum than several other peer algorithms.
The level 1 and 2 specification for parallel benchmark and a benchmark test of scalar-parallel computer SP2 based on the specifications

International Nuclear Information System (INIS)

Orii, Shigeo

1998-06-01

A benchmark specification for performance evaluation of parallel computers for numerical analysis is proposed. Level 1 benchmark, which is a conventional type benchmark using processing time, measures performance of computers running a code. Level 2 benchmark proposed in this report is to give the reason of the performance. As an example, scalar-parallel computer SP2 is evaluated with this benchmark specification in case of a molecular dynamics code. As a result, the main causes to suppress the parallel performance are maximum band width and start-up time of communication between nodes. Especially the start-up time is proportional not only to the number of processors but also to the number of particles. (author)
Efficient Parallel Kernel Solvers for Computational Fluid Dynamics Applications

Science.gov (United States)

Sun, Xian-He

1997-01-01

Distributed-memory parallel computers dominate today's parallel computing arena. These machines, such as Intel Paragon, IBM SP2, and Cray Origin2OO, have successfully delivered high performance computing power for solving some of the so-called "grand-challenge" problems. Despite initial success, parallel machines have not been widely accepted in production engineering environments due to the complexity of parallel programming. On a parallel computing system, a task has to be partitioned and distributed appropriately among processors to reduce communication cost and to attain load balance. More importantly, even with careful partitioning and mapping, the performance of an algorithm may still be unsatisfactory, since conventional sequential algorithms may be serial in nature and may not be implemented efficiently on parallel machines. In many cases, new algorithms have to be introduced to increase parallel performance. In order to achieve optimal performance, in addition to partitioning and mapping, a careful performance study should be conducted for a given application to find a good algorithm-machine combination. This process, however, is usually painful and elusive. The goal of this project is to design and develop efficient parallel algorithms for highly accurate Computational Fluid Dynamics (CFD) simulations and other engineering applications. The work plan is 1) developing highly accurate parallel numerical algorithms, 2) conduct preliminary testing to verify the effectiveness and potential of these algorithms, 3) incorporate newly developed algorithms into actual simulation packages. The work plan has well achieved. Two highly accurate, efficient Poisson solvers have been developed and tested based on two different approaches: (1) Adopting a mathematical geometry which has a better capacity to describe the fluid, (2) Using compact scheme to gain high order accuracy in numerical discretization. The previously developed Parallel Diagonal Dominant (PDD) algorithm
PALNS - A software framework for parallel large neighborhood search

DEFF Research Database (Denmark)

Røpke, Stefan

2009-01-01

This paper propose a simple, parallel, portable software framework for the metaheuristic named large neighborhood search (LNS). The aim is to provide a framework where the user has to set up a few data structures and implement a few functions and then the framework provides a metaheuristic where ...... parallelization "comes for free". We apply the parallel LNS heuristic to two different problems: the traveling salesman problem with pickup and delivery (TSPPD) and the capacitated vehicle routing problem (CVRP)....
The kpx, a program analyzer for parallelization

International Nuclear Information System (INIS)

Matsuyama, Yuji; Orii, Shigeo; Ota, Toshiro; Kume, Etsuo; Aikawa, Hiroshi.

1997-03-01

The kpx is a program analyzer, developed as a common technological basis for promoting parallel processing. The kpx consists of three tools. The first is ktool, that shows how much execution time is spent in program segments. The second is ptool, that shows parallelization overhead on the Paragon system. The last is xtool, that shows parallelization overhead on the VPP system. The kpx, designed to work for any FORTRAN cord on any UNIX computer, is confirmed to work well after testing on Paragon, SP2, SR2201, VPP500, VPP300, Monte-4, SX-4 and T90. (author)
Parallelization of a blind deconvolution algorithm

Science.gov (United States)

Matson, Charles L.; Borelli, Kathy J.

2006-09-01

Often it is of interest to deblur imagery in order to obtain higher-resolution images. Deblurring requires knowledge of the blurring function - information that is often not available separately from the blurred imagery. Blind deconvolution algorithms overcome this problem by jointly estimating both the high-resolution image and the blurring function from the blurred imagery. Because blind deconvolution algorithms are iterative in nature, they can take minutes to days to deblur an image depending how many frames of data are used for the deblurring and the platforms on which the algorithms are executed. Here we present our progress in parallelizing a blind deconvolution algorithm to increase its execution speed. This progress includes sub-frame parallelization and a code structure that is not specialized to a specific computer hardware architecture.
The BLAZE language - A parallel language for scientific programming

Science.gov (United States)

Mehrotra, Piyush; Van Rosendale, John

1987-01-01

A Pascal-like scientific programming language, BLAZE, is described. BLAZE contains array arithmetic, forall loops, and APL-style accumulation operators, which allow natural expression of fine grained parallelism. It also employs an applicative or functional procedure invocation mechanism, which makes it easy for compilers to extract coarse grained parallelism using machine specific program restructuring. Thus BLAZE should allow one to achieve highly parallel execution on multiprocessor architectures, while still providing the user with conceptually sequential control flow. A central goal in the design of BLAZE is portability across a broad range of parallel architectures. The multiple levels of parallelism present in BLAZE code, in principle, allow a compiler to extract the types of parallelism appropriate for the given architecture while neglecting the remainder. The features of BLAZE are described and it is shown how this language would be used in typical scientific programming.
The BLAZE language: A parallel language for scientific programming

Science.gov (United States)

Mehrotra, P.; Vanrosendale, J.

1985-01-01

A Pascal-like scientific programming language, Blaze, is described. Blaze contains array arithmetic, forall loops, and APL-style accumulation operators, which allow natural expression of fine grained parallelism. It also employs an applicative or functional procedure invocation mechanism, which makes it easy for compilers to extract coarse grained parallelism using machine specific program restructuring. Thus Blaze should allow one to achieve highly parallel execution on multiprocessor architectures, while still providing the user with onceptually sequential control flow. A central goal in the design of Blaze is portability across a broad range of parallel architectures. The multiple levels of parallelism present in Blaze code, in principle, allow a compiler to extract the types of parallelism appropriate for the given architecture while neglecting the remainder. The features of Blaze are described and shows how this language would be used in typical scientific programming.
How to interpret liver function tests

Directory of Open Access Journals (Sweden)

Christina Levick

2017-05-01

Full Text Available Careful interpretation of liver function tests within the clinical context can help elucidate the cause and severity of the underlying pathology. Predominantly raised alkaline phosphatase represents the cholestatic pattern of biliary pathology, whilst predominantly raised alanine aminotransferase and aspartate aminotransferase represent the hepatocellular pattern of hepatocellular pathology. The severity of liver dysfunction or biliary obstruction is reflected in the bilirubin level and the degree of liver synthetic function can also be indicated by the albumin level. Beyond the liver function tests, prothrombin time provides another marker of liver synthetic function and a low platelet count suggests portal hypertension.
Simulation Exploration through Immersive Parallel Planes

Energy Technology Data Exchange (ETDEWEB)

Brunhart-Lupo, Nicholas J [National Renewable Energy Laboratory (NREL), Golden, CO (United States); Bush, Brian W [National Renewable Energy Laboratory (NREL), Golden, CO (United States); Gruchalla, Kenny M [National Renewable Energy Laboratory (NREL), Golden, CO (United States); Smith, Steve [Los Alamos Visualization Associates

2017-05-25

We present a visualization-driven simulation system that tightly couples systems dynamics simulations with an immersive virtual environment to allow analysts to rapidly develop and test hypotheses in a high-dimensional parameter space. To accomplish this, we generalize the two-dimensional parallel-coordinates statistical graphic as an immersive 'parallel-planes' visualization for multivariate time series emitted by simulations running in parallel with the visualization. In contrast to traditional parallel coordinate's mapping the multivariate dimensions onto coordinate axes represented by a series of parallel lines, we map pairs of the multivariate dimensions onto a series of parallel rectangles. As in the case of parallel coordinates, each individual observation in the dataset is mapped to a polyline whose vertices coincide with its coordinate values. Regions of the rectangles can be 'brushed' to highlight and select observations of interest: a 'slider' control allows the user to filter the observations by their time coordinate. In an immersive virtual environment, users interact with the parallel planes using a joystick that can select regions on the planes, manipulate selection, and filter time. The brushing and selection actions are used to both explore existing data as well as to launch additional simulations corresponding to the visually selected portions of the input parameter space. As soon as the new simulations complete, their resulting observations are displayed in the virtual environment. This tight feedback loop between simulation and immersive analytics accelerates users' realization of insights about the simulation and its output.
Parallelization of Reversible Ripple-carry Adders

DEFF Research Database (Denmark)

Thomsen, Michael Kirkedal; Axelsen, Holger Bock

2009-01-01

The design of fast arithmetic logic circuits is an important research topic for reversible and quantum computing. A special challenge in this setting is the computation of standard arithmetical functions without the generation of \\emph{garbage}. Here, we present a novel parallelization scheme...... wherein $m$ parallel $k$-bit reversible ripple-carry adders are combined to form a reversible $mk$-bit \\emph{ripple-block carry adder} with logic depth $\\mathcal{O}(m+k)$ for a \\emph{minimal} logic depth $\\mathcal{O}(\\sqrt{mk})$, thus improving on the $mk$-bit ripple-carry adder logic depth $\\mathcal...
Comparison of multihardware parallel implementations for a phase unwrapping algorithm

Science.gov (United States)

Hernandez-Lopez, Francisco Javier; Rivera, Mariano; Salazar-Garibay, Adan; Legarda-Sáenz, Ricardo

2018-04-01

Phase unwrapping is an important problem in the areas of optical metrology, synthetic aperture radar (SAR) image analysis, and magnetic resonance imaging (MRI) analysis. These images are becoming larger in size and, particularly, the availability and need for processing of SAR and MRI data have increased significantly with the acquisition of remote sensing data and the popularization of magnetic resonators in clinical diagnosis. Therefore, it is important to develop faster and accurate phase unwrapping algorithms. We propose a parallel multigrid algorithm of a phase unwrapping method named accumulation of residual maps, which builds on a serial algorithm that consists of the minimization of a cost function; minimization achieved by means of a serial Gauss-Seidel kind algorithm. Our algorithm also optimizes the original cost function, but unlike the original work, our algorithm is a parallel Jacobi class with alternated minimizations. This strategy is known as the chessboard type, where red pixels can be updated in parallel at same iteration since they are independent. Similarly, black pixels can be updated in parallel in an alternating iteration. We present parallel implementations of our algorithm for different parallel multicore architecture such as CPU-multicore, Xeon Phi coprocessor, and Nvidia graphics processing unit. In all the cases, we obtain a superior performance of our parallel algorithm when compared with the original serial version. In addition, we present a detailed comparative performance of the developed parallel versions.
Corral framework: Trustworthy and fully functional data intensive parallel astronomical pipelines

Science.gov (United States)

Cabral, J. B.; Sánchez, B.; Beroiz, M.; Domínguez, M.; Lares, M.; Gurovich, S.; Granitto, P.

2017-07-01

Data processing pipelines represent an important slice of the astronomical software library that include chains of processes that transform raw data into valuable information via data reduction and analysis. In this work we present Corral, a Python framework for astronomical pipeline generation. Corral features a Model-View-Controller design pattern on top of an SQL Relational Database capable of handling: custom data models; processing stages; and communication alerts, and also provides automatic quality and structural metrics based on unit testing. The Model-View-Controller provides concept separation between the user logic and the data models, delivering at the same time multi-processing and distributed computing capabilities. Corral represents an improvement over commonly found data processing pipelines in astronomysince the design pattern eases the programmer from dealing with processing flow and parallelization issues, allowing them to focus on the specific algorithms needed for the successive data transformations and at the same time provides a broad measure of quality over the created pipeline. Corral and working examples of pipelines that use it are available to the community at https://github.com/toros-astro.
Mesh Partitioning Algorithm Based on Parallel Finite Element Analysis and Its Actualization

Directory of Open Access Journals (Sweden)

Lei Zhang

2013-01-01

Full Text Available In parallel computing based on finite element analysis, domain decomposition is a key technique for its preprocessing. Generally, a domain decomposition of a mesh can be realized through partitioning of a graph which is converted from a finite element mesh. This paper discusses the method for graph partitioning and the way to actualize mesh partitioning. Relevant softwares are introduced, and the data structure and key functions of Metis and ParMetis are introduced. The writing, compiling, and testing of the mesh partitioning interface program based on these key functions are performed. The results indicate some objective law and characteristics to guide the users who use the graph partitioning algorithm and software to write PFEM program, and ideal partitioning effects can be achieved by actualizing mesh partitioning through the program. The interface program can also be used directly by the engineering researchers as a module of the PFEM software. So that it can reduce the application of the threshold of graph partitioning algorithm, improve the calculation efficiency, and promote the application of graph theory and parallel computing.
Stampi: a message passing library for distributed parallel computing. User's guide, second edition

International Nuclear Information System (INIS)

Imamura, Toshiyuki; Koide, Hiroshi; Takemiya, Hiroshi

2000-02-01

A new message passing library, Stampi, has been developed to realize a computation with different kind of parallel computers arbitrarily and making MPI (Message Passing Interface) as an unique interface for communication. Stampi is based on the MPI2 specification, and it realizes dynamic process creation to different machines and communication between spawned one within the scope of MPI semantics. Main features of Stampi are summarized as follows: (i) an automatic switch function between external- and internal communications, (ii) a message routing/relaying with a routing module, (iii) a dynamic process creation, (iv) a support of two types of connection, Master/Slave and Client/Server, (v) a support of a communication with Java applets. Indeed vendors implemented MPI libraries as a closed system in one parallel machine or their systems, and did not support both functions; process creation and communication to external machines. Stampi supports both functions and enables us distributed parallel computing. Currently Stampi has been implemented on COMPACS (COMplex PArallel Computer System) introduced in CCSE, five parallel computers and one graphic workstation, moreover on eight kinds of parallel machines, totally fourteen systems. Stampi provides us MPI communication functionality on them. This report describes mainly the usage of Stampi. (author)
Reliability allocation problem in a series-parallel system

International Nuclear Information System (INIS)

Yalaoui, Alice; Chu, Chengbin; Chatelet, Eric

2005-01-01

In order to improve system reliability, designers may introduce in a system different technologies in parallel. When each technology is composed of components in series, the configuration belongs to the series-parallel systems. This type of system has not been studied as much as the parallel-series architecture. There exist no methods dedicated to the reliability allocation in series-parallel systems with different technologies. We propose in this paper theoretical and practical results for the allocation problem in a series-parallel system. Two resolution approaches are developed. Firstly, a one stage problem is studied and the results are exploited for the multi-stages problem. A theoretical condition for obtaining the optimal allocation is developed. Since this condition is too restrictive, we secondly propose an alternative approach based on an approximated function and the results of the one-stage study. This second approach is applied to numerical examples

Parallel computational in nuclear group constant calculation

International Nuclear Information System (INIS)

Su'ud, Zaki; Rustandi, Yaddi K.; Kurniadi, Rizal

2002-01-01

In this paper parallel computational method in nuclear group constant calculation using collision probability method will be discuss. The main focus is on the calculation of collision matrix which need large amount of computational time. The geometry treated here is concentric cylinder. The calculation of collision probability matrix is carried out using semi analytic method using Beckley Naylor Function. To accelerate computation speed some computer parallel used to solve the problem. We used LINUX based parallelization using PVM software with C or fortran language. While in windows based we used socket programming using DELPHI or C builder. The calculation results shows the important of optimal weight for each processor in case there area many type of processor speed
General upper bounds on the runtime of parallel evolutionary algorithms.

Science.gov (United States)

Lässig, Jörg; Sudholt, Dirk

2014-01-01

We present a general method for analyzing the runtime of parallel evolutionary algorithms with spatially structured populations. Based on the fitness-level method, it yields upper bounds on the expected parallel runtime. This allows for a rigorous estimate of the speedup gained by parallelization. Tailored results are given for common migration topologies: ring graphs, torus graphs, hypercubes, and the complete graph. Example applications for pseudo-Boolean optimization show that our method is easy to apply and that it gives powerful results. In our examples the performance guarantees improve with the density of the topology. Surprisingly, even sparse topologies such as ring graphs lead to a significant speedup for many functions while not increasing the total number of function evaluations by more than a constant factor. We also identify which number of processors lead to the best guaranteed speedups, thus giving hints on how to parameterize parallel evolutionary algorithms.
Programming parallel architectures: The BLAZE family of languages

Science.gov (United States)

Mehrotra, Piyush

1988-01-01

Programming multiprocessor architectures is a critical research issue. An overview is given of the various approaches to programming these architectures that are currently being explored. It is argued that two of these approaches, interactive programming environments and functional parallel languages, are particularly attractive since they remove much of the burden of exploiting parallel architectures from the user. Also described is recent work by the author in the design of parallel languages. Research on languages for both shared and nonshared memory multiprocessors is described, as well as the relations of this work to other current language research projects.
MCBooster: a tool for MC generation for massively parallel platforms

CERN Multimedia

Alves Junior, Antonio Augusto

2016-01-01

MCBooster is a header-only, C++11-compliant library for the generation of large samples of phase-space Monte Carlo events on massively parallel platforms. It was released on GitHub in the spring of 2016. The library core algorithms implement the Raubold-Lynch method; they are able to generate the full kinematics of decays with up to nine particles in the final state. The library supports the generation of sequential decays as well as the parallel evaluation of arbitrary functions over the generated events. The output of MCBooster completely accords with popular and well-tested software packages such as GENBOD (W515 from CERNLIB) and TGenPhaseSpace from the ROOT framework. MCBooster is developed on top of the Thrust library and runs on Linux systems. It deploys transparently on NVidia CUDA-enabled GPUs as well as multicore CPUs. This contribution summarizes the main features of MCBooster. A basic description of the user interface and some examples of applications are provided, along with measurements of perfor...
Development of Industrial High-Speed Transfer Parallel Robot

International Nuclear Information System (INIS)

Kim, Byung In; Kyung, Jin Ho; Do, Hyun Min; Jo, Sang Hyun

2013-01-01

Parallel robots used in industry require high stiffness or high speed because of their structural characteristics. Nowadays, the importance of rapid transportation has increased in the distribution industry. In this light, an industrial parallel robot has been developed for high-speed transfer. The developed parallel robot can handle a maximum payload of 3 kg. For a payload of 0.1 kg, the trajectory cycle time is 0.3 s (come and go), and the maximum velocity is 4.5 m/s (pick amp, place work, adept cycle). In this motion, its maximum acceleration is very high and reaches approximately 13g. In this paper, the design, analysis, and performance test results of the developed parallel robot system are introduced
Algorithms for computational fluid dynamics n parallel processors

International Nuclear Information System (INIS)

Van de Velde, E.F.

1986-01-01

A study of parallel algorithms for the numerical solution of partial differential equations arising in computational fluid dynamics is presented. The actual implementation on parallel processors of shared and nonshared memory design is discussed. The performance of these algorithms is analyzed in terms of machine efficiency, communication time, bottlenecks and software development costs. For elliptic equations, a parallel preconditioned conjugate gradient method is described, which has been used to solve pressure equations discretized with high order finite elements on irregular grids. A parallel full multigrid method and a parallel fast Poisson solver are also presented. Hyperbolic conservation laws were discretized with parallel versions of finite difference methods like the Lax-Wendroff scheme and with the Random Choice method. Techniques are developed for comparing the behavior of an algorithm on different architectures as a function of problem size and local computational effort. Effective use of these advanced architecture machines requires the use of machine dependent programming. It is shown that the portability problems can be minimized by introducing high level operations on vectors and matrices structured into program libraries
The simplified spherical harmonics (SPL) methodology with space and moment decomposition in parallel environments

International Nuclear Information System (INIS)

Gianluca, Longoni; Alireza, Haghighat

2003-01-01

In recent years, the SP L (simplified spherical harmonics) equations have received renewed interest for the simulation of nuclear systems. We have derived the SP L equations starting from the even-parity form of the S N equations. The SP L equations form a system of (L+1)/2 second order partial differential equations that can be solved with standard iterative techniques such as the Conjugate Gradient (CG). We discretized the SP L equations with the finite-volume approach in a 3-D Cartesian space. We developed a new 3-D general code, Pensp L (Parallel Environment Neutral-particle SP L ). Pensp L solves both fixed source and criticality eigenvalue problems. In order to optimize the memory management, we implemented a Compressed Diagonal Storage (CDS) to store the SP L matrices. Pensp L includes parallel algorithms for space and moment domain decomposition. The computational load is distributed on different processors, using a mapping function, which maps the 3-D Cartesian space and moments onto processors. The code is written in Fortran 90 using the Message Passing Interface (MPI) libraries for the parallel implementation of the algorithm. The code has been tested on the Pcpen cluster and the parallel performance has been assessed in terms of speed-up and parallel efficiency. (author)
Application of parallel preprocessors in data acquisition

International Nuclear Information System (INIS)

Butler, H.S.; Cooper, M.D.; Williams, R.A.; Hughes, E.B.; Rolfe, J.R.; Wilson, S.L.; Zeman, H.D.

1981-01-01

A data-acquisition system is being developed for a large-scale experiment at LAMPF. It will make use of four microprocessors running in parallel to acquire and preprocess data from 432 photomultiplier tubes (PMT) attached to 396 NaI crystals. The microprocessors are LSI-11/23s operating through CAMAC Auxiliary Crate Controllers (ACC). Data acquired by the microprocessors will be collected through a programmable Branch Driver (MBD) which also will read data from 52 scintillators (88 PMTs) and 728 wires comprising a drift chamber. The MBD will transfer data from each event into a PDP-11/44 for further processing and taping. The microprocessors will perform the secondary function of monitoring the calibration of the NaI PMTs. A special trigger circuit allows the system to stack data from a second event while the first is still being processed. Major components of the system were tested in April 1981. Timing measurements from this test are reported
Darwin's concepts in a test tube: parallels between organismal and in vitro evolution.

Science.gov (United States)

Díaz Arenas, Carolina; Lehman, Niles

2009-02-01

The evolutionary process as imagined by Darwin 150 years ago is evident not only in nature but also in the manner in which naked nucleic acids and proteins experience the "survival of the fittest" in the test tube during in vitro evolution. This review highlights some of the most apparent evolutionary patterns, such as directional selection, purifying selection, disruptive selection, and iterative evolution (recurrence), and draws parallels between what happens in the wild with whole organisms and what happens in the lab with molecules. Advances in molecular selection techniques, particularly with catalytic RNAs and DNAs, have accelerated in the last 20 years to the point where soon any sort of complex differential hereditary event that one can ascribe to natural populations will be observable in molecular populations, and exploitation of these events can even lead to practical applications in some cases.
Attempt to identify the functional areas of the cerebral cortex on CT slices parallel to the orbito-meatal line

Energy Technology Data Exchange (ETDEWEB)

Tanabe, Hirotaka; Okuda, Junichiro; Nishikawa, Takashi; Nishimura, Tsuyoshi (Osaka Univ. (Japan). Faculty of Medicine); Shiraishi, Junzo

1982-06-01

In order to identify the functional brain areas, such as Broca's area, on computed tomography slices parallel to the orbito-meatal line, the numbers of Brodmann's cortical mapping were shown on a diagram of representative brain sections parallel to the orbito-meatal line. Also, we described a method, using cerebral sulci as anatomical landmarks, for projecting lesions shown by CT scan onto the lateral brain diagram. The procedures were as follows. The distribution of lesions on CT slices was determined by the identification of major cerebral sulci and fissures, such as the Sylvian fissure, the central sulcus, and the superior frontal sulcus. Those lesions were then projected onto the lateral diagram by comparing each CT slice with the horizontal diagrams of brain sections. The method was demonstrated in three cases developing neuropsychological symptoms.
Parallelization of ITOUGH2 using PVM

International Nuclear Information System (INIS)

Finsterle, Stefan

1998-01-01

ITOUGH2 inversions are computationally intensive because the forward problem must be solved many times to evaluate the objective function for different parameter combinations or to numerically calculate sensitivity coefficients. Most of these forward runs are independent from each other and can therefore be performed in parallel. Message passing based on the Parallel Virtual Machine (PVM) system has been implemented into ITOUGH2 to enable parallel processing of ITOUGH2 jobs on a heterogeneous network of Unix workstations. This report describes the PVM system and its implementation into ITOUGH2. Instructions are given for installing PVM, compiling ITOUGH2-PVM for use on a workstation cluster, the preparation of an 1.TOUGH2 input file under PVM, and the execution of an ITOUGH2-PVM application. Examples are discussed, demonstrating the use of ITOUGH2-PVM
The R package "sperrorest" : Parallelized spatial error estimation and variable importance assessment for geospatial machine learning

Science.gov (United States)

Schratz, Patrick; Herrmann, Tobias; Brenning, Alexander

2017-04-01

Computational and statistical prediction methods such as the support vector machine have gained popularity in remote-sensing applications in recent years and are often compared to more traditional approaches like maximum-likelihood classification. However, the accuracy assessment of such predictive models in a spatial context needs to account for the presence of spatial autocorrelation in geospatial data by using spatial cross-validation and bootstrap strategies instead of their now more widely used non-spatial equivalent. The R package sperrorest by A. Brenning [IEEE International Geoscience and Remote Sensing Symposium, 1, 374 (2012)] provides a generic interface for performing (spatial) cross-validation of any statistical or machine-learning technique available in R. Since spatial statistical models as well as flexible machine-learning algorithms can be computationally expensive, parallel computing strategies are required to perform cross-validation efficiently. The most recent major release of sperrorest therefore comes with two new features (aside from improved documentation): The first one is the parallelized version of sperrorest(), parsperrorest(). This function features two parallel modes to greatly speed up cross-validation runs. Both parallel modes are platform independent and provide progress information. par.mode = 1 relies on the pbapply package and calls interactively (depending on the platform) parallel::mclapply() or parallel::parApply() in the background. While forking is used on Unix-Systems, Windows systems use a cluster approach for parallel execution. par.mode = 2 uses the foreach package to perform parallelization. This method uses a different way of cluster parallelization than the parallel package does. In summary, the robustness of parsperrorest() is increased with the implementation of two independent parallel modes. A new way of partitioning the data in sperrorest is provided by partition.factor.cv(). This function gives the user the
GaAs mixed signal multi-function X-band MMIC with 7 bit phase and amplitude control and integrated serial to parallel converter

NARCIS (Netherlands)

Boer, A. de; Mouthaan, K.

2000-01-01

The design and measured performance of a GaAs multi-function X-band MMIC for spacebased synthetic aperture radar (SAR) applications with 7-bit phase and amplitude control and integrated serial to parallel converter (including level conversion) is presented. The main application for the
Identifying failure in a tree network of a parallel computer

Science.gov (United States)

Archer, Charles J.; Pinnow, Kurt W.; Wallenfelt, Brian P.

2010-08-24

Methods, parallel computers, and products are provided for identifying failure in a tree network of a parallel computer. The parallel computer includes one or more processing sets including an I/O node and a plurality of compute nodes. For each processing set embodiments include selecting a set of test compute nodes, the test compute nodes being a subset of the compute nodes of the processing set; measuring the performance of the I/O node of the processing set; measuring the performance of the selected set of test compute nodes; calculating a current test value in dependence upon the measured performance of the I/O node of the processing set, the measured performance of the set of test compute nodes, and a predetermined value for I/O node performance; and comparing the current test value with a predetermined tree performance threshold. If the current test value is below the predetermined tree performance threshold, embodiments include selecting another set of test compute nodes. If the current test value is not below the predetermined tree performance threshold, embodiments include selecting from the test compute nodes one or more potential problem nodes and testing individually potential problem nodes and links to potential problem nodes.
Practical parallel computing

CERN Document Server

Morse, H Stephen

1994-01-01

Practical Parallel Computing provides information pertinent to the fundamental aspects of high-performance parallel processing. This book discusses the development of parallel applications on a variety of equipment.Organized into three parts encompassing 12 chapters, this book begins with an overview of the technology trends that converge to favor massively parallel hardware over traditional mainframes and vector machines. This text then gives a tutorial introduction to parallel hardware architectures. Other chapters provide worked-out examples of programs using several parallel languages. Thi
Parallel simulated annealing algorithms for cell placement on hypercube multiprocessors

Science.gov (United States)

Banerjee, Prithviraj; Jones, Mark Howard; Sargent, Jeff S.

1990-01-01

Two parallel algorithms for standard cell placement using simulated annealing are developed to run on distributed-memory message-passing hypercube multiprocessors. The cells can be mapped in a two-dimensional area of a chip onto processors in an n-dimensional hypercube in two ways, such that both small and large cell exchange and displacement moves can be applied. The computation of the cost function in parallel among all the processors in the hypercube is described, along with a distributed data structure that needs to be stored in the hypercube to support the parallel cost evaluation. A novel tree broadcasting strategy is used extensively for updating cell locations in the parallel environment. A dynamic parallel annealing schedule estimates the errors due to interacting parallel moves and adapts the rate of synchronization automatically. Two novel approaches in controlling error in parallel algorithms are described: heuristic cell coloring and adaptive sequence control.
Simulation of partially coherent light propagation using parallel computing devices

Science.gov (United States)

Magalhães, Tiago C.; Rebordão, José M.

2017-08-01

Light acquires or loses coherence and coherence is one of the few optical observables. Spectra can be derived from coherence functions and understanding any interferometric experiment is also relying upon coherence functions. Beyond the two limiting cases (full coherence or incoherence) the coherence of light is always partial and it changes with propagation. We have implemented a code to compute the propagation of partially coherent light from the source plane to the observation plane using parallel computing devices (PCDs). In this paper, we restrict the propagation in free space only. To this end, we used the Open Computing Language (OpenCL) and the open-source toolkit PyOpenCL, which gives access to OpenCL parallel computation through Python. To test our code, we chose two coherence source models: an incoherent source and a Gaussian Schell-model source. In the former case, we divided into two different source shapes: circular and rectangular. The results were compared to the theoretical values. Our implemented code allows one to choose between the PyOpenCL implementation and a standard one, i.e using the CPU only. To test the computation time for each implementation (PyOpenCL and standard), we used several computer systems with different CPUs and GPUs. We used powers of two for the dimensions of the cross-spectral density matrix (e.g. 324, 644) and a significant speed increase is observed in the PyOpenCL implementation when compared to the standard one. This can be an important tool for studying new source models.
Parallel rendering

Science.gov (United States)

Crockett, Thomas W.

1995-01-01

This article provides a broad introduction to the subject of parallel rendering, encompassing both hardware and software systems. The focus is on the underlying concepts and the issues which arise in the design of parallel rendering algorithms and systems. We examine the different types of parallelism and how they can be applied in rendering applications. Concepts from parallel computing, such as data decomposition, task granularity, scalability, and load balancing, are considered in relation to the rendering problem. We also explore concepts from computer graphics, such as coherence and projection, which have a significant impact on the structure of parallel rendering algorithms. Our survey covers a number of practical considerations as well, including the choice of architectural platform, communication and memory requirements, and the problem of image assembly and display. We illustrate the discussion with numerous examples from the parallel rendering literature, representing most of the principal rendering methods currently used in computer graphics.
Semen analysis and sperm function tests: How much to test?

Directory of Open Access Journals (Sweden)

S S Vasan

2011-01-01

Full Text Available Semen analysis as an integral part of infertility investigations is taken as a surrogate measure for male fecundity in clinical andrology, male fertility, and pregnancy risk assessments. Clearly, laboratory seminology is still very much in its infancy. In as much as the creation of a conventional semen profile will always represent the foundations of male fertility evaluation, the 5th edition of the World Health Organization (WHO manual is a definitive statement on how such assessments should be carried out and how the quality should be controlled. A major advance in this new edition of the WHO manual, resolving the most salient critique of previous editions, is the development of the first well-defined reference ranges for semen analysis based on the analysis of over 1900 recent fathers. The methodology used in the assessment of the usual variables in semen analysis is described, as are many of the less common, but very valuable, sperm function tests. Sperm function testing is used to determine if the sperm have the biologic capacity to perform the tasks necessary to reach and fertilize ova and ultimately result in live births. A variety of tests are available to evaluate different aspects of these functions. To accurately use these functional assays, the clinician must understand what the tests measure, what the indications are for the assays, and how to interpret the results to direct further testing or patient management.
Nonparametric tests for equality of psychometric functions.

Science.gov (United States)

García-Pérez, Miguel A; Núñez-Antón, Vicente

2017-12-07

Many empirical studies measure psychometric functions (curves describing how observers' performance varies with stimulus magnitude) because these functions capture the effects of experimental conditions. To assess these effects, parametric curves are often fitted to the data and comparisons are carried out by testing for equality of mean parameter estimates across conditions. This approach is parametric and, thus, vulnerable to violations of the implied assumptions. Furthermore, testing for equality of means of parameters may be misleading: Psychometric functions may vary meaningfully across conditions on an observer-by-observer basis with no effect on the mean values of the estimated parameters. Alternative approaches to assess equality of psychometric functions per se are thus needed. This paper compares three nonparametric tests that are applicable in all situations of interest: The existing generalized Mantel-Haenszel test, a generalization of the Berry-Mielke test that was developed here, and a split variant of the generalized Mantel-Haenszel test also developed here. Their statistical properties (accuracy and power) are studied via simulation and the results show that all tests are indistinguishable as to accuracy but they differ non-uniformly as to power. Empirical use of the tests is illustrated via analyses of published data sets and practical recommendations are given. The computer code in MATLAB and R to conduct these tests is available as Electronic Supplemental Material.

Parallel algorithms for interactive manipulation of digital terrain models

Science.gov (United States)

Davis, E. W.; Mcallister, D. F.; Nagaraj, V.

1988-01-01

Interactive three-dimensional graphics applications, such as terrain data representation and manipulation, require extensive arithmetic processing. Massively parallel machines are attractive for this application since they offer high computational rates, and grid connected architectures provide a natural mapping for grid based terrain models. Presented here are algorithms for data movement on the massive parallel processor (MPP) in support of pan and zoom functions over large data grids. It is an extension of earlier work that demonstrated real-time performance of graphics functions on grids that were equal in size to the physical dimensions of the MPP. When the dimensions of a data grid exceed the processing array size, data is packed in the array memory. Windows of the total data grid are interactively selected for processing. Movement of packed data is needed to distribute items across the array for efficient parallel processing. Execution time for data movement was found to exceed that for arithmetic aspects of graphics functions. Performance figures are given for routines written in MPP Pascal.
Requirements for implementing real-time control functional modules on a hierarchical parallel pipelined system

Science.gov (United States)

Wheatley, Thomas E.; Michaloski, John L.; Lumia, Ronald

1989-01-01

Analysis of a robot control system leads to a broad range of processing requirements. One fundamental requirement of a robot control system is the necessity of a microcomputer system in order to provide sufficient processing capability.The use of multiple processors in a parallel architecture is beneficial for a number of reasons, including better cost performance, modular growth, increased reliability through replication, and flexibility for testing alternate control strategies via different partitioning. A survey of the progression from low level control synchronizing primitives to higher level communication tools is presented. The system communication and control mechanisms of existing robot control systems are compared to the hierarchical control model. The impact of this design methodology on the current robot control systems is explored.
Reliability of a functional test battery evaluating functionality, proprioception, and strength in recreational athletes with functional ankle instability.

Science.gov (United States)

Sekir, U; Yildiz, Y; Hazneci, B; Ors, F; Saka, T; Aydin, T

2008-12-01

In contrast to the single evaluation methods used in the past, the combination of multiple tests allows one to obtain a global assessment of the ankle joint. The aim of this study was to determine the reliability of the different tests in a functional test battery. Twenty-four male recreational athletes with unilateral functional ankle instability (FAI) were recruited for this study. One component of the test battery included five different functional ability tests. These tests included a single limb hopping course, single-legged and triple-legged hop for distance, and six and cross six meter hop for time. The ankle joint position sense and one leg standing test were used for evaluation of proprioception and sensorimotor control. The isokinetic strengths of the ankle invertor and evertor muscles were evaluated at a velocity of 120 degrees /s. The reliability of the test battery was assessed by calculating the intraclass correlation coefficient (ICC). Each subject was tested two times, with an interval of 3-5 days between the test sessions. The ICCs for ankle functional and proprioceptive ability showed high reliability (ICCs ranging from 0.94 to 0.98). Additionally, isokinetic ankle joint inversion and eversion strength measurements represented good to high reliability (ICCs between 0.82 and 0.98). The functional test battery investigated in this study proved to be a reliable tool for the assessment of athletes with functional ankle instability. Therefore, clinicians may obtain reliable information from the functional test battery during the assessment of ankle joint performance in patients with functional ankle instability.
An attempt to identify the functional areas of the cerebral cortex on CT slices parallel to the orbito-meatal line

International Nuclear Information System (INIS)

Tanabe, Hirotaka; Okuda, Junichiro; Nishikawa, Takashi; Nishimura, Tsuyoshi; Shiraishi, Junzo.

1982-01-01

In order to identify the functional brain areas, such as Broca's area, on computed tomography slices parallel to the orbito-meatal line, the numbers of Brodmann's cortical mapping were shown on a diagram of representative brain sections parallel to the orbito-meatal line. Also, we described a method, using cerebral sulci as anatomical landmarks, for projecting lesions shown by CT scan onto the lateral brain diagram. The procedures were as follows. The distribution of lesions on CT slices was determined by the identification of major cerebral sulci and fissures, such as the Sylvian fissure, the central sulcus, and the superior frontal sulcus. Those lesions were then projected onto the lateral diagram by comparing each CT slice with the horizontal diagrams of brain sections. The method was demonstrated in three cases developing neuropsychological symptoms. (author)
Parallel sorting algorithms

CERN Document Server

Akl, Selim G

1985-01-01

Parallel Sorting Algorithms explains how to use parallel algorithms to sort a sequence of items on a variety of parallel computers. The book reviews the sorting problem, the parallel models of computation, parallel algorithms, and the lower bounds on the parallel sorting problems. The text also presents twenty different algorithms, such as linear arrays, mesh-connected computers, cube-connected computers. Another example where algorithm can be applied is on the shared-memory SIMD (single instruction stream multiple data stream) computers in which the whole sequence to be sorted can fit in the
Liver Function Tests in Tuberculoid Leprosy

Directory of Open Access Journals (Sweden)

R V Korane

1979-01-01

Full Text Available A total of 24 patients with untreated tuberculoid leprosy were taken up for study, They were the same group of patients in whom the authors have earlier reported involvement of liver in 85 % cases. Five healthy controls, studied also belonged to the same series. Liver function tests included prothrombin time, serum bilirubin, zinc sulphate turbidity, serum proteins and serum transaminases. No significant alterations in the liver function were observed. This is because the changes in the liver were so minimal and focal that they were not reflected in the various liver function tests.
Ultrascalable petaflop parallel supercomputer

Science.gov (United States)

Blumrich, Matthias A [Ridgefield, CT; Chen, Dong [Croton On Hudson, NY; Chiu, George [Cross River, NY; Cipolla, Thomas M [Katonah, NY; Coteus, Paul W [Yorktown Heights, NY; Gara, Alan G [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Hall, Shawn [Pleasantville, NY; Haring, Rudolf A [Cortlandt Manor, NY; Heidelberger, Philip [Cortlandt Manor, NY; Kopcsay, Gerard V [Yorktown Heights, NY; Ohmacht, Martin [Yorktown Heights, NY; Salapura, Valentina [Chappaqua, NY; Sugavanam, Krishnan [Mahopac, NY; Takken, Todd [Brewster, NY

2010-07-20

A massively parallel supercomputer of petaOPS-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC) having up to four processing elements. The ASIC nodes are interconnected by multiple independent networks that optimally maximize the throughput of packet communications between nodes with minimal latency. The multiple networks may include three high-speed networks for parallel algorithm message passing including a Torus, collective network, and a Global Asynchronous network that provides global barrier and notification functions. These multiple independent networks may be collaboratively or independently utilized according to the needs or phases of an algorithm for optimizing algorithm processing performance. The use of a DMA engine is provided to facilitate message passing among the nodes without the expenditure of processing resources at the node.
Parallelization for X-ray crystal structural analysis program

Energy Technology Data Exchange (ETDEWEB)

Watanabe, Hiroshi [Japan Atomic Energy Research Inst., Tokyo (Japan); Minami, Masayuki; Yamamoto, Akiji

1997-10-01

In this report we study vectorization and parallelization for X-ray crystal structural analysis program. The target machine is NEC SX-4 which is a distributed/shared memory type vector parallel supercomputer. X-ray crystal structural analysis is surveyed, and a new multi-dimensional discrete Fourier transform method is proposed. The new method is designed to have a very long vector length, so that it enables to obtain the 12.0 times higher performance result that the original code. Besides the above-mentioned vectorization, the parallelization by micro-task functions on SX-4 reaches 13.7 times acceleration in the part of multi-dimensional discrete Fourier transform with 14 CPUs, and 3.0 times acceleration in the whole program. Totally 35.9 times acceleration to the original 1CPU scalar version is achieved with vectorization and parallelization on SX-4. (author)
Implementation of a control system test environment in UNIX

International Nuclear Information System (INIS)

Brittain, C.R.; Otaduy, P.J.; Rovere, L.A.

1990-01-01

This paper discusses how UNIX features such as shared memory, remote procedure calls, and signalling have been used to implement a distributed computational environment ideal for the development and testing of digital control systems. The resulting environment -based on features commonly available in commercial workstations- is flexible, allows process simulation and controller development to proceed in parallel, and provides for testing and validation in a realistic environment. In addition, the use of shared memory to exchange data allows other tasks such as user interfaces and recorders to be added without affecting the process simulation or controllers. A library of functions is presented which provides a simple interface to using the features described. These functions can be used in either C or FORTRAN programs and have been tested on a network of Sun workstations and an ENCORE parallel computer. 6 refs., 2 figs
The Development of Reading and Spelling in Arabic Orthography: Two Parallel Processes?

Science.gov (United States)

Taha, Haitham

2016-01-01

The parallels between reading and spelling skills in Arabic were tested. One-hundred forty-three native Arab students, with typical reading development, from second, fourth, and sixth grades were tested with reading, spelling and orthographic decision tasks. The results indicated a full parallel between the reading and spelling performances within…
RELATIONSHIPS BETWEEN FUNCTIONAL MOVEMENT TESTS AND PERFORMANCE TESTS IN YOUNG ELITE MALE BASKETBALL PLAYERS.

Science.gov (United States)

Gonzalo-Skok, Oliver; Serna, Jorge; Rhea, Matthew R; Marín, Pedro J

2015-10-01

Sprinting and jumping are two common and important components of high-level sport performance. The weight-bearing dorsiflexion test (WB-DF) and Star Excursion Balance Test (SEBT) are tools developed to identify athletes at risk for lower extremity injury and may be related to running and jumping performance among athletes. The purposes of the present study were: 1) to identify any relationships between functional movement tests (WB-DF and SEBT) and performance tests (jumping, sprinting and changing direction); 2) to examine any relationships between asymmetries in functional movements and performance tests. Descriptive cohort study. Fifteen elite male basketball players (age: 15.4 ± 0.9 years) were assessed during a three-week period to determine the reliability of functional screening tools and performance tests and to examine the relationships between these tests. Relative (intraclass correlation coefficient) and absolute (coefficient of variation) reliability were used to assess the reproducibility of the tests. Significant correlations were detected between certain functional movement tests and performance tests. Both left and right excursion composite scores related to slower performance times in sprint testing, demonstrating that greater dynamic reach relates to decreased quickness and acceleration among these elite basketball athletes. The various relationships between dynamic functional movement testing, speed, and jump performance provide guidance for the strength and conditioning professional when conducting and evaluating data in an effort to improve performance and reduce risk of injury. The results of the present study suggest that these functional and performance tests do not measure the same components of human movement, and could be paired as outcome measures for the clinical and sport assessment of lower extremity function. 2b.
Performance evaluation of the HEP, ELXSI and CRAY X-MP parallel processors on hydrocode test problems

International Nuclear Information System (INIS)

Liebrock, L.M.; McGrath, J.F.; Hicks, D.L.

1986-01-01

Parallel programming promises improved processing speeds for hydrocodes, magnetohydrocodes, multiphase flow codes, thermal-hydraulics codes, wavecodes and other continuum dynamics codes. This paper presents the results of some investigations of parallel algorithms on three parallel processors: the CRAY X-MP, ELXSI and the HEP computers. Introduction and Background: We report the results of investigations of parallel algorithms for computational continuum dynamics. These programs (hydrocodes, wavecodes, etc.) produce simulations of the solutions to problems arising in the motion of continua: solid dynamics, liquid dynamics, gas dynamics, plasma dynamics, multiphase flow dynamics, thermal-hydraulic dynamics and multimaterial flow dynamics. This report restricts its scope to one-dimensional algorithms such as the von Neumann-Richtmyer (1950) scheme
Parallel hyperbolic PDE simulation on clusters: Cell versus GPU

Science.gov (United States)

Rostrup, Scott; De Sterck, Hans

2010-12-01

:http://cpc.cs.qub.ac.uk/summaries/AEGY_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: GPL v3 No. of lines in distributed program, including test data, etc.: 59 168 No. of bytes in distributed program, including test data, etc.: 453 409 Distribution format: tar.gz Programming language: C, CUDA Computer: Parallel Computing Clusters. Individual compute nodes may consist of x86 CPU, Cell processor, or x86 CPU with attached NVIDIA GPU accelerator. Operating system: Linux Has the code been vectorised or parallelized?: Yes. Tested on 1-128 x86 CPU cores, 1-32 Cell Processors, and 1-32 NVIDIA GPUs. RAM: Tested on Problems requiring up to 4 GB per compute node. Classification: 12 External routines: MPI, CUDA, IBM Cell SDK Nature of problem: MPI-parallel simulation of Shallow Water equations using high-resolution 2D hyperbolic equation solver on regular Cartesian grids for x86 CPU, Cell Processor, and NVIDIA GPU using CUDA. Solution method: SWsolver provides 3 implementations of a high-resolution 2D Shallow Water equation solver on regular Cartesian grids, for CPU, Cell Processor, and NVIDIA GPU. Each implementation uses MPI to divide work across a parallel computing cluster. Additional comments: Sub-program numdiff is used for the test run.
The Acoustic and Peceptual Effects of Series and Parallel Processing

Directory of Open Access Journals (Sweden)

Melinda C. Anderson

2009-01-01

Full Text Available Temporal envelope (TE cues provide a great deal of speech information. This paper explores how spectral subtraction and dynamic-range compression gain modifications affect TE fluctuations for parallel and series configurations. In parallel processing, algorithms compute gains based on the same input signal, and the gains in dB are summed. In series processing, output from the first algorithm forms the input to the second algorithm. Acoustic measurements show that the parallel arrangement produces more gain fluctuations, introducing more changes to the TE than the series configurations. Intelligibility tests for normal-hearing (NH and hearing-impaired (HI listeners show (1 parallel processing gives significantly poorer speech understanding than an unprocessed (UNP signal and the series arrangement and (2 series processing and UNP yield similar results. Speech quality tests show that UNP is preferred to both parallel and series arrangements, although spectral subtraction is the most preferred. No significant differences exist in sound quality between the series and parallel arrangements, or between the NH group and the HI group. These results indicate that gain modifications affect intelligibility and sound quality differently. Listeners appear to have a higher tolerance for gain modifications with regard to intelligibility, while judgments for sound quality appear to be more affected by smaller amounts of gain modification.
Parallel MR imaging.

Science.gov (United States)

Deshmane, Anagha; Gulani, Vikas; Griswold, Mark A; Seiberlich, Nicole

2012-07-01

Parallel imaging is a robust method for accelerating the acquisition of magnetic resonance imaging (MRI) data, and has made possible many new applications of MR imaging. Parallel imaging works by acquiring a reduced amount of k-space data with an array of receiver coils. These undersampled data can be acquired more quickly, but the undersampling leads to aliased images. One of several parallel imaging algorithms can then be used to reconstruct artifact-free images from either the aliased images (SENSE-type reconstruction) or from the undersampled data (GRAPPA-type reconstruction). The advantages of parallel imaging in a clinical setting include faster image acquisition, which can be used, for instance, to shorten breath-hold times resulting in fewer motion-corrupted examinations. In this article the basic concepts behind parallel imaging are introduced. The relationship between undersampling and aliasing is discussed and two commonly used parallel imaging methods, SENSE and GRAPPA, are explained in detail. Examples of artifacts arising from parallel imaging are shown and ways to detect and mitigate these artifacts are described. Finally, several current applications of parallel imaging are presented and recent advancements and promising research in parallel imaging are briefly reviewed. Copyright © 2012 Wiley Periodicals, Inc.
Parallelization of a spherical Sn transport theory algorithm

International Nuclear Information System (INIS)

Haghighat, A.

1989-01-01

The work described in this paper derives a parallel algorithm for an R-dependent spherical S N transport theory algorithm and studies its performance by testing different sample problems. The S N transport method is one of the most accurate techniques used to solve the linear Boltzmann equation. Several studies have been done on the vectorization of the S N algorithms; however, very few studies have been performed on the parallelization of this algorithm. Weinke and Hommoto have looked at the parallel processing of the different energy groups, and Azmy recently studied the parallel processing of the inner iterations of an X-Y S N nodal transport theory method. Both studies have reported very encouraging results, which have prompted us to look at the parallel processing of an R-dependent S N spherical geometry algorithm. This geometry was chosen because, in spite of its simplicity, it contains the complications of the curvilinear geometries (i.e., redistribution of neutrons over the discretized angular bins)
Simulation Exploration through Immersive Parallel Planes: Preprint

Energy Technology Data Exchange (ETDEWEB)

Brunhart-Lupo, Nicholas; Bush, Brian W.; Gruchalla, Kenny; Smith, Steve

2016-03-01

We present a visualization-driven simulation system that tightly couples systems dynamics simulations with an immersive virtual environment to allow analysts to rapidly develop and test hypotheses in a high-dimensional parameter space. To accomplish this, we generalize the two-dimensional parallel-coordinates statistical graphic as an immersive 'parallel-planes' visualization for multivariate time series emitted by simulations running in parallel with the visualization. In contrast to traditional parallel coordinate's mapping the multivariate dimensions onto coordinate axes represented by a series of parallel lines, we map pairs of the multivariate dimensions onto a series of parallel rectangles. As in the case of parallel coordinates, each individual observation in the dataset is mapped to a polyline whose vertices coincide with its coordinate values. Regions of the rectangles can be 'brushed' to highlight and select observations of interest: a 'slider' control allows the user to filter the observations by their time coordinate. In an immersive virtual environment, users interact with the parallel planes using a joystick that can select regions on the planes, manipulate selection, and filter time. The brushing and selection actions are used to both explore existing data as well as to launch additional simulations corresponding to the visually selected portions of the input parameter space. As soon as the new simulations complete, their resulting observations are displayed in the virtual environment. This tight feedback loop between simulation and immersive analytics accelerates users' realization of insights about the simulation and its output.
Parallelism Effects and Verb Activation: The Sustained Reactivation Hypothesis

Science.gov (United States)

Callahan, Sarah M.; Shapiro, Lewis P.; Love, Tracy

2010-01-01

This study investigated the processes underlying parallelism by evaluating the activation of a parallel element (i.e., a verb) throughout "and"-coordinated sentences. Four points were tested: (1) approximately 1,600ms after the verb in the first conjunct (PP1), (2) immediately following the conjunction (PP2), (3) approximately 1,100ms after the…
Parallel computing solution of Boltzmann neutron transport equation

International Nuclear Information System (INIS)

Ansah-Narh, T.

2010-01-01

The focus of the research was on developing parallel computing algorithm for solving Eigen-values of the Boltzmam Neutron Transport Equation (BNTE) in a slab geometry using multi-grid approach. In response to the problem of slow execution of serial computing when solving large problems, such as BNTE, the study was focused on the design of parallel computing systems which was an evolution of serial computing that used multiple processing elements simultaneously to solve complex physical and mathematical problems. Finite element method (FEM) was used for the spatial discretization scheme, while angular discretization was accomplished by expanding the angular dependence in terms of Legendre polynomials. The eigenvalues representing the multiplication factors in the BNTE were determined by the power method. MATLAB Compiler Version 4.1 (R2009a) was used to compile the MATLAB codes of BNTE. The implemented parallel algorithms were enabled with matlabpool, a Parallel Computing Toolbox function. The option UseParallel was set to 'always' and the default value of the option was 'never'. When those conditions held, the solvers computed estimated gradients in parallel. The parallel computing system was used to handle all the bottlenecks in the matrix generated from the finite element scheme and each domain of the power method generated. The parallel algorithm was implemented on a Symmetric Multi Processor (SMP) cluster machine, which had Intel 32 bit quad-core x 86 processors. Convergence rates and timings for the algorithm on the SMP cluster machine were obtained. Numerical experiments indicated the designed parallel algorithm could reach perfect speedup and had good stability and scalability. (au)
High-Performance Psychometrics: The Parallel-E Parallel-M Algorithm for Generalized Latent Variable Models. Research Report. ETS RR-16-34

Science.gov (United States)

von Davier, Matthias

2016-01-01

This report presents results on a parallel implementation of the expectation-maximization (EM) algorithm for multidimensional latent variable models. The developments presented here are based on code that parallelizes both the E step and the M step of the parallel-E parallel-M algorithm. Examples presented in this report include item response…

New partially parallel acquisition technique in cerebral imaging: preliminary findings

International Nuclear Information System (INIS)

Tintera, Jaroslav; Gawehn, Joachim; Bauermann, Thomas; Vucurevic, Goran; Stoeter, Peter

2004-01-01

In MRI applications where short acquisition time is necessary, the increase of acquisition speed is often at the expense of image resolution and SNR. In such cases, the newly developed parallel acquisition techniques could provide images without mentioned limitations and in reasonably shortened measurement time. A newly designed eight-channel head coil array (i-PAT coil) allowing for parallel acquisition of independently reconstructed images (GRAPPA mode) has been tested for its applicability in neuroradiology. Image homogeneity was tested in standard phantom and healthy volunteers. BOLD signal changes were studied in a group of six volunteers using finger tapping stimulation. Phantom studies revealed an important drop of signal even after the use of a normalization filter in the center of the image and an important increase of artifact power with reduction of measurement time strongly depending on the combination of acceleration parameters. The additional application of a parallel acquisition technique such as GRAPPA decreases measurement time in the range of about 30%, but further reduction is often possible only at the expense of SNR. This technique performs best in conditions in which imaging speed is important, such as CE MRA, but time resolution still does not allow the acquisition of angiograms separating the arterial and venous phase. Significantly larger areas of BOLD activation were found using the i-PAT coil compared to the standard head coil. Being an eight-channel surface coil array, peripheral cortical structures profit from high SNR as high-resolution imaging of small cortical dysplasias and functional activation of cortical areas imaged by BOLD contrast. In BOLD contrast imaging, susceptibility artifacts are reduced, but only if an appropriate combination of acceleration parameters is used. (orig.)
New partially parallel acquisition technique in cerebral imaging: preliminary findings

Energy Technology Data Exchange (ETDEWEB)

Tintera, Jaroslav [Institute for Clinical and Experimental Medicine, Prague (Czech Republic); Gawehn, Joachim; Bauermann, Thomas; Vucurevic, Goran; Stoeter, Peter [University Clinic Mainz, Institute of Neuroradiology, Mainz (Germany)

2004-12-01

In MRI applications where short acquisition time is necessary, the increase of acquisition speed is often at the expense of image resolution and SNR. In such cases, the newly developed parallel acquisition techniques could provide images without mentioned limitations and in reasonably shortened measurement time. A newly designed eight-channel head coil array (i-PAT coil) allowing for parallel acquisition of independently reconstructed images (GRAPPA mode) has been tested for its applicability in neuroradiology. Image homogeneity was tested in standard phantom and healthy volunteers. BOLD signal changes were studied in a group of six volunteers using finger tapping stimulation. Phantom studies revealed an important drop of signal even after the use of a normalization filter in the center of the image and an important increase of artifact power with reduction of measurement time strongly depending on the combination of acceleration parameters. The additional application of a parallel acquisition technique such as GRAPPA decreases measurement time in the range of about 30%, but further reduction is often possible only at the expense of SNR. This technique performs best in conditions in which imaging speed is important, such as CE MRA, but time resolution still does not allow the acquisition of angiograms separating the arterial and venous phase. Significantly larger areas of BOLD activation were found using the i-PAT coil compared to the standard head coil. Being an eight-channel surface coil array, peripheral cortical structures profit from high SNR as high-resolution imaging of small cortical dysplasias and functional activation of cortical areas imaged by BOLD contrast. In BOLD contrast imaging, susceptibility artifacts are reduced, but only if an appropriate combination of acceleration parameters is used. (orig.)
Genotypic tropism testing by massively parallel sequencing: qualitative and quantitative analysis

Directory of Open Access Journals (Sweden)

Thiele Bernhard

2011-05-01

Full Text Available Abstract Background Inferring viral tropism from genotype is a fast and inexpensive alternative to phenotypic testing. While being highly predictive when performed on clonal samples, sensitivity of predicting CXCR4-using (X4 variants drops substantially in clinical isolates. This is mainly attributed to minor variants not detected by standard bulk-sequencing. Massively parallel sequencing (MPS detects single clones thereby being much more sensitive. Using this technology we wanted to improve genotypic prediction of coreceptor usage. Methods Plasma samples from 55 antiretroviral-treated patients tested for coreceptor usage with the Monogram Trofile Assay were sequenced with standard population-based approaches. Fourteen of these samples were selected for further analysis with MPS. Tropism was predicted from each sequence with geno2pheno[coreceptor]. Results Prediction based on bulk-sequencing yielded 59.1% sensitivity and 90.9% specificity compared to the trofile assay. With MPS, 7600 reads were generated on average per isolate. Minorities of sequences with high confidence in CXCR4-usage were found in all samples, irrespective of phenotype. When using the default false-positive-rate of geno2pheno[coreceptor] (10%, and defining a minority cutoff of 5%, the results were concordant in all but one isolate. Conclusions The combination of MPS and coreceptor usage prediction results in a fast and accurate alternative to phenotypic assays. The detection of X4-viruses in all isolates suggests that coreceptor usage as well as fitness of minorities is important for therapy outcome. The high sensitivity of this technology in combination with a quantitative description of the viral population may allow implementing meaningful cutoffs for predicting response to CCR5-antagonists in the presence of X4-minorities.
Genotypic tropism testing by massively parallel sequencing: qualitative and quantitative analysis.

Science.gov (United States)

Däumer, Martin; Kaiser, Rolf; Klein, Rolf; Lengauer, Thomas; Thiele, Bernhard; Thielen, Alexander

2011-05-13

Inferring viral tropism from genotype is a fast and inexpensive alternative to phenotypic testing. While being highly predictive when performed on clonal samples, sensitivity of predicting CXCR4-using (X4) variants drops substantially in clinical isolates. This is mainly attributed to minor variants not detected by standard bulk-sequencing. Massively parallel sequencing (MPS) detects single clones thereby being much more sensitive. Using this technology we wanted to improve genotypic prediction of coreceptor usage. Plasma samples from 55 antiretroviral-treated patients tested for coreceptor usage with the Monogram Trofile Assay were sequenced with standard population-based approaches. Fourteen of these samples were selected for further analysis with MPS. Tropism was predicted from each sequence with geno2pheno[coreceptor]. Prediction based on bulk-sequencing yielded 59.1% sensitivity and 90.9% specificity compared to the trofile assay. With MPS, 7600 reads were generated on average per isolate. Minorities of sequences with high confidence in CXCR4-usage were found in all samples, irrespective of phenotype. When using the default false-positive-rate of geno2pheno[coreceptor] (10%), and defining a minority cutoff of 5%, the results were concordant in all but one isolate. The combination of MPS and coreceptor usage prediction results in a fast and accurate alternative to phenotypic assays. The detection of X4-viruses in all isolates suggests that coreceptor usage as well as fitness of minorities is important for therapy outcome. The high sensitivity of this technology in combination with a quantitative description of the viral population may allow implementing meaningful cutoffs for predicting response to CCR5-antagonists in the presence of X4-minorities.
Parallel computing in plasma physics: Nonlinear instabilities

International Nuclear Information System (INIS)

Pohn, E.; Kamelander, G.; Shoucri, M.

2000-01-01

A Vlasov-Poisson-system is used for studying the time evolution of the charge-separation at a spatial one- as well as a two-dimensional plasma-edge. Ions are advanced in time using the Vlasov-equation. The whole three-dimensional velocity-space is considered leading to very time-consuming four-resp. five-dimensional fully kinetic simulations. In the 1D simulations electrons are assumed to behave adiabatic, i.e. they are Boltzmann-distributed, leading to a nonlinear Poisson-equation. In the 2D simulations a gyro-kinetic approximation is used for the electrons. The plasma is assumed to be initially neutral. The simulations are performed at an equidistant grid. A constant time-step is used for advancing the density-distribution function in time. The time-evolution of the distribution function is performed using a splitting scheme. Each dimension (x, y, υ x , υ y , υ z ) of the phase-space is advanced in time separately. The value of the distribution function for the next time is calculated from the value of an - in general - interstitial point at the present time (fractional shift). One-dimensional cubic-spline interpolation is used for calculating the interstitial function values. After the fractional shifts are performed for each dimension of the phase-space, a whole time-step for advancing the distribution function is finished. Afterwards the charge density is calculated, the Poisson-equation is solved and the electric field is calculated before the next time-step is performed. The fractional shift method sketched above was parallelized for p processors as follows. Considering first the shifts in y-direction, a proper parallelization strategy is to split the grid into p disjoint υ z -slices, which are sub-grids, each containing a different 1/p-th part of the υ z range but the whole range of all other dimensions. Each processor is responsible for performing the y-shifts on a different slice, which can be done in parallel without any communication between
Two- and three-dimensional nonlocal density functional theory for inhomogeneous fluids. 1. Algorithms and parallelization

International Nuclear Information System (INIS)

Frink, L.J.D.; Salinger, A.G.

2000-01-01

Fluids adsorbed near surfaces, near macromolecules, and in porous materials are inhomogeneous, exhibiting spatially varying density distributions. This inhomogeneity in the fluid plays an important role in controlling a wide variety of complex physical phenomena including wetting, self-assembly, corrosion, and molecular recognition. One of the key methods for studying the properties of inhomogeneous fluids in simple geometries has been density functional theory (DFT). However, there has been a conspicuous lack of calculations in complex two- and three-dimensional geometries. The computational difficulty arises from the need to perform nested integrals that are due to nonlocal terms in the free energy functional. These integral equations are expensive both in evaluation time and in memory requirements; however, the expense can be mitigated by intelligent algorithms and the use of parallel computers. This paper details the efforts to develop efficient numerical algorithms so that nonlocal DFT calculations in complex geometries that require two or three dimensions can be performed. The success of this implementation will enable the study of solvation effects at heterogeneous surfaces, in zeolites, in solvated (bio)polymers, and in colloidal suspensions
Platelet Function Tests: Preanalytical Variables, Clinical Utility, Advantages, and Disadvantages.

Science.gov (United States)

Hvas, Anne-Mette; Grove, Erik Lerkevang

2017-01-01

Platelet function tests are mainly used in the diagnostic work-up of platelet disorders. During the last decade, the additional use of platelet function tests to evaluate the effect of antiplatelet therapy has also emerged in an attempt to identify patients with an increased risk of arterial thrombosis. Furthermore, platelet function tests are increasingly used to measure residual effect of antiplatelet therapy prior to surgery with the aim of reducing the risk of bleeding. To a limited extend, platelet function tests are also used to evaluate hyperaggregability as a potential marker of a prothrombotic state outside the setting of antiplatelet therapy. This multifaceted use of platelet function tests and the development of simpler point-of-care tests with narrower application have increased the use of platelet function testing and also facilitated the use of platelet function tests outside the highly specialized laboratories. The present chapter describes the preanalytical variables, which should be taken into account when planning platelet function testing. Also, the most widely used platelet function tests are introduced, and their clinical utility and their relative advantages and disadvantages are discussed.
Fluid-Elastic Instability Tests on Parallel Triangular Tube Bundles with Different Mass Ratio Values under Increasing and Decreasing Flow Velocities

Directory of Open Access Journals (Sweden)

Xu Zhang

2016-01-01

Full Text Available To study the effects of increasing and decreasing flow velocities on the fluid-elastic instability of tube bundles, the responses of an elastically mounted tube in a rigid parallel triangular tube bundle with a pitch-to-diameter ratio of 1.67 were tested in a water tunnel subjected to crossflow. Aluminum and stainless steel tubes were tested, respectively. In the in-line and transverse directions, the amplitudes, power spectrum density functions, response frequencies, added mass coefficients, and other results were obtained and compared. Results show that the nonlinear hysteresis phenomenon occurred in both tube bundle vibrations. When the flow velocity is decreasing, the tubes which have been in the state of fluid-elastic instability can keep on this state for a certain flow velocity range. During this process, the response frequencies of the tubes will decrease. Furthermore, the response frequencies of the aluminum tube can decrease much more than those of the stainless steel tube. The fluid-elastic instability constants fitted for these experiments were obtained from experimental data. A deeper insight into the fluid-elastic instability of tube bundles was also obtained by synthesizing the results. This study is beneficial for designing and operating equipment with tube bundles inside, as well as for further research on the fluid-elastic instability of tube bundles.
Parallelization of MCNP 4, a Monte Carlo neutron and photon transport code system, in highly parallel distributed memory type computer

International Nuclear Information System (INIS)

Masukawa, Fumihiro; Takano, Makoto; Naito, Yoshitaka; Yamazaki, Takao; Fujisaki, Masahide; Suzuki, Koichiro; Okuda, Motoi.

1993-11-01

In order to improve the accuracy and calculating speed of shielding analyses, MCNP 4, a Monte Carlo neutron and photon transport code system, has been parallelized and measured of its efficiency in the highly parallel distributed memory type computer, AP1000. The code has been analyzed statically and dynamically, then the suitable algorithm for parallelization has been determined for the shielding analysis functions of MCNP 4. This includes a strategy where a new history is assigned to the idling processor element dynamically during the execution. Furthermore, to avoid the congestion of communicative processing, the batch concept, processing multi-histories by a unit, has been introduced. By analyzing a sample cask problem with 2,000,000 histories by the AP1000 with 512 processor elements, the 82 % of parallelization efficiency is achieved, and the calculational speed has been estimated to be around 50 times as fast as that of FACOM M-780. (author)
Vestibular function testing.

LENUS (Irish Health Repository)

Lang, E E

2010-06-01

Vestibular symptoms of vertigo, dizziness and dysequilibrium are common complaints which can be disabling both physically and psychologically. Routine examination of the ear nose and throat and neurological system are often normal in these patients. An accurate history and thorough clinical examination can provide a diagnosis in the majority of patients. However, in a subgroup of patients, vestibular function testing may be invaluable in arriving at a correct diagnosis and ultimately in the optimal treatment of these patients.
Parallel algorithms for testing finite state machines:Generating UIO sequences

OpenAIRE

Hierons, RM; Turker, UC

2016-01-01

This paper describes an efficient parallel algorithm that uses many-core GPUs for automatically deriving Unique Input Output sequences (UIOs) from Finite State Machines. The proposed algorithm uses the global scope of the GPU's global memory through coalesced memory access and minimises the transfer between CPU and GPU memory. The results of experiments indicate that the proposed method yields considerably better results compared to a single core UIO construction algorithm. Our algorithm is s...
Recent experience with testing of parallel disc gate valves under accident flow conditions

International Nuclear Information System (INIS)

LaPointe, P.A.; Clayton, J.K.

1992-01-01

This paper presents the nuclear valve industry's latest and most extensive valve qualification test program experience. The test program includes a variety of 25 different gate and globe valves. All the test valves are power operated using either air, electric, or gas/hydraulic operators. The valves are categorized in size and pressure class so as to form a group of appropriate parent valve assemblies. Parent valve assembly qualification is used as the basis for qualification of candidate valve assemblies. The parent and candidate valve assemblies are representative of a nuclear plant's safety-related valve applications. The test program was performed in accordance with ANSI B16.41-1983 'Functional Qualification Requirements for Power Operated Active Valve Assemblies for Nuclear Power Plants.' The focus of this paper is on functional valve qualification test experience and specifically flow interruption testing to Annex G of the aforementioned test standard. Results of the flow test are summarized, including the coefficient of friction for each of the gate type valves reported. Information on valve size, pressure class, and actuator are given for all valves in the program. Although all valves performed extremely well, only selected test data are presented. The effects of the speed of operation and the effects of different fluid flow rates as they relate to the coefficient of friction between the valve disc and seat are discussed. The variation in the coefficient of friction based on other variables in the thrust equation, namely, differential pressure area is cited
Indications and interpretation of esophageal function testing.

Science.gov (United States)

Gyawali, C Prakash; de Bortoli, Nicola; Clarke, John; Marinelli, Carla; Tolone, Salvatore; Roman, Sabine; Savarino, Edoardo

2018-05-12

Esophageal symptoms are common, and can arise from mucosal, motor, functional, and neoplastic processes, among others. Judicious use of diagnostic testing can help define the etiology of symptoms and can direct management. Endoscopy, esophageal high-resolution manometry (HRM), ambulatory pH or pH-impedance manometry, and barium radiography are commonly used for esophageal function testing; functional lumen imaging probe is an emerging option. Recent consensus guidelines have provided direction in using test findings toward defining mechanisms of esophageal symptoms. The Chicago Classification describes hierarchical steps in diagnosing esophageal motility disorders. The Lyon Consensus characterizes conclusive evidence on esophageal testing for a diagnosis of gastroesophageal reflux disease (GERD), and establishes a motor classification of GERD. Taking these recent advances into consideration, our discussion focuses primarily on the indications, technique, equipment, and interpretation of esophageal HRM and ambulatory reflux monitoring in the evaluation of esophageal symptoms, and describes indications for alternative esophageal tests. © 2018 New York Academy of Sciences.
A Successful Test of Parallel Replication Teams in Teaching Research Methods

Science.gov (United States)

Standing, Lionel G.; Astrologo, Lisa; Benbow, Felecia F.; Cyr-Gauthier, Chelsea S.; Williams, Charlotte A.

2016-01-01

This paper describes the novel use of parallel student teams from a research methods course to perform a replication study, and suggests that this approach offers pedagogical benefits for both students and teachers, as well as potentially contributing to a resolution of the replication crisis in psychology today. Four teams, of five undergraduates…
The Functional Task Test (FTT): An Interdisciplinary Testing Protocol to Investigate the Factors Underlying Changes in Astronaut Functional Performance

Science.gov (United States)

Bloomberg, J. J.; Lawrence, E. L.; Arzeno, N. M.; Buxton, R. E.; Feiveson, A. H.; Kofman, I. S.; Lee, S. M. C.; Mulavara, A. P.; Peters, B. T.; Platts. S. H.;

2011-01-01

Exposure to space flight causes adaptations in multiple physiological systems including changes in sensorimotor, cardiovascular, and neuromuscular systems. These changes may affect a crewmember s ability to perform critical mission tasks immediately after landing on a planetary surface. The overall goal of this project is to determine the effects of space flight on functional tests that are representative of high priority exploration mission tasks and to identify the key underlying physiological factors that contribute to decrements in performance. To achieve this goal we developed an interdisciplinary testing protocol (Functional Task Test, FTT) that evaluates both astronaut functional performance and related physiological changes. Functional tests include ladder climbing, hatch opening, jump down, manual manipulation of objects and tool use, seat egress and obstacle avoidance, recovery from a fall and object translation tasks. Physiological measures include assessments of postural and gait control, dynamic visual acuity, fine motor control, plasma volume, orthostatic intolerance, upper- and lower-body muscle strength, power, endurance, control, and neuromuscular drive. Crewmembers perform this integrated test protocol before and after short (Shuttle) and long-duration (ISS) space flight. Data are collected on two sessions before flight, on landing day (Shuttle only) and 1, 6 and 30 days after landing. Preliminary results from both Shuttle and ISS crewmembers indicate decrement in performance of the functional tasks after both short and long-duration space flight. On-going data collection continues to improve the statistical power required to map changes in functional task performance to alterations in physiological systems. The information obtained from this study will be used to design and implement countermeasures that specifically target the physiological systems most responsible for the altered functional performance associated with space flight.

A SPECT reconstruction method for extending parallel to non-parallel geometries

International Nuclear Information System (INIS)

Wen Junhai; Liang Zhengrong

2010-01-01

Due to its simplicity, parallel-beam geometry is usually assumed for the development of image reconstruction algorithms. The established reconstruction methodologies are then extended to fan-beam, cone-beam and other non-parallel geometries for practical application. This situation occurs for quantitative SPECT (single photon emission computed tomography) imaging in inverting the attenuated Radon transform. Novikov reported an explicit parallel-beam formula for the inversion of the attenuated Radon transform in 2000. Thereafter, a formula for fan-beam geometry was reported by Bukhgeim and Kazantsev (2002 Preprint N. 99 Sobolev Institute of Mathematics). At the same time, we presented a formula for varying focal-length fan-beam geometry. Sometimes, the reconstruction formula is so implicit that we cannot obtain the explicit reconstruction formula in the non-parallel geometries. In this work, we propose a unified reconstruction framework for extending parallel-beam geometry to any non-parallel geometry using ray-driven techniques. Studies by computer simulations demonstrated the accuracy of the presented unified reconstruction framework for extending parallel-beam to non-parallel geometries in inverting the attenuated Radon transform.
The language parallel Pascal and other aspects of the massively parallel processor

Science.gov (United States)

Reeves, A. P.; Bruner, J. D.

1982-01-01

A high level language for the Massively Parallel Processor (MPP) was designed. This language, called Parallel Pascal, is described in detail. A description of the language design, a description of the intermediate language, Parallel P-Code, and details for the MPP implementation are included. Formal descriptions of Parallel Pascal and Parallel P-Code are given. A compiler was developed which converts programs in Parallel Pascal into the intermediate Parallel P-Code language. The code generator to complete the compiler for the MPP is being developed independently. A Parallel Pascal to Pascal translator was also developed. The architecture design for a VLSI version of the MPP was completed with a description of fault tolerant interconnection networks. The memory arrangement aspects of the MPP are discussed and a survey of other high level languages is given.
Parallel Atomistic Simulations

Energy Technology Data Exchange (ETDEWEB)

HEFFELFINGER,GRANT S.

2000-01-18

Algorithms developed to enable the use of atomistic molecular simulation methods with parallel computers are reviewed. Methods appropriate for bonded as well as non-bonded (and charged) interactions are included. While strategies for obtaining parallel molecular simulations have been developed for the full variety of atomistic simulation methods, molecular dynamics and Monte Carlo have received the most attention. Three main types of parallel molecular dynamics simulations have been developed, the replicated data decomposition, the spatial decomposition, and the force decomposition. For Monte Carlo simulations, parallel algorithms have been developed which can be divided into two categories, those which require a modified Markov chain and those which do not. Parallel algorithms developed for other simulation methods such as Gibbs ensemble Monte Carlo, grand canonical molecular dynamics, and Monte Carlo methods for protein structure determination are also reviewed and issues such as how to measure parallel efficiency, especially in the case of parallel Monte Carlo algorithms with modified Markov chains are discussed.
Experimental fast reactor JOYO MK-III functional test. Primary auxiliary cooling system test

International Nuclear Information System (INIS)

Karube, Koji; Akagi, Shinji; Terano, Toshihiro; Onuki, Osamu; Ito, Hideaki; Aoki, Hiroshi; Odo, Toshihiro

2004-03-01

This paper describes the results of primary auxiliary cooling system, which were done as a part of JOYO MK-III function test. The aim of the tests was to confirm the operational performance of primary auxiliary EMP and the protection system including siphon breaker of primary auxiliary cooling system. The items of the tests were: (Test No.): (Test item). 1) SKS-117: EMP start up test. 2) SKS-118-1: EMP start up test when pony motor running. 3) SKS-121: Function test of siphon breaker. The results of the tests satisfied the required performance, and demonstrated successful operation of primary auxiliary cooling system. (author)
Platelet function testing in pediatric patients

DEFF Research Database (Denmark)

Hvas, Anne-Mette; Favaloro, Emmanuel J

2017-01-01

Introduction: Platelets play a key role in primary hemostasis and are also intricately linked to secondary hemostasis. Investigation of platelet function in children, especially in neonates, is seriously challenged by the volumes required to perform the majority of platelet function tests and due...

Parallel integer sorting with medium and fine-scale parallelism

Science.gov (United States)

Dagum, Leonardo

1993-01-01

Two new parallel integer sorting algorithms, queue-sort and barrel-sort, are presented and analyzed in detail. These algorithms do not have optimal parallel complexity, yet they show very good performance in practice. Queue-sort designed for fine-scale parallel architectures which allow the queueing of multiple messages to the same destination. Barrel-sort is designed for medium-scale parallel architectures with a high message passing overhead. The performance results from the implementation of queue-sort on a Connection Machine CM-2 and barrel-sort on a 128 processor iPSC/860 are given. The two implementations are found to be comparable in performance but not as good as a fully vectorized bucket sort on the Cray YMP.
Parallel imaging with phase scrambling.

Science.gov (United States)

Zaitsev, Maxim; Schultz, Gerrit; Hennig, Juergen; Gruetter, Rolf; Gallichan, Daniel

2015-04-01

Most existing methods for accelerated parallel imaging in MRI require additional data, which are used to derive information about the sensitivity profile of each radiofrequency (RF) channel. In this work, a method is presented to avoid the acquisition of separate coil calibration data for accelerated Cartesian trajectories. Quadratic phase is imparted to the image to spread the signals in k-space (aka phase scrambling). By rewriting the Fourier transform as a convolution operation, a window can be introduced to the convolved chirp function, allowing a low-resolution image to be reconstructed from phase-scrambled data without prominent aliasing. This image (for each RF channel) can be used to derive coil sensitivities to drive existing parallel imaging techniques. As a proof of concept, the quadratic phase was applied by introducing an offset to the x(2) - y(2) shim and the data were reconstructed using adapted versions of the image space-based sensitivity encoding and GeneRalized Autocalibrating Partially Parallel Acquisitions algorithms. The method is demonstrated in a phantom (1 × 2, 1 × 3, and 2 × 2 acceleration) and in vivo (2 × 2 acceleration) using a 3D gradient echo acquisition. Phase scrambling can be used to perform parallel imaging acceleration without acquisition of separate coil calibration data, demonstrated here for a 3D-Cartesian trajectory. Further research is required to prove the applicability to other 2D and 3D sampling schemes. © 2014 Wiley Periodicals, Inc.
Spatially parallel processing of within-dimension conjunctions.

Science.gov (United States)

Linnell, K J; Humphreys, G W

2001-01-01

Within-dimension conjunction search for red-green targets amongst red-blue, and blue-green, nontargets is extremely inefficient (Wolfe et al, 1990 Journal of Experimental Psychology: Human Perception and Performance 16 879-892). We tested whether pairs of red-green conjunction targets can nevertheless be processed spatially in parallel. Participants made speeded detection responses whenever a red-green target was present. Across trials where a second identical target was present, the distribution of detection times was compatible with the assumption that targets were processed in parallel (Miller, 1982 Cognitive Psychology 14 247-279). We show that this was not an artifact of response-competition or feature-based processing. We suggest that within-dimension conjunctions can be processed spatially in parallel. Visual search for such items may be inefficient owing to within-dimension grouping between items.
Does parallel item content on WOMAC's Pain and Function Subscales limit its ability to detect change in functional status?

Directory of Open Access Journals (Sweden)

Kennedy Deborah M

2004-06-01

Full Text Available Abstract Background Although the Western Ontario and McMaster University Osteoarthritis Index (WOMAC is considered the leading outcome measure for patients with osteoarthritis of the lower extremity, recent work has challenged its factorial validity and the physical function subscale's ability to detect valid change when pain and function display different profiles of change. This study examined the etiology of the WOMAC's physical function subscale's limited ability to detect change in the presence of discordant changes for pain and function. We hypothesized that the duplication of some items on the WOMAC's pain and function subscales contributed to this shortcoming. Methods Two eight-item physical function scales were abstracted from the WOMAC's 17-item physical function subscale: one contained activities and themes that were duplicated on the pain subscale (SIMILAR-8; the other version avoided overlapping activities (DISSIMILAR-8. Factorial validity of the shortened measures was assessed on 310 patients awaiting hip or knee arthroplasty. The shortened measures' abilities to detect change were examined on a sample of 104 patients following primary hip or knee arthroplasty. The WOMAC and three performance measures that included activity specific pain assessments – 40 m walk test, stair test, and timed-up-and-go test – were administered preoperatively, within 16 days of hip or knee arthroplasty, and at an interval of greater than 20 days following the first post-surgical assessment. Standardized response means were used to quantify change. Results The SIMILAR-8 did not demonstrate factorial validity; however, the factorial structure of the DISSIMILAR-8 was supported. The time to complete the performance measures more than doubled between the preoperative and first postoperative assessments supporting the theory that lower extremity functional status diminished over this interval. The DISSIMILAR-8 detected this deterioration in functional
High temporal resolution magnetic resonance imaging: development of a parallel three dimensional acquisition method for functional neuroimaging; Imagerie par resonance magnetique a haute resolution temporelle: developpement d'une methode d'acquisition parallele tridimensionnelle pour l'imagerie fonctionnelle cerebrale

Energy Technology Data Exchange (ETDEWEB)

Rabrait, C

2007-11-15

Echo Planar Imaging is widely used to perform data acquisition in functional neuroimaging. This sequence allows the acquisition of a set of about 30 slices, covering the whole brain, at a spatial resolution ranging from 2 to 4 mm, and a temporal resolution ranging from 1 to 2 s. It is thus well adapted to the mapping of activated brain areas but does not allow precise study of the brain dynamics. Moreover, temporal interpolation is needed in order to correct for inter-slices delays and 2-dimensional acquisition is subject to vascular in flow artifacts. To improve the estimation of the hemodynamic response functions associated with activation, this thesis aimed at developing a 3-dimensional high temporal resolution acquisition method. To do so, Echo Volume Imaging was combined with reduced field-of-view acquisition and parallel imaging. Indeed, E.V.I. allows the acquisition of a whole volume in Fourier space following a single excitation, but it requires very long echo trains. Parallel imaging and field-of-view reduction are used to reduce the echo train durations by a factor of 4, which allows the acquisition of a 3-dimensional brain volume with limited susceptibility-induced distortions and signal losses, in 200 ms. All imaging parameters have been optimized in order to reduce echo train durations and to maximize S.N.R., so that cerebral activation can be detected with a high level of confidence. Robust detection of brain activation was demonstrated with both visual and auditory paradigms. High temporal resolution hemodynamic response functions could be estimated through selective averaging of the response to the different trials of the stimulation. To further improve S.N.R., the matrix inversions required in parallel reconstruction were regularized, and the impact of the level of regularization on activation detection was investigated. Eventually, potential applications of parallel E.V.I. such as the study of non-stationary effects in the B.O.L.D. response
About Parallel Programming: Paradigms, Parallel Execution and Collaborative Systems

Directory of Open Access Journals (Sweden)

Loredana MOCEAN

2009-01-01

Full Text Available In the last years, there were made efforts for delineation of a stabile and unitary frame, where the problems of logical parallel processing must find solutions at least at the level of imperative languages. The results obtained by now are not at the level of the made efforts. This paper wants to be a little contribution at these efforts. We propose an overview in parallel programming, parallel execution and collaborative systems.
Parallel Branch-and-Bound Methods for the Job Shop Scheduling

DEFF Research Database (Denmark)

Clausen, Jens; Perregaard, Michael

1998-01-01

Job-shop scheduling (JSS) problems are among the more difficult to solve in the class of NP-complete problems. The only successful approach has been branch-and-bound based algorithms, but such algorithms depend heavily on good bound functions. Much work has been done to identify such functions...... for the JSS problem, but with limited success. Even with recent methods, it is still not possible to solve problems substantially larger than 10 machines and 10 jobs. In the current study, we focus on parallel methods for solving JSS problems. We implement two different parallel branch-and-bound algorithms...
Parallel computing works!

CERN Document Server

Fox, Geoffrey C; Messina, Guiseppe C

2014-01-01

A clear illustration of how parallel computers can be successfully appliedto large-scale scientific computations. This book demonstrates how avariety of applications in physics, biology, mathematics and other scienceswere implemented on real parallel computers to produce new scientificresults. It investigates issues of fine-grained parallelism relevant forfuture supercomputers with particular emphasis on hypercube architecture. The authors describe how they used an experimental approach to configuredifferent massively parallel machines, design and implement basic systemsoftware, and develop
A hybrid algorithm for parallel molecular dynamics simulations

Science.gov (United States)

Mangiardi, Chris M.; Meyer, R.

2017-10-01

This article describes algorithms for the hybrid parallelization and SIMD vectorization of molecular dynamics simulations with short-range forces. The parallelization method combines domain decomposition with a thread-based parallelization approach. The goal of the work is to enable efficient simulations of very large (tens of millions of atoms) and inhomogeneous systems on many-core processors with hundreds or thousands of cores and SIMD units with large vector sizes. In order to test the efficiency of the method, simulations of a variety of configurations with up to 74 million atoms have been performed. Results are shown that were obtained on multi-core systems with Sandy Bridge and Haswell processors as well as systems with Xeon Phi many-core processors.
Experience with a clustered parallel reduction machine

NARCIS (Netherlands)

Beemster, M.; Hartel, Pieter H.; Hertzberger, L.O.; Hofman, R.F.H.; Langendoen, K.G.; Li, L.L.; Milikowski, R.; Vree, W.G.; Barendregt, H.P.; Mulder, J.C.

A clustered architecture has been designed to exploit divide and conquer parallelism in functional programs. The programming methodology developed for the machine is based on explicit annotations and program transformations. It has been successfully applied to a number of algorithms resulting in a
Accelerated radiotherapy planners calculated by parallelization with GPUs

International Nuclear Information System (INIS)

Reinado, D.; Cozar, J.; Alonso, S.; Chinillach, N.; Cortina, T.; Ricos, B.; Diez, S.

2011-01-01

In this paper we have developed and tested by a subroutine parallelization architectures graphics processing units (GPUs) to apply to calculations with standard algorithms known code. The experience acquired during these tests shall also apply to the MC calculations in radiotherapy if you have the code.
A PC parallel port button box provides millisecond response time accuracy under Linux.

Science.gov (United States)

Stewart, Neil

2006-02-01

For psychologists, it is sometimes necessary to measure people's reaction times to the nearest millisecond. This article describes how to use the PC parallel port to receive signals from a button box to achieve millisecond response time accuracy. The workings of the parallel port, the corresponding port addresses, and a simple Linux program for controlling the port are described. A test of the speed and reliability of button box signal detection is reported. If the reader is moderately familiar with Linux, this article should provide sufficient instruction for him or her to build and test his or her own parallel port button box. This article also describes how the parallel port could be used to control an external apparatus.
Characterization of DNA repair phenotypes of Xeroderma pigmentosum cell lines by a paralleled in vitro test

International Nuclear Information System (INIS)

Raffin, A.L.

2009-06-01

DNA is constantly damaged modifying the genetic information for which it encodes. Several cellular mechanisms as the Base Excision Repair (BER) and the Nucleotide Excision Repair (NER) allow recovering the right DNA sequence. The Xeroderma pigmentosum is a disease characterised by a deficiency in the NER pathway. The aim of this study was to propose an efficient and fast test for the diagnosis of this disease as an alternative to the currently available UDS test. DNA repair activities of XP cell lines were quantified using in vitro miniaturized and paralleled tests in order to establish DNA repair phenotypes of XPA and XPC deficient cells. The main advantage of the tests used in this study is the simultaneous measurement of excision or excision synthesis (ES) of several lesions by only one cellular extract. We showed on one hand that the relative ES of the different lesions depend strongly on the protein concentration of the nuclear extract tested. Working at high protein concentration allowed discriminating the XP phenotype versus the control one, whereas it was impossible under a certain concentration's threshold. On the other hand, while the UVB irradiation of control cells stimulated their repair activities, this effect was not observed in XP cells. This study brings new information on the XPA and XPC protein roles during BER and NER and underlines the complexity of the regulations of DNA repair processes. (author)
A Parallel Butterfly Algorithm

KAUST Repository

Poulson, Jack; Demanet, Laurent; Maxwell, Nicholas; Ying, Lexing

2014-01-01

The butterfly algorithm is a fast algorithm which approximately evaluates a discrete analogue of the integral transform (Equation Presented.) at large numbers of target points when the kernel, K(x, y), is approximately low-rank when restricted to subdomains satisfying a certain simple geometric condition. In d dimensions with O(Nd) quasi-uniformly distributed source and target points, when each appropriate submatrix of K is approximately rank-r, the running time of the algorithm is at most O(r2Nd logN). A parallelization of the butterfly algorithm is introduced which, assuming a message latency of α and per-process inverse bandwidth of β, executes in at most (Equation Presented.) time using p processes. This parallel algorithm was then instantiated in the form of the open-source DistButterfly library for the special case where K(x, y) = exp(iΦ(x, y)), where Φ(x, y) is a black-box, sufficiently smooth, real-valued phase function. Experiments on Blue Gene/Q demonstrate impressive strong-scaling results for important classes of phase functions. Using quasi-uniform sources, hyperbolic Radon transforms, and an analogue of a three-dimensional generalized Radon transform were, respectively, observed to strong-scale from 1-node/16-cores up to 1024-nodes/16,384-cores with greater than 90% and 82% efficiency, respectively. © 2014 Society for Industrial and Applied Mathematics.
A Parallel Butterfly Algorithm

KAUST Repository

Poulson, Jack

2014-02-04

The butterfly algorithm is a fast algorithm which approximately evaluates a discrete analogue of the integral transform (Equation Presented.) at large numbers of target points when the kernel, K(x, y), is approximately low-rank when restricted to subdomains satisfying a certain simple geometric condition. In d dimensions with O(Nd) quasi-uniformly distributed source and target points, when each appropriate submatrix of K is approximately rank-r, the running time of the algorithm is at most O(r2Nd logN). A parallelization of the butterfly algorithm is introduced which, assuming a message latency of α and per-process inverse bandwidth of β, executes in at most (Equation Presented.) time using p processes. This parallel algorithm was then instantiated in the form of the open-source DistButterfly library for the special case where K(x, y) = exp(iΦ(x, y)), where Φ(x, y) is a black-box, sufficiently smooth, real-valued phase function. Experiments on Blue Gene/Q demonstrate impressive strong-scaling results for important classes of phase functions. Using quasi-uniform sources, hyperbolic Radon transforms, and an analogue of a three-dimensional generalized Radon transform were, respectively, observed to strong-scale from 1-node/16-cores up to 1024-nodes/16,384-cores with greater than 90% and 82% efficiency, respectively. © 2014 Society for Industrial and Applied Mathematics.
Summary of functional and performance test procedures

DEFF Research Database (Denmark)

Mitzel, Jens; Gülzow, Erich; Friedrich, K. Andreas

Different Test Modules (TM) are defined for the functional and performance characterization of a PEMFC stack. The master document TM2.00 defines requirements and methodology for parameter variation, stability and data acquisition.......Different Test Modules (TM) are defined for the functional and performance characterization of a PEMFC stack. The master document TM2.00 defines requirements and methodology for parameter variation, stability and data acquisition....
An interactive parallel processor for data analysis

International Nuclear Information System (INIS)

Mong, J.; Logan, D.; Maples, C.; Rathbun, W.; Weaver, D.

1984-01-01

A parallel array of eight minicomputers has been assembled in an attempt to deal with kiloparameter data events. By exporting computer system functions to a separate processor, the authors have been able to achieve computer amplification linearly proportional to the number of executing processors
Safety of pulmonary function testing

DEFF Research Database (Denmark)

Roberts, Cara; Ward, Simon; Walsted, Emil

2017-01-01

BACKGROUND: Pulmonary function testing (PFT) is a key investigation in the evaluation of individuals with respiratory symptoms; however, the safety of routine and specialised PFT testing has not been reported in a large data set. Using patient safety incident (PSI) records, we aimed to assess risk...... was rated using the NHS National Patient Safety Agency and any hospital admission reported. RESULTS: There were 119 PSIs reported from 186 000 PFT; that is, 0.6 PSIs per 1000 tests. Cardiopulmonary PSIs were 3.3 times more likely to occur than non-cardiopulmonary (95% CI 2.17 to 5.12). Syncope was the most...
Correlation with liver scintigram, reticuloendothelial function test, plasma endotoxin level and liver function tests in chronic liver diseases. Multivariate analysis

Energy Technology Data Exchange (ETDEWEB)

Ohmoto, Kenji; Yamamoto, Shinichi; Ideguchi, Seiji and others

1989-02-01

Liver scintigrams with Tc-99m phytate were reviewed in a total of 64 consecutive patients, comprising 28 with chronic hepatitis and 36 with liver cirrhosis. Reticuloendothelial (RES) function, plasma endotoxin (Et) levels and findings of general liver function tests were used as reference parameters to determine the diagnostic ability of liver scintigraphy. Multivariate analyses revealed that liver scintigrams had a strong correlation with RES function and Et levels in terms of morphology of the liver and hepatic and bone marrow Tc-99m uptake. General liver function tests revealed gamma globulin to be correlated with hepatic uptake and the degree of splenogemaly on liver scintigrams; and ICG levels at 15 min to be correlated with bone marrow and splenic uptake. Accuracy of liver scintigraphy was 73% for chronic hepatitis, which was inferior to general liver function tests (83%). When both modalities were combined, diangostic accuracy increased to 95%. Liver scintigraphy seems to be useful as a complementary approach. (Namekawa, K).
Endpoint-based parallel data processing in a parallel active messaging interface of a parallel computer

Science.gov (United States)

Archer, Charles J.; Blocksome, Michael A.; Ratterman, Joseph D.; Smith, Brian E.

2014-08-12

Endpoint-based parallel data processing in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI composed of data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, the compute nodes coupled for data communications through the PAMI, including establishing a data communications geometry, the geometry specifying, for tasks representing processes of execution of the parallel application, a set of endpoints that are used in collective operations of the PAMI including a plurality of endpoints for one of the tasks; receiving in endpoints of the geometry an instruction for a collective operation; and executing the instruction for a collective operation through the endpoints in dependence upon the geometry, including dividing data communications operations among the plurality of endpoints for one of the tasks.

Development of massively parallel quantum chemistry program SMASH

International Nuclear Information System (INIS)

Ishimura, Kazuya

2015-01-01

A massively parallel program for quantum chemistry calculations SMASH was released under the Apache License 2.0 in September 2014. The SMASH program is written in the Fortran90/95 language with MPI and OpenMP standards for parallelization. Frequently used routines, such as one- and two-electron integral calculations, are modularized to make program developments simple. The speed-up of the B3LYP energy calculation for (C 150 H 30 ) 2 with the cc-pVDZ basis set (4500 basis functions) was 50,499 on 98,304 cores of the K computer
Ultrasound Vector Flow Imaging: Part II: Parallel Systems

DEFF Research Database (Denmark)

Jensen, Jørgen Arendt; Nikolov, Svetoslav Ivanov; Yu, Alfred C. H.

2016-01-01

The paper gives a review of the current state-of-theart in ultrasound parallel acquisition systems for flow imaging using spherical and plane waves emissions. The imaging methods are explained along with the advantages of using these very fast and sensitive velocity estimators. These experimental...... ultrasound imaging for studying brain function in animals. The paper explains the underlying acquisition and estimation methods for fast 2-D and 3-D velocity imaging and gives a number of examples. Future challenges and the potentials of parallel acquisition systems for flow imaging are also discussed....
Functional and nonfunctional testing of ATM networks

Science.gov (United States)

Ricardo, Manuel; Ferreira, M. E. P.; Guimaraes, Francisco E.; Mamede, J.; Henriques, M.; da Silva, Jorge A.; Carrapatoso, E.

1995-02-01

ATM network will support new multimedia services that will require new protocols, those services and protocols will need different test strategies and tools. In this paper, the concepts of functional and non-functional testers of ATM networks are discussed, a multimedia service and its requirements are presented and finally, a summary description of an ATM network and of the test tool that will be used to validate it are presented.
Web based parallel/distributed medical data mining using software agents

Energy Technology Data Exchange (ETDEWEB)

Kargupta, H.; Stafford, B.; Hamzaoglu, I.

1997-12-31

This paper describes an experimental parallel/distributed data mining system PADMA (PArallel Data Mining Agents) that uses software agents for local data accessing and analysis and a web based interface for interactive data visualization. It also presents the results of applying PADMA for detecting patterns in unstructured texts of postmortem reports and laboratory test data for Hepatitis C patients.
Functional tests for myocardial ischemia

International Nuclear Information System (INIS)

Levinson, J.R.; Guiney, T.E.; Boucher, C.A.

1991-01-01

Functional tests for myocardial ischemia are numerous. Most depend upon a combination of either exercise or pharmacologic intervention with analysis of the electrocardiogram, of regional perfusion with radionuclide imaging, or of regional wall motion with radionuclide imaging or echocardiography. While each test has unique features, especially at the research level, they are generally quite similar in clinical practice, so the clinician is advised to concentrate on one or two in which local expertise is high.22 references
Concurrent particle-in-cell plasma simulation on a multi-transputer parallel computer

International Nuclear Information System (INIS)

Khare, A.N.; Jethra, A.; Patel, Kartik

1992-01-01

This report describes the parallelization of a Particle-in-Cell (PIC) plasma simulation code on a multi-transputer parallel computer. The algorithm used in the parallelization of the PIC method is described. The decomposition schemes related to the distribution of the particles among the processors are discussed. The implementation of the algorithm on a transputer network connected as a torus is presented. The solutions of the problems related to global communication of data are presented in the form of a set of generalized communication functions. The performance of the program as a function of data size and the number of transputers show that the implementation is scalable and represents an effective way of achieving high performance at acceptable cost. (author). 11 refs., 4 figs., 2 tabs., appendices
Quantum algorithms for testing Boolean functions

Directory of Open Access Journals (Sweden)

Erika Andersson

2010-06-01

Full Text Available We discuss quantum algorithms, based on the Bernstein-Vazirani algorithm, for finding which variables a Boolean function depends on. There are 2^n possible linear Boolean functions of n variables; given a linear Boolean function, the Bernstein-Vazirani quantum algorithm can deterministically identify which one of these Boolean functions we are given using just one single function query. The same quantum algorithm can also be used to learn which input variables other types of Boolean functions depend on, with a success probability that depends on the form of the Boolean function that is tested, but does not depend on the total number of input variables. We also outline a procedure to futher amplify the success probability, based on another quantum algorithm, the Grover search.
Kalman Filter Tracking on Parallel Architectures

International Nuclear Information System (INIS)

Cerati, Giuseppe; Elmer, Peter; Krutelyov, Slava; Lantz, Steven; Lefebvre, Matthieu; McDermott, Kevin; Riley, Daniel; Tadel, Matevž; Wittich, Peter; Würthwein, Frank; Yagil, Avi

2016-01-01

Power density constraints are limiting the performance improvements of modern CPUs. To address this we have seen the introduction of lower-power, multi-core processors such as GPGPU, ARM and Intel MIC. In order to achieve the theoretical performance gains of these processors, it will be necessary to parallelize algorithms to exploit larger numbers of lightweight cores and specialized functions like large vector units. Track finding and fitting is one of the most computationally challenging problems for event reconstruction in particle physics. At the High-Luminosity Large Hadron Collider (HL-LHC), for example, this will be by far the dominant problem. The need for greater parallelism has driven investigations of very different track finding techniques such as Cellular Automata or Hough Transforms. The most common track finding techniques in use today, however, are those based on a Kalman filter approach. Significant experience has been accumulated with these techniques on real tracking detector systems, both in the trigger and offline. They are known to provide high physics performance, are robust, and are in use today at the LHC. Given the utility of the Kalman filter in track finding, we have begun to port these algorithms to parallel architectures, namely Intel Xeon and Xeon Phi. We report here on our progress towards an end-to-end track reconstruction algorithm fully exploiting vectorization and parallelization techniques in a simplified experimental environment
Processing communications events in parallel active messaging interface by awakening thread from wait state

Science.gov (United States)

Archer, Charles J; Blocksome, Michael A; Ratterman, Joseph D; Smith, Brian E

2013-10-22

Processing data communications events in a parallel active messaging interface (`PAMI`) of a parallel computer that includes compute nodes that execute a parallel application, with the PAMI including data communications endpoints, and the endpoints are coupled for data communications through the PAMI and through other data communications resources, including determining by an advance function that there are no actionable data communications events pending for its context, placing by the advance function its thread of execution into a wait state, waiting for a subsequent data communications event for the context; responsive to occurrence of a subsequent data communications event for the context, awakening by the thread from the wait state; and processing by the advance function the subsequent data communications event now pending for the context.
Update on endoscopic pancreatic function testing

Institute of Scientific and Technical Information of China (English)

Tyler Stevens; Mansour A Parsi

2011-01-01

Hormone-stimulated pancreatic function tests (PFTs) are considered the gold standard for measuring pancreatic exocrine function. PFTs involve the administration of intravenous secretin or cholecystokinin, followed by collection and analysis of pancreatic secretions. Because exocrine function may decline in the earliest phase of pancreatic fibrosis, PFTs are considered accurate for diagnosing chronic pancreatitis. Unfortunately, these potentially valuable tests are infrequently performed except at specialized centers, because they are time consuming and complicated. To overcome these limitations, endoscopic PFT methods have been developed which include aspiration of pancreatic secretions through the suction channel of the endoscope. The secretin endoscopic pancreatic function test (ePFT) involves collection of duodenal aspirates at 15, 30, 45 and 60 min after secretin stimulation. A bicarbonate concentration greater than 80 mmol/L in any of the samples is considered a normal result. The secretin ePFT has demonstrated good sensitivity and specificity compared with various reference standards, including the "Dreiling tube" secretin PFT, endoscopic ultrasound, and surgical histology. Furthermore, a standard autoanalyzer can be used for bicarbonate analysis, which allows the secretin ePFT to be performed at any hospital. The secretin ePFT may complement imaging tests like endoscopic ultrasound (EUS) in the diagnosis of early chronic pancreatitis.This paper will review the literature validating the use of ePFT in the diagnosis of exocrine insufficiency and chronic pancreatitis. Newer developments will also be discussed, including the feasibility of combined EUS/ePFT, the use of cholecystokinin alone or in combination with secretin, and the discovery of new protein and lipid pancreatic juice biomarkers which may complement traditionalfluid analysis.
Parallel phase model : a programming model for high-end parallel machines with manycores.

Energy Technology Data Exchange (ETDEWEB)

Wu, Junfeng (Syracuse University, Syracuse, NY); Wen, Zhaofang; Heroux, Michael Allen; Brightwell, Ronald Brian

2009-04-01

This paper presents a parallel programming model, Parallel Phase Model (PPM), for next-generation high-end parallel machines based on a distributed memory architecture consisting of a networked cluster of nodes with a large number of cores on each node. PPM has a unified high-level programming abstraction that facilitates the design and implementation of parallel algorithms to exploit both the parallelism of the many cores and the parallelism at the cluster level. The programming abstraction will be suitable for expressing both fine-grained and coarse-grained parallelism. It includes a few high-level parallel programming language constructs that can be added as an extension to an existing (sequential or parallel) programming language such as C; and the implementation of PPM also includes a light-weight runtime library that runs on top of an existing network communication software layer (e.g. MPI). Design philosophy of PPM and details of the programming abstraction are also presented. Several unstructured applications that inherently require high-volume random fine-grained data accesses have been implemented in PPM with very promising results.
Systematic approach for deriving feasible mappings of parallel algorithms to parallel computing platforms

NARCIS (Netherlands)

Arkin, Ethem; Tekinerdogan, Bedir; Imre, Kayhan M.

2017-01-01

The need for high-performance computing together with the increasing trend from single processor to parallel computer architectures has leveraged the adoption of parallel computing. To benefit from parallel computing power, usually parallel algorithms are defined that can be mapped and executed
The simplified spherical harmonics (SP{sub L}) methodology with space and moment decomposition in parallel environments

Energy Technology Data Exchange (ETDEWEB)

Gianluca, Longoni; Alireza, Haghighat [Florida University, Nuclear and Radiological Engineering Department, Gainesville, FL (United States)

2003-07-01

In recent years, the SP{sub L} (simplified spherical harmonics) equations have received renewed interest for the simulation of nuclear systems. We have derived the SP{sub L} equations starting from the even-parity form of the S{sub N} equations. The SP{sub L} equations form a system of (L+1)/2 second order partial differential equations that can be solved with standard iterative techniques such as the Conjugate Gradient (CG). We discretized the SP{sub L} equations with the finite-volume approach in a 3-D Cartesian space. We developed a new 3-D general code, Pensp{sub L} (Parallel Environment Neutral-particle SP{sub L}). Pensp{sub L} solves both fixed source and criticality eigenvalue problems. In order to optimize the memory management, we implemented a Compressed Diagonal Storage (CDS) to store the SP{sub L} matrices. Pensp{sub L} includes parallel algorithms for space and moment domain decomposition. The computational load is distributed on different processors, using a mapping function, which maps the 3-D Cartesian space and moments onto processors. The code is written in Fortran 90 using the Message Passing Interface (MPI) libraries for the parallel implementation of the algorithm. The code has been tested on the Pcpen cluster and the parallel performance has been assessed in terms of speed-up and parallel efficiency. (author)
GRADSPMHD: A parallel MHD code based on the SPH formalism

Science.gov (United States)

Vanaverbeke, S.; Keppens, R.; Poedts, S.

2014-03-01

We present GRADSPMHD, a completely Lagrangian parallel magnetohydrodynamics code based on the SPH formalism. The implementation of the equations of SPMHD in the “GRAD-h” formalism assembles known results, including the derivation of the discretized MHD equations from a variational principle, the inclusion of time-dependent artificial viscosity, resistivity and conductivity terms, as well as the inclusion of a mixed hyperbolic/parabolic correction scheme for satisfying the ∇ṡB→ constraint on the magnetic field. The code uses a tree-based formalism for neighbor finding and can optionally use the tree code for computing the self-gravity of the plasma. The structure of the code closely follows the framework of our parallel GRADSPH FORTRAN 90 code which we added previously to the CPC program library. We demonstrate the capabilities of GRADSPMHD by running 1, 2, and 3 dimensional standard benchmark tests and we find good agreement with previous work done by other researchers. The code is also applied to the problem of simulating the magnetorotational instability in 2.5D shearing box tests as well as in global simulations of magnetized accretion disks. We find good agreement with available results on this subject in the literature. Finally, we discuss the performance of the code on a parallel supercomputer with distributed memory architecture. Catalogue identifier: AERP_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AERP_v1_0.html Program obtainable from: CPC Program Library, Queen’s University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 620503 No. of bytes in distributed program, including test data, etc.: 19837671 Distribution format: tar.gz Programming language: FORTRAN 90/MPI. Computer: HPC cluster. Operating system: Unix. Has the code been vectorized or parallelized?: Yes, parallelized using MPI. RAM: ˜30 MB for a
Parallel algorithms

CERN Document Server

Casanova, Henri; Robert, Yves

2008-01-01

""…The authors of the present book, who have extensive credentials in both research and instruction in the area of parallelism, present a sound, principled treatment of parallel algorithms. … This book is very well written and extremely well designed from an instructional point of view. … The authors have created an instructive and fascinating text. The book will serve researchers as well as instructors who need a solid, readable text for a course on parallelism in computing. Indeed, for anyone who wants an understandable text from which to acquire a current, rigorous, and broad vi
Pulmonary function testing in children and infants

International Nuclear Information System (INIS)

Vogt, B; Weiler, N; Frerichs, I; Falkenberg, C

2014-01-01

Pulmonary function testing is performed in children and infants with the aim of documenting lung development with age and making diagnoses of lung diseases. In children and infants with an established lung disease, pulmonary function is tested to assess the disease progression and the efficacy of therapy. It is difficult to carry out the measurements in this age group without disturbances, so obtaining results of good quality and reproducibility is challenging. Young children are often uncooperative during the examinations. This is partly related to their young age but also due to the long testing duration and the unpopular equipment. We address a variety of examination techniques for lung function assessment in children and infants in this review. We describe the measuring principles, examination procedures, clinical findings and their interpretation, as well as advantages and limitations of these methods. The comparability between devices and centres as well as the availability of reference values are still considered a challenge in many of these techniques. In recent years, new technologies have emerged allowing the assessment of lung function not only on the global level but also on the regional level. This opens new possibilities for detecting regional lung function heterogeneity that might lead to a better understanding of respiratory pathophysiology in children. (topical review)
Significance tests for functional data with complex dependence structure.

Science.gov (United States)

Staicu, Ana-Maria; Lahiri, Soumen N; Carroll, Raymond J

2015-01-01

We propose an L 2 -norm based global testing procedure for the null hypothesis that multiple group mean functions are equal, for functional data with complex dependence structure. Specifically, we consider the setting of functional data with a multilevel structure of the form groups-clusters or subjects-units, where the unit-level profiles are spatially correlated within the cluster, and the cluster-level data are independent. Orthogonal series expansions are used to approximate the group mean functions and the test statistic is estimated using the basis coefficients. The asymptotic null distribution of the test statistic is developed, under mild regularity conditions. To our knowledge this is the first work that studies hypothesis testing, when data have such complex multilevel functional and spatial structure. Two small-sample alternatives, including a novel block bootstrap for functional data, are proposed, and their performance is examined in simulation studies. The paper concludes with an illustration of a motivating experiment.
Significance tests for functional data with complex dependence structure

KAUST Repository

Staicu, Ana-Maria

2015-01-01

We propose an L (2)-norm based global testing procedure for the null hypothesis that multiple group mean functions are equal, for functional data with complex dependence structure. Specifically, we consider the setting of functional data with a multilevel structure of the form groups-clusters or subjects-units, where the unit-level profiles are spatially correlated within the cluster, and the cluster-level data are independent. Orthogonal series expansions are used to approximate the group mean functions and the test statistic is estimated using the basis coefficients. The asymptotic null distribution of the test statistic is developed, under mild regularity conditions. To our knowledge this is the first work that studies hypothesis testing, when data have such complex multilevel functional and spatial structure. Two small-sample alternatives, including a novel block bootstrap for functional data, are proposed, and their performance is examined in simulation studies. The paper concludes with an illustration of a motivating experiment.
A parallelized three-dimensional cellular automaton model for grain growth during additive manufacturing

Science.gov (United States)

Lian, Yanping; Lin, Stephen; Yan, Wentao; Liu, Wing Kam; Wagner, Gregory J.

2018-05-01

In this paper, a parallelized 3D cellular automaton computational model is developed to predict grain morphology for solidification of metal during the additive manufacturing process. Solidification phenomena are characterized by highly localized events, such as the nucleation and growth of multiple grains. As a result, parallelization requires careful treatment of load balancing between processors as well as interprocess communication in order to maintain a high parallel efficiency. We give a detailed summary of the formulation of the model, as well as a description of the communication strategies implemented to ensure parallel efficiency. Scaling tests on a representative problem with about half a billion cells demonstrate parallel efficiency of more than 80% on 8 processors and around 50% on 64; loss of efficiency is attributable to load imbalance due to near-surface grain nucleation in this test problem. The model is further demonstrated through an additive manufacturing simulation with resulting grain structures showing reasonable agreement with those observed in experiments.
A parallelized three-dimensional cellular automaton model for grain growth during additive manufacturing

Science.gov (United States)

Lian, Yanping; Lin, Stephen; Yan, Wentao; Liu, Wing Kam; Wagner, Gregory J.

2018-01-01

In this paper, a parallelized 3D cellular automaton computational model is developed to predict grain morphology for solidification of metal during the additive manufacturing process. Solidification phenomena are characterized by highly localized events, such as the nucleation and growth of multiple grains. As a result, parallelization requires careful treatment of load balancing between processors as well as interprocess communication in order to maintain a high parallel efficiency. We give a detailed summary of the formulation of the model, as well as a description of the communication strategies implemented to ensure parallel efficiency. Scaling tests on a representative problem with about half a billion cells demonstrate parallel efficiency of more than 80% on 8 processors and around 50% on 64; loss of efficiency is attributable to load imbalance due to near-surface grain nucleation in this test problem. The model is further demonstrated through an additive manufacturing simulation with resulting grain structures showing reasonable agreement with those observed in experiments.

Esophageal function testing: Billing and coding update.

Science.gov (United States)

Khan, A; Massey, B; Rao, S; Pandolfino, J

2018-01-01

Esophageal function testing is being increasingly utilized in diagnosis and management of esophageal disorders. There have been several recent technological advances in the field to allow practitioners the ability to more accurately assess and treat such conditions, but there has been a relative lack of education in the literature regarding the associated Common Procedural Terminology (CPT) codes and methods of reimbursement. This review, commissioned and supported by the American Neurogastroenterology and Motility Society Council, aims to summarize each of the CPT codes for esophageal function testing and show the trends of associated reimbursement, as well as recommend coding methods in a practical context. We also aim to encourage many of these codes to be reviewed on a gastrointestinal (GI) societal level, by providing evidence of both discrepancies in coding definitions and inadequate reimbursement in this new era of esophageal function testing. © 2017 John Wiley & Sons Ltd.
Parallelizing AT with MatlabMPI

International Nuclear Information System (INIS)

2011-01-01

The Accelerator Toolbox (AT) is a high-level collection of tools and scripts specifically oriented toward solving problems dealing with computational accelerator physics. It is integrated into the MATLAB environment, which provides an accessible, intuitive interface for accelerator physicists, allowing researchers to focus the majority of their efforts on simulations and calculations, rather than programming and debugging difficulties. Efforts toward parallelization of AT have been put in place to upgrade its performance to modern standards of computing. We utilized the packages MatlabMPI and pMatlab, which were developed by MIT Lincoln Laboratory, to set up a message-passing environment that could be called within MATLAB, which set up the necessary pre-requisites for multithread processing capabilities. On local quad-core CPUs, we were able to demonstrate processor efficiencies of roughly 95% and speed increases of nearly 380%. By exploiting the efficacy of modern-day parallel computing, we were able to demonstrate incredibly efficient speed increments per processor in AT's beam-tracking functions. Extrapolating from prediction, we can expect to reduce week-long computation runtimes to less than 15 minutes. This is a huge performance improvement and has enormous implications for the future computing power of the accelerator physics group at SSRL. However, one of the downfalls of parringpass is its current lack of transparency; the pMatlab and MatlabMPI packages must first be well-understood by the user before the system can be configured to run the scripts. In addition, the instantiation of argument parameters requires internal modification of the source code. Thus, parringpass, cannot be directly run from the MATLAB command line, which detracts from its flexibility and user-friendliness. Future work in AT's parallelization will focus on development of external functions and scripts that can be called from within MATLAB and configured on multiple nodes, while
Parallelization Issues and Particle-In Codes.

Science.gov (United States)

Elster, Anne Cathrine

1994-01-01

"Everything should be made as simple as possible, but not simpler." Albert Einstein. The field of parallel scientific computing has concentrated on parallelization of individual modules such as matrix solvers and factorizers. However, many applications involve several interacting modules. Our analyses of a particle-in-cell code modeling charged particles in an electric field, show that these accompanying dependencies affect data partitioning and lead to new parallelization strategies concerning processor, memory and cache utilization. Our test-bed, a KSR1, is a distributed memory machine with a globally shared addressing space. However, most of the new methods presented hold generally for hierarchical and/or distributed memory systems. We introduce a novel approach that uses dual pointers on the local particle arrays to keep the particle locations automatically partially sorted. Complexity and performance analyses with accompanying KSR benchmarks, have been included for both this scheme and for the traditional replicated grids approach. The latter approach maintains load-balance with respect to particles. However, our results demonstrate it fails to scale properly for problems with large grids (say, greater than 128-by-128) running on as few as 15 KSR nodes, since the extra storage and computation time associated with adding the grid copies, becomes significant. Our grid partitioning scheme, although harder to implement, does not need to replicate the whole grid. Consequently, it scales well for large problems on highly parallel systems. It may, however, require load balancing schemes for non-uniform particle distributions. Our dual pointer approach may facilitate this through dynamically partitioned grids. We also introduce hierarchical data structures that store neighboring grid-points within the same cache -line by reordering the grid indexing. This alignment produces a 25% savings in cache-hits for a 4-by-4 cache. A consideration of the input data's effect on
Modulation transfer function assessment in parallel beam and fan beam collimators with square and cylindrical holes.

Science.gov (United States)

Khorshidi, Abdollah; Ashoor, Mansour

2014-05-01

This study investigates modulation transfer function (MTF) in parallel beam (PB) and fan beam (FB) collimators using the Monte Carlo method with full width at half maximum (FWHM), square and circular-shaped holes, and scatter and penetration (S + P) components. A regulation similar to the lead-to-air ratio was used for both collimators to estimate output data. The hole pattern was designed to compare FB by PB parameters. The radioactive source in air and in a water phantom placed in front of the collimators was simulated using MCNP5 code. The test results indicated that the square holes in PB (PBs) had better FWHM than did the cylindrical (PBc) holes. In contrast, the cylindrical holes in the FB (FBc) had better FWHM than the square holes. In general, the resolution of FBc was better than that of the PBc in air and scatter mediums. The S + P decreased for all collimators as the distance from the source to the collimator surface (z) increased. The FBc had a lower S + P than FBs, but PBc had a higher S + P than PBs. Of the FB and PB collimators with the identical hole shapes, PBs had a smaller S + P than FBs, and FBc had a smaller S + P than PBc. The MTF value for the FB was greater than for the PB and had increased spatial frequency; the FBc had higher MTF than the FBs and PB collimators. Estimating the FB using PB parameters and diverse hole shapes may be useful in collimator design to improve the resolution and efficiency of SPECT images.
Benefits of a parallel hybrid electric architecture on medium commercial vehicles

Energy Technology Data Exchange (ETDEWEB)

Boot, Marco Aimo; Consano, Ludovico [Iveco S.p.A, Turin (Italy)

2009-07-01

Hybrid electric technology is becoming an increasingly interesting solution for medium and heavy trucks involved in urban and suburban missions. The increasing demand for gas and oil, consequent price rises and environmental concerns are driving a market that is in need of alternative solutions. For these reasons, the growth in the global hybrid market significantly exceeded all the hybrid sales forecasts. The parallel hybrid electric vehicle (PHEV) employs an additional power source (electric motogenerator) in combination with the conventional diesel engine. This architecture exploits the benefits of both power sources in order to reduce the fuel consumption, increase the overall power, and above all, decrease CO2 emissions. Moreover, the emissions reduction target is lead by EU Regulations and local initiatives for traffic limitations, but the real drivers for the growth in the market are demonstrable fuel economy improvements and productivity costs optimization (global efficiency). This paper presents the results achieved by Iveco in the development and testing of parallel hybrid systems applied to medium range commercial vehicles, with the intent to evaluate the functionality, driveability performance and leading the best reduction in terms of fuel consumption and emissions in different real-world missions. The system architecture foresees one electric motor/generator and a single clutch unit. An external electrical power source for the battery recharging it is not necessary. The chosen configuration allows to implement the following functional modes: Stop and Start with Electric Launch, Hybrid Mode, Regenerative Braking Mode, Inertial Start and Creeping Mode. The software contained in the supervisor control unit has been tuned to the customer specific missions, taking in account on road data acquisition in order to demonstrate the reliability, driveability and the overall efficiency of the hybrid system. The field tests carried out in collaboration with
LOOP-3, Hydraulic Stability in Heated Parallel Channels

Energy Technology Data Exchange (ETDEWEB)

Davies, A L [AEEW, Dorset (United Kingdom)

1968-02-01

1 - Nature of physical problem solved: Hydraulic stability in parallel channels. 2 - Method of solution: Calculation of transfer functions developed in reference (10 below). 3 - Restrictions on the complexity of the problem: Only due to assumptions in analysis (see ref.)
A parallel solver for huge dense linear systems

Science.gov (United States)

Badia, J. M.; Movilla, J. L.; Climente, J. I.; Castillo, M.; Marqués, M.; Mayo, R.; Quintana-Ortí, E. S.; Planelles, J.

2011-11-01

HDSS (Huge Dense Linear System Solver) is a Fortran Application Programming Interface (API) to facilitate the parallel solution of very large dense systems to scientists and engineers. The API makes use of parallelism to yield an efficient solution of the systems on a wide range of parallel platforms, from clusters of processors to massively parallel multiprocessors. It exploits out-of-core strategies to leverage the secondary memory in order to solve huge linear systems O(100.000). The API is based on the parallel linear algebra library PLAPACK, and on its Out-Of-Core (OOC) extension POOCLAPACK. Both PLAPACK and POOCLAPACK use the Message Passing Interface (MPI) as the communication layer and BLAS to perform the local matrix operations. The API provides a friendly interface to the users, hiding almost all the technical aspects related to the parallel execution of the code and the use of the secondary memory to solve the systems. In particular, the API can automatically select the best way to store and solve the systems, depending of the dimension of the system, the number of processes and the main memory of the platform. Experimental results on several parallel platforms report high performance, reaching more than 1 TFLOP with 64 cores to solve a system with more than 200 000 equations and more than 10 000 right-hand side vectors. New version program summaryProgram title: Huge Dense System Solver (HDSS) Catalogue identifier: AEHU_v1_1 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEHU_v1_1.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 87 062 No. of bytes in distributed program, including test data, etc.: 1 069 110 Distribution format: tar.gz Programming language: Fortran90, C Computer: Parallel architectures: multiprocessors, computer clusters Operating system
Do functional tests predict low back pain?

Science.gov (United States)

Takala, E P; Viikari-Juntura, E

2000-08-15

A cohort of 307 nonsymptomatic workers and another cohort of 123 workers with previous episodes of low back pain were followed up for 2 years. The outcomes were measured by symptoms, medical consultations, and sick leaves due to low back disorders. To study the predictive value of a set of tests measuring the physical performance of the back in a working population. The hypothesis was that subjects with poor functional capacity are liable to back disorders. Reduced functional performance has been associated with back pain. There are few data to show whether reduced functional capacity is a cause or a consequence of pain. Mobility of the trunk in forward and side bending, maximal isokinetic trunk extension, flexion and lifting strength, and static endurance of back extension were measured. Standing balance and foot reaction time were recorded with a force plate. Clinical tests for the provocation of back or leg pain were performed. Gender, workload, age, and anthropometrics were managed as potential confounders in the analysis. Marked overlapping was seen in the measures of the subjects with different outcomes. Among the nonsymptomatic subjects, low performance in tests of mobility and standing balance was associated with future back disorders. Among workers with previous episodes of back pain, low isokinetic extension strength, poor standing balance, and positive clinical signs predicted future pain. Some associations were found between the functional tests and future low back pain. The wide variation in the results questions the value of the tests in health examinations (e.g., in screening or surveillance of low back disorders).
Parallel and distributed processing in two SGBDS: A case study

Directory of Open Access Journals (Sweden)

Francisco Javier Moreno

2017-04-01

Full Text Available Context: One of the strategies for managing large volumes of data is distributed and parallel computing. Among the tools that allow applying these characteristics are some Data Base Management Systems (DBMS, such as Oracle, DB2, and SQL Server. Method: In this paper we present a case study where we evaluate the performance of an SQL query in two of these DBMS. The evaluation is done through various forms of data distribution in a computer network with different degrees of parallelism. Results: The tests of the SQL query evidenced the performance differences between the two DBMS analyzed. However, more thorough testing and a wider variety of queries are needed. Conclusions: The differences in performance between the two DBMSs analyzed show that when evaluating this aspect, it is necessary to consider the particularities of each DBMS and the degree of parallelism of the queries.
Parallel processing for nonlinear dynamics simulations of structures including rotating bladed-disk assemblies

Science.gov (United States)

Hsieh, Shang-Hsien

1993-01-01

The principal objective of this research is to develop, test, and implement coarse-grained, parallel-processing strategies for nonlinear dynamic simulations of practical structural problems. There are contributions to four main areas: finite element modeling and analysis of rotational dynamics, numerical algorithms for parallel nonlinear solutions, automatic partitioning techniques to effect load-balancing among processors, and an integrated parallel analysis system.
Parallelization of MCNP4 code by using simple FORTRAN algorithms

International Nuclear Information System (INIS)

Yazid, P.I.; Takano, Makoto; Masukawa, Fumihiro; Naito, Yoshitaka.

1993-12-01

Simple FORTRAN algorithms, that rely only on open, close, read and write statements, together with disk files and some UNIX commands have been applied to parallelization of MCNP4. The code, named MCNPNFS, maintains almost all capabilities of MCNP4 in solving shielding problems. It is able to perform parallel computing on a set of any UNIX workstations connected by a network, regardless of the heterogeneity in hardware system, provided that all processors produce a binary file in the same format. Further, it is confirmed that MCNPNFS can be executed also on Monte-4 vector-parallel computer. MCNPNFS has been tested intensively by executing 5 photon-neutron benchmark problems, a spent fuel cask problem and 17 sample problems included in the original code package of MCNP4. Three different workstations, connected by a network, have been used to execute MCNPNFS in parallel. By measuring CPU time, the parallel efficiency is determined to be 58% to 99% and 86% in average. On Monte-4, MCNPNFS has been executed using 4 processors concurrently and has achieved the parallel efficiency of 79% in average. (author)
An efficient parallel algorithm for the calculation of canonical MP2 energies.

Science.gov (United States)

Baker, Jon; Pulay, Peter

2002-09-01

We present the parallel version of a previous serial algorithm for the efficient calculation of canonical MP2 energies (Pulay, P.; Saebo, S.; Wolinski, K. Chem Phys Lett 2001, 344, 543). It is based on the Saebo-Almlöf direct-integral transformation, coupled with an efficient prescreening of the AO integrals. The parallel algorithm avoids synchronization delays by spawning a second set of slaves during the bin-sort prior to the second half-transformation. Results are presented for systems with up to 2000 basis functions. MP2 energies for molecules with 400-500 basis functions can be routinely calculated to microhartree accuracy on a small number of processors (6-8) in a matter of minutes with modern PC-based parallel computers. Copyright 2002 Wiley Periodicals, Inc. J Comput Chem 23: 1150-1156, 2002
Prosodic Parallelism – comparing spoken and written language

Directory of Open Access Journals (Sweden)

Richard Wiese

2016-10-01

Full Text Available The Prosodic Parallelism hypothesis claims adjacent prosodic categories to prefer identical branching of internal adjacent constituents. According to Wiese and Speyer (2015, this preference implies feet contained in the same phonological phrase to display either binary or unary branching, but not different types of branching. The seemingly free schwa-zero alternations at the end of some words in German make it possible to test this hypothesis. The hypothesis was successfully tested by conducting a corpus study which used large-scale bodies of written German. As some open questions remain, and as it is unclear whether Prosodic Parallelism is valid for the spoken modality as well, the present study extends this inquiry to spoken German. As in the previous study, the results of a corpus analysis recruiting a variety of linguistic constructions are presented. The Prosodic Parallelism hypothesis can be demonstrated to be valid for spoken German as well as for written German. The paper thus contributes to the question whether prosodic preferences are similar between the spoken and written modes of a language. Some consequences of the results for the production of language are discussed.
Development of massively parallel quantum chemistry program SMASH

Energy Technology Data Exchange (ETDEWEB)

Ishimura, Kazuya [Department of Theoretical and Computational Molecular Science, Institute for Molecular Science 38 Nishigo-Naka, Myodaiji, Okazaki, Aichi 444-8585 (Japan)

2015-12-31

A massively parallel program for quantum chemistry calculations SMASH was released under the Apache License 2.0 in September 2014. The SMASH program is written in the Fortran90/95 language with MPI and OpenMP standards for parallelization. Frequently used routines, such as one- and two-electron integral calculations, are modularized to make program developments simple. The speed-up of the B3LYP energy calculation for (C{sub 150}H{sub 30}){sub 2} with the cc-pVDZ basis set (4500 basis functions) was 50,499 on 98,304 cores of the K computer.
Parallelizing an electron transport Monte Carlo simulator (MOCASIN 2.0)

International Nuclear Information System (INIS)

Schwetman, H.; Burdick, S.

1988-01-01

Electron transport simulators are tools for studying electrical properties of semiconducting materials and devices. As demands for modeling more complex devices and new materials have emerged, so have demands for more processing power. This paper documents a project to convert an electron transport simulator (MOCASIN 2.0) to a parallel processing environment. In addition to describing the conversion, the paper presents PPL, a parallel programming version of C running on a Sequent multiprocessor system. In timing tests, models that simulated the movement of 2,000 particles for 100 time steps were executed on ten processors, with a parallel efficiency of over 97%
Study on MPI/OpenMP hybrid parallelism for Monte Carlo neutron transport code

International Nuclear Information System (INIS)

Liang Jingang; Xu Qi; Wang Kan; Liu Shiwen

2013-01-01

Parallel programming with mixed mode of messages-passing and shared-memory has several advantages when used in Monte Carlo neutron transport code, such as fitting hardware of distributed-shared clusters, economizing memory demand of Monte Carlo transport, improving parallel performance, and so on. MPI/OpenMP hybrid parallelism was implemented based on a one dimension Monte Carlo neutron transport code. Some critical factors affecting the parallel performance were analyzed and solutions were proposed for several problems such as contention access, lock contention and false sharing. After optimization the code was tested finally. It is shown that the hybrid parallel code can reach good performance just as pure MPI parallel program, while it saves a lot of memory usage at the same time. Therefore hybrid parallel is efficient for achieving large-scale parallel of Monte Carlo neutron transport. (authors)
Efficient parallel implicit methods for rotary-wing aerodynamics calculations

Science.gov (United States)

Wissink, Andrew M.

Euler/Navier-Stokes Computational Fluid Dynamics (CFD) methods are commonly used for prediction of the aerodynamics and aeroacoustics of modern rotary-wing aircraft. However, their widespread application to large complex problems is limited lack of adequate computing power. Parallel processing offers the potential for dramatic increases in computing power, but most conventional implicit solution methods are inefficient in parallel and new techniques must be adopted to realize its potential. This work proposes alternative implicit schemes for Euler/Navier-Stokes rotary-wing calculations which are robust and efficient in parallel. The first part of this work proposes an efficient parallelizable modification of the Lower Upper-Symmetric Gauss Seidel (LU-SGS) implicit operator used in the well-known Transonic Unsteady Rotor Navier Stokes (TURNS) code. The new hybrid LU-SGS scheme couples a point-relaxation approach of the Data Parallel-Lower Upper Relaxation (DP-LUR) algorithm for inter-processor communication with the Symmetric Gauss Seidel algorithm of LU-SGS for on-processor computations. With the modified operator, TURNS is implemented in parallel using Message Passing Interface (MPI) for communication. Numerical performance and parallel efficiency are evaluated on the IBM SP2 and Thinking Machines CM-5 multi-processors for a variety of steady-state and unsteady test cases. The hybrid LU-SGS scheme maintains the numerical performance of the original LU-SGS algorithm in all cases and shows a good degree of parallel efficiency. It experiences a higher degree of robustness than DP-LUR for third-order upwind solutions. The second part of this work examines use of Krylov subspace iterative solvers for the nonlinear CFD solutions. The hybrid LU-SGS scheme is used as a parallelizable preconditioner. Two iterative methods are tested, Generalized Minimum Residual (GMRES) and Orthogonal s-Step Generalized Conjugate Residual (OSGCR). The Newton method demonstrates good
Parallel data grabbing card based on PCI bus RS422

International Nuclear Information System (INIS)

Zhang Zhenghui; Shen Ji; Wei Dongshan; Chen Ziyu

2005-01-01

This article briefly introduces the developments of the parallel data grabbing card based on RS422 and PCI bus. It could be applied for grabbing the 14 bits parallel data in high speed, coming from the devices with RS422 interface. The methods of data acquisition which bases on the PCI protocol, the functions and their usages of the chips employed, the ideas and principles of the hardware and software designing are presented. (authors)
Parallel Careers and their Consequences for Companies in Brazil

Directory of Open Access Journals (Sweden)

Maria Candida Baumer Azevedo

2014-04-01

Full Text Available Given the relevance of the need to manage parallel careers to attract and retain people in organizations, this paper provides insight into this phenomenon from an organizational perspective. The parallel career concept, introduced by Alboher (2007 and recently addressed by Schuiling (2012, has previously been examined only from the perspective of the parallel career holder (PC holder. The paper provides insight from both individual and organizational perspectives on the phenomenon of parallel careers and considers how it can function as an important tool for attracting and retaining people by contributing to human development. This paper employs a qualitative approach that includes 30 semi-structured one-on-one interviews. The organizational perspective arises from the 15 interviews with human resources (HR executives from different companies. The individual viewpoint originates from the interviews with 15 executives who are also PC holders. An inductive content analysis approach was used to examine Brazilian companies and the Brazilian office of multinationals. Companies that are concerned about having the best talent on their teams can benefit from a deeper understanding of parallel careers, which can be used to attract, develop, and retain talent. Limitations and directions for future research are discussed.
Kinematics/statics analysis of a novel serial-parallel robotic arm with hand

Energy Technology Data Exchange (ETDEWEB)

Lu, Yi; Dai, Zhuohong; Ye, Nijia; Wang, Peng [Yanshan University, Hebei (China)

2015-10-15

A robotic arm with fingered hand generally has multi-functions to complete various complicated operations. A novel serial-parallel robotic arm with a hand is proposed and its kinematics and statics are studied systematically. A 3D prototype of the serial-parallel robotic arm with a hand is constructed and analyzed by simulation. The serial-parallel robotic arm with a hand is composed of an upper 3RPS parallel manipulator, a lower 3SPR parallel manipulator and a hand with three finger mechanisms. Its kinematics formulae for solving the displacement, velocity, acceleration of are derived. Its statics formula for solving the active/constrained forces is derived. Its reachable workspace and orientation workspace are constructed and analyzed. Finally, an analytic example is given for solving the kinematics and statics of the serial-parallel robotic arm with a hand and the analytic solutions are verified by a simulation mechanism.

Kinematics/statics analysis of a novel serial-parallel robotic arm with hand

International Nuclear Information System (INIS)

Lu, Yi; Dai, Zhuohong; Ye, Nijia; Wang, Peng

2015-01-01

A robotic arm with fingered hand generally has multi-functions to complete various complicated operations. A novel serial-parallel robotic arm with a hand is proposed and its kinematics and statics are studied systematically. A 3D prototype of the serial-parallel robotic arm with a hand is constructed and analyzed by simulation. The serial-parallel robotic arm with a hand is composed of an upper 3RPS parallel manipulator, a lower 3SPR parallel manipulator and a hand with three finger mechanisms. Its kinematics formulae for solving the displacement, velocity, acceleration of are derived. Its statics formula for solving the active/constrained forces is derived. Its reachable workspace and orientation workspace are constructed and analyzed. Finally, an analytic example is given for solving the kinematics and statics of the serial-parallel robotic arm with a hand and the analytic solutions are verified by a simulation mechanism.
Continuous path control of a 5-DOF parallel-serial hybrid robot

International Nuclear Information System (INIS)

Uchiyama, Takuma; Terada, Hidetsugu; Mitsuya, Hironori

2010-01-01

Using the 5-degree of freedom parallel-serial hybrid robot, to realize the de-burring, new forward and inverse kinematic calculation methods based on the 'off-line teaching' method are proposed. This hybrid robot consists of a parallel stage section and a serial stage section. Considering this point, each section is calculated individually. And the continuous path control algorithm of this hybrid robot is proposed. To verify the usefulness, a prototype robot is tested which is controlled based on the proposed methods. This verification includes a positioning test and a pose test. The positioning test evaluates the continuous path of the tool center point. The pose test evaluates the pose on the tool center point. As the result, it is confirmed that this hybrid robot moves correctly using the proposed methods
Parallel algorithms for mapping pipelined and parallel computations

Science.gov (United States)

Nicol, David M.

1988-01-01

Many computational problems in image processing, signal processing, and scientific computing are naturally structured for either pipelined or parallel computation. When mapping such problems onto a parallel architecture it is often necessary to aggregate an obvious problem decomposition. Even in this context the general mapping problem is known to be computationally intractable, but recent advances have been made in identifying classes of problems and architectures for which optimal solutions can be found in polynomial time. Among these, the mapping of pipelined or parallel computations onto linear array, shared memory, and host-satellite systems figures prominently. This paper extends that work first by showing how to improve existing serial mapping algorithms. These improvements have significantly lower time and space complexities: in one case a published O(nm sup 3) time algorithm for mapping m modules onto n processors is reduced to an O(nm log m) time complexity, and its space requirements reduced from O(nm sup 2) to O(m). Run time complexity is further reduced with parallel mapping algorithms based on these improvements, which run on the architecture for which they create the mappings.
Functional requirements for gas characterization system computer software

International Nuclear Information System (INIS)

Tate, D.D.

1996-01-01

This document provides the Functional Requirements for the Computer Software operating the Gas Characterization System (GCS), which monitors the combustible gasses in the vapor space of selected tanks. Necessary computer functions are defined to support design, testing, operation, and change control. The GCS requires several individual computers to address the control and data acquisition functions of instruments and sensors. These computers are networked for communication, and must multi-task to accommodate operation in parallel
Comparison of functional ramp walk test and 6-min walk test in healthy volunteers: A new approach in functional capacity evaluation

Directory of Open Access Journals (Sweden)

Manivel Arumugam

2017-01-01

Full Text Available Objective: Inclined surfaces or ramps are the common obstacles faced by elderly and cardiopulmonary disabled in accessing public amenities. Ramp walking is one of the most common functional demands to be met by a common man in the industrialized world. To assess the functional (uphill walking capacity, we need a different functional stress test over the routinely used 6-min walk test (6MWT. Hence, a new 3-min steep ramp walk test (3MRWT was constructed to meet the demands similar to an uphill walk and to provide more functional stress than routinely used 6MWT. Methodology: The observational, crossover study design was adopted for this study. Fifteen healthy participants (8 males, 7 females performed both tests in a randomized order with a washout time of 6 h in between the tests. Walking distance to both ramp and ground, heart rate, blood pressure, saturation (SpO2, dyspnea, and fatigue with Borg exertion scale were compared prior and after the two walk tests. Results: The average distances covered in 6MWT were 510.5 ± 55.06 and 440.65 ± 25.08 meters and in 3MRWT were 270.18 ± 30.8, 230.05 ± 15.06 meters for male and female respectively. The difference between 3MRWT and 6MWT distances covered by the participants was statistically significant (t = 0.893. The mean difference between the heart rate, saturation and perceptions were highly significant (P < 0.001. Conclusion: The study results show that 3MRWT is valid over routinely administered 6MWT and may provide greater functional stress (uphill or ramp walk capacity in a shorter duration in healthy individuals in assessing the maximal functional capacity in a ramp or uphill walking.
Parallel computing for homogeneous diffusion and transport equations in neutronics; Calcul parallele pour les equations de diffusion et de transport homogenes en neutronique

Energy Technology Data Exchange (ETDEWEB)

Pinchedez, K

1999-06-01

Parallel computing meets the ever-increasing requirements for neutronic computer code speed and accuracy. In this work, two different approaches have been considered. We first parallelized the sequential algorithm used by the neutronics code CRONOS developed at the French Atomic Energy Commission. The algorithm computes the dominant eigenvalue associated with PN simplified transport equations by a mixed finite element method. Several parallel algorithms have been developed on distributed memory machines. The performances of the parallel algorithms have been studied experimentally by implementation on a T3D Cray and theoretically by complexity models. A comparison of various parallel algorithms has confirmed the chosen implementations. We next applied a domain sub-division technique to the two-group diffusion Eigen problem. In the modal synthesis-based method, the global spectrum is determined from the partial spectra associated with sub-domains. Then the Eigen problem is expanded on a family composed, on the one hand, from eigenfunctions associated with the sub-domains and, on the other hand, from functions corresponding to the contribution from the interface between the sub-domains. For a 2-D homogeneous core, this modal method has been validated and its accuracy has been measured. (author)
Parallel computing works

Energy Technology Data Exchange (ETDEWEB)

1991-10-23

An account of the Caltech Concurrent Computation Program (C{sup 3}P), a five year project that focused on answering the question: Can parallel computers be used to do large-scale scientific computations '' As the title indicates, the question is answered in the affirmative, by implementing numerous scientific applications on real parallel computers and doing computations that produced new scientific results. In the process of doing so, C{sup 3}P helped design and build several new computers, designed and implemented basic system software, developed algorithms for frequently used mathematical computations on massively parallel machines, devised performance models and measured the performance of many computers, and created a high performance computing facility based exclusively on parallel computers. While the initial focus of C{sup 3}P was the hypercube architecture developed by C. Seitz, many of the methods developed and lessons learned have been applied successfully on other massively parallel architectures.
Block-Parallel Data Analysis with DIY2

Energy Technology Data Exchange (ETDEWEB)

Morozov, Dmitriy [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Peterka, Tom [Argonne National Lab. (ANL), Argonne, IL (United States)

2017-08-30

DIY2 is a programming model and runtime for block-parallel analytics on distributed-memory machines. Its main abstraction is block-structured data parallelism: data are decomposed into blocks; blocks are assigned to processing elements (processes or threads); computation is described as iterations over these blocks, and communication between blocks is defined by reusable patterns. By expressing computation in this general form, the DIY2 runtime is free to optimize the movement of blocks between slow and fast memories (disk and flash vs. DRAM) and to concurrently execute blocks residing in memory with multiple threads. This enables the same program to execute in-core, out-of-core, serial, parallel, single-threaded, multithreaded, or combinations thereof. This paper describes the implementation of the main features of the DIY2 programming model and optimizations to improve performance. DIY2 is evaluated on benchmark test cases to establish baseline performance for several common patterns and on larger complete analysis codes running on large-scale HPC machines.
15N liver function tests - concept, validity, clinical use

International Nuclear Information System (INIS)

Faust, H.; Jung, K.; Krumbiegel, P.; Hirschberg, K.; Reinhardt, R.; Junghans, P.

1987-01-01

Several liver function tests using the oral application of a nitrogen compound labelled with 15 N and the subsequent determination of 15 N in a certain fraction of urine by emission spectrometry are described. Because of the key position of the liver in the metabolism of nitrogen compounds the results of these tests allow conclusions concerning disturbances of special liver functions. Instructions for the clinical use of the '[ 15 N]Ammonium Test', '[ 15 N]Hippurate Test' the '[ 15 N]Methacetin Test', and the '[ 15 N]Glycine Test' are given. (author)
Multilevel Parallelization of AutoDock 4.2

Directory of Open Access Journals (Sweden)

Norgan Andrew P

2011-04-01

Full Text Available Abstract Background Virtual (computational screening is an increasingly important tool for drug discovery. AutoDock is a popular open-source application for performing molecular docking, the prediction of ligand-receptor interactions. AutoDock is a serial application, though several previous efforts have parallelized various aspects of the program. In this paper, we report on a multi-level parallelization of AutoDock 4.2 (mpAD4. Results Using MPI and OpenMP, AutoDock 4.2 was parallelized for use on MPI-enabled systems and to multithread the execution of individual docking jobs. In addition, code was implemented to reduce input/output (I/O traffic by reusing grid maps at each node from docking to docking. Performance of mpAD4 was examined on two multiprocessor computers. Conclusions Using MPI with OpenMP multithreading, mpAD4 scales with near linearity on the multiprocessor systems tested. In situations where I/O is limiting, reuse of grid maps reduces both system I/O and overall screening time. Multithreading of AutoDock's Lamarkian Genetic Algorithm with OpenMP increases the speed of execution of individual docking jobs, and when combined with MPI parallelization can significantly reduce the execution time of virtual screens. This work is significant in that mpAD4 speeds the execution of certain molecular docking workloads and allows the user to optimize the degree of system-level (MPI and node-level (OpenMP parallelization to best fit both workloads and computational resources.
Multilevel Parallelization of AutoDock 4.2.

Science.gov (United States)

Norgan, Andrew P; Coffman, Paul K; Kocher, Jean-Pierre A; Katzmann, David J; Sosa, Carlos P

2011-04-28

Virtual (computational) screening is an increasingly important tool for drug discovery. AutoDock is a popular open-source application for performing molecular docking, the prediction of ligand-receptor interactions. AutoDock is a serial application, though several previous efforts have parallelized various aspects of the program. In this paper, we report on a multi-level parallelization of AutoDock 4.2 (mpAD4). Using MPI and OpenMP, AutoDock 4.2 was parallelized for use on MPI-enabled systems and to multithread the execution of individual docking jobs. In addition, code was implemented to reduce input/output (I/O) traffic by reusing grid maps at each node from docking to docking. Performance of mpAD4 was examined on two multiprocessor computers. Using MPI with OpenMP multithreading, mpAD4 scales with near linearity on the multiprocessor systems tested. In situations where I/O is limiting, reuse of grid maps reduces both system I/O and overall screening time. Multithreading of AutoDock's Lamarkian Genetic Algorithm with OpenMP increases the speed of execution of individual docking jobs, and when combined with MPI parallelization can significantly reduce the execution time of virtual screens. This work is significant in that mpAD4 speeds the execution of certain molecular docking workloads and allows the user to optimize the degree of system-level (MPI) and node-level (OpenMP) parallelization to best fit both workloads and computational resources.
A Self Consistent Multiprocessor Space Charge Algorithm that is Almost Embarrassingly Parallel

International Nuclear Information System (INIS)

Nissen, Edward; Erdelyi, B.; Manikonda, S.L.

2012-01-01

We present a space charge code that is self consistent, massively parallelizeable, and requires very little communication between computer nodes; making the calculation almost embarrassingly parallel. This method is implemented in the code COSY Infinity where the differential algebras used in this code are important to the algorithm's proper functioning. The method works by calculating the self consistent space charge distribution using the statistical moments of the test particles, and converting them into polynomial series coefficients. These coefficients are combined with differential algebraic integrals to form the potential, and electric fields. The result is a map which contains the effects of space charge. This method allows for massive parallelization since its statistics based solver doesn't require any binning of particles, and only requires a vector containing the partial sums of the statistical moments for the different nodes to be passed. All other calculations are done independently. The resulting maps can be used to analyze the system using normal form analysis, as well as advance particles in numbers and at speeds that were previously impossible.
Massive parallel electromagnetic field simulation program JEMS-FDTD design and implementation on jasmin

International Nuclear Information System (INIS)

Li Hanyu; Zhou Haijing; Dong Zhiwei; Liao Cheng; Chang Lei; Cao Xiaolin; Xiao Li

2010-01-01

A large-scale parallel electromagnetic field simulation program JEMS-FDTD(J Electromagnetic Solver-Finite Difference Time Domain) is designed and implemented on JASMIN (J parallel Adaptive Structured Mesh applications INfrastructure). This program can simulate propagation, radiation, couple of electromagnetic field by solving Maxwell equations on structured mesh explicitly with FDTD method. JEMS-FDTD is able to simulate billion-mesh-scale problems on thousands of processors. In this article, the program is verified by simulating the radiation of an electric dipole. A beam waveguide is simulated to demonstrate the capability of large scale parallel computation. A parallel performance test indicates that a high parallel efficiency is obtained. (authors)
Parallel iterative solvers and preconditioners using approximate hierarchical methods

Energy Technology Data Exchange (ETDEWEB)

Grama, A.; Kumar, V.; Sameh, A. [Univ. of Minnesota, Minneapolis, MN (United States)

1996-12-31

In this paper, we report results of the performance, convergence, and accuracy of a parallel GMRES solver for Boundary Element Methods. The solver uses a hierarchical approximate matrix-vector product based on a hybrid Barnes-Hut / Fast Multipole Method. We study the impact of various accuracy parameters on the convergence and show that with minimal loss in accuracy, our solver yields significant speedups. We demonstrate the excellent parallel efficiency and scalability of our solver. The combined speedups from approximation and parallelism represent an improvement of several orders in solution time. We also develop fast and paralellizable preconditioners for this problem. We report on the performance of an inner-outer scheme and a preconditioner based on truncated Green`s function. Experimental results on a 256 processor Cray T3D are presented.
Optimal conductive constructal configurations with “parallel design”

International Nuclear Information System (INIS)

Eslami, M.

2016-01-01

Highlights: • A new parallel design is proposed for conductive cooling of heat generating rectangles. • The geometric features are optimized analytically. • The internal structure morph as a function of available conductive material. • Thermal performance is superior to the previously numerically optimized designs. - Abstract: Today, conductive volume to point cooling of heat generating bodies is under investigation as an alternative method for thermal management of electronic chipsets with high power density. In this paper, a new simple geometry called “parallel design” is proposed for effective conductive cooling of rectangular heat generating bodies. This configuration tries to minimize the thermal resistance associated with the temperature drop inside the heat generating volume. The geometric features of the design are all optimized analytically and expressed with simple explicit equations. It is proved that optimal number of parallel links is equal to the thermal conductivity ratio multiplied by the porosity (or the volume ratio). With the universal aspect ratio of H/L = 2, total thermal resistance of the present parallel design is lower than the recently proposed networks of various shapes that are optimized with help of numerical simulations; especially when more conducting material is available.
Locating hardware faults in a data communications network of a parallel computer

Science.gov (United States)

Archer, Charles J.; Megerian, Mark G.; Ratterman, Joseph D.; Smith, Brian E.

2010-01-12

Hardware faults location in a data communications network of a parallel computer. Such a parallel computer includes a plurality of compute nodes and a data communications network that couples the compute nodes for data communications and organizes the compute node as a tree. Locating hardware faults includes identifying a next compute node as a parent node and a root of a parent test tree, identifying for each child compute node of the parent node a child test tree having the child compute node as root, running a same test suite on the parent test tree and each child test tree, and identifying the parent compute node as having a defective link connected from the parent compute node to a child compute node if the test suite fails on the parent test tree and succeeds on all the child test trees.
Platelet function testing: methods of assessment and clinical utility.

LENUS (Irish Health Repository)

Mylotte, Darren

2012-02-01

Platelets play a central role in the regulation of both thrombosis and haemostasis yet tests of platelet function have, until recently, been exclusively used in the diagnosis and management of bleeding disorders. Recent advances have demonstrated the clinical utility of platelet function testing in patients with cardiovascular disease. The ex vivo measurement of response to antiplatelet therapies (aspirin and clopidogrel), by an ever-increasing array of platelet function tests, is with some assays, predictive of adverse clinical events and thus, represents an emerging area of interest for both the clinician and basic scientist. This review article will describe the advantages and disadvantages of the currently available methods of measuring platelet function and discuss both the limitations and emerging data supporting the role of platelet function studies in clinical practice.
Platelet function testing: methods of assessment and clinical utility.

LENUS (Irish Health Repository)

Mylotte, Darren

2011-01-01

Platelets play a central role in the regulation of both thrombosis and haemostasis yet tests of platelet function have, until recently, been exclusively used in the diagnosis and management of bleeding disorders. Recent advances have demonstrated the clinical utility of platelet function testing in patients with cardiovascular disease. The ex vivo measurement of response to antiplatelet therapies (aspirin and clopidogrel), by an ever-increasing array of platelet function tests, is with some assays, predictive of adverse clinical events and thus, represents an emerging area of interest for both the clinician and basic scientist. This review article will describe the advantages and disadvantages of the currently available methods of measuring platelet function and discuss both the limitations and emerging data supporting the role of platelet function studies in clinical practice.
Template based parallel checkpointing in a massively parallel computer system

Science.gov (United States)

Archer, Charles Jens [Rochester, MN; Inglett, Todd Alan [Rochester, MN

2009-01-13

A method and apparatus for a template based parallel checkpoint save for a massively parallel super computer system using a parallel variation of the rsync protocol, and network broadcast. In preferred embodiments, the checkpoint data for each node is compared to a template checkpoint file that resides in the storage and that was previously produced. Embodiments herein greatly decrease the amount of data that must be transmitted and stored for faster checkpointing and increased efficiency of the computer system. Embodiments are directed to a parallel computer system with nodes arranged in a cluster with a high speed interconnect that can perform broadcast communication. The checkpoint contains a set of actual small data blocks with their corresponding checksums from all nodes in the system. The data blocks may be compressed using conventional non-lossy data compression algorithms to further reduce the overall checkpoint size.
HVI Ballistic Performance Characterization of Non-Parallel Walls

Science.gov (United States)

Bohl, William; Miller, Joshua; Christiansen, Eric

2012-01-01

The Double-Wall, "Whipple" Shield [1] has been the subject of many hypervelocity impact studies and has proven to be an effective shield system for Micro-Meteoroid and Orbital Debris (MMOD) impacts for spacecraft. The US modules of the International Space Station (ISS), with their "bumper shields" offset from their pressure holding rear walls provide good examples of effective on-orbit use of the double wall shield. The concentric cylinder shield configuration with its large radius of curvature relative to separation distance is easily and effectively represented for testing and analysis as a system of two parallel plates. The parallel plate double wall configuration has been heavily tested and characterized for shield performance for normal and oblique impacts for the ISS and other programs. The double wall shield and principally similar Stuffed Whipple Shield are very common shield types for MMOD protection. However, in some locations with many spacecraft designs, the rear wall cannot be modeled as being parallel or concentric with the outer bumper wall. As represented in Figure 1, there is an included angle between the two walls. And, with a cylindrical outer wall, the effective included angle constantly changes. This complicates assessment of critical spacecraft components located within outer spacecraft walls when using software tools such as NASA's BumperII. In addition, the validity of the risk assessment comes into question when using the standard double wall shield equations, especially since verification testing of every set of double wall included angles is impossible.

Sensitivity Analysis of the Proximal-Based Parallel Decomposition Methods

Directory of Open Access Journals (Sweden)

Feng Ma

2014-01-01

Full Text Available The proximal-based parallel decomposition methods were recently proposed to solve structured convex optimization problems. These algorithms are eligible for parallel computation and can be used efficiently for solving large-scale separable problems. In this paper, compared with the previous theoretical results, we show that the range of the involved parameters can be enlarged while the convergence can be still established. Preliminary numerical tests on stable principal component pursuit problem testify to the advantages of the enlargement.
A Note on Using Partitioning Techniques for Solving Unconstrained Optimization Problems on Parallel Systems

Directory of Open Access Journals (Sweden)

Mehiddin Al-Baali

2015-12-01

Full Text Available We deal with the design of parallel algorithms by using variable partitioning techniques to solve nonlinear optimization problems. We propose an iterative solution method that is very efficient for separable functions, our scope being to discuss its performance for general functions. Experimental results on an illustrative example have suggested some useful modifications that, even though they improve the efficiency of our parallel method, leave some questions open for further investigation.
A Sparse Self-Consistent Field Algorithm and Its Parallel Implementation: Application to Density-Functional-Based Tight Binding.

Science.gov (United States)

Scemama, Anthony; Renon, Nicolas; Rapacioli, Mathias

2014-06-10

We present an algorithm and its parallel implementation for solving a self-consistent problem as encountered in Hartree-Fock or density functional theory. The algorithm takes advantage of the sparsity of matrices through the use of local molecular orbitals. The implementation allows one to exploit efficiently modern symmetric multiprocessing (SMP) computer architectures. As a first application, the algorithm is used within the density-functional-based tight binding method, for which most of the computational time is spent in the linear algebra routines (diagonalization of the Fock/Kohn-Sham matrix). We show that with this algorithm (i) single point calculations on very large systems (millions of atoms) can be performed on large SMP machines, (ii) calculations involving intermediate size systems (1000-100 000 atoms) are also strongly accelerated and can run efficiently on standard servers, and (iii) the error on the total energy due to the use of a cutoff in the molecular orbital coefficients can be controlled such that it remains smaller than the SCF convergence criterion.
Performance studies of the parallel VIM code

International Nuclear Information System (INIS)

Shi, B.; Blomquist, R.N.

1996-01-01

In this paper, the authors evaluate the performance of the parallel version of the VIM Monte Carlo code on the IBM SPx at the High Performance Computing Research Facility at ANL. Three test problems with contrasting computational characteristics were used to assess effects in performance. A statistical method for estimating the inefficiencies due to load imbalance and communication is also introduced. VIM is a large scale continuous energy Monte Carlo radiation transport program and was parallelized using history partitioning, the master/worker approach, and p4 message passing library. Dynamic load balancing is accomplished when the master processor assigns chunks of histories to workers that have completed a previously assigned task, accommodating variations in the lengths of histories, processor speeds, and worker loads. At the end of each batch (generation), the fission sites and tallies are sent from each worker to the master process, contributing to the parallel inefficiency. All communications are between master and workers, and are serial. The SPx is a scalable 128-node parallel supercomputer with high-performance Omega switches of 63 microsec latency and 35 MBytes/sec bandwidth. For uniform and reproducible performance, they used only the 120 identical regular processors (IBM RS/6000) and excluded the remaining eight planet nodes, which may be loaded by other's jobs
PLAST: parallel local alignment search tool for database comparison

Directory of Open Access Journals (Sweden)

Lavenier Dominique

2009-10-01

Full Text Available Abstract Background Sequence similarity searching is an important and challenging task in molecular biology and next-generation sequencing should further strengthen the need for faster algorithms to process such vast amounts of data. At the same time, the internal architecture of current microprocessors is tending towards more parallelism, leading to the use of chips with two, four and more cores integrated on the same die. The main purpose of this work was to design an effective algorithm to fit with the parallel capabilities of modern microprocessors. Results A parallel algorithm for comparing large genomic banks and targeting middle-range computers has been developed and implemented in PLAST software. The algorithm exploits two key parallel features of existing and future microprocessors: the SIMD programming model (SSE instruction set and the multithreading concept (multicore. Compared to multithreaded BLAST software, tests performed on an 8-processor server have shown speedup ranging from 3 to 6 with a similar level of accuracy. Conclusion A parallel algorithmic approach driven by the knowledge of the internal microprocessor architecture allows significant speedup to be obtained while preserving standard sensitivity for similarity search problems.
Liver function tests using the stable istope 15N

International Nuclear Information System (INIS)

Faust, H.; Jung, K.; Hirschberg, K.; Krumbiegel, P.; Junghans, P.; Reinhardt, R.; Teichmann, B.

1988-01-01

Several liver function tests using oral application of a nitrogen compound labelled with 15 N and the subsequent determination of 15 N in a certain fraction of urine or in the total urine by emission spectrometry are described. Because of the key function of the liver in the metabolism of nitrogen compounds, the results of these tests allow conclusions concerning some disturbances of liver functions. (author)
Frontiers of massively parallel scientific computation

International Nuclear Information System (INIS)

Fischer, J.R.

1987-07-01

Practical applications using massively parallel computer hardware first appeared during the 1980s. Their development was motivated by the need for computing power orders of magnitude beyond that available today for tasks such as numerical simulation of complex physical and biological processes, generation of interactive visual displays, satellite image analysis, and knowledge based systems. Representative of the first generation of this new class of computers is the Massively Parallel Processor (MPP). A team of scientists was provided the opportunity to test and implement their algorithms on the MPP. The first results are presented. The research spans a broad variety of applications including Earth sciences, physics, signal and image processing, computer science, and graphics. The performance of the MPP was very good. Results obtained using the Connection Machine and the Distributed Array Processor (DAP) are presented
Acceleration and parallelization calculation of EFEN-SP_3 method

International Nuclear Information System (INIS)

Yang Wen; Zheng Youqi; Wu Hongchun; Cao Liangzhi; Li Yunzhao

2013-01-01

Due to the fact that the exponential function expansion nodal-SP_3 (EFEN-SP_3) method needs further improvement in computational efficiency to routinely carry out PWR whole core pin-by-pin calculation, the coarse mesh acceleration and spatial parallelization were investigated in this paper. The coarse mesh acceleration was built by considering discontinuity factor on each coarse mesh interface and preserving neutron balance within each coarse mesh in space, angle and energy. The spatial parallelization based on MPI was implemented by guaranteeing load balancing and minimizing communications cost to fully take advantage of the modern computing and storage abilities. Numerical results based on a commercial nuclear power reactor demonstrate an speedup ratio of about 40 for the coarse mesh acceleration and a parallel efficiency of higher than 60% with 40 CPUs for the spatial parallelization. With these two improvements, the EFEN code can complete a PWR whole core pin-by-pin calculation with 289 × 289 × 218 meshes and 4 energy groups within 100 s by using 48 CPUs (2.40 GHz frequency). (authors)
Parallel computing techniques for rotorcraft aerodynamics

Science.gov (United States)

Ekici, Kivanc

The modification of unsteady three-dimensional Navier-Stokes codes for application on massively parallel and distributed computing environments is investigated. The Euler/Navier-Stokes code TURNS (Transonic Unsteady Rotor Navier-Stokes) was chosen as a test bed because of its wide use by universities and industry. For the efficient implementation of TURNS on parallel computing systems, two algorithmic changes are developed. First, main modifications to the implicit operator, Lower-Upper Symmetric Gauss Seidel (LU-SGS) originally used in TURNS, is performed. Second, application of an inexact Newton method, coupled with a Krylov subspace iterative method (Newton-Krylov method) is carried out. Both techniques have been tried previously for the Euler equations mode of the code. In this work, we have extended the methods to the Navier-Stokes mode. Several new implicit operators were tried because of convergence problems of traditional operators with the high cell aspect ratio (CAR) grids needed for viscous calculations on structured grids. Promising results for both Euler and Navier-Stokes cases are presented for these operators. For the efficient implementation of Newton-Krylov methods to the Navier-Stokes mode of TURNS, efficient preconditioners must be used. The parallel implicit operators used in the previous step are employed as preconditioners and the results are compared. The Message Passing Interface (MPI) protocol has been used because of its portability to various parallel architectures. It should be noted that the proposed methodology is general and can be applied to several other CFD codes (e.g. OVERFLOW).
Multibus-based parallel processor for simulation

Science.gov (United States)

Ogrady, E. P.; Wang, C.-H.

1983-01-01

A Multibus-based parallel processor simulation system is described. The system is intended to serve as a vehicle for gaining hands-on experience, testing system and application software, and evaluating parallel processor performance during development of a larger system based on the horizontal/vertical-bus interprocessor communication mechanism. The prototype system consists of up to seven Intel iSBC 86/12A single-board computers which serve as processing elements, a multiple transmission controller (MTC) designed to support system operation, and an Intel Model 225 Microcomputer Development System which serves as the user interface and input/output processor. All components are interconnected by a Multibus/IEEE 796 bus. An important characteristic of the system is that it provides a mechanism for a processing element to broadcast data to other selected processing elements. This parallel transfer capability is provided through the design of the MTC and a minor modification to the iSBC 86/12A board. The operation of the MTC, the basic hardware-level operation of the system, and pertinent details about the iSBC 86/12A and the Multibus are described.
Functional Testing of Wireless Sensor Node Designs

DEFF Research Database (Denmark)

Virk, Kashif M.; Madsen, Jan

2007-01-01

Wireless sensor networks are networked embedded computer systems with stringent power, performance, cost and form-factor requirements along with numerous other constraints related to their pervasiveness and ubiquitousness. Therefore, only a systematic design methdology coupled with an efficient...... test approach can enable their conformance to design and deployment specifications. We discuss off-line, hierarchical, functional testing of complete wireless sensor nodes containing configurable logic through a combination of FPGA-based board test and Software-Based Self-Test (SBST) techniques...
Using Coarrays to Parallelize Legacy Fortran Applications: Strategy and Case Study

Directory of Open Access Journals (Sweden)

Hari Radhakrishnan

2015-01-01

Full Text Available This paper summarizes a strategy for parallelizing a legacy Fortran 77 program using the object-oriented (OO and coarray features that entered Fortran in the 2003 and 2008 standards, respectively. OO programming (OOP facilitates the construction of an extensible suite of model-verification and performance tests that drive the development. Coarray parallel programming facilitates a rapid evolution from a serial application to a parallel application capable of running on multicore processors and many-core accelerators in shared and distributed memory. We delineate 17 code modernization steps used to refactor and parallelize the program and study the resulting performance. Our initial studies were done using the Intel Fortran compiler on a 32-core shared memory server. Scaling behavior was very poor, and profile analysis using TAU showed that the bottleneck in the performance was due to our implementation of a collective, sequential summation procedure. We were able to improve the scalability and achieve nearly linear speedup by replacing the sequential summation with a parallel, binary tree algorithm. We also tested the Cray compiler, which provides its own collective summation procedure. Intel provides no collective reductions. With Cray, the program shows linear speedup even in distributed-memory execution. We anticipate similar results with other compilers once they support the new collective procedures proposed for Fortran 2015.
Correlation between HRCT and pulmonary functional tests in cystic fibrosis

International Nuclear Information System (INIS)

Mastellari, Paola; Biggi, Simona; Lombardi, Alfonsa; Zompatori, Maurizio; Grzincich, Gianluigi; Pisi, Giovanna; Spaggiari, Cinzia

2005-01-01

Purpose. To compare the HRCT score by Oikonottlou and air trapping in expiratory scans with pulmonary functional tests and evaluate which radiological criteria are more useful to predict clinical impairment. Materials and methods. From January to September 2003, pulmonary HRCT study was performed in 37 patients (23 males), aged between 7 and 41 years, with cystic fibrosis. In the same day of CT examination they also received a complete functional evaluation. HRCT studies were evaluated by three radiologists blinded to the clinical data and were correlated with the lung function tests. Results. We obtained a high correlation (p=0.01) for two of the HRCT signs: extent of mucus plugging and mosaic perfusion pattern and all function tests. Discussion. Previous studies have demonstrated good correlation between lung function tests, in particular with FEV1 and HRCT signs. Our study differed from previous ones in that we analysed the correlation between lung function tests and with both single and combined CT criteria. Conclusion. Our results suggest that a simplified HRCT store could be useful to evaluate patients with cystic fibrosis [it
A theoretical response of the electrostatic parallel plate to constant and low-frequency accelerations

International Nuclear Information System (INIS)

Lee, Ki Bang

2009-01-01

A theoretical response of an electrostatic gap-closing actuator based on parallel plates to constant and low-frequency accelerations has been derived as a function of the applied acceleration and voltage. The nonlinear equation of motion is obtained in a dimensionless form from the fact that the inertial and damping forces are neglected at a frequency much less than the resonant frequency of the parallel plate, and thereafter the nonlinear equation is solved for the stable inter-plate gap at the acceleration and voltage. From the derived solution, the pull-in acceleration is obtained as a function of the applied voltage, and the pull-in voltage is also expressed as a function of the acceleration. The closed-form solution is validated by comparison with a numerical solution. The theoretical solution is in excellent agreement with the numerical results when the actuator is exposed to a constant acceleration as well as a low-frequency acceleration. The theoretical solution and pull-in acceleration and voltage thus provide guidance to prescribe operational constraints for devices that use the parallel plate actuator and to predict the response of the electrostatic gap-closing parallel plates to constant and low-frequency acceleration
AdiosStMan: Parallelizing Casacore Table Data System using Adaptive IO System

Science.gov (United States)

Wang, R.; Harris, C.; Wicenec, A.

2016-07-01

In this paper, we investigate the Casacore Table Data System (CTDS) used in the casacore and CASA libraries, and methods to parallelize it. CTDS provides a storage manager plugin mechanism for third-party developers to design and implement their own CTDS storage managers. Having this in mind, we looked into various storage backend techniques that can possibly enable parallel I/O for CTDS by implementing new storage managers. After carrying on benchmarks showing the excellent parallel I/O throughput of the Adaptive IO System (ADIOS), we implemented an ADIOS based parallel CTDS storage manager. We then applied the CASA MSTransform frequency split task to verify the ADIOS Storage Manager. We also ran a series of performance tests to examine the I/O throughput in a massively parallel scenario.
Model-driven product line engineering for mapping parallel algorithms to parallel computing platforms

NARCIS (Netherlands)

Arkin, Ethem; Tekinerdogan, Bedir

2016-01-01

Mapping parallel algorithms to parallel computing platforms requires several activities such as the analysis of the parallel algorithm, the definition of the logical configuration of the platform, the mapping of the algorithm to the logical configuration platform and the implementation of the
Parallelization in Modern C++

CERN Multimedia

CERN. Geneva

2016-01-01

The traditionally used and well established parallel programming models OpenMP and MPI are both targeting lower level parallelism and are meant to be as language agnostic as possible. For a long time, those models were the only widely available portable options for developing parallel C++ applications beyond using plain threads. This has strongly limited the optimization capabilities of compilers, has inhibited extensibility and genericity, and has restricted the use of those models together with other, modern higher level abstractions introduced by the C++11 and C++14 standards. The recent revival of interest in the industry and wider community for the C++ language has also spurred a remarkable amount of standardization proposals and technical specifications being developed. Those efforts however have so far failed to build a vision on how to seamlessly integrate various types of parallelism, such as iterative parallel execution, task-based parallelism, asynchronous many-task execution flows, continuation s...
A parallel solution for high resolution histological image analysis.

Science.gov (United States)

Bueno, G; González, R; Déniz, O; García-Rojo, M; González-García, J; Fernández-Carrobles, M M; Vállez, N; Salido, J

2012-10-01

This paper describes a general methodology for developing parallel image processing algorithms based on message passing for high resolution images (on the order of several Gigabytes). These algorithms have been applied to histological images and must be executed on massively parallel processing architectures. Advances in new technologies for complete slide digitalization in pathology have been combined with developments in biomedical informatics. However, the efficient use of these digital slide systems is still a challenge. The image processing that these slides are subject to is still limited both in terms of data processed and processing methods. The work presented here focuses on the need to design and develop parallel image processing tools capable of obtaining and analyzing the entire gamut of information included in digital slides. Tools have been developed to assist pathologists in image analysis and diagnosis, and they cover low and high-level image processing methods applied to histological images. Code portability, reusability and scalability have been tested by using the following parallel computing architectures: distributed memory with massive parallel processors and two networks, INFINIBAND and Myrinet, composed of 17 and 1024 nodes respectively. The parallel framework proposed is flexible, high performance solution and it shows that the efficient processing of digital microscopic images is possible and may offer important benefits to pathology laboratories. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Massively parallel mathematical sieves

Energy Technology Data Exchange (ETDEWEB)

Montry, G.R.

1989-01-01

The Sieve of Eratosthenes is a well-known algorithm for finding all prime numbers in a given subset of integers. A parallel version of the Sieve is described that produces computational speedups over 800 on a hypercube with 1,024 processing elements for problems of fixed size. Computational speedups as high as 980 are achieved when the problem size per processor is fixed. The method of parallelization generalizes to other sieves and will be efficient on any ensemble architecture. We investigate two highly parallel sieves using scattered decomposition and compare their performance on a hypercube multiprocessor. A comparison of different parallelization techniques for the sieve illustrates the trade-offs necessary in the design and implementation of massively parallel algorithms for large ensemble computers.
Biologic variability and correlation of platelet function testing in healthy dogs.

Science.gov (United States)

Blois, Shauna L; Lang, Sean T; Wood, R Darren; Monteith, Gabrielle

2015-12-01

Platelet function tests are influenced by biologic variability, including inter-individual (CVG ) and intra-individual (CVI ), as well as analytic (CVA ) variability. Variability in canine platelet function testing is unknown, but if excessive, would make it difficult to interpret serial results. Additionally, the correlation between platelet function tests is poor in people, but not well described in dogs. The aims were to: (1) identify the effect of variation in preanalytic factors (venipuncture, elapsed time until analysis) on platelet function tests; (2) calculate analytic and biologic variability of adenosine diphosphate (ADP) and arachidonic acid (AA)-induced thromboelastograph platelet mapping (TEG-PM), ADP-, AA-, and collagen-induced whole blood platelet aggregometry (WBA), and collagen/ADP and collagen/epinephrine platelet function analysis (PFA-CADP, PFA-CEPI); and (3) determine the correlation between these variables. In this prospective observational trial, platelet function was measured once every 7 days, for 4 consecutive weeks, in 9 healthy dogs. In addition, CBC, TEG-PM, WBA, and PFA were performed. Overall coefficients of variability ranged from 13.3% to 87.8% for the platelet function tests. Biologic variability was highest for AA-induced maximum amplitude generated during TEG-PM (MAAA; CVG = 95.3%, CVI = 60.8%). Use of population-based reference intervals (RI) was determined appropriate only for PFA-CADP (index of individuality = 10.7). There was poor correlation between most platelet function tests. Use of population-based RI appears inappropriate for most platelet function tests, and tests poorly correlate with one another. Future studies on biologic variability and correlation of platelet function tests should be performed in dogs with platelet dysfunction and those treated with antiplatelet therapy. © 2015 American Society for Veterinary Clinical Pathology.

Computer-Aided Parallelizer and Optimizer

Science.gov (United States)

Jin, Haoqiang

2011-01-01

The Computer-Aided Parallelizer and Optimizer (CAPO) automates the insertion of compiler directives (see figure) to facilitate parallel processing on Shared Memory Parallel (SMP) machines. While CAPO currently is integrated seamlessly into CAPTools (developed at the University of Greenwich, now marketed as ParaWise), CAPO was independently developed at Ames Research Center as one of the components for the Legacy Code Modernization (LCM) project. The current version takes serial FORTRAN programs, performs interprocedural data dependence analysis, and generates OpenMP directives. Due to the widely supported OpenMP standard, the generated OpenMP codes have the potential to run on a wide range of SMP machines. CAPO relies on accurate interprocedural data dependence information currently provided by CAPTools. Compiler directives are generated through identification of parallel loops in the outermost level, construction of parallel regions around parallel loops and optimization of parallel regions, and insertion of directives with automatic identification of private, reduction, induction, and shared variables. Attempts also have been made to identify potential pipeline parallelism (implemented with point-to-point synchronization). Although directives are generated automatically, user interaction with the tool is still important for producing good parallel codes. A comprehensive graphical user interface is included for users to interact with the parallelization process.
Classical test theory and Rasch analysis validation of the Upper Limb Functional Index in subjects with upper limb musculoskeletal disorders.

Science.gov (United States)

Bravini, Elisabetta; Franchignoni, Franco; Giordano, Andrea; Sartorio, Francesco; Ferriero, Giorgio; Vercelli, Stefano; Foti, Calogero

2015-01-01

To perform a comprehensive analysis of the psychometric properties and dimensionality of the Upper Limb Functional Index (ULFI) using both classical test theory and Rasch analysis (RA). Prospective, single-group observational design. Freestanding rehabilitation center. Convenience sample of Italian-speaking subjects with upper limb musculoskeletal disorders (N=174). Not applicable. The Italian version of the ULFI. Data were analyzed using parallel analysis, exploratory factor analysis, and RA for evaluating dimensionality, functioning of rating scale categories, item fit, hierarchy of item difficulties, and reliability indices. Parallel analysis revealed 2 factors explaining 32.5% and 10.7% of the response variance. RA confirmed the failure of the unidimensionality assumption, and 6 items out of the 25 misfitted the Rasch model. When the analysis was rerun excluding the misfitting items, the scale showed acceptable fit values, loading meaningfully to a single factor. Item separation reliability and person separation reliability were .98 and .89, respectively. Cronbach alpha was .92. RA revealed weakness of the scale concerning dimensionality and internal construct validity. However, a set of 19 ULFI items defined through the statistical process demonstrated a unidimensional structure, good psychometric properties, and clinical meaningfulness. These findings represent a useful starting point for further analyses of the tool (based on modern psychometric approaches and confirmatory factor analysis) in larger samples, including different patient populations and nationalities. Copyright © 2015 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
A program system for ab initio MO calculations on vector and parallel processing machines. Pt. 1

International Nuclear Information System (INIS)

Ernenwein, R.; Rohmer, M.M.; Benard, M.

1990-01-01

We present a program system for ab initio molecular orbital calculations on vector and parallel computers. The present article is devoted to the computation of one- and two-electron integrals over contracted Gaussian basis sets involving s-, p-, d- and f-type functions. The McMurchie and Davidson (MMD) algorithm has been implemented and parallelized by distributing over a limited number of logical tasks the calculation of the 55 relevant classes of integrals. All sections of the MMD algorithm have been efficiently vectorized, leading to a scalar/vector ratio of 5.8. Different algorithms are proposed and compared for an optimal vectorization of the contraction of the 'intermediate integrals' generated by the MMD formalism. Advantage is taken of the dynamic storage allocation for tuning the length of the vector loops (i.e. the size of the vectorization buffer) as a function of (i) the total memory available for the job, (ii) the number of logical tasks defined by the user (≤13), and (iii) the storage requested by each specific class of integrals. Test calculations carried out on a CRAY-2 computer show that the average number of finite integrals computed over a (s, p, d, f) CGTO basis set is about 1180000 per second and per processor. The combination of vectorization and parallelism on this 4-processor machine reduces the CPU time by a factor larger than 20 with respect to the scalar and sequential performance. (orig.)
Scalable Parallel Distributed Coprocessor System for Graph Searching Problems with Massive Data

Directory of Open Access Journals (Sweden)

Wanrong Huang

2017-01-01

Full Text Available The Internet applications, such as network searching, electronic commerce, and modern medical applications, produce and process massive data. Considerable data parallelism exists in computation processes of data-intensive applications. A traversal algorithm, breadth-first search (BFS, is fundamental in many graph processing applications and metrics when a graph grows in scale. A variety of scientific programming methods have been proposed for accelerating and parallelizing BFS because of the poor temporal and spatial locality caused by inherent irregular memory access patterns. However, new parallel hardware could provide better improvement for scientific methods. To address small-world graph problems, we propose a scalable and novel field-programmable gate array-based heterogeneous multicore system for scientific programming. The core is multithread for streaming processing. And the communication network InfiniBand is adopted for scalability. We design a binary search algorithm to address mapping to unify all processor addresses. Within the limits permitted by the Graph500 test bench after 1D parallel hybrid BFS algorithm testing, our 8-core and 8-thread-per-core system achieved superior performance and efficiency compared with the prior work under the same degree of parallelism. Our system is efficient not as a special acceleration unit but as a processor platform that deals with graph searching applications.
Robust equivalent consumption-based controllers for a dual-mode diesel parallel HEV

International Nuclear Information System (INIS)

Finesso, Roberto; Spessa, Ezio; Venditti, Mattia

2016-01-01

Highlights: • Non-plug-in dual-mode parallel hybrid architecture. • Cross-validation machine-learning for robust equivalent consumption-based controllers. • Optimal control strategy based on fuel consumption, NOx and battery aging. • Impact of different equivalent consumption definitions on HEV performance. • Correlation between vehicle braking energy and SOC variation in the traction stages. - Abstract: New equivalent consumption minimization strategy (ECMS) tools have been developed and applied to identify the optimal control strategy of a dual-mode parallel hybrid electric vehicle equipped with a compression-ignition engine. In this architecture, the electric machine is coupled to the engine through either a single-speed gearbox (torque-coupling) or a planetary gear set (speed-coupling). One of the main novelties of the present study concerns the definition of the instantaneous equivalent consumption (EC) function, which takes into account not only fuel consumption (FC) and the energy flow through the electric components, but also NO_x emissions, battery aging, and the battery SOC. The EC function has been trained using a cross-validation machine-learning technique, based on a genetic algorithm, where the training data set has been selected in order to maximize performances over a testing data set. The adoption of this technique, in conjunction with the new definition of EC, have led to the identification of very robust controllers, which provide an accurate control for different driving scenarios, even when the EC function is not specifically trained on the same missions over which it is tested. To this aim, a data set of fifty driving cycles and six user-defined missions, which cover a total distance of 70–100 km, has been considered as a training driving set. The ECMS controllers can be implemented in a vehicle control unit, and their performance has resulted to be close to that of a dynamic programming tool, which has here been used as benchmark
33 CFR 157.12f - Workshop functional test requirements.

Science.gov (United States)

2010-07-01

... CARRYING OIL IN BULK Design, Equipment, and Installation § 157.12f Workshop functional test requirements... 33 Navigation and Navigable Waters 2 2010-07-01 2010-07-01 false Workshop functional test requirements. 157.12f Section 157.12f Navigation and Navigable Waters COAST GUARD, DEPARTMENT OF HOMELAND...
Solving the Flood Propagation Problem with Newton Algorithm on Parallel Systems

Directory of Open Access Journals (Sweden)

Chefi Triki

2012-04-01

Full Text Available In this paper we propose a parallel implementation for the flood propagation method Flo2DH. The model is built on a finite element spatial approximation combined with a Newton algorithm that uses a direct LU linear solver. The parallel implementation has been developed by using the standard MPI protocol and has been tested on a set of real world problems.
Photoluminescence spectra of n-doped double quantum wells in a parallel magnetic field

International Nuclear Information System (INIS)

Huang, D.; Lyo, S.K.

1999-01-01

We show that the photoluminescence (PL) line shapes from tunnel-split ground sublevels of n-doped thin double quantum wells (DQW close-quote s) are sensitively modulated by an in-plane magnetic field B parallel at low temperatures (T). The modulation is caused by the B parallel -induced distortion of the electronic structure. The latter arises from the relative shift of the energy-dispersion parabolas of the two quantum wells (QW close-quote s) in rvec k space, both in the conduction and valence bands, and formation of an anticrossing gap in the conduction band. Using a self-consistent density-functional theory, the PL spectra and the band-gap narrowing are calculated as a function of B parallel , T, and the homogeneous linewidths. The PL spectra from symmetric and asymmetric DQW close-quote s are found to show strikingly different behavior. In symmetric DQW close-quote s with a high density of electrons, two PL peaks are obtained at B parallel =0, representing the interband transitions between the pair of the upper (i.e., antisymmetric) levels and that of the lower (i.e., symmetric) levels of the ground doublets. As B parallel increases, the upper PL peak develops an N-type kink, namely a maximum followed by a minimum, and merges with the lower peak, which rises monotonically as a function of B parallel due to the diamagnetic energy. When the electron density is low, however, only a single PL peak, arising from the transitions between the lower levels, is obtained. In asymmetric DQW close-quote s, the PL spectra show mainly one dominant peak at all B parallel close-quote s. In this case, the holes are localized in one of the QW close-quote s at low T and recombine only with the electrons in the same QW. At high electron densities, the upper PL peak shows an N-type kink like in symmetric DQW close-quote s. However, the lower peak is absent at low B parallel close-quote s because it arises from the inter-QW transitions. Reasonable agreement is obtained with recent
Type synthesis for 4-DOF parallel press mechanism using GF set theory

Science.gov (United States)

He, Jun; Gao, Feng; Meng, Xiangdun; Guo, Weizhong

2015-07-01

Parallel mechanisms is used in the large capacity servo press to avoid the over-constraint of the traditional redundant actuation. Currently, the researches mainly focus on the performance analysis for some specific parallel press mechanisms. However, the type synthesis and evaluation of parallel press mechanisms is seldom studied, especially for the four degrees of freedom(DOF) press mechanisms. The type synthesis of 4-DOF parallel press mechanisms is carried out based on the generalized function(GF) set theory. Five design criteria of 4-DOF parallel press mechanisms are firstly proposed. The general procedure of type synthesis of parallel press mechanisms is obtained, which includes number synthesis, symmetrical synthesis of constraint GF sets, decomposition of motion GF sets and design of limbs. Nine combinations of constraint GF sets of 4-DOF parallel press mechanisms, ten combinations of GF sets of active limbs, and eleven combinations of GF sets of passive limbs are synthesized. Thirty-eight kinds of press mechanisms are presented and then different structures of kinematic limbs are designed. Finally, the geometrical constraint complexity( GCC), kinematic pair complexity( KPC), and type complexity( TC) are proposed to evaluate the press types and the optimal press type is achieved. The general methodologies of type synthesis and evaluation for parallel press mechanism are suggested.
Parallel computing for homogeneous diffusion and transport equations in neutronics

International Nuclear Information System (INIS)

Pinchedez, K.

1999-06-01

Parallel computing meets the ever-increasing requirements for neutronic computer code speed and accuracy. In this work, two different approaches have been considered. We first parallelized the sequential algorithm used by the neutronics code CRONOS developed at the French Atomic Energy Commission. The algorithm computes the dominant eigenvalue associated with PN simplified transport equations by a mixed finite element method. Several parallel algorithms have been developed on distributed memory machines. The performances of the parallel algorithms have been studied experimentally by implementation on a T3D Cray and theoretically by complexity models. A comparison of various parallel algorithms has confirmed the chosen implementations. We next applied a domain sub-division technique to the two-group diffusion Eigen problem. In the modal synthesis-based method, the global spectrum is determined from the partial spectra associated with sub-domains. Then the Eigen problem is expanded on a family composed, on the one hand, from eigenfunctions associated with the sub-domains and, on the other hand, from functions corresponding to the contribution from the interface between the sub-domains. For a 2-D homogeneous core, this modal method has been validated and its accuracy has been measured. (author)
The specification of Stampi, a message passing library for distributed parallel computing

International Nuclear Information System (INIS)

Imamura, Toshiyuki; Takemiya, Hiroshi; Koide, Hiroshi

2000-03-01

At CCSE, Center for Promotion of Computational Science and Engineering, a new message passing library for heterogeneous and distributed parallel computing has been developed, and it is called as Stampi. Stampi enables us to communicate between any combination of parallel computers as well as workstations. Currently, a Stampi system is constructed from Stampi library and Stampi/Java. It provides functions to connect a Stampi application with not only those on COMPACS, COMplex Parallel Computer System, but also applets which work on WWW browsers. This report summarizes the specifications of Stampi and details the development of its system. (author)
A parallel buffer tree

DEFF Research Database (Denmark)

Sitchinava, Nodar; Zeh, Norbert

2012-01-01

We present the parallel buffer tree, a parallel external memory (PEM) data structure for batched search problems. This data structure is a non-trivial extension of Arge's sequential buffer tree to a private-cache multiprocessor environment and reduces the number of I/O operations by the number of...... in the optimal OhOf(psortN + K/PB) parallel I/O complexity, where K is the size of the output reported in the process and psortN is the parallel I/O complexity of sorting N elements using P processors....
Application Portable Parallel Library

Science.gov (United States)

Cole, Gary L.; Blech, Richard A.; Quealy, Angela; Townsend, Scott

1995-01-01

Application Portable Parallel Library (APPL) computer program is subroutine-based message-passing software library intended to provide consistent interface to variety of multiprocessor computers on market today. Minimizes effort needed to move application program from one computer to another. User develops application program once and then easily moves application program from parallel computer on which created to another parallel computer. ("Parallel computer" also include heterogeneous collection of networked computers). Written in C language with one FORTRAN 77 subroutine for UNIX-based computers and callable from application programs written in C language or FORTRAN 77.
Parallel Algorithms and Patterns

Energy Technology Data Exchange (ETDEWEB)

Robey, Robert W. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

2016-06-16

This is a powerpoint presentation on parallel algorithms and patterns. A parallel algorithm is a well-defined, step-by-step computational procedure that emphasizes concurrency to solve a problem. Examples of problems include: Sorting, searching, optimization, matrix operations. A parallel pattern is a computational step in a sequence of independent, potentially concurrent operations that occurs in diverse scenarios with some frequency. Examples are: Reductions, prefix scans, ghost cell updates. We only touch on parallel patterns in this presentation. It really deserves its own detailed discussion which Gabe Rockefeller would like to develop.
Parallelization and implementation of approximate root isolation for nonlinear system by Monte Carlo

Science.gov (United States)

Khosravi, Ebrahim

1998-12-01

This dissertation solves a fundamental problem of isolating the real roots of nonlinear systems of equations by Monte-Carlo that were published by Bush Jones. This algorithm requires only function values and can be applied readily to complicated systems of transcendental functions. The implementation of this sequential algorithm provides scientists with the means to utilize function analysis in mathematics or other fields of science. The algorithm, however, is so computationally intensive that the system is limited to a very small set of variables, and this will make it unfeasible for large systems of equations. Also a computational technique was needed for investigating a metrology of preventing the algorithm structure from converging to the same root along different paths of computation. The research provides techniques for improving the efficiency and correctness of the algorithm. The sequential algorithm for this technique was corrected and a parallel algorithm is presented. This parallel method has been formally analyzed and is compared with other known methods of root isolation. The effectiveness, efficiency, enhanced overall performance of the parallel processing of the program in comparison to sequential processing is discussed. The message passing model was used for this parallel processing, and it is presented and implemented on Intel/860 MIMD architecture. The parallel processing proposed in this research has been implemented in an ongoing high energy physics experiment: this algorithm has been used to track neutrinoes in a super K detector. This experiment is located in Japan, and data can be processed on-line or off-line locally or remotely.
SP-100 nuclear assembly test: Test assembly functional requirements and system arrangement

International Nuclear Information System (INIS)

Fallas, T.T.; Gluck, R.; Motwani, K.; Clay, H.; O'Neill, G.

1991-01-01

This paper describes the functional requirements and the system that will be tested to validate the reactor, flight shield, and flight controller of the SP-100 Generic Flight System (GFS). The Nuclear Assembly Test (NAT) consists of the test article (SP-100 reactor with control devices and the flight shield) and its supporting systems. The NAT test assembly is being designed by GE. Westinghouse Hanford Company (WHC) is designing the test cell and vacuum vessel system that will contain the NAT test assembly (Renkey et al. 1989). Preliminary design reviews have been completed and the final design is under way
The most frequently used tests for assessing executive functions in aging

Directory of Open Access Journals (Sweden)

Camila de Assis Faria

Full Text Available There are numerous neuropsychological tests for assessing executive functions in aging, which vary according to the different domains assessed. OBJECTIVE: To present a systematic review of the most frequently used instruments for assessing executive functions in older adults with different educational levels in clinical and experimental research. METHODS: We searched for articles published in the last five years, using the PubMed database with the following terms: "neuropsychological tests", "executive functions", and "mild cognitive impairment". There was no language restriction. RESULTS: 25 articles fulfilled all the inclusion criteria. The seven neuropsychological tests most frequently used to evaluate executive functions in aging were: [1] Trail Making Test (TMT Form B; [2] Verbal Fluency Test (VFT - F, A and S; [3] VFT Animals category; [4] Clock Drawing Test (CDT; [5] Digits Forward and Backward subtests (WAIS-R or WAIS-III; [6] Stroop Test; and [7] Wisconsin Card Sorting Test (WCST and its variants. The domains of executive functions most frequently assessed were: mental flexibility, verbal fluency, planning, working memory, and inhibitory control. CONCLUSION: The study identified the tests and domains of executive functions most frequently used in the last five years by research groups worldwide to evaluate older adults. These results can direct future research and help build evaluation protocols for assessing executive functions, taking into account the different educational levels and socio-demographic profiles of older adults in Brazil.
Eigenvalues calculation algorithms for {lambda}-modes determination. Parallelization approach

Energy Technology Data Exchange (ETDEWEB)

Vidal, V. [Universidad Politecnica de Valencia (Spain). Departamento de Sistemas Informaticos y Computacion; Verdu, G.; Munoz-Cobo, J.L. [Universidad Politecnica de Valencia (Spain). Departamento de Ingenieria Quimica y Nuclear; Ginestart, D. [Universidad Politecnica de Valencia (Spain). Departamento de Matematica Aplicada

1997-03-01

In this paper, we review two methods to obtain the {lambda}-modes of a nuclear reactor, Subspace Iteration method and Arnoldi`s method, which are popular methods to solve the partial eigenvalue problem for a given matrix. In the developed application for the neutron diffusion equation we include improved acceleration techniques for both methods. Also, we propose two parallelization approaches for these methods, a coarse grain parallelization and a fine grain one. We have tested the developed algorithms with two realistic problems, focusing on the efficiency of the methods according to the CPU times. (author).
Totally parallel multilevel algorithms

Science.gov (United States)

Frederickson, Paul O.

1988-01-01

Four totally parallel algorithms for the solution of a sparse linear system have common characteristics which become quite apparent when they are implemented on a highly parallel hypercube such as the CM2. These four algorithms are Parallel Superconvergent Multigrid (PSMG) of Frederickson and McBryan, Robust Multigrid (RMG) of Hackbusch, the FFT based Spectral Algorithm, and Parallel Cyclic Reduction. In fact, all four can be formulated as particular cases of the same totally parallel multilevel algorithm, which are referred to as TPMA. In certain cases the spectral radius of TPMA is zero, and it is recognized to be a direct algorithm. In many other cases the spectral radius, although not zero, is small enough that a single iteration per timestep keeps the local error within the required tolerance.
Preliminary Study on the Enhancement of Reconstruction Speed for Emission Computed Tomography Using Parallel Processing

International Nuclear Information System (INIS)

Park, Min Jae; Lee, Jae Sung; Kim, Soo Mee; Kang, Ji Yeon; Lee, Dong Soo; Park, Kwang Suk

2009-01-01

Conventional image reconstruction uses simplified physical models of projection. However, real physics, for example 3D reconstruction, takes too long time to process all the data in clinic and is unable in a common reconstruction machine because of the large memory for complex physical models. We suggest the realistic distributed memory model of fast-reconstruction using parallel processing on personal computers to enable large-scale technologies. The preliminary tests for the possibility on virtual machines and various performance test on commercial super computer, Tachyon were performed. Expectation maximization algorithm with common 2D projection and realistic 3D line of response were tested. Since the process time was getting slower (max 6 times) after a certain iteration, optimization for compiler was performed to maximize the efficiency of parallelization. Parallel processing of a program on multiple computers was available on Linux with MPICH and NFS. We verified that differences between parallel processed image and single processed image at the same iterations were under the significant digits of floating point number, about 6 bit. Double processors showed good efficiency (1.96 times) of parallel computing. Delay phenomenon was solved by vectorization method using SSE. Through the study, realistic parallel computing system in clinic was established to be able to reconstruct by plenty of memory using the realistic physical models which was impossible to simplify

Parallel routes of human carcinoma development: implications of the age-specific incidence data.

Directory of Open Access Journals (Sweden)

James P Brody

Full Text Available BACKGROUND: The multi-stage hypothesis suggests that cancers develop through a single defined series of genetic alterations. This hypothesis was first suggested over 50 years ago based upon age-specific incidence data. However, recent molecular studies of tumors indicate that multiple routes exist to the formation of cancer, not a single route. This parallel route hypothesis has not been tested with age-specific incidence data. METHODOLOGY/PRINCIPAL FINDINGS: To test the parallel route hypothesis, I formulated it in terms of a mathematical equation and then tested whether this equation was consistent with age-specific incidence data compiled by the Surveillance Epidemiology and End Results (SEER cancer registries since 1973. I used the chi-squared goodness of fit test to measure consistency. The age-specific incidence data from most human carcinomas, including those of the colon, lung, prostate, and breast were consistent with the parallel route hypothesis. However, this hypothesis is only consistent if an immune sub-population exists, one that will never develop carcinoma. Furthermore, breast carcinoma has two distinct forms of the disease, and one of these occurs at significantly different rates in different racial groups. CONCLUSIONS/SIGNIFICANCE: I conclude that the parallel route hypothesis is consistent with the age-specific incidence data only if carcinoma occurs in a distinct sub population, while the multi-stage hypothesis is inconsistent with this data.
Neural Parallel Engine: A toolbox for massively parallel neural signal processing.

Science.gov (United States)

Tam, Wing-Kin; Yang, Zhi

2018-05-01

Large-scale neural recordings provide detailed information on neuronal activities and can help elicit the underlying neural mechanisms of the brain. However, the computational burden is also formidable when we try to process the huge data stream generated by such recordings. In this study, we report the development of Neural Parallel Engine (NPE), a toolbox for massively parallel neural signal processing on graphical processing units (GPUs). It offers a selection of the most commonly used routines in neural signal processing such as spike detection and spike sorting, including advanced algorithms such as exponential-component-power-component (EC-PC) spike detection and binary pursuit spike sorting. We also propose a new method for detecting peaks in parallel through a parallel compact operation. Our toolbox is able to offer a 5× to 110× speedup compared with its CPU counterparts depending on the algorithms. A user-friendly MATLAB interface is provided to allow easy integration of the toolbox into existing workflows. Previous efforts on GPU neural signal processing only focus on a few rudimentary algorithms, are not well-optimized and often do not provide a user-friendly programming interface to fit into existing workflows. There is a strong need for a comprehensive toolbox for massively parallel neural signal processing. A new toolbox for massively parallel neural signal processing has been created. It can offer significant speedup in processing signals from large-scale recordings up to thousands of channels. Copyright © 2018 Elsevier B.V. All rights reserved.
A possibility of parallel and anti-parallel diffraction measurements on ...

Indian Academy of Sciences (India)

However, a bent perfect crystal (BPC) monochromator at monochromatic focusing condition can provide a quite flat and equal resolution property at both parallel and anti-parallel positions and thus one can have a chance to use both sides for the diffraction experiment. From the data of the FWHM and the / measured ...
Modulation, plasticity and pathophysiology of the parallel fiber-Purkinje cell synapse

Directory of Open Access Journals (Sweden)

Eriola Hoxha

2016-11-01

Full Text Available The parallel fiber-Purkinje cell synapse represents the point of maximal signal divergence in the cerebellar cortex with an estimated number of about 60 billion synaptic contacts in the rat and 100,000 billions in humans. At the same time, the Purkinje cell dendritic tree is a site of remarkable convergence of more than 100,000 parallel fiber synapses. Parallel fibers activity generates fast postsynaptic currents via AMPA receptors, and slower signals, mediated by mGlu1 receptors, resulting in Purkinje cell depolarization accompanied by sharp calcium elevation within dendritic regions. Long-term depression and long-term potentiation have been widely described for the parallel fiber-Purkinje cell synapse and have been proposed as mechanisms for motor learning. The mechanisms of induction for LTP and LTD involve different signaling mechanisms within the presynaptic terminal and/or at the postsynaptic site, promoting enduring modification in the neurotransmitter release and change in responsiveness to the neurotransmitter. The parallel fiber-Purkinje cell synapse is finely modulated by several neurotransmitters, including serotonin, noradrenaline, and acetylcholine. The ability of these neuromodulators to gate LTP and LTD at the parallel fiber-Purkinje cell synapse could, at least in part, explain their effect on cerebellar-dependent learning and memory paradigms. Overall, these findings have important implications for understanding the cerebellar involvement in a series of pathological conditions, ranging from ataxia to autism. For example, parallel fiber-Purkinje cell synapse dysfunctions have been identified in several murine models of spinocerebellar ataxia (SCA types 1, 3, 5 and 27. In some cases, the defect is specific for the AMPA receptor signaling (SCA27, while in others the mGlu1 pathway is affected (SCA1, 3, 5. Interestingly, the parallel fiber-Purkinje cell synapse has been shown to be hyper-functional in a mutant mouse model of autism
Multilevel parallel strategy on Monte Carlo particle transport for the large-scale full-core pin-by-pin simulations

International Nuclear Information System (INIS)

Zhang, B.; Li, G.; Wang, W.; Shangguan, D.; Deng, L.

2015-01-01

This paper introduces the Strategy of multilevel hybrid parallelism of JCOGIN Infrastructure on Monte Carlo Particle Transport for the large-scale full-core pin-by-pin simulations. The particle parallelism, domain decomposition parallelism and MPI/OpenMP parallelism are designed and implemented. By the testing, JMCT presents the parallel scalability of JCOGIN, which reaches the parallel efficiency 80% on 120,000 cores for the pin-by-pin computation of the BEAVRS benchmark. (author)
Comparative eye-tracking evaluation of scatterplots and parallel coordinates

Directory of Open Access Journals (Sweden)

Rudolf Netzel

2017-06-01

Full Text Available We investigate task performance and reading characteristics for scatterplots (Cartesian coordinates and parallel coordinates. In a controlled eye-tracking study, we asked 24 participants to assess the relative distance of points in multidimensional space, depending on the diagram type (parallel coordinates or a horizontal collection of scatterplots, the number of data dimensions (2, 4, 6, or 8, and the relative distance between points (15%, 20%, or 25%. For a given reference point and two target points, we instructed participants to choose the target point that was closer to the reference point in multidimensional space. We present a visual scanning model that describes different strategies to solve this retrieval task for both diagram types, and propose corresponding hypotheses that we test using task completion time, accuracy, and gaze positions as dependent variables. Our results show that scatterplots outperform parallel coordinates significantly in 2 dimensions, however, the task was solved more quickly and more accurately with parallel coordinates in 8 dimensions. The eye-tracking data further shows significant differences between Cartesian and parallel coordinates, as well as between different numbers of dimensions. For parallel coordinates, there is a clear trend toward shorter fixations and longer saccades with increasing number of dimensions. Using an area-of-interest (AOI based approach, we identify different reading strategies for each diagram type: For parallel coordinates, the participants’ gaze frequently jumped back and forth between pairs of axes, while axes were rarely focused on when viewing Cartesian coordinates. We further found that participants’ attention is biased: toward the center of the whole plotfor parallel coordinates and skewed to the center/left side for Cartesian coordinates. We anticipate that these results may support the design of more effective visualizations for multidimensional data.
Numerical simulation of Vlasov equation with parallel tools

International Nuclear Information System (INIS)

Peyroux, J.

2005-11-01

This project aims to make even more powerful the resolution of Vlasov codes through the various parallelization tools (MPI, OpenMP...). A simplified test case served as a base for constructing the parallel codes for obtaining a data-processing skeleton which, thereafter, could be re-used for increasingly complex models (more than four variables of phase space). This will thus make it possible to treat more realistic situations linked, for example, to the injection of ultra short and ultra intense impulses in inertial fusion plasmas, or the study of the instability of trapped ions now taken as being responsible for the generation of turbulence in tokamak plasmas. (author)
Parallel Harmony Search Based Distributed Energy Resource Optimization

Energy Technology Data Exchange (ETDEWEB)

Ceylan, Oguzhan [ORNL; Liu, Guodong [ORNL; Tomsovic, Kevin [University of Tennessee, Knoxville (UTK)

2015-01-01

This paper presents a harmony search based parallel optimization algorithm to minimize voltage deviations in three phase unbalanced electrical distribution systems and to maximize active power outputs of distributed energy resources (DR). The main contribution is to reduce the adverse impacts on voltage profile during a day as photovoltaics (PVs) output or electrical vehicles (EVs) charging changes throughout a day. The IEEE 123- bus distribution test system is modified by adding DRs and EVs under different load profiles. The simulation results show that by using parallel computing techniques, heuristic methods may be used as an alternative optimization tool in electrical power distribution systems operation.
FPGAs and wavelets on circuit testing based on current signal measurements

International Nuclear Information System (INIS)

Pouros, Sotirios; Vassios, Vassilios; Manolakis, Dimitrios; Bamnios, Georgios; Papakostas, Dimitrios K.; Hatzopoulos, Alkis A.; Hristov, Valentin

2015-01-01

The research team designed and implemented a prototype testing system using FPGAs, where test methods for analog and digital (mixed) electronics using wavelets can be incorporated. The prototype has been evaluated and the results are promising. Moreover, the usability and verification of the system’s functionality are presented. The current sensing unit is described in detail. The new automated fault testing system incorporates reconfigurability and parallel processing capabilities.
Parallel and distributed processing in power system simulation and control

Energy Technology Data Exchange (ETDEWEB)

Falcao, Djalma M [Universidade Federal, Rio de Janeiro, RJ (Brazil). Coordenacao dos Programas de Pos-graduacao de Engenharia

1994-12-31

Recent advances in computer technology will certainly have a great impact in the methodologies used in power system expansion and operational planning as well as in real-time control. Parallel and distributed processing are among the new technologies that present great potential for application in these areas. Parallel computers use multiple functional or processing units to speed up computation while distributed processing computer systems are collection of computers joined together by high speed communication networks having many objectives and advantages. The paper presents some ideas for the use of parallel and distributed processing in power system simulation and control. It also comments on some of the current research work in these topics and presents a summary of the work presently being developed at COPPE. (author) 53 refs., 2 figs.
Parallel heat transport in integrable and chaotic magnetic fields

Energy Technology Data Exchange (ETDEWEB)

Castillo-Negrete, D. del; Chacon, L. [Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831-8071 (United States)

2012-05-15

The study of transport in magnetized plasmas is a problem of fundamental interest in controlled fusion, space plasmas, and astrophysics research. Three issues make this problem particularly challenging: (i) The extreme anisotropy between the parallel (i.e., along the magnetic field), {chi}{sub ||} , and the perpendicular, {chi}{sub Up-Tack }, conductivities ({chi}{sub ||} /{chi}{sub Up-Tack} may exceed 10{sup 10} in fusion plasmas); (ii) Nonlocal parallel transport in the limit of small collisionality; and (iii) Magnetic field lines chaos which in general complicates (and may preclude) the construction of magnetic field line coordinates. Motivated by these issues, we present a Lagrangian Green's function method to solve the local and non-local parallel transport equation applicable to integrable and chaotic magnetic fields in arbitrary geometry. The method avoids by construction the numerical pollution issues of grid-based algorithms. The potential of the approach is demonstrated with nontrivial applications to integrable (magnetic island), weakly chaotic (Devil's staircase), and fully chaotic magnetic field configurations. For the latter, numerical solutions of the parallel heat transport equation show that the effective radial transport, with local and non-local parallel closures, is non-diffusive, thus casting doubts on the applicability of quasilinear diffusion descriptions. General conditions for the existence of non-diffusive, multivalued flux-gradient relations in the temperature evolution are derived.
Evidence for parallel consolidation of motion direction and orientation into visual short-term memory.

Science.gov (United States)

Rideaux, Reuben; Apthorp, Deborah; Edwards, Mark

2015-02-12

Recent findings have indicated the capacity to consolidate multiple items into visual short-term memory in parallel varies as a function of the type of information. That is, while color can be consolidated in parallel, evidence suggests that orientation cannot. Here we investigated the capacity to consolidate multiple motion directions in parallel and reexamined this capacity using orientation. This was achieved by determining the shortest exposure duration necessary to consolidate a single item, then examining whether two items, presented simultaneously, could be consolidated in that time. The results show that parallel consolidation of direction and orientation information is possible, and that parallel consolidation of direction appears to be limited to two. Additionally, we demonstrate the importance of adequate separation between feature intervals used to define items when attempting to consolidate in parallel, suggesting that when multiple items are consolidated in parallel, as opposed to serially, the resolution of representations suffer. Finally, we used facilitation of spatial attention to show that the deterioration of item resolution occurs during parallel consolidation, as opposed to storage. © 2015 ARVO.
Numerical simulation of Vlasov equation with parallel tools; Simulations numeriques de l'equation de Vlasov a l'aide d'outils paralleles

Energy Technology Data Exchange (ETDEWEB)

Peyroux, J

2005-11-15

This project aims to make even more powerful the resolution of Vlasov codes through the various parallelization tools (MPI, OpenMP...). A simplified test case served as a base for constructing the parallel codes for obtaining a data-processing skeleton which, thereafter, could be re-used for increasingly complex models (more than four variables of phase space). This will thus make it possible to treat more realistic situations linked, for example, to the injection of ultra short and ultra intense impulses in inertial fusion plasmas, or the study of the instability of trapped ions now taken as being responsible for the generation of turbulence in tokamak plasmas. (author)
Optimization of the parameter calculation the process of production historic by using Parallel Virtual Machine-PVM; Otimizacao do calculo de parametros no processo de ajuste de historicos de producao usando PVM

Energy Technology Data Exchange (ETDEWEB)

Vargas Cuervo, Carlos Hernan

1997-03-01

The main objective of this work is to develop a methodology to optimize the simultaneous computation of two parameters in the process of production history matching. This work describes a procedure to minimize an objective function established to find the values of the parameters which are modified in the process. The parameters are chosen after a sensibility analysis. Two optimization methods are tested: a Region Search Method (MBR) and Polytope Method. Both are based in direct search methods which do not require the function derivative. The software PVM (Parallel Virtual Machine) is used to parallelize the simulation runs, allowing the acceleration of the process and the search of multiple solutions. The validation of the methodology is applied to two reservoir models: one homogeneous and other heterogeneous. The advantages of each method and of the parallelization are also present. (author)
Parallel k-means++

Energy Technology Data Exchange (ETDEWEB)

2017-04-04

A parallelization of the k-means++ seed selection algorithm on three distinct hardware platforms: GPU, multicore CPU, and multithreaded architecture. K-means++ was developed by David Arthur and Sergei Vassilvitskii in 2007 as an extension of the k-means data clustering technique. These algorithms allow people to cluster multidimensional data, by attempting to minimize the mean distance of data points within a cluster. K-means++ improved upon traditional k-means by using a more intelligent approach to selecting the initial seeds for the clustering process. While k-means++ has become a popular alternative to traditional k-means clustering, little work has been done to parallelize this technique. We have developed original C++ code for parallelizing the algorithm on three unique hardware architectures: GPU using NVidia's CUDA/Thrust framework, multicore CPU using OpenMP, and the Cray XMT multithreaded architecture. By parallelizing the process for these platforms, we are able to perform k-means++ clustering much more quickly than it could be done before.
Investigation and performance tests of a new parallel plate ionization chamber with double sensitive volume for measuring diagnostic X-rays

Energy Technology Data Exchange (ETDEWEB)

Sharifi, B., E-mail: babak_sharifi88@yahoo.com [Graduate University of Advanced Technology, Kerman (Iran, Islamic Republic of); Zamani Zeinali, H. [Application of Radiation Research School, Nuclear Science and Technology Research Institute, AEOI, Karaj (Iran, Islamic Republic of); Soltani, J.; Negarestani, A. [Graduate University of Advanced Technology, Kerman (Iran, Islamic Republic of); Shahvar, A. [Application of Radiation Research School, Nuclear Science and Technology Research Institute, AEOI, Karaj (Iran, Islamic Republic of)

2015-01-11

Medical diagnostic equipment, like diagnostic radiology and mammography require a dosimeter with high accuracy for dosimetry of the diagnostic X-ray beam. Ionization chambers are suitable instruments for dosimetry of diagnostic-range X-ray beams because of their appropriate response and high reliability. This work introduces the design and fabrication of a new parallel plate ionization chamber with a PMMA body, graphite-coated PMMA windows (0.5 mm thick) and a graphite-foil central electrode (0.1 mm thick, 0.7 g/cm{sup 3} dense). This design improves upon the response characteristics of existing designs through the specific choice of materials as well as the appropriate size and arrangement of the ionization chamber components. The results of performance tests conducted at the Secondary Standard Dosimetry laboratory in Karaj-Iran demonstrated the short and long-term stability, the low leakage current, the low directional dependence, and the high ion collection efficiency of the design. Furthermore, the FLUKA Monte Carlo simulations confirmed the low effect of central electrode on this new ionization chamber response. The response characteristics of the parallel plate ionization chamber presented in this work makes the instrument suitable for use as a standard dosimeter in laboratories.
Usng subjective percentiles and test data for estimating fragility functions

International Nuclear Information System (INIS)

George, L.L.; Mensing, R.W.

1981-01-01

Fragility functions are cumulative distribution functions (cdfs) of strengths at failure. They are needed for reliability analyses of systems such as power generation and transmission systems. Subjective opinions supplement sparse test data for estimating fragility functions. Often the opinions are opinions on the percentiles of the fragility function. Subjective percentiles are likely to be less biased than opinions on parameters of cdfs. Solutions to several problems in the estimation of fragility functions are found for subjective percentiles and test data. How subjective percentiles should be used to estimate subjective fragility functions, how subjective percentiles should be combined with test data, how fragility functions for several failure modes should be combined into a composite fragility function, and how inherent randomness and uncertainty due to lack of knowledge should be represented are considered. Subjective percentiles are treated as independent estimates of percentiles. The following are derived: least-squares parameter estimators for normal and lognormal cdfs, based on subjective percentiles (the method is applicable to any invertible cdf); a composite fragility function for combining several failure modes; estimators of variation within and between groups of experts for nonidentically distributed subjective percentiles; weighted least-squares estimators when subjective percentiles have higher variation at higher percents; and weighted least-squares and Bayes parameter estimators based on combining subjective percentiles and test data. 4 figures, 2 tables
Parallel magnetic resonance imaging

International Nuclear Information System (INIS)

Larkman, David J; Nunes, Rita G

2007-01-01

Parallel imaging has been the single biggest innovation in magnetic resonance imaging in the last decade. The use of multiple receiver coils to augment the time consuming Fourier encoding has reduced acquisition times significantly. This increase in speed comes at a time when other approaches to acquisition time reduction were reaching engineering and human limits. A brief summary of spatial encoding in MRI is followed by an introduction to the problem parallel imaging is designed to solve. There are a large number of parallel reconstruction algorithms; this article reviews a cross-section, SENSE, SMASH, g-SMASH and GRAPPA, selected to demonstrate the different approaches. Theoretical (the g-factor) and practical (coil design) limits to acquisition speed are reviewed. The practical implementation of parallel imaging is also discussed, in particular coil calibration. How to recognize potential failure modes and their associated artefacts are shown. Well-established applications including angiography, cardiac imaging and applications using echo planar imaging are reviewed and we discuss what makes a good application for parallel imaging. Finally, active research areas where parallel imaging is being used to improve data quality by repairing artefacted images are also reviewed. (invited topical review)
Experiences in Data-Parallel Programming

Directory of Open Access Journals (Sweden)

Terry W. Clark

1997-01-01

Full Text Available To efficiently parallelize a scientific application with a data-parallel compiler requires certain structural properties in the source program, and conversely, the absence of others. A recent parallelization effort of ours reinforced this observation and motivated this correspondence. Specifically, we have transformed a Fortran 77 version of GROMOS, a popular dusty-deck program for molecular dynamics, into Fortran D, a data-parallel dialect of Fortran. During this transformation we have encountered a number of difficulties that probably are neither limited to this particular application nor do they seem likely to be addressed by improved compiler technology in the near future. Our experience with GROMOS suggests a number of points to keep in mind when developing software that may at some time in its life cycle be parallelized with a data-parallel compiler. This note presents some guidelines for engineering data-parallel applications that are compatible with Fortran D or High Performance Fortran compilers.
Random number generators for large-scale parallel Monte Carlo simulations on FPGA

Science.gov (United States)

Lin, Y.; Wang, F.; Liu, B.

2018-05-01

Through parallelization, field programmable gate array (FPGA) can achieve unprecedented speeds in large-scale parallel Monte Carlo (LPMC) simulations. FPGA presents both new constraints and new opportunities for the implementations of random number generators (RNGs), which are key elements of any Monte Carlo (MC) simulation system. Using empirical and application based tests, this study evaluates all of the four RNGs used in previous FPGA based MC studies and newly proposed FPGA implementations for two well-known high-quality RNGs that are suitable for LPMC studies on FPGA. One of the newly proposed FPGA implementations: a parallel version of additive lagged Fibonacci generator (Parallel ALFG) is found to be the best among the evaluated RNGs in fulfilling the needs of LPMC simulations on FPGA.

“The Language of Gods”: The Pragmatics of Bilingual Parallelism in Ritual Ch’orti’ Maya Discourse

Directory of Open Access Journals (Sweden)

Kerry Hull

2017-10-01

Full Text Available In this study I investigate the discursive function of parallelism in the ritual speech of Ch’orti’ Maya. Specifically, I examine the exploitation of the dual lexicons of Ch’orti’ Mayan and Spanish in the production of parallel structures. Ch’orti’ ritual speech is almost universally constructed in parallelistic fashion, accomplishing at once a near hypnotic cadence when performed, while also serving various pragmatic functions. I detail the dynamic breadth of what I refer to as bilingual parallelism, i.e., parallelism that involves the pairing of synonymous terms from different languages in a distich. The effective use of parallelistic speech is said by the Ch’orti’ to be an imitation of the speech patterns of the gods themselves, thereby further explaining its importance in ceremonial contexts when speaking to gods and otherworld beings.
A Powerful Test for Comparing Multiple Regression Functions.

Science.gov (United States)

Maity, Arnab

2012-09-01

In this article, we address the important problem of comparison of two or more population regression functions. Recently, Pardo-Fernández, Van Keilegom and González-Manteiga (2007) developed test statistics for simple nonparametric regression models: Y(ij) = θ(j)(Z(ij)) + σ(j)(Z(ij))∊(ij), based on empirical distributions of the errors in each population j = 1, … , J. In this paper, we propose a test for equality of the θ(j)(·) based on the concept of generalized likelihood ratio type statistics. We also generalize our test for other nonparametric regression setups, e.g, nonparametric logistic regression, where the loglikelihood for population j is any general smooth function [Formula: see text]. We describe a resampling procedure to obtain the critical values of the test. In addition, we present a simulation study to evaluate the performance of the proposed test and compare our results to those in Pardo-Fernández et al. (2007).
Non-Cartesian parallel imaging reconstruction.

Science.gov (United States)

Wright, Katherine L; Hamilton, Jesse I; Griswold, Mark A; Gulani, Vikas; Seiberlich, Nicole

2014-11-01

Non-Cartesian parallel imaging has played an important role in reducing data acquisition time in MRI. The use of non-Cartesian trajectories can enable more efficient coverage of k-space, which can be leveraged to reduce scan times. These trajectories can be undersampled to achieve even faster scan times, but the resulting images may contain aliasing artifacts. Just as Cartesian parallel imaging can be used to reconstruct images from undersampled Cartesian data, non-Cartesian parallel imaging methods can mitigate aliasing artifacts by using additional spatial encoding information in the form of the nonhomogeneous sensitivities of multi-coil phased arrays. This review will begin with an overview of non-Cartesian k-space trajectories and their sampling properties, followed by an in-depth discussion of several selected non-Cartesian parallel imaging algorithms. Three representative non-Cartesian parallel imaging methods will be described, including Conjugate Gradient SENSE (CG SENSE), non-Cartesian generalized autocalibrating partially parallel acquisition (GRAPPA), and Iterative Self-Consistent Parallel Imaging Reconstruction (SPIRiT). After a discussion of these three techniques, several potential promising clinical applications of non-Cartesian parallel imaging will be covered. © 2014 Wiley Periodicals, Inc.
Influence of Paralleling Dies and Paralleling Half-Bridges on Transient Current Distribution in Multichip Power Modules

DEFF Research Database (Denmark)

Li, Helong; Zhou, Wei; Wang, Xiongfei

2018-01-01

This paper addresses the transient current distribution in the multichip half-bridge power modules, where two types of paralleling connections with different current commutation mechanisms are considered: paralleling dies and paralleling half-bridges. It reveals that with paralleling dies, both t...
Hybrid parallel computing architecture for multiview phase shifting

Science.gov (United States)

Zhong, Kai; Li, Zhongwei; Zhou, Xiaohui; Shi, Yusheng; Wang, Congjun

2014-11-01

The multiview phase-shifting method shows its powerful capability in achieving high resolution three-dimensional (3-D) shape measurement. Unfortunately, this ability results in very high computation costs and 3-D computations have to be processed offline. To realize real-time 3-D shape measurement, a hybrid parallel computing architecture is proposed for multiview phase shifting. In this architecture, the central processing unit can co-operate with the graphic processing unit (GPU) to achieve hybrid parallel computing. The high computation cost procedures, including lens distortion rectification, phase computation, correspondence, and 3-D reconstruction, are implemented in GPU, and a three-layer kernel function model is designed to simultaneously realize coarse-grained and fine-grained paralleling computing. Experimental results verify that the developed system can perform 50 fps (frame per second) real-time 3-D measurement with 260 K 3-D points per frame. A speedup of up to 180 times is obtained for the performance of the proposed technique using a NVIDIA GT560Ti graphics card rather than a sequential C in a 3.4 GHZ Inter Core i7 3770.
Fast electrostatic force calculation on parallel computer clusters

International Nuclear Information System (INIS)

Kia, Amirali; Kim, Daejoong; Darve, Eric

2008-01-01

The fast multipole method (FMM) and smooth particle mesh Ewald (SPME) are well known fast algorithms to evaluate long range electrostatic interactions in molecular dynamics and other fields. FMM is a multi-scale method which reduces the computation cost by approximating the potential due to a group of particles at a large distance using few multipole functions. This algorithm scales like O(N) for N particles. SPME algorithm is an O(NlnN) method which is based on an interpolation of the Fourier space part of the Ewald sum and evaluating the resulting convolutions using fast Fourier transform (FFT). Those algorithms suffer from relatively poor efficiency on large parallel machines especially for mid-size problems around hundreds of thousands of atoms. A variation of the FMM, called PWA, based on plane wave expansions is presented in this paper. A new parallelization strategy for PWA, which takes advantage of the specific form of this expansion, is described. Its parallel efficiency is compared with SPME through detail time measurements on two different computer clusters
Development of a battery of functional tests for low vision.

Science.gov (United States)

Dougherty, Bradley E; Martin, Scott R; Kelly, Corey B; Jones, Lisa A; Raasch, Thomas W; Bullimore, Mark A

2009-08-01

We describe the development and evaluation of a battery of tests of functional visual performance of everyday tasks intended to be suitable for assessment of low vision patients. The functional test battery comprises-Reading rate: reading aloud 20 unrelated words for each of four print sizes (8, 4, 2, & 1 M); Telephone book: finding a name and reading the telephone number; Medicine bottle label: reading the name and dosing; Utility bill: reading the due date and amount due; Cooking instructions: reading cooking time on a food package; Coin sorting: making a specified amount from coins placed on a table; Playing card recognition: identifying denomination and suit; and Face recognition: identifying expressions of printed, life-size faces at 1 and 3 m. All tests were timed except face and playing card recognition. Fourteen normally sighted and 24 low vision subjects were assessed with the functional test battery. Visual acuity, contrast sensitivity, and quality of life (National Eye Institute Visual Function Questionnaire 25 [NEI-VFQ 25]) were measured and the functional tests repeated. Subsequently, 23 low vision patients participated in a pilot randomized clinical trial with half receiving low vision rehabilitation and half a delayed intervention. The functional tests were administered at enrollment and 3 months later. Normally sighted subjects could perform all tasks but the proportion of trials performed correctly by the low vision subjects ranged from 35% for face recognition at 3 m, to 95% for the playing card identification. On average, low vision subjects performed three times slower than the normally sighted subjects. Timed tasks with a visual search component showed poorer repeatability. In the pilot clinical trial, low vision rehabilitation produced the greatest improvement for the medicine bottle and cooking instruction tasks. Performance of patients on these functional tests has been assessed. Some appear responsive to low vision rehabilitation.
A test of conformal invariance: Correlation functions on a disk

International Nuclear Information System (INIS)

Badke, R.; Rittenberg, V.; Ruegg, H.

1985-06-01

Using conformal invariance one can derive the correlation functions of a disk from those in the half-plane. The correlation function in the half-plane is determined by the 'small' conformal invariance up to an unknown function of one variable. By measuring through the Monte Carlo method the correlation function for two different configurations, the unknown function can be eliminated and one obtains a test of conformal invariance. It is shown that the Ising and the three state Potts model pass the test for very small lattices. (orig.)
Significance tests for functional data with complex dependence structure

KAUST Repository

Staicu, Ana-Maria; Lahiri, Soumen N.; Carroll, Raymond J.

2015-01-01

We propose an L (2)-norm based global testing procedure for the null hypothesis that multiple group mean functions are equal, for functional data with complex dependence structure. Specifically, we consider the setting of functional data with a
Design of a planar 3-DOF parallel micromanipulator

International Nuclear Information System (INIS)

Lee, Jeong Jae; Dong, Yanlu; Jeon, Yong Ho; Lee, Moon Gu

2013-01-01

A planar three degree-of-freedom (DOF) parallel manipulator is proposed to be applied for alignment during assembly of microcomponents. It adopts a PRR (prismatic-revolute-revolute) mechanism to meet the requirements of high precision for assembly and robustness against disturbance. The mechanism was designed to have a large workspace and good dexterity because parallel mechanisms usually have a narrow range and singularity of motion compared to serial mechanisms. Inverse kinematics and a simple closed-loop algorithm of the parallel manipulator are presented to control it. Experimental tests have been carried out with high-resolution capacitance sensors to verify the performance of the mechanism. The results of experiments show that the manipulator has a large workspace of ±1.0 mm, ±1.0 mm, and ±10 mrad in the X-, Y-, and θ-directions, respectively. This is a large workspace when considering it adopts a parallel mechanism and has a small size, 100 ´ 100 ´ 100 mm3 . It also has a good precision of 2 μm, 3 μm, and 0.2 mrad, in the X-, Y-, and θ- axes, respectively. These are high resolutions considering the manipulator adopts conventional joints. The manipulator is expected to have good dexterity.
Optimization Solutions for Improving the Performance of the Parallel Reduction Algorithm Using Graphics Processing Units

Directory of Open Access Journals (Sweden)

Ion LUNGU

2012-01-01

Full Text Available In this paper, we research, analyze and develop optimization solutions for the parallel reduction function using graphics processing units (GPUs that implement the Compute Unified Device Architecture (CUDA, a modern and novel approach for improving the software performance of data processing applications and algorithms. Many of these applications and algorithms make use of the reduction function in their computational steps. After having designed the function and its algorithmic steps in CUDA, we have progressively developed and implemented optimization solutions for the reduction function. In order to confirm, test and evaluate the solutions' efficiency, we have developed a custom tailored benchmark suite. We have analyzed the obtained experimental results regarding: the comparison of the execution time and bandwidth when using graphic processing units covering the main CUDA architectures (Tesla GT200, Fermi GF100, Kepler GK104 and a central processing unit; the data type influence; the binary operator's influence.
Parallel Architectures and Parallel Algorithms for Integrated Vision Systems. Ph.D. Thesis

Science.gov (United States)

Choudhary, Alok Nidhi

1989-01-01

Computer vision is regarded as one of the most complex and computationally intensive problems. An integrated vision system (IVS) is a system that uses vision algorithms from all levels of processing to perform for a high level application (e.g., object recognition). An IVS normally involves algorithms from low level, intermediate level, and high level vision. Designing parallel architectures for vision systems is of tremendous interest to researchers. Several issues are addressed in parallel architectures and parallel algorithms for integrated vision systems.
Pattern-Driven Automatic Parallelization

Directory of Open Access Journals (Sweden)

Christoph W. Kessler

1996-01-01

Full Text Available This article describes a knowledge-based system for automatic parallelization of a wide class of sequential numerical codes operating on vectors and dense matrices, and for execution on distributed memory message-passing multiprocessors. Its main feature is a fast and powerful pattern recognition tool that locally identifies frequently occurring computations and programming concepts in the source code. This tool also works for dusty deck codes that have been "encrypted" by former machine-specific code transformations. Successful pattern recognition guides sophisticated code transformations including local algorithm replacement such that the parallelized code need not emerge from the sequential program structure by just parallelizing the loops. It allows access to an expert's knowledge on useful parallel algorithms, available machine-specific library routines, and powerful program transformations. The partially restored program semantics also supports local array alignment, distribution, and redistribution, and allows for faster and more exact prediction of the performance of the parallelized target code than is usually possible.
Data communications in a parallel active messaging interface of a parallel computer

Science.gov (United States)

Archer, Charles J; Blocksome, Michael A; Ratterman, Joseph D; Smith, Brian E

2013-10-29

Data communications in a parallel active messaging interface (`PAMI`) of a parallel computer, the parallel computer including a plurality of compute nodes that execute a parallel application, the PAMI composed of data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, the compute nodes and the endpoints coupled for data communications through the PAMI and through data communications resources, including receiving in an origin endpoint of the PAMI a data communications instruction, the instruction characterized by an instruction type, the instruction specifying a transmission of transfer data from the origin endpoint to a target endpoint and transmitting, in accordance with the instruction type, the transfer data from the origin endpoint to the target endpoint.
The STAPL Parallel Graph Library

KAUST Repository

Harshvardhan,; Fidel, Adam; Amato, Nancy M.; Rauchwerger, Lawrence

2013-01-01

This paper describes the stapl Parallel Graph Library, a high-level framework that abstracts the user from data-distribution and parallelism details and allows them to concentrate on parallel graph algorithm development. It includes a customizable
Evaluation of thyroid function tests in non-thyroidal illness

International Nuclear Information System (INIS)

Schutte, D.P.

1988-01-01

Normal thyroid physiology and pathophysiology with reference to non-thyroidal illness (NTI) is reviewed including specific disease states and drugs and their effect on thyroid function tests. The diagnostic utility of two new highly sensitive thyrotrophin (TSH) assays as screening tests for thyroid dysfunction are evaluated and compared with conventional thyroid function assays. A group of 40 patients with NTI was studied. This group was compared to a group of normal controls and a group of thyrotoxic patients. Conventional thyroid function tests yielded many values outside the reference range in the NTI group. The general pattern that emerged was decreased total triiodothyronine levels in 70% of NTI patients, normal to low thyroxine values, increased mean free thyroxine values (dialysis), low mean values for the free thyroxine index and varying results for newer commercial assays for free thyroxine according to methodology. The TSH response to intravenous thyroliberin (TRH) was found to be blunted compared to controls. Basal TSH levels were measured with two ultasensitive TSH assays. The immunoradiometric assays yielded fewer values outside the reference range in the NTI group than conventional thyroid function tests. This assay yielded undetectable basal TSH levels in all thyrotoxic patients and could reliably separate thyrotoxic patients from the NTI group. Basal TSH levels with ultrasensitive TSH assays correlated well with the TSH response to TRH (TSH) and could obviate the need for TRH tests. Ultrasensitive TSH assays are promising first line screening tests in NTI. 120 refs., 13 figs., 7 tabs
Overview of Parallel Platforms for Common High Performance Computing

Directory of Open Access Journals (Sweden)

T. Fryza

2012-04-01

Full Text Available The paper deals with various parallel platforms used for high performance computing in the signal processing domain. More precisely, the methods exploiting the multicores central processing units such as message passing interface and OpenMP are taken into account. The properties of the programming methods are experimentally proved in the application of a fast Fourier transform and a discrete cosine transform and they are compared with the possibilities of MATLAB's built-in functions and Texas Instruments digital signal processors with very long instruction word architectures. New FFT and DCT implementations were proposed and tested. The implementation phase was compared with CPU based computing methods and with possibilities of the Texas Instruments digital signal processing library on C6747 floating-point DSPs. The optimal combination of computing methods in the signal processing domain and new, fast routines' implementation is proposed as well.
The 1-min sit-to-stand test--A simple functional capacity test in cystic fibrosis?

Science.gov (United States)

Radtke, Thomas; Puhan, Milo A; Hebestreit, Helge; Kriemler, Susi

2016-03-01

We aimed to assess the measurement properties and the minimal important difference (MID) of the 1-min sit-to-stand (STS) test in cystic fibrosis (CF). Patients with CF were tested during a pulmonary rehabilitation program. Five STS tests were performed during the program; two tests at the beginning (STS0 and STS1) and three tests at the end (STS2a-2c). Exercise capacity, pulmonary function, and health-related quality of life (HRQoL) and patient-reported health status were measured at the beginning and end of the program. We calculated overall mean, standard deviation, coefficient of variation (CV), and intraclass correlation coefficient (ICC) of the STS test. The MID was calculated using anchor-based and distributional methods. Fourteen participants (8 female, mean age 30.4±6.1years) were included. STS test performance increased significantly from STS0 to STS1 indicative of a learning effect. Test-retest reliability for the subsequent STS2a-2c tests was excellent (ICC 0.98, 95% CI 0.96-0.99). The estimated MID for the STS test was 5 repetitions. STS test performance was responsive to change (effect size of 0.97) and correlated with exercise capacity (r=0.63-0.73) and with the physical functioning HRQoL scale (r=0.72). The 1-min STS test appears to be a reliable, valid, and feasible test to measure functional capacity in patients with CF. Copyright © 2015 European Cystic Fibrosis Society. Published by Elsevier B.V. All rights reserved.
Parallelism and array processing

International Nuclear Information System (INIS)

Zacharov, V.

1983-01-01

Modern computing, as well as the historical development of computing, has been dominated by sequential monoprocessing. Yet there is the alternative of parallelism, where several processes may be in concurrent execution. This alternative is discussed in a series of lectures, in which the main developments involving parallelism are considered, both from the standpoint of computing systems and that of applications that can exploit such systems. The lectures seek to discuss parallelism in a historical context, and to identify all the main aspects of concurrency in computation right up to the present time. Included will be consideration of the important question as to what use parallelism might be in the field of data processing. (orig.)
Functional Task Test: 3. Skeletal Muscle Performance Adaptations to Space Flight

Science.gov (United States)

Ryder, Jeffrey W.; Wickwire, P. J.; Buxton, R. E.; Bloomberg, J. J.; Ploutz-Snyder, L.

2011-01-01

The functional task test is a multi-disciplinary study investigating how space-flight induced changes to physiological systems impacts functional task performance. Impairment of neuromuscular function would be expected to negatively affect functional performance of crewmembers following exposure to microgravity. This presentation reports the results for muscle performance testing in crewmembers. Functional task performance will be presented in the abstract "Functional Task Test 1: sensory motor adaptations associated with postflight alternations in astronaut functional task performance." METHODS: Muscle performance measures were obtained in crewmembers before and after short-duration space flight aboard the Space Shuttle and long-duration International Space Station (ISS) missions. The battery of muscle performance tests included leg press and bench press measures of isometric force, isotonic power and total work. Knee extension was used for the measurement of central activation and maximal isometric force. Upper and lower body force steadiness control were measured on the bench press and knee extension machine, respectively. Tests were implemented 60 and 30 days before launch, on landing day (Shuttle crew only), and 6, 10 and 30 days after landing. Seven Space Shuttle crew and four ISS crew have completed the muscle performance testing to date. RESULTS: Preliminary results for Space Shuttle crew reveal significant reductions in the leg press performance metrics of maximal isometric force, power and total work on R+0 (pperformance metrics were observed in returning Shuttle crew and these adaptations are likely contributors to impaired functional tasks that are ambulatory in nature (See abstract Functional Task Test: 1). Interestingly, no significant changes in central activation capacity were detected. Therefore, impairments in muscle function in response to short-duration space flight are likely myocellular rather than neuromotor in nature.

Test Protocols for Advanced Inverter Interoperability Functions - Appendices

Energy Technology Data Exchange (ETDEWEB)

Johnson, Jay Dean [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Gonzalez, Sigifredo [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Ralph, Mark E. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Ellis, Abraham [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Broderick, Robert Joseph [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

2013-11-01

Distributed energy resources (DER) such as photovoltaic (PV) systems, when deployed in a large scale, are capable of influencing significantly the operation of power systems. Looking to the future, stakeholders are working on standards to make it possible to manage the potentially complex interactions between DER and the power system. In 2009, the Electric Power Research Institute (EPRI), Sandia National Laboratories (SNL) with the U.S. Department of Energy (DOE), and the Solar Electric Power Association (SEPA) initiated a large industry collaborative to identify and standardize definitions for a set of DER grid support functions. While the initial effort concentrated on grid-tied PV inverters and energy storage systems, the concepts have applicability to all DER. A partial product of this on-going effort is a reference definitions document (IEC TR 61850-90-7, Object models for power converters in distributed energy resources (DER) systems) that has become a basis for expansion of related International Electrotechnical Commission (IEC) standards, and is supported by US National Institute of Standards and Technology (NIST) Smart Grid Interoperability Panel (SGIP). Some industry-led organizations advancing communications protocols have also embraced this work. As standards continue to evolve, it is necessary to develop test protocols to independently verify that the inverters are properly executing the advanced functions. Interoperability is assured by establishing common definitions for the functions and a method to test compliance with operational requirements. This document describes test protocols developed by SNL to evaluate the electrical performance and operational capabilities of PV inverters and energy storage, as described in IEC TR 61850-90-7. While many of these functions are not now required by existing grid codes or may not be widely available commercially, the industry is rapidly moving in that direction. Interoperability issues are already apparent as
General-purpose parallel simulator for quantum computing

International Nuclear Information System (INIS)

Niwa, Jumpei; Matsumoto, Keiji; Imai, Hiroshi

2002-01-01

With current technologies, it seems to be very difficult to implement quantum computers with many qubits. It is therefore of importance to simulate quantum algorithms and circuits on the existing computers. However, for a large-size problem, the simulation often requires more computational power than is available from sequential processing. Therefore, simulation methods for parallel processors are required. We have developed a general-purpose simulator for quantum algorithms/circuits on the parallel computer (Sun Enterprise4500). It can simulate algorithms/circuits with up to 30 qubits. In order to test efficiency of our proposed methods, we have simulated Shor's factorization algorithm and Grover's database search, and we have analyzed robustness of the corresponding quantum circuits in the presence of both decoherence and operational errors. The corresponding results, statistics, and analyses are presented in this paper
A physical function test for use in the intensive care unit: validity, responsiveness, and predictive utility of the physical function ICU test (scored).

Science.gov (United States)

Denehy, Linda; de Morton, Natalie A; Skinner, Elizabeth H; Edbrooke, Lara; Haines, Kimberley; Warrillow, Stephen; Berney, Sue

2013-12-01

Several tests have recently been developed to measure changes in patient strength and functional outcomes in the intensive care unit (ICU). The original Physical Function ICU Test (PFIT) demonstrates reliability and sensitivity. The aims of this study were to further develop the original PFIT, to derive an interval score (the PFIT-s), and to test the clinimetric properties of the PFIT-s. A nested cohort study was conducted. One hundred forty-four and 116 participants performed the PFIT at ICU admission and discharge, respectively. Original test components were modified using principal component analysis. Rasch analysis examined the unidimensionality of the PFIT, and an interval score was derived. Correlations tested validity, and multiple regression analyses investigated predictive ability. Responsiveness was assessed using the effect size index (ESI), and the minimal clinically important difference (MCID) was calculated. The shoulder lift component was removed. Unidimensionality of combined admission and discharge PFIT-s scores was confirmed. The PFIT-s displayed moderate convergent validity with the Timed "Up & Go" Test (r=-.60), the Six-Minute Walk Test (r=.41), and the Medical Research Council (MRC) sum score (rho=.49). The ESI of the PFIT-s was 0.82, and the MCID was 1.5 points (interval scale range=0-10). A higher admission PFIT-s score was predictive of: an MRC score of ≥48, increased likelihood of discharge home, reduced likelihood of discharge to inpatient rehabilitation, and reduced acute care hospital length of stay. Scoring of sit-to-stand assistance required is subjective, and cadence cutpoints used may not be generalizable. The PFIT-s is a safe and inexpensive test of physical function with high clinical utility. It is valid, responsive to change, and predictive of key outcomes. It is recommended that the PFIT-s be adopted to test physical function in the ICU.
Liver function tests using the stable isotope /sup 15/N

Energy Technology Data Exchange (ETDEWEB)

Faust, H; Jung, K; Hirschberg, K; Krumbiegel, P; Junghans, P; Reinhardt, R; Matkowitz, R; Teichmann, B

1988-01-01

Several liver function tests using oral application of a nitrogen compound labelled with /sup 15/N and the subsequent determination of /sup 15/N in a certain fraction of urine or in the total urine by emission spectrometry are described. Because of the key function of the liver in the metabolism of nitrogen compounds, the results of these tests allow conclusions concerning some disturbances of liver functions.
Experimental characterization of a binary actuated parallel manipulator

Science.gov (United States)

Giuseppe, Carbone

2016-05-01

This paper describes the BAPAMAN (Binary Actuated Parallel MANipulator) series of parallel manipulators that has been conceived at Laboratory of Robotics and Mechatronics (LARM). Basic common characteristics of BAPAMAN series are described. In particular, it is outlined the use of a reduced number of active degrees of freedom, the use of design solutions with flexural joints and Shape Memory Alloy (SMA) actuators for achieving miniaturization, cost reduction and easy operation features. Given the peculiarities of BAPAMAN architecture, specific experimental tests have been proposed and carried out with the aim to validate the proposed design and to evaluate the practical operation performance and the characteristics of a built prototype, in particular, in terms of operation and workspace characteristics.
I and C functional test facility user guide

International Nuclear Information System (INIS)

Kwon, Ki Chun

1996-07-01

The objective of I and C functional test facility (FTF) is to validate newly developed digital control and protection algorithm, alarm reduction algorithm and the function of operator support system and so on. Test facility is divided into three major parts; software, hardware and graphic user interface. Software consists of mathematical modeling which simulates 3 loop pressurizer water reactor, 993 MWe Westinghouse plant and supervisory module which interpret user instructions and data interface program. FTF is implemented in HP747I workstation using FORTRAN77 and ''C'' language under UNIX operating system. This User Guide provides file structure, instructions and program modification method and provides initial data, malfunction list, process variables list and simulation diagram as an appendix to test developed prototype. 12 figs. (Author)
I and C functional test facility user guide

Energy Technology Data Exchange (ETDEWEB)

Kwon, Ki Chun [Korea Atomic Energy Research Institute, Taejon (Korea, Republic of)

1996-07-01

The objective of I and C functional test facility (FTF) is to validate newly developed digital control and protection algorithm, alarm reduction algorithm and the function of operator support system and so on. Test facility is divided into three major parts; software, hardware and graphic user interface. Software consists of mathematical modeling which simulates 3 loop pressurizer water reactor, 993 MWe Westinghouse plant and supervisory module which interpret user instructions and data interface program. FTF is implemented in HP747I workstation using FORTRAN77 and ``C`` language under UNIX operating system. This User Guide provides file structure, instructions and program modification method and provides initial data, malfunction list, process variables list and simulation diagram as an appendix to test developed prototype. 12 figs. (Author).
Parallel algorithms on the ASTRA SIMD machine

International Nuclear Information System (INIS)

Odor, G.; Rohrbach, F.; Vesztergombi, G.; Varga, G.; Tatrai, F.

1996-01-01

In view of the tremendous computing power jump of modern RISC processors the interest in parallel computing seems to be thinning out. Why use a complicated system of parallel processors, if the problem can be solved by a single powerful micro-chip. It is a general law, however, that exponential growth will always end by some kind of a saturation, and then parallelism will again become a hot topic. We try to prepare ourselves for this eventuality. The MPPC project started in 1990 in the keydeys of parallelism and produced four ASTRA machines (presented at CHEP's 92) with 4k processors (which are expandable to 16k) based on yesterday's chip-technology (chip presented at CHEP'91). These machines now provide excellent test-beds for algorithmic developments in a complete, real environment. We are developing for example fast-pattern recognition algorithms which could be used in high-energy physics experiments at the LHC (planned to be operational after 2004 at CERN) for triggering and data reduction. The basic feature of our ASP (Associate String Processor) approach is to use extremely simple (thus very cheap) processor elements but in huge quantities (up to millions of processors) connected together by a very simple string-like communication chain. In this paper we present powerful algorithms based on this architecture indicating the performance perspectives if the hardware quality reaches present or even future technology levels. (author)
Mathematical Abstraction: Constructing Concept of Parallel Coordinates

Science.gov (United States)

Nurhasanah, F.; Kusumah, Y. S.; Sabandar, J.; Suryadi, D.

2017-09-01

Mathematical abstraction is an important process in teaching and learning mathematics so pre-service mathematics teachers need to understand and experience this process. One of the theoretical-methodological frameworks for studying this process is Abstraction in Context (AiC). Based on this framework, abstraction process comprises of observable epistemic actions, Recognition, Building-With, Construction, and Consolidation called as RBC + C model. This study investigates and analyzes how pre-service mathematics teachers constructed and consolidated concept of Parallel Coordinates in a group discussion. It uses AiC framework for analyzing mathematical abstraction of a group of pre-service teachers consisted of four students in learning Parallel Coordinates concepts. The data were collected through video recording, students’ worksheet, test, and field notes. The result shows that the students’ prior knowledge related to concept of the Cartesian coordinate has significant role in the process of constructing Parallel Coordinates concept as a new knowledge. The consolidation process is influenced by the social interaction between group members. The abstraction process taken place in this group were dominated by empirical abstraction that emphasizes on the aspect of identifying characteristic of manipulated or imagined object during the process of recognizing and building-with.
Inter-dot coupling effects on transport through correlated parallel

Indian Academy of Sciences (India)

Transport through symmetric parallel coupled quantum dot system has been studied, using non-equilibrium Green function formalism. The inter-dot tunnelling with on-dot and inter-dot Coulomb repulsion is included. The transmission coefficient and Landaur–Buttiker like current formula are shown in terms of internal states ...
Parallel External Memory Graph Algorithms

DEFF Research Database (Denmark)

Arge, Lars Allan; Goodrich, Michael T.; Sitchinava, Nodari

2010-01-01

In this paper, we study parallel I/O efficient graph algorithms in the Parallel External Memory (PEM) model, one o f the private-cache chip multiprocessor (CMP) models. We study the fundamental problem of list ranking which leads to efficient solutions to problems on trees, such as computing lowest...... an optimal speedup of Â¿(P) in parallel I/O complexity and parallel computation time, compared to the single-processor external memory counterparts....
Compiling the parallel programming language NestStep to the CELL processor

OpenAIRE

Holm, Magnus

2010-01-01

The goal of this project is to create a source-to-source compiler which will translate NestStep code to C code. The compiler's job is to replace NestStep constructs with a series of function calls to the NestStep runtime system. NestStep is a parallel programming language extension based on the BSP model. It adds constructs for parallel programming on top of an imperative programming language. For this project, only constructs extending the C language are relevant. The output code will compil...
Distributed analysis functional testing using GangaRobot in the ATLAS experiment

Science.gov (United States)

Legger, Federica; ATLAS Collaboration

2011-12-01

Automated distributed analysis tests are necessary to ensure smooth operations of the ATLAS grid resources. The HammerCloud framework allows for easy definition, submission and monitoring of grid test applications. Both functional and stress test applications can be defined in HammerCloud. Stress tests are large-scale tests meant to verify the behaviour of sites under heavy load. Functional tests are light user applications running at each site with high frequency, to ensure that the site functionalities are available at all times. Success or failure rates of these tests jobs are individually monitored. Test definitions and results are stored in a database and made available to users and site administrators through a web interface. In this work we present the recent developments of the GangaRobot framework. GangaRobot monitors the outcome of functional tests, creates a blacklist of sites failing the tests, and exports the results to the ATLAS Site Status Board (SSB) and to the Service Availability Monitor (SAM), providing on the one hand a fast way to identify systematic or temporary site failures, and on the other hand allowing for an effective distribution of the work load on the available resources.
Measurement of Function Post Hip Fracture: Testing a Comprehensive Measurement Model of Physical Function.

Science.gov (United States)

Resnick, Barbara; Gruber-Baldini, Ann L; Hicks, Gregory; Ostir, Glen; Klinedinst, N Jennifer; Orwig, Denise; Magaziner, Jay

2016-07-01

Measurement of physical function post hip fracture has been conceptualized using multiple different measures. This study tested a comprehensive measurement model of physical function. This was a descriptive secondary data analysis including 168 men and 171 women post hip fracture. Using structural equation modeling, a measurement model of physical function which included grip strength, activities of daily living, instrumental activities of daily living, and performance was tested for fit at 2 and 12 months post hip fracture, and among male and female participants. Validity of the measurement model of physical function was evaluated based on how well the model explained physical activity, exercise, and social activities post hip fracture. The measurement model of physical function fit the data. The amount of variance the model or individual factors of the model explained varied depending on the activity. Decisions about the ideal way in which to measure physical function should be based on outcomes considered and participants. The measurement model of physical function is a reliable and valid method to comprehensively measure physical function across the hip fracture recovery trajectory. © 2015 Association of Rehabilitation Nurses.
Anisotropic behaviour of transmission through thin superconducting NbN film in parallel magnetic field

Energy Technology Data Exchange (ETDEWEB)

Šindler, M., E-mail: sindler@fzu.cz [Institute of Physics ASCR, v. v. i., Cukrovarnická 10, CZ-162 53 Praha 6 (Czech Republic); Tesař, R. [Institute of Physics ASCR, v. v. i., Cukrovarnická 10, CZ-162 53 Praha 6 (Czech Republic); Faculty of Mathematics and Physics, Charles University, Ke Karlovu 3, CZ-121 16 Praha (Czech Republic); Koláček, J. [Institute of Physics ASCR, v. v. i., Cukrovarnická 10, CZ-162 53 Praha 6 (Czech Republic); Skrbek, L. [Faculty of Mathematics and Physics, Charles University, Ke Karlovu 3, CZ-121 16 Praha (Czech Republic)

2017-02-15

Highlights: • Transmission through thin NbN film in parallel magnetic field exhibits strong anisotropic behaviour in the terahertz range. • Response for a polarisation parallel with the applied field is given as weighted sum of superconducting and normal state contributions. • Effective medium approach fails to describe response for linear polarisation perpendicular to the applied magnetic field. - Abstract: Transmission of terahertz waves through a thin layer of the superconductor NbN deposited on an anisotropic R-cut sapphire substrate is studied as a function of temperature in a magnetic field oriented parallel with the sample. A significant difference is found between transmitted intensities of beams linearly polarised parallel with and perpendicular to the direction of applied magnetic field.
Local and Nonlocal Parallel Heat Transport in General Magnetic Fields

International Nuclear Information System (INIS)

Castillo-Negrete, D. del; Chacon, L.

2011-01-01

A novel approach for the study of parallel transport in magnetized plasmas is presented. The method avoids numerical pollution issues of grid-based formulations and applies to integrable and chaotic magnetic fields with local or nonlocal parallel closures. In weakly chaotic fields, the method gives the fractal structure of the devil's staircase radial temperature profile. In fully chaotic fields, the temperature exhibits self-similar spatiotemporal evolution with a stretched-exponential scaling function for local closures and an algebraically decaying one for nonlocal closures. It is shown that, for both closures, the effective radial heat transport is incompatible with the quasilinear diffusion model.
Apparatus to examine pulsed parallel field losses in large conductors

International Nuclear Information System (INIS)

Miller, J.R.; Shen, S.S.

1977-01-01

Conductors in tokamak toroidal field coils will be exposed to pulsed fields both parallel and perpendicular to the current direction. These conductors will likely be quite high capacity (10 to 20 kA) and therefore probably will be built up out of smaller units. We have previously published measurements of losses in conductors exposed to a pulsed parallel field, but those experiments necessarily used monolithic conductors of relatively small cross section because the pulse coil, a torus that surrounded the test conductor, was itself small. Here we describe an apparatus that is conceptually similar but has been scaled up to accept conductors of much larger cross section and current capacity. The apparatus consists basically of a superconducting torus that contains a movable spool to allow test samples to be wound inside without unwinding the torus. Details of apparatus design and capabilities are described and preliminary results from tests of the apparatus and from loss measurements using it are reported
Sensitivity and Specificity of Clinical and Laboratory Otolith Function Tests.

Science.gov (United States)

Kumar, Lokesh; Thakar, Alok; Thakur, Bhaskar; Sikka, Kapil

2017-10-01

To evaluate clinic based and laboratory tests of otolith function for their sensitivity and specificity in demarcating unilateral compensated complete vestibular deficit from normal. Prospective cross-sectional study. Tertiary care hospital vestibular physiology laboratory. Control group-30 healthy adults, 20-45 years age; Case group-15 subjects post vestibular shwannoma excision or post-labyrinthectomy with compensated unilateral complete audio-vestibular loss. Otolith function evaluation by precise clinical testing (head tilt test-HTT; subjective visual vertical-SVV) and laboratory testing (headroll-eye counterroll-HR-ECR; vesibular evoked myogenic potentials-cVEMP). Sensitivity and specificity of clinical and laboratory tests in differentiating case and control subjects. Measurable test results were universally obtained with clinical otolith tests (SVV; HTT) but not with laboratory tests. The HR-ECR test did not indicate any definitive wave forms in 10% controls and 26% cases. cVEMP responses were absent in 10% controls.HTT test with normative cutoff at 2 degrees deviations from vertical noted as 93.33% sensitive and 100% specific. SVV test with normative cutoff at 1.3 degrees noted as 100% sensitive and 100% specific. Laboratory tests demonstrated poorer specificities owing primarily to significant unresponsiveness in normal controls. Clinical otolith function tests, if conducted with precision, demonstrate greater ability than laboratory testing in discriminating normal controls from cases with unilateral complete compensated vestibular dysfunction.
Effects of parallel planning on agreement production.

Science.gov (United States)

Veenstra, Alma; Meyer, Antje S; Acheson, Daniel J

2015-11-01

An important issue in current psycholinguistics is how the time course of utterance planning affects the generation of grammatical structures. The current study investigated the influence of parallel activation of the components of complex noun phrases on the generation of subject-verb agreement. Specifically, the lexical interference account (Gillespie & Pearlmutter, 2011b; Solomon & Pearlmutter, 2004) predicts more agreement errors (i.e., attraction) for subject phrases in which the head and local noun mismatch in number (e.g., the apple next to the pears) when nouns are planned in parallel than when they are planned in sequence. We used a speeded picture description task that yielded sentences such as the apple next to the pears is red. The objects mentioned in the noun phrase were either semantically related or unrelated. To induce agreement errors, pictures sometimes mismatched in number. In order to manipulate the likelihood of parallel processing of the objects and to test the hypothesized relationship between parallel processing and the rate of agreement errors, the pictures were either placed close together or far apart. Analyses of the participants' eye movements and speech onset latencies indicated slower processing of the first object and stronger interference from the related (compared to the unrelated) second object in the close than in the far condition. Analyses of the agreement errors yielded an attraction effect, with more errors in mismatching than in matching conditions. However, the magnitude of the attraction effect did not differ across the close and far conditions. Thus, spatial proximity encouraged parallel processing of the pictures, which led to interference of the associated conceptual and/or lexical representation, but, contrary to the prediction, it did not lead to more attraction errors. Copyright © 2015 Elsevier B.V. All rights reserved.
Parallel inter channel interaction mechanisms

International Nuclear Information System (INIS)

Jovic, V.; Afgan, N.; Jovic, L.

1995-01-01

Parallel channels interactions are examined. For experimental researches of nonstationary regimes flow in three parallel vertical channels results of phenomenon analysis and mechanisms of parallel channel interaction for adiabatic condition of one-phase fluid and two-phase mixture flow are shown. (author)

CIT photoheliograph functional verification unit test program

Science.gov (United States)

1973-01-01

Tests of the 2/3-meter photoheliograph functional verification unit FVU were performed with the FVU installed in its Big Bear Solar Observatory vacuum chamber. Interferometric tests were run both in Newtonian (f/3.85) and Gregorian (f/50) configurations. Tests were run in both configurations with optical axis horizontal, vertical, and at 45 deg to attempt to determine any gravity effects on the system. Gravity effects, if present, were masked by scatter in the data associated with the system wavefront error of 0.16 lambda rms ( = 6328A) apparently due to problems in the primary mirror. Tests showed that the redesigned secondary mirror assembly works well.
A study of parallelizing O(N) Green-function-based Monte Carlo method for many fermions coupled with classical degrees of freedom

International Nuclear Information System (INIS)

Zhang Shixun; Yamagia, Shinichi; Yunoki, Seiji

2013-01-01

Models of fermions interacting with classical degrees of freedom are applied to a large variety of systems in condensed matter physics. For this class of models, Weiße [Phys. Rev. Lett. 102, 150604 (2009)] has recently proposed a very efficient numerical method, called O(N) Green-Function-Based Monte Carlo (GFMC) method, where a kernel polynomial expansion technique is used to avoid the full numerical diagonalization of the fermion Hamiltonian matrix of size N, which usually costs O(N 3 ) computational complexity. Motivated by this background, in this paper we apply the GFMC method to the double exchange model in three spatial dimensions. We mainly focus on the implementation of GFMC method using both MPI on a CPU-based cluster and Nvidia's Compute Unified Device Architecture (CUDA) programming techniques on a GPU-based (Graphics Processing Unit based) cluster. The time complexity of the algorithm and the parallel implementation details on the clusters are discussed. We also show the performance scaling for increasing Hamiltonian matrix size and increasing number of nodes, respectively. The performance evaluation indicates that for a 32 3 Hamiltonian a single GPU shows higher performance equivalent to more than 30 CPU cores parallelized using MPI
A new tool for investigating the functional testing of the VOR

Directory of Open Access Journals (Sweden)

Paolo eColagiorgio

2013-10-01

Full Text Available Peripheral vestibular function may be tested quantitatively, by measuring the gain of the angular vestibulo-ocular reflex (aVOR, or functionally, by assessing how well the aVOR performs with respect to its goal of stabilizing gaze in space and thus allow to acquire visual information during the head movement. In recent years, several groups have developed clinical and quantitative approaches to functional testing of the vestibular system based on the ability to identify an optotype briefly displayed on screen during head rotations.Although the proposed techniques differ in terms of the parameters controlling the testing paradigm, no study has thus far dealt with understanding the role of such choices in determining the effectiveness and reliability of the testing approach. Moreover, recent work has shown that peripheral vestibular patients may produce corrective saccades during the head movement (covert saccades, yet the role of these eye movements towards reading ability during head rotations is not yet understood. Finally, no study has thus far dealt with measuring the true performance of their experimental setups, which is nonetheless likely to be crucial information for understanding the effectiveness of functional testing approaches. Thus we propose a new software and hardware research tool allowing the combined measurement of eye and head movements, together with the timing of the optotype on screen, during functional testing of the VOR based on the Head Impulse Test (HIT. The goal of such tool is therefore that of allowing functional testing of the VOR while collecting the experimental data necessary to understand, for instance, a the effectiveness of the covert saccades strategy towards image stabilization, b which experimental parameters are crucial for optimizing the diagnostic power of the functional testing approach, and c which conditions lead to a successful reading or an error trial.
Quantum tests for the linearity and permutation invariance of Boolean functions

Energy Technology Data Exchange (ETDEWEB)

Hillery, Mark [Department of Physics, Hunter College of the City University of New York, 695 Park Avenue, New York, New York 10021 (United States); Andersson, Erika [SUPA, School of Engineering and Physical Sciences, Heriot-Watt University, Edinburgh EH14 4AS (United Kingdom)

2011-12-15

The goal in function property testing is to determine whether a black-box Boolean function has a certain property or is {epsilon}-far from having that property. The performance of the algorithm is judged by how many calls need to be made to the black box in order to determine, with high probability, which of the two alternatives is the case. Here we present two quantum algorithms, the first to determine whether the function is linear and the second to determine whether it is symmetric (invariant under permutations of the arguments). Both require order {epsilon}{sup -2/3} calls to the oracle, which is better than known classical algorithms. In addition, in the case of linearity testing, if the function is linear, the quantum algorithm identifies which linear function it is. The linearity test combines the Bernstein-Vazirani algorithm and amplitude amplification, while the test to determine whether a function is symmetric uses projective measurements and amplitude amplification.
[Eosin Y-water test for sperm function examination].

Science.gov (United States)

Zha, Shu-wei; Lü, Nian-qing; Xu, Hao-qin

2015-06-01

Based on the principles of the in vitro staining technique, hypotonic swelling test, and water test, the Eosin Y-water test method was developed to simultaneously detect the integrity of the sperm head and tail and sperm membrane structure and function. As a widely used method in clinical laboratories in China, the Eosin Y-water test is methodologically characterized by three advantages. Firstly, both the sperm head and tail can be detected at the same time, which allows easy and comprehensive assessment of membrane damage in different parts of sperm. Secondly, distilled water is used instead of the usual formula solution to simplify and standardize the test by eliminating any potential effects on the water molecules through the sperm membrane due to different osmotic pressure or different sugar proportions and electrolyte solutions. Thirdly, the test takes less time and thus can be repeated before and after treatment. This article focuses on the fundamental principles and modification of the Eosin Y-water test and its application in sperm function examination and routine semen analysis for male infertility, assessment of the quality of sperm retrieved by testicular fine needle aspiration, semen cryopreservation program development, and evaluation of sperm membrane integrity after microwave radiation.
An automated system for pulmonary function testing

Science.gov (United States)

Mauldin, D. G.

1974-01-01

An experiment to quantitate pulmonary function was accepted for the space shuttle concept verification test. The single breath maneuver and the nitrogen washout are combined to reduce the test time. Parameters are defined from the forced vital capacity maneuvers. A spirometer measures the breath volume and a magnetic section mass spectrometer provides definition of gas composition. Mass spectrometer and spirometer data are analyzed by a PDP-81 digital computer.
Seeing or moving in parallel

DEFF Research Database (Denmark)

Christensen, Mark Schram; Ehrsson, H Henrik; Nielsen, Jens Bo

2013-01-01

a different network, involving bilateral dorsal premotor cortex (PMd), primary motor cortex, and SMA, was more active when subjects viewed parallel movements while performing either symmetrical or parallel movements. Correlations between behavioral instability and brain activity were present in right lateral...... adduction-abduction movements symmetrically or in parallel with real-time congruent or incongruent visual feedback of the movements. One network, consisting of bilateral superior and middle frontal gyrus and supplementary motor area (SMA), was more active when subjects performed parallel movements, whereas...
Executive Functioning Profiles and Test Anxiety in College Students

Science.gov (United States)

O'Donnell, Patrick S.

2017-01-01

The current study attempted to answer whether a specific executive functioning profile for individuals with test anxiety exists and whether deficits in working memory are associated with an earlier onset of test anxiety. Two hundred eighty-four undergraduate students completed a survey on test anxiety and self-report measures of test anxiety and…
Fear Control an Danger Control: A Test of the Extended Parallel Process Model (EPPM).

Science.gov (United States)

Witte, Kim

1994-01-01

Explores cognitive and emotional mechanisms underlying success and failure of fear appeals in context of AIDS prevention. Offers general support for Extended Parallel Process Model. Suggests that cognitions lead to fear appeal success (attitude, intention, or behavior changes) via danger control processes, whereas the emotion fear leads to fear…
Outcomes of anatomical versus functional testing for coronary artery disease.

Science.gov (United States)

Douglas, Pamela S; Hoffmann, Udo; Patel, Manesh R; Mark, Daniel B; Al-Khalidi, Hussein R; Cavanaugh, Brendan; Cole, Jason; Dolor, Rowena J; Fordyce, Christopher B; Huang, Megan; Khan, Muhammad Akram; Kosinski, Andrzej S; Krucoff, Mitchell W; Malhotra, Vinay; Picard, Michael H; Udelson, James E; Velazquez, Eric J; Yow, Eric; Cooper, Lawton S; Lee, Kerry L

2015-04-02

Many patients have symptoms suggestive of coronary artery disease (CAD) and are often evaluated with the use of diagnostic testing, although there are limited data from randomized trials to guide care. We randomly assigned 10,003 symptomatic patients to a strategy of initial anatomical testing with the use of coronary computed tomographic angiography (CTA) or to functional testing (exercise electrocardiography, nuclear stress testing, or stress echocardiography). The composite primary end point was death, myocardial infarction, hospitalization for unstable angina, or major procedural complication. Secondary end points included invasive cardiac catheterization that did not show obstructive CAD and radiation exposure. The mean age of the patients was 60.8±8.3 years, 52.7% were women, and 87.7% had chest pain or dyspnea on exertion. The mean pretest likelihood of obstructive CAD was 53.3±21.4%. Over a median follow-up period of 25 months, a primary end-point event occurred in 164 of 4996 patients in the CTA group (3.3%) and in 151 of 5007 (3.0%) in the functional-testing group (adjusted hazard ratio, 1.04; 95% confidence interval, 0.83 to 1.29; P=0.75). CTA was associated with fewer catheterizations showing no obstructive CAD than was functional testing (3.4% vs. 4.3%, P=0.02), although more patients in the CTA group underwent catheterization within 90 days after randomization (12.2% vs. 8.1%). The median cumulative radiation exposure per patient was lower in the CTA group than in the functional-testing group (10.0 mSv vs. 11.3 mSv), but 32.6% of the patients in the functional-testing group had no exposure, so the overall exposure was higher in the CTA group (mean, 12.0 mSv vs. 10.1 mSv; P<0.001). In symptomatic patients with suspected CAD who required noninvasive testing, a strategy of initial CTA, as compared with functional testing, did not improve clinical outcomes over a median follow-up of 2 years. (Funded by the National Heart, Lung, and Blood Institute
A novel parallel pipeline structure of VP9 decoder

Science.gov (United States)

Qin, Huabiao; Chen, Wu; Yi, Sijun; Tan, Yunfei; Yi, Huan

2018-04-01

To improve the efficiency of VP9 decoder, a novel parallel pipeline structure of VP9 decoder is presented in this paper. According to the decoding workflow, VP9 decoder can be divided into sub-modules which include entropy decoding, inverse quantization, inverse transform, intra prediction, inter prediction, deblocking and pixel adaptive compensation. By analyzing the computing time of each module, hotspot modules are located and the causes of low efficiency of VP9 decoder can be found. Then, a novel pipeline decoder structure is designed by using mixed parallel decoding methods of data division and function division. The experimental results show that this structure can greatly improve the decoding efficiency of VP9.
The numerical parallel computing of photon transport

International Nuclear Information System (INIS)

Huang Qingnan; Liang Xiaoguang; Zhang Lifa

1998-12-01

The parallel computing of photon transport is investigated, the parallel algorithm and the parallelization of programs on parallel computers both with shared memory and with distributed memory are discussed. By analyzing the inherent law of the mathematics and physics model of photon transport according to the structure feature of parallel computers, using the strategy of 'to divide and conquer', adjusting the algorithm structure of the program, dissolving the data relationship, finding parallel liable ingredients and creating large grain parallel subtasks, the sequential computing of photon transport into is efficiently transformed into parallel and vector computing. The program was run on various HP parallel computers such as the HY-1 (PVP), the Challenge (SMP) and the YH-3 (MPP) and very good parallel speedup has been gotten
Hypergraph partitioning implementation for parallelizing matrix-vector multiplication using CUDA GPU-based parallel computing

Science.gov (United States)

Murni, Bustamam, A.; Ernastuti, Handhika, T.; Kerami, D.

2017-07-01

Calculation of the matrix-vector multiplication in the real-world problems often involves large matrix with arbitrary size. Therefore, parallelization is needed to speed up the calculation process that usually takes a long time. Graph partitioning techniques that have been discussed in the previous studies cannot be used to complete the parallelized calculation of matrix-vector multiplication with arbitrary size. This is due to the assumption of graph partitioning techniques that can only solve the square and symmetric matrix. Hypergraph partitioning techniques will overcome the shortcomings of the graph partitioning technique. This paper addresses the efficient parallelization of matrix-vector multiplication through hypergraph partitioning techniques using CUDA GPU-based parallel computing. CUDA (compute unified device architecture) is a parallel computing platform and programming model that was created by NVIDIA and implemented by the GPU (graphics processing unit).
Writing parallel programs that work

CERN Multimedia

CERN. Geneva

2012-01-01

Serial algorithms typically run inefficiently on parallel machines. This may sound like an obvious statement, but it is the root cause of why parallel programming is considered to be difficult. The current state of the computer industry is still that almost all programs in existence are serial. This talk will describe the techniques used in the Intel Parallel Studio to provide a developer with the tools necessary to understand the behaviors and limitations of the existing serial programs. Once the limitations are known the developer can refactor the algorithms and reanalyze the resulting programs with the tools in the Intel Parallel Studio to create parallel programs that work. About the speaker Paul Petersen is a Sr. Principal Engineer in the Software and Solutions Group (SSG) at Intel. He received a Ph.D. degree in Computer Science from the University of Illinois in 1993. After UIUC, he was employed at Kuck and Associates, Inc. (KAI) working on auto-parallelizing compiler (KAP), and was involved in th...
Parallel Framework for Cooperative Processes

Directory of Open Access Journals (Sweden)

Mitică Craus

2005-01-01

Full Text Available This paper describes the work of an object oriented framework designed to be used in the parallelization of a set of related algorithms. The idea behind the system we are describing is to have a re-usable framework for running several sequential algorithms in a parallel environment. The algorithms that the framework can be used with have several things in common: they have to run in cycles and the work should be possible to be split between several "processing units". The parallel framework uses the message-passing communication paradigm and is organized as a master-slave system. Two applications are presented: an Ant Colony Optimization (ACO parallel algorithm for the Travelling Salesman Problem (TSP and an Image Processing (IP parallel algorithm for the Symmetrical Neighborhood Filter (SNF. The implementations of these applications by means of the parallel framework prove to have good performances: approximatively linear speedup and low communication cost.
Oxytocin: parallel processing in the social brain?

Science.gov (United States)

Dölen, Gül

2015-06-01

Early studies attempting to disentangle the network complexity of the brain exploited the accessibility of sensory receptive fields to reveal circuits made up of synapses connected both in series and in parallel. More recently, extension of this organisational principle beyond the sensory systems has been made possible by the advent of modern molecular, viral and optogenetic approaches. Here, evidence supporting parallel processing of social behaviours mediated by oxytocin is reviewed. Understanding oxytocinergic signalling from this perspective has significant implications for the design of oxytocin-based therapeutic interventions aimed at disorders such as autism, where disrupted social function is a core clinical feature. Moreover, identification of opportunities for novel technology development will require a better appreciation of the complexity of the circuit-level organisation of the social brain. © 2015 The Authors. Journal of Neuroendocrinology published by John Wiley & Sons Ltd on behalf of British Society for Neuroendocrinology.
Vdebug: debugging tool for parallel scientific programs. Design report on vdebug

International Nuclear Information System (INIS)

Matsuda, Katsuyuki; Takemiya, Hiroshi

2000-02-01

We report on a debugging tool called vdebug which supports debugging work for parallel scientific simulation programs. It is difficult to debug scientific programs with an existing debugger, because the volume of data generated by the programs is too large for users to check data in characters. Usually, the existing debugger shows data values in characters. To alleviate it, we have developed vdebug which enables to check the validity of large amounts of data by showing these data values visually. Although targets of vdebug have been restricted to sequential programs, we have made it applicable to parallel programs by realizing the function of merging and visualizing data distributed on programs on each computer node. Now, vdebug works on seven kinds of parallel computers. In this report, we describe the design of vdebug. (author)
The Permanent Magnet Operating Mechanism of Double Coil Parallel Driven at a High Speed

Directory of Open Access Journals (Sweden)

WEI Xau-Lao

2017-02-01

Full Text Available Abstract:Operating mechanism is the main part of breaker，and the quality of breaker will directly influence the safe operation of power system. Because of the continuous improvement requirements of switch，in order to mak this actuator faster and more powerful closing，this paper proposes a double coil parallel driven permanent magnet actuator at a high speed. This paper expounds the working principle of single and double coil parallel driven permanent magnet actuator. It uses Ansoft building model and contrasts test results. In prance we designed and produced the single and double coil parallel driven permanent magnet actuator for experimental study. The simulation and experiment results show that double coil parallel driven permanent magnet actuator，compared with single coil parallel driven permanent magnet actuator，has a better and faster action performance. Thus，the double coil parallel driven permanent magnet actuator achieves a kind of optimization.
Parallel computing: numerics, applications, and trends

National Research Council Canada - National Science Library

Trobec, Roman; Vajteršic, Marián; Zinterhof, Peter

2009-01-01

... and/or distributed systems. The contributions to this book are focused on topics most concerned in the trends of today's parallel computing. These range from parallel algorithmics, programming, tools, network computing to future parallel computing. Particular attention is paid to parallel numerics: linear algebra, differential equations, numerica...
Parallel Computing Strategies for Irregular Algorithms

Science.gov (United States)

Biswas, Rupak; Oliker, Leonid; Shan, Hongzhang; Biegel, Bryan (Technical Monitor)

2002-01-01

Parallel computing promises several orders of magnitude increase in our ability to solve realistic computationally-intensive problems, but relies on their efficient mapping and execution on large-scale multiprocessor architectures. Unfortunately, many important applications are irregular and dynamic in nature, making their effective parallel implementation a daunting task. Moreover, with the proliferation of parallel architectures and programming paradigms, the typical scientist is faced with a plethora of questions that must be answered in order to obtain an acceptable parallel implementation of the solution algorithm. In this paper, we consider three representative irregular applications: unstructured remeshing, sparse matrix computations, and N-body problems, and parallelize them using various popular programming paradigms on a wide spectrum of computer platforms ranging from state-of-the-art supercomputers to PC clusters. We present the underlying problems, the solution algorithms, and the parallel implementation strategies. Smart load-balancing, partitioning, and ordering techniques are used to enhance parallel performance. Overall results demonstrate the complexity of efficiently parallelizing irregular algorithms.

Wada test for evaluation of language and memory function in medically intractable epilepsy

International Nuclear Information System (INIS)

Hong, Yong Kook; Chung, Tae Sub; Suh, Jung Ho; Kim, Dong Ik; Kim, Eun Kyung; Lee, Byung In; Huh, Kyun

1992-01-01

The Wada test was performed for lateralization of language and memory function, using intracarotid injection of Sodium Amytal. But the internal carotid artery (ICA) Wada test has some limitations for testing memory function. The posterior cerebral artery (PCA) Wada test has been designed to modify the ICA Wada test for testing memory function selectively. In our study, 10 patients out of 12 patients with intractable seizure underwent only the ICA Wada test and the other 2 patients underwent both the ICA and the selective PCA Wada test. In all 12 patients undergoing the ICA Wada test, we successfully localized speech and language dominance. Four of 12 patients who underwent the ICA Wada test for evaluation of memory function displayed superior memory functions in one hemisphere, but the other hemisphere also significantly contributed to memory. The selective PCA Wada test, performed in 2 patients, showed successful results of memory function test in both patients. Four of 12 patients underwent temporal lobectomy and there was no major post-operative language or memory deficits. We concluded that the ICA and PCA Wada tests are useful for preoperative evaluation of medically intractable epilepsy, and the PCA Wada test is valuable in memory evaluation in some patients who have high risk of postoperative global amnesia after temporal lobectomy following equivocal results of memory function by the ICA Wada test
Wada test for evaluation of language and memory function in medically intractable epilepsy

Energy Technology Data Exchange (ETDEWEB)

Hong, Yong Kook; Chung, Tae Sub; Suh, Jung Ho; Kim, Dong Ik; Kim, Eun Kyung; Lee, Byung In; Huh, Kyun [College of Medicine, Yonsei University, Seoul (Korea, Republic of)

1992-05-15

The Wada test was performed for lateralization of language and memory function, using intracarotid injection of Sodium Amytal. But the internal carotid artery (ICA) Wada test has some limitations for testing memory function. The posterior cerebral artery (PCA) Wada test has been designed to modify the ICA Wada test for testing memory function selectively. In our study, 10 patients out of 12 patients with intractable seizure underwent only the ICA Wada test and the other 2 patients underwent both the ICA and the selective PCA Wada test. In all 12 patients undergoing the ICA Wada test, we successfully localized speech and language dominance. Four of 12 patients who underwent the ICA Wada test for evaluation of memory function displayed superior memory functions in one hemisphere, but the other hemisphere also significantly contributed to memory. The selective PCA Wada test, performed in 2 patients, showed successful results of memory function test in both patients. Four of 12 patients underwent temporal lobectomy and there was no major post-operative language or memory deficits. We concluded that the ICA and PCA Wada tests are useful for preoperative evaluation of medically intractable epilepsy, and the PCA Wada test is valuable in memory evaluation in some patients who have high risk of postoperative global amnesia after temporal lobectomy following equivocal results of memory function by the ICA Wada test.
MAVL wastes containers functional demonstration and associated tests program

International Nuclear Information System (INIS)

Templier, J.C.

2002-01-01

In the framework of studies on the MAVL wastes, the CEA develops containers for middle time wastes storage. This program aims to realize a ''B wastes containers'' demonstrator. A demonstrator is a container, parts of a container or samples which must validate the tests. This document presents the state of the study in the following three chapters: functions description, base data and design choices; presentation of the functional demonstrators; demonstration tests description. (A.L.B.)
Proceedings of the workshop on Compilation of (Symbolic) Languages for Parallel Computers

Energy Technology Data Exchange (ETDEWEB)

Foster, I.; Tick, E. (comp.)

1991-11-01

This report comprises the abstracts and papers for the talks presented at the Workshop on Compilation of (Symbolic) Languages for Parallel Computers, held October 31--November 1, 1991, in San Diego. These unreferred contributions were provided by the participants for the purpose of this workshop; many of them will be published elsewhere in peer-reviewed conferences and publications. Our goal is planning this workshop was to bring together researchers from different disciplines with common problems in compilation. In particular, we wished to encourage interaction between researchers working in compilation of symbolic languages and those working on compilation of conventional, imperative languages. The fundamental problems facing researchers interested in compilation of logic, functional, and procedural programming languages for parallel computers are essentially the same. However, differences in the basic programming paradigms have led to different communities emphasizing different species of the parallel compilation problem. For example, parallel logic and functional languages provide dataflow-like formalisms in which control dependencies are unimportant. Hence, a major focus of research in compilation has been on techniques that try to infer when sequential control flow can safely be imposed. Granularity analysis for scheduling is a related problem. The single- assignment property leads to a need for analysis of memory use in order to detect opportunities for reuse. Much of the work in each of these areas relies on the use of abstract interpretation techniques.
Reactor safety impact of functional test intervals: an application of Bayesian decision theory

International Nuclear Information System (INIS)

Buoni, F.B.

1978-01-01

Functional test intervals for important nuclear reactor systems can be obtained by viewing safety assessment as a decision process and functional testing as a Bayesian learning or information process. A preposterior analysis is used as the analytical model to find the preposterior expected reliability of a system as a function of test intervals. Persistent and transitory failure models are shown to yield different results. Functional tests of systems subject to persistent failure are effective in maintaining system reliability goals. Functional testing is not effective for systems subject to transitory failure; preventive maintenance must be used. A Bayesian posterior analysis of testing data can discriminate between persistent and transitory failure. The role of functional testing is seen to be an aid in assessing the future performance of reactor systems
The Functional Test for Agility Performance is a Reliable Quick Decision-Making Test for Skilled Water Polo Players

Directory of Open Access Journals (Sweden)

Tucher Guilherme

2015-06-01

Full Text Available The reliability of the Functional Test for Agility Performance has only been evaluated in water polo players in a small group of novice athletes. Thus, the aim of this study was to evaluate the reliability of the Functional Test for Agility Performance in skilled water polo players. Forty-two athletes (17.81 ± 3.24 years old with a minimum of 5 years of competitive experience (7.05 ± 2.84 years and playing at the national or international level were evaluated. The Functional Test for Agility Performance is characterized as a specific open decision-making test where a tested player moves as quickly as possible in accordance to a pass made by another player. The time spent in the test was measured by two experienced coaches. Descriptive statistics, repeated measures analysis of variance (ANOVA, 95% limit of agreement (LOA, intraclass correlation coefficient (ICC and standard error of measurements (SEM were used for data analysis. Athletes completed the Functional Test for Agility Performance in 4.15 0.47 s. The ICC value was 0.87 (95% IC = 0.80-0.92. The SEM varied between 0.24 and 0.38 s. The LOA was 1.20 s and the CV average considering each individual trial was 6%. The Functional Test for Agility Performance was shown to be a reliable quick decision-making test for skilled water polo players.
Effects of Parallel Channel Interactions on Two-Phase Flow Split in ...

African Journals Online (AJOL)

The tests would aid the development of a realistic transient computer model for tracking the distribution of two-phase flows into the multiple parallel channels of a Nuclear Reactor, during Loss of Coolant Accidents (LOCA), and were performed at the General Electric Nuclear Energy Division Laboratory, California. The test ...
On а Recursive-Parallel Algorithm for Solving the Knapsack Problem

Directory of Open Access Journals (Sweden)

Vladimir V. Vasilchikov

2018-01-01

Full Text Available In this paper, we offer an efficient parallel algorithm for solving the NP-complete Knapsack Problem in its basic, so-called 0-1 variant. To find its exact solution, algorithms belonging to the category ”branch and bound methods” have long been used. To speed up the solving with varying degrees of efficiency, various options for parallelizing computations are also used. We propose here an algorithm for solving the problem, based on the paradigm of recursive-parallel computations. We consider it suited well for problems of this kind, when it is difficult to immediately break up the computations into a sufficient number of subtasks that are comparable in complexity, since they appear dynamically at run time. We used the RPM ParLib library, developed by the author, as the main tool to program the algorithm. This library allows us to develop effective applications for parallel computing on a local network in the .NET Framework. Such applications have the ability to generate parallel branches of computation directly during program execution and dynamically redistribute work between computing modules. Any language with support for the .NET Framework can be used as a programming language in conjunction with this library. For our experiments, we developed some C# applications using this library. The main purpose of these experiments was to study the acceleration achieved by recursive-parallel computing. A detailed description of the algorithm and its testing, as well as the results obtained, are also given in the paper.
A lightweight, flow-based toolkit for parallel and distributed bioinformatics pipelines

Directory of Open Access Journals (Sweden)

Cieślik Marcin

2011-02-01

Full Text Available Abstract Background Bioinformatic analyses typically proceed as chains of data-processing tasks. A pipeline, or 'workflow', is a well-defined protocol, with a specific structure defined by the topology of data-flow interdependencies, and a particular functionality arising from the data transformations applied at each step. In computer science, the dataflow programming (DFP paradigm defines software systems constructed in this manner, as networks of message-passing components. Thus, bioinformatic workflows can be naturally mapped onto DFP concepts. Results To enable the flexible creation and execution of bioinformatics dataflows, we have written a modular framework for parallel pipelines in Python ('PaPy'. A PaPy workflow is created from re-usable components connected by data-pipes into a directed acyclic graph, which together define nested higher-order map functions. The successive functional transformations of input data are evaluated on flexibly pooled compute resources, either local or remote. Input items are processed in batches of adjustable size, all flowing one to tune the trade-off between parallelism and lazy-evaluation (memory consumption. An add-on module ('NuBio' facilitates the creation of bioinformatics workflows by providing domain specific data-containers (e.g., for biomolecular sequences, alignments, structures and functionality (e.g., to parse/write standard file formats. Conclusions PaPy offers a modular framework for the creation and deployment of parallel and distributed data-processing workflows. Pipelines derive their functionality from user-written, data-coupled components, so PaPy also can be viewed as a lightweight toolkit for extensible, flow-based bioinformatics data-processing. The simplicity and flexibility of distributed PaPy pipelines may help users bridge the gap between traditional desktop/workstation and grid computing. PaPy is freely distributed as open-source Python code at http://muralab.org/PaPy, and
Development and evaluation of a scheduling algorithm for parallel hardware tests at CERN

CERN Document Server

Galetzka, Michael

This thesis aims at describing the problem of scheduling, evaluating different scheduling algorithms and comparing them with each other as well as with the current prototype solution. The implementation of the final solution will be delineated, as will the design considerations that led to it. The CERN Large Hadron Collider (LHC) has to deal with unprecedented stored energy, both in its particle beams and its superconducting magnet circuits. This energy could result in major equipment damage and downtime if it is not properly extracted from the machine. Before commissioning the machine with the particle beam, several thousands of tests have to be executed, analyzed and tracked to assess the proper functioning of the equipment and protection systems. These tests access the accelerator's equipment in order to verify the correct behavior of all systems, such as magnets, power converters and interlock controllers. A test could, for example, ramp the magnet to a certain energy level and then provoke an emergency...
Multisensor transducer based on a parallel fiber optic digital-to-analog converter

Directory of Open Access Journals (Sweden)

Grechishnikov Vladimir

2017-01-01

Full Text Available Considered possibility of creating a multisensory information converter (MSPI based on new fiber-optic functional element-digital-to-analog (DAC fiber optic converter. The use of DAC fiber-optic provides jamming immunity combined with low weight and cost of indicators .Because of that MSPI scheme was developed based on parallel DAC fiber-optic (Russian Federation Patent 157416. We came up with an equation for parallel DAC fiber-optic. An eleborate general mathematical model of the proposed converter. Developed a method for reducing conversion errors by placing the DAC transfer function between i and i + 1 ADC quantization levels. By using this model it allows you to obtain reliable information about the technical capabilities of a converter without the need for costly experiments.
MaMiCo: Software design for parallel molecular-continuum flow simulations

KAUST Repository

Neumann, Philipp; Flohr, Hanno; Arora, Rahul; Jarmatz, Piet; Tchipev, Nikola; Bungartz, Hans-Joachim

2015-01-01

The macro-micro-coupling tool (MaMiCo) was developed to ease the development of and modularize molecular-continuum simulations, retaining sequential and parallel performance. We demonstrate the functionality and performance of MaMiCo by coupling
Parallelization of MRCI based on hole-particle symmetry.

Science.gov (United States)

Suo, Bing; Zhai, Gaohong; Wang, Yubin; Wen, Zhenyi; Hu, Xiangqian; Li, Lemin

2005-01-15

The parallel implementation of multireference configuration interaction program based on the hole-particle symmetry is described. The platform to implement the parallelization is an Intel-Architectural cluster consisting of 12 nodes, each of which is equipped with two 2.4-G XEON processors, 3-GB memory, and 36-GB disk, and are connected by a Gigabit Ethernet Switch. The dependence of speedup on molecular symmetries and task granularities is discussed. Test calculations show that the scaling with the number of nodes is about 1.9 (for C1 and Cs), 1.65 (for C2v), and 1.55 (for D2h) when the number of nodes is doubled. The largest calculation performed on this cluster involves 5.6 x 10(8) CSFs.
Patterns for Parallel Software Design

CERN Document Server

Ortega-Arjona, Jorge Luis

2010-01-01

Essential reading to understand patterns for parallel programming Software patterns have revolutionized the way we think about how software is designed, built, and documented, and the design of parallel software requires you to consider other particular design aspects and special skills. From clusters to supercomputers, success heavily depends on the design skills of software developers. Patterns for Parallel Software Design presents a pattern-oriented software architecture approach to parallel software design. This approach is not a design method in the classic sense, but a new way of managin
High performance parallel I/O

CERN Document Server

Prabhat

2014-01-01

Gain Critical Insight into the Parallel I/O EcosystemParallel I/O is an integral component of modern high performance computing (HPC), especially in storing and processing very large datasets to facilitate scientific discovery. Revealing the state of the art in this field, High Performance Parallel I/O draws on insights from leading practitioners, researchers, software architects, developers, and scientists who shed light on the parallel I/O ecosystem.The first part of the book explains how large-scale HPC facilities scope, configure, and operate systems, with an emphasis on choices of I/O har
Performance-intensity functions of Mandarin word recognition tests in noise: test dialect and listener language effects.

Science.gov (United States)

Liu, Danzheng; Shi, Lu-Feng

2013-06-01

This study established the performance-intensity function for Beijing and Taiwan Mandarin bisyllabic word recognition tests in noise in native speakers of Wu Chinese. Effects of the test dialect and listeners' first language on psychometric variables (i.e., slope and 50%-correct threshold) were analyzed. Thirty-two normal-hearing Wu-speaking adults who used Mandarin since early childhood were compared to 16 native Mandarin-speaking adults. Both Beijing and Taiwan bisyllabic word recognition tests were presented at 8 signal-to-noise ratios (SNRs) in 4-dB steps (-12 dB to +16 dB). At each SNR, a half list (25 words) was presented in speech-spectrum noise to listeners' right ear. The order of the test, SNR, and half list was randomized across listeners. Listeners responded orally and in writing. Overall, the Wu-speaking listeners performed comparably to the Mandarin-speaking listeners on both tests. Compared to the Taiwan test, the Beijing test yielded a significantly lower threshold for both the Mandarin- and Wu-speaking listeners, as well as a significantly steeper slope for the Wu-speaking listeners. Both Mandarin tests can be used to evaluate Wu-speaking listeners. Of the 2, the Taiwan Mandarin test results in more comparable functions across listener groups. Differences in the performance-intensity function between listener groups and between tests indicate a first language and dialectal effect, respectively.
A method of paralleling computer calculation for two-dimensional kinetic plasma model

International Nuclear Information System (INIS)

Brazhnik, V.A.; Demchenko, V.V.; Dem'yanov, V.G.; D'yakov, V.E.; Ol'shanskij, V.V.; Panchenko, V.I.

1987-01-01

A method for parallel computer calculation and OSIRIS program complex realizing it and designed for numerical plasma simulation by the macroparticle method are described. The calculation can be carried out either with one or simultaneously with two computers BESM-6, that is provided by some package of interacting programs functioning in every computer. Program interaction in every computer is based on event techniques realized in OS DISPAK. Parallel computer calculation with two BESM-6 computers allows to accelerate the computation 1.5 times
Verification of Electromagnetic Physics Models for Parallel Computing Architectures in the GeantV Project

Energy Technology Data Exchange (ETDEWEB)

Amadio, G.; et al.

2017-11-22

An intensive R&D and programming effort is required to accomplish new challenges posed by future experimental high-energy particle physics (HEP) programs. The GeantV project aims to narrow the gap between the performance of the existing HEP detector simulation software and the ideal performance achievable, exploiting latest advances in computing technology. The project has developed a particle detector simulation prototype capable of transporting in parallel particles in complex geometries exploiting instruction level microparallelism (SIMD and SIMT), task-level parallelism (multithreading) and high-level parallelism (MPI), leveraging both the multi-core and the many-core opportunities. We present preliminary verification results concerning the electromagnetic (EM) physics models developed for parallel computing architectures within the GeantV project. In order to exploit the potential of vectorization and accelerators and to make the physics model effectively parallelizable, advanced sampling techniques have been implemented and tested. In this paper we introduce a set of automated statistical tests in order to verify the vectorized models by checking their consistency with the corresponding Geant4 models and to validate them against experimental data.
Parallel algorithms for continuum dynamics

International Nuclear Information System (INIS)

Hicks, D.L.; Liebrock, L.M.

1987-01-01

Simply porting existing parallel programs to a new parallel processor may not achieve the full speedup possible; to achieve the maximum efficiency may require redesigning the parallel algorithms for the specific architecture. The authors discuss here parallel algorithms that were developed first for the HEP processor and then ported to the CRAY X-MP/4, the ELXSI/10, and the Intel iPSC/32. Focus is mainly on the most recent parallel processing results produced, i.e., those on the Intel Hypercube. The applications are simulations of continuum dynamics in which the momentum and stress gradients are important. Examples of these are inertial confinement fusion experiments, severe breaks in the coolant system of a reactor, weapons physics, shock-wave physics. Speedup efficiencies on the Intel iPSC Hypercube are very sensitive to the ratio of communication to computation. Great care must be taken in designing algorithms for this machine to avoid global communication. This is much more critical on the iPSC than it was on the three previous parallel processors
Parallel S/sub n/ iteration schemes

International Nuclear Information System (INIS)

Wienke, B.R.; Hiromoto, R.E.

1986-01-01

The iterative, multigroup, discrete ordinates (S/sub n/) technique for solving the linear transport equation enjoys widespread usage and appeal. Serial iteration schemes and numerical algorithms developed over the years provide a timely framework for parallel extension. On the Denelcor HEP, the authors investigate three parallel iteration schemes for solving the one-dimensional S/sub n/ transport equation. The multigroup representation and serial iteration methods are also reviewed. This analysis represents a first attempt to extend serial S/sub n/ algorithms to parallel environments and provides good baseline estimates on ease of parallel implementation, relative algorithm efficiency, comparative speedup, and some future directions. The authors examine ordered and chaotic versions of these strategies, with and without concurrent rebalance and diffusion acceleration. Two strategies efficiently support high degrees of parallelization and appear to be robust parallel iteration techniques. The third strategy is a weaker parallel algorithm. Chaotic iteration, difficult to simulate on serial machines, holds promise and converges faster than ordered versions of the schemes. Actual parallel speedup and efficiency are high and payoff appears substantial

Some links on this page may take you to non-federal websites. Their policies may differ from this site.