lines substantially parallel: Topics by WorldWideScience.org

Sample records for lines substantially parallel

Parallel plate transmission line transformer

NARCIS (Netherlands)

Voeten, S.J.; Brussaard, G.J.H.; Pemen, A.J.M.

2011-01-01

A Transmission Line Transformer (TLT) can be used to transform high-voltage nanosecond pulses. These transformers rely on the fact that the length of the pulse is shorter than the transmission lines used. This allows connecting the transmission lines in parallel at the input and in series at the
Parallel Lines

Directory of Open Access Journals (Sweden)

James G. Worner

2017-05-01

Full Text Available James Worner is an Australian-based writer and scholar currently pursuing a PhD at the University of Technology Sydney. His research seeks to expose masculinities lost in the shadow of Australia’s Anzac hegemony while exploring new opportunities for contemporary historiography. He is the recipient of the Doctoral Scholarship in Historical Consciousness at the university’s Australian Centre of Public History and will be hosted by the University of Bologna during 2017 on a doctoral research writing scholarship. ‘Parallel Lines’ is one of a collection of stories, The Shapes of Us, exploring liminal spaces of modern life: class, gender, sexuality, race, religion and education. It looks at lives, like lines, that do not meet but which travel in proximity, simultaneously attracted and repelled. James’ short stories have been published in various journals and anthologies.
Parallel field line and stream line tracing algorithms for space physics applications

Science.gov (United States)

Toth, G.; de Zeeuw, D.; Monostori, G.

2004-05-01

Field line and stream line tracing is required in various space physics applications, such as the coupling of the global magnetosphere and inner magnetosphere models, the coupling of the solar energetic particle and heliosphere models, or the modeling of comets, where the multispecies chemical equations are solved along stream lines of a steady state solution obtained with single fluid MHD model. Tracing a vector field is an inherently serial process, which is difficult to parallelize. This is especially true when the data corresponding to the vector field is distributed over a large number of processors. We designed algorithms for the various applications, which scale well to a large number of processors. In the first algorithm the computational domain is divided into blocks. Each block is on a single processor. The algorithm folows the vector field inside the blocks, and calculates a mapping of the block surfaces. The blocks communicate the values at the coinciding surfaces, and the results are interpolated. Finally all block surfaces are defined and values inside the blocks are obtained. In the second algorithm all processors start integrating along the vector field inside the accessible volume. When the field line leaves the local subdomain, the position and other information is stored in a buffer. Periodically the processors exchange the buffers, and continue integration of the field lines until they reach a boundary. At that point the results are sent back to the originating processor. Efficiency is achieved by a careful phasing of computation and communication. In the third algorithm the results of a steady state simulation are stored on a hard drive. The vector field is contained in blocks. All processors read in all the grid and vector field data and the stream lines are integrated in parallel. If a stream line enters a block, which has already been integrated, the results can be interpolated. By a clever ordering of the blocks the execution speed can be
Highly parallel line-based image coding for many cores.

Science.gov (United States)

Peng, Xiulian; Xu, Jizheng; Zhou, You; Wu, Feng

2012-01-01

Computers are developing along with a new trend from the dual-core and quad-core processors to ones with tens or even hundreds of cores. Multimedia, as one of the most important applications in computers, has an urgent need to design parallel coding algorithms for compression. Taking intraframe/image coding as a start point, this paper proposes a pure line-by-line coding scheme (LBLC) to meet the need. In LBLC, an input image is processed line by line sequentially, and each line is divided into small fixed-length segments. The compression of all segments from prediction to entropy coding is completely independent and concurrent at many cores. Results on a general-purpose computer show that our scheme can get a 13.9 times speedup with 15 cores at the encoder and a 10.3 times speedup at the decoder. Ideally, such near-linear speeding relation with the number of cores can be kept for more than 100 cores. In addition to the high parallelism, the proposed scheme can perform comparatively or even better than the H.264 high profile above middle bit rates. At near-lossless coding, it outperforms H.264 more than 10 dB. At lossless coding, up to 14% bit-rate reduction is observed compared with H.264 lossless coding at the high 4:4:4 profile.
On-line event reconstruction using a parallel in-memory data base

OpenAIRE

Argante, E; Van der Stok, P D V; Willers, Ian Malcolm

1995-01-01

PORS is a system designed for on-line event reconstruction in high energy physics (HEP) experiments. It uses the CPREAD reconstruction program. Central to the system is a parallel in-memory database which is used as communication medium between parallel workers. A farming control structure is implemented with PORS in a natural way. The database provides structured storage of data with a short life time. PORS serves as a case study for the construction of a methodology on how to apply parallel...
Parallel inhomogeneity and the Alfven resonance. 1: Open field lines

Science.gov (United States)

Hansen, P. J.; Harrold, B. G.

1994-01-01

In light of a recent demonstration of the general nonexistence of a singularity at the Alfven resonance in cold, ideal, linearized magnetohydrodynamics, we examine the effect of a small density gradient parallel to uniform, open ambient magnetic field lines. To lowest order, energy deposition is quantitatively unaffected but occurs continuously over a thickened layer. This effect is illustrated in a numerical analysis of a plasma sheet boundary layer model with perfectly absorbing boundary conditions. Consequences of the results are discussed, both for the open field line approximation and for the ensuing closed field line analysis.
Parallel transport of long mean-free-path plasma along open magnetic field lines: Parallel heat flux

International Nuclear Information System (INIS)

Guo Zehua; Tang Xianzhu

2012-01-01

In a long mean-free-path plasma where temperature anisotropy can be sustained, the parallel heat flux has two components with one associated with the parallel thermal energy and the other the perpendicular thermal energy. Due to the large deviation of the distribution function from local Maxwellian in an open field line plasma with low collisionality, the conventional perturbative calculation of the parallel heat flux closure in its local or non-local form is no longer applicable. Here, a non-perturbative calculation is presented for a collisionless plasma in a two-dimensional flux expander bounded by absorbing walls. Specifically, closures of previously unfamiliar form are obtained for ions and electrons, which relate two distinct components of the species parallel heat flux to the lower order fluid moments such as density, parallel flow, parallel and perpendicular temperatures, and the field quantities such as the magnetic field strength and the electrostatic potential. The plasma source and boundary condition at the absorbing wall enter explicitly in the closure calculation. Although the closure calculation does not take into account wave-particle interactions, the results based on passing orbits from steady-state collisionless drift-kinetic equation show remarkable agreement with fully kinetic-Maxwell simulations. As an example of the physical implications of the theory, the parallel heat flux closures are found to predict a surprising observation in the kinetic-Maxwell simulation of the 2D magnetic flux expander problem, where the parallel heat flux of the parallel thermal energy flows from low to high parallel temperature region.
Parallel diffusion calculation for the PHAETON on-line multiprocessor computer

International Nuclear Information System (INIS)

Collart, J.M.; Fedon-Magnaud, C.; Lautard, J.J.

1987-04-01

The aim of the PHAETON project is the design of an on-line computer in order to increase the immediate knowledge of the main operating and safety parameters in power plants. A significant stage is the computation of the three dimensional flux distribution. For cost and safety reason a computer based on a parallel microprocessor architecture has been studied. This paper presents a first approach to parallelized three dimensional diffusion calculation. A computing software has been written and built in a four processors demonstrator. We present the realization in progress, concerning the final equipment. 8 refs
Establishing Substantial Equivalence: Transcriptomics

Science.gov (United States)

Baudo, María Marcela; Powers, Stephen J.; Mitchell, Rowan A. C.; Shewry, Peter R.

Regulatory authorities in Western Europe require transgenic crops to be substantially equivalent to conventionally bred forms if they are to be approved for commercial production. One way to establish substantial equivalence is to compare the transcript profiles of developing grain and other tissues of transgenic and conventionally bred lines, in order to identify any unintended effects of the transformation process. We present detailed protocols for transcriptomic comparisons of developing wheat grain and leaf material, and illustrate their use by reference to our own studies of lines transformed to express additional gluten protein genes controlled by their own endosperm-specific promoters. The results show that the transgenes present in these lines (which included those encoding marker genes) did not have any significant unpredicted effects on the expression of endogenous genes and that the transgenic plants were therefore substantially equivalent to the corresponding parental lines.
Passing in Command Line Arguments and Parallel Cluster/Multicore Batching in R with batch.

Science.gov (United States)

Hoffmann, Thomas J

2011-03-01

It is often useful to rerun a command line R script with some slight change in the parameters used to run it - a new set of parameters for a simulation, a different dataset to process, etc. The R package batch provides a means to pass in multiple command line options, including vectors of values in the usual R format, easily into R. The same script can be setup to run things in parallel via different command line arguments. The R package batch also provides a means to simplify this parallel batching by allowing one to use R and an R-like syntax for arguments to spread a script across a cluster or local multicore/multiprocessor computer, with automated syntax for several popular cluster types. Finally it provides a means to aggregate the results together of multiple processes run on a cluster.
Parallel Hough Transform-Based Straight Line Detection and Its FPGA Implementation in Embedded Vision

Directory of Open Access Journals (Sweden)

Nam Ling

2013-07-01

Full Text Available Hough Transform has been widely used for straight line detection in low-definition and still images, but it suffers from execution time and resource requirements. Field Programmable Gate Arrays (FPGA provide a competitive alternative for hardware acceleration to reap tremendous computing performance. In this paper, we propose a novel parallel Hough Transform (PHT and FPGA architecture-associated framework for real-time straight line detection in high-definition videos. A resource-optimized Canny edge detection method with enhanced non-maximum suppression conditions is presented to suppress most possible false edges and obtain more accurate candidate edge pixels for subsequent accelerated computation. Then, a novel PHT algorithm exploiting spatial angle-level parallelism is proposed to upgrade computational accuracy by improving the minimum computational step. Moreover, the FPGA based multi-level pipelined PHT architecture optimized by spatial parallelism ensures real-time computation for 1,024 × 768 resolution videos without any off-chip memory consumption. This framework is evaluated on ALTERA DE2-115 FPGA evaluation platform at a maximum frequency of 200 MHz, and it can calculate straight line parameters in 15.59 ms on the average for one frame. Qualitative and quantitative evaluation results have validated the system performance regarding data throughput, memory bandwidth, resource, speed and robustness.
Parallel Hough Transform-based straight line detection and its FPGA implementation in embedded vision.

Science.gov (United States)

Lu, Xiaofeng; Song, Li; Shen, Sumin; He, Kang; Yu, Songyu; Ling, Nam

2013-07-17

Hough Transform has been widely used for straight line detection in low-definition and still images, but it suffers from execution time and resource requirements. Field Programmable Gate Arrays (FPGA) provide a competitive alternative for hardware acceleration to reap tremendous computing performance. In this paper, we propose a novel parallel Hough Transform (PHT) and FPGA architecture-associated framework for real-time straight line detection in high-definition videos. A resource-optimized Canny edge detection method with enhanced non-maximum suppression conditions is presented to suppress most possible false edges and obtain more accurate candidate edge pixels for subsequent accelerated computation. Then, a novel PHT algorithm exploiting spatial angle-level parallelism is proposed to upgrade computational accuracy by improving the minimum computational step. Moreover, the FPGA based multi-level pipelined PHT architecture optimized by spatial parallelism ensures real-time computation for 1,024 × 768 resolution videos without any off-chip memory consumption. This framework is evaluated on ALTERA DE2-115 FPGA evaluation platform at a maximum frequency of 200 MHz, and it can calculate straight line parameters in 15.59 ms on the average for one frame. Qualitative and quantitative evaluation results have validated the system performance regarding data throughput, memory bandwidth, resource, speed and robustness.
Real-Time Straight-Line Detection for XGA-Size Videos by Hough Transform with Parallelized Voting Procedures.

Science.gov (United States)

Guan, Jungang; An, Fengwei; Zhang, Xiangyu; Chen, Lei; Mattausch, Hans Jürgen

2017-01-30

The Hough Transform (HT) is a method for extracting straight lines from an edge image. The main limitations of the HT for usage in actual applications are computation time and storage requirements. This paper reports a hardware architecture for HT implementation on a Field Programmable Gate Array (FPGA) with parallelized voting procedure. The 2-dimensional accumulator array, namely the Hough space in parametric form (ρ, θ), for computing the strength of each line by a voting mechanism is mapped on a 1-dimensional array with regular increments of θ. Then, this Hough space is divided into a number of parallel parts. The computation of (ρ, θ) for the edge pixels and the voting procedure for straight-line determination are therefore executable in parallel. In addition, a synchronized initialization for the Hough space further increases the speed of straight-line detection, so that XGA video processing becomes possible. The designed prototype system has been synthesized on a DE4 platform with a Stratix-IV FPGA device. In the application of road-lane detection, the average processing speed of this HT implementation is 5.4ms per XGA-frame at 200 MHz working frequency.
Parallel evolution under chemotherapy pressure in 29 breast cancer cell lines results in dissimilar mechanisms of resistance.

Directory of Open Access Journals (Sweden)

Bálint Tegze

Full Text Available BACKGROUND: Developing chemotherapy resistant cell lines can help to identify markers of resistance. Instead of using a panel of highly heterogeneous cell lines, we assumed that truly robust and convergent pattern of resistance can be identified in multiple parallel engineered derivatives of only a few parental cell lines. METHODS: Parallel cell populations were initiated for two breast cancer cell lines (MDA-MB-231 and MCF-7 and these were treated independently for 18 months with doxorubicin or paclitaxel. IC50 values against 4 chemotherapy agents were determined to measure cross-resistance. Chromosomal instability and karyotypic changes were determined by cytogenetics. TaqMan RT-PCR measurements were performed for resistance-candidate genes. Pgp activity was measured by FACS. RESULTS: All together 16 doxorubicin- and 13 paclitaxel-treated cell lines were developed showing 2-46 fold and 3-28 fold increase in resistance, respectively. The RT-PCR and FACS analyses confirmed changes in tubulin isofom composition, TOP2A and MVP expression and activity of transport pumps (ABCB1, ABCG2. Cytogenetics showed less chromosomes but more structural aberrations in the resistant cells. CONCLUSION: We surpassed previous studies by parallel developing a massive number of cell lines to investigate chemoresistance. While the heterogeneity caused evolution of multiple resistant clones with different resistance characteristics, the activation of only a few mechanisms were sufficient in one cell line to achieve resistance.
A 32-channel lattice transmission line array for parallel transmit and receive MRI at 7 tesla.

Science.gov (United States)

Adriany, Gregor; Auerbach, Edward J; Snyder, Carl J; Gözübüyük, Ark; Moeller, Steen; Ritter, Johannes; Van de Moortele, Pierre-François; Vaughan, Tommy; Uğurbil, Kâmil

2010-06-01

Transmit and receive RF coil arrays have proven to be particularly beneficial for ultra-high-field MR. Transmit coil arrays enable such techniques as B(1) (+) shimming to substantially improve transmit B(1) homogeneity compared to conventional volume coil designs, and receive coil arrays offer enhanced parallel imaging performance and SNR. Concentric coil arrangements hold promise for developing transceiver arrays incorporating large numbers of coil elements. At magnetic field strengths of 7 tesla and higher where the Larmor frequencies of interest can exceed 300 MHz, the coil array design must also overcome the problem of the coil conductor length approaching the RF wavelength. In this study, a novel concentric arrangement of resonance elements built from capacitively-shortened half-wavelength transmission lines is presented. This approach was utilized to construct an array with whole-brain coverage using 16 transceiver elements and 16 receive-only elements, resulting in a coil with a total of 16 transmit and 32 receive channels. (c) 2010 Wiley-Liss, Inc.
Cubic systems with invariant affine straight lines of total parallel multiplicity seven

Directory of Open Access Journals (Sweden)

Alexandru Suba

2013-12-01

Full Text Available In this article, we study the planar cubic differential systems with invariant affine straight lines of total parallel multiplicity seven. We classify these system according to their geometric properties encoded in the configurations of invariant straight lines. We show that there are only 17 different topological phase portraits in the Poincar\\'e disc associated to this family of cubic systems up to a reversal of the sense of their orbits, and we provide representatives of every class modulo an affine change of variables and rescaling of the time variable.
Airport object extraction based on visual attention mechanism and parallel line detection

Science.gov (United States)

Lv, Jing; Lv, Wen; Zhang, Libao

2017-10-01

Target extraction is one of the important aspects in remote sensing image analysis and processing, which has wide applications in images compression, target tracking, target recognition and change detection. Among different targets, airport has attracted more and more attention due to its significance in military and civilian. In this paper, we propose a novel and reliable airport object extraction model combining visual attention mechanism and parallel line detection algorithm. First, a novel saliency analysis model for remote sensing images with airport region is proposed to complete statistical saliency feature analysis. The proposed model can precisely extract the most salient region and preferably suppress the background interference. Then, the prior geometric knowledge is analyzed and airport runways contained two parallel lines with similar length are detected efficiently. Finally, we use the improved Otsu threshold segmentation method to segment and extract the airport regions from the salient map of remote sensing images. The experimental results demonstrate that the proposed model outperforms existing saliency analysis models and shows good performance in the detection of the airport.
Implementation of a microcomputer based distance relay for parallel transmission lines

International Nuclear Information System (INIS)

Phadke, A.G.; Jihuang, L.

1986-01-01

Distance relaying for parallel transmission lines is a difficult application problem with conventional phase and ground distance relays. It is known that for cross-country faults involving dissimilar phases and ground, three phase tripping may result. This paper summarizes a newly developed microcomputer based relay which is capable of classifying the cross-country fault correctly. The paper describes the principle of operation and results of laboratory tests of this relay
Integrated configurable equipment selection and line balancing for mass production with serial-parallel machining systems

Science.gov (United States)

Battaïa, Olga; Dolgui, Alexandre; Guschinsky, Nikolai; Levin, Genrikh

2014-10-01

Solving equipment selection and line balancing problems together allows better line configurations to be reached and avoids local optimal solutions. This article considers jointly these two decision problems for mass production lines with serial-parallel workplaces. This study was motivated by the design of production lines based on machines with rotary or mobile tables. Nevertheless, the results are more general and can be applied to assembly and production lines with similar structures. The designers' objectives and the constraints are studied in order to suggest a relevant mathematical model and an efficient optimization approach to solve it. A real case study is used to validate the model and the developed approach.
Parallel Evolution under Chemotherapy Pressure in 29 Breast Cancer Cell Lines Results in Dissimilar Mechanisms of Resistance

DEFF Research Database (Denmark)

Tegze, Balint; Szallasi, Zoltan Imre; Haltrich, Iren

2012-01-01

Background: Developing chemotherapy resistant cell lines can help to identify markers of resistance. Instead of using a panel of highly heterogeneous cell lines, we assumed that truly robust and convergent pattern of resistance can be identified in multiple parallel engineered derivatives of only...

Data Parallel Line Relaxation (DPLR) Code User Manual: Acadia - Version 4.01.1

Science.gov (United States)

Wright, Michael J.; White, Todd; Mangini, Nancy

2009-01-01

Data-Parallel Line Relaxation (DPLR) code is a computational fluid dynamic (CFD) solver that was developed at NASA Ames Research Center to help mission support teams generate high-value predictive solutions for hypersonic flow field problems. The DPLR Code Package is an MPI-based, parallel, full three-dimensional Navier-Stokes CFD solver with generalized models for finite-rate reaction kinetics, thermal and chemical non-equilibrium, accurate high-temperature transport coefficients, and ionized flow physics incorporated into the code. DPLR also includes a large selection of generalized realistic surface boundary conditions and links to enable loose coupling with external thermal protection system (TPS) material response and shock layer radiation codes.
'Ogura'-based 'CMS' lines with different nuclear backgrounds of cabbage revealed substantial diversity at morphological and molecular levels.

Science.gov (United States)

Parkash, Chander; Kumar, Sandeep; Singh, Rajender; Kumar, Ajay; Kumar, Satish; Dey, Shyam Sundar; Bhatia, Reeta; Kumar, Raj

2018-01-01

A comprehensive study on characterization and genetic diversity analysis was carried out in 16 'Ogura'-based 'CMS' lines of cabbage using 14 agro-morphological traits and 29 SSR markers. Agro-morphological characterization depicted considerable variations for different horticultural traits studied. The genotype, ZHA-2, performed better for most of the economically important quantitative traits. Further, gross head weight (0.76), head length (0.60) and head width (0.83) revealed significant positive correlation with net head weight. Dendrogram based on 10 quantitative traits exhibited considerable diversity among different CMS lines and principle component analysis (PCA) indicated that net and gross head weight, and head length and width are the main components of divergence between 16 CMS lines of cabbage. In molecular study, a total of 58 alleles were amplified by 29 SSR primers, averaging to 2.0 alleles in each locus. High mean values of Shannon's Information index (0.62), expected (0.45) and observed (0.32) heterozygosity and polymorphic information content (0.35) depicted substantial polymorphism. Dendrogram based on Jaccard's similarity coefficient constructed two major groups and eight sub-groups, which revealed substantial diversity among different CMS lines. In overall, based on agro-morphological and molecular studies genotype RRMA, ZHA-2 and RCA were found most divergent. Hence, they have immense potential in future breeding programs for the high-yielding hybrid development in cabbage.
Line filter design of parallel interleaved VSCs for high power wind energy conversion systems

DEFF Research Database (Denmark)

Gohil, Ghanshyamsinh Vijaysinh; Bede, Lorand; Teodorescu, Remus

2015-01-01

The Voltage Source Converters (VSCs) are often connected in parallel in a Wind Energy Conversion System (WECS) to match the high power rating of the modern wind turbines. The effect of the interleaved carriers on the harmonic performance of the parallel connected VSCs is analyzed in this paper...... limit. In order to achieve the desired filter performance with optimal values of the filter parameters, the use of a LC trap branch with the conventional LCL filter is proposed. The expressions for the resonant frequencies of the proposed line filter are derived and used in the design to selectively...
Model-driven product line engineering for mapping parallel algorithms to parallel computing platforms

NARCIS (Netherlands)

Arkin, Ethem; Tekinerdogan, Bedir

2016-01-01

Mapping parallel algorithms to parallel computing platforms requires several activities such as the analysis of the parallel algorithm, the definition of the logical configuration of the platform, the mapping of the algorithm to the logical configuration platform and the implementation of the
Line-plane broadcasting in a data communications network of a parallel computer

Science.gov (United States)

Archer, Charles J.; Berg, Jeremy E.; Blocksome, Michael A.; Smith, Brian E.

2010-06-08

Methods, apparatus, and products are disclosed for line-plane broadcasting in a data communications network of a parallel computer, the parallel computer comprising a plurality of compute nodes connected together through the network, the network optimized for point to point data communications and characterized by at least a first dimension, a second dimension, and a third dimension, that include: initiating, by a broadcasting compute node, a broadcast operation, including sending a message to all of the compute nodes along an axis of the first dimension for the network; sending, by each compute node along the axis of the first dimension, the message to all of the compute nodes along an axis of the second dimension for the network; and sending, by each compute node along the axis of the second dimension, the message to all of the compute nodes along an axis of the third dimension for the network.
Transmission line theory for long plasma production by radio frequency discharges between parallel-plate electrodes

International Nuclear Information System (INIS)

Nonaka, S.

1991-01-01

In order to seek for a radio frequency (RF) eigen-mode of waves in producing a plasma between a pair of long dielectric-covered parallel-plate RF electrodes, this paper analyzed all normal modes propagating along the electrodes by solving Maxwell's equations. The result showed that only an odd surface wave mode will produce the plasma in usual experimental conditions, which will become a basic transmission line theory when use of such long electrodes for on-line mass-production of amorphous silicon solar cells
A study of parallelism of the occlusal plane and ala-tragus line.

Science.gov (United States)

Sadr, Katayoun; Sadr, Makan

2009-01-01

Orientation of the occlusal plane is one of the most important clinical procedures in prostho-dontic rehabilitation of edentulous patients. The aim of this study was to define the best posterior reference point of ala-tragus line for orientation of occlusal plane for complete denture fabrication. Fifty-three dental students (27 females and 26 males) with complete natural dentition and Angel's Class I occlusal relationship were selected. The subjects were photographed in natural head position while clenching on a Fox plane. After tracing the photographs, the angles between the following lines were measured: the occlusal plane (Fox plane) and the superior border of ala-tragus, the occlusal plane (Fox plane) and the middle of ala-tragus as well as the occlusal plane (Fox plane) and the inferior border of ala-tragus. Descriptive statistics, one sample t-test and independent t-test were used. P value less than 0.05 was considered significant. There was no parallelism between the occlusal plane and ala-tragus line with three different posterior ends and one sample t-test showed that the angles between them were significantly different from zero (pplane. The superior border of the tragus is suggested as the posterior reference for ala-tragus line.
Wideband Dual-Polarization Patch Antenna Array With Parallel Strip Line Balun Feeding

DEFF Research Database (Denmark)

Zhang, Jin; Lin, Xianqi; Nie, Liying

2016-01-01

A wideband dual-polarization patch antenna array is proposed in this letter. The array is fed by a parallel strip line balun, which is adopted to generate 180° phase shift in a wide frequency range. In addition, this balun has simple structure, very small phase shift error, and good ports isolati...... is higher than 30 dB. The simulation and measurement turns out to be similar. This antenna array can be used in TD-LTE base stations, and the design methods are also useful to other wideband microstrip antennas....
Parallel implementation of the PHOENIX generalized stellar atmosphere program. II. Wavelength parallelization

International Nuclear Information System (INIS)

Baron, E.; Hauschildt, Peter H.

1998-01-01

We describe an important addition to the parallel implementation of our generalized nonlocal thermodynamic equilibrium (NLTE) stellar atmosphere and radiative transfer computer program PHOENIX. In a previous paper in this series we described data and task parallel algorithms we have developed for radiative transfer, spectral line opacity, and NLTE opacity and rate calculations. These algorithms divided the work spatially or by spectral lines, that is, distributing the radial zones, individual spectral lines, or characteristic rays among different processors and employ, in addition, task parallelism for logically independent functions (such as atomic and molecular line opacities). For finite, monotonic velocity fields, the radiative transfer equation is an initial value problem in wavelength, and hence each wavelength point depends upon the previous one. However, for sophisticated NLTE models of both static and moving atmospheres needed to accurately describe, e.g., novae and supernovae, the number of wavelength points is very large (200,000 - 300,000) and hence parallelization over wavelength can lead both to considerable speedup in calculation time and the ability to make use of the aggregate memory available on massively parallel supercomputers. Here, we describe an implementation of a pipelined design for the wavelength parallelization of PHOENIX, where the necessary data from the processor working on a previous wavelength point is sent to the processor working on the succeeding wavelength point as soon as it is known. Our implementation uses a MIMD design based on a relatively small number of standard message passing interface (MPI) library calls and is fully portable between serial and parallel computers. copyright 1998 The American Astronomical Society
Parallel superconducting strip-line detectors: reset behaviour in the single-strip switch regime

International Nuclear Information System (INIS)

Casaburi, A; Heath, R M; Tanner, M G; Hadfield, R H; Cristiano, R; Ejrnaes, M; Nappi, C

2014-01-01

Superconducting strip-line detectors (SSLDs) are an important emerging technology for the detection of single molecules in time-of-flight mass spectrometry (TOF-MS). We present an experimental investigation of a SSLD laid out in a parallel configuration, designed to address selected single strip-lines operating in the single-strip switch regime. Fast laser pulses were tightly focused onto the device, allowing controllable nucleation of a resistive region at a specific location and study of the subsequent device response dynamics. We observed that in this regime, although the strip-line returns to the superconducting state after triggering, no effective recovery of the bias current occurs, in qualitative agreement with a phenomenological circuit simulation that we performed. Moreover, from theoretical considerations and by looking at the experimental pulse amplitude distribution histogram, we have the first confirmation of the fact that the phenomenological London model governs the current redistribution in these large area devices also after detection events. (paper)
Parallelism at Cern: real-time and off-line applications in the GP-MIMD2 project

International Nuclear Information System (INIS)

Calafiura, P.

1997-01-01

A wide range of general purpose high-energy physics applications, ranging from Monte Carlo simulation to data acquisition, from interactive data analysis to on-line filtering, have been ported, or developed, and run in parallel on IBM SP-2 and Meiko CS-2 CERN large multi-processor machines. The ESPRIT project GP-MIMD2 has been a catalyst for the interest in parallel computing at CERN. The project provided the 128 processor Meiko CS-2 system that is now succesfully integrated in the CERN computing environment. The CERN experiment NA48 was involved in the GP-MIMD2 project since the beginning. NA48 physicists run, as part of their day-to-day work, simulation and analysis programs parallelized using the message passing interface MPI. The CS-2 is also a vital component of the experiment data acquisition system and will be used to calibrate in real-time the 13000 channels liquid krypton calorimeter. (orig.)
VERY STRONG EMISSION-LINE GALAXIES IN THE WFC3 INFRARED SPECTROSCOPIC PARALLEL SURVEY AND IMPLICATIONS FOR HIGH-REDSHIFT GALAXIES

Energy Technology Data Exchange (ETDEWEB)

Atek, H.; Colbert, J.; Shim, H. [Spitzer Science Center, Caltech, Pasadena, CA 91125 (United States); Siana, B.; Bridge, C. [Department of Astronomy, Caltech, Pasadena, CA 91125 (United States); Scarlata, C. [Department of Astronomy, University of Minnesota-Twin Cities, Minneapolis, MN 55455 (United States); Malkan, M.; Ross, N. R. [Department of Physics and Astronomy, University of California, Los Angeles, CA (United States); McCarthy, P.; Dressler, A.; Hathi, N. P. [Observatories of the Carnegie Institution for Science, Pasadena, CA 91101 (United States); Teplitz, H. [Infrared Processing and Analysis Center, Caltech, Pasadena, CA 91125 (United States); Henry, A.; Martin, C. [Department of Physics, University of California, Santa Barbara, CA 93106 (United States); Bunker, A. J. [Department of Physics, University of Oxford, Denys Wilkinson Building, Keble Road, Oxford OX1 3RH (United Kingdom); Fosbury, R. A. E. [Space Telescope-European Coordinating Facility, Garching bei Muenchen (Germany)

2011-12-20

The WFC3 Infrared Spectroscopic Parallel Survey uses the Hubble Space Telescope (HST) infrared grism capabilities to obtain slitless spectra of thousands of galaxies over a wide redshift range including the peak of star formation history of the universe. We select a population of very strong emission-line galaxies with rest-frame equivalent widths (EWs) higher than 200 A. A total of 176 objects are found over the redshift range 0.35 < z < 2.3 in the 180 arcmin{sup 2} area that we have analyzed so far. This population consists of young and low-mass starbursts with high specific star formation rates (sSFR). After spectroscopic follow-up of one of these galaxies with Keck/Low Resolution Imaging Spectrometer, we report the detection at z = 0.7 of an extremely metal-poor galaxy with 12 + log(O/H) =7.47 {+-} 0.11. After estimating the active galactic nucleus fraction in the sample, we show that the high-EW galaxies have higher sSFR than normal star-forming galaxies at any redshift. We find that the nebular emission lines can substantially affect the total broadband flux density with a median brightening of 0.3 mag, with some examples of line contamination producing brightening of up to 1 mag. We show that the presence of strong emission lines in low-z galaxies can mimic the color-selection criteria used in the z {approx} 8 dropout surveys. In order to effectively remove low-redshift interlopers, deep optical imaging is needed, at least 1 mag deeper than the bands in which the objects are detected. Without deep optical data, most of the interlopers cannot be ruled out in the wide shallow HST imaging surveys. Finally, we empirically demonstrate that strong nebular lines can lead to an overestimation of the mass and the age of galaxies derived from fitting of their spectral energy distribution (SED). Without removing emission lines, the age and the stellar mass estimates are overestimated by a factor of 2 on average and up to a factor of 10 for the high-EW galaxies
Operating system design of parallel computer for on-line management of nuclear pressurised water reactor cores

International Nuclear Information System (INIS)

Gougam, F.

1991-04-01

This study is part of the PHAETON project which aims at increasing the knowledge of safety parameters of PWR core and reducing operating margins during the reactor cycle. The on-line system associates a simulator process to compute the three dimensional flux distribution and an acquisition process of reactor core parameters from the central instrumentation. The 3D flux calculation is the most time consuming. So, for cost and safety reasons, the PHAETON project proposes an approach which is to parallelize the 3D diffusion calculation and to use a computer based on parallel processor architecture. This paper presents the design of the operating system on which the application is executed. The routine interface proposed, includes the main operations necessary for programming a real time and parallel application. The primitives include: task management, data transfer, synchronisation by event signalling and by using the rendez-vous mechanisms. The primitives which are proposed use standard softwares like real-time kernel and UNIX operating system [fr
Design of a chemical batch plant : a study of dedicated parallel lines with intermediate storage and the plant performance

OpenAIRE

Verbiest, Floor; Cornelissens, Trijntje; Springael, Johan

2016-01-01

Abstract: Production plants worldwide face huge challenges in satisfying high service levels and outperforming competition. These challenges require appropriate strategic decisions on plant design and production strategies. In this paper, we focus on multiproduct chemical batch plants, which are typically equipped with multiple production lines and intermediate storage tanks. First we extend the existing MI(N) LP design models with the concept of parallel production lines, and optimise the as...
Attempt to identify the functional areas of the cerebral cortex on CT slices parallel to the orbito-meatal line

Energy Technology Data Exchange (ETDEWEB)

Tanabe, Hirotaka; Okuda, Junichiro; Nishikawa, Takashi; Nishimura, Tsuyoshi (Osaka Univ. (Japan). Faculty of Medicine); Shiraishi, Junzo

1982-06-01

In order to identify the functional brain areas, such as Broca's area, on computed tomography slices parallel to the orbito-meatal line, the numbers of Brodmann's cortical mapping were shown on a diagram of representative brain sections parallel to the orbito-meatal line. Also, we described a method, using cerebral sulci as anatomical landmarks, for projecting lesions shown by CT scan onto the lateral brain diagram. The procedures were as follows. The distribution of lesions on CT slices was determined by the identification of major cerebral sulci and fissures, such as the Sylvian fissure, the central sulcus, and the superior frontal sulcus. Those lesions were then projected onto the lateral diagram by comparing each CT slice with the horizontal diagrams of brain sections. The method was demonstrated in three cases developing neuropsychological symptoms.
An attempt to identify the functional areas of the cerebral cortex on CT slices parallel to the orbito-meatal line

International Nuclear Information System (INIS)

Tanabe, Hirotaka; Okuda, Junichiro; Nishikawa, Takashi; Nishimura, Tsuyoshi; Shiraishi, Junzo.

1982-01-01

In order to identify the functional brain areas, such as Broca's area, on computed tomography slices parallel to the orbito-meatal line, the numbers of Brodmann's cortical mapping were shown on a diagram of representative brain sections parallel to the orbito-meatal line. Also, we described a method, using cerebral sulci as anatomical landmarks, for projecting lesions shown by CT scan onto the lateral brain diagram. The procedures were as follows. The distribution of lesions on CT slices was determined by the identification of major cerebral sulci and fissures, such as the Sylvian fissure, the central sulcus, and the superior frontal sulcus. Those lesions were then projected onto the lateral diagram by comparing each CT slice with the horizontal diagrams of brain sections. The method was demonstrated in three cases developing neuropsychological symptoms. (author)
600 GHz resonant mode in a parallel array of Josephson tunnel junctions connected by superconducting microstrip lines

DEFF Research Database (Denmark)

Kaplunenko, V. K.; Larsen, Britt Hvolbæk; Mygind, Jesper

1994-01-01

on experimental and numerical investigations of a resonant step observed at a voltage corresponding to 600 GHz in the dc current-voltage characteristic of a parallel array of 20 identical small NbAl2O3Nb Josephson junctions interconnected by short sections of superconducting microstrip line. The junctions...... are mutually phase locked due to collective interaction with the line sections excited close to the half wavelength resonance. The phase locking range can be adjusted by means of an external dc magnetic field and the step size varies periodically with the magnetic field. The largest step corresponds...
Advanced mathematical on-line analysis in nuclear experiments. Usage of parallel computing CUDA routines in standard root analysis

Science.gov (United States)

Grzeszczuk, A.; Kowalski, S.

2015-04-01

Compute Unified Device Architecture (CUDA) is a parallel computing platform developed by Nvidia for increase speed of graphics by usage of parallel mode for processes calculation. The success of this solution has opened technology General-Purpose Graphic Processor Units (GPGPUs) for applications not coupled with graphics. The GPGPUs system can be applying as effective tool for reducing huge number of data for pulse shape analysis measures, by on-line recalculation or by very quick system of compression. The simplified structure of CUDA system and model of programming based on example Nvidia GForce GTX580 card are presented by our poster contribution in stand-alone version and as ROOT application.
MOEA based design of decentralized controllers for LFC of interconnected power systems with nonlinearities, AC-DC parallel tie-lines and SMES units

International Nuclear Information System (INIS)

Ganapathy, S.; Velusami, S.

2010-01-01

A new design of Multi-Objective Evolutionary Algorithm based decentralized controllers for load-frequency control of interconnected power systems with Governor Dead Band and Generation Rate Constraint nonlinearities, AC-DC parallel tie-lines and Superconducting Magnetic Energy Storage (SMES) units, is proposed in this paper. The HVDC link is used as system interconnection in parallel with AC tie-line to effectively damp the frequency oscillations of AC system while the SMES unit provides bulk energy storage and release, thereby achieving combined benefits. The proposed controller satisfies two main objectives, namely, minimum Integral Squared Error of the system output and maximum closed-loop stability of the system. Simulation studies are conducted on a two area interconnected power system with nonlinearities, AC-DC tie-lines and SMES units. Results indicate that the proposed controller improves the transient responses and guarantees the closed-loop stability of the overall system even in the presence of system nonlinearities and with parameter changes.
GPU-based, parallel-line, omni-directional integration of measured acceleration field to obtain the 3D pressure distribution

Science.gov (United States)

Wang, Jin; Zhang, Cao; Katz, Joseph

2016-11-01

A PIV based method to reconstruct the volumetric pressure field by direct integration of the 3D material acceleration directions has been developed. Extending the 2D virtual-boundary omni-directional method (Omni2D, Liu & Katz, 2013), the new 3D parallel-line omni-directional method (Omni3D) integrates the material acceleration along parallel lines aligned in multiple directions. Their angles are set by a spherical virtual grid. The integration is parallelized on a Tesla K40c GPU, which reduced the computing time from three hours to one minute for a single realization. To validate its performance, this method is utilized to calculate the 3D pressure fields in isotropic turbulence and channel flow using the JHU DNS Databases (http://turbulence.pha.jhu.edu). Both integration of the DNS acceleration as well as acceleration from synthetic 3D particles are tested. Results are compared to other method, e.g. solution to the Pressure Poisson Equation (e.g. PPE, Ghaemi et al., 2012) with Bernoulli based Dirichlet boundary conditions, and the Omni2D method. The error in Omni3D prediction is uniformly low, and its sensitivity to acceleration errors is local. It agrees with the PPE/Bernoulli prediction away from the Dirichlet boundary. The Omni3D method is also applied to experimental data obtained using tomographic PIV, and results are correlated with deformation of a compliant wall. ONR.

Advanced mathematical on-line analysis in nuclear experiments. Usage of parallel computing CUDA routines in standard root analysis

Directory of Open Access Journals (Sweden)

Grzeszczuk A.

2015-01-01

Full Text Available Compute Unified Device Architecture (CUDA is a parallel computing platform developed by Nvidia for increase speed of graphics by usage of parallel mode for processes calculation. The success of this solution has opened technology General-Purpose Graphic Processor Units (GPGPUs for applications not coupled with graphics. The GPGPUs system can be applying as effective tool for reducing huge number of data for pulse shape analysis measures, by on-line recalculation or by very quick system of compression. The simplified structure of CUDA system and model of programming based on example Nvidia GForce GTX580 card are presented by our poster contribution in stand-alone version and as ROOT application.
Parallel mRNA, proteomics and miRNA expression analysis in cell line models of the intestine.

Science.gov (United States)

O'Sullivan, Finbarr; Keenan, Joanne; Aherne, Sinead; O'Neill, Fiona; Clarke, Colin; Henry, Michael; Meleady, Paula; Breen, Laura; Barron, Niall; Clynes, Martin; Horgan, Karina; Doolan, Padraig; Murphy, Richard

2017-11-07

To identify miRNA-regulated proteins differentially expressed between Caco2 and HT-29: two principal cell line models of the intestine. Exponentially growing Caco-2 and HT-29 cells were harvested and prepared for mRNA, miRNA and proteomic profiling. mRNA microarray profiling analysis was carried out using the Affymetrix GeneChip Human Gene 1.0 ST array. miRNA microarray profiling analysis was carried out using the Affymetrix Genechip miRNA 3.0 array. Quantitative Label-free LC-MS/MS proteomic analysis was performed using a Dionex Ultimate 3000 RSLCnano system coupled to a hybrid linear ion trap/Orbitrap mass spectrometer. Peptide identities were validated in Proteome Discoverer 2.1 and were subsequently imported into Progenesis QI software for further analysis. Hierarchical cluster analysis for all three parallel datasets (miRNA, proteomics, mRNA) was conducted in the R software environment using the Euclidean distance measure and Ward's clustering algorithm. The prediction of miRNA and oppositely correlated protein/mRNA interactions was performed using TargetScan 6.1. GO biological process, molecular function and cellular component enrichment analysis was carried out for the DE miRNA, protein and mRNA lists via the Pathway Studio 11.3 Web interface using their Mammalian database. Differential expression (DE) profiling comparing the intestinal cell lines HT-29 and Caco-2 identified 1795 Genes, 168 Proteins and 160 miRNAs as DE between the two cell lines. At the gene level, 1084 genes were upregulated and 711 were downregulated in the Caco-2 cell line relative to the HT-29 cell line. At the protein level, 57 proteins were found to be upregulated and 111 downregulated in the Caco-2 cell line relative to the HT-29 cell line. Finally, at the miRNAs level, 104 were upregulated and 56 downregulated in the Caco-2 cell line relative to the HT-29 cell line. Gene ontology (GO) analysis of the DE mRNA identified cell adhesion, migration and ECM organization, cellular lipid
Parallel heater system for subsurface formations

Science.gov (United States)

Harris, Christopher Kelvin [Houston, TX; Karanikas, John Michael [Houston, TX; Nguyen, Scott Vinh [Houston, TX

2011-10-25

A heating system for a subsurface formation is disclosed. The system includes a plurality of substantially horizontally oriented or inclined heater sections located in a hydrocarbon containing layer in the formation. At least a portion of two of the heater sections are substantially parallel to each other. The ends of at least two of the heater sections in the layer are electrically coupled to a substantially horizontal, or inclined, electrical conductor oriented substantially perpendicular to the ends of the at least two heater sections.
Flux-line-cutting losses in type-II superconductors

International Nuclear Information System (INIS)

Clem, J.R.

1982-01-01

Energy dissipation associated with flux-line cutting (intersection and cross-joining of adjacent nonparallel vortices) is considered theoretically. The flux-line-cutting contribution to the dissipation per unit volume, arising from mutual annihilation of transverse magnetic flux, is identified as J/sub parallel/xE/sub parallel/, where J/sub parallel/ and E/sub parallel/ are the components of the current density and the electric field parallel to the magnetic induction. The dynamical behavior of the magnetic structure at the flux-line-cutting threshold is shown to be governed by a special critical-state model similar to that proposed by previous authors. The resulting flux-line-cutting critical-state model, characterized in planar geometry by a parallel critical current density J/sub c/parallel or a critical angle gradient k/sub c/, is used to calculate predicted hysteretic ac flux-line-cutting losses in type-II superconductors in which the flux pinning is weak. The relation of the theory to previous experiments is discussed
Multi-objective optimization algorithms for mixed model assembly line balancing problem with parallel workstations

Directory of Open Access Journals (Sweden)

Masoud Rabbani

2016-12-01

Full Text Available This paper deals with mixed model assembly line (MMAL balancing problem of type-I. In MMALs several products are made on an assembly line while the similarity of these products is so high. As a result, it is possible to assemble several types of products simultaneously without any additional setup times. The problem has some particular features such as parallel workstations and precedence constraints in dynamic periods in which each period also effects on its next period. The research intends to reduce the number of workstations and maximize the workload smoothness between workstations. Dynamic periods are used to determine all variables in different periods to achieve efficient solutions. A non-dominated sorting genetic algorithm (NSGA-II and multi-objective particle swarm optimization (MOPSO are used to solve the problem. The proposed model is validated with GAMS software for small size problem and the performance of the foregoing algorithms is compared with each other based on some comparison metrics. The NSGA-II outperforms MOPSO with respect to some comparison metrics used in this paper, but in other metrics MOPSO is better than NSGA-II. Finally, conclusion and future research is provided.
Parallel S/sub n/ iteration schemes

International Nuclear Information System (INIS)

Wienke, B.R.; Hiromoto, R.E.

1986-01-01

The iterative, multigroup, discrete ordinates (S/sub n/) technique for solving the linear transport equation enjoys widespread usage and appeal. Serial iteration schemes and numerical algorithms developed over the years provide a timely framework for parallel extension. On the Denelcor HEP, the authors investigate three parallel iteration schemes for solving the one-dimensional S/sub n/ transport equation. The multigroup representation and serial iteration methods are also reviewed. This analysis represents a first attempt to extend serial S/sub n/ algorithms to parallel environments and provides good baseline estimates on ease of parallel implementation, relative algorithm efficiency, comparative speedup, and some future directions. The authors examine ordered and chaotic versions of these strategies, with and without concurrent rebalance and diffusion acceleration. Two strategies efficiently support high degrees of parallelization and appear to be robust parallel iteration techniques. The third strategy is a weaker parallel algorithm. Chaotic iteration, difficult to simulate on serial machines, holds promise and converges faster than ordered versions of the schemes. Actual parallel speedup and efficiency are high and payoff appears substantial
A multi-objective approach to the assignment of stock keeping units to unidirectional picking lines

Directory of Open Access Journals (Sweden)

Le Roux, G. J.

2017-05-01

Full Text Available An order picking system in a distribution centre consisting of parallel unidirectional picking lines is considered. The objectives are to minimise the walking distance of the pickers, the largest volume of stock on a picking line over all picking lines, the number of small packages, and the total penalty incurred for late distributions. The problem is formulated as a multi-objective multiple knapsack problem that is not solvable in a realistic time. Population-based algorithms, including the artificial bee colony algorithm and the genetic algorithm, are also implemented. The results obtained from all algorithms indicate a substantial improvement on all objectives relative to historical assignments. The genetic algorithm delivers the best performance.
High-voltage isolation transformer for sub-nanosecond rise time pulses constructed with annular parallel-strip transmission lines.

Science.gov (United States)

Homma, Akira

2011-07-01

A novel annular parallel-strip transmission line was devised to construct high-voltage high-speed pulse isolation transformers. The transmission lines can easily realize stable high-voltage operation and good impedance matching between primary and secondary circuits. The time constant for the step response of the transformer was calculated by introducing a simple low-frequency equivalent circuit model. Results show that the relation between the time constant and low-cut-off frequency of the transformer conforms to the theory of the general first-order linear time-invariant system. Results also show that the test transformer composed of the new transmission lines can transmit about 600 ps rise time pulses across the dc potential difference of more than 150 kV with insertion loss of -2.5 dB. The measured effective time constant of 12 ns agreed exactly with the theoretically predicted value. For practical applications involving the delivery of synchronized trigger signals to a dc high-voltage electron gun station, the transformer described in this paper exhibited advantages over methods using fiber optic cables for the signal transfer system. This transformer has no jitter or breakdown problems that invariably occur in active circuit components.
Mapping robust parallel multigrid algorithms to scalable memory architectures

Science.gov (United States)

Overman, Andrea; Vanrosendale, John

1993-01-01

The convergence rate of standard multigrid algorithms degenerates on problems with stretched grids or anisotropic operators. The usual cure for this is the use of line or plane relaxation. However, multigrid algorithms based on line and plane relaxation have limited and awkward parallelism and are quite difficult to map effectively to highly parallel architectures. Newer multigrid algorithms that overcome anisotropy through the use of multiple coarse grids rather than relaxation are better suited to massively parallel architectures because they require only simple point-relaxation smoothers. In this paper, we look at the parallel implementation of a V-cycle multiple semicoarsened grid (MSG) algorithm on distributed-memory architectures such as the Intel iPSC/860 and Paragon computers. The MSG algorithms provide two levels of parallelism: parallelism within the relaxation or interpolation on each grid and across the grids on each multigrid level. Both levels of parallelism must be exploited to map these algorithms effectively to parallel architectures. This paper describes a mapping of an MSG algorithm to distributed-memory architectures that demonstrates how both levels of parallelism can be exploited. The result is a robust and effective multigrid algorithm for distributed-memory machines.
Parallel Implementation of the Multi-Dimensional Spectral Code SPECT3D on large 3D grids.

Science.gov (United States)

Golovkin, Igor E.; Macfarlane, Joseph J.; Woodruff, Pamela R.; Pereyra, Nicolas A.

2006-10-01

The multi-dimensional collisional-radiative, spectral analysis code SPECT3D can be used to study radiation from complex plasmas. SPECT3D can generate instantaneous and time-gated images and spectra, space-resolved and streaked spectra, which makes it a valuable tool for post-processing hydrodynamics calculations and direct comparison between simulations and experimental data. On large three dimensional grids, transporting radiation along lines of sight (LOS) requires substantial memory and CPU resources. Currently, the parallel option in SPECT3D is based on parallelization over photon frequencies and allows for a nearly linear speed-up for a variety of problems. In addition, we are introducing a new parallel mechanism that will greatly reduce memory requirements. In the new implementation, spatial domain decomposition will be utilized allowing transport along a LOS to be performed only on the mesh cells the LOS crosses. The ability to operate on a fraction of the grid is crucial for post-processing the results of large-scale three-dimensional hydrodynamics simulations. We will present a parallel implementation of the code and provide a scalability study performed on a Linux cluster.
Collisionless reconnection: magnetic field line interaction

Directory of Open Access Journals (Sweden)

R. A. Treumann

2012-10-01

Full Text Available Magnetic field lines are quantum objects carrying one quantum Φ0 = 2πh/e of magnetic flux and have finite radius λm. Here we argue that they possess a very specific dynamical interaction. Parallel field lines reject each other. When confined to a certain area they form two-dimensional lattices of hexagonal structure. We estimate the filling factor of such an area. Anti-parallel field lines, on the other hand, attract each other. We identify the physical mechanism as being due to the action of the gauge potential field, which we determine quantum mechanically for two parallel and two anti-parallel field lines. The distortion of the quantum electrodynamic vacuum causes a cloud of virtual pairs. We calculate the virtual pair production rate from quantum electrodynamics and estimate the virtual pair cloud density, pair current and Lorentz force density acting on the field lines via the pair cloud. These properties of field line dynamics become important in collisionless reconnection, consistently explaining why and how reconnection can spontaneously set on in the field-free centre of a current sheet below the electron-inertial scale.
Parallel Monte Carlo Search for Hough Transform

Science.gov (United States)

Lopes, Raul H. C.; Franqueira, Virginia N. L.; Reid, Ivan D.; Hobson, Peter R.

2017-10-01

We investigate the problem of line detection in digital image processing and in special how state of the art algorithms behave in the presence of noise and whether CPU efficiency can be improved by the combination of a Monte Carlo Tree Search, hierarchical space decomposition, and parallel computing. The starting point of the investigation is the method introduced in 1962 by Paul Hough for detecting lines in binary images. Extended in the 1970s to the detection of space forms, what came to be known as Hough Transform (HT) has been proposed, for example, in the context of track fitting in the LHC ATLAS and CMS projects. The Hough Transform transfers the problem of line detection, for example, into one of optimization of the peak in a vote counting process for cells which contain the possible points of candidate lines. The detection algorithm can be computationally expensive both in the demands made upon the processor and on memory. Additionally, it can have a reduced effectiveness in detection in the presence of noise. Our first contribution consists in an evaluation of the use of a variation of the Radon Transform as a form of improving theeffectiveness of line detection in the presence of noise. Then, parallel algorithms for variations of the Hough Transform and the Radon Transform for line detection are introduced. An algorithm for Parallel Monte Carlo Search applied to line detection is also introduced. Their algorithmic complexities are discussed. Finally, implementations on multi-GPU and multicore architectures are discussed.
Same-source parallel implementation of the PSU/NCAR MM5

Energy Technology Data Exchange (ETDEWEB)

Michalakes, J.

1997-12-31

The Pennsylvania State/National Center for Atmospheric Research Mesoscale Model is a limited-area model of atmospheric systems, now in its fifth generation, MM5. Designed and maintained for vector and shared-memory parallel architectures, the official version of MM5 does not run on message-passing distributed memory (DM) parallel computers. The authors describe a same-source parallel implementation of the PSU/NCAR MM5 using FLIC, the Fortran Loop and Index Converter. The resulting source is nearly line-for-line identical with the original source code. The result is an efficient distributed memory parallel option to MM5 that can be seamlessly integrated into the official version.
Micromachined silicon parallel acoustic delay lines as time-delayed ultrasound detector array for real-time photoacoustic tomography

Science.gov (United States)

Cho, Y.; Chang, C.-C.; Wang, L. V.; Zou, J.

2016-02-01

This paper reports the development of a new 16-channel parallel acoustic delay line (PADL) array for real-time photoacoustic tomography (PAT). The PADLs were directly fabricated from single-crystalline silicon substrates using deep reactive ion etching. Compared with other acoustic delay lines (e.g., optical fibers), the micromachined silicon PADLs offer higher acoustic transmission efficiency, smaller form factor, easier assembly, and mass production capability. To demonstrate its real-time photoacoustic imaging capability, the silicon PADL array was interfaced with one single-element ultrasonic transducer followed by one channel of data acquisition electronics to receive 16 channels of photoacoustic signals simultaneously. A PAT image of an optically-absorbing target embedded in an optically-scattering phantom was reconstructed, which matched well with the actual size of the imaged target. Because the silicon PADL array allows a signal-to-channel reduction ratio of 16:1, it could significantly simplify the design and construction of ultrasonic receivers for real-time PAT.
Micromachined silicon parallel acoustic delay lines as time-delayed ultrasound detector array for real-time photoacoustic tomography

International Nuclear Information System (INIS)

Cho, Y; Chang, C-C; Zou, J; Wang, L V

2016-01-01

This paper reports the development of a new 16-channel parallel acoustic delay line (PADL) array for real-time photoacoustic tomography (PAT). The PADLs were directly fabricated from single-crystalline silicon substrates using deep reactive ion etching. Compared with other acoustic delay lines (e.g., optical fibers), the micromachined silicon PADLs offer higher acoustic transmission efficiency, smaller form factor, easier assembly, and mass production capability. To demonstrate its real-time photoacoustic imaging capability, the silicon PADL array was interfaced with one single-element ultrasonic transducer followed by one channel of data acquisition electronics to receive 16 channels of photoacoustic signals simultaneously. A PAT image of an optically-absorbing target embedded in an optically-scattering phantom was reconstructed, which matched well with the actual size of the imaged target. Because the silicon PADL array allows a signal-to-channel reduction ratio of 16:1, it could significantly simplify the design and construction of ultrasonic receivers for real-time PAT. (paper)
Automatic Loop Parallelization via Compiler Guided Refactoring

DEFF Research Database (Denmark)

Larsen, Per; Ladelsky, Razya; Lidman, Jacob

For many parallel applications, performance relies not on instruction-level parallelism, but on loop-level parallelism. Unfortunately, many modern applications are written in ways that obstruct automatic loop parallelization. Since we cannot identify sufficient parallelization opportunities...... for these codes in a static, off-line compiler, we developed an interactive compilation feedback system that guides the programmer in iteratively modifying application source, thereby improving the compiler’s ability to generate loop-parallel code. We use this compilation system to modify two sequential...... benchmarks, finding that the code parallelized in this way runs up to 8.3 times faster on an octo-core Intel Xeon 5570 system and up to 12.5 times faster on a quad-core IBM POWER6 system. Benchmark performance varies significantly between the systems. This suggests that semi-automatic parallelization should...
Parallel graded attention in reading: A pupillometric study

NARCIS (Netherlands)

Snell, Joshua; Mathot, Sebastiaan; Mirault, Jonathan; Grainger, Jonathan

2018-01-01

There are roughly two lines of theory to account for recent evidence that word processing is influenced by adjacent orthographic information. One line assumes that multiple words can be processed simultaneously through a parallel graded distribution of visuo-spatial attention. The other line assumes
Language constructs for modular parallel programs

Energy Technology Data Exchange (ETDEWEB)

Foster, I.

1996-03-01

We describe programming language constructs that facilitate the application of modular design techniques in parallel programming. These constructs allow us to isolate resource management and processor scheduling decisions from the specification of individual modules, which can themselves encapsulate design decisions concerned with concurrence, communication, process mapping, and data distribution. This approach permits development of libraries of reusable parallel program components and the reuse of these components in different contexts. In particular, alternative mapping strategies can be explored without modifying other aspects of program logic. We describe how these constructs are incorporated in two practical parallel programming languages, PCN and Fortran M. Compilers have been developed for both languages, allowing experimentation in substantial applications.
Conformal pure radiation with parallel rays

International Nuclear Information System (INIS)

Leistner, Thomas; Paweł Nurowski

2012-01-01

We define pure radiation metrics with parallel rays to be n-dimensional pseudo-Riemannian metrics that admit a parallel null line bundle K and whose Ricci tensor vanishes on vectors that are orthogonal to K. We give necessary conditions in terms of the Weyl, Cotton and Bach tensors for a pseudo-Riemannian metric to be conformal to a pure radiation metric with parallel rays. Then, we derive conditions in terms of the tractor calculus that are equivalent to the existence of a pure radiation metric with parallel rays in a conformal class. We also give analogous results for n-dimensional pseudo-Riemannian pp-waves. (paper)
Digital parallel-to-series pulse-train converter

Science.gov (United States)

Hussey, J.

1971-01-01

Circuit converts number represented as two level signal on n-bit lines to series of pulses on one of two lines, depending on sign of number. Converter accepts parallel binary input data and produces number of output pulses equal to number represented by input data.

Establishing Substantial Equivalence: Proteomics

Science.gov (United States)

Lovegrove, Alison; Salt, Louise; Shewry, Peter R.

Wheat is a major crop in world agriculture and is consumed after processing into a range of food products. It is therefore of great importance to determine the consequences (intended and unintended) of transgenesis in wheat and whether genetically modified lines are substantially equivalent to those produced by conventional plant breeding. Proteomic analysis is one of several approaches which can be used to address these questions. Two-dimensional PAGE (2D PAGE) remains the most widely available method for proteomic analysis, but is notoriously difficult to reproduce between laboratories. We therefore describe methods which have been developed as standard operating procedures in our laboratory to ensure the reproducibility of proteomic analyses of wheat using 2D PAGE analysis of grain proteins.
Strong contributions from vertical triads to helix-partner preferences in parallel coiled coils.

Science.gov (United States)

Steinkruger, Jay D; Bartlett, Gail J; Woolfson, Derek N; Gellman, Samuel H

2012-09-26

Pairing preferences in heterodimeric coiled coils are determined by complementarities among side chains that pack against one another at the helix-helix interface. However, relationships between dimer stability and interfacial residue identity are not fully understood. In the context of the "knobs-into-holes" (KIH) packing pattern, one can identify two classes of interactions between side chains from different helices: "lateral", in which a line connecting the adjacent side chains is perpendicular to the helix axes, and "vertical", in which the connecting line is parallel to the helix axes. We have previously analyzed vertical interactions in antiparallel coiled coils and found that one type of triad constellation (a'-a-a') exerts a strong effect on pairing preferences, while the other type of triad (d'-d-d') has relatively little impact on pairing tendencies. Here, we ask whether vertical interactions (d'-a-d') influence pairing in parallel coiled-coil dimers. Our results indicate that vertical interactions can exert a substantial impact on pairing specificity, and that the influence of the d'-a-d' triad depends on the lateral a' contact within the local KIH motif. Structure-informed bioinformatic analyses of protein sequences reveal trends consistent with the thermodynamic data derived from our experimental model system in suggesting that heterotriads involving Leu and Ile are preferred over homotriads involving Leu and Ile.
Energy flow of electric dipole radiation in between parallel mirrors

Science.gov (United States)

Xu, Zhangjin; Arnoldus, Henk F.

2017-11-01

We have studied the energy flow patterns of the radiation emitted by an electric dipole located in between parallel mirrors. It appears that the field lines of the Poynting vector (the flow lines of energy) can have very intricate structures, including many singularities and vortices. The flow line patterns depend on the distance between the mirrors, the distance of the dipole to one of the mirrors and the angle of oscillation of the dipole moment with respect to the normal of the mirror surfaces. Already for the simplest case of a dipole moment oscillating perpendicular to the mirrors, singularities appear at regular intervals along the direction of propagation (parallel to the mirrors). For a parallel dipole, vortices appear in the neighbourhood of the dipole. For a dipole oscillating under a finite angle with the surface normal, the radiating tends to swirl around the dipole before travelling off parallel to the mirrors. For relatively large mirror separations, vortices appear in the pattern. When the dipole is off-centred with respect to the midway point between the mirrors, the flow line structure becomes even more complicated, with numerous vortices in the pattern, and tiny loops near the dipole. We have also investigated the locations of the vortices and singularities, and these can be found without any specific knowledge about the flow lines. This provides an independent means of studying the propagation of dipole radiation between mirrors.
Parallel constraint satisfaction in memory-based decisions.

Science.gov (United States)

Glöckner, Andreas; Hodges, Sara D

2011-01-01

Three studies sought to investigate decision strategies in memory-based decisions and to test the predictions of the parallel constraint satisfaction (PCS) model for decision making (Glöckner & Betsch, 2008). Time pressure was manipulated and the model was compared against simple heuristics (take the best and equal weight) and a weighted additive strategy. From PCS we predicted that fast intuitive decision making is based on compensatory information integration and that decision time increases and confidence decreases with increasing inconsistency in the decision task. In line with these predictions we observed a predominant usage of compensatory strategies under all time-pressure conditions and even with decision times as short as 1.7 s. For a substantial number of participants, choices and decision times were best explained by PCS, but there was also evidence for use of simple heuristics. The time-pressure manipulation did not significantly affect decision strategies. Overall, the results highlight intuitive, automatic processes in decision making and support the idea that human information-processing capabilities are less severely bounded than often assumed.
On-line electrochemistry-bioaffinity screening with parallel HR-LC-MS for the generation and characterization of modified p38α kinase inhibitors.

Science.gov (United States)

Falck, David; de Vlieger, Jon S B; Giera, Martin; Honing, Maarten; Irth, Hubertus; Niessen, Wilfried M A; Kool, Jeroen

2012-04-01

In this study, an integrated approach is developed for the formation, identification and biological characterization of electrochemical conversion products of p38α mitogen-activated protein kinase inhibitors. This work demonstrates the hyphenation of an electrochemical reaction cell with a continuous-flow bioaffinity assay and parallel LC-HR-MS. Competition of the formed products with a tracer (SKF-86002) that shows fluorescence enhancement in the orthosteric binding site of the p38α kinase is the readout for bioaffinity. Parallel HR-MS(n) experiments provided information on the identity of binders and non-binders. Finally, the data produced with this on-line system were compared to electrochemical conversion products generated off-line. The electrochemical conversion of 1-{6-chloro-5-[(2R,5S)-4-(4-fluorobenzyl)-2,5-dimethylpiperazine-1-carbonyl]-3aH-indol-3-yl}-2-morpholinoethane-1,2-dione resulted in eight products, three of which showed bioaffinity in the continuous-flow p38α bioaffinity assay used. Electrochemical conversion of BIRB796 resulted, amongst others, in the formation of the reactive quinoneimine structure and its corresponding hydroquinone. Both products were detected in the p38α bioaffinity assay, which indicates binding to the p38α kinase.
Three-dimensional parallel vortex rings in Bose-Einstein condensates

International Nuclear Information System (INIS)

Crasovan, Lucian-Cornel; Perez-Garcia, Victor M.; Danaila, Ionut; Mihalache, Dumitru; Torner, Lluis

2004-01-01

We construct three-dimensional structures of topological defects hosted in trapped wave fields, in the form of vortex stars, vortex cages, parallel vortex lines, perpendicular vortex rings, and parallel vortex rings, and we show that the latter exist as robust stationary, collective states of nonrotating Bose-Einstein condensates. We discuss the stability properties of excited states containing several parallel vortex rings hosted by the condensate, including their dynamical and structural stability
Parallelization of the model-based iterative reconstruction algorithm DIRA

International Nuclear Information System (INIS)

Oertenberg, A.; Sandborg, M.; Alm Carlsson, G.; Malusek, A.; Magnusson, M.

2016-01-01

New paradigms for parallel programming have been devised to simplify software development on multi-core processors and many-core graphical processing units (GPU). Despite their obvious benefits, the parallelization of existing computer programs is not an easy task. In this work, the use of the Open Multiprocessing (OpenMP) and Open Computing Language (OpenCL) frameworks is considered for the parallelization of the model-based iterative reconstruction algorithm DIRA with the aim to significantly shorten the code's execution time. Selected routines were parallelized using OpenMP and OpenCL libraries; some routines were converted from MATLAB to C and optimised. Parallelization of the code with the OpenMP was easy and resulted in an overall speedup of 15 on a 16-core computer. Parallelization with OpenCL was more difficult owing to differences between the central processing unit and GPU architectures. The resulting speedup was substantially lower than the theoretical peak performance of the GPU; the cause was explained. (authors)
Illustrative Line Styles for Flow Visualization

NARCIS (Netherlands)

Everts, Maarten H.; Bekker, Hendrik; Roerdink, Jos B. T. M.; Isenberg, Tobias

2011-01-01

We present a flexible illustrative line style model for the visualization of streamline data. Our model partitions view-oriented line strips into parallel bands whose basic visual properties can be controlled independently. We thus extend previous line stylization techniques specifically for
Flow Visualization using Illustrative Line Styles

NARCIS (Netherlands)

Everts, Maarten H.; Bekker, Hendrik; Roerdink, Jos B. T. M.; Isenberg, Tobias; Bekker, Paulus

2011-01-01

We present a flexible illustrative line style model for the visualization of streamline data. Our model partitions view- oriented line strips into parallel bands whose basic visual properties can be controlled independently. We thus extend previous line stylization techniques specifically for
40 CFR Appendix C to Part 112 - Substantial Harm Criteria

Science.gov (United States)

2010-07-01

... to Part 112 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) WATER PROGRAMS OIL POLLUTION PREVENTION Pt. 112, App. C Appendix C to Part 112—Substantial Harm Criteria 1.0Introduction The..., except in the Gulf of Mexico. In the Gulf of Mexico, it means the area shoreward of the lines of...
Eccentric vision : adverse interactions between line segments

NARCIS (Netherlands)

Andriessen, J.J.; Bouma, H.

1976-01-01

The paper deals with adverse interactions between line stimuli ineccentric vision. Bothcontrast thresholdandjust noticeable difference of slanthave been measured for a test line as a function of the distance from a number of surrounding lines. Test lines were either parallel or perpendicular to the
Similarities and differences between helminth parasites and cancer cell lines in shaping human monocytes: Insights into parallel mechanisms of immune evasion.

Directory of Open Access Journals (Sweden)

Prakash Babu Narasimhan

2018-04-01

Full Text Available A number of features at the host-parasite interface are reminiscent of those that are also observed at the host-tumor interface. Both cancer cells and parasites establish a tissue microenvironment that allows for immune evasion and may reflect functional alterations of various innate cells. Here, we investigated how the phenotype and function of human monocytes is altered by exposure to cancer cell lines and if these functional and phenotypic alterations parallel those induced by exposure to helminth parasites. Thus, human monocytes were exposed to three different cancer cell lines (breast, ovarian, or glioblastoma or to live microfilariae (mf of Brugia malayi-a causative agent of lymphatic filariasis. After 2 days of co-culture, monocytes exposed to cancer cell lines showed markedly upregulated expression of M1-associated (TNF-α, IL-1β, M2-associated (CCL13, CD206, Mreg-associated (IL-10, TGF-β, and angiogenesis associated (MMP9, VEGF genes. Similar to cancer cell lines, but less dramatically, mf altered the mRNA expression of IL-1β, CCL13, TGM2 and MMP9. When surface expression of the inhibitory ligands PDL1 and PDL2 was assessed, monocytes exposed to both cancer cell lines and to live mf significantly upregulated PDL1 and PDL2 expression. In contrast to exposure to mf, exposure to cancer cell lines increased the phagocytic ability of monocytes and reduced their ability to induce T cell proliferation and to expand Granzyme A+ CD8+ T cells. Our data suggest that despite the fact that helminth parasites and cancer cell lines are extraordinarily disparate, they share the ability to alter the phenotype of human monocytes.
Similarities and differences between helminth parasites and cancer cell lines in shaping human monocytes: Insights into parallel mechanisms of immune evasion.

Science.gov (United States)

Narasimhan, Prakash Babu; Akabas, Leor; Tariq, Sameha; Huda, Naureen; Bennuru, Sasisekhar; Sabzevari, Helen; Hofmeister, Robert; Nutman, Thomas B; Tolouei Semnani, Roshanak

2018-04-01

A number of features at the host-parasite interface are reminiscent of those that are also observed at the host-tumor interface. Both cancer cells and parasites establish a tissue microenvironment that allows for immune evasion and may reflect functional alterations of various innate cells. Here, we investigated how the phenotype and function of human monocytes is altered by exposure to cancer cell lines and if these functional and phenotypic alterations parallel those induced by exposure to helminth parasites. Thus, human monocytes were exposed to three different cancer cell lines (breast, ovarian, or glioblastoma) or to live microfilariae (mf) of Brugia malayi-a causative agent of lymphatic filariasis. After 2 days of co-culture, monocytes exposed to cancer cell lines showed markedly upregulated expression of M1-associated (TNF-α, IL-1β), M2-associated (CCL13, CD206), Mreg-associated (IL-10, TGF-β), and angiogenesis associated (MMP9, VEGF) genes. Similar to cancer cell lines, but less dramatically, mf altered the mRNA expression of IL-1β, CCL13, TGM2 and MMP9. When surface expression of the inhibitory ligands PDL1 and PDL2 was assessed, monocytes exposed to both cancer cell lines and to live mf significantly upregulated PDL1 and PDL2 expression. In contrast to exposure to mf, exposure to cancer cell lines increased the phagocytic ability of monocytes and reduced their ability to induce T cell proliferation and to expand Granzyme A+ CD8+ T cells. Our data suggest that despite the fact that helminth parasites and cancer cell lines are extraordinarily disparate, they share the ability to alter the phenotype of human monocytes.
Semi-coarsening multigrid methods for parallel computing

Energy Technology Data Exchange (ETDEWEB)

Jones, J.E.

1996-12-31

Standard multigrid methods are not well suited for problems with anisotropic coefficients which can occur, for example, on grids that are stretched to resolve a boundary layer. There are several different modifications of the standard multigrid algorithm that yield efficient methods for anisotropic problems. In the paper, we investigate the parallel performance of these multigrid algorithms. Multigrid algorithms which work well for anisotropic problems are based on line relaxation and/or semi-coarsening. In semi-coarsening multigrid algorithms a grid is coarsened in only one of the coordinate directions unlike standard or full-coarsening multigrid algorithms where a grid is coarsened in each of the coordinate directions. When both semi-coarsening and line relaxation are used, the resulting multigrid algorithm is robust and automatic in that it requires no knowledge of the nature of the anisotropy. This is the basic multigrid algorithm whose parallel performance we investigate in the paper. The algorithm is currently being implemented on an IBM SP2 and its performance is being analyzed. In addition to looking at the parallel performance of the basic semi-coarsening algorithm, we present algorithmic modifications with potentially better parallel efficiency. One modification reduces the amount of computational work done in relaxation at the expense of using multiple coarse grids. This modification is also being implemented with the aim of comparing its performance to that of the basic semi-coarsening algorithm.
The ongoing investigation of high performance parallel computing in HEP

CERN Document Server

Peach, Kenneth J; Böck, R K; Dobinson, Robert W; Hansroul, M; Norton, Alan Robert; Willers, Ian Malcolm; Baud, J P; Carminati, F; Gagliardi, F; McIntosh, E; Metcalf, M; Robertson, L; CERN. Geneva. Detector Research and Development Committee

1993-01-01

Past and current exploitation of parallel computing in High Energy Physics is summarized and a list of R & D projects in this area is presented. The applicability of new parallel hardware and software to physics problems is investigated, in the light of the requirements for computing power of LHC experiments and the current trends in the computer industry. Four main themes are discussed (possibilities for a finer grain of parallelism; fine-grain communication mechanism; usable parallel programming environment; different programming models and architectures, using standard commercial products). Parallel computing technology is potentially of interest for offline and vital for real time applications in LHC. A substantial investment in applications development and evaluation of state of the art hardware and software products is needed. A solid development environment is required at an early stage, before mainline LHC program development begins.
Xyce parallel electronic simulator : reference guide.

Energy Technology Data Exchange (ETDEWEB)

Mei, Ting; Rankin, Eric Lamont; Thornquist, Heidi K.; Santarelli, Keith R.; Fixel, Deborah A.; Coffey, Todd Stirling; Russo, Thomas V.; Schiek, Richard Louis; Warrender, Christina E.; Keiter, Eric Richard; Pawlowski, Roger Patrick

2011-05-01

This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users Guide. The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce. This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users Guide. The Xyce Parallel Electronic Simulator has been written to support, in a rigorous manner, the simulation needs of the Sandia National Laboratories electrical designers. It is targeted specifically to run on large-scale parallel computing platforms but also runs well on a variety of architectures including single processor workstations. It also aims to support a variety of devices and models specific to Sandia needs. This document is intended to complement the Xyce Users Guide. It contains comprehensive, detailed information about a number of topics pertinent to the usage of Xyce. Included in this document is a netlist reference for the input-file commands and elements supported within Xyce; a command line reference, which describes the available command line arguments for Xyce; and quick-references for users of other circuit codes, such as Orcad's PSpice and Sandia's ChileSPICE.
Power-flow control and stability enhancement of four parallel-operated offshore wind farms using a line-commutated HVDC link

DEFF Research Database (Denmark)

Wang, Li; Wang, Kuo-Hua; Lee, Wei-Jen

2010-01-01

This paper presents an effective control scheme using a line-commutated high-voltage direct-current (HVDC) link with a designed rectifier current regulator (RCR) to simultaneously perform both power-fluctuation mitigation and damping improvement of four parallel-operated 80-MW offshore wind farms...... delivering generated power to a large utility grid. The proposed RCR of the HVDC link is designed by using modal control theory to contribute adequate damping to the studied four offshore wind farms under various wind speeds. A systematic analysis using a frequency-domain approach based on eigenvalue...... characteristics to the studied offshore wind farms under various wind speeds but also effectively mitigate power fluctuations of the offshore wind farms under wind-speed disturbance conditions....
Substantially parallel flux uncluttered rotor machines

Science.gov (United States)

Hsu, John S.

2012-12-11

A permanent magnet-less and brushless synchronous system includes a stator that generates a magnetic rotating field when sourced by polyphase alternating currents. An uncluttered rotor is positioned within the magnetic rotating field and is spaced apart from the stator. An excitation core is spaced apart from the stator and the uncluttered rotor and magnetically couples the uncluttered rotor. The brushless excitation source generates a magnet torque by inducing magnetic poles near an outer peripheral surface of the uncluttered rotor, and the stator currents also generate a reluctance torque by a reaction of the difference between the direct and quadrature magnetic paths of the uncluttered rotor. The system can be used either as a motor or a generator
MPI_XSTAR: MPI-based Parallelization of the XSTAR Photoionization Program

Science.gov (United States)

Danehkar, Ashkbiz; Nowak, Michael A.; Lee, Julia C.; Smith, Randall K.

2018-02-01

We describe a program for the parallel implementation of multiple runs of XSTAR, a photoionization code that is used to predict the physical properties of an ionized gas from its emission and/or absorption lines. The parallelization program, called MPI_XSTAR, has been developed and implemented in the C++ language by using the Message Passing Interface (MPI) protocol, a conventional standard of parallel computing. We have benchmarked parallel multiprocessing executions of XSTAR, using MPI_XSTAR, against a serial execution of XSTAR, in terms of the parallelization speedup and the computing resource efficiency. Our experience indicates that the parallel execution runs significantly faster than the serial execution, however, the efficiency in terms of the computing resource usage decreases with increasing the number of processors used in the parallel computing.
Parallel heat transport in integrable and chaotic magnetic fields

Energy Technology Data Exchange (ETDEWEB)

Castillo-Negrete, D. del; Chacon, L. [Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831-8071 (United States)

2012-05-15

The study of transport in magnetized plasmas is a problem of fundamental interest in controlled fusion, space plasmas, and astrophysics research. Three issues make this problem particularly challenging: (i) The extreme anisotropy between the parallel (i.e., along the magnetic field), {chi}{sub ||} , and the perpendicular, {chi}{sub Up-Tack }, conductivities ({chi}{sub ||} /{chi}{sub Up-Tack} may exceed 10{sup 10} in fusion plasmas); (ii) Nonlocal parallel transport in the limit of small collisionality; and (iii) Magnetic field lines chaos which in general complicates (and may preclude) the construction of magnetic field line coordinates. Motivated by these issues, we present a Lagrangian Green's function method to solve the local and non-local parallel transport equation applicable to integrable and chaotic magnetic fields in arbitrary geometry. The method avoids by construction the numerical pollution issues of grid-based algorithms. The potential of the approach is demonstrated with nontrivial applications to integrable (magnetic island), weakly chaotic (Devil's staircase), and fully chaotic magnetic field configurations. For the latter, numerical solutions of the parallel heat transport equation show that the effective radial transport, with local and non-local parallel closures, is non-diffusive, thus casting doubts on the applicability of quasilinear diffusion descriptions. General conditions for the existence of non-diffusive, multivalued flux-gradient relations in the temperature evolution are derived.

Series Transmission Line Transformer

Science.gov (United States)

Buckles, Robert A.; Booth, Rex; Yen, Boris T.

2004-06-29

A series transmission line transformer is set forth which includes two or more of impedance matched sets of at least two transmissions lines such as shielded cables, connected in parallel at one end ans series at the other in a cascading fashion. The cables are wound about a magnetic core. The series transmission line transformer (STLT) which can provide for higher impedance ratios and bandwidths, which is scalable, and which is of simpler design and construction.
A position sensitive parallel plate avalanche counter

International Nuclear Information System (INIS)

Lombardi, M.; Tan Jilian; Potenza, R.; D'amico, V.

1986-01-01

A position sensitive parallel plate avalanche counter with a distributed constant delay-line-cathode (PSAC) is described. The strips formed on the printed board were served as the cathode and the delay line for readout of signals. The detector (PSAC) was operated in isobutane gas at the pressure range from 10 to 20 torr. The position resolution is better than 1 mm and the time resolution is about 350 ps, for 252 Cf fission-spectrum source
Simulation Exploration through Immersive Parallel Planes

Energy Technology Data Exchange (ETDEWEB)

Brunhart-Lupo, Nicholas J [National Renewable Energy Laboratory (NREL), Golden, CO (United States); Bush, Brian W [National Renewable Energy Laboratory (NREL), Golden, CO (United States); Gruchalla, Kenny M [National Renewable Energy Laboratory (NREL), Golden, CO (United States); Smith, Steve [Los Alamos Visualization Associates

2017-05-25

We present a visualization-driven simulation system that tightly couples systems dynamics simulations with an immersive virtual environment to allow analysts to rapidly develop and test hypotheses in a high-dimensional parameter space. To accomplish this, we generalize the two-dimensional parallel-coordinates statistical graphic as an immersive 'parallel-planes' visualization for multivariate time series emitted by simulations running in parallel with the visualization. In contrast to traditional parallel coordinate's mapping the multivariate dimensions onto coordinate axes represented by a series of parallel lines, we map pairs of the multivariate dimensions onto a series of parallel rectangles. As in the case of parallel coordinates, each individual observation in the dataset is mapped to a polyline whose vertices coincide with its coordinate values. Regions of the rectangles can be 'brushed' to highlight and select observations of interest: a 'slider' control allows the user to filter the observations by their time coordinate. In an immersive virtual environment, users interact with the parallel planes using a joystick that can select regions on the planes, manipulate selection, and filter time. The brushing and selection actions are used to both explore existing data as well as to launch additional simulations corresponding to the visually selected portions of the input parameter space. As soon as the new simulations complete, their resulting observations are displayed in the virtual environment. This tight feedback loop between simulation and immersive analytics accelerates users' realization of insights about the simulation and its output.
PIMR: Parallel and Integrated Matching for Raw Data.

Science.gov (United States)

Li, Zhenghao; Yang, Junying; Zhao, Jiaduo; Han, Peng; Chai, Zhi

2016-01-02

With the trend of high-resolution imaging, computational costs of image matching have substantially increased. In order to find the compromise between accuracy and computation in real-time applications, we bring forward a fast and robust matching algorithm, named parallel and integrated matching for raw data (PIMR). This algorithm not only effectively utilizes the color information of raw data, but also designs a parallel and integrated framework to shorten the time-cost in the demosaicing stage. Experiments show that compared to existing state-of-the-art methods, the proposed algorithm yields a comparable recognition rate, while the total time-cost of imaging and matching is significantly reduced.
Electric Mars: A large trans-terminator electric potential drop on closed magnetic field lines above Utopia Planitia

Science.gov (United States)

Collinson, Glyn; Mitchell, David; Xu, Shaosui; Glocer, Alex; Grebowsky, Joseph; Hara, Takuya; Lillis, Robert; Espley, Jared; Mazelle, Christian; Sauvaud, Jean-André; Fedorov, Andrey; Liemohn, Mike; Andersson, Laila; Jakosky, Bruce

2017-02-01

Parallel electric fields and their associated electric potential structures play a crucial role in ionospheric-magnetospheric interactions at any planet. Although there is abundant evidence that parallel electric fields play key roles in Martian ionospheric outflow and auroral electron acceleration, the fields themselves are challenging to directly measure due to their relatively weak nature. Using measurements by the Solar Wind Electron Analyzer instrument aboard the NASA Mars Atmosphere and Volatile EvolutioN (MAVEN) Mars Scout, we present the discovery and measurement of a substantial (ΦMars=7.7 ± 0.6 V) parallel electric potential drop on closed magnetic field lines spanning the terminator from day to night above the great impact basin of Utopia Planitia, a region largely free of crustal magnetic fields. A survey of the previous 26 orbits passing over a range of longitudes revealed similar signatures on seven orbits, with a mean potential drop (ΦMars) of 10.9 ± 0.8 V, suggestive that although trans-terminator electric fields of comparable strength are not ubiquitous, they may be common, at least at these northerly latitudes.
Electric Mars: A Large Trans-Terminator Electric Potential Drop on Closed Magnetic Field Lines Above Utopia Planitia

Science.gov (United States)

Collinson, Glyn; Mitchell, David; Xu, Shaosui; Glocer, Alex; Grebowsky, Joseph; Hara, Takuya; Lillis, Robert; Espley, Jared; Mazelle, Christian; Sauvaud, Jean-Andre

2017-01-01

Abstract Parallel electric fields and their associated electric potential structures play a crucial role inionospheric-magnetospheric interactions at any planet. Although there is abundant evidence that parallel electric fields play key roles in Martian ionospheric outflow and auroral electron acceleration, the fields themselves are challenging to directly measure due to their relatively weak nature. Using measurements by the Solar Wind Electron Analyzer instrument aboard the NASA Mars Atmosphere and Volatile EvolutioN(MAVEN) Mars Scout, we present the discovery and measurement of a substantial (Phi) Mars 7.7 +/-0.6 V) parallel electric potential drop on closed magnetic field lines spanning the terminator from day to night above the great impact basin of Utopia Planitia, a region largely free of crustal magnetic fields. A survey of the previous 26 orbits passing over a range of longitudes revealed similar signatures on seven orbits, with a mean potential drop (Phi) Mars of 10.9 +/- 0.8 V, suggestive that although trans-terminator electric fields of comparable strength are not ubiquitous, they may be common, at least at these northerly latitudes.
Slit shaped microwave induced atmospheric pressure plasma based on a parallel plate transmission line resonator

Science.gov (United States)

Kang, S. K.; Seo, Y. S.; Lee, H. Wk; Aman-ur-Rehman; Kim, G. C.; Lee, J. K.

2011-11-01

A new type of microwave-excited atmospheric pressure plasma source, based on the principle of parallel plate transmission line resonator, is developed for the treatment of large areas in biomedical applications such as skin treatment and wound healing. A stable plasma of 20 mm width is sustained by a small microwave power source operated at a frequency of 700 MHz and a gas flow rate of 0.9 slm. Plasma impedance and plasma density of this plasma source are estimated by fitting the calculated reflection coefficient to the measured one. The estimated plasma impedance shows a decreasing trend while estimated plasma density shows an increasing trend with the increase in the input power. Plasma uniformity is confirmed by temperature and optical emission distribution measurements. Plasma temperature is sustained at less than 40 °C and abundant amounts of reactive species, which are important agents for bacteria inactivation, are detected over the entire plasma region. Large area treatment ability of this newly developed device is verified through bacteria inactivation experiment using E. coli. Sterilization experiment shows a large bacterial killing mark of 25 mm for a plasma treatment time of 10 s.
(Nearly) portable PIC code for parallel computers

International Nuclear Information System (INIS)

Decyk, V.K.

1993-01-01

As part of the Numerical Tokamak Project, the author has developed a (nearly) portable, one dimensional version of the GCPIC algorithm for particle-in-cell codes on parallel computers. This algorithm uses a spatial domain decomposition for the fields, and passes particles from one domain to another as the particles move spatially. With only minor changes, the code has been run in parallel on the Intel Delta, the Cray C-90, the IBM ES/9000 and a cluster of workstations. After a line by line translation into cmfortran, the code was also run on the CM-200. Impressive speeds have been achieved, both on the Intel Delta and the Cray C-90, around 30 nanoseconds per particle per time step. In addition, the author was able to isolate the data management modules, so that the physics modules were not changed much from their sequential version, and the data management modules can be used as open-quotes black boxes.close quotes
Parallel GPU implementation of iterative PCA algorithms.

Science.gov (United States)

Andrecut, M

2009-11-01

Principal component analysis (PCA) is a key statistical technique for multivariate data analysis. For large data sets, the common approach to PCA computation is based on the standard NIPALS-PCA algorithm, which unfortunately suffers from loss of orthogonality, and therefore its applicability is usually limited to the estimation of the first few components. Here we present an algorithm based on Gram-Schmidt orthogonalization (called GS-PCA), which eliminates this shortcoming of NIPALS-PCA. Also, we discuss the GPU (Graphics Processing Unit) parallel implementation of both NIPALS-PCA and GS-PCA algorithms. The numerical results show that the GPU parallel optimized versions, based on CUBLAS (NVIDIA), are substantially faster (up to 12 times) than the CPU optimized versions based on CBLAS (GNU Scientific Library).
Improved magnetic field line design for TMX

International Nuclear Information System (INIS)

Logan, B.G.; Baldwin, D.E.; Foote, J.H.; Chargin, A.K.; Hinkle, R.E.; Hussung, R.O.; Damm, C.C.

1977-01-01

Optimization of the currents in the TMX magnet set leads to a field line configuration which has a central solenoidal region uniform in parallel B parallel to within 10 percent over a 2m length. The field design has sufficient flexibility to meet all three physics objectives of the TMX experiment
Effects of parallel electron dynamics on plasma blob transport

Energy Technology Data Exchange (ETDEWEB)

Angus, Justin R.; Krasheninnikov, Sergei I. [University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093 (United States); Umansky, Maxim V. [Lawrence Livermore National Laboratory, 7000 East Avenue, Livermore, California 94550 (United States)

2012-08-15

The 3D effects on sheath connected plasma blobs that result from parallel electron dynamics are studied by allowing for the variation of blob density and potential along the magnetic field line and using collisional Ohm's law to model the parallel current density. The parallel current density from linear sheath theory, typically used in the 2D model, is implemented as parallel boundary conditions. This model includes electrostatic 3D effects, such as resistive drift waves and blob spinning, while retaining all of the fundamental 2D physics of sheath connected plasma blobs. If the growth time of unstable drift waves is comparable to the 2D advection time scale of the blob, then the blob's density gradient will be depleted resulting in a much more diffusive blob with little radial motion. Furthermore, blob profiles that are initially varying along the field line drive the potential to a Boltzmann relation that spins the blob and thereby acts as an addition sink of the 2D potential. Basic dimensionless parameters are presented to estimate the relative importance of these two 3D effects. The deviation of blob dynamics from that predicted by 2D theory in the appropriate limits of these parameters is demonstrated by a direct comparison of 2D and 3D seeded blob simulations.
Examination of Speed Contribution of Parallelization for Several Fingerprint Pre-Processing Algorithms

Directory of Open Access Journals (Sweden)

GORGUNOGLU, S.

2014-05-01

Full Text Available In analysis of minutiae based fingerprint systems, fingerprints needs to be pre-processed. The pre-processing is carried out to enhance the quality of the fingerprint and to obtain more accurate minutiae points. Reducing the pre-processing time is important for identification and verification in real time systems and especially for databases holding large fingerprints information. Parallel processing and parallel CPU computing can be considered as distribution of processes over multi core processor. This is done by using parallel programming techniques. Reducing the execution time is the main objective in parallel processing. In this study, pre-processing of minutiae based fingerprint system is implemented by parallel processing on multi core computers using OpenMP and on graphics processor using CUDA to improve execution time. The execution times and speedup ratios are compared with the one that of single core processor. The results show that by using parallel processing, execution time is substantially improved. The improvement ratios obtained for different pre-processing algorithms allowed us to make suggestions on the more suitable approaches for parallelization.
Simulation Exploration through Immersive Parallel Planes: Preprint

Energy Technology Data Exchange (ETDEWEB)

Brunhart-Lupo, Nicholas; Bush, Brian W.; Gruchalla, Kenny; Smith, Steve

2016-03-01

We present a visualization-driven simulation system that tightly couples systems dynamics simulations with an immersive virtual environment to allow analysts to rapidly develop and test hypotheses in a high-dimensional parameter space. To accomplish this, we generalize the two-dimensional parallel-coordinates statistical graphic as an immersive 'parallel-planes' visualization for multivariate time series emitted by simulations running in parallel with the visualization. In contrast to traditional parallel coordinate's mapping the multivariate dimensions onto coordinate axes represented by a series of parallel lines, we map pairs of the multivariate dimensions onto a series of parallel rectangles. As in the case of parallel coordinates, each individual observation in the dataset is mapped to a polyline whose vertices coincide with its coordinate values. Regions of the rectangles can be 'brushed' to highlight and select observations of interest: a 'slider' control allows the user to filter the observations by their time coordinate. In an immersive virtual environment, users interact with the parallel planes using a joystick that can select regions on the planes, manipulate selection, and filter time. The brushing and selection actions are used to both explore existing data as well as to launch additional simulations corresponding to the visually selected portions of the input parameter space. As soon as the new simulations complete, their resulting observations are displayed in the virtual environment. This tight feedback loop between simulation and immersive analytics accelerates users' realization of insights about the simulation and its output.
Vectorization and parallelization of a production reactor assembly code

International Nuclear Information System (INIS)

Vujic, J.L.; Martin, W.R.; Michigan Univ., Ann Arbor, MI

1991-01-01

In order to use efficiently the new features of supercomputers, production codes, usually written 10 -20 years ago, must be tailored for modern computer architectures. We have chosen to optimize the CPM-2 code, a production reactor assembly code based on the collision probability transport method. Substantial speedup in the execution times was obtained with the parallel/vector version of the CPM-2 code. In addition, we have developed a new transfer probability method, which removes some of the modelling limitations of the collision probability method encoded in the CPM-2 code, and can fully utilize the parallel/vector architecture of a multiprocessor IBM 3090. (author)
Vectorization and parallelization of a production reactor assembly code

International Nuclear Information System (INIS)

Vujic, J.L.; Martin, W.R.

1991-01-01

In order to efficiently use new features of supercomputers, production codes, usually written 10 - 20 years ago, must be tailored for modern computer architectures. We have chosen to optimize the CPM-2 code, a production reactor assembly code based on the collision probability transport method. Substantial speedups in the execution times were obtained with the parallel/vector version of the CPM-2 code. In addition, we have developed a new transfer probability method, which removes some of the modelling limitations of the collision probability method encoded in the CPM-2 code, and can fully utilize parallel/vector architecture of a multiprocessor IBM 3090. (author)
Parallel Branch-and-Bound Methods for the Job Shop Scheduling

DEFF Research Database (Denmark)

Clausen, Jens; Perregaard, Michael

1998-01-01

Job-shop scheduling (JSS) problems are among the more difficult to solve in the class of NP-complete problems. The only successful approach has been branch-and-bound based algorithms, but such algorithms depend heavily on good bound functions. Much work has been done to identify such functions...... for the JSS problem, but with limited success. Even with recent methods, it is still not possible to solve problems substantially larger than 10 machines and 10 jobs. In the current study, we focus on parallel methods for solving JSS problems. We implement two different parallel branch-and-bound algorithms...
Fast parallel event reconstruction

CERN Multimedia

CERN. Geneva

2010-01-01

On-line processing of large data volumes produced in modern HEP experiments requires using maximum capabilities of modern and future many-core CPU and GPU architectures.One of such powerful feature is a SIMD instruction set, which allows packing several data items in one register and to operate on all of them, thus achievingmore operations per clock cycle. Motivated by the idea of using the SIMD unit ofmodern processors, the KF based track fit has been adapted for parallelism, including memory optimization, numerical analysis, vectorization with inline operator overloading, and optimization using SDKs. The speed of the algorithm has been increased in 120000 times with 0.1 ms/track, running in parallel on 16 SPEs of a Cell Blade computer. Running on a Nehalem CPU with 8 cores it shows the processing speed of 52 ns/track using the Intel Threading Building Blocks. The same KF algorithm running on an Nvidia GTX 280 in the CUDA frameworkprovi...
Parallel transport in ideal magnetohydrodynamics and applications to resistive wall modes

International Nuclear Information System (INIS)

Finn, J.M.; Gerwin, R.A.

1996-01-01

It is shown that in magnetohydrodynamics (MHD) with an ideal Ohm close-quote s law, in the presence of parallel heat flux, density gradient, temperature gradient, and parallel compression, but in the absence of perpendicular compressibility, there is an exact cancellation of the parallel transport terms. This cancellation is due to the fact that magnetic flux is advected in the presence of an ideal Ohm close-quote s law, and therefore parallel transport of temperature and density gives the same result as perpendicular advection of the same quantities. Discussions are also presented regarding parallel viscosity and parallel velocity shear, and the generalization to toroidal geometry. These results suggest that a correct generalization of the Hammett endash Perkins fluid operator [G. W. Hammett and F. W. Perkins, Phys. Rev. Lett. 64, 3019 (1990)] to simulate Landau damping for electromagnetic modes must give an operator that acts on the dynamics parallel to the perturbed magnetic field lines. copyright 1996 American Institute of Physics
An Integrated Inductor For Parallel Interleaved Three-Phase Voltage Source Converters

DEFF Research Database (Denmark)

Gohil, Ghanshyamsinh Vijaysinh; Bede, Lorand; Teodorescu, Remus

2016-01-01

Three phase Voltage Source Converters (VSCs) are often connected in parallel to realize high current output converter system. The harmonic quality of the resultant switched output voltage can be improved by interleaving the carrier signals of these parallel connected VSCs. As a result, the line...... of the state-of-the-art filtering solution. The performance of the integrated inductor is also verified by the experimental measurements....
CS-Studio Scan System Parallelization

Energy Technology Data Exchange (ETDEWEB)

Kasemir, Kay [ORNL; Pearson, Matthew R [ORNL

2015-01-01

For several years, the Control System Studio (CS-Studio) Scan System has successfully automated the operation of beam lines at the Oak Ridge National Laboratory (ORNL) High Flux Isotope Reactor (HFIR) and Spallation Neutron Source (SNS). As it is applied to additional beam lines, we need to support simultaneous adjustments of temperatures or motor positions. While this can be implemented via virtual motors or similar logic inside the Experimental Physics and Industrial Control System (EPICS) Input/Output Controllers (IOCs), doing so requires a priori knowledge of experimenters requirements. By adding support for the parallel control of multiple process variables (PVs) to the Scan System, we can better support ad hoc automation of experiments that benefit from such simultaneous PV adjustments.

Conductance of auroral magnetic field lines

International Nuclear Information System (INIS)

Weimer, D.R.; Gurnett, D.A.; Goertz, C.K.

1986-01-01

DE-1 high-resolution double-probe electric-field data and simultaneous magnetic-field measurements are reported for two 1981 events with large electric fields which reversed over short distances. The data are presented graphically and analyzed in detail. A field-line conductance of about 1 nmho/sq m is determined for both upward and downward currents, and the ionospheric conductivity is shown, in the short-wavelength limit, to have little effect on the relationship between the (N-S) electric and (E-W) magnetic fields above the potential drop parallel to the magnetic-field lines. The results are found to be consistent with a linear relationship between the field-aligned current density and the parallel potential drop. 14 references
New algorithms for parallel MRI

International Nuclear Information System (INIS)

Anzengruber, S; Ramlau, R; Bauer, F; Leitao, A

2008-01-01

Magnetic Resonance Imaging with parallel data acquisition requires algorithms for reconstructing the patient's image from a small number of measured lines of the Fourier domain (k-space). In contrast to well-known algorithms like SENSE and GRAPPA and its flavors we consider the problem as a non-linear inverse problem. However, in order to avoid cost intensive derivatives we will use Landweber-Kaczmarz iteration and in order to improve the overall results some additional sparsity constraints.
Critical care medicine as a distinct product line with substantial financial profitability: the role of business planning.

Science.gov (United States)

Bekes, Carolyn E; Dellinger, R Phillip; Brooks, Daniel; Edmondson, Robert; Olivia, Christopher T; Parrillo, Joseph E

2004-05-01

As academic health centers face increasing financial pressures, they have adopted a more businesslike approach to planning, particularly for discrete "product" or clinical service lines. Since critical care typically has been viewed as a service provided by a hospital, and not a product line, business plans have not historically been developed to expand and promote critical care. The major focus when examining the finances of critical care has been cost reduction, not business development. We hypothesized that a critical care business plan can be developed and analyzed like other more typical product lines and that such a critical care product line can be profitable for an institution. In-depth analysis of critical care including business planning for critical care services. Regional academic health center in southern New Jersey. None. As part of an overall business planning process directed by the Board of Trustees, the critical care product line was identified by isolating revenue, expenses, and profitability associated with critical care patients. We were able to identify the major sources ("value chain") of critical care patients: the emergency room, patients who are admitted for other problems but spend time in a critical care unit, and patients transferred to our intensive care units from other hospitals. The greatest opportunity to expand the product line comes from increasing the referrals from other hospitals. A methodology was developed to identify the revenue and expenses associated with critical care, based on the analysis of past experience. With this model, we were able to demonstrate a positive contribution margin of dollar 7 million per year related to patients transferred to the institution primarily for critical care services. This can be seen as the profit related to the product line segment of critical care. There was an additional positive contribution margin of dollar 5.8 million attributed to the critical care portion of the hospital stay of
Research on Control Strategy of Complex Systems through VSC-HVDC Grid Parallel Device

Directory of Open Access Journals (Sweden)

Xue Mei-Juan

2014-07-01

Full Text Available After the completion of grid parallel, the device can turn to be UPFC, STATCOM, SSSC, research on the conversion circuit and transform method by corresponding switching operation. Accomplish the grid parallel and comprehensive control of the tie-line and stable operation and control functions of grid after parallel. Defines the function select operation switch matrix and grid parallel system branch variable, forming a switch matrix to achieve corresponding function of the composite system. Formed a criterion of the selection means to choose control strategy according to the switch matrix, to accomplish corresponding function. Put the grid parallel, STATCOM, SSSC and UPFC together as a system, improve the stable operation and flexible control of the power system.
Parallel Simulation of Three-Dimensional Free Surface Fluid Flow Problems

International Nuclear Information System (INIS)

BAER, THOMAS A.; SACKINGER, PHILIP A.; SUBIA, SAMUEL R.

1999-01-01

Simulation of viscous three-dimensional fluid flow typically involves a large number of unknowns. When free surfaces are included, the number of unknowns increases dramatically. Consequently, this class of problem is an obvious application of parallel high performance computing. We describe parallel computation of viscous, incompressible, free surface, Newtonian fluid flow problems that include dynamic contact fines. The Galerkin finite element method was used to discretize the fully-coupled governing conservation equations and a ''pseudo-solid'' mesh mapping approach was used to determine the shape of the free surface. In this approach, the finite element mesh is allowed to deform to satisfy quasi-static solid mechanics equations subject to geometric or kinematic constraints on the boundaries. As a result, nodal displacements must be included in the set of unknowns. Other issues discussed are the proper constraints appearing along the dynamic contact line in three dimensions. Issues affecting efficient parallel simulations include problem decomposition to equally distribute computational work among a SPMD computer and determination of robust, scalable preconditioners for the distributed matrix systems that must be solved. Solution continuation strategies important for serial simulations have an enhanced relevance in a parallel coquting environment due to the difficulty of solving large scale systems. Parallel computations will be demonstrated on an example taken from the coating flow industry: flow in the vicinity of a slot coater edge. This is a three dimensional free surface problem possessing a contact line that advances at the web speed in one region but transitions to static behavior in another region. As such, a significant fraction of the computational time is devoted to processing boundary data. Discussion focuses on parallel speed ups for fixed problem size, a class of problems of immediate practical importance
Optimization approaches to mpi and area merging-based parallel buffer algorithm

Directory of Open Access Journals (Sweden)

Junfu Fan

Full Text Available On buffer zone construction, the rasterization-based dilation method inevitably introduces errors, and the double-sided parallel line method involves a series of complex operations. In this paper, we proposed a parallel buffer algorithm based on area merging and MPI (Message Passing Interface to improve the performances of buffer analyses on processing large datasets. Experimental results reveal that there are three major performance bottlenecks which significantly impact the serial and parallel buffer construction efficiencies, including the area merging strategy, the task load balance method and the MPI inter-process results merging strategy. Corresponding optimization approaches involving tree-like area merging strategy, the vertex number oriented parallel task partition method and the inter-process results merging strategy were suggested to overcome these bottlenecks. Experiments were carried out to examine the performance efficiency of the optimized parallel algorithm. The estimation results suggested that the optimization approaches could provide high performance and processing ability for buffer construction in a cluster parallel environment. Our method could provide insights into the parallelization of spatial analysis algorithm.
Massive hybrid parallelism for fully implicit multiphysics

International Nuclear Information System (INIS)

Gaston, D. R.; Permann, C. J.; Andrs, D.; Peterson, J. W.

2013-01-01

As hardware advances continue to modify the supercomputing landscape, traditional scientific software development practices will become more outdated, ineffective, and inefficient. The process of rewriting/retooling existing software for new architectures is a Sisyphean task, and results in substantial hours of development time, effort, and money. Software libraries which provide an abstraction of the resources provided by such architectures are therefore essential if the computational engineering and science communities are to continue to flourish in this modern computing environment. The Multiphysics Object Oriented Simulation Environment (MOOSE) framework enables complex multiphysics analysis tools to be built rapidly by scientists, engineers, and domain specialists, while also allowing them to both take advantage of current HPC architectures, and efficiently prepare for future supercomputer designs. MOOSE employs a hybrid shared-memory and distributed-memory parallel model and provides a complete and consistent interface for creating multiphysics analysis tools. In this paper, a brief discussion of the mathematical algorithms underlying the framework and the internal object-oriented hybrid parallel design are given. Representative massively parallel results from several applications areas are presented, and a brief discussion of future areas of research for the framework are provided. (authors)
Massive hybrid parallelism for fully implicit multiphysics

Energy Technology Data Exchange (ETDEWEB)

Gaston, D. R.; Permann, C. J.; Andrs, D.; Peterson, J. W. [Idaho National Laboratory, 2525 N. Fremont Ave., Idaho Falls, ID 83415 (United States)

2013-07-01

As hardware advances continue to modify the supercomputing landscape, traditional scientific software development practices will become more outdated, ineffective, and inefficient. The process of rewriting/retooling existing software for new architectures is a Sisyphean task, and results in substantial hours of development time, effort, and money. Software libraries which provide an abstraction of the resources provided by such architectures are therefore essential if the computational engineering and science communities are to continue to flourish in this modern computing environment. The Multiphysics Object Oriented Simulation Environment (MOOSE) framework enables complex multiphysics analysis tools to be built rapidly by scientists, engineers, and domain specialists, while also allowing them to both take advantage of current HPC architectures, and efficiently prepare for future supercomputer designs. MOOSE employs a hybrid shared-memory and distributed-memory parallel model and provides a complete and consistent interface for creating multiphysics analysis tools. In this paper, a brief discussion of the mathematical algorithms underlying the framework and the internal object-oriented hybrid parallel design are given. Representative massively parallel results from several applications areas are presented, and a brief discussion of future areas of research for the framework are provided. (authors)
MASSIVE HYBRID PARALLELISM FOR FULLY IMPLICIT MULTIPHYSICS

Energy Technology Data Exchange (ETDEWEB)

Cody J. Permann; David Andrs; John W. Peterson; Derek R. Gaston

2013-05-01

As hardware advances continue to modify the supercomputing landscape, traditional scientific software development practices will become more outdated, ineffective, and inefficient. The process of rewriting/retooling existing software for new architectures is a Sisyphean task, and results in substantial hours of development time, effort, and money. Software libraries which provide an abstraction of the resources provided by such architectures are therefore essential if the computational engineering and science communities are to continue to flourish in this modern computing environment. The Multiphysics Object Oriented Simulation Environment (MOOSE) framework enables complex multiphysics analysis tools to be built rapidly by scientists, engineers, and domain specialists, while also allowing them to both take advantage of current HPC architectures, and efficiently prepare for future supercomputer designs. MOOSE employs a hybrid shared-memory and distributed-memory parallel model and provides a complete and consistent interface for creating multiphysics analysis tools. In this paper, a brief discussion of the mathematical algorithms underlying the framework and the internal object-oriented hybrid parallel design are given. Representative massively parallel results from several applications areas are presented, and a brief discussion of future areas of research for the framework are provided.
A parallel algorithm for 3D dislocation dynamics

International Nuclear Information System (INIS)

Wang Zhiqiang; Ghoniem, Nasr; Swaminarayan, Sriram; LeSar, Richard

2006-01-01

Dislocation dynamics (DD), a discrete dynamic simulation method in which dislocations are the fundamental entities, is a powerful tool for investigation of plasticity, deformation and fracture of materials at the micron length scale. However, severe computational difficulties arising from complex, long-range interactions between these curvilinear line defects limit the application of DD in the study of large-scale plastic deformation. We present here the development of a parallel algorithm for accelerated computer simulations of DD. By representing dislocations as a 3D set of dislocation particles, we show here that the problem of an interacting ensemble of dislocations can be converted to a problem of a particle ensemble, interacting with a long-range force field. A grid using binary space partitioning is constructed to keep track of node connectivity across domains. We demonstrate the computational efficiency of the parallel micro-plasticity code and discuss how O(N) methods map naturally onto the parallel data structure. Finally, we present results from applications of the parallel code to deformation in single crystal fcc metals
The relation between reconnected flux, the parallel electric field, and the reconnection rate in a three-dimensional kinetic simulation of magnetic reconnection

International Nuclear Information System (INIS)

Wendel, D. E.; Olson, D. K.; Hesse, M.; Kuznetsova, M.; Adrian, M. L.; Aunai, N.; Karimabadi, H.; Daughton, W.

2013-01-01

We investigate the distribution of parallel electric fields and their relationship to the location and rate of magnetic reconnection in a large particle-in-cell simulation of 3D turbulent magnetic reconnection with open boundary conditions. The simulation's guide field geometry inhibits the formation of simple topological features such as null points. Therefore, we derive the location of potential changes in magnetic connectivity by finding the field lines that experience a large relative change between their endpoints, i.e., the quasi-separatrix layer. We find a good correspondence between the locus of changes in magnetic connectivity or the quasi-separatrix layer and the map of large gradients in the integrated parallel electric field (or quasi-potential). Furthermore, we investigate the distribution of the parallel electric field along the reconnecting field lines. We find the reconnection rate is controlled by only the low-amplitude, zeroth and first–order trends in the parallel electric field while the contribution from fluctuations of the parallel electric field, such as electron holes, is negligible. The results impact the determination of reconnection sites and reconnection rates in models and in situ spacecraft observations of 3D turbulent reconnection. It is difficult through direct observation to isolate the loci of the reconnection parallel electric field amidst the large amplitude fluctuations. However, we demonstrate that a positive slope of the running sum of the parallel electric field along the field line as a function of field line length indicates where reconnection is occurring along the field line
An Integrated Inductor for Parallel Interleaved VSCs and PWM Schemes for Flux Minimization

DEFF Research Database (Denmark)

Gohil, Ghanshyamsinh Vijaysinh; Bede, Lorand; Teodorescu, Remus

2015-01-01

The interleaving of the carrier signals of the parallel Voltage Source Converters (VSCs) can reduce the harmonic content in the resultant switched output voltages. As a result, the size of the line filter inductor can be reduced. However, in addition to the line filter, an inductive filter is often...
Implementation of PHENIX trigger algorithms on massively parallel computers

International Nuclear Information System (INIS)

Petridis, A.N.; Wohn, F.K.

1995-01-01

The event selection requirements of contemporary high energy and nuclear physics experiments are met by the introduction of on-line trigger algorithms which identify potentially interesting events and reduce the data acquisition rate to levels that are manageable by the electronics. Such algorithms being parallel in nature can be simulated off-line using massively parallel computers. The PHENIX experiment intends to investigate the possible existence of a new phase of matter called the quark gluon plasma which has been theorized to have existed in very early stages of the evolution of the universe by studying collisions of heavy nuclei at ultra-relativistic energies. Such interactions can also reveal important information regarding the structure of the nucleus and mandate a thorough investigation of the simpler proton-nucleus collisions at the same energies. The complexity of PHENIX events and the need to analyze and also simulate them at rates similar to the data collection ones imposes enormous computation demands. This work is a first effort to implement PHENIX trigger algorithms on parallel computers and to study the feasibility of using such machines to run the complex programs necessary for the simulation of the PHENIX detector response. Fine and coarse grain approaches have been studied and evaluated. Depending on the application the performance of a massively parallel computer can be much better or much worse than that of a serial workstation. A comparison between single instruction and multiple instruction computers is also made and possible applications of the single instruction machines to high energy and nuclear physics experiments are outlined. copyright 1995 American Institute of Physics
Analysis of Relative Parallelism Between Hamular-Incisive-Papilla Plane and Campers Plane in Edentulous Subjects: A Comparative Study.

Science.gov (United States)

Tambake, Deepti; Shetty, Shilpa; Satish Babu, C L; Fulari, Sangamesh G

2014-12-01

The study was undertaken to evaluate the parallelism between hamular-incisive-papilla plane (HIP) and the Campers plane. And to determine which part of the posterior reference of the tragus i.e., the superior, middle or the inferior of the Camper's plane is parallel to HIP using digital lateral cephalograms. Fifty edentulous subjects with well formed ridges were selected for the study. The master casts were obtained using the standard selective pressure impression procedure. On the deepest point of the hamular notches and the centre of the incisive papilla stainless steel spherical bearings were glued to the cast at the marked points. The study templates were fabricated with autopolymerizing acrylic resin. The subjects were prepared for the lateral cephalograms. Stainless steel spherical bearings were adhered to the superior, middle, inferior points of the tragus of the ear and inferior border of the ala of the nose using surgical adhesive tape. The subjects with study templates were subjected to lateral cephalograms. Cephalometric tracings were done using Autocad 2010 software. Lines were drawn connecting the incisive papilla and hamular notch and the stainless steel spherical bearings placed on the superior, middle and inferior points on the tragus and the ala of the nose i.e., the Campers line S, Campers line M, Campers line I. The angles between the three Camper's line and the HIP were measured and recorded. Higher mean angulation was recorded in Campers line S -HIP (8.03) followed by Campers line M-HIP (4.60). Campers line I-HIP recorded the least angulation (3.80). The HIP is parallel to the Camper's plane. The Camper's plane formed with the posterior reference point as inferior point of the tragus is relatively parallel to the HIP.
Massively parallel algorithms for trace-driven cache simulations

Science.gov (United States)

Nicol, David M.; Greenberg, Albert G.; Lubachevsky, Boris D.

1991-01-01

Trace driven cache simulation is central to computer design. A trace is a very long sequence of reference lines from main memory. At the t(exp th) instant, reference x sub t is hashed into a set of cache locations, the contents of which are then compared with x sub t. If at the t sup th instant x sub t is not present in the cache, then it is said to be a miss, and is loaded into the cache set, possibly forcing the replacement of some other memory line, and making x sub t present for the (t+1) sup st instant. The problem of parallel simulation of a subtrace of N references directed to a C line cache set is considered, with the aim of determining which references are misses and related statistics. A simulation method is presented for the Least Recently Used (LRU) policy, which regradless of the set size C runs in time O(log N) using N processors on the exclusive read, exclusive write (EREW) parallel model. A simpler LRU simulation algorithm is given that runs in O(C log N) time using N/log N processors. Timings are presented of the second algorithm's implementation on the MasPar MP-1, a machine with 16384 processors. A broad class of reference based line replacement policies are considered, which includes LRU as well as the Least Frequently Used and Random replacement policies. A simulation method is presented for any such policy that on any trace of length N directed to a C line set runs in the O(C log N) time with high probability using N processors on the EREW model. The algorithms are simple, have very little space overhead, and are well suited for SIMD implementation.
User-friendly parallelization of GAUDI applications with Python

International Nuclear Information System (INIS)

Mato, Pere; Smith, Eoin

2010-01-01

GAUDI is a software framework in C++ used to build event data processing applications using a set of standard components with well-defined interfaces. Simulation, high-level trigger, reconstruction, and analysis programs used by several experiments are developed using GAUDI. These applications can be configured and driven by simple Python scripts. Given the fact that a considerable amount of existing software has been developed using serial methodology, and has existed in some cases for many years, implementation of parallelisation techniques at the framework level may offer a way of exploiting current multi-core technologies to maximize performance and reduce latencies without re-writing thousands/millions of lines of code. In the solution we have developed, the parallelization techniques are introduced to the high level Python scripts which configure and drive the applications, such that the core C++ application code requires no modification, and that end users need make only minimal changes to their scripts. The developed solution leverages from existing generic Python modules that support parallel processing. Naturally, the parallel version of a given program should produce results consistent with its serial execution. The evaluation of several prototypes incorporating various parallelization techniques are presented and discussed.
User-friendly parallelization of GAUDI applications with Python

Energy Technology Data Exchange (ETDEWEB)

Mato, Pere; Smith, Eoin, E-mail: pere.mato@cern.c [PH Department, CERN, 1211 Geneva 23 (Switzerland)

2010-04-01

GAUDI is a software framework in C++ used to build event data processing applications using a set of standard components with well-defined interfaces. Simulation, high-level trigger, reconstruction, and analysis programs used by several experiments are developed using GAUDI. These applications can be configured and driven by simple Python scripts. Given the fact that a considerable amount of existing software has been developed using serial methodology, and has existed in some cases for many years, implementation of parallelisation techniques at the framework level may offer a way of exploiting current multi-core technologies to maximize performance and reduce latencies without re-writing thousands/millions of lines of code. In the solution we have developed, the parallelization techniques are introduced to the high level Python scripts which configure and drive the applications, such that the core C++ application code requires no modification, and that end users need make only minimal changes to their scripts. The developed solution leverages from existing generic Python modules that support parallel processing. Naturally, the parallel version of a given program should produce results consistent with its serial execution. The evaluation of several prototypes incorporating various parallelization techniques are presented and discussed.
Characterization of DNA repair phenotypes of Xeroderma pigmentosum cell lines by a paralleled in vitro test

International Nuclear Information System (INIS)

Raffin, A.L.

2009-06-01

DNA is constantly damaged modifying the genetic information for which it encodes. Several cellular mechanisms as the Base Excision Repair (BER) and the Nucleotide Excision Repair (NER) allow recovering the right DNA sequence. The Xeroderma pigmentosum is a disease characterised by a deficiency in the NER pathway. The aim of this study was to propose an efficient and fast test for the diagnosis of this disease as an alternative to the currently available UDS test. DNA repair activities of XP cell lines were quantified using in vitro miniaturized and paralleled tests in order to establish DNA repair phenotypes of XPA and XPC deficient cells. The main advantage of the tests used in this study is the simultaneous measurement of excision or excision synthesis (ES) of several lesions by only one cellular extract. We showed on one hand that the relative ES of the different lesions depend strongly on the protein concentration of the nuclear extract tested. Working at high protein concentration allowed discriminating the XP phenotype versus the control one, whereas it was impossible under a certain concentration's threshold. On the other hand, while the UVB irradiation of control cells stimulated their repair activities, this effect was not observed in XP cells. This study brings new information on the XPA and XPC protein roles during BER and NER and underlines the complexity of the regulations of DNA repair processes. (author)
Parallel Programming with Intel Parallel Studio XE

CERN Document Server

Blair-Chappell , Stephen

2012-01-01

Optimize code for multi-core processors with Intel's Parallel Studio Parallel programming is rapidly becoming a "must-know" skill for developers. Yet, where to start? This teach-yourself tutorial is an ideal starting point for developers who already know Windows C and C++ and are eager to add parallelism to their code. With a focus on applying tools, techniques, and language extensions to implement parallelism, this essential resource teaches you how to write programs for multicore and leverage the power of multicore in your programs. Sharing hands-on case studies and real-world examples, the
He II lines in the spectrum of zeta Puppis

International Nuclear Information System (INIS)

Snijders, M.A.J.; Underhill, A.B.

1975-01-01

Equivalents widths of He II lines in the series n=2,3,4 and 5 are compiled and compared with predictions from plane-parallel, static model atmospheres using a non-LTE theory of line formation. The agreement between observation and prediction for a (50,000,4.0) model atmosphere is good for the upper members of the n=3 and the n=5 series, but the two lines of the n=2 series which are observed and the upper members of the n=4 series (4→15,4→17, etc.) are stronger than predicted. Well-determined profiles of lines from the n=3 series indicate v sin i=200 km s -1 . Profiles of the higher members of the n=4 series, however, do not match the predictions, the observed line cores being deeper than predicted. The n=4 level appears to be more overpopulated at moderate depths in the atmosphere than the non-LTE calculations with plane-parallel layers indicate. This may be due to an overlap of the H and He II lines in the even-even series caused by macroturbulent velocities of the hydrogen atoms and helium atoms

Performance assessment of the SIMFAP parallel cluster at IFIN-HH Bucharest

International Nuclear Information System (INIS)

Adam, Gh.; Adam, S.; Ayriyan, A.; Dushanov, E.; Hayryan, E.; Korenkov, V.; Lutsenko, A.; Mitsyn, V.; Sapozhnikova, T.; Sapozhnikov, A; Streltsova, O.; Buzatu, F.; Dulea, M.; Vasile, I.; Sima, A.; Visan, C.; Busa, J.; Pokorny, I.

2008-01-01

Performance assessment and case study outputs of the parallel SIMFAP cluster at IFIN-HH Bucharest point to its effective and reliable operation. A comparison with results on the supercomputing system in LIT-JINR Dubna adds insight on resource allocation for problem solving by parallel computing. The solution of models asking for very large numbers of knots in the discretization mesh needs the migration to high performance computing based on parallel cluster architectures. The acquisition of ready-to-use parallel computing facilities being beyond limited budgetary resources, the solution at IFIN-HH was to buy the hardware and the inter-processor network, and to implement by own efforts the open software concerning both the operating system and the parallel computing standard. The present paper provides a report demonstrating the successful solution of these tasks. The implementation of the well-known HPL (High Performance LINPACK) Benchmark points to the effective and reliable operation of the cluster. The comparison of HPL outputs obtained on parallel clusters of different magnitudes shows that there is an optimum range of the order N of the linear algebraic system over which a given parallel cluster provides optimum parallel solutions. For the SIMFAP cluster, this range can be inferred to correspond to about 1 to 2 x 10 4 linear algebraic equations. For an algorithm of polynomial complexity N α the task sharing among p processors within a parallel solution mainly follows an (N/p)α behaviour under peak performance achievement. Thus, while the problem complexity remains the same, a substantial decrease of the coefficient of the leading order of the polynomial complexity is achieved. (authors)
Fabrication of Si-nozzles for parallel mechano-electrospinning direct writing

International Nuclear Information System (INIS)

Pan, Yanqiao; Huang, YongAn; Bu, Ningbin; Yin, Zhouping

2013-01-01

Nozzles with micro-scale orifices drive high-resolution printing techniques for generating micro- to nano-scale droplets/lines. This paper presents the fabrication and application of Si-nozzles in mechano-electrospinning (MES). The fabrication process mainly consists of photolithography, Au deposition, inductively coupled plasma, and polydimethylsiloxane encapsulation. The 6 wt% polyethylene oxide solution is adopted to study the electrospinning behaviour and the relations between fibre diameter and process parameters in MES. A fibre grid with 250 µm spacing is able to be direct written, and the diameters are less than 3 µm. To improve the printing efficiency, positioning accuracy and flexibility, a rotatable multi-nozzle is adopted. The distance between parallel lines reduces sharply from 4.927 to 0.308 mm with the rotating angle increasing from 0° to 87°, and the fibre grids with tunable distance are achieved. This method paves the way for fabrication of addressable Si-nozzle array in parallel MES direct writing. (paper)
Supertracker: A Programmable Parallel Pipeline Arithmetic Processor For Auto-Cueing Target Processing

Science.gov (United States)

Mack, Harold; Reddi, S. S.

1980-04-01

Supertracker represents a programmable parallel pipeline computer architecture that has been designed to meet the real time image processing requirements of auto-cueing target data processing. The prototype bread-board currently under development will be designed to perform input video preprocessing and processing for 525-line and 875-line TV formats FLIR video, automatic display gain and contrast control, and automatic target cueing, classification, and tracking. The video preprocessor is capable of performing operations full frames of video data in real time, e.g., frame integration, storage, 3 x 3 convolution, and neighborhood processing. The processor architecture is being implemented using bit-slice microprogrammable arithmetic processors, operating in parallel. Each processor is capable of up to 20 million operations per second. Multiple frame memories are used for additional flexibility.
The concept of parallel input/output processing for an electron linac

International Nuclear Information System (INIS)

Emoto, Takashi

1993-01-01

The instrumentation of and the control system for the PNC 10 MeV CW electron linac are described. A new concept of parallel input/output processing for the linac has been introduced. It is based on a substantial number of input/output processors(IOP) using beam control and diagnostics. The flexibility and simplicity of hardware/software are significant advantages with this scheme. (author)
Parallelized Seeded Region Growing Using CUDA

Directory of Open Access Journals (Sweden)

Seongjin Park

2014-01-01

Full Text Available This paper presents a novel method for parallelizing the seeded region growing (SRG algorithm using Compute Unified Device Architecture (CUDA technology, with intention to overcome the theoretical weakness of SRG algorithm of its computation time being directly proportional to the size of a segmented region. The segmentation performance of the proposed CUDA-based SRG is compared with SRG implementations on single-core CPUs, quad-core CPUs, and shader language programming, using synthetic datasets and 20 body CT scans. Based on the experimental results, the CUDA-based SRG outperforms the other three implementations, advocating that it can substantially assist the segmentation during massive CT screening tests.
26 CFR 1.132-4 - Line of business limitation.

Science.gov (United States)

2010-04-01

... athletic facilities. (iii) Performance of substantial services in more than one line of business. An... one line of business, such lines of business will be treated as a single line of business where and to... business. For example, assume that on the same premises an employer sells both women's apparel and jewelry...
An efficient parallel stochastic simulation method for analysis of nonviral gene delivery systems

KAUST Repository

Kuwahara, Hiroyuki

2011-01-01

Gene therapy has a great potential to become an effective treatment for a wide variety of diseases. One of the main challenges to make gene therapy practical in clinical settings is the development of efficient and safe mechanisms to deliver foreign DNA molecules into the nucleus of target cells. Several computational and experimental studies have shown that the design process of synthetic gene transfer vectors can be greatly enhanced by computational modeling and simulation. This paper proposes a novel, effective parallelization of the stochastic simulation algorithm (SSA) for pharmacokinetic models that characterize the rate-limiting, multi-step processes of intracellular gene delivery. While efficient parallelizations of the SSA are still an open problem in a general setting, the proposed parallel simulation method is able to substantially accelerate the next reaction selection scheme and the reaction update scheme in the SSA by exploiting and decomposing the structures of stochastic gene delivery models. This, thus, makes computationally intensive analysis such as parameter optimizations and gene dosage control for specific cell types, gene vectors, and transgene expression stability substantially more practical than that could otherwise be with the standard SSA. Here, we translated the nonviral gene delivery model based on mass-action kinetics by Varga et al. [Molecular Therapy, 4(5), 2001] into a more realistic model that captures intracellular fluctuations based on stochastic chemical kinetics, and as a case study we applied our parallel simulation to this stochastic model. Our results show that our simulation method is able to increase the efficiency of statistical analysis by at least 50% in various settings. © 2011 ACM.
Polarisation in the auroral red line during coordinated EISCAT Svalbard Radar/optical experiments

Directory of Open Access Journals (Sweden)

M. Barthélémy

2011-06-01

Full Text Available The polarisation of the atomic oxygen red line in the Earth's thermosphere is observed in different configurations with respect to the magnetic field line at high latitude during several coordinated Incoherent Scatter radar/optical experiment campaigns. When pointing northward with a line-of-sight nearly perpendicular to the magnetic field, we show that, as expected, the polarisation is due to precipitated electrons with characteristic energies of a few hundreds of electron Volts. When pointing toward the zenith or southward with a line-of-sight more parallel to the magnetic field, we show that the polarisation practically disappears. This confirms experimentally the predictions deduced from the recent discovery of the red line polarisation. We show that the polarisation direction is parallel to the magnetic field line during geomagnetic activity intensification and that these results are in agreement with theoretical work.
A novel two-level dynamic parallel data scheme for large 3-D SN calculations

International Nuclear Information System (INIS)

Sjoden, G.E.; Shedlock, D.; Haghighat, A.; Yi, C.

2005-01-01

We introduce a new dynamic parallel memory optimization scheme for executing large scale 3-D discrete ordinates (Sn) simulations on distributed memory parallel computers. In order for parallel transport codes to be truly scalable, they must use parallel data storage, where only the variables that are locally computed are locally stored. Even with parallel data storage for the angular variables, cumulative storage requirements for large discrete ordinates calculations can be prohibitive. To address this problem, Memory Tuning has been implemented into the PENTRAN 3-D parallel discrete ordinates code as an optimized, two-level ('large' array, 'small' array) parallel data storage scheme. Memory Tuning can be described as the process of parallel data memory optimization. Memory Tuning dynamically minimizes the amount of required parallel data in allocated memory on each processor using a statistical sampling algorithm. This algorithm is based on the integral average and standard deviation of the number of fine meshes contained in each coarse mesh in the global problem. Because PENTRAN only stores the locally computed problem phase space, optimal two-level memory assignments can be unique on each node, depending upon the parallel decomposition used (hybrid combinations of angular, energy, or spatial). As demonstrated in the two large discrete ordinates models presented (a storage cask and an OECD MOX Benchmark), Memory Tuning can save a substantial amount of memory per parallel processor, allowing one to accomplish very large scale Sn computations. (authors)
Substantial Union or Substantial Distinction of Mind and Body in Descartes' Metaphysics

Directory of Open Access Journals (Sweden)

Fahime Jamei

2009-01-01

Full Text Available According to Descartes’ metaphysics there are two different kinds of substances in the world of creatures: “thinking substance” and “extended substance” or soul and matter. In Descartes’ philosophy the soul is equal to the mind and considered as a “thinking substance”. This immaterial substance is the essence of the human being. Body, being considered as a “matter“, is an “extended substance” and entirely distinct from the soul. The soul, therefore, exists and may be known prior to body and, not being corporeal, can exist after human death. Hence, Descartes can prove the immortality of human soul in the framework of the principle of substantial distinction. On the other hand, as a physiologist and psychologist, Descartes indeed believes in mind-body union, so that some causal interactions between mind and body show their substantial union. In this essay, the authors show that Descartes faces a serious problem in combining substantial union of mind and body with their substantial distinction; despite of his efforts in introducing the idea of pineal gland, the problem remains unsolved. Therefore it seems that as he cannot dispense with his only reason for proving the immortality of human soul, he has to hold the mind-body distinction theory in his metaphysics. Indeed, Descartes prefers to support the distinction theory rather than union theory in confronting a thesis and an antithesis stating one of two theories
Substantial :union: or Substantial Distinction of Mind and Body in Descartes\\' Metaphysics

Directory of Open Access Journals (Sweden)

f Jamei

2009-06-01

Full Text Available According to Descartes’ metaphysics there are two different kinds of substances in the world of creatures: “thinking substance” and “extended substance” or soul and matter. In Descartes’ philosophy the soul is equal to the mind and considered as a “thinking substance”. This immaterial substance is the essence of the human being. Body, being considered as a “matter“, is an “extended substance” and entirely distinct from the soul. The soul, therefore, exists and may be known prior to body and, not being corporeal, can exist after human death. Hence, Descartes can prove the immortality of human soul in the framework of the principle of substantial distinction. On the other hand, as a physiologist and psychologist, Descartes indeed believes in mind-body :union:, so that some causal interactions between mind and body show their substantial :union:. In this essay, the authors show that Descartes faces a serious problem in combining substantial :union: of mind and body with their substantial distinction despite of his efforts in introducing the idea of pineal gland, the problem remains unsolved. Therefore it seems that as he cannot dispense with his only reason for proving the immortality of human soul, he has to hold the mind-body distinction theory in his metaphysics. Indeed, Descartes prefers to support the distinction theory rather than :union: theory in confronting a thesis and an antithesis stating one of two theories.
A massively-parallel electronic-structure calculations based on real-space density functional theory

International Nuclear Information System (INIS)

Iwata, Jun-Ichi; Takahashi, Daisuke; Oshiyama, Atsushi; Boku, Taisuke; Shiraishi, Kenji; Okada, Susumu; Yabana, Kazuhiro

2010-01-01

Based on the real-space finite-difference method, we have developed a first-principles density functional program that efficiently performs large-scale calculations on massively-parallel computers. In addition to efficient parallel implementation, we also implemented several computational improvements, substantially reducing the computational costs of O(N 3 ) operations such as the Gram-Schmidt procedure and subspace diagonalization. Using the program on a massively-parallel computer cluster with a theoretical peak performance of several TFLOPS, we perform electronic-structure calculations for a system consisting of over 10,000 Si atoms, and obtain a self-consistent electronic-structure in a few hundred hours. We analyze in detail the costs of the program in terms of computation and of inter-node communications to clarify the efficiency, the applicability, and the possibility for further improvements.
Modulations of the processing of line discontinuities under selective attention conditions?

Science.gov (United States)

Giersch, Anne; Fahle, Manfred

2002-01-01

We examined whether the processing of discontinuities involved in figure-ground segmentation, like line ends, can be modulated under selective attention conditions. Subjects decided whether a gap in collinear or parallel lines was located to the right or left. Two stimuli were displayed in immediate succession. When the gaps were on the same side, reaction times (RTs) for the second stimulus increased when collinear lines followed parallel lines, or the reverse, but only when the two stimuli shared the same orientation and location. The effect did not depend on the global form of the stimuli or on the relative orientation of the gaps. A frame drawn around collinear elements affected the results, suggesting a crucial role of the "amodal" orthogonal lines produced when line ends are aligned. Including several gaps in the first stimulus also eliminated RT variations. By contrast, RT variations remained stable across several experimental blocks and were significant for interstimulus intervals from 50 to 600 msec between the two stimuli. These results are interpreted in terms of a modulation of the processing of line ends or the production of amodal lines, arising when attention is selectively drawn to a gap.
Assignment of stock keeping units to parallel undirectional picking

Directory of Open Access Journals (Sweden)

Matthews, Jason

2015-05-01

Full Text Available An order picking system consisting of a number of parallel unidirectional picking lines is investigated. Stock keeping units (SKUs that are grouped by product type into distributions (DBNs are assigned daily to available picking lines. A mathematical programming formulation and its relaxations is presented. A greedy insertion and a greedy phased insertion are further introduced to obtain feasible results within usable computation times for all test cases. The walking distance of the pickers was shown to decrease by about 22 per cent compared with the current assignment approach. However, product handling and operational risk increases.
Novel pulse amplifying circuits based on transmission lines of different characteristic impedance

International Nuclear Information System (INIS)

Belloni, F.; Doria, D.; Lorusso, A.; Nassisi, V.

2006-01-01

Two novel circuits used to amplify electric pulses by the coupling of transmission lines of different characteristic impedance are described. The circuits are intended for doubling voltage pulses and for doubling current pulses. The former is composed by a R 0 transmission line closed on a set of two 2R 0 storage lines connected in parallel, while the latter is composed by a R 0 transmission line closed on a set of two R 0 /2 storage lines connected in series. The length of every storage line is half of input-pulse length. In both circuits, one storage line is characterized by an open extremity and the other line by a closed extremity. Connecting opportunely the storage lines to suitable load resistors, 4R 0 and R 0 /4, for the circuit having parallel and series connected lines, respectively, a twice of the output pulse intensity is obtained. Such devices are very suitable to generate high intensity voltage and/or current peaks which are very interesting in the field of the accelerators. Both circuit behaviours have been theoretically studied and verified by computer simulations
DOE-EPRI On-Line Monitoring Implementation Guidelines

International Nuclear Information System (INIS)

E. Davis, R. Bickford

2003-01-01

Industry and EPRI experience at several plants has shown on-line monitoring to be very effective in identifying out-of-calibration instrument channels or indications of equipment-degradation problems. The EPRI implementation project for on-line monitoring has demonstrated the feasibility of on-line monitoring at several participating nuclear plants. The results have been very encouraging, and substantial progress is anticipated in the coming years
24 CFR 902.79 - Substantial default.

Science.gov (United States)

2010-04-01

... 24 Housing and Urban Development 4 2010-04-01 2010-04-01 false Substantial default. 902.79 Section... PUBLIC HOUSING ASSESSMENT SYSTEM PHAS Incentives and Remedies § 902.79 Substantial default. (a) Events or conditions that constitute substantial default. The following events or conditions shall constitute...
Reconstruction of multiple line source attenuation maps

International Nuclear Information System (INIS)

Celler, A.; Sitek, A.; Harrop, R.

1996-01-01

A simple configuration for a transmission source for the single photon emission computed tomography (SPECT) was proposed, which utilizes a series of collimated line sources parallel to the axis of rotation of a camera. The detector is equipped with a standard parallel hole collimator. We have demonstrated that this type of source configuration can be used to generate sufficient data for the reconstruction of the attenuation map when using 8-10 line sources spaced by 3.5-4.5 cm for a 30 x 40cm detector at 65cm distance from the sources. Transmission data for a nonuniform thorax phantom was simulated, then binned and reconstructed using filtered backprojection (FBP) and iterative methods. The optimum maps are obtained with data binned into 2-3 bins and FBP reconstruction. The activity in the source was investigated for uniform and exponential activity distributions, as well as the effect of gaps and overlaps of the neighboring fan beams. A prototype of the line source has been built and the experimental verification of the technique has started
Dynamic Line Rating Oncor Electric Delivery Smart Grid Program

Energy Technology Data Exchange (ETDEWEB)

Johnson, Justin; Smith, Cale; Young, Mike; Donohoo, Ken; Owen, Ross; Clark, Eddit; Espejo, Raul; Aivaliotis, Sandy; Stelmak, Ron; Mohr, Ron; Barba, Cristian; Gonzalez, Guillermo; Malkin, Stuart; Dimitrova, Vessela; Ragsdale, Gary; Mitchem, Sean; Jeirath, Nakul; Loomis, Joe; Trevino, Gerardo; Syracuse, Steve; Hurst, Neil; Mereness, Matt; Johnson, Chad; Bivens, Carrie

2013-05-04

Electric transmission lines are the lifeline of the electric utility industry, delivering its product from source to consumer. This critical infrastructure is often constrained such that there is inadequate capacity on existing transmission lines to efficiently deliver the power to meet demand in certain areas or to transport energy from high-generation areas to high-consumption regions. When this happens, the cost of the energy rises; more costly sources of power are used to meet the demand or the system operates less reliably. These economic impacts are known as congestion, and they can amount to substantial dollars for any time frame of reference: hour, day or year. There are several solutions to the transmission constraint problem, including: construction of new generation, construction of new transmission facilities, rebuilding and reconductoring of existing transmission assets, and Dynamic Line Rating (DLR). All of these options except DLR are capital intensive, have long lead times and often experience strong public and regulatory opposition. The Smart Grid Demonstration Program (SGDP) project co-funded by the Department of Energy (DOE) and Oncor Electric Delivery Company developed and deployed the most extensive and advanced DLR installation to demonstrate that DLR technology is capable of resolving many transmission capacity constraint problems with a system that is reliable, safe and very cost competitive. The SGDP DLR deployment is the first application of DLR technology to feed transmission line real-time dynamic ratings directly into the system operation’s State Estimator and load dispatch program, which optimizes the matching of generation with load demand on a security, reliability and economic basis. The integrated Dynamic Line Rating (iDLR)1 collects transmission line parameters at remote locations on the lines, calculates the real-time line rating based on the equivalent conductor temperature, ambient temperature and influence of wind and solar
Visual coherence for large-scale line-plot visualizations

KAUST Repository

Muigg, Philipp

2011-06-01

Displaying a large number of lines within a limited amount of screen space is a task that is common to many different classes of visualization techniques such as time-series visualizations, parallel coordinates, link-node diagrams, and phase-space diagrams. This paper addresses the challenging problems of cluttering and overdraw inherent to such visualizations. We generate a 2x2 tensor field during line rasterization that encodes the distribution of line orientations through each image pixel. Anisotropic diffusion of a noise texture is then used to generate a dense, coherent visualization of line orientation. In order to represent features of different scales, we employ a multi-resolution representation of the tensor field. The resulting technique can easily be applied to a wide variety of line-based visualizations. We demonstrate this for parallel coordinates, a time-series visualization, and a phase-space diagram. Furthermore, we demonstrate how to integrate a focus+context approach by incorporating a second tensor field. Our approach achieves interactive rendering performance for large data sets containing millions of data items, due to its image-based nature and ease of implementation on GPUs. Simulation results from computational fluid dynamics are used to evaluate the performance and usefulness of the proposed method. © 2011 The Author(s).

Visual coherence for large-scale line-plot visualizations

KAUST Repository

Muigg, Philipp; Hadwiger, Markus; Doleisch, Helmut; Grö ller, Eduard M.

2011-01-01

Displaying a large number of lines within a limited amount of screen space is a task that is common to many different classes of visualization techniques such as time-series visualizations, parallel coordinates, link-node diagrams, and phase-space diagrams. This paper addresses the challenging problems of cluttering and overdraw inherent to such visualizations. We generate a 2x2 tensor field during line rasterization that encodes the distribution of line orientations through each image pixel. Anisotropic diffusion of a noise texture is then used to generate a dense, coherent visualization of line orientation. In order to represent features of different scales, we employ a multi-resolution representation of the tensor field. The resulting technique can easily be applied to a wide variety of line-based visualizations. We demonstrate this for parallel coordinates, a time-series visualization, and a phase-space diagram. Furthermore, we demonstrate how to integrate a focus+context approach by incorporating a second tensor field. Our approach achieves interactive rendering performance for large data sets containing millions of data items, due to its image-based nature and ease of implementation on GPUs. Simulation results from computational fluid dynamics are used to evaluate the performance and usefulness of the proposed method. © 2011 The Author(s).
Increasing the reach of forensic genetics with massively parallel sequencing.

Science.gov (United States)

Budowle, Bruce; Schmedes, Sarah E; Wendt, Frank R

2017-09-01

The field of forensic genetics has made great strides in the analysis of biological evidence related to criminal and civil matters. More so, the discipline has set a standard of performance and quality in the forensic sciences. The advent of massively parallel sequencing will allow the field to expand its capabilities substantially. This review describes the salient features of massively parallel sequencing and how it can impact forensic genetics. The features of this technology offer increased number and types of genetic markers that can be analyzed, higher throughput of samples, and the capability of targeting different organisms, all by one unifying methodology. While there are many applications, three are described where massively parallel sequencing will have immediate impact: molecular autopsy, microbial forensics and differentiation of monozygotic twins. The intent of this review is to expose the forensic science community to the potential enhancements that have or are soon to arrive and demonstrate the continued expansion the field of forensic genetics and its service in the investigation of legal matters.
Increasing phylogenetic resolution at low taxonomic levels using massively parallel sequencing of chloroplast genomes

Science.gov (United States)

Matthew Parks; Richard Cronn; Aaron Liston

2009-01-01

We reconstruct the infrageneric phylogeny of Pinus from 37 nearly-complete chloroplast genomes (average 109 kilobases each of an approximately 120 kilobase genome) generated using multiplexed massively parallel sequencing. We found that 30/33 ingroup nodes resolved wlth > 95-percent bootstrap support; this is a substantial improvement relative...
Parallel solutions of the two-group neutron diffusion equations

International Nuclear Information System (INIS)

Zee, K.S.; Turinsky, P.J.

1987-01-01

Recent efforts to adapt various numerical solution algorithms to parallel computer architectures have addressed the possibility of substantially reducing the running time of few-group neutron diffusion calculations. The authors have developed an efficient iterative parallel algorithm and an associated computer code for the rapid solution of the finite difference method representation of the two-group neutron diffusion equations on the CRAY X/MP-48 supercomputer having multi-CPUs and vector pipelines. For realistic simulation of light water reactor cores, the code employees a macroscopic depletion model with trace capability for selected fission product transients and critical boron. In addition to this, moderator and fuel temperature feedback models are also incorporated into the code. The validity of the physics models used in the code were benchmarked against qualified codes and proved accurate. This work is an extension of previous work in that various feedback effects are accounted for in the system; the entire code is structured to accommodate extensive vectorization; and an additional parallelism by multitasking is achieved not only for the solution of the matrix equations associated with the inner iterations but also for the other segments of the code, e.g., outer iterations
Automatic Thread-Level Parallelization in the Chombo AMR Library

Energy Technology Data Exchange (ETDEWEB)

Christen, Matthias; Keen, Noel; Ligocki, Terry; Oliker, Leonid; Shalf, John; Van Straalen, Brian; Williams, Samuel

2011-05-26

The increasing on-chip parallelism has some substantial implications for HPC applications. Currently, hybrid programming models (typically MPI+OpenMP) are employed for mapping software to the hardware in order to leverage the hardware?s architectural features. In this paper, we present an approach that automatically introduces thread level parallelism into Chombo, a parallel adaptive mesh refinement framework for finite difference type PDE solvers. In Chombo, core algorithms are specified in the ChomboFortran, a macro language extension to F77 that is part of the Chombo framework. This domain-specific language forms an already used target language for an automatic migration of the large number of existing algorithms into a hybrid MPI+OpenMP implementation. It also provides access to the auto-tuning methodology that enables tuning certain aspects of an algorithm to hardware characteristics. Performance measurements are presented for a few of the most relevant kernels with respect to a specific application benchmark using this technique as well as benchmark results for the entire application. The kernel benchmarks show that, using auto-tuning, up to a factor of 11 in performance was gained with 4 threads with respect to the serial reference implementation.
Hardware-Efficient On-line Learning through Pipelined Truncated-Error Backpropagation in Binary-State Networks

Directory of Open Access Journals (Sweden)

Hesham Mostafa

2017-09-01

Full Text Available Artificial neural networks (ANNs trained using backpropagation are powerful learning architectures that have achieved state-of-the-art performance in various benchmarks. Significant effort has been devoted to developing custom silicon devices to accelerate inference in ANNs. Accelerating the training phase, however, has attracted relatively little attention. In this paper, we describe a hardware-efficient on-line learning technique for feedforward multi-layer ANNs that is based on pipelined backpropagation. Learning is performed in parallel with inference in the forward pass, removing the need for an explicit backward pass and requiring no extra weight lookup. By using binary state variables in the feedforward network and ternary errors in truncated-error backpropagation, the need for any multiplications in the forward and backward passes is removed, and memory requirements for the pipelining are drastically reduced. Further reduction in addition operations owing to the sparsity in the forward neural and backpropagating error signal paths contributes to highly efficient hardware implementation. For proof-of-concept validation, we demonstrate on-line learning of MNIST handwritten digit classification on a Spartan 6 FPGA interfacing with an external 1Gb DDR2 DRAM, that shows small degradation in test error performance compared to an equivalently sized binary ANN trained off-line using standard back-propagation and exact errors. Our results highlight an attractive synergy between pipelined backpropagation and binary-state networks in substantially reducing computation and memory requirements, making pipelined on-line learning practical in deep networks.
Hardware-Efficient On-line Learning through Pipelined Truncated-Error Backpropagation in Binary-State Networks.

Science.gov (United States)

Mostafa, Hesham; Pedroni, Bruno; Sheik, Sadique; Cauwenberghs, Gert

2017-01-01

Artificial neural networks (ANNs) trained using backpropagation are powerful learning architectures that have achieved state-of-the-art performance in various benchmarks. Significant effort has been devoted to developing custom silicon devices to accelerate inference in ANNs. Accelerating the training phase, however, has attracted relatively little attention. In this paper, we describe a hardware-efficient on-line learning technique for feedforward multi-layer ANNs that is based on pipelined backpropagation. Learning is performed in parallel with inference in the forward pass, removing the need for an explicit backward pass and requiring no extra weight lookup. By using binary state variables in the feedforward network and ternary errors in truncated-error backpropagation, the need for any multiplications in the forward and backward passes is removed, and memory requirements for the pipelining are drastically reduced. Further reduction in addition operations owing to the sparsity in the forward neural and backpropagating error signal paths contributes to highly efficient hardware implementation. For proof-of-concept validation, we demonstrate on-line learning of MNIST handwritten digit classification on a Spartan 6 FPGA interfacing with an external 1Gb DDR2 DRAM, that shows small degradation in test error performance compared to an equivalently sized binary ANN trained off-line using standard back-propagation and exact errors. Our results highlight an attractive synergy between pipelined backpropagation and binary-state networks in substantially reducing computation and memory requirements, making pipelined on-line learning practical in deep networks.
The parallel dynamics of drift wave turbulence in the WEGA stellarator

Energy Technology Data Exchange (ETDEWEB)

Marsen, S; Endler, M; Otte, M; Wagner, F, E-mail: stefan.marsen@ipp.mpg.d [Max-Planck-Institut fuer Plasmaphysik, EURATOM Association, Wendelsteinstrasse 1, 17491 Greifswald (Germany)

2009-08-15

The three-dimensional structure of turbulence in the edge (inside the last closed flux surface) of the WEGA stellarator is studied focusing on the parallel dynamics. WEGA as a small stellarator with moderate plasma parameters offers the opportunity to study turbulence with Langmuir probes providing high spatial and temporal resolution. Multiple probes with radial, poloidal and toroidal resolution are used to measure density fluctuations. Correlation analysis is used to reconstruct a 3D picture of turbulent structures. We find that these structures originate predominantly on the low field side and have a three-dimensional character with a finite averaged parallel wavenumber. The ratio between the parallel and perpendicular wavenumber component is in the order of 10{sup -2}. The parallel dynamics are compared at magnetic inductions of 57 and 500 mT. At 500 mT, the parallel wavelength is in the order of the field line connection length 2{pi}R{iota}-bar. A frequency resolved measure of k{sub ||}/k{sub {theta}} shows a constant ratio in this case. At 57 mT the observed k{sub ||} is much smaller than at 500 mT. However, the observed small average value is due to an averaging over positive and negative components pointing parallel and antiparallel to the magnetic field vector.
Parallelization and implementation of approximate root isolation for nonlinear system by Monte Carlo

Science.gov (United States)

Khosravi, Ebrahim

1998-12-01

This dissertation solves a fundamental problem of isolating the real roots of nonlinear systems of equations by Monte-Carlo that were published by Bush Jones. This algorithm requires only function values and can be applied readily to complicated systems of transcendental functions. The implementation of this sequential algorithm provides scientists with the means to utilize function analysis in mathematics or other fields of science. The algorithm, however, is so computationally intensive that the system is limited to a very small set of variables, and this will make it unfeasible for large systems of equations. Also a computational technique was needed for investigating a metrology of preventing the algorithm structure from converging to the same root along different paths of computation. The research provides techniques for improving the efficiency and correctness of the algorithm. The sequential algorithm for this technique was corrected and a parallel algorithm is presented. This parallel method has been formally analyzed and is compared with other known methods of root isolation. The effectiveness, efficiency, enhanced overall performance of the parallel processing of the program in comparison to sequential processing is discussed. The message passing model was used for this parallel processing, and it is presented and implemented on Intel/860 MIMD architecture. The parallel processing proposed in this research has been implemented in an ongoing high energy physics experiment: this algorithm has been used to track neutrinoes in a super K detector. This experiment is located in Japan, and data can be processed on-line or off-line locally or remotely.
Parallel algorithm for determining motion vectors in ice floe images by matching edge features

Science.gov (United States)

Manohar, M.; Ramapriyan, H. K.; Strong, J. P.

1988-01-01

A parallel algorithm is described to determine motion vectors of ice floes using time sequences of images of the Arctic ocean obtained from the Synthetic Aperture Radar (SAR) instrument flown on-board the SEASAT spacecraft. Researchers describe a parallel algorithm which is implemented on the MPP for locating corresponding objects based on their translationally and rotationally invariant features. The algorithm first approximates the edges in the images by polygons or sets of connected straight-line segments. Each such edge structure is then reduced to a seed point. Associated with each seed point are the descriptions (lengths, orientations and sequence numbers) of the lines constituting the corresponding edge structure. A parallel matching algorithm is used to match packed arrays of such descriptions to identify corresponding seed points in the two images. The matching algorithm is designed such that fragmentation and merging of ice floes are taken into account by accepting partial matches. The technique has been demonstrated to work on synthetic test patterns and real image pairs from SEASAT in times ranging from .5 to 0.7 seconds for 128 x 128 images.
Practical parallel computing

CERN Document Server

Morse, H Stephen

1994-01-01

Practical Parallel Computing provides information pertinent to the fundamental aspects of high-performance parallel processing. This book discusses the development of parallel applications on a variety of equipment.Organized into three parts encompassing 12 chapters, this book begins with an overview of the technology trends that converge to favor massively parallel hardware over traditional mainframes and vector machines. This text then gives a tutorial introduction to parallel hardware architectures. Other chapters provide worked-out examples of programs using several parallel languages. Thi
Parallel rendering

Science.gov (United States)

Crockett, Thomas W.

1995-01-01

This article provides a broad introduction to the subject of parallel rendering, encompassing both hardware and software systems. The focus is on the underlying concepts and the issues which arise in the design of parallel rendering algorithms and systems. We examine the different types of parallelism and how they can be applied in rendering applications. Concepts from parallel computing, such as data decomposition, task granularity, scalability, and load balancing, are considered in relation to the rendering problem. We also explore concepts from computer graphics, such as coherence and projection, which have a significant impact on the structure of parallel rendering algorithms. Our survey covers a number of practical considerations as well, including the choice of architectural platform, communication and memory requirements, and the problem of image assembly and display. We illustrate the discussion with numerous examples from the parallel rendering literature, representing most of the principal rendering methods currently used in computer graphics.
Vertical Line Nodes in the Superconducting Gap Structure of Sr_{2}RuO_{4}

Directory of Open Access Journals (Sweden)

E. Hassinger

2017-03-01

Full Text Available There is strong experimental evidence that the superconductor Sr_{2}RuO_{4} has a chiral p-wave order parameter. This symmetry does not require that the associated gap has nodes, yet specific heat, ultrasound, and thermal conductivity measurements indicate the presence of nodes in the superconducting gap structure of Sr_{2}RuO_{4}. Theoretical scenarios have been proposed to account for the existence of deep minima or accidental nodes (minima tuned to zero or below by material parameters within a p-wave state. Other scenarios propose chiral d-wave and f-wave states, with horizontal and vertical line nodes, respectively. To elucidate the nodal structure of the gap, it is essential to know whether the lines of nodes (or minima are vertical (parallel to the tetragonal c axis or horizontal (perpendicular to the c axis. Here, we report thermal conductivity measurements on single crystals of Sr_{2}RuO_{4} down to 50 mK for currents parallel and perpendicular to the c axis. We find that there is substantial quasiparticle transport in the T=0 limit for both current directions. A magnetic field H immediately excites quasiparticles with velocities both in the basal plane and in the c direction. Our data down to T_{c}/30 and down to H_{c2}/100 show no evidence that the nodes are in fact deep minima. Relative to the normal state, the thermal conductivity of the superconducting state is found to be very similar for the two current directions, from H=0 to H=H_{c2}. These findings show that the gap structure of Sr_{2}RuO_{4} consists of vertical line nodes. This rules out a chiral d-wave state. Given that the c-axis dispersion (warping of the Fermi surface in Sr_{2}RuO_{4} varies strongly from sheet to sheet, the small a-c anisotropy suggests that the line nodes are present on all three sheets of the Fermi surface. If imposed by symmetry, vertical line nodes would be inconsistent with a p-wave order parameter for Sr_{2}RuO_{4}. To reconcile the gap structure
Parallel computations

CERN Document Server

1982-01-01

Parallel Computations focuses on parallel computation, with emphasis on algorithms used in a variety of numerical and physical applications and for many different types of parallel computers. Topics covered range from vectorization of fast Fourier transforms (FFTs) and of the incomplete Cholesky conjugate gradient (ICCG) algorithm on the Cray-1 to calculation of table lookups and piecewise functions. Single tridiagonal linear systems and vectorized computation of reactive flow are also discussed.Comprised of 13 chapters, this volume begins by classifying parallel computers and describing techn
Parallel sorting algorithms

CERN Document Server

Akl, Selim G

1985-01-01

Parallel Sorting Algorithms explains how to use parallel algorithms to sort a sequence of items on a variety of parallel computers. The book reviews the sorting problem, the parallel models of computation, parallel algorithms, and the lower bounds on the parallel sorting problems. The text also presents twenty different algorithms, such as linear arrays, mesh-connected computers, cube-connected computers. Another example where algorithm can be applied is on the shared-memory SIMD (single instruction stream multiple data stream) computers in which the whole sequence to be sorted can fit in the
Computationally efficient implementation of combustion chemistry in parallel PDF calculations

International Nuclear Information System (INIS)

Lu Liuyan; Lantz, Steven R.; Ren Zhuyin; Pope, Stephen B.

2009-01-01

In parallel calculations of combustion processes with realistic chemistry, the serial in situ adaptive tabulation (ISAT) algorithm [S.B. Pope, Computationally efficient implementation of combustion chemistry using in situ adaptive tabulation, Combustion Theory and Modelling, 1 (1997) 41-63; L. Lu, S.B. Pope, An improved algorithm for in situ adaptive tabulation, Journal of Computational Physics 228 (2009) 361-386] substantially speeds up the chemistry calculations on each processor. To improve the parallel efficiency of large ensembles of such calculations in parallel computations, in this work, the ISAT algorithm is extended to the multi-processor environment, with the aim of minimizing the wall clock time required for the whole ensemble. Parallel ISAT strategies are developed by combining the existing serial ISAT algorithm with different distribution strategies, namely purely local processing (PLP), uniformly random distribution (URAN), and preferential distribution (PREF). The distribution strategies enable the queued load redistribution of chemistry calculations among processors using message passing. They are implemented in the software x2f m pi, which is a Fortran 95 library for facilitating many parallel evaluations of a general vector function. The relative performance of the parallel ISAT strategies is investigated in different computational regimes via the PDF calculations of multiple partially stirred reactors burning methane/air mixtures. The results show that the performance of ISAT with a fixed distribution strategy strongly depends on certain computational regimes, based on how much memory is available and how much overlap exists between tabulated information on different processors. No one fixed strategy consistently achieves good performance in all the regimes. Therefore, an adaptive distribution strategy, which blends PLP, URAN and PREF, is devised and implemented. It yields consistently good performance in all regimes. In the adaptive parallel
Study on Parallel Processing for Efficient Flexible Multibody Analysis based on Subsystem Synthesis Method

Energy Technology Data Exchange (ETDEWEB)

Han, Jong-Boo; Song, Hajun; Kim, Sung-Soo [Chungnam Nat’l Univ., Daejeon (Korea, Republic of)

2017-06-15

Flexible multibody simulations are widely used in the industry to design mechanical systems. In flexible multibody dynamics, deformation coordinates are described either relatively in the body reference frame that is floating in the space or in the inertial reference frame. Moreover, these deformation coordinates are generated based on the discretization of the body according to the finite element approach. Therefore, the formulation of the flexible multibody system always deals with a huge number of degrees of freedom and the numerical solution methods require a substantial amount of computational time. Parallel computational methods are a solution for efficient computation. However, most of the parallel computational methods are focused on the efficient solution of large-sized linear equations. For multibody analysis, we need to develop an efficient formulation that could be suitable for parallel computation. In this paper, we developed a subsystem synthesis method for a flexible multibody system and proposed efficient parallel computational schemes based on the OpenMP API in order to achieve efficient computation. Simulations of a rotating blade system, which consists of three identical blades, were carried out with two different parallel computational schemes. Actual CPU times were measured to investigate the efficiency of the proposed parallel schemes.
Cosmic Shear With ACS Pure Parallels

Science.gov (United States)

Rhodes, Jason

2002-07-01

Small distortions in the shapes of background galaxies by foreground mass provide a powerful method of directly measuring the amount and distribution of dark matter. Several groups have recently detected this weak lensing by large-scale structure, also called cosmic shear. The high resolution and sensitivity of HST/ACS provide a unique opportunity to measure cosmic shear accurately on small scales. Using 260 parallel orbits in Sloan textiti {F775W} we will measure for the first time: beginlistosetlength sep0cm setlengthemsep0cm setlengthopsep0cm em the cosmic shear variance on scales Omega_m^0.5, with signal-to-noise {s/n} 20, and the mass density Omega_m with s/n=4. They will be done at small angular scales where non-linear effects dominate the power spectrum, providing a test of the gravitational instability paradigm for structure formation. Measurements on these scales are not possible from the ground, because of the systematic effects induced by PSF smearing from seeing. Having many independent lines of sight reduces the uncertainty due to cosmic variance, making parallel observations ideal.
Parallel Directionally Split Solver Based on Reformulation of Pipelined Thomas Algorithm

Science.gov (United States)

Povitsky, A.

1998-01-01

In this research an efficient parallel algorithm for 3-D directionally split problems is developed. The proposed algorithm is based on a reformulated version of the pipelined Thomas algorithm that starts the backward step computations immediately after the completion of the forward step computations for the first portion of lines This algorithm has data available for other computational tasks while processors are idle from the Thomas algorithm. The proposed 3-D directionally split solver is based on the static scheduling of processors where local and non-local, data-dependent and data-independent computations are scheduled while processors are idle. A theoretical model of parallelization efficiency is used to define optimal parameters of the algorithm, to show an asymptotic parallelization penalty and to obtain an optimal cover of a global domain with subdomains. It is shown by computational experiments and by the theoretical model that the proposed algorithm reduces the parallelization penalty about two times over the basic algorithm for the range of the number of processors (subdomains) considered and the number of grid nodes per subdomain.
Ab initio quantum chemistry in parallel-portable tools and applications

International Nuclear Information System (INIS)

Harrison, R.J.; Shepard, R.; Kendall, R.A.

1991-01-01

In common with many of the computational sciences, ab initio chemistry faces computational constraints to which a partial solution is offered by the prospect of highly parallel computers. Ab initio codes are large and complex (O(10 5 ) lines of FORTRAN), representing a significant investment of communal effort. The often conflicting requirements of portability and efficiency have been successfully resolved on vector computers by reliance on matrix oriented kernels. This proves inadequate even upon closely-coupled shared-memory parallel machines. We examine the algorithms employed during a typical sequence of calculations. Then we investigate how efficient portable parallel implementations may be derived, including the complex multi-reference singles and doubles configuration interaction algorithm. A portable toolkit, modeled after the Intel iPSC and the ANL-ACRF PARMACS, is developed, using shared memory and TCP/IP sockets. The toolkit is used as an initial platform for programs portable between LANS, Crays and true distributed-memory MIMD machines. Timings are presented. 53 refs., 4 tabs

Parallel MR imaging.

Science.gov (United States)

Deshmane, Anagha; Gulani, Vikas; Griswold, Mark A; Seiberlich, Nicole

2012-07-01

Parallel imaging is a robust method for accelerating the acquisition of magnetic resonance imaging (MRI) data, and has made possible many new applications of MR imaging. Parallel imaging works by acquiring a reduced amount of k-space data with an array of receiver coils. These undersampled data can be acquired more quickly, but the undersampling leads to aliased images. One of several parallel imaging algorithms can then be used to reconstruct artifact-free images from either the aliased images (SENSE-type reconstruction) or from the undersampled data (GRAPPA-type reconstruction). The advantages of parallel imaging in a clinical setting include faster image acquisition, which can be used, for instance, to shorten breath-hold times resulting in fewer motion-corrupted examinations. In this article the basic concepts behind parallel imaging are introduced. The relationship between undersampling and aliasing is discussed and two commonly used parallel imaging methods, SENSE and GRAPPA, are explained in detail. Examples of artifacts arising from parallel imaging are shown and ways to detect and mitigate these artifacts are described. Finally, several current applications of parallel imaging are presented and recent advancements and promising research in parallel imaging are briefly reviewed. Copyright © 2012 Wiley Periodicals, Inc.
High-Performance Psychometrics: The Parallel-E Parallel-M Algorithm for Generalized Latent Variable Models. Research Report. ETS RR-16-34

Science.gov (United States)

von Davier, Matthias

2016-01-01

This report presents results on a parallel implementation of the expectation-maximization (EM) algorithm for multidimensional latent variable models. The developments presented here are based on code that parallelizes both the E step and the M step of the parallel-E parallel-M algorithm. Examples presented in this report include item response…
Vortex-line fluctuations in model high-temperature superconductors

International Nuclear Information System (INIS)

Li, Y.; Teitel, S.

1993-01-01

We carry out Monte Carlo simulations of the uniformly frustrated three-dimensional XY model, as a model for vortex-line fluctuations in a high-T c superconductor in an external magnetic field. A density of vortex lines of f=1/25 is considered. We find two sharp phase transitions. The low-T superconducting phase is an ordered vortex-line lattice. The high-T normal phase is a vortex-line liquid, with much entangling, cutting, and loop excitations. An intermediate phase is found, which is characterized as a vortex-line liquid of disentangled, approximately straight, lines. In this phase, the system displays superconducting properties in the direction parallel to the magnetic field, but normal behavior in planes perpendicular to the field. A detailed analysis of the vortex structure function is carried out
Superconducting coherence in a vortex line liquid

International Nuclear Information System (INIS)

Chen, T.; Teitel, S.

1995-01-01

We carry out simulations of the anisotropic uniformly frustrated 3d XY model, as a model for vortex line fluctuations in high T c superconductors. We compute the phase diagram as a function of temperature and anisotropy, for a fixed applied magnetic field B. We find two distinct phase transitions. Upon heating, there is first a lower T c perpendicular where the vortex line lattice melts and super-conducting coherence perpendicular to the applied magnetic field vanishes. At a higher T cz , within the vortex line liquid, superconducting coherence parallel to the applied magnetic field vanishes. For finite anisotropy, both T c perpendicular and T cz lie well below the crossover from the vortex line liquid to the normal state
Optimal parallel algorithms for problems modeled by a family of intervals

Science.gov (United States)

Olariu, Stephan; Schwing, James L.; Zhang, Jingyuan

1992-01-01

A family of intervals on the real line provides a natural model for a vast number of scheduling and VLSI problems. Recently, a number of parallel algorithms to solve a variety of practical problems on such a family of intervals have been proposed in the literature. Computational tools are developed, and it is shown how they can be used for the purpose of devising cost-optimal parallel algorithms for a number of interval-related problems including finding a largest subset of pairwise nonoverlapping intervals, a minimum dominating subset of intervals, along with algorithms to compute the shortest path between a pair of intervals and, based on the shortest path, a parallel algorithm to find the center of the family of intervals. More precisely, with an arbitrary family of n intervals as input, all algorithms run in O(log n) time using O(n) processors in the EREW-PRAM model of computation.
Parallel and series FED microstrip array with high efficiency and low cross polarization

Science.gov (United States)

Huang, John (Inventor)

1995-01-01

A microstrip array antenna for vertically polarized fan beam (approximately 2 deg x 50 deg) for C-band SAR applications with a physical area of 1.7 m by 0.17 m comprises two rows of patch elements and employs a parallel feed to left- and right-half sections of the rows. Each section is divided into two segments that are fed in parallel with the elements in each segment fed in series through matched transmission lines for high efficiency. The inboard section has half the number of patch elements of the outboard section, and the outboard sections, which have tapered distribution with identical transmission line sections, terminated with half wavelength long open-circuit stubs so that the remaining energy is reflected and radiated in phase. The elements of the two inboard segments of the two left- and right-half sections are provided with tapered transmission lines from element to element for uniform power distribution over the central third of the entire array antenna. The two rows of array elements are excited at opposite patch feed locations with opposite (180 deg difference) phases for reduced cross-polarization.
Integrated variable projection approach (IVAPA) for parallel magnetic resonance imaging.

Science.gov (United States)

Zhang, Qiao; Sheng, Jinhua

2012-10-01

Parallel magnetic resonance imaging (pMRI) is a fast method which requires algorithms for the reconstructing image from a small number of measured k-space lines. The accurate estimation of the coil sensitivity functions is still a challenging problem in parallel imaging. The joint estimation of the coil sensitivity functions and the desired image has recently been proposed to improve the situation by iteratively optimizing both the coil sensitivity functions and the image reconstruction. It regards both the coil sensitivities and the desired images as unknowns to be solved for jointly. In this paper, we propose an integrated variable projection approach (IVAPA) for pMRI, which integrates two individual processing steps (coil sensitivity estimation and image reconstruction) into a single processing step to improve the accuracy of the coil sensitivity estimation using the variable projection approach. The method is demonstrated to be able to give an optimal solution with considerably reduced artifacts for high reduction factors and a low number of auto-calibration signal (ACS) lines, and our implementation has a fast convergence rate. The performance of the proposed method is evaluated using a set of in vivo experiment data. Copyright © 2012 Elsevier Ltd. All rights reserved.
A SPECT reconstruction method for extending parallel to non-parallel geometries

International Nuclear Information System (INIS)

Wen Junhai; Liang Zhengrong

2010-01-01

Due to its simplicity, parallel-beam geometry is usually assumed for the development of image reconstruction algorithms. The established reconstruction methodologies are then extended to fan-beam, cone-beam and other non-parallel geometries for practical application. This situation occurs for quantitative SPECT (single photon emission computed tomography) imaging in inverting the attenuated Radon transform. Novikov reported an explicit parallel-beam formula for the inversion of the attenuated Radon transform in 2000. Thereafter, a formula for fan-beam geometry was reported by Bukhgeim and Kazantsev (2002 Preprint N. 99 Sobolev Institute of Mathematics). At the same time, we presented a formula for varying focal-length fan-beam geometry. Sometimes, the reconstruction formula is so implicit that we cannot obtain the explicit reconstruction formula in the non-parallel geometries. In this work, we propose a unified reconstruction framework for extending parallel-beam geometry to any non-parallel geometry using ray-driven techniques. Studies by computer simulations demonstrated the accuracy of the presented unified reconstruction framework for extending parallel-beam to non-parallel geometries in inverting the attenuated Radon transform.
The language parallel Pascal and other aspects of the massively parallel processor

Science.gov (United States)

Reeves, A. P.; Bruner, J. D.

1982-01-01

A high level language for the Massively Parallel Processor (MPP) was designed. This language, called Parallel Pascal, is described in detail. A description of the language design, a description of the intermediate language, Parallel P-Code, and details for the MPP implementation are included. Formal descriptions of Parallel Pascal and Parallel P-Code are given. A compiler was developed which converts programs in Parallel Pascal into the intermediate Parallel P-Code language. The code generator to complete the compiler for the MPP is being developed independently. A Parallel Pascal to Pascal translator was also developed. The architecture design for a VLSI version of the MPP was completed with a description of fault tolerant interconnection networks. The memory arrangement aspects of the MPP are discussed and a survey of other high level languages is given.
Parallel Atomistic Simulations

Energy Technology Data Exchange (ETDEWEB)

HEFFELFINGER,GRANT S.

2000-01-18

Algorithms developed to enable the use of atomistic molecular simulation methods with parallel computers are reviewed. Methods appropriate for bonded as well as non-bonded (and charged) interactions are included. While strategies for obtaining parallel molecular simulations have been developed for the full variety of atomistic simulation methods, molecular dynamics and Monte Carlo have received the most attention. Three main types of parallel molecular dynamics simulations have been developed, the replicated data decomposition, the spatial decomposition, and the force decomposition. For Monte Carlo simulations, parallel algorithms have been developed which can be divided into two categories, those which require a modified Markov chain and those which do not. Parallel algorithms developed for other simulation methods such as Gibbs ensemble Monte Carlo, grand canonical molecular dynamics, and Monte Carlo methods for protein structure determination are also reviewed and issues such as how to measure parallel efficiency, especially in the case of parallel Monte Carlo algorithms with modified Markov chains are discussed.
Dynamic programming in parallel boundary detection with application to ultrasound intima-media segmentation.

Science.gov (United States)

Zhou, Yuan; Cheng, Xinyao; Xu, Xiangyang; Song, Enmin

2013-12-01

Segmentation of carotid artery intima-media in longitudinal ultrasound images for measuring its thickness to predict cardiovascular diseases can be simplified as detecting two nearly parallel boundaries within a certain distance range, when plaque with irregular shapes is not considered. In this paper, we improve the implementation of two dynamic programming (DP) based approaches to parallel boundary detection, dual dynamic programming (DDP) and piecewise linear dual dynamic programming (PL-DDP). Then, a novel DP based approach, dual line detection (DLD), which translates the original 2-D curve position to a 4-D parameter space representing two line segments in a local image segment, is proposed to solve the problem while maintaining efficiency and rotation invariance. To apply the DLD to ultrasound intima-media segmentation, it is imbedded in a framework that employs an edge map obtained from multiplication of the responses of two edge detectors with different scales and a coupled snake model that simultaneously deforms the two contours for maintaining parallelism. The experimental results on synthetic images and carotid arteries of clinical ultrasound images indicate improved performance of the proposed DLD compared to DDP and PL-DDP, with respect to accuracy and efficiency. Copyright © 2013 Elsevier B.V. All rights reserved.
Leibniz on the parallel postulate and the foundations of geometry the unpublished manuscripts

CERN Document Server

De Risi, Vincenzo

2016-01-01

This book offers a general introduction to the geometrical studies of Gottfried Wilhelm Leibniz (1646-1716) and his mathematical epistemology. In particular, it focuses on his theory of parallel lines and his attempts to prove the famous Parallel Postulate. Furthermore it explains the role that Leibniz’s work played in the development of non-Euclidean geometry. The first part is an overview of his epistemology of geometry and a few of his geometrical findings, which puts them in the context of the seventeenth-century studies on the foundations of geometry. It also provides a detailed mathematical and philosophical commentary on his writings on the theory of parallels, and discusses how they were received in the eighteenth century as well as their relevance for the non-Euclidean revolution in mathematics. The second part offers a collection of Leibniz’s essays on the theory of parallels and an English translation of them. While a few of these papers have already been published (in Latin) in the standard Le...
P3T+: A Performance Estimator for Distributed and Parallel Programs

Directory of Open Access Journals (Sweden)

T. Fahringer

2000-01-01

Full Text Available Developing distributed and parallel programs on today's multiprocessor architectures is still a challenging task. Particular distressing is the lack of effective performance tools that support the programmer in evaluating changes in code, problem and machine sizes, and target architectures. In this paper we introduce P3T+ which is a performance estimator for mostly regular HPF (High Performance Fortran programs but partially covers also message passing programs (MPI. P3T+ is unique by modeling programs, compiler code transformations, and parallel and distributed architectures. It computes at compile-time a variety of performance parameters including work distribution, number of transfers, amount of data transferred, transfer times, computation times, and number of cache misses. Several novel technologies are employed to compute these parameters: loop iteration spaces, array access patterns, and data distributions are modeled by employing highly effective symbolic analysis. Communication is estimated by simulating the behavior of a communication library used by the underlying compiler. Computation times are predicted through pre-measured kernels on every target architecture of interest. We carefully model most critical architecture specific factors such as cache lines sizes, number of cache lines available, startup times, message transfer time per byte, etc. P3T+ has been implemented and is closely integrated with the Vienna High Performance Compiler (VFC to support programmers develop parallel and distributed applications. Experimental results for realistic kernel codes taken from real-world applications are presented to demonstrate both accuracy and usefulness of P3T+.
Real-time trajectory optimization on parallel processors

Science.gov (United States)

Psiaki, Mark L.

1993-01-01

A parallel algorithm has been developed for rapidly solving trajectory optimization problems. The goal of the work has been to develop an algorithm that is suitable to do real-time, on-line optimal guidance through repeated solution of a trajectory optimization problem. The algorithm has been developed on an INTEL iPSC/860 message passing parallel processor. It uses a zero-order-hold discretization of a continuous-time problem and solves the resulting nonlinear programming problem using a custom-designed augmented Lagrangian nonlinear programming algorithm. The algorithm achieves parallelism of function, derivative, and search direction calculations through the principle of domain decomposition applied along the time axis. It has been encoded and tested on 3 example problems, the Goddard problem, the acceleration-limited, planar minimum-time to the origin problem, and a National Aerospace Plane minimum-fuel ascent guidance problem. Execution times as fast as 118 sec of wall clock time have been achieved for a 128-stage Goddard problem solved on 32 processors. A 32-stage minimum-time problem has been solved in 151 sec on 32 processors. A 32-stage National Aerospace Plane problem required 2 hours when solved on 32 processors. A speed-up factor of 7.2 has been achieved by using 32-nodes instead of 1-node to solve a 64-stage Goddard problem.
Path integral approach for electron transport in disturbed magnetic field lines

Energy Technology Data Exchange (ETDEWEB)

Kanno, Ryutaro; Nakajima, Noriyoshi; Takamaru, Hisanori

2002-05-01

A path integral method is developed to investigate statistical property of an electron transport described as a Langevin equation in a statically disturbed magnetic field line structure; especially a transition probability of electrons strongly tied to field lines is considered. The path integral method has advantages that 1) it does not include intrinsically a growing numerical error of an orbit, which is caused by evolution of the Langevin equation under a finite calculation accuracy in a chaotic field line structure, and 2) it gives a method of understanding the qualitative content of the Langevin equation and assists to expect statistical property of the transport. Monte Carlo calculations of the electron distributions under both effects of chaotic field lines and collisions are demonstrated to comprehend above advantages through some examples. The mathematical techniques are useful to study statistical properties of various phenomena described as Langevin equations in general. By using parallel generators of random numbers, the Monte Carlo scheme to calculate a transition probability can be suitable for a parallel computation. (author)
Parallel integer sorting with medium and fine-scale parallelism

Science.gov (United States)

Dagum, Leonardo

1993-01-01

Two new parallel integer sorting algorithms, queue-sort and barrel-sort, are presented and analyzed in detail. These algorithms do not have optimal parallel complexity, yet they show very good performance in practice. Queue-sort designed for fine-scale parallel architectures which allow the queueing of multiple messages to the same destination. Barrel-sort is designed for medium-scale parallel architectures with a high message passing overhead. The performance results from the implementation of queue-sort on a Connection Machine CM-2 and barrel-sort on a 128 processor iPSC/860 are given. The two implementations are found to be comparable in performance but not as good as a fully vectorized bucket sort on the Cray YMP.
20 CFR 404.1675 - Finding of substantial failure.

Science.gov (United States)

2010-04-01

... 20 Employees' Benefits 2 2010-04-01 2010-04-01 false Finding of substantial failure. 404.1675... DISABILITY INSURANCE (1950- ) Determinations of Disability Substantial Failure § 404.1675 Finding of substantial failure. A finding of substantial failure with respect to a State may not be made unless and until...
About Parallel Programming: Paradigms, Parallel Execution and Collaborative Systems

Directory of Open Access Journals (Sweden)

Loredana MOCEAN

2009-01-01

Full Text Available In the last years, there were made efforts for delineation of a stabile and unitary frame, where the problems of logical parallel processing must find solutions at least at the level of imperative languages. The results obtained by now are not at the level of the made efforts. This paper wants to be a little contribution at these efforts. We propose an overview in parallel programming, parallel execution and collaborative systems.
Parallel computing works!

CERN Document Server

Fox, Geoffrey C; Messina, Guiseppe C

2014-01-01

A clear illustration of how parallel computers can be successfully appliedto large-scale scientific computations. This book demonstrates how avariety of applications in physics, biology, mathematics and other scienceswere implemented on real parallel computers to produce new scientificresults. It investigates issues of fine-grained parallelism relevant forfuture supercomputers with particular emphasis on hypercube architecture. The authors describe how they used an experimental approach to configuredifferent massively parallel machines, design and implement basic systemsoftware, and develop
20 CFR 416.1075 - Finding of substantial failure.

Science.gov (United States)

2010-04-01

... 20 Employees' Benefits 2 2010-04-01 2010-04-01 false Finding of substantial failure. 416.1075... AGED, BLIND, AND DISABLED Determinations of Disability Substantial Failure § 416.1075 Finding of substantial failure. A finding of substantial failure with respect to a State may not be made unless and until...

The effect of shadow lines on a low concentrating photovoltaic system

NARCIS (Netherlands)

Janssen, H.J.J.; Sonneveld, P.J.; Swinkels, G.L.A.M.; Tuijl, van B.A.J.; Zwart, de H.F.

2011-01-01

In order to reduce the energy losses caused by shadow lines, three options are investigated. These are: 1. the use of two types of diodes; 2. the use of an ”ideal” diode based an active bypass by using MOS-FET’s [4] and 3. parallel switching of a number of cells between two shadow lines. The first
ExoCross: Spectra from molecular line lists

Science.gov (United States)

Yurchenko, Sergei N.; Al-Refaie, Ahmed; Tennyson, Jonathan

2018-03-01

ExoCross generates spectra and thermodynamic properties from molecular line lists in ExoMol, HITRAN, or several other formats. The code is parallelized and also shows a high degree of vectorization; it works with line profiles such as Doppler, Lorentzian and Voigt and supports several broadening schemes. ExoCross is also capable of working with the recently proposed method of super-lines. It supports calculations of lifetimes, cooling functions, specific heats and other properties. ExoCross converts between different formats, such as HITRAN, ExoMol and Phoenix, and simulates non-LTE spectra using a simple two-temperature approach. Different electronic, vibronic or vibrational bands can be simulated separately using an efficient filtering scheme based on the quantum numbers.
Steady-state and time-dependent modelling of parallel transport in the scrape-off layer

DEFF Research Database (Denmark)

Havlickova, E.; Fundamenski, W.; Naulin, Volker

2011-01-01

The one-dimensional fluid code SOLF1D has been used for modelling of plasma transport in the scrape-off layer (SOL) along magnetic field lines, both in steady state and under transient conditions that arise due to plasma turbulence. The presented work summarizes results of SOLF1D with attention...... given to transient parallel transport which reveals two distinct time scales due to the transport mechanisms of convection and diffusion. Time-dependent modelling combined with the effect of ballooning shows propagation of particles along the magnetic field line with Mach number up to M ≈ 1...... temperature calculated in SOLF1D is compared with the approximative model used in the turbulence code ESEL both for steady-state and turbulent SOL. Dynamics of the parallel transport are investigated for a simple transient event simulating the propagation of particles and energy to the targets from a blob...
Parallelization Issues and Particle-In Codes.

Science.gov (United States)

Elster, Anne Cathrine

1994-01-01

"Everything should be made as simple as possible, but not simpler." Albert Einstein. The field of parallel scientific computing has concentrated on parallelization of individual modules such as matrix solvers and factorizers. However, many applications involve several interacting modules. Our analyses of a particle-in-cell code modeling charged particles in an electric field, show that these accompanying dependencies affect data partitioning and lead to new parallelization strategies concerning processor, memory and cache utilization. Our test-bed, a KSR1, is a distributed memory machine with a globally shared addressing space. However, most of the new methods presented hold generally for hierarchical and/or distributed memory systems. We introduce a novel approach that uses dual pointers on the local particle arrays to keep the particle locations automatically partially sorted. Complexity and performance analyses with accompanying KSR benchmarks, have been included for both this scheme and for the traditional replicated grids approach. The latter approach maintains load-balance with respect to particles. However, our results demonstrate it fails to scale properly for problems with large grids (say, greater than 128-by-128) running on as few as 15 KSR nodes, since the extra storage and computation time associated with adding the grid copies, becomes significant. Our grid partitioning scheme, although harder to implement, does not need to replicate the whole grid. Consequently, it scales well for large problems on highly parallel systems. It may, however, require load balancing schemes for non-uniform particle distributions. Our dual pointer approach may facilitate this through dynamically partitioned grids. We also introduce hierarchical data structures that store neighboring grid-points within the same cache -line by reordering the grid indexing. This alignment produces a 25% savings in cache-hits for a 4-by-4 cache. A consideration of the input data's effect on
Dynamic and Control Analysis of Modular Multi-Parallel Rectifiers (MMR)

DEFF Research Database (Denmark)

Zare, Firuz; Ghosh, Arindam; Davari, Pooya

2017-01-01

This paper presents dynamic analysis of a Modular Multi-Parallel Rectifier (MMR) based on state-space modelling and analysis. The proposed topology is suitable for high power application which can reduce line current harmonics emissions significantly. However, a proper controller is required...... to share and control current through each rectifier. Mathematical analysis and preliminary simulations have been carried out to verify the proposed controller under different operating conditions....
Optimal task mapping in safety-critical real-time parallel systems; Placement optimal de taches pour les systemes paralleles temps-reel critiques

Energy Technology Data Exchange (ETDEWEB)

Aussagues, Ch

1998-12-11

This PhD thesis is dealing with the correct design of safety-critical real-time parallel systems. Such systems constitutes a fundamental part of high-performance systems for command and control that can be found in the nuclear domain or more generally in parallel embedded systems. The verification of their temporal correctness is the core of this thesis. our contribution is mainly in the following three points: the analysis and extension of a programming model for such real-time parallel systems; the proposal of an original method based on a new operator of synchronized product of state machines task-graphs; the validation of the approach by its implementation and evaluation. The work addresses particularly the main problem of optimal task mapping on a parallel architecture, such that the temporal constraints are globally guaranteed, i.e. the timeliness property is valid. The results incorporate also optimally criteria for the sizing and correct dimensioning of a parallel system, for instance in the number of processing elements. These criteria are connected with operational constraints of the application domain. Our approach is based on the off-line analysis of the feasibility of the deadline-driven dynamic scheduling that is used to schedule tasks inside one processor. This leads us to define the synchronized-product, a system of linear, constraints is automatically generated and then allows to calculate a maximum load of a group of tasks and then to verify their timeliness constraints. The communications, their timeliness verification and incorporation to the mapping problem is the second main contribution of this thesis. FInally, the global solving technique dealing with both task and communication aspects has been implemented and evaluated in the framework of the OASIS project in the LETI research center at the CEA/Saclay. (author) 96 refs.
Veel kord regilaulu parallelismist, poeetilisest sünonüümiast ja analoogiast/ Once more on the parallelism of runosong, on the poetical synonymy and analogy

Directory of Open Access Journals (Sweden)

Mari Sarv

2016-01-01

Relying on her own previous research on runosongs and proverbs demonstrating the mutual dependency of alliteration and parallelism typical to runosong (Sarv 1999, 2000, 2003, the results of syntactic analysis of runosong texts in H. Metslang’s dissertation (1978, Juhan Peegel’s definition of poetical synonyms in runosong (Peegel 2004, and Ewald Lang’s concept of quasisynonymy (Lang 1987, the author proposes the definition of the canonical parallelism of runosong as follows: it is a grammatical verse parallelism where all or some of the syntactic elements of the main verse have corresponding parallels in the successive lines representing the same general notion, and interpreted in the context of the parallelism as semantically equivalent, irrespective of their semantic relations in the colloquial language (equivalence, synonymy, metonymy, metaphor, analogy, antonymy, hyponymy etc.. Because of this semantical equivalence, the parallel words can be selected and combined into the parallel verses according to their formal features enabling the metrical alignment and alliteration. The article also points to the problems with the classification of runosong parallelism to the analogous and synonymous by Wolfgang Steinitz (1934, widely used in the runosong discourse: although analogy and synonymy probably represent the most remarkable semantic relations between the parallel lines, it is not easy to make clear distinction between synonymous and analogous lines (or concepts—even in the colloquial non-poetic language the synonyms are usually not equivalent in all aspects of meaning; the regular use of poetical synonyms in runosongs makes it impossible at all—the geese, ducks, and grouses as different birds are analogous in the colloquial language, but synonymous in the runosong all denoting the group of maidens.
Controlled Compact High Voltage Power Lines

Directory of Open Access Journals (Sweden)

Postolati V.

2016-04-01

Full Text Available Nowadays modern overhead transmission lines (OHL constructions having several significant differences from conventional ones are being used in power grids more and more widely. Implementation of compact overhead lines equipped with FACTS devices, including phase angle regulator settings (compact controlled OHL, appears to be one of the most effective ways of power grid development. Compact controlled AC HV OHL represent a new generation of power transmission lines embodying recent advanced achievements in design solutions, including towers and insulation, together with interconnection schemes and control systems. Results of comprehensive research and development in relation to 110–500kV compact controlled power transmission lines together with theoretical basis, substantiation, and methodological approaches to their practical application are presented in the present paper.
Endpoint-based parallel data processing in a parallel active messaging interface of a parallel computer

Science.gov (United States)

Archer, Charles J.; Blocksome, Michael A.; Ratterman, Joseph D.; Smith, Brian E.

2014-08-12

Endpoint-based parallel data processing in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI composed of data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, the compute nodes coupled for data communications through the PAMI, including establishing a data communications geometry, the geometry specifying, for tasks representing processes of execution of the parallel application, a set of endpoints that are used in collective operations of the PAMI including a plurality of endpoints for one of the tasks; receiving in endpoints of the geometry an instruction for a collective operation; and executing the instruction for a collective operation through the endpoints in dependence upon the geometry, including dividing data communications operations among the plurality of endpoints for one of the tasks.
Direct and iterative algorithms for the parallel solution of the one-dimensional macroscopic Navier-Stokes equations

International Nuclear Information System (INIS)

Doster, J.M.; Sills, E.D.

1986-01-01

Current efforts are under way to develop and evaluate numerical algorithms for the parallel solution of the large sparse matrix equations associated with the finite difference representation of the macroscopic Navier-Stokes equations. Previous work has shown that these equations can be cast into smaller coupled matrix equations suitable for solution utilizing multiple computer processors operating in parallel. The individual processors themselves may exhibit parallelism through the use of vector pipelines. This wor, has concentrated on the one-dimensional drift flux form of the Navier-Stokes equations. Direct and iterative algorithms that may be suitable for implementation on parallel computer architectures are evaluated in terms of accuracy and overall execution speed. This work has application to engineering and training simulations, on-line process control systems, and engineering workstations where increased computational speeds are required
Two-state ion heating at quasi-parallel shocks

International Nuclear Information System (INIS)

Thomsen, M.F.; Gosling, J.T.; Bame, S.J.; Onsager, T.G.; Russell, C.T.

1990-01-01

In a previous study of ion heating at quasi-parallel shocks, the authors showed a case in which the ion distributions downstream from the shock alternated between a cooler, denser, core/shoulder type and a hotter, less dense, more Maxwellian type. In this paper they further document the alternating occurrence of two different ion states downstream from several quasi-parallel shocks. Three separate lines of evidence are presented to show that the two states are not related in an evolutionary sense, but rather both are produced alternately at the shock: (1) the asymptotic downstream plasma parameters (density, ion temperature, and flow speed) are intermediate between those characterizing the two different states closer to the shock, suggesting that the asymptotic state is produced by a mixing of the two initial states; (2) examples of apparently interpenetrating (i.e., mixing) distributions can be found during transitions from one state to the other; and (3) examples of both types of distributions can be found at actual crossings of the shock ramp. The alternation between the two different types of ion distribution provides direct observational support for the idea that the dissipative dynamics of at least some quasi-parallel shocks is non-stationary and cyclic in nature, as demonstrated by recent numerical simulations. Typical cycle times between intervals of similar ion heating states are ∼2 upstream ion gyroperiods. Both the simulations and the in situ observations indicate that a process of coherent ion reflection is commonly an important part of the dissipation at quasi-parallel shocks
Performance evaluation of parallel electric field tunnel field-effect transistor by a distributed-element circuit model

Science.gov (United States)

Morita, Yukinori; Mori, Takahiro; Migita, Shinji; Mizubayashi, Wataru; Tanabe, Akihito; Fukuda, Koichi; Matsukawa, Takashi; Endo, Kazuhiko; O'uchi, Shin-ichi; Liu, Yongxun; Masahara, Meishoku; Ota, Hiroyuki

2014-12-01

The performance of parallel electric field tunnel field-effect transistors (TFETs), in which band-to-band tunneling (BTBT) was initiated in-line to the gate electric field was evaluated. The TFET was fabricated by inserting an epitaxially-grown parallel-plate tunnel capacitor between heavily doped source wells and gate insulators. Analysis using a distributed-element circuit model indicated there should be a limit of the drain current caused by the self-voltage-drop effect in the ultrathin channel layer.
A parallel algorithm for filtering gravitational waves from coalescing binaries

International Nuclear Information System (INIS)

Sathyaprakash, B.S.; Dhurandhar, S.V.

1992-10-01

Coalescing binary stars are perhaps the most promising sources for the observation of gravitational waves with laser interferometric gravity wave detectors. The waveform from these sources can be predicted with sufficient accuracy for matched filtering techniques to be applied. In this paper we present a parallel algorithm for detecting signals from coalescing compact binaries by the method of matched filtering. We also report the details of its implementation on a 256-node connection machine consisting of a network of transputers. The results of our analysis indicate that parallel processing is a promising approach to on-line analysis of data from gravitational wave detectors to filter out coalescing binary signals. The algorithm described is quite general in that the kernel of the algorithm is applicable to any set of matched filters. (author). 15 refs, 4 figs
Parallel Processing of Images in Mobile Devices using BOINC

Science.gov (United States)

Curiel, Mariela; Calle, David F.; Santamaría, Alfredo S.; Suarez, David F.; Flórez, Leonardo

2018-04-01

Medical image processing helps health professionals make decisions for the diagnosis and treatment of patients. Since some algorithms for processing images require substantial amounts of resources, one could take advantage of distributed or parallel computing. A mobile grid can be an adequate computing infrastructure for this problem. A mobile grid is a grid that includes mobile devices as resource providers. In a previous step of this research, we selected BOINC as the infrastructure to build our mobile grid. However, parallel processing of images in mobile devices poses at least two important challenges: the execution of standard libraries for processing images and obtaining adequate performance when compared to desktop computers grids. By the time we started our research, the use of BOINC in mobile devices also involved two issues: a) the execution of programs in mobile devices required to modify the code to insert calls to the BOINC API, and b) the division of the image among the mobile devices as well as its merging required additional code in some BOINC components. This article presents answers to these four challenges.
Parallel Processing of Images in Mobile Devices using BOINC

Directory of Open Access Journals (Sweden)

Curiel Mariela

2018-04-01

Full Text Available Medical image processing helps health professionals make decisions for the diagnosis and treatment of patients. Since some algorithms for processing images require substantial amounts of resources, one could take advantage of distributed or parallel computing. A mobile grid can be an adequate computing infrastructure for this problem. A mobile grid is a grid that includes mobile devices as resource providers. In a previous step of this research, we selected BOINC as the infrastructure to build our mobile grid. However, parallel processing of images in mobile devices poses at least two important challenges: the execution of standard libraries for processing images and obtaining adequate performance when compared to desktop computers grids. By the time we started our research, the use of BOINC in mobile devices also involved two issues: a the execution of programs in mobile devices required to modify the code to insert calls to the BOINC API, and b the division of the image among the mobile devices as well as its merging required additional code in some BOINC components. This article presents answers to these four challenges.
Parallelizing AT with MatlabMPI

International Nuclear Information System (INIS)

2011-01-01

The Accelerator Toolbox (AT) is a high-level collection of tools and scripts specifically oriented toward solving problems dealing with computational accelerator physics. It is integrated into the MATLAB environment, which provides an accessible, intuitive interface for accelerator physicists, allowing researchers to focus the majority of their efforts on simulations and calculations, rather than programming and debugging difficulties. Efforts toward parallelization of AT have been put in place to upgrade its performance to modern standards of computing. We utilized the packages MatlabMPI and pMatlab, which were developed by MIT Lincoln Laboratory, to set up a message-passing environment that could be called within MATLAB, which set up the necessary pre-requisites for multithread processing capabilities. On local quad-core CPUs, we were able to demonstrate processor efficiencies of roughly 95% and speed increases of nearly 380%. By exploiting the efficacy of modern-day parallel computing, we were able to demonstrate incredibly efficient speed increments per processor in AT's beam-tracking functions. Extrapolating from prediction, we can expect to reduce week-long computation runtimes to less than 15 minutes. This is a huge performance improvement and has enormous implications for the future computing power of the accelerator physics group at SSRL. However, one of the downfalls of parringpass is its current lack of transparency; the pMatlab and MatlabMPI packages must first be well-understood by the user before the system can be configured to run the scripts. In addition, the instantiation of argument parameters requires internal modification of the source code. Thus, parringpass, cannot be directly run from the MATLAB command line, which detracts from its flexibility and user-friendliness. Future work in AT's parallelization will focus on development of external functions and scripts that can be called from within MATLAB and configured on multiple nodes, while
Experiences in the parallelization of the discrete ordinates method using OpenMP and MPI

Energy Technology Data Exchange (ETDEWEB)

Pautz, A. [TUV Hannover/Sachsen-Anhalt e.V. (Germany); Langenbuch, S. [Gesellschaft fur Anlagen- und Reaktorsicherheit (GRS) mbH (Germany)

2003-07-01

The method of Discrete Ordinates is in principle parallelizable to a high degree, since the transport 'mesh sweeps' are mutually independent for all angular directions. However, in the well-known production code Dort such a type of angular domain decomposition has to be done on a spatial line-byline basis, causing the parallelism in the code to be very fine-grained. The construction of scalar fluxes and moments requires a large effort for inter-thread or inter-process communication. We have implemented two different parallelization approaches in Dort: firstly, we have used a shared-memory model suitable for SMP (Symmetric Multiprocessor) machines based on the standard OpenMP. The second approach uses the well-known Message Passing Interface (MPI) to establish communication between parallel processes running in a distributed-memory environment. We investigate the benefits and drawbacks of both models and show first results on performance and scaling behaviour of the parallel Dort code. (authors)
Experiences in the parallelization of the discrete ordinates method using OpenMP and MPI

International Nuclear Information System (INIS)

Pautz, A.; Langenbuch, S.

2003-01-01

The method of Discrete Ordinates is in principle parallelizable to a high degree, since the transport 'mesh sweeps' are mutually independent for all angular directions. However, in the well-known production code Dort such a type of angular domain decomposition has to be done on a spatial line-byline basis, causing the parallelism in the code to be very fine-grained. The construction of scalar fluxes and moments requires a large effort for inter-thread or inter-process communication. We have implemented two different parallelization approaches in Dort: firstly, we have used a shared-memory model suitable for SMP (Symmetric Multiprocessor) machines based on the standard OpenMP. The second approach uses the well-known Message Passing Interface (MPI) to establish communication between parallel processes running in a distributed-memory environment. We investigate the benefits and drawbacks of both models and show first results on performance and scaling behaviour of the parallel Dort code. (authors)
Parallel algorithms for finding cliques in a graph

International Nuclear Information System (INIS)

Szabo, S

2011-01-01

A clique is a subgraph in a graph that is complete in the sense that each two of its nodes are connected by an edge. Finding cliques in a given graph is an important procedure in discrete mathematical modeling. The paper will show how concepts such as splitting partitions, quasi coloring, node and edge dominance are related to clique search problems. In particular we will discuss the connection with parallel clique search algorithms. These concepts also suggest practical guide lines to inspect a given graph before starting a large scale search.
Geometrical reasoning in the primary school, the case of parallel lines

OpenAIRE

Sinclair, Nathalie; Jones, Keith

2009-01-01

During the primary school years, children are typically expected to develop ways of explaining their mathematical reasoning. This paper reports on ideas developed during an analysis of data from a project which involved young children (aged 5-7 years old) in a whole-class situation using dynamic geometry software (specifically Sketchpad). The focus is a classroom episode in which the children try to decide whether two lines that they know continue (but cannot see all of the continuation) will...

The application of image processing in the measurement for three-light-axis parallelity of laser ranger

Science.gov (United States)

Wang, Yang; Wang, Qianqian

2008-12-01

When laser ranger is transported or used in field operations, the transmitting axis, receiving axis and aiming axis may be not parallel. The nonparallelism of the three-light-axis will affect the range-measuring ability or make laser ranger not be operated exactly. So testing and adjusting the three-light-axis parallelity in the production and maintenance of laser ranger is important to ensure using laser ranger reliably. The paper proposes a new measurement method using digital image processing based on the comparison of some common measurement methods for the three-light-axis parallelity. It uses large aperture off-axis paraboloid reflector to get the images of laser spot and white light cross line, and then process the images on LabVIEW platform. The center of white light cross line can be achieved by the matching arithmetic in LABVIEW DLL. And the center of laser spot can be achieved by gradation transformation, binarization and area filter in turn. The software system can set CCD, detect the off-axis paraboloid reflector, measure the parallelity of transmitting axis and aiming axis and control the attenuation device. The hardware system selects SAA7111A, a programmable vedio decoding chip, to perform A/D conversion. FIFO (first-in first-out) is selected as buffer.USB bus is used to transmit data to PC. The three-light-axis parallelity can be achieved according to the position bias between them. The device based on this method has been already used. The application proves this method has high precision, speediness and automatization.
Parallel phase model : a programming model for high-end parallel machines with manycores.

Energy Technology Data Exchange (ETDEWEB)

Wu, Junfeng (Syracuse University, Syracuse, NY); Wen, Zhaofang; Heroux, Michael Allen; Brightwell, Ronald Brian

2009-04-01

This paper presents a parallel programming model, Parallel Phase Model (PPM), for next-generation high-end parallel machines based on a distributed memory architecture consisting of a networked cluster of nodes with a large number of cores on each node. PPM has a unified high-level programming abstraction that facilitates the design and implementation of parallel algorithms to exploit both the parallelism of the many cores and the parallelism at the cluster level. The programming abstraction will be suitable for expressing both fine-grained and coarse-grained parallelism. It includes a few high-level parallel programming language constructs that can be added as an extension to an existing (sequential or parallel) programming language such as C; and the implementation of PPM also includes a light-weight runtime library that runs on top of an existing network communication software layer (e.g. MPI). Design philosophy of PPM and details of the programming abstraction are also presented. Several unstructured applications that inherently require high-volume random fine-grained data accesses have been implemented in PPM with very promising results.
Systematic approach for deriving feasible mappings of parallel algorithms to parallel computing platforms

NARCIS (Netherlands)

Arkin, Ethem; Tekinerdogan, Bedir; Imre, Kayhan M.

2017-01-01

The need for high-performance computing together with the increasing trend from single processor to parallel computer architectures has leveraged the adoption of parallel computing. To benefit from parallel computing power, usually parallel algorithms are defined that can be mapped and executed
Parallel algorithms

CERN Document Server

Casanova, Henri; Robert, Yves

2008-01-01

""…The authors of the present book, who have extensive credentials in both research and instruction in the area of parallelism, present a sound, principled treatment of parallel algorithms. … This book is very well written and extremely well designed from an instructional point of view. … The authors have created an instructive and fascinating text. The book will serve researchers as well as instructors who need a solid, readable text for a course on parallelism in computing. Indeed, for anyone who wants an understandable text from which to acquire a current, rigorous, and broad vi
On-line, continuous monitoring in solar cell and fuel cell manufacturing using spectral reflectance imaging

Energy Technology Data Exchange (ETDEWEB)

Sopori, Bhushan; Rupnowski, Przemyslaw; Ulsh, Michael

2016-01-12

A monitoring system 100 comprising a material transport system 104 providing for the transportation of a substantially planar material 102, 107 through the monitoring zone 103 of the monitoring system 100. The system 100 also includes a line camera 106 positioned to obtain multiple line images across a width of the material 102, 107 as it is transported through the monitoring zone 103. The system 100 further includes an illumination source 108 providing for the illumination of the material 102, 107 transported through the monitoring zone 103 such that light reflected in a direction normal to the substantially planar surface of the material 102, 107 is detected by the line camera 106. A data processing system 110 is also provided in digital communication with the line camera 106. The data processing system 110 is configured to receive data output from the line camera 106 and further configured to calculate and provide substantially contemporaneous information relating to a quality parameter of the material 102, 107. Also disclosed are methods of monitoring a quality parameter of a material.
Design of a family of integrated parallel co-processors for images processing

International Nuclear Information System (INIS)

Court, Thierry

1991-01-01

The design of parallel image processing Systems joining in a same architecture, sophisticated microprocessors and specialised operators is a difficult task, because of the various problems to be taken into account. The current study identifies a certain way of realizing and interfacing such dedicated operators to a central unit with microprocessor type. The two guide lines of this work are the search for polyvalent specialized and re-configurated operators as well as their connections to a System bus, and not to specialized video buses. This research work proposes a certain architecture of circuits dedicated to image processing and two realization proposals of them. One of them was be realized in this study by using silicon compiler tools. This work belongs to a more important project, whose aim is the development of an industrial image processing System, high performing, modular, based on the parallelization, in MIMD structures, of an elementary, autonomous image processing unit integrating a microprocessor equipped with a parallel coprocessor suited to image processing. (author) [fr
The Protein Maker: an automated system for high-throughput parallel purification

International Nuclear Information System (INIS)

Smith, Eric R.; Begley, Darren W.; Anderson, Vanessa; Raymond, Amy C.; Haffner, Taryn E.; Robinson, John I.; Edwards, Thomas E.; Duncan, Natalie; Gerdts, Cory J.; Mixon, Mark B.; Nollert, Peter; Staker, Bart L.; Stewart, Lance J.

2011-01-01

The Protein Maker instrument addresses a critical bottleneck in structural genomics by allowing automated purification and buffer testing of multiple protein targets in parallel with a single instrument. Here, the use of this instrument to (i) purify multiple influenza-virus proteins in parallel for crystallization trials and (ii) identify optimal lysis-buffer conditions prior to large-scale protein purification is described. The Protein Maker is an automated purification system developed by Emerald BioSystems for high-throughput parallel purification of proteins and antibodies. This instrument allows multiple load, wash and elution buffers to be used in parallel along independent lines for up to 24 individual samples. To demonstrate its utility, its use in the purification of five recombinant PB2 C-terminal domains from various subtypes of the influenza A virus is described. Three of these constructs crystallized and one diffracted X-rays to sufficient resolution for structure determination and deposition in the Protein Data Bank. Methods for screening lysis buffers for a cytochrome P450 from a pathogenic fungus prior to upscaling expression and purification are also described. The Protein Maker has become a valuable asset within the Seattle Structural Genomics Center for Infectious Disease (SSGCID) and hence is a potentially valuable tool for a variety of high-throughput protein-purification applications
A directly heated electron beam line source

International Nuclear Information System (INIS)

Iqbal, M.; Masood, K.; Rafiq, M.; Chaudhry, M.A.

2002-05-01

A 140-mm cathode length, Electron Beam Line Source with a high degree of focusing of the beam is constructed. The design principles and basic characteristic considerations for electron beam line source consists of parallel plate electrode geometric array as well as a beam power of 35kW are worked out. The dimensions of the beam at the work site are 1.25xl00mm. The gun is designed basically for the study of evaporation and deposition characteristic of refractory metals for laboratory use. However, it may be equally used for melting and casting of these metals. (author)
20 CFR 604.6 - Conformity and substantial compliance.

Science.gov (United States)

2010-04-01

... 20 Employees' Benefits 3 2010-04-01 2010-04-01 false Conformity and substantial compliance. 604.6... FOR ELIGIBILITY FOR UNEMPLOYMENT COMPENSATION § 604.6 Conformity and substantial compliance. (a) In... for the administration of its UC program. (b) Resolving Issues of Conformity and Substantial...
Workshop on Radio Recombination Lines

CERN Document Server

1980-01-01

Since their first detection 15 years ago, radio recombination lines from several elements have been observed in a wide variety of objects including HII regions, planetary nebulae, molecular clouds, the diffuse interstellar medium, and recently, other galaxies. The observations span almost the entire range from 0.1 to 100 GHz, and employ both single djsh and aperture synthesis techniques. The theory of radio recombination lines has also advanced strongly, to the point where it is perhaps one of the best-understood in astro physics. In a parallel development, it has become possible over the last decade to study these same highly-excited atoms in the laboratory; this work provides further confirmation of the theoretical framework. However there has been continuing controversy over the astrophysical interpre tation of radio recombination line observations, especially regarding the role of stimulated emission. A workshop was held in Ottawa on 24-25 August, 1979, bringing together many of the active scientist...
Interactive animation of fault-tolerant parallel algorithms

Energy Technology Data Exchange (ETDEWEB)

Apgar, S.W.

1992-02-01

Animation of algorithms makes understanding them intuitively easier. This paper describes the software tool Raft (Robust Animator of Fault Tolerant Algorithms). The Raft system allows the user to animate a number of parallel algorithms which achieve fault tolerant execution. In particular, we use it to illustrate the key Write-All problem. It has an extensive user-interface which allows a choice of the number of processors, the number of elements in the Write-All array, and the adversary to control the processor failures. The novelty of the system is that the interface allows the user to create new on-line adversaries as the algorithm executes.
19 CFR 10.7 - Substantial containers or holders.

Science.gov (United States)

2010-04-01

... 19 Customs Duties 1 2010-04-01 2010-04-01 false Substantial containers or holders. 10.7 Section 10... Exported and Returned § 10.7 Substantial containers or holders. (a) Substantial containers or holders... domestic products exported and returned. When such containers or holders are imported not containing or...
Parallel algorithms for mapping pipelined and parallel computations

Science.gov (United States)

Nicol, David M.

1988-01-01

Many computational problems in image processing, signal processing, and scientific computing are naturally structured for either pipelined or parallel computation. When mapping such problems onto a parallel architecture it is often necessary to aggregate an obvious problem decomposition. Even in this context the general mapping problem is known to be computationally intractable, but recent advances have been made in identifying classes of problems and architectures for which optimal solutions can be found in polynomial time. Among these, the mapping of pipelined or parallel computations onto linear array, shared memory, and host-satellite systems figures prominently. This paper extends that work first by showing how to improve existing serial mapping algorithms. These improvements have significantly lower time and space complexities: in one case a published O(nm sup 3) time algorithm for mapping m modules onto n processors is reduced to an O(nm log m) time complexity, and its space requirements reduced from O(nm sup 2) to O(m). Run time complexity is further reduced with parallel mapping algorithms based on these improvements, which run on the architecture for which they create the mappings.
Nitrobenzene anti-parallel dimer formation in non-polar solvents

Directory of Open Access Journals (Sweden)

Toshiyuki Shikata

2014-06-01

Full Text Available We investigated the dielectric and depolarized Rayleigh scattering behaviors of nitrobenzene (NO2-Bz, which is a benzene mono-substituted with a planar molecular frame bearing the large electric dipole moment 4.0 D, in non-polar solvents solutions, such as tetrachloromethane and benzene, at up to 3 THz for the dielectric measurements and 8 THz for the scattering experiments at 20 °C. The dielectric relaxation strength of the system was substantially smaller than the proportionality to the concentration in a concentrated regime and showed a Kirkwood correlation factor markedly lower than unity; gK ∼ 0.65. This observation revealed that NO2-Bz has a tendency to form dimers, (NO2-Bz2, in anti-parallel configurations for the dipole moment with increasing concentration of the two solvents. Both the dielectric and scattering data exhibited fast and slow Debye-type relaxation modes with the characteristic time constants ∼7 and ∼50 ps in a concentrated regime (∼15 and ∼30 ps in a dilute regime, respectively. The fast mode was simply attributed to the rotational motion of the (monomeric NO2-Bz. However, the magnitude of the slow mode was proportional to the square of the concentration in the dilute regime; thus, the mode was assigned to the anti-parallel dimer, (NO2-Bz2, dissociation process, and the slow relaxation time was attributed to the anti-parallel dimer lifetime. The concentration dependencies of both the dielectric and scattering data show that the NO2-Bz molecular processes are controlled through a chemical equilibrium between monomers and anti-parallel dimers, 2NO2-Bz ↔ (NO2-Bz2, due to a strong dipole-dipole interaction between nitro groups.
Parallel computing works

Energy Technology Data Exchange (ETDEWEB)

1991-10-23

An account of the Caltech Concurrent Computation Program (C{sup 3}P), a five year project that focused on answering the question: Can parallel computers be used to do large-scale scientific computations '' As the title indicates, the question is answered in the affirmative, by implementing numerous scientific applications on real parallel computers and doing computations that produced new scientific results. In the process of doing so, C{sup 3}P helped design and build several new computers, designed and implemented basic system software, developed algorithms for frequently used mathematical computations on massively parallel machines, devised performance models and measured the performance of many computers, and created a high performance computing facility based exclusively on parallel computers. While the initial focus of C{sup 3}P was the hypercube architecture developed by C. Seitz, many of the methods developed and lessons learned have been applied successfully on other massively parallel architectures.
A comparison of energetic ions in the plasma depletion layer and the quasi-parallel magnetosheath

Science.gov (United States)

Fuselier, Stephen A.

1994-01-01

Energetic ion spectra measured by the Active Magnetospheric Particle Tracer Explorers/Charge Composition Explorer (AMPTE/CCE) downstream from the Earth's quasi-parallel bow shock (in the quasi-parallel magnetosheath) and in the plasma depletion layer are compared. In the latter region, energetic ions are from a single source, leakage of magnetospheric ions across the magnetopause and into the plasma depletion layer. In the former region, both the magnetospheric source and shock acceleration of the thermal solar wind population at the quasi-parallel shock can contribute to the energetic ion spectra. The relative strengths of these two energetic ion sources are determined through the comparison of spectra from the two regions. It is found that magnetospheric leakage can provide an upper limit of 35% of the total energetic H(+) population in the quasi-parallel magnetosheath near the magnetopause in the energy range from approximately 10 to approximately 80 keV/e and substantially less than this limit for the energetic He(2+) population. The rest of the energetic H(+) population and nearly all of the energetic He(2+) population are accelerated out of the thermal solar wind population through shock acceleration processes. By comparing the energetic and thermal He(2+) and H(+) populations in the quasi-parallel magnetosheath, it is found that the quasi-parallel bow shock is 2 to 3 times more efficient at accelerating He(2+) than H(+). This result is consistent with previous estimates from shock acceleration theory and simulati ons.
Continuous path control of a 5-DOF parallel-serial hybrid robot

International Nuclear Information System (INIS)

Uchiyama, Takuma; Terada, Hidetsugu; Mitsuya, Hironori

2010-01-01

Using the 5-degree of freedom parallel-serial hybrid robot, to realize the de-burring, new forward and inverse kinematic calculation methods based on the 'off-line teaching' method are proposed. This hybrid robot consists of a parallel stage section and a serial stage section. Considering this point, each section is calculated individually. And the continuous path control algorithm of this hybrid robot is proposed. To verify the usefulness, a prototype robot is tested which is controlled based on the proposed methods. This verification includes a positioning test and a pose test. The positioning test evaluates the continuous path of the tool center point. The pose test evaluates the pose on the tool center point. As the result, it is confirmed that this hybrid robot moves correctly using the proposed methods
Parallel magnetic resonance imaging as approximation in a reproducing kernel Hilbert space

International Nuclear Information System (INIS)

Athalye, Vivek; Lustig, Michael; Martin Uecker

2015-01-01

In magnetic resonance imaging data samples are collected in the spatial frequency domain (k-space), typically by time-consuming line-by-line scanning on a Cartesian grid. Scans can be accelerated by simultaneous acquisition of data using multiple receivers (parallel imaging), and by using more efficient non-Cartesian sampling schemes. To understand and design k-space sampling patterns, a theoretical framework is needed to analyze how well arbitrary sampling patterns reconstruct unsampled k-space using receive coil information. As shown here, reconstruction from samples at arbitrary locations can be understood as approximation of vector-valued functions from the acquired samples and formulated using a reproducing kernel Hilbert space with a matrix-valued kernel defined by the spatial sensitivities of the receive coils. This establishes a formal connection between approximation theory and parallel imaging. Theoretical tools from approximation theory can then be used to understand reconstruction in k-space and to extend the analysis of the effects of samples selection beyond the traditional image-domain g-factor noise analysis to both noise amplification and approximation errors in k-space. This is demonstrated with numerical examples. (paper)
Cosmic Shear With ACS Pure Parallels. Targeted Portion.

Science.gov (United States)

Rhodes, Jason

2002-07-01

Small distortions in the shapes of background galaxies by foreground mass provide a powerful method of directly measuring the amount and distribution of dark matter. Several groups have recently detected this weak lensing by large-scale structure, also called cosmic shear. The high resolution and sensitivity of HST/ACS provide a unique opportunity to measure cosmic shear accurately on small scales. Using 260 parallel orbits in Sloan i {F775W} we will measure for the first time: the cosmic shear variance on scales Omega_m^0.5, with signal-to-noise {s/n} 20, and the mass density Omega_m with s/n=4. They will be done at small angular scales where non-linear effects dominate the power spectrum, providing a test of the gravitational instability paradigm for structure formation. Measurements on these scales are not possible from the ground, because of the systematic effects induced by PSF smearing from seeing. Having many independent lines of sight reduces the uncertainty due to cosmic variance, making parallel observations ideal.
Computing NLTE Opacities -- Node Level Parallel Calculation

Energy Technology Data Exchange (ETDEWEB)

Holladay, Daniel [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

2015-09-11

Presentation. The goal: to produce a robust library capable of computing reasonably accurate opacities inline with the assumption of LTE relaxed (non-LTE). Near term: demonstrate acceleration of non-LTE opacity computation. Far term (if funded): connect to application codes with in-line capability and compute opacities. Study science problems. Use efficient algorithms that expose many levels of parallelism and utilize good memory access patterns for use on advanced architectures. Portability to multiple types of hardware including multicore processors, manycore processors such as KNL, GPUs, etc. Easily coupled to radiation hydrodynamics and thermal radiative transfer codes.

On field line resonances of hydromagnetic Alfven waves in dipole magnetic field

International Nuclear Information System (INIS)

Chen, Liu; Cowley, S.C.

1989-07-01

Using the dipole magnetic field model, we have developed the theory of field line resonances of hydromagnetic Alfven waves in general magnetic field geometries. In this model, the Alfven speed thus varies both perpendicular and parallel to the magnetic field. Specifically, it is found that field line resonances do persist in the dipole model. The corresponding singular solutions near the resonant field lines as well as the natural definition of standing shear Alfven eigenfunctions have also been systematically derived. 11 refs
An optimal algorithm for preemptive on-line scheduling

NARCIS (Netherlands)

Chen, B.; Vliet, van A.; Woeginger, G.J.

1995-01-01

We investigate the problem of on-line scheduling jobs on m identical parallel machines where preemption is allowed. The goal is to minimize the makespan. We derive an approximation algorithm with worst-case guarantee mm/(mm - (m - 1)m) for every m 2, which increasingly tends to e/(e - 1) ˜ 1.58 as m
The End of the Lines for OX 169: No Binary Broad-Line Region

Science.gov (United States)

Halpern, J. P.; Eracleous, M.

2000-03-01

We show that unusual Balmer emission-line profiles of the quasar OX 169, frequently described as either self-absorbed or double peaked, are actually neither. The effect is an illusion resulting from two coincidences. First, the forbidden lines are quite strong and broad. Consequently, the [N II] λ6583 line and the associated narrow-line component of Hα present the appearance of twin Hα peaks. Second, the redshift of 0.2110 brings Hβ into coincidence with Na I D at zero redshift, and ISM absorption in Na I D divides the Hβ emission line. In spectra obtained over the past decade, we see no substantial change in the character of the line profiles and no indication of intrinsic double-peaked structure. The Hγ, Mg II, and Lyα emission lines are single peaked, and all of the emission-line redshifts are consistent once they are correctly attributed to their permitted and forbidden-line identifications. A systematic shift of up to 700 km s-1 between broad and narrow lines is seen, but such differences are common and could be due to gravitational and transverse redshift in a low-inclination disk. Stockton & Farnham had called attention to an apparent tidal tail in the host galaxy of OX 169 and speculated that a recent merger had supplied the nucleus with a coalescing pair of black holes that was now revealing its existence in the form of two physically distinct broad-line regions. Although there is no longer any evidence for two broad emission-line regions in OX 169, binary black holes should form frequently in galaxy mergers, and it is still worthwhile to monitor the radial velocities of emission lines that could supply evidence of their existence in certain objects.
Template based parallel checkpointing in a massively parallel computer system

Science.gov (United States)

Archer, Charles Jens [Rochester, MN; Inglett, Todd Alan [Rochester, MN

2009-01-13

A method and apparatus for a template based parallel checkpoint save for a massively parallel super computer system using a parallel variation of the rsync protocol, and network broadcast. In preferred embodiments, the checkpoint data for each node is compared to a template checkpoint file that resides in the storage and that was previously produced. Embodiments herein greatly decrease the amount of data that must be transmitted and stored for faster checkpointing and increased efficiency of the computer system. Embodiments are directed to a parallel computer system with nodes arranged in a cluster with a high speed interconnect that can perform broadcast communication. The checkpoint contains a set of actual small data blocks with their corresponding checksums from all nodes in the system. The data blocks may be compressed using conventional non-lossy data compression algorithms to further reduce the overall checkpoint size.
The Application of Paired Parallel Filters for Ultra-Wideband Signal Processing

Directory of Open Access Journals (Sweden)

S. L. Chernyshev

2015-01-01

Full Text Available The paper considers a unit in which the parallel filters on regular lines are pair-attached. This connection allows to reduce a side line impedance at the point of connection. At the same time these lines become narrow, and the possibility to excite higher modes in the joint reduces.Consider the scattering matrix of four identical lines connection. Then find the scattering matrix of connection in which two side lines are connected with filters. Particular cases of the reflection coefficients of different filters are considered. It is shown that only in the case of identical filters there remained a linear relationship between the input filter coefficients of reflection and transmission coefficient of the unit. It facilitates the solution of the problem of synthesis. Restrictions on the transfer coefficient are found. In transition to the time domain impulse response of connection under consideration and the expression for the synthesis were defined. The paper considers an example of implementation of the matched filtering in this connection. In this case, the output signal is a half-sum of the input signal and their autocorrelation function.
Experience with highly-parallel software for the storage system of the ATLAS Experiment at CERN

CERN Document Server

Colombo, T; The ATLAS collaboration

2012-01-01

The ATLAS experiment is observing proton-proton collisions delivered by the LHC accelerator. The ATLAS Trigger and Data Acquisition (TDAQ) system selects interesting events on-line in a three-level trigger system in order to store them at a budgeted rate of several hundred Hz. This paper focuses on the TDAQ data-logging system and in particular on the implementation and performance of a novel parallel software design. In this respect, the main challenge presented by the data-logging workload is the conflict between the largely parallel nature of the event processing, especially the recently introduced event compression, and the constraint of sequential file writing and checksum evaluation. This is further complicated by the necessity of operating in a fully data-driven mode, to cope with continuously evolving trigger and detector configurations. In this paper we report on the design of the new ATLAS on-line storage software. In particular we will discuss our development experience using recent concurrency-ori...
Modeling stretched solitary waves along magnetic field lines

Directory of Open Access Journals (Sweden)

L. Muschietti

2002-01-01

Full Text Available A model is presented for a new type of fast solitary waves which is observed in downward current regions of the auroral zone. The three-dimensional, coherent structures are electrostatic, have a positive potential, and move along the magnetic field lines with speeds on the order of the electron drift. Their parallel potential profile is flattened and cannot fit to the Gaussian shape used in previous work. We develop a detailed BGK model which includes a flattened potential and an assumed cylindrical symmetry around a centric magnetic field line. The model envisions concentric shells of trapped electrons slowly drifting azimuthally while bouncing back and forth in the parallel direction. The electron dynamics is analysed in terms of three basic motions that occur on different time scales characterized by the cyclotron frequency We , the bounce frequency wb , and the azimuthal drift frequency wg. The ordering We >> wb >> wg is required. Self-consistent distribution functions are calculated in terms of approximate constants of motion. Constraints on the parameters characterizing the amplitude and shape of the stretched solitary wave are discussed.
AUTOMATIC RAILWAY POWER LINE EXTRACTION USING MOBILE LASER SCANNING DATA

Directory of Open Access Journals (Sweden)

S. Zhang

2016-06-01

Full Text Available Research on power line extraction technology using mobile laser point clouds has important practical significance on railway power lines patrol work. In this paper, we presents a new method for automatic extracting railway power line from MLS (Mobile Laser Scanning data. Firstly, according to the spatial structure characteristics of power-line and trajectory, the significant data is segmented piecewise. Then, use the self-adaptive space region growing method to extract power lines parallel with rails. Finally use PCA (Principal Components Analysis combine with information entropy theory method to judge a section of the power line whether is junction or not and which type of junction it belongs to. The least squares fitting algorithm is introduced to model the power line. An evaluation of the proposed method over a complicated railway point clouds acquired by a RIEGL VMX450 MLS system shows that the proposed method is promising.
Conceptual design and kinematic analysis of a novel parallel robot for high-speed pick-and-place operations

Science.gov (United States)

Meng, Qizhi; Xie, Fugui; Liu, Xin-Jun

2018-06-01

This paper deals with the conceptual design, kinematic analysis and workspace identification of a novel four degrees-of-freedom (DOFs) high-speed spatial parallel robot for pick-and-place operations. The proposed spatial parallel robot consists of a base, four arms and a 1½ mobile platform. The mobile platform is a major innovation that avoids output singularity and offers the advantages of both single and double platforms. To investigate the characteristics of the robot's DOFs, a line graph method based on Grassmann line geometry is adopted in mobility analysis. In addition, the inverse kinematics is derived, and the constraint conditions to identify the correct solution are also provided. On the basis of the proposed concept, the workspace of the robot is identified using a set of presupposed parameters by taking input and output transmission index as the performance evaluation criteria.
On-line reconstruction algorithms for the CBM and ALICE experiments

International Nuclear Information System (INIS)

Gorbunov, Sergey

2013-01-01

This thesis presents various algorithms which have been developed for on-line event reconstruction in the CBM experiment at GSI, Darmstadt and the ALICE experiment at CERN, Geneve. Despite the fact that the experiments are different - CBM is a fixed target experiment with forward geometry, while ALICE has a typical collider geometry - they share common aspects when reconstruction is concerned. The thesis describes: - general modifications to the Kalman filter method, which allows one to accelerate, to improve, and to simplify existing fit algorithms; - developed algorithms for track fit in CBM and ALICE experiment, including a new method for track extrapolation in non-homogeneous magnetic field. - developed algorithms for primary and secondary vertex fit in the both experiments. In particular, a new method of reconstruction of decayed particles is presented. - developed parallel algorithm for the on-line tracking in the CBM experiment. - developed parallel algorithm for the on-line tracking in High Level Trigger of the ALICE experiment. - the realisation of the track finders on modern hardware, such as SIMD CPU registers and GPU accelerators. All the presented methods have been developed by or with the direct participation of the author.
Introduction to parallel programming

CERN Document Server

Brawer, Steven

1989-01-01

Introduction to Parallel Programming focuses on the techniques, processes, methodologies, and approaches involved in parallel programming. The book first offers information on Fortran, hardware and operating system models, and processes, shared memory, and simple parallel programs. Discussions focus on processes and processors, joining processes, shared memory, time-sharing with multiple processors, hardware, loops, passing arguments in function/subroutine calls, program structure, and arithmetic expressions. The text then elaborates on basic parallel programming techniques, barriers and race
Parallelism in matrix computations

CERN Document Server

Gallopoulos, Efstratios; Sameh, Ahmed H

2016-01-01

This book is primarily intended as a research monograph that could also be used in graduate courses for the design of parallel algorithms in matrix computations. It assumes general but not extensive knowledge of numerical linear algebra, parallel architectures, and parallel programming paradigms. The book consists of four parts: (I) Basics; (II) Dense and Special Matrix Computations; (III) Sparse Matrix Computations; and (IV) Matrix functions and characteristics. Part I deals with parallel programming paradigms and fundamental kernels, including reordering schemes for sparse matrices. Part II is devoted to dense matrix computations such as parallel algorithms for solving linear systems, linear least squares, the symmetric algebraic eigenvalue problem, and the singular-value decomposition. It also deals with the development of parallel algorithms for special linear systems such as banded ,Vandermonde ,Toeplitz ,and block Toeplitz systems. Part III addresses sparse matrix computations: (a) the development of pa...
Substantial equivalence--an appropriate paradigm for the safety assessment of genetically modified foods?

International Nuclear Information System (INIS)

Kuiper, Harry A.; Kleter, Gijs A.; Noteborn, Hub P.J.M.; Kok, Esther J.

2002-01-01

Safety assessment of genetically modified food crops is based on the concept of substantial equivalence, developed by OECD and further elaborated by FAO/WHO. The concept embraces a comparative approach to identify possible differences between the genetically modified food and its traditional comparator, which is considered to be safe. The concept is not a safety assessment in itself, it identifies hazards but does not assess them. The outcome of the comparative exercise will further guide the safety assessment, which may include (immuno)toxicological and biochemical testing. Application of the concept of substantial equivalence may encounter practical difficulties: (i) the availability of near-isogenic parental lines to compare the genetically modified food with; (ii) limited availability of methods for the detection of (un)intended effects resulting from the genetic modification; and (iii) limited information on natural variations in levels of relevant crop constituents. In order to further improve the methodology for identification of unintended effects, new 'profiling' methods are recommended. Such methods will allow for the screening of potential changes in the modified host organism at different integration levels, i.e. at the genome level, during gene expression and protein translation, and at the level of cellular metabolism
Chameleon's behavior of modulable nonlinear electrical transmission line

Science.gov (United States)

Togueu Motcheyo, A. B.; Tchinang Tchameu, J. D.; Fewo, S. I.; Tchawoua, C.; Kofane, T. C.

2017-12-01

We show that modulable discrete nonlinear transmission line can adopt Chameleon's behavior due to the fact that, without changing its appearance structure, it can become alternatively purely right or left handed line which is different to the composite one. Using a quasidiscrete approximation, we derive a nonlinear Schrödinger equation, that predicts accurately the carrier frequency threshold from the linear analysis. It appears that the increasing of the linear capacitor in parallel in the series branch induced the selectivity of the filter in the right-handed region while it increases band pass filter in the left-handed region. Numerical simulations of the nonlinear model confirm the forward wave in the right handed line and the backward wave in the left handed one.
The energetic ion signature of an O-type neutral line in the geomagnetic tail

Science.gov (United States)

Martin, R. F., Jr.; Johnson, D. F.; Speiser, T. W.

1991-01-01

An energetic ion signature is presented which has the potential for remote sensing of an O-type neutral line embedded in a current sheet. A source plasma with a tailward flowing Kappa distribution yields a strongly non-Kappa distribution after interacting with the neutral line: sharp jumps, or ridges, occur in the velocity space distribution function f(nu-perpendicular, nu-parallel) associated with both increases and decreases in f. The jumps occur when orbits are reversed in the x-direction: a reversal causing initially earthward particles (low probability in the source distribution) to be observed results in a decrease in f, while a reversal causing initially tailward particles to be observed produces an increase in f. The reversals, and hence the jumps, occur at approximately constant values of perpendicular velocity in both the positive nu parallel and negative nu parallel half planes. The results were obtained using single particle simulations in a fixed magnetic field model.
16 CFR 1203.11 - Marking the impact test line.

Science.gov (United States)

2010-01-01

... Section 1203.11 Commercial Practices CONSUMER PRODUCT SAFETY COMMISSION CONSUMER PRODUCT SAFETY ACT... (HPI), with the brow parallel to the basic plane. Place a 5-kg (11-lb) preload ballast on top of the... helmet coinciding with the intersection of the surface of the helmet with the impact line planes defined...
Integrated sensor array for on-line monitoring micro bioreactors

NARCIS (Netherlands)

Krommenhoek, E.E.

2007-01-01

The “Fed��?batch on a chip��?��?project, which was carried out in close cooperation with the Technical University of Delft, aims to miniaturize and parallelize micro bioreactors suitable for on-line screening of micro-organisms. This thesis describes an electrochemical sensor array which has been
Parallelization in Modern C++

CERN Multimedia

CERN. Geneva

2016-01-01

The traditionally used and well established parallel programming models OpenMP and MPI are both targeting lower level parallelism and are meant to be as language agnostic as possible. For a long time, those models were the only widely available portable options for developing parallel C++ applications beyond using plain threads. This has strongly limited the optimization capabilities of compilers, has inhibited extensibility and genericity, and has restricted the use of those models together with other, modern higher level abstractions introduced by the C++11 and C++14 standards. The recent revival of interest in the industry and wider community for the C++ language has also spurred a remarkable amount of standardization proposals and technical specifications being developed. Those efforts however have so far failed to build a vision on how to seamlessly integrate various types of parallelism, such as iterative parallel execution, task-based parallelism, asynchronous many-task execution flows, continuation s...
Massively parallel mathematical sieves

Energy Technology Data Exchange (ETDEWEB)

Montry, G.R.

1989-01-01

The Sieve of Eratosthenes is a well-known algorithm for finding all prime numbers in a given subset of integers. A parallel version of the Sieve is described that produces computational speedups over 800 on a hypercube with 1,024 processing elements for problems of fixed size. Computational speedups as high as 980 are achieved when the problem size per processor is fixed. The method of parallelization generalizes to other sieves and will be efficient on any ensemble architecture. We investigate two highly parallel sieves using scattered decomposition and compare their performance on a hypercube multiprocessor. A comparison of different parallelization techniques for the sieve illustrates the trade-offs necessary in the design and implementation of massively parallel algorithms for large ensemble computers.
Effect of parallel magnetic field on repetitively unipolar nanosecond pulsed dielectric barrier discharge under different pulse repetition frequencies

Science.gov (United States)

Liu, Yidi; Yan, Huijie; Guo, Hongfei; Fan, Zhihui; Wang, Yuying; Wu, Yun; Ren, Chunsheng

2018-03-01

A magnetic field, with the direction parallel to the electric field, is applied to the repetitively unipolar positive nanosecond pulsed dielectric barrier discharge. The effect of the parallel magnetic field on the plasma generated between two parallel-plate electrodes in quiescent air is experimentally studied under different pulse repetition frequencies (PRFs). It is indicated that only the current pulse in the rising front of the voltage pulse occurs, and the value of the current is increased by the parallel magnetic field under different PRFs. The discharge uniformity is improved with the decrease in PRF, and this phenomenon is also observed in the discharge with the parallel magnetic field. By using the line-ratio technique of optical emission spectra, it is found that the average electron density and electron temperature under the considered PRFs are both increased when the parallel magnetic field is applied. The incremental degree of average electron density is basically the same under the considered PRFs, while the incremental degree of electron temperature under the higher-PRFs is larger than that under the lower-PRFs. All the above phenomena are explained by the effect of parallel magnetic field on diffusion and dissipation of electrons.

Computer-Aided Parallelizer and Optimizer

Science.gov (United States)

Jin, Haoqiang

2011-01-01

The Computer-Aided Parallelizer and Optimizer (CAPO) automates the insertion of compiler directives (see figure) to facilitate parallel processing on Shared Memory Parallel (SMP) machines. While CAPO currently is integrated seamlessly into CAPTools (developed at the University of Greenwich, now marketed as ParaWise), CAPO was independently developed at Ames Research Center as one of the components for the Legacy Code Modernization (LCM) project. The current version takes serial FORTRAN programs, performs interprocedural data dependence analysis, and generates OpenMP directives. Due to the widely supported OpenMP standard, the generated OpenMP codes have the potential to run on a wide range of SMP machines. CAPO relies on accurate interprocedural data dependence information currently provided by CAPTools. Compiler directives are generated through identification of parallel loops in the outermost level, construction of parallel regions around parallel loops and optimization of parallel regions, and insertion of directives with automatic identification of private, reduction, induction, and shared variables. Attempts also have been made to identify potential pipeline parallelism (implemented with point-to-point synchronization). Although directives are generated automatically, user interaction with the tool is still important for producing good parallel codes. A comprehensive graphical user interface is included for users to interact with the parallelization process.
Optimal data replication: A new approach to optimizing parallel EM algorithms on a mesh-connected multiprocessor for 3D PET image reconstruction

International Nuclear Information System (INIS)

Chen, C.M.; Lee, S.Y.

1995-01-01

The EM algorithm promises an estimated image with the maximal likelihood for 3D PET image reconstruction. However, due to its long computation time, the EM algorithm has not been widely used in practice. While several parallel implementations of the EM algorithm have been developed to make the EM algorithm feasible, they do not guarantee an optimal parallelization efficiency. In this paper, the authors propose a new parallel EM algorithm which maximizes the performance by optimizing data replication on a mesh-connected message-passing multiprocessor. To optimize data replication, the authors have formally derived the optimal allocation of shared data, group sizes, integration and broadcasting of replicated data as well as the scheduling of shared data accesses. The proposed parallel EM algorithm has been implemented on an iPSC/860 with 16 PEs. The experimental and theoretical results, which are consistent with each other, have shown that the proposed parallel EM algorithm could improve performance substantially over those using unoptimized data replication
Data communications in a parallel active messaging interface of a parallel computer

Science.gov (United States)

Archer, Charles J; Blocksome, Michael A; Ratterman, Joseph D; Smith, Brian E

2013-11-12

Data communications in a parallel active messaging interface (`PAMI`) of a parallel computer composed of compute nodes that execute a parallel application, each compute node including application processors that execute the parallel application and at least one management processor dedicated to gathering information regarding data communications. The PAMI is composed of data communications endpoints, each endpoint composed of a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, the compute nodes and the endpoints coupled for data communications through the PAMI and through data communications resources. Embodiments function by gathering call site statistics describing data communications resulting from execution of data communications instructions and identifying in dependence upon the call cite statistics a data communications algorithm for use in executing a data communications instruction at a call site in the parallel application.
Managing first-line failure.

Science.gov (United States)

Cooper, David A

2014-01-01

WHO standard of care for failure of a first regimen, usually 2N(t)RTI's and an NNRTI, consists of a ritonavir-boosted protease inhibitor with a change in N(t)RTI's. Until recently, there was no evidence to support these recommendations which were based on expert opinion. Two large randomized clinical trials, SECOND LINE and EARNEST both showed excellent response rates (>80%) for the WHO standard of care and indicated that a novel regimen of a boosted protease inhibitor with an integrase inhibitor had equal efficacy with no difference in toxicity. In EARNEST, a third arm consisting of induction with the combined protease and integrase inhibitor followed by protease inhibitor monotherapy maintenance was inferior and led to substantial (20%) protease inhibitor resistance. These studies confirm the validity of the current recommendations of WHO and point to a novel public health approach of using two new classes for second line when standard first-line therapy has failed, which avoids resistance genotyping. Notwithstanding, adherence must be stressed in those failing first-line treatments. Protease inhibitor monotherapy is not suitable for a public health approach in low- and middle-income countries.
A parallel buffer tree

DEFF Research Database (Denmark)

Sitchinava, Nodar; Zeh, Norbert

2012-01-01

We present the parallel buffer tree, a parallel external memory (PEM) data structure for batched search problems. This data structure is a non-trivial extension of Arge's sequential buffer tree to a private-cache multiprocessor environment and reduces the number of I/O operations by the number of...... in the optimal OhOf(psortN + K/PB) parallel I/O complexity, where K is the size of the output reported in the process and psortN is the parallel I/O complexity of sorting N elements using P processors....
Application Portable Parallel Library

Science.gov (United States)

Cole, Gary L.; Blech, Richard A.; Quealy, Angela; Townsend, Scott

1995-01-01

Application Portable Parallel Library (APPL) computer program is subroutine-based message-passing software library intended to provide consistent interface to variety of multiprocessor computers on market today. Minimizes effort needed to move application program from one computer to another. User develops application program once and then easily moves application program from parallel computer on which created to another parallel computer. ("Parallel computer" also include heterogeneous collection of networked computers). Written in C language with one FORTRAN 77 subroutine for UNIX-based computers and callable from application programs written in C language or FORTRAN 77.
Parallel Algorithms and Patterns

Energy Technology Data Exchange (ETDEWEB)

Robey, Robert W. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

2016-06-16

This is a powerpoint presentation on parallel algorithms and patterns. A parallel algorithm is a well-defined, step-by-step computational procedure that emphasizes concurrency to solve a problem. Examples of problems include: Sorting, searching, optimization, matrix operations. A parallel pattern is a computational step in a sequence of independent, potentially concurrent operations that occurs in diverse scenarios with some frequency. Examples are: Reductions, prefix scans, ghost cell updates. We only touch on parallel patterns in this presentation. It really deserves its own detailed discussion which Gabe Rockefeller would like to develop.
Increased Energy Delivery for Parallel Battery Packs with No Regulated Bus

Science.gov (United States)

Hsu, Chung-Ti

In this dissertation, a new approach to paralleling different battery types is presented. A method for controlling charging/discharging of different battery packs by using low-cost bi-directional switches instead of DC-DC converters is proposed. The proposed system architecture, algorithms, and control techniques allow batteries with different chemistry, voltage, and SOC to be properly charged and discharged in parallel without causing safety problems. The physical design and cost for the energy management system is substantially reduced. Additionally, specific types of failures in the maximum power point tracking (MPPT) in a photovoltaic (PV) system when tracking only the load current of a DC-DC converter are analyzed. The periodic nonlinear load current will lead MPPT realized by the conventional perturb and observe (P&O) algorithm to be problematic. A modified MPPT algorithm is proposed and it still only requires typically measured signals, yet is suitable for both linear and periodic nonlinear loads. Moreover, for a modular DC-DC converter using several converters in parallel, the input power from PV panels is processed and distributed at the module level. Methods for properly implementing distributed MPPT are studied. A new approach to efficient MPPT under partial shading conditions is presented. The power stage architecture achieves fast input current change rate by combining a current-adjustable converter with a few converters operating at a constant current.
Parallel Evolution of Copy-Number Variation across Continents in Drosophila melanogaster

Science.gov (United States)

Schrider, Daniel R.; Hahn, Matthew W.; Begun, David J.

2016-01-01

Genetic differentiation across populations that is maintained in the presence of gene flow is a hallmark of spatially varying selection. In Drosophila melanogaster, the latitudinal clines across the eastern coasts of Australia and North America appear to be examples of this type of selection, with recent studies showing that a substantial portion of the D. melanogaster genome exhibits allele frequency differentiation with respect to latitude on both continents. As of yet there has been no genome-wide examination of differentiated copy-number variants (CNVs) in these geographic regions, despite their potential importance for phenotypic variation in Drosophila and other taxa. Here, we present an analysis of geographic variation in CNVs in D. melanogaster. We also present the first genomic analysis of geographic variation for copy-number variation in the sister species, D. simulans, in order to investigate patterns of parallel evolution in these close relatives. In D. melanogaster we find hundreds of CNVs, many of which show parallel patterns of geographic variation on both continents, lending support to the idea that they are influenced by spatially varying selection. These findings support the idea that polymorphic CNVs contribute to local adaptation in D. melanogaster. In contrast, we find very few CNVs in D. simulans that are geographically differentiated in parallel on both continents, consistent with earlier work suggesting that clinal patterns are weaker in this species. PMID:26809315
Application of parallelized software architecture to an autonomous ground vehicle

Science.gov (United States)

Shakya, Rahul; Wright, Adam; Shin, Young Ho; Momin, Orko; Petkovsek, Steven; Wortman, Paul; Gautam, Prasanna; Norton, Adam

2011-01-01

This paper presents improvements made to Q, an autonomous ground vehicle designed to participate in the Intelligent Ground Vehicle Competition (IGVC). For the 2010 IGVC, Q was upgraded with a new parallelized software architecture and a new vision processor. Improvements were made to the power system reducing the number of batteries required for operation from six to one. In previous years, a single state machine was used to execute the bulk of processing activities including sensor interfacing, data processing, path planning, navigation algorithms and motor control. This inefficient approach led to poor software performance and made it difficult to maintain or modify. For IGVC 2010, the team implemented a modular parallel architecture using the National Instruments (NI) LabVIEW programming language. The new architecture divides all the necessary tasks - motor control, navigation, sensor data collection, etc. into well-organized components that execute in parallel, providing considerable flexibility and facilitating efficient use of processing power. Computer vision is used to detect white lines on the ground and determine their location relative to the robot. With the new vision processor and some optimization of the image processing algorithm used last year, two frames can be acquired and processed in 70ms. With all these improvements, Q placed 2nd in the autonomous challenge.
A Parallel Compact Multi-Dimensional Numerical Algorithm with Aeroacoustics Applications

Science.gov (United States)

Povitsky, Alex; Morris, Philip J.

1999-01-01

In this study we propose a novel method to parallelize high-order compact numerical algorithms for the solution of three-dimensional PDEs (Partial Differential Equations) in a space-time domain. For this numerical integration most of the computer time is spent in computation of spatial derivatives at each stage of the Runge-Kutta temporal update. The most efficient direct method to compute spatial derivatives on a serial computer is a version of Gaussian elimination for narrow linear banded systems known as the Thomas algorithm. In a straightforward pipelined implementation of the Thomas algorithm processors are idle due to the forward and backward recurrences of the Thomas algorithm. To utilize processors during this time, we propose to use them for either non-local data independent computations, solving lines in the next spatial direction, or local data-dependent computations by the Runge-Kutta method. To achieve this goal, control of processor communication and computations by a static schedule is adopted. Thus, our parallel code is driven by a communication and computation schedule instead of the usual "creative, programming" approach. The obtained parallelization speed-up of the novel algorithm is about twice as much as that for the standard pipelined algorithm and close to that for the explicit DRP algorithm.
Totally parallel multilevel algorithms

Science.gov (United States)

Frederickson, Paul O.

1988-01-01

Four totally parallel algorithms for the solution of a sparse linear system have common characteristics which become quite apparent when they are implemented on a highly parallel hypercube such as the CM2. These four algorithms are Parallel Superconvergent Multigrid (PSMG) of Frederickson and McBryan, Robust Multigrid (RMG) of Hackbusch, the FFT based Spectral Algorithm, and Parallel Cyclic Reduction. In fact, all four can be formulated as particular cases of the same totally parallel multilevel algorithm, which are referred to as TPMA. In certain cases the spectral radius of TPMA is zero, and it is recognized to be a direct algorithm. In many other cases the spectral radius, although not zero, is small enough that a single iteration per timestep keeps the local error within the required tolerance.
Optimal task mapping in safety-critical real-time parallel systems

International Nuclear Information System (INIS)

Aussagues, Ch.

1998-01-01

This PhD thesis is dealing with the correct design of safety-critical real-time parallel systems. Such systems constitutes a fundamental part of high-performance systems for command and control that can be found in the nuclear domain or more generally in parallel embedded systems. The verification of their temporal correctness is the core of this thesis. our contribution is mainly in the following three points: the analysis and extension of a programming model for such real-time parallel systems; the proposal of an original method based on a new operator of synchronized product of state machines task-graphs; the validation of the approach by its implementation and evaluation. The work addresses particularly the main problem of optimal task mapping on a parallel architecture, such that the temporal constraints are globally guaranteed, i.e. the timeliness property is valid. The results incorporate also optimally criteria for the sizing and correct dimensioning of a parallel system, for instance in the number of processing elements. These criteria are connected with operational constraints of the application domain. Our approach is based on the off-line analysis of the feasibility of the deadline-driven dynamic scheduling that is used to schedule tasks inside one processor. This leads us to define the synchronized-product, a system of linear, constraints is automatically generated and then allows to calculate a maximum load of a group of tasks and then to verify their timeliness constraints. The communications, their timeliness verification and incorporation to the mapping problem is the second main contribution of this thesis. FInally, the global solving technique dealing with both task and communication aspects has been implemented and evaluated in the framework of the OASIS project in the LETI research center at the CEA/Saclay. (author)
Parallelization of a hydrological model using the message passing interface

Science.gov (United States)

Wu, Yiping; Li, Tiejian; Sun, Liqun; Chen, Ji

2013-01-01

With the increasing knowledge about the natural processes, hydrological models such as the Soil and Water Assessment Tool (SWAT) are becoming larger and more complex with increasing computation time. Additionally, other procedures such as model calibration, which may require thousands of model iterations, can increase running time and thus further reduce rapid modeling and analysis. Using the widely-applied SWAT as an example, this study demonstrates how to parallelize a serial hydrological model in a Windows® environment using a parallel programing technology—Message Passing Interface (MPI). With a case study, we derived the optimal values for the two parameters (the number of processes and the corresponding percentage of work to be distributed to the master process) of the parallel SWAT (P-SWAT) on an ordinary personal computer and a work station. Our study indicates that model execution time can be reduced by 42%–70% (or a speedup of 1.74–3.36) using multiple processes (two to five) with a proper task-distribution scheme (between the master and slave processes). Although the computation time cost becomes lower with an increasing number of processes (from two to five), this enhancement becomes less due to the accompanied increase in demand for message passing procedures between the master and all slave processes. Our case study demonstrates that the P-SWAT with a five-process run may reach the maximum speedup, and the performance can be quite stable (fairly independent of a project size). Overall, the P-SWAT can help reduce the computation time substantially for an individual model run, manual and automatic calibration procedures, and optimization of best management practices. In particular, the parallelization method we used and the scheme for deriving the optimal parameters in this study can be valuable and easily applied to other hydrological or environmental models.
Neural Parallel Engine: A toolbox for massively parallel neural signal processing.

Science.gov (United States)

Tam, Wing-Kin; Yang, Zhi

2018-05-01

Large-scale neural recordings provide detailed information on neuronal activities and can help elicit the underlying neural mechanisms of the brain. However, the computational burden is also formidable when we try to process the huge data stream generated by such recordings. In this study, we report the development of Neural Parallel Engine (NPE), a toolbox for massively parallel neural signal processing on graphical processing units (GPUs). It offers a selection of the most commonly used routines in neural signal processing such as spike detection and spike sorting, including advanced algorithms such as exponential-component-power-component (EC-PC) spike detection and binary pursuit spike sorting. We also propose a new method for detecting peaks in parallel through a parallel compact operation. Our toolbox is able to offer a 5× to 110× speedup compared with its CPU counterparts depending on the algorithms. A user-friendly MATLAB interface is provided to allow easy integration of the toolbox into existing workflows. Previous efforts on GPU neural signal processing only focus on a few rudimentary algorithms, are not well-optimized and often do not provide a user-friendly programming interface to fit into existing workflows. There is a strong need for a comprehensive toolbox for massively parallel neural signal processing. A new toolbox for massively parallel neural signal processing has been created. It can offer significant speedup in processing signals from large-scale recordings up to thousands of channels. Copyright © 2018 Elsevier B.V. All rights reserved.
Parallel processing in the brain’s visual form system: An fMRI study

Directory of Open Access Journals (Sweden)

Yoshihito eShigihara

2014-07-01

Full Text Available We here extend and complement our earlier time-based, magneto-encephalographic (MEG, study of the processing of forms by the visual brain (Shigihara and Zeki, 2013 with a functional magnetic resonance imaging (fMRI study, in order to better localize the activity produced in early visual areas when subjects view simple geometric stimuli of increasing perceptual complexity (lines, angles, rhomboids constituted from the same elements (lines. Our results show that all three categories of form activate all three visual areas with which we were principally concerned (V1, V2, V3, with angles producing the strongest and rhomboids the weakest activity in all three. The difference between the activity produced by angles and rhomboids was significant, that between lines and rhomboids was trend significant while that between lines and angles was not. Taken together with our earlier MEG results, the present ones suggest that a parallel strategy is used in processing forms, in addition to the well-documented hierarchical strategy.
Nasca Lines: A Mystery wrapped in an Enigma

OpenAIRE

Pita, J. R. Castrejon; Pita, A. A. Castrejon; Galan, A. Sarmiento; Garcia, R. Castrejon

2003-01-01

We analyze the geometrical structure of the astonishing Nasca geoglyphs in terms of their fractal dimension with the idea of dating these manifestations of human cultural engagements in relation to one another. Our findings suggest that the first delineated images consist of straight, parallel lines and that having sophisticated their abilities, Nasca artist moved on to the design of more complex structures.
Nasca lines: A mystery wrapped in an enigma

Science.gov (United States)

Castrejón-Pita, J. R.; Castrejón-Pita, A. A.; Sarmiento-Galán, A.; Castrejón-García, R.

2003-09-01

We analyze the geometrical structure of the astonishing Nasca geoglyphs in terms of their fractal dimension with the idea of dating these manifestations of human cultural engagements in relation to one another. Our findings suggest that the first delineated images consist of straight, parallel lines and that having sophisticated their abilities, the Nasca artists moved on to the design of more complex structures.
A possibility of parallel and anti-parallel diffraction measurements on ...

Indian Academy of Sciences (India)

However, a bent perfect crystal (BPC) monochromator at monochromatic focusing condition can provide a quite flat and equal resolution property at both parallel and anti-parallel positions and thus one can have a chance to use both sides for the diffraction experiment. From the data of the FWHM and the / measured ...
Parallel-fed planar dipole antenna arrays for low-observable platforms

CERN Document Server

Singh, Hema; Jha, Rakesh Mohan

2016-01-01

This book focuses on determination of scattering of parallel-fed planar dipole arrays in terms of reflection and transmission coefficients at different levels of the array system. In aerospace vehicles, the phased arrays are often in planar configuration. The radar cross section (RCS) of the vehicle is mainly due to its structure and the antennas mounted over it. There can be situation when the signatures due to antennas dominate over the structural RCS of the platform. This necessitates the study towards the reduction and control of antenna/ array RCS. The planar dipole array is considered as a stacked linear dipole array. A systematic, step-by-step approach is used to determine the RCS pattern including the finite dimensions of dipole antenna elements. The mutual impedance between the dipole elements for planar configuration is determined. The scattering till second-level of couplers in parallel feed network is taken into account. The phase shifters are modelled as delay line. All the couplers in the feed n...

Magnetic insulation in triplate and coaxial vacuum transmission lines. Report PIFR-1009

International Nuclear Information System (INIS)

Di Capua, M.; Pellinen, D.G.

1980-08-01

An experimental investigation was made of magnetically insulated transmission lines for use in an electron beam fusion accelerator. The magnetically insulated vacuum transmission lines would transfer the power pulses from many modules to a single diode region or multiple diodes to generate currents on the order of 100 MA. This approach may allow present limits on power flow through dielectric vacuum interfaces to be overcome. We have investigated symmetric parallel plate (triplate) transmission lines with a wave impedance of 24 Ω and a spacing of 1.9 cm, and coaxial transmission lines (coax) with a wave impedance of 42 Ω and a spacing of 2.9 cm
Preliminary Study on the Enhancement of Reconstruction Speed for Emission Computed Tomography Using Parallel Processing

International Nuclear Information System (INIS)

Park, Min Jae; Lee, Jae Sung; Kim, Soo Mee; Kang, Ji Yeon; Lee, Dong Soo; Park, Kwang Suk

2009-01-01

Conventional image reconstruction uses simplified physical models of projection. However, real physics, for example 3D reconstruction, takes too long time to process all the data in clinic and is unable in a common reconstruction machine because of the large memory for complex physical models. We suggest the realistic distributed memory model of fast-reconstruction using parallel processing on personal computers to enable large-scale technologies. The preliminary tests for the possibility on virtual machines and various performance test on commercial super computer, Tachyon were performed. Expectation maximization algorithm with common 2D projection and realistic 3D line of response were tested. Since the process time was getting slower (max 6 times) after a certain iteration, optimization for compiler was performed to maximize the efficiency of parallelization. Parallel processing of a program on multiple computers was available on Linux with MPICH and NFS. We verified that differences between parallel processed image and single processed image at the same iterations were under the significant digits of floating point number, about 6 bit. Double processors showed good efficiency (1.96 times) of parallel computing. Delay phenomenon was solved by vectorization method using SSE. Through the study, realistic parallel computing system in clinic was established to be able to reconstruct by plenty of memory using the realistic physical models which was impossible to simplify
Conceptual design of multiple parallel switching controller

International Nuclear Information System (INIS)

Ugolini, D.; Yoshikawa, S.; Ozawa, K.

1996-01-01

This paper discusses the conceptual design and the development of a preliminary model of a multiple parallel switching (MPS) controller. The introduction of several advanced controllers has widened and improved the control capability of nonlinear dynamical systems. However, it is not possible to uniquely define a controller that always outperforms the others, and, in many situations, the controller providing the best control action depends on the operating conditions and on the intrinsic properties and behavior of the controlled dynamical system. The desire to combine the control action of several controllers with the purpose to continuously attain the best control action has motivated the development of the MPS controller. The MPS controller consists of a number of single controllers acting in parallel and of an artificial intelligence (AI) based selecting mechanism. The AI selecting mechanism analyzes the output of each controller and implements the one providing the best control performance. An inherent property of the MPS controller is the possibility to discard unreliable controllers while still being able to perform the control action. To demonstrate the feasibility and the capability of the MPS controller the simulation of the on-line operation control of a fast breeder reactor (FBR) evaporator is presented. (author)
High-resolution brain SPECT imaging by combination of parallel and tilted detector heads.

Science.gov (United States)

Suzuki, Atsuro; Takeuchi, Wataru; Ishitsu, Takafumi; Morimoto, Yuichi; Kobashi, Keiji; Ueno, Yuichiro

2015-10-01

To improve the spatial resolution of brain single-photon emission computed tomography (SPECT), we propose a new brain SPECT system in which the detector heads are tilted towards the rotation axis so that they are closer to the brain. In addition, parallel detector heads are used to obtain the complete projection data set. We evaluated this parallel and tilted detector head system (PT-SPECT) in simulations. In the simulation study, the tilt angle of the detector heads relative to the axis was 45°. The distance from the collimator surface of the parallel detector heads to the axis was 130 mm. The distance from the collimator surface of the tilted detector heads to the origin on the axis was 110 mm. A CdTe semiconductor panel with a 1.4 mm detector pitch and a parallel-hole collimator were employed in both types of detector head. A line source phantom, cold-rod brain-shaped phantom, and cerebral blood flow phantom were evaluated. The projection data were generated by forward-projection of the phantom images using physics models, and Poisson noise at clinical levels was applied to the projection data. The ordered-subsets expectation maximization algorithm with physics models was used. We also evaluated conventional SPECT using four parallel detector heads for the sake of comparison. The evaluation of the line source phantom showed that the transaxial FWHM in the central slice for conventional SPECT ranged from 6.1 to 8.5 mm, while that for PT-SPECT ranged from 5.3 to 6.9 mm. The cold-rod brain-shaped phantom image showed that conventional SPECT could visualize up to 8-mm-diameter rods. By contrast, PT-SPECT could visualize up to 6-mm-diameter rods in upper slices of a cerebrum. The cerebral blood flow phantom image showed that the PT-SPECT system provided higher resolution at the thalamus and caudate nucleus as well as at the longitudinal fissure of the cerebrum compared with conventional SPECT. PT-SPECT provides improved image resolution at not only upper but also at
Chiral filtration-induced spin/valley polarization in silicene line defects

Science.gov (United States)

Ren, Chongdan; Zhou, Benhu; Sun, Minglei; Wang, Sake; Li, Yunfang; Tian, Hongyu; Lu, Weitao

2018-06-01

The spin/valley polarization in silicene with extended line defects is investigated according to the chiral filtration mechanism. It is shown that the inner-built quantum Hall pseudo-edge states with identical chirality can serve as a chiral filter with a weak magnetic field and that the transmission process is restrained/strengthened for chiral states with reversed/identical chirality. With two parallel line defects, which act as natural chiral filtration, the filter effect is greatly enhanced, and 100% spin/valley polarization can be achieved.
Parallel asynchronous systems and image processing algorithms

Science.gov (United States)

Coon, D. D.; Perera, A. G. U.

1989-01-01

A new hardware approach to implementation of image processing algorithms is described. The approach is based on silicon devices which would permit an independent analog processing channel to be dedicated to evey pixel. A laminar architecture consisting of a stack of planar arrays of the device would form a two-dimensional array processor with a 2-D array of inputs located directly behind a focal plane detector array. A 2-D image data stream would propagate in neuronlike asynchronous pulse coded form through the laminar processor. Such systems would integrate image acquisition and image processing. Acquisition and processing would be performed concurrently as in natural vision systems. The research is aimed at implementation of algorithms, such as the intensity dependent summation algorithm and pyramid processing structures, which are motivated by the operation of natural vision systems. Implementation of natural vision algorithms would benefit from the use of neuronlike information coding and the laminar, 2-D parallel, vision system type architecture. Besides providing a neural network framework for implementation of natural vision algorithms, a 2-D parallel approach could eliminate the serial bottleneck of conventional processing systems. Conversion to serial format would occur only after raw intensity data has been substantially processed. An interesting challenge arises from the fact that the mathematical formulation of natural vision algorithms does not specify the means of implementation, so that hardware implementation poses intriguing questions involving vision science.
Line-Interactive UPS for Microgrids

DEFF Research Database (Denmark)

Abusara, Mohammad; Guerrero, Josep M.; Sharkh, Suleiman

2014-01-01

Line interactive Uninterruptable Power Supply (UPS) systems are good candidates for providing energy storage within a microgrid to help improve its reliability, economy and efficiency. In grid-connected mode, power can be imported from the grid by the UPS to charge its battery. Power can also...... drooping technique to ensure seamless transfer between grid-connected and stand-alone parallel modes of operation. The drooping coefficients are chosen to limit the energy imported by the USP when re-connecting to the grid and to give good transient response. Experimental results of a microgrid consisting...
Parallel k-means++

Energy Technology Data Exchange (ETDEWEB)

2017-04-04

A parallelization of the k-means++ seed selection algorithm on three distinct hardware platforms: GPU, multicore CPU, and multithreaded architecture. K-means++ was developed by David Arthur and Sergei Vassilvitskii in 2007 as an extension of the k-means data clustering technique. These algorithms allow people to cluster multidimensional data, by attempting to minimize the mean distance of data points within a cluster. K-means++ improved upon traditional k-means by using a more intelligent approach to selecting the initial seeds for the clustering process. While k-means++ has become a popular alternative to traditional k-means clustering, little work has been done to parallelize this technique. We have developed original C++ code for parallelizing the algorithm on three unique hardware architectures: GPU using NVidia's CUDA/Thrust framework, multicore CPU using OpenMP, and the Cray XMT multithreaded architecture. By parallelizing the process for these platforms, we are able to perform k-means++ clustering much more quickly than it could be done before.
Parallel magnetic resonance imaging

International Nuclear Information System (INIS)

Larkman, David J; Nunes, Rita G

2007-01-01

Parallel imaging has been the single biggest innovation in magnetic resonance imaging in the last decade. The use of multiple receiver coils to augment the time consuming Fourier encoding has reduced acquisition times significantly. This increase in speed comes at a time when other approaches to acquisition time reduction were reaching engineering and human limits. A brief summary of spatial encoding in MRI is followed by an introduction to the problem parallel imaging is designed to solve. There are a large number of parallel reconstruction algorithms; this article reviews a cross-section, SENSE, SMASH, g-SMASH and GRAPPA, selected to demonstrate the different approaches. Theoretical (the g-factor) and practical (coil design) limits to acquisition speed are reviewed. The practical implementation of parallel imaging is also discussed, in particular coil calibration. How to recognize potential failure modes and their associated artefacts are shown. Well-established applications including angiography, cardiac imaging and applications using echo planar imaging are reviewed and we discuss what makes a good application for parallel imaging. Finally, active research areas where parallel imaging is being used to improve data quality by repairing artefacted images are also reviewed. (invited topical review)
Experiences in Data-Parallel Programming

Directory of Open Access Journals (Sweden)

Terry W. Clark

1997-01-01

Full Text Available To efficiently parallelize a scientific application with a data-parallel compiler requires certain structural properties in the source program, and conversely, the absence of others. A recent parallelization effort of ours reinforced this observation and motivated this correspondence. Specifically, we have transformed a Fortran 77 version of GROMOS, a popular dusty-deck program for molecular dynamics, into Fortran D, a data-parallel dialect of Fortran. During this transformation we have encountered a number of difficulties that probably are neither limited to this particular application nor do they seem likely to be addressed by improved compiler technology in the near future. Our experience with GROMOS suggests a number of points to keep in mind when developing software that may at some time in its life cycle be parallelized with a data-parallel compiler. This note presents some guidelines for engineering data-parallel applications that are compatible with Fortran D or High Performance Fortran compilers.
Non-Cartesian parallel imaging reconstruction.

Science.gov (United States)

Wright, Katherine L; Hamilton, Jesse I; Griswold, Mark A; Gulani, Vikas; Seiberlich, Nicole

2014-11-01

Non-Cartesian parallel imaging has played an important role in reducing data acquisition time in MRI. The use of non-Cartesian trajectories can enable more efficient coverage of k-space, which can be leveraged to reduce scan times. These trajectories can be undersampled to achieve even faster scan times, but the resulting images may contain aliasing artifacts. Just as Cartesian parallel imaging can be used to reconstruct images from undersampled Cartesian data, non-Cartesian parallel imaging methods can mitigate aliasing artifacts by using additional spatial encoding information in the form of the nonhomogeneous sensitivities of multi-coil phased arrays. This review will begin with an overview of non-Cartesian k-space trajectories and their sampling properties, followed by an in-depth discussion of several selected non-Cartesian parallel imaging algorithms. Three representative non-Cartesian parallel imaging methods will be described, including Conjugate Gradient SENSE (CG SENSE), non-Cartesian generalized autocalibrating partially parallel acquisition (GRAPPA), and Iterative Self-Consistent Parallel Imaging Reconstruction (SPIRiT). After a discussion of these three techniques, several potential promising clinical applications of non-Cartesian parallel imaging will be covered. © 2014 Wiley Periodicals, Inc.
Influence of Paralleling Dies and Paralleling Half-Bridges on Transient Current Distribution in Multichip Power Modules

DEFF Research Database (Denmark)

Li, Helong; Zhou, Wei; Wang, Xiongfei

2018-01-01

This paper addresses the transient current distribution in the multichip half-bridge power modules, where two types of paralleling connections with different current commutation mechanisms are considered: paralleling dies and paralleling half-bridges. It reveals that with paralleling dies, both t...
A parallel solver for huge dense linear systems

Science.gov (United States)

Badia, J. M.; Movilla, J. L.; Climente, J. I.; Castillo, M.; Marqués, M.; Mayo, R.; Quintana-Ortí, E. S.; Planelles, J.

2011-11-01

HDSS (Huge Dense Linear System Solver) is a Fortran Application Programming Interface (API) to facilitate the parallel solution of very large dense systems to scientists and engineers. The API makes use of parallelism to yield an efficient solution of the systems on a wide range of parallel platforms, from clusters of processors to massively parallel multiprocessors. It exploits out-of-core strategies to leverage the secondary memory in order to solve huge linear systems O(100.000). The API is based on the parallel linear algebra library PLAPACK, and on its Out-Of-Core (OOC) extension POOCLAPACK. Both PLAPACK and POOCLAPACK use the Message Passing Interface (MPI) as the communication layer and BLAS to perform the local matrix operations. The API provides a friendly interface to the users, hiding almost all the technical aspects related to the parallel execution of the code and the use of the secondary memory to solve the systems. In particular, the API can automatically select the best way to store and solve the systems, depending of the dimension of the system, the number of processes and the main memory of the platform. Experimental results on several parallel platforms report high performance, reaching more than 1 TFLOP with 64 cores to solve a system with more than 200 000 equations and more than 10 000 right-hand side vectors. New version program summaryProgram title: Huge Dense System Solver (HDSS) Catalogue identifier: AEHU_v1_1 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEHU_v1_1.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 87 062 No. of bytes in distributed program, including test data, etc.: 1 069 110 Distribution format: tar.gz Programming language: Fortran90, C Computer: Parallel architectures: multiprocessors, computer clusters Operating system
Parallelization of a beam dynamics code and first large scale radio frequency quadrupole simulations

Directory of Open Access Journals (Sweden)

J. Xu

2007-01-01

Full Text Available The design and operation support of hadron (proton and heavy-ion linear accelerators require substantial use of beam dynamics simulation tools. The beam dynamics code TRACK has been originally developed at Argonne National Laboratory (ANL to fulfill the special requirements of the rare isotope accelerator (RIA accelerator systems. From the beginning, the code has been developed to make it useful in the three stages of a linear accelerator project, namely, the design, commissioning, and operation of the machine. To realize this concept, the code has unique features such as end-to-end simulations from the ion source to the final beam destination and automatic procedures for tuning of a multiple charge state heavy-ion beam. The TRACK code has become a general beam dynamics code for hadron linacs and has found wide applications worldwide. Until recently, the code has remained serial except for a simple parallelization used for the simulation of multiple seeds to study the machine errors. To speed up computation, the TRACK Poisson solver has been parallelized. This paper discusses different parallel models for solving the Poisson equation with the primary goal to extend the scalability of the code onto 1024 and more processors of the new generation of supercomputers known as BlueGene (BG/L. Domain decomposition techniques have been adapted and incorporated into the parallel version of the TRACK code. To demonstrate the new capabilities of the parallelized TRACK code, the dynamics of a 45 mA proton beam represented by 10^{8} particles has been simulated through the 325 MHz radio frequency quadrupole and initial accelerator section of the proposed FNAL proton driver. The results show the benefits and advantages of large-scale parallel computing in beam dynamics simulations.
Parallel processing of dose calculation for external photon beam therapy

International Nuclear Information System (INIS)

Kunieda, Etsuo; Ando, Yutaka; Tsukamoto, Nobuhiro; Ito, Hisao; Kubo, Atsushi

1994-01-01

We implemented external photon beam dose calculation programs into a parallel processor system consisting of Transputers, 32-bit processors especially suitable for multi-processor configuration. Two network conformations, binary-tree and pipeline, were evaluated for rectangular and irregular field dose calculation algorithms. Although computation speed increased in proportion to the number of CPU, substantial overhead caused by inter-processor communication occurred when a smaller computation load was delivered to each processor. On the other hand, for irregular field calculation, which requires more computation capability for each calculation point, the communication overhead was still less even when more than 50 processors were involved. Real-time responses could be expected for more complex algorithms by increasing the number of processors. (author)
Circuit QED with hybrid metamaterial transmission lines

Energy Technology Data Exchange (ETDEWEB)

Ruloff, Stefan; Taketani, Bruno; Wilhelm, Frank [Theoretical Physics, Universitaet des Saarlandes, Saarbruecken (Germany)

2016-07-01

We're working on the theory of metamaterials providing some interesting results. The negative refraction index causes an opposite orientation of the wave vector k and the Poynting vector S of the travelling waves. Hence the metamaterial has a falling dispersion relation ∂ω(k)/∂k < 0 implying that low frequencies correspond to short wavelengths. Metamaterials are simulated by left-handed transmission lines consisting of discrete arrays of series capacitors and parallel inductors to ground. Unusual physics arises when right-and left-handed transmission lines are coupled forming a hybrid metamaterial transmission line. E.g. if a qubit is placed in front of a hybrid metamaterial transmission line terminated in an open circuit, the spontaneous emission rate is weakened or unaffected depending on the transition frequency of the qubit. Some other research interests are the general analysis of metamaterial cavities and the mode structure of hybrid metamaterial cavities for QND readout of multi-qubit operators. Especially the precise answer to the question about the definition of the mode volume of a metamaterial cavity is one of our primary goals.
Parallel Architectures and Parallel Algorithms for Integrated Vision Systems. Ph.D. Thesis

Science.gov (United States)

Choudhary, Alok Nidhi

1989-01-01

Computer vision is regarded as one of the most complex and computationally intensive problems. An integrated vision system (IVS) is a system that uses vision algorithms from all levels of processing to perform for a high level application (e.g., object recognition). An IVS normally involves algorithms from low level, intermediate level, and high level vision. Designing parallel architectures for vision systems is of tremendous interest to researchers. Several issues are addressed in parallel architectures and parallel algorithms for integrated vision systems.
Pattern-Driven Automatic Parallelization

Directory of Open Access Journals (Sweden)

Christoph W. Kessler

1996-01-01

Full Text Available This article describes a knowledge-based system for automatic parallelization of a wide class of sequential numerical codes operating on vectors and dense matrices, and for execution on distributed memory message-passing multiprocessors. Its main feature is a fast and powerful pattern recognition tool that locally identifies frequently occurring computations and programming concepts in the source code. This tool also works for dusty deck codes that have been "encrypted" by former machine-specific code transformations. Successful pattern recognition guides sophisticated code transformations including local algorithm replacement such that the parallelized code need not emerge from the sequential program structure by just parallelizing the loops. It allows access to an expert's knowledge on useful parallel algorithms, available machine-specific library routines, and powerful program transformations. The partially restored program semantics also supports local array alignment, distribution, and redistribution, and allows for faster and more exact prediction of the performance of the parallelized target code than is usually possible.
Automated Parallel Capillary Electrophoretic System

Science.gov (United States)

Li, Qingbo; Kane, Thomas E.; Liu, Changsheng; Sonnenschein, Bernard; Sharer, Michael V.; Kernan, John R.

2000-02-22

An automated electrophoretic system is disclosed. The system employs a capillary cartridge having a plurality of capillary tubes. The cartridge has a first array of capillary ends projecting from one side of a plate. The first array of capillary ends are spaced apart in substantially the same manner as the wells of a microtitre tray of standard size. This allows one to simultaneously perform capillary electrophoresis on samples present in each of the wells of the tray. The system includes a stacked, dual carousel arrangement to eliminate cross-contamination resulting from reuse of the same buffer tray on consecutive executions from electrophoresis. The system also has a gel delivery module containing a gel syringe/a stepper motor or a high pressure chamber with a pump to quickly and uniformly deliver gel through the capillary tubes. The system further includes a multi-wavelength beam generator to generate a laser beam which produces a beam with a wide range of wavelengths. An off-line capillary reconditioner thoroughly cleans a capillary cartridge to enable simultaneous execution of electrophoresis with another capillary cartridge. The streamlined nature of the off-line capillary reconditioner offers the advantage of increased system throughput with a minimal increase in system cost.
Data communications in a parallel active messaging interface of a parallel computer

Science.gov (United States)

Archer, Charles J; Blocksome, Michael A; Ratterman, Joseph D; Smith, Brian E

2013-10-29

Data communications in a parallel active messaging interface (`PAMI`) of a parallel computer, the parallel computer including a plurality of compute nodes that execute a parallel application, the PAMI composed of data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, the compute nodes and the endpoints coupled for data communications through the PAMI and through data communications resources, including receiving in an origin endpoint of the PAMI a data communications instruction, the instruction characterized by an instruction type, the instruction specifying a transmission of transfer data from the origin endpoint to a target endpoint and transmitting, in accordance with the instruction type, the transfer data from the origin endpoint to the target endpoint.

The STAPL Parallel Graph Library

KAUST Repository

Harshvardhan,; Fidel, Adam; Amato, Nancy M.; Rauchwerger, Lawrence

2013-01-01

This paper describes the stapl Parallel Graph Library, a high-level framework that abstracts the user from data-distribution and parallelism details and allows them to concentrate on parallel graph algorithm development. It includes a customizable
Parallelism and array processing

International Nuclear Information System (INIS)

Zacharov, V.

1983-01-01

Modern computing, as well as the historical development of computing, has been dominated by sequential monoprocessing. Yet there is the alternative of parallelism, where several processes may be in concurrent execution. This alternative is discussed in a series of lectures, in which the main developments involving parallelism are considered, both from the standpoint of computing systems and that of applications that can exploit such systems. The lectures seek to discuss parallelism in a historical context, and to identify all the main aspects of concurrency in computation right up to the present time. Included will be consideration of the important question as to what use parallelism might be in the field of data processing. (orig.)
Net-Zero Building Technologies Create Substantial Energy Savings -

Science.gov (United States)

only an estimated 1% of commercial buildings are built to net-zero energy criteria. One reason for this Continuum Magazine | NREL Net-Zero Building Technologies Create Substantial Energy Savings Net -Zero Building Technologies Create Substantial Energy Savings Researchers work to package and share step
Vectorization, parallelization and porting of nuclear codes (vectorization and parallelization). Progress report fiscal 1998

International Nuclear Information System (INIS)

Ishizuki, Shigeru; Kawai, Wataru; Nemoto, Toshiyuki; Ogasawara, Shinobu; Kume, Etsuo; Adachi, Masaaki; Kawasaki, Nobuo; Yatake, Yo-ichi

2000-03-01

Several computer codes in the nuclear field have been vectorized, parallelized and transported on the FUJITSU VPP500 system, the AP3000 system and the Paragon system at Center for Promotion of Computational Science and Engineering in Japan Atomic Energy Research Institute. We dealt with 12 codes in fiscal 1998. These results are reported in 3 parts, i.e., the vectorization and parallelization on vector processors part, the parallelization on scalar processors part and the porting part. In this report, we describe the vectorization and parallelization on vector processors. In this vectorization and parallelization on vector processors part, the vectorization of General Tokamak Circuit Simulation Program code GTCSP, the vectorization and parallelization of Molecular Dynamics NTV (n-particle, Temperature and Velocity) Simulation code MSP2, Eddy Current Analysis code EDDYCAL, Thermal Analysis Code for Test of Passive Cooling System by HENDEL T2 code THANPACST2 and MHD Equilibrium code SELENEJ on the VPP500 are described. In the parallelization on scalar processors part, the parallelization of Monte Carlo N-Particle Transport code MCNP4B2, Plasma Hydrodynamics code using Cubic Interpolated Propagation Method PHCIP and Vectorized Monte Carlo code (continuous energy model / multi-group model) MVP/GMVP on the Paragon are described. In the porting part, the porting of Monte Carlo N-Particle Transport code MCNP4B2 and Reactor Safety Analysis code RELAP5 on the AP3000 are described. (author)
Parallel External Memory Graph Algorithms

DEFF Research Database (Denmark)

Arge, Lars Allan; Goodrich, Michael T.; Sitchinava, Nodari

2010-01-01

In this paper, we study parallel I/O efficient graph algorithms in the Parallel External Memory (PEM) model, one o f the private-cache chip multiprocessor (CMP) models. We study the fundamental problem of list ranking which leads to efficient solutions to problems on trees, such as computing lowest...... an optimal speedup of Â¿(P) in parallel I/O complexity and parallel computation time, compared to the single-processor external memory counterparts....
Short-Range Sensor for Underwater Robot Navigation using Line-lasers and Vision

DEFF Research Database (Denmark)

Hansen, Peter Nicholas; Nielsen, Mikkel Cornelius; Christensen, David Johan

2015-01-01

This paper investigates a minimalistic laser-based range sensor, used for underwater inspection by Autonomous Underwater Vehicles (AUV). This range detection system system comprise two lasers projecting vertical lines, parallel to a camera’s viewing axis, into the environment. Using both lasers...
Design of multiple sequence alignment algorithms on parallel, distributed memory supercomputers.

Science.gov (United States)

Church, Philip C; Goscinski, Andrzej; Holt, Kathryn; Inouye, Michael; Ghoting, Amol; Makarychev, Konstantin; Reumann, Matthias

2011-01-01

The challenge of comparing two or more genomes that have undergone recombination and substantial amounts of segmental loss and gain has recently been addressed for small numbers of genomes. However, datasets of hundreds of genomes are now common and their sizes will only increase in the future. Multiple sequence alignment of hundreds of genomes remains an intractable problem due to quadratic increases in compute time and memory footprint. To date, most alignment algorithms are designed for commodity clusters without parallelism. Hence, we propose the design of a multiple sequence alignment algorithm on massively parallel, distributed memory supercomputers to enable research into comparative genomics on large data sets. Following the methodology of the sequential progressiveMauve algorithm, we design data structures including sequences and sorted k-mer lists on the IBM Blue Gene/P supercomputer (BG/P). Preliminary results show that we can reduce the memory footprint so that we can potentially align over 250 bacterial genomes on a single BG/P compute node. We verify our results on a dataset of E.coli, Shigella and S.pneumoniae genomes. Our implementation returns results matching those of the original algorithm but in 1/2 the time and with 1/4 the memory footprint for scaffold building. In this study, we have laid the basis for multiple sequence alignment of large-scale datasets on a massively parallel, distributed memory supercomputer, thus enabling comparison of hundreds instead of a few genome sequences within reasonable time.
Parallel Quasi Newton Algorithms for Large Scale Non Linear Unconstrained Optimization

International Nuclear Information System (INIS)

Rahman, M. A.; Basarudin, T.

1997-01-01

This paper discusses about Quasi Newton (QN) method to solve non-linear unconstrained minimization problems. One of many important of QN method is choice of matrix Hk. to be positive definite and satisfies to QN method. Our interest here is the parallel QN methods which will suite for the solution of large-scale optimization problems. The QN methods became less attractive in large-scale problems because of the storage and computational requirements. How ever, it is often the case that the Hessian is space matrix. In this paper we include the mechanism of how to reduce the Hessian update and hold the Hessian properties.One major reason of our research is that the QN method may be good in solving certain type of minimization problems, but it is efficiency degenerate when is it applied to solve other category of problems. For this reason, we use an algorithm containing several direction strategies which are processed in parallel. We shall attempt to parallelized algorithm by exploring different search directions which are generated by various QN update during the minimization process. The different line search strategies will be employed simultaneously in the process of locating the minimum along each direction.The code of algorithm will be written in Occam language 2 which is run on the transputer machine
Parallel hyperbolic PDE simulation on clusters: Cell versus GPU

Science.gov (United States)

Rostrup, Scott; De Sterck, Hans

2010-12-01

:http://cpc.cs.qub.ac.uk/summaries/AEGY_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: GPL v3 No. of lines in distributed program, including test data, etc.: 59 168 No. of bytes in distributed program, including test data, etc.: 453 409 Distribution format: tar.gz Programming language: C, CUDA Computer: Parallel Computing Clusters. Individual compute nodes may consist of x86 CPU, Cell processor, or x86 CPU with attached NVIDIA GPU accelerator. Operating system: Linux Has the code been vectorised or parallelized?: Yes. Tested on 1-128 x86 CPU cores, 1-32 Cell Processors, and 1-32 NVIDIA GPUs. RAM: Tested on Problems requiring up to 4 GB per compute node. Classification: 12 External routines: MPI, CUDA, IBM Cell SDK Nature of problem: MPI-parallel simulation of Shallow Water equations using high-resolution 2D hyperbolic equation solver on regular Cartesian grids for x86 CPU, Cell Processor, and NVIDIA GPU using CUDA. Solution method: SWsolver provides 3 implementations of a high-resolution 2D Shallow Water equation solver on regular Cartesian grids, for CPU, Cell Processor, and NVIDIA GPU. Each implementation uses MPI to divide work across a parallel computing cluster. Additional comments: Sub-program numdiff is used for the test run.
Parallel inter channel interaction mechanisms

International Nuclear Information System (INIS)

Jovic, V.; Afgan, N.; Jovic, L.

1995-01-01

Parallel channels interactions are examined. For experimental researches of nonstationary regimes flow in three parallel vertical channels results of phenomenon analysis and mechanisms of parallel channel interaction for adiabatic condition of one-phase fluid and two-phase mixture flow are shown. (author)
Parallel paving: An algorithm for generating distributed, adaptive, all-quadrilateral meshes on parallel computers

Energy Technology Data Exchange (ETDEWEB)

Lober, R.R.; Tautges, T.J.; Vaughan, C.T.

1997-03-01

Paving is an automated mesh generation algorithm which produces all-quadrilateral elements. It can additionally generate these elements in varying sizes such that the resulting mesh adapts to a function distribution, such as an error function. While powerful, conventional paving is a very serial algorithm in its operation. Parallel paving is the extension of serial paving into parallel environments to perform the same meshing functions as conventional paving only on distributed, discretized models. This extension allows large, adaptive, parallel finite element simulations to take advantage of paving`s meshing capabilities for h-remap remeshing. A significantly modified version of the CUBIT mesh generation code has been developed to host the parallel paving algorithm and demonstrate its capabilities on both two dimensional and three dimensional surface geometries and compare the resulting parallel produced meshes to conventionally paved meshes for mesh quality and algorithm performance. Sandia`s {open_quotes}tiling{close_quotes} dynamic load balancing code has also been extended to work with the paving algorithm to retain parallel efficiency as subdomains undergo iterative mesh refinement.
Usefulness of bone window CT images parallel to the transnasal surgical route for pituitary disorders

International Nuclear Information System (INIS)

Abe, T.; Kunii, N.; Ikeda, H.; Izumiyama, H.; Asahina, N.

2003-01-01

Before operating on 130 patients with pituitary disorders, we evaluated their bone window CT images sliced parallel to the trans nasal surgical route to assess the surgical anatomy of the nasal cavity for trans nasal surgery. High resolution bone window CT was performed in 3- to 5-mm slices parallel to the imaginary line connecting the inferior margin of the piriform aperture and the top of the sellar floor, parallel to the trans nasal surgical route. This CT angle was useful in evaluating the width and depth of the operative hold, the bony components of the nasal conchas, deviation of the nasal septum, the bony structure and mucosa in the sphenoid sinus, and the condition of the sellar floor. In patients requiring repeat surgery, the location of thin or thick nasal mucosa, residual bony septum, and inadequate sellar floor opening were easily detected. Bone window CT images sliced parallel to the trans nasal surgical route provide direct visualization of the nasal anatomy for the trans nasal approach. This method is helpful in determining how far to remove the sellar floor laterally, especially in cases requiring repeat surgery. (author)
Seeing or moving in parallel

DEFF Research Database (Denmark)

Christensen, Mark Schram; Ehrsson, H Henrik; Nielsen, Jens Bo

2013-01-01

a different network, involving bilateral dorsal premotor cortex (PMd), primary motor cortex, and SMA, was more active when subjects viewed parallel movements while performing either symmetrical or parallel movements. Correlations between behavioral instability and brain activity were present in right lateral...... adduction-abduction movements symmetrically or in parallel with real-time congruent or incongruent visual feedback of the movements. One network, consisting of bilateral superior and middle frontal gyrus and supplementary motor area (SMA), was more active when subjects performed parallel movements, whereas...
The use of parallel imaging for MRI assessment of knees in children and adolescents.

Science.gov (United States)

Doria, Andrea S; Chaudry, Gulraiz A; Nasui, Cristina; Rayner, Tammy; Wang, Chenghua; Moineddin, Rahim; Babyn, Paul S; White, Larry M; Sussman, Marshall S

2010-03-01

Parallel imaging provides faster scanning at the cost of reduced signal-to-noise ratio (SNR) and increased artifacts. To compare the diagnostic performance of two parallel MRI protocols (PPs) for assessment of pathologic knees using an 8-channel knee coil (reference standard, conventional protocol [CP]) and to characterize the SNR losses associated with parallel imaging. Two radiologists blindly interpreted 1.5 Tesla knee MRI images in 21 children (mean 13 years, range 9-18 years) with clinical indications for an MRI scan. Sagittal proton density, T2-W fat-saturated FSE, axial T2-W fat-saturated FSE, and coronal T1-W (NEX of 1,1,1) images were obtained with both CP and PP. Images were read for soft tissue and osteochondral findings. There was a 75% decrease in acquisition time using PP in comparison to CP. The CP and PP protocols fell within excellent or upper limits of substantial agreement: CP, kappa coefficient, 0.81 (95% CIs, 0.73-0.89); PP, 0.80-0.81 (0.73-0.89). The sensitivity of the two PPs was similar for assessment of soft (0.98-1.00) and osteochondral (0.89-0.94) tissues. Phantom data indicated an SNR of 1.67, 1.6, and 1.51 (axial, sagittal and coronal planes) between CP and PP scans. Parallel MRI provides a reliable assessment for pediatric knees in a significantly reduced scan time without affecting the diagnostic performance of MRI.
Magnetic field line random walk in non-axisymmetric turbulence

International Nuclear Information System (INIS)

Tautz, R.C.; Lerche, I.

2011-01-01

Including a random component of a magnetic field parallel to an ambient field introduces a mean perpendicular motion to the average field line. This effect is normally not discussed because one customarily chooses at the outset to ignore such a field component in discussing random walk and diffusion of field lines. A discussion of the basic effect is given, indicating that any random magnetic field with a non-zero helicity will lead to such a non-zero perpendicular mean motion. Several exact analytic illustrations are given of the effect as well as a simple numerical illustration. -- Highlights: → For magnetic field line random walk all magnetic field components are important. → Non-vanishing magnetic helicity leads to mean perpendicular motion. → Analytically exact stream functions illustrate that the novel transverse effect exists.
Online measurement for geometrical parameters of wheel set based on structure light and CUDA parallel processing

Science.gov (United States)

Wu, Kaihua; Shao, Zhencheng; Chen, Nian; Wang, Wenjie

2018-01-01

The wearing degree of the wheel set tread is one of the main factors that influence the safety and stability of running train. Geometrical parameters mainly include flange thickness and flange height. Line structure laser light was projected on the wheel tread surface. The geometrical parameters can be deduced from the profile image. An online image acquisition system was designed based on asynchronous reset of CCD and CUDA parallel processing unit. The image acquisition was fulfilled by hardware interrupt mode. A high efficiency parallel segmentation algorithm based on CUDA was proposed. The algorithm firstly divides the image into smaller squares, and extracts the squares of the target by fusion of k_means and STING clustering image segmentation algorithm. Segmentation time is less than 0.97ms. A considerable acceleration ratio compared with the CPU serial calculation was obtained, which greatly improved the real-time image processing capacity. When wheel set was running in a limited speed, the system placed alone railway line can measure the geometrical parameters automatically. The maximum measuring speed is 120km/h.
Non-Almost Periodicity of Parallel Transports for Homogeneous Connections

International Nuclear Information System (INIS)

Brunnemann, Johannes; Fleischhack, Christian

2012-01-01

Let A be the affine space of all connections in an SU(2) principal fibre bundle over ℝ 3 . The set of homogeneous isotropic connections forms a line l in A. We prove that the parallel transports for general, non-straight paths in the base manifold do not depend almost periodically on l. Consequently, the embedding l ↪ A does not continuously extend to an embedding l-bar ↪ A-bar of the respective compactifications. Here, the Bohr compactification l-bar corresponds to the configuration space of homogeneous isotropic loop quantum cosmology and A-bar to that of loop quantum gravity. Analogous results are given for the anisotropic case.
The numerical parallel computing of photon transport

International Nuclear Information System (INIS)

Huang Qingnan; Liang Xiaoguang; Zhang Lifa

1998-12-01

The parallel computing of photon transport is investigated, the parallel algorithm and the parallelization of programs on parallel computers both with shared memory and with distributed memory are discussed. By analyzing the inherent law of the mathematics and physics model of photon transport according to the structure feature of parallel computers, using the strategy of 'to divide and conquer', adjusting the algorithm structure of the program, dissolving the data relationship, finding parallel liable ingredients and creating large grain parallel subtasks, the sequential computing of photon transport into is efficiently transformed into parallel and vector computing. The program was run on various HP parallel computers such as the HY-1 (PVP), the Challenge (SMP) and the YH-3 (MPP) and very good parallel speedup has been gotten
Hypergraph partitioning implementation for parallelizing matrix-vector multiplication using CUDA GPU-based parallel computing

Science.gov (United States)

Murni, Bustamam, A.; Ernastuti, Handhika, T.; Kerami, D.

2017-07-01

Calculation of the matrix-vector multiplication in the real-world problems often involves large matrix with arbitrary size. Therefore, parallelization is needed to speed up the calculation process that usually takes a long time. Graph partitioning techniques that have been discussed in the previous studies cannot be used to complete the parallelized calculation of matrix-vector multiplication with arbitrary size. This is due to the assumption of graph partitioning techniques that can only solve the square and symmetric matrix. Hypergraph partitioning techniques will overcome the shortcomings of the graph partitioning technique. This paper addresses the efficient parallelization of matrix-vector multiplication through hypergraph partitioning techniques using CUDA GPU-based parallel computing. CUDA (compute unified device architecture) is a parallel computing platform and programming model that was created by NVIDIA and implemented by the GPU (graphics processing unit).
Writing parallel programs that work

CERN Multimedia

CERN. Geneva

2012-01-01

Serial algorithms typically run inefficiently on parallel machines. This may sound like an obvious statement, but it is the root cause of why parallel programming is considered to be difficult. The current state of the computer industry is still that almost all programs in existence are serial. This talk will describe the techniques used in the Intel Parallel Studio to provide a developer with the tools necessary to understand the behaviors and limitations of the existing serial programs. Once the limitations are known the developer can refactor the algorithms and reanalyze the resulting programs with the tools in the Intel Parallel Studio to create parallel programs that work. About the speaker Paul Petersen is a Sr. Principal Engineer in the Software and Solutions Group (SSG) at Intel. He received a Ph.D. degree in Computer Science from the University of Illinois in 1993. After UIUC, he was employed at Kuck and Associates, Inc. (KAI) working on auto-parallelizing compiler (KAP), and was involved in th...

Parallel Framework for Cooperative Processes

Directory of Open Access Journals (Sweden)

Mitică Craus

2005-01-01

Full Text Available This paper describes the work of an object oriented framework designed to be used in the parallelization of a set of related algorithms. The idea behind the system we are describing is to have a re-usable framework for running several sequential algorithms in a parallel environment. The algorithms that the framework can be used with have several things in common: they have to run in cycles and the work should be possible to be split between several "processing units". The parallel framework uses the message-passing communication paradigm and is organized as a master-slave system. Two applications are presented: an Ant Colony Optimization (ACO parallel algorithm for the Travelling Salesman Problem (TSP and an Image Processing (IP parallel algorithm for the Symmetrical Neighborhood Filter (SNF. The implementations of these applications by means of the parallel framework prove to have good performances: approximatively linear speedup and low communication cost.
Compiler Technology for Parallel Scientific Computation

Directory of Open Access Journals (Sweden)

Can Özturan

1994-01-01

Full Text Available There is a need for compiler technology that, given the source program, will generate efficient parallel codes for different architectures with minimal user involvement. Parallel computation is becoming indispensable in solving large-scale problems in science and engineering. Yet, the use of parallel computation is limited by the high costs of developing the needed software. To overcome this difficulty we advocate a comprehensive approach to the development of scalable architecture-independent software for scientific computation based on our experience with equational programming language (EPL. Our approach is based on a program decomposition, parallel code synthesis, and run-time support for parallel scientific computation. The program decomposition is guided by the source program annotations provided by the user. The synthesis of parallel code is based on configurations that describe the overall computation as a set of interacting components. Run-time support is provided by the compiler-generated code that redistributes computation and data during object program execution. The generated parallel code is optimized using techniques of data alignment, operator placement, wavefront determination, and memory optimization. In this article we discuss annotations, configurations, parallel code generation, and run-time support suitable for parallel programs written in the functional parallel programming language EPL and in Fortran.
Parallel computing: numerics, applications, and trends

National Research Council Canada - National Science Library

Trobec, Roman; Vajteršic, Marián; Zinterhof, Peter

2009-01-01

... and/or distributed systems. The contributions to this book are focused on topics most concerned in the trends of today's parallel computing. These range from parallel algorithmics, programming, tools, network computing to future parallel computing. Particular attention is paid to parallel numerics: linear algebra, differential equations, numerica...
Parallel Computing Strategies for Irregular Algorithms

Science.gov (United States)

Biswas, Rupak; Oliker, Leonid; Shan, Hongzhang; Biegel, Bryan (Technical Monitor)

2002-01-01

Parallel computing promises several orders of magnitude increase in our ability to solve realistic computationally-intensive problems, but relies on their efficient mapping and execution on large-scale multiprocessor architectures. Unfortunately, many important applications are irregular and dynamic in nature, making their effective parallel implementation a daunting task. Moreover, with the proliferation of parallel architectures and programming paradigms, the typical scientist is faced with a plethora of questions that must be answered in order to obtain an acceptable parallel implementation of the solution algorithm. In this paper, we consider three representative irregular applications: unstructured remeshing, sparse matrix computations, and N-body problems, and parallelize them using various popular programming paradigms on a wide spectrum of computer platforms ranging from state-of-the-art supercomputers to PC clusters. We present the underlying problems, the solution algorithms, and the parallel implementation strategies. Smart load-balancing, partitioning, and ordering techniques are used to enhance parallel performance. Overall results demonstrate the complexity of efficiently parallelizing irregular algorithms.
Coiled transmission line pulse generators

Science.gov (United States)

McDonald, Kenneth Fox

2010-11-09

Methods and apparatus are provided for fabricating and constructing solid dielectric "Coiled Transmission Line" pulse generators in radial or axial coiled geometries. The pour and cure fabrication process enables a wide variety of geometries and form factors. The volume between the conductors is filled with liquid blends of monomers, polymers, oligomers, and/or cross-linkers and dielectric powders; and then cured to form high field strength and high dielectric constant solid dielectric transmission lines that intrinsically produce ideal rectangular high voltage pulses when charged and switched into matched impedance loads. Voltage levels may be increased by Marx and/or Blumlein principles incorporating spark gap or, preferentially, solid state switches (such as optically triggered thyristors) which produce reliable, high repetition rate operation. Moreover, these Marxed pulse generators can be DC charged and do not require additional pulse forming circuitry, pulse forming lines, transformers, or an a high voltage spark gap output switch. The apparatus accommodates a wide range of voltages, impedances, pulse durations, pulse repetition rates, and duty cycles. The resulting mobile or flight platform friendly cylindrical geometric configuration is much more compact, light-weight, and robust than conventional linear geometries, or pulse generators constructed from conventional components. Installing additional circuitry may accommodate optional pulse shape improvements. The Coiled Transmission Lines can also be connected in parallel to decrease the impedance, or in series to increase the pulse length.
The Glasgow Parallel Reduction Machine: Programming Shared-memory Many-core Systems using Parallel Task Composition

Directory of Open Access Journals (Sweden)

Ashkan Tousimojarad

2013-12-01

Full Text Available We present the Glasgow Parallel Reduction Machine (GPRM, a novel, flexible framework for parallel task-composition based many-core programming. We allow the programmer to structure programs into task code, written as C++ classes, and communication code, written in a restricted subset of C++ with functional semantics and parallel evaluation. In this paper we discuss the GPRM, the virtual machine framework that enables the parallel task composition approach. We focus the discussion on GPIR, the functional language used as the intermediate representation of the bytecode running on the GPRM. Using examples in this language we show the flexibility and power of our task composition framework. We demonstrate the potential using an implementation of a merge sort algorithm on a 64-core Tilera processor, as well as on a conventional Intel quad-core processor and an AMD 48-core processor system. We also compare our framework with OpenMP tasks in a parallel pointer chasing algorithm running on the Tilera processor. Our results show that the GPRM programs outperform the corresponding OpenMP codes on all test platforms, and can greatly facilitate writing of parallel programs, in particular non-data parallel algorithms such as reductions.
Streaming for Functional Data-Parallel Languages

DEFF Research Database (Denmark)

Madsen, Frederik Meisner

In this thesis, we investigate streaming as a general solution to the space inefficiency commonly found in functional data-parallel programming languages. The data-parallel paradigm maps well to parallel SIMD-style hardware. However, the traditional fully materializing execution strategy...... by extending two existing data-parallel languages: NESL and Accelerate. In the extensions we map bulk operations to data-parallel streams that can evaluate fully sequential, fully parallel or anything in between. By a dataflow, piecewise parallel execution strategy, the runtime system can adjust to any target...... flattening necessitates all sub-computations to materialize at the same time. For example, naive n by n matrix multiplication requires n^3 space in NESL because the algorithm contains n^3 independent scalar multiplications. For large values of n, this is completely unacceptable. We address the problem...
Systematic test on fast time resolution parallel plate avalanche counter

International Nuclear Information System (INIS)

Chen Yu; Li Guangwu; Gu Xianbao; Chen Yanchao; Zhang Gang; Zhang Wenhui; Yan Guohong

2011-01-01

Systematic test on each detect unit of parallel plate avalanche counter (PPAC) used in the fission multi-parameter measurement was performed with a 241 Am α source to get the time resolution and position resolution. The detectors work at 600 Pa flowing isobutane and with-600 V on cathode. The time resolution was got by TOF method and the position resolution was got by delay line method. The time resolution of detect units is better than 400 ps, and the position resolution is 6 mm. The results show that the demand of measurement is fully covered. (authors)
Electron temperature in field reversed configurations and theta pinches with closed magnetic field lines

International Nuclear Information System (INIS)

Newton, A.A.

1986-01-01

Field-reversed configurations (FRC) and theta pinches with trapped reversed bias field are essentially the same magnetic confinement systems using closed magnetic field lines inside an open-ended magnetic flux tube. A simple model of joule heating and parallel electron thermal conduction along the open flux lines to an external heat sink gives the electron temperature as Tsub(e)(eV) approx.= 0.05 Bsup(2/3)(G)Lsup(1/3)(cm), where B is the magnetic field and L is the coil length. This model appears to agree with measurements from present FRC experiments and past theta-pinch experiments which cover a range of 40-900 eV. The energy balance in the model is dominated by (a) parallel electron thermal conduction along the open field lines which has a steep temperature dependence, Q is proportional to Tsub(e)sup(7/2), and (b) the assumed rapid perpendicular transport in the plasma bulk which, in experiments to date, may be due to the small number of ion gyroradii across the plasma. (author)
Patterns for Parallel Software Design

CERN Document Server

Ortega-Arjona, Jorge Luis

2010-01-01

Essential reading to understand patterns for parallel programming Software patterns have revolutionized the way we think about how software is designed, built, and documented, and the design of parallel software requires you to consider other particular design aspects and special skills. From clusters to supercomputers, success heavily depends on the design skills of software developers. Patterns for Parallel Software Design presents a pattern-oriented software architecture approach to parallel software design. This approach is not a design method in the classic sense, but a new way of managin
High performance parallel I/O

CERN Document Server

Prabhat

2014-01-01

Gain Critical Insight into the Parallel I/O EcosystemParallel I/O is an integral component of modern high performance computing (HPC), especially in storing and processing very large datasets to facilitate scientific discovery. Revealing the state of the art in this field, High Performance Parallel I/O draws on insights from leading practitioners, researchers, software architects, developers, and scientists who shed light on the parallel I/O ecosystem.The first part of the book explains how large-scale HPC facilities scope, configure, and operate systems, with an emphasis on choices of I/O har
Is Monte Carlo embarrassingly parallel?

Energy Technology Data Exchange (ETDEWEB)

Hoogenboom, J. E. [Delft Univ. of Technology, Mekelweg 15, 2629 JB Delft (Netherlands); Delft Nuclear Consultancy, IJsselzoom 2, 2902 LB Capelle aan den IJssel (Netherlands)

2012-07-01

Monte Carlo is often stated as being embarrassingly parallel. However, running a Monte Carlo calculation, especially a reactor criticality calculation, in parallel using tens of processors shows a serious limitation in speedup and the execution time may even increase beyond a certain number of processors. In this paper the main causes of the loss of efficiency when using many processors are analyzed using a simple Monte Carlo program for criticality. The basic mechanism for parallel execution is MPI. One of the bottlenecks turn out to be the rendez-vous points in the parallel calculation used for synchronization and exchange of data between processors. This happens at least at the end of each cycle for fission source generation in order to collect the full fission source distribution for the next cycle and to estimate the effective multiplication factor, which is not only part of the requested results, but also input to the next cycle for population control. Basic improvements to overcome this limitation are suggested and tested. Also other time losses in the parallel calculation are identified. Moreover, the threading mechanism, which allows the parallel execution of tasks based on shared memory using OpenMP, is analyzed in detail. Recommendations are given to get the maximum efficiency out of a parallel Monte Carlo calculation. (authors)
Is Monte Carlo embarrassingly parallel?

International Nuclear Information System (INIS)

Hoogenboom, J. E.

2012-01-01

Monte Carlo is often stated as being embarrassingly parallel. However, running a Monte Carlo calculation, especially a reactor criticality calculation, in parallel using tens of processors shows a serious limitation in speedup and the execution time may even increase beyond a certain number of processors. In this paper the main causes of the loss of efficiency when using many processors are analyzed using a simple Monte Carlo program for criticality. The basic mechanism for parallel execution is MPI. One of the bottlenecks turn out to be the rendez-vous points in the parallel calculation used for synchronization and exchange of data between processors. This happens at least at the end of each cycle for fission source generation in order to collect the full fission source distribution for the next cycle and to estimate the effective multiplication factor, which is not only part of the requested results, but also input to the next cycle for population control. Basic improvements to overcome this limitation are suggested and tested. Also other time losses in the parallel calculation are identified. Moreover, the threading mechanism, which allows the parallel execution of tasks based on shared memory using OpenMP, is analyzed in detail. Recommendations are given to get the maximum efficiency out of a parallel Monte Carlo calculation. (authors)
20 CFR 654.13 - Determination of areas of substantial unemployment.

Science.gov (United States)

2010-04-01

... 20 Employees' Benefits 3 2010-04-01 2010-04-01 false Determination of areas of substantial unemployment. 654.13 Section 654.13 Employees' Benefits EMPLOYMENT AND TRAINING ADMINISTRATION, DEPARTMENT OF... 10582 § 654.13 Determination of areas of substantial unemployment. An area of substantial unemployment...
Parallel algorithms for continuum dynamics

International Nuclear Information System (INIS)

Hicks, D.L.; Liebrock, L.M.

1987-01-01

Simply porting existing parallel programs to a new parallel processor may not achieve the full speedup possible; to achieve the maximum efficiency may require redesigning the parallel algorithms for the specific architecture. The authors discuss here parallel algorithms that were developed first for the HEP processor and then ported to the CRAY X-MP/4, the ELXSI/10, and the Intel iPSC/32. Focus is mainly on the most recent parallel processing results produced, i.e., those on the Intel Hypercube. The applications are simulations of continuum dynamics in which the momentum and stress gradients are important. Examples of these are inertial confinement fusion experiments, severe breaks in the coolant system of a reactor, weapons physics, shock-wave physics. Speedup efficiencies on the Intel iPSC Hypercube are very sensitive to the ratio of communication to computation. Great care must be taken in designing algorithms for this machine to avoid global communication. This is much more critical on the iPSC than it was on the three previous parallel processors
Correlations in the quantum theory of plasma line broadening

International Nuclear Information System (INIS)

Dufty, J.W.; Boercker, D.B.

1976-01-01

A unified theory of plasma line broadening is obtained from a quantum kinetic equation, paralleling existing results for a classical plasma. The atom-electron interactions are shielded by equilibrium electron correlation functions and a frequency dependent dielectric function. A 'ring' approximation is used to replace the classical plasma parameter expansion, for typical laboratory conditions. Atom-electron correlations are included as well as electron-electron correlations. (author)
Parallel I/O measurements on a SparcCenter and a MEIKO CS2

International Nuclear Information System (INIS)

Panzer-Steindel, B.

1994-01-01

CERN is participating in a European ESPRIT project (GPMIMD2) to investigate the benefits and future prospects of parallel processing to HEP computing. One of the focal points at CERN is the parallelization of the Monte-Carlo program from the NA48 experiment. This experiment aims to measure the CP-Violation parameters in the neutral kaon system to a very high precision starting in 1996. The Monte-Carlo program (NMC) from the NA48 collaboration consists of 102 subroutines with 17500 lines of FORTRAN code. It simulates the decay of neutral kaons into charged/neutral pions in their detector setup. Their main detector is a liquid krypton calorimeter with about 13000 channels. As the shower simulation of photons and pions in this detector take a long time per event (GEANT), the NMC uses a lookup table (showerlibrary) of previously generated showers
Kinetics of transformations nucleated on random parallel planes: analytical modelling and computer simulation

International Nuclear Information System (INIS)

Rios, Paulo R; Assis, Weslley L S; Ribeiro, Tatiana C S; Villa, Elena

2012-01-01

In a classical paper, Cahn derived expressions for the kinetics of transformations nucleated on random planes and lines. He used those as a model for nucleation on the boundaries, edges and vertices of a polycrystal consisting of equiaxed grains. In this paper it is demonstrated that Cahn's expression for random planes may be used in situations beyond the scope envisaged in Cahn's original paper. For instance, we derived an expression for the kinetics of transformations nucleated on random parallel planes that is identical to that formerly obtained by Cahn considering random planes. Computer simulation of transformations nucleated on random parallel planes is carried out. It is shown that there is excellent agreement between simulated results and analytical solutions. Such an agreement is to be expected if both the simulation and the analytical solution are correct. (paper)
A theoretical concept for a thermal-hydraulic 3D parallel channel core model

International Nuclear Information System (INIS)

Hoeld, A.

2004-01-01

A detailed description of the theoretical concept of the 3D thermal-hydraulic single- and two-phase flow phenomena is presented. The theoretical concept is based on important development lines such as separate treatment of the mass and energy from the momentum balance eqs. The other line is the establishment of a procedure for the calculation of the mass flow distributions into different parallel channels based on the fact that the sum of pressure decrease terms over a closed loop must stay, despite of un-symmetric perturbations, zero. The concept is realized in the experimental code HERO-X3D, concentrating in a first step on an artificial BWR or PWR core which may consist of a central channel, four quadrants, and a bypass channel. (authors)
Impact of interference on the performance of selection based parallel multiuser scheduling

KAUST Repository

Nam, Sungsik

2012-02-01

In conventional multiuser parallel scheduling schemes, every scheduled user is interfering with every other scheduled user, which limits the capacity and performance of multiuser systems, and the level of interference becomes substantial as the number of scheduled users increases. Based on the above observations, we investigate the trade-off between the system throughput and the number of scheduled users through the exact analysis of the total average sum rate capacity and the average spectral efficiency. Our analytical results can help the system designer to carefully select the appropriate number of scheduled users to maximize the overall throughput while maintaining an acceptable quality of service under certain channel conditions. © 2012 IEEE.

Vectorization, parallelization and porting of nuclear codes. Vectorization and parallelization. Progress report fiscal 1999

Energy Technology Data Exchange (ETDEWEB)

Adachi, Masaaki; Ogasawara, Shinobu; Kume, Etsuo [Japan Atomic Energy Research Inst., Tokai, Ibaraki (Japan). Tokai Research Establishment; Ishizuki, Shigeru; Nemoto, Toshiyuki; Kawasaki, Nobuo; Kawai, Wataru [Fujitsu Ltd., Tokyo (Japan); Yatake, Yo-ichi [Hitachi Ltd., Tokyo (Japan)

2001-02-01

Several computer codes in the nuclear field have been vectorized, parallelized and trans-ported on the FUJITSU VPP500 system, the AP3000 system, the SX-4 system and the Paragon system at Center for Promotion of Computational Science and Engineering in Japan Atomic Energy Research Institute. We dealt with 18 codes in fiscal 1999. These results are reported in 3 parts, i.e., the vectorization and the parallelization part on vector processors, the parallelization part on scalar processors and the porting part. In this report, we describe the vectorization and parallelization on vector processors. In this vectorization and parallelization on vector processors part, the vectorization of Relativistic Molecular Orbital Calculation code RSCAT, a microscopic transport code for high energy nuclear collisions code JAM, three-dimensional non-steady thermal-fluid analysis code STREAM, Relativistic Density Functional Theory code RDFT and High Speed Three-Dimensional Nodal Diffusion code MOSRA-Light on the VPP500 system and the SX-4 system are described. (author)
Parallel R-matrix computation

International Nuclear Information System (INIS)

Heggarty, J.W.

1999-06-01

For almost thirty years, sequential R-matrix computation has been used by atomic physics research groups, from around the world, to model collision phenomena involving the scattering of electrons or positrons with atomic or molecular targets. As considerable progress has been made in the understanding of fundamental scattering processes, new data, obtained from more complex calculations, is of current interest to experimentalists. Performing such calculations, however, places considerable demands on the computational resources to be provided by the target machine, in terms of both processor speed and memory requirement. Indeed, in some instances the computational requirements are so great that the proposed R-matrix calculations are intractable, even when utilising contemporary classic supercomputers. Historically, increases in the computational requirements of R-matrix computation were accommodated by porting the problem codes to a more powerful classic supercomputer. Although this approach has been successful in the past, it is no longer considered to be a satisfactory solution due to the limitations of current (and future) Von Neumann machines. As a consequence, there has been considerable interest in the high performance multicomputers, that have emerged over the last decade which appear to offer the computational resources required by contemporary R-matrix research. Unfortunately, developing codes for these machines is not as simple a task as it was to develop codes for successive classic supercomputers. The difficulty arises from the considerable differences in the computing models that exist between the two types of machine and results in the programming of multicomputers to be widely acknowledged as a difficult, time consuming and error-prone task. Nevertheless, unless parallel R-matrix computation is realised, important theoretical and experimental atomic physics research will continue to be hindered. This thesis describes work that was undertaken in
Implementation and performance of parallelized elegant

International Nuclear Information System (INIS)

Wang, Y.; Borland, M.

2008-01-01

The program elegant is widely used for design and modeling of linacs for free-electron lasers and energy recovery linacs, as well as storage rings and other applications. As part of a multi-year effort, we have parallelized many aspects of the code, including single-particle dynamics, wakefields, and coherent synchrotron radiation. We report on the approach used for gradual parallelization, which proved very beneficial in getting parallel features into the hands of users quickly. We also report details of parallelization of collective effects. Finally, we discuss performance of the parallelized code in various applications.
Transmission line analysis of beam deflection in a BPM stripline kicker

International Nuclear Information System (INIS)

Caporaso, G.J.; Chen, Yu Ju; Poole, B.

1997-05-01

In the usual treatment of impedances of beamline structures the electromagnetic response is computed under the assumption that the source charge trajectory is parallel to the propagation axis and is unaffected by the wake of the structure. For high energy beams of relatively low current this is generally a valid assumption. Under certain conditions the assumption of a parallel source charge trajectory is no longer valid and the effects of the changing trajectory must be included in the analysis. Here the usual transmission line analysis that has been applied to BPM type transverse kickers is extended to include the self-consistent motion of the beam in the structure
Parallelizing the spectral transform method: A comparison of alternative parallel algorithms

International Nuclear Information System (INIS)

Foster, I.; Worley, P.H.

1993-01-01

The spectral transform method is a standard numerical technique for solving partial differential equations on the sphere and is widely used in global climate modeling. In this paper, we outline different approaches to parallelizing the method and describe experiments that we are conducting to evaluate the efficiency of these approaches on parallel computers. The experiments are conducted using a testbed code that solves the nonlinear shallow water equations on a sphere, but are designed to permit evaluation in the context of a global model. They allow us to evaluate the relative merits of the approaches as a function of problem size and number of processors. The results of this study are guiding ongoing work on PCCM2, a parallel implementation of the Community Climate Model developed at the National Center for Atmospheric Research
Algorithms for parallel computers

International Nuclear Information System (INIS)

Churchhouse, R.F.

1985-01-01

Until relatively recently almost all the algorithms for use on computers had been designed on the (usually unstated) assumption that they were to be run on single processor, serial machines. With the introduction of vector processors, array processors and interconnected systems of mainframes, minis and micros, however, various forms of parallelism have become available. The advantage of parallelism is that it offers increased overall processing speed but it also raises some fundamental questions, including: (i) which, if any, of the existing 'serial' algorithms can be adapted for use in the parallel mode. (ii) How close to optimal can such adapted algorithms be and, where relevant, what are the convergence criteria. (iii) How can we design new algorithms specifically for parallel systems. (iv) For multi-processor systems how can we handle the software aspects of the interprocessor communications. Aspects of these questions illustrated by examples are considered in these lectures. (orig.)
Field evaluations of the VDmax approach for substantiation of a 25 kGy sterilization dose and its application to other preselected doses

International Nuclear Information System (INIS)

Kowalski, John B.; Herring, Craig; Baryschpolec, Lisa; Reger, John; Patel, Jay; Feeney, Mary; Tallentire, Alan

2002-01-01

The International and European standards for radiation sterilization require evidence of the effectiveness of a minimum sterilization dose of 25 kGy but do not provide detailed guidance on how this evidence can be generated. An approach, designated VD max , has recently been described and computer evaluated to provide safe and unambiguous substantiation of a 25 kGy sterilization dose. The approach has been further developed into a practical method, which has been subjected to field evaluations at three manufacturing facilities which produce different types of medical devices. The three facilities each used a different overall evaluation strategy: Facility A used VD max for quarterly dose audits; Facility B compared VD max and Method 1 in side-by-side parallel experiments; and Facility C, a new facility at start-up, used VD max for initial substantiation of 25 kGy and subsequent quarterly dose audits. A common element at all three facilities was the use of 10 product units for irradiation in the verification dose experiment. The field evaluations of the VD max method were successful at all three facilities; they included many different types of medical devices/product families with a wide range of average bioburden and sample item portion values used in the verification dose experiments. Overall, around 500 verification dose experiments were performed and no failures were observed. In the side-by-side parallel experiments, the outcomes of the VD max experiments were consistent with the outcomes observed with Method 1. The VD max approach has been extended to sterilization doses >25 and max method for doses other than 25 kGy must await controlled field evaluations and the development of appropriate specifications/standards
Parallel processing for fluid dynamics applications

International Nuclear Information System (INIS)

Johnson, G.M.

1989-01-01

The impact of parallel processing on computational science and, in particular, on computational fluid dynamics is growing rapidly. In this paper, particular emphasis is given to developments which have occurred within the past two years. Parallel processing is defined and the reasons for its importance in high-performance computing are reviewed. Parallel computer architectures are classified according to the number and power of their processing units, their memory, and the nature of their connection scheme. Architectures which show promise for fluid dynamics applications are emphasized. Fluid dynamics problems are examined for parallelism inherent at the physical level. CFD algorithms and their mappings onto parallel architectures are discussed. Several example are presented to document the performance of fluid dynamics applications on present-generation parallel processing devices
Parallel Evolution of Copy-Number Variation across Continents in Drosophila melanogaster.

Science.gov (United States)

Schrider, Daniel R; Hahn, Matthew W; Begun, David J

2016-05-01

Genetic differentiation across populations that is maintained in the presence of gene flow is a hallmark of spatially varying selection. In Drosophila melanogaster, the latitudinal clines across the eastern coasts of Australia and North America appear to be examples of this type of selection, with recent studies showing that a substantial portion of the D. melanogaster genome exhibits allele frequency differentiation with respect to latitude on both continents. As of yet there has been no genome-wide examination of differentiated copy-number variants (CNVs) in these geographic regions, despite their potential importance for phenotypic variation in Drosophila and other taxa. Here, we present an analysis of geographic variation in CNVs in D. melanogaster. We also present the first genomic analysis of geographic variation for copy-number variation in the sister species, D. simulans, in order to investigate patterns of parallel evolution in these close relatives. In D. melanogaster we find hundreds of CNVs, many of which show parallel patterns of geographic variation on both continents, lending support to the idea that they are influenced by spatially varying selection. These findings support the idea that polymorphic CNVs contribute to local adaptation in D. melanogaster In contrast, we find very few CNVs in D. simulans that are geographically differentiated in parallel on both continents, consistent with earlier work suggesting that clinal patterns are weaker in this species. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Parallel discrete event simulation

NARCIS (Netherlands)

Overeinder, B.J.; Hertzberger, L.O.; Sloot, P.M.A.; Withagen, W.J.

1991-01-01

In simulating applications for execution on specific computing systems, the simulation performance figures must be known in a short period of time. One basic approach to the problem of reducing the required simulation time is the exploitation of parallelism. However, in parallelizing the simulation
SESOTHO trial ("Switch Either near Suppression Or THOusand") - switch to second-line versus WHO-guided standard of care for unsuppressed patients on first-line ART with viremia below 1000 copies/mL: protocol of a multicenter, parallel-group, open-label, randomized clinical trial in Lesotho, Southern Africa.

Science.gov (United States)

Amstutz, Alain; Nsakala, Bienvenu Lengo; Vanobberghen, Fiona; Muhairwe, Josephine; Glass, Tracy Renée; Achieng, Beatrice; Sepeka, Mamorena; Tlali, Katleho; Sao, Lebohang; Thin, Kyaw; Klimkait, Thomas; Battegay, Manuel; Labhardt, Niklaus Daniel

2018-02-12

The World Health Organization (WHO) recommends viral load (VL) measurement as the preferred monitoring strategy for HIV-infected individuals on antiretroviral therapy (ART) in resource-limited settings. The new WHO guidelines 2016 continue to define virologic failure as two consecutive VL ≥1000 copies/mL (at least 3 months apart) despite good adherence, triggering switch to second-line therapy. However, the threshold of 1000 copies/mL for defining virologic failure is based on low-quality evidence. Observational studies have shown that individuals with low-level viremia (measurable but below 1000 copies/mL) are at increased risk for accumulation of resistance mutations and subsequent virologic failure. The SESOTHO trial assesses a lower threshold for switch to second-line ART in patients with sustained unsuppressed VL. In this multicenter, parallel-group, open-label, randomized controlled trial conducted in Lesotho, patients on first-line ART with two consecutive unsuppressed VL measurements ≥100 copies/mL, where the second VL is between 100 and 999 copies/mL, will either be switched to second-line ART immediately (intervention group) or not be switched (standard of care, according to WHO guidelines). The primary endpoint is viral resuppression (VL < 50 copies/mL) 9 months after randomization. We will enrol 80 patients, giving us 90% power to detect a difference of 35% in viral resuppression between the groups (assuming two-sided 5% alpha error). For our primary analysis, we will use a modified intention-to-treat set, with those lost to care, death, or crossed over considered failure to resuppress, and using logistic regression models adjusted for the prespecified stratification variables. The SESOTHO trial challenges the current WHO guidelines, assessing an alternative, lower VL threshold for patients with unsuppressed VL on first-line ART. This trial will provide data to inform future WHO guidelines on VL thresholds to recommend switch to second-line ART
Photoluminescence spectra of n-doped double quantum wells in a parallel magnetic field

International Nuclear Information System (INIS)

Huang, D.; Lyo, S.K.

1999-01-01

We show that the photoluminescence (PL) line shapes from tunnel-split ground sublevels of n-doped thin double quantum wells (DQW close-quote s) are sensitively modulated by an in-plane magnetic field B parallel at low temperatures (T). The modulation is caused by the B parallel -induced distortion of the electronic structure. The latter arises from the relative shift of the energy-dispersion parabolas of the two quantum wells (QW close-quote s) in rvec k space, both in the conduction and valence bands, and formation of an anticrossing gap in the conduction band. Using a self-consistent density-functional theory, the PL spectra and the band-gap narrowing are calculated as a function of B parallel , T, and the homogeneous linewidths. The PL spectra from symmetric and asymmetric DQW close-quote s are found to show strikingly different behavior. In symmetric DQW close-quote s with a high density of electrons, two PL peaks are obtained at B parallel =0, representing the interband transitions between the pair of the upper (i.e., antisymmetric) levels and that of the lower (i.e., symmetric) levels of the ground doublets. As B parallel increases, the upper PL peak develops an N-type kink, namely a maximum followed by a minimum, and merges with the lower peak, which rises monotonically as a function of B parallel due to the diamagnetic energy. When the electron density is low, however, only a single PL peak, arising from the transitions between the lower levels, is obtained. In asymmetric DQW close-quote s, the PL spectra show mainly one dominant peak at all B parallel close-quote s. In this case, the holes are localized in one of the QW close-quote s at low T and recombine only with the electrons in the same QW. At high electron densities, the upper PL peak shows an N-type kink like in symmetric DQW close-quote s. However, the lower peak is absent at low B parallel close-quote s because it arises from the inter-QW transitions. Reasonable agreement is obtained with recent
Overview of the Force Scientific Parallel Language

Directory of Open Access Journals (Sweden)

Gita Alaghband

1994-01-01

Full Text Available The Force parallel programming language designed for large-scale shared-memory multiprocessors is presented. The language provides a number of parallel constructs as extensions to the ordinary Fortran language and is implemented as a two-level macro preprocessor to support portability across shared memory multiprocessors. The global parallelism model on which the Force is based provides a powerful parallel language. The parallel constructs, generic synchronization, and freedom from process management supported by the Force has resulted in structured parallel programs that are ported to the many multiprocessors on which the Force is implemented. Two new parallel constructs for looping and functional decomposition are discussed. Several programming examples to illustrate some parallel programming approaches using the Force are also presented.
The Galley Parallel File System

Science.gov (United States)

Nieuwejaar, Nils; Kotz, David

1996-01-01

Most current multiprocessor file systems are designed to use multiple disks in parallel, using the high aggregate bandwidth to meet the growing I/0 requirements of parallel scientific applications. Many multiprocessor file systems provide applications with a conventional Unix-like interface, allowing the application to access multiple disks transparently. This interface conceals the parallelism within the file system, increasing the ease of programmability, but making it difficult or impossible for sophisticated programmers and libraries to use knowledge about their I/O needs to exploit that parallelism. In addition to providing an insufficient interface, most current multiprocessor file systems are optimized for a different workload than they are being asked to support. We introduce Galley, a new parallel file system that is intended to efficiently support realistic scientific multiprocessor workloads. We discuss Galley's file structure and application interface, as well as the performance advantages offered by that interface.
PDDP, A Data Parallel Programming Model

Directory of Open Access Journals (Sweden)

Karen H. Warren

1996-01-01

Full Text Available PDDP, the parallel data distribution preprocessor, is a data parallel programming model for distributed memory parallel computers. PDDP implements high-performance Fortran-compatible data distribution directives and parallelism expressed by the use of Fortran 90 array syntax, the FORALL statement, and the WHERE construct. Distributed data objects belong to a global name space; other data objects are treated as local and replicated on each processor. PDDP allows the user to program in a shared memory style and generates codes that are portable to a variety of parallel machines. For interprocessor communication, PDDP uses the fastest communication primitives on each platform.
Design considerations for parallel graphics libraries

Science.gov (United States)

Crockett, Thomas W.

1994-01-01

Applications which run on parallel supercomputers are often characterized by massive datasets. Converting these vast collections of numbers to visual form has proven to be a powerful aid to comprehension. For a variety of reasons, it may be desirable to provide this visual feedback at runtime. One way to accomplish this is to exploit the available parallelism to perform graphics operations in place. In order to do this, we need appropriate parallel rendering algorithms and library interfaces. This paper provides a tutorial introduction to some of the issues which arise in designing parallel graphics libraries and their underlying rendering algorithms. The focus is on polygon rendering for distributed memory message-passing systems. We illustrate our discussion with examples from PGL, a parallel graphics library which has been developed on the Intel family of parallel systems.
Antiproliferative effects of small fruit juices on several cancer cell lines.

Science.gov (United States)

Yoshizawa, Y; Kawaii, S; Urashima, M; Fukase, T; Sato, T; Tanaka, R; Murofushi, N; Nishimura, H

2000-01-01

Juices prepared from small fruits, mainly growing in the northern part of Japan, were studied in an attempt to explore the feasibility of an assay that screens cytotoxic properties. Screening of 43 small fruit juices indicated that Actinidia polygama Maxim., Rosa rugosa Thunb., Vaccinium smallii A. Gray and Sorbus sambucifolia Roem, strongly inhibited the proliferation of all cancer cell lines examined and yet these juices were substantially less cytotoxic toward normal human cell lines.
Conjunction of anti-parallel and component reconnection at the dayside MP: Cluster and Double Star coordinated observation on 6 April 2004

Science.gov (United States)

Wang, J.; Pu, Z. Y.; Fu, S. Y.; Wang, X. G.; Xiao, C. J.; Dunlop, M. W.; Wei, Y.; Bogdanova, Y. V.; Zong, Q. G.; Xie, L.

2011-05-01

Previous theoretical and simulation studies have suggested that the anti-parallel and component reconnection can occur simultaneously on the dayside magnetopause. Certain observations have also been reported to support global conjunct pattern of magnetic reconnection. Here, we show direct evidence for the conjunction of anti-parallel and component MR using coordinated observations of Double Star TC-1 and Cluster under the same IMF condition on 6 April, 2004. The global MR X-line configuration constructed is in good agreement with the “S-shape” model.
Aspects of computation on asynchronous parallel processors

International Nuclear Information System (INIS)

Wright, M.

1989-01-01

The increasing availability of asynchronous parallel processors has provided opportunities for original and useful work in scientific computing. However, the field of parallel computing is still in a highly volatile state, and researchers display a wide range of opinion about many fundamental questions such as models of parallelism, approaches for detecting and analyzing parallelism of algorithms, and tools that allow software developers and users to make effective use of diverse forms of complex hardware. This volume collects the work of researchers specializing in different aspects of parallel computing, who met to discuss the framework and the mechanics of numerical computing. The far-reaching impact of high-performance asynchronous systems is reflected in the wide variety of topics, which include scientific applications (e.g. linear algebra, lattice gauge simulation, ordinary and partial differential equations), models of parallelism, parallel language features, task scheduling, automatic parallelization techniques, tools for algorithm development in parallel environments, and system design issues
A fine adjustment mechanism of the second crystal in a double-crystal monochromator with a 3-PS parallel manipulator

International Nuclear Information System (INIS)

Cao Chongzhen; Gao, X.; Ma, P.; Yu, H.; Wang, F.; Huang, Y.; Liu, P.

2005-01-01

A novel fine adjustment mechanism of the second crystal in a double-crystal monochromator is put forward, which is based on a 3-PS parallel manipulator and the magnetic force. Not only is the principle of fine adjusting the pitch angle and the roll angle analyzed, but also optimization of the structure parameters of the permanent magnet, a key part of the fine adjustment mechanism. The fine adjustment mechanism with the 3-PS parallel manipulator has been applied successfully in the double-crystal monochromator of 4W1B beam line in the Beijing Synchrotron Radiation Facility (BSRF)

Parallelization of the FLAPW method

International Nuclear Information System (INIS)

Canning, A.; Mannstadt, W.; Freeman, A.J.

1999-01-01

The FLAPW (full-potential linearized-augmented plane-wave) method is one of the most accurate first-principles methods for determining electronic and magnetic properties of crystals and surfaces. Until the present work, the FLAPW method has been limited to systems of less than about one hundred atoms due to a lack of an efficient parallel implementation to exploit the power and memory of parallel computers. In this work we present an efficient parallelization of the method by division among the processors of the plane-wave components for each state. The code is also optimized for RISC (reduced instruction set computer) architectures, such as those found on most parallel computers, making full use of BLAS (basic linear algebra subprograms) wherever possible. Scaling results are presented for systems of up to 686 silicon atoms and 343 palladium atoms per unit cell, running on up to 512 processors on a CRAY T3E parallel computer
Parallelization of the FLAPW method

Science.gov (United States)

Canning, A.; Mannstadt, W.; Freeman, A. J.

2000-08-01

The FLAPW (full-potential linearized-augmented plane-wave) method is one of the most accurate first-principles methods for determining structural, electronic and magnetic properties of crystals and surfaces. Until the present work, the FLAPW method has been limited to systems of less than about a hundred atoms due to the lack of an efficient parallel implementation to exploit the power and memory of parallel computers. In this work, we present an efficient parallelization of the method by division among the processors of the plane-wave components for each state. The code is also optimized for RISC (reduced instruction set computer) architectures, such as those found on most parallel computers, making full use of BLAS (basic linear algebra subprograms) wherever possible. Scaling results are presented for systems of up to 686 silicon atoms and 343 palladium atoms per unit cell, running on up to 512 processors on a CRAY T3E parallel supercomputer.
Drainage network extraction from a high-resolution DEM using parallel programming in the .NET Framework

Science.gov (United States)

Du, Chao; Ye, Aizhong; Gan, Yanjun; You, Jinjun; Duan, Qinyun; Ma, Feng; Hou, Jingwen

2017-12-01

High-resolution Digital Elevation Models (DEMs) can be used to extract high-accuracy prerequisite drainage networks. A higher resolution represents a larger number of grids. With an increase in the number of grids, the flow direction determination will require substantial computer resources and computing time. Parallel computing is a feasible method with which to resolve this problem. In this paper, we proposed a parallel programming method within the .NET Framework with a C# Compiler in a Windows environment. The basin is divided into sub-basins, and subsequently the different sub-basins operate on multiple threads concurrently to calculate flow directions. The method was applied to calculate the flow direction of the Yellow River basin from 3 arc-second resolution SRTM DEM. Drainage networks were extracted and compared with HydroSHEDS river network to assess their accuracy. The results demonstrate that this method can calculate the flow direction from high-resolution DEMs efficiently and extract high-precision continuous drainage networks.
A Parallel Restoration for Black Start of Microgrids Considering Characteristics of Distributed Generations

Directory of Open Access Journals (Sweden)

Jing Wang

2017-12-01

Full Text Available The black start capability is vital for microgrids, which can potentially improve the reliability of the power grid. This paper proposes a black start strategy for microgrids based on a parallel restoration strategy. Considering the characteristics of distributed generations (DGs, an evaluation model, which is used to assess the black start capability of DGs, is established by adopting the variation coefficient method. Thus, the DGs with good black start capability, which are selected by a diversity sequence method, are restored first in parallel under the constraints of DGs and network. During the selection process of recovery paths, line weight and node importance degree are proposed under the consideration of the node topological importance and the load importance as well as the backbone network restoration time. Therefore, the whole optimization of the reconstructed network is realized. Finally, the simulation results verify the feasibility and effectiveness of the strategy.
Nested separatrices in simple shear flows: the effect of localized disturbances on stagnation lines

OpenAIRE

Wilson, M.C.T.; Gaskell, P.H.; Savage, M.D.

2005-01-01

The effects of localized two-dimensional disturbances on the structure of shear flows featuring a stagnation line are investigated. A simple superposition of a planar Couette flow and Moffatt's [J. Fluid Mech. 18, 1--18 (1964)] streamfunction for the decay of a disturbance between infinite stationary parallel plates shows that in general the stagnation line is replaced by a chain of alternating elliptic and hyperbolic stagnation points with a separation equal to 2.78 times the half-gap betwee...
Parallelization of 2-D lattice Boltzmann codes

International Nuclear Information System (INIS)

Suzuki, Soichiro; Kaburaki, Hideo; Yokokawa, Mitsuo.

1996-03-01

Lattice Boltzmann (LB) codes to simulate two dimensional fluid flow are developed on vector parallel computer Fujitsu VPP500 and scalar parallel computer Intel Paragon XP/S. While a 2-D domain decomposition method is used for the scalar parallel LB code, a 1-D domain decomposition method is used for the vector parallel LB code to be vectorized along with the axis perpendicular to the direction of the decomposition. High parallel efficiency of 95.1% by the vector parallel calculation on 16 processors with 1152x1152 grid and 88.6% by the scalar parallel calculation on 100 processors with 800x800 grid are obtained. The performance models are developed to analyze the performance of the LB codes. It is shown by our performance models that the execution speed of the vector parallel code is about one hundred times faster than that of the scalar parallel code with the same number of processors up to 100 processors. We also analyze the scalability in keeping the available memory size of one processor element at maximum. Our performance model predicts that the execution time of the vector parallel code increases about 3% on 500 processors. Although the 1-D domain decomposition method has in general a drawback in the interprocessor communication, the vector parallel LB code is still suitable for the large scale and/or high resolution simulations. (author)
Parallelization of 2-D lattice Boltzmann codes

Energy Technology Data Exchange (ETDEWEB)

Suzuki, Soichiro; Kaburaki, Hideo; Yokokawa, Mitsuo

1996-03-01

Lattice Boltzmann (LB) codes to simulate two dimensional fluid flow are developed on vector parallel computer Fujitsu VPP500 and scalar parallel computer Intel Paragon XP/S. While a 2-D domain decomposition method is used for the scalar parallel LB code, a 1-D domain decomposition method is used for the vector parallel LB code to be vectorized along with the axis perpendicular to the direction of the decomposition. High parallel efficiency of 95.1% by the vector parallel calculation on 16 processors with 1152x1152 grid and 88.6% by the scalar parallel calculation on 100 processors with 800x800 grid are obtained. The performance models are developed to analyze the performance of the LB codes. It is shown by our performance models that the execution speed of the vector parallel code is about one hundred times faster than that of the scalar parallel code with the same number of processors up to 100 processors. We also analyze the scalability in keeping the available memory size of one processor element at maximum. Our performance model predicts that the execution time of the vector parallel code increases about 3% on 500 processors. Although the 1-D domain decomposition method has in general a drawback in the interprocessor communication, the vector parallel LB code is still suitable for the large scale and/or high resolution simulations. (author).
Explorations of the implementation of a parallel IDW interpolation algorithm in a Linux cluster-based parallel GIS

Science.gov (United States)

Huang, Fang; Liu, Dingsheng; Tan, Xicheng; Wang, Jian; Chen, Yunping; He, Binbin

2011-04-01

To design and implement an open-source parallel GIS (OP-GIS) based on a Linux cluster, the parallel inverse distance weighting (IDW) interpolation algorithm has been chosen as an example to explore the working model and the principle of algorithm parallel pattern (APP), one of the parallelization patterns for OP-GIS. Based on an analysis of the serial IDW interpolation algorithm of GRASS GIS, this paper has proposed and designed a specific parallel IDW interpolation algorithm, incorporating both single process, multiple data (SPMD) and master/slave (M/S) programming modes. The main steps of the parallel IDW interpolation algorithm are: (1) the master node packages the related information, and then broadcasts it to the slave nodes; (2) each node calculates its assigned data extent along one row using the serial algorithm; (3) the master node gathers the data from all nodes; and (4) iterations continue until all rows have been processed, after which the results are outputted. According to the experiments performed in the course of this work, the parallel IDW interpolation algorithm can attain an efficiency greater than 0.93 compared with similar algorithms, which indicates that the parallel algorithm can greatly reduce processing time and maximize speed and performance.
Parallel Monte Carlo reactor neutronics

International Nuclear Information System (INIS)

Blomquist, R.N.; Brown, F.B.

1994-01-01

The issues affecting implementation of parallel algorithms for large-scale engineering Monte Carlo neutron transport simulations are discussed. For nuclear reactor calculations, these include load balancing, recoding effort, reproducibility, domain decomposition techniques, I/O minimization, and strategies for different parallel architectures. Two codes were parallelized and tested for performance. The architectures employed include SIMD, MIMD-distributed memory, and workstation network with uneven interactive load. Speedups linear with the number of nodes were achieved
Parallel Implicit Algorithms for CFD

Science.gov (United States)

Keyes, David E.

1998-01-01

The main goal of this project was efficient distributed parallel and workstation cluster implementations of Newton-Krylov-Schwarz (NKS) solvers for implicit Computational Fluid Dynamics (CFD.) "Newton" refers to a quadratically convergent nonlinear iteration using gradient information based on the true residual, "Krylov" to an inner linear iteration that accesses the Jacobian matrix only through highly parallelizable sparse matrix-vector products, and "Schwarz" to a domain decomposition form of preconditioning the inner Krylov iterations with primarily neighbor-only exchange of data between the processors. Prior experience has established that Newton-Krylov methods are competitive solvers in the CFD context and that Krylov-Schwarz methods port well to distributed memory computers. The combination of the techniques into Newton-Krylov-Schwarz was implemented on 2D and 3D unstructured Euler codes on the parallel testbeds that used to be at LaRC and on several other parallel computers operated by other agencies or made available by the vendors. Early implementations were made directly in Massively Parallel Integration (MPI) with parallel solvers we adapted from legacy NASA codes and enhanced for full NKS functionality. Later implementations were made in the framework of the PETSC library from Argonne National Laboratory, which now includes pseudo-transient continuation Newton-Krylov-Schwarz solver capability (as a result of demands we made upon PETSC during our early porting experiences). A secondary project pursued with funding from this contract was parallel implicit solvers in acoustics, specifically in the Helmholtz formulation. A 2D acoustic inverse problem has been solved in parallel within the PETSC framework.
Parallel kinematics type, kinematics, and optimal design

CERN Document Server

Liu, Xin-Jun

2014-01-01

Parallel Kinematics- Type, Kinematics, and Optimal Design presents the results of 15 year's research on parallel mechanisms and parallel kinematics machines. This book covers the systematic classification of parallel mechanisms (PMs) as well as providing a large number of mechanical architectures of PMs available for use in practical applications. It focuses on the kinematic design of parallel robots. One successful application of parallel mechanisms in the field of machine tools, which is also called parallel kinematics machines, has been the emerging trend in advanced machine tools. The book describes not only the main aspects and important topics in parallel kinematics, but also references novel concepts and approaches, i.e. type synthesis based on evolution, performance evaluation and optimization based on screw theory, singularity model taking into account motion and force transmissibility, and others. This book is intended for researchers, scientists, engineers and postgraduates or above with interes...
Experiments with parallel algorithms for combinatorial problems

NARCIS (Netherlands)

G.A.P. Kindervater (Gerard); H.W.J.M. Trienekens

1985-01-01

textabstractIn the last decade many models for parallel computation have been proposed and many parallel algorithms have been developed. However, few of these models have been realized and most of these algorithms are supposed to run on idealized, unrealistic parallel machines. The parallel machines
A non-local thermodynamic equilibrium, line-blanketed synthetic spectrum of Iota Herculis - C, Al, and Si lines

Science.gov (United States)

Grigsby, James A.

1991-01-01

A non-LTE line-blanketed model stellar atmosphere is used to compute a model of I Herculis (B3 IV) with a Teff of 17,500 K and a log g of 3.75, following the conclusions of Peters and Polidan (1985). Detailed profiles of a number of lines of C, Al, and Si in the 1200-2000-A region are computed, including the resonance lines of C II, Al II, and Al III. These profiles are compared to observations obtained from the coaddition of eight IUE SWP images, using a technique developed by Leckrone and Adelman (1989). Comparison of carbon lines with a model that is underabundant in carbon by a factor of 2 relative to the sun indicates that the C abundance of Iota Her is at most one-half solar. Non-LTE effects are examined by comparing an LTE model possessing identical atmospheric parameters with the non-LTE model. Substantial differences in the populations of the model atomic states are found, but differences in the temperature structure of the two models often mask the non-LTE effects in the synthetic spectra.
Parallel reservoir simulator computations

International Nuclear Information System (INIS)

Hemanth-Kumar, K.; Young, L.C.

1995-01-01

The adaptation of a reservoir simulator for parallel computations is described. The simulator was originally designed for vector processors. It performs approximately 99% of its calculations in vector/parallel mode and relative to scalar calculations it achieves speedups of 65 and 81 for black oil and EOS simulations, respectively on the CRAY C-90
The STAPL Parallel Graph Library

KAUST Repository

Harshvardhan,

2013-01-01

This paper describes the stapl Parallel Graph Library, a high-level framework that abstracts the user from data-distribution and parallelism details and allows them to concentrate on parallel graph algorithm development. It includes a customizable distributed graph container and a collection of commonly used parallel graph algorithms. The library introduces pGraph pViews that separate algorithm design from the container implementation. It supports three graph processing algorithmic paradigms, level-synchronous, asynchronous and coarse-grained, and provides common graph algorithms based on them. Experimental results demonstrate improved scalability in performance and data size over existing graph libraries on more than 16,000 cores and on internet-scale graphs containing over 16 billion vertices and 250 billion edges. © Springer-Verlag Berlin Heidelberg 2013.
The parallel volume at large distances

DEFF Research Database (Denmark)

Kampf, Jürgen

In this paper we examine the asymptotic behavior of the parallel volume of planar non-convex bodies as the distance tends to infinity. We show that the difference between the parallel volume of the convex hull of a body and the parallel volume of the body itself tends to . This yields a new proof...... for the fact that a planar body can only have polynomial parallel volume, if it is convex. Extensions to Minkowski spaces and random sets are also discussed....
The parallel volume at large distances

DEFF Research Database (Denmark)

Kampf, Jürgen

In this paper we examine the asymptotic behavior of the parallel volume of planar non-convex bodies as the distance tends to infinity. We show that the difference between the parallel volume of the convex hull of a body and the parallel volume of the body itself tends to 0. This yields a new proof...... for the fact that a planar body can only have polynomial parallel volume, if it is convex. Extensions to Minkowski spaces and random sets are also discussed....
Expressing Parallelism with ROOT

Energy Technology Data Exchange (ETDEWEB)

Piparo, D. [CERN; Tejedor, E. [CERN; Guiraud, E. [CERN; Ganis, G. [CERN; Mato, P. [CERN; Moneta, L. [CERN; Valls Pla, X. [CERN; Canal, P. [Fermilab

2017-11-22

The need for processing the ever-increasing amount of data generated by the LHC experiments in a more efficient way has motivated ROOT to further develop its support for parallelism. Such support is being tackled both for shared-memory and distributed-memory environments. The incarnations of the aforementioned parallelism are multi-threading, multi-processing and cluster-wide executions. In the area of multi-threading, we discuss the new implicit parallelism and related interfaces, as well as the new building blocks to safely operate with ROOT objects in a multi-threaded environment. Regarding multi-processing, we review the new MultiProc framework, comparing it with similar tools (e.g. multiprocessing module in Python). Finally, as an alternative to PROOF for cluster-wide executions, we introduce the efforts on integrating ROOT with state-of-the-art distributed data processing technologies like Spark, both in terms of programming model and runtime design (with EOS as one of the main components). For all the levels of parallelism, we discuss, based on real-life examples and measurements, how our proposals can increase the productivity of scientists.
Expressing Parallelism with ROOT

Science.gov (United States)

Piparo, D.; Tejedor, E.; Guiraud, E.; Ganis, G.; Mato, P.; Moneta, L.; Valls Pla, X.; Canal, P.

2017-10-01

The need for processing the ever-increasing amount of data generated by the LHC experiments in a more efficient way has motivated ROOT to further develop its support for parallelism. Such support is being tackled both for shared-memory and distributed-memory environments. The incarnations of the aforementioned parallelism are multi-threading, multi-processing and cluster-wide executions. In the area of multi-threading, we discuss the new implicit parallelism and related interfaces, as well as the new building blocks to safely operate with ROOT objects in a multi-threaded environment. Regarding multi-processing, we review the new MultiProc framework, comparing it with similar tools (e.g. multiprocessing module in Python). Finally, as an alternative to PROOF for cluster-wide executions, we introduce the efforts on integrating ROOT with state-of-the-art distributed data processing technologies like Spark, both in terms of programming model and runtime design (with EOS as one of the main components). For all the levels of parallelism, we discuss, based on real-life examples and measurements, how our proposals can increase the productivity of scientists.
Parallel hierarchical radiosity rendering

Energy Technology Data Exchange (ETDEWEB)

Carter, Michael [Iowa State Univ., Ames, IA (United States)

1993-07-01

In this dissertation, the step-by-step development of a scalable parallel hierarchical radiosity renderer is documented. First, a new look is taken at the traditional radiosity equation, and a new form is presented in which the matrix of linear system coefficients is transformed into a symmetric matrix, thereby simplifying the problem and enabling a new solution technique to be applied. Next, the state-of-the-art hierarchical radiosity methods are examined for their suitability to parallel implementation, and scalability. Significant enhancements are also discovered which both improve their theoretical foundations and improve the images they generate. The resultant hierarchical radiosity algorithm is then examined for sources of parallelism, and for an architectural mapping. Several architectural mappings are discussed. A few key algorithmic changes are suggested during the process of making the algorithm parallel. Next, the performance, efficiency, and scalability of the algorithm are analyzed. The dissertation closes with a discussion of several ideas which have the potential to further enhance the hierarchical radiosity method, or provide an entirely new forum for the application of hierarchical methods.

Increased greenhouse effect substantiated through measurements

International Nuclear Information System (INIS)

Skartveit, Arvid

2001-01-01

The article presents studies on the greenhouse effect which substantiates the results from satellite measurements during the period 1970 - 1997. These show an increased effect due to increase in the concentration of the climatic gases CO 2 , methane, CFC-11 and CFC-12 in the atmosphere
Shared Variable Oriented Parallel Precompiler for SPMD Model

Institute of Scientific and Technical Information of China (English)

无

1995-01-01

For the moment,commercial parallel computer systems with distributed memory architecture are usually provided with parallel FORTRAN or parallel C compliers,which are just traditional sequential FORTRAN or C compilers expanded with communication statements.Programmers suffer from writing parallel programs with communication statements. The Shared Variable Oriented Parallel Precompiler (SVOPP) proposed in this paper can automatically generate appropriate communication statements based on shared variables for SPMD(Single Program Multiple Data) computation model and greatly ease the parallel programming with high communication efficiency.The core function of parallel C precompiler has been successfully verified on a transputer-based parallel computer.Its prominent performance shows that SVOPP is probably a break-through in parallel programming technique.
GRADSPMHD: A parallel MHD code based on the SPH formalism

Science.gov (United States)

Vanaverbeke, S.; Keppens, R.; Poedts, S.

2014-03-01

We present GRADSPMHD, a completely Lagrangian parallel magnetohydrodynamics code based on the SPH formalism. The implementation of the equations of SPMHD in the “GRAD-h” formalism assembles known results, including the derivation of the discretized MHD equations from a variational principle, the inclusion of time-dependent artificial viscosity, resistivity and conductivity terms, as well as the inclusion of a mixed hyperbolic/parabolic correction scheme for satisfying the ∇ṡB→ constraint on the magnetic field. The code uses a tree-based formalism for neighbor finding and can optionally use the tree code for computing the self-gravity of the plasma. The structure of the code closely follows the framework of our parallel GRADSPH FORTRAN 90 code which we added previously to the CPC program library. We demonstrate the capabilities of GRADSPMHD by running 1, 2, and 3 dimensional standard benchmark tests and we find good agreement with previous work done by other researchers. The code is also applied to the problem of simulating the magnetorotational instability in 2.5D shearing box tests as well as in global simulations of magnetized accretion disks. We find good agreement with available results on this subject in the literature. Finally, we discuss the performance of the code on a parallel supercomputer with distributed memory architecture. Catalogue identifier: AERP_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AERP_v1_0.html Program obtainable from: CPC Program Library, Queen’s University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 620503 No. of bytes in distributed program, including test data, etc.: 19837671 Distribution format: tar.gz Programming language: FORTRAN 90/MPI. Computer: HPC cluster. Operating system: Unix. Has the code been vectorized or parallelized?: Yes, parallelized using MPI. RAM: ˜30 MB for a
Evaluating parallel optimization on transputers

Directory of Open Access Journals (Sweden)

A.G. Chalmers

2003-12-01

Full Text Available The faster processing power of modern computers and the development of efficient algorithms have made it possible for operations researchers to tackle a much wider range of problems than ever before. Further improvements in processing speed can be achieved utilising relatively inexpensive transputers to process components of an algorithm in parallel. The Davidon-Fletcher-Powell method is one of the most successful and widely used optimisation algorithms for unconstrained problems. This paper examines the algorithm and identifies the components that can be processed in parallel. The results of some experiments with these components are presented which indicates under what conditions parallel processing with an inexpensive configuration is likely to be faster than the traditional sequential implementations. The performance of the whole algorithm with its parallel components is then compared with the original sequential algorithm. The implementation serves to illustrate the practicalities of speeding up typical OR algorithms in terms of difficulty, effort and cost. The results give an indication of the savings in time a given parallel implementation can be expected to yield.
Programming massively parallel processors a hands-on approach

CERN Document Server

Kirk, David B

2010-01-01

Programming Massively Parallel Processors discusses basic concepts about parallel programming and GPU architecture. ""Massively parallel"" refers to the use of a large number of processors to perform a set of computations in a coordinated parallel way. The book details various techniques for constructing parallel programs. It also discusses the development process, performance level, floating-point format, parallel patterns, and dynamic parallelism. The book serves as a teaching guide where parallel programming is the main topic of the course. It builds on the basics of C programming for CUDA, a parallel programming environment that is supported on NVI- DIA GPUs. Composed of 12 chapters, the book begins with basic information about the GPU as a parallel computer source. It also explains the main concepts of CUDA, data parallelism, and the importance of memory access efficiency using CUDA. The target audience of the book is graduate and undergraduate students from all science and engineering disciplines who ...
Spatial updating grand canonical Monte Carlo algorithms for fluid simulation: generalization to continuous potentials and parallel implementation.

Science.gov (United States)

O'Keeffe, C J; Ren, Ruichao; Orkoulas, G

2007-11-21

Spatial updating grand canonical Monte Carlo algorithms are generalizations of random and sequential updating algorithms for lattice systems to continuum fluid models. The elementary steps, insertions or removals, are constructed by generating points in space either at random (random updating) or in a prescribed order (sequential updating). These algorithms have previously been developed only for systems of impenetrable spheres for which no particle overlap occurs. In this work, spatial updating grand canonical algorithms are generalized to continuous, soft-core potentials to account for overlapping configurations. Results on two- and three-dimensional Lennard-Jones fluids indicate that spatial updating grand canonical algorithms, both random and sequential, converge faster than standard grand canonical algorithms. Spatial algorithms based on sequential updating not only exhibit the fastest convergence but also are ideal for parallel implementation due to the absence of strict detailed balance and the nature of the updating that minimizes interprocessor communication. Parallel simulation results for three-dimensional Lennard-Jones fluids show a substantial reduction of simulation time for systems of moderate and large size. The efficiency improvement by parallel processing through domain decomposition is always in addition to the efficiency improvement by sequential updating.
Hydrogen line formation in the quescent prominences

International Nuclear Information System (INIS)

Tsovookhuu, Ch.

1980-01-01

Equations of transfer and statistical equilibrium for hydronen atom with eight bound levels and continuum are solved simultaneously. A plane-parallel layer located perpendicular to the Sun surface is taken as a geometrical model. Input parameters of the physical model are optical thickness in the center of Hsub(α) line, electron temperature and concentration in the layer center are well as temperature and density gradients. Functions of sources, line profiles, total energies and the Balmer decrements, which are compared with observations and theoretical calculations made by other authors, have been calculated. The comparison shows that the results are quite acceptable and can be used when analyzing the spectrum and determining physical parameters of solar prominences. Dependence of different performances of the line (equivalent width, central intensity, halfwidth, depth of central depression etc.) on values of initial model parameters is investigated. Line halfwidth is more sensitive to the temperatuu value in the layer center, while central intensity - to the value of temperature gradient and a depth of central depression - to electron concentration. Calculated were shares of primary sources responsible for different excitation mechanism depending on total optical thickness as well as mean probabilities of quantum yield out of a medium which can be used during parametric accountancy of radiation diffusion in solar prominences [ru
Russians and Ukrainians plan new gas line

International Nuclear Information System (INIS)

TREND

2003-01-01

Paper deals with the building of new gas line between Russia and the Ukraine. In September 2003 Gazprom and Naftohaz Ukraine signed protocol about conditions of Russian gas transit through Ukrainian area for 2004. There is guaranteed quantity of transported gas on the level of 127,8 billions cubic meters by Russian side, 110 billions of it should next lead to Europe. Treaty refills long-term contract between Naftohaz and Gazprom about gas transit through Ukraine for period 2003-2013. Both countries and its companies plan to accept particular capital decision about building of new gas line Novopskov-Uzgorod till end of this year, which will be parallel with gas line Soyuz. It will allow to enlarge transportation system capacity by 28 billion cubic meters. Construction of about 1600 km long ducting system will require 1.6 billion USD by annual capacity about 15 billion cubic meters. If capacity would be doubled by construction of another compressor stations, capital expenses rise to 2.2 billions USD. It can be completely built in 7 years. Its linear part with one compressor can be already built in 3 years. Slovensky Plynarensky Podnik (SPP) would certainly have advantages from possible realisation of new gas line and from consequential transit increase. (Author)
Exploiting Symmetry on Parallel Architectures.

Science.gov (United States)

Stiller, Lewis Benjamin

1995-01-01

This thesis describes techniques for the design of parallel programs that solve well-structured problems with inherent symmetry. Part I demonstrates the reduction of such problems to generalized matrix multiplication by a group-equivariant matrix. Fast techniques for this multiplication are described, including factorization, orbit decomposition, and Fourier transforms over finite groups. Our algorithms entail interaction between two symmetry groups: one arising at the software level from the problem's symmetry and the other arising at the hardware level from the processors' communication network. Part II illustrates the applicability of our symmetry -exploitation techniques by presenting a series of case studies of the design and implementation of parallel programs. First, a parallel program that solves chess endgames by factorization of an associated dihedral group-equivariant matrix is described. This code runs faster than previous serial programs, and discovered it a number of results. Second, parallel algorithms for Fourier transforms for finite groups are developed, and preliminary parallel implementations for group transforms of dihedral and of symmetric groups are described. Applications in learning, vision, pattern recognition, and statistics are proposed. Third, parallel implementations solving several computational science problems are described, including the direct n-body problem, convolutions arising from molecular biology, and some communication primitives such as broadcast and reduce. Some of our implementations ran orders of magnitude faster than previous techniques, and were used in the investigation of various physical phenomena.
Advanced parallel processing with supercomputer architectures

International Nuclear Information System (INIS)

Hwang, K.

1987-01-01

This paper investigates advanced parallel processing techniques and innovative hardware/software architectures that can be applied to boost the performance of supercomputers. Critical issues on architectural choices, parallel languages, compiling techniques, resource management, concurrency control, programming environment, parallel algorithms, and performance enhancement methods are examined and the best answers are presented. The authors cover advanced processing techniques suitable for supercomputers, high-end mainframes, minisupers, and array processors. The coverage emphasizes vectorization, multitasking, multiprocessing, and distributed computing. In order to achieve these operation modes, parallel languages, smart compilers, synchronization mechanisms, load balancing methods, mapping parallel algorithms, operating system functions, application library, and multidiscipline interactions are investigated to ensure high performance. At the end, they assess the potentials of optical and neural technologies for developing future supercomputers
19 CFR 134.35 - Articles substantially changed by manufacture.

Science.gov (United States)

2010-04-01

... 19 Customs Duties 1 2010-04-01 2010-04-01 false Articles substantially changed by manufacture. 134... substantially changed by manufacture. (a) Articles other than goods of a NAFTA country. An article used in the United States in manufacture which results in an article having a name, character, or use differing from...
Endpoint-based parallel data processing with non-blocking collective instructions in a parallel active messaging interface of a parallel computer

Science.gov (United States)

Archer, Charles J; Blocksome, Michael A; Cernohous, Bob R; Ratterman, Joseph D; Smith, Brian E

2014-11-11

Endpoint-based parallel data processing with non-blocking collective instructions in a PAMI of a parallel computer is disclosed. The PAMI is composed of data communications endpoints, each including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task. The compute nodes are coupled for data communications through the PAMI. The parallel application establishes a data communications geometry specifying a set of endpoints that are used in collective operations of the PAMI by associating with the geometry a list of collective algorithms valid for use with the endpoints of the geometry; registering in each endpoint in the geometry a dispatch callback function for a collective operation; and executing without blocking, through a single one of the endpoints in the geometry, an instruction for the collective operation.
SOFTWARE FOR DESIGNING PARALLEL APPLICATIONS

Directory of Open Access Journals (Sweden)

M. K. Bouza

2017-01-01

Full Text Available The object of research is the tools to support the development of parallel programs in C/C ++. The methods and software which automates the process of designing parallel applications are proposed.
An Introduction to Parallel Computation R

Indian Academy of Sciences (India)

How are they programmed? This article provides an introduction. A parallel computer is a network of processors built for ... and have been used to solve problems much faster than a single ... in parallel computer design is to select an organization which ..... The most ambitious approach to parallel computing is to develop.
Building a parallel file system simulator

International Nuclear Information System (INIS)

Molina-Estolano, E; Maltzahn, C; Brandt, S A; Bent, J

2009-01-01

Parallel file systems are gaining in popularity in high-end computing centers as well as commercial data centers. High-end computing systems are expected to scale exponentially and to pose new challenges to their storage scalability in terms of cost and power. To address these challenges scientists and file system designers will need a thorough understanding of the design space of parallel file systems. Yet there exist few systematic studies of parallel file system behavior at petabyte- and exabyte scale. An important reason is the significant cost of getting access to large-scale hardware to test parallel file systems. To contribute to this understanding we are building a parallel file system simulator that can simulate parallel file systems at very large scale. Our goal is to simulate petabyte-scale parallel file systems on a small cluster or even a single machine in reasonable time and fidelity. With this simulator, file system experts will be able to tune existing file systems for specific workloads, scientists and file system deployment engineers will be able to better communicate workload requirements, file system designers and researchers will be able to try out design alternatives and innovations at scale, and instructors will be able to study very large-scale parallel file system behavior in the class room. In this paper we describe our approach and provide preliminary results that are encouraging both in terms of fidelity and simulation scalability.
Professional Parallel Programming with C# Master Parallel Extensions with NET 4

CERN Document Server

Hillar, Gastón

2010-01-01

Expert guidance for those programming today's dual-core processors PCs As PC processors explode from one or two to now eight processors, there is an urgent need for programmers to master concurrent programming. This book dives deep into the latest technologies available to programmers for creating professional parallel applications using C#, .NET 4, and Visual Studio 2010. The book covers task-based programming, coordination data structures, PLINQ, thread pools, asynchronous programming model, and more. It also teaches other parallel programming techniques, such as SIMD and vectorization.Teach
The Potsdam Parallel Ice Sheet Model (PISM-PIK) - Part 1: Model description

Science.gov (United States)

Winkelmann, R.; Martin, M. A.; Haseloff, M.; Albrecht, T.; Bueler, E.; Khroulev, C.; Levermann, A.

2011-09-01

We present the Potsdam Parallel Ice Sheet Model (PISM-PIK), developed at the Potsdam Institute for Climate Impact Research to be used for simulations of large-scale ice sheet-shelf systems. It is derived from the Parallel Ice Sheet Model (Bueler and Brown, 2009). Velocities are calculated by superposition of two shallow stress balance approximations within the entire ice covered region: the shallow ice approximation (SIA) is dominant in grounded regions and accounts for shear deformation parallel to the geoid. The plug-flow type shallow shelf approximation (SSA) dominates the velocity field in ice shelf regions and serves as a basal sliding velocity in grounded regions. Ice streams can be identified diagnostically as regions with a significant contribution of membrane stresses to the local momentum balance. All lateral boundaries in PISM-PIK are free to evolve, including the grounding line and ice fronts. Ice shelf margins in particular are modeled using Neumann boundary conditions for the SSA equations, reflecting a hydrostatic stress imbalance along the vertical calving face. The ice front position is modeled using a subgrid-scale representation of calving front motion (Albrecht et al., 2011) and a physically-motivated calving law based on horizontal spreading rates. The model is tested in experiments from the Marine Ice Sheet Model Intercomparison Project (MISMIP). A dynamic equilibrium simulation of Antarctica under present-day conditions is presented in Martin et al. (2011).
Parallelization for first principles electronic state calculation program

International Nuclear Information System (INIS)

Watanabe, Hiroshi; Oguchi, Tamio.

1997-03-01

In this report we study the parallelization for First principles electronic state calculation program. The target machines are NEC SX-4 for shared memory type parallelization and FUJITSU VPP300 for distributed memory type parallelization. The features of each parallel machine are surveyed, and the parallelization methods suitable for each are proposed. It is shown that 1.60 times acceleration is achieved with 2 CPU parallelization by SX-4 and 4.97 times acceleration is achieved with 12 PE parallelization by VPP 300. (author)
Parallel computation

International Nuclear Information System (INIS)

Jejcic, A.; Maillard, J.; Maurel, G.; Silva, J.; Wolff-Bacha, F.

1997-01-01

The work in the field of parallel processing has developed as research activities using several numerical Monte Carlo simulations related to basic or applied current problems of nuclear and particle physics. For the applications utilizing the GEANT code development or improvement works were done on parts simulating low energy physical phenomena like radiation, transport and interaction. The problem of actinide burning by means of accelerators was approached using a simulation with the GEANT code. A program of neutron tracking in the range of low energies up to the thermal region has been developed. It is coupled to the GEANT code and permits in a single pass the simulation of a hybrid reactor core receiving a proton burst. Other works in this field refers to simulations for nuclear medicine applications like, for instance, development of biological probes, evaluation and characterization of the gamma cameras (collimators, crystal thickness) as well as the method for dosimetric calculations. Particularly, these calculations are suited for a geometrical parallelization approach especially adapted to parallel machines of the TN310 type. Other works mentioned in the same field refer to simulation of the electron channelling in crystals and simulation of the beam-beam interaction effect in colliders. The GEANT code was also used to simulate the operation of germanium detectors designed for natural and artificial radioactivity monitoring of environment
Coil extensions improve line shapes by removing field distortions

Science.gov (United States)

Conradi, Mark S.; Altobelli, Stephen A.; McDowell, Andrew F.

2018-06-01

The static magnetic susceptibility of the rf coil can substantially distort the field B0 and be a dominant source of line broadening. A scaling argument shows that this may be a particular problem in microcoil NMR. We propose coil extensions to reduce the distortion. The actual rf coil is extended to a much longer overall length by abutted coil segments that do not carry rf current. The result is a long and nearly uniform sheath of copper wire, in terms of the static susceptibility. The line shape improvement is demonstrated at 43.9 MHz and in simulation calculations.

Neoclassical parallel flow calculation in the presence of external parallel momentum sources in Heliotron J

Energy Technology Data Exchange (ETDEWEB)

Nishioka, K.; Nakamura, Y. [Graduate School of Energy Science, Kyoto University, Gokasho, Uji, Kyoto 611-0011 (Japan); Nishimura, S. [National Institute for Fusion Science, 322-6 Oroshi-cho, Toki, Gifu 509-5292 (Japan); Lee, H. Y. [Korea Advanced Institute of Science and Technology, Daejeon 305-701 (Korea, Republic of); Kobayashi, S.; Mizuuchi, T.; Nagasaki, K.; Okada, H.; Minami, T.; Kado, S.; Yamamoto, S.; Ohshima, S.; Konoshima, S.; Sano, F. [Institute of Advanced Energy, Kyoto University, Gokasho, Uji, Kyoto 611-0011 (Japan)

2016-03-15

A moment approach to calculate neoclassical transport in non-axisymmetric torus plasmas composed of multiple ion species is extended to include the external parallel momentum sources due to unbalanced tangential neutral beam injections (NBIs). The momentum sources that are included in the parallel momentum balance are calculated from the collision operators of background particles with fast ions. This method is applied for the clarification of the physical mechanism of the neoclassical parallel ion flows and the multi-ion species effect on them in Heliotron J NBI plasmas. It is found that parallel ion flow can be determined by the balance between the parallel viscosity and the external momentum source in the region where the external source is much larger than the thermodynamic force driven source in the collisional plasmas. This is because the friction between C{sup 6+} and D{sup +} prevents a large difference between C{sup 6+} and D{sup +} flow velocities in such plasmas. The C{sup 6+} flow velocities, which are measured by the charge exchange recombination spectroscopy system, are numerically evaluated with this method. It is shown that the experimentally measured C{sup 6+} impurity flow velocities do not contradict clearly with the neoclassical estimations, and the dependence of parallel flow velocities on the magnetic field ripples is consistent in both results.
Increasing phylogenetic resolution at low taxonomic levels using massively parallel sequencing of chloroplast genomes

Directory of Open Access Journals (Sweden)

Cronn Richard

2009-12-01

Full Text Available Abstract Background Molecular evolutionary studies share the common goal of elucidating historical relationships, and the common challenge of adequately sampling taxa and characters. Particularly at low taxonomic levels, recent divergence, rapid radiations, and conservative genome evolution yield limited sequence variation, and dense taxon sampling is often desirable. Recent advances in massively parallel sequencing make it possible to rapidly obtain large amounts of sequence data, and multiplexing makes extensive sampling of megabase sequences feasible. Is it possible to efficiently apply massively parallel sequencing to increase phylogenetic resolution at low taxonomic levels? Results We reconstruct the infrageneric phylogeny of Pinus from 37 nearly-complete chloroplast genomes (average 109 kilobases each of an approximately 120 kilobase genome generated using multiplexed massively parallel sequencing. 30/33 ingroup nodes resolved with ≥ 95% bootstrap support; this is a substantial improvement relative to prior studies, and shows massively parallel sequencing-based strategies can produce sufficient high quality sequence to reach support levels originally proposed for the phylogenetic bootstrap. Resampling simulations show that at least the entire plastome is necessary to fully resolve Pinus, particularly in rapidly radiating clades. Meta-analysis of 99 published infrageneric phylogenies shows that whole plastome analysis should provide similar gains across a range of plant genera. A disproportionate amount of phylogenetic information resides in two loci (ycf1, ycf2, highlighting their unusual evolutionary properties. Conclusion Plastome sequencing is now an efficient option for increasing phylogenetic resolution at lower taxonomic levels in plant phylogenetic and population genetic analyses. With continuing improvements in sequencing capacity, the strategies herein should revolutionize efforts requiring dense taxon and character sampling
21 CFR 514.4 - Substantial evidence.

Science.gov (United States)

2010-04-01

... adequate and well-controlled studies, such as a study in a target species, study in laboratory animals... and conditions of use. Substantial evidence of effectiveness of a new animal drug shall demonstrate that the new animal drug is effective for each intended use and associated conditions of use for and...
Structural Properties of G,T-Parallel Duplexes

Directory of Open Access Journals (Sweden)

Anna Aviñó

2010-01-01

Full Text Available The structure of G,T-parallel-stranded duplexes of DNA carrying similar amounts of adenine and guanine residues is studied by means of molecular dynamics (MD simulations and UV- and CD spectroscopies. In addition the impact of the substitution of adenine by 8-aminoadenine and guanine by 8-aminoguanine is analyzed. The presence of 8-aminoadenine and 8-aminoguanine stabilizes the parallel duplex structure. Binding of these oligonucleotides to their target polypyrimidine sequences to form the corresponding G,T-parallel triplex was not observed. Instead, when unmodified parallel-stranded duplexes were mixed with their polypyrimidine target, an interstrand Watson-Crick duplex was formed. As predicted by theoretical calculations parallel-stranded duplexes carrying 8-aminopurines did not bind to their target. The preference for the parallel-duplex over the Watson-Crick antiparallel duplex is attributed to the strong stabilization of the parallel duplex produced by the 8-aminopurines. Theoretical studies show that the isomorphism of the triads is crucial for the stability of the parallel triplex.
Parallel operation of primary sodium pumps in FBTR

International Nuclear Information System (INIS)

Athmalingam, S.; Ellappan, T.R.; Vaidyanathan, G.; Chetal, S.C.; Bhoje, S.B.

1994-01-01

Sodium pumps used in the primary main circuit of Fast Breeder Test Reactor (FBTR) are centrifugal pumps. These pumps have a free level of sodium with a cover gas above it to simplify the pump seal arrangement. The sodium level in the pumps will vary based on the flow. The minimum level is governed by consideration of gas entrainment and net positive suction head (NPSH) to the pump while the maximum level is limited by sodium entering the pump tank gas line. There is a special feature in these pumps in that a small portion of the pump outlet sodium flow is led back into the suction chamber to maintain level and avoid gas entrainment. A control valve in this line helps in controlling the level at the desired value. With parallel operation of two sodium pumps a study was conducted to find the regions of safe operation of the two pumps. The purpose of this paper is to give the various design features and methodology of the analysis to arrive at the limiting condition of operation for the different operating states of the two pumps and the effect of pump speed variations on the fluctuations in sodium flows. (author). 6 figs
High-speed parallel solution of the neutron diffusion equation with the hierarchical domain decomposition boundary element method incorporating parallel communications

International Nuclear Information System (INIS)

Tsuji, Masashi; Chiba, Gou

2000-01-01

A hierarchical domain decomposition boundary element method (HDD-BEM) for solving the multiregion neutron diffusion equation (NDE) has been fully parallelized, both for numerical computations and for data communications, to accomplish a high parallel efficiency on distributed memory message passing parallel computers. Data exchanges between node processors that are repeated during iteration processes of HDD-BEM are implemented, without any intervention of the host processor that was used to supervise parallel processing in the conventional parallelized HDD-BEM (P-HDD-BEM). Thus, the parallel processing can be executed with only cooperative operations of node processors. The communication overhead was even the dominant time consuming part in the conventional P-HDD-BEM, and the parallelization efficiency decreased steeply with the increase of the number of processors. With the parallel data communication, the efficiency is affected only by the number of boundary elements assigned to decomposed subregions, and the communication overhead can be drastically reduced. This feature can be particularly advantageous in the analysis of three-dimensional problems where a large number of processors are required. The proposed P-HDD-BEM offers a promising solution to the deterioration problem of parallel efficiency and opens a new path to parallel computations of NDEs on distributed memory message passing parallel computers. (author)
Parallel education: what is it?

OpenAIRE

Amos, Michelle Peta

2017-01-01

In the history of education it has long been discussed that single-sex and coeducation are the two models of education present in schools. With the introduction of parallel schools over the last 15 years, there has been very little research into this 'new model'. Many people do not understand what it means for a school to be parallel or they confuse a parallel model with co-education, due to the presence of both boys and girls within the one institution. Therefore, the main obj...
Parallel computing of physical maps--a comparative study in SIMD and MIMD parallelism.

Science.gov (United States)

Bhandarkar, S M; Chirravuri, S; Arnold, J

1996-01-01

Ordering clones from a genomic library into physical maps of whole chromosomes presents a central computational problem in genetics. Chromosome reconstruction via clone ordering is usually isomorphic to the NP-complete Optimal Linear Arrangement problem. Parallel SIMD and MIMD algorithms for simulated annealing based on Markov chain distribution are proposed and applied to the problem of chromosome reconstruction via clone ordering. Perturbation methods and problem-specific annealing heuristics are proposed and described. The SIMD algorithms are implemented on a 2048 processor MasPar MP-2 system which is an SIMD 2-D toroidal mesh architecture whereas the MIMD algorithms are implemented on an 8 processor Intel iPSC/860 which is an MIMD hypercube architecture. A comparative analysis of the various SIMD and MIMD algorithms is presented in which the convergence, speedup, and scalability characteristics of the various algorithms are analyzed and discussed. On a fine-grained, massively parallel SIMD architecture with a low synchronization overhead such as the MasPar MP-2, a parallel simulated annealing algorithm based on multiple periodically interacting searches performs the best. For a coarse-grained MIMD architecture with high synchronization overhead such as the Intel iPSC/860, a parallel simulated annealing algorithm based on multiple independent searches yields the best results. In either case, distribution of clonal data across multiple processors is shown to exacerbate the tendency of the parallel simulated annealing algorithm to get trapped in a local optimum.
On synchronous parallel computations with independent probabilistic choice

International Nuclear Information System (INIS)

Reif, J.H.

1984-01-01

This paper introduces probabilistic choice to synchronous parallel machine models; in particular parallel RAMs. The power of probabilistic choice in parallel computations is illustrate by parallelizing some known probabilistic sequential algorithms. The authors characterize the computational complexity of time, space, and processor bounded probabilistic parallel RAMs in terms of the computational complexity of probabilistic sequential RAMs. They show that parallelism uniformly speeds up time bounded probabilistic sequential RAM computations by nearly a quadratic factor. They also show that probabilistic choice can be eliminated from parallel computations by introducing nonuniformity
Automatic Parallelization Tool: Classification of Program Code for Parallel Computing

Directory of Open Access Journals (Sweden)

Mustafa Basthikodi

2016-04-01

Full Text Available Performance growth of single-core processors has come to a halt in the past decade, but was re-enabled by the introduction of parallelism in processors. Multicore frameworks along with Graphical Processing Units empowered to enhance parallelism broadly. Couples of compilers are updated to developing challenges forsynchronization and threading issues. Appropriate program and algorithm classifications will have advantage to a great extent to the group of software engineers to get opportunities for effective parallelization. In present work we investigated current species for classification of algorithms, in that related work on classification is discussed along with the comparison of issues that challenges the classification. The set of algorithms are chosen which matches the structure with different issues and perform given task. We have tested these algorithms utilizing existing automatic species extraction toolsalong with Bones compiler. We have added functionalities to existing tool, providing a more detailed characterization. The contributions of our work include support for pointer arithmetic, conditional and incremental statements, user defined types, constants and mathematical functions. With this, we can retain significant data which is not captured by original speciesof algorithms. We executed new theories into the device, empowering automatic characterization of program code.
Numeric algorithms for parallel processors computer architectures with applications to the few-groups neutron diffusion equations

International Nuclear Information System (INIS)

Zee, S.K.

1987-01-01

A numeric algorithm and an associated computer code were developed for the rapid solution of the finite-difference method representation of the few-group neutron-diffusion equations on parallel computers. Applications of the numeric algorithm on both SIMD (vector pipeline) and MIMD/SIMD (multi-CUP/vector pipeline) architectures were explored. The algorithm was successfully implemented in the two-group, 3-D neutron diffusion computer code named DIFPAR3D (DIFfusion PARallel 3-Dimension). Numerical-solution techniques used in the code include the Chebyshev polynomial acceleration technique in conjunction with the power method of outer iteration. For inner iterations, a parallel form of red-black (cyclic) line SOR with automated determination of group dependent relaxation factors and iteration numbers required to achieve specified inner iteration error tolerance is incorporated. The code employs a macroscopic depletion model with trace capability for selected fission products' transients and critical boron. In addition to this, moderator and fuel temperature feedback models are also incorporated into the DIFPAR3D code, for realistic simulation of power reactor cores. The physics models used were proven acceptable in separate benchmarking studies
Resistor Combinations for Parallel Circuits.

Science.gov (United States)

McTernan, James P.

1978-01-01

To help simplify both teaching and learning of parallel circuits, a high school electricity/electronics teacher presents and illustrates the use of tables of values for parallel resistive circuits in which total resistances are whole numbers. (MF)
Parallelization methods study of thermal-hydraulics codes

International Nuclear Information System (INIS)

Gaudart, Catherine

2000-01-01

The variety of parallelization methods and machines leads to a wide selection for programmers. In this study we suggest, in an industrial context, some solutions from the experience acquired through different parallelization methods. The study is about several scientific codes which simulate a large variety of thermal-hydraulics phenomena. A bibliography on parallelization methods and a first analysis of the codes showed the difficulty of our process on the whole applications to study. Therefore, it would be necessary to identify and extract a representative part of these applications and parallelization methods. The linear solver part of the codes forced itself. On this particular part several parallelization methods had been used. From these developments one could estimate the necessary work for a non initiate programmer to parallelize his application, and the impact of the development constraints. The different methods of parallelization tested are the numerical library PETSc, the parallelizer PAF, the language HPF, the formalism PEI and the communications library MPI and PYM. In order to test several methods on different applications and to follow the constraint of minimization of the modifications in codes, a tool called SPS (Server of Parallel Solvers) had be developed. We propose to describe the different constraints about the optimization of codes in an industrial context, to present the solutions given by the tool SPS, to show the development of the linear solver part with the tested parallelization methods and lastly to compare the results against the imposed criteria. (author) [fr
Implications of the stagnation line model for energy input through the dayside magnetopause

International Nuclear Information System (INIS)

Pudovkin, M.I.; Semenov, V.S.; Heyn, M.F.; Biernat, H.K.

1986-01-01

Based on the formation of a stagnation line at the magnetopause the electromagnetic energy transport from the solar wind into the dayside magnetosphere is analyzed. The resulting energy flux is analyzed. The resulting energy flux is proportional to v/sub infinity/B/sub infinity/sin 2 (theta/sub infinity/-phi/sub infinity/), where v/sub infinity/ and B/sub infinity/ are the solar wind speed and magnetic field and theta/sub infinity/-phi/sub s/infinity is the angle between the IMF and the stagnation line projected into the interplanetary space. A stagnation line parallel to the separator gives approximately the sin 4 (theta/sub infinity//2) energy flux dependence of Akasofu's epsilon-index
Workspace Analysis for Parallel Robot

Directory of Open Access Journals (Sweden)

Ying Sun

2013-05-01

Full Text Available As a completely new-type of robot, the parallel robot possesses a lot of advantages that the serial robot does not, such as high rigidity, great load-carrying capacity, small error, high precision, small self-weight/load ratio, good dynamic behavior and easy control, hence its range is extended in using domain. In order to find workspace of parallel mechanism, the numerical boundary-searching algorithm based on the reverse solution of kinematics and limitation of link length has been introduced. This paper analyses position workspace, orientation workspace of parallel robot of the six degrees of freedom. The result shows: It is a main means to increase and decrease its workspace to change the length of branch of parallel mechanism; The radius of the movement platform has no effect on the size of workspace, but will change position of workspace.
Massively Parallel Finite Element Programming

KAUST Repository

Heister, Timo; Kronbichler, Martin; Bangerth, Wolfgang

2010-01-01

Today's large finite element simulations require parallel algorithms to scale on clusters with thousands or tens of thousands of processor cores. We present data structures and algorithms to take advantage of the power of high performance computers in generic finite element codes. Existing generic finite element libraries often restrict the parallelization to parallel linear algebra routines. This is a limiting factor when solving on more than a few hundreds of cores. We describe routines for distributed storage of all major components coupled with efficient, scalable algorithms. We give an overview of our effort to enable the modern and generic finite element library deal.II to take advantage of the power of large clusters. In particular, we describe the construction of a distributed mesh and develop algorithms to fully parallelize the finite element calculation. Numerical results demonstrate good scalability. © 2010 Springer-Verlag.
Massively Parallel Finite Element Programming

KAUST Repository

Heister, Timo

2010-01-01

Today\\'s large finite element simulations require parallel algorithms to scale on clusters with thousands or tens of thousands of processor cores. We present data structures and algorithms to take advantage of the power of high performance computers in generic finite element codes. Existing generic finite element libraries often restrict the parallelization to parallel linear algebra routines. This is a limiting factor when solving on more than a few hundreds of cores. We describe routines for distributed storage of all major components coupled with efficient, scalable algorithms. We give an overview of our effort to enable the modern and generic finite element library deal.II to take advantage of the power of large clusters. In particular, we describe the construction of a distributed mesh and develop algorithms to fully parallelize the finite element calculation. Numerical results demonstrate good scalability. © 2010 Springer-Verlag.
A 3D gyrokinetic particle-in-cell simulation of fusion plasma microturbulence on parallel computers

Science.gov (United States)

Williams, T. J.

1992-12-01

One of the grand challenge problems now supported by HPCC is the Numerical Tokamak Project. A goal of this project is the study of low-frequency micro-instabilities in tokamak plasmas, which are believed to cause energy loss via turbulent thermal transport across the magnetic field lines. An important tool in this study is gyrokinetic particle-in-cell (PIC) simulation. Gyrokinetic, as opposed to fully-kinetic, methods are particularly well suited to the task because they are optimized to study the frequency and wavelength domain of the microinstabilities. Furthermore, many researchers now employ low-noise delta(f) methods to greatly reduce statistical noise by modelling only the perturbation of the gyrokinetic distribution function from a fixed background, not the entire distribution function. In spite of the increased efficiency of these improved algorithms over conventional PIC algorithms, gyrokinetic PIC simulations of tokamak micro-turbulence are still highly demanding of computer power--even fully-vectorized codes on vector supercomputers. For this reason, we have worked for several years to redevelop these codes on massively parallel computers. We have developed 3D gyrokinetic PIC simulation codes for SIMD and MIMD parallel processors, using control-parallel, data-parallel, and domain-decomposition message-passing (DDMP) programming paradigms. This poster summarizes our earlier work on codes for the Connection Machine and BBN TC2000 and our development of a generic DDMP code for distributed-memory parallel machines. We discuss the memory-access issues which are of key importance in writing parallel PIC codes, with special emphasis on issues peculiar to gyrokinetic PIC. We outline the domain decompositions in our new DDMP code and discuss the interplay of different domain decompositions suited for the particle-pushing and field-solution components of the PIC algorithm.
Collectively loading an application in a parallel computer

Science.gov (United States)

Aho, Michael E.; Attinella, John E.; Gooding, Thomas M.; Miller, Samuel J.; Mundy, Michael B.

2016-01-05

Collectively loading an application in a parallel computer, the parallel computer comprising a plurality of compute nodes, including: identifying, by a parallel computer control system, a subset of compute nodes in the parallel computer to execute a job; selecting, by the parallel computer control system, one of the subset of compute nodes in the parallel computer as a job leader compute node; retrieving, by the job leader compute node from computer memory, an application for executing the job; and broadcasting, by the job leader to the subset of compute nodes in the parallel computer, the application for executing the job.
Productive Parallel Programming: The PCN Approach

Directory of Open Access Journals (Sweden)

Ian Foster

1992-01-01

Full Text Available We describe the PCN programming system, focusing on those features designed to improve the productivity of scientists and engineers using parallel supercomputers. These features include a simple notation for the concise specification of concurrent algorithms, the ability to incorporate existing Fortran and C code into parallel applications, facilities for reusing parallel program components, a portable toolkit that allows applications to be developed on a workstation or small parallel computer and run unchanged on supercomputers, and integrated debugging and performance analysis tools. We survey representative scientific applications and identify problem classes for which PCN has proved particularly useful.

Final Report: Migration Mechanisms for Large-scale Parallel Applications

Energy Technology Data Exchange (ETDEWEB)

Jason Nieh

2009-10-30

Process migration is the ability to transfer a process from one machine to another. It is a useful facility in distributed computing environments, especially as computing devices become more pervasive and Internet access becomes more ubiquitous. The potential benefits of process migration, among others, are fault resilience by migrating processes off of faulty hosts, data access locality by migrating processes closer to the data, better system response time by migrating processes closer to users, dynamic load balancing by migrating processes to less loaded hosts, and improved service availability and administration by migrating processes before host maintenance so that applications can continue to run with minimal downtime. Although process migration provides substantial potential benefits and many approaches have been considered, achieving transparent process migration functionality has been difficult in practice. To address this problem, our work has designed, implemented, and evaluated new and powerful transparent process checkpoint-restart and migration mechanisms for desktop, server, and parallel applications that operate across heterogeneous cluster and mobile computing environments. A key aspect of this work has been to introduce lightweight operating system virtualization to provide processes with private, virtual namespaces that decouple and isolate processes from dependencies on the host operating system instance. This decoupling enables processes to be transparently checkpointed and migrated without modifying, recompiling, or relinking applications or the operating system. Building on this lightweight operating system virtualization approach, we have developed novel technologies that enable (1) coordinated, consistent checkpoint-restart and migration of multiple processes, (2) fast checkpointing of process and file system state to enable restart of multiple parallel execution environments and time travel, (3) process migration across heterogeneous
Parallel-In-Time For Moving Meshes

Energy Technology Data Exchange (ETDEWEB)

Falgout, R. D. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Manteuffel, T. A. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Southworth, B. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Schroder, J. B. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

2016-02-04

With steadily growing computational resources available, scientists must develop e ective ways to utilize the increased resources. High performance, highly parallel software has be- come a standard. However until recent years parallelism has focused primarily on the spatial domain. When solving a space-time partial di erential equation (PDE), this leads to a sequential bottleneck in the temporal dimension, particularly when taking a large number of time steps. The XBraid parallel-in-time library was developed as a practical way to add temporal parallelism to existing se- quential codes with only minor modi cations. In this work, a rezoning-type moving mesh is applied to a di usion problem and formulated in a parallel-in-time framework. Tests and scaling studies are run using XBraid and demonstrate excellent results for the simple model problem considered herein.
Integrated Task And Data Parallel Programming: Language Design

Science.gov (United States)

Grimshaw, Andrew S.; West, Emily A.

1998-01-01

his research investigates the combination of task and data parallel language constructs within a single programming language. There are an number of applications that exhibit properties which would be well served by such an integrated language. Examples include global climate models, aircraft design problems, and multidisciplinary design optimization problems. Our approach incorporates data parallel language constructs into an existing, object oriented, task parallel language. The language will support creation and manipulation of parallel classes and objects of both types (task parallel and data parallel). Ultimately, the language will allow data parallel and task parallel classes to be used either as building blocks or managers of parallel objects of either type, thus allowing the development of single and multi-paradigm parallel applications. 1995 Research Accomplishments In February I presented a paper at Frontiers '95 describing the design of the data parallel language subset. During the spring I wrote and defended my dissertation proposal. Since that time I have developed a runtime model for the language subset. I have begun implementing the model and hand-coding simple examples which demonstrate the language subset. I have identified an astrophysical fluid flow application which will validate the data parallel language subset. 1996 Research Agenda Milestones for the coming year include implementing a significant portion of the data parallel language subset over the Legion system. Using simple hand-coded methods, I plan to demonstrate (1) concurrent task and data parallel objects and (2) task parallel objects managing both task and data parallel objects. My next steps will focus on constructing a compiler and implementing the fluid flow application with the language. Concurrently, I will conduct a search for a real-world application exhibiting both task and data parallelism within the same program m. Additional 1995 Activities During the fall I collaborated
Scheduling Additional Train Unit Services on Rail Transit Lines

OpenAIRE

Zhibin Jiang; Yuyan Tan; Özgür Yalçınkaya

2014-01-01

This paper deals with the problem of scheduling additional train unit (TU) services in a double parallel rail transit line, and a mixed integer programming (MIP) model is formulated for integration strategies of new trains connected by TUs with the objective of obtaining higher frequencies in some special sections and special time periods due to mass passenger volumes. We took timetable scheduling and TUs scheduling as an integrated optimization model with two objectives: minimizing travel ti...
Performance of the Galley Parallel File System

Science.gov (United States)

Nieuwejaar, Nils; Kotz, David

1996-01-01

As the input/output (I/O) needs of parallel scientific applications increase, file systems for multiprocessors are being designed to provide applications with parallel access to multiple disks. Many parallel file systems present applications with a conventional Unix-like interface that allows the application to access multiple disks transparently. This interface conceals the parallism within the file system, which increases the ease of programmability, but makes it difficult or impossible for sophisticated programmers and libraries to use knowledge about their I/O needs to exploit that parallelism. Furthermore, most current parallel file systems are optimized for a different workload than they are being asked to support. We introduce Galley, a new parallel file system that is intended to efficiently support realistic parallel workloads. Initial experiments, reported in this paper, indicate that Galley is capable of providing high-performance 1/O to applications the applications that rely on them. In Section 3 we describe that access data in patterns that have been observed to be common.
Application of escape probability to line transfer in laser-produced plasmas

International Nuclear Information System (INIS)

Lee, Y.T.; London, R.A.; Zimmerman, G.B.; Haglestein, P.L.

1989-01-01

In this paper the authors apply the escape probability method to treat transfer of optically thick lines in laser-produced plasmas in plan-parallel geometry. They investigate the effect of self-absorption on the ionization balance and ion level populations. In addition, they calculate such effect on the laser gains in an exploding foil target heated by an optical laser. Due to the large ion streaming motion in laser-produced plasmas, absorption of an emitted photon occurs only over the length in which the Doppler shift is equal to the line width. They find that the escape probability calculated with the Doppler shift is larger compared to the escape probability for a static plasma. Therefore, the ion streaming motion contributes significantly to the line transfer process in laser-produced plasmas. As examples, they have applied escape probability to calculate transfer of optically thick lines in both ablating slab and exploding foil targets under irradiation of a high-power optical laser
Unified Singularity Modeling and Reconfiguration of 3rTPS Metamorphic Parallel Mechanisms with Parallel Constraint Screws

Directory of Open Access Journals (Sweden)

Yufeng Zhuang

2015-01-01

Full Text Available This paper presents a unified singularity modeling and reconfiguration analysis of variable topologies of a class of metamorphic parallel mechanisms with parallel constraint screws. The new parallel mechanisms consist of three reconfigurable rTPS limbs that have two working phases stemming from the reconfigurable Hooke (rT joint. While one phase has full mobility, the other supplies a constraint force to the platform. Based on these, the platform constraint screw systems show that the new metamorphic parallel mechanisms have four topologies by altering the limb phases with mobility change among 1R2T (one rotation with two translations, 2R2T, and 3R2T and mobility 6. Geometric conditions of the mechanism design are investigated with some special topologies illustrated considering the limb arrangement. Following this and the actuation scheme analysis, a unified Jacobian matrix is formed using screw theory to include the change between geometric constraints and actuation constraints in the topology reconfiguration. Various singular configurations are identified by analyzing screw dependency in the Jacobian matrix. The work in this paper provides basis for singularity-free workspace analysis and optimal design of the class of metamorphic parallel mechanisms with parallel constraint screws which shows simple geometric constraints with potential simple kinematics and dynamics properties.
Stability of arsenic peptides in plant extracts: off-line versus on-line parallel elemental and molecular mass spectrometric detection for liquid chromatographic separation.

Science.gov (United States)

Bluemlein, Katharina; Raab, Andrea; Feldmann, Jörg

2009-01-01

The instability of metal and metalloid complexes during analytical processes has always been an issue of an uncertainty regarding their speciation in plant extracts. Two different speciation protocols were compared regarding the analysis of arsenic phytochelatin (As(III)PC) complexes in fresh plant material. As the final step for separation/detection both methods used RP-HPLC simultaneously coupled to ICP-MS and ES-MS. However, one method was the often used off-line approach using two-dimensional separation, i.e. a pre-cleaning step using size-exclusion chromatography with subsequent fraction collection and freeze-drying prior to the analysis using RP-HPLC-ICP-MS and/or ES-MS. This approach revealed that less than 2% of the total arsenic was bound to peptides such as phytochelatins in the root extract of an arsenate exposed Thunbergia alata, whereas the direct on-line method showed that 83% of arsenic was bound to peptides, mainly as As(III)PC(3) and (GS)As(III)PC(2). Key analytical factors were identified which destabilise the As(III)PCs. The low pH of the mobile phase (0.1% formic acid) using RP-HPLC-ICP-MS/ES-MS stabilises the arsenic peptide complexes in the plant extract as well as the free peptide concentration, as shown by the kinetic disintegration study of the model compound As(III)(GS)(3) at pH 2.2 and 3.8. But only short half-lives of only a few hours were determined for the arsenic glutathione complex. Although As(III)PC(3) showed a ten times higher half-life (23 h) in a plant extract, the pre-cleaning step with subsequent fractionation in a mobile phase of pH 5.6 contributes to the destabilisation of the arsenic peptides in the off-line method. Furthermore, it was found that during a freeze-drying process more than 90% of an As(III)PC(3) complex and smaller free peptides such as PC(2) and PC(3) can be lost. Although the two-dimensional off-line method has been used successfully for other metal complexes, it is concluded here that the fractionation and
Development of high-resolution x-ray CT system using parallel beam geometry

Energy Technology Data Exchange (ETDEWEB)

Yoneyama, Akio, E-mail: akio.yoneyama.bu@hitachi.com; Baba, Rika [Central Research Laboratory, Hitachi Ltd., Hatoyama, Saitama (Japan); Hyodo, Kazuyuki [Institute of Materials Science, High Energy Accelerator Research Organization, Tsukuba, Ibaraki (Japan); Takeda, Tohoru [School of Allied Health Sciences, Kitasato University, Sagamihara, Kanagawa (Japan); Nakano, Haruhisa; Maki, Koutaro [Department of Orthodontics, School of Dentistry Showa University, Ota-ku, Tokyo (Japan); Sumitani, Kazushi; Hirai, Yasuharu [Kyushu Synchrotron Light Research Center, Tosu, Saga (Japan)

2016-01-28

For fine three-dimensional observations of large biomedical and organic material samples, we developed a high-resolution X-ray CT system. The system consists of a sample positioner, a 5-μm scintillator, microscopy lenses, and a water-cooled sCMOS detector. Parallel beam geometry was adopted to attain a field of view of a few mm square. A fine three-dimensional image of birch branch was obtained using a 9-keV X-ray at BL16XU of SPring-8 in Japan. The spatial resolution estimated from the line profile of a sectional image was about 3 μm.
Methodology for substantiation of the fast reactor fuel element serviceability

International Nuclear Information System (INIS)

Tsykanov, V.A.; Maershin, A.A.

1988-01-01

Methodological aspects of fast reactor fuel element serviceability substantiation are presented. The choice of the experimental program and strategies of its realization to solve the problem set in short time, taking into account available experimental means, are substantiated. Factors determining fuel element serviceability depending on parameters and operational conditions are considered. The methodological approach recommending separate studing of the factors, which points to the possibility of data acquisition, required for the development of calculational models and substantiation of fuel element serviceability in pilot and experimental reactors, is described. It is shown that the special-purpose data are more useful for the substantiation of fuel element serviceability and analytical method development than unsubstantial and expensive complex tests of fuel elements and fuel assemblies, which should be conducted only at final stages for the improvement of the structure on the whole
Large rainfall changes consistently projected over substantial areas of tropical land

Science.gov (United States)

Chadwick, Robin; Good, Peter; Martin, Gill; Rowell, David P.

2016-02-01

Many tropical countries are exceptionally vulnerable to changes in rainfall patterns, with floods or droughts often severely affecting human life and health, food and water supplies, ecosystems and infrastructure. There is widespread disagreement among climate model projections of how and where rainfall will change over tropical land at the regional scales relevant to impacts, with different models predicting the position of current tropical wet and dry regions to shift in different ways. Here we show that despite uncertainty in the location of future rainfall shifts, climate models consistently project that large rainfall changes will occur for a considerable proportion of tropical land over the twenty-first century. The area of semi-arid land affected by large changes under a higher emissions scenario is likely to be greater than during even the most extreme regional wet or dry periods of the twentieth century, such as the Sahel drought of the late 1960s to 1990s. Substantial changes are projected to occur by mid-century--earlier than previously expected--and to intensify in line with global temperature rise. Therefore, current climate projections contain quantitative, decision-relevant information on future regional rainfall changes, particularly with regard to climate change mitigation policy.
Fast ℓ1-SPIRiT Compressed Sensing Parallel Imaging MRI: Scalable Parallel Implementation and Clinically Feasible Runtime

Science.gov (United States)

Murphy, Mark; Alley, Marcus; Demmel, James; Keutzer, Kurt; Vasanawala, Shreyas; Lustig, Michael

2012-01-01

We present ℓ1-SPIRiT, a simple algorithm for auto calibrating parallel imaging (acPI) and compressed sensing (CS) that permits an efficient implementation with clinically-feasible runtimes. We propose a CS objective function that minimizes cross-channel joint sparsity in the Wavelet domain. Our reconstruction minimizes this objective via iterative soft-thresholding, and integrates naturally with iterative Self-Consistent Parallel Imaging (SPIRiT). Like many iterative MRI reconstructions, ℓ1-SPIRiT’s image quality comes at a high computational cost. Excessively long runtimes are a barrier to the clinical use of any reconstruction approach, and thus we discuss our approach to efficiently parallelizing ℓ1-SPIRiT and to achieving clinically-feasible runtimes. We present parallelizations of ℓ1-SPIRiT for both multi-GPU systems and multi-core CPUs, and discuss the software optimization and parallelization decisions made in our implementation. The performance of these alternatives depends on the processor architecture, the size of the image matrix, and the number of parallel imaging channels. Fundamentally, achieving fast runtime requires the correct trade-off between cache usage and parallelization overheads. We demonstrate image quality via a case from our clinical experimentation, using a custom 3DFT Spoiled Gradient Echo (SPGR) sequence with up to 8× acceleration via poisson-disc undersampling in the two phase-encoded directions. PMID:22345529
Parallel and non-parallel laminar mixed convection flow in an inclined tube: The effect of the boundary conditions

International Nuclear Information System (INIS)

Barletta, A.

2008-01-01

The necessary condition for the onset of parallel flow in the fully developed region of an inclined duct is applied to the case of a circular tube. Parallel flow in inclined ducts is an uncommon regime, since in most cases buoyancy tends to produce the onset of secondary flow. The present study shows how proper thermal boundary conditions may preserve parallel flow regime. Mixed convection flow is studied for a special non-axisymmetric thermal boundary condition that, with a proper choice of a switch parameter, may be compatible with parallel flow. More precisely, a circumferentially variable heat flux distribution is prescribed on the tube wall, expressed as a sinusoidal function of the azimuthal coordinate θ with period 2π. A π/2 rotation in the position of the maximum heat flux, achieved by setting the switch parameter, may allow or not the existence of parallel flow. Two cases are considered corresponding to parallel and non-parallel flow. In the first case, the governing balance equations allow a simple analytical solution. On the contrary, in the second case, the local balance equations are solved numerically by employing a finite element method
Parallel programming with Easy Java Simulations

Science.gov (United States)

Esquembre, F.; Christian, W.; Belloni, M.

2018-01-01

Nearly all of today's processors are multicore, and ideally programming and algorithm development utilizing the entire processor should be introduced early in the computational physics curriculum. Parallel programming is often not introduced because it requires a new programming environment and uses constructs that are unfamiliar to many teachers. We describe how we decrease the barrier to parallel programming by using a java-based programming environment to treat problems in the usual undergraduate curriculum. We use the easy java simulations programming and authoring tool to create the program's graphical user interface together with objects based on those developed by Kaminsky [Building Parallel Programs (Course Technology, Boston, 2010)] to handle common parallel programming tasks. Shared-memory parallel implementations of physics problems, such as time evolution of the Schrödinger equation, are available as source code and as ready-to-run programs from the AAPT-ComPADRE digital library.
Tracing Magnetic Fields With The Polarization Of Submillimeter Lines

Science.gov (United States)

Zhang, Heshou; Yan, Huirong

2017-10-01

Magnetic fields play important roles in many astrophysical processes. However, there is no universal diagnostic for the magnetic fields in the interstellar medium (ISM) and each magnetic tracer has its limitation. Any new detection method is thus valuable. Theoretical studies have shown that submillimeter fine-structure lines are polarized due to atomic alignment by Ultraviolet (UV) photon-excitation, which opens up a new avenue to probe interstellar magnetic fields. The method is applicable to all radiative-excitation dominant region, e.g., H II Regions, PDRs. The polarization of the submillimeter fine-structure lines induced by atomic alignment could be substantial and the applicability of using the spectro-polarimetry of atomic lines to trace magnetic fields has been supported by synthetic observations of simulated ISM in our recent paper. Our results demonstrate that the polarization of submillimeter atomic lines is a powerful magnetic tracer and add great value to the observational studies of the submilimeter astronomy.
Parallelism and Scalability in an Image Processing Application

DEFF Research Database (Denmark)

Rasmussen, Morten Sleth; Stuart, Matthias Bo; Karlsson, Sven

2008-01-01

parallel programs. This paper investigates parallelism and scalability of an embedded image processing application. The major challenges faced when parallelizing the application were to extract enough parallelism from the application and to reduce load imbalance. The application has limited immediately......The recent trends in processor architecture show that parallel processing is moving into new areas of computing in the form of many-core desktop processors and multi-processor system-on-chip. This means that parallel processing is required in application areas that traditionally have not used...
Parallelism and Scalability in an Image Processing Application

DEFF Research Database (Denmark)

Rasmussen, Morten Sleth; Stuart, Matthias Bo; Karlsson, Sven

2009-01-01

parallel programs. This paper investigates parallelism and scalability of an embedded image processing application. The major challenges faced when parallelizing the application were to extract enough parallelism from the application and to reduce load imbalance. The application has limited immediately......The recent trends in processor architecture show that parallel processing is moving into new areas of computing in the form of many-core desktop processors and multi-processor system-on-chips. This means that parallel processing is required in application areas that traditionally have not used...
Parallel auto-correlative statistics with VTK.

Energy Technology Data Exchange (ETDEWEB)

Pebay, Philippe Pierre; Bennett, Janine Camille

2013-08-01

This report summarizes existing statistical engines in VTK and presents both the serial and parallel auto-correlative statistics engines. It is a sequel to [PT08, BPRT09b, PT09, BPT09, PT10] which studied the parallel descriptive, correlative, multi-correlative, principal component analysis, contingency, k-means, and order statistics engines. The ease of use of the new parallel auto-correlative statistics engine is illustrated by the means of C++ code snippets and algorithm verification is provided. This report justifies the design of the statistics engines with parallel scalability in mind, and provides scalability and speed-up analysis results for the autocorrelative statistics engine.
Mouse-tracking evidence for parallel anticipatory option evaluation.

Science.gov (United States)

Cranford, Edward A; Moss, Jarrod

2017-12-23

In fast-paced, dynamic tasks, the ability to anticipate the future outcome of a sequence of events is crucial to quickly selecting an appropriate course of action among multiple alternative options. There are two classes of theories that describe how anticipation occurs. Serial theories assume options are generated and evaluated one at a time, in order of quality, whereas parallel theories assume simultaneous generation and evaluation. The present research examined the option evaluation process during a task designed to be analogous to prior anticipation tasks, but within the domain of narrative text comprehension. Prior research has relied on indirect, off-line measurement of the option evaluation process during anticipation tasks. Because the movement of the hand can provide a window into underlying cognitive processes, online metrics such as continuous mouse tracking provide more fine-grained measurements of cognitive processing as it occurs in real time. In this study, participants listened to three-sentence stories and predicted the protagonists' final action by moving a mouse toward one of three possible options. Each story was presented with either one (control condition) or two (distractor condition) plausible ending options. Results seem most consistent with a parallel option evaluation process because initial mouse trajectories deviated further from the best option in the distractor condition compared to the control condition. It is difficult to completely rule out all possible serial processing accounts, although the results do place constraints on the time frame in which a serial processing explanation must operate.
CALCULATION METHOD OF ELECTRIC POWER LINES MAGNETIC FIELD STRENGTH BASED ON CYLINDRICAL SPATIAL HARMONICS

Directory of Open Access Journals (Sweden)

A.V. Erisov

2016-05-01

Full Text Available Purpose. Simplification of accounting ratio to determine the magnetic field strength of electric power lines, and assessment of their environmental safety. Methodology. Description of the transmission lines of the magnetic field by using techniques of spatial harmonic analysis in the cylindrical coordinate system is carried out. Results. For engineering calculations of electric power lines magnetic field with sufficient accuracy describes their first spatial harmonic magnetic field. Originality. Substantial simplification of the definition of the impact of the construction of transmission line poles on the value of its magnetic field and the bands of land alienation sizes. Practical value. The environmentally friendly projection electric power lines on the level of the magnetic field.

MAGNETICALLY DOMINATED PARALLEL INTERSTELLAR FILAMENTS IN THE INFRARED DARK CLOUD G14.225-0.506

International Nuclear Information System (INIS)

Santos, Fábio P.; Busquet, Gemma; Girart, Josep Miquel; Franco, Gabriel A. P.; Zhang, Qizhou

2016-01-01

The infrared dark cloud G14.225-0.506 (IRDC G14.2) displays a remarkable complex of parallel dense molecular filaments projected on the plane of the sky. Previous studies of dust emission and molecular lines have speculated whether magnetic fields could have played an important role in the formation of such elongated structures, which are hosts to numerous young stellar sources. In this work we have conducted a vast polarimetric survey at optical and near-infrared wavelengths in order to study the morphology of magnetic field lines in IRDC G14.2 through the observation of background stars. The orientation of interstellar polarization, which traces magnetic field lines, is perpendicular to most of the filamentary features within the cloud. Additionally, the larger-scale molecular cloud as a whole exhibits an elongated shape also perpendicular to magnetic fields. Estimates of magnetic field strengths indicate values in the range 320–550 μ G, which allow sub-alfvénic conditions, but do not prevent the gravitational collapse of hub–filament structures, which in general are close to the critical state. These characteristics suggest that magnetic fields played the main role in regulating the collapse from large to small scales, leading to the formation of series of parallel elongated structures. The morphology is also consistent with numerical simulations that show how gravitational instabilities develop when subjected to strong magnetic fields. Finally, the results corroborate the hypothesis that strong support from internal magnetic fields might explain why the cloud seems to be contracting on a timescale 2–3 times longer than what is expected from a free-fall collapse.
MAGNETICALLY DOMINATED PARALLEL INTERSTELLAR FILAMENTS IN THE INFRARED DARK CLOUD G14.225-0.506

Energy Technology Data Exchange (ETDEWEB)

Santos, Fábio P. [Department of Physics and Astronomy, Northwestern University, 2145 Sheridan Road, Evanston, IL 60208 (United States); Busquet, Gemma; Girart, Josep Miquel [Institut de Ciències de l’Espai (CSIC-IEEC), Campus UAB, Carrer de Can Magrans, S/N E-08193 Bellaterra, Catalunya (Spain); Franco, Gabriel A. P. [Departamento de Física—ICEx—UFMG, Caixa Postal 702, 30.123-970 Belo Horizonte, MG (Brazil); Zhang, Qizhou, E-mail: fabiops@northwestern.edu, E-mail: busquet@ice.cat, E-mail: girart@ice.cat, E-mail: franco@fisica.ufmg.br, E-mail: qzhang@cfa.harvard.edu [Harvard-Smithsonian Center for Astrophysics, 60, Garden Street, Cambridge, MA 02138 (United States)

2016-12-01

The infrared dark cloud G14.225-0.506 (IRDC G14.2) displays a remarkable complex of parallel dense molecular filaments projected on the plane of the sky. Previous studies of dust emission and molecular lines have speculated whether magnetic fields could have played an important role in the formation of such elongated structures, which are hosts to numerous young stellar sources. In this work we have conducted a vast polarimetric survey at optical and near-infrared wavelengths in order to study the morphology of magnetic field lines in IRDC G14.2 through the observation of background stars. The orientation of interstellar polarization, which traces magnetic field lines, is perpendicular to most of the filamentary features within the cloud. Additionally, the larger-scale molecular cloud as a whole exhibits an elongated shape also perpendicular to magnetic fields. Estimates of magnetic field strengths indicate values in the range 320–550 μ G, which allow sub-alfvénic conditions, but do not prevent the gravitational collapse of hub–filament structures, which in general are close to the critical state. These characteristics suggest that magnetic fields played the main role in regulating the collapse from large to small scales, leading to the formation of series of parallel elongated structures. The morphology is also consistent with numerical simulations that show how gravitational instabilities develop when subjected to strong magnetic fields. Finally, the results corroborate the hypothesis that strong support from internal magnetic fields might explain why the cloud seems to be contracting on a timescale 2–3 times longer than what is expected from a free-fall collapse.
Parallel plasma fluid turbulence calculations

International Nuclear Information System (INIS)

Leboeuf, J.N.; Carreras, B.A.; Charlton, L.A.; Drake, J.B.; Lynch, V.E.; Newman, D.E.; Sidikman, K.L.; Spong, D.A.

1994-01-01

The study of plasma turbulence and transport is a complex problem of critical importance for fusion-relevant plasmas. To this day, the fluid treatment of plasma dynamics is the best approach to realistic physics at the high resolution required for certain experimentally relevant calculations. Core and edge turbulence in a magnetic fusion device have been modeled using state-of-the-art, nonlinear, three-dimensional, initial-value fluid and gyrofluid codes. Parallel implementation of these models on diverse platforms--vector parallel (National Energy Research Supercomputer Center's CRAY Y-MP C90), massively parallel (Intel Paragon XP/S 35), and serial parallel (clusters of high-performance workstations using the Parallel Virtual Machine protocol)--offers a variety of paths to high resolution and significant improvements in real-time efficiency, each with its own advantages. The largest and most efficient calculations have been performed at the 200 Mword memory limit on the C90 in dedicated mode, where an overlap of 12 to 13 out of a maximum of 16 processors has been achieved with a gyrofluid model of core fluctuations. The richness of the physics captured by these calculations is commensurate with the increased resolution and efficiency and is limited only by the ingenuity brought to the analysis of the massive amounts of data generated
A task parallel implementation of fast multipole methods

KAUST Repository

Taura, Kenjiro

2012-11-01

This paper describes a task parallel implementation of ExaFMM, an open source implementation of fast multipole methods (FMM), using a lightweight task parallel library MassiveThreads. Although there have been many attempts on parallelizing FMM, experiences have almost exclusively been limited to formulation based on flat homogeneous parallel loops. FMM in fact contains operations that cannot be readily expressed in such conventional but restrictive models. We show that task parallelism, or parallel recursions in particular, allows us to parallelize all operations of FMM naturally and scalably. Moreover it allows us to parallelize a \\'\\'mutual interaction\\'\\' for force/potential evaluation, which is roughly twice as efficient as a more conventional, unidirectional force/potential evaluation. The net result is an open source FMM that is clearly among the fastest single node implementations, including those on GPUs; with a million particles on a 32 cores Sandy Bridge 2.20GHz node, it completes a single time step including tree construction and force/potential evaluation in 65 milliseconds. The study clearly showcases both programmability and performance benefits of flexible parallel constructs over more monolithic parallel loops. © 2012 IEEE.
LS1 Report: The cryogenic line goes through the scanner

CERN Multimedia

CERN Bulletin

2013-01-01

In spite of the complexity of LS1, with many different activities taking place in parallel and sometimes overlapping, the dashboard shows that work is progressing on schedule. This week, teams have started X-raying the cryogenic line to examine its condition in minute detail. The LS1 schedule is pretty unfathomable for those who don't work in the tunnels or installations, but if you look down all the columns and stop at the line indicating today’s date, you can see that all of the priority and critical items are bang on time, like a Swiss watch. More specifically: the SMACC project in the LHC is on schedule, with a new testing phase for the interconnections which have already been consolidated; preparations are under way for the cable replacement campaign at Point 1 of the SPS (about 20% of the cables will not be replaced as they are completely unused); and the demineralised water distribution line is back in service, as are the electrical substations for the 400 and 66 kV line...
Second derivative parallel block backward differentiation type ...

African Journals Online (AJOL)

Second derivative parallel block backward differentiation type formulas for Stiff ODEs. ... Log in or Register to get access to full text downloads. ... and the methods are inherently parallel and can be distributed over parallel processors. They are ...
Parallelization of quantum molecular dynamics simulation code

International Nuclear Information System (INIS)

Kato, Kaori; Kunugi, Tomoaki; Shibahara, Masahiko; Kotake, Susumu

1998-02-01

A quantum molecular dynamics simulation code has been developed for the analysis of the thermalization of photon energies in the molecule or materials in Kansai Research Establishment. The simulation code is parallelized for both Scalar massively parallel computer (Intel Paragon XP/S75) and Vector parallel computer (Fujitsu VPP300/12). Scalable speed-up has been obtained with a distribution to processor units by division of particle group in both parallel computers. As a result of distribution to processor units not only by particle group but also by the particles calculation that is constructed with fine calculations, highly parallelization performance is achieved in Intel Paragon XP/S75. (author)
A Parallel Approach to Fractal Image Compression

OpenAIRE

Lubomir Dedera

2004-01-01

The paper deals with a parallel approach to coding and decoding algorithms in fractal image compressionand presents experimental results comparing sequential and parallel algorithms from the point of view of achieved bothcoding and decoding time and effectiveness of parallelization.
Differences Between Distributed and Parallel Systems

Energy Technology Data Exchange (ETDEWEB)

Brightwell, R.; Maccabe, A.B.; Rissen, R.

1998-10-01

Distributed systems have been studied for twenty years and are now coming into wider use as fast networks and powerful workstations become more readily available. In many respects a massively parallel computer resembles a network of workstations and it is tempting to port a distributed operating system to such a machine. However, there are significant differences between these two environments and a parallel operating system is needed to get the best performance out of a massively parallel system. This report characterizes the differences between distributed systems, networks of workstations, and massively parallel systems and analyzes the impact of these differences on operating system design. In the second part of the report, we introduce Puma, an operating system specifically developed for massively parallel systems. We describe Puma portals, the basic building blocks for message passing paradigms implemented on top of Puma, and show how the differences observed in the first part of the report have influenced the design and implementation of Puma.
Parallel processing from applications to systems

CERN Document Server

Moldovan, Dan I

1993-01-01

This text provides one of the broadest presentations of parallelprocessing available, including the structure of parallelprocessors and parallel algorithms. The emphasis is on mappingalgorithms to highly parallel computers, with extensive coverage ofarray and multiprocessor architectures. Early chapters provideinsightful coverage on the analysis of parallel algorithms andprogram transformations, effectively integrating a variety ofmaterial previously scattered throughout the literature. Theory andpractice are well balanced across diverse topics in this concisepresentation. For exceptional cla
A survey of parallel multigrid algorithms

Science.gov (United States)

Chan, Tony F.; Tuminaro, Ray S.

1987-01-01

A typical multigrid algorithm applied to well-behaved linear-elliptic partial-differential equations (PDEs) is described. Criteria for designing and evaluating parallel algorithms are presented. Before evaluating the performance of some parallel multigrid algorithms, consideration is given to some theoretical complexity results for solving PDEs in parallel and for executing the multigrid algorithm. The effect of mapping and load imbalance on the partial efficiency of the algorithm is studied.
A Parallel, Multi-Scale Watershed-Hydrologic-Inundation Model with Adaptively Switching Mesh for Capturing Flooding and Lake Dynamics

Science.gov (United States)

Ji, X.; Shen, C.

2017-12-01

Flood inundation presents substantial societal hazards and also changes biogeochemistry for systems like the Amazon. It is often expensive to simulate high-resolution flood inundation and propagation in a long-term watershed-scale model. Due to the Courant-Friedrichs-Lewy (CFL) restriction, high resolution and large local flow velocity both demand prohibitively small time steps even for parallel codes. Here we develop a parallel surface-subsurface process-based model enhanced by multi-resolution meshes that are adaptively switched on or off. The high-resolution overland flow meshes are enabled only when the flood wave invades to floodplains. This model applies semi-implicit, semi-Lagrangian (SISL) scheme in solving dynamic wave equations, and with the assistant of the multi-mesh method, it also adaptively chooses the dynamic wave equation only in the area of deep inundation. Therefore, the model achieves a balance between accuracy and computational cost.
Parallel computing by Monte Carlo codes MVP/GMVP

International Nuclear Information System (INIS)

Nagaya, Yasunobu; Nakagawa, Masayuki; Mori, Takamasa

2001-01-01

General-purpose Monte Carlo codes MVP/GMVP are well-vectorized and thus enable us to perform high-speed Monte Carlo calculations. In order to achieve more speedups, we parallelized the codes on the different types of parallel computing platforms or by using a standard parallelization library MPI. The platforms used for benchmark calculations are a distributed-memory vector-parallel computer Fujitsu VPP500, a distributed-memory massively parallel computer Intel paragon and a distributed-memory scalar-parallel computer Hitachi SR2201, IBM SP2. As mentioned generally, linear speedup could be obtained for large-scale problems but parallelization efficiency decreased as the batch size per a processing element(PE) was smaller. It was also found that the statistical uncertainty for assembly powers was less than 0.1% by the PWR full-core calculation with more than 10 million histories and it took about 1.5 hours by massively parallel computing. (author)
The parallel processing of EGS4 code on distributed memory scalar parallel computer:Intel Paragon XP/S15-256

Energy Technology Data Exchange (ETDEWEB)

Takemiya, Hiroshi; Ohta, Hirofumi; Honma, Ichirou

1996-03-01

The parallelization of Electro-Magnetic Cascade Monte Carlo Simulation Code, EGS4 on distributed memory scalar parallel computer: Intel Paragon XP/S15-256 is described. EGS4 has the feature that calculation time for one incident particle is quite different from each other because of the dynamic generation of secondary particles and different behavior of each particle. Granularity for parallel processing, parallel programming model and the algorithm of parallel random number generation are discussed and two kinds of method, each of which allocates particles dynamically or statically, are used for the purpose of realizing high speed parallel processing of this code. Among four problems chosen for performance evaluation, the speedup factors for three problems have been attained to nearly 100 times with 128 processor. It has been found that when both the calculation time for each incident particles and its dispersion are large, it is preferable to use dynamic particle allocation method which can average the load for each processor. And it has also been found that when they are small, it is preferable to use static particle allocation method which reduces the communication overhead. Moreover, it is pointed out that to get the result accurately, it is necessary to use double precision variables in EGS4 code. Finally, the workflow of program parallelization is analyzed and tools for program parallelization through the experience of the EGS4 parallelization are discussed. (author).
Fast parallel tandem mass spectral library searching using GPU hardware acceleration.

Science.gov (United States)

Baumgardner, Lydia Ashleigh; Shanmugam, Avinash Kumar; Lam, Henry; Eng, Jimmy K; Martin, Daniel B

2011-06-03

Mass spectrometry-based proteomics is a maturing discipline of biologic research that is experiencing substantial growth. Instrumentation has steadily improved over time with the advent of faster and more sensitive instruments collecting ever larger data files. Consequently, the computational process of matching a peptide fragmentation pattern to its sequence, traditionally accomplished by sequence database searching and more recently also by spectral library searching, has become a bottleneck in many mass spectrometry experiments. In both of these methods, the main rate-limiting step is the comparison of an acquired spectrum with all potential matches from a spectral library or sequence database. This is a highly parallelizable process because the core computational element can be represented as a simple but arithmetically intense multiplication of two vectors. In this paper, we present a proof of concept project taking advantage of the massively parallel computing available on graphics processing units (GPUs) to distribute and accelerate the process of spectral assignment using spectral library searching. This program, which we have named FastPaSS (for Fast Parallelized Spectral Searching), is implemented in CUDA (Compute Unified Device Architecture) from NVIDIA, which allows direct access to the processors in an NVIDIA GPU. Our efforts demonstrate the feasibility of GPU computing for spectral assignment, through implementation of the validated spectral searching algorithm SpectraST in the CUDA environment.
Simulating Hydrologic Flow and Reactive Transport with PFLOTRAN and PETSc on Emerging Fine-Grained Parallel Computer Architectures

Science.gov (United States)

Mills, R. T.; Rupp, K.; Smith, B. F.; Brown, J.; Knepley, M.; Zhang, H.; Adams, M.; Hammond, G. E.

2017-12-01

As the high-performance computing community pushes towards the exascale horizon, power and heat considerations have driven the increasing importance and prevalence of fine-grained parallelism in new computer architectures. High-performance computing centers have become increasingly reliant on GPGPU accelerators and "manycore" processors such as the Intel Xeon Phi line, and 512-bit SIMD registers have even been introduced in the latest generation of Intel's mainstream Xeon server processors. The high degree of fine-grained parallelism and more complicated memory hierarchy considerations of such "manycore" processors present several challenges to existing scientific software. Here, we consider how the massively parallel, open-source hydrologic flow and reactive transport code PFLOTRAN - and the underlying Portable, Extensible Toolkit for Scientific Computation (PETSc) library on which it is built - can best take advantage of such architectures. We will discuss some key features of these novel architectures and our code optimizations and algorithmic developments targeted at them, and present experiences drawn from working with a wide range of PFLOTRAN benchmark problems on these architectures.
Towards a streaming model for nested data parallelism

DEFF Research Database (Denmark)

Madsen, Frederik Meisner; Filinski, Andrzej

2013-01-01

The language-integrated cost semantics for nested data parallelism pioneered by NESL provides an intuitive, high-level model for predicting performance and scalability of parallel algorithms with reasonable accuracy. However, this predictability, obtained through a uniform, parallelism-flattening......The language-integrated cost semantics for nested data parallelism pioneered by NESL provides an intuitive, high-level model for predicting performance and scalability of parallel algorithms with reasonable accuracy. However, this predictability, obtained through a uniform, parallelism......-processable in a streaming fashion. This semantics is directly compatible with previously proposed piecewise execution models for nested data parallelism, but allows the expected space usage to be reasoned about directly at the source-language level. The language definition and implementation are still very much work...
Development of a Stereo Vision Measurement System for a 3D Three-Axial Pneumatic Parallel Mechanism Robot Arm

Directory of Open Access Journals (Sweden)

Chien-Lun Hou

2011-02-01

Full Text Available In this paper, a stereo vision 3D position measurement system for a three-axial pneumatic parallel mechanism robot arm is presented. The stereo vision 3D position measurement system aims to measure the 3D trajectories of the end-effector of the robot arm. To track the end-effector of the robot arm, the circle detection algorithm is used to detect the desired target and the SAD algorithm is used to track the moving target and to search the corresponding target location along the conjugate epipolar line in the stereo pair. After camera calibration, both intrinsic and extrinsic parameters of the stereo rig can be obtained, so images can be rectified according to the camera parameters. Thus, through the epipolar rectification, the stereo matching process is reduced to a horizontal search along the conjugate epipolar line. Finally, 3D trajectories of the end-effector are computed by stereo triangulation. The experimental results show that the stereo vision 3D position measurement system proposed in this paper can successfully track and measure the fifth-order polynomial trajectory and sinusoidal trajectory of the end-effector of the three- axial pneumatic parallel mechanism robot arm.
Massively Parallel Computing: A Sandia Perspective

Energy Technology Data Exchange (ETDEWEB)

Dosanjh, Sudip S.; Greenberg, David S.; Hendrickson, Bruce; Heroux, Michael A.; Plimpton, Steve J.; Tomkins, James L.; Womble, David E.

1999-05-06

The computing power available to scientists and engineers has increased dramatically in the past decade, due in part to progress in making massively parallel computing practical and available. The expectation for these machines has been great. The reality is that progress has been slower than expected. Nevertheless, massively parallel computing is beginning to realize its potential for enabling significant break-throughs in science and engineering. This paper provides a perspective on the state of the field, colored by the authors' experiences using large scale parallel machines at Sandia National Laboratories. We address trends in hardware, system software and algorithms, and we also offer our view of the forces shaping the parallel computing industry.
Parallel Algorithms for the Exascale Era

Energy Technology Data Exchange (ETDEWEB)

Robey, Robert W. [Los Alamos National Laboratory

2016-10-19

New parallel algorithms are needed to reach the Exascale level of parallelism with millions of cores. We look at some of the research developed by students in projects at LANL. The research blends ideas from the early days of computing while weaving in the fresh approach brought by students new to the field of high performance computing. We look at reproducibility of global sums and why it is important to parallel computing. Next we look at how the concept of hashing has led to the development of more scalable algorithms suitable for next-generation parallel computers. Nearly all of this work has been done by undergraduates and published in leading scientific journals.

A Parallel Approach to Fractal Image Compression

Directory of Open Access Journals (Sweden)

Lubomir Dedera

2004-01-01

Full Text Available The paper deals with a parallel approach to coding and decoding algorithms in fractal image compressionand presents experimental results comparing sequential and parallel algorithms from the point of view of achieved bothcoding and decoding time and effectiveness of parallelization.
A parallelization study of the general purpose Monte Carlo code MCNP4 on a distributed memory highly parallel computer

International Nuclear Information System (INIS)

Yamazaki, Takao; Fujisaki, Masahide; Okuda, Motoi; Takano, Makoto; Masukawa, Fumihiro; Naito, Yoshitaka

1993-01-01

The general purpose Monte Carlo code MCNP4 has been implemented on the Fujitsu AP1000 distributed memory highly parallel computer. Parallelization techniques developed and studied are reported. A shielding analysis function of the MCNP4 code is parallelized in this study. A technique to map a history to each processor dynamically and to map control process to a certain processor was applied. The efficiency of parallelized code is up to 80% for a typical practical problem with 512 processors. These results demonstrate the advantages of a highly parallel computer to the conventional computers in the field of shielding analysis by Monte Carlo method. (orig.)
Performance Analysis of Parallel Mathematical Subroutine library PARCEL

International Nuclear Information System (INIS)

Yamada, Susumu; Shimizu, Futoshi; Kobayashi, Kenichi; Kaburaki, Hideo; Kishida, Norio

2000-01-01

The parallel mathematical subroutine library PARCEL (Parallel Computing Elements) has been developed by Japan Atomic Energy Research Institute for easy use of typical parallelized mathematical codes in any application problems on distributed parallel computers. The PARCEL includes routines for linear equations, eigenvalue problems, pseudo-random number generation, and fast Fourier transforms. It is shown that the results of performance for linear equations routines exhibit good parallelization efficiency on vector, as well as scalar, parallel computers. A comparison of the efficiency results with the PETSc (Portable Extensible Tool kit for Scientific Computations) library has been reported. (author)
Applications of the parallel computing system using network

International Nuclear Information System (INIS)

Ido, Shunji; Hasebe, Hiroki

1994-01-01

Parallel programming is applied to multiple processors connected in Ethernet. Data exchanges between tasks located in each processing element are realized by two ways. One is socket which is standard library on recent UNIX operating systems. Another is a network connecting software, named as Parallel Virtual Machine (PVM) which is a free software developed by ORNL, to use many workstations connected to network as a parallel computer. This paper discusses the availability of parallel computing using network and UNIX workstations and comparison between specialized parallel systems (Transputer and iPSC/860) in a Monte Carlo simulation which generally shows high parallelization ratio. (author)
Balanced, parallel operation of flashlamps

International Nuclear Information System (INIS)

Carder, B.M.; Merritt, B.T.

1979-01-01

A new energy store, the Compensated Pulsed Alternator (CPA), promises to be a cost effective substitute for capacitors to drive flashlamps that pump large Nd:glass lasers. Because the CPA is large and discrete, it will be necessary that it drive many parallel flashlamp circuits, presenting a problem in equal current distribution. Current division to +- 20% between parallel flashlamps has been achieved, but this is marginal for laser pumping. A method is presented here that provides equal current sharing to about 1%, and it includes fused protection against short circuit faults. The method was tested with eight parallel circuits, including both open-circuit and short-circuit fault tests
Bayer image parallel decoding based on GPU

Science.gov (United States)

Hu, Rihui; Xu, Zhiyong; Wei, Yuxing; Sun, Shaohua

2012-11-01

In the photoelectrical tracking system, Bayer image is decompressed in traditional method, which is CPU-based. However, it is too slow when the images become large, for example, 2K×2K×16bit. In order to accelerate the Bayer image decoding, this paper introduces a parallel speedup method for NVIDA's Graphics Processor Unit (GPU) which supports CUDA architecture. The decoding procedure can be divided into three parts: the first is serial part, the second is task-parallelism part, and the last is data-parallelism part including inverse quantization, inverse discrete wavelet transform (IDWT) as well as image post-processing part. For reducing the execution time, the task-parallelism part is optimized by OpenMP techniques. The data-parallelism part could advance its efficiency through executing on the GPU as CUDA parallel program. The optimization techniques include instruction optimization, shared memory access optimization, the access memory coalesced optimization and texture memory optimization. In particular, it can significantly speed up the IDWT by rewriting the 2D (Tow-dimensional) serial IDWT into 1D parallel IDWT. Through experimenting with 1K×1K×16bit Bayer image, data-parallelism part is 10 more times faster than CPU-based implementation. Finally, a CPU+GPU heterogeneous decompression system was designed. The experimental result shows that it could achieve 3 to 5 times speed increase compared to the CPU serial method.
Refinement of Parallel and Reactive Programs

OpenAIRE

Back, R. J. R.

1992-01-01

We show how to apply the refinement calculus to stepwise refinement of parallel and reactive programs. We use action systems as our basic program model. Action systems are sequential programs which can be implemented in a parallel fashion. Hence refinement calculus methods, originally developed for sequential programs, carry over to the derivation of parallel programs. Refinement of reactive programs is handled by data refinement techniques originally developed for the sequential refinement c...
Comparative analysis of the serial/parallel numerical calculation of boiling channels thermohydraulics; Analisis comparativo del calculo numerico serie/paralelo de la termohidraulica de canales con ebullicion

Energy Technology Data Exchange (ETDEWEB)

Cecenas F, M., E-mail: mcf@iie.org.mx [Instituto Nacional de Electricidad y Energias Limpias, Reforma 113, Col. Palmira, 62490 Cuernavaca, Morelos (Mexico)

2017-09-15

A parallel channel model with boiling and punctual neutron kinetics is used to compare the implementation of its programming in C language through a conventional scheme and through a parallel programming scheme. In both cases the subroutines written in C are practically the same, but they vary in the way of controlling the execution of the tasks that calculate the different channels. Parallel Virtual Machine is used for the parallel solution, which allows the passage of messages between tasks to control convergence and transfer the variables of interest between the tasks that run simultaneously on a platform equipped with a multi-core microprocessor. For some problems defined as a study case, such as the one presented in this paper, a computer with two cores can reduce the computation time to 54-56% of the time required by the same program in its conventional sequential version. Similarly, a processor with four cores can reduce the time to 22-33% of execution time of the conventional serial version. These results of substantially reducing the computation time are very motivating of all those applications that can be prepared to be parallelized and whose execution time is an important factor. (Author)
26 CFR 1.42-7 - Substantially bond-financed buildings. [Reserved

Science.gov (United States)

2010-04-01

... 26 Internal Revenue 1 2010-04-01 2010-04-01 true Substantially bond-financed buildings. [Reserved] 1.42-7 Section 1.42-7 Internal Revenue INTERNAL REVENUE SERVICE, DEPARTMENT OF THE TREASURY INCOME TAX INCOME TAXES Credits Against Tax § 1.42-7 Substantially bond-financed buildings. [Reserved] ...
Portable parallel programming in a Fortran environment

International Nuclear Information System (INIS)

May, E.N.

1989-01-01

Experience using the Argonne-developed PARMACs macro package to implement a portable parallel programming environment is described. Fortran programs with intrinsic parallelism of coarse and medium granularity are easily converted to parallel programs which are portable among a number of commercially available parallel processors in the class of shared-memory bus-based and local-memory network based MIMD processors. The parallelism is implemented using standard UNIX (tm) tools and a small number of easily understood synchronization concepts (monitors and message-passing techniques) to construct and coordinate multiple cooperating processes on one or many processors. Benchmark results are presented for parallel computers such as the Alliant FX/8, the Encore MultiMax, the Sequent Balance, the Intel iPSC/2 Hypercube and a network of Sun 3 workstations. These parallel machines are typical MIMD types with from 8 to 30 processors, each rated at from 1 to 10 MIPS processing power. The demonstration code used for this work is a Monte Carlo simulation of the response to photons of a ''nearly realistic'' lead, iron and plastic electromagnetic and hadronic calorimeter, using the EGS4 code system. 6 refs., 2 figs., 2 tabs
Structured Parallel Programming Patterns for Efficient Computation

CERN Document Server

McCool, Michael; Robison, Arch

2012-01-01

Programming is now parallel programming. Much as structured programming revolutionized traditional serial programming decades ago, a new kind of structured programming, based on patterns, is relevant to parallel programming today. Parallel computing experts and industry insiders Michael McCool, Arch Robison, and James Reinders describe how to design and implement maintainable and efficient parallel algorithms using a pattern-based approach. They present both theory and practice, and give detailed concrete examples using multiple programming models. Examples are primarily given using two of th
A Tutorial on Parallel and Concurrent Programming in Haskell

Science.gov (United States)

Peyton Jones, Simon; Singh, Satnam

This practical tutorial introduces the features available in Haskell for writing parallel and concurrent programs. We first describe how to write semi-explicit parallel programs by using annotations to express opportunities for parallelism and to help control the granularity of parallelism for effective execution on modern operating systems and processors. We then describe the mechanisms provided by Haskell for writing explicitly parallel programs with a focus on the use of software transactional memory to help share information between threads. Finally, we show how nested data parallelism can be used to write deterministically parallel programs which allows programmers to use rich data types in data parallel programs which are automatically transformed into flat data parallel versions for efficient execution on multi-core processors.
3-D Hybrid Simulation of Quasi-Parallel Bow Shock and Its Effects on the Magnetosphere

International Nuclear Information System (INIS)

Lin, Y.; Wang, X.Y.

2005-01-01

A three-dimensional (3-D) global-scale hybrid simulation is carried out for the structure of the quasi-parallel bow shock, in particular the foreshock waves and pressure pulses. The wave evolution and interaction with the dayside magnetosphere are discussed. It is shown that diamagnetic cavities are generated in the turbulent foreshock due to the ion beam plasma interaction, and these compressional pulses lead to strong surface perturbations at the magnetopause and Alfven waves/field line resonance in the magnetosphere
Parallel Computing Using Web Servers and "Servlets".

Science.gov (United States)

Lo, Alfred; Bloor, Chris; Choi, Y. K.

2000-01-01

Describes parallel computing and presents inexpensive ways to implement a virtual parallel computer with multiple Web servers. Highlights include performance measurement of parallel systems; models for using Java and intranet technology including single server, multiple clients and multiple servers, single client; and a comparison of CGI (common…
Current distribution characteristics of superconducting parallel circuits

International Nuclear Information System (INIS)

Mori, K.; Suzuki, Y.; Hara, N.; Kitamura, M.; Tominaka, T.

1994-01-01

In order to increase the current carrying capacity of the current path of the superconducting magnet system, the portion of parallel circuits such as insulated multi-strand cables or parallel persistent current switches (PCS) are made. In superconducting parallel circuits of an insulated multi-strand cable or a parallel persistent current switch (PCS), the current distribution during the current sweep, the persistent mode, and the quench process were investigated. In order to measure the current distribution, two methods were used. (1) Each strand was surrounded with a pure iron core with the air gap. In the air gap, a Hall probe was located. The accuracy of this method was deteriorated by the magnetic hysteresis of iron. (2) The Rogowski coil without iron was used for the current measurement of each path in a 4-parallel PCS. As a result, it was shown that the current distribution characteristics of a parallel PCS is very similar to that of an insulated multi-strand cable for the quench process
Parallel hierarchical global illumination

Energy Technology Data Exchange (ETDEWEB)

Snell, Quinn O. [Iowa State Univ., Ames, IA (United States)

1997-10-08

Solving the global illumination problem is equivalent to determining the intensity of every wavelength of light in all directions at every point in a given scene. The complexity of the problem has led researchers to use approximation methods for solving the problem on serial computers. Rather than using an approximation method, such as backward ray tracing or radiosity, the authors have chosen to solve the Rendering Equation by direct simulation of light transport from the light sources. This paper presents an algorithm that solves the Rendering Equation to any desired accuracy, and can be run in parallel on distributed memory or shared memory computer systems with excellent scaling properties. It appears superior in both speed and physical correctness to recent published methods involving bidirectional ray tracing or hybrid treatments of diffuse and specular surfaces. Like progressive radiosity methods, it dynamically refines the geometry decomposition where required, but does so without the excessive storage requirements for ray histories. The algorithm, called Photon, produces a scene which converges to the global illumination solution. This amounts to a huge task for a 1997-vintage serial computer, but using the power of a parallel supercomputer significantly reduces the time required to generate a solution. Currently, Photon can be run on most parallel environments from a shared memory multiprocessor to a parallel supercomputer, as well as on clusters of heterogeneous workstations.
6th International Parallel Tools Workshop

CERN Document Server

Brinkmann, Steffen; Gracia, José; Resch, Michael; Nagel, Wolfgang

2013-01-01

The latest advances in the High Performance Computing hardware have significantly raised the level of available compute performance. At the same time, the growing hardware capabilities of modern supercomputing architectures have caused an increasing complexity of the parallel application development. Despite numerous efforts to improve and simplify parallel programming, there is still a lot of manual debugging and tuning work required. This process is supported by special software tools, facilitating debugging, performance analysis, and optimization and thus making a major contribution to the development of robust and efficient parallel software. This book introduces a selection of the tools, which were presented and discussed at the 6th International Parallel Tools Workshop, held in Stuttgart, Germany, 25-26 September 2012.
Angular parallelization of a curvilinear Sn transport theory method

International Nuclear Information System (INIS)

Haghighat, A.

1991-01-01

In this paper a parallel algorithm for angular domain decomposition (or parallelization) of an r-dependent spherical S n transport theory method is derived. The parallel formulation is incorporated into TWOTRAN-II using the IBM Parallel Fortran compiler and implemented on an IBM 3090/400 (with four processors). The behavior of the parallel algorithm for different physical problems is studied, and it is concluded that the parallel algorithm behaves differently in the presence of a fission source as opposed to the absence of a fission source; this is attributed to the relative contributions of the source and the angular redistribution terms in the S s algorithm. Further, the parallel performance of the algorithm is measured for various problem sizes and different combinations of angular subdomains or processors. Poor parallel efficiencies between ∼35 and 50% are achieved in situations where the relative difference of parallel to serial iterations is ∼50%. High parallel efficiencies between ∼60% and 90% are obtained in situations where the relative difference of parallel to serial iterations is <35%
Combining Compile-Time and Run-Time Parallelization

Directory of Open Access Journals (Sweden)

Sungdo Moon

1999-01-01

Full Text Available This paper demonstrates that significant improvements to automatic parallelization technology require that existing systems be extended in two ways: (1 they must combine high‐quality compile‐time analysis with low‐cost run‐time testing; and (2 they must take control flow into account during analysis. We support this claim with the results of an experiment that measures the safety of parallelization at run time for loops left unparallelized by the Stanford SUIF compiler’s automatic parallelization system. We present results of measurements on programs from two benchmark suites – SPECFP95 and NAS sample benchmarks – which identify inherently parallel loops in these programs that are missed by the compiler. We characterize remaining parallelization opportunities, and find that most of the loops require run‐time testing, analysis of control flow, or some combination of the two. We present a new compile‐time analysis technique that can be used to parallelize most of these remaining loops. This technique is designed to not only improve the results of compile‐time parallelization, but also to produce low‐cost, directed run‐time tests that allow the system to defer binding of parallelization until run‐time when safety cannot be proven statically. We call this approach predicated array data‐flow analysis. We augment array data‐flow analysis, which the compiler uses to identify independent and privatizable arrays, by associating predicates with array data‐flow values. Predicated array data‐flow analysis allows the compiler to derive “optimistic” data‐flow values guarded by predicates; these predicates can be used to derive a run‐time test guaranteeing the safety of parallelization.
Parallelization characteristics of the DeCART code

International Nuclear Information System (INIS)

Cho, J. Y.; Joo, H. G.; Kim, H. Y.; Lee, C. C.; Chang, M. H.; Zee, S. Q.

2003-12-01

This report is to describe the parallelization characteristics of the DeCART code and also examine its parallel performance. Parallel computing algorithms are implemented to DeCART to reduce the tremendous computational burden and memory requirement involved in the three-dimensional whole core transport calculation. In the parallelization of the DeCART code, the axial domain decomposition is first realized by using MPI (Message Passing Interface), and then the azimuthal angle domain decomposition by using either MPI or OpenMP. When using the MPI for both the axial and the angle domain decomposition, the concept of MPI grouping is employed for convenient communication in each communication world. For the parallel computation, most of all the computing modules except for the thermal hydraulic module are parallelized. These parallelized computing modules include the MOC ray tracing, CMFD, NEM, region-wise cross section preparation and cell homogenization modules. For the distributed allocation, most of all the MOC and CMFD/NEM variables are allocated only for the assigned planes, which reduces the required memory by a ratio of the number of the assigned planes to the number of all planes. The parallel performance of the DeCART code is evaluated by solving two problems, a rodded variation of the C5G7 MOX three-dimensional benchmark problem and a simplified three-dimensional SMART PWR core problem. In the aspect of parallel performance, the DeCART code shows a good speedup of about 40.1 and 22.4 in the ray tracing module and about 37.3 and 20.2 in the total computing time when using 48 CPUs on the IBM Regatta and 24 CPUs on the LINUX cluster, respectively. In the comparison between the MPI and OpenMP, OpenMP shows a somewhat better performance than MPI. Therefore, it is concluded that the first priority in the parallel computation of the DeCART code is in the axial domain decomposition by using MPI, and then in the angular domain using OpenMP, and finally the angular

PSHED: a simplified approach to developing parallel programs

International Nuclear Information System (INIS)

Mahajan, S.M.; Ramesh, K.; Rajesh, K.; Somani, A.; Goel, M.

1992-01-01

This paper presents a simplified approach in the forms of a tree structured computational model for parallel application programs. An attempt is made to provide a standard user interface to execute programs on BARC Parallel Processing System (BPPS), a scalable distributed memory multiprocessor. The interface package called PSHED provides a basic framework for representing and executing parallel programs on different parallel architectures. The PSHED package incorporates concepts from a broad range of previous research in programming environments and parallel computations. (author). 6 refs
Parallel evolutionary computation in bioinformatics applications.

Science.gov (United States)

Pinho, Jorge; Sobral, João Luis; Rocha, Miguel

2013-05-01

A large number of optimization problems within the field of Bioinformatics require methods able to handle its inherent complexity (e.g. NP-hard problems) and also demand increased computational efforts. In this context, the use of parallel architectures is a necessity. In this work, we propose ParJECoLi, a Java based library that offers a large set of metaheuristic methods (such as Evolutionary Algorithms) and also addresses the issue of its efficient execution on a wide range of parallel architectures. The proposed approach focuses on the easiness of use, making the adaptation to distinct parallel environments (multicore, cluster, grid) transparent to the user. Indeed, this work shows how the development of the optimization library can proceed independently of its adaptation for several architectures, making use of Aspect-Oriented Programming. The pluggable nature of parallelism related modules allows the user to easily configure its environment, adding parallelism modules to the base source code when needed. The performance of the platform is validated with two case studies within biological model optimization. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
PARALLEL IMPORT: REALITY FOR RUSSIA

Directory of Open Access Journals (Sweden)

Т. А. Сухопарова

2014-01-01

Full Text Available Problem of parallel import is urgent question at now. Parallel import legalization in Russia is expedient. Such statement based on opposite experts opinion analysis. At the same time it’s necessary to negative consequences consider of this decision and to apply remedies to its minimization.Purchase on Elibrary.ru > Buy now
Multitasking TORT Under UNICOS: Parallel Performance Models and Measurements

International Nuclear Information System (INIS)

Azmy, Y.Y.; Barnett, D.A.

1999-01-01

The existing parallel algorithms in the TORT discrete ordinates were updated to function in a UNI-COS environment. A performance model for the parallel overhead was derived for the existing algorithms. The largest contributors to the parallel overhead were identified and a new algorithm was developed. A parallel overhead model was also derived for the new algorithm. The results of the comparison of parallel performance models were compared to applications of the code to two TORT standard test problems and a large production problem. The parallel performance models agree well with the measured parallel overhead
Multitasking TORT under UNICOS: Parallel performance models and measurements

International Nuclear Information System (INIS)

Barnett, A.; Azmy, Y.Y.

1999-01-01

The existing parallel algorithms in the TORT discrete ordinates code were updated to function in a UNICOS environment. A performance model for the parallel overhead was derived for the existing algorithms. The largest contributors to the parallel overhead were identified and a new algorithm was developed. A parallel overhead model was also derived for the new algorithm. The results of the comparison of parallel performance models were compared to applications of the code to two TORT standard test problems and a large production problem. The parallel performance models agree well with the measured parallel overhead
Efficient method to design RF pulses for parallel excitation MRI using gridding and conjugate gradient.

Science.gov (United States)

Feng, Shuo; Ji, Jim

2014-04-01

Parallel excitation (pTx) techniques with multiple transmit channels have been widely used in high field MRI imaging to shorten the RF pulse duration and/or reduce the specific absorption rate (SAR). However, the efficiency of pulse design still needs substantial improvement for practical real-time applications. In this paper, we present a detailed description of a fast pulse design method with Fourier domain gridding and a conjugate gradient method. Simulation results of the proposed method show that the proposed method can design pTx pulses at an efficiency 10 times higher than that of the conventional conjugate-gradient based method, without reducing the accuracy of the desirable excitation patterns.
Parallel artificial liquid membrane extraction

DEFF Research Database (Denmark)

Gjelstad, Astrid; Rasmussen, Knut Einar; Parmer, Marthe Petrine

2013-01-01

This paper reports development of a new approach towards analytical liquid-liquid-liquid membrane extraction termed parallel artificial liquid membrane extraction. A donor plate and acceptor plate create a sandwich, in which each sample (human plasma) and acceptor solution is separated by an arti......This paper reports development of a new approach towards analytical liquid-liquid-liquid membrane extraction termed parallel artificial liquid membrane extraction. A donor plate and acceptor plate create a sandwich, in which each sample (human plasma) and acceptor solution is separated...... by an artificial liquid membrane. Parallel artificial liquid membrane extraction is a modification of hollow-fiber liquid-phase microextraction, where the hollow fibers are replaced by flat membranes in a 96-well plate format....
Radiosensitivity of different human tumor cells lines grown as multicellular spheroids determined from growth curves and survival data

International Nuclear Information System (INIS)

Schwachoefer, J.H.C.; Crooijmans, R.P.; van Gasteren, J.J.; Hoogenhout, J.; Jerusalem, C.R.; Kal, H.B.; Theeuwes, A.G.

1989-01-01

Five human tumor cell lines were grown as multicellular tumor spheroids (MTS) to determine whether multicellular tumor spheroids derived from different types of tumors would show tumor-type dependent differences in response to single-dose irradiation, and whether these differences paralleled clinical behavior. Multicellular tumor spheroids of two neuroblastoma, one lung adenocarcinoma, one melanoma, and a squamous cell carcinoma of the oral tongue, were studied in terms of growth delay, calculated cell survival, and spheroid control dose50 (SCD50). Growth delay and cell survival analysis for the tumor cell lines showed sensitivities that correlated well with clinical behavior of the tumor types of origin. Similar to other studies on melanoma multicellular tumor spheroids our spheroid control dose50 results for the melanoma cell line deviated from the general pattern of sensitivity. This might be due to the location of surviving cells, which prohibits proliferation of surviving cells and hence growth of melanoma multicellular tumor spheroids. This study demonstrates that radiosensitivity of human tumor cell lines can be evaluated in terms of growth delay, calculated cell survival, and spheroid control dose50 when grown as multicellular tumor spheroids. The sensitivity established from these evaluations parallels clinical behavior, thus offering a unique tool for the in vitro analysis of human tumor radiosensitivity
Measurement of magnetically insulated line voltage using a Thomson Parabola Charged Particle Analyser

International Nuclear Information System (INIS)

Stanley, T.D.; Stinnett, R.W.

1981-01-01

The absence of direct measurements of magnetically insulated line voltage necessitated reliance on inferred voltages based on theoretical calculation and current measurements. This paper presents some of the first direct measurements of magnetically insulated transmission line peak voltages. These measurements were made on the Sandia National Laboratories HydraMITE facility. The peak voltage is measured by observing the energy of negative ions produced at the line cathode and accelerated through the line voltage. The ion energy and the charge-to-mass ratio are measured using the Thomson Parabola mass spectrometry technique. This technique uses parallel E and B fields to deflect the ions. The deflected ions are detected using a microchannel plate coupled to a phosphor screen and photographic film. The Thomson Parabola results are compared to Faraday Cup measurements and to calculated voltages based on current measurements. In addition, the significance of observed positive ions is discussed
WHY IS NON-THERMAL LINE BROADENING OF SPECTRAL LINES IN THE LOWER TRANSITION REGION OF THE SUN INDEPENDENT OF SPATIAL RESOLUTION?

International Nuclear Information System (INIS)

De Pontieu, B.; Martinez-Sykora, J.; McIntosh, S.; Peter, H.; Pereira, T. M. D.

2015-01-01

Spectral observations of the solar transition region (TR) and corona show broadening of spectral lines beyond what is expected from thermal and instrumental broadening. The remaining non-thermal broadening is significant (5–30 km s −1 ) and correlated with intensity. Here we study spectra of the TR Si iv 1403 Å line obtained at high resolution with the Interface Region Imaging Spectrograph (IRIS). We find that the large improvement in spatial resolution (0.″33) of IRIS compared to previous spectrographs (2″) does not resolve the non-thermal line broadening which, in most regions, remains at pre-IRIS levels of about 20 km s −1 . This invariance to spatial resolution indicates that the processes behind the broadening occur along the line-of-sight (LOS) and/or on spatial scales (perpendicular to the LOS) smaller than 250 km. Both effects appear to play a role. Comparison with IRIS chromospheric observations shows that, in regions where the LOS is more parallel to the field, magneto-acoustic shocks driven from below impact the TR and can lead to significant non-thermal line broadening. This scenario is supported by MHD simulations. While these do not show enough non-thermal line broadening, they do reproduce the long-known puzzling correlation between non-thermal line broadening and intensity. This correlation is caused by the shocks, but only if non-equilibrium ionization is taken into account. In regions where the LOS is more perpendicular to the field, the prevalence of small-scale twist is likely to play a significant role in explaining the invariance and correlation with intensity. (letters)
Massively parallel multicanonical simulations

Science.gov (United States)

Gross, Jonathan; Zierenberg, Johannes; Weigel, Martin; Janke, Wolfhard

2018-03-01

Generalized-ensemble Monte Carlo simulations such as the multicanonical method and similar techniques are among the most efficient approaches for simulations of systems undergoing discontinuous phase transitions or with rugged free-energy landscapes. As Markov chain methods, they are inherently serial computationally. It was demonstrated recently, however, that a combination of independent simulations that communicate weight updates at variable intervals allows for the efficient utilization of parallel computational resources for multicanonical simulations. Implementing this approach for the many-thread architecture provided by current generations of graphics processing units (GPUs), we show how it can be efficiently employed with of the order of 104 parallel walkers and beyond, thus constituting a versatile tool for Monte Carlo simulations in the era of massively parallel computing. We provide the fully documented source code for the approach applied to the paradigmatic example of the two-dimensional Ising model as starting point and reference for practitioners in the field.
Parallel thermal radiation transport in two dimensions

International Nuclear Information System (INIS)

Smedley-Stevenson, R.P.; Ball, S.R.

2003-01-01

This paper describes the distributed memory parallel implementation of a deterministic thermal radiation transport algorithm in a 2-dimensional ALE hydrodynamics code. The parallel algorithm consists of a variety of components which are combined in order to produce a state of the art computational capability, capable of solving large thermal radiation transport problems using Blue-Oak, the 3 Tera-Flop MPP (massive parallel processors) computing facility at AWE (United Kingdom). Particular aspects of the parallel algorithm are described together with examples of the performance on some challenging applications. (author)
Parallel thermal radiation transport in two dimensions

Energy Technology Data Exchange (ETDEWEB)

Smedley-Stevenson, R.P.; Ball, S.R. [AWE Aldermaston (United Kingdom)

2003-07-01

This paper describes the distributed memory parallel implementation of a deterministic thermal radiation transport algorithm in a 2-dimensional ALE hydrodynamics code. The parallel algorithm consists of a variety of components which are combined in order to produce a state of the art computational capability, capable of solving large thermal radiation transport problems using Blue-Oak, the 3 Tera-Flop MPP (massive parallel processors) computing facility at AWE (United Kingdom). Particular aspects of the parallel algorithm are described together with examples of the performance on some challenging applications. (author)
Parallel processing for artificial intelligence 1

CERN Document Server

Kanal, LN; Kumar, V; Suttner, CB

1994-01-01

Parallel processing for AI problems is of great current interest because of its potential for alleviating the computational demands of AI procedures. The articles in this book consider parallel processing for problems in several areas of artificial intelligence: image processing, knowledge representation in semantic networks, production rules, mechanization of logic, constraint satisfaction, parsing of natural language, data filtering and data mining. The publication is divided into six sections. The first addresses parallel computing for processing and understanding images. The second discus
Multimoded rf delay line distribution system for the Next Linear Collider

Directory of Open Access Journals (Sweden)

S. G. Tantawi

2002-03-01

Full Text Available The delay line distribution system is an alternative to conventional pulse compression, which enhances the peak power of rf sources while matching the long pulse of those sources to the shorter filling time of accelerator structures. We present an implementation of this scheme that combines pairs of parallel delay lines of the system into single lines. The power of several sources is combined into a single waveguide delay line using a multimode launcher. The output mode of the launcher is determined by the phase coding of the input signals. The combined power is extracted from the delay line using mode-selective extractors, each of which extracts a single mode. Hence, the phase coding of the sources controls the output port of the combined power. The power is then fed to the local accelerator structures. We present a detailed design of such a system, including several implementation methods for the launchers, extractors, and ancillary high power rf components. The system is designed so that it can handle the 600 MW peak power required by the Next Linear Collider design while maintaining high efficiency.
Comparison of parallel viscosity with neoclassical theory

International Nuclear Information System (INIS)

Ida, K.; Nakajima, N.

1996-04-01

Toroidal rotation profiles are measured with charge exchange spectroscopy for the plasma heated with tangential NBI in CHS heliotron/torsatron device to estimate parallel viscosity. The parallel viscosity derived from the toroidal rotation velocity shows good agreement with the neoclassical parallel viscosity plus the perpendicular viscosity. (μ perpendicular = 2 m 2 /s). (author)
Adapting algorithms to massively parallel hardware

CERN Document Server

Sioulas, Panagiotis

2016-01-01

In the recent years, the trend in computing has shifted from delivering processors with faster clock speeds to increasing the number of cores per processor. This marks a paradigm shift towards parallel programming in which applications are programmed to exploit the power provided by multi-cores. Usually there is gain in terms of the time-to-solution and the memory footprint. Specifically, this trend has sparked an interest towards massively parallel systems that can provide a large number of processors, and possibly computing nodes, as in the GPUs and MPPAs (Massively Parallel Processor Arrays). In this project, the focus was on two distinct computing problems: k-d tree searches and track seeding cellular automata. The goal was to adapt the algorithms to parallel systems and evaluate their performance in different cases.
Implementing Shared Memory Parallelism in MCBEND

Directory of Open Access Journals (Sweden)

Bird Adam

2017-01-01

Full Text Available MCBEND is a general purpose radiation transport Monte Carlo code from AMEC Foster Wheelers’s ANSWERS® Software Service. MCBEND is well established in the UK shielding community for radiation shielding and dosimetry assessments. The existing MCBEND parallel capability effectively involves running the same calculation on many processors. This works very well except when the memory requirements of a model restrict the number of instances of a calculation that will fit on a machine. To more effectively utilise parallel hardware OpenMP has been used to implement shared memory parallelism in MCBEND. This paper describes the reasoning behind the choice of OpenMP, notes some of the challenges of multi-threading an established code such as MCBEND and assesses the performance of the parallel method implemented in MCBEND.
Parallel Task Processing on a Multicore Platform in a PC-based Control System for Parallel Kinematics

Directory of Open Access Journals (Sweden)

Harald Michalik

2009-02-01

Full Text Available Multicore platforms are such that have one physical processor chip with multiple cores interconnected via a chip level bus. Because they deliver a greater computing power through concurrency, offer greater system density multicore platforms provide best qualifications to address the performance bottleneck encountered in PC-based control systems for parallel kinematic robots with heavy CPU-load. Heavy load control tasks are generated by new control approaches that include features like singularity prediction, structure control algorithms, vision data integration and similar tasks. In this paper we introduce the parallel task scheduling extension of a communication architecture specially tailored for the development of PC-based control of parallel kinematics. The Sche-duling is specially designed for the processing on a multicore platform. It breaks down the serial task processing of the robot control cycle and extends it with parallel task processing paths in order to enhance the overall control performance.
Parallel fabrication of macroporous scaffolds.

Science.gov (United States)

Dobos, Andrew; Grandhi, Taraka Sai Pavan; Godeshala, Sudhakar; Meldrum, Deirdre R; Rege, Kaushal

2018-07-01

Scaffolds generated from naturally occurring and synthetic polymers have been investigated in several applications because of their biocompatibility and tunable chemo-mechanical properties. Existing methods for generation of 3D polymeric scaffolds typically cannot be parallelized, suffer from low throughputs, and do not allow for quick and easy removal of the fragile structures that are formed. Current molds used in hydrogel and scaffold fabrication using solvent casting and porogen leaching are often single-use and do not facilitate 3D scaffold formation in parallel. Here, we describe a simple device and related approaches for the parallel fabrication of macroporous scaffolds. This approach was employed for the generation of macroporous and non-macroporous materials in parallel, in higher throughput and allowed for easy retrieval of these 3D scaffolds once formed. In addition, macroporous scaffolds with interconnected as well as non-interconnected pores were generated, and the versatility of this approach was employed for the generation of 3D scaffolds from diverse materials including an aminoglycoside-derived cationic hydrogel ("Amikagel"), poly(lactic-co-glycolic acid) or PLGA, and collagen. Macroporous scaffolds generated using the device were investigated for plasmid DNA binding and cell loading, indicating the use of this approach for developing materials for different applications in biotechnology. Our results demonstrate that the device-based approach is a simple technology for generating scaffolds in parallel, which can enhance the toolbox of current fabrication techniques. © 2018 Wiley Periodicals, Inc.

Antiproliferative effects of the readily extractable fractions prepared from various citrus juices on several cancer cell lines.

Science.gov (United States)

Kawaii, S; Tomono, Y; Katase, E; Ogawa, K; Yano, M

1999-07-01

To eliminate the masking effect by flavonoid glycosides, which comprise approximately 70% of conventionally prepared sample, the readily extractable fraction from Citrus juice, which was prepared by adsorbing on HP-20 resin and eluting with ethanol and acetone from the resin, was subjected to antiproliferative tests against several cancer cell lines. Screening of 34 Citrus juices indicated that King (Citrus nobilis) strongly inhibited proliferation of all cancer cell lines examined. Sweet lime and Kabuchi inhibited three of the four cancer cell lines. In contrast, these samples were substantially less cytotoxic toward normal human cell lines.
Event parallelism: Distributed memory parallel computing for high energy physics experiments

International Nuclear Information System (INIS)

Nash, T.

1989-05-01

This paper describes the present and expected future development of distributed memory parallel computers for high energy physics experiments. It covers the use of event parallel microprocessor farms, particularly at Fermilab, including both ACP multiprocessors and farms of MicroVAXES. These systems have proven very cost effective in the past. A case is made for moving to the more open environment of UNIX and RISC processors. The 2nd Generation ACP Multiprocessor System, which is based on powerful RISC systems, is described. Given the promise of still more extraordinary increases in processor performance, a new emphasis on point to point, rather than bussed, communication will be required. Developments in this direction are described. 6 figs
Event parallelism: Distributed memory parallel computing for high energy physics experiments

International Nuclear Information System (INIS)

Nash, T.

1989-01-01

This paper describes the present and expected future development of distributed memory parallel computers for high energy physics experiments. It covers the use of event parallel microprocessor farms, particularly at Fermilab, including both ACP multiprocessors and farms of MicroVAXES. These systems have proven very cost effective in the past. A case is made for moving to the more open environment of UNIX and RISC processors. The 2nd Generation ACP Multiprocessor System, which is based on powerful RISC systems, is described. Given the promise of still more extraordinary increases in processor performance, a new emphasis on point to point, rather than bussed, communication will be required. Developments in this direction are described. (orig.)
Event parallelism: Distributed memory parallel computing for high energy physics experiments

Science.gov (United States)

Nash, Thomas

1989-12-01

This paper describes the present and expected future development of distributed memory parallel computers for high energy physics experiments. It covers the use of event parallel microprocessor farms, particularly at Fermilab, including both ACP multiprocessors and farms of MicroVAXES. These systems have proven very cost effective in the past. A case is made for moving to the more open environment of UNIX and RISC processors. The 2nd Generation ACP Multiprocessor System, which is based on powerful RISC system, is described. Given the promise of still more extraordinary increases in processor performance, a new emphasis on point to point, rather than bussed, communication will be required. Developments in this direction are described.
Researching the Parallel Process in Supervision and Psychotherapy

DEFF Research Database (Denmark)

Jacobsen, Claus Haugaard

Reflects upon how to do process research in supervision and in the parallel process. A single case study is presented illustrating how a study on parallel process can be carried out.......Reflects upon how to do process research in supervision and in the parallel process. A single case study is presented illustrating how a study on parallel process can be carried out....
Micro-XRD Stress And Texture Study Of Inlaid Copper Lines - Influence Of ILD, Liner And Etch Stop Layer

International Nuclear Information System (INIS)

Prinz, H.; Zienert, I.; Rinderknecht, J.; Geisler, H.; Zschech, E.; Besser, P.

2004-01-01

The influence of ILD, liner and etch stop layer on the room temperature stress state of copper line test structures was examined by micro-XRD. Test structures consisted of large arrays of parallel lines with line widths of 0.18 μm and 1.8 μm. All these parameters have an influence on the room temperature stress state, whereas the variation of the liner and the ILD showed the largest effects. The change from a full low-k stack to a hybrid stack, where SiO2 ILD is use for the 'via layer' only and low-k material for the 'line layer' results in completely different parameter dependencies. The relationship between copper microstructure and the resulting stress in copper lines is discussed
Microprocessor event analysis in parallel with Camac data acquisition

International Nuclear Information System (INIS)

Cords, D.; Eichler, R.; Riege, H.

1981-01-01

The Plessey MIPROC-16 microprocessor (16 bits, 250 ns execution time) has been connected to a Camac System (GEC-ELLIOTT System Crate) and shares the Camac access with a Nord-1OS computer. Interfaces have been designed and tested for execution of Camac cycles, communication with the Nord-1OS computer and DMA-transfer from Camac to the MIPROC-16 memory. The system is used in the JADE data-acquisition-system at PETRA where it receives the data from the detector in parallel with the Nord-1OS computer via DMA through the indirect-data-channel mode. The microprocessor performs an on-line analysis of events and the result of various checks is appended to the event. In case of spurious triggers or clear beam gas events, the Nord-1OS buffer will be reset and the event omitted from further processing. (orig.)
Facing regulatory challenges of on-line hemodiafiltration.

Science.gov (United States)

Kümmerle, Wolfgang

2011-01-01

On-line hemodiafiltration (on-line HDF) is the result of a vision that triggered multifarious changes in very different areas. Driven by the idea to offer better medical treatment for renal patients, technological innovations were developed and established that also constituted new challenges in the field of regulatory affairs. The existing regulations predominantly addressed the quality and safety of those products needed to perform dialysis treatment which were supplied by industrial manufacturers. However, the complexity of treatment system required for the provision of on-line fluids demanded a holistic approach encompassing all components involved. Hence, focus was placed not only on single products, but much more on their interfacing, and the clinical infrastructure, in particular, had to undergo substantial changes. The overall understanding of the interaction between such factors, quite different in their nature, was crucial to overcome the arising regulatory obstacles. This essay describes the evolution of the on-line HDF procedure from the regulatory point of view. A simplified diagram demonstrates the path taken from the former regulatory understanding to the realization of necessary changes. That achievement was only possible through 'management of preview' and consequent promotion of technical and medical innovations as well as regulatory re-evaluations. Copyright © 2011 S. Karger AG, Basel.
Development of parallel/serial program analyzing tool

International Nuclear Information System (INIS)

Watanabe, Hiroshi; Nagao, Saichi; Takigawa, Yoshio; Kumakura, Toshimasa

1999-03-01

Japan Atomic Energy Research Institute has been developing 'KMtool', a parallel/serial program analyzing tool, in order to promote the parallelization of the science and engineering computation program. KMtool analyzes the performance of program written by FORTRAN77 and MPI, and it reduces the effort for parallelization. This paper describes development purpose, design, utilization and evaluation of KMtool. (author)
Parallel programming practical aspects, models and current limitations

CERN Document Server

Tarkov, Mikhail S

2014-01-01

Parallel programming is designed for the use of parallel computer systems for solving time-consuming problems that cannot be solved on a sequential computer in a reasonable time. These problems can be divided into two classes: 1. Processing large data arrays (including processing images and signals in real time)2. Simulation of complex physical processes and chemical reactions For each of these classes, prospective methods are designed for solving problems. For data processing, one of the most promising technologies is the use of artificial neural networks. Particles-in-cell method and cellular automata are very useful for simulation. Problems of scalability of parallel algorithms and the transfer of existing parallel programs to future parallel computers are very acute now. An important task is to optimize the use of the equipment (including the CPU cache) of parallel computers. Along with parallelizing information processing, it is essential to ensure the processing reliability by the relevant organization ...
Trends in on-line data processing

International Nuclear Information System (INIS)

Masetti, M.

1981-01-01

The developement of integrated circuits has been characterized by an exponential growth of gates on a single chip that will still continue in the coming years. In parallel the price per bit is dropping down with more or less the same law. As a consequence of this a few statements can be made: The present 16-bit minicomputer in a small configuration is going to be substituted by a 16-bit microcomputer, and the 16-bit microcomputer in a powerful configuration by a 32-bit midi having also a virtual memory facility. Fully programmable or microcoded powerful devices like the LASS hardware processor or MICE, will allow an efficient on-line filter. Higher computing speed can be achieved by a multiprocessor configuration which can be insensitive to hardware failures. Therefore we are moving towards an integrated on-line computing system with much higher computing power than now and the present distinction between on-line and off-line will no longer be so sharp. As more processing can be performed on-line, fast high quality feed-back can be provided for the experiment. In the years to come the trend towards more processing power, at a lower price, and assembled in the same hardware volume will continue for at least five years; at the same time the future large high-energy physics experiments at LEP will be carried out within a wide international collaboration. In this environment methods must be found for a large fraction of the work to be distributed amongst the collaborators. To accomplish this aim it is necessary to introduce common standard practices concerning both hardware and software, in such a way that the seperate parts, developed by the collaborators, will be plug-compatible. (orig.)
The method of lines solution of discrete ordinates method for non-grey media

International Nuclear Information System (INIS)

Cayan, Fatma Nihan; Selcuk, Nevin

2007-01-01

A radiation code based on method of lines (MOL) solution of discrete ordinates method (DOM) for radiative heat transfer in non-grey absorbing-emitting media was developed by incorporation of a gas spectral radiative property model, namely wide band correlated-k (WBCK) model, which is compatible with MOL solution of DOM. Predictive accuracy of the code was evaluated by applying it to 1-D parallel plate and 2-D axisymmetric cylindrical enclosure problems containing absorbing-emitting medium and benchmarking its predictions against line-by-line solutions available in the literature. Comparisons reveal that MOL solution of DOM with WBCK model produces accurate results for radiative heat fluxes and source terms and can be used with confidence in conjunction with computational fluid dynamics codes based on the same approach
Mixed-time parallel evolution in multiple quantum NMR experiments: sensitivity and resolution enhancement in heteronuclear NMR

International Nuclear Information System (INIS)

Ying Jinfa; Chill, Jordan H.; Louis, John M.; Bax, Ad

2007-01-01

A new strategy is demonstrated that simultaneously enhances sensitivity and resolution in three- or higher-dimensional heteronuclear multiple quantum NMR experiments. The approach, referred to as mixed-time parallel evolution (MT-PARE), utilizes evolution of chemical shifts of the spins participating in the multiple quantum coherence in parallel, thereby reducing signal losses relative to sequential evolution. The signal in a given PARE dimension, t 1 , is of a non-decaying constant-time nature for a duration that depends on the length of t 2 , and vice versa, prior to the onset of conventional exponential decay. Line shape simulations for the 1 H- 15 N PARE indicate that this strategy significantly enhances both sensitivity and resolution in the indirect 1 H dimension, and that the unusual signal decay profile results in acceptable line shapes. Incorporation of the MT-PARE approach into a 3D HMQC-NOESY experiment for measurement of H N -H N NOEs in KcsA in SDS micelles at 50 o C was found to increase the experimental sensitivity by a factor of 1.7±0.3 with a concomitant resolution increase in the indirectly detected 1 H dimension. The method is also demonstrated for a situation in which homonuclear 13 C- 13 C decoupling is required while measuring weak H3'-2'OH NOEs in an RNA oligomer
26 CFR 1.528-4 - Substantiality test.

Science.gov (United States)

2010-04-01

... 26 Internal Revenue 7 2010-04-01 2010-04-01 true Substantiality test. 1.528-4 Section 1.528-4 Internal Revenue INTERNAL REVENUE SERVICE, DEPARTMENT OF THE TREASURY (CONTINUED) INCOME TAX (CONTINUED... residence. Units which are used for purposes auxiliary to residential use (such as laundry areas, swimming...
Open | SpeedShop: An Open Source Infrastructure for Parallel Performance Analysis

Directory of Open Access Journals (Sweden)

Martin Schulz

2008-01-01

Full Text Available Over the last decades a large number of performance tools has been developed to analyze and optimize high performance applications. Their acceptance by end users, however, has been slow: each tool alone is often limited in scope and comes with widely varying interfaces and workflow constraints, requiring different changes in the often complex build and execution infrastructure of the target application. We started the Open | SpeedShop project about 3 years ago to overcome these limitations and provide efficient, easy to apply, and integrated performance analysis for parallel systems. Open | SpeedShop has two different faces: it provides an interoperable tool set covering the most common analysis steps as well as a comprehensive plugin infrastructure for building new tools. In both cases, the tools can be deployed to large scale parallel applications using DPCL/Dyninst for distributed binary instrumentation. Further, all tools developed within or on top of Open | SpeedShop are accessible through multiple fully equivalent interfaces including an easy-to-use GUI as well as an interactive command line interface reducing the usage threshold for those tools.
Spin-orbit torques for current parallel and perpendicular to a domain wall

International Nuclear Information System (INIS)

Schulz, Tomek; Lee, Kyujoon; Karnad, Gurucharan V.; Alejos, Oscar; Martinez, Eduardo; Moretti, Simone; Hals, Kjetil M. D.; Garcia, Karin; Ravelosona, Dafiné; Vila, Laurent; Lo Conte, Roberto; Kläui, Mathias; Ocker, Berthold; Brataas, Arne

2015-01-01

We report field- and current-induced domain wall (DW) depinning experiments in Ta\\Co 20 Fe 60 B 20 \\MgO nanowires through a Hall cross geometry. While purely field-induced depinning shows no angular dependence on in-plane fields, the effect of the current depends crucially on the internal DW structure, which we manipulate by an external magnetic in-plane field. We show depinning measurements for a current sent parallel to the DW and compare its depinning efficiency with the conventional case of current flowing perpendicularly to the DW. We find that the maximum efficiency is similar for both current directions within the error bars, which is in line with a dominating damping-like spin-orbit torque (SOT) and indicates that no large additional torques arise for currents perpendicular to the DW. Finally, we find a varying dependence of the maximum depinning efficiency angle for different DWs and pinning levels. This emphasizes the importance of our full angular scans compared with previously used measurements for just two field directions (parallel and perpendicular to the DW) to determine the real torque strength and shows the sensitivity of the SOT to the precise DW structure and pinning sites
Parallelization of Subchannel Analysis Code MATRA

International Nuclear Information System (INIS)

Kim, Seongjin; Hwang, Daehyun; Kwon, Hyouk

2014-01-01

A stand-alone calculation of MATRA code used up pertinent computing time for the thermal margin calculations while a relatively considerable time is needed to solve the whole core pin-by-pin problems. In addition, it is strongly required to improve the computation speed of the MATRA code to satisfy the overall performance of the multi-physics coupling calculations. Therefore, a parallel approach to improve and optimize the computability of the MATRA code is proposed and verified in this study. The parallel algorithm is embodied in the MATRA code using the MPI communication method and the modification of the previous code structure was minimized. An improvement is confirmed by comparing the results between the single and multiple processor algorithms. The speedup and efficiency are also evaluated when increasing the number of processors. The parallel algorithm was implemented to the subchannel code MATRA using the MPI. The performance of the parallel algorithm was verified by comparing the results with those from the MATRA with the single processor. It is also noticed that the performance of the MATRA code was greatly improved by implementing the parallel algorithm for the 1/8 core and whole core problems
A new beam emission polarimetry diagnostic for measuring the magnetic field line angle at the plasma edge of ASDEX Upgrade.

Science.gov (United States)

Viezzer, E; Dux, R; Dunne, M G

2016-11-01

A new edge beam emission polarimetry diagnostic dedicated to the measurement of the magnetic field line angle has been installed on the ASDEX Upgrade tokamak. The new diagnostic relies on the motional Stark effect and is based on the simultaneous measurement of the polarization direction of the linearly polarized π (parallel to the electric field) and σ (perpendicular to the electric field) lines of the Balmer line D α . The technical properties of the system are described. The calibration procedures are discussed and first measurements are presented.
Broadcasting a message in a parallel computer

Science.gov (United States)

Berg, Jeremy E [Rochester, MN; Faraj, Ahmad A [Rochester, MN

2011-08-02

Methods, systems, and products are disclosed for broadcasting a message in a parallel computer. The parallel computer includes a plurality of compute nodes connected together using a data communications network. The data communications network optimized for point to point data communications and is characterized by at least two dimensions. The compute nodes are organized into at least one operational group of compute nodes for collective parallel operations of the parallel computer. One compute node of the operational group assigned to be a logical root. Broadcasting a message in a parallel computer includes: establishing a Hamiltonian path along all of the compute nodes in at least one plane of the data communications network and in the operational group; and broadcasting, by the logical root to the remaining compute nodes, the logical root's message along the established Hamiltonian path.
Parallel programming with Python

CERN Document Server

Palach, Jan

2014-01-01

A fast, easy-to-follow and clear tutorial to help you develop Parallel computing systems using Python. Along with explaining the fundamentals, the book will also introduce you to slightly advanced concepts and will help you in implementing these techniques in the real world. If you are an experienced Python programmer and are willing to utilize the available computing resources by parallelizing applications in a simple way, then this book is for you. You are required to have a basic knowledge of Python development to get the most of this book.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.