sequential parallel comparison: Topics by WorldWideScience.org

Sample records for sequential parallel comparison

Research on parallel algorithm for sequential pattern mining

Science.gov (United States)

Zhou, Lijuan; Qin, Bai; Wang, Yu; Hao, Zhongxiao

2008-03-01

Sequential pattern mining is the mining of frequent sequences related to time or other orders from the sequence database. Its initial motivation is to discover the laws of customer purchasing in a time section by finding the frequent sequences. In recent years, sequential pattern mining has become an important direction of data mining, and its application field has not been confined to the business database and has extended to new data sources such as Web and advanced science fields such as DNA analysis. The data of sequential pattern mining has characteristics as follows: mass data amount and distributed storage. Most existing sequential pattern mining algorithms haven't considered the above-mentioned characteristics synthetically. According to the traits mentioned above and combining the parallel theory, this paper puts forward a new distributed parallel algorithm SPP(Sequential Pattern Parallel). The algorithm abides by the principal of pattern reduction and utilizes the divide-and-conquer strategy for parallelization. The first parallel task is to construct frequent item sets applying frequent concept and search space partition theory and the second task is to structure frequent sequences using the depth-first search method at each processor. The algorithm only needs to access the database twice and doesn't generate the candidated sequences, which abates the access time and improves the mining efficiency. Based on the random data generation procedure and different information structure designed, this paper simulated the SPP algorithm in a concrete parallel environment and implemented the AprioriAll algorithm. The experiments demonstrate that compared with AprioriAll, the SPP algorithm had excellent speedup factor and efficiency.
A solution for automatic parallelization of sequential assembly code

Directory of Open Access Journals (Sweden)

Kovačević Đorđe

2013-01-01

Full Text Available Since modern multicore processors can execute existing sequential programs only on a single core, there is a strong need for automatic parallelization of program code. Relying on existing algorithms, this paper describes one new software solution tool for parallelization of sequential assembly code. The main goal of this paper is to develop the parallelizator which reads sequential assembler code and at the output provides parallelized code for MIPS processor with multiple cores. The idea is the following: the parser translates assembler input file to program objects suitable for further processing. After that the static single assignment is done. Based on the data flow graph, the parallelization algorithm separates instructions on different cores. Once sequential code is parallelized by the parallelization algorithm, registers are allocated with the algorithm for linear allocation, and the result at the end of the program is distributed assembler code on each of the cores. In the paper we evaluate the speedup of the matrix multiplication example, which was processed by the parallelizator of assembly code. The result is almost linear speedup of code execution, which increases with the number of cores. The speed up on the two cores is 1.99, while on 16 cores the speed up is 13.88.
A path-level exact parallelization strategy for sequential simulation

Science.gov (United States)

Peredo, Oscar F.; Baeza, Daniel; Ortiz, Julián M.; Herrero, José R.

2018-01-01

Sequential Simulation is a well known method in geostatistical modelling. Following the Bayesian approach for simulation of conditionally dependent random events, Sequential Indicator Simulation (SIS) method draws simulated values for K categories (categorical case) or classes defined by K different thresholds (continuous case). Similarly, Sequential Gaussian Simulation (SGS) method draws simulated values from a multivariate Gaussian field. In this work, a path-level approach to parallelize SIS and SGS methods is presented. A first stage of re-arrangement of the simulation path is performed, followed by a second stage of parallel simulation for non-conflicting nodes. A key advantage of the proposed parallelization method is to generate identical realizations as with the original non-parallelized methods. Case studies are presented using two sequential simulation codes from GSLIB: SISIM and SGSIM. Execution time and speedup results are shown for large-scale domains, with many categories and maximum kriging neighbours in each case, achieving high speedup results in the best scenarios using 16 threads of execution in a single machine.
Efficient sequential and parallel algorithms for record linkage.

Science.gov (United States)

Mamun, Abdullah-Al; Mi, Tian; Aseltine, Robert; Rajasekaran, Sanguthevar

2014-01-01

Integrating data from multiple sources is a crucial and challenging problem. Even though there exist numerous algorithms for record linkage or deduplication, they suffer from either large time needs or restrictions on the number of datasets that they can integrate. In this paper we report efficient sequential and parallel algorithms for record linkage which handle any number of datasets and outperform previous algorithms. Our algorithms employ hierarchical clustering algorithms as the basis. A key idea that we use is radix sorting on certain attributes to eliminate identical records before any further processing. Another novel idea is to form a graph that links similar records and find the connected components. Our sequential and parallel algorithms have been tested on a real dataset of 1,083,878 records and synthetic datasets ranging in size from 50,000 to 9,000,000 records. Our sequential algorithm runs at least two times faster, for any dataset, than the previous best-known algorithm, the two-phase algorithm using faster computation of the edit distance (TPA (FCED)). The speedups obtained by our parallel algorithm are almost linear. For example, we get a speedup of 7.5 with 8 cores (residing in a single node), 14.1 with 16 cores (residing in two nodes), and 26.4 with 32 cores (residing in four nodes). We have compared the performance of our sequential algorithm with TPA (FCED) and found that our algorithm outperforms the previous one. The accuracy is the same as that of this previous best-known algorithm.
Sequential and parallel image restoration: neural network implementations.

Science.gov (United States)

Figueiredo, M T; Leitao, J N

1994-01-01

Sequential and parallel image restoration algorithms and their implementations on neural networks are proposed. For images degraded by linear blur and contaminated by additive white Gaussian noise, maximum a posteriori (MAP) estimation and regularization theory lead to the same high dimension convex optimization problem. The commonly adopted strategy (in using neural networks for image restoration) is to map the objective function of the optimization problem into the energy of a predefined network, taking advantage of its energy minimization properties. Departing from this approach, we propose neural implementations of iterative minimization algorithms which are first proved to converge. The developed schemes are based on modified Hopfield (1985) networks of graded elements, with both sequential and parallel updating schedules. An algorithm supported on a fully standard Hopfield network (binary elements and zero autoconnections) is also considered. Robustness with respect to finite numerical precision is studied, and examples with real images are presented.
The parallel-sequential field subtraction technique for coherent nonlinear ultrasonic imaging

Science.gov (United States)

Cheng, Jingwei; Potter, Jack N.; Drinkwater, Bruce W.

2018-06-01

Nonlinear imaging techniques have recently emerged which have the potential to detect cracks at a much earlier stage than was previously possible and have sensitivity to partially closed defects. This study explores a coherent imaging technique based on the subtraction of two modes of focusing: parallel, in which the elements are fired together with a delay law and sequential, in which elements are fired independently. In the parallel focusing a high intensity ultrasonic beam is formed in the specimen at the focal point. However, in sequential focusing only low intensity signals from individual elements enter the sample and the full matrix of transmit-receive signals is recorded and post-processed to form an image. Under linear elastic assumptions, both parallel and sequential images are expected to be identical. Here we measure the difference between these images and use this to characterise the nonlinearity of small closed fatigue cracks. In particular we monitor the change in relative phase and amplitude at the fundamental frequencies for each focal point and use this nonlinear coherent imaging metric to form images of the spatial distribution of nonlinearity. The results suggest the subtracted image can suppress linear features (e.g. back wall or large scatters) effectively when instrumentation noise compensation in applied, thereby allowing damage to be detected at an early stage (c. 15% of fatigue life) and reliably quantified in later fatigue life.
From sequential to parallel programming with patterns

CERN Document Server

CERN. Geneva

2018-01-01

To increase in both performance and efficiency, our programming models need to adapt to better exploit modern processors. The classic idioms and patterns for programming such as loops, branches or recursion are the pillars of almost every code and are well known among all programmers. These patterns all have in common that they are sequential in nature. Embracing parallel programming patterns, which allow us to program for multi- and many-core hardware in a natural way, greatly simplifies the task of designing a program that scales and performs on modern hardware, independently of the used programming language, and in a generic way.
On Modeling Large-Scale Multi-Agent Systems with Parallel, Sequential and Genuinely Asynchronous Cellular Automata

International Nuclear Information System (INIS)

Tosic, P.T.

2011-01-01

We study certain types of Cellular Automata (CA) viewed as an abstraction of large-scale Multi-Agent Systems (MAS). We argue that the classical CA model needs to be modified in several important respects, in order to become a relevant and sufficiently general model for the large-scale MAS, and so that thus generalized model can capture many important MAS properties at the level of agent ensembles and their long-term collective behavior patterns. We specifically focus on the issue of inter-agent communication in CA, and propose sequential cellular automata (SCA) as the first step, and genuinely Asynchronous Cellular Automata (ACA) as the ultimate deterministic CA-based abstract models for large-scale MAS made of simple reactive agents. We first formulate deterministic and nondeterministic versions of sequential CA, and then summarize some interesting configuration space properties (i.e., possible behaviors) of a restricted class of sequential CA. In particular, we compare and contrast those properties of sequential CA with the corresponding properties of the classical (that is, parallel and perfectly synchronous) CA with the same restricted class of update rules. We analytically demonstrate failure of the studied sequential CA models to simulate all possible behaviors of perfectly synchronous parallel CA, even for a very restricted class of non-linear totalistic node update rules. The lesson learned is that the interleaving semantics of concurrency, when applied to sequential CA, is not refined enough to adequately capture the perfect synchrony of parallel CA updates. Last but not least, we outline what would be an appropriate CA-like abstraction for large-scale distributed computing insofar as the inter-agent communication model is concerned, and in that context we propose genuinely asynchronous CA. (author)
Efficient sequential and parallel algorithms for finding edit distance based motifs.

Science.gov (United States)

Pal, Soumitra; Xiao, Peng; Rajasekaran, Sanguthevar

2016-08-18

Motif search is an important step in extracting meaningful patterns from biological data. The general problem of motif search is intractable and there is a pressing need to develop efficient, exact and approximation algorithms to solve this problem. In this paper, we present several novel, exact, sequential and parallel algorithms for solving the (l,d) Edit-distance-based Motif Search (EMS) problem: given two integers l,d and n biological strings, find all strings of length l that appear in each input string with atmost d errors of types substitution, insertion and deletion. One popular technique to solve the problem is to explore for each input string the set of all possible l-mers that belong to the d-neighborhood of any substring of the input string and output those which are common for all input strings. We introduce a novel and provably efficient neighborhood exploration technique. We show that it is enough to consider the candidates in neighborhood which are at a distance exactly d. We compactly represent these candidate motifs using wildcard characters and efficiently explore them with very few repetitions. Our sequential algorithm uses a trie based data structure to efficiently store and sort the candidate motifs. Our parallel algorithm in a multi-core shared memory setting uses arrays for storing and a novel modification of radix-sort for sorting the candidate motifs. The algorithms for EMS are customarily evaluated on several challenging instances such as (8,1), (12,2), (16,3), (20,4), and so on. The best previously known algorithm, EMS1, is sequential and in estimated 3 days solves up to instance (16,3). Our sequential algorithms are more than 20 times faster on (16,3). On other hard instances such as (9,2), (11,3), (13,4), our algorithms are much faster. Our parallel algorithm has more than 600 % scaling performance while using 16 threads. Our algorithms have pushed up the state-of-the-art of EMS solvers and we believe that the techniques introduced in
Air-side performance of a parallel-flow parallel-fin (PF{sup 2}) heat exchanger in sequential frosting

Energy Technology Data Exchange (ETDEWEB)

Zhang, Ping [Zhejiang Vocational College of Commerce, Hangzhou, Binwen Road 470 (China); Department of Mechanical Science and Engineering, University of Illinois at Urbana-Champaign, 1206 West Green Street, Urbana, IL 61801 (United States); Hrnjak, P.S. [Department of Mechanical Science and Engineering, University of Illinois at Urbana-Champaign, 1206 West Green Street, Urbana, IL 61801 (United States)

2010-09-15

The thermal-hydraulic performance in periodic frosting conditions is experimentally studied for the parallel-flow parallel-fin heat exchanger, henceforth referred to as a PF{sup 2} heat exchanger, a new style of heat exchanger that uses louvered bent fins on flat tubes to enhance water drainage when the flat tubes are horizontal. Typically, it takes a few frosting/defrosting cycles to come to repeatable conditions. The criterion for the initiation of defrost and a sufficiently long defrost period are determined for the test PF{sup 2} heat exchanger and test condition. The effects of blower operation on the pressure drop, frost accumulation, water retention, and capacity in time are compared under the conditions of 15 sequential frosting cycles. Pressure drop across the heat exchanger and overall heat transfer coefficient are quantified under frost conditions as functions of the air humidity and air face velocity. The performances of two types of flat-tube heat exchangers, PF{sup 2} heat exchanger and conventional parallel-flow serpentine-fin (PFSF) heat exchanger, are compared and the results obtained are presented. (author)
OPTIMIZATION OF AGGREGATION AND SEQUENTIAL-PARALLEL EXECUTION MODES OF INTERSECTING OPERATION SETS

Directory of Open Access Journals (Sweden)

G. М. Levin

2016-01-01

Full Text Available A mathematical model and a method for the problem of optimization of aggregation and of sequential- parallel execution modes of intersecting operation sets are proposed. The proposed method is based on the two-level decomposition scheme. At the top level the variant of aggregation for groups of operations is selected, and at the lower level the execution modes of operations are optimized for a fixed version of aggregation.
Parameter sampling capabilities of sequential and simultaneous data assimilation: I. Analytical comparison

International Nuclear Information System (INIS)

Fossum, Kristian; Mannseth, Trond

2014-01-01

We assess the parameter sampling capabilities of some Bayesian, ensemble-based, joint state-parameter (JS) estimation methods. The forward model is assumed to be non-chaotic and have nonlinear components, and the emphasis is on results obtained for the parameters in the state-parameter vector. A variety of approximate sampling methods exist, and a number of numerical comparisons between such methods have been performed. Often, more than one of the defining characteristics vary from one method to another, so it can be difficult to point out which characteristic of the more successful method in such a comparison was decisive. In this study, we single out one defining characteristic for comparison; whether or not data are assimilated sequentially or simultaneously. The current paper is concerned with analytical investigations into this issue. We carefully select one sequential and one simultaneous JS method for the comparison. We also design a corresponding pair of pure parameter estimation methods, and we show how the JS methods and the parameter estimation methods are pairwise related. It is shown that the sequential and the simultaneous parameter estimation methods are equivalent for one particular combination of observations with different degrees of nonlinearity. Strong indications are presented for why one may expect the sequential parameter estimation method to outperform the simultaneous parameter estimation method for all other combinations of observations. Finally, the conditions for when similar relations can be expected to hold between the corresponding JS methods are discussed. A companion paper, part II (Fossum and Mannseth 2014 Inverse Problems 30 114003), is concerned with statistical analysis of results from a range of numerical experiments involving sequential and simultaneous JS estimation, where the design of the numerical investigation is motivated by our findings in the current paper. (paper)
Refinement of Parallel and Reactive Programs

OpenAIRE

Back, R. J. R.

1992-01-01

We show how to apply the refinement calculus to stepwise refinement of parallel and reactive programs. We use action systems as our basic program model. Action systems are sequential programs which can be implemented in a parallel fashion. Hence refinement calculus methods, originally developed for sequential programs, carry over to the derivation of parallel programs. Refinement of reactive programs is handled by data refinement techniques originally developed for the sequential refinement c...
On synchronous parallel computations with independent probabilistic choice

International Nuclear Information System (INIS)

Reif, J.H.

1984-01-01

This paper introduces probabilistic choice to synchronous parallel machine models; in particular parallel RAMs. The power of probabilistic choice in parallel computations is illustrate by parallelizing some known probabilistic sequential algorithms. The authors characterize the computational complexity of time, space, and processor bounded probabilistic parallel RAMs in terms of the computational complexity of probabilistic sequential RAMs. They show that parallelism uniformly speeds up time bounded probabilistic sequential RAM computations by nearly a quadratic factor. They also show that probabilistic choice can be eliminated from parallel computations by introducing nonuniformity
A sequential/parallel track selector

CERN Document Server

Bertolino, F; Bressani, Tullio; Chiavassa, E; Costa, S; Dellacasa, G; Gallio, M; Musso, A

1980-01-01

A medium speed ( approximately 1 mu s) hardware pre-analyzer for the selection of events detected in four planes of drift chambers in the magnetic field of the Omicron Spectrometer at the CERN SC is described. Specific geometrical criteria determine patterns of hits in the four planes of vertical wires that have to be recognized and that are stored as patterns of '1's in random access memories. Pairs of good hits are found sequentially, then the RAMs are used as look-up tables. (6 refs).
Comparison Of Hybrid Sorting Algorithms Implemented On Different Parallel Hardware Platforms

Directory of Open Access Journals (Sweden)

Dominik Zurek

2013-01-01

Full Text Available Sorting is a common problem in computer science. There are lot of well-known sorting algorithms created for sequential execution on a single processor. Recently, hardware platforms enable to create wide parallel algorithms. We have standard processors consist of multiple cores and hardware accelerators like GPU. The graphic cards with their parallel architecture give new possibility to speed up many algorithms. In this paper we describe results of implementation of a few different sorting algorithms on GPU cards and multicore processors. Then hybrid algorithm will be presented which consists of parts executed on both platforms, standard CPU and GPU.
Exact parallel maximum clique algorithm for general and protein graphs.

Science.gov (United States)

Depolli, Matjaž; Konc, Janez; Rozman, Kati; Trobec, Roman; Janežič, Dušanka

2013-09-23

A new exact parallel maximum clique algorithm MaxCliquePara, which finds the maximum clique (the fully connected subgraph) in undirected general and protein graphs, is presented. First, a new branch and bound algorithm for finding a maximum clique on a single computer core, which builds on ideas presented in two published state of the art sequential algorithms is implemented. The new sequential MaxCliqueSeq algorithm is faster than the reference algorithms on both DIMACS benchmark graphs as well as on protein-derived product graphs used for protein structural comparisons. Next, the MaxCliqueSeq algorithm is parallelized by splitting the branch-and-bound search tree to multiple cores, resulting in MaxCliquePara algorithm. The ability to exploit all cores efficiently makes the new parallel MaxCliquePara algorithm markedly superior to other tested algorithms. On a 12-core computer, the parallelization provides up to 2 orders of magnitude faster execution on the large DIMACS benchmark graphs and up to an order of magnitude faster execution on protein product graphs. The algorithms are freely accessible on http://commsys.ijs.si/~matjaz/maxclique.
Parallel genetic algorithms with migration for the hybrid flow shop scheduling problem

Directory of Open Access Journals (Sweden)

K. Belkadi

2006-01-01

Full Text Available This paper addresses scheduling problems in hybrid flow shop-like systems with a migration parallel genetic algorithm (PGA_MIG. This parallel genetic algorithm model allows genetic diversity by the application of selection and reproduction mechanisms nearer to nature. The space structure of the population is modified by dividing it into disjoined subpopulations. From time to time, individuals are exchanged between the different subpopulations (migration. Influence of parameters and dedicated strategies are studied. These parameters are the number of independent subpopulations, the interconnection topology between subpopulations, the choice/replacement strategy of the migrant individuals, and the migration frequency. A comparison between the sequential and parallel version of genetic algorithm (GA is provided. This comparison relates to the quality of the solution and the execution time of the two versions. The efficiency of the parallel model highly depends on the parameters and especially on the migration frequency. In the same way this parallel model gives a significant improvement of computational time if it is implemented on a parallel architecture which offers an acceptable number of processors (as many processors as subpopulations.
Using Hadoop MapReduce for Parallel Genetic Algorithms: A Comparison of the Global, Grid and Island Models.

Science.gov (United States)

Ferrucci, Filomena; Salza, Pasquale; Sarro, Federica

2017-06-29

The need to improve the scalability of Genetic Algorithms (GAs) has motivated the research on Parallel Genetic Algorithms (PGAs), and different technologies and approaches have been used. Hadoop MapReduce represents one of the most mature technologies to develop parallel algorithms. Based on the fact that parallel algorithms introduce communication overhead, the aim of the present work is to understand if, and possibly when, the parallel GAs solutions using Hadoop MapReduce show better performance than sequential versions in terms of execution time. Moreover, we are interested in understanding which PGA model can be most effective among the global, grid, and island models. We empirically assessed the performance of these three parallel models with respect to a sequential GA on a software engineering problem, evaluating the execution time and the achieved speedup. We also analysed the behaviour of the parallel models in relation to the overhead produced by the use of Hadoop MapReduce and the GAs' computational effort, which gives a more machine-independent measure of these algorithms. We exploited three problem instances to differentiate the computation load and three cluster configurations based on 2, 4, and 8 parallel nodes. Moreover, we estimated the costs of the execution of the experimentation on a potential cloud infrastructure, based on the pricing of the major commercial cloud providers. The empirical study revealed that the use of PGA based on the island model outperforms the other parallel models and the sequential GA for all the considered instances and clusters. Using 2, 4, and 8 nodes, the island model achieves an average speedup over the three datasets of 1.8, 3.4, and 7.0 times, respectively. Hadoop MapReduce has a set of different constraints that need to be considered during the design and the implementation of parallel algorithms. The overhead of data store (i.e., HDFS) accesses, communication, and latency requires solutions that reduce data store
Parallelism in computations in quantum and statistical mechanics

International Nuclear Information System (INIS)

Clementi, E.; Corongiu, G.; Detrich, J.H.

1985-01-01

Often very fundamental biochemical and biophysical problems defy simulations because of limitations in today's computers. We present and discuss a distributed system composed of two IBM 4341 s and/or an IBM 4381 as front-end processors and ten FPS-164 attached array processors. This parallel system - called LCAP - has presently a peak performance of about 110 Mflops; extensions to higher performance are discussed. Presently, the system applications use a modified version of VM/SP as the operating system: description of the modifications is given. Three applications programs have been migrated from sequential to parallel: a molecular quantum mechanical, a Metropolis-Monte Carlo and a molecular dynamics program. Descriptions of the parallel codes are briefly outlined. Use of these parallel codes has already opened up new capabilities for our research. The very positive performance comparisons with today's supercomputers allow us to conclude that parallel computers and programming, of the type we have considered, represent a pragmatic answer to many computationally intensive problems. (orig.)

Pattern-Driven Automatic Parallelization

Directory of Open Access Journals (Sweden)

Christoph W. Kessler

1996-01-01

Full Text Available This article describes a knowledge-based system for automatic parallelization of a wide class of sequential numerical codes operating on vectors and dense matrices, and for execution on distributed memory message-passing multiprocessors. Its main feature is a fast and powerful pattern recognition tool that locally identifies frequently occurring computations and programming concepts in the source code. This tool also works for dusty deck codes that have been "encrypted" by former machine-specific code transformations. Successful pattern recognition guides sophisticated code transformations including local algorithm replacement such that the parallelized code need not emerge from the sequential program structure by just parallelizing the loops. It allows access to an expert's knowledge on useful parallel algorithms, available machine-specific library routines, and powerful program transformations. The partially restored program semantics also supports local array alignment, distribution, and redistribution, and allows for faster and more exact prediction of the performance of the parallelized target code than is usually possible.
A parallel approach to the stable marriage problem

DEFF Research Database (Denmark)

Larsen, Jesper

1997-01-01

This paper describes two parallel algorithms for the stable marriage problem implemented on a MIMD parallel computer. The algorithms are tested against sequential algorithms on randomly generated and worst-case instances. The results clearly show that the combination fo a very simple problem...... and a commercial MIMD system results in parallel algorithms which are not competitive with sequential algorithms wrt. practical performance. 1 Introduction In 1962 the Stable Marriage Problem was....
Comparison of Sequential and Variational Data Assimilation

Science.gov (United States)

Alvarado Montero, Rodolfo; Schwanenberg, Dirk; Weerts, Albrecht

2017-04-01

Data assimilation is a valuable tool to improve model state estimates by combining measured observations with model simulations. It has recently gained significant attention due to its potential in using remote sensing products to improve operational hydrological forecasts and for reanalysis purposes. This has been supported by the application of sequential techniques such as the Ensemble Kalman Filter which require no additional features within the modeling process, i.e. it can use arbitrary black-box models. Alternatively, variational techniques rely on optimization algorithms to minimize a pre-defined objective function. This function describes the trade-off between the amount of noise introduced into the system and the mismatch between simulated and observed variables. While sequential techniques have been commonly applied to hydrological processes, variational techniques are seldom used. In our believe, this is mainly attributed to the required computation of first order sensitivities by algorithmic differentiation techniques and related model enhancements, but also to lack of comparison between both techniques. We contribute to filling this gap and present the results from the assimilation of streamflow data in two basins located in Germany and Canada. The assimilation introduces noise to precipitation and temperature to produce better initial estimates of an HBV model. The results are computed for a hindcast period and assessed using lead time performance metrics. The study concludes with a discussion of the main features of each technique and their advantages/disadvantages in hydrological applications.
Parallelization and implementation of approximate root isolation for nonlinear system by Monte Carlo

Science.gov (United States)

Khosravi, Ebrahim

1998-12-01

This dissertation solves a fundamental problem of isolating the real roots of nonlinear systems of equations by Monte-Carlo that were published by Bush Jones. This algorithm requires only function values and can be applied readily to complicated systems of transcendental functions. The implementation of this sequential algorithm provides scientists with the means to utilize function analysis in mathematics or other fields of science. The algorithm, however, is so computationally intensive that the system is limited to a very small set of variables, and this will make it unfeasible for large systems of equations. Also a computational technique was needed for investigating a metrology of preventing the algorithm structure from converging to the same root along different paths of computation. The research provides techniques for improving the efficiency and correctness of the algorithm. The sequential algorithm for this technique was corrected and a parallel algorithm is presented. This parallel method has been formally analyzed and is compared with other known methods of root isolation. The effectiveness, efficiency, enhanced overall performance of the parallel processing of the program in comparison to sequential processing is discussed. The message passing model was used for this parallel processing, and it is presented and implemented on Intel/860 MIMD architecture. The parallel processing proposed in this research has been implemented in an ongoing high energy physics experiment: this algorithm has been used to track neutrinoes in a super K detector. This experiment is located in Japan, and data can be processed on-line or off-line locally or remotely.
Work-Efficient Parallel Skyline Computation for the GPU

DEFF Research Database (Denmark)

Bøgh, Kenneth Sejdenfaden; Chester, Sean; Assent, Ira

2015-01-01

offers the potential for parallelizing skyline computation across thousands of cores. However, attempts to port skyline algorithms to the GPU have prioritized throughput and failed to outperform sequential algorithms. In this paper, we introduce a new skyline algorithm, designed for the GPU, that uses...... a global, static partitioning scheme. With the partitioning, we can permit controlled branching to exploit transitive relationships and avoid most point-to-point comparisons. The result is a non-traditional GPU algorithm, SkyAlign, that prioritizes work-effciency and respectable throughput, rather than...
Efficient sequential and parallel algorithms for planted motif search.

Science.gov (United States)

Nicolae, Marius; Rajasekaran, Sanguthevar

2014-01-31

Motif searching is an important step in the detection of rare events occurring in a set of DNA or protein sequences. One formulation of the problem is known as (l,d)-motif search or Planted Motif Search (PMS). In PMS we are given two integers l and d and n biological sequences. We want to find all sequences of length l that appear in each of the input sequences with at most d mismatches. The PMS problem is NP-complete. PMS algorithms are typically evaluated on certain instances considered challenging. Despite ample research in the area, a considerable performance gap exists because many state of the art algorithms have large runtimes even for moderately challenging instances. This paper presents a fast exact parallel PMS algorithm called PMS8. PMS8 is the first algorithm to solve the challenging (l,d) instances (25,10) and (26,11). PMS8 is also efficient on instances with larger l and d such as (50,21). We include a comparison of PMS8 with several state of the art algorithms on multiple problem instances. This paper also presents necessary and sufficient conditions for 3 l-mers to have a common d-neighbor. The program is freely available at http://engr.uconn.edu/~man09004/PMS8/. We present PMS8, an efficient exact algorithm for Planted Motif Search. PMS8 introduces novel ideas for generating common neighborhoods. We have also implemented a parallel version for this algorithm. PMS8 can solve instances not solved by any previous algorithms.
A Comparison of Sequential and GPU Implementations of Iterative Methods to Compute Reachability Probabilities

Directory of Open Access Journals (Sweden)

Elise Cormie-Bowins

2012-10-01

Full Text Available We consider the problem of computing reachability probabilities: given a Markov chain, an initial state of the Markov chain, and a set of goal states of the Markov chain, what is the probability of reaching any of the goal states from the initial state? This problem can be reduced to solving a linear equation Ax = b for x, where A is a matrix and b is a vector. We consider two iterative methods to solve the linear equation: the Jacobi method and the biconjugate gradient stabilized (BiCGStab method. For both methods, a sequential and a parallel version have been implemented. The parallel versions have been implemented on the compute unified device architecture (CUDA so that they can be run on a NVIDIA graphics processing unit (GPU. From our experiments we conclude that as the size of the matrix increases, the CUDA implementations outperform the sequential implementations. Furthermore, the BiCGStab method performs better than the Jacobi method for dense matrices, whereas the Jacobi method does better for sparse ones. Since the reachability probabilities problem plays a key role in probabilistic model checking, we also compared the implementations for matrices obtained from a probabilistic model checker. Our experiments support the conjecture by Bosnacki et al. that the Jacobi method is superior to Krylov subspace methods, a class to which the BiCGStab method belongs, for probabilistic model checking.
An Evaluation of Parallel Synchronous and Conservative Asynchronous Logic-Level Simulations

Directory of Open Access Journals (Sweden)

Ausif Mahmood

1996-01-01

a circuit remain fixed during the entire simulation. We remove this limitation and, by extending the analyses to multi-input, multi-output circuits with an arbitrary number of input events, show that the conservative asynchronous simulation extracts more parallelism and executes faster than synchronous simulation in general. Our conclusions are supported by a comparison of the idealized execution times of synchronous and conservative asynchronous algorithms on ISCAS combinational and sequential benchmark circuits.
A Parallel Approach to Fractal Image Compression

OpenAIRE

Lubomir Dedera

2004-01-01

The paper deals with a parallel approach to coding and decoding algorithms in fractal image compressionand presents experimental results comparing sequential and parallel algorithms from the point of view of achieved bothcoding and decoding time and effectiveness of parallelization.
Leveraging Parallel Data Processing Frameworks with Verified Lifting

Directory of Open Access Journals (Sweden)

Maaz Bin Safeer Ahmad

2016-11-01

Full Text Available Many parallel data frameworks have been proposed in recent years that let sequential programs access parallel processing. To capitalize on the benefits of such frameworks, existing code must often be rewritten to the domain-specific languages that each framework supports. This rewriting–tedious and error-prone–also requires developers to choose the framework that best optimizes performance given a specific workload. This paper describes Casper, a novel compiler that automatically retargets sequential Java code for execution on Hadoop, a parallel data processing framework that implements the MapReduce paradigm. Given a sequential code fragment, Casper uses verified lifting to infer a high-level summary expressed in our program specification language that is then compiled for execution on Hadoop. We demonstrate that Casper automatically translates Java benchmarks into Hadoop. The translated results execute on average 3.3x faster than the sequential implementations and scale better, as well, to larger datasets.
A Parallel Approach to Fractal Image Compression

Directory of Open Access Journals (Sweden)

Lubomir Dedera

2004-01-01

Full Text Available The paper deals with a parallel approach to coding and decoding algorithms in fractal image compressionand presents experimental results comparing sequential and parallel algorithms from the point of view of achieved bothcoding and decoding time and effectiveness of parallelization.
Robustness of the Sequential Lineup Advantage

Science.gov (United States)

Gronlund, Scott D.; Carlson, Curt A.; Dailey, Sarah B.; Goodsell, Charles A.

2009-01-01

A growing movement in the United States and around the world involves promoting the advantages of conducting an eyewitness lineup in a sequential manner. We conducted a large study (N = 2,529) that included 24 comparisons of sequential versus simultaneous lineups. A liberal statistical criterion revealed only 2 significant sequential lineup…
Multi-Stage Recognition of Speech Emotion Using Sequential Forward Feature Selection

Directory of Open Access Journals (Sweden)

Liogienė Tatjana

2016-07-01

Full Text Available The intensive research of speech emotion recognition introduced a huge collection of speech emotion features. Large feature sets complicate the speech emotion recognition task. Among various feature selection and transformation techniques for one-stage classification, multiple classifier systems were proposed. The main idea of multiple classifiers is to arrange the emotion classification process in stages. Besides parallel and serial cases, the hierarchical arrangement of multi-stage classification is most widely used for speech emotion recognition. In this paper, we present a sequential-forward-feature-selection-based multi-stage classification scheme. The Sequential Forward Selection (SFS and Sequential Floating Forward Selection (SFFS techniques were employed for every stage of the multi-stage classification scheme. Experimental testing of the proposed scheme was performed using the German and Lithuanian emotional speech datasets. Sequential-feature-selection-based multi-stage classification outperformed the single-stage scheme by 12–42 % for different emotion sets. The multi-stage scheme has shown higher robustness to the growth of emotion set. The decrease in recognition rate with the increase in emotion set for multi-stage scheme was lower by 10–20 % in comparison with the single-stage case. Differences in SFS and SFFS employment for feature selection were negligible.
Evaluating parallel optimization on transputers

Directory of Open Access Journals (Sweden)

A.G. Chalmers

2003-12-01

Full Text Available The faster processing power of modern computers and the development of efficient algorithms have made it possible for operations researchers to tackle a much wider range of problems than ever before. Further improvements in processing speed can be achieved utilising relatively inexpensive transputers to process components of an algorithm in parallel. The Davidon-Fletcher-Powell method is one of the most successful and widely used optimisation algorithms for unconstrained problems. This paper examines the algorithm and identifies the components that can be processed in parallel. The results of some experiments with these components are presented which indicates under what conditions parallel processing with an inexpensive configuration is likely to be faster than the traditional sequential implementations. The performance of the whole algorithm with its parallel components is then compared with the original sequential algorithm. The implementation serves to illustrate the practicalities of speeding up typical OR algorithms in terms of difficulty, effort and cost. The results give an indication of the savings in time a given parallel implementation can be expected to yield.
A parallel buffer tree

DEFF Research Database (Denmark)

Sitchinava, Nodar; Zeh, Norbert

2012-01-01

We present the parallel buffer tree, a parallel external memory (PEM) data structure for batched search problems. This data structure is a non-trivial extension of Arge's sequential buffer tree to a private-cache multiprocessor environment and reduces the number of I/O operations by the number of...... in the optimal OhOf(psortN + K/PB) parallel I/O complexity, where K is the size of the output reported in the process and psortN is the parallel I/O complexity of sorting N elements using P processors....
Breast conserving treatment for breast cancer: dosimetric comparison of sequential versus simultaneous integrated photon boost.

Science.gov (United States)

Van Parijs, Hilde; Reynders, Truus; Heuninckx, Karina; Verellen, Dirk; Storme, Guy; De Ridder, Mark

2014-01-01

Breast conserving surgery followed by whole breast irradiation is widely accepted as standard of care for early breast cancer. Addition of a boost dose to the initial tumor area further reduces local recurrences. We investigated the dosimetric benefits of a simultaneously integrated boost (SIB) compared to a sequential boost to hypofractionate the boost volume, while maintaining normofractionation on the breast. For 10 patients 4 treatment plans were deployed, 1 with a sequential photon boost, and 3 with different SIB techniques: on a conventional linear accelerator, helical TomoTherapy, and static TomoDirect. Dosimetric comparison was performed. PTV-coverage was good in all techniques. Conformity was better with all SIB techniques compared to sequential boost (P = 0.0001). There was less dose spilling to the ipsilateral breast outside the PTVboost (P = 0.04). The dose to the organs at risk (OAR) was not influenced by SIB compared to sequential boost. Helical TomoTherapy showed a higher mean dose to the contralateral breast, but less than 5 Gy for each patient. SIB showed less dose spilling within the breast and equal dose to OAR compared to sequential boost. Both helical TomoTherapy and the conventional technique delivered acceptable dosimetry. SIB seems a safe alternative and can be implemented in clinical routine.
Parallelization and automatic data distribution for nuclear reactor simulations

Energy Technology Data Exchange (ETDEWEB)

Liebrock, L.M. [Liebrock-Hicks Research, Calumet, MI (United States)

1997-07-01

Detailed attempts at realistic nuclear reactor simulations currently take many times real time to execute on high performance workstations. Even the fastest sequential machine can not run these simulations fast enough to ensure that the best corrective measure is used during a nuclear accident to prevent a minor malfunction from becoming a major catastrophe. Since sequential computers have nearly reached the speed of light barrier, these simulations will have to be run in parallel to make significant improvements in speed. In physical reactor plants, parallelism abounds. Fluids flow, controls change, and reactions occur in parallel with only adjacent components directly affecting each other. These do not occur in the sequentialized manner, with global instantaneous effects, that is often used in simulators. Development of parallel algorithms that more closely approximate the real-world operation of a reactor may, in addition to speeding up the simulations, actually improve the accuracy and reliability of the predictions generated. Three types of parallel architecture (shared memory machines, distributed memory multicomputers, and distributed networks) are briefly reviewed as targets for parallelization of nuclear reactor simulation. Various parallelization models (loop-based model, shared memory model, functional model, data parallel model, and a combined functional and data parallel model) are discussed along with their advantages and disadvantages for nuclear reactor simulation. A variety of tools are introduced for each of the models. Emphasis is placed on the data parallel model as the primary focus for two-phase flow simulation. Tools to support data parallel programming for multiple component applications and special parallelization considerations are also discussed.
Parallelization and automatic data distribution for nuclear reactor simulations

International Nuclear Information System (INIS)

Liebrock, L.M.

1997-01-01

Detailed attempts at realistic nuclear reactor simulations currently take many times real time to execute on high performance workstations. Even the fastest sequential machine can not run these simulations fast enough to ensure that the best corrective measure is used during a nuclear accident to prevent a minor malfunction from becoming a major catastrophe. Since sequential computers have nearly reached the speed of light barrier, these simulations will have to be run in parallel to make significant improvements in speed. In physical reactor plants, parallelism abounds. Fluids flow, controls change, and reactions occur in parallel with only adjacent components directly affecting each other. These do not occur in the sequentialized manner, with global instantaneous effects, that is often used in simulators. Development of parallel algorithms that more closely approximate the real-world operation of a reactor may, in addition to speeding up the simulations, actually improve the accuracy and reliability of the predictions generated. Three types of parallel architecture (shared memory machines, distributed memory multicomputers, and distributed networks) are briefly reviewed as targets for parallelization of nuclear reactor simulation. Various parallelization models (loop-based model, shared memory model, functional model, data parallel model, and a combined functional and data parallel model) are discussed along with their advantages and disadvantages for nuclear reactor simulation. A variety of tools are introduced for each of the models. Emphasis is placed on the data parallel model as the primary focus for two-phase flow simulation. Tools to support data parallel programming for multiple component applications and special parallelization considerations are also discussed
Parallel computing for homogeneous diffusion and transport equations in neutronics; Calcul parallele pour les equations de diffusion et de transport homogenes en neutronique

Energy Technology Data Exchange (ETDEWEB)

Pinchedez, K

1999-06-01

Parallel computing meets the ever-increasing requirements for neutronic computer code speed and accuracy. In this work, two different approaches have been considered. We first parallelized the sequential algorithm used by the neutronics code CRONOS developed at the French Atomic Energy Commission. The algorithm computes the dominant eigenvalue associated with PN simplified transport equations by a mixed finite element method. Several parallel algorithms have been developed on distributed memory machines. The performances of the parallel algorithms have been studied experimentally by implementation on a T3D Cray and theoretically by complexity models. A comparison of various parallel algorithms has confirmed the chosen implementations. We next applied a domain sub-division technique to the two-group diffusion Eigen problem. In the modal synthesis-based method, the global spectrum is determined from the partial spectra associated with sub-domains. Then the Eigen problem is expanded on a family composed, on the one hand, from eigenfunctions associated with the sub-domains and, on the other hand, from functions corresponding to the contribution from the interface between the sub-domains. For a 2-D homogeneous core, this modal method has been validated and its accuracy has been measured. (author)
Breast Conserving Treatment for Breast Cancer: Dosimetric Comparison of Sequential versus Simultaneous Integrated Photon Boost

Directory of Open Access Journals (Sweden)

Hilde Van Parijs

2014-01-01

Full Text Available Background. Breast conserving surgery followed by whole breast irradiation is widely accepted as standard of care for early breast cancer. Addition of a boost dose to the initial tumor area further reduces local recurrences. We investigated the dosimetric benefits of a simultaneously integrated boost (SIB compared to a sequential boost to hypofractionate the boost volume, while maintaining normofractionation on the breast. Methods. For 10 patients 4 treatment plans were deployed, 1 with a sequential photon boost, and 3 with different SIB techniques: on a conventional linear accelerator, helical TomoTherapy, and static TomoDirect. Dosimetric comparison was performed. Results. PTV-coverage was good in all techniques. Conformity was better with all SIB techniques compared to sequential boost (P = 0.0001. There was less dose spilling to the ipsilateral breast outside the PTVboost (P = 0.04. The dose to the organs at risk (OAR was not influenced by SIB compared to sequential boost. Helical TomoTherapy showed a higher mean dose to the contralateral breast, but less than 5 Gy for each patient. Conclusions. SIB showed less dose spilling within the breast and equal dose to OAR compared to sequential boost. Both helical TomoTherapy and the conventional technique delivered acceptable dosimetry. SIB seems a safe alternative and can be implemented in clinical routine.

Fast Evaluation of Segmentation Quality with Parallel Computing

Directory of Open Access Journals (Sweden)

Henry Cruz

2017-01-01

Full Text Available In digital image processing and computer vision, a fairly frequent task is the performance comparison of different algorithms on enormous image databases. This task is usually time-consuming and tedious, such that any kind of tool to simplify this work is welcome. To achieve an efficient and more practical handling of a normally tedious evaluation, we implemented the automatic detection system, with the help of MATLAB®’s Parallel Computing Toolbox™. The key parts of the system have been parallelized to achieve simultaneous execution and analysis of segmentation algorithms on the one hand and the evaluation of detection accuracy for the nonforested regions, such as a study case, on the other hand. As a positive side effect, CPU usage was reduced and processing time was significantly decreased by 68.54% compared to sequential processing (i.e., executing the system with each algorithm one by one.
A highly scalable massively parallel fast marching method for the Eikonal equation

Science.gov (United States)

Yang, Jianming; Stern, Frederick

2017-03-01

The fast marching method is a widely used numerical method for solving the Eikonal equation arising from a variety of scientific and engineering fields. It is long deemed inherently sequential and an efficient parallel algorithm applicable to large-scale practical applications is not available in the literature. In this study, we present a highly scalable massively parallel implementation of the fast marching method using a domain decomposition approach. Central to this algorithm is a novel restarted narrow band approach that coordinates the frequency of communications and the amount of computations extra to a sequential run for achieving an unprecedented parallel performance. Within each restart, the narrow band fast marching method is executed; simple synchronous local exchanges and global reductions are adopted for communicating updated data in the overlapping regions between neighboring subdomains and getting the latest front status, respectively. The independence of front characteristics is exploited through special data structures and augmented status tags to extract the masked parallelism within the fast marching method. The efficiency, flexibility, and applicability of the parallel algorithm are demonstrated through several examples. These problems are extensively tested on six grids with up to 1 billion points using different numbers of processes ranging from 1 to 65536. Remarkable parallel speedups are achieved using tens of thousands of processes. Detailed pseudo-codes for both the sequential and parallel algorithms are provided to illustrate the simplicity of the parallel implementation and its similarity to the sequential narrow band fast marching algorithm.
Parallel computing for homogeneous diffusion and transport equations in neutronics

International Nuclear Information System (INIS)

Pinchedez, K.

1999-06-01

Parallel computing meets the ever-increasing requirements for neutronic computer code speed and accuracy. In this work, two different approaches have been considered. We first parallelized the sequential algorithm used by the neutronics code CRONOS developed at the French Atomic Energy Commission. The algorithm computes the dominant eigenvalue associated with PN simplified transport equations by a mixed finite element method. Several parallel algorithms have been developed on distributed memory machines. The performances of the parallel algorithms have been studied experimentally by implementation on a T3D Cray and theoretically by complexity models. A comparison of various parallel algorithms has confirmed the chosen implementations. We next applied a domain sub-division technique to the two-group diffusion Eigen problem. In the modal synthesis-based method, the global spectrum is determined from the partial spectra associated with sub-domains. Then the Eigen problem is expanded on a family composed, on the one hand, from eigenfunctions associated with the sub-domains and, on the other hand, from functions corresponding to the contribution from the interface between the sub-domains. For a 2-D homogeneous core, this modal method has been validated and its accuracy has been measured. (author)
Comparison of ERBS orbit determination accuracy using batch least-squares and sequential methods

Science.gov (United States)

Oza, D. H.; Jones, T. L.; Fabien, S. M.; Mistretta, G. D.; Hart, R. C.; Doll, C. E.

1991-10-01

The Flight Dynamics Div. (FDD) at NASA-Goddard commissioned a study to develop the Real Time Orbit Determination/Enhanced (RTOD/E) system as a prototype system for sequential orbit determination of spacecraft on a DOS based personal computer (PC). An overview is presented of RTOD/E capabilities and the results are presented of a study to compare the orbit determination accuracy for a Tracking and Data Relay Satellite System (TDRSS) user spacecraft obtained using RTOS/E on a PC with the accuracy of an established batch least squares system, the Goddard Trajectory Determination System (GTDS), operating on a mainframe computer. RTOD/E was used to perform sequential orbit determination for the Earth Radiation Budget Satellite (ERBS), and the Goddard Trajectory Determination System (GTDS) was used to perform the batch least squares orbit determination. The estimated ERBS ephemerides were obtained for the Aug. 16 to 22, 1989, timeframe, during which intensive TDRSS tracking data for ERBS were available. Independent assessments were made to examine the consistencies of results obtained by the batch and sequential methods. Comparisons were made between the forward filtered RTOD/E orbit solutions and definitive GTDS orbit solutions for ERBS; the solution differences were less than 40 meters after the filter had reached steady state.
Comparison of ERBS orbit determination accuracy using batch least-squares and sequential methods

Science.gov (United States)

Oza, D. H.; Jones, T. L.; Fabien, S. M.; Mistretta, G. D.; Hart, R. C.; Doll, C. E.

1991-01-01

The Flight Dynamics Div. (FDD) at NASA-Goddard commissioned a study to develop the Real Time Orbit Determination/Enhanced (RTOD/E) system as a prototype system for sequential orbit determination of spacecraft on a DOS based personal computer (PC). An overview is presented of RTOD/E capabilities and the results are presented of a study to compare the orbit determination accuracy for a Tracking and Data Relay Satellite System (TDRSS) user spacecraft obtained using RTOS/E on a PC with the accuracy of an established batch least squares system, the Goddard Trajectory Determination System (GTDS), operating on a mainframe computer. RTOD/E was used to perform sequential orbit determination for the Earth Radiation Budget Satellite (ERBS), and the Goddard Trajectory Determination System (GTDS) was used to perform the batch least squares orbit determination. The estimated ERBS ephemerides were obtained for the Aug. 16 to 22, 1989, timeframe, during which intensive TDRSS tracking data for ERBS were available. Independent assessments were made to examine the consistencies of results obtained by the batch and sequential methods. Comparisons were made between the forward filtered RTOD/E orbit solutions and definitive GTDS orbit solutions for ERBS; the solution differences were less than 40 meters after the filter had reached steady state.
Optimizing trial design in pharmacogenetics research: comparing a fixed parallel group, group sequential, and adaptive selection design on sample size requirements.

Science.gov (United States)

Boessen, Ruud; van der Baan, Frederieke; Groenwold, Rolf; Egberts, Antoine; Klungel, Olaf; Grobbee, Diederick; Knol, Mirjam; Roes, Kit

2013-01-01

Two-stage clinical trial designs may be efficient in pharmacogenetics research when there is some but inconclusive evidence of effect modification by a genomic marker. Two-stage designs allow to stop early for efficacy or futility and can offer the additional opportunity to enrich the study population to a specific patient subgroup after an interim analysis. This study compared sample size requirements for fixed parallel group, group sequential, and adaptive selection designs with equal overall power and control of the family-wise type I error rate. The designs were evaluated across scenarios that defined the effect sizes in the marker positive and marker negative subgroups and the prevalence of marker positive patients in the overall study population. Effect sizes were chosen to reflect realistic planning scenarios, where at least some effect is present in the marker negative subgroup. In addition, scenarios were considered in which the assumed 'true' subgroup effects (i.e., the postulated effects) differed from those hypothesized at the planning stage. As expected, both two-stage designs generally required fewer patients than a fixed parallel group design, and the advantage increased as the difference between subgroups increased. The adaptive selection design added little further reduction in sample size, as compared with the group sequential design, when the postulated effect sizes were equal to those hypothesized at the planning stage. However, when the postulated effects deviated strongly in favor of enrichment, the comparative advantage of the adaptive selection design increased, which precisely reflects the adaptive nature of the design. Copyright © 2013 John Wiley & Sons, Ltd.
Comparison of ablation centration after bilateral sequential versus simultaneous LASIK.

Science.gov (United States)

Lin, Jane-Ming; Tsai, Yi-Yu

2005-01-01

To compare ablation centration after bilateral sequential and simultaneous myopic LASIK. A retrospective randomized case series was performed of 670 eyes of 335 consecutive patients who had undergone either bilateral sequential (group 1) or simultaneous (group 2) myopic LASIK between July 2000 and July 2001 at the China Medical University Hospital, Taichung, Taiwan. The ablation centrations of the first and second eyes in the two groups were compared 3 months postoperatively. Of 670 eyes, 274 eyes (137 patients) comprised the sequential group and 396 eyes (198 patients) comprised the simultaneous group. Three months post-operatively, 220 eyes of 110 patients (80%) in the sequential group and 236 eyes of 118 patients (60%) in the simultaneous group provided topographic data for centration analysis. For the first eyes, mean decentration was 0.39 +/- 0.26 mm in the sequential group and 0.41 +/- 0.19 mm in the simultaneous group (P = .30). For the second eyes, mean decentration was 0.28 +/- 0.23 mm in the sequential group and 0.30 +/- 0.21 mm in the simultaneous group (P = .36). Decentration in the second eyes significantly improved in both groups (group 1, P = .02; group 2, P sequential group and 0.32 +/- 0.18 mm in the simultaneous group (P = .33). The difference of ablation center angles between the first and second eyes was 43.2 sequential group and 45.1 +/- 50.8 degrees in the simultaneous group (P = .42). Simultaneous bilateral LASIK is comparable to sequential surgery in ablation centration.
Parallelism and array processing

International Nuclear Information System (INIS)

Zacharov, V.

1983-01-01

Modern computing, as well as the historical development of computing, has been dominated by sequential monoprocessing. Yet there is the alternative of parallelism, where several processes may be in concurrent execution. This alternative is discussed in a series of lectures, in which the main developments involving parallelism are considered, both from the standpoint of computing systems and that of applications that can exploit such systems. The lectures seek to discuss parallelism in a historical context, and to identify all the main aspects of concurrency in computation right up to the present time. Included will be consideration of the important question as to what use parallelism might be in the field of data processing. (orig.)
Comparison of Sequential Regimen and Standard Therapy for Helicobacter pylori Eradication in Patients with Dyspepsia

Directory of Open Access Journals (Sweden)

Gh. Roshanaei

2013-10-01

Full Text Available Introduction & Objective: Some studies have reported successful eradication rates using se-quential therapy but more recent studies performed in Asia did not find a similar benefit. Due to inconsistencies in the comparison of standard triple drugs therapy and sequential regimen, in the previous researches we decided to compare these treatments in Persian patients. Materials & Methods: This study is a randomized clinical trial, performed in one hundred and forty patients suffering from dyspepsia with indication for H. pylori eradication between No-vember 2010 and March 2012.Patients were randomized in two equal groups. The patients in the first group (standard were treated by omeprazole capsule 20 mg BID, amoxicillin cap-sule 1 gr BID, clarithromycin tablet 500mg BID for 14 days; while the patients in the second group (sequential were treated by omeprazole capsule 20 mg for 10 days, amoxicillin cap-sule 1 gr BID for 5 days, then clarithromycin tablet 500 mg and tinidazole tablet 500 mg BID for other 5 days. 4-6 weeks after the treatment, we compared the eradication of H.pylori be-tween the two groups by urease breathe test with C14. Results: H. pylori infection was successfully cured in 57/70 (81.43% with a 10-day sequen-tial therapy, in 60/70 (85.75% with the standard fourteen-day triple therapy, respectively. Conclusion: We detected no significant differences between the 10-day sequential eradication therapy for H. pylori and 14-day standard triple treatment among the patients. (Sci J Hamadan Univ Med Sci 2013; 20 (3:184-193
A parallel model for SQL astronomical databases based on solid state storage. Application to the Gaia Archive PostgreSQL database

Science.gov (United States)

González-Núñez, J.; Gutiérrez-Sánchez, R.; Salgado, J.; Segovia, J. C.; Merín, B.; Aguado-Agelet, F.

2017-07-01

Query planning and optimisation algorithms in most popular relational databases were developed at the times hard disk drives were the only storage technology available. The advent of higher parallel random access capacity devices, such as solid state disks, opens up the way for intra-machine parallel computing over large datasets. We describe a two phase parallel model for the implementation of heavy analytical processes in single instance PostgreSQL astronomical databases. This model is particularised to fulfil two frequent astronomical problems, density maps and crossmatch computation with Quad Tree Cube (Q3C) indexes. They are implemented as part of the relational databases infrastructure for the Gaia Archive and performance is assessed. Improvement of a factor 28.40 in comparison to sequential execution is observed in the reference implementation for a histogram computation. Speedup ratios of 3.7 and 4.0 are attained for the reference positional crossmatches considered. We observe large performance enhancements over sequential execution for both CPU and disk access intensive computations, suggesting these methods might be useful with the growing data volumes in Astronomy.
Comparison of two percutaneous tracheostomy techniques, guide wire dilating forceps and Ciaglia Blue Rhino: a sequential cohort study.

NARCIS (Netherlands)

Fikkers, B.G.; Staatsen, M; Lardenoije, S.G.; Hoogen, F.J.A. van den; Hoeven, J.G. van der

2004-01-01

INTRODUCTION: To evaluate and compare the peri-operative and postoperative complications of the two most frequently used percutaneous tracheostomy techniques, namely guide wire dilating forceps (GWDF) and Ciaglia Blue Rhino (CBR). METHODS: A sequential cohort study with comparison of short-term and
The numerical parallel computing of photon transport

International Nuclear Information System (INIS)

Huang Qingnan; Liang Xiaoguang; Zhang Lifa

1998-12-01

The parallel computing of photon transport is investigated, the parallel algorithm and the parallelization of programs on parallel computers both with shared memory and with distributed memory are discussed. By analyzing the inherent law of the mathematics and physics model of photon transport according to the structure feature of parallel computers, using the strategy of 'to divide and conquer', adjusting the algorithm structure of the program, dissolving the data relationship, finding parallel liable ingredients and creating large grain parallel subtasks, the sequential computing of photon transport into is efficiently transformed into parallel and vector computing. The program was run on various HP parallel computers such as the HY-1 (PVP), the Challenge (SMP) and the YH-3 (MPP) and very good parallel speedup has been gotten
Sequential combination of k-t principle component analysis (PCA) and partial parallel imaging: k-t PCA GROWL.

Science.gov (United States)

Qi, Haikun; Huang, Feng; Zhou, Hongmei; Chen, Huijun

2017-03-01

k-t principle component analysis (k-t PCA) is a distinguished method for high spatiotemporal resolution dynamic MRI. To further improve the accuracy of k-t PCA, a combination with partial parallel imaging (PPI), k-t PCA/SENSE, has been tested. However, k-t PCA/SENSE suffers from long reconstruction time and limited improvement. This study aims to improve the combination of k-t PCA and PPI on both reconstruction speed and accuracy. A sequential combination scheme called k-t PCA GROWL (GRAPPA operator for wider readout line) was proposed. The GRAPPA operator was performed before k-t PCA to extend each readout line into a wider band, which improved the condition of the encoding matrix in the following k-t PCA reconstruction. k-t PCA GROWL was tested and compared with k-t PCA and k-t PCA/SENSE on cardiac imaging. k-t PCA GROWL consistently resulted in better image quality compared with k-t PCA/SENSE at high acceleration factors for both retrospectively and prospectively undersampled cardiac imaging, with a much lower computation cost. The improvement in image quality became greater with the increase of acceleration factor. By sequentially combining the GRAPPA operator and k-t PCA, the proposed k-t PCA GROWL method outperformed k-t PCA/SENSE in both reconstruction speed and accuracy, suggesting that k-t PCA GROWL is a better combination scheme than k-t PCA/SENSE. Magn Reson Med 77:1058-1067, 2017. © 2016 International Society for Magnetic Resonance in Medicine. © 2016 International Society for Magnetic Resonance in Medicine.
Parallel phase model : a programming model for high-end parallel machines with manycores.

Energy Technology Data Exchange (ETDEWEB)

Wu, Junfeng (Syracuse University, Syracuse, NY); Wen, Zhaofang; Heroux, Michael Allen; Brightwell, Ronald Brian

2009-04-01

This paper presents a parallel programming model, Parallel Phase Model (PPM), for next-generation high-end parallel machines based on a distributed memory architecture consisting of a networked cluster of nodes with a large number of cores on each node. PPM has a unified high-level programming abstraction that facilitates the design and implementation of parallel algorithms to exploit both the parallelism of the many cores and the parallelism at the cluster level. The programming abstraction will be suitable for expressing both fine-grained and coarse-grained parallelism. It includes a few high-level parallel programming language constructs that can be added as an extension to an existing (sequential or parallel) programming language such as C; and the implementation of PPM also includes a light-weight runtime library that runs on top of an existing network communication software layer (e.g. MPI). Design philosophy of PPM and details of the programming abstraction are also presented. Several unstructured applications that inherently require high-volume random fine-grained data accesses have been implemented in PPM with very promising results.
An Automatic Instruction-Level Parallelization of Machine Code

Directory of Open Access Journals (Sweden)

MARINKOVIC, V.

2018-02-01

Full Text Available Prevailing multicores and novel manycores have made a great challenge of modern day - parallelization of embedded software that is still written as sequential. In this paper, automatic code parallelization is considered, focusing on developing a parallelization tool at the binary level as well as on the validation of this approach. The novel instruction-level parallelization algorithm for assembly code which uses the register names after SSA to find independent blocks of code and then to schedule independent blocks using METIS to achieve good load balance is developed. The sequential consistency is verified and the validation is done by measuring the program execution time on the target architecture. Great speedup, taken as the performance measure in the validation process, and optimal load balancing are achieved for multicore RISC processors with 2 to 16 cores (e.g. MIPS, MicroBlaze, etc.. In particular, for 16 cores, the average speedup is 7.92x, while in some cases it reaches 14x. An approach to automatic parallelization provided by this paper is useful to researchers and developers in the area of parallelization as the basis for further optimizations, as the back-end of a compiler, or as the code parallelization tool for an embedded system.
Parallel SN transport calculations on a transputer network

International Nuclear Information System (INIS)

Kim, Yong Hee; Cho, Nam Zin

1994-01-01

A parallel computing algorithm for the neutron transport problems has been implemented on a transputer network and two reactor benchmark problems (a fixed-source problem and an eigenvalue problem) are solved. We have shown that the parallel calculations provided significant reduction in execution time over the sequential calculations
Compiling Scientific Programs for Scalable Parallel Systems

National Research Council Canada - National Science Library

Kennedy, Ken

2001-01-01

...). The research performed in this project included new techniques for recognizing implicit parallelism in sequential programs, a powerful and precise set-based framework for analysis and transformation...
Streaming for Functional Data-Parallel Languages

DEFF Research Database (Denmark)

Madsen, Frederik Meisner

In this thesis, we investigate streaming as a general solution to the space inefficiency commonly found in functional data-parallel programming languages. The data-parallel paradigm maps well to parallel SIMD-style hardware. However, the traditional fully materializing execution strategy...... by extending two existing data-parallel languages: NESL and Accelerate. In the extensions we map bulk operations to data-parallel streams that can evaluate fully sequential, fully parallel or anything in between. By a dataflow, piecewise parallel execution strategy, the runtime system can adjust to any target...... flattening necessitates all sub-computations to materialize at the same time. For example, naive n by n matrix multiplication requires n^3 space in NESL because the algorithm contains n^3 independent scalar multiplications. For large values of n, this is completely unacceptable. We address the problem...
Shared Variable Oriented Parallel Precompiler for SPMD Model

Institute of Scientific and Technical Information of China (English)

无

1995-01-01

For the moment,commercial parallel computer systems with distributed memory architecture are usually provided with parallel FORTRAN or parallel C compliers,which are just traditional sequential FORTRAN or C compilers expanded with communication statements.Programmers suffer from writing parallel programs with communication statements. The Shared Variable Oriented Parallel Precompiler (SVOPP) proposed in this paper can automatically generate appropriate communication statements based on shared variables for SPMD(Single Program Multiple Data) computation model and greatly ease the parallel programming with high communication efficiency.The core function of parallel C precompiler has been successfully verified on a transputer-based parallel computer.Its prominent performance shows that SVOPP is probably a break-through in parallel programming technique.
A multithreaded parallel implementation of a dynamic programming algorithm for sequence comparison.

Science.gov (United States)

Martins, W S; Del Cuvillo, J B; Useche, F J; Theobald, K B; Gao, G R

2001-01-01

This paper discusses the issues involved in implementing a dynamic programming algorithm for biological sequence comparison on a general-purpose parallel computing platform based on a fine-grain event-driven multithreaded program execution model. Fine-grain multithreading permits efficient parallelism exploitation in this application both by taking advantage of asynchronous point-to-point synchronizations and communication with low overheads and by effectively tolerating latency through the overlapping of computation and communication. We have implemented our scheme on EARTH, a fine-grain event-driven multithreaded execution and architecture model which has been ported to a number of parallel machines with off-the-shelf processors. Our experimental results show that the dynamic programming algorithm can be efficiently implemented on EARTH systems with high performance (e.g., speedup of 90 on 120 nodes), good programmability and reasonable cost.

Automatic Loop Parallelization via Compiler Guided Refactoring

DEFF Research Database (Denmark)

Larsen, Per; Ladelsky, Razya; Lidman, Jacob

For many parallel applications, performance relies not on instruction-level parallelism, but on loop-level parallelism. Unfortunately, many modern applications are written in ways that obstruct automatic loop parallelization. Since we cannot identify sufficient parallelization opportunities...... for these codes in a static, off-line compiler, we developed an interactive compilation feedback system that guides the programmer in iteratively modifying application source, thereby improving the compiler’s ability to generate loop-parallel code. We use this compilation system to modify two sequential...... benchmarks, finding that the code parallelized in this way runs up to 8.3 times faster on an octo-core Intel Xeon 5570 system and up to 12.5 times faster on a quad-core IBM POWER6 system. Benchmark performance varies significantly between the systems. This suggests that semi-automatic parallelization should...
New parallel SOR method by domain partitioning

Energy Technology Data Exchange (ETDEWEB)

Xie, Dexuan [Courant Inst. of Mathematical Sciences New York Univ., NY (United States)

1996-12-31

In this paper, we propose and analyze a new parallel SOR method, the PSOR method, formulated by using domain partitioning together with an interprocessor data-communication technique. For the 5-point approximation to the Poisson equation on a square, we show that the ordering of the PSOR based on the strip partition leads to a consistently ordered matrix, and hence the PSOR and the SOR using the row-wise ordering have the same convergence rate. However, in general, the ordering used in PSOR may not be {open_quote}consistently ordered{close_quotes}. So, there is a need to analyze the convergence of PSOR directly. In this paper, we present a PSOR theory, and show that the PSOR method can have the same asymptotic rate of convergence as the corresponding sequential SOR method for a wide class of linear systems in which the matrix is {open_quotes}consistently ordered{close_quotes}. Finally, we demonstrate the parallel performance of the PSOR method on four different message passing multiprocessors (a KSR1, the Intel Delta, an Intel Paragon and an IBM SP2), along with a comparison with the point Red-Black and four-color SOR methods.
Parallel Framework for Cooperative Processes

Directory of Open Access Journals (Sweden)

Mitică Craus

2005-01-01

Full Text Available This paper describes the work of an object oriented framework designed to be used in the parallelization of a set of related algorithms. The idea behind the system we are describing is to have a re-usable framework for running several sequential algorithms in a parallel environment. The algorithms that the framework can be used with have several things in common: they have to run in cycles and the work should be possible to be split between several "processing units". The parallel framework uses the message-passing communication paradigm and is organized as a master-slave system. Two applications are presented: an Ant Colony Optimization (ACO parallel algorithm for the Travelling Salesman Problem (TSP and an Image Processing (IP parallel algorithm for the Symmetrical Neighborhood Filter (SNF. The implementations of these applications by means of the parallel framework prove to have good performances: approximatively linear speedup and low communication cost.
Sequential data access with Oracle and Hadoop: a performance comparison

International Nuclear Information System (INIS)

Baranowski, Zbigniew; Canali, Luca; Grancher, Eric

2014-01-01

The Hadoop framework has proven to be an effective and popular approach for dealing with 'Big Data' and, thanks to its scaling ability and optimised storage access, Hadoop Distributed File System-based projects such as MapReduce or HBase are seen as candidates to replace traditional relational database management systems whenever scalable speed of data processing is a priority. But do these projects deliver in practice? Does migrating to Hadoop's 'shared nothing' architecture really improve data access throughput? And, if so, at what cost? Authors answer these questions–addressing cost/performance as well as raw performance– based on a performance comparison between an Oracle-based relational database and Hadoop's distributed solutions like MapReduce or HBase for sequential data access. A key feature of our approach is the use of an unbiased data model as certain data models can significantly favour one of the technologies tested.
An Alternative Algorithm for Computing Watersheds on Shared Memory Parallel Computers

NARCIS (Netherlands)

Meijster, A.; Roerdink, J.B.T.M.

1995-01-01

In this paper a parallel implementation of a watershed algorithm is proposed. The algorithm can easily be implemented on shared memory parallel computers. The watershed transform is generally considered to be inherently sequential since the discrete watershed of an image is defined using recursion.
Parallel-In-Time For Moving Meshes

Energy Technology Data Exchange (ETDEWEB)

Falgout, R. D. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Manteuffel, T. A. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Southworth, B. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Schroder, J. B. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

2016-02-04

With steadily growing computational resources available, scientists must develop e ective ways to utilize the increased resources. High performance, highly parallel software has be- come a standard. However until recent years parallelism has focused primarily on the spatial domain. When solving a space-time partial di erential equation (PDE), this leads to a sequential bottleneck in the temporal dimension, particularly when taking a large number of time steps. The XBraid parallel-in-time library was developed as a practical way to add temporal parallelism to existing se- quential codes with only minor modi cations. In this work, a rezoning-type moving mesh is applied to a di usion problem and formulated in a parallel-in-time framework. Tests and scaling studies are run using XBraid and demonstrate excellent results for the simple model problem considered herein.
Learning and Parallelization Boost Constraint Search

Science.gov (United States)

Yun, Xi

2013-01-01

Constraint satisfaction problems are a powerful way to abstract and represent academic and real-world problems from both artificial intelligence and operations research. A constraint satisfaction problem is typically addressed by a sequential constraint solver running on a single processor. Rather than construct a new, parallel solver, this work…
Parallelizing More Loops with Compiler Guided Refactoring

DEFF Research Database (Denmark)

Larsen, Per; Ladelsky, Razya; Lidman, Jacob

2012-01-01

an interactive compilation feedback system that guides programmers in iteratively modifying their application source code. This helps leverage the compiler’s ability to generate loop-parallel code. We employ our system to modify two sequential benchmarks dealing with image processing and edge detection...
Parallel community climate model: Description and user`s guide

Energy Technology Data Exchange (ETDEWEB)

Drake, J.B.; Flanery, R.E.; Semeraro, B.D.; Worley, P.H. [and others

1996-07-15

This report gives an overview of a parallel version of the NCAR Community Climate Model, CCM2, implemented for MIMD massively parallel computers using a message-passing programming paradigm. The parallel implementation was developed on an Intel iPSC/860 with 128 processors and on the Intel Delta with 512 processors, and the initial target platform for the production version of the code is the Intel Paragon with 2048 processors. Because the implementation uses a standard, portable message-passing libraries, the code has been easily ported to other multiprocessors supporting a message-passing programming paradigm. The parallelization strategy used is to decompose the problem domain into geographical patches and assign each processor the computation associated with a distinct subset of the patches. With this decomposition, the physics calculations involve only grid points and data local to a processor and are performed in parallel. Using parallel algorithms developed for the semi-Lagrangian transport, the fast Fourier transform and the Legendre transform, both physics and dynamics are computed in parallel with minimal data movement and modest change to the original CCM2 source code. Sequential or parallel history tapes are written and input files (in history tape format) are read sequentially by the parallel code to promote compatibility with production use of the model on other computer systems. A validation exercise has been performed with the parallel code and is detailed along with some performance numbers on the Intel Paragon and the IBM SP2. A discussion of reproducibility of results is included. A user`s guide for the PCCM2 version 2.1 on the various parallel machines completes the report. Procedures for compilation, setup and execution are given. A discussion of code internals is included for those who may wish to modify and use the program in their own research.
Spatial updating grand canonical Monte Carlo algorithms for fluid simulation: generalization to continuous potentials and parallel implementation.

Science.gov (United States)

O'Keeffe, C J; Ren, Ruichao; Orkoulas, G

2007-11-21

Spatial updating grand canonical Monte Carlo algorithms are generalizations of random and sequential updating algorithms for lattice systems to continuum fluid models. The elementary steps, insertions or removals, are constructed by generating points in space either at random (random updating) or in a prescribed order (sequential updating). These algorithms have previously been developed only for systems of impenetrable spheres for which no particle overlap occurs. In this work, spatial updating grand canonical algorithms are generalized to continuous, soft-core potentials to account for overlapping configurations. Results on two- and three-dimensional Lennard-Jones fluids indicate that spatial updating grand canonical algorithms, both random and sequential, converge faster than standard grand canonical algorithms. Spatial algorithms based on sequential updating not only exhibit the fastest convergence but also are ideal for parallel implementation due to the absence of strict detailed balance and the nature of the updating that minimizes interprocessor communication. Parallel simulation results for three-dimensional Lennard-Jones fluids show a substantial reduction of simulation time for systems of moderate and large size. The efficiency improvement by parallel processing through domain decomposition is always in addition to the efficiency improvement by sequential updating.
Fast Parallel Computation of Polynomials Using Few Processors

DEFF Research Database (Denmark)

Valiant, Leslie G.; Skyum, Sven; Berkowitz, S.

1983-01-01

It is shown that any multivariate polynomial of degree $d$ that can be computed sequentially in $C$ steps can be computed in parallel in $O((\\log d)(\\log C + \\log d))$ steps using only $(Cd)^{O(1)} $ processors....
Parallel computation for biological sequence comparison: comparing a portable model to the native model for the Intel Hypercube.

Science.gov (United States)

Nadkarni, P M; Miller, P L

1991-01-01

A parallel program for inter-database sequence comparison was developed on the Intel Hypercube using two models of parallel programming. One version was built using machine-specific Hypercube parallel programming commands. The other version was built using Linda, a machine-independent parallel programming language. The two versions of the program provide a case study comparing these two approaches to parallelization in an important biological application area. Benchmark tests with both programs gave comparable results with a small number of processors. As the number of processors was increased, the Linda version was somewhat less efficient. The Linda version was also run without change on Network Linda, a virtual parallel machine running on a network of desktop workstations.
Fast parallel computation of polynomials using few processors

DEFF Research Database (Denmark)

Valiant, Leslie; Skyum, Sven

1981-01-01

It is shown that any multivariate polynomial that can be computed sequentially in C steps and has degree d can be computed in parallel in 0((log d) (log C + log d)) steps using only (Cd)0(1) processors....
Stiffness analysis and comparison of a Biglide parallel grinder with alternative spatial modular parallelograms

DEFF Research Database (Denmark)

Wu, Guanglei; Zou, Ping

2017-01-01

This paper deals with the stiffness modeling, analysis and comparison of a Biglide parallel grinder with two alternative modular parallelograms. It turns out that the Cartesian stiffness matrix of the manipulator has the property that it can be decoupled into two homogeneous matrices, correspondi...
Java-Based Coupling for Parallel Predictive-Adaptive Domain Decomposition

Directory of Open Access Journals (Sweden)

Cécile Germain‐Renaud

1999-01-01

Full Text Available Adaptive domain decomposition exemplifies the problem of integrating heterogeneous software components with intermediate coupling granularity. This paper describes an experiment where a data‐parallel (HPF client interfaces with a sequential computation server through Java. We show that seamless integration of data‐parallelism is possible, but requires most of the tools from the Java palette: Java Native Interface (JNI, Remote Method Invocation (RMI, callbacks and threads.
Application of Pfortran and Co-Array Fortran in the Parallelization of the GROMOS96 Molecular Dynamics Module

Directory of Open Access Journals (Sweden)

Piotr Bała

2001-01-01

Full Text Available After at least a decade of parallel tool development, parallelization of scientific applications remains a significant undertaking. Typically parallelization is a specialized activity supported only partially by the programming tool set, with the programmer involved with parallel issues in addition to sequential ones. The details of concern range from algorithm design down to low-level data movement details. The aim of parallel programming tools is to automate the latter without sacrificing performance and portability, allowing the programmer to focus on algorithm specification and development. We present our use of two similar parallelization tools, Pfortran and Cray's Co-Array Fortran, in the parallelization of the GROMOS96 molecular dynamics module. Our parallelization started from the GROMOS96 distribution's shared-memory implementation of the replicated algorithm, but used little of that existing parallel structure. Consequently, our parallelization was close to starting with the sequential version. We found the intuitive extensions to Pfortran and Co-Array Fortran helpful in the rapid parallelization of the project. We present performance figures for both the Pfortran and Co-Array Fortran parallelizations showing linear speedup within the range expected by these parallelization methods.
Comparison of parallel viscosity with neoclassical theory

International Nuclear Information System (INIS)

Ida, K.; Nakajima, N.

1996-04-01

Toroidal rotation profiles are measured with charge exchange spectroscopy for the plasma heated with tangential NBI in CHS heliotron/torsatron device to estimate parallel viscosity. The parallel viscosity derived from the toroidal rotation velocity shows good agreement with the neoclassical parallel viscosity plus the perpendicular viscosity. (μ perpendicular = 2 m 2 /s). (author)
Process Creation and Full Sequential Composition in a Name-Passing Calculus

NARCIS (Netherlands)

Gehrke, Thomas; Rensink, Arend

This paper presents a first attempt to formulate a process calculus featuring process creation and sequential composition, instead of the more usual parallel composition and action prefixing, in a setting where mobility is achieved by communicating channel names. We discuss the questions of scope
Parallel programming practical aspects, models and current limitations

CERN Document Server

Tarkov, Mikhail S

2014-01-01

Parallel programming is designed for the use of parallel computer systems for solving time-consuming problems that cannot be solved on a sequential computer in a reasonable time. These problems can be divided into two classes: 1. Processing large data arrays (including processing images and signals in real time)2. Simulation of complex physical processes and chemical reactions For each of these classes, prospective methods are designed for solving problems. For data processing, one of the most promising technologies is the use of artificial neural networks. Particles-in-cell method and cellular automata are very useful for simulation. Problems of scalability of parallel algorithms and the transfer of existing parallel programs to future parallel computers are very acute now. An important task is to optimize the use of the equipment (including the CPU cache) of parallel computers. Along with parallelizing information processing, it is essential to ensure the processing reliability by the relevant organization ...
Parallelizing the spectral transform method: A comparison of alternative parallel algorithms

International Nuclear Information System (INIS)

Foster, I.; Worley, P.H.

1993-01-01

The spectral transform method is a standard numerical technique for solving partial differential equations on the sphere and is widely used in global climate modeling. In this paper, we outline different approaches to parallelizing the method and describe experiments that we are conducting to evaluate the efficiency of these approaches on parallel computers. The experiments are conducted using a testbed code that solves the nonlinear shallow water equations on a sphere, but are designed to permit evaluation in the context of a global model. They allow us to evaluate the relative merits of the approaches as a function of problem size and number of processors. The results of this study are guiding ongoing work on PCCM2, a parallel implementation of the Community Climate Model developed at the National Center for Atmospheric Research

Computation of watersheds based on parallel graph algorithms

NARCIS (Netherlands)

Meijster, A.; Roerdink, J.B.T.M.; Maragos, P; Schafer, RW; Butt, MA

1996-01-01

In this paper the implementation of a parallel watershed algorithm is described. The algorithm has been implemented on a Cray J932, which is a shared memory architecture with 32 processors. The watershed transform has generally been considered to be inherently sequential, but recently a few research
Simple and flexible SAS and SPSS programs for analyzing lag-sequential categorical data.

Science.gov (United States)

O'Connor, B P

1999-11-01

This paper describes simple and flexible programs for analyzing lag-sequential categorical data, using SAS and SPSS. The programs read a stream of codes and produce a variety of lag-sequential statistics, including transitional frequencies, expected transitional frequencies, transitional probabilities, adjusted residuals, z values, Yule's Q values, likelihood ratio tests of stationarity across time and homogeneity across groups or segments, transformed kappas for unidirectional dependence, bidirectional dependence, parallel and nonparallel dominance, and significance levels based on both parametric and randomization tests.
The BLAZE language - A parallel language for scientific programming

Science.gov (United States)

Mehrotra, Piyush; Van Rosendale, John

1987-01-01

A Pascal-like scientific programming language, BLAZE, is described. BLAZE contains array arithmetic, forall loops, and APL-style accumulation operators, which allow natural expression of fine grained parallelism. It also employs an applicative or functional procedure invocation mechanism, which makes it easy for compilers to extract coarse grained parallelism using machine specific program restructuring. Thus BLAZE should allow one to achieve highly parallel execution on multiprocessor architectures, while still providing the user with conceptually sequential control flow. A central goal in the design of BLAZE is portability across a broad range of parallel architectures. The multiple levels of parallelism present in BLAZE code, in principle, allow a compiler to extract the types of parallelism appropriate for the given architecture while neglecting the remainder. The features of BLAZE are described and it is shown how this language would be used in typical scientific programming.
The BLAZE language: A parallel language for scientific programming

Science.gov (United States)

Mehrotra, P.; Vanrosendale, J.

1985-01-01

A Pascal-like scientific programming language, Blaze, is described. Blaze contains array arithmetic, forall loops, and APL-style accumulation operators, which allow natural expression of fine grained parallelism. It also employs an applicative or functional procedure invocation mechanism, which makes it easy for compilers to extract coarse grained parallelism using machine specific program restructuring. Thus Blaze should allow one to achieve highly parallel execution on multiprocessor architectures, while still providing the user with onceptually sequential control flow. A central goal in the design of Blaze is portability across a broad range of parallel architectures. The multiple levels of parallelism present in Blaze code, in principle, allow a compiler to extract the types of parallelism appropriate for the given architecture while neglecting the remainder. The features of Blaze are described and shows how this language would be used in typical scientific programming.
Reliability-Based Optimization of Series Systems of Parallel Systems

DEFF Research Database (Denmark)

Enevoldsen, I.; Sørensen, John Dalsgaard

Reliability-based design of structural systems is considered. Especially systems where the reliability model is a series system of parallel systems are analysed. A sensitivity analysis for this class of problems is presented. Direct and sequential optimization procedures to solve the optimization...
Data parallel sorting for particle simulation

Science.gov (United States)

Dagum, Leonardo

1992-01-01

Sorting on a parallel architecture is a communications intensive event which can incur a high penalty in applications where it is required. In the case of particle simulation, only integer sorting is necessary, and sequential implementations easily attain the minimum performance bound of O (N) for N particles. Parallel implementations, however, have to cope with the parallel sorting problem which, in addition to incurring a heavy communications cost, can make the minimun performance bound difficult to attain. This paper demonstrates how the sorting problem in a particle simulation can be reduced to a merging problem, and describes an efficient data parallel algorithm to solve this merging problem in a particle simulation. The new algorithm is shown to be optimal under conditions usual for particle simulation, and its fieldwise implementation on the Connection Machine is analyzed in detail. The new algorithm is about four times faster than a fieldwise implementation of radix sort on the Connection Machine.
Benefits of Parallel I/O in Ab Initio Nuclear Physics Calculations

International Nuclear Information System (INIS)

Laghave, Nikhil; Sosonkina, Masha; Maris, Pieter; Vary, James P.

2009-01-01

Many modern scientific applications rely on highly parallel calculations, which scale to 10's of thousands processors. However, most applications do not concentrate on parallelizing input/output operations. In particular, sequential I/O has been identified as a bottleneck for the highly scalable MFDn (Many Fermion Dynamics for nuclear structure) code performing ab initio nuclear structure calculations. In this paper, we develop interfaces and parallel I/O procedures to use a well-known parallel I/O library in MFDn. As a result, we gain efficient input/output of large datasets along with their portability and ease of use in the downstream processing.
Comparison of human embryomorphokinetic parameters in sequential or global culture media.

Science.gov (United States)

Kazdar, Nadia; Brugnon, Florence; Bouche, Cyril; Jouve, Guilhem; Veau, Ségolène; Drapier, Hortense; Rousseau, Chloé; Pimentel, Céline; Viard, Patricia; Belaud-Rotureau, Marc-Antoine; Ravel, Célia

2017-08-01

A prospective study on randomized patients was conducted to determine how morphokinetic parameters are altered in embryos grown in sequential versus global culture media. Eleven morphokinetic parameters of 160 single embryos transferred were analyzed by time lapse imaging involving two University-affiliated in vitro fertilization (IVF) centers. We found that the fading of the two pronuclei occurred earlier in global (22.56±2.15 hpi) versus sequential media (23.63±2.71 hpi; p=0.0297). Likewise, the first cleavage started earlier at 24.52±2.33 hpi vs 25.76±2.95 hpi (p=0.0158). Also, the first cytokinesis was shorter in global medium, lasting 18±10.2 minutes in global versus 36±37.8 minutes in sequential culture medium (p culture medium. Our study highlights the need to adapt morphokinetic analysis accordingly to the type of media used to best support human early embryo development.
A Parallel Saturation Algorithm on Shared Memory Architectures

Science.gov (United States)

Ezekiel, Jonathan; Siminiceanu

2007-01-01

Symbolic state-space generators are notoriously hard to parallelize. However, the Saturation algorithm implemented in the SMART verification tool differs from other sequential symbolic state-space generators in that it exploits the locality of ring events in asynchronous system models. This paper explores whether event locality can be utilized to efficiently parallelize Saturation on shared-memory architectures. Conceptually, we propose to parallelize the ring of events within a decision diagram node, which is technically realized via a thread pool. We discuss the challenges involved in our parallel design and conduct experimental studies on its prototypical implementation. On a dual-processor dual core PC, our studies show speed-ups for several example models, e.g., of up to 50% for a Kanban model, when compared to running our algorithm only on a single core.
A soft sensor for bioprocess control based on sequential filtering of metabolic heat signals.

Science.gov (United States)

Paulsson, Dan; Gustavsson, Robert; Mandenius, Carl-Fredrik

2014-09-26

Soft sensors are the combination of robust on-line sensor signals with mathematical models for deriving additional process information. Here, we apply this principle to a microbial recombinant protein production process in a bioreactor by exploiting bio-calorimetric methodology. Temperature sensor signals from the cooling system of the bioreactor were used for estimating the metabolic heat of the microbial culture and from that the specific growth rate and active biomass concentration were derived. By applying sequential digital signal filtering, the soft sensor was made more robust for industrial practice with cultures generating low metabolic heat in environments with high noise level. The estimated specific growth rate signal obtained from the three stage sequential filter allowed controlled feeding of substrate during the fed-batch phase of the production process. The biomass and growth rate estimates from the soft sensor were also compared with an alternative sensor probe and a capacitance on-line sensor, for the same variables. The comparison showed similar or better sensitivity and lower variability for the metabolic heat soft sensor suggesting that using permanent temperature sensors of a bioreactor is a realistic and inexpensive alternative for monitoring and control. However, both alternatives are easy to implement in a soft sensor, alone or in parallel.
A Soft Sensor for Bioprocess Control Based on Sequential Filtering of Metabolic Heat Signals

Directory of Open Access Journals (Sweden)

Dan Paulsson

2014-09-01

Full Text Available Soft sensors are the combination of robust on-line sensor signals with mathematical models for deriving additional process information. Here, we apply this principle to a microbial recombinant protein production process in a bioreactor by exploiting bio-calorimetric methodology. Temperature sensor signals from the cooling system of the bioreactor were used for estimating the metabolic heat of the microbial culture and from that the specific growth rate and active biomass concentration were derived. By applying sequential digital signal filtering, the soft sensor was made more robust for industrial practice with cultures generating low metabolic heat in environments with high noise level. The estimated specific growth rate signal obtained from the three stage sequential filter allowed controlled feeding of substrate during the fed-batch phase of the production process. The biomass and growth rate estimates from the soft sensor were also compared with an alternative sensor probe and a capacitance on-line sensor, for the same variables. The comparison showed similar or better sensitivity and lower variability for the metabolic heat soft sensor suggesting that using permanent temperature sensors of a bioreactor is a realistic and inexpensive alternative for monitoring and control. However, both alternatives are easy to implement in a soft sensor, alone or in parallel.
Simulation Study of Real Time 3-D Synthetic Aperture Sequential Beamforming for Ultrasound Imaging

DEFF Research Database (Denmark)

Hemmsen, Martin Christian; Rasmussen, Morten Fischer; Stuart, Matthias Bo

2014-01-01

in the main system. The real-time imaging capability is achieved using a synthetic aperture beamforming technique, utilizing the transmit events to generate a set of virtual elements that in combination can generate an image. The two core capabilities in combination is named Synthetic Aperture Sequential......This paper presents a new beamforming method for real-time three-dimensional (3-D) ultrasound imaging using a 2-D matrix transducer. To obtain images with sufficient resolution and contrast, several thousand elements are needed. The proposed method reduces the required channel count from...... Beamforming (SASB). Simulations are performed to evaluate the image quality of the presented method in comparison to Parallel beamforming utilizing 16 receive beamformers. As indicators for image quality the detail resolution and Cystic resolution are determined for a set of scatterers at a depth of 90mm...
Untitled

Indian Academy of Sciences (India)

Gs run time of sequential algorithm on 1 CPU run time of parallel algorithm on NCPU. This measure only shows the efficiency of the parallelization in terms of the algorithm itself but not a comparison with the best sequential algorithm. Thus a more realistic measure is defined as the speedup of the parallel algorithm against ...
A Hybrid Shared-Memory Parallel Max-Tree Algorithm for Extreme Dynamic-Range Images.

Science.gov (United States)

Moschini, Ugo; Meijster, Arnold; Wilkinson, Michael H F

2018-03-01

Max-trees, or component trees, are graph structures that represent the connected components of an image in a hierarchical way. Nowadays, many application fields rely on images with high-dynamic range or floating point values. Efficient sequential algorithms exist to build trees and compute attributes for images of any bit depth. However, we show that the current parallel algorithms perform poorly already with integers at bit depths higher than 16 bits per pixel. We propose a parallel method combining the two worlds of flooding and merging max-tree algorithms. First, a pilot max-tree of a quantized version of the image is built in parallel using a flooding method. Later, this structure is used in a parallel leaf-to-root approach to compute efficiently the final max-tree and to drive the merging of the sub-trees computed by the threads. We present an analysis of the performance both on simulated and actual 2D images and 3D volumes. Execution times are about better than the fastest sequential algorithm and speed-up goes up to on 64 threads.
Using Coarrays to Parallelize Legacy Fortran Applications: Strategy and Case Study

Directory of Open Access Journals (Sweden)

Hari Radhakrishnan

2015-01-01

Full Text Available This paper summarizes a strategy for parallelizing a legacy Fortran 77 program using the object-oriented (OO and coarray features that entered Fortran in the 2003 and 2008 standards, respectively. OO programming (OOP facilitates the construction of an extensible suite of model-verification and performance tests that drive the development. Coarray parallel programming facilitates a rapid evolution from a serial application to a parallel application capable of running on multicore processors and many-core accelerators in shared and distributed memory. We delineate 17 code modernization steps used to refactor and parallelize the program and study the resulting performance. Our initial studies were done using the Intel Fortran compiler on a 32-core shared memory server. Scaling behavior was very poor, and profile analysis using TAU showed that the bottleneck in the performance was due to our implementation of a collective, sequential summation procedure. We were able to improve the scalability and achieve nearly linear speedup by replacing the sequential summation with a parallel, binary tree algorithm. We also tested the Cray compiler, which provides its own collective summation procedure. Intel provides no collective reductions. With Cray, the program shows linear speedup even in distributed-memory execution. We anticipate similar results with other compilers once they support the new collective procedures proposed for Fortran 2015.
Comparing and Optimising Parallel Haskell Implementations for Multicore Machines

DEFF Research Database (Denmark)

Berthold, Jost; Marlow, Simon; Hammond, Kevin

2009-01-01

In this paper, we investigate the differences and tradeoffs imposed by two parallel Haskell dialects running on multicore machines. GpH and Eden are both constructed using the highly-optimising sequential GHC compiler, and share thread scheduling, and other elements, from a common code base. The ...
Development of a parallelization strategy for the VARIANT code

International Nuclear Information System (INIS)

Hanebutte, U.R.; Khalil, H.S.; Palmiotti, G.; Tatsumi, M.

1996-01-01

The VARIANT code solves the multigroup steady-state neutron diffusion and transport equation in three-dimensional Cartesian and hexagonal geometries using the variational nodal method. VARIANT consists of four major parts that must be executed sequentially: input handling, calculation of response matrices, solution algorithm (i.e. inner-outer iteration), and output of results. The objective of the parallelization effort was to reduce the overall computing time by distributing the work of the two computationally intensive (sequential) tasks, the coupling coefficient calculation and the iterative solver, equally among a group of processors. This report describes the code's calculations and gives performance results on one of the benchmark problems used to test the code. The performance analysis in the IBM SPx system shows good efficiency for well-load-balanced programs. Even for relatively small problem sizes, respectable efficiencies are seen for the SPx. An extension to achieve a higher degree of parallelism will be addressed in future work. 7 refs., 1 tab
MaMiCo: Software design for parallel molecular-continuum flow simulations

KAUST Repository

Neumann, Philipp; Flohr, Hanno; Arora, Rahul; Jarmatz, Piet; Tchipev, Nikola; Bungartz, Hans-Joachim

2015-01-01

The macro-micro-coupling tool (MaMiCo) was developed to ease the development of and modularize molecular-continuum simulations, retaining sequential and parallel performance. We demonstrate the functionality and performance of MaMiCo by coupling
Comparisons of memory for nonverbal auditory and visual sequential stimuli.

Science.gov (United States)

McFarland, D J; Cacace, A T

1995-01-01

Properties of auditory and visual sensory memory were compared by examining subjects' recognition performance of randomly generated binary auditory sequential frequency patterns and binary visual sequential color patterns within a forced-choice paradigm. Experiment 1 demonstrated serial-position effects in auditory and visual modalities consisting of both primacy and recency effects. Experiment 2 found that retention of auditory and visual information was remarkably similar when assessed across a 10s interval. Experiments 3 and 4, taken together, showed that the recency effect in sensory memory is affected more by the type of response required (recognition vs. reproduction) than by the sensory modality employed. These studies suggest that auditory and visual sensory memory stores for nonverbal stimuli share similar properties with respect to serial-position effects and persistence over time.
In Vivo Evaluation of Synthetic Aperture Sequential Beamforming

DEFF Research Database (Denmark)

Hemmsen, Martin Christian; Hansen, Peter Møller; Lange, Theis

2012-01-01

Ultrasound in vivo imaging using synthetic aperture sequential beamformation (SASB) is compared with conventional imaging in a double blinded study using side-by-side comparisons. The objective is to evaluate if the image quality in terms of penetration depth, spatial resolution, contrast...

On Coding the States of Sequential Machines with the Use of Partition Pairs

DEFF Research Database (Denmark)

Zahle, Torben U.

1966-01-01

This article introduces a new technique of making state assignment for sequential machines. The technique is in line with the approach used by Hartmanis [l], Stearns and Hartmanis [3], and Curtis [4]. It parallels the work of Dolotta and McCluskey [7], although it was developed independently...
Mathematical Methods and Algorithms of Mobile Parallel Computing on the Base of Multi-core Processors

Directory of Open Access Journals (Sweden)

Alexander B. Bakulev

2012-11-01

Full Text Available This article deals with mathematical models and algorithms, providing mobility of sequential programs parallel representation on the high-level language, presents formal model of operation environment processes management, based on the proposed model of programs parallel representation, presenting computation process on the base of multi-core processors.
A comparison of energetic ions in the plasma depletion layer and the quasi-parallel magnetosheath

Science.gov (United States)

Fuselier, Stephen A.

1994-01-01

Energetic ion spectra measured by the Active Magnetospheric Particle Tracer Explorers/Charge Composition Explorer (AMPTE/CCE) downstream from the Earth's quasi-parallel bow shock (in the quasi-parallel magnetosheath) and in the plasma depletion layer are compared. In the latter region, energetic ions are from a single source, leakage of magnetospheric ions across the magnetopause and into the plasma depletion layer. In the former region, both the magnetospheric source and shock acceleration of the thermal solar wind population at the quasi-parallel shock can contribute to the energetic ion spectra. The relative strengths of these two energetic ion sources are determined through the comparison of spectra from the two regions. It is found that magnetospheric leakage can provide an upper limit of 35% of the total energetic H(+) population in the quasi-parallel magnetosheath near the magnetopause in the energy range from approximately 10 to approximately 80 keV/e and substantially less than this limit for the energetic He(2+) population. The rest of the energetic H(+) population and nearly all of the energetic He(2+) population are accelerated out of the thermal solar wind population through shock acceleration processes. By comparing the energetic and thermal He(2+) and H(+) populations in the quasi-parallel magnetosheath, it is found that the quasi-parallel bow shock is 2 to 3 times more efficient at accelerating He(2+) than H(+). This result is consistent with previous estimates from shock acceleration theory and simulati ons.
A parallel implementation of 3-d CT image reconstruction on a hypercube multiprocessor

International Nuclear Information System (INIS)

Chen, C.M.; Lee, S.Y.; Cho, Z.H.

1990-01-01

In this paper, the authors describe how image reconstruction in computerized tomography (CT) can be parallelized on a message-passing multiprocessor. In particular, the results obtained from parallel implementation of 3-D CT image reconstruction for parallel beam geometries on the Intel hypercube, iPSC/2, are presented. A two stage pipelining approach is employed for filtering (convolution) and backprojection. The conventional sequential convolution algorithm is modified such that the symmetry of the filter kernel is fully utilized for parallelization. In the backprojection stage, the 3-D incremental algorithm, the authors' recently developed backprojection scheme which is shown to be faster than conventional algorithm, is parallelized
Comparison of likelihood testing procedures for parallel systems with covariances

International Nuclear Information System (INIS)

Ayman Baklizi; Isa Daud; Noor Akma Ibrahim

1998-01-01

In this paper we considered investigating and comparing the behavior of the likelihood ratio, the Rao's and the Wald's statistics for testing hypotheses on the parameters of the simple linear regression model based on parallel systems with covariances. These statistics are asymptotically equivalent (Barndorff-Nielsen and Cox, 1994). However, their relative performances in finite samples are generally known. A Monte Carlo experiment is conducted to stimulate the sizes and the powers of these statistics for complete samples and in the presence of time censoring. Comparisons of the statistics are made according to the attainment of assumed size of the test and their powers at various points in the parameter space. The results show that the likelihood ratio statistics appears to have the best performance in terms of the attainment of the assumed size of the test. Power comparisons show that the Rao statistic has some advantage over the Wald statistic in almost all of the space of alternatives while likelihood ratio statistic occupies either the first or the last position in term of power. Overall, the likelihood ratio statistic appears to be more appropriate to the model under study, especially for small sample sizes
A sequential adaptation technique and its application to the Mark 12 IFF system

Science.gov (United States)

Bailey, John S.; Mallett, John D.; Sheppard, Duane J.; Warner, F. Neal; Adams, Robert

1986-07-01

Sequential adaptation uses only two sets of receivers, correlators, and A/D converters which are time multiplexed to effect spatial adaptation in a system with (N) adaptive degrees of freedom. This technique can substantially reduce the hardware cost over what is realizable in a parallel architecture. A three channel L-band version of the sequential adapter was built and tested for use with the MARK XII IFF (identify friend or foe) system. In this system the sequentially determined adaptive weights were obtained digitally but implemented at RF. As a result, many of the post RF hardware induced sources of error that normally limit cancellation, such as receiver mismatch, are removed by the feedback property. The result is a system that can yield high levels of cancellation and be readily retrofitted to currently fielded equipment.
Polarization control of direct (non-sequential) two-photon double ionization of He

International Nuclear Information System (INIS)

Pronin, E A; Manakov, N L; Marmo, S I; Starace, Anthony F

2007-01-01

An ab initio parametrization of the doubly-differential cross section (DDCS) for two-photon double ionization (TPDI) from an s 2 subshell of an atom in a 1 S 0 -state is presented. Analysis of the elliptic dichroism (ED) effect in the DDCS for TPDI of He and its comparison with the same effect in the concurrent process of sequential double ionization shows their qualitative and quantitative differences, thus providing a means to control and to distinguish sequential and non-sequential processes by measuring the relative ED parameter
Program For Parallel Discrete-Event Simulation

Science.gov (United States)

Beckman, Brian C.; Blume, Leo R.; Geiselman, John S.; Presley, Matthew T.; Wedel, John J., Jr.; Bellenot, Steven F.; Diloreto, Michael; Hontalas, Philip J.; Reiher, Peter L.; Weiland, Frederick P.

1991-01-01

User does not have to add any special logic to aid in synchronization. Time Warp Operating System (TWOS) computer program is special-purpose operating system designed to support parallel discrete-event simulation. Complete implementation of Time Warp mechanism. Supports only simulations and other computations designed for virtual time. Time Warp Simulator (TWSIM) subdirectory contains sequential simulation engine interface-compatible with TWOS. TWOS and TWSIM written in, and support simulations in, C programming language.
GPU-Based Point Cloud Superpositioning for Structural Comparisons of Protein Binding Sites.

Science.gov (United States)

Leinweber, Matthias; Fober, Thomas; Freisleben, Bernd

2018-01-01

In this paper, we present a novel approach to solve the labeled point cloud superpositioning problem for performing structural comparisons of protein binding sites. The solution is based on a parallel evolution strategy that operates on large populations and runs on GPU hardware. The proposed evolution strategy reduces the likelihood of getting stuck in a local optimum of the multimodal real-valued optimization problem represented by labeled point cloud superpositioning. The performance of the GPU-based parallel evolution strategy is compared to a previously proposed CPU-based sequential approach for labeled point cloud superpositioning, indicating that the GPU-based parallel evolution strategy leads to qualitatively better results and significantly shorter runtimes, with speed improvements of up to a factor of 1,500 for large populations. Binary classification tests based on the ATP, NADH, and FAD protein subsets of CavBase, a database containing putative binding sites, show average classification rate improvements from about 92 percent (CPU) to 96 percent (GPU). Further experiments indicate that the proposed GPU-based labeled point cloud superpositioning approach can be superior to traditional protein comparison approaches based on sequence alignments.
Parallel Sequential Monte Carlo for Efficient Density Combination: The Deco Matlab Toolbox

DEFF Research Database (Denmark)

Casarin, Roberto; Grassi, Stefano; Ravazzolo, Francesco

This paper presents the Matlab package DeCo (Density Combination) which is based on the paper by Billio et al. (2013) where a constructive Bayesian approach is presented for combining predictive densities originating from different models or other sources of information. The combination weights...... for standard CPU computing and for Graphical Process Unit (GPU) parallel computing. For the GPU implementation we use the Matlab parallel computing toolbox and show how to use General Purposes GPU computing almost effortless. This GPU implementation comes with a speed up of the execution time up to seventy...... times compared to a standard CPU Matlab implementation on a multicore CPU. We show the use of the package and the computational gain of the GPU version, through some simulation experiments and empirical applications....
Design of multiple sequence alignment algorithms on parallel, distributed memory supercomputers.

Science.gov (United States)

Church, Philip C; Goscinski, Andrzej; Holt, Kathryn; Inouye, Michael; Ghoting, Amol; Makarychev, Konstantin; Reumann, Matthias

2011-01-01

The challenge of comparing two or more genomes that have undergone recombination and substantial amounts of segmental loss and gain has recently been addressed for small numbers of genomes. However, datasets of hundreds of genomes are now common and their sizes will only increase in the future. Multiple sequence alignment of hundreds of genomes remains an intractable problem due to quadratic increases in compute time and memory footprint. To date, most alignment algorithms are designed for commodity clusters without parallelism. Hence, we propose the design of a multiple sequence alignment algorithm on massively parallel, distributed memory supercomputers to enable research into comparative genomics on large data sets. Following the methodology of the sequential progressiveMauve algorithm, we design data structures including sequences and sorted k-mer lists on the IBM Blue Gene/P supercomputer (BG/P). Preliminary results show that we can reduce the memory footprint so that we can potentially align over 250 bacterial genomes on a single BG/P compute node. We verify our results on a dataset of E.coli, Shigella and S.pneumoniae genomes. Our implementation returns results matching those of the original algorithm but in 1/2 the time and with 1/4 the memory footprint for scaffold building. In this study, we have laid the basis for multiple sequence alignment of large-scale datasets on a massively parallel, distributed memory supercomputer, thus enabling comparison of hundreds instead of a few genome sequences within reasonable time.
Speeding Up the String Comparison of the IDS Snort using Parallel Programming: A Systematic Literature Review on the Parallelized Aho-Corasick Algorithm

Directory of Open Access Journals (Sweden)

SILVA JUNIOR,J. B.

2016-12-01

Full Text Available The Intrusion Detection System (IDS needs to compare the contents of all packets arriving at the network interface with a set of signatures for indicating possible attacks, a task that consumes much CPU processing time. In order to alleviate this problem, some researchers have tried to parallelize the IDS's comparison engine, transferring execution from the CPU to GPU. This paper identifies and maps the parallelization features of the Aho-Corasick algorithm, which is used in Snort to compare patterns, in order to show this algorithm's implementation and execution issues, as well as optimization techniques for the Aho-Corasick machine. We have found 147 papers from important computer science publications databases, and have mapped them. We selected 22 and analyzed them in order to find our results. Our analysis of the papers showed, among other results, that parallelization of the AC algorithm is a new task and the authors have focused on the State Transition Table as the most common way to implement the algorithm on the GPU. Furthermore, we found that some techniques speed up the algorithm and reduce the required machine storage space are highly used, such as the algorithm running on the fastest memories and mechanisms for reducing the number of nodes and bit maping.
Efficient multitasking: parallel versus serial processing of multiple tasks.

Science.gov (United States)

Fischer, Rico; Plessow, Franziska

2015-01-01

In the context of performance optimizations in multitasking, a central debate has unfolded in multitasking research around whether cognitive processes related to different tasks proceed only sequentially (one at a time), or can operate in parallel (simultaneously). This review features a discussion of theoretical considerations and empirical evidence regarding parallel versus serial task processing in multitasking. In addition, we highlight how methodological differences and theoretical conceptions determine the extent to which parallel processing in multitasking can be detected, to guide their employment in future research. Parallel and serial processing of multiple tasks are not mutually exclusive. Therefore, questions focusing exclusively on either task-processing mode are too simplified. We review empirical evidence and demonstrate that shifting between more parallel and more serial task processing critically depends on the conditions under which multiple tasks are performed. We conclude that efficient multitasking is reflected by the ability of individuals to adjust multitasking performance to environmental demands by flexibly shifting between different processing strategies of multiple task-component scheduling.
Event-shape analysis: Sequential versus simultaneous multifragment emission

International Nuclear Information System (INIS)

Cebra, D.A.; Howden, S.; Karn, J.; Nadasen, A.; Ogilvie, C.A.; Vander Molen, A.; Westfall, G.D.; Wilson, W.K.; Winfield, J.S.; Norbeck, E.

1990-01-01

The Michigan State University 4π array has been used to select central-impact-parameter events from the reaction 40 Ar+ 51 V at incident energies from 35 to 85 MeV/nucleon. The event shape in momentum space is an observable which is shown to be sensitive to the dynamics of the fragmentation process. A comparison of the experimental event-shape distribution to sequential- and simultaneous-decay predictions suggests that a transition in the breakup process may have occurred. At 35 MeV/nucleon, a sequential-decay simulation reproduces the data. For the higher energies, the experimental distributions fall between the two contrasting predictions
Parallel computing in cluster of GPU applied to a problem of nuclear engineering

International Nuclear Information System (INIS)

Moraes, Sergio Ricardo S.; Heimlich, Adino; Resende, Pedro

2013-01-01

Cluster computing has been widely used as a low cost alternative for parallel processing in scientific applications. With the use of Message-Passing Interface (MPI) protocol development became even more accessible and widespread in the scientific community. A more recent trend is the use of Graphic Processing Unit (GPU), which is a powerful co-processor able to perform hundreds of instructions in parallel, reaching a capacity of hundreds of times the processing of a CPU. However, a standard PC does not allow, in general, more than two GPUs. Hence, it is proposed in this work development and evaluation of a hybrid low cost parallel approach to the solution to a nuclear engineering typical problem. The idea is to use clusters parallelism technology (MPI) together with GPU programming techniques (CUDA - Compute Unified Device Architecture) to simulate neutron transport through a slab using Monte Carlo method. By using a cluster comprised by four quad-core computers with 2 GPU each, it has been developed programs using MPI and CUDA technologies. Experiments, applying different configurations, from 1 to 8 GPUs has been performed and results were compared with the sequential (non-parallel) version. A speed up of about 2.000 times has been observed when comparing the 8-GPU with the sequential version. Results here presented are discussed and analyzed with the objective of outlining gains and possible limitations of the proposed approach. (author)
Automatic parallelization of while-Loops using speculative execution

International Nuclear Information System (INIS)

Collard, J.F.

1995-01-01

Automatic parallelization of imperative sequential programs has focused on nests of for-loops. The most recent of them consist in finding an affine mapping with respect to the loop indices to simultaneously capture the temporal and spatial properties of the parallelized program. Such a mapping is usually called a open-quotes space-time transformation.close quotes This work describes an extension of these techniques to while-loops using speculative execution. We show that space-time transformations are a good framework for summing up previous restructuration techniques of while-loop, such as pipelining. Moreover, we show that these transformations can be derived and applied automatically
Graph Transformation and Designing Parallel Sparse Matrix Algorithms beyond Data Dependence Analysis

Directory of Open Access Journals (Sweden)

H.X. Lin

2004-01-01

Full Text Available Algorithms are often parallelized based on data dependence analysis manually or by means of parallel compilers. Some vector/matrix computations such as the matrix-vector products with simple data dependence structures (data parallelism can be easily parallelized. For problems with more complicated data dependence structures, parallelization is less straightforward. The data dependence graph is a powerful means for designing and analyzing parallel algorithms. However, for sparse matrix computations, parallelization based on solely exploiting the existing parallelism in an algorithm does not always give satisfactory results. For example, the conventional Gaussian elimination algorithm for the solution of a tri-diagonal system is inherently sequential, so algorithms specially for parallel computation has to be designed. After briefly reviewing different parallelization approaches, a powerful graph formalism for designing parallel algorithms is introduced. This formalism will be discussed using a tri-diagonal system as an example. Its application to general matrix computations is also discussed. Its power in designing parallel algorithms beyond the ability of data dependence analysis is shown by means of a new algorithm called ACER (Alternating Cyclic Elimination and Reduction algorithm.
Characterizing and Mitigating Work Time Inflation in Task Parallel Programs

Directory of Open Access Journals (Sweden)

Stephen L. Olivier

2013-01-01

Full Text Available Task parallelism raises the level of abstraction in shared memory parallel programming to simplify the development of complex applications. However, task parallel applications can exhibit poor performance due to thread idleness, scheduling overheads, and work time inflation – additional time spent by threads in a multithreaded computation beyond the time required to perform the same work in a sequential computation. We identify the contributions of each factor to lost efficiency in various task parallel OpenMP applications and diagnose the causes of work time inflation in those applications. Increased data access latency can cause significant work time inflation in NUMA systems. Our locality framework for task parallel OpenMP programs mitigates this cause of work time inflation. Our extensions to the Qthreads library demonstrate that locality-aware scheduling can improve performance up to 3X compared to the Intel OpenMP task scheduler.
Massively parallel mathematical sieves

Energy Technology Data Exchange (ETDEWEB)

Montry, G.R.

1989-01-01

The Sieve of Eratosthenes is a well-known algorithm for finding all prime numbers in a given subset of integers. A parallel version of the Sieve is described that produces computational speedups over 800 on a hypercube with 1,024 processing elements for problems of fixed size. Computational speedups as high as 980 are achieved when the problem size per processor is fixed. The method of parallelization generalizes to other sieves and will be efficient on any ensemble architecture. We investigate two highly parallel sieves using scattered decomposition and compare their performance on a hypercube multiprocessor. A comparison of different parallelization techniques for the sieve illustrates the trade-offs necessary in the design and implementation of massively parallel algorithms for large ensemble computers.
Data driven parallelism in experimental high energy physics applications

International Nuclear Information System (INIS)

Pohl, M.

1987-01-01

I present global design principles for the implementation of high energy physics data analysis code on sequential and parallel processors with mixed shared and local memory. Potential parallelism in the structure of high energy physics tasks is identified with granularity varying from a few times 10 8 instructions all the way down to a few times 10 4 instructions. It follows the hierarchical structure of detector and data acquisition systems. To take advantage of this - yet preserving the necessary portability of the code - I propose a computational model with purely data driven concurrency in Single Program Multiple Data (SPMD) mode. The task granularity is defined by varying the granularity of the central data structure manipulated. Concurrent processes coordiate themselves asynchroneously using simple lock constructs on parts of the data structure. Load balancing among processes occurs naturally. The scheme allows to map the internal layout of the data structure closely onto the layout of local and shared memory in a parallel architecture. It thus allows to optimize the application with respect to synchronization as well as data transport overheads. I present a coarse top level design for a portable implementation of this scheme on sequential machines, multiprocessor mainframes (e.g. IBM 3090), tightly coupled multiprocessors (e.g. RP-3) and loosely coupled processor arrays (e.g. LCAP, Emulating Processor Farms). (orig.)

Data driven parallelism in experimental high energy physics applications

Science.gov (United States)

Pohl, Martin

1987-08-01

I present global design principles for the implementation of High Energy Physics data analysis code on sequential and parallel processors with mixed shared and local memory. Potential parallelism in the structure of High Energy Physics tasks is identified with granularity varying from a few times 10 8 instructions all the way down to a few times 10 4 instructions. It follows the hierarchical structure of detector and data acquisition systems. To take advantage of this - yet preserving the necessary portability of the code - I propose a computational model with purely data driven concurrency in Single Program Multiple Data (SPMD) mode. The Task granularity is defined by varying the granularity of the central data structure manipulated. Concurrent processes coordinate themselves asynchroneously using simple lock constructs on parts of the data structure. Load balancing among processes occurs naturally. The scheme allows to map the internal layout of the data structure closely onto the layout of local and shared memory in a parallel architecture. It thus allows to optimize the application with respect to synchronization as well as data transport overheads. I present a coarse top level design for a portable implementation of this scheme on sequential machines, multiprocessor mainframes (e.g. IBM 3090), tightly coupled multiprocessors (e.g. RP-3) and loosely coupled processor arrays (e.g. LCAP, Emulating Processor Farms).
Microwave Ablation: Comparison of Simultaneous and Sequential Activation of Multiple Antennas in Liver Model Systems.

Science.gov (United States)

Harari, Colin M; Magagna, Michelle; Bedoya, Mariajose; Lee, Fred T; Lubner, Meghan G; Hinshaw, J Louis; Ziemlewicz, Timothy; Brace, Christopher L

2016-01-01

To compare microwave ablation zones created by using sequential or simultaneous power delivery in ex vivo and in vivo liver tissue. All procedures were approved by the institutional animal care and use committee. Microwave ablations were performed in both ex vivo and in vivo liver models with a 2.45-GHz system capable of powering up to three antennas simultaneously. Two- and three-antenna arrays were evaluated in each model. Sequential and simultaneous ablations were created by delivering power (50 W ex vivo, 65 W in vivo) for 5 minutes per antenna (10 and 15 minutes total ablation time for sequential ablations, 5 minutes for simultaneous ablations). Thirty-two ablations were performed in ex vivo bovine livers (eight per group) and 28 in the livers of eight swine in vivo (seven per group). Ablation zone size and circularity metrics were determined from ablations excised postmortem. Mixed effects modeling was used to evaluate the influence of power delivery, number of antennas, and tissue type. On average, ablations created by using the simultaneous power delivery technique were larger than those with the sequential technique (P Simultaneous ablations were also more circular than sequential ablations (P = .0001). Larger and more circular ablations were achieved with three antennas compared with two antennas (P simultaneous power delivery creates larger, more confluent ablations with greater temperatures than those created with sequential power delivery. © RSNA, 2015.
A parallel algorithm for 3D particle tracking and Lagrangian trajectory reconstruction

International Nuclear Information System (INIS)

Barker, Douglas; Zhang, Yuanhui; Lifflander, Jonathan; Arya, Anshu

2012-01-01

Particle-tracking methods are widely used in fluid mechanics and multi-target tracking research because of their unique ability to reconstruct long trajectories with high spatial and temporal resolution. Researchers have recently demonstrated 3D tracking of several objects in real time, but as the number of objects is increased, real-time tracking becomes impossible due to data transfer and processing bottlenecks. This problem may be solved by using parallel processing. In this paper, a parallel-processing framework has been developed based on frame decomposition and is programmed using the asynchronous object-oriented Charm++ paradigm. This framework can be a key step in achieving a scalable Lagrangian measurement system for particle-tracking velocimetry and may lead to real-time measurement capabilities. The parallel tracking algorithm was evaluated with three data sets including the particle image velocimetry standard 3D images data set #352, a uniform data set for optimal parallel performance and a computational-fluid-dynamics-generated non-uniform data set to test trajectory reconstruction accuracy, consistency with the sequential version and scalability to more than 500 processors. The algorithm showed strong scaling up to 512 processors and no inherent limits of scalability were seen. Ultimately, up to a 200-fold speedup is observed compared to the serial algorithm when 256 processors were used. The parallel algorithm is adaptable and could be easily modified to use any sequential tracking algorithm, which inputs frames of 3D particle location data and outputs particle trajectories
Foreword to Special Issue on "The Difference between Concurrent and Sequential Computation'' of Mathematical Structures

DEFF Research Database (Denmark)

Aceto, Luca; Longo, Giuseppe; Victor, Björn

2003-01-01

tarpit, and argued that some of the most crucial distinctions in computing methodology, such as sequential versus parallel, deterministic versus non-deterministic, local versus distributed disappear if all one sees in computation is pure symbol pushing. How can we express formally the difference between...
Parallel Computing Using Web Servers and "Servlets".

Science.gov (United States)

Lo, Alfred; Bloor, Chris; Choi, Y. K.

2000-01-01

Describes parallel computing and presents inexpensive ways to implement a virtual parallel computer with multiple Web servers. Highlights include performance measurement of parallel systems; models for using Java and intranet technology including single server, multiple clients and multiple servers, single client; and a comparison of CGI (common…
Parallel Multi-cycle LES of an Optical Pent-roof DISI Engine Under Motored Operating Conditions

Energy Technology Data Exchange (ETDEWEB)

Van Dam, Noah; Sjöberg, Magnus; Zeng, Wei; Som, Sibendu

2017-10-15

The use of Large-eddy Simulations (LES) has increased due to their ability to resolve the turbulent fluctuations of engine flows and capture the resulting cycle-to-cycle variability. One drawback of LES, however, is the requirement to run multiple engine cycles to obtain the necessary cycle statistics for full validation. The standard method to obtain the cycles by running a single simulation through many engine cycles sequentially can take a long time to complete. Recently, a new strategy has been proposed by our research group to reduce the amount of time necessary to simulate the many engine cycles by running individual engine cycle simulations in parallel. With modern large computing systems this has the potential to reduce the amount of time necessary for a full set of simulated engine cycles to finish by up to an order of magnitude. In this paper, the Parallel Perturbation Methodology (PPM) is used to simulate up to 35 engine cycles of an optically accessible, pent-roof Directinjection Spark-ignition (DISI) engine at two different motored engine operating conditions, one throttled and one un-throttled. Comparisons are made against corresponding sequential-cycle simulations to verify the similarity of results using either methodology. Mean results from the PPM approach are very similar to sequential-cycle results with less than 0.5% difference in pressure and a magnitude structure index (MSI) of 0.95. Differences in cycle-to-cycle variability (CCV) predictions are larger, but close to the statistical uncertainty in the measurement for the number of cycles simulated. PPM LES results were also compared against experimental data. Mean quantities such as pressure or mean velocities were typically matched to within 5- 10%. Pressure CCVs were under-predicted, mostly due to the lack of any perturbations in the pressure boundary conditions between cycles. Velocity CCVs for the simulations had the same average magnitude as experiments, but the experimental data showed
Dosimetric comparison of standard three-dimensional conformal radiotherapy followed by intensity-modulated radiotherapy boost schedule (sequential IMRT plan) with simultaneous integrated boost-IMRT (SIB IMRT) treatment plan in patients with localized carcinoma prostate.

Science.gov (United States)

Bansal, A; Kapoor, R; Singh, S K; Kumar, N; Oinam, A S; Sharma, S C

2012-07-01

DOSIMETERIC AND RADIOBIOLOGICAL COMPARISON OF TWO RADIATION SCHEDULES IN LOCALIZED CARCINOMA PROSTATE: Standard Three-Dimensional Conformal Radiotherapy (3DCRT) followed by Intensity Modulated Radiotherapy (IMRT) boost (sequential-IMRT) with Simultaneous Integrated Boost IMRT (SIB-IMRT). Thirty patients were enrolled. In all, the target consisted of PTV P + SV (Prostate and seminal vesicles) and PTV LN (lymph nodes) where PTV refers to planning target volume and the critical structures included: bladder, rectum and small bowel. All patients were treated with sequential-IMRT plan, but for dosimetric comparison, SIB-IMRT plan was also created. The prescription dose to PTV P + SV was 74 Gy in both strategies but with different dose per fraction, however, the dose to PTV LN was 50 Gy delivered in 25 fractions over 5 weeks for sequential-IMRT and 54 Gy delivered in 27 fractions over 5.5 weeks for SIB-IMRT. The treatment plans were compared in terms of dose-volume histograms. Also, Tumor Control Probability (TCP) and Normal Tissue Complication Probability (NTCP) obtained with the two plans were compared. The volume of rectum receiving 70 Gy or more (V > 70 Gy) was reduced to 18.23% with SIB-IMRT from 22.81% with sequential-IMRT. SIB-IMRT reduced the mean doses to both bladder and rectum by 13% and 17%, respectively, as compared to sequential-IMRT. NTCP of 0.86 ± 0.75% and 0.01 ± 0.02% for the bladder, 5.87 ± 2.58% and 4.31 ± 2.61% for the rectum and 8.83 ± 7.08% and 8.25 ± 7.98% for the bowel was seen with sequential-IMRT and SIB-IMRT plans respectively. For equal PTV coverage, SIB-IMRT markedly reduced doses to critical structures, therefore should be considered as the strategy for dose escalation. SIB-IMRT achieves lesser NTCP than sequential-IMRT.
Cost-effectiveness of simultaneous versus sequential surgery in head and neck reconstruction.

Science.gov (United States)

Wong, Kevin K; Enepekides, Danny J; Higgins, Kevin M

2011-02-01

To determine whether simultaneous (ablation and reconstruction overlaps by two teams) head and neck reconstruction is cost effective compared to sequentially (ablation followed by reconstruction) performed surgery. Case-controlled study. Tertiary care hospital. Oncology patients undergoing free flap reconstruction of the head and neck. A match paired comparison study was performed with a retrospective chart review examining the total time of surgery for sequential and simultaneous surgery. Nine patients were selected for both the sequential and simultaneous groups. Sequential head and neck reconstruction patients were pair matched with patients who had undergone similar oncologic ablative or reconstructive procedures performed in a simultaneous fashion. A detailed cost analysis using the microcosting method was then undertaken looking at the direct costs of the surgeons, anesthesiologist, operating room, and nursing. On average, simultaneous surgery required 3 hours 15 minutes less operating time, leading to a cost savings of approximately $1200/case when compared to sequential surgery. This represents approximately a 15% reduction in the cost of the entire operation. Simultaneous head and neck reconstruction is more cost effective when compared to sequential surgery.
Sequential vs simultaneous encoding of spatial information: a comparison between the blind and the sighted.

Science.gov (United States)

Ruotolo, Francesco; Ruggiero, Gennaro; Vinciguerra, Michela; Iachini, Tina

2012-02-01

The aim of this research is to assess whether the crucial factor in determining the characteristics of blind people's spatial mental images is concerned with the visual impairment per se or the processing style that the dominant perceptual modalities used to acquire spatial information impose, i.e. simultaneous (vision) vs sequential (kinaesthesis). Participants were asked to learn six positions in a large parking area via movement alone (congenitally blind, adventitiously blind, blindfolded sighted) or with vision plus movement (simultaneous sighted, sequential sighted), and then to mentally scan between positions in the path. The crucial manipulation concerned the sequential sighted group. Their visual exploration was made sequential by putting visual obstacles within the pathway in such a way that they could not see simultaneously the positions along the pathway. The results revealed a significant time/distance linear relation in all tested groups. However, the linear component was lower in sequential sighted and blind participants, especially congenital. Sequential sighted and congenitally blind participants showed an almost overlapping performance. Differences between groups became evident when mentally scanning farther distances (more than 5m). This threshold effect could be revealing of processing limitations due to the need of integrating and updating spatial information. Overall, the results suggest that the characteristics of the processing style rather than the visual impairment per se affect blind people's spatial mental images. Copyright © 2011 Elsevier B.V. All rights reserved.
A Parallel Algorithm for Connected Component Labelling of Gray-scale Images on Homogeneous Multicore Architectures

International Nuclear Information System (INIS)

Niknam, Mehdi; Thulasiraman, Parimala; Camorlinga, Sergio

2010-01-01

Connected component labelling is an essential step in image processing. We provide a parallel version of Suzuki's sequential connected component algorithm in order to speed up the labelling process. Also, we modify the algorithm to enable labelling gray-scale images. Due to the data dependencies in the algorithm we used a method similar to pipeline to exploit parallelism. The parallel algorithm method achieved a speedup of 2.5 for image size of 256 x 256 pixels using 4 processing threads.
Comparison of Pre-Analytical FFPE Sample Preparation Methods and Their Impact on Massively Parallel Sequencing in Routine Diagnostics

Science.gov (United States)

Heydt, Carina; Fassunke, Jana; Künstlinger, Helen; Ihle, Michaela Angelika; König, Katharina; Heukamp, Lukas Carl; Schildhaus, Hans-Ulrich; Odenthal, Margarete; Büttner, Reinhard; Merkelbach-Bruse, Sabine

2014-01-01

Over the last years, massively parallel sequencing has rapidly evolved and has now transitioned into molecular pathology routine laboratories. It is an attractive platform for analysing multiple genes at the same time with very little input material. Therefore, the need for high quality DNA obtained from automated DNA extraction systems has increased, especially to those laboratories which are dealing with formalin-fixed paraffin-embedded (FFPE) material and high sample throughput. This study evaluated five automated FFPE DNA extraction systems as well as five DNA quantification systems using the three most common techniques, UV spectrophotometry, fluorescent dye-based quantification and quantitative PCR, on 26 FFPE tissue samples. Additionally, the effects on downstream applications were analysed to find the most suitable pre-analytical methods for massively parallel sequencing in routine diagnostics. The results revealed that the Maxwell 16 from Promega (Mannheim, Germany) seems to be the superior system for DNA extraction from FFPE material. The extracts had a 1.3–24.6-fold higher DNA concentration in comparison to the other extraction systems, a higher quality and were most suitable for downstream applications. The comparison of the five quantification methods showed intermethod variations but all methods could be used to estimate the right amount for PCR amplification and for massively parallel sequencing. Interestingly, the best results in massively parallel sequencing were obtained with a DNA input of 15 ng determined by the NanoDrop 2000c spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA). No difference could be detected in mutation analysis based on the results of the quantification methods. These findings emphasise, that it is particularly important to choose the most reliable and constant DNA extraction system, especially when using small biopsies and low elution volumes, and that all common DNA quantification techniques can be used for
Comparison of pre-analytical FFPE sample preparation methods and their impact on massively parallel sequencing in routine diagnostics.

Directory of Open Access Journals (Sweden)

Carina Heydt

Full Text Available Over the last years, massively parallel sequencing has rapidly evolved and has now transitioned into molecular pathology routine laboratories. It is an attractive platform for analysing multiple genes at the same time with very little input material. Therefore, the need for high quality DNA obtained from automated DNA extraction systems has increased, especially to those laboratories which are dealing with formalin-fixed paraffin-embedded (FFPE material and high sample throughput. This study evaluated five automated FFPE DNA extraction systems as well as five DNA quantification systems using the three most common techniques, UV spectrophotometry, fluorescent dye-based quantification and quantitative PCR, on 26 FFPE tissue samples. Additionally, the effects on downstream applications were analysed to find the most suitable pre-analytical methods for massively parallel sequencing in routine diagnostics. The results revealed that the Maxwell 16 from Promega (Mannheim, Germany seems to be the superior system for DNA extraction from FFPE material. The extracts had a 1.3-24.6-fold higher DNA concentration in comparison to the other extraction systems, a higher quality and were most suitable for downstream applications. The comparison of the five quantification methods showed intermethod variations but all methods could be used to estimate the right amount for PCR amplification and for massively parallel sequencing. Interestingly, the best results in massively parallel sequencing were obtained with a DNA input of 15 ng determined by the NanoDrop 2000c spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA. No difference could be detected in mutation analysis based on the results of the quantification methods. These findings emphasise, that it is particularly important to choose the most reliable and constant DNA extraction system, especially when using small biopsies and low elution volumes, and that all common DNA quantification techniques can
A scalable method for parallelizing sampling-based motion planning algorithms

KAUST Repository

Jacobs, Sam Ade; Manavi, Kasra; Burgos, Juan; Denny, Jory; Thomas, Shawna; Amato, Nancy M.

2012-01-01

This paper describes a scalable method for parallelizing sampling-based motion planning algorithms. It subdivides configuration space (C-space) into (possibly overlapping) regions and independently, in parallel, uses standard (sequential) sampling-based planners to construct roadmaps in each region. Next, in parallel, regional roadmaps in adjacent regions are connected to form a global roadmap. By subdividing the space and restricting the locality of connection attempts, we reduce the work and inter-processor communication associated with nearest neighbor calculation, a critical bottleneck for scalability in existing parallel motion planning methods. We show that our method is general enough to handle a variety of planning schemes, including the widely used Probabilistic Roadmap (PRM) and Rapidly-exploring Random Trees (RRT) algorithms. We compare our approach to two other existing parallel algorithms and demonstrate that our approach achieves better and more scalable performance. Our approach achieves almost linear scalability on a 2400 core LINUX cluster and on a 153,216 core Cray XE6 petascale machine. © 2012 IEEE.
A scalable method for parallelizing sampling-based motion planning algorithms

KAUST Repository

Jacobs, Sam Ade

2012-05-01

This paper describes a scalable method for parallelizing sampling-based motion planning algorithms. It subdivides configuration space (C-space) into (possibly overlapping) regions and independently, in parallel, uses standard (sequential) sampling-based planners to construct roadmaps in each region. Next, in parallel, regional roadmaps in adjacent regions are connected to form a global roadmap. By subdividing the space and restricting the locality of connection attempts, we reduce the work and inter-processor communication associated with nearest neighbor calculation, a critical bottleneck for scalability in existing parallel motion planning methods. We show that our method is general enough to handle a variety of planning schemes, including the widely used Probabilistic Roadmap (PRM) and Rapidly-exploring Random Trees (RRT) algorithms. We compare our approach to two other existing parallel algorithms and demonstrate that our approach achieves better and more scalable performance. Our approach achieves almost linear scalability on a 2400 core LINUX cluster and on a 153,216 core Cray XE6 petascale machine. © 2012 IEEE.
Power stability methods for parallel systems

International Nuclear Information System (INIS)

Wallach, Y.

1988-01-01

Parallel-Processing Systems are already commercially available. This paper shows that if one of them - the Alternating Sequential Parallel, or ASP system - is applied to network stability calculations it will lead to a higher speed of solution. The ASP system is first described and is then shown to be cheaper, more reliable and available than other parallel systems. Also, no deadlock need be feared and the speedup is normally very high. A number of ASP systems were already assembled (the SMS systems, Topps, DIRMU etc.). At present, an IBM Local Area Network is being modified so that it too can work in the ASP mode. Existing ASP systems were programmed in Fortran or assembly language. Since newer systems (e.g. DIRMU) are programmed in Modula-2, this language can be used. Stability analysis is based on solving nonlinear differential and algebraic equations. The algorithm for solving the nonlinear differential equations on ASP, is described and programmed in Modula-2. The speedup is computed and is shown to be almost optimal
Objective and subjective measures of simultaneous vs sequential bilateral cochlear implants in adults : A randomized clinical trial

NARCIS (Netherlands)

Kraaijenga, Véronique J.C.; Ramakers, Geerte G.J.; Smulders, Yvette E.; Van Zon, Alice; Stegeman, Inge; Smit, Adriana L.; Stokroos, Robert J.; Hendrice, Nadia; Free, Rolien H.; Maat, Bert; Frijns, Johan H M; Briaire, Jeroen J; Mylanus, Emmanuel A M; Huinck, Wendy J.; van Zanten, Gijsbert A.; Grolman, Wilko

2017-01-01

IMPORTANCE: To date, no randomized clinical trial on the comparison between simultaneous and sequential bilateral cochlear implants (BiCIs) has been performed. OBJECTIVE: To investigate the hearing capabilities and the self-reported benefits of simultaneous BiCIs compared with those of sequential
Objective and Subjective Measures of Simultaneous vs Sequential Bilateral Cochlear Implants in Adults: A Randomized Clinical Trial

NARCIS (Netherlands)

Kraaijenga, V.J.; Ramakers, G.G.; Smulders, Y.E.; Zon, A. van; Stegeman, I.; Smit, A.L.; Stokroos, R.J.; Hendrice, N.; Free, R.H.; Maat, B.; Frijns, J.H.; Briaire, J.J.; Mylanus, E.A.M.; Huinck, W.J.; Zanten, G.A.; Grolman, W.

2017-01-01

Importance: To date, no randomized clinical trial on the comparison between simultaneous and sequential bilateral cochlear implants (BiCIs) has been performed. Objective: To investigate the hearing capabilities and the self-reported benefits of simultaneous BiCIs compared with those of sequential
Parallelization of a Quantum-Classic Hybrid Model For Nanoscale Semiconductor Devices

Directory of Open Access Journals (Sweden)

Oscar Salas

2011-07-01

Full Text Available The expensive reengineering of the sequential software and the difficult parallel programming are two of the many technical and economic obstacles to the wide use of HPC. We investigate the chance to improve in a rapid way the performance of a numerical serial code for the simulation of the transport of a charged carriers in a Double-Gate MOSFET. We introduce the Drift-Diffusion-Schrödinger-Poisson (DDSP model and we study a rapid parallelization strategy of the numerical procedure on shared memory architectures.
The specificity of learned parallelism in dual-memory retrieval.

Science.gov (United States)

Strobach, Tilo; Schubert, Torsten; Pashler, Harold; Rickard, Timothy

2014-05-01

Retrieval of two responses from one visually presented cue occurs sequentially at the outset of dual-retrieval practice. Exclusively for subjects who adopt a mode of grouping (i.e., synchronizing) their response execution, however, reaction times after dual-retrieval practice indicate a shift to learned retrieval parallelism (e.g., Nino & Rickard, in Journal of Experimental Psychology: Learning, Memory, and Cognition, 29, 373-388, 2003). In the present study, we investigated how this learned parallelism is achieved and why it appears to occur only for subjects who group their responses. Two main accounts were considered: a task-level versus a cue-level account. The task-level account assumes that learned retrieval parallelism occurs at the level of the task as a whole and is not limited to practiced cues. Grouping response execution may thus promote a general shift to parallel retrieval following practice. The cue-level account states that learned retrieval parallelism is specific to practiced cues. This type of parallelism may result from cue-specific response chunking that occurs uniquely as a consequence of grouped response execution. The results of two experiments favored the second account and were best interpreted in terms of a structural bottleneck model.
OpenMP Issues Arising in the Development of Parallel BLAS and LAPACK Libraries

Directory of Open Access Journals (Sweden)

C. Addison

2003-01-01

Full Text Available Dense linear algebra libraries need to cope efficiently with a range of input problem sizes and shapes. Inherently this means that parallel implementations have to exploit parallelism wherever it is present. While OpenMP allows relatively fine grain parallelism to be exploited in a shared memory environment it currently lacks features to make it easy to partition computation over multiple array indices or to overlap sequential and parallel computations. The inherent flexible nature of shared memory paradigms such as OpenMP poses other difficulties when it becomes necessary to optimise performance across successive parallel library calls. Notions borrowed from distributed memory paradigms, such as explicit data distributions help address some of these problems, but the focus on data rather than work distribution appears misplaced in an SMP context.

A comparison of high-order explicit Runge–Kutta, extrapolation, and deferred correction methods in serial and parallel

KAUST Repository

Ketcheson, David I.

2014-06-13

We compare the three main types of high-order one-step initial value solvers: extrapolation, spectral deferred correction, and embedded Runge–Kutta pairs. We consider orders four through twelve, including both serial and parallel implementations. We cast extrapolation and deferred correction methods as fixed-order Runge–Kutta methods, providing a natural framework for the comparison. The stability and accuracy properties of the methods are analyzed by theoretical measures, and these are compared with the results of numerical tests. In serial, the eighth-order pair of Prince and Dormand (DOP8) is most efficient. But other high-order methods can be more efficient than DOP8 when implemented in parallel. This is demonstrated by comparing a parallelized version of the wellknown ODEX code with the (serial) DOP853 code. For an N-body problem with N = 400, the experimental extrapolation code is as fast as the tuned Runge–Kutta pair at loose tolerances, and is up to two times as fast at tight tolerances.
Advances in randomized parallel computing

CERN Document Server

Rajasekaran, Sanguthevar

1999-01-01

The technique of randomization has been employed to solve numerous prob lems of computing both sequentially and in parallel. Examples of randomized algorithms that are asymptotically better than their deterministic counterparts in solving various fundamental problems abound. Randomized algorithms have the advantages of simplicity and better performance both in theory and often in practice. This book is a collection of articles written by renowned experts in the area of randomized parallel computing. A brief introduction to randomized algorithms In the aflalysis of algorithms, at least three different measures of performance can be used: the best case, the worst case, and the average case. Often, the average case run time of an algorithm is much smaller than the worst case. 2 For instance, the worst case run time of Hoare's quicksort is O(n ), whereas its average case run time is only O( n log n). The average case analysis is conducted with an assumption on the input space. The assumption made to arrive at t...
Eyewitness accuracy rates in sequential and simultaneous lineup presentations: a meta-analytic comparison.

Science.gov (United States)

Steblay, N; Dysart, J; Fulero, S; Lindsay, R C

2001-10-01

Most police lineups use a simultaneous presentation technique in which eyewitnesses view all lineup members at the same time. Lindsay and Wells (R. C. L. Lindsay & G. L. Wells, 1985) devised an alternative procedure, the sequential lineup, in which witnesses view one lineup member at a time and decide whether or not that person is the perpetrator prior to viewing the next lineup member. The present work uses the technique of meta-analysis to compare the accuracy rates of these presentation styles. Twenty-three papers were located (9 published and 14 unpublished), providing 30 tests of the hypothesis and including 4,145 participants. Results showed that identification of perpetrators from target-present lineups occurs at a higher rate from simultaneous than from sequential lineups. However, this difference largely disappears when moderator variables approximating real world conditions are considered. Also, correct rejection rates were significantly higher for sequential than simultaneous lineups and this difference is maintained or increased by greater approximation to real world conditions. Implications of these findings are discussed.
Synthetic Aperture Sequential Beamforming implemented on multi-core platforms

DEFF Research Database (Denmark)

Kjeldsen, Thomas; Lassen, Lee; Hemmsen, Martin Christian

2014-01-01

This paper compares several computational ap- proaches to Synthetic Aperture Sequential Beamforming (SASB) targeting consumer level parallel processors such as multi-core CPUs and GPUs. The proposed implementations demonstrate that ultrasound imaging using SASB can be executed in real- time with ...... per second) on an Intel Core i7 2600 CPU with an AMD HD7850 and a NVIDIA GTX680 GPU. The fastest CPU and GPU implementations use 14% and 1.3% of the real-time budget of 62 ms/frame, respectively. The maximum achieved processing rate is 1265 frames/s....
General-purpose parallel simulator for quantum computing

International Nuclear Information System (INIS)

Niwa, Jumpei; Matsumoto, Keiji; Imai, Hiroshi

2002-01-01

With current technologies, it seems to be very difficult to implement quantum computers with many qubits. It is therefore of importance to simulate quantum algorithms and circuits on the existing computers. However, for a large-size problem, the simulation often requires more computational power than is available from sequential processing. Therefore, simulation methods for parallel processors are required. We have developed a general-purpose simulator for quantum algorithms/circuits on the parallel computer (Sun Enterprise4500). It can simulate algorithms/circuits with up to 30 qubits. In order to test efficiency of our proposed methods, we have simulated Shor's factorization algorithm and Grover's database search, and we have analyzed robustness of the corresponding quantum circuits in the presence of both decoherence and operational errors. The corresponding results, statistics, and analyses are presented in this paper
A node linkage approach for sequential pattern mining.

Directory of Open Access Journals (Sweden)

Osvaldo Navarro

Full Text Available Sequential Pattern Mining is a widely addressed problem in data mining, with applications such as analyzing Web usage, examining purchase behavior, and text mining, among others. Nevertheless, with the dramatic increase in data volume, the current approaches prove inefficient when dealing with large input datasets, a large number of different symbols and low minimum supports. In this paper, we propose a new sequential pattern mining algorithm, which follows a pattern-growth scheme to discover sequential patterns. Unlike most pattern growth algorithms, our approach does not build a data structure to represent the input dataset, but instead accesses the required sequences through pseudo-projection databases, achieving better runtime and reducing memory requirements. Our algorithm traverses the search space in a depth-first fashion and only preserves in memory a pattern node linkage and the pseudo-projections required for the branch being explored at the time. Experimental results show that our new approach, the Node Linkage Depth-First Traversal algorithm (NLDFT, has better performance and scalability in comparison with state of the art algorithms.
Simultaneous sequential monitoring of efficacy and safety led to masking of effects.

Science.gov (United States)

van Eekelen, Rik; de Hoop, Esther; van der Tweel, Ingeborg

2016-08-01

Usually, sequential designs for clinical trials are applied on the primary (=efficacy) outcome. In practice, other outcomes (e.g., safety) will also be monitored and influence the decision whether to stop a trial early. Implications of simultaneous monitoring on trial decision making are yet unclear. This study examines what happens to the type I error, power, and required sample sizes when one efficacy outcome and one correlated safety outcome are monitored simultaneously using sequential designs. We conducted a simulation study in the framework of a two-arm parallel clinical trial. Interim analyses on two outcomes were performed independently and simultaneously on the same data sets using four sequential monitoring designs, including O'Brien-Fleming and Triangular Test boundaries. Simulations differed in values for correlations and true effect sizes. When an effect was present in both outcomes, competition was introduced, which decreased power (e.g., from 80% to 60%). Futility boundaries for the efficacy outcome reduced overall type I errors as well as power for the safety outcome. Monitoring two correlated outcomes, given that both are essential for early trial termination, leads to masking of true effects. Careful consideration of scenarios must be taken into account when designing sequential trials. Simulation results can help guide trial design. Copyright © 2016 Elsevier Inc. All rights reserved.
Sequential decisions: a computational comparison of observational and reinforcement accounts.

Directory of Open Access Journals (Sweden)

Nazanin Mohammadi Sepahvand

Full Text Available Right brain damaged patients show impairments in sequential decision making tasks for which healthy people do not show any difficulty. We hypothesized that this difficulty could be due to the failure of right brain damage patients to develop well-matched models of the world. Our motivation is the idea that to navigate uncertainty, humans use models of the world to direct the decisions they make when interacting with their environment. The better the model is, the better their decisions are. To explore the model building and updating process in humans and the basis for impairment after brain injury, we used a computational model of non-stationary sequence learning. RELPH (Reinforcement and Entropy Learned Pruned Hypothesis space was able to qualitatively and quantitatively reproduce the results of left and right brain damaged patient groups and healthy controls playing a sequential version of Rock, Paper, Scissors. Our results suggests that, in general, humans employ a sub-optimal reinforcement based learning method rather than an objectively better statistical learning approach, and that differences between right brain damaged and healthy control groups can be explained by different exploration policies, rather than qualitatively different learning mechanisms.
Comparison of simultaneous and sequential SPECT imaging for discrimination tasks in assessment of cardiac defects.

Science.gov (United States)

Trott, C M; Ouyang, J; El Fakhri, G

2010-11-21

Simultaneous rest perfusion/fatty-acid metabolism studies have the potential to replace sequential rest/stress perfusion studies for the assessment of cardiac function. Simultaneous acquisition has the benefits of increased signal and lack of need for patient stress, but is complicated by cross-talk between the two radionuclide signals. We consider a simultaneous rest (99m)Tc-sestamibi/(123)I-BMIPP imaging protocol in place of the commonly used sequential rest/stress (99m)Tc-sestamibi protocol. The theoretical precision with which the severity of a cardiac defect and the transmural extent of infarct can be measured is computed for simultaneous and sequential SPECT imaging, and their performance is compared for discriminating (1) degrees of defect severity and (2) sub-endocardial from transmural defects. We consider cardiac infarcts for which reduced perfusion and metabolism are observed. From an information perspective, simultaneous imaging is found to yield comparable or improved performance compared with sequential imaging for discriminating both severity of defect and transmural extent of infarct, for three defects of differing location and size.
Comparison of simultaneous and sequential SPECT imaging for discrimination tasks in assessment of cardiac defects

International Nuclear Information System (INIS)

Trott, C M; Ouyang, J; El Fakhri, G

2010-01-01

Simultaneous rest perfusion/fatty-acid metabolism studies have the potential to replace sequential rest/stress perfusion studies for the assessment of cardiac function. Simultaneous acquisition has the benefits of increased signal and lack of need for patient stress, but is complicated by cross-talk between the two radionuclide signals. We consider a simultaneous rest 99m Tc-sestamibi/ 123 I-BMIPP imaging protocol in place of the commonly used sequential rest/stress 99m Tc-sestamibi protocol. The theoretical precision with which the severity of a cardiac defect and the transmural extent of infarct can be measured is computed for simultaneous and sequential SPECT imaging, and their performance is compared for discriminating (1) degrees of defect severity and (2) sub-endocardial from transmural defects. We consider cardiac infarcts for which reduced perfusion and metabolism are observed. From an information perspective, simultaneous imaging is found to yield comparable or improved performance compared with sequential imaging for discriminating both severity of defect and transmural extent of infarct, for three defects of differing location and size.
PLAST: parallel local alignment search tool for database comparison

Directory of Open Access Journals (Sweden)

Lavenier Dominique

2009-10-01

Full Text Available Abstract Background Sequence similarity searching is an important and challenging task in molecular biology and next-generation sequencing should further strengthen the need for faster algorithms to process such vast amounts of data. At the same time, the internal architecture of current microprocessors is tending towards more parallelism, leading to the use of chips with two, four and more cores integrated on the same die. The main purpose of this work was to design an effective algorithm to fit with the parallel capabilities of modern microprocessors. Results A parallel algorithm for comparing large genomic banks and targeting middle-range computers has been developed and implemented in PLAST software. The algorithm exploits two key parallel features of existing and future microprocessors: the SIMD programming model (SSE instruction set and the multithreading concept (multicore. Compared to multithreaded BLAST software, tests performed on an 8-processor server have shown speedup ranging from 3 to 6 with a similar level of accuracy. Conclusion A parallel algorithmic approach driven by the knowledge of the internal microprocessor architecture allows significant speedup to be obtained while preserving standard sensitivity for similarity search problems.
Parallel discrete event simulation: A shared memory approach

Science.gov (United States)

Reed, Daniel A.; Malony, Allen D.; Mccredie, Bradley D.

1987-01-01

With traditional event list techniques, evaluating a detailed discrete event simulation model can often require hours or even days of computation time. Parallel simulation mimics the interacting servers and queues of a real system by assigning each simulated entity to a processor. By eliminating the event list and maintaining only sufficient synchronization to insure causality, parallel simulation can potentially provide speedups that are linear in the number of processors. A set of shared memory experiments is presented using the Chandy-Misra distributed simulation algorithm to simulate networks of queues. Parameters include queueing network topology and routing probabilities, number of processors, and assignment of network nodes to processors. These experiments show that Chandy-Misra distributed simulation is a questionable alternative to sequential simulation of most queueing network models.
Assessing potential forest and steel inter-industry residue utilisation by sequential chemical extraction

Energy Technology Data Exchange (ETDEWEB)

Makela, M.

2012-10-15

Traditional process industries in Finland and abroad are facing an emerging waste disposal problem due recent regulatory development which has increased the costs of landfill disposal and difficulty in acquiring new sites. For large manufacturers, such as the forest and ferrous metals industries, symbiotic cooperation of formerly separate industrial sectors could enable the utilisation waste-labeled residues in manufacturing novel residue-derived materials suitable for replacing commercial virgin alternatives. Such efforts would allow transforming the current linear resource use and disposal models to more cyclical ones and thus attain savings in valuable materials and energy resources. The work described in this thesis was aimed at utilising forest and carbon steel industry residues in the experimental manufacture of novel residue-derived materials technically and environmentally suitable for amending agricultural or forest soil properties. Single and sequential chemical extractions were used to compare the pseudo-total concentrations of trace elements in the manufactured amendment samples to relevant Finnish statutory limit values for the use of fertilizer products and to assess respective potential availability under natural conditions. In addition, the quality of analytical work and the suitability of sequential extraction in the analysis of an industrial solid sample were respectively evaluated through the analysis of a certified reference material and by X-ray diffraction of parallel sequential extraction residues. According to the acquired data, the incorporation of both forest and steel industry residues, such as fly ashes, lime wastes, green liquor dregs, sludges and slags, led to amendment liming capacities (34.9-38.3%, Ca equiv., d.w.) comparable to relevant commercial alternatives. Only the first experimental samples showed increased concentrations of pseudo-total cadmium and chromium, of which the latter was specified as the trivalent Cr(III). Based on
Comparison of capacitive and radio frequency resonator sensors for monitoring parallelized droplet microfluidic production

KAUST Repository

Conchouso Gonzalez, David

2016-06-28

Scaled-up production of microfluidic droplets, through the parallelization of hundreds of droplet generators, has received a lot of attention to bring novel multiphase microfluidics research to industrial applications. However, apart from droplet generation, other significant challenges relevant to this goal have never been discussed. Examples include monitoring systems, high-throughput processing of droplets and quality control procedures among others. In this paper, we present and compare capacitive and radio frequency (RF) resonator sensors as two candidates that can measure the dielectric properties of emulsions in microfluidic channels. By placing several of these sensors in a parallelization device, the stability of the droplet generation at different locations can be compared, and potential malfunctions can be detected. This strategy enables for the first time the monitoring of scaled-up microfluidic droplet production. Both sensors were prototyped and characterized using emulsions with droplets of 100-150 μm in diameter, which were generated in parallelization devices at water-in-oil volume fractions (φ) between 11.1% and 33.3%.Using these sensors, we were able to measure accurately increments as small as 2.4% in the water volume fraction of the emulsions. Although both methods rely on the dielectric properties of the emulsions, the main advantage of the RF resonator sensors is the fact that they can be designed to resonate at multiple frequencies of the broadband transmission line. Consequently with careful design, two or more sensors can be parallelized and read out by a single signal. Finally, a comparison between these sensors based on their sensitivity, readout cost and simplicity, and design flexibility is also discussed. © 2016 The Royal Society of Chemistry.
Comparison of multihardware parallel implementations for a phase unwrapping algorithm

Science.gov (United States)

Hernandez-Lopez, Francisco Javier; Rivera, Mariano; Salazar-Garibay, Adan; Legarda-Sáenz, Ricardo

2018-04-01

Phase unwrapping is an important problem in the areas of optical metrology, synthetic aperture radar (SAR) image analysis, and magnetic resonance imaging (MRI) analysis. These images are becoming larger in size and, particularly, the availability and need for processing of SAR and MRI data have increased significantly with the acquisition of remote sensing data and the popularization of magnetic resonators in clinical diagnosis. Therefore, it is important to develop faster and accurate phase unwrapping algorithms. We propose a parallel multigrid algorithm of a phase unwrapping method named accumulation of residual maps, which builds on a serial algorithm that consists of the minimization of a cost function; minimization achieved by means of a serial Gauss-Seidel kind algorithm. Our algorithm also optimizes the original cost function, but unlike the original work, our algorithm is a parallel Jacobi class with alternated minimizations. This strategy is known as the chessboard type, where red pixels can be updated in parallel at same iteration since they are independent. Similarly, black pixels can be updated in parallel in an alternating iteration. We present parallel implementations of our algorithm for different parallel multicore architecture such as CPU-multicore, Xeon Phi coprocessor, and Nvidia graphics processing unit. In all the cases, we obtain a superior performance of our parallel algorithm when compared with the original serial version. In addition, we present a detailed comparative performance of the developed parallel versions.
A hybrid parallel framework for the cellular Potts model simulations

Energy Technology Data Exchange (ETDEWEB)

Jiang, Yi [Los Alamos National Laboratory; He, Kejing [SOUTH CHINA UNIV; Dong, Shoubin [SOUTH CHINA UNIV

2009-01-01

The Cellular Potts Model (CPM) has been widely used for biological simulations. However, most current implementations are either sequential or approximated, which can't be used for large scale complex 3D simulation. In this paper we present a hybrid parallel framework for CPM simulations. The time-consuming POE solving, cell division, and cell reaction operation are distributed to clusters using the Message Passing Interface (MPI). The Monte Carlo lattice update is parallelized on shared-memory SMP system using OpenMP. Because the Monte Carlo lattice update is much faster than the POE solving and SMP systems are more and more common, this hybrid approach achieves good performance and high accuracy at the same time. Based on the parallel Cellular Potts Model, we studied the avascular tumor growth using a multiscale model. The application and performance analysis show that the hybrid parallel framework is quite efficient. The hybrid parallel CPM can be used for the large scale simulation ({approx}10{sup 8} sites) of complex collective behavior of numerous cells ({approx}10{sup 6}).
A Comparison of Ultimate Loads from Fully and Sequentially Coupled Analyses

Energy Technology Data Exchange (ETDEWEB)

Wendt, Fabian F [National Renewable Energy Laboratory (NREL), Golden, CO (United States); Damiani, Rick R [National Renewable Energy Laboratory (NREL), Golden, CO (United States)

2017-11-14

This poster summarizes the scope and preliminary results of a study conducted for the Bureau of Safety and Environmental Enforcement aimed at quantifying differences between two modeling approaches (fully coupled and sequentially coupled) through aero-hydro-servo-elastic simulations of two offshore wind turbines on a monopile and jacket substructure.
Testing New Programming Paradigms with NAS Parallel Benchmarks

Science.gov (United States)

Jin, H.; Frumkin, M.; Schultz, M.; Yan, J.

2000-01-01

Over the past decade, high performance computing has evolved rapidly, not only in hardware architectures but also with increasing complexity of real applications. Technologies have been developing to aim at scaling up to thousands of processors on both distributed and shared memory systems. Development of parallel programs on these computers is always a challenging task. Today, writing parallel programs with message passing (e.g. MPI) is the most popular way of achieving scalability and high performance. However, writing message passing programs is difficult and error prone. Recent years new effort has been made in defining new parallel programming paradigms. The best examples are: HPF (based on data parallelism) and OpenMP (based on shared memory parallelism). Both provide simple and clear extensions to sequential programs, thus greatly simplify the tedious tasks encountered in writing message passing programs. HPF is independent of memory hierarchy, however, due to the immaturity of compiler technology its performance is still questionable. Although use of parallel compiler directives is not new, OpenMP offers a portable solution in the shared-memory domain. Another important development involves the tremendous progress in the internet and its associated technology. Although still in its infancy, Java promisses portability in a heterogeneous environment and offers possibility to "compile once and run anywhere." In light of testing these new technologies, we implemented new parallel versions of the NAS Parallel Benchmarks (NPBs) with HPF and OpenMP directives, and extended the work with Java and Java-threads. The purpose of this study is to examine the effectiveness of alternative programming paradigms. NPBs consist of five kernels and three simulated applications that mimic the computation and data movement of large scale computational fluid dynamics (CFD) applications. We started with the serial version included in NPB2.3. Optimization of memory and cache usage
A New Approach of Parallelism and Load Balance for the Apriori Algorithm

Directory of Open Access Journals (Sweden)

BOLINA, A. C.

2013-06-01

Full Text Available The main goal of data mining is to discover relevant information on digital content. The Apriori algorithm is widely used to this objective, but its sequential version has a low performance when execu- ted over large volumes of data. Among the solutions for this problem is the parallel implementation of the algorithm, and among the parallel implementations presented in the literature that based on Apriori, it highlights the DPA (Distributed Parallel Apriori [10]. This paper presents the DMTA (Distributed Multithread Apriori algorithm, which is based on DPA and exploits the parallelism level of threads in order to increase the performance. Besides, DMTA can be executed over heterogeneous hardware platform, using different number of cores. The results showed that DMTA outperforms DPA, presents load balance among processes and threads, and it is effective in current multicore architectures.
Distortion product otoacoustic emissions: comparison of sequential vs. simultaneous presentation of primary tones.

Science.gov (United States)

Kumar, U Ajith; Maruthy, Sandeep; Chandrakant, Vishwakarma

2009-03-01

Distortion product otoacoustic emissions are one form of evoked otoacoustic emissions. DPOAEs provide the frequency specific information about the hearing status in mid and high frequency regions. But in most screening protocols TEOAEs are preferred as it requires less time compared to DPOAE. This is because, in DPOAE each stimulus is presented one after the other and responses are analyzed. Grason and Stadler Incorporation 60 (GSI-60) offer simultaneous presentation of four sets of primary tones at a time and checks for the DPOAE. In this mode of presentation, all the pairs are presented at a time and following that response is extracted separately whereas, in sequential mode primaries are presented in orderly fashion one after the other. In this article simultaneous and sequential protocols were used to compare the Distortion product otoacoustic emission amplitude, noise floor and administration time in individuals with normal hearing and mild sensori-neural (SN) hearing loss. In simultaneous protocols four sets of primary tones (i.e. 8 tones) were presented together whereas, in sequential presentation mode one set of primary tones was presented each time. Simultaneous protocol was completed in less than half the time required for the completion of sequential protocol. Two techniques yielded similar results at frequencies above 1000 Hz only in normal hearing group. In SN hearing loss group simultaneous presentation yielded signifi cantly higher noise floors and distortion product amplitudes. This result challenges the use of simultaneous presentation technique in neonatal hearing screening programmes and on other pathologies. This discrepancy between two protocols may be due to some changes in biomechanical process in the cochlear and/or due to higher distortion/noise produced by the system during the simultaneous presentation mode.

Parallel Simulation of Loosely Timed SystemC/TLM Programs: Challenges Raised by an Industrial Case Study

Directory of Open Access Journals (Sweden)

Denis Becker

2016-05-01

Full Text Available Transaction level models of systems-on-chip in SystemC are commonly used in the industry to provide an early simulation environment. The SystemC standard imposes coroutine semantics for the scheduling of simulated processes, to ensure determinism and reproducibility of simulations. However, because of this, sequential implementations have, for a long time, been the only option available, and still now the reference implementation is sequential. With the increasing size and complexity of models, and the multiplication of computation cores on recent machines, the parallelization of SystemC simulations is a major research concern. There have been several proposals for SystemC parallelization, but most of them are limited to cycle-accurate models. In this paper we focus on loosely timed models, which are commonly used in the industry. We present an industrial context and show that, unfortunately, most of the existing approaches for SystemC parallelization can fundamentally not apply in this context. We support this claim with a set of measurements performed on a platform used in production at STMicroelectronics. This paper surveys existing techniques, presents a visualization and profiling tool and identifies unsolved challenges in the parallelization of SystemC models at transaction level.
Algorithms for the Construction of Parallel Tests by Zero-One Programming. Project Psychometric Aspects of Item Banking No. 7. Research Report 86-7.

Science.gov (United States)

Boekkooi-Timminga, Ellen

Nine methods for automated test construction are described. All are based on the concepts of information from item response theory. Two general kinds of methods for the construction of parallel tests are presented: (1) sequential test design; and (2) simultaneous test design. Sequential design implies that the tests are constructed one after the…
Vdebug: debugging tool for parallel scientific programs. Design report on vdebug

International Nuclear Information System (INIS)

Matsuda, Katsuyuki; Takemiya, Hiroshi

2000-02-01

We report on a debugging tool called vdebug which supports debugging work for parallel scientific simulation programs. It is difficult to debug scientific programs with an existing debugger, because the volume of data generated by the programs is too large for users to check data in characters. Usually, the existing debugger shows data values in characters. To alleviate it, we have developed vdebug which enables to check the validity of large amounts of data by showing these data values visually. Although targets of vdebug have been restricted to sequential programs, we have made it applicable to parallel programs by realizing the function of merging and visualizing data distributed on programs on each computer node. Now, vdebug works on seven kinds of parallel computers. In this report, we describe the design of vdebug. (author)
Sequential search leads to faster, more efficient fragment-based de novo protein structure prediction.

Science.gov (United States)

de Oliveira, Saulo H P; Law, Eleanor C; Shi, Jiye; Deane, Charlotte M

2018-04-01

Most current de novo structure prediction methods randomly sample protein conformations and thus require large amounts of computational resource. Here, we consider a sequential sampling strategy, building on ideas from recent experimental work which shows that many proteins fold cotranslationally. We have investigated whether a pseudo-greedy search approach, which begins sequentially from one of the termini, can improve the performance and accuracy of de novo protein structure prediction. We observed that our sequential approach converges when fewer than 20 000 decoys have been produced, fewer than commonly expected. Using our software, SAINT2, we also compared the run time and quality of models produced in a sequential fashion against a standard, non-sequential approach. Sequential prediction produces an individual decoy 1.5-2.5 times faster than non-sequential prediction. When considering the quality of the best model, sequential prediction led to a better model being produced for 31 out of 41 soluble protein validation cases and for 18 out of 24 transmembrane protein cases. Correct models (TM-Score > 0.5) were produced for 29 of these cases by the sequential mode and for only 22 by the non-sequential mode. Our comparison reveals that a sequential search strategy can be used to drastically reduce computational time of de novo protein structure prediction and improve accuracy. Data are available for download from: http://opig.stats.ox.ac.uk/resources. SAINT2 is available for download from: https://github.com/sauloho/SAINT2. saulo.deoliveira@dtc.ox.ac.uk. Supplementary data are available at Bioinformatics online.
Multitasking TORT Under UNICOS: Parallel Performance Models and Measurements

International Nuclear Information System (INIS)

Azmy, Y.Y.; Barnett, D.A.

1999-01-01

The existing parallel algorithms in the TORT discrete ordinates were updated to function in a UNI-COS environment. A performance model for the parallel overhead was derived for the existing algorithms. The largest contributors to the parallel overhead were identified and a new algorithm was developed. A parallel overhead model was also derived for the new algorithm. The results of the comparison of parallel performance models were compared to applications of the code to two TORT standard test problems and a large production problem. The parallel performance models agree well with the measured parallel overhead
Multitasking TORT under UNICOS: Parallel performance models and measurements

International Nuclear Information System (INIS)

Barnett, A.; Azmy, Y.Y.

1999-01-01

The existing parallel algorithms in the TORT discrete ordinates code were updated to function in a UNICOS environment. A performance model for the parallel overhead was derived for the existing algorithms. The largest contributors to the parallel overhead were identified and a new algorithm was developed. A parallel overhead model was also derived for the new algorithm. The results of the comparison of parallel performance models were compared to applications of the code to two TORT standard test problems and a large production problem. The parallel performance models agree well with the measured parallel overhead
Automatic synthesis of sequential control schemes

International Nuclear Information System (INIS)

Klein, I.

1993-01-01

Of all hard- and software developed for industrial control purposes, the majority is devoted to sequential, or binary valued, control and only a minor part to classical linear control. Typically, the sequential parts of the controller are invoked during startup and shut-down to bring the system into its normal operating region and into some safe standby region, respectively. Despite its importance, fairly little theoretical research has been devoted to this area, and sequential control programs are therefore still created manually without much theoretical support to obtain a systematic approach. We propose a method to create sequential control programs automatically. The main ideas is to spend some effort off-line modelling the plant, and from this model generate the control strategy, that is the plan. The plant is modelled using action structures, thereby concentrating on the actions instead of the states of the plant. In general the planning problem shows exponential complexity in the number of state variables. However, by focusing on the actions, we can identify problem classes as well as algorithms such that the planning complexity is reduced to polynomial complexity. We prove that these algorithms are sound, i.e., the generated solution will solve the stated problem, and complete, i.e., if the algorithms fail, then no solution exists. The algorithms generate a plan as a set of actions and a partial order on this set specifying the execution order. The generated plant is proven to be minimal and maximally parallel. For a larger class of problems we propose a method to split the original problem into a number of simple problems that can each be solved using one of the presented algorithms. It is also shown how a plan can be translated into a GRAFCET chart, and to illustrate these ideas we have implemented a planing tool, i.e., a system that is able to automatically create control schemes. Such a tool can of course also be used on-line if it is fast enough. This
Digital intermediate frequency QAM modulator using parallel processing

Science.gov (United States)

Pao, Hsueh-Yuan [Livermore, CA; Tran, Binh-Nien [San Ramon, CA

2008-05-27

The digital Intermediate Frequency (IF) modulator applies to various modulation types and offers a simple and low cost method to implement a high-speed digital IF modulator using field programmable gate arrays (FPGAs). The architecture eliminates multipliers and sequential processing by storing the pre-computed modulated cosine and sine carriers in ROM look-up-tables (LUTs). The high-speed input data stream is parallel processed using the corresponding LUTs, which reduces the main processing speed, allowing the use of low cost FPGAs.
Performance Analysis of Parallel Mathematical Subroutine library PARCEL

International Nuclear Information System (INIS)

Yamada, Susumu; Shimizu, Futoshi; Kobayashi, Kenichi; Kaburaki, Hideo; Kishida, Norio

2000-01-01

The parallel mathematical subroutine library PARCEL (Parallel Computing Elements) has been developed by Japan Atomic Energy Research Institute for easy use of typical parallelized mathematical codes in any application problems on distributed parallel computers. The PARCEL includes routines for linear equations, eigenvalue problems, pseudo-random number generation, and fast Fourier transforms. It is shown that the results of performance for linear equations routines exhibit good parallelization efficiency on vector, as well as scalar, parallel computers. A comparison of the efficiency results with the PETSc (Portable Extensible Tool kit for Scientific Computations) library has been reported. (author)
Applications of the parallel computing system using network

International Nuclear Information System (INIS)

Ido, Shunji; Hasebe, Hiroki

1994-01-01

Parallel programming is applied to multiple processors connected in Ethernet. Data exchanges between tasks located in each processing element are realized by two ways. One is socket which is standard library on recent UNIX operating systems. Another is a network connecting software, named as Parallel Virtual Machine (PVM) which is a free software developed by ORNL, to use many workstations connected to network as a parallel computer. This paper discusses the availability of parallel computing using network and UNIX workstations and comparison between specialized parallel systems (Transputer and iPSC/860) in a Monte Carlo simulation which generally shows high parallelization ratio. (author)
Parallel combinations of pre-ionized low jitter spark gaps

International Nuclear Information System (INIS)

Fitzsimmons, W.A.; Rosocha, L.A.

1979-01-01

The properties of 10 to 30 kV four electrode field emission pre-ionized triggered spark gaps have been studied. A mid-plane off-axis trigger electrode is biased at +V 0 /2, and a field emission point is located adjacent to and biased at the grounded cathode potential. Simultaneous application of a -V 0 trigger rapid pulse to both the electrodes results in the rapid sequential closing of the anode-trigger and trigger-cathode gaps. The observed jitter is about 1.5 ns. Parallel operation of these gaps (up to 10 so far) connected to a common capacitive load has been studied. A simple theory that predicts the number of gaps that may be expected to operate in parallel is discussed
Sequential Power-Dependence Theory

NARCIS (Netherlands)

Buskens, Vincent; Rijt, Arnout van de

2008-01-01

Existing methods for predicting resource divisions in laboratory exchange networks do not take into account the sequential nature of the experimental setting. We extend network exchange theory by considering sequential exchange. We prove that Sequential Power-Dependence Theory—unlike
A Comparison Study on Motion/Force Transmissibility of Two Typical 3-DOF Parallel Manipulators: The Sprint Z3 and A3 Tool Heads

Directory of Open Access Journals (Sweden)

Xiang Chen

2014-01-01

Full Text Available This paper presents a comparison study of two important three-degree-of-freedom (DOF parallel manipulators, the Sprint Z3 head and the A3 head, both commonly used in industry. As an initial step, the inverse kinematics are derived and an analysis of two classes of limbs is carried out via screw theory. For comparison, three transmission indices are then defined to describe their motion/force transmission performance. Based on the same main parameters, the compared results reveal some distinct characteristics in addition to the similarities between the two parallel manipulators. To a certain extent, the A3 head outperforms the common Sprint Z3 head, providing a new and satisfactory option for a machine tool head in industry.
Evaluation of degree of readsorption of radionuclides during sequential extraction in soil: comparison between batch and dynamic extraction systems

DEFF Research Database (Denmark)

Petersen, Roongrat; Hansen, Elo Harald; Hou, Xiaolin

. However, the techniques have an important problem with redistribution as a result of readsorption of dissolved analytes onto the remaining solids phases during extraction. Many authors have demonstrated the readsorption problem and inaccuracy from it. In our previous work, a dynamic extraction system......Sequential extraction techniques have been widely used to fractionate metals in solid samples (soils, sediments, solid wastes, etc.) due to their leachability. The results are useful for obtaining information about bioavailability, potential mobility and transport of element in natural environments...... developed in our laboratory for heavy metal fractionation has shown the reduction of readsorption problem in comparison with the batch techniques. Moreover, the system shows many advantages over the batch system such as speed of extraction, simple procedure, fully automatic, less risk of contamination...
Plane-Based Sampling for Ray Casting Algorithm in Sequential Medical Images

Science.gov (United States)

Lin, Lili; Chen, Shengyong; Shao, Yan; Gu, Zichun

2013-01-01

This paper proposes a plane-based sampling method to improve the traditional Ray Casting Algorithm (RCA) for the fast reconstruction of a three-dimensional biomedical model from sequential images. In the novel method, the optical properties of all sampling points depend on the intersection points when a ray travels through an equidistant parallel plan cluster of the volume dataset. The results show that the method improves the rendering speed at over three times compared with the conventional algorithm and the image quality is well guaranteed. PMID:23424608
User's guide of parallel program development environment (PPDE). The 2nd edition

International Nuclear Information System (INIS)

Ueno, Hirokazu; Takemiya, Hiroshi; Imamura, Toshiyuki; Koide, Hiroshi; Matsuda, Katsuyuki; Higuchi, Kenji; Hirayama, Toshio; Ohta, Hirofumi

2000-03-01

The STA basic system has been enhanced to accelerate support for parallel programming on heterogeneous parallel computers, through a series of R and D on the technology of parallel processing. The enhancement has been made through extending the function of the PPDF, Parallel Program Development Environment in the STA basic system. The extended PPDE has the function to make: 1) the automatic creation of a 'makefile' and a shell script file for its execution, 2) the multi-tools execution which makes the tools on heterogeneous computers to execute with one operation a task on a computer, and 3) the mirror composition to reflect editing results of a file on a computer into all related files on other computers. These additional functions will enhance the work efficiency for program development on some computers. More functions have been added to the PPDE to provide help for parallel program development. New functions were also designed to complement a HPF translator and a parallelizing support tool when working together so that a sequential program is efficiently converted to a parallel program. This report describes the use of extended PPDE. (author)
Hybrid parallel computing architecture for multiview phase shifting

Science.gov (United States)

Zhong, Kai; Li, Zhongwei; Zhou, Xiaohui; Shi, Yusheng; Wang, Congjun

2014-11-01

The multiview phase-shifting method shows its powerful capability in achieving high resolution three-dimensional (3-D) shape measurement. Unfortunately, this ability results in very high computation costs and 3-D computations have to be processed offline. To realize real-time 3-D shape measurement, a hybrid parallel computing architecture is proposed for multiview phase shifting. In this architecture, the central processing unit can co-operate with the graphic processing unit (GPU) to achieve hybrid parallel computing. The high computation cost procedures, including lens distortion rectification, phase computation, correspondence, and 3-D reconstruction, are implemented in GPU, and a three-layer kernel function model is designed to simultaneously realize coarse-grained and fine-grained paralleling computing. Experimental results verify that the developed system can perform 50 fps (frame per second) real-time 3-D measurement with 260 K 3-D points per frame. A speedup of up to 180 times is obtained for the performance of the proposed technique using a NVIDIA GT560Ti graphics card rather than a sequential C in a 3.4 GHZ Inter Core i7 3770.
P-HS-SFM: a parallel harmony search algorithm for the reproduction of experimental data in the continuous microscopic crowd dynamic models

Science.gov (United States)

Jaber, Khalid Mohammad; Alia, Osama Moh'd.; Shuaib, Mohammed Mahmod

2018-03-01

Finding the optimal parameters that can reproduce experimental data (such as the velocity-density relation and the specific flow rate) is a very important component of the validation and calibration of microscopic crowd dynamic models. Heavy computational demand during parameter search is a known limitation that exists in a previously developed model known as the Harmony Search-Based Social Force Model (HS-SFM). In this paper, a parallel-based mechanism is proposed to reduce the computational time and memory resource utilisation required to find these parameters. More specifically, two MATLAB-based multicore techniques (parfor and create independent jobs) using shared memory are developed by taking advantage of the multithreading capabilities of parallel computing, resulting in a new framework called the Parallel Harmony Search-Based Social Force Model (P-HS-SFM). The experimental results show that the parfor-based P-HS-SFM achieved a better computational time of about 26 h, an efficiency improvement of ? 54% and a speedup factor of 2.196 times in comparison with the HS-SFM sequential processor. The performance of the P-HS-SFM using the create independent jobs approach is also comparable to parfor with a computational time of 26.8 h, an efficiency improvement of about 30% and a speedup of 2.137 times.
Introduction of Parallel GPGPU Acceleration Algorithms for the Solution of Radiative Transfer

Science.gov (United States)

Godoy, William F.; Liu, Xu

2011-01-01

General-purpose computing on graphics processing units (GPGPU) is a recent technique that allows the parallel graphics processing unit (GPU) to accelerate calculations performed sequentially by the central processing unit (CPU). To introduce GPGPU to radiative transfer, the Gauss-Seidel solution of the well-known expressions for 1-D and 3-D homogeneous, isotropic media is selected as a test case. Different algorithms are introduced to balance memory and GPU-CPU communication, critical aspects of GPGPU. Results show that speed-ups of one to two orders of magnitude are obtained when compared to sequential solutions. The underlying value of GPGPU is its potential extension in radiative solvers (e.g., Monte Carlo, discrete ordinates) at a minimal learning curve.
A massively parallel algorithm for the collision probability calculations in the Apollo-II code using the PVM library

International Nuclear Information System (INIS)

Stankovski, Z.

1995-01-01

The collision probability method in neutron transport, as applied to 2D geometries, consume a great amount of computer time, for a typical 2D assembly calculation evaluations. Consequently RZ or 3D calculations became prohibitive. In this paper we present a simple but efficient parallel algorithm based on the message passing host/node programing model. Parallelization was applied to the energy group treatment. Such approach permits parallelization of the existing code, requiring only limited modifications. Sequential/parallel computer portability is preserved, witch is a necessary condition for a industrial code. Sequential performances are also preserved. The algorithm is implemented on a CRAY 90 coupled to a 128 processor T3D computer, a 16 processor IBM SP1 and a network of workstations, using the Public Domain PVM library. The tests were executed for a 2D geometry with the standard 99-group library. All results were very satisfactory, the best ones with IBM SP1. Because of heterogeneity of the workstation network, we did ask high performances for this architecture. The same source code was used for all computers. A more impressive advantage of this algorithm will appear in the calculations of the SAPHYR project (with the future fine multigroup library of about 8000 groups) with a massively parallel computer, using several hundreds of processors. (author). 5 refs., 6 figs., 2 tabs

A massively parallel algorithm for the collision probability calculations in the Apollo-II code using the PVM library

International Nuclear Information System (INIS)

Stankovski, Z.

1995-01-01

The collision probability method in neutron transport, as applied to 2D geometries, consume a great amount of computer time, for a typical 2D assembly calculation about 90% of the computing time is consumed in the collision probability evaluations. Consequently RZ or 3D calculations became prohibitive. In this paper the author presents a simple but efficient parallel algorithm based on the message passing host/node programmation model. Parallelization was applied to the energy group treatment. Such approach permits parallelization of the existing code, requiring only limited modifications. Sequential/parallel computer portability is preserved, which is a necessary condition for a industrial code. Sequential performances are also preserved. The algorithm is implemented on a CRAY 90 coupled to a 128 processor T3D computer, a 16 processor IBM SPI and a network of workstations, using the Public Domain PVM library. The tests were executed for a 2D geometry with the standard 99-group library. All results were very satisfactory, the best ones with IBM SPI. Because of heterogeneity of the workstation network, the author did not ask high performances for this architecture. The same source code was used for all computers. A more impressive advantage of this algorithm will appear in the calculations of the SAPHYR project (with the future fine multigroup library of about 8000 groups) with a massively parallel computer, using several hundreds of processors
A fast and accurate online sequential learning algorithm for feedforward networks.

Science.gov (United States)

Liang, Nan-Ying; Huang, Guang-Bin; Saratchandran, P; Sundararajan, N

2006-11-01

In this paper, we develop an online sequential learning algorithm for single hidden layer feedforward networks (SLFNs) with additive or radial basis function (RBF) hidden nodes in a unified framework. The algorithm is referred to as online sequential extreme learning machine (OS-ELM) and can learn data one-by-one or chunk-by-chunk (a block of data) with fixed or varying chunk size. The activation functions for additive nodes in OS-ELM can be any bounded nonconstant piecewise continuous functions and the activation functions for RBF nodes can be any integrable piecewise continuous functions. In OS-ELM, the parameters of hidden nodes (the input weights and biases of additive nodes or the centers and impact factors of RBF nodes) are randomly selected and the output weights are analytically determined based on the sequentially arriving data. The algorithm uses the ideas of ELM of Huang et al. developed for batch learning which has been shown to be extremely fast with generalization performance better than other batch training methods. Apart from selecting the number of hidden nodes, no other control parameters have to be manually chosen. Detailed performance comparison of OS-ELM is done with other popular sequential learning algorithms on benchmark problems drawn from the regression, classification and time series prediction areas. The results show that the OS-ELM is faster than the other sequential algorithms and produces better generalization performance.
Objective and Subjective Measures of Simultaneous vs Sequential Bilateral Cochlear Implants in Adults A Randomized Clinical Trial : A Randomized Clinical Trial

NARCIS (Netherlands)

Kraaijenga, Véronique J C; Ramakers, Geerte G J; Smulders, Yvette E; van Zon, Alice; Stegeman, Inge; Smit, Adriana L; Stokroos, Robert J; Hendrice, Nadia; Free, Rolien H; Maat, Bert; Frijns, Johan H M; Briaire, Jeroen J; Mylanus, E A M; Huinck, Wendy J; Van Zanten, Gijsbert A; Grolman, Wilko

IMPORTANCE To date, no randomized clinical trial on the comparison between simultaneous and sequential bilateral cochlear implants (BiCIs) has been performed. OBJECTIVE To investigate the hearing capabilities and the self-reported benefits of simultaneous BiCIs compared with those of sequential
Parallel Algorithms for Graph Optimization using Tree Decompositions

Energy Technology Data Exchange (ETDEWEB)

Sullivan, Blair D [ORNL; Weerapurage, Dinesh P [ORNL; Groer, Christopher S [ORNL

2012-06-01

Although many $\\cal{NP}$-hard graph optimization problems can be solved in polynomial time on graphs of bounded tree-width, the adoption of these techniques into mainstream scientific computation has been limited due to the high memory requirements of the necessary dynamic programming tables and excessive runtimes of sequential implementations. This work addresses both challenges by proposing a set of new parallel algorithms for all steps of a tree decomposition-based approach to solve the maximum weighted independent set problem. A hybrid OpenMP/MPI implementation includes a highly scalable parallel dynamic programming algorithm leveraging the MADNESS task-based runtime, and computational results demonstrate scaling. This work enables a significant expansion of the scale of graphs on which exact solutions to maximum weighted independent set can be obtained, and forms a framework for solving additional graph optimization problems with similar techniques.
A Globally Convergent Parallel SSLE Algorithm for Inequality Constrained Optimization

Directory of Open Access Journals (Sweden)

Zhijun Luo

2014-01-01

Full Text Available A new parallel variable distribution algorithm based on interior point SSLE algorithm is proposed for solving inequality constrained optimization problems under the condition that the constraints are block-separable by the technology of sequential system of linear equation. Each iteration of this algorithm only needs to solve three systems of linear equations with the same coefficient matrix to obtain the descent direction. Furthermore, under certain conditions, the global convergence is achieved.
Competence and Praxis: Sequential Analysis in German Sociology

Directory of Open Access Journals (Sweden)

Kai-Olaf Maiwald

2005-09-01

Full Text Available In German social research nowadays most qualitative methodologies employ sequential analysis. This article explores the similarities and differences in conceptualising and practising this method. First, the working consensus, conceived as a shared set of methodological assumptions, is explicated. Second, with regard to three major paradigms of qualitative research in Germany—conversation analysis, objective hermeneutics, and hermeneutic sociology of knowledge—the different ways of doing sequential analysis are investigated to locate the points of departure from a working consensus. It is argued that differences arise from different case-perspectives and, relative to that, from different modes of introducing general knowledge, i.e. knowledge that is not specific for the analysed case, into the interpretation. An important notion to emerge from the comparison is the distinction between competence and praxis. URN: urn:nbn:de:0114-fqs0503310
Three-dimensional classical-ensemble modeling of non-sequential double ionization

International Nuclear Information System (INIS)

Haan, S.L.; Breen, L.; Tannor, D.; Panfili, R.; Ho, Phay J.; Eberly, J.H.

2005-01-01

Full text: We have been using 1d ensembles of classical two-electron atoms to simulate helium atoms that are exposed to pulses of intense laser radiation. In this talk we discuss the challenges in setting up a 3d classical ensemble that can mimic the quantum ground state of helium. We then report studies in which each one of 500,000 two-electron trajectories is followed in 3d through a ten-cycle (25 fs) 780 nm laser pulse. We examine double-ionization yield for various intensities, finding the familiar knee structure. We consider the momentum spread of outcoming electrons in directions both parallel and perpendicular to the direction of laser polarization, and find results that are consistent with experiment. We examine individual trajectories and recollision processes that lead to double ionization, considering the best phases of the laser cycle for recollision events and looking at the possible time delay between recollision and emergence. We consider also the number of recollision events, and find that multiple recollisions are common in the classical ensemble. We investigate which collisional processes lead to various final electron momenta. We conclude with comments regarding the ability of classical mechanics to describe non-sequential double ionization, and a quick summary of similarities and differences between 1d and 3d classical double ionization using energy-trajectory comparisons. Refs. 3 (author)
Parallel R-matrix computation

International Nuclear Information System (INIS)

Heggarty, J.W.

1999-06-01

For almost thirty years, sequential R-matrix computation has been used by atomic physics research groups, from around the world, to model collision phenomena involving the scattering of electrons or positrons with atomic or molecular targets. As considerable progress has been made in the understanding of fundamental scattering processes, new data, obtained from more complex calculations, is of current interest to experimentalists. Performing such calculations, however, places considerable demands on the computational resources to be provided by the target machine, in terms of both processor speed and memory requirement. Indeed, in some instances the computational requirements are so great that the proposed R-matrix calculations are intractable, even when utilising contemporary classic supercomputers. Historically, increases in the computational requirements of R-matrix computation were accommodated by porting the problem codes to a more powerful classic supercomputer. Although this approach has been successful in the past, it is no longer considered to be a satisfactory solution due to the limitations of current (and future) Von Neumann machines. As a consequence, there has been considerable interest in the high performance multicomputers, that have emerged over the last decade which appear to offer the computational resources required by contemporary R-matrix research. Unfortunately, developing codes for these machines is not as simple a task as it was to develop codes for successive classic supercomputers. The difficulty arises from the considerable differences in the computing models that exist between the two types of machine and results in the programming of multicomputers to be widely acknowledged as a difficult, time consuming and error-prone task. Nevertheless, unless parallel R-matrix computation is realised, important theoretical and experimental atomic physics research will continue to be hindered. This thesis describes work that was undertaken in
Intra-individual diagnostic image quality and organ-specific-radiation dose comparison between spiral cCT with iterative image reconstruction and z-axis automated tube current modulation and sequential cCT

International Nuclear Information System (INIS)

Wenz, Holger; Maros, Máté E.; Meyer, Mathias; Gawlitza, Joshua; Förster, Alex; Haubenreisser, Holger; Kurth, Stefan; Schoenberg, Stefan O.; Groden, Christoph; Henzler, Thomas

2016-01-01

•Superiority of spiral versus sequential cCT in image quality and organ-specific-radiation dose.•Spiral cCT: lower organ-specific-radiation-dose in eye lense compared to tilted sequential cCT.•State-of-the-art IR spiral cCT techniques has significant advantages over sequential cCT techniques. Superiority of spiral versus sequential cCT in image quality and organ-specific-radiation dose. Spiral cCT: lower organ-specific-radiation-dose in eye lense compared to tilted sequential cCT. State-of-the-art IR spiral cCT techniques has significant advantages over sequential cCT techniques. To prospectively evaluate image quality and organ-specific-radiation dose of spiral cranial CT (cCT) combined with automated tube current modulation (ATCM) and iterative image reconstruction (IR) in comparison to sequential tilted cCT reconstructed with filtered back projection (FBP) without ATCM. 31 patients with a previous performed tilted non-contrast enhanced sequential cCT aquisition on a 4-slice CT system with only FBP reconstruction and no ATCM were prospectively enrolled in this study for a clinical indicated cCT scan. All spiral cCT examinations were performed on a 3rd generation dual-source CT system using ATCM in z-axis direction. Images were reconstructed using both, FBP and IR (level 1–5). A Monte-Carlo-simulation-based analysis was used to compare organ-specific-radiation dose. Subjective image quality for various anatomic structures was evaluated using a 4-point Likert-scale and objective image quality was evaluated by comparing signal-to-noise ratios (SNR). Spiral cCT led to a significantly lower (p < 0.05) organ-specific-radiation dose in all targets including eye lense. Subjective image quality of spiral cCT datasets with an IR reconstruction level 5 was rated significantly higher compared to the sequential cCT acquisitions (p < 0.0001). Consecutive mean SNR was significantly higher in all spiral datasets (FBP, IR 1–5) when compared to sequential cCT with a mean
In vivo comparison of simultaneous versus sequential injection technique for thermochemical ablation in a porcine model.

Science.gov (United States)

Cressman, Erik N K; Shenoi, Mithun M; Edelman, Theresa L; Geeslin, Matthew G; Hennings, Leah J; Zhang, Yan; Iaizzo, Paul A; Bischof, John C

2012-01-01

To investigate simultaneous and sequential injection thermochemical ablation in a porcine model, and compare them to sham and acid-only ablation. This IACUC-approved study involved 11 pigs in an acute setting. Ultrasound was used to guide placement of a thermocouple probe and coaxial device designed for thermochemical ablation. Solutions of 10 M acetic acid and NaOH were used in the study. Four injections per pig were performed in identical order at a total rate of 4 mL/min: saline sham, simultaneous, sequential, and acid only. Volume and sphericity of zones of coagulation were measured. Fixed specimens were examined by H&E stain. Average coagulation volumes were 11.2 mL (simultaneous), 19.0 mL (sequential) and 4.4 mL (acid). The highest temperature, 81.3°C, was obtained with simultaneous injection. Average temperatures were 61.1°C (simultaneous), 47.7°C (sequential) and 39.5°C (acid only). Sphericity coefficients (0.83-0.89) had no statistically significant difference among conditions. Thermochemical ablation produced substantial volumes of coagulated tissues relative to the amounts of reagents injected, considerably greater than acid alone in either technique employed. The largest volumes were obtained with sequential injection, yet this came at a price in one case of cardiac arrest. Simultaneous injection yielded the highest recorded temperatures and may be tolerated as well as or better than acid injection alone. Although this pilot study did not show a clear advantage for either sequential or simultaneous methods, the results indicate that thermochemical ablation is attractive for further investigation with regard to both safety and efficacy.
Implementations of BLAST for parallel computers.

Science.gov (United States)

Jülich, A

1995-02-01

The BLAST sequence comparison programs have been ported to a variety of parallel computers-the shared memory machine Cray Y-MP 8/864 and the distributed memory architectures Intel iPSC/860 and nCUBE. Additionally, the programs were ported to run on workstation clusters. We explain the parallelization techniques and consider the pros and cons of these methods. The BLAST programs are very well suited for parallelization for a moderate number of processors. We illustrate our results using the program blastp as an example. As input data for blastp, a 799 residue protein query sequence and the protein database PIR were used.
Sequential and Parallel Algorithms for Finding a Maximum Convex Polygon

DEFF Research Database (Denmark)

Fischer, Paul

1997-01-01

This paper investigates the problem where one is given a finite set of n points in the plane each of which is labeled either ?positive? or ?negative?. We consider bounded convex polygons, the vertices of which are positive points and which do not contain any negative point. It is shown how...... such a polygon which is maximal with respect to area can be found in time O(n³ log n). With the same running time one can also find such a polygon which contains a maximum number of positive points. If, in addition, the number of vertices of the polygon is restricted to be at most M, then the running time...... becomes O(M n³ log n). It is also shown how to find a maximum convex polygon which contains a given point in time O(n³ log n). Two parallel algorithms for the basic problem are also presented. The first one runs in time O(n log n) using O(n²) processors, the second one has polylogarithmic time but needs O...
Comparison of some parallelization strategies of thermalhydraulic codes on GPUs

International Nuclear Information System (INIS)

Jendoubi, T.; Bergeaud, V.; Geay, A.

2013-01-01

Modern supercomputers architecture is now often based on hybrid concepts combining parallelism to distributed memory, parallelism to shared memory and also to GPUs (Graphic Process Units). In this work, we propose a new approach to take advantage of these graphic cards in thermohydraulics algorithms. (authors)
User's guide of parallel program development environment (PPDE). The 2nd edition

Energy Technology Data Exchange (ETDEWEB)

Ueno, Hirokazu; Takemiya, Hiroshi; Imamura, Toshiyuki; Koide, Hiroshi; Matsuda, Katsuyuki; Higuchi, Kenji; Hirayama, Toshio [Center for Promotion of Computational Science and Engineering, Japan Atomic Energy Research Institute, Tokyo (Japan); Ohta, Hirofumi [Hitachi Ltd., Tokyo (Japan)

2000-03-01

The STA basic system has been enhanced to accelerate support for parallel programming on heterogeneous parallel computers, through a series of R and D on the technology of parallel processing. The enhancement has been made through extending the function of the PPDF, Parallel Program Development Environment in the STA basic system. The extended PPDE has the function to make: 1) the automatic creation of a 'makefile' and a shell script file for its execution, 2) the multi-tools execution which makes the tools on heterogeneous computers to execute with one operation a task on a computer, and 3) the mirror composition to reflect editing results of a file on a computer into all related files on other computers. These additional functions will enhance the work efficiency for program development on some computers. More functions have been added to the PPDE to provide help for parallel program development. New functions were also designed to complement a HPF translator and a paralleilizing support tool when working together so that a sequential program is efficiently converted to a parallel program. This report describes the use of extended PPDE. (author)
Hybrid shared/distributed parallelism for 3D characteristics transport solvers

International Nuclear Information System (INIS)

Dahmani, M.; Roy, R.

2005-01-01

In this paper, we will present a new hybrid parallel model for solving large-scale 3-dimensional neutron transport problems used in nuclear reactor simulations. Large heterogeneous reactor problems, like the ones that occurs when simulating Candu cores, have remained computationally intensive and impractical for routine applications on single-node or even vector computers. Based on the characteristics method, this new model is designed to solve the transport equation after distributing the calculation load on a network of shared memory multi-processors. The tracks are either generated on the fly at each characteristics sweep or stored in sequential files. The load balancing is taken into account by estimating the calculation load of tracks and by distributing batches of uniform load on each node of the network. Moreover, the communication overhead can be predicted after benchmarking the latency and bandwidth using appropriate network test suite. These models are useful for predicting the performance of the parallel applications and to analyze the scalability of the parallel systems. (authors)
(Nearly) portable PIC code for parallel computers

International Nuclear Information System (INIS)

Decyk, V.K.

1993-01-01

As part of the Numerical Tokamak Project, the author has developed a (nearly) portable, one dimensional version of the GCPIC algorithm for particle-in-cell codes on parallel computers. This algorithm uses a spatial domain decomposition for the fields, and passes particles from one domain to another as the particles move spatially. With only minor changes, the code has been run in parallel on the Intel Delta, the Cray C-90, the IBM ES/9000 and a cluster of workstations. After a line by line translation into cmfortran, the code was also run on the CM-200. Impressive speeds have been achieved, both on the Intel Delta and the Cray C-90, around 30 nanoseconds per particle per time step. In addition, the author was able to isolate the data management modules, so that the physics modules were not changed much from their sequential version, and the data management modules can be used as open-quotes black boxes.close quotes
Parallelized preconditioned BiCGStab solution of sparse linear system equations in F-COBRA-TF

International Nuclear Information System (INIS)

Geemert, Rene van; Glück, Markus; Riedmann, Michael; Gabriel, Harry

2011-01-01

Recently, the in-house development of a preconditioned and parallelized BiCGStab solver has been pursued successfully in AREVA’s advanced sub-channel code F-COBRA-TF. This solver can be run either in a sequential computation mode on a single CPU, or in a parallel computation mode on multiple parallel CPUs. The developed procedure enables the computation of several thousands of successive sparse linear system solutions in F-COBRA-TF with acceptable wall clock run times. The current paper provides general information about F-COBRA-TF in terms of modeling capabilities and application areas, and points out where the relevance arises for the efficient iterative solution of sparse linear systems. Furthermore, the preconditioning and parallelization strategies in the developed BiCGStab iterative solution approach are discussed. The paper is concluded with a number of verification examples. (author)
Porting Gravitational Wave Signal Extraction to Parallel Virtual Machine (PVM)

Science.gov (United States)

Thirumalainambi, Rajkumar; Thompson, David E.; Redmon, Jeffery

2009-01-01

Laser Interferometer Space Antenna (LISA) is a planned NASA-ESA mission to be launched around 2012. The Gravitational Wave detection is fundamentally the determination of frequency, source parameters, and waveform amplitude derived in a specific order from the interferometric time-series of the rotating LISA spacecrafts. The LISA Science Team has developed a Mock LISA Data Challenge intended to promote the testing of complicated nested search algorithms to detect the 100-1 millihertz frequency signals at amplitudes of 10E-21. However, it has become clear that, sequential search of the parameters is very time consuming and ultra-sensitive; hence, a new strategy has been developed. Parallelization of existing sequential search algorithms of Gravitational Wave signal identification consists of decomposing sequential search loops, beginning with outermost loops and working inward. In this process, the main challenge is to detect interdependencies among loops and partitioning the loops so as to preserve concurrency. Existing parallel programs are based upon either shared memory or distributed memory paradigms. In PVM, master and node programs are used to execute parallelization and process spawning. The PVM can handle process management and process addressing schemes using a virtual machine configuration. The task scheduling and the messaging and signaling can be implemented efficiently for the LISA Gravitational Wave search process using a master and 6 nodes. This approach is accomplished using a server that is available at NASA Ames Research Center, and has been dedicated to the LISA Data Challenge Competition. Historically, gravitational wave and source identification parameters have taken around 7 days in this dedicated single thread Linux based server. Using PVM approach, the parameter extraction problem can be reduced to within a day. The low frequency computation and a proxy signal-to-noise ratio are calculated in separate nodes that are controlled by the master
Efficient Parallel Kernel Solvers for Computational Fluid Dynamics Applications

Science.gov (United States)

Sun, Xian-He

1997-01-01

Distributed-memory parallel computers dominate today's parallel computing arena. These machines, such as Intel Paragon, IBM SP2, and Cray Origin2OO, have successfully delivered high performance computing power for solving some of the so-called "grand-challenge" problems. Despite initial success, parallel machines have not been widely accepted in production engineering environments due to the complexity of parallel programming. On a parallel computing system, a task has to be partitioned and distributed appropriately among processors to reduce communication cost and to attain load balance. More importantly, even with careful partitioning and mapping, the performance of an algorithm may still be unsatisfactory, since conventional sequential algorithms may be serial in nature and may not be implemented efficiently on parallel machines. In many cases, new algorithms have to be introduced to increase parallel performance. In order to achieve optimal performance, in addition to partitioning and mapping, a careful performance study should be conducted for a given application to find a good algorithm-machine combination. This process, however, is usually painful and elusive. The goal of this project is to design and develop efficient parallel algorithms for highly accurate Computational Fluid Dynamics (CFD) simulations and other engineering applications. The work plan is 1) developing highly accurate parallel numerical algorithms, 2) conduct preliminary testing to verify the effectiveness and potential of these algorithms, 3) incorporate newly developed algorithms into actual simulation packages. The work plan has well achieved. Two highly accurate, efficient Poisson solvers have been developed and tested based on two different approaches: (1) Adopting a mathematical geometry which has a better capacity to describe the fluid, (2) Using compact scheme to gain high order accuracy in numerical discretization. The previously developed Parallel Diagonal Dominant (PDD) algorithm
Graphics Processing Unit Enhanced Parallel Document Flocking Clustering

Energy Technology Data Exchange (ETDEWEB)

Cui, Xiaohui [ORNL; Potok, Thomas E [ORNL; ST Charles, Jesse Lee [ORNL

2010-01-01

Analyzing and clustering documents is a complex problem. One explored method of solving this problem borrows from nature, imitating the flocking behavior of birds. One limitation of this method of document clustering is its complexity O(n2). As the number of documents grows, it becomes increasingly difficult to generate results in a reasonable amount of time. In the last few years, the graphics processing unit (GPU) has received attention for its ability to solve highly-parallel and semi-parallel problems much faster than the traditional sequential processor. In this paper, we have conducted research to exploit this archi- tecture and apply its strengths to the flocking based document clustering problem. Using the CUDA platform from NVIDIA, we developed a doc- ument flocking implementation to be run on the NVIDIA GEFORCE GPU. Performance gains ranged from thirty-six to nearly sixty times improvement of the GPU over the CPU implementation.

BLAST in Gid (BiG): A Grid-Enabled Software Architecture and Implementation of Parallel and Sequential BLAST

International Nuclear Information System (INIS)

Aparicio, G.; Blanquer, I.; Hernandez, V.; Segrelles, D.

2007-01-01

The integration of High-performance computing tools is a key issue in biomedical research. Many computer-based applications have been migrated to High-Performance computers to deal with their computing and storage needs such as BLAST. However, the use of clusters and computing farm presents problems in scalability. The use of a higher layer of parallelism that splits the task into highly independent long jobs that can be executed in parallel can improve the performance maintaining the efficiency. Grid technologies combined with parallel computing resources are an important enabling technology. This work presents a software architecture for executing BLAST in a International Grid Infrastructure that guarantees security, scalability and fault tolerance. The software architecture is modular an adaptable to many other high-throughput applications, both inside the field of bio computing and outside. (Author)
Accuracy of respiratory motion measurement of 4D-MRI: A comparison between cine and sequential acquisition.

Science.gov (United States)

Liu, Yilin; Yin, Fang-Fang; Rhee, DongJoo; Cai, Jing

2016-01-01

The authors have recently developed a cine-mode T2*/T1-weighted 4D-MRI technique and a sequential-mode T2-weighted 4D-MRI technique for imaging respiratory motion. This study aims at investigating which 4D-MRI image acquisition mode, cine or sequential, provides more accurate measurement of organ motion during respiration. A 4D digital extended cardiac-torso (XCAT) human phantom with a hypothesized tumor was used to simulate the image acquisition and the 4D-MRI reconstruction. The respiratory motion was controlled by the given breathing signal profiles. The tumor was manipulated to move continuously with the surrounding tissue. The motion trajectories were measured from both sequential- and cine-mode 4D-MRI images. The measured trajectories were compared with the average trajectory calculated from the input profiles, which was used as references. The error in 4D-MRI tumor motion trajectory (E) was determined. In addition, the corresponding respiratory motion amplitudes of all the selected 2D images for 4D reconstruction were recorded. Each of the amplitude was compared with the amplitude of its associated bin on the average breathing curve. The mean differences from the average breathing curve across all slice positions (D) were calculated. A total of 500 simulated respiratory profiles with a wide range of irregularity (Ir) were used to investigate the relationship between D and Ir. Furthermore, statistical analysis of E and D using XCAT controlled by 20 cancer patients' breathing profiles was conducted. Wilcoxon Signed Rank test was conducted to compare two modes. D increased faster for cine-mode (D = 1.17 × Ir + 0.23) than sequential-mode (D = 0.47 × Ir + 0.23) as irregularity increased. For the XCAT study using 20 cancer patients' breathing profiles, the median E values were significantly different: 0.12 and 0.10 cm for cine- and sequential-modes, respectively, with a p-value of 0.02. The median D values were significantly different: 0.47 and 0.24 cm for cine
PaFlexPepDock: parallel ab-initio docking of peptides onto their receptors with full flexibility based on Rosetta.

Science.gov (United States)

Li, Haiou; Lu, Liyao; Chen, Rong; Quan, Lijun; Xia, Xiaoyan; Lü, Qiang

2014-01-01

Structural information related to protein-peptide complexes can be very useful for novel drug discovery and design. The computational docking of protein and peptide can supplement the structural information available on protein-peptide interactions explored by experimental ways. Protein-peptide docking of this paper can be described as three processes that occur in parallel: ab-initio peptide folding, peptide docking with its receptor, and refinement of some flexible areas of the receptor as the peptide is approaching. Several existing methods have been used to sample the degrees of freedom in the three processes, which are usually triggered in an organized sequential scheme. In this paper, we proposed a parallel approach that combines all the three processes during the docking of a folding peptide with a flexible receptor. This approach mimics the actual protein-peptide docking process in parallel way, and is expected to deliver better performance than sequential approaches. We used 22 unbound protein-peptide docking examples to evaluate our method. Our analysis of the results showed that the explicit refinement of the flexible areas of the receptor facilitated more accurate modeling of the interfaces of the complexes, while combining all of the moves in parallel helped the constructing of energy funnels for predictions.
PaFlexPepDock: parallel ab-initio docking of peptides onto their receptors with full flexibility based on Rosetta.

Directory of Open Access Journals (Sweden)

Haiou Li

Full Text Available Structural information related to protein-peptide complexes can be very useful for novel drug discovery and design. The computational docking of protein and peptide can supplement the structural information available on protein-peptide interactions explored by experimental ways. Protein-peptide docking of this paper can be described as three processes that occur in parallel: ab-initio peptide folding, peptide docking with its receptor, and refinement of some flexible areas of the receptor as the peptide is approaching. Several existing methods have been used to sample the degrees of freedom in the three processes, which are usually triggered in an organized sequential scheme. In this paper, we proposed a parallel approach that combines all the three processes during the docking of a folding peptide with a flexible receptor. This approach mimics the actual protein-peptide docking process in parallel way, and is expected to deliver better performance than sequential approaches. We used 22 unbound protein-peptide docking examples to evaluate our method. Our analysis of the results showed that the explicit refinement of the flexible areas of the receptor facilitated more accurate modeling of the interfaces of the complexes, while combining all of the moves in parallel helped the constructing of energy funnels for predictions.
Just-in-Time Compilation-Inspired Methodology for Parallelization of Compute Intensive Java Code

Directory of Open Access Journals (Sweden)

GHULAM MUSTAFA

2017-01-01

Full Text Available Compute intensive programs generally consume significant fraction of execution time in a small amount of repetitive code. Such repetitive code is commonly known as hotspot code. We observed that compute intensive hotspots often possess exploitable loop level parallelism. A JIT (Just-in-Time compiler profiles a running program to identify its hotspots. Hotspots are then translated into native code, for efficient execution. Using similar approach, we propose a methodology to identify hotspots and exploit their parallelization potential on multicore systems. Proposed methodology selects and parallelizes each DOALL loop that is either contained in a hotspot method or calls a hotspot method. The methodology could be integrated in front-end of a JIT compiler to parallelize sequential code, just before native translation. However, compilation to native code is out of scope of this work. As a case study, we analyze eighteen JGF (Java Grande Forum benchmarks to determine parallelization potential of hotspots. Eight benchmarks demonstrate a speedup of up to 7.6x on an 8-core system
GRAPES: a software for parallel searching on biological graphs targeting multi-core architectures.

Directory of Open Access Journals (Sweden)

Rosalba Giugno

Full Text Available Biological applications, from genomics to ecology, deal with graphs that represents the structure of interactions. Analyzing such data requires searching for subgraphs in collections of graphs. This task is computationally expensive. Even though multicore architectures, from commodity computers to more advanced symmetric multiprocessing (SMP, offer scalable computing power, currently published software implementations for indexing and graph matching are fundamentally sequential. As a consequence, such software implementations (i do not fully exploit available parallel computing power and (ii they do not scale with respect to the size of graphs in the database. We present GRAPES, software for parallel searching on databases of large biological graphs. GRAPES implements a parallel version of well-established graph searching algorithms, and introduces new strategies which naturally lead to a faster parallel searching system especially for large graphs. GRAPES decomposes graphs into subcomponents that can be efficiently searched in parallel. We show the performance of GRAPES on representative biological datasets containing antiviral chemical compounds, DNA, RNA, proteins, protein contact maps and protein interactions networks.
Just-in-time compilation-inspired methodology for parallelization of compute intensive java code

International Nuclear Information System (INIS)

Mustafa, G.; Ghani, M.U.

2017-01-01

Compute intensive programs generally consume significant fraction of execution time in a small amount of repetitive code. Such repetitive code is commonly known as hotspot code. We observed that compute intensive hotspots often possess exploitable loop level parallelism. A JIT (Just-in-Time) compiler profiles a running program to identify its hotspots. Hotspots are then translated into native code, for efficient execution. Using similar approach, we propose a methodology to identify hotspots and exploit their parallelization potential on multicore systems. Proposed methodology selects and parallelizes each DOALL loop that is either contained in a hotspot method or calls a hotspot method. The methodology could be integrated in front-end of a JIT compiler to parallelize sequential code, just before native translation. However, compilation to native code is out of scope of this work. As a case study, we analyze eighteen JGF (Java Grande Forum) benchmarks to determine parallelization potential of hotspots. Eight benchmarks demonstrate a speedup of up to 7.6x on an 8-core system. (author)
Development of a flow method for the determination of phosphate in estuarine and freshwaters-Comparison of flow cells in spectrophotometric sequential injection analysis

International Nuclear Information System (INIS)

Mesquita, Raquel B.R.; Ferreira, M. Teresa S.O.B.; Toth, Ildiko V.; Bordalo, Adriano A.; McKelvie, Ian D.; Rangel, Antonio O.S.S.

2011-01-01

Highlights: → Sequential injection determination of phosphate in estuarine and freshwaters. → Alternative spectrophotometric flow cells are compared. → Minimization of schlieren effect was assessed. → Proposed method can cope with wide salinity ranges. → Multi-reflective cell shows clear advantages. - Abstract: A sequential injection system with dual analytical line was developed and applied in the comparison of two different detection systems viz; a conventional spectrophotometer with a commercial flow cell, and a multi-reflective flow cell coupled with a photometric detector under the same experimental conditions. The study was based on the spectrophotometric determination of phosphate using the molybdenum-blue chemistry. The two alternative flow cells were compared in terms of their response to variation of sample salinity, susceptibility to interferences and to refractive index changes. The developed method was applied to the determination of phosphate in natural waters (estuarine, river, well and ground waters). The achieved detection limit (0.007 μM PO 4 3- ) is consistent with the requirement of the target water samples, and a wide quantification range (0.024-9.5 μM) was achieved using both detection systems.
Development of a flow method for the determination of phosphate in estuarine and freshwaters-Comparison of flow cells in spectrophotometric sequential injection analysis

Energy Technology Data Exchange (ETDEWEB)

Mesquita, Raquel B.R. [CBQF/Escola Superior de Biotecnologia, Universidade Catolica Portuguesa, R. Dr. Antonio Bernardino de Almeida, 4200-072 Porto (Portugal); Laboratory of Hydrobiology, Institute of Biomedical Sciences Abel Salazar (ICBAS) and Institute of Marine Research (CIIMAR), Universidade do Porto, Lg. Abel Salazar 2, 4099-003 Porto (Portugal); Ferreira, M. Teresa S.O.B. [CBQF/Escola Superior de Biotecnologia, Universidade Catolica Portuguesa, R. Dr. Antonio Bernardino de Almeida, 4200-072 Porto (Portugal); Toth, Ildiko V. [REQUIMTE, Departamento de Quimica, Faculdade de Farmacia, Universidade de Porto, Rua Anibal Cunha, 164, 4050-047 Porto (Portugal); Bordalo, Adriano A. [Laboratory of Hydrobiology, Institute of Biomedical Sciences Abel Salazar (ICBAS) and Institute of Marine Research (CIIMAR), Universidade do Porto, Lg. Abel Salazar 2, 4099-003 Porto (Portugal); McKelvie, Ian D. [School of Chemistry, University of Melbourne, Victoria 3010 (Australia); Rangel, Antonio O.S.S., E-mail: aorangel@esb.ucp.pt [CBQF/Escola Superior de Biotecnologia, Universidade Catolica Portuguesa, R. Dr. Antonio Bernardino de Almeida, 4200-072 Porto (Portugal)

2011-09-02

Highlights: {yields} Sequential injection determination of phosphate in estuarine and freshwaters. {yields} Alternative spectrophotometric flow cells are compared. {yields} Minimization of schlieren effect was assessed. {yields} Proposed method can cope with wide salinity ranges. {yields} Multi-reflective cell shows clear advantages. - Abstract: A sequential injection system with dual analytical line was developed and applied in the comparison of two different detection systems viz; a conventional spectrophotometer with a commercial flow cell, and a multi-reflective flow cell coupled with a photometric detector under the same experimental conditions. The study was based on the spectrophotometric determination of phosphate using the molybdenum-blue chemistry. The two alternative flow cells were compared in terms of their response to variation of sample salinity, susceptibility to interferences and to refractive index changes. The developed method was applied to the determination of phosphate in natural waters (estuarine, river, well and ground waters). The achieved detection limit (0.007 {mu}M PO{sub 4}{sup 3-}) is consistent with the requirement of the target water samples, and a wide quantification range (0.024-9.5 {mu}M) was achieved using both detection systems.
A Parallel Sweeping Preconditioner for Heterogeneous 3D Helmholtz Equations

KAUST Repository

Poulson, Jack

2013-05-02

A parallelization of a sweeping preconditioner for three-dimensional Helmholtz equations without large cavities is introduced and benchmarked for several challenging velocity models. The setup and application costs of the sequential preconditioner are shown to be O(γ2N4/3) and O(γN logN), where γ(ω) denotes the modestly frequency-dependent number of grid points per perfectly matched layer. Several computational and memory improvements are introduced relative to using black-box sparse-direct solvers for the auxiliary problems, and competitive runtimes and iteration counts are reported for high-frequency problems distributed over thousands of cores. Two open-source packages are released along with this paper: Parallel Sweeping Preconditioner (PSP) and the underlying distributed multifrontal solver, Clique. © 2013 Society for Industrial and Applied Mathematics.
OpenMP parallelization of a gridded SWAT (SWATG)

Science.gov (United States)

Zhang, Ying; Hou, Jinliang; Cao, Yongpan; Gu, Juan; Huang, Chunlin

2017-12-01

Large-scale, long-term and high spatial resolution simulation is a common issue in environmental modeling. A Gridded Hydrologic Response Unit (HRU)-based Soil and Water Assessment Tool (SWATG) that integrates grid modeling scheme with different spatial representations also presents such problems. The time-consuming problem affects applications of very high resolution large-scale watershed modeling. The OpenMP (Open Multi-Processing) parallel application interface is integrated with SWATG (called SWATGP) to accelerate grid modeling based on the HRU level. Such parallel implementation takes better advantage of the computational power of a shared memory computer system. We conducted two experiments at multiple temporal and spatial scales of hydrological modeling using SWATG and SWATGP on a high-end server. At 500-m resolution, SWATGP was found to be up to nine times faster than SWATG in modeling over a roughly 2000 km2 watershed with 1 CPU and a 15 thread configuration. The study results demonstrate that parallel models save considerable time relative to traditional sequential simulation runs. Parallel computations of environmental models are beneficial for model applications, especially at large spatial and temporal scales and at high resolutions. The proposed SWATGP model is thus a promising tool for large-scale and high-resolution water resources research and management in addition to offering data fusion and model coupling ability.
Rubus: A compiler for seamless and extensible parallelism

Science.gov (United States)

Adnan, Muhammad; Aslam, Faisal; Sarwar, Syed Mansoor

2017-01-01

Nowadays, a typical processor may have multiple processing cores on a single chip. Furthermore, a special purpose processing unit called Graphic Processing Unit (GPU), originally designed for 2D/3D games, is now available for general purpose use in computers and mobile devices. However, the traditional programming languages which were designed to work with machines having single core CPUs, cannot utilize the parallelism available on multi-core processors efficiently. Therefore, to exploit the extraordinary processing power of multi-core processors, researchers are working on new tools and techniques to facilitate parallel programming. To this end, languages like CUDA and OpenCL have been introduced, which can be used to write code with parallelism. The main shortcoming of these languages is that programmer needs to specify all the complex details manually in order to parallelize the code across multiple cores. Therefore, the code written in these languages is difficult to understand, debug and maintain. Furthermore, to parallelize legacy code can require rewriting a significant portion of code in CUDA or OpenCL, which can consume significant time and resources. Thus, the amount of parallelism achieved is proportional to the skills of the programmer and the time spent in code optimizations. This paper proposes a new open source compiler, Rubus, to achieve seamless parallelism. The Rubus compiler relieves the programmer from manually specifying the low-level details. It analyses and transforms a sequential program into a parallel program automatically, without any user intervention. This achieves massive speedup and better utilization of the underlying hardware without a programmer’s expertise in parallel programming. For five different benchmarks, on average a speedup of 34.54 times has been achieved by Rubus as compared to Java on a basic GPU having only 96 cores. Whereas, for a matrix multiplication benchmark the average execution speedup of 84 times has been
Rubus: A compiler for seamless and extensible parallelism.

Directory of Open Access Journals (Sweden)

Muhammad Adnan

Full Text Available Nowadays, a typical processor may have multiple processing cores on a single chip. Furthermore, a special purpose processing unit called Graphic Processing Unit (GPU, originally designed for 2D/3D games, is now available for general purpose use in computers and mobile devices. However, the traditional programming languages which were designed to work with machines having single core CPUs, cannot utilize the parallelism available on multi-core processors efficiently. Therefore, to exploit the extraordinary processing power of multi-core processors, researchers are working on new tools and techniques to facilitate parallel programming. To this end, languages like CUDA and OpenCL have been introduced, which can be used to write code with parallelism. The main shortcoming of these languages is that programmer needs to specify all the complex details manually in order to parallelize the code across multiple cores. Therefore, the code written in these languages is difficult to understand, debug and maintain. Furthermore, to parallelize legacy code can require rewriting a significant portion of code in CUDA or OpenCL, which can consume significant time and resources. Thus, the amount of parallelism achieved is proportional to the skills of the programmer and the time spent in code optimizations. This paper proposes a new open source compiler, Rubus, to achieve seamless parallelism. The Rubus compiler relieves the programmer from manually specifying the low-level details. It analyses and transforms a sequential program into a parallel program automatically, without any user intervention. This achieves massive speedup and better utilization of the underlying hardware without a programmer's expertise in parallel programming. For five different benchmarks, on average a speedup of 34.54 times has been achieved by Rubus as compared to Java on a basic GPU having only 96 cores. Whereas, for a matrix multiplication benchmark the average execution speedup of 84
Parallel discrete event simulation using shared memory

Science.gov (United States)

Reed, Daniel A.; Malony, Allen D.; Mccredie, Bradley D.

1988-01-01

With traditional event-list techniques, evaluating a detailed discrete-event simulation-model can often require hours or even days of computation time. By eliminating the event list and maintaining only sufficient synchronization to ensure causality, parallel simulation can potentially provide speedups that are linear in the numbers of processors. A set of shared-memory experiments, using the Chandy-Misra distributed-simulation algorithm, to simulate networks of queues is presented. Parameters of the study include queueing network topology and routing probabilities, number of processors, and assignment of network nodes to processors. These experiments show that Chandy-Misra distributed simulation is a questionable alternative to sequential-simulation of most queueing network models.
Modelling sequentially scored item responses

NARCIS (Netherlands)

Akkermans, W.

2000-01-01

The sequential model can be used to describe the variable resulting from a sequential scoring process. In this paper two more item response models are investigated with respect to their suitability for sequential scoring: the partial credit model and the graded response model. The investigation is
Comments on the parallelization efficiency of the Sunway TaihuLight supercomputer

OpenAIRE

Végh, János

2016-01-01

In the world of supercomputers, the large number of processors requires to minimize the inefficiencies of parallelization, which appear as a sequential part of the program from the point of view of Amdahl's law. The recently suggested new figure of merit is applied to the recently presented supercomputer, and the timeline of "Top 500" supercomputers is scrutinized using the metric. It is demonstrated, that in addition to the computing performance and power consumption, the new supercomputer i...
Relative resilience to noise of standard and sequential approaches to measurement-based quantum computation

Science.gov (United States)

Gallagher, C. B.; Ferraro, A.

2018-05-01

A possible alternative to the standard model of measurement-based quantum computation (MBQC) is offered by the sequential model of MBQC—a particular class of quantum computation via ancillae. Although these two models are equivalent under ideal conditions, their relative resilience to noise in practical conditions is not yet known. We analyze this relationship for various noise models in the ancilla preparation and in the entangling-gate implementation. The comparison of the two models is performed utilizing both the gate infidelity and the diamond distance as figures of merit. Our results show that in the majority of instances the sequential model outperforms the standard one in regard to a universal set of operations for quantum computation. Further investigation is made into the performance of sequential MBQC in experimental scenarios, thus setting benchmarks for possible cavity-QED implementations.
Sequential Interval Estimation of a Location Parameter with Fixed Width in the Nonregular Case

OpenAIRE

Koike, Ken-ichi

2007-01-01

For a location-scale parameter family of distributions with a finite support, a sequential confidence interval with a fixed width is obtained for the location parameter, and its asymptotic consistency and efficiency are shown. Some comparisons with the Chow-Robbins procedure are also done.
Automatic Parallelization Tool: Classification of Program Code for Parallel Computing

Directory of Open Access Journals (Sweden)

Mustafa Basthikodi

2016-04-01

Full Text Available Performance growth of single-core processors has come to a halt in the past decade, but was re-enabled by the introduction of parallelism in processors. Multicore frameworks along with Graphical Processing Units empowered to enhance parallelism broadly. Couples of compilers are updated to developing challenges forsynchronization and threading issues. Appropriate program and algorithm classifications will have advantage to a great extent to the group of software engineers to get opportunities for effective parallelization. In present work we investigated current species for classification of algorithms, in that related work on classification is discussed along with the comparison of issues that challenges the classification. The set of algorithms are chosen which matches the structure with different issues and perform given task. We have tested these algorithms utilizing existing automatic species extraction toolsalong with Bones compiler. We have added functionalities to existing tool, providing a more detailed characterization. The contributions of our work include support for pointer arithmetic, conditional and incremental statements, user defined types, constants and mathematical functions. With this, we can retain significant data which is not captured by original speciesof algorithms. We executed new theories into the device, empowering automatic characterization of program code.
Multi-agent sequential hypothesis testing

KAUST Repository

Kim, Kwang-Ki K.

2014-12-15

This paper considers multi-agent sequential hypothesis testing and presents a framework for strategic learning in sequential games with explicit consideration of both temporal and spatial coordination. The associated Bayes risk functions explicitly incorporate costs of taking private/public measurements, costs of time-difference and disagreement in actions of agents, and costs of false declaration/choices in the sequential hypothesis testing. The corresponding sequential decision processes have well-defined value functions with respect to (a) the belief states for the case of conditional independent private noisy measurements that are also assumed to be independent identically distributed over time, and (b) the information states for the case of correlated private noisy measurements. A sequential investment game of strategic coordination and delay is also discussed as an application of the proposed strategic learning rules.

Decomposition and parallelization strategies for solving large-scale MDO problems

Energy Technology Data Exchange (ETDEWEB)

Grauer, M.; Eschenauer, H.A. [Research Center for Multidisciplinary Analyses and Applied Structural Optimization, FOMAAS, Univ. of Siegen (Germany)

2007-07-01

During previous years, structural optimization has been recognized as a useful tool within the discriptiones of engineering and economics. However, the optimization of large-scale systems or structures is impeded by an immense solution effort. This was the reason to start a joint research and development (R and D) project between the Institute of Mechanics and Control Engineering and the Information and Decision Sciences Institute within the Research Center for Multidisciplinary Analyses and Applied Structural Optimization (FOMAAS) on cluster computing for parallel and distributed solution of multidisciplinary optimization (MDO) problems based on the OpTiX-Workbench. Here the focus of attention will be put on coarsegrained parallelization and its implementation on clusters of workstations. A further point of emphasis was laid on the development of a parallel decomposition strategy called PARDEC, for the solution of very complex optimization problems which cannot be solved efficiently by sequential integrated optimization. The use of the OptiX-Workbench together with the FEM ground water simulation system FEFLOW is shown for a special water management problem. (orig.)
High-performance parallel approaches for three-dimensional light detection and ranging point clouds gridding

Science.gov (United States)

Rizki, Permata Nur Miftahur; Lee, Heezin; Lee, Minsu; Oh, Sangyoon

2017-01-01

With the rapid advance of remote sensing technology, the amount of three-dimensional point-cloud data has increased extraordinarily, requiring faster processing in the construction of digital elevation models. There have been several attempts to accelerate the computation using parallel methods; however, little attention has been given to investigating different approaches for selecting the most suited parallel programming model for a given computing environment. We present our findings and insights identified by implementing three popular high-performance parallel approaches (message passing interface, MapReduce, and GPGPU) on time demanding but accurate kriging interpolation. The performances of the approaches are compared by varying the size of the grid and input data. In our empirical experiment, we demonstrate the significant acceleration by all three approaches compared to a C-implemented sequential-processing method. In addition, we also discuss the pros and cons of each method in terms of usability, complexity infrastructure, and platform limitation to give readers a better understanding of utilizing those parallel approaches for gridding purposes.
Sequential charged particle reaction

International Nuclear Information System (INIS)

Hori, Jun-ichi; Ochiai, Kentaro; Sato, Satoshi; Yamauchi, Michinori; Nishitani, Takeo

2004-01-01

The effective cross sections for producing the sequential reaction products in F82H, pure vanadium and LiF with respect to the 14.9-MeV neutron were obtained and compared with the estimation ones. Since the sequential reactions depend on the secondary charged particles behavior, the effective cross sections are corresponding to the target nuclei and the material composition. The effective cross sections were also estimated by using the EAF-libraries and compared with the experimental ones. There were large discrepancies between estimated and experimental values. Additionally, we showed the contribution of the sequential reaction on the induced activity and dose rate in the boundary region with water. From the present study, it has been clarified that the sequential reactions are of great importance to evaluate the dose rates around the surface of cooling pipe and the activated corrosion products. (author)
User's guide of parallel program development environment (PPDE). The 2nd edition

Energy Technology Data Exchange (ETDEWEB)

Ueno, Hirokazu; Takemiya, Hiroshi; Imamura, Toshiyuki; Koide, Hiroshi; Matsuda, Katsuyuki; Higuchi, Kenji; Hirayama, Toshio [Center for Promotion of Computational Science and Engineering, Japan Atomic Energy Research Institute, Tokyo (Japan); Ohta, Hirofumi [Hitachi Ltd., Tokyo (Japan)

2000-03-01

The STA basic system has been enhanced to accelerate support for parallel programming on heterogeneous parallel computers, through a series of R and D on the technology of parallel processing. The enhancement has been made through extending the function of the PPDF, Parallel Program Development Environment in the STA basic system. The extended PPDE has the function to make: 1) the automatic creation of a 'makefile' and a shell script file for its execution, 2) the multi-tools execution which makes the tools on heterogeneous computers to execute with one operation a task on a computer, and 3) the mirror composition to reflect editing results of a file on a computer into all related files on other computers. These additional functions will enhance the work efficiency for program development on some computers. More functions have been added to the PPDE to provide help for parallel program development. New functions were also designed to complement a HPF translator and a paralleilizing support tool when working together so that a sequential program is efficiently converted to a parallel program. This report describes the use of extended PPDE. (author)
Performance of a Sequential and Parallel Computational Fluid Dynamic (CFD) Solver on a Missile Body Configuration

National Research Council Canada - National Science Library

Hisley, Dixie

1999-01-01

.... The goals of this report are: (1) to investigate the performance of message passing and loop level parallelization techniques, as they were implemented in the computational fluid dynamics (CFD...
External parallel sorting with multiprocessor computers

International Nuclear Information System (INIS)

Comanceau, S.I.

1984-01-01

This article describes methods of external sorting in which the entire main computer memory is used for the internal sorting of entries, forming out of them sorted segments of the greatest possible size, and outputting them to external memories. The obtained segments are merged into larger segments until all entries form one ordered segment. The described methods are suitable for sequential files stored on magnetic tape. The needs of the sorting algorithm can be met by using the relatively slow peripheral storage devices (e.g., tapes, disks, drums). The efficiency of the external sorting methods is determined by calculating the total sorting time as a function of the number of entries to be sorted and the number of parallel processors participating in the sorting process
Heuristic and optimal policy computations in the human brain during sequential decision-making.

Science.gov (United States)

Korn, Christoph W; Bach, Dominik R

2018-01-23

Optimal decisions across extended time horizons require value calculations over multiple probabilistic future states. Humans may circumvent such complex computations by resorting to easy-to-compute heuristics that approximate optimal solutions. To probe the potential interplay between heuristic and optimal computations, we develop a novel sequential decision-making task, framed as virtual foraging in which participants have to avoid virtual starvation. Rewards depend only on final outcomes over five-trial blocks, necessitating planning over five sequential decisions and probabilistic outcomes. Here, we report model comparisons demonstrating that participants primarily rely on the best available heuristic but also use the normatively optimal policy. FMRI signals in medial prefrontal cortex (MPFC) relate to heuristic and optimal policies and associated choice uncertainties. Crucially, reaction times and dorsal MPFC activity scale with discrepancies between heuristic and optimal policies. Thus, sequential decision-making in humans may emerge from integration between heuristic and optimal policies, implemented by controllers in MPFC.
Treatment planning in radiosurgery: parallel Monte Carlo simulation software

Energy Technology Data Exchange (ETDEWEB)

Scielzo, G [Galliera Hospitals, Genova (Italy). Dept. of Hospital Physics; Grillo Ruggieri, F [Galliera Hospitals, Genova (Italy) Dept. for Radiation Therapy; Modesti, M; Felici, R [Electronic Data System, Rome (Italy); Surridge, M [University of South Hampton (United Kingdom). Parallel Apllication Centre

1995-12-01

The main objective of this research was to evaluate the possibility of direct Monte Carlo simulation for accurate dosimetry with short computation time. We made us of: graphics workstation, linear accelerator, water, PMMA and anthropomorphic phantoms, for validation purposes; ionometric, film and thermo-luminescent techniques, for dosimetry; treatment planning system for comparison. Benchmarking results suggest that short computing times can be obtained with use of the parallel version of EGS4 that was developed. Parallelism was obtained assigning simulation incident photons to separate processors, and the development of a parallel random number generator was necessary. Validation consisted in: phantom irradiation, comparison of predicted and measured values good agreement in PDD and dose profiles. Experiments on anthropomorphic phantoms (with inhomogeneities) were carried out, and these values are being compared with results obtained with the conventional treatment planning system.
Eyewitness confidence in simultaneous and sequential lineups: a criterion shift account for sequential mistaken identification overconfidence.

Science.gov (United States)

Dobolyi, David G; Dodson, Chad S

2013-12-01

Confidence judgments for eyewitness identifications play an integral role in determining guilt during legal proceedings. Past research has shown that confidence in positive identifications is strongly associated with accuracy. Using a standard lineup recognition paradigm, we investigated accuracy using signal detection and ROC analyses, along with the tendency to choose a face with both simultaneous and sequential lineups. We replicated past findings of reduced rates of choosing with sequential as compared to simultaneous lineups, but notably found an accuracy advantage in favor of simultaneous lineups. Moreover, our analysis of the confidence-accuracy relationship revealed two key findings. First, we observed a sequential mistaken identification overconfidence effect: despite an overall reduction in false alarms, confidence for false alarms that did occur was higher with sequential lineups than with simultaneous lineups, with no differences in confidence for correct identifications. This sequential mistaken identification overconfidence effect is an expected byproduct of the use of a more conservative identification criterion with sequential than with simultaneous lineups. Second, we found a steady drop in confidence for mistaken identifications (i.e., foil identifications and false alarms) from the first to the last face in sequential lineups, whereas confidence in and accuracy of correct identifications remained relatively stable. Overall, we observed that sequential lineups are both less accurate and produce higher confidence false identifications than do simultaneous lineups. Given the increasing prominence of sequential lineups in our legal system, our data argue for increased scrutiny and possibly a wholesale reevaluation of this lineup format. PsycINFO Database Record (c) 2013 APA, all rights reserved.
Improving image quality of parallel phase-shifting digital holography

International Nuclear Information System (INIS)

Awatsuji, Yasuhiro; Tahara, Tatsuki; Kaneko, Atsushi; Koyama, Takamasa; Nishio, Kenzo; Ura, Shogo; Kubota, Toshihiro; Matoba, Osamu

2008-01-01

The authors propose parallel two-step phase-shifting digital holography to improve the image quality of parallel phase-shifting digital holography. The proposed technique can increase the effective number of pixels of hologram twice in comparison to the conventional parallel four-step technique. The increase of the number of pixels makes it possible to improve the image quality of the reconstructed image of the parallel phase-shifting digital holography. Numerical simulation and preliminary experiment of the proposed technique were conducted and the effectiveness of the technique was confirmed. The proposed technique is more practical than the conventional parallel phase-shifting digital holography, because the composition of the digital holographic system based on the proposed technique is simpler.
Parallel computation with molecular-motor-propelled agents in nanofabricated networks.

Science.gov (United States)

Nicolau, Dan V; Lard, Mercy; Korten, Till; van Delft, Falco C M J M; Persson, Malin; Bengtsson, Elina; Månsson, Alf; Diez, Stefan; Linke, Heiner; Nicolau, Dan V

2016-03-08

The combinatorial nature of many important mathematical problems, including nondeterministic-polynomial-time (NP)-complete problems, places a severe limitation on the problem size that can be solved with conventional, sequentially operating electronic computers. There have been significant efforts in conceiving parallel-computation approaches in the past, for example: DNA computation, quantum computation, and microfluidics-based computation. However, these approaches have not proven, so far, to be scalable and practical from a fabrication and operational perspective. Here, we report the foundations of an alternative parallel-computation system in which a given combinatorial problem is encoded into a graphical, modular network that is embedded in a nanofabricated planar device. Exploring the network in a parallel fashion using a large number of independent, molecular-motor-propelled agents then solves the mathematical problem. This approach uses orders of magnitude less energy than conventional computers, thus addressing issues related to power consumption and heat dissipation. We provide a proof-of-concept demonstration of such a device by solving, in a parallel fashion, the small instance {2, 5, 9} of the subset sum problem, which is a benchmark NP-complete problem. Finally, we discuss the technical advances necessary to make our system scalable with presently available technology.
Comparison of sequential and single extraction in order to estimate environmental impact of metals from fly ash

Directory of Open Access Journals (Sweden)

Tasić Aleksandra M.

2016-01-01

Full Text Available The aim of this paper was to simulate leaching of metals from fly ash in different environmental conditions using ultrasound and microwave-assisted extraction techniques. Single-agent extraction and sequential extraction procedures were used to determine the levels of different metals leaching. The concentration of metals (Al, Fe, Mn, Cd, Co, Cr, Ni, Pb, Cu, As, Be in fly ash extracts were measured by Inductively Coupled Plasma-Atomic Emission Spectrometry. Single-agent extractions of metals were conducted during sonication times of 10, 20, 30, 40 and 50 min. Single-agent extraction with deionized water was also undertaken by exposing samples to microwave radiation at the temperature of 50°C. The sequential extraction was undertaken according to the BCR procedure which was modified and applied to study the partitioning of metals in coal fly ash. The microwave-assisted sequential extraction was performed at different extraction temperatures: 50, 100 and 150°C. The partitioning of metals between the individual fractions was investigated and discussed. The efficiency of the extraction process for each step was examined. In addition, the results of the microwave-assisted sequential extraction are compared to the results obtained by standard ASTM method. The mobility of most elements contained in fly ash is markedly pH sensitive. [Projekat Ministarstva nauke Republike Srbije, br. 172030, br. 176006 i br. III43009
Time-dependent resonant tunnelling for parallel-coupled double quantum dots

International Nuclear Information System (INIS)

Dong Bing; Djuric, Ivana; Cui, H L; Lei, X L

2004-01-01

We derive the quantum rate equations for an Aharonov-Bohm interferometer with two vertically coupled quantum dots embedded in each of two arms by means of the nonequilibrium Green function in the sequential tunnelling regime. Based on these equations, we investigate time-dependent resonant tunnelling under a small amplitude irradiation and find that the resonant photon-assisted tunnelling peaks in photocurrent demonstrate a combination behaviour of Fano and Lorentzian resonances due to the interference effect between the two pathways in this parallel configuration, which is controllable by threading the magnetic flux inside this device
Parallel symbolic state-space exploration is difficult, but what is the alternative?

Directory of Open Access Journals (Sweden)

Gianfranco Ciardo

2009-12-01

Full Text Available State-space exploration is an essential step in many modeling and analysis problems. Its goal is to find the states reachable from the initial state of a discrete-state model described. The state space can used to answer important questions, e.g., "Is there a dead state?" and "Can N become negative?", or as a starting point for sophisticated investigations expressed in temporal logic. Unfortunately, the state space is often so large that ordinary explicit data structures and sequential algorithms cannot cope, prompting the exploration of (1 parallel approaches using multiple processors, from simple workstation networks to shared-memory supercomputers, to satisfy large memory and runtime requirements and (2 symbolic approaches using decision diagrams to encode the large structured sets and relations manipulated during state-space generation. Both approaches have merits and limitations. Parallel explicit state-space generation is challenging, but almost linear speedup can be achieved; however, the analysis is ultimately limited by the memory and processors available. Symbolic methods are a heuristic that can efficiently encode many, but not all, functions over a structured and exponentially large domain; here the pitfalls are subtler: their performance varies widely depending on the class of decision diagram chosen, the state variable order, and obscure algorithmic parameters. As symbolic approaches are often much more efficient than explicit ones for many practical models, we argue for the need to parallelize symbolic state-space generation algorithms, so that we can realize the advantage of both approaches. This is a challenging endeavor, as the most efficient symbolic algorithm, Saturation, is inherently sequential. We conclude by discussing challenges, efforts, and promising directions toward this goal.
Parallel-Sequential Texture Analysis

NARCIS (Netherlands)

van den Broek, Egon; Singh, Sameer; Singh, Maneesha; van Rikxoort, Eva M.; Apte, Chid; Perner, Petra

2005-01-01

Color induced texture analysis is explored, using two texture analysis techniques: the co-occurrence matrix and the color correlogram as well as color histograms. Several quantization schemes for six color spaces and the human-based 11 color quantization scheme have been applied. The VisTex texture
Prosodic structure as a parallel to musical structure

Directory of Open Access Journals (Sweden)

Christopher Cullen Heffner

2015-12-01

Full Text Available What structural properties do language and music share? Although early speculation identified a wide variety of possibilities, the literature has largely focused on the parallels between musical structure and syntactic structure. Here, we argue that parallels between musical structure and prosodic structure deserve more attention. We review the evidence for a link between musical and prosodic structure and find it to be strong. In fact, certain elements of prosodic structure may provide a parsimonious comparison with musical structure without sacrificing empirical findings related to the parallels between language and music. We then develop several predictions related to such a hypothesis.
Computational cost estimates for parallel shared memory isogeometric multi-frontal solvers

KAUST Repository

Woźniak, Maciej; Kuźnik, Krzysztof M.; Paszyński, Maciej R.; Calo, Victor M.; Pardo, D.

2014-01-01

In this paper we present computational cost estimates for parallel shared memory isogeometric multi-frontal solvers. The estimates show that the ideal isogeometric shared memory parallel direct solver scales as O( p2log(N/p)) for one dimensional problems, O(Np2) for two dimensional problems, and O(N4/3p2) for three dimensional problems, where N is the number of degrees of freedom, and p is the polynomial order of approximation. The computational costs of the shared memory parallel isogeometric direct solver are compared with those corresponding to the sequential isogeometric direct solver, being the latest equal to O(N p2) for the one dimensional case, O(N1.5p3) for the two dimensional case, and O(N2p3) for the three dimensional case. The shared memory version significantly reduces both the scalability in terms of N and p. Theoretical estimates are compared with numerical experiments performed with linear, quadratic, cubic, quartic, and quintic B-splines, in one and two spatial dimensions. © 2014 Elsevier Ltd. All rights reserved.
Computational cost estimates for parallel shared memory isogeometric multi-frontal solvers

KAUST Repository

Woźniak, Maciej

2014-06-01

In this paper we present computational cost estimates for parallel shared memory isogeometric multi-frontal solvers. The estimates show that the ideal isogeometric shared memory parallel direct solver scales as O( p2log(N/p)) for one dimensional problems, O(Np2) for two dimensional problems, and O(N4/3p2) for three dimensional problems, where N is the number of degrees of freedom, and p is the polynomial order of approximation. The computational costs of the shared memory parallel isogeometric direct solver are compared with those corresponding to the sequential isogeometric direct solver, being the latest equal to O(N p2) for the one dimensional case, O(N1.5p3) for the two dimensional case, and O(N2p3) for the three dimensional case. The shared memory version significantly reduces both the scalability in terms of N and p. Theoretical estimates are compared with numerical experiments performed with linear, quadratic, cubic, quartic, and quintic B-splines, in one and two spatial dimensions. © 2014 Elsevier Ltd. All rights reserved.
Remarks on sequential designs in risk assessment

International Nuclear Information System (INIS)

Seidenfeld, T.

1982-01-01

The special merits of sequential designs are reviewed in light of particular challenges that attend risk assessment for human population. The kinds of ''statistical inference'' are distinguished and the problem of design which is pursued is the clash between Neyman-Pearson and Bayesian programs of sequential design. The value of sequential designs is discussed and the Neyman-Pearson vs. Bayesian sequential designs are probed in particular. Finally, warnings with sequential designs are considered, especially in relation to utilitarianism
Comparison of three sequential extraction procedures to describe metal fractionation in anaerobic granular sludges

NARCIS (Netherlands)

Hullebusch, van E.D.; Sudarno, S.; Zandvoort, M.H.; Lens, P.N.L.

2005-01-01

In the last few decades. several sequential extraction procedures have been developed to quantify the chemical status of metals in the solid phase. In this study. three extraction techniques (modified [A. Tessier, P.G.C. Campbell, M. Bisson, Anal. Chem. 51 (1979) 844]: [R.C. Stover. L.E. Sommers,

Cross-sectional versus sequential quality indicators of risk factor management in patients with type 2 diabetes

NARCIS (Netherlands)

Voorham, Jaco; Denig, Petra; Wolffenbuttel, Bruce H. R.; Haaijer-Ruskamp, Flora M.

Background: The fairness of quality assessment methods is under debate. Quality indicators incorporating the longitudinal nature of care have been advocated but their usefulness in comparison to more commonly used cross-sectional measures is not clear. Aims: To compare cross-sectional and sequential
Sequential lineup laps and eyewitness accuracy.

Science.gov (United States)

Steblay, Nancy K; Dietrich, Hannah L; Ryan, Shannon L; Raczynski, Jeanette L; James, Kali A

2011-08-01

Police practice of double-blind sequential lineups prompts a question about the efficacy of repeated viewings (laps) of the sequential lineup. Two laboratory experiments confirmed the presence of a sequential lap effect: an increase in witness lineup picks from first to second lap, when the culprit was a stranger. The second lap produced more errors than correct identifications. In Experiment 2, lineup diagnosticity was significantly higher for sequential lineup procedures that employed a single versus double laps. Witnesses who elected to view a second lap made significantly more errors than witnesses who chose to stop after one lap or those who were required to view two laps. Witnesses with prior exposure to the culprit did not exhibit a sequential lap effect.
Costs of achieving live birth from assisted reproductive technology: a comparison of sequential single and double embryo transfer approaches.

Science.gov (United States)

Crawford, Sara; Boulet, Sheree L; Mneimneh, Allison S; Perkins, Kiran M; Jamieson, Denise J; Zhang, Yujia; Kissin, Dmitry M

2016-02-01

To assess treatment and pregnancy/infant-associated medical costs and birth outcomes for assisted reproductive technology (ART) cycles in a subset of patients using elective double embryo (ET) and to project the difference in costs and outcomes had the cycles instead been sequential single ETs (fresh followed by frozen if the fresh ET did not result in live birth). Retrospective cohort study using 2012 and 2013 data from the National ART Surveillance System. Infertility treatment centers. Fresh, autologous double ETs performed in 2012 among ART patients younger than 35 years of age with no prior ART use who cryopreserved at least one embryo. Sequential single and double ETs. Actual live birth rates and estimated ART treatment and pregnancy/infant-associated medical costs for double ET cycles started in 2012 and projected ART treatment and pregnancy/infant-associated medical costs if the double ET cycles had been performed as sequential single ETs. The estimated total ART treatment and pregnancy/infant-associated medical costs were $580.9 million for 10,001 double ETs started in 2012. If performed as sequential single ETs, estimated costs would have decreased by $195.0 million to $386.0 million, and live birth rates would have increased from 57.7%-68.0%. Sequential single ETs, when clinically appropriate, can reduce total ART treatment and pregnancy/infant-associated medical costs by reducing multiple births without lowering live birth rates. Published by Elsevier Inc.
Effects of neostriatal 6-OHDA lesion on performance in a rat sequential reaction time task.

Science.gov (United States)

Domenger, D; Schwarting, R K W

2008-10-31

Work in humans and monkeys has provided evidence that the basal ganglia, and the neurotransmitter dopamine therein, play an important role for sequential learning and performance. Compared to primates, experimental work in rodents is rather sparse, largely due to the fact that tasks comparable to the human ones, especially serial reaction time tasks (SRTT), had been lacking until recently. We have developed a rat model of the SRTT, which allows to study neural correlates of sequential performance and motor sequence execution. Here, we report the effects of dopaminergic neostriatal lesions, performed using bilateral 6-hydroxydopamine injections, on performance of well-trained rats tested in our SRTT. Sequential behavior was measured in two ways: for one, the effects of small violations of otherwise well trained sequences were examined as a measure of attention and automation. Secondly, sequential versus random performance was compared as a measure of sequential learning. Neurochemically, the lesions led to sub-total dopamine depletions in the neostriatum, which ranged around 60% in the lateral, and around 40% in the medial neostriatum. These lesions led to a general instrumental impairment in terms of reduced speed (response latencies) and response rate, and these deficits were correlated with the degree of striatal dopamine loss. Furthermore, the violation test indicated that the lesion group conducted less automated responses. The comparison of random versus sequential responding showed that the lesion group did not retain its superior sequential performance in terms of speed, whereas they did in terms of accuracy. Also, rats with lesions did not improve further in overall performance as compared to pre-lesion values, whereas controls did. These results support previous results that neostriatal dopamine is involved in instrumental behaviour in general. Also, these lesions are not sufficient to completely abolish sequential performance, at least when acquired
On the Organization of Parallel Operation of Some Algorithms for Finding the Shortest Path on a Graph on a Computer System with Multiple Instruction Stream and Single Data Stream

Directory of Open Access Journals (Sweden)

V. E. Podol'skii

2015-01-01

Full Text Available The paper considers the implementing Bellman-Ford and Lee algorithms to find the shortest graph path on a computer system with multiple instruction stream and single data stream (MISD. The MISD computer is a computer that executes commands of arithmetic-logical processing (on the CPU and commands of structures processing (on the structures processor in parallel on a single data stream. Transformation of sequential programs into the MISD programs is a labor intensity process because it requires a stream of the arithmetic-logical processing to be manually separated from that of the structures processing. Algorithms based on the processing of data structures (e.g., algorithms on graphs show high performance on a MISD computer. Bellman-Ford and Lee algorithms for finding the shortest path on a graph are representatives of these algorithms. They are applied to robotics for automatic planning of the robot movement in-situ. Modification of Bellman-Ford and Lee algorithms for finding the shortest graph path in coprocessor MISD mode and the parallel MISD modification of these algorithms were first obtained in this article. Thus, this article continues a series of studies on the transformation of sequential algorithms into MISD ones (Dijkstra and Ford-Fulkerson 's algorithms and has a pronouncedly applied nature. The article also presents the analysis results of Bellman-Ford and Lee algorithms in MISD mode. The paper formulates the basic trends of a technique for parallelization of algorithms into arithmetic-logical processing stream and structures processing stream. Among the key areas for future research, development of the mathematical approach to provide a subsequently formalized and automated process of parallelizing sequential algorithms between the CPU and structures processor is highlighted. Among the mathematical models that can be used in future studies there are graph models of algorithms (e.g., dependency graph of a program. Due to the high
Using UPPAAL to Analyze an MPEG-2 Algorithm

DEFF Research Database (Denmark)

Cambronero, M. Emilia; Ravn, Anders Peter; Valero, Valentin

2005-01-01

The performance of a parallel algorithm for an MPEG-2 encoding is analyzed using timed automata models in the UppAal tool. We have constructed both a sequential model of MPEG-2, and a parallel model of MPEG-2 and then, a comparison of the results obtained for both models is made. We show how...
Analysis of a parallel multigrid algorithm

Science.gov (United States)

Chan, Tony F.; Tuminaro, Ray S.

1989-01-01

The parallel multigrid algorithm of Frederickson and McBryan (1987) is considered. This algorithm uses multiple coarse-grid problems (instead of one problem) in the hope of accelerating convergence and is found to have a close relationship to traditional multigrid methods. Specifically, the parallel coarse-grid correction operator is identical to a traditional multigrid coarse-grid correction operator, except that the mixing of high and low frequencies caused by aliasing error is removed. Appropriate relaxation operators can be chosen to take advantage of this property. Comparisons between the standard multigrid and the new method are made.
Influence of Sequential vs. Simultaneous Dual-Task Exercise Training on Cognitive Function in Older Adults.

Science.gov (United States)

Tait, Jamie L; Duckham, Rachel L; Milte, Catherine M; Main, Luana C; Daly, Robin M

2017-01-01

Emerging research indicates that exercise combined with cognitive training may improve cognitive function in older adults. Typically these programs have incorporated sequential training, where exercise and cognitive training are undertaken separately. However, simultaneous or dual-task training, where cognitive and/or motor training are performed simultaneously with exercise, may offer greater benefits. This review summary provides an overview of the effects of combined simultaneous vs. sequential training on cognitive function in older adults. Based on the available evidence, there are inconsistent findings with regard to the cognitive benefits of sequential training in comparison to cognitive or exercise training alone. In contrast, simultaneous training interventions, particularly multimodal exercise programs in combination with secondary tasks regulated by sensory cues, have significantly improved cognition in both healthy older and clinical populations. However, further research is needed to determine the optimal characteristics of a successful simultaneous training program for optimizing cognitive function in older people.
Research in Parallel Algorithms and Software for Computational Aerosciences

Science.gov (United States)

Domel, Neal D.

1996-01-01

Phase 1 is complete for the development of a computational fluid dynamics CFD) parallel code with automatic grid generation and adaptation for the Euler analysis of flow over complex geometries. SPLITFLOW, an unstructured Cartesian grid code developed at Lockheed Martin Tactical Aircraft Systems, has been modified for a distributed memory/massively parallel computing environment. The parallel code is operational on an SGI network, Cray J90 and C90 vector machines, SGI Power Challenge, and Cray T3D and IBM SP2 massively parallel machines. Parallel Virtual Machine (PVM) is the message passing protocol for portability to various architectures. A domain decomposition technique was developed which enforces dynamic load balancing to improve solution speed and memory requirements. A host/node algorithm distributes the tasks. The solver parallelizes very well, and scales with the number of processors. Partially parallelized and non-parallelized tasks consume most of the wall clock time in a very fine grain environment. Timing comparisons on a Cray C90 demonstrate that Parallel SPLITFLOW runs 2.4 times faster on 8 processors than its non-parallel counterpart autotasked over 8 processors.
A Parallel Processing Algorithm for Remote Sensing Classification

Science.gov (United States)

Gualtieri, J. Anthony

2005-01-01

A current thread in parallel computation is the use of cluster computers created by networking a few to thousands of commodity general-purpose workstation-level commuters using the Linux operating system. For example on the Medusa cluster at NASA/GSFC, this provides for super computing performance, 130 G(sub flops) (Linpack Benchmark) at moderate cost, $370K. However, to be useful for scientific computing in the area of Earth science, issues of ease of programming, access to existing scientific libraries, and portability of existing code need to be considered. In this paper, I address these issues in the context of tools for rendering earth science remote sensing data into useful products. In particular, I focus on a problem that can be decomposed into a set of independent tasks, which on a serial computer would be performed sequentially, but with a cluster computer can be performed in parallel, giving an obvious speedup. To make the ideas concrete, I consider the problem of classifying hyperspectral imagery where some ground truth is available to train the classifier. In particular I will use the Support Vector Machine (SVM) approach as applied to hyperspectral imagery. The approach will be to introduce notions about parallel computation and then to restrict the development to the SVM problem. Pseudocode (an outline of the computation) will be described and then details specific to the implementation will be given. Then timing results will be reported to show what speedups are possible using parallel computation. The paper will close with a discussion of the results.
Comparison of 250 MHz R10K Origin 2000 and 400 MHz Origin 2000 Using NAS Parallel Benchmarks

Science.gov (United States)

Turney, Raymond D.; Thigpen, William W. (Technical Monitor)

2001-01-01

This report describes results of benchmark tests on Steger, a 250 MHz Origin 2000 system with R10K processors, currently installed at the NASA Ames National Advanced Supercomputing (NAS) facility. For comparison purposes, the tests were also run on Lomax, a 400 MHz Origin 2000 with R12K processors. The BT, LU, and SP application benchmarks in the NAS Parallel Benchmark Suite and the kernel benchmark FT were chosen to measure system performance. Having been written to measure performance on Computational Fluid Dynamics applications, these benchmarks are assumed appropriate to represent the NAS workload. Since the NAS runs both message passing (MPI) and shared-memory, compiler directive type codes, both MPI and OpenMP versions of the benchmarks were used. The MPI versions used were the latest official release of the NAS Parallel Benchmarks, version 2.3. The OpenMP versions used were PBN3b2, a beta version that is in the process of being released. NPB 2.3 and PBN3b2 are technically different benchmarks, and NPB results are not directly comparable to PBN results.
Multi-agent sequential hypothesis testing

KAUST Repository

Kim, Kwang-Ki K.; Shamma, Jeff S.

2014-01-01

incorporate costs of taking private/public measurements, costs of time-difference and disagreement in actions of agents, and costs of false declaration/choices in the sequential hypothesis testing. The corresponding sequential decision processes have well
Low jitter spark gap switch for repetitively pulsed parallel capacitor banks

International Nuclear Information System (INIS)

Rohwein, G.J.

1980-01-01

A two-section air insulated spark gap has been developed for switching multi-kilojoule plus-minus charged parallel capacitor banks which operate continuously at pulse rates up to 20 pps. The switch operates with less than 2 ns jitter, recovers its dielectric strength within 2 to 5 ms and has not shown degraded performance in sequential test runs totaling over a million shots. Its estimated life with copper electrodes is > 10 7 shots. All preliminary tests indicate that the switch is suitable for continuous running multi-kilojoule systems operating to at least 20 pps
A comparison of an algorithm for automated sequential beam orientation selection (Cycle) with simulated annealing

International Nuclear Information System (INIS)

Woudstra, Evert; Heijmen, Ben J M; Storchi, Pascal R M

2008-01-01

Some time ago we developed and published a new deterministic algorithm (called Cycle) for automatic selection of beam orientations in radiotherapy. This algorithm is a plan generation process aiming at the prescribed PTV dose within hard dose and dose-volume constraints. The algorithm allows a large number of input orientations to be used and selects only the most efficient orientations, surviving the selection process. Efficiency is determined by a score function and is more or less equal to the extent of uninhibited access to the PTV for a specific beam during the selection process. In this paper we compare the capabilities of fast-simulated annealing (FSA) and Cycle for cases where local optima are supposed to be present. Five pancreas and five oesophagus cases previously treated in our institute were selected for this comparison. Plans were generated for FSA and Cycle, using the same hard dose and dose-volume constraints, and the largest possible achieved PTV doses as obtained from these algorithms were compared. The largest achieved PTV dose values were generally very similar for the two algorithms. In some cases FSA resulted in a slightly higher PTV dose than Cycle, at the cost of switching on substantially more beam orientations than Cycle. In other cases, when Cycle generated the solution with the highest PTV dose using only a limited number of non-zero weight beams, FSA seemed to have some difficulty in switching off the unfavourable directions. Cycle was faster than FSA, especially for large-dimensional feasible spaces. In conclusion, for the cases studied in this paper, we have found that despite the inherent drawback of sequential search as used by Cycle (where Cycle could probably get trapped in a local optimum), Cycle is nevertheless able to find comparable or sometimes slightly better treatment plans in comparison with FSA (which in theory finds the global optimum) especially in large-dimensional beam weight spaces
Sequential stochastic optimization

CERN Document Server

Cairoli, Renzo

1996-01-01

Sequential Stochastic Optimization provides mathematicians and applied researchers with a well-developed framework in which stochastic optimization problems can be formulated and solved. Offering much material that is either new or has never before appeared in book form, it lucidly presents a unified theory of optimal stopping and optimal sequential control of stochastic processes. This book has been carefully organized so that little prior knowledge of the subject is assumed; its only prerequisites are a standard graduate course in probability theory and some familiarity with discrete-paramet
Simultaneous optimization of sequential IMRT plans

International Nuclear Information System (INIS)

Popple, Richard A.; Prellop, Perri B.; Spencer, Sharon A.; Santos, Jennifer F. de los; Duan, Jun; Fiveash, John B.; Brezovich, Ivan A.

2005-01-01

plans was equivalent to the independently optimized plans actually used for treatment. Tolerance doses of the critical structures were respected for the plan sum; however, the dose to critical structures for the individual initial and boost plans was different between the simultaneously optimized and the independently optimized plans. In conclusion, we have demonstrated a method for optimization of initial and boost plans that treat volume reductions using the same dose per fraction. The method is efficient, as it avoids the iterative approach necessitated by currently available TPSs, and is generalizable to more than two treatment phases. Comparison with clinical plans developed independently suggests that current manual techniques for planning sequential treatments may be suboptimal
Pthreads vs MPI Parallel Performance of Angular-Domain Decomposed S

International Nuclear Information System (INIS)

Azmy, Y.Y.; Barnett, D.A.

2000-01-01

Two programming models for parallelizing the Angular Domain Decomposition (ADD) of the discrete ordinates (S n ) approximation of the neutron transport equation are examined. These are the shared memory model based on the POSIX threads (Pthreads) standard, and the message passing model based on the Message Passing Interface (MPI) standard. These standard libraries are available on most multiprocessor platforms thus making the resulting parallel codes widely portable. The question is: on a fixed platform, and for a particular code solving a given test problem, which of the two programming models delivers better parallel performance? Such comparison is possible on Symmetric Multi-Processors (SMP) architectures in which several CPUs physically share a common memory, and in addition are capable of emulating message passing functionality. Implementation of the two-dimensional,(S n ), Arbitrarily High Order Transport (AHOT) code for solving neutron transport problems using these two parallelization models is described. Measured parallel performance of each model on the COMPAQ AlphaServer 8400 and the SGI Origin 2000 platforms is described, and comparison of the observed speedup for the two programming models is reported. For the case presented in this paper it appears that the MPI implementation scales better than the Pthreads implementation on both platforms
Numerical investigation of two interacting parallel thruster-plumes and comparison to experiment

Science.gov (United States)

Grabe, Martin; Holz, André; Ziegenhagen, Stefan; Hannemann, Klaus

2014-12-01

Clusters of orbital thrusters are an attractive option to achieve graduated thrust levels and increased redundancy with available hardware, but the heavily under-expanded plumes of chemical attitude control thrusters placed in close proximity will interact, leading to a local amplification of downstream fluxes and of back-flow onto the spacecraft. The interaction of two similar, parallel, axi-symmetric cold-gas model thrusters has recently been studied in the DLR High-Vacuum Plume Test Facility STG under space-like vacuum conditions, employing a Patterson-type impact pressure probe with slot orifice. We reproduce a selection of these experiments numerically, and emphasise that a comparison of numerical results to the measured data is not straight-forward. The signal of the probe used in the experiments must be interpreted according to the degree of rarefaction and local flow Mach number, and both vary dramatically thoughout the flow-field. We present a procedure to reconstruct the probe signal by post-processing the numerically obtained flow-field data and show that agreement to the experimental results is then improved. Features of the investigated cold-gas thruster plume interaction are discussed on the basis of the numerical results.
Exploring the sequential lineup advantage using WITNESS.

Science.gov (United States)

Goodsell, Charles A; Gronlund, Scott D; Carlson, Curt A

2010-12-01

Advocates claim that the sequential lineup is an improvement over simultaneous lineup procedures, but no formal (quantitatively specified) explanation exists for why it is better. The computational model WITNESS (Clark, Appl Cogn Psychol 17:629-654, 2003) was used to develop theoretical explanations for the sequential lineup advantage. In its current form, WITNESS produced a sequential advantage only by pairing conservative sequential choosing with liberal simultaneous choosing. However, this combination failed to approximate four extant experiments that exhibited large sequential advantages. Two of these experiments became the focus of our efforts because the data were uncontaminated by likely suspect position effects. Decision-based and memory-based modifications to WITNESS approximated the data and produced a sequential advantage. The next step is to evaluate the proposed explanations and modify public policy recommendations accordingly.
Sequential and simultaneous choices: testing the diet selection and sequential choice models.

Science.gov (United States)

Freidin, Esteban; Aw, Justine; Kacelnik, Alex

2009-03-01

We investigate simultaneous and sequential choices in starlings, using Charnov's Diet Choice Model (DCM) and Shapiro, Siller and Kacelnik's Sequential Choice Model (SCM) to integrate function and mechanism. During a training phase, starlings encountered one food-related option per trial (A, B or R) in random sequence and with equal probability. A and B delivered food rewards after programmed delays (shorter for A), while R ('rejection') moved directly to the next trial without reward. In this phase we measured latencies to respond. In a later, choice, phase, birds encountered the pairs A-B, A-R and B-R, the first implementing a simultaneous choice and the second and third sequential choices. The DCM predicts when R should be chosen to maximize intake rate, and SCM uses latencies of the training phase to predict choices between any pair of options in the choice phase. The predictions of both models coincided, and both successfully predicted the birds' preferences. The DCM does not deal with partial preferences, while the SCM does, and experimental results were strongly correlated to this model's predictions. We believe that the SCM may expose a very general mechanism of animal choice, and that its wider domain of success reflects the greater ecological significance of sequential over simultaneous choices.

MCBooster: a tool for MC generation for massively parallel platforms

CERN Multimedia

Alves Junior, Antonio Augusto

2016-01-01

MCBooster is a header-only, C++11-compliant library for the generation of large samples of phase-space Monte Carlo events on massively parallel platforms. It was released on GitHub in the spring of 2016. The library core algorithms implement the Raubold-Lynch method; they are able to generate the full kinematics of decays with up to nine particles in the final state. The library supports the generation of sequential decays as well as the parallel evaluation of arbitrary functions over the generated events. The output of MCBooster completely accords with popular and well-tested software packages such as GENBOD (W515 from CERNLIB) and TGenPhaseSpace from the ROOT framework. MCBooster is developed on top of the Thrust library and runs on Linux systems. It deploys transparently on NVidia CUDA-enabled GPUs as well as multicore CPUs. This contribution summarizes the main features of MCBooster. A basic description of the user interface and some examples of applications are provided, along with measurements of perfor...
Sequential memory: Binding dynamics

Science.gov (United States)

Afraimovich, Valentin; Gong, Xue; Rabinovich, Mikhail

2015-10-01

Temporal order memories are critical for everyday animal and human functioning. Experiments and our own experience show that the binding or association of various features of an event together and the maintaining of multimodality events in sequential order are the key components of any sequential memories—episodic, semantic, working, etc. We study a robustness of binding sequential dynamics based on our previously introduced model in the form of generalized Lotka-Volterra equations. In the phase space of the model, there exists a multi-dimensional binding heteroclinic network consisting of saddle equilibrium points and heteroclinic trajectories joining them. We prove here the robustness of the binding sequential dynamics, i.e., the feasibility phenomenon for coupled heteroclinic networks: for each collection of successive heteroclinic trajectories inside the unified networks, there is an open set of initial points such that the trajectory going through each of them follows the prescribed collection staying in a small neighborhood of it. We show also that the symbolic complexity function of the system restricted to this neighborhood is a polynomial of degree L - 1, where L is the number of modalities.
Thermodynamic performance analysis of sequential Carnot cycles using heat sources with finite heat capacity

International Nuclear Information System (INIS)

Park, Hansaem; Kim, Min Soo

2014-01-01

The maximum efficiency of a heat engine is able to be estimated by using a Carnot cycle. Even though, in terms of efficiency, the Carnot cycle performs the role of reference very well, its application is limited to the case of infinite heat reservoirs, which is not that realistic. Moreover, considering that one of the recent key issues is to produce maximum work from low temperature and finite heat sources, which are called renewable energy sources, more advanced theoretical cycles, which can present a new standard, and the research about them are necessary. Therefore, in this paper, a sequential Carnot cycle, where multiple Carnot cycles are connected in parallel, is studied. The cycle adopts a finite heat source, which has a certain initial temperature and heat capacity, and an infinite heat sink, which is assumed to be ambient air. Heat transfer processes in the cycle occur with the temperature difference between a heat reservoir and a cycle. In order to resolve the heat transfer rate in those processes, the product of an overall heat transfer coefficient and a heat transfer area is introduced. Using these conditions, the performance of a sequential Carnot cycle is analytically calculated. Furthermore, as the efforts for enhancing the work of the cycle, the optimization research is also conducted with numerical calculation. - Highlights: • Modified sequential Carnot cycles are proposed for evaluating low grade heat sources. • Performance of sequential Carnot cycles is calculated analytically. • Optimization study for the cycle is conducted with numerical solver. • Maximum work from a heat source under a certain condition is obtained by equations
Sequential Probability Ration Tests : Conservative and Robust

NARCIS (Netherlands)

Kleijnen, J.P.C.; Shi, Wen

2017-01-01

In practice, most computers generate simulation outputs sequentially, so it is attractive to analyze these outputs through sequential statistical methods such as sequential probability ratio tests (SPRTs). We investigate several SPRTs for choosing between two hypothesized values for the mean output
Performance assessment of the SIMFAP parallel cluster at IFIN-HH Bucharest

International Nuclear Information System (INIS)

Adam, Gh.; Adam, S.; Ayriyan, A.; Dushanov, E.; Hayryan, E.; Korenkov, V.; Lutsenko, A.; Mitsyn, V.; Sapozhnikova, T.; Sapozhnikov, A; Streltsova, O.; Buzatu, F.; Dulea, M.; Vasile, I.; Sima, A.; Visan, C.; Busa, J.; Pokorny, I.

2008-01-01

Performance assessment and case study outputs of the parallel SIMFAP cluster at IFIN-HH Bucharest point to its effective and reliable operation. A comparison with results on the supercomputing system in LIT-JINR Dubna adds insight on resource allocation for problem solving by parallel computing. The solution of models asking for very large numbers of knots in the discretization mesh needs the migration to high performance computing based on parallel cluster architectures. The acquisition of ready-to-use parallel computing facilities being beyond limited budgetary resources, the solution at IFIN-HH was to buy the hardware and the inter-processor network, and to implement by own efforts the open software concerning both the operating system and the parallel computing standard. The present paper provides a report demonstrating the successful solution of these tasks. The implementation of the well-known HPL (High Performance LINPACK) Benchmark points to the effective and reliable operation of the cluster. The comparison of HPL outputs obtained on parallel clusters of different magnitudes shows that there is an optimum range of the order N of the linear algebraic system over which a given parallel cluster provides optimum parallel solutions. For the SIMFAP cluster, this range can be inferred to correspond to about 1 to 2 x 10 4 linear algebraic equations. For an algorithm of polynomial complexity N α the task sharing among p processors within a parallel solution mainly follows an (N/p)α behaviour under peak performance achievement. Thus, while the problem complexity remains the same, a substantial decrease of the coefficient of the leading order of the polynomial complexity is achieved. (authors)
Comparing cluster-level dynamic treatment regimens using sequential, multiple assignment, randomized trials: Regression estimation and sample size considerations.

Science.gov (United States)

NeCamp, Timothy; Kilbourne, Amy; Almirall, Daniel

2017-08-01

Cluster-level dynamic treatment regimens can be used to guide sequential treatment decision-making at the cluster level in order to improve outcomes at the individual or patient-level. In a cluster-level dynamic treatment regimen, the treatment is potentially adapted and re-adapted over time based on changes in the cluster that could be impacted by prior intervention, including aggregate measures of the individuals or patients that compose it. Cluster-randomized sequential multiple assignment randomized trials can be used to answer multiple open questions preventing scientists from developing high-quality cluster-level dynamic treatment regimens. In a cluster-randomized sequential multiple assignment randomized trial, sequential randomizations occur at the cluster level and outcomes are observed at the individual level. This manuscript makes two contributions to the design and analysis of cluster-randomized sequential multiple assignment randomized trials. First, a weighted least squares regression approach is proposed for comparing the mean of a patient-level outcome between the cluster-level dynamic treatment regimens embedded in a sequential multiple assignment randomized trial. The regression approach facilitates the use of baseline covariates which is often critical in the analysis of cluster-level trials. Second, sample size calculators are derived for two common cluster-randomized sequential multiple assignment randomized trial designs for use when the primary aim is a between-dynamic treatment regimen comparison of the mean of a continuous patient-level outcome. The methods are motivated by the Adaptive Implementation of Effective Programs Trial which is, to our knowledge, the first-ever cluster-randomized sequential multiple assignment randomized trial in psychiatry.
Simultaneous activation of parallel sensory pathways promotes a grooming sequence in Drosophila

Science.gov (United States)

Hampel, Stefanie; McKellar, Claire E

2017-01-01

A central model that describes how behavioral sequences are produced features a neural architecture that readies different movements simultaneously, and a mechanism where prioritized suppression between the movements determines their sequential performance. We previously described a model whereby suppression drives a Drosophila grooming sequence that is induced by simultaneous activation of different sensory pathways that each elicit a distinct movement (Seeds et al., 2014). Here, we confirm this model using transgenic expression to identify and optogenetically activate sensory neurons that elicit specific grooming movements. Simultaneous activation of different sensory pathways elicits a grooming sequence that resembles the naturally induced sequence. Moreover, the sequence proceeds after the sensory excitation is terminated, indicating that a persistent trace of this excitation induces the next grooming movement once the previous one is performed. This reveals a mechanism whereby parallel sensory inputs can be integrated and stored to elicit a delayed and sequential grooming response. PMID:28887878
Comparison of phase-constrained parallel MRI approaches: Analogies and differences.

Science.gov (United States)

Blaimer, Martin; Heim, Marius; Neumann, Daniel; Jakob, Peter M; Kannengiesser, Stephan; Breuer, Felix A

2016-03-01

Phase-constrained parallel MRI approaches have the potential for significantly improving the image quality of accelerated MRI scans. The purpose of this study was to investigate the properties of two different phase-constrained parallel MRI formulations, namely the standard phase-constrained approach and the virtual conjugate coil (VCC) concept utilizing conjugate k-space symmetry. Both formulations were combined with image-domain algorithms (SENSE) and a mathematical analysis was performed. Furthermore, the VCC concept was combined with k-space algorithms (GRAPPA and ESPIRiT) for image reconstruction. In vivo experiments were conducted to illustrate analogies and differences between the individual methods. Furthermore, a simple method of improving the signal-to-noise ratio by modifying the sampling scheme was implemented. For SENSE, the VCC concept was mathematically equivalent to the standard phase-constrained formulation and therefore yielded identical results. In conjunction with k-space algorithms, the VCC concept provided more robust results when only a limited amount of calibration data were available. Additionally, VCC-GRAPPA reconstructed images provided spatial phase information with full resolution. Although both phase-constrained parallel MRI formulations are very similar conceptually, there exist important differences between image-domain and k-space domain reconstructions regarding the calibration robustness and the availability of high-resolution phase information. © 2015 Wiley Periodicals, Inc.
Sequential lineup presentation: Patterns and policy

OpenAIRE

Lindsay, R C L; Mansour, Jamal K; Beaudry, J L; Leach, A-M; Bertrand, M I

2009-01-01

Sequential lineups were offered as an alternative to the traditional simultaneous lineup. Sequential lineups reduce incorrect lineup selections; however, the accompanying loss of correct identifications has resulted in controversy regarding adoption of the technique. We discuss the procedure and research relevant to (1) the pattern of results found using sequential versus simultaneous lineups; (2) reasons (theory) for differences in witness responses; (3) two methodological issues; and (4) im...
Sequential Classification of Palm Gestures Based on A* Algorithm and MLP Neural Network for Quadrocopter Control

Directory of Open Access Journals (Sweden)

Wodziński Marek

2017-06-01

Full Text Available This paper presents an alternative approach to the sequential data classification, based on traditional machine learning algorithms (neural networks, principal component analysis, multivariate Gaussian anomaly detector and finding the shortest path in a directed acyclic graph, using A* algorithm with a regression-based heuristic. Palm gestures were used as an example of the sequential data and a quadrocopter was the controlled object. The study includes creation of a conceptual model and practical construction of a system using the GPU to ensure the realtime operation. The results present the classification accuracy of chosen gestures and comparison of the computation time between the CPU- and GPU-based solutions.
A new parallel algorithm and its simulation on hypercube simulator for low pass digital image filtering using systolic array

International Nuclear Information System (INIS)

Al-Hallaq, A.; Amin, S.

1998-01-01

This paper introduces a new parallel algorithm and its simulation on a hypercube simulator for the low pass digital image filtering using a systolic array. This new algorithm is faster than the old one (Amin, 1988). This is due to the the fact that the old algorithm carries out the addition operations in a sequential mode. But in our new design these addition operations are divided into tow groups, which can be performed in parallel. One group will be performed on one half of the systolic array and the other on the second half, that is, by folding. This parallelism reduces the time required for the whole process by almost quarter the time of the old algorithm.(authors). 18 refs., 3 figs
Sequential Product of Quantum Effects: An Overview

Science.gov (United States)

Gudder, Stan

2010-12-01

This article presents an overview for the theory of sequential products of quantum effects. We first summarize some of the highlights of this relatively recent field of investigation and then provide some new results. We begin by discussing sequential effect algebras which are effect algebras endowed with a sequential product satisfying certain basic conditions. We then consider sequential products of (discrete) quantum measurements. We next treat transition effect matrices (TEMs) and their associated sequential product. A TEM is a matrix whose entries are effects and whose rows form quantum measurements. We show that TEMs can be employed for the study of quantum Markov chains. Finally, we prove some new results concerning TEMs and vector densities.
Optimal Sequential Rules for Computer-Based Instruction.

Science.gov (United States)

Vos, Hans J.

1998-01-01

Formulates sequential rules for adapting the appropriate amount of instruction to learning needs in the context of computer-based instruction. Topics include Bayesian decision theory, threshold and linear-utility structure, psychometric model, optimal sequential number of test questions, and an empirical example of sequential instructional…
Comparison of Software Technologies for Vectorization and Parallelization

CERN Document Server

Lazzaro, Alfio; Nowak, Andrzej; Valsan, Liviu

2012-01-01

This paper demonstrates how modern software development methodologies can be used to give an existing sequential application a considerable performance speed-up on modern x86 server systems. Whereas, in the past, speed-up was directly linked to the increase in clock frequency when moving to a more modern system, current x86 servers present a plethora of “performance dimensions” that need to be harnessed with great care. The application we used is a real-life data analysis example in C++ analyzing High Energy Physics data. The key software methods used are OpenMP, Intel Threading Building Blocks (TBB), Intel Cilk Plus, and the auto-vectorization capability of the Intel compiler (Composer XE). Somewhat surprisingly, the Message Passing Interface (MPI) is successfully added, although our focus is on single-node rather than multi-node performance optimization. The paper underlines the importance of algorithmic redesign in order to optimize each performance dimension and links this to close control of the memo...
Dual-volume excitation and parallel reconstruction for J-difference-edited MR spectroscopy

DEFF Research Database (Denmark)

Oeltzschner, Georg; Puts, Nicolaas A J; Chan, Kimberly L

2017-01-01

successfully reconstructed with a mean in vivo g-factor of 1.025 (typical voxel-center separation: 7-8 cm). MEGA-PRIAM experiments showed higher signal-to-noise ratio than sequential single-voxel experiments of the same total duration (mean improvement 1.38 ± 0.24). CONCLUSIONS: Simultaneous acquisition of J......PURPOSE: To develop J-difference editing with parallel reconstruction in accelerated multivoxel (PRIAM) for simultaneous measurement in two separate brain regions of γ-aminobutyric acid (GABA) or glutathione. METHODS: PRIAM separates signals from two simultaneously excited voxels using receiver...
Quantum Inequalities and Sequential Measurements

International Nuclear Information System (INIS)

Candelpergher, B.; Grandouz, T.; Rubinx, J.L.

2011-01-01

In this article, the peculiar context of sequential measurements is chosen in order to analyze the quantum specificity in the two most famous examples of Heisenberg and Bell inequalities: Results are found at some interesting variance with customary textbook materials, where the context of initial state re-initialization is described. A key-point of the analysis is the possibility of defining Joint Probability Distributions for sequential random variables associated to quantum operators. Within the sequential context, it is shown that Joint Probability Distributions can be defined in situations where not all of the quantum operators (corresponding to random variables) do commute two by two. (authors)
Online Sequential Projection Vector Machine with Adaptive Data Mean Update.

Science.gov (United States)

Chen, Lin; Jia, Ji-Ting; Zhang, Qiong; Deng, Wan-Yu; Wei, Wei

2016-01-01

We propose a simple online learning algorithm especial for high-dimensional data. The algorithm is referred to as online sequential projection vector machine (OSPVM) which derives from projection vector machine and can learn from data in one-by-one or chunk-by-chunk mode. In OSPVM, data centering, dimension reduction, and neural network training are integrated seamlessly. In particular, the model parameters including (1) the projection vectors for dimension reduction, (2) the input weights, biases, and output weights, and (3) the number of hidden nodes can be updated simultaneously. Moreover, only one parameter, the number of hidden nodes, needs to be determined manually, and this makes it easy for use in real applications. Performance comparison was made on various high-dimensional classification problems for OSPVM against other fast online algorithms including budgeted stochastic gradient descent (BSGD) approach, adaptive multihyperplane machine (AMM), primal estimated subgradient solver (Pegasos), online sequential extreme learning machine (OSELM), and SVD + OSELM (feature selection based on SVD is performed before OSELM). The results obtained demonstrated the superior generalization performance and efficiency of the OSPVM.
Online Sequential Projection Vector Machine with Adaptive Data Mean Update

Directory of Open Access Journals (Sweden)

Lin Chen

2016-01-01

Full Text Available We propose a simple online learning algorithm especial for high-dimensional data. The algorithm is referred to as online sequential projection vector machine (OSPVM which derives from projection vector machine and can learn from data in one-by-one or chunk-by-chunk mode. In OSPVM, data centering, dimension reduction, and neural network training are integrated seamlessly. In particular, the model parameters including (1 the projection vectors for dimension reduction, (2 the input weights, biases, and output weights, and (3 the number of hidden nodes can be updated simultaneously. Moreover, only one parameter, the number of hidden nodes, needs to be determined manually, and this makes it easy for use in real applications. Performance comparison was made on various high-dimensional classification problems for OSPVM against other fast online algorithms including budgeted stochastic gradient descent (BSGD approach, adaptive multihyperplane machine (AMM, primal estimated subgradient solver (Pegasos, online sequential extreme learning machine (OSELM, and SVD + OSELM (feature selection based on SVD is performed before OSELM. The results obtained demonstrated the superior generalization performance and efficiency of the OSPVM.
A Fast parallel tridiagonal algorithm for a class of CFD applications

Science.gov (United States)

Moitra, Stuti; Sun, Xian-He

1996-01-01

The parallel diagonal dominant (PDD) algorithm is an efficient tridiagonal solver. This paper presents for study a variation of the PDD algorithm, the reduced PDD algorithm. The new algorithm maintains the minimum communication provided by the PDD algorithm, but has a reduced operation count. The PDD algorithm also has a smaller operation count than the conventional sequential algorithm for many applications. Accuracy analysis is provided for the reduced PDD algorithm for symmetric Toeplitz tridiagonal (STT) systems. Implementation results on Langley's Intel Paragon and IBM SP2 show that both the PDD and reduced PDD algorithms are efficient and scalable.
COMPARISON OF PARALLEL AND SERIES HYBRID POWERTRAINS FOR TRANSIT BUS APPLICATION

Energy Technology Data Exchange (ETDEWEB)

Gao, Zhiming [ORNL; Daw, C Stuart [ORNL; Smith, David E [ORNL; Jones, Perry T [ORNL; LaClair, Tim J [ORNL; Parks, II, James E [ORNL

2016-01-01

The fuel economy and emissions of both conventional and hybrid buses equipped with emissions aftertreatment were evaluated via computational simulation for six representative city bus drive cycles. Both series and parallel configurations for the hybrid case were studied. The simulation results indicate that series hybrid buses have the greatest overall advantage in fuel economy. The series and parallel hybrid buses were predicted to produce similar CO and HC tailpipe emissions but were also predicted to have reduced NOx tailpipe emissions compared to the conventional bus in higher speed cycles. For the New York bus cycle (NYBC), which has the lowest average speed among the cycles evaluated, the series bus tailpipe emissions were somewhat higher than they were for the conventional bus, while the parallel hybrid bus had significantly lower tailpipe emissions. All three bus powertrains were found to require periodic active DPF regeneration to maintain PM control. Plug-in operation of series hybrid buses appears to offer significant fuel economy benefits and is easily employed due to the relatively large battery capacity that is typical of the series hybrid configuration.

A CS1 pedagogical approach to parallel thinking

Science.gov (United States)

Rague, Brian William

Almost all collegiate programs in Computer Science offer an introductory course in programming primarily devoted to communicating the foundational principles of software design and development. The ACM designates this introduction to computer programming course for first-year students as CS1, during which methodologies for solving problems within a discrete computational context are presented. Logical thinking is highlighted, guided primarily by a sequential approach to algorithm development and made manifest by typically using the latest, commercially successful programming language. In response to the most recent developments in accessible multicore computers, instructors of these introductory classes may wish to include training on how to design workable parallel code. Novel issues arise when programming concurrent applications which can make teaching these concepts to beginning programmers a seemingly formidable task. Student comprehension of design strategies related to parallel systems should be monitored to ensure an effective classroom experience. This research investigated the feasibility of integrating parallel computing concepts into the first-year CS classroom. To quantitatively assess student comprehension of parallel computing, an experimental educational study using a two-factor mixed group design was conducted to evaluate two instructional interventions in addition to a control group: (1) topic lecture only, and (2) topic lecture with laboratory work using a software visualization Parallel Analysis Tool (PAT) specifically designed for this project. A new evaluation instrument developed for this study, the Perceptions of Parallelism Survey (PoPS), was used to measure student learning regarding parallel systems. The results from this educational study show a statistically significant main effect among the repeated measures, implying that student comprehension levels of parallel concepts as measured by the PoPS improve immediately after the delivery of
READ-EVAL-PRINT in Parallel and Asynchronous Proof-checking

Directory of Open Access Journals (Sweden)

Makarius Wenzel

2013-07-01

Full Text Available The LCF tradition of interactive theorem proving, which was started by Milner in the 1970-ies, appears to be tied to the classic READ-EVAL-PRINT-LOOP of sequential and synchronous evaluation of prover commands. We break up this loop and retrofit the read-eval-print phases into a model of parallel and asynchronous proof processing. Thus we explain some key concepts of the Isabelle/Scala approach to prover interaction and integration, and the Isabelle/jEdit Prover IDE as front-end technology. We hope to open up the scientific discussion about non-trivial interaction models for ITP systems again, and help getting other old-school proof assistants on a similar track.
Statistical 3D damage accumulation model for ion implant simulators

CERN Document Server

Hernandez-Mangas, J M; Enriquez, L E; Bailon, L; Barbolla, J; Jaraiz, M

2003-01-01

A statistical 3D damage accumulation model, based on the modified Kinchin-Pease formula, for ion implant simulation has been included in our physically based ion implantation code. It has only one fitting parameter for electronic stopping and uses 3D electron density distributions for different types of targets including compound semiconductors. Also, a statistical noise reduction mechanism based on the dose division is used. The model has been adapted to be run under parallel execution in order to speed up the calculation in 3D structures. Sequential ion implantation has been modelled including previous damage profiles. It can also simulate the implantation of molecular and cluster projectiles. Comparisons of simulated doping profiles with experimental SIMS profiles are presented. Also comparisons between simulated amorphization and experimental RBS profiles are shown. An analysis of sequential versus parallel processing is provided.
Statistical 3D damage accumulation model for ion implant simulators

International Nuclear Information System (INIS)

Hernandez-Mangas, J.M.; Lazaro, J.; Enriquez, L.; Bailon, L.; Barbolla, J.; Jaraiz, M.

2003-01-01

A statistical 3D damage accumulation model, based on the modified Kinchin-Pease formula, for ion implant simulation has been included in our physically based ion implantation code. It has only one fitting parameter for electronic stopping and uses 3D electron density distributions for different types of targets including compound semiconductors. Also, a statistical noise reduction mechanism based on the dose division is used. The model has been adapted to be run under parallel execution in order to speed up the calculation in 3D structures. Sequential ion implantation has been modelled including previous damage profiles. It can also simulate the implantation of molecular and cluster projectiles. Comparisons of simulated doping profiles with experimental SIMS profiles are presented. Also comparisons between simulated amorphization and experimental RBS profiles are shown. An analysis of sequential versus parallel processing is provided
Acceleration of cardiovascular MRI using parallel imaging: basic principles, practical considerations, clinical applications and future directions

International Nuclear Information System (INIS)

Niendorf, T.; Sodickson, D.

2006-01-01

Cardiovascular Magnetic Resonance (CVMR) imaging has proven to be of clinical value for non-invasive diagnostic imaging of cardiovascular diseases. CVMR requires rapid imaging; however, the speed of conventional MRI is fundamentally limited due to its sequential approach to image acquisition, in which data points are collected one after the other in the presence of sequentially-applied magnetic field gradients and radiofrequency coils to acquire multiple data points simultaneously, and thereby to increase imaging speed and efficiency beyond the limits of purely gradient-based approaches. The resulting improvements in imaging speed can be used in various ways, including shortening long examinations, improving spatial resolution and anatomic coverage, improving temporal resolution, enhancing image quality, overcoming physiological constraints, detecting and correcting for physiologic motion, and streamlining work flow. Examples of these strategies will be provided in this review, after some of the fundamentals of parallel imaging methods now in use for cardiovascular MRI are outlined. The emphasis will rest upon basic principles and clinical state-of-the art cardiovascular MRI applications. In addition, practical aspects such as signal-to-noise ratio considerations, tailored parallel imaging protocols and potential artifacts will be discussed, and current trends and future directions will be explored. (orig.)
Analysis of IDR(s Family of Solvers for Reservoir Simulations on Different Parallel Architectures

Directory of Open Access Journals (Sweden)

Seignole Vincent

2016-09-01

Full Text Available The present contribution consists in providing a detailed analysis of several realizations of the IDR(s family of solvers, under different facets: robustness, performance and implementation on different parallel environments in regards of sequential IDR(s resolution implementation tested through several industrial geologically and structurally coherent 3D-field case reservoir models. This work is the result of continuous efforts towards time-response improvement of Storengy’s reservoir three-dimensional simulator named Multi, dedicated to gas-storage applications.
A non overlapping parallel domain decomposition method applied to the simplified transport equations

International Nuclear Information System (INIS)

Lathuiliere, B.; Barrault, M.; Ramet, P.; Roman, J.

2009-01-01

A reactivity computation requires to compute the highest eigenvalue of a generalized eigenvalue problem. An inverse power algorithm is used commonly. Very fine modelizations are difficult to tackle for our sequential solver, based on the simplified transport equations, in terms of memory consumption and computational time. So, we propose a non-overlapping domain decomposition method for the approximate resolution of the linear system to solve at each inverse power iteration. Our method brings to a low development effort as the inner multigroup solver can be re-use without modification, and allows us to adapt locally the numerical resolution (mesh, finite element order). Numerical results are obtained by a parallel implementation of the method on two different cases with a pin by pin discretization. This results are analyzed in terms of memory consumption and parallel efficiency. (authors)
Physico-chemical and viscoelastic properties of high pressure homogenized lemon peel fiber fraction suspensions obtained after sequential pectin extraction

NARCIS (Netherlands)

Willemsen, K.L.D.D.; Panozzo, A.; Moelants, K.; Debon, S.J.J.; Desmet, C.; Cardinaels, R.M.; Moldenaers, P.; Wallecan, J.; Hendrickx, M.E.G.

2017-01-01

The viscoelastic properties of high pressure homogenized lemon peel cell wall fiber suspensions, obtained after sequential selective pectin extraction, were investigated in the current study. For comparison, a general pectin extraction was additionally performed on lemon peel under acid thermal
The effect of plasma fluctuations on parallel transport parameters in the SOL

DEFF Research Database (Denmark)

Havlíčková, E.; Fundamenski, W.; Naulin, Volker

2011-01-01

The effect of plasma fluctuations due to turbulence at the outboard midplane on parallel transport properties is investigated. Time-dependent fluctuating signals at different radial locations are used to study the effect of signal statistics. Further, a computational analysis of parallel transport...... to a comparison of steady-state and time-dependent modelling....
Data-parallel tomographic reconstruction : A comparison of filtered backprojection and direct Fourier reconstruction

NARCIS (Netherlands)

Roerdink, J.B.T.M.; Westenberg, M.A

1998-01-01

We consider the parallelization of two standard 2D reconstruction algorithms, filtered backprojection and direct Fourier reconstruction, using the data-parallel programming style. The algorithms are implemented on a Connection Machine CM-5 with 16 processors and a peak performance of 2 Gflop/s.
Sequential Generalized Transforms on Function Space

Directory of Open Access Journals (Sweden)

Jae Gil Choi

2013-01-01

Full Text Available We define two sequential transforms on a function space Ca,b[0,T] induced by generalized Brownian motion process. We then establish the existence of the sequential transforms for functionals in a Banach algebra of functionals on Ca,b[0,T]. We also establish that any one of these transforms acts like an inverse transform of the other transform. Finally, we give some remarks about certain relations between our sequential transforms and other well-known transforms on Ca,b[0,T].
Improvements to parallel plate flow chambers to reduce reagent and cellular requirements

Directory of Open Access Journals (Sweden)

Larson Richard S

2001-09-01

Full Text Available Abstract Background The parallel plate flow chamber has become a mainstay for examination of leukocytes under physiologic flow conditions. Several design modifications have occurred over the years, yet a comparison of these different designs has not been performed. In addition, the reagent requirements of many designs prohibit the study of rare leukocyte populations and require large amounts of reagents. Results In this study, we evaluate modifications to a newer parallel plate flow chamber design in comparison to the original parallel plate flow chamber described by Lawrence et al. We show that modifications in the chamber size, internal tubing diameters, injection valves, and a recirculation design may dramatically reduce the cellular and reagent requirements without altering measurements. Conclusions These modifications are simple and easily implemented so that study of rare leukocyte subsets using scarce or expensive reagents can occur.
Forced Sequence Sequential Decoding

DEFF Research Database (Denmark)

Jensen, Ole Riis; Paaske, Erik

1998-01-01

We describe a new concatenated decoding scheme based on iterations between an inner sequentially decoded convolutional code of rate R=1/4 and memory M=23, and block interleaved outer Reed-Solomon (RS) codes with nonuniform profile. With this scheme decoding with good performance is possible as low...... as Eb/N0=0.6 dB, which is about 1.25 dB below the signal-to-noise ratio (SNR) that marks the cutoff rate for the full system. Accounting for about 0.45 dB due to the outer codes, sequential decoding takes place at about 1.7 dB below the SNR cutoff rate for the convolutional code. This is possible since...... the iteration process provides the sequential decoders with side information that allows a smaller average load and minimizes the probability of computational overflow. Analytical results for the probability that the first RS word is decoded after C computations are presented. These results are supported...
A fast and efficient method for sequential cone-beam tomography

International Nuclear Information System (INIS)

Koehler, Th.; Proksa, R.; Grass, M.

2001-01-01

Sequential cone-beam tomography is a method that uses data of two or more parallel circular trajectories of a cone-beam scanner to reconstruct the object function. We propose a condition for the data acquisition that ensures that all object points between two successive circles are irradiated over an angular span of the x-ray source position of exactly 360 deg. in total as seen along the rotation axis. A fast and efficient approximative reconstruction method for the proposed acquisition is presented which uses data from exactly 360 deg. for every object point. It is based on the Tent-FDK method which was recently developed for single circular cone-beam CT. The measurement geometry does not provide sufficient data for exact reconstruction but it is shown that the proposed reconstruction method provides satisfying image quality for small cone angles
Time-resolved echo-shared parallel MRA of the lung: observer preference study of image quality in comparison with non-echo-shared sequences

International Nuclear Information System (INIS)

Fink, C.; Puderbach, M.; Zaporozhan, J.; Plathow, C.; Kauczor, H.-U.; Ley, S.

2005-01-01

The aim of this study was to evaluate the image quality of time-resolved echo-shared parallel MRA of the lung. The pulmonary vasculature of nine patients (seven females, two males; median age: 44 years) with pulmonary disease was examined using a time-resolved MRA sequence combining echo sharing with parallel imaging (time-resolved echo-shared angiography technique, or TREAT). The sharpness of the vessel borders, conspicuousness of peripheral lung vessels, artifact level, and overall image quality of TREAT was assessed independently by four readers in a side-by-side comparison with non-echo-shared time-resolved parallel MRA data (pMRA) previously acquired in the same patients. Furthermore, the SNR of pulmonary arteries (PA) and veins (PV) achieved with both pulse sequences was compared. The mean voxel size of TREAT MRA was decreased by 24% compared with the non-echo-shared MRA. Regarding the sharpness of the vessel borders, conspicuousness of peripheral lung vessels, and overall image quality the TREAT sequence was rated superior in 75-76% of all cases. If the TREAT images were preferred over the pMRA images, the advantage was rated as major in 61-71% of all cases. The level of artifacts was not increased with the TREAT sequence. The mean interobserver agreement for all categories ranged between fair (artifact level) and good (overall image quality). The maximum SNR of TREAT did not differ from non-echo-shared parallel MRA (PA: TREAT: 273±45; pMRA: 280±71; PV: TREAT: 273±33; pMRA: 258±62). TREAT achieves a higher spatial resolution than non-echo-shared parallel MRA which is also perceived as an improved image quality. (orig.)
SequenceL: Automated Parallel Algorithms Derived from CSP-NT Computational Laws

Science.gov (United States)

Cooke, Daniel; Rushton, Nelson

2013-01-01

With the introduction of new parallel architectures like the cell and multicore chips from IBM, Intel, AMD, and ARM, as well as the petascale processing available for highend computing, a larger number of programmers will need to write parallel codes. Adding the parallel control structure to the sequence, selection, and iterative control constructs increases the complexity of code development, which often results in increased development costs and decreased reliability. SequenceL is a high-level programming language that is, a programming language that is closer to a human s way of thinking than to a machine s. Historically, high-level languages have resulted in decreased development costs and increased reliability, at the expense of performance. In recent applications at JSC and in industry, SequenceL has demonstrated the usual advantages of high-level programming in terms of low cost and high reliability. SequenceL programs, however, have run at speeds typically comparable with, and in many cases faster than, their counterparts written in C and C++ when run on single-core processors. Moreover, SequenceL is able to generate parallel executables automatically for multicore hardware, gaining parallel speedups without any extra effort from the programmer beyond what is required to write the sequen tial/singlecore code. A SequenceL-to-C++ translator has been developed that automatically renders readable multithreaded C++ from a combination of a SequenceL program and sample data input. The SequenceL language is based on two fundamental computational laws, Consume-Simplify- Produce (CSP) and Normalize-Trans - pose (NT), which enable it to automate the creation of parallel algorithms from high-level code that has no annotations of parallelism whatsoever. In our anecdotal experience, SequenceL development has been in every case less costly than development of the same algorithm in sequential (that is, single-core, single process) C or C++, and an order of magnitude less
Sequential probability ratio controllers for safeguards radiation monitors

International Nuclear Information System (INIS)

Fehlau, P.E.; Coop, K.L.; Nixon, K.V.

1984-01-01

Sequential hypothesis tests applied to nuclear safeguards accounting methods make the methods more sensitive to detecting diversion. The sequential tests also improve transient signal detection in safeguards radiation monitors. This paper describes three microprocessor control units with sequential probability-ratio tests for detecting transient increases in radiation intensity. The control units are designed for three specific applications: low-intensity monitoring with Poisson probability ratios, higher intensity gamma-ray monitoring where fixed counting intervals are shortened by sequential testing, and monitoring moving traffic where the sequential technique responds to variable-duration signals. The fixed-interval controller shortens a customary 50-s monitoring time to an average of 18 s, making the monitoring delay less bothersome. The controller for monitoring moving vehicles benefits from the sequential technique by maintaining more than half its sensitivity when the normal passage speed doubles
Proceedings of the workshop on Compilation of (Symbolic) Languages for Parallel Computers

Energy Technology Data Exchange (ETDEWEB)

Foster, I.; Tick, E. (comp.)

1991-11-01

This report comprises the abstracts and papers for the talks presented at the Workshop on Compilation of (Symbolic) Languages for Parallel Computers, held October 31--November 1, 1991, in San Diego. These unreferred contributions were provided by the participants for the purpose of this workshop; many of them will be published elsewhere in peer-reviewed conferences and publications. Our goal is planning this workshop was to bring together researchers from different disciplines with common problems in compilation. In particular, we wished to encourage interaction between researchers working in compilation of symbolic languages and those working on compilation of conventional, imperative languages. The fundamental problems facing researchers interested in compilation of logic, functional, and procedural programming languages for parallel computers are essentially the same. However, differences in the basic programming paradigms have led to different communities emphasizing different species of the parallel compilation problem. For example, parallel logic and functional languages provide dataflow-like formalisms in which control dependencies are unimportant. Hence, a major focus of research in compilation has been on techniques that try to infer when sequential control flow can safely be imposed. Granularity analysis for scheduling is a related problem. The single- assignment property leads to a need for analysis of memory use in order to detect opportunities for reuse. Much of the work in each of these areas relies on the use of abstract interpretation techniques.
Biased lineups: sequential presentation reduces the problem.

Science.gov (United States)

Lindsay, R C; Lea, J A; Nosworthy, G J; Fulford, J A; Hector, J; LeVan, V; Seabrook, C

1991-12-01

Biased lineups have been shown to increase significantly false, but not correct, identification rates (Lindsay, Wallbridge, & Drennan, 1987; Lindsay & Wells, 1980; Malpass & Devine, 1981). Lindsay and Wells (1985) found that sequential lineup presentation reduced false identification rates, presumably by reducing reliance on relative judgment processes. Five staged-crime experiments were conducted to examine the effect of lineup biases and sequential presentation on eyewitness recognition accuracy. Sequential lineup presentation significantly reduced false identification rates from fair lineups as well as from lineups biased with regard to foil similarity, instructions, or witness attire, and from lineups biased in all of these ways. The results support recommendations that police present lineups sequentially.
Parallelization of one image compression method. Wavelet, Transform, Vector Quantization and Huffman Coding

International Nuclear Information System (INIS)

Moravie, Philippe

1997-01-01

Today, in the digitized satellite image domain, the needs for high dimension increase considerably. To transmit or to stock such images (more than 6000 by 6000 pixels), we need to reduce their data volume and so we have to use real-time image compression techniques. The large amount of computations required by image compression algorithms prohibits the use of common sequential processors, for the benefits of parallel computers. The study presented here deals with parallelization of a very efficient image compression scheme, based on three techniques: Wavelets Transform (WT), Vector Quantization (VQ) and Entropic Coding (EC). First, we studied and implemented the parallelism of each algorithm, in order to determine the architectural characteristics needed for real-time image compression. Then, we defined eight parallel architectures: 3 for Mallat algorithm (WT), 3 for Tree-Structured Vector Quantization (VQ) and 2 for Huffman Coding (EC). As our system has to be multi-purpose, we chose 3 global architectures between all of the 3x3x2 systems available. Because, for technological reasons, real-time is not reached at anytime (for all the compression parameter combinations), we also defined and evaluated two algorithmic optimizations: fix point precision and merging entropic coding in vector quantization. As a result, we defined a new multi-purpose multi-SMIMD parallel machine, able to compress digitized satellite image in real-time. The definition of the best suited architecture for real-time image compression was answered by presenting 3 parallel machines among which one multi-purpose, embedded and which might be used for other applications on board. (author) [fr

A parallel approximate string matching under Levenshtein distance on graphics processing units using warp-shuffle operations.

Directory of Open Access Journals (Sweden)

ThienLuan Ho

Full Text Available Approximate string matching with k-differences has a number of practical applications, ranging from pattern recognition to computational biology. This paper proposes an efficient memory-access algorithm for parallel approximate string matching with k-differences on Graphics Processing Units (GPUs. In the proposed algorithm, all threads in the same GPUs warp share data using warp-shuffle operation instead of accessing the shared memory. Moreover, we implement the proposed algorithm by exploiting the memory structure of GPUs to optimize its performance. Experiment results for real DNA packages revealed that the performance of the proposed algorithm and its implementation archived up to 122.64 and 1.53 times compared to that of sequential algorithm on CPU and previous parallel approximate string matching algorithm on GPUs, respectively.
Timed sequential chemotherapy of cytoxan-refractory multiple myeloma with cytoxan and adriamycin based on induced tumor proliferation.

Science.gov (United States)

Karp, J E; Humphrey, R L; Burke, P J

1981-03-01

Malignant plasma cell proliferation and induced humoral stimulatory activity (HSA) occur in vivo at a predictable time following drug administration. Sequential sera from 11 patients with poor-risk multiple myeloma (MM) undergoing treatment with Cytoxan (CY) 2400 mq/sq m were assayed for their in vitro effects on malignant bone marrow plasma cell tritiated thymidine (3HTdR) incorporation. Peak HSA was detected day 9 following CY. Sequential changes in marrow malignant plasma cell 3HTdR-labeling indices (LI) paralleled changes in serum activity, with peak LI occurring at the time of peak HS. An in vitro model of chemotherapy demonstrated that malignant plasma cell proliferation was enhanced by HSA, as determined by 3HTdR incorporation assay, 3HTdR LI, and tumor cells counts, and that stimulated plasma cells were more sensitive to cytotoxic effects of adriamycin (ADR) than were cells cultured in autologous pretreatment serum. Based on these studies, we designed a clinical trial to treat 12 CY-refractory poor-risk patients with MM in which ADR (60 mg/sq m) was administered at the time of peak HSA and residual tumor cell LI (day 9) following initial CY, 2400 mg/m (CY1ADR9). Eight of 12 (67%) responded to timed sequential chemotherapy with a greater than 50% decrement in monoclonal protein marker and a median survival projected to be greater than 8 mo duration (range 4-21+ mo). These clinical results using timed sequential CY1ADR9 compare favorably with results obtained using ADR in nonsequential chemotherapeutic regimens.
Lineup composition, suspect position, and the sequential lineup advantage.

Science.gov (United States)

Carlson, Curt A; Gronlund, Scott D; Clark, Steven E

2008-06-01

N. M. Steblay, J. Dysart, S. Fulero, and R. C. L. Lindsay (2001) argued that sequential lineups reduce the likelihood of mistaken eyewitness identification. Experiment 1 replicated the design of R. C. L. Lindsay and G. L. Wells (1985), the first study to show the sequential lineup advantage. However, the innocent suspect was chosen at a lower rate in the simultaneous lineup, and no sequential lineup advantage was found. This led the authors to hypothesize that protection from a sequential lineup might emerge only when an innocent suspect stands out from the other lineup members. In Experiment 2, participants viewed a simultaneous or sequential lineup with either the guilty suspect or 1 of 3 innocent suspects. Lineup fairness was varied to influence the degree to which a suspect stood out. A sequential lineup advantage was found only for the unfair lineups. Additional analyses of suspect position in the sequential lineups showed an increase in the diagnosticity of suspect identifications as the suspect was placed later in the sequential lineup. These results suggest that the sequential lineup advantage is dependent on lineup composition and suspect position. (c) 2008 APA, all rights reserved
Limited angle tomographic breast imaging: A comparison of parallel beam and pinhole collimation

International Nuclear Information System (INIS)

Wessell, D.E.; Kadrmas, D.J.; Frey, E.C.

1996-01-01

Results from clinical trials have suggested no improvement in lesion detection with parallel hole SPECT scintimammography (SM) with Tc-99m over parallel hole planar SM. In this initial investigation, we have elucidated some of the unique requirements of SPECT SM. With these requirements in mind, we have begun to develop practical data acquisition and reconstruction strategies that can reduce image artifacts and improve image quality. In this paper we investigate limited angle orbits for both parallel hole and pinhole SPECT SM. Singular Value Decomposition (SVD) is used to analyze the artifacts associated with the limited angle orbits. Maximum likelihood expectation maximization (MLEM) reconstructions are then used to examine the effects of attenuation compensation on the quality of the reconstructed image. All simulations are performed using the 3D-MCAT breast phantom. The results of these simulation studies demonstrate that limited angle SPECT SM is feasible, that attenuation correction is needed for accurate reconstructions, and that pinhole SPECT SM may have an advantage over parallel hole SPECT SM in terms of improved image quality and reduced image artifacts
Efficient parallel and out of core algorithms for constructing large bi-directed de Bruijn graphs

Directory of Open Access Journals (Sweden)

Vaughn Matthew

2010-11-01

Full Text Available Abstract Background Assembling genomic sequences from a set of overlapping reads is one of the most fundamental problems in computational biology. Algorithms addressing the assembly problem fall into two broad categories - based on the data structures which they employ. The first class uses an overlap/string graph and the second type uses a de Bruijn graph. However with the recent advances in short read sequencing technology, de Bruijn graph based algorithms seem to play a vital role in practice. Efficient algorithms for building these massive de Bruijn graphs are very essential in large sequencing projects based on short reads. In an earlier work, an O(n/p time parallel algorithm has been given for this problem. Here n is the size of the input and p is the number of processors. This algorithm enumerates all possible bi-directed edges which can overlap with a node and ends up generating Θ(nΣ messages (Σ being the size of the alphabet. Results In this paper we present a Θ(n/p time parallel algorithm with a communication complexity that is equal to that of parallel sorting and is not sensitive to Σ. The generality of our algorithm makes it very easy to extend it even to the out-of-core model and in this case it has an optimal I/O complexity of Θ(nlog(n/BBlog(M/B (M being the main memory size and B being the size of the disk block. We demonstrate the scalability of our parallel algorithm on a SGI/Altix computer. A comparison of our algorithm with the previous approaches reveals that our algorithm is faster - both asymptotically and practically. We demonstrate the scalability of our sequential out-of-core algorithm by comparing it with the algorithm used by VELVET to build the bi-directed de Bruijn graph. Our experiments reveal that our algorithm can build the graph with a constant amount of memory, which clearly outperforms VELVET. We also provide efficient algorithms for the bi-directed chain compaction problem. Conclusions The bi
Efficient parallel and out of core algorithms for constructing large bi-directed de Bruijn graphs.

Science.gov (United States)

Kundeti, Vamsi K; Rajasekaran, Sanguthevar; Dinh, Hieu; Vaughn, Matthew; Thapar, Vishal

2010-11-15

Assembling genomic sequences from a set of overlapping reads is one of the most fundamental problems in computational biology. Algorithms addressing the assembly problem fall into two broad categories - based on the data structures which they employ. The first class uses an overlap/string graph and the second type uses a de Bruijn graph. However with the recent advances in short read sequencing technology, de Bruijn graph based algorithms seem to play a vital role in practice. Efficient algorithms for building these massive de Bruijn graphs are very essential in large sequencing projects based on short reads. In an earlier work, an O(n/p) time parallel algorithm has been given for this problem. Here n is the size of the input and p is the number of processors. This algorithm enumerates all possible bi-directed edges which can overlap with a node and ends up generating Θ(nΣ) messages (Σ being the size of the alphabet). In this paper we present a Θ(n/p) time parallel algorithm with a communication complexity that is equal to that of parallel sorting and is not sensitive to Σ. The generality of our algorithm makes it very easy to extend it even to the out-of-core model and in this case it has an optimal I/O complexity of Θ(nlog(n/B)Blog(M/B)) (M being the main memory size and B being the size of the disk block). We demonstrate the scalability of our parallel algorithm on a SGI/Altix computer. A comparison of our algorithm with the previous approaches reveals that our algorithm is faster--both asymptotically and practically. We demonstrate the scalability of our sequential out-of-core algorithm by comparing it with the algorithm used by VELVET to build the bi-directed de Bruijn graph. Our experiments reveal that our algorithm can build the graph with a constant amount of memory, which clearly outperforms VELVET. We also provide efficient algorithms for the bi-directed chain compaction problem. The bi-directed de Bruijn graph is a fundamental data structure for
STOCHSIMGPU: parallel stochastic simulation for the Systems Biology Toolbox 2 for MATLAB

KAUST Repository

Klingbeil, G.

2011-02-25

Motivation: The importance of stochasticity in biological systems is becoming increasingly recognized and the computational cost of biologically realistic stochastic simulations urgently requires development of efficient software. We present a new software tool STOCHSIMGPU that exploits graphics processing units (GPUs) for parallel stochastic simulations of biological/chemical reaction systems and show that significant gains in efficiency can be made. It is integrated into MATLAB and works with the Systems Biology Toolbox 2 (SBTOOLBOX2) for MATLAB. Results: The GPU-based parallel implementation of the Gillespie stochastic simulation algorithm (SSA), the logarithmic direct method (LDM) and the next reaction method (NRM) is approximately 85 times faster than the sequential implementation of the NRM on a central processing unit (CPU). Using our software does not require any changes to the user\\'s models, since it acts as a direct replacement of the stochastic simulation software of the SBTOOLBOX2. © The Author 2011. Published by Oxford University Press. All rights reserved.
Tradable permit allocations and sequential choice

Energy Technology Data Exchange (ETDEWEB)

MacKenzie, Ian A. [Centre for Economic Research, ETH Zuerich, Zurichbergstrasse 18, 8092 Zuerich (Switzerland)

2011-01-15

This paper investigates initial allocation choices in an international tradable pollution permit market. For two sovereign governments, we compare allocation choices that are either simultaneously or sequentially announced. We show sequential allocation announcements result in higher (lower) aggregate emissions when announcements are strategic substitutes (complements). Whether allocation announcements are strategic substitutes or complements depends on the relationship between the follower's damage function and governments' abatement costs. When the marginal damage function is relatively steep (flat), allocation announcements are strategic substitutes (complements). For quadratic abatement costs and damages, sequential announcements provide a higher level of aggregate emissions. (author)
Characterization of a sequential pipeline approach to automatic tissue segmentation from brain MR Images

International Nuclear Information System (INIS)

Hou, Zujun; Huang, Su

2008-01-01

Quantitative analysis of gray matter and white matter in brain magnetic resonance imaging (MRI) is valuable for neuroradiology and clinical practice. Submission of large collections of MRI scans to pipeline processing is increasingly important. We characterized this process and suggest several improvements. To investigate tissue segmentation from brain MR images through a sequential approach, a pipeline that consecutively executes denoising, skull/scalp removal, intensity inhomogeneity correction and intensity-based classification was developed. The denoising phase employs a 3D-extension of the Bayes-Shrink method. The inhomogeneity is corrected by an improvement of the Dawant et al.'s method with automatic generation of reference points. The N3 method has also been evaluated. Subsequently the brain tissue is segmented into cerebrospinal fluid, gray matter and white matter by a generalized Otsu thresholding technique. Intensive comparisons with other sequential or iterative methods have been carried out using simulated and real images. The sequential approach with judicious selection on the algorithm selection in each stage is not only advantageous in speed, but also can attain at least as accurate segmentation as iterative methods under a variety of noise or inhomogeneity levels. A sequential approach to tissue segmentation, which consecutively executes the wavelet shrinkage denoising, scalp/skull removal, inhomogeneity correction and intensity-based classification was developed to automatically segment the brain tissue into CSF, GM and WM from brain MR images. This approach is advantageous in several common applications, compared with other pipeline methods. (orig.)
A comparison of the interactions between sequential Ga-P and Ga-As diffusions in silicon

International Nuclear Information System (INIS)

Jones, C.L.; Willoughby, A.F.W.

1976-01-01

Investigation of the interactions between sequential gallium-phosphorus and gallium-arsenic diffusions have been made using radiotracer profiling techniques. Gallium diffusions were first carried out using isotope 67 Ga diffused from a solid gallium oxide source, and subsequently phosphorus or arsenic were diffused into the same surface. The effect of phosphorus diffusion of high surface concentration was found to be a large enhancement (up to a factor of 100)in the diffusion coefficient of the tail of the gallium profile, while similar arsenic diffusion produced either a small enhancement or a retardation, depending on the conditions used. In addition, the diffusion of both phosphorus and arsenic produced a pronounced dip in the gallium profiles, which is discussed in terms of the built-in electric field produced during the emitter diffusions. The differences between the positions of the dips produced by phosphorus and arsenic are explained by the differences in their profile shape and hence in the electric field distribution. In the case of arsenic, the dip is located at the steeply falling front of the arsenic profile which resolves discrepancies in previous studies of boron-arsenic sequential diffusions. (author)
Encoding Sequential Information in Semantic Space Models: Comparing Holographic Reduced Representation and Random Permutation

Directory of Open Access Journals (Sweden)

Gabriel Recchia

2015-01-01

Full Text Available Circular convolution and random permutation have each been proposed as neurally plausible binding operators capable of encoding sequential information in semantic memory. We perform several controlled comparisons of circular convolution and random permutation as means of encoding paired associates as well as encoding sequential information. Random permutations outperformed convolution with respect to the number of paired associates that can be reliably stored in a single memory trace. Performance was equal on semantic tasks when using a small corpus, but random permutations were ultimately capable of achieving superior performance due to their higher scalability to large corpora. Finally, “noisy” permutations in which units are mapped to other units arbitrarily (no one-to-one mapping perform nearly as well as true permutations. These findings increase the neurological plausibility of random permutations and highlight their utility in vector space models of semantics.
A Sequential Kriging reliability analysis method with characteristics of adaptive sampling regions and parallelizability

International Nuclear Information System (INIS)

Wen, Zhixun; Pei, Haiqing; Liu, Hai; Yue, Zhufeng

2016-01-01

The sequential Kriging reliability analysis (SKRA) method has been developed in recent years for nonlinear implicit response functions which are expensive to evaluate. This type of method includes EGRA: the efficient reliability analysis method, and AK-MCS: the active learning reliability method combining Kriging model and Monte Carlo simulation. The purpose of this paper is to improve SKRA by adaptive sampling regions and parallelizability. The adaptive sampling regions strategy is proposed to avoid selecting samples in regions where the probability density is so low that the accuracy of these regions has negligible effects on the results. The size of the sampling regions is adapted according to the failure probability calculated by last iteration. Two parallel strategies are introduced and compared, aimed at selecting multiple sample points at a time. The improvement is verified through several troublesome examples. - Highlights: • The ISKRA method improves the efficiency of SKRA. • Adaptive sampling regions strategy reduces the number of needed samples. • The two parallel strategies reduce the number of needed iterations. • The accuracy of the optimal value impacts the number of samples significantly.
Applying the minimax principle to sequential mastery testing

NARCIS (Netherlands)

Vos, Hendrik J.

2002-01-01

The purpose of this paper is to derive optimal rules for sequential mastery tests. In a sequential mastery test, the decision is to classify a subject as a master, a nonmaster, or to continue sampling and administering another random item. The framework of minimax sequential decision theory (minimum
Classical and sequential limit analysis revisited

Science.gov (United States)

Leblond, Jean-Baptiste; Kondo, Djimédo; Morin, Léo; Remmal, Almahdi

2018-04-01

Classical limit analysis applies to ideal plastic materials, and within a linearized geometrical framework implying small displacements and strains. Sequential limit analysis was proposed as a heuristic extension to materials exhibiting strain hardening, and within a fully general geometrical framework involving large displacements and strains. The purpose of this paper is to study and clearly state the precise conditions permitting such an extension. This is done by comparing the evolution equations of the full elastic-plastic problem, the equations of classical limit analysis, and those of sequential limit analysis. The main conclusion is that, whereas classical limit analysis applies to materials exhibiting elasticity - in the absence of hardening and within a linearized geometrical framework -, sequential limit analysis, to be applicable, strictly prohibits the presence of elasticity - although it tolerates strain hardening and large displacements and strains. For a given mechanical situation, the relevance of sequential limit analysis therefore essentially depends upon the importance of the elastic-plastic coupling in the specific case considered.
Synthesis of sequential control algorithms for pneumatic drives controlled by monostable valves

Directory of Open Access Journals (Sweden)

Ł. Dworzak

2009-07-01

Full Text Available Application of the Grafpol method [1] for synthesising sequential control algorithms for pneumatic drives controlled by monostable valves is presented. The developed principles simplify the MTS method of programming production processes in the scope of the memory realisation [2]. Thanks to this, time for synthesising the schematic equation can be significantly reduced in comparison to the network transformation method [3]. The designed schematic equation makes a ground for writing an application program of a PLC using any language defined in IEC 61131-3.
Simultaneous versus sequential penetrating keratoplasty and cataract surgery.

Science.gov (United States)

Hayashi, Ken; Hayashi, Hideyuki

2006-10-01

To compare the surgical outcomes of simultaneous penetrating keratoplasty and cataract surgery with those of sequential surgery. Thirty-nine eyes of 39 patients scheduled for simultaneous keratoplasty and cataract surgery and 23 eyes of 23 patients scheduled for sequential keratoplasty and secondary phacoemulsification surgery were recruited. Refractive error, regular and irregular corneal astigmatism determined by Fourier analysis, and endothelial cell loss were studied at 1 week and 3, 6, and 12 months after combined surgery in the simultaneous surgery group or after subsequent phacoemulsification surgery in the sequential surgery group. At 3 and more months after surgery, mean refractive error was significantly greater in the simultaneous surgery group than in the sequential surgery group, although no difference was seen at 1 week. The refractive error at 12 months was within 2 D of that targeted in 15 eyes (39%) in the simultaneous surgery group and within 2 D in 16 eyes (70%) in the sequential surgery group; the incidence was significantly greater in the sequential group (P = 0.0344). The regular and irregular astigmatism was not significantly different between the groups at 3 and more months after surgery. No significant difference was also found in the percentage of endothelial cell loss between the groups. Although corneal astigmatism and endothelial cell loss were not different, refractive error from target refraction was greater after simultaneous keratoplasty and cataract surgery than after sequential surgery, indicating a better outcome after sequential surgery than after simultaneous surgery.
A Parallel Algebraic Multigrid Solver on Graphics Processing Units

KAUST Repository

Haase, Gundolf; Liebmann, Manfred; Douglas, Craig C.; Plank, Gernot

2010-01-01

-vector multiplication scheme underlying the PCG-AMG algorithm is presented for the many-core GPU architecture. A performance comparison of the parallel solver shows that a singe Nvidia Tesla C1060 GPU board delivers the performance of a sixteen node Infiniband cluster
Parallel generation of architecture on the GPU

KAUST Repository

Steinberger, Markus

2014-05-01

In this paper, we present a novel approach for the parallel evaluation of procedural shape grammars on the graphics processing unit (GPU). Unlike previous approaches that are either limited in the kind of shapes they allow, the amount of parallelism they can take advantage of, or both, our method supports state of the art procedural modeling including stochasticity and context-sensitivity. To increase parallelism, we explicitly express independence in the grammar, reduce inter-rule dependencies required for context-sensitive evaluation, and introduce intra-rule parallelism. Our rule scheduling scheme avoids unnecessary back and forth between CPU and GPU and reduces round trips to slow global memory by dynamically grouping rules in on-chip shared memory. Our GPU shape grammar implementation is multiple orders of magnitude faster than the standard in CPU-based rule evaluation, while offering equal expressive power. In comparison to the state of the art in GPU shape grammar derivation, our approach is nearly 50 times faster, while adding support for geometric context-sensitivity. © 2014 The Author(s) Computer Graphics Forum © 2014 The Eurographics Association and John Wiley & Sons Ltd. Published by John Wiley & Sons Ltd.
Trial Sequential Methods for Meta-Analysis

Science.gov (United States)

Kulinskaya, Elena; Wood, John

2014-01-01

Statistical methods for sequential meta-analysis have applications also for the design of new trials. Existing methods are based on group sequential methods developed for single trials and start with the calculation of a required information size. This works satisfactorily within the framework of fixed effects meta-analysis, but conceptual…
Sequentially pulsed traveling wave accelerator

Science.gov (United States)

Caporaso, George J [Livermore, CA; Nelson, Scott D [Patterson, CA; Poole, Brian R [Tracy, CA

2009-08-18

A sequentially pulsed traveling wave compact accelerator having two or more pulse forming lines each with a switch for producing a short acceleration pulse along a short length of a beam tube, and a trigger mechanism for sequentially triggering the switches so that a traveling axial electric field is produced along the beam tube in synchronism with an axially traversing pulsed beam of charged particles to serially impart energy to the particle beam.

Scalability of Parallel Scientific Applications on the Cloud

Directory of Open Access Journals (Sweden)

Satish Narayana Srirama

2011-01-01

Full Text Available Cloud computing, with its promise of virtually infinite resources, seems to suit well in solving resource greedy scientific computing problems. To study the effects of moving parallel scientific applications onto the cloud, we deployed several benchmark applications like matrix–vector operations and NAS parallel benchmarks, and DOUG (Domain decomposition On Unstructured Grids on the cloud. DOUG is an open source software package for parallel iterative solution of very large sparse systems of linear equations. The detailed analysis of DOUG on the cloud showed that parallel applications benefit a lot and scale reasonable on the cloud. We could also observe the limitations of the cloud and its comparison with cluster in terms of performance. However, for efficiently running the scientific applications on the cloud infrastructure, the applications must be reduced to frameworks that can successfully exploit the cloud resources, like the MapReduce framework. Several iterative and embarrassingly parallel algorithms are reduced to the MapReduce model and their performance is measured and analyzed. The analysis showed that Hadoop MapReduce has significant problems with iterative methods, while it suits well for embarrassingly parallel algorithms. Scientific computing often uses iterative methods to solve large problems. Thus, for scientific computing on the cloud, this paper raises the necessity for better frameworks or optimizations for MapReduce.
Fundamental physics issues of multilevel logic in developing a parallel processor.

Science.gov (United States)

Bandyopadhyay, Anirban; Miki, Kazushi

2007-06-01

In the last century, On and Off physical switches, were equated with two decisions 0 and 1 to express every information in terms of binary digits and physically realize it in terms of switches connected in a circuit. Apart from memory-density increase significantly, more possible choices in particular space enables pattern-logic a reality, and manipulation of pattern would allow controlling logic, generating a new kind of processor. Neumann's computer is based on sequential logic, processing bits one by one. But as pattern-logic is generated on a surface, viewing whole pattern at a time is a truly parallel processing. Following Neumann's and Shannons fundamental thermodynamical approaches we have built compatible model based on series of single molecule based multibit logic systems of 4-12 bits in an UHV-STM. On their monolayer multilevel communication and pattern formation is experimentally verified. Furthermore, the developed intelligent monolayer is trained by Artificial Neural Network. Therefore fundamental weak interactions for the building of truly parallel processor are explored here physically and theoretically.
Novel 2D-sequential color code system employing Image Sensor Communications for Optical Wireless Communications

Directory of Open Access Journals (Sweden)

Trang Nguyen

2016-06-01

Full Text Available The IEEE 802.15.7r1 Optical Wireless Communications Task Group (TG7r1, also known as the revision of the IEEE 802.15.7 Visible Light Communication standard targeting the commercial usage of visible light communication systems, is of interest in this paper. The paper is mainly concerned with Image Sensor Communications (ISC of TG7r1; however, the major challenge facing ISC, as addressed in the Technical Consideration Document (TCD of TG7r1, is Image Sensor Compatibility among the variety of different commercial cameras on the market. One of the most challenging but interesting compatibility requirements is the need to support the verified presence of frame rate variation. This paper proposes a novel design for 2D-sequential color code. Compared to a QR-code-based sequential transmission, the proposed design of 2D-sequential code can overcome the above challenge that it is compatible with different frame rate variations and different shutter operations, and has the ability to mitigate the rolling effect as well as the rotating effect while effectively minimizing transmission overhead. Practical implementations are demonstrated and a performance comparison is presented.
An Efficient System Based On Closed Sequential Patterns for Web Recommendations

OpenAIRE

Utpala Niranjan; R.B.V. Subramanyam; V-Khana

2010-01-01

Sequential pattern mining, since its introduction has received considerable attention among the researchers with broad applications. The sequential pattern algorithms generally face problems when mining long sequential patterns or while using very low support threshold. One possible solution of such problems is by mining the closed sequential patterns, which is a condensed representation of sequential patterns. Recently, several researchers have utilized the sequential pattern discovery for d...
An approach to multicore parallelism using functional programming: A case study based on Presburger Arithmetic

DEFF Research Database (Denmark)

Dung, Phan Anh; Hansen, Michael Reichhardt

2015-01-01

In this paper we investigate multicore parallelism in the context of functional programming by means of two quantifier-elimination procedures for Presburger Arithmetic: one is based on Cooper’s algorithm and the other is based on the Omega Test. We first develop correct-by-construction prototype...... platform executing on an 8-core machine. A speedup of approximately 4 was obtained for Cooper’s algorithm and a speedup of approximately 6 was obtained for the exact-shadow part of the Omega Test. The considered procedures are complex, memory-intense algorithms on huge formula trees and the case study...... reveals more general applicable techniques and guideline for deriving parallel algorithms from sequential ones in the context of data-intensive tree algorithms. The obtained insights should apply for any strict and impure functional programming language. Furthermore, the results obtained for the exact...
Algébrico: Parte II - Algoritmo Paralelo

Directory of Open Access Journals (Sweden)

Fabio Henrique Pereira

2007-01-01

Full Text Available In this work, it is presented a new parallel wavelet- based algorithm for the Algebraic Multigrid Method (PWAMG. A variation of the standard parallel implementation of discrete wavelet transforms is used in the construction of a hierarchy of matrices and of intergrid transfer operators for Algebraic Multigrid. The PWAMG method has been tested as a parallel solver for the two dimensional Poisson equation, for different numbers of finite difference mesh nodes and comparisons are made with the sequential version of this method.
A compositional reservoir simulator on distributed memory parallel computers

International Nuclear Information System (INIS)

Rame, M.; Delshad, M.

1995-01-01

This paper presents the application of distributed memory parallel computes to field scale reservoir simulations using a parallel version of UTCHEM, The University of Texas Chemical Flooding Simulator. The model is a general purpose highly vectorized chemical compositional simulator that can simulate a wide range of displacement processes at both field and laboratory scales. The original simulator was modified to run on both distributed memory parallel machines (Intel iPSC/960 and Delta, Connection Machine 5, Kendall Square 1 and 2, and CRAY T3D) and a cluster of workstations. A domain decomposition approach has been taken towards parallelization of the code. A portion of the discrete reservoir model is assigned to each processor by a set-up routine that attempts a data layout as even as possible from the load-balance standpoint. Each of these subdomains is extended so that data can be shared between adjacent processors for stencil computation. The added routines that make parallel execution possible are written in a modular fashion that makes the porting to new parallel platforms straight forward. Results of the distributed memory computing performance of Parallel simulator are presented for field scale applications such as tracer flood and polymer flood. A comparison of the wall-clock times for same problems on a vector supercomputer is also presented
Massive Exploration of Perturbed Conditions of the Blood Coagulation Cascade through GPU Parallelization

Directory of Open Access Journals (Sweden)

Paolo Cazzaniga

2014-01-01

high-performance computing solutions is motivated by the need of performing large numbers of in silico analysis to study the behavior of biological systems in different conditions, which necessitate a computing power that usually overtakes the capability of standard desktop computers. In this work we present coagSODA, a CUDA-powered computational tool that was purposely developed for the analysis of a large mechanistic model of the blood coagulation cascade (BCC, defined according to both mass-action kinetics and Hill functions. coagSODA allows the execution of parallel simulations of the dynamics of the BCC by automatically deriving the system of ordinary differential equations and then exploiting the numerical integration algorithm LSODA. We present the biological results achieved with a massive exploration of perturbed conditions of the BCC, carried out with one-dimensional and bi-dimensional parameter sweep analysis, and show that GPU-accelerated parallel simulations of this model can increase the computational performances up to a 181× speedup compared to the corresponding sequential simulations.
Asymptotically optimum multialternative sequential procedures for discernment of processes minimizing average length of observations

Science.gov (United States)

Fishman, M. M.

1985-01-01

The problem of multialternative sequential discernment of processes is formulated in terms of conditionally optimum procedures minimizing the average length of observations, without any probabilistic assumptions about any one occurring process, rather than in terms of Bayes procedures minimizing the average risk. The problem is to find the procedure that will transform inequalities into equalities. The problem is formulated for various models of signal observation and data processing: (1) discernment of signals from background interference by a multichannel system; (2) discernment of pulse sequences with unknown time delay; (3) discernment of harmonic signals with unknown frequency. An asymptotically optimum sequential procedure is constructed which compares the statistics of the likelihood ratio with the mean-weighted likelihood ratio and estimates the upper bound for conditional average lengths of observations. This procedure is shown to remain valid as the upper bound for the probability of erroneous partial solutions decreases approaching zero and the number of hypotheses increases approaching infinity. It also remains valid under certain special constraints on the probability such as a threshold. A comparison with a fixed-length procedure reveals that this sequential procedure decreases the length of observations to one quarter, on the average, when the probability of erroneous partial solutions is low.
Exploiting Thread Parallelism for Ocean Modeling on Cray XC Supercomputers

Energy Technology Data Exchange (ETDEWEB)

Sarje, Abhinav [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Jacobsen, Douglas W. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Williams, Samuel W. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Ringler, Todd [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Oliker, Leonid [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)

2016-05-01

The incorporation of increasing core counts in modern processors used to build state-of-the-art supercomputers is driving application development towards exploitation of thread parallelism, in addition to distributed memory parallelism, with the goal of delivering efficient high-performance codes. In this work we describe the exploitation of threading and our experiences with it with respect to a real-world ocean modeling application code, MPAS-Ocean. We present detailed performance analysis and comparisons of various approaches and configurations for threading on the Cray XC series supercomputers.
Stiffness Analysis and Comparison of 3-PPR Planar Parallel Manipulators with Actuation Compliance

DEFF Research Database (Denmark)

Wu, Guanglei; Bai, Shaoping; Kepler, Jørgen Asbøl

2012-01-01

In this paper, the stiffness of 3-PPR planar parallel manipulator (PPM) is analyzed with the consideration of nonlinear actuation compliance. The characteristics of the stiffness matrix pertaining to the planar parallel manipulators are analyzed and discussed. Graphic representation of the stiffn...... of the stiffness characteristics by means of translational and rotational stiffness mapping is developed. The developed method is illustrated with an unsymmetrical 3-PPR PPM, being compared with its structure-symmetrical counterpart....
BCR-701: A review of 10-years of sequential extraction analyses

International Nuclear Information System (INIS)

Sutherland, Ross A.

2010-01-01

A detailed quantitative analysis was performed on data presented in the literature that focused on the sequential extraction of cadmium (Cd), chromium (Cr), copper (Cu), nickel (Ni), lead (Pb) and zinc (Zn) from the certified reference material BCR-701 (lake sediment) using the three-step harmonized BCR procedure. The accuracy of data reported in the literature, including precision and different measures of trueness, was assessed relative to the certified values for BCR-701. Forty data sets were accepted following extreme outlier removal, and statistically summarized with measures of central tendency, dispersion, and distribution form. In general, literature data were similar in their measurement precision to the expert laboratories used to certify the trace element contents in BCR-701. The overall median precision for literature reported data was 10% (range 6-19%), compared to certifying laboratories of 9% (range 4-33%). One measure of literature data trueness was assessed via a confirmatory approach using a robust bootstrap method. Only 22% of the comparisons indicated significantly different (all were lower) concentrations reported in the literature compared to certified values. The question of whether the differences are practically significant for environmental studies is raised. Bias was computed as a measure of trueness, and literature data were more frequently negatively biased, indicating lower concentrations reported in the literature for the six trace elements for the three-step sequential procedure compared to the certified values. However, 95% confidence intervals about the average bias for the 18 comparisons indicated only four instances when a mean bias of 0 (i.e., measured = certified) was not incorporated-suggesting statistical difference. Finally, Z-scores incorporating a Horwitz-type function were used to assess the general trueness of laboratory data. Of the 468 laboratory Z-score values computed, 92% were considered to be satisfactory, 5% were
Mixing modes in a population-based interview survey: comparison of a sequential and a concurrent mixed-mode design for public health research.

Science.gov (United States)

Mauz, Elvira; von der Lippe, Elena; Allen, Jennifer; Schilling, Ralph; Müters, Stephan; Hoebel, Jens; Schmich, Patrick; Wetzstein, Matthias; Kamtsiuris, Panagiotis; Lange, Cornelia

2018-01-01

Population-based surveys currently face the problem of decreasing response rates. Mixed-mode designs are now being implemented more often to account for this, to improve sample composition and to reduce overall costs. This study examines whether a concurrent or sequential mixed-mode design achieves better results on a number of indicators of survey quality. Data were obtained from a population-based health interview survey of adults in Germany that was conducted as a methodological pilot study as part of the German Health Update (GEDA). Participants were randomly allocated to one of two surveys; each of the surveys had a different design. In the concurrent mixed-mode design ( n = 617) two types of self-administered questionnaires (SAQ-Web and SAQ-Paper) and computer-assisted telephone interviewing were offered simultaneously to the respondents along with the invitation to participate. In the sequential mixed-mode design ( n = 561), SAQ-Web was initially provided, followed by SAQ-Paper, with an option for a telephone interview being sent out together with the reminders at a later date. Finally, this study compared the response rates, sample composition, health indicators, item non-response, the scope of fieldwork and the costs of both designs. No systematic differences were identified between the two mixed-mode designs in terms of response rates, the socio-demographic characteristics of the achieved samples, or the prevalence rates of the health indicators under study. The sequential design gained a higher rate of online respondents. Very few telephone interviews were conducted for either design. With regard to data quality, the sequential design (which had more online respondents) showed less item non-response. There were minor differences between the designs in terms of their costs. Postage and printing costs were lower in the concurrent design, but labour costs were lower in the sequential design. No differences in health indicators were found between
A Parallel Algebraic Multigrid Solver on Graphics Processing Units

KAUST Repository

Haase, Gundolf

2010-01-01

The paper presents a multi-GPU implementation of the preconditioned conjugate gradient algorithm with an algebraic multigrid preconditioner (PCG-AMG) for an elliptic model problem on a 3D unstructured grid. An efficient parallel sparse matrix-vector multiplication scheme underlying the PCG-AMG algorithm is presented for the many-core GPU architecture. A performance comparison of the parallel solver shows that a singe Nvidia Tesla C1060 GPU board delivers the performance of a sixteen node Infiniband cluster and a multi-GPU configuration with eight GPUs is about 100 times faster than a typical server CPU core. © 2010 Springer-Verlag.
Hybrid parallelization of the XTOR-2F code for the simulation of two-fluid MHD instabilities in tokamaks

Science.gov (United States)

Marx, Alain; Lütjens, Hinrich

2017-03-01

A hybrid MPI/OpenMP parallel version of the XTOR-2F code [Lütjens and Luciani, J. Comput. Phys. 229 (2010) 8130] solving the two-fluid MHD equations in full tokamak geometry by means of an iterative Newton-Krylov matrix-free method has been developed. The present work shows that the code has been parallelized significantly despite the numerical profile of the problem solved by XTOR-2F, i.e. a discretization with pseudo-spectral representations in all angular directions, the stiffness of the two-fluid stability problem in tokamaks, and the use of a direct LU decomposition to invert the physical pre-conditioner at every Krylov iteration of the solver. The execution time of the parallelized version is an order of magnitude smaller than the sequential one for low resolution cases, with an increasing speedup when the discretization mesh is refined. Moreover, it allows to perform simulations with higher resolutions, previously forbidden because of memory limitations.
Improved Parallel Three-List Algorithm for the Knapsack Problem without Memory Conflicts

Institute of Scientific and Technical Information of China (English)

Pan Jun; Li Kenli; Li Qinghua

2006-01-01

Based on the two-list algorithm and the parallel three-list algorithm, an improved parallel three-list algorithm for knapsack problem is proposed, in which the method of divide and conquer, and parallel merging without memory conflicts are adopted. To find a solution for the n-element knapsack problem, the proposed algorithm needs O(23n/8) time when O(23n/8) shared memory units and O(2n/4) processors are available. The comparisons between the proposed algorithm and 10 existing algorithms show that the improved parallel three-list algorithm is the first exclusive-read exclusive-write (EREW) parallel algorithm that can solve the knapsack instances in less than O(2n/2) time when the available hardware resource is smaller than O(2n/2), and hence is an improved result over the past researches.
Parallelization of pressure equation solver for incompressible N-S equations

International Nuclear Information System (INIS)

Ichihara, Kiyoshi; Yokokawa, Mitsuo; Kaburaki, Hideo.

1996-03-01

A pressure equation solver in a code for 3-dimensional incompressible flow analysis has been parallelized by using red-black SOR method and PCG method on Fujitsu VPP500, a vector parallel computer with distributed memory. For the comparison of scalability, the solver using the red-black SOR method has been also parallelized on the Intel Paragon, a scalar parallel computer with a distributed memory. The scalability of the red-black SOR method on both VPP500 and Paragon was lost, when number of processor elements was increased. The reason of non-scalability on both systems is increasing communication time between processor elements. In addition, the parallelization by DO-loop division makes the vectorizing efficiency lower on VPP500. For an effective implementation on VPP500, a large scale problem which holds very long vectorized DO-loops in the parallel program should be solved. PCG method with red-black SOR method applied to incomplete LU factorization (red-black PCG) has more iteration steps than normal PCG method with forward and backward substitution, in spite of same number of the floating point operations in a DO-loop of incomplete LU factorization. The parallelized red-black PCG method has less merits than the parallelized red-black SOR method when the computational region has fewer grids, because the low vectorization efficiency is obtained in red-black PCG method. (author)
Smoldyn on graphics processing units: massively parallel Brownian dynamics simulations.

Science.gov (United States)

Dematté, Lorenzo

2012-01-01

Space is a very important aspect in the simulation of biochemical systems; recently, the need for simulation algorithms able to cope with space is becoming more and more compelling. Complex and detailed models of biochemical systems need to deal with the movement of single molecules and particles, taking into consideration localized fluctuations, transportation phenomena, and diffusion. A common drawback of spatial models lies in their complexity: models can become very large, and their simulation could be time consuming, especially if we want to capture the systems behavior in a reliable way using stochastic methods in conjunction with a high spatial resolution. In order to deliver the promise done by systems biology to be able to understand a system as whole, we need to scale up the size of models we are able to simulate, moving from sequential to parallel simulation algorithms. In this paper, we analyze Smoldyn, a widely diffused algorithm for stochastic simulation of chemical reactions with spatial resolution and single molecule detail, and we propose an alternative, innovative implementation that exploits the parallelism of Graphics Processing Units (GPUs). The implementation executes the most computational demanding steps (computation of diffusion, unimolecular, and bimolecular reaction, as well as the most common cases of molecule-surface interaction) on the GPU, computing them in parallel on each molecule of the system. The implementation offers good speed-ups and real time, high quality graphics output
Comparison of concurrent chemoradiotherapy versus sequential radiochemotherapy in patients with completely resected non-small cell lung cancer

Energy Technology Data Exchange (ETDEWEB)

Kim, Hwan Ik; Noh, O Kyu; Oh, Young Taek; Chun, Mi Son; Kim, Sang Won; Cho, O Yeon; Heo, Jae Sung [Ajou University School of Medicine, Suwon (Korea, Republic of)

2016-09-15

Our institution has implemented two different adjuvant protocols in treating patients with non-small cell lung cancer (NSCLC): chemotherapy followed by concurrent chemoradiotherapy (CT-CCRT) and sequential postoperative radiotherapy (PORT) followed by postoperative chemotherapy (POCT). We aimed to compare the clinical outcomes between the two adjuvant protocols. From March 1997 to October 2012, 68 patients were treated with CT-CCRT (n = 25) and sequential PORT followed by POCT (RT-CT; n = 43). The CT-CCRT protocol consisted of 2 cycles of cisplatin-based POCT followed by PORT concurrently with 2 cycles of POCT. The RT-CT protocol consisted of PORT followed by 4 cycles of cisplatin-based POCT. PORT was administered using conventional fractionation with a dose of 50.4–60 Gy. We compared the outcomes between the two adjuvant protocols and analyzed the clinical factors affecting survivals. Median follow-up time was 43.9 months (range, 3.2 to 74.0 months), and the 5-year overall survival (OS), locoregional recurrence-free survival (LRFS), and distant metastasis-free survival (DMFS) were 53.9%, 68.2%, and 51.0%, respectively. There were no significant differences in OS (p = 0.074), LRFS (p = 0.094), and DMFS (p = 0.490) between the two protocols. In multivariable analyses, adjuvant protocol remained as a significant prognostic factor for LRFS, favouring CT-CCRT (hazard ratio [HR] = 3.506, p = 0.046) over RT-CT, not for OS (HR = 0.647, p = 0.229). CT-CCRT protocol increased LRFS more than RT-CT protocol in patients with completely resected NSCLC, but not in OS. Further studies are warranted to evaluate the benefit of CCRT strategy compared with sequential strategy.
Discrimination between sequential and simultaneous virtual channels with electrical hearing.

Science.gov (United States)

Landsberger, David; Galvin, John J

2011-09-01

In cochlear implants (CIs), simultaneous or sequential stimulation of adjacent electrodes can produce intermediate pitch percepts between those of the component electrodes. However, it is unclear whether simultaneous and sequential virtual channels (VCs) can be discriminated. In this study, CI users were asked to discriminate simultaneous and sequential VCs; discrimination was measured for monopolar (MP) and bipolar + 1 stimulation (BP + 1), i.e., relatively broad and focused stimulation modes. For sequential VCs, the interpulse interval (IPI) varied between 0.0 and 1.8 ms. All stimuli were presented at comfortably loud, loudness-balanced levels at a 250 pulse per second per electrode (ppse) stimulation rate. On average, CI subjects were able to reliably discriminate between sequential and simultaneous VCs. While there was no significant effect of IPI or stimulation mode on VC discrimination, some subjects exhibited better VC discrimination with BP + 1 stimulation. Subjects' discrimination between sequential and simultaneous VCs was correlated with electrode discrimination, suggesting that spatial selectivity may influence perception of sequential VCs. To maintain equal loudness, sequential VC amplitudes were nearly double those of simultaneous VCs, presumably resulting in a broader spread of excitation. These results suggest that perceptual differences between simultaneous and sequential VCs might be explained by differences in the spread of excitation. © 2011 Acoustical Society of America

Comparison of three-stage sequential extraction and toxicity characteristic leaching tests to evaluate metal mobility in mining wastes

International Nuclear Information System (INIS)

Margui, E.; Salvado, V.; Queralt, I.; Hidalgo, M.

2004-01-01

Abandoned mining sites contain residues from ore processing operations that are characterised by high concentrations of heavy metals. The form in which a metal exists strongly influences its mobility and, thus, the effects on the environment. Operational methods of speciation analysis, such as the use of sequential extraction procedures, are commonly applied. In this work, the modified three-stage sequential extraction procedure proposed by the BCR (now the Standards, Measurements and Testing Programme) was applied for the fractionation of Ni, Zn, Pb and Cd in mining wastes from old Pb-Zn mining areas located in the Val d'Aran (NE Spain) and Cartagena (SE Spain). Analyses of the extracts were performed by inductively coupled plasma atomic emission spectrometry and electrothermal atomic absorption spectrometry. The procedure was evaluated by using a certified reference material, BCR-701. The results of the partitioning study indicate that more easily mobilised forms (acid exchangeable) were predominant for Cd and Zn, particularly in the sample from Cartagena. In contrast, the largest amount of lead was associated with the iron and manganese oxide fractions. On the other hand, the applicability of lixiviation tests commonly used to evaluate the leaching of toxic species from landfill disposal (US-EPA Toxicity Characteristic Leaching Procedure and DIN 38414-S4) to mining wastes was also investigated and the obtained results compared with the information on metal mobility derivable from the application of the three-stage sequential extraction procedure
Fractionation of potentially toxic elements in urban soils from five European cities by means of a harmonised sequential extraction procedure

International Nuclear Information System (INIS)

Davidson, Christine M.; Urquhart, Graham J.; Ajmone-Marsan, Franco; Biasioli, Mattia; Costa Duarte, Armando da; Diaz-Barrientos, Encarnacion; Grcman, Helena; Hossack, Iain; Hursthouse, Andrew S.; Madrid, Luis; Rodrigues, Sonia; Zupan, Marko

2006-01-01

The revised (four-step) BCR sequential extraction procedure has been applied to fractionate the chromium, copper, iron, manganese, nickel, lead and zinc contents in urban soil samples from public-access areas in five European cities. A preliminary inter-laboratory comparison was conducted and showed that data obtained by different laboratories participating in the study were sufficiently harmonious for comparisons to be made between cities and land types (e.g. parks, roadside, riverbanks, etc.). Analyte recoveries by sequential extraction, with respect to direct aqua regia digestion, were generally acceptable (100 ± 15%). Iron, nickel and, at most sites, chromium were found mainly in association with the residual phase of the soil matrix. Copper was present in the reducible, oxidisable and residual fractions, whilst zinc was found in all four sequential extracts. Manganese was strongly associated with reducible material as, in some cities, was lead. This is of concern because high lead concentrations were present in some soils (>500 mg kg -1 ) and the potential exists for remobilisation under reducing conditions. As would be expected, extractable metal contents were generally highest in older, more heavily industrialised cities. Copper, lead and zinc showed marked (and often correlated) variations in concentrations between sites within the same city whereas manganese and, especially, iron, did not. No overall relationships were, however, found between analyte concentrations and land use, nor between analyte partitioning and land use
Sequential versus simultaneous market delineation

DEFF Research Database (Denmark)

Haldrup, Niels; Møllgaard, Peter; Kastberg Nielsen, Claus

2005-01-01

and geographical markets. Using a unique data setfor prices of Norwegian and Scottish salmon, we propose a methodologyfor simultaneous market delineation and we demonstrate that comparedto a sequential approach conclusions will be reversed.JEL: C3, K21, L41, Q22Keywords: Relevant market, econometric delineation......Delineation of the relevant market forms a pivotal part of most antitrustcases. The standard approach is sequential. First the product marketis delineated, then the geographical market is defined. Demand andsupply substitution in both the product dimension and the geographicaldimension...
Designing a parallel evolutionary algorithm for inferring gene networks on the cloud computing environment.

Science.gov (United States)

Lee, Wei-Po; Hsiao, Yu-Ting; Hwang, Wei-Che

2014-01-16

To improve the tedious task of reconstructing gene networks through testing experimentally the possible interactions between genes, it becomes a trend to adopt the automated reverse engineering procedure instead. Some evolutionary algorithms have been suggested for deriving network parameters. However, to infer large networks by the evolutionary algorithm, it is necessary to address two important issues: premature convergence and high computational cost. To tackle the former problem and to enhance the performance of traditional evolutionary algorithms, it is advisable to use parallel model evolutionary algorithms. To overcome the latter and to speed up the computation, it is advocated to adopt the mechanism of cloud computing as a promising solution: most popular is the method of MapReduce programming model, a fault-tolerant framework to implement parallel algorithms for inferring large gene networks. This work presents a practical framework to infer large gene networks, by developing and parallelizing a hybrid GA-PSO optimization method. Our parallel method is extended to work with the Hadoop MapReduce programming model and is executed in different cloud computing environments. To evaluate the proposed approach, we use a well-known open-source software GeneNetWeaver to create several yeast S. cerevisiae sub-networks and use them to produce gene profiles. Experiments have been conducted and the results have been analyzed. They show that our parallel approach can be successfully used to infer networks with desired behaviors and the computation time can be largely reduced. Parallel population-based algorithms can effectively determine network parameters and they perform better than the widely-used sequential algorithms in gene network inference. These parallel algorithms can be distributed to the cloud computing environment to speed up the computation. By coupling the parallel model population-based optimization method and the parallel computational framework, high
Parallel Programming with Intel Parallel Studio XE

CERN Document Server

Blair-Chappell , Stephen

2012-01-01

Optimize code for multi-core processors with Intel's Parallel Studio Parallel programming is rapidly becoming a "must-know" skill for developers. Yet, where to start? This teach-yourself tutorial is an ideal starting point for developers who already know Windows C and C++ and are eager to add parallelism to their code. With a focus on applying tools, techniques, and language extensions to implement parallelism, this essential resource teaches you how to write programs for multicore and leverage the power of multicore in your programs. Sharing hands-on case studies and real-world examples, the
Enhancing parallelism of tile bidiagonal transformation on multicore architectures using tree reduction

KAUST Repository

Ltaief, Hatem

2012-01-01

The objective of this paper is to enhance the parallelism of the tile bidiagonal transformation using tree reduction on multicore architectures. First introduced by Ltaief et. al [LAPACK Working Note #247, 2011], the bidiagonal transformation using tile algorithms with a two-stage approach has shown very promising results on square matrices. However, for tall and skinny matrices, the inherent problem of processing the panel in a domino-like fashion generates unnecessary sequential tasks. By using tree reduction, the panel is horizontally split, which creates another dimension of parallelism and engenders many concurrent tasks to be dynamically scheduled on the available cores. The results reported in this paper are very encouraging. The new tile bidiagonal transformation, targeting tall and skinny matrices, outperforms the state-of-the-art numerical linear algebra libraries LAPACK V3.2 and Intel MKL ver. 10.3 by up to 29-fold speedup and the standard two-stage PLASMA BRD by up to 20-fold speedup, on an eight socket hexa-core AMD Opteron multicore shared-memory system. © 2012 Springer-Verlag.
BCYCLIC: A parallel block tridiagonal matrix cyclic solver

Science.gov (United States)

Hirshman, S. P.; Perumalla, K. S.; Lynch, V. E.; Sanchez, R.

2010-09-01

A block tridiagonal matrix is factored with minimal fill-in using a cyclic reduction algorithm that is easily parallelized. Storage of the factored blocks allows the application of the inverse to multiple right-hand sides which may not be known at factorization time. Scalability with the number of block rows is achieved with cyclic reduction, while scalability with the block size is achieved using multithreaded routines (OpenMP, GotoBLAS) for block matrix manipulation. This dual scalability is a noteworthy feature of this new solver, as well as its ability to efficiently handle arbitrary (non-powers-of-2) block row and processor numbers. Comparison with a state-of-the art parallel sparse solver is presented. It is expected that this new solver will allow many physical applications to optimally use the parallel resources on current supercomputers. Example usage of the solver in magneto-hydrodynamic (MHD), three-dimensional equilibrium solvers for high-temperature fusion plasmas is cited.
Group-sequential analysis may allow for early trial termination

DEFF Research Database (Denmark)

Gerke, Oke; Vilstrup, Mie H; Halekoh, Ulrich

2017-01-01

BACKGROUND: Group-sequential testing is widely used in pivotal therapeutic, but rarely in diagnostic research, although it may save studies, time, and costs. The purpose of this paper was to demonstrate a group-sequential analysis strategy in an intra-observer study on quantitative FDG-PET/CT mea......BACKGROUND: Group-sequential testing is widely used in pivotal therapeutic, but rarely in diagnostic research, although it may save studies, time, and costs. The purpose of this paper was to demonstrate a group-sequential analysis strategy in an intra-observer study on quantitative FDG...
Tissue P Systems With Channel States Working in the Flat Maximally Parallel Way.

Science.gov (United States)

Song, Bosheng; Perez-Jimenez, Mario J; Paun, Gheorghe; Pan, Linqiang

2016-10-01

Tissue P systems with channel states are a class of bio-inspired parallel computational models, where rules are used in a sequential manner (on each channel, at most one rule can be used at each step). In this work, tissue P systems with channel states working in a flat maximally parallel way are considered, where at each step, on each channel, a maximal set of applicable rules that pass from a given state to a unique next state, is chosen and each rule in the set is applied once. The computational power of such P systems is investigated. Specifically, it is proved that tissue P systems with channel states and antiport rules of length two are able to compute Parikh sets of finite languages, and such P systems with one cell and noncooperative symport rules can compute at least all Parikh sets of matrix languages. Some Turing universality results are also provided. Moreover, the NP-complete problem SAT is solved by tissue P systems with channel states, cell division and noncooperative symport rules working in the flat maximally parallel way; nevertheless, if channel states are not used, then such P systems working in the flat maximally parallel way can solve only tractable problems. These results show that channel states provide a frontier of tractability between efficiency and non-efficiency in the framework of tissue P systems with cell division (assuming P ≠ NP ).
Sequential logic analysis and synthesis

CERN Document Server

Cavanagh, Joseph

2007-01-01

Until now, there was no single resource for actual digital system design. Using both basic and advanced concepts, Sequential Logic: Analysis and Synthesis offers a thorough exposition of the analysis and synthesis of both synchronous and asynchronous sequential machines. With 25 years of experience in designing computing equipment, the author stresses the practical design of state machines. He clearly delineates each step of the structured and rigorous design principles that can be applied to practical applications. The book begins by reviewing the analysis of combinatorial logic and Boolean a
An experimental study of two-phase flow instability on two parallel channel with low steam quality

International Nuclear Information System (INIS)

Jiang Shengyao; Wu shaorong; Bo Jinhai; Yao Meisheng; Han Bing; Zhang Youjie

1988-01-01

An experimental result of two-phase flow instability on two parallel channel natural circulation with low steam quality is presented. The comparison of instability in the single channel and that in parallel channel is given. The effect of unequal inlet resistance coefficient and unequal power on the parallel channel instability is described and the behaviour of instability with equal exit steam quality in the two channel is investigated
Parallelization characteristics of the DeCART code

International Nuclear Information System (INIS)

Cho, J. Y.; Joo, H. G.; Kim, H. Y.; Lee, C. C.; Chang, M. H.; Zee, S. Q.

2003-12-01

This report is to describe the parallelization characteristics of the DeCART code and also examine its parallel performance. Parallel computing algorithms are implemented to DeCART to reduce the tremendous computational burden and memory requirement involved in the three-dimensional whole core transport calculation. In the parallelization of the DeCART code, the axial domain decomposition is first realized by using MPI (Message Passing Interface), and then the azimuthal angle domain decomposition by using either MPI or OpenMP. When using the MPI for both the axial and the angle domain decomposition, the concept of MPI grouping is employed for convenient communication in each communication world. For the parallel computation, most of all the computing modules except for the thermal hydraulic module are parallelized. These parallelized computing modules include the MOC ray tracing, CMFD, NEM, region-wise cross section preparation and cell homogenization modules. For the distributed allocation, most of all the MOC and CMFD/NEM variables are allocated only for the assigned planes, which reduces the required memory by a ratio of the number of the assigned planes to the number of all planes. The parallel performance of the DeCART code is evaluated by solving two problems, a rodded variation of the C5G7 MOX three-dimensional benchmark problem and a simplified three-dimensional SMART PWR core problem. In the aspect of parallel performance, the DeCART code shows a good speedup of about 40.1 and 22.4 in the ray tracing module and about 37.3 and 20.2 in the total computing time when using 48 CPUs on the IBM Regatta and 24 CPUs on the LINUX cluster, respectively. In the comparison between the MPI and OpenMP, OpenMP shows a somewhat better performance than MPI. Therefore, it is concluded that the first priority in the parallel computation of the DeCART code is in the axial domain decomposition by using MPI, and then in the angular domain using OpenMP, and finally the angular
Comparison of Coregistration Accuracy of Pelvic Structures Between Sequential and Simultaneous Imaging During Hybrid PET/MRI in Patients with Bladder Cancer.

Science.gov (United States)

Rosenkrantz, Andrew B; Balar, Arjun V; Huang, William C; Jackson, Kimberly; Friedman, Kent P

2015-08-01

The aim of this study was to compare coregistration of the bladder wall, bladder masses, and pelvic lymph nodes between sequential and simultaneous PET and MRI acquisitions obtained during hybrid (18)F-FDG PET/MRI performed using a diuresis protocol in bladder cancer patients. Six bladder cancer patients underwent (18)F-FDG hybrid PET/MRI, including IV Lasix administration and oral hydration, before imaging to achieve bladder clearance. Axial T2-weighted imaging (T2WI) was obtained approximately 40 minutes before PET ("sequential") and concurrently with PET ("simultaneous"). Three-dimensional spatial coordinates of the bladder wall, bladder masses, and pelvic lymph nodes were recorded for PET and T2WI. Distances between these locations on PET and T2WI sequences were computed and used to compare in-plane (x-y plane) and through-plane (z-axis) misregistration relative to PET between T2WI acquisitions. The bladder increased in volume between T2WI acquisitions (sequential, 176 [139] mL; simultaneous, 255 [146] mL). Four patients exhibited a bladder mass, all with increased activity (SUV, 9.5-38.4). Seven pelvic lymph nodes in 4 patients showed increased activity (SUV, 2.2-9.9). The bladder wall exhibited substantially less misregistration relative to PET for simultaneous, compared with sequential, acquisitions in in-plane (2.8 [3.1] mm vs 7.4 [9.1] mm) and through-plane (1.7 [2.2] mm vs 5.7 [9.6] mm) dimensions. Bladder masses exhibited slightly decreased misregistration for simultaneous, compared with sequential, acquisitions in in-plane (2.2 [1.4] mm vs 2.6 [1.9] mm) and through-plane (0.0 [0.0] mm vs 0.3 [0.8] mm) dimensions. FDG-avid lymph nodes exhibited slightly decreased in-plane misregistration (1.1 [0.8] mm vs 2.5 [0.6] mm), although identical through-plane misregistration (4.0 [1.9] mm vs 4.0 [2.8] mm). Using hybrid PET/MRI, simultaneous imaging substantially improved bladder wall coregistration and slightly improved coregistration of bladder masses and
A Novel Parallel Algorithm for Edit Distance Computation

Directory of Open Access Journals (Sweden)

Muhammad Murtaza Yousaf

2018-01-01

Full Text Available The edit distance between two sequences is the minimum number of weighted transformation-operations that are required to transform one string into the other. The weighted transformation-operations are insert, remove, and substitute. Dynamic programming solution to find edit distance exists but it becomes computationally intensive when the lengths of strings become very large. This work presents a novel parallel algorithm to solve edit distance problem of string matching. The algorithm is based on resolving dependencies in the dynamic programming solution of the problem and it is able to compute each row of edit distance table in parallel. In this way, it becomes possible to compute the complete table in min(m,n iterations for strings of size m and n whereas state-of-the-art parallel algorithm solves the problem in max(m,n iterations. The proposed algorithm also increases the amount of parallelism in each of its iteration. The algorithm is also capable of exploiting spatial locality while its implementation. Additionally, the algorithm works in a load balanced way that further improves its performance. The algorithm is implemented for multicore systems having shared memory. Implementation of the algorithm in OpenMP shows linear speedup and better execution time as compared to state-of-the-art parallel approach. Efficiency of the algorithm is also proven better in comparison to its competitor.
Scientific programming on massively parallel processor CP-PACS

International Nuclear Information System (INIS)

Boku, Taisuke

1998-01-01

The massively parallel processor CP-PACS takes various problems of calculation physics as the object, and it has been designed so that its architecture has been devised to do various numerical processings. In this report, the outline of the CP-PACS and the example of programming in the Kernel CG benchmark in NAS Parallel Benchmarks, version 1, are shown, and the pseudo vector processing mechanism and the parallel processing tuning of scientific and technical computation utilizing the three-dimensional hyper crossbar net, which are two great features of the architecture of the CP-PACS are described. As for the CP-PACS, the PUs based on RISC processor and added with pseudo vector processor are used. Pseudo vector processing is realized as the loop processing by scalar command. The features of the connection net of PUs are explained. The algorithm of the NPB version 1 Kernel CG is shown. The part that takes the time for processing most in the main loop is the product of matrix and vector (matvec), and the parallel processing of the matvec is explained. The time for the computation by the CPU is determined. As the evaluation of the performance, the evaluation of the time for execution, the short vector processing of pseudo vector processor based on slide window, and the comparison with other parallel computers are reported. (K.I.)
Performance Analysis of Video Transmission Using Sequential Distortion Minimization Method for Digital Video Broadcasting Terrestrial

Directory of Open Access Journals (Sweden)

Novita Astin

2016-12-01

Full Text Available This paper presents about the transmission of Digital Video Broadcasting system with streaming video resolution 640x480 on different IQ rate and modulation. In the video transmission, distortion often occurs, so the received video has bad quality. Key frames selection algorithm is flexibel on a change of video, but on these methods, the temporal information of a video sequence is omitted. To minimize distortion between the original video and received video, we aimed at adding methodology using sequential distortion minimization algorithm. Its aim was to create a new video, better than original video without significant loss of content between the original video and received video, fixed sequentially. The reliability of video transmission was observed based on a constellation diagram, with the best result on IQ rate 2 Mhz and modulation 8 QAM. The best video transmission was also investigated using SEDIM (Sequential Distortion Minimization Method and without SEDIM. The experimental result showed that the PSNR (Peak Signal to Noise Ratio average of video transmission using SEDIM was an increase from 19,855 dB to 48,386 dB and SSIM (Structural Similarity average increase 10,49%. The experimental results and comparison of proposed method obtained a good performance. USRP board was used as RF front-end on 2,2 GHz.
Structural Consistency, Consistency, and Sequential Rationality.

OpenAIRE

Kreps, David M; Ramey, Garey

1987-01-01

Sequential equilibria comprise consistent beliefs and a sequentially ra tional strategy profile. Consistent beliefs are limits of Bayes ratio nal beliefs for sequences of strategies that approach the equilibrium strategy. Beliefs are structurally consistent if they are rationaliz ed by some single conjecture concerning opponents' strategies. Consis tent beliefs are not necessarily structurally consistent, notwithstan ding a claim by Kreps and Robert Wilson (1982). Moreover, the spirit of stru...
Aging in Movement Representations for Sequential Finger Movements: A Comparison between Young-, Middle-Aged, and Older Adults

Science.gov (United States)

Cacola, Priscila; Roberson, Jerroed; Gabbard, Carl

2013-01-01

Studies show that as we enter older adulthood (greater than 64 years), our ability to mentally represent action in the form of using motor imagery declines. Using a chronometry paradigm to compare the movement duration of imagined and executed movements, we tested young-, middle-aged, and older adults on their ability to perform sequential finger…
Constraint treatment techniques and parallel algorithms for multibody dynamic analysis. Ph.D. Thesis

Science.gov (United States)

Chiou, Jin-Chern

1990-01-01

Computational procedures for kinematic and dynamic analysis of three-dimensional multibody dynamic (MBD) systems are developed from the differential-algebraic equations (DAE's) viewpoint. Constraint violations during the time integration process are minimized and penalty constraint stabilization techniques and partitioning schemes are developed. The governing equations of motion, a two-stage staggered explicit-implicit numerical algorithm, are treated which takes advantage of a partitioned solution procedure. A robust and parallelizable integration algorithm is developed. This algorithm uses a two-stage staggered central difference algorithm to integrate the translational coordinates and the angular velocities. The angular orientations of bodies in MBD systems are then obtained by using an implicit algorithm via the kinematic relationship between Euler parameters and angular velocities. It is shown that the combination of the present solution procedures yields a computationally more accurate solution. To speed up the computational procedures, parallel implementation of the present constraint treatment techniques, the two-stage staggered explicit-implicit numerical algorithm was efficiently carried out. The DAE's and the constraint treatment techniques were transformed into arrowhead matrices to which Schur complement form was derived. By fully exploiting the sparse matrix structural analysis techniques, a parallel preconditioned conjugate gradient numerical algorithm is used to solve the systems equations written in Schur complement form. A software testbed was designed and implemented in both sequential and parallel computers. This testbed was used to demonstrate the robustness and efficiency of the constraint treatment techniques, the accuracy of the two-stage staggered explicit-implicit numerical algorithm, and the speed up of the Schur-complement-based parallel preconditioned conjugate gradient algorithm on a parallel computer.
Distributed and parallel approach for handle and perform huge datasets

Science.gov (United States)

Konopko, Joanna

2015-12-01

Big Data refers to the dynamic, large and disparate volumes of data comes from many different sources (tools, machines, sensors, mobile devices) uncorrelated with each others. It requires new, innovative and scalable technology to collect, host and analytically process the vast amount of data. Proper architecture of the system that perform huge data sets is needed. In this paper, the comparison of distributed and parallel system architecture is presented on the example of MapReduce (MR) Hadoop platform and parallel database platform (DBMS). This paper also analyzes the problem of performing and handling valuable information from petabytes of data. The both paradigms: MapReduce and parallel DBMS are described and compared. The hybrid architecture approach is also proposed and could be used to solve the analyzed problem of storing and processing Big Data.

Design and Transmission Analysis of an Asymmetrical Spherical Parallel Manipulator

DEFF Research Database (Denmark)

Wu, Guanglei; Caro, Stéphane; Wang, Jiawei

2015-01-01

analysis and optimal design of the proposed manipulator based on its kinematic analysis. The input and output transmission indices of the manipulator are defined for its optimum design based on the virtual coefficient between the transmission wrenches and twist screws. The sets of optimal parameters......This paper presents an asymmetrical spherical parallel manipulator and its transmissibility analysis. This manipulator contains a center shaft to both generate a decoupled unlimited-torsion motion and support the mobile platform for high positioning accuracy. This work addresses the transmission...... are identified and the distribution of the transmission index is visualized. Moreover, a comparative study regarding to the performances with the symmetrical spherical parallel manipulators is conducted and the comparison shows the advantages of the proposed manipulator with respect to its spherical parallel...
Parallelizing flow-accumulation calculations on graphics processing units—From iterative DEM preprocessing algorithm to recursive multiple-flow-direction algorithm

Science.gov (United States)

Qin, Cheng-Zhi; Zhan, Lijun

2012-06-01

As one of the important tasks in digital terrain analysis, the calculation of flow accumulations from gridded digital elevation models (DEMs) usually involves two steps in a real application: (1) using an iterative DEM preprocessing algorithm to remove the depressions and flat areas commonly contained in real DEMs, and (2) using a recursive flow-direction algorithm to calculate the flow accumulation for every cell in the DEM. Because both algorithms are computationally intensive, quick calculation of the flow accumulations from a DEM (especially for a large area) presents a practical challenge to personal computer (PC) users. In recent years, rapid increases in hardware capacity of the graphics processing units (GPUs) provided in modern PCs have made it possible to meet this challenge in a PC environment. Parallel computing on GPUs using a compute-unified-device-architecture (CUDA) programming model has been explored to speed up the execution of the single-flow-direction algorithm (SFD). However, the parallel implementation on a GPU of the multiple-flow-direction (MFD) algorithm, which generally performs better than the SFD algorithm, has not been reported. Moreover, GPU-based parallelization of the DEM preprocessing step in the flow-accumulation calculations has not been addressed. This paper proposes a parallel approach to calculate flow accumulations (including both iterative DEM preprocessing and a recursive MFD algorithm) on a CUDA-compatible GPU. For the parallelization of an MFD algorithm (MFD-md), two different parallelization strategies using a GPU are explored. The first parallelization strategy, which has been used in the existing parallel SFD algorithm on GPU, has the problem of computing redundancy. Therefore, we designed a parallelization strategy based on graph theory. The application results show that the proposed parallel approach to calculate flow accumulations on a GPU performs much faster than either sequential algorithms or other parallel GPU
Efficient parallel algorithms for string editing and related problems

Science.gov (United States)

Apostolico, Alberto; Atallah, Mikhail J.; Larmore, Lawrence; Mcfaddin, H. S.

1988-01-01

The string editing problem for input strings x and y consists of transforming x into y by performing a series of weighted edit operations on x of overall minimum cost. An edit operation on x can be the deletion of a symbol from x, the insertion of a symbol in x or the substitution of a symbol x with another symbol. This problem has a well known O((absolute value of x)(absolute value of y)) time sequential solution (25). The efficient Program Requirements Analysis Methods (PRAM) parallel algorithms for the string editing problem are given. If m = ((absolute value of x),(absolute value of y)) and n = max((absolute value of x),(absolute value of y)), then the CREW bound is O (log m log n) time with O (mn/log m) processors. In all algorithms, space is O (mn).
Generalized infimum and sequential product of quantum effects

International Nuclear Information System (INIS)

Li Yuan; Sun Xiuhong; Chen Zhengli

2007-01-01

The quantum effects for a physical system can be described by the set E(H) of positive operators on a complex Hilbert space H that are bounded above by the identity operator I. For A, B(set-membership sign)E(H), the operation of sequential product A(convolution sign)B=A 1/2 BA 1/2 was proposed as a model for sequential quantum measurements. A nice investigation of properties of the sequential product has been carried over [Gudder, S. and Nagy, G., 'Sequential quantum measurements', J. Math. Phys. 42, 5212 (2001)]. In this note, we extend some results of this reference. In particular, a gap in the proof of Theorem 3.2 in this reference is overcome. In addition, some properties of generalized infimum A sqcap B are studied
Parallel Processing and Bio-inspired Computing for Biomedical Image Registration

Directory of Open Access Journals (Sweden)

Silviu Ioan Bejinariu

2014-07-01

Full Text Available Image Registration (IR is an optimization problem computing optimal parameters of a geometric transform used to overlay one or more source images to a given model by maximizing a similarity measure. In this paper the use of bio-inspired optimization algorithms in image registration is analyzed. Results obtained by means of three different algorithms are compared: Bacterial Foraging Optimization Algorithm (BFOA, Genetic Algorithm (GA and Clonal Selection Algorithm (CSA. Depending on the images type, the registration may be: area based, which is slow but more precise, and features based, which is faster. In this paper a feature based approach based on the Scale Invariant Feature Transform (SIFT is proposed. Finally, results obtained using sequential and parallel implementations on multi-core systems for area based and features based image registration are compared.
Fast and accurate non-sequential protein structure alignment using a new asymmetric linear sum assignment heuristic.

Science.gov (United States)

Brown, Peter; Pullan, Wayne; Yang, Yuedong; Zhou, Yaoqi

2016-02-01

The three dimensional tertiary structure of a protein at near atomic level resolution provides insight alluding to its function and evolution. As protein structure decides its functionality, similarity in structure usually implies similarity in function. As such, structure alignment techniques are often useful in the classifications of protein function. Given the rapidly growing rate of new, experimentally determined structures being made available from repositories such as the Protein Data Bank, fast and accurate computational structure comparison tools are required. This paper presents SPalignNS, a non-sequential protein structure alignment tool using a novel asymmetrical greedy search technique. The performance of SPalignNS was evaluated against existing sequential and non-sequential structure alignment methods by performing trials with commonly used datasets. These benchmark datasets used to gauge alignment accuracy include (i) 9538 pairwise alignments implied by the HOMSTRAD database of homologous proteins; (ii) a subset of 64 difficult alignments from set (i) that have low structure similarity; (iii) 199 pairwise alignments of proteins with similar structure but different topology; and (iv) a subset of 20 pairwise alignments from the RIPC set. SPalignNS is shown to achieve greater alignment accuracy (lower or comparable root-mean squared distance with increased structure overlap coverage) for all datasets, and the highest agreement with reference alignments from the challenging dataset (iv) above, when compared with both sequentially constrained alignments and other non-sequential alignments. SPalignNS was implemented in C++. The source code, binary executable, and a web server version is freely available at: http://sparks-lab.org yaoqi.zhou@griffith.edu.au. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Sequential analysis in neonatal research-systematic review.

Science.gov (United States)

Lava, Sebastiano A G; Elie, Valéry; Ha, Phuong Thi Viet; Jacqz-Aigrain, Evelyne

2018-05-01

As more new drugs are discovered, traditional designs come at their limits. Ten years after the adoption of the European Paediatric Regulation, we performed a systematic review on the US National Library of Medicine and Excerpta Medica database of sequential trials involving newborns. Out of 326 identified scientific reports, 21 trials were included. They enrolled 2832 patients, of whom 2099 were analyzed: the median number of neonates included per trial was 48 (IQR 22-87), median gestational age was 28.7 (IQR 27.9-30.9) weeks. Eighteen trials used sequential techniques to determine sample size, while 3 used continual reassessment methods for dose-finding. In 16 studies reporting sufficient data, the sequential design allowed to non-significantly reduce the number of enrolled neonates by a median of 24 (31%) patients (IQR - 4.75 to 136.5, p = 0.0674) with respect to a traditional trial. When the number of neonates finally included in the analysis was considered, the difference became significant: 35 (57%) patients (IQR 10 to 136.5, p = 0.0033). Sequential trial designs have not been frequently used in Neonatology. They might potentially be able to reduce the number of patients in drug trials, although this is not always the case. What is known: • In evaluating rare diseases in fragile populations, traditional designs come at their limits. About 20% of pediatric trials are discontinued, mainly because of recruitment problems. What is new: • Sequential trials involving newborns were infrequently used and only a few (n = 21) are available for analysis. • The sequential design allowed to non-significantly reduce the number of enrolled neonates by a median of 24 (31%) patients (IQR - 4.75 to 136.5, p = 0.0674).
Effects of rainfall on water quality in six sequentially disposed fishponds with continuous water flow

Directory of Open Access Journals (Sweden)

LH. Sipaúba-Tavares

Full Text Available An investigation was carried out during the rainy period in six semi-intensive production fish ponds in which water flowed from one pond to another without undergoing any treatment. Eight sampling sites were assigned at pond outlets during the rainy period (December-February. Lowest and highest physical and chemical parameters of water occurred in pond P1 (a site near the springs and in pond P4 (a critical site that received allochthonous material from the other ponds and also from frog culture ponds, respectively. Pond sequential layout caused concentration of nutrients, chlorophyll-a and conductivity. Seasonal rains increased the water flow in the ponds and, consequently, silted more particles and other dissolved material from one fish pond to another. Silting increased limnological variables from P3 to P6. Although results suggest that during the period under analysis, rainfall affected positively the ponds' water quality and since the analyzed systems have been aligned in a sequential layout with constant water flow from fish ponds and parallel tanks without any previous treatment, care has to be taken so that an increase in rain-induced water flow does not have a contrary effect in the fish ponds investigated.
Group-sequential analysis may allow for early trial termination

DEFF Research Database (Denmark)

Gerke, Oke; Vilstrup, Mie H; Halekoh, Ulrich

2017-01-01

BACKGROUND: Group-sequential testing is widely used in pivotal therapeutic, but rarely in diagnostic research, although it may save studies, time, and costs. The purpose of this paper was to demonstrate a group-sequential analysis strategy in an intra-observer study on quantitative FDG-PET/CT mea......BACKGROUND: Group-sequential testing is widely used in pivotal therapeutic, but rarely in diagnostic research, although it may save studies, time, and costs. The purpose of this paper was to demonstrate a group-sequential analysis strategy in an intra-observer study on quantitative FDG...... assumed to be normally distributed, and sequential one-sided hypothesis tests on the population standard deviation of the differences against a hypothesised value of 1.5 were performed, employing an alpha spending function. The fixed-sample analysis (N = 45) was compared with the group-sequential analysis...... strategies comprising one (at N = 23), two (at N = 15, 30), or three interim analyses (at N = 11, 23, 34), respectively, which were defined post hoc. RESULTS: When performing interim analyses with one third and two thirds of patients, sufficient agreement could be concluded after the first interim analysis...
Parallel algorithms for unconstrained optimization by multisplitting with inexact subspace search - the abstract

Energy Technology Data Exchange (ETDEWEB)

Renaut, R.; He, Q. [Arizona State Univ., Tempe, AZ (United States)

1994-12-31

In a new parallel iterative algorithm for unconstrained optimization by multisplitting is proposed. In this algorithm the original problem is split into a set of small optimization subproblems which are solved using well known sequential algorithms. These algorithms are iterative in nature, e.g. DFP variable metric method. Here the authors use sequential algorithms based on an inexact subspace search, which is an extension to the usual idea of an inexact fine search. Essentially the idea of the inexact line search for nonlinear minimization is that at each iteration the authors only find an approximate minimum in the line search direction. Hence by inexact subspace search, they mean that, instead of finding the minimum of the subproblem at each interation, they do an incomplete down hill search to give an approximate minimum. Some convergence and numerical results for this algorithm will be presented. Further, the original theory will be generalized to the situation with a singular Hessian. Applications for nonlinear least squares problems will be presented. Experimental results will be presented for implementations on an Intel iPSC/860 Hypercube with 64 nodes as well as on the Intel Paragon.
Comparison between four dissimilar solar panel configurations

Science.gov (United States)

Suleiman, K.; Ali, U. A.; Yusuf, Ibrahim; Koko, A. D.; Bala, S. I.

2017-12-01

Several studies on photovoltaic systems focused on how it operates and energy required in operating it. Little attention is paid on its configurations, modeling of mean time to system failure, availability, cost benefit and comparisons of parallel and series-parallel designs. In this research work, four system configurations were studied. Configuration I consists of two sub-components arranged in parallel with 24 V each, configuration II consists of four sub-components arranged logically in parallel with 12 V each, configuration III consists of four sub-components arranged in series-parallel with 8 V each, and configuration IV has six sub-components with 6 V each arranged in series-parallel. Comparative analysis was made using Chapman Kolmogorov's method. The derivation for explicit expression of mean time to system failure, steady state availability and cost benefit analysis were performed, based on the comparison. Ranking method was used to determine the optimal configuration of the systems. The results of analytical and numerical solutions of system availability and mean time to system failure were determined and it was found that configuration I is the optimal configuration.
GPU: the biggest key processor for AI and parallel processing

Science.gov (United States)

Baji, Toru

2017-07-01

Two types of processors exist in the market. One is the conventional CPU and the other is Graphic Processor Unit (GPU). Typical CPU is composed of 1 to 8 cores while GPU has thousands of cores. CPU is good for sequential processing, while GPU is good to accelerate software with heavy parallel executions. GPU was initially dedicated for 3D graphics. However from 2006, when GPU started to apply general-purpose cores, it was noticed that this architecture can be used as a general purpose massive-parallel processor. NVIDIA developed a software framework Compute Unified Device Architecture (CUDA) that make it possible to easily program the GPU for these application. With CUDA, GPU started to be used in workstations and supercomputers widely. Recently two key technologies are highlighted in the industry. The Artificial Intelligence (AI) and Autonomous Driving Cars. AI requires a massive parallel operation to train many-layers of neural networks. With CPU alone, it was impossible to finish the training in a practical time. The latest multi-GPU system with P100 makes it possible to finish the training in a few hours. For the autonomous driving cars, TOPS class of performance is required to implement perception, localization, path planning processing and again SoC with integrated GPU will play a key role there. In this paper, the evolution of the GPU which is one of the biggest commercial devices requiring state-of-the-art fabrication technology will be introduced. Also overview of the GPU demanding key application like the ones described above will be introduced.
A Survey of Multi-Objective Sequential Decision-Making

NARCIS (Netherlands)

Roijers, D.M.; Vamplew, P.; Whiteson, S.; Dazeley, R.

2013-01-01

Sequential decision-making problems with multiple objectives arise naturally in practice and pose unique challenges for research in decision-theoretic planning and learning, which has largely focused on single-objective settings. This article surveys algorithms designed for sequential
Sequential lineups: shift in criterion or decision strategy?

Science.gov (United States)

Gronlund, Scott D

2004-04-01

R. C. L. Lindsay and G. L. Wells (1985) argued that a sequential lineup enhanced discriminability because it elicited use of an absolute decision strategy. E. B. Ebbesen and H. D. Flowe (2002) argued that a sequential lineup led witnesses to adopt a more conservative response criterion, thereby affecting bias, not discriminability. Height was encoded as absolute (e.g., 6 ft [1.83 m] tall) or relative (e.g., taller than). If a sequential lineup elicited an absolute decision strategy, the principle of transfer-appropriate processing predicted that performance should be best when height was encoded absolutely. Conversely, if a simultaneous lineup elicited a relative decision strategy, performance should be best when height was encoded relatively. The predicted interaction was observed, providing direct evidence for the decision strategies explanation of what happens when witnesses view a sequential lineup.
Using parallel computing in modeling and optimization of mineral ...

African Journals Online (AJOL)

Then to solve ultimate pit limit problem it is required to find such a sub graph in a graph whose sum of weights will be maximal. One of the possible solutions of this problem is using genetic algorithms. We use a ... Details of implementation parallel genetic algorithm for searching open pit limits are provided. Comparison with ...
Experience with highly-parallel software for the storage system of the ATLAS Experiment at CERN

CERN Document Server

Colombo, T; The ATLAS collaboration

2012-01-01

The ATLAS experiment is observing proton-proton collisions delivered by the LHC accelerator. The ATLAS Trigger and Data Acquisition (TDAQ) system selects interesting events on-line in a three-level trigger system in order to store them at a budgeted rate of several hundred Hz. This paper focuses on the TDAQ data-logging system and in particular on the implementation and performance of a novel parallel software design. In this respect, the main challenge presented by the data-logging workload is the conflict between the largely parallel nature of the event processing, especially the recently introduced event compression, and the constraint of sequential file writing and checksum evaluation. This is further complicated by the necessity of operating in a fully data-driven mode, to cope with continuously evolving trigger and detector configurations. In this paper we report on the design of the new ATLAS on-line storage software. In particular we will discuss our development experience using recent concurrency-ori...
How to Read the Tractatus Sequentially

Directory of Open Access Journals (Sweden)

Tim Kraft

2016-11-01

Full Text Available One of the unconventional features of Wittgenstein’s Tractatus Logico-Philosophicus is its use of an elaborated and detailed numbering system. Recently, Bazzocchi, Hacker und Kuusela have argued that the numbering system means that the Tractatus must be read and interpreted not as a sequentially ordered book, but as a text with a two-dimensional, tree-like structure. Apart from being able to explain how the Tractatus was composed, the tree reading allegedly solves exegetical issues both on the local (e. g. how 4.02 fits into the series of remarks surrounding it and the global level (e. g. relation between ontology and picture theory, solipsism and the eye analogy, resolute and irresolute readings. This paper defends the sequential reading against the tree reading. After presenting the challenges generated by the numbering system and the two accounts as attempts to solve them, it is argued that Wittgenstein’s own explanation of the numbering system, anaphoric references within the Tractatus and the exegetical issues mentioned above do not favour the tree reading, but a version of the sequential reading. This reading maintains that the remarks of the Tractatus form a sequential chain: The role of the numbers is to indicate how remarks on different levels are interconnected to form a concise, surveyable and unified whole.
A two-level parallel direct search implementation for arbitrarily sized objective functions

Energy Technology Data Exchange (ETDEWEB)

Hutchinson, S.A.; Shadid, N.; Moffat, H.K. [Sandia National Labs., Albuquerque, NM (United States)] [and others

1994-12-31

In the past, many optimization schemes for massively parallel computers have attempted to achieve parallel efficiency using one of two methods. In the case of large and expensive objective function calculations, the optimization itself may be run in serial and the objective function calculations parallelized. In contrast, if the objective function calculations are relatively inexpensive and can be performed on a single processor, then the actual optimization routine itself may be parallelized. In this paper, a scheme based upon the Parallel Direct Search (PDS) technique is presented which allows the objective function calculations to be done on an arbitrarily large number (p{sub 2}) of processors. If, p, the number of processors available, is greater than or equal to 2p{sub 2} then the optimization may be parallelized as well. This allows for efficient use of computational resources since the objective function calculations can be performed on the number of processors that allow for peak parallel efficiency and then further speedup may be achieved by parallelizing the optimization. Results are presented for an optimization problem which involves the solution of a PDE using a finite-element algorithm as part of the objective function calculation. The optimum number of processors for the finite-element calculations is less than p/2. Thus, the PDS method is also parallelized. Performance comparisons are given for a nCUBE 2 implementation.
A minimax procedure in the context of sequential mastery testing

NARCIS (Netherlands)

Vos, Hendrik J.

1999-01-01

The purpose of this paper is to derive optimal rules for sequential mastery tests. In a sequential mastery test, the decision is to classify a subject as a master or a nonmaster, or to continue sampling and administering another random test item. The framework of minimax sequential decision theory
COMPARISON BETWEEN TEST METHODS TO DETERMINE WOOD EMBEDMENT STRENGTH PARALLEL TO THE GRAIN

Directory of Open Access Journals (Sweden)

Diego Henrique de Almeida

Full Text Available ABSTRACT This study compares the test methods according to the ABNT NBR 7190:1997, EN 383:2007, ASTM D5764:2007, EUROCODE 5:2004, NDS:2001 standards in order to provide support to establish a new test method for determining the embedment strength of wood parallel to the grain. Parallel-to-grain tests were carried out for six wood species (Schizolobium amazonicum; Pinus elliottii; Pinus oocarpa; Hymenaea spp.; Lyptus(r: hybrid Eucalyptus grandis and Eucalyptus urophylla, and Goupia glabra using four diameters (8 mm, 10 mm, 12 mm and 16 mm for the metal pin fasteners (bolts. The experimental results obtained according to the EN 383:2007 standard were closer to the specific values for the metal-dowel connections design used by ABNT NBR 7190:1997, which are considered equal compression parallel to the grain. The use of maximum embedment force or the force causing displacement of 5 mm between the bolt and the test-piece as criteria for determining embedment strength for EN 383:2007 appears to be more appropriate than the criteria used by the Brazilian and American Standards.

Multichannel, sequential or combined X-ray spectrometry

International Nuclear Information System (INIS)

Florestan, J.

1979-01-01

X-ray spectrometer qualities and defects are evaluated for sequential and multichannel categories. Multichannel X-ray spectrometer has time-coherency advantage and its results could be more reproducible; on the other hand some spatial incoherency limits low percentage and traces applications, specially when backgrounds are very variable. In this last case, sequential X-ray spectrometer would find again great usefulness [fr
Induction of simultaneous and sequential malolactic fermentation in durian wine.

Science.gov (United States)

Taniasuri, Fransisca; Lee, Pin-Rou; Liu, Shao-Quan

2016-08-02

This study represented for the first time the impact of malolactic fermentation (MLF) induced by Oenococcus oeni and its inoculation strategies (simultaneous vs. sequential) on the fermentation performance as well as aroma compound profile of durian wine. There was no negative impact of simultaneous inoculation of O. oeni and Saccharomyces cerevisiae on the growth and fermentation kinetics of S. cerevisiae as compared to sequential fermentation. Simultaneous MLF did not lead to an excessive increase in volatile acidity as compared to sequential MLF. The kinetic changes of organic acids (i.e. malic, lactic, succinic, acetic and α-ketoglutaric acids) varied with simultaneous and sequential MLF relative to yeast alone. MLF, regardless of inoculation mode, resulted in higher production of fermentation-derived volatiles as compared to control (alcoholic fermentation only), including esters, volatile fatty acids, and terpenes, except for higher alcohols. Most indigenous volatile sulphur compounds in durian were decreased to trace levels with little differences among the control, simultaneous and sequential MLF. Among the different wines, the wine with simultaneous MLF had higher concentrations of terpenes and acetate esters while sequential MLF had increased concentrations of medium- and long-chain ethyl esters. Relative to alcoholic fermentation only, both simultaneous and sequential MLF reduced acetaldehyde substantially with sequential MLF being more effective. These findings illustrate that MLF is an effective and novel way of modulating the volatile and aroma compound profile of durian wine. Copyright © 2016 Elsevier B.V. All rights reserved.
Highly parallel line-based image coding for many cores.

Science.gov (United States)

Peng, Xiulian; Xu, Jizheng; Zhou, You; Wu, Feng

2012-01-01

Computers are developing along with a new trend from the dual-core and quad-core processors to ones with tens or even hundreds of cores. Multimedia, as one of the most important applications in computers, has an urgent need to design parallel coding algorithms for compression. Taking intraframe/image coding as a start point, this paper proposes a pure line-by-line coding scheme (LBLC) to meet the need. In LBLC, an input image is processed line by line sequentially, and each line is divided into small fixed-length segments. The compression of all segments from prediction to entropy coding is completely independent and concurrent at many cores. Results on a general-purpose computer show that our scheme can get a 13.9 times speedup with 15 cores at the encoder and a 10.3 times speedup at the decoder. Ideally, such near-linear speeding relation with the number of cores can be kept for more than 100 cores. In addition to the high parallelism, the proposed scheme can perform comparatively or even better than the H.264 high profile above middle bit rates. At near-lossless coding, it outperforms H.264 more than 10 dB. At lossless coding, up to 14% bit-rate reduction is observed compared with H.264 lossless coding at the high 4:4:4 profile.
Sequential Banking.

OpenAIRE

Bizer, David S; DeMarzo, Peter M

1992-01-01

The authors study environments in which agents may borrow sequentially from more than one leader. Although debt is prioritized, additional lending imposes an externality on prior debt because, with moral hazard, the probability of repayment of prior loans decreases. Equilibrium interest rates are higher than they would be if borrowers could commit to borrow from at most one bank. Even though the loan terms are less favorable than they would be under commitment, the indebtedness of borrowers i...
Bioinformatics algorithm based on a parallel implementation of a machine learning approach using transducers

International Nuclear Information System (INIS)

Roche-Lima, Abiel; Thulasiram, Ruppa K

2012-01-01

Finite automata, in which each transition is augmented with an output label in addition to the familiar input label, are considered finite-state transducers. Transducers have been used to analyze some fundamental issues in bioinformatics. Weighted finite-state transducers have been proposed to pairwise alignments of DNA and protein sequences; as well as to develop kernels for computational biology. Machine learning algorithms for conditional transducers have been implemented and used for DNA sequence analysis. Transducer learning algorithms are based on conditional probability computation. It is calculated by using techniques, such as pair-database creation, normalization (with Maximum-Likelihood normalization) and parameters optimization (with Expectation-Maximization - EM). These techniques are intrinsically costly for computation, even worse when are applied to bioinformatics, because the databases sizes are large. In this work, we describe a parallel implementation of an algorithm to learn conditional transducers using these techniques. The algorithm is oriented to bioinformatics applications, such as alignments, phylogenetic trees, and other genome evolution studies. Indeed, several experiences were developed using the parallel and sequential algorithm on Westgrid (specifically, on the Breeze cluster). As results, we obtain that our parallel algorithm is scalable, because execution times are reduced considerably when the data size parameter is increased. Another experience is developed by changing precision parameter. In this case, we obtain smaller execution times using the parallel algorithm. Finally, number of threads used to execute the parallel algorithm on the Breezy cluster is changed. In this last experience, we obtain as result that speedup is considerably increased when more threads are used; however there is a convergence for number of threads equal to or greater than 16.
A sequential vesicle pool model with a single release sensor and a ca(2+)-dependent priming catalyst effectively explains ca(2+)-dependent properties of neurosecretion

DEFF Research Database (Denmark)

Walter, Alexander M; da Silva Pinheiro, Paulo César; Verhage, Matthijs

2013-01-01

identified. We here propose a Sequential Pool Model (SPM), assuming a novel Ca(2+)-dependent action: a Ca(2+)-dependent catalyst that accelerates both forward and reverse priming reactions. While both models account for fast fusion from the Readily-Releasable Pool (RRP) under control of synaptotagmin-1...... the simultaneous changes in release rate and amplitude seen when mutating the SNARE-complex. Finally, it can account for the loss of fast- and the persistence of slow release in the synaptotagmin-1 knockout by assuming that the RRP is depleted, leading to slow and Ca(2+)-dependent fusion from the NRP. We conclude...... that the elusive 'alternative Ca(2+) sensor' for slow release might be the upstream priming catalyst, and that a sequential model effectively explains Ca(2+)-dependent properties of secretion without assuming parallel pools or sensors....
A comparison of the electrochemical recovery of palladium using a parallel flat plate flow-by reactor and a rotating cylinder electrode reactor

International Nuclear Information System (INIS)

Terrazas-Rodriguez, J.E.; Gutierrez-Granados, S.; Alatorre-Ordaz, M.A.; Ponce de Leon, C.; Walsh, F.C.

2011-01-01

The production of catalytic converters generates large amounts of waste water containing Pd 2+ , Rh 3+ and Nd 3+ ions. The electrochemical treatment of these solutions offers an economic and effective alternative to recover the precious metals in comparison with other traditional metal recovery technologies. The separation of palladium from this mixture of metal ions by catalytic deposition was carried out using a rotating cylinder electrode reactor (RCER) and a parallel plate reactor (FM01-LC) with the same cathode area (64 cm 2 ) and electrolyte volume (300 cm 3 ). The study was carried out at mean linear flow velocities of 1.27 -1 (120 e /v -1 (7390 2+ ions in the parallel plate electrode reactor was 35% while the recovery of 97% of Pd 2+ in the RCER was 62%. The volumetric energy consumption during the electrolysis was 0.56 kW h m -3 and 2.1 kW h m -3 for the RCER and the FM01-LC reactors, respectively. Using a three-dimensional stainless steel electrode in the FM01-LC laboratory reactor, 99% of palladium ions were recovered after 30 min of electrolysis while in the RCER, 120 min were necessary.
Equivalence between quantum simultaneous games and quantum sequential games

OpenAIRE

Kobayashi, Naoki

2007-01-01

A framework for discussing relationships between different types of games is proposed. Within the framework, quantum simultaneous games, finite quantum simultaneous games, quantum sequential games, and finite quantum sequential games are defined. In addition, a notion of equivalence between two games is defined. Finally, the following three theorems are shown: (1) For any quantum simultaneous game G, there exists a quantum sequential game equivalent to G. (2) For any finite quantum simultaneo...
Accounting for Heterogeneous Returns in Sequential Schooling Decisions

NARCIS (Netherlands)

Zamarro, G.

2006-01-01

This paper presents a method for estimating returns to schooling that takes into account that returns may be heterogeneous among agents and that educational decisions are made sequentially.A sequential decision model is interesting because it explicitly considers that the level of education of each
Simultaneous Versus Sequential Ptosis and Strabismus Surgery in Children.

Science.gov (United States)

Revere, Karen E; Binenbaum, Gil; Li, Jonathan; Mills, Monte D; Katowitz, William R; Katowitz, James A

The authors sought to compare the clinical outcomes of simultaneous versus sequential ptosis and strabismus surgery in children. Retrospective, single-center cohort study of children requiring both ptosis and strabismus surgery on the same eye. Simultaneous surgeries were performed during a single anesthetic event; sequential surgeries were performed at least 7 weeks apart. Outcomes were ptosis surgery success (margin reflex distance 1 ≥ 2 mm, good eyelid contour, and good eyelid crease); strabismus surgery success (ocular alignment within 10 prism diopters of orthophoria and/or improved head position); surgical complications; and reoperations. Fifty-six children were studied, 38 had simultaneous surgery and 18 sequential. Strabismus surgery was performed first in 38/38 simultaneous and 6/18 sequential cases. Mean age at first surgery was 64 months, with mean follow up 27 months. A total of 75% of children had congenital ptosis; 64% had comitant strabismus. A majority of ptosis surgeries were frontalis sling (59%) or Fasanella-Servat (30%) procedures. There were no significant differences between simultaneous and sequential groups with regards to surgical success rates, complications, or reoperations (all p > 0.28). In the first comparative study of simultaneous versus sequential ptosis and strabismus surgery, no advantage for sequential surgery was seen. Despite a theoretical risk of postoperative eyelid malposition or complications when surgeries were performed in a combined manner, the rate of such outcomes was not increased with simultaneous surgeries. Performing ptosis and strabismus surgery together appears to be clinically effective and safe, and reduces anesthesia exposure during childhood.
Sequential antimicrobial therapy: comparison of the views of microbiologists and pharmacists.

Science.gov (United States)

Smyth, E T; Tillotson, G S

1998-07-01

Sequential antimicrobial therapy (SAT) is arousing keen interest in microbiologists and pharmacists. In an attempt to obtain information from these groups regarding the use of SAT in hospitals, an anonymized postal survey was carried out. A SAT questionnaire was circulated to consultant medical microbiologists, clinical microbiologists, and heads of pharmacy departments within the British Isles. Four hundred and forty-seven microbiologists and pharmacists returned completed questionnaires, giving a response rate of 29%. Just over half of medical microbiologists (MM) and pharmacists (PH) indicated that SAT was used in their institution in respiratory medicine, geriatrics, surgery and, significantly, to a lesser degree in paediatrics. The most common infections treated were pneumonia, bronchitis and wound infection. However, there were significant differences between MM and PH, with MM favouring greater use of SAT in peritonitis (P=0.03), septicaemia (PUTI) (P<0.01), and PH favouring use in bronchitis (P<0.01). The ability to take oral fluids or a recognition of no potential absorption problems were key criteria in the decision process leading to the institution of SAT by MM and PH. Significantly more MM favoured employing criteria such as temperature <38 degrees C (P<0.01), no requirement for high tissue concentrations (P=0.02) and evidence of response to i.v. antimicrobial therapy (P<0.01) than PH. The most frequently "switched" antimicrobials were metronidazole, ciprofloxacin and co-amoxiclav. There were more than five times as many MM reporting the use of clindamycin than PH (P<0.01), whereas nearly twice as many PH cited use of cefuroxime (P<0.01). Of those hospitals not employing SAT, most MM and PH concurred that the commonest reason to institute SAT was financial, followed by convenience to patients and staff. However, more PH than MM indicated that protocols (P<0.01) and a reduction in i.v. complications (P<0.01) were important to them. In promoting SAT, MM
SPRINT: A new parallel framework for R

Directory of Open Access Journals (Sweden)

Scharinger Florian

2008-12-01

Full Text Available Abstract Background Microarray analysis allows the simultaneous measurement of thousands to millions of genes or sequences across tens to thousands of different samples. The analysis of the resulting data tests the limits of existing bioinformatics computing infrastructure. A solution to this issue is to use High Performance Computing (HPC systems, which contain many processors and more memory than desktop computer systems. Many biostatisticians use R to process the data gleaned from microarray analysis and there is even a dedicated group of packages, Bioconductor, for this purpose. However, to exploit HPC systems, R must be able to utilise the multiple processors available on these systems. There are existing modules that enable R to use multiple processors, but these are either difficult to use for the HPC novice or cannot be used to solve certain classes of problems. A method of exploiting HPC systems, using R, but without recourse to mastering parallel programming paradigms is therefore necessary to analyse genomic data to its fullest. Results We have designed and built a prototype framework that allows the addition of parallelised functions to R to enable the easy exploitation of HPC systems. The Simple Parallel R INTerface (SPRINT is a wrapper around such parallelised functions. Their use requires very little modification to existing sequential R scripts and no expertise in parallel computing. As an example we created a function that carries out the computation of a pairwise calculated correlation matrix. This performs well with SPRINT. When executed using SPRINT on an HPC resource of eight processors this computation reduces by more than three times the time R takes to complete it on one processor. Conclusion SPRINT allows the biostatistician to concentrate on the research problems rather than the computation, while still allowing exploitation of HPC systems. It is easy to use and with further development will become more useful as more
Forced Sequence Sequential Decoding

DEFF Research Database (Denmark)

Jensen, Ole Riis

In this thesis we describe a new concatenated decoding scheme based on iterations between an inner sequentially decoded convolutional code of rate R=1/4 and memory M=23, and block interleaved outer Reed-Solomon codes with non-uniform profile. With this scheme decoding with good performance...... is possible as low as Eb/No=0.6 dB, which is about 1.7 dB below the signal-to-noise ratio that marks the cut-off rate for the convolutional code. This is possible since the iteration process provides the sequential decoders with side information that allows a smaller average load and minimizes the probability...... of computational overflow. Analytical results for the probability that the first Reed-Solomon word is decoded after C computations are presented. This is supported by simulation results that are also extended to other parameters....
Practical parallel computing

CERN Document Server

Morse, H Stephen

1994-01-01

Practical Parallel Computing provides information pertinent to the fundamental aspects of high-performance parallel processing. This book discusses the development of parallel applications on a variety of equipment.Organized into three parts encompassing 12 chapters, this book begins with an overview of the technology trends that converge to favor massively parallel hardware over traditional mainframes and vector machines. This text then gives a tutorial introduction to parallel hardware architectures. Other chapters provide worked-out examples of programs using several parallel languages. Thi
A new scheduling algorithm for parallel sparse LU factorization with static pivoting

Energy Technology Data Exchange (ETDEWEB)

Grigori, Laura; Li, Xiaoye S.

2002-08-20

In this paper we present a static scheduling algorithm for parallel sparse LU factorization with static pivoting. The algorithm is divided into mapping and scheduling phases, using the symmetric pruned graphs of L' and U to represent dependencies. The scheduling algorithm is designed for driving the parallel execution of the factorization on a distributed-memory architecture. Experimental results and comparisons with SuperLU{_}DIST are reported after applying this algorithm on real world application matrices on an IBM SP RS/6000 distributed memory machine.
Comparison of PET/CT with Sequential PET/MRI Using an MR-Compatible Mobile PET System.

Science.gov (United States)

Nakamoto, Ryusuke; Nakamoto, Yuji; Ishimori, Takayoshi; Fushimi, Yasutaka; Kido, Aki; Togashi, Kaori

2018-05-01

The current study tested a newly developed flexible PET (fxPET) scanner prototype. This fxPET system involves dual arc-shaped detectors based on silicon photomultipliers that are designed to fit existing MRI devices, allowing us to obtain fused PET and MR images by sequential PET and MR scanning. This prospective study sought to evaluate the image quality, lesion detection rate, and quantitative values of fxPET in comparison with conventional whole-body (WB) PET and to assess the accuracy of registration. Methods: Seventeen patients with suspected or known malignant tumors were analyzed. Approximately 1 h after intravenous injection of 18 F-FDG, WB PET/CT was performed, followed by fxPET and MRI. For reconstruction of fxPET images, MRI-based attenuation correction was applied. The quality of fxPET images was visually assessed, and the number of detected lesions was compared between the 2 imaging methods. SUV max and maximum average SUV within a 1 cm 3 spheric volume (SUV peak ) of lesions were also compared. In addition, the magnitude of misregistration between fxPET and MR images was evaluated. Results: The image quality of fxPET was acceptable for diagnosis of malignant tumors. There was no significant difference in detectability of malignant lesions between fxPET and WB PET ( P > 0.05). However, the fxPET system did not exhibit superior performance to the WB PET system. There were strong positive correlations between the 2 imaging modalities in SUV max (ρ = 0.88) and SUV peak (ρ = 0.81). SUV max and SUV peak measured with fxPET were approximately 1.1-fold greater than measured with WB PET. The average misregistration between fxPET and MR images was 5.5 ± 3.4 mm. Conclusion: Our preliminary data indicate that running an fxPET scanner near an existing MRI system provides visually and quantitatively acceptable fused PET/MR images for diagnosis of malignant lesions. © 2018 by the Society of Nuclear Medicine and Molecular Imaging.
Computational fluid dynamics on a massively parallel computer

Science.gov (United States)

Jespersen, Dennis C.; Levit, Creon

1989-01-01

A finite difference code was implemented for the compressible Navier-Stokes equations on the Connection Machine, a massively parallel computer. The code is based on the ARC2D/ARC3D program and uses the implicit factored algorithm of Beam and Warming. The codes uses odd-even elimination to solve linear systems. Timings and computation rates are given for the code, and a comparison is made with a Cray XMP.
Parallel rendering

Science.gov (United States)

Crockett, Thomas W.

1995-01-01

This article provides a broad introduction to the subject of parallel rendering, encompassing both hardware and software systems. The focus is on the underlying concepts and the issues which arise in the design of parallel rendering algorithms and systems. We examine the different types of parallelism and how they can be applied in rendering applications. Concepts from parallel computing, such as data decomposition, task granularity, scalability, and load balancing, are considered in relation to the rendering problem. We also explore concepts from computer graphics, such as coherence and projection, which have a significant impact on the structure of parallel rendering algorithms. Our survey covers a number of practical considerations as well, including the choice of architectural platform, communication and memory requirements, and the problem of image assembly and display. We illustrate the discussion with numerous examples from the parallel rendering literature, representing most of the principal rendering methods currently used in computer graphics.
Reading Remediation Based on Sequential and Simultaneous Processing.

Science.gov (United States)

Gunnison, Judy; And Others

1982-01-01

The theory postulating a dichotomy between sequential and simultaneous processing is reviewed and its implications for remediating reading problems are reviewed. Research is cited on sequential-simultaneous processing for early and advanced reading. A list of remedial strategies based on the processing dichotomy addresses decoding and lexical…
Parallel computations

CERN Document Server

1982-01-01

Parallel Computations focuses on parallel computation, with emphasis on algorithms used in a variety of numerical and physical applications and for many different types of parallel computers. Topics covered range from vectorization of fast Fourier transforms (FFTs) and of the incomplete Cholesky conjugate gradient (ICCG) algorithm on the Cray-1 to calculation of table lookups and piecewise functions. Single tridiagonal linear systems and vectorized computation of reactive flow are also discussed.Comprised of 13 chapters, this volume begins by classifying parallel computers and describing techn

STOCHSIMGPU: parallel stochastic simulation for the Systems Biology Toolbox 2 for MATLAB.

Science.gov (United States)

Klingbeil, Guido; Erban, Radek; Giles, Mike; Maini, Philip K

2011-04-15

The importance of stochasticity in biological systems is becoming increasingly recognized and the computational cost of biologically realistic stochastic simulations urgently requires development of efficient software. We present a new software tool STOCHSIMGPU that exploits graphics processing units (GPUs) for parallel stochastic simulations of biological/chemical reaction systems and show that significant gains in efficiency can be made. It is integrated into MATLAB and works with the Systems Biology Toolbox 2 (SBTOOLBOX2) for MATLAB. The GPU-based parallel implementation of the Gillespie stochastic simulation algorithm (SSA), the logarithmic direct method (LDM) and the next reaction method (NRM) is approximately 85 times faster than the sequential implementation of the NRM on a central processing unit (CPU). Using our software does not require any changes to the user's models, since it acts as a direct replacement of the stochastic simulation software of the SBTOOLBOX2. The software is open source under the GPL v3 and available at http://www.maths.ox.ac.uk/cmb/STOCHSIMGPU. The web site also contains supplementary information. klingbeil@maths.ox.ac.uk Supplementary data are available at Bioinformatics online.
Sequential sampling: a novel method in farm animal welfare assessment.

Science.gov (United States)

Heath, C A E; Main, D C J; Mullan, S; Haskell, M J; Browne, W J

2016-02-01

Lameness in dairy cows is an important welfare issue. As part of a welfare assessment, herd level lameness prevalence can be estimated from scoring a sample of animals, where higher levels of accuracy are associated with larger sample sizes. As the financial cost is related to the number of cows sampled, smaller samples are preferred. Sequential sampling schemes have been used for informing decision making in clinical trials. Sequential sampling involves taking samples in stages, where sampling can stop early depending on the estimated lameness prevalence. When welfare assessment is used for a pass/fail decision, a similar approach could be applied to reduce the overall sample size. The sampling schemes proposed here apply the principles of sequential sampling within a diagnostic testing framework. This study develops three sequential sampling schemes of increasing complexity to classify 80 fully assessed UK dairy farms, each with known lameness prevalence. Using the Welfare Quality herd-size-based sampling scheme, the first 'basic' scheme involves two sampling events. At the first sampling event half the Welfare Quality sample size is drawn, and then depending on the outcome, sampling either stops or is continued and the same number of animals is sampled again. In the second 'cautious' scheme, an adaptation is made to ensure that correctly classifying a farm as 'bad' is done with greater certainty. The third scheme is the only scheme to go beyond lameness as a binary measure and investigates the potential for increasing accuracy by incorporating the number of severely lame cows into the decision. The three schemes are evaluated with respect to accuracy and average sample size by running 100 000 simulations for each scheme, and a comparison is made with the fixed size Welfare Quality herd-size-based sampling scheme. All three schemes performed almost as well as the fixed size scheme but with much smaller average sample sizes. For the third scheme, an overall
Left auditory cortex is involved in pairwise comparisons of the direction of frequency modulated tones

Directory of Open Access Journals (Sweden)

Nicole eAngenstein

2013-07-01

Full Text Available Evaluating series of complex sounds like those in speech and music requires sequential comparisons to extract task-relevant relations between subsequent sounds. With the present functional magnetic resonance imaging (fMRI study, we investigated whether sequential comparison of a specific acoustic feature within pairs of tones leads to a change in lateralized processing in the auditory cortex of humans. For this we used the active categorization of the direction (up versus down of slow frequency modulated (FM tones. Several studies suggest that this task is mainly processed in the right auditory cortex. These studies, however, tested only the categorization of the FM direction of each individual tone. In the present study we ask the question whether the right lateralized processing changes when, in addition, the FM direction is compared within pairs of successive tones. For this we use an experimental approach involving contralateral noise presentation in order to explore the contributions made by the left and right auditory cortex in the completion of the auditory task. This method has already been applied to confirm the right-lateralized processing of the FM direction of individual tones. In the present study, the subjects were required to perform, in addition, a sequential comparison of the FM-direction in pairs of tones. The results suggest a division of labor between the two hemispheres such that the FM direction of each individual tone is mainly processed in the right auditory cortex whereas the sequential comparison of this feature between tones in a pair is probably performed in the left auditory cortex.
Algorithm comparison and benchmarking using a parallel spectra transform shallow water model

Energy Technology Data Exchange (ETDEWEB)

Worley, P.H. [Oak Ridge National Lab., TN (United States); Foster, I.T.; Toonen, B. [Argonne National Lab., IL (United States)

1995-04-01

In recent years, a number of computer vendors have produced supercomputers based on a massively parallel processing (MPP) architecture. These computers have been shown to be competitive in performance with conventional vector supercomputers for some applications. As spectral weather and climate models are heavy users of vector supercomputers, it is interesting to determine how these models perform on MPPS, and which MPPs are best suited to the execution of spectral models. The benchmarking of MPPs is complicated by the fact that different algorithms may be more efficient on different architectures. Hence, a comprehensive benchmarking effort must answer two related questions: which algorithm is most efficient on each computer and how do the most efficient algorithms compare on different computers. In general, these are difficult questions to answer because of the high cost associated with implementing and evaluating a range of different parallel algorithms on each MPP platform.
Parallel sorting algorithms

CERN Document Server

Akl, Selim G

1985-01-01

Parallel Sorting Algorithms explains how to use parallel algorithms to sort a sequence of items on a variety of parallel computers. The book reviews the sorting problem, the parallel models of computation, parallel algorithms, and the lower bounds on the parallel sorting problems. The text also presents twenty different algorithms, such as linear arrays, mesh-connected computers, cube-connected computers. Another example where algorithm can be applied is on the shared-memory SIMD (single instruction stream multiple data stream) computers in which the whole sequence to be sorted can fit in the
Sequential and Parallel Attack Tree Modelling

NARCIS (Netherlands)

Arnold, Florian; Guck, Dennis; Kumar, Rajesh; Stoelinga, Mariëlle Ida Antoinette; Koornneef, Floor; van Gulijk, Coen

The intricacy of socio-technical systems requires a careful planning and utilisation of security resources to ensure uninterrupted, secure and reliable services. Even though many studies have been conducted to understand and model the behaviour of a potential attacker, the detection of crucial
Fast robot kinematics modeling by using a parallel simulator (PSIM)

International Nuclear Information System (INIS)

El-Gazzar, H.M.; Ayad, N.M.A.

2002-01-01

High-speed computers are strongly needed not only for solving scientific and engineering problems, but also for numerous industrial applications. Such applications include computer-aided design, oil exploration, weather predication, space applications and safety of nuclear reactors. The rapid development in VLSI technology makes it possible to implement time consuming algorithms in real-time situations. Parallel processing approaches can now be used to reduce the processing-time for models of very high mathematical structure such as the kinematics molding of robot manipulator. This system is used to construct and evaluate the performance and cost effectiveness of several proposed methods to solve the Jacobian algorithm. Parallelism is introduced to the algorithms by using different task-allocations and dividing the whole job into sub tasks. Detailed analysis is performed and results are obtained for the case of six DOF (degree of freedom) robot arms (Stanford Arm). Execution times comparisons between Von Neumann (uni processor) and parallel processor architectures by using parallel simulator package (PSIM) are presented. The gained results are much in favour for the parallel techniques by at least fifty-percent improvements. Of course, further studies are needed to achieve the convenient and optimum number of processors has to be done
Fast robot kinematics modeling by using a parallel simulator (PSIM)

Energy Technology Data Exchange (ETDEWEB)

El-Gazzar, H M; Ayad, N M.A. [Atomic Energy Authority, Reactor Dept., Computer and Control Lab., P.O. Box no 13759 (Egypt)

2002-09-15

High-speed computers are strongly needed not only for solving scientific and engineering problems, but also for numerous industrial applications. Such applications include computer-aided design, oil exploration, weather predication, space applications and safety of nuclear reactors. The rapid development in VLSI technology makes it possible to implement time consuming algorithms in real-time situations. Parallel processing approaches can now be used to reduce the processing-time for models of very high mathematical structure such as the kinematics molding of robot manipulator. This system is used to construct and evaluate the performance and cost effectiveness of several proposed methods to solve the Jacobian algorithm. Parallelism is introduced to the algorithms by using different task-allocations and dividing the whole job into sub tasks. Detailed analysis is performed and results are obtained for the case of six DOF (degree of freedom) robot arms (Stanford Arm). Execution times comparisons between Von Neumann (uni processor) and parallel processor architectures by using parallel simulator package (PSIM) are presented. The gained results are much in favour for the parallel techniques by at least fifty-percent improvements. Of course, further studies are needed to achieve the convenient and optimum number of processors has to be done.
Toward a model framework of generalized parallel componential processing of multi-symbol numbers.

Science.gov (United States)

Huber, Stefan; Cornelsen, Sonja; Moeller, Korbinian; Nuerk, Hans-Christoph

2015-05-01

In this article, we propose and evaluate a new model framework of parallel componential multi-symbol number processing, generalizing the idea of parallel componential processing of multi-digit numbers to the case of negative numbers by considering the polarity signs similar to single digits. In a first step, we evaluated this account by defining and investigating a sign-decade compatibility effect for the comparison of positive and negative numbers, which extends the unit-decade compatibility effect in 2-digit number processing. Then, we evaluated whether the model is capable of accounting for previous findings in negative number processing. In a magnitude comparison task, in which participants had to single out the larger of 2 integers, we observed a reliable sign-decade compatibility effect with prolonged reaction times for incompatible (e.g., -97 vs. +53; in which the number with the larger decade digit has the smaller, i.e., negative polarity sign) as compared with sign-decade compatible number pairs (e.g., -53 vs. +97). Moreover, an analysis of participants' eye fixation behavior corroborated our model of parallel componential processing of multi-symbol numbers. These results are discussed in light of concurrent theoretical notions about negative number processing. On the basis of the present results, we propose a generalized integrated model framework of parallel componential multi-symbol processing. (c) 2015 APA, all rights reserved).
Comparison between state graphs and fault trees for sequential and repairable systems

International Nuclear Information System (INIS)

Soussan, D.; Saignes, P.

1996-01-01

In French PSA (Probabilistic Safety Assessment) 1300 for the 1300 Mwe PWR plants carried out by EDF, sequential and reparable systems are modeled with state graphs. This method is particularly convenient for modeling dynamic systems with long-term missions but induces a bad traceability and understandability of models. In the objective of providing elements for rewriting PSA 1300 with only boolean models, EDF has asked CEA to participate to a methodological study. The aim is to carry out a feasibility study of transposition of state graphs models into fault trees on Component Cooling System and Essential Service Water System (CCS/ESWS) and to draw a methodological guide for transposition. The study realized on CCS/ESWS involves two main axes: quantification of cold source loss (as an accident sequence initiating event, called H1); quantification of the CCS/ESWS missions in accident sequences. The subject of this article is to show that this transformation is applicable with minimum distortions of the results and to determine the hypotheses, the conditions and the limits of application of this conversion. (authors). 2 refs
The Impact of Embedded Story Structures versus Sequential Story Structures on Critical Thinking of Iranian Intermediate EFL Learners

Directory of Open Access Journals (Sweden)

Sara Samadi

2016-09-01

Full Text Available Confirming the constructive effects of reading comprehension on critical thinking, this paper attempted to investigate the impact of story structures on critical thinking of Iranian EFL learners. In doing so, the researcher utilized a quasi–experimental design with 60 intermediate students who were divided into two embedded story structures and sequential story structures groups (experimental groups. After taking PET, a critical thinking questionnaire was employed as a pre-test. The two groups received 16 sessions of treatment. All participants received similar amount of instruction but one group was given embedded short stories and the other group sequential short stories. To compare the two groups, they were received the parallel critical thinking questionnaire as a post-test. The two null hypotheses in this study were rejected due to different performance of the two groups. Statistical results did not support the superiority of neither structures. Therefore, the researcher was not able to suggest which structure caused a better or higher impact on critical thinking. However, the findings reveal that teaching story structures in EFL context can develop critical thinking of intermediate EFL learners. The study have some implications for test-designers, teachers, and students.
High performance shallow water kernels for parallel overland flow simulations based on FullSWOF2D

KAUST Repository

Wittmann, Roland

2017-01-25

We describe code optimization and parallelization procedures applied to the sequential overland flow solver FullSWOF2D. Major difficulties when simulating overland flows comprise dealing with high resolution datasets of large scale areas which either cannot be computed on a single node either due to limited amount of memory or due to too many (time step) iterations resulting from the CFL condition. We address these issues in terms of two major contributions. First, we demonstrate a generic step-by-step transformation of the second order finite volume scheme in FullSWOF2D towards MPI parallelization. Second, the computational kernels are optimized by the use of templates and a portable vectorization approach. We discuss the load imbalance of the flux computation due to dry and wet cells and propose a solution using an efficient cell counting approach. Finally, scalability results are shown for different test scenarios along with a flood simulation benchmark using the Shaheen II supercomputer.
C-quence: a tool for analyzing qualitative sequential data.

Science.gov (United States)

Duncan, Starkey; Collier, Nicholson T

2002-02-01

C-quence is a software application that matches sequential patterns of qualitative data specified by the user and calculates the rate of occurrence of these patterns in a data set. Although it was designed to facilitate analyses of face-to-face interaction, it is applicable to any data set involving categorical data and sequential information. C-quence queries are constructed using a graphical user interface. The program does not limit the complexity of the sequential patterns specified by the user.
Simultaneous vs sequential bilateral cataract surgery for infants with congenital cataracts: Visual outcomes, adverse events, and economic costs.

Science.gov (United States)

Dave, Hreem; Phoenix, Vidya; Becker, Edmund R; Lambert, Scott R

2010-08-01

To compare the incidence of adverse events and visual outcomes and to compare the economic costs of sequential vs simultaneous bilateral cataract surgery for infants with congenital cataracts. Retrospective review of simultaneous vs sequential bilateral cataract surgery for infants with congenital cataracts who underwent cataract surgery when 6 months or younger at our institution. Records were available for 10 children who underwent sequential surgery at a mean age of 49 days for the first eye and 17 children who underwent simultaneous surgery at a mean age of 68 days (P = .25). We found a similar incidence of adverse events between the 2 treatment groups. Intraoperative or postoperative complications occurred in 14 eyes. The most common postoperative complication was glaucoma. No eyes developed endophthalmitis. The mean (SD) absolute interocular difference in logMAR visual acuities between the 2 treatment groups was 0.47 (0.76) for the sequential group and 0.44 (0.40) for the simultaneous group (P = .92). Payments for the hospital, drugs, supplies, and professional services were on average 21.9% lower per patient in the simultaneous group. Simultaneous bilateral cataract surgery for infants with congenital cataracts is associated with a 21.9% reduction in medical payments and no discernible difference in the incidence of adverse events or visual outcomes. However, our small sample size limits our ability to make meaningful comparisons of the relative risks and visual benefits of the 2 procedures.
Top-down attention affects sequential regularity representation in the human visual system.

Science.gov (United States)

Kimura, Motohiro; Widmann, Andreas; Schröger, Erich

2010-08-01

Recent neuroscience studies using visual mismatch negativity (visual MMN), an event-related brain potential (ERP) index of memory-mismatch processes in the visual sensory system, have shown that although sequential regularities embedded in successive visual stimuli can be automatically represented in the visual sensory system, an existence of sequential regularity itself does not guarantee that the sequential regularity will be automatically represented. In the present study, we investigated the effects of top-down attention on sequential regularity representation in the visual sensory system. Our results showed that a sequential regularity (SSSSD) embedded in a modified oddball sequence where infrequent deviant (D) and frequent standard stimuli (S) differing in luminance were regularly presented (SSSSDSSSSDSSSSD...) was represented in the visual sensory system only when participants attended the sequential regularity in luminance, but not when participants ignored the stimuli or simply attended the dimension of luminance per se. This suggests that top-down attention affects sequential regularity representation in the visual sensory system and that top-down attention is a prerequisite for particular sequential regularities to be represented. Copyright 2010 Elsevier B.V. All rights reserved.
SPRINT: A Tool to Generate Concurrent Transaction-Level Models from Sequential Code

Directory of Open Access Journals (Sweden)

Richard Stahl

2007-01-01

Full Text Available A high-level concurrent model such as a SystemC transaction-level model can provide early feedback during the exploration of implementation alternatives for state-of-the-art signal processing applications like video codecs on a multiprocessor platform. However, the creation of such a model starting from sequential code is a time-consuming and error-prone task. It is typically done only once, if at all, for a given design. This lack of exploration of the design space often leads to a suboptimal implementation. To support our systematic C-based design flow, we have developed a tool to generate a concurrent SystemC transaction-level model for user-selected task boundaries. Using this tool, different parallelization alternatives have been evaluated during the design of an MPEG-4 simple profile encoder and an embedded zero-tree coder. Generation plus evaluation of an alternative was possible in less than six minutes. This is fast enough to allow extensive exploration of the design space.
A suppression hierarchy among competing motor programs drives sequential grooming in Drosophila.

Science.gov (United States)

Seeds, Andrew M; Ravbar, Primoz; Chung, Phuong; Hampel, Stefanie; Midgley, Frank M; Mensh, Brett D; Simpson, Julie H

2014-08-19

Motor sequences are formed through the serial execution of different movements, but how nervous systems implement this process remains largely unknown. We determined the organizational principles governing how dirty fruit flies groom their bodies with sequential movements. Using genetically targeted activation of neural subsets, we drove distinct motor programs that clean individual body parts. This enabled competition experiments revealing that the motor programs are organized into a suppression hierarchy; motor programs that occur first suppress those that occur later. Cleaning one body part reduces the sensory drive to its motor program, which relieves suppression of the next movement, allowing the grooming sequence to progress down the hierarchy. A model featuring independently evoked cleaning movements activated in parallel, but selected serially through hierarchical suppression, was successful in reproducing the grooming sequence. This provides the first example of an innate motor sequence implemented by the prevailing model for generating human action sequences. Copyright © 2014, Seeds et al.
Capacity Analysis for Parallel Runway through Agent-Based Simulation

Directory of Open Access Journals (Sweden)

Yang Peng

2013-01-01

Full Text Available Parallel runway is the mainstream structure of China hub airport, runway is often the bottleneck of an airport, and the evaluation of its capacity is of great importance to airport management. This study outlines a model, multiagent architecture, implementation approach, and software prototype of a simulation system for evaluating runway capacity. Agent Unified Modeling Language (AUML is applied to illustrate the inbound and departing procedure of planes and design the agent-based model. The model is evaluated experimentally, and the quality is studied in comparison with models, created by SIMMOD and Arena. The results seem to be highly efficient, so the method can be applied to parallel runway capacity evaluation and the model propose favorable flexibility and extensibility.
Mining compressing sequential problems

NARCIS (Netherlands)

Hoang, T.L.; Mörchen, F.; Fradkin, D.; Calders, T.G.K.

2012-01-01

Compression based pattern mining has been successfully applied to many data mining tasks. We propose an approach based on the minimum description length principle to extract sequential patterns that compress a database of sequences well. We show that mining compressing patterns is NP-Hard and
Fast sequential Monte Carlo methods for counting and optimization

CERN Document Server

Rubinstein, Reuven Y; Vaisman, Radislav

2013-01-01

A comprehensive account of the theory and application of Monte Carlo methods Based on years of research in efficient Monte Carlo methods for estimation of rare-event probabilities, counting problems, and combinatorial optimization, Fast Sequential Monte Carlo Methods for Counting and Optimization is a complete illustration of fast sequential Monte Carlo techniques. The book provides an accessible overview of current work in the field of Monte Carlo methods, specifically sequential Monte Carlo techniques, for solving abstract counting and optimization problems. Written by authorities in the

Computing sequential equilibria for two-player games

DEFF Research Database (Denmark)

Miltersen, Peter Bro

2006-01-01

Koller, Megiddo and von Stengel showed how to efficiently compute minimax strategies for two-player extensive-form zero-sum games with imperfect information but perfect recall using linear programming and avoiding conversion to normal form. Their algorithm has been used by AI researchers...... for constructing prescriptive strategies for concrete, often fairly large games. Koller and Pfeffer pointed out that the strategies obtained by the algorithm are not necessarily sequentially rational and that this deficiency is often problematic for the practical applications. We show how to remove this deficiency...... by modifying the linear programs constructed by Koller, Megiddo and von Stengel so that pairs of strategies forming a sequential equilibrium are computed. In particular, we show that a sequential equilibrium for a two-player zero-sum game with imperfect information but perfect recall can be found in polynomial...
Computing Sequential Equilibria for Two-Player Games

DEFF Research Database (Denmark)

Miltersen, Peter Bro; Sørensen, Troels Bjerre

2006-01-01

Koller, Megiddo and von Stengel showed how to efficiently compute minimax strategies for two-player extensive-form zero-sum games with imperfect information but perfect recall using linear programming and avoiding conversion to normal form. Koller and Pfeffer pointed out that the strategies...... obtained by the algorithm are not necessarily sequentially rational and that this deficiency is often problematic for the practical applications. We show how to remove this deficiency by modifying the linear programs constructed by Koller, Megiddo and von Stengel so that pairs of strategies forming...... a sequential equilibrium are computed. In particular, we show that a sequential equilibrium for a two-player zero-sum game with imperfect information but perfect recall can be found in polynomial time. In addition, the equilibrium we find is normal-form perfect. Our technique generalizes to general-sum games...
Fast parallel algorithms that compute transitive closure of a fuzzy relation

Science.gov (United States)

Kreinovich, Vladik YA.

1993-01-01

The notion of a transitive closure of a fuzzy relation is very useful for clustering in pattern recognition, for fuzzy databases, etc. The original algorithm proposed by L. Zadeh (1971) requires the computation time O(n(sup 4)), where n is the number of elements in the relation. In 1974, J. C. Dunn proposed a O(n(sup 2)) algorithm. Since we must compute n(n-1)/2 different values s(a, b) (a not equal to b) that represent the fuzzy relation, and we need at least one computational step to compute each of these values, we cannot compute all of them in less than O(n(sup 2)) steps. So, Dunn's algorithm is in this sense optimal. For small n, it is ok. However, for big n (e.g., for big databases), it is still a lot, so it would be desirable to decrease the computation time (this problem was formulated by J. Bezdek). Since this decrease cannot be done on a sequential computer, the only way to do it is to use a computer with several processors working in parallel. We show that on a parallel computer, transitive closure can be computed in time O((log(sub 2)(n))2).
Parallel MR imaging.

Science.gov (United States)

Deshmane, Anagha; Gulani, Vikas; Griswold, Mark A; Seiberlich, Nicole

2012-07-01

Parallel imaging is a robust method for accelerating the acquisition of magnetic resonance imaging (MRI) data, and has made possible many new applications of MR imaging. Parallel imaging works by acquiring a reduced amount of k-space data with an array of receiver coils. These undersampled data can be acquired more quickly, but the undersampling leads to aliased images. One of several parallel imaging algorithms can then be used to reconstruct artifact-free images from either the aliased images (SENSE-type reconstruction) or from the undersampled data (GRAPPA-type reconstruction). The advantages of parallel imaging in a clinical setting include faster image acquisition, which can be used, for instance, to shorten breath-hold times resulting in fewer motion-corrupted examinations. In this article the basic concepts behind parallel imaging are introduced. The relationship between undersampling and aliasing is discussed and two commonly used parallel imaging methods, SENSE and GRAPPA, are explained in detail. Examples of artifacts arising from parallel imaging are shown and ways to detect and mitigate these artifacts are described. Finally, several current applications of parallel imaging are presented and recent advancements and promising research in parallel imaging are briefly reviewed. Copyright © 2012 Wiley Periodicals, Inc.
A massively parallel corpus: the Bible in 100 languages.

Science.gov (United States)

Christodouloupoulos, Christos; Steedman, Mark

We describe the creation of a massively parallel corpus based on 100 translations of the Bible. We discuss some of the difficulties in acquiring and processing the raw material as well as the potential of the Bible as a corpus for natural language processing. Finally we present a statistical analysis of the corpora collected and a detailed comparison between the English translation and other English corpora.
High-Performance Psychometrics: The Parallel-E Parallel-M Algorithm for Generalized Latent Variable Models. Research Report. ETS RR-16-34

Science.gov (United States)

von Davier, Matthias

2016-01-01

This report presents results on a parallel implementation of the expectation-maximization (EM) algorithm for multidimensional latent variable models. The developments presented here are based on code that parallelizes both the E step and the M step of the parallel-E parallel-M algorithm. Examples presented in this report include item response…
Sensitivity Analysis in Sequential Decision Models.

Science.gov (United States)

Chen, Qiushi; Ayer, Turgay; Chhatwal, Jagpreet

2017-02-01

Sequential decision problems are frequently encountered in medical decision making, which are commonly solved using Markov decision processes (MDPs). Modeling guidelines recommend conducting sensitivity analyses in decision-analytic models to assess the robustness of the model results against the uncertainty in model parameters. However, standard methods of conducting sensitivity analyses cannot be directly applied to sequential decision problems because this would require evaluating all possible decision sequences, typically in the order of trillions, which is not practically feasible. As a result, most MDP-based modeling studies do not examine confidence in their recommended policies. In this study, we provide an approach to estimate uncertainty and confidence in the results of sequential decision models. First, we provide a probabilistic univariate method to identify the most sensitive parameters in MDPs. Second, we present a probabilistic multivariate approach to estimate the overall confidence in the recommended optimal policy considering joint uncertainty in the model parameters. We provide a graphical representation, which we call a policy acceptability curve, to summarize the confidence in the optimal policy by incorporating stakeholders' willingness to accept the base case policy. For a cost-effectiveness analysis, we provide an approach to construct a cost-effectiveness acceptability frontier, which shows the most cost-effective policy as well as the confidence in that for a given willingness to pay threshold. We demonstrate our approach using a simple MDP case study. We developed a method to conduct sensitivity analysis in sequential decision models, which could increase the credibility of these models among stakeholders.
The sequential structure of brain activation predicts skill.

Science.gov (United States)

Anderson, John R; Bothell, Daniel; Fincham, Jon M; Moon, Jungaa

2016-01-29

In an fMRI study, participants were trained to play a complex video game. They were scanned early and then again after substantial practice. While better players showed greater activation in one region (right dorsal striatum) their relative skill was better diagnosed by considering the sequential structure of whole brain activation. Using a cognitive model that played this game, we extracted a characterization of the mental states that are involved in playing a game and the statistical structure of the transitions among these states. There was a strong correspondence between this measure of sequential structure and the skill of different players. Using multi-voxel pattern analysis, it was possible to recognize, with relatively high accuracy, the cognitive states participants were in during particular scans. We used the sequential structure of these activation-recognized states to predict the skill of individual players. These findings indicate that important features about information-processing strategies can be identified from a model-based analysis of the sequential structure of brain activation. Copyright © 2015 Elsevier Ltd. All rights reserved.
A one-sided sequential test

Energy Technology Data Exchange (ETDEWEB)

Racz, A.; Lux, I. [Hungarian Academy of Sciences, Budapest (Hungary). Atomic Energy Research Inst.

1996-04-16

The applicability of the classical sequential probability ratio testing (SPRT) for early failure detection problems is limited by the fact that there is an extra time delay between the occurrence of the failure and its first recognition. Chien and Adams developed a method to minimize this time for the case when the problem can be formulated as testing the mean value of a Gaussian signal. In our paper we propose a procedure that can be applied for both mean and variance testing and that minimizes the time delay. The method is based on a special parametrization of the classical SPRT. The one-sided sequential tests (OSST) can reproduce the results of the Chien-Adams test when applied for mean values. (author).
Functional Improvement after Photothrombotic Stroke in Rats Is Associated with Different Patterns of Dendritic Plasticity after G-CSF Treatment and G-CSF Treatment Combined with Concomitant or Sequential Constraint-Induced Movement Therapy.

Directory of Open Access Journals (Sweden)

Katrin Frauenknecht

Full Text Available We have previously shown that granulocyte-colony stimulating factor (G-CSF treatment alone, or in combination with constraint movement therapy (CIMT either sequentially or concomitantly, results in significantly improved sensorimotor recovery after photothrombotic stroke in rats in comparison to untreated control animals. CIMT alone did not result in any significant differences compared to the control group (Diederich et al., Stroke, 2012;43:185-192. Using a subset of rat brains from this former experiment the present study was designed to evaluate whether dendritic plasticity would parallel improved functional outcomes. Five treatment groups were analyzed (n = 6 each (i ischemic control (saline; (ii CIMT (CIMT between post-stroke days 2 and 11; (iii G-CSF (10 μg/kg G-CSF daily between post-stroke days 2 and 11; (iv combined concurrent group (CIMT plus G-CSF and (v combined sequential group (CIMT between post-stroke days 2 and 11; 10 μg/kg G-CSF daily between post-stroke days 12 and 21, respectively. After impregnation of rat brains with a modified Golgi-Cox protocol layer V pyramidal neurons in the peri-infarct cortex as well as the corresponding contralateral cortex were analyzed. Surprisingly, animals with a similar degree of behavioral recovery exhibited quite different patterns of dendritic plasticity in both peri-lesional and contralesional areas. The cause for these patterns is not easily to explain but puts the simple assumption that increased dendritic complexity after stroke necessarily results in increased functional outcome into perspective.
High performance parallel backprojection on FPGA

Energy Technology Data Exchange (ETDEWEB)

Pfanner, Florian; Knaup, Michael; Kachelriess, Marc [Erlangen-Nuernberg Univ., Erlangen (Germany). Inst. of Medical Physics (IMP)

2011-07-01

Reconstruction of tomographic images, i.e., images from a Computed Tomography scanner, is a very time consuming issue. The most calculation power is needed for the backprojection step. A closer inspection shows that the algorithm for backprojection is easy to parallelize. FPGAs are able to execute many operations in the same time, so a highly parallel algorithm is a requirement for a powerful acceleration. For data flow rate maximization, we realized the backprojection in a pipelined structure with data throughput of one clock cycle. Due the hardware limitations of the FPGA, it is not possible to reconstruct the image as a whole. So it is necessary to split up the image and reconstruct these parts separately. Despite that, a reconstruction of 512 projections into a 5122 image is calculated within 13 ms on a Virtex 5 FPGA. To save hardware resources we use fixed point arithmetic with an accuracy of 23 bit for calculation. A comparison of the result image and an image, calculated with floating point arithmetic on CPU, shows that there are no differences between these images. (orig.)
Mining Emerging Sequential Patterns for Activity Recognition in Body Sensor Networks

DEFF Research Database (Denmark)

Gu, Tao; Wang, Liang; Chen, Hanhua

2010-01-01

Body Sensor Networks oer many applications in healthcare, well-being and entertainment. One of the emerging applications is recognizing activities of daily living. In this paper, we introduce a novel knowledge pattern named Emerging Sequential Pattern (ESP)|a sequential pattern that discovers...... signicant class dierences|to recognize both simple (i.e., sequential) and complex (i.e., interleaved and concurrent) activities. Based on ESPs, we build our complex activity models directly upon the sequential model to recognize both activity types. We conduct comprehensive empirical studies to evaluate...
Objective and Subjective Measures of Simultaneous vs Sequential Bilateral Cochlear Implants in Adults: A Randomized Clinical Trial.

Science.gov (United States)

Kraaijenga, Véronique J C; Ramakers, Geerte G J; Smulders, Yvette E; van Zon, Alice; Stegeman, Inge; Smit, Adriana L; Stokroos, Robert J; Hendrice, Nadia; Free, Rolien H; Maat, Bert; Frijns, Johan H M; Briaire, Jeroen J; Mylanus, E A M; Huinck, Wendy J; Van Zanten, Gijsbert A; Grolman, Wilko

2017-09-01

To date, no randomized clinical trial on the comparison between simultaneous and sequential bilateral cochlear implants (BiCIs) has been performed. To investigate the hearing capabilities and the self-reported benefits of simultaneous BiCIs compared with those of sequential BiCIs. A multicenter randomized clinical trial was conducted between January 12, 2010, and September 2, 2012, at 5 tertiary referral centers among 40 participants eligible for BiCIs. Main inclusion criteria were postlingual severe to profound hearing loss, age 18 to 70 years, and a maximum duration of 10 years without hearing aid use in both ears. Data analysis was conducted from May 24 to June 12, 2016. The simultaneous BiCI group received 2 cochlear implants during 1 surgical procedure. The sequential BiCI group received 2 cochlear implants with an interval of 2 years between implants. First, the results 1 year after receiving simultaneous BiCIs were compared with the results 1 year after receiving sequential BiCIs. Second, the results of 3 years of follow-up for both groups were compared separately. The primary outcome measure was speech intelligibility in noise from straight ahead. Secondary outcome measures were speech intelligibility in noise from spatially separated sources, speech intelligibility in silence, localization capabilities, and self-reported benefits assessed with various hearing and quality of life questionnaires. Nineteen participants were randomized to receive simultaneous BiCIs (11 women and 8 men; median age, 52 years [interquartile range, 36-63 years]), and another 19 participants were randomized to undergo sequential BiCIs (8 women and 11 men; median age, 54 years [interquartile range, 43-64 years]). Three patients did not receive a second cochlear implant and were unavailable for follow-up. Comparable results were found 1 year after simultaneous or sequential BiCIs for speech intelligibility in noise from straight ahead (difference, 0.9 dB [95% CI, -3.1 to 4.4 dB]) and
Impact of controlling the sum of error probability in the sequential probability ratio test

Directory of Open Access Journals (Sweden)

Bijoy Kumarr Pradhan

2013-05-01

Full Text Available A generalized modified method is proposed to control the sum of error probabilities in sequential probability ratio test to minimize the weighted average of the two average sample numbers under a simple null hypothesis and a simple alternative hypothesis with the restriction that the sum of error probabilities is a pre-assigned constant to find the optimal sample size and finally a comparison is done with the optimal sample size found from fixed sample size procedure. The results are applied to the cases when the random variate follows a normal law as well as Bernoullian law.
Discrimination between sequential and simultaneous virtual channels with electrical hearing

OpenAIRE

Landsberger, David; Galvin, John J.

2011-01-01

In cochlear implants (CIs), simultaneous or sequential stimulation of adjacent electrodes can produce intermediate pitch percepts between those of the component electrodes. However, it is unclear whether simultaneous and sequential virtual channels (VCs) can be discriminated. In this study, CI users were asked to discriminate simultaneous and sequential VCs; discrimination was measured for monopolar (MP) and bipolar + 1 stimulation (BP + 1), i.e., relatively broad and focused stimulation mode...
Parallel algorithms for nuclear reactor analysis via domain decomposition method

International Nuclear Information System (INIS)

Kim, Yong Hee

1995-02-01

the number of inner level iterations are limited. The analysis shows that mixed pseudo-boundary conditions have superior convergence properties if the pseudo-boundary parameters are optimally chosen. DN(or ND) conditions can be efficiently accelerated via under-relaxation concept, where DN(or ND) means that Dirichlet and Neumann conditions are independently imposed on neighbouring pseudo-boundaries. However, exact realization of such schemes is not practical since complete inner iteration is required. It is shown that limiting the number of inner iterations is equivalent to the under-relaxation concept, however, limiting the number of inner level iterations in MM scheme requires more outer iterations. Consequently, DN (or ND) algorithm with under-relaxation and MM algorithm may provide similar parallel performance in practical implementation, if the numerical solver used is not extraordinarily efficient. The parallel Schwarz algorithm is applied to two types of reactor benchmark problems: fixed source problems and eigenvalue problems. Several results of parallel computation for the problems are reported and compared with those of sequential computations. The results show that very high speedup can be achieved in fixed source problems in spite of the small problem size and that relatively high speedup, although lower than that of fixed source problems, can be obtained in eigenvalue problems
Hybrid Computerized Adaptive Testing: From Group Sequential Design to Fully Sequential Design

Science.gov (United States)

Wang, Shiyu; Lin, Haiyan; Chang, Hua-Hua; Douglas, Jeff

2016-01-01

Computerized adaptive testing (CAT) and multistage testing (MST) have become two of the most popular modes in large-scale computer-based sequential testing. Though most designs of CAT and MST exhibit strength and weakness in recent large-scale implementations, there is no simple answer to the question of which design is better because different…
Sequential dependencies in magnitude scaling of loudness

DEFF Research Database (Denmark)

Joshi, Suyash Narendra; Jesteadt, Walt

2013-01-01

Ten normally hearing listeners used a programmable sone-potentiometer knob to adjust the level of a 1000-Hz sinusoid to match the loudness of numbers presented to them in a magnitude production task. Three different power-law exponents (0.15, 0.30, and 0.60) and a log-law with equal steps in d......B were used to program the sone-potentiometer. The knob settings systematically influenced the form of the loudness function. Time series analysis was used to assess the sequential dependencies in the data, which increased with increasing exponent and were greatest for the log-law. It would be possible......, therefore, to choose knob properties that minimized these dependencies. When the sequential dependencies were removed from the data, the slope of the loudness functions did not change, but the variability decreased. Sequential dependencies were only present when the level of the tone on the previous trial...
Visual short-term memory for sequential arrays.

Science.gov (United States)

Kumar, Arjun; Jiang, Yuhong

2005-04-01

The capacity of visual short-term memory (VSTM) for a single visual display has been investigated in past research, but VSTM for multiple sequential arrays has been explored only recently. In this study, we investigate the capacity of VSTM across two sequential arrays separated by a variable stimulus onset asynchrony (SOA). VSTM for spatial locations (Experiment 1), colors (Experiments 2-4), orientations (Experiments 3 and 4), and conjunction of color and orientation (Experiment 4) were tested, with the SOA across the two sequential arrays varying from 100 to 1,500 msec. We find that VSTM for the trailing array is much better than VSTM for the leading array, but when averaged across the two arrays VSTM has a constant capacity independent of the SOA. We suggest that multiple displays compete for retention in VSTM and that separating information into two temporally discrete groups does not enhance the overall capacity of VSTM.
The target-to-foils shift in simultaneous and sequential lineups.

Science.gov (United States)

Clark, Steven E; Davey, Sherrie L

2005-04-01

A theoretical cornerstone in eyewitness identification research is the proposition that witnesses, in making decisions from standard simultaneous lineups, make relative judgments. The present research considers two sources of support for this proposal. An experiment by G. L. Wells (1993) showed that if the target is removed from a lineup, witnesses shift their responses to pick foils, rather than rejecting the lineups, a result we will term a target-to-foils shift. Additional empirical support is provided by results from sequential lineups which typically show higher accuracy than simultaneous lineups, presumably because of a decrease in the use of relative judgments in making identification decisions. The combination of these two lines of research suggests that the target-to-foils shift should be reduced in sequential lineups relative to simultaneous lineups. Results of two experiments showed an overall advantage for sequential lineups, but also showed a target-to-foils shift equal in size for simultaneous and sequential lineups. Additional analyses indicated that the target-to-foils shift in sequential lineups was moderated in part by an order effect and was produced with (Experiment 2) or without (Experiment 1) a shift in decision criterion. This complex pattern of results suggests that more work is needed to understand the processes which underlie decisions in simultaneous and sequential lineups.

On-board landmark navigation and attitude reference parallel processor system

Science.gov (United States)

Gilbert, L. E.; Mahajan, D. T.

1978-01-01

An approach to autonomous navigation and attitude reference for earth observing spacecraft is described along with the landmark identification technique based on a sequential similarity detection algorithm (SSDA). Laboratory experiments undertaken to determine if better than one pixel accuracy in registration can be achieved consistent with onboard processor timing and capacity constraints are included. The SSDA is implemented using a multi-microprocessor system including synchronization logic and chip library. The data is processed in parallel stages, effectively reducing the time to match the small known image within a larger image as seen by the onboard image system. Shared memory is incorporated in the system to help communicate intermediate results among microprocessors. The functions include finding mean values and summation of absolute differences over the image search area. The hardware is a low power, compact unit suitable to onboard application with the flexibility to provide for different parameters depending upon the environment.
A SPECT reconstruction method for extending parallel to non-parallel geometries

International Nuclear Information System (INIS)

Wen Junhai; Liang Zhengrong

2010-01-01

Due to its simplicity, parallel-beam geometry is usually assumed for the development of image reconstruction algorithms. The established reconstruction methodologies are then extended to fan-beam, cone-beam and other non-parallel geometries for practical application. This situation occurs for quantitative SPECT (single photon emission computed tomography) imaging in inverting the attenuated Radon transform. Novikov reported an explicit parallel-beam formula for the inversion of the attenuated Radon transform in 2000. Thereafter, a formula for fan-beam geometry was reported by Bukhgeim and Kazantsev (2002 Preprint N. 99 Sobolev Institute of Mathematics). At the same time, we presented a formula for varying focal-length fan-beam geometry. Sometimes, the reconstruction formula is so implicit that we cannot obtain the explicit reconstruction formula in the non-parallel geometries. In this work, we propose a unified reconstruction framework for extending parallel-beam geometry to any non-parallel geometry using ray-driven techniques. Studies by computer simulations demonstrated the accuracy of the presented unified reconstruction framework for extending parallel-beam to non-parallel geometries in inverting the attenuated Radon transform.
Prospectively ECG-triggered sequential dual-source coronary CT angiography in patients with atrial fibrillation: comparison with retrospectively ECG-gated helical CT

Energy Technology Data Exchange (ETDEWEB)

Xu, Lei; Yang, Lin; Zhang, Zhaoqi [Capital Medical University, Department of Radiology, Beijing Anzhen Hospital, Beijing (China); Wang, Yining; Jin, Zhengyu [Chinese Academy of Medical Sciences, Department of Radiology, Peking Union Medical College Hospital, Beijing (China); Zhang, Longjiang; Lu, Guangming [Nanjing University, Department of Medical Imaging, Jinling Hospital, Clinical School of Medical College, Nanjing, Jiangsu (China)

2013-07-15

To investigate the feasibility of applying prospectively ECG-triggered sequential coronary CT angiography (CCTA) to patients with atrial fibrillation (AF) and evaluate the image quality and radiation dose compared with a retrospectively ECG-gated helical protocol. 100 patients with persistent AF were enrolled. Fifty patients were randomly assigned to a prospective protocol and the other patients to a retrospective protocol using a second-generation dual-source CT (DS-CT). Image quality was evaluated using a four-point grading scale (1 = excellent, 2 = good, 3 = moderate, 4 = poor) by two reviewers on a per-segment basis. The coronary artery segments were considered non-diagnostic with a quality score of 4. The radiation dose was evaluated. Diagnostic segment rate in the prospective group was 99.4 % (642/646 segments), while that in the retrospective group was 96.5 % (604/626 segments) (P < 0.001). Effective dose was 4.29 {+-} 1.86 and 11.95 {+-} 5.34 mSv for each of the two protocols (P < 0.001), which was a 64 % reduction in the radiation dose for prospective sequential imaging compared with retrospective helical imaging. In AF patients, prospectively ECG-triggered sequential CCTA is feasible using second-generation DS-CT and can decrease >60 % radiation exposure compared with retrospectively ECG-gated helical imaging while improving diagnostic image quality. (orig.)
Improving matrix-vector product performance and multi-level preconditioning for the parallel PCG package

Energy Technology Data Exchange (ETDEWEB)

McLay, R.T.; Carey, G.F.

1996-12-31

In this study we consider parallel solution of sparse linear systems arising from discretized PDE`s. As part of our continuing work on our parallel PCG Solver package, we have made improvements in two areas. The first is improving the performance of the matrix-vector product. Here on regular finite-difference grids, we are able to use the cache memory more efficiently for smaller domains or where there are multiple degrees of freedom. The second problem of interest in the present work is the construction of preconditioners in the context of the parallel PCG solver we are developing. Here the problem is partitioned over a set of processors subdomains and the matrix-vector product for PCG is carried out in parallel for overlapping grid subblocks. For problems of scaled speedup, the actual rate of convergence of the unpreconditioned system deteriorates as the mesh is refined. Multigrid and subdomain strategies provide a logical approach to resolving the problem. We consider the parallel trade-offs between communication and computation and provide a complexity analysis of a representative algorithm. Some preliminary calculations using the parallel package and comparisons with other preconditioners are provided together with parallel performance results.
The language parallel Pascal and other aspects of the massively parallel processor

Science.gov (United States)

Reeves, A. P.; Bruner, J. D.

1982-01-01

A high level language for the Massively Parallel Processor (MPP) was designed. This language, called Parallel Pascal, is described in detail. A description of the language design, a description of the intermediate language, Parallel P-Code, and details for the MPP implementation are included. Formal descriptions of Parallel Pascal and Parallel P-Code are given. A compiler was developed which converts programs in Parallel Pascal into the intermediate Parallel P-Code language. The code generator to complete the compiler for the MPP is being developed independently. A Parallel Pascal to Pascal translator was also developed. The architecture design for a VLSI version of the MPP was completed with a description of fault tolerant interconnection networks. The memory arrangement aspects of the MPP are discussed and a survey of other high level languages is given.
Parallel Atomistic Simulations

Energy Technology Data Exchange (ETDEWEB)

HEFFELFINGER,GRANT S.

2000-01-18

Algorithms developed to enable the use of atomistic molecular simulation methods with parallel computers are reviewed. Methods appropriate for bonded as well as non-bonded (and charged) interactions are included. While strategies for obtaining parallel molecular simulations have been developed for the full variety of atomistic simulation methods, molecular dynamics and Monte Carlo have received the most attention. Three main types of parallel molecular dynamics simulations have been developed, the replicated data decomposition, the spatial decomposition, and the force decomposition. For Monte Carlo simulations, parallel algorithms have been developed which can be divided into two categories, those which require a modified Markov chain and those which do not. Parallel algorithms developed for other simulation methods such as Gibbs ensemble Monte Carlo, grand canonical molecular dynamics, and Monte Carlo methods for protein structure determination are also reviewed and issues such as how to measure parallel efficiency, especially in the case of parallel Monte Carlo algorithms with modified Markov chains are discussed.
Dynamics-based sequential memory: Winnerless competition of patterns

International Nuclear Information System (INIS)

Seliger, Philip; Tsimring, Lev S.; Rabinovich, Mikhail I.

2003-01-01

We introduce a biologically motivated dynamical principle of sequential memory which is based on winnerless competition (WLC) of event images. This mechanism is implemented in a two-layer neural model of sequential spatial memory. We present the learning dynamics which leads to the formation of a WLC network. After learning, the system is capable of associative retrieval of prerecorded sequences of patterns
Sequential, progressive, equal-power, reflective beam-splitter arrays

Science.gov (United States)

Manhart, Paul K.

2017-11-01

The equations to calculate equal-power reflectivity of a sequential series of beam splitters is presented. Non-sequential optical design examples are offered for uniform illumination using diode lasers. Objects created using Boolean operators and Swept Surfaces can create objects capable of reflecting light into predefined elevation and azimuth angles. Analysis of the illumination patterns for the array are also presented.
Comparison of sequential vs same-day simultaneous collagen cross-linking and topography-guided PRK for treatment of keratoconus.

Science.gov (United States)

Kanellopoulos, Anastasios John

2009-09-01

The safety and efficacy of corneal collagen cross-linking (CXL) and topography-guided photorefractive keratectomy (PRK) using a different sequence and timing were evaluated in consecutive keratoconus cases. This study included a total of 325 eyes with keratoconus. Eyes were divided into two groups. The first group (n=127 eyes) underwent CXL with subsequent topography-guided PRK performed 6 months later (sequential group) and the second group (n=198 eyes) underwent CXL and PRK in a combined procedure on the same day (simultaneous group). Statistical differences were examined for pre- to postoperative changes in uncorrected (UCVA, logMAR) and best-spectacle-corrected visual acuity (BSCVA, logMAR), manifest refraction spherical equivalent (MRSE), keratometry (K), topography, central corneal thickness, endothelial cell count, corneal haze, and ectatic progression. Mean follow-up was 36+/-18 months (range: 24 to 68 months). At last follow-up in the sequential group, the mean UCVA improved from 0.9+/-0.3 logMAR to 0.49+/-0.25 logMAR, and mean BSCVA from 0.41+/-0.25 logMAR to 0.16+/-0.22 logMAR. Mean reduction in spherical equivalent refraction was 2.50+/-1.20 diopters (D), mean haze score was 1.2+/-0.5, and mean reduction in K was 2.75+/-1.30 D. In the simultaneous group, mean UCVA improved from 0.96+/-0.2 logMAR to 0.3+/-0.2 logMAR, and mean BSCVA from 0.39+/-0.3 logMAR to 0.11+/-0.16 logMAR. Mean reduction in spherical equivalent refraction was 3.20+/-1.40 D, mean haze score was 0.5+/-0.3, and mean reduction in K was 3.50+/-1.3 D. Endothelial cell count preoperatively and at last follow-up was unchanged (PPRK and CXL appears to be superior to sequential CXL with later PRK in the visual rehabilitation of progressing keratoconus. Copyright 2009, SLACK Incorporated.
Basal ganglia and cortical networks for sequential ordering and rhythm of complex movements

Directory of Open Access Journals (Sweden)

Jeffery G. Bednark

2015-07-01

Full Text Available Voluntary actions require the concurrent engagement and coordinated control of complex temporal (e.g. rhythm and ordinal motor processes. Using high-resolution functional magnetic resonance imaging (fMRI and multi-voxel pattern analysis (MVPA, we sought to determine the degree to which these complex motor processes are dissociable in basal ganglia and cortical networks. We employed three different finger-tapping tasks that differed in the demand on the sequential temporal rhythm or sequential ordering of submovements. Our results demonstrate that sequential rhythm and sequential order tasks were partially dissociable based on activation differences. The sequential rhythm task activated a widespread network centered around the SMA and basal-ganglia regions including the dorsomedial putamen and caudate nucleus, while the sequential order task preferentially activated a fronto-parietal network. There was also extensive overlap between sequential rhythm and sequential order tasks, with both tasks commonly activating bilateral premotor, supplementary motor, and superior/inferior parietal cortical regions, as well as regions of the caudate/putamen of the basal ganglia and the ventro-lateral thalamus. Importantly, within the cortical regions that were active for both complex movements, MVPA could accurately classify different patterns of activation for the sequential rhythm and sequential order tasks. In the basal ganglia, however, overlapping activation for the sequential rhythm and sequential order tasks, which was found in classic motor circuits of the putamen and ventro-lateral thalamus, could not be accurately differentiated by MVPA. Overall, our results highlight the convergent architecture of the motor system, where complex motor information that is spatially distributed in the cortex converges into a more compact representation in the basal ganglia.
The sequential price of anarchy for atomic congestion games

NARCIS (Netherlands)

de Jong, Jasper; Uetz, Marc Jochen; Liu, Tie-Yan; Qi, Qi; Ye, Yinyu

2014-01-01

In situations without central coordination, the price of anarchy relates the quality of any Nash equilibrium to the quality of a global optimum. Instead of assuming that all players choose their actions simultaneously, we consider games where players choose their actions sequentially. The sequential
Development of real-time visualization system for Computational Fluid Dynamics on parallel computers

International Nuclear Information System (INIS)

Muramatsu, Kazuhiro; Otani, Takayuki; Matsumoto, Hideki; Takei, Toshifumi; Doi, Shun

1998-03-01

A real-time visualization system for computational fluid dynamics in a network connecting between a parallel computing server and the client terminal was developed. Using the system, a user can visualize the results of a CFD (Computational Fluid Dynamics) simulation on the parallel computer as a client terminal during the actual computation on a server. Using GUI (Graphical User Interface) on the client terminal, to user is also able to change parameters of the analysis and visualization during the real-time of the calculation. The system carries out both of CFD simulation and generation of a pixel image data on the parallel computer, and compresses the data. Therefore, the amount of data from the parallel computer to the client is so small in comparison with no compression that the user can enjoy the swift image appearance comfortably. Parallelization of image data generation is based on Owner Computation Rule. GUI on the client is built on Java applet. A real-time visualization is thus possible on the client PC only if Web browser is implemented on it. (author)
Enhancing Application Performance Using Mini-Apps: Comparison of Hybrid Parallel Programming Paradigms

Science.gov (United States)

Lawson, Gary; Sosonkina, Masha; Baurle, Robert; Hammond, Dana

2017-01-01

In many fields, real-world applications for High Performance Computing have already been developed. For these applications to stay up-to-date, new parallel strategies must be explored to yield the best performance; however, restructuring or modifying a real-world application may be daunting depending on the size of the code. In this case, a mini-app may be employed to quickly explore such options without modifying the entire code. In this work, several mini-apps have been created to enhance a real-world application performance, namely the VULCAN code for complex flow analysis developed at the NASA Langley Research Center. These mini-apps explore hybrid parallel programming paradigms with Message Passing Interface (MPI) for distributed memory access and either Shared MPI (SMPI) or OpenMP for shared memory accesses. Performance testing shows that MPI+SMPI yields the best execution performance, while requiring the largest number of code changes. A maximum speedup of 23 was measured for MPI+SMPI, but only 11 was measured for MPI+OpenMP.
Native Frames: Disentangling Sequential from Concerted Three-Body Fragmentation

Science.gov (United States)

Rajput, Jyoti; Severt, T.; Berry, Ben; Jochim, Bethany; Feizollah, Peyman; Kaderiya, Balram; Zohrabi, M.; Ablikim, U.; Ziaee, Farzaneh; Raju P., Kanaka; Rolles, D.; Rudenko, A.; Carnes, K. D.; Esry, B. D.; Ben-Itzhak, I.

2018-03-01

A key question concerning the three-body fragmentation of polyatomic molecules is the distinction of sequential and concerted mechanisms, i.e., the stepwise or simultaneous cleavage of bonds. Using laser-driven fragmentation of OCS into O++C++S+ and employing coincidence momentum imaging, we demonstrate a novel method that enables the clear separation of sequential and concerted breakup. The separation is accomplished by analyzing the three-body fragmentation in the native frame associated with each step and taking advantage of the rotation of the intermediate molecular fragment, CO2 + or CS2 + , before its unimolecular dissociation. This native-frame method works for any projectile (electrons, ions, or photons), provides details on each step of the sequential breakup, and enables the retrieval of the relevant spectra for sequential and concerted breakup separately. Specifically, this allows the determination of the branching ratio of all these processes in OCS3 + breakup. Moreover, we find that the first step of sequential breakup is tightly aligned along the laser polarization and identify the likely electronic states of the intermediate dication that undergo unimolecular dissociation in the second step. Finally, the separated concerted breakup spectra show clearly that the central carbon atom is preferentially ejected perpendicular to the laser field.
A Comparative Taxonomy of Parallel Algorithms for RNA Secondary Structure Prediction

Science.gov (United States)

Al-Khatib, Ra’ed M.; Abdullah, Rosni; Rashid, Nur’Aini Abdul

2010-01-01

RNA molecules have been discovered playing crucial roles in numerous biological and medical procedures and processes. RNA structures determination have become a major problem in the biology context. Recently, computer scientists have empowered the biologists with RNA secondary structures that ease an understanding of the RNA functions and roles. Detecting RNA secondary structure is an NP-hard problem, especially in pseudoknotted RNA structures. The detection process is also time-consuming; as a result, an alternative approach such as using parallel architectures is a desirable option. The main goal in this paper is to do an intensive investigation of parallel methods used in the literature to solve the demanding issues, related to the RNA secondary structure prediction methods. Then, we introduce a new taxonomy for the parallel RNA folding methods. Based on this proposed taxonomy, a systematic and scientific comparison is performed among these existing methods. PMID:20458364
Campbell and moment measures for finite sequential spatial processes

NARCIS (Netherlands)

M.N.M. van Lieshout (Marie-Colette)

2006-01-01

textabstractWe define moment and Campbell measures for sequential spatial processes, prove a Campbell-Mecke theorem, and relate the results to their counterparts in the theory of point processes. In particular, we show that any finite sequential spatial process model can be derived as the vector
Parallel integer sorting with medium and fine-scale parallelism

Science.gov (United States)

Dagum, Leonardo

1993-01-01

Two new parallel integer sorting algorithms, queue-sort and barrel-sort, are presented and analyzed in detail. These algorithms do not have optimal parallel complexity, yet they show very good performance in practice. Queue-sort designed for fine-scale parallel architectures which allow the queueing of multiple messages to the same destination. Barrel-sort is designed for medium-scale parallel architectures with a high message passing overhead. The performance results from the implementation of queue-sort on a Connection Machine CM-2 and barrel-sort on a 128 processor iPSC/860 are given. The two implementations are found to be comparable in performance but not as good as a fully vectorized bucket sort on the Cray YMP.
Sequential Dependencies in Driving

Science.gov (United States)

Doshi, Anup; Tran, Cuong; Wilder, Matthew H.; Mozer, Michael C.; Trivedi, Mohan M.

2012-01-01

The effect of recent experience on current behavior has been studied extensively in simple laboratory tasks. We explore the nature of sequential effects in the more naturalistic setting of automobile driving. Driving is a safety-critical task in which delayed response times may have severe consequences. Using a realistic driving simulator, we find…
Comparison of combined use of fluconazole and clotrimazole with the sequential dose of fluconazole in the treatment of recurrent Candida vaginitis

Directory of Open Access Journals (Sweden)

Tayebeh Gharibi

2009-09-01

Full Text Available Background: fluconazole is one of the systemic anti-fungal agents and clotrimazole vaginal cream is a topical agent against Candida Albicans. In this study, comparison between of the two regimes (Fluconazole with and without vaginal clotrimazole in recurrent Candida albicans was assessed .with that of sequential dose of fluconazole for the treatment of Candida vaginitis, this evaluation was done. Methods: A double blind randomized clinical trial was carried out on 80 married women (20-45 years old having chronic vaginal Candidiasis. The patients were divided in to two groups (40 in each. The first groups received two doses of fluconazole at two different timing (Zero and 72 hours along with clotrimazole vaginal cream 1% ( for 7 days . The second group recived only two doses of fluconazole (Zero time and 72 hours later. Then the patients were examined at 2 and 6 weeks after the treatment. Results: The signs and symptoms of disease (itching, erythema, excoriation, edema and fissure in both groups were significantly decreased after two weeks of the treatment (P = 0.00. The final examination of both groups also showed that the treatment was more effective in the first group compared to the second group. The difference was significant statistically (P<0.05. Conclusion: the data shows that adding topical clotrimazole in treatment of patients with recurrent Candida vaginitis Is more effective.
About Parallel Programming: Paradigms, Parallel Execution and Collaborative Systems

Directory of Open Access Journals (Sweden)

Loredana MOCEAN

2009-01-01

Full Text Available In the last years, there were made efforts for delineation of a stabile and unitary frame, where the problems of logical parallel processing must find solutions at least at the level of imperative languages. The results obtained by now are not at the level of the made efforts. This paper wants to be a little contribution at these efforts. We propose an overview in parallel programming, parallel execution and collaborative systems.

Parallel computing works!

CERN Document Server

Fox, Geoffrey C; Messina, Guiseppe C

2014-01-01

A clear illustration of how parallel computers can be successfully appliedto large-scale scientific computations. This book demonstrates how avariety of applications in physics, biology, mathematics and other scienceswere implemented on real parallel computers to produce new scientificresults. It investigates issues of fine-grained parallelism relevant forfuture supercomputers with particular emphasis on hypercube architecture. The authors describe how they used an experimental approach to configuredifferent massively parallel machines, design and implement basic systemsoftware, and develop
Effects of parallel electron dynamics on plasma blob transport

Energy Technology Data Exchange (ETDEWEB)

Angus, Justin R.; Krasheninnikov, Sergei I. [University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093 (United States); Umansky, Maxim V. [Lawrence Livermore National Laboratory, 7000 East Avenue, Livermore, California 94550 (United States)

2012-08-15

The 3D effects on sheath connected plasma blobs that result from parallel electron dynamics are studied by allowing for the variation of blob density and potential along the magnetic field line and using collisional Ohm's law to model the parallel current density. The parallel current density from linear sheath theory, typically used in the 2D model, is implemented as parallel boundary conditions. This model includes electrostatic 3D effects, such as resistive drift waves and blob spinning, while retaining all of the fundamental 2D physics of sheath connected plasma blobs. If the growth time of unstable drift waves is comparable to the 2D advection time scale of the blob, then the blob's density gradient will be depleted resulting in a much more diffusive blob with little radial motion. Furthermore, blob profiles that are initially varying along the field line drive the potential to a Boltzmann relation that spins the blob and thereby acts as an addition sink of the 2D potential. Basic dimensionless parameters are presented to estimate the relative importance of these two 3D effects. The deviation of blob dynamics from that predicted by 2D theory in the appropriate limits of these parameters is demonstrated by a direct comparison of 2D and 3D seeded blob simulations.
Study and simulation of a parallel numerical processing machine

International Nuclear Information System (INIS)

Bel Hadj, Slaheddine

1981-12-01

This study has been carried out in the perspective of the implementation on a minicomputer of the NEPTUNIX package (software for the resolution of very large algebra-differential equation systems). Aiming at increasing the system performance, a previous research work has shown the necessity of reducing the execution time of certain numerical computation tasks, which are of frequent use. It has also demonstrated the feasibility of handling these tasks with efficient algorithms of parallel type. The present work deals with the study and simulation of a parallel architecture processor adapted to the fast execution of these algorithms. A minicomputer fitted with a connection to such a parallel processor, has a greatly extended computing power. Then the architecture of a parallel numerical processor, based on the use of VLSI microprocessors and co-processors, is described. Its design aims at the best cost / performance ratio. The last part deals with the simulation processor with the 'CHAMBOR' program. Results show an increasing factor of 30 in speed, in comparison with the execution on a MITRA 15 minicomputer. Moreover the conflicts importance, mainly at the level of access to a shared resource is evaluated. Although this implementation has been designed having in mind a dedicated application, other uses could be envisaged, particularly for the simulation of nuclear reactors: operator guiding system, the behavioural study under accidental circumstances, etc. (author) [fr
Comparative analysis of the serial/parallel numerical calculation of boiling channels thermohydraulics; Analisis comparativo del calculo numerico serie/paralelo de la termohidraulica de canales con ebullicion

Energy Technology Data Exchange (ETDEWEB)

Cecenas F, M., E-mail: mcf@iie.org.mx [Instituto Nacional de Electricidad y Energias Limpias, Reforma 113, Col. Palmira, 62490 Cuernavaca, Morelos (Mexico)

2017-09-15

A parallel channel model with boiling and punctual neutron kinetics is used to compare the implementation of its programming in C language through a conventional scheme and through a parallel programming scheme. In both cases the subroutines written in C are practically the same, but they vary in the way of controlling the execution of the tasks that calculate the different channels. Parallel Virtual Machine is used for the parallel solution, which allows the passage of messages between tasks to control convergence and transfer the variables of interest between the tasks that run simultaneously on a platform equipped with a multi-core microprocessor. For some problems defined as a study case, such as the one presented in this paper, a computer with two cores can reduce the computation time to 54-56% of the time required by the same program in its conventional sequential version. Similarly, a processor with four cores can reduce the time to 22-33% of execution time of the conventional serial version. These results of substantially reducing the computation time are very motivating of all those applications that can be prepared to be parallelized and whose execution time is an important factor. (Author)
Comparison of electrorheological characteristics obtained for two geometries: parallel plates and concentric cylinders

Czech Academy of Sciences Publication Activity Database

Peer, Petra; Filip, Petr; Stěnička, M.; Pavlínek, V.

2014-01-01

Roč. 59, č. 3 (2014), s. 221-235 ISSN 0001-7043 R&D Projects: GA ČR(CZ) GAP105/11/2342 Institutional support: RVO:67985874 Keywords : electrorheology * parallel plates * concentric cylinders * silicone oil * PANI powders Subject RIV: BK - Fluid Dynamics
Study on High Performance of MPI-Based Parallel FDTD from WorkStation to Super Computer Platform

Directory of Open Access Journals (Sweden)

Z. L. He

2012-01-01

Full Text Available Parallel FDTD method is applied to analyze the electromagnetic problems of the electrically large targets on super computer. It is well known that the more the number of processors the less computing time consumed. Nevertheless, with the same number of processors, computing efficiency is affected by the scheme of the MPI virtual topology. Then, the influence of different virtual topology schemes on parallel performance of parallel FDTD is studied in detail. The general rules are presented on how to obtain the highest efficiency of parallel FDTD algorithm by optimizing MPI virtual topology. To show the validity of the presented method, several numerical results are given in the later part. Various comparisons are made and some useful conclusions are summarized.
Comparisons of Energy Management Methods for a Parallel Plug-In Hybrid Electric Vehicle between the Convex Optimization and Dynamic Programming

Directory of Open Access Journals (Sweden)

Renxin Xiao

2018-01-01

Full Text Available This paper proposes a comparison study of energy management methods for a parallel plug-in hybrid electric vehicle (PHEV. Based on detailed analysis of the vehicle driveline, quadratic convex functions are presented to describe the nonlinear relationship between engine fuel-rate and battery charging power at different vehicle speed and driveline power demand. The engine-on power threshold is estimated by the simulated annealing (SA algorithm, and the battery power command is achieved by convex optimization with target of improving fuel economy, compared with the dynamic programming (DP based method and the charging depleting–charging sustaining (CD/CS method. In addition, the proposed control methods are discussed at different initial battery state of charge (SOC values to extend the application. Simulation results validate that the proposed strategy based on convex optimization can save the fuel consumption and reduce the computation burden obviously.
Portable, parallel, reusable Krylov space codes

Energy Technology Data Exchange (ETDEWEB)

Smith, B.; Gropp, W. [Argonne National Lab., IL (United States)

1994-12-31

Krylov space accelerators are an important component of many algorithms for the iterative solution of linear systems. Each Krylov space method has it`s own particular advantages and disadvantages, therefore it is desirable to have a variety of them available all with an identical, easy to use, interface. A common complaint application programmers have with available software libraries for the iterative solution of linear systems is that they require the programmer to use the data structures provided by the library. The library is not able to work with the data structures of the application code. Hence, application programmers find themselves constantly recoding the Krlov space algorithms. The Krylov space package (KSP) is a data-structure-neutral implementation of a variety of Krylov space methods including preconditioned conjugate gradient, GMRES, BiCG-Stab, transpose free QMR and CGS. Unlike all other software libraries for linear systems that the authors are aware of, KSP will work with any application codes data structures, in Fortran or C. Due to it`s data-structure-neutral design KSP runs unchanged on both sequential and parallel machines. KSP has been tested on workstations, the Intel i860 and Paragon, Thinking Machines CM-5 and the IBM SP1.
Framework for sequential approximate optimization

NARCIS (Netherlands)

Jacobs, J.H.; Etman, L.F.P.; Keulen, van F.; Rooda, J.E.

2004-01-01

An object-oriented framework for Sequential Approximate Optimization (SAO) isproposed. The framework aims to provide an open environment for thespecification and implementation of SAO strategies. The framework is based onthe Python programming language and contains a toolbox of Python
A Survey of Multi-Objective Sequential Decision-Making

OpenAIRE

Roijers, D.M.; Vamplew, P.; Whiteson, S.; Dazeley, R.

2013-01-01

Sequential decision-making problems with multiple objectives arise naturally in practice and pose unique challenges for research in decision-theoretic planning and learning, which has largely focused on single-objective settings. This article surveys algorithms designed for sequential decision-making problems with multiple objectives. Though there is a growing body of literature on this subject, little of it makes explicit under what circumstances special methods are needed to solve multi-obj...
A Sequential Multiplicative Extended Kalman Filter for Attitude Estimation Using Vector Observations

Science.gov (United States)

Qin, Fangjun; Jiang, Sai; Zha, Feng

2018-01-01

In this paper, a sequential multiplicative extended Kalman filter (SMEKF) is proposed for attitude estimation using vector observations. In the proposed SMEKF, each of the vector observations is processed sequentially to update the attitude, which can make the measurement model linearization more accurate for the next vector observation. This is the main difference to Murrell’s variation of the MEKF, which does not update the attitude estimate during the sequential procedure. Meanwhile, the covariance is updated after all the vector observations have been processed, which is used to account for the special characteristics of the reset operation necessary for the attitude update. This is the main difference to the traditional sequential EKF, which updates the state covariance at each step of the sequential procedure. The numerical simulation study demonstrates that the proposed SMEKF has more consistent and accurate performance in a wide range of initial estimate errors compared to the MEKF and its traditional sequential forms. PMID:29751538
A Sequential Multiplicative Extended Kalman Filter for Attitude Estimation Using Vector Observations

Directory of Open Access Journals (Sweden)

Fangjun Qin

2018-05-01

Full Text Available In this paper, a sequential multiplicative extended Kalman filter (SMEKF is proposed for attitude estimation using vector observations. In the proposed SMEKF, each of the vector observations is processed sequentially to update the attitude, which can make the measurement model linearization more accurate for the next vector observation. This is the main difference to Murrell’s variation of the MEKF, which does not update the attitude estimate during the sequential procedure. Meanwhile, the covariance is updated after all the vector observations have been processed, which is used to account for the special characteristics of the reset operation necessary for the attitude update. This is the main difference to the traditional sequential EKF, which updates the state covariance at each step of the sequential procedure. The numerical simulation study demonstrates that the proposed SMEKF has more consistent and accurate performance in a wide range of initial estimate errors compared to the MEKF and its traditional sequential forms.
Asynchronous Operators of Sequential Logic Venjunction & Sequention

CERN Document Server

Vasyukevich, Vadim

2011-01-01

This book is dedicated to new mathematical instruments assigned for logical modeling of the memory of digital devices. The case in point is logic-dynamical operation named venjunction and venjunctive function as well as sequention and sequentional function. Venjunction and sequention operate within the framework of sequential logic. In a form of the corresponding equations, they organically fit analytical expressions of Boolean algebra. Thus, a sort of symbiosis is formed using elements of asynchronous sequential logic on the one hand and combinational logic on the other hand. So, asynchronous
Endpoint-based parallel data processing in a parallel active messaging interface of a parallel computer

Science.gov (United States)

Archer, Charles J.; Blocksome, Michael A.; Ratterman, Joseph D.; Smith, Brian E.

2014-08-12

Endpoint-based parallel data processing in a parallel active messaging interface (`PAMI`) of a parallel computer, the PAMI composed of data communications endpoints, each endpoint including a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, the compute nodes coupled for data communications through the PAMI, including establishing a data communications geometry, the geometry specifying, for tasks representing processes of execution of the parallel application, a set of endpoints that are used in collective operations of the PAMI including a plurality of endpoints for one of the tasks; receiving in endpoints of the geometry an instruction for a collective operation; and executing the instruction for a collective operation through the endpoints in dependence upon the geometry, including dividing data communications operations among the plurality of endpoints for one of the tasks.
Human visual system automatically encodes sequential regularities of discrete events.

Science.gov (United States)

Kimura, Motohiro; Schröger, Erich; Czigler, István; Ohira, Hideki

2010-06-01

For our adaptive behavior in a dynamically changing environment, an essential task of the brain is to automatically encode sequential regularities inherent in the environment into a memory representation. Recent studies in neuroscience have suggested that sequential regularities embedded in discrete sensory events are automatically encoded into a memory representation at the level of the sensory system. This notion is largely supported by evidence from investigations using auditory mismatch negativity (auditory MMN), an event-related brain potential (ERP) correlate of an automatic memory-mismatch process in the auditory sensory system. However, it is still largely unclear whether or not this notion can be generalized to other sensory modalities. The purpose of the present study was to investigate the contribution of the visual sensory system to the automatic encoding of sequential regularities using visual mismatch negativity (visual MMN), an ERP correlate of an automatic memory-mismatch process in the visual sensory system. To this end, we conducted a sequential analysis of visual MMN in an oddball sequence consisting of infrequent deviant and frequent standard stimuli, and tested whether the underlying memory representation of visual MMN generation contains only a sensory memory trace of standard stimuli (trace-mismatch hypothesis) or whether it also contains sequential regularities extracted from the repetitive standard sequence (regularity-violation hypothesis). The results showed that visual MMN was elicited by first deviant (deviant stimuli following at least one standard stimulus), second deviant (deviant stimuli immediately following first deviant), and first standard (standard stimuli immediately following first deviant), but not by second standard (standard stimuli immediately following first standard). These results are consistent with the regularity-violation hypothesis, suggesting that the visual sensory system automatically encodes sequential
A Bayesian Theory of Sequential Causal Learning and Abstract Transfer.

Science.gov (United States)

Lu, Hongjing; Rojas, Randall R; Beckers, Tom; Yuille, Alan L

2016-03-01

Two key research issues in the field of causal learning are how people acquire causal knowledge when observing data that are presented sequentially, and the level of abstraction at which learning takes place. Does sequential causal learning solely involve the acquisition of specific cause-effect links, or do learners also acquire knowledge about abstract causal constraints? Recent empirical studies have revealed that experience with one set of causal cues can dramatically alter subsequent learning and performance with entirely different cues, suggesting that learning involves abstract transfer, and such transfer effects involve sequential presentation of distinct sets of causal cues. It has been demonstrated that pre-training (or even post-training) can modulate classic causal learning phenomena such as forward and backward blocking. To account for these effects, we propose a Bayesian theory of sequential causal learning. The theory assumes that humans are able to consider and use several alternative causal generative models, each instantiating a different causal integration rule. Model selection is used to decide which integration rule to use in a given learning environment in order to infer causal knowledge from sequential data. Detailed computer simulations demonstrate that humans rely on the abstract characteristics of outcome variables (e.g., binary vs. continuous) to select a causal integration rule, which in turn alters causal learning in a variety of blocking and overshadowing paradigms. When the nature of the outcome variable is ambiguous, humans select the model that yields the best fit with the recent environment, and then apply it to subsequent learning tasks. Based on sequential patterns of cue-outcome co-occurrence, the theory can account for a range of phenomena in sequential causal learning, including various blocking effects, primacy effects in some experimental conditions, and apparently abstract transfer of causal knowledge. Copyright © 2015
Impact of Diagrams on Recalling Sequential Elements in Expository Texts.

Science.gov (United States)

Guri-Rozenblit, Sarah

1988-01-01

Examines the instructional effectiveness of abstract diagrams on recall of sequential relations in social science textbooks. Concludes that diagrams assist significantly the recall of sequential relations in a text and decrease significantly the rate of order mistakes. (RS)
Sequential vs alternate chemo-radiotherapy in advanced head and neck tumors

International Nuclear Information System (INIS)

Corvo', R.; Merlano, M.; Grimaldi, A.; Rosso, R.; Vitale, V.; Scarpati, D.; Santelli, A.; Scasso, F.

1988-01-01

Between 1983 and 1986, a multicenter randomized study was conducted to compare a sequential program of induction chemotherapy (CT) followed by radiotherapy (RT), Arm A, with an alteration of cycles of CT with 3 courses of RT (20 Gy/10 fractions up to a total dose of 60 Gy), Arm B, in advanced head and neck cancer patients. The same CT (VBM: Vinblastine, Bleomycin, Methotrexate) was used on both arms; one hundred and sixteen patients (pts) entered the study, 55 in Arm A, 61 in Arm B. Fourty-five pts had stage III and 71 stage IV cancers. The two arms are fully comparable. Up to October 1987, 116 pts are evaluable for survival, while 112 are evaluable for toxicity and 105 for response. In 21 patients (10 in Arm A, 11 in B) the association CT-RT was followed by surgery. Response analysis shows 14 complete responses in Arm A and 30 in Arm B (p≤0.03). The madian disease-free survival and median overall survival are also statistically different, with an advantage for Arm B (33 vs 22 weeks, p≤0.0007, and 59 vs 38 weeks, p<0.03 respectively). The actual overall survival of complete responders at 50 months id 43% (B) and 21% (A). Toxicity (mainly stage III-IV mucositis) is superior in Arm B (30% vs 4%). This experience demonstrates the advantages of alternate over sequential CT-RT. A comparison of this cyclic assiciation with RT alone is in progress
QR-decomposition based SENSE reconstruction using parallel architecture.

Science.gov (United States)

Ullah, Irfan; Nisar, Habab; Raza, Haseeb; Qasim, Malik; Inam, Omair; Omer, Hammad

2018-04-01

Magnetic Resonance Imaging (MRI) is a powerful medical imaging technique that provides essential clinical information about the human body. One major limitation of MRI is its long scan time. Implementation of advance MRI algorithms on a parallel architecture (to exploit inherent parallelism) has a great potential to reduce the scan time. Sensitivity Encoding (SENSE) is a Parallel Magnetic Resonance Imaging (pMRI) algorithm that utilizes receiver coil sensitivities to reconstruct MR images from the acquired under-sampled k-space data. At the heart of SENSE lies inversion of a rectangular encoding matrix. This work presents a novel implementation of GPU based SENSE algorithm, which employs QR decomposition for the inversion of the rectangular encoding matrix. For a fair comparison, the performance of the proposed GPU based SENSE reconstruction is evaluated against single and multicore CPU using openMP. Several experiments against various acceleration factors (AFs) are performed using multichannel (8, 12 and 30) phantom and in-vivo human head and cardiac datasets. Experimental results show that GPU significantly reduces the computation time of SENSE reconstruction as compared to multi-core CPU (approximately 12x speedup) and single-core CPU (approximately 53x speedup) without any degradation in the quality of the reconstructed images. Copyright © 2018 Elsevier Ltd. All rights reserved.
Quantum Probability Zero-One Law for Sequential Terminal Events

Science.gov (United States)

Rehder, Wulf

1980-07-01

On the basis of the Jauch-Piron quantum probability calculus a zero-one law for sequential terminal events is proven, and the significance of certain crucial axioms in the quantum probability calculus is discussed. The result shows that the Jauch-Piron set of axioms is appropriate for the non-Boolean algebra of sequential events.

Concatenated coding system with iterated sequential inner decoding

DEFF Research Database (Denmark)

Jensen, Ole Riis; Paaske, Erik

1995-01-01

We describe a concatenated coding system with iterated sequential inner decoding. The system uses convolutional codes of very long constraint length and operates on iterations between an inner Fano decoder and an outer Reed-Solomon decoder......We describe a concatenated coding system with iterated sequential inner decoding. The system uses convolutional codes of very long constraint length and operates on iterations between an inner Fano decoder and an outer Reed-Solomon decoder...
Parallel optimization of IDW interpolation algorithm on multicore platform

Science.gov (United States)

Guan, Xuefeng; Wu, Huayi

2009-10-01

Due to increasing power consumption, heat dissipation, and other physical issues, the architecture of central processing unit (CPU) has been turning to multicore rapidly in recent years. Multicore processor is packaged with multiple processor cores in the same chip, which not only offers increased performance, but also presents significant challenges to application developers. As a matter of fact, in GIS field most of current GIS algorithms were implemented serially and could not best exploit the parallelism potential on such multicore platforms. In this paper, we choose Inverse Distance Weighted spatial interpolation algorithm (IDW) as an example to study how to optimize current serial GIS algorithms on multicore platform in order to maximize performance speedup. With the help of OpenMP, threading methodology is introduced to split and share the whole interpolation work among processor cores. After parallel optimization, execution time of interpolation algorithm is greatly reduced and good performance speedup is achieved. For example, performance speedup on Intel Xeon 5310 is 1.943 with 2 execution threads and 3.695 with 4 execution threads respectively. An additional output comparison between pre-optimization and post-optimization is carried out and shows that parallel optimization does to affect final interpolation result.
Parallel factor analysis PARAFAC of process affected water

Energy Technology Data Exchange (ETDEWEB)

Ewanchuk, A.M.; Ulrich, A.C.; Sego, D. [Alberta Univ., Edmonton, AB (Canada). Dept. of Civil and Environmental Engineering; Alostaz, M. [Thurber Engineering Ltd., Calgary, AB (Canada)

2010-07-01

A parallel factor analysis (PARAFAC) of oil sands process-affected water was presented. Naphthenic acids (NA) are traditionally described as monobasic carboxylic acids. Research has indicated that oil sands NA do not fit classical definitions of NA. Oil sands organic acids have toxic and corrosive properties. When analyzed by fluorescence technology, oil sands process-affected water displays a characteristic peak at 290 nm excitation and approximately 346 nm emission. In this study, a parallel factor analysis (PARAFAC) was used to decompose process-affected water multi-way data into components representing analytes, chemical compounds, and groups of compounds. Water samples from various oil sands operations were analyzed in order to obtain EEMs. The EEMs were then arranged into a large matrix in decreasing process-affected water content for PARAFAC. Data were divided into 5 components. A comparison with commercially prepared NA samples suggested that oil sands NA is fundamentally different. Further research is needed to determine what each of the 5 components represent. tabs., figs.
Lineup Composition, Suspect Position, and the Sequential Lineup Advantage

Science.gov (United States)

Carlson, Curt A.; Gronlund, Scott D.; Clark, Steven E.

2008-01-01

N. M. Steblay, J. Dysart, S. Fulero, and R. C. L. Lindsay (2001) argued that sequential lineups reduce the likelihood of mistaken eyewitness identification. Experiment 1 replicated the design of R. C. L. Lindsay and G. L. Wells (1985), the first study to show the sequential lineup advantage. However, the innocent suspect was chosen at a lower rate…
Trial Sequential Analysis in systematic reviews with meta-analysis

Directory of Open Access Journals (Sweden)

Jørn Wetterslev

2017-03-01

Full Text Available Abstract Background Most meta-analyses in systematic reviews, including Cochrane ones, do not have sufficient statistical power to detect or refute even large intervention effects. This is why a meta-analysis ought to be regarded as an interim analysis on its way towards a required information size. The results of the meta-analyses should relate the total number of randomised participants to the estimated required meta-analytic information size accounting for statistical diversity. When the number of participants and the corresponding number of trials in a meta-analysis are insufficient, the use of the traditional 95% confidence interval or the 5% statistical significance threshold will lead to too many false positive conclusions (type I errors and too many false negative conclusions (type II errors. Methods We developed a methodology for interpreting meta-analysis results, using generally accepted, valid evidence on how to adjust thresholds for significance in randomised clinical trials when the required sample size has not been reached. Results The Lan-DeMets trial sequential monitoring boundaries in Trial Sequential Analysis offer adjusted confidence intervals and restricted thresholds for statistical significance when the diversity-adjusted required information size and the corresponding number of required trials for the meta-analysis have not been reached. Trial Sequential Analysis provides a frequentistic approach to control both type I and type II errors. We define the required information size and the corresponding number of required trials in a meta-analysis and the diversity (D2 measure of heterogeneity. We explain the reasons for using Trial Sequential Analysis of meta-analysis when the actual information size fails to reach the required information size. We present examples drawn from traditional meta-analyses using unadjusted naïve 95% confidence intervals and 5% thresholds for statistical significance. Spurious conclusions in
Analysis of Parallel Burn Without Crossfeed TSTO RLV Architectures and Comparison to Parallel Burn With Crossfeed and Series Burn Architectures

Science.gov (United States)

Smith, Garrett; Phillips, Alan

2002-01-01

There are currently three dominant TSTO class architectures. These are Series Burn (SB), Parallel Burn with crossfeed (PBw/cf), and Parallel Burn without crossfeed (PBncf). The goal of this study was to determine what factors uniquely affect PBncf architectures, how each of these factors interact, and to determine from a performance perspective whether a PBncf vehicle could be competitive with a PBw/cf or SB vehicle using equivalent technology and assumptions. In all cases, performance was evaluated on a relative basis for a fixed payload and mission by comparing gross and dry vehicle masses of a closed vehicle. Propellant combinations studied were LOX: LH2 propelled orbiter and booster (HH) and LOX: Kerosene booster with LOX: LH2 orbiter (KH). The study conclusions were: 1) a PBncf orbiter should be throttled as deeply as possible after launch until the staging point. 2) a detailed structural model is essential to accurate architecture analysis and evaluation. 3) a PBncf TSTO architecture is feasible for systems that stage at mach 7. 3a) HH architectures can achieve a mass growth relative to PBw/cf of ratio and to the position of the orbiter required to align the nozzle heights at liftoff. 5 ) thrust to weight ratios of 1.3 at liftoff and between 1.0 and 0.9 when staging at mach 7 appear to be close to ideal for PBncf vehicles. 6) performance for all vehicles studied is better when staged at mach 7 instead of mach 5. The study showed that a Series Burn architecture has the lowest gross mass for HH cases, and has the lowest dry mass for KH cases. The potential disadvantages of SB are the required use of an air-start for the orbiter engines and potential CG control issues. A Parallel Burn with crossfeed architecture solves both these problems, but the mechanics of a large bipropellant crossfeed system pose significant technical difficulties. Parallel Burn without crossfeed vehicles start both booster and orbiter engines on the ground and thus avoid both the risk of
Heat accumulation during sequential cortical bone drilling.

Science.gov (United States)

Palmisano, Andrew C; Tai, Bruce L; Belmont, Barry; Irwin, Todd A; Shih, Albert; Holmes, James R

2016-03-01

Significant research exists regarding heat production during single-hole bone drilling. No published data exist regarding repetitive sequential drilling. This study elucidates the phenomenon of heat accumulation for sequential drilling with both Kirschner wires (K wires) and standard two-flute twist drills. It was hypothesized that cumulative heat would result in a higher temperature with each subsequent drill pass. Nine holes in a 3 × 3 array were drilled sequentially on moistened cadaveric tibia bone kept at body temperature (about 37 °C). Four thermocouples were placed at the center of four adjacent holes and 2 mm below the surface. A battery-driven hand drill guided by a servo-controlled motion system was used. Six samples were drilled with each tool (2.0 mm K wire and 2.0 and 2.5 mm standard drills). K wire drilling increased temperature from 5 °C at the first hole to 20 °C at holes 6 through 9. A similar trend was found in standard drills with less significant increments. The maximum temperatures of both tools increased from drill sizes was found to be insignificant (P > 0.05). In conclusion, heat accumulated during sequential drilling, with size difference being insignificant. K wire produced more heat than its twist-drill counterparts. This study has demonstrated the heat accumulation phenomenon and its significant effect on temperature. Maximizing the drilling field and reducing the number of drill passes may decrease bone injury. © 2015 Orthopaedic Research Society. Published by Wiley Periodicals, Inc.
Systematic approach for deriving feasible mappings of parallel algorithms to parallel computing platforms

NARCIS (Netherlands)

Arkin, Ethem; Tekinerdogan, Bedir; Imre, Kayhan M.

2017-01-01

The need for high-performance computing together with the increasing trend from single processor to parallel computer architectures has leveraged the adoption of parallel computing. To benefit from parallel computing power, usually parallel algorithms are defined that can be mapped and executed
Investigation of Mitochondrial Dysfunction by Sequential Microplate-Based Respiration Measurements from Intact and Permeabilized Neurons

Science.gov (United States)

Clerc, Pascaline; Polster, Brian M.

2012-01-01

Mitochondrial dysfunction is a component of many neurodegenerative conditions. Measurement of oxygen consumption from intact neurons enables evaluation of mitochondrial bioenergetics under conditions that are more physiologically realistic compared to isolated mitochondria. However, mechanistic analysis of mitochondrial function in cells is complicated by changing energy demands and lack of substrate control. Here we describe a technique for sequentially measuring respiration from intact and saponin-permeabilized cortical neurons on single microplates. This technique allows control of substrates to individual electron transport chain complexes following permeabilization, as well as side-by-side comparisons to intact cells. To illustrate the utility of the technique, we demonstrate that inhibition of respiration by the drug KB-R7943 in intact neurons is relieved by delivery of the complex II substrate succinate, but not by complex I substrates, via acute saponin permeabilization. In contrast, methyl succinate, a putative cell permeable complex II substrate, failed to rescue respiration in intact neurons and was a poor complex II substrate in permeabilized cells. Sequential measurements of intact and permeabilized cell respiration should be particularly useful for evaluating indirect mitochondrial toxicity due to drugs or cellular signaling events which cannot be readily studied using isolated mitochondria. PMID:22496810
Sequential Measurement of Intermodal Variability in Public Transportation PM2.5 and CO Exposure Concentrations.

Science.gov (United States)

Che, W W; Frey, H Christopher; Lau, Alexis K H

2016-08-16

A sequential measurement method is demonstrated for quantifying the variability in exposure concentration during public transportation. This method was applied in Hong Kong by measuring PM2.5 and CO concentrations along a route connecting 13 transportation-related microenvironments within 3-4 h. The study design takes into account ventilation, proximity to local sources, area-wide air quality, and meteorological conditions. Portable instruments were compacted into a backpack to facilitate measurement under crowded transportation conditions and to quantify personal exposure by sampling at nose level. The route included stops next to three roadside monitors to enable comparison of fixed site and exposure concentrations. PM2.5 exposure concentrations were correlated with the roadside monitors, despite differences in averaging time, detection method, and sampling location. Although highly correlated in temporal trend, PM2.5 concentrations varied significantly among microenvironments, with mean concentration ratios versus roadside monitor ranging from 0.5 for MTR train to 1.3 for bus terminal. Measured inter-run variability provides insight regarding the sample size needed to discriminate between microenvironments with increased statistical significance. The study results illustrate the utility of sequential measurement of microenvironments and policy-relevant insights for exposure mitigation and management.
Dihydroazulene photoswitch operating in sequential tunneling regime

DEFF Research Database (Denmark)

Broman, Søren Lindbæk; Lara-Avila, Samuel; Thisted, Christine Lindbjerg

2012-01-01

to electrodes so that the electron transport goes by sequential tunneling. To assure weak coupling, the DHA switching kernel is modified by incorporating p-MeSC6H4 end-groups. Molecules are prepared by Suzuki cross-couplings on suitable halogenated derivatives of DHA. The synthesis presents an expansion of our......, incorporating a p-MeSC6H4 anchoring group in one end, has been placed in a silver nanogap. Conductance measurements justify that transport through both DHA (high resistivity) and VHF (low resistivity) forms goes by sequential tunneling. The switching is fairly reversible and reenterable; after more than 20 ON...
Parallel algorithms

CERN Document Server

Casanova, Henri; Robert, Yves

2008-01-01

""…The authors of the present book, who have extensive credentials in both research and instruction in the area of parallelism, present a sound, principled treatment of parallel algorithms. … This book is very well written and extremely well designed from an instructional point of view. … The authors have created an instructive and fascinating text. The book will serve researchers as well as instructors who need a solid, readable text for a course on parallelism in computing. Indeed, for anyone who wants an understandable text from which to acquire a current, rigorous, and broad vi
A Trust-region-based Sequential Quadratic Programming Algorithm

DEFF Research Database (Denmark)

Henriksen, Lars Christian; Poulsen, Niels Kjølstad

This technical note documents the trust-region-based sequential quadratic programming algorithm used in other works by the authors. The algorithm seeks to minimize a convex nonlinear cost function subject to linear inequalty constraints and nonlinear equality constraints.......This technical note documents the trust-region-based sequential quadratic programming algorithm used in other works by the authors. The algorithm seeks to minimize a convex nonlinear cost function subject to linear inequalty constraints and nonlinear equality constraints....
A Parallel Algorithm for the Counting of Ellipses Present in Conglomerates Using GPU

Directory of Open Access Journals (Sweden)

Reyes Yam-Uicab

2018-01-01

Full Text Available Detecting and counting elliptical objects are an interesting problem in digital image processing. There are real-world applications of this problem in various disciplines. Solving this problem is harder when there is occlusion among the elliptical objects, since in general these objects are considered as part of the bigger object (conglomerate. The solution to this problem focusses on the detection and segmentation of the precise number of occluded elliptical objects, while omitting all noninteresting objects. There are a variety of computational approximations that focus on this problem; however, such approximations are not accurate when there is occlusion. This paper presents an algorithm designed to solve this problem, specifically, to detect, segment, and count elliptical objects of a specific size when these are in occlusion with other objects within the conglomerate. Our algorithm deals with a time-consuming combinatorial process. To optimize the execution time of our algorithm, we implemented a parallel GPU version with CUDA-C, which experimentally improved the detection of occluded objects, as well as lowering processing times compared to the sequential version of the method. Comparative test results with another method featured in literature showed improved detection of objects in occlusion when using the proposed parallel method.
Synthetic Aperture Sequential Beamforming

DEFF Research Database (Denmark)

Kortbek, Jacob; Jensen, Jørgen Arendt; Gammelmark, Kim Løkke

2008-01-01

A synthetic aperture focusing (SAF) technique denoted Synthetic Aperture Sequential Beamforming (SASB) suitable for 2D and 3D imaging is presented. The technique differ from prior art of SAF in the sense that SAF is performed on pre-beamformed data contrary to channel data. The objective is to im......A synthetic aperture focusing (SAF) technique denoted Synthetic Aperture Sequential Beamforming (SASB) suitable for 2D and 3D imaging is presented. The technique differ from prior art of SAF in the sense that SAF is performed on pre-beamformed data contrary to channel data. The objective...... is to improve and obtain a more range independent lateral resolution compared to conventional dynamic receive focusing (DRF) without compromising frame rate. SASB is a two-stage procedure using two separate beamformers. First a set of Bmode image lines using a single focal point in both transmit and receive...... is stored. The second stage applies the focused image lines from the first stage as input data. The SASB method has been investigated using simulations in Field II and by off-line processing of data acquired with a commercial scanner. The performance of SASB with a static image object is compared with DRF...
Evaluation Using Sequential Trials Methods.

Science.gov (United States)

Cohen, Mark E.; Ralls, Stephen A.

1986-01-01

Although dental school faculty as well as practitioners are interested in evaluating products and procedures used in clinical practice, research design and statistical analysis can sometimes pose problems. Sequential trials methods provide an analytical structure that is both easy to use and statistically valid. (Author/MLW)
Attack Trees with Sequential Conjunction

NARCIS (Netherlands)

Jhawar, Ravi; Kordy, Barbara; Mauw, Sjouke; Radomirović, Sasa; Trujillo-Rasua, Rolando

2015-01-01

We provide the first formal foundation of SAND attack trees which are a popular extension of the well-known attack trees. The SAND at- tack tree formalism increases the expressivity of attack trees by intro- ducing the sequential conjunctive operator SAND. This operator enables the modeling of
Parallel SOL transport in MAST and JET: the impact of the mirror force

International Nuclear Information System (INIS)

Kirk, A; Fundamenski, W; Ahn, J-W; Counsell, G

2003-01-01

Interpretative modelling of the SOL plasma in conventional (JET) and tight (MAST) aspect ratio devices has been performed using OSM2/EIRENE. A detailed comparison has been made of the solutions of the fluid equations and one key issue uncovered by this modelling is the significance of the mirror force for the spherical tokamak (ST) SOL. This force is proportional to ∇ parallel B/B, which is typically a factor 10 larger in an ST due to the low aspect ratio. This term leads to changes in the charged particle velocity distributions near regions with large V parallel B/B representing an effective, upstream particle and momentum source. The modelling performed in this paper indicates that exclusion of the ∇ parallel B term may lead to incorrect conclusions on, for example, the upstream density, especially in STs
PARAMO: a PARAllel predictive MOdeling platform for healthcare analytic research using electronic health records.

Science.gov (United States)

Ng, Kenney; Ghoting, Amol; Steinhubl, Steven R; Stewart, Walter F; Malin, Bradley; Sun, Jimeng

2014-04-01

Healthcare analytics research increasingly involves the construction of predictive models for disease targets across varying patient cohorts using electronic health records (EHRs). To facilitate this process, it is critical to support a pipeline of tasks: (1) cohort construction, (2) feature construction, (3) cross-validation, (4) feature selection, and (5) classification. To develop an appropriate model, it is necessary to compare and refine models derived from a diversity of cohorts, patient-specific features, and statistical frameworks. The goal of this work is to develop and evaluate a predictive modeling platform that can be used to simplify and expedite this process for health data. To support this goal, we developed a PARAllel predictive MOdeling (PARAMO) platform which (1) constructs a dependency graph of tasks from specifications of predictive modeling pipelines, (2) schedules the tasks in a topological ordering of the graph, and (3) executes those tasks in parallel. We implemented this platform using Map-Reduce to enable independent tasks to run in parallel in a cluster computing environment. Different task scheduling preferences are also supported. We assess the performance of PARAMO on various workloads using three datasets derived from the EHR systems in place at Geisinger Health System and Vanderbilt University Medical Center and an anonymous longitudinal claims database. We demonstrate significant gains in computational efficiency against a standard approach. In particular, PARAMO can build 800 different models on a 300,000 patient data set in 3h in parallel compared to 9days if running sequentially. This work demonstrates that an efficient parallel predictive modeling platform can be developed for EHR data. This platform can facilitate large-scale modeling endeavors and speed-up the research workflow and reuse of health information. This platform is only a first step and provides the foundation for our ultimate goal of building analytic pipelines
An investigation of methods for free-field comparison calibration of measurement microphones

DEFF Research Database (Denmark)

Barrera-Figueroa, Salvador; Moreno Pescador, Guillermo; Jacobsen, Finn

2010-01-01

Free-field comparison calibration of measurement microphones requires that a calibrated reference microphone and a test microphone are exposed to the same sound pressure in a free field. The output voltages of the microphones can be measured either sequentially or simultaneously. The sequential...... method requires the sound field to have good temporal stability. The simultaneous method requires instead that the sound pressure is the same in the positions where the microphones are placed. In this paper the results of the application of the two methods are compared. A third combined method...

Some links on this page may take you to non-federal websites. Their policies may differ from this site.