WorldWideScience

Sample records for massively parallel nucleotide

  1. Massively parallel mathematical sieves

    Energy Technology Data Exchange (ETDEWEB)

    Montry, G.R.

    1989-01-01

    The Sieve of Eratosthenes is a well-known algorithm for finding all prime numbers in a given subset of integers. A parallel version of the Sieve is described that produces computational speedups over 800 on a hypercube with 1,024 processing elements for problems of fixed size. Computational speedups as high as 980 are achieved when the problem size per processor is fixed. The method of parallelization generalizes to other sieves and will be efficient on any ensemble architecture. We investigate two highly parallel sieves using scattered decomposition and compare their performance on a hypercube multiprocessor. A comparison of different parallelization techniques for the sieve illustrates the trade-offs necessary in the design and implementation of massively parallel algorithms for large ensemble computers.

  2. Massively parallel multicanonical simulations

    Science.gov (United States)

    Gross, Jonathan; Zierenberg, Johannes; Weigel, Martin; Janke, Wolfhard

    2018-03-01

    Generalized-ensemble Monte Carlo simulations such as the multicanonical method and similar techniques are among the most efficient approaches for simulations of systems undergoing discontinuous phase transitions or with rugged free-energy landscapes. As Markov chain methods, they are inherently serial computationally. It was demonstrated recently, however, that a combination of independent simulations that communicate weight updates at variable intervals allows for the efficient utilization of parallel computational resources for multicanonical simulations. Implementing this approach for the many-thread architecture provided by current generations of graphics processing units (GPUs), we show how it can be efficiently employed with of the order of 104 parallel walkers and beyond, thus constituting a versatile tool for Monte Carlo simulations in the era of massively parallel computing. We provide the fully documented source code for the approach applied to the paradigmatic example of the two-dimensional Ising model as starting point and reference for practitioners in the field.

  3. Massively Parallel QCD

    International Nuclear Information System (INIS)

    Soltz, R; Vranas, P; Blumrich, M; Chen, D; Gara, A; Giampap, M; Heidelberger, P; Salapura, V; Sexton, J; Bhanot, G

    2007-01-01

    The theory of the strong nuclear force, Quantum Chromodynamics (QCD), can be numerically simulated from first principles on massively-parallel supercomputers using the method of Lattice Gauge Theory. We describe the special programming requirements of lattice QCD (LQCD) as well as the optimal supercomputer hardware architectures that it suggests. We demonstrate these methods on the BlueGene massively-parallel supercomputer and argue that LQCD and the BlueGene architecture are a natural match. This can be traced to the simple fact that LQCD is a regular lattice discretization of space into lattice sites while the BlueGene supercomputer is a discretization of space into compute nodes, and that both are constrained by requirements of locality. This simple relation is both technologically important and theoretically intriguing. The main result of this paper is the speedup of LQCD using up to 131,072 CPUs on the largest BlueGene/L supercomputer. The speedup is perfect with sustained performance of about 20% of peak. This corresponds to a maximum of 70.5 sustained TFlop/s. At these speeds LQCD and BlueGene are poised to produce the next generation of strong interaction physics theoretical results

  4. Massively parallel quantum computer simulator

    NARCIS (Netherlands)

    De Raedt, K.; Michielsen, K.; De Raedt, H.; Trieu, B.; Arnold, G.; Richter, M.; Lippert, Th.; Watanabe, H.; Ito, N.

    2007-01-01

    We describe portable software to simulate universal quantum computers on massive parallel Computers. We illustrate the use of the simulation software by running various quantum algorithms on different computer architectures, such as a IBM BlueGene/L, a IBM Regatta p690+, a Hitachi SR11000/J1, a Cray

  5. Massively Parallel Finite Element Programming

    KAUST Repository

    Heister, Timo

    2010-01-01

    Today\\'s large finite element simulations require parallel algorithms to scale on clusters with thousands or tens of thousands of processor cores. We present data structures and algorithms to take advantage of the power of high performance computers in generic finite element codes. Existing generic finite element libraries often restrict the parallelization to parallel linear algebra routines. This is a limiting factor when solving on more than a few hundreds of cores. We describe routines for distributed storage of all major components coupled with efficient, scalable algorithms. We give an overview of our effort to enable the modern and generic finite element library deal.II to take advantage of the power of large clusters. In particular, we describe the construction of a distributed mesh and develop algorithms to fully parallelize the finite element calculation. Numerical results demonstrate good scalability. © 2010 Springer-Verlag.

  6. Massively Parallel Finite Element Programming

    KAUST Repository

    Heister, Timo; Kronbichler, Martin; Bangerth, Wolfgang

    2010-01-01

    Today's large finite element simulations require parallel algorithms to scale on clusters with thousands or tens of thousands of processor cores. We present data structures and algorithms to take advantage of the power of high performance computers in generic finite element codes. Existing generic finite element libraries often restrict the parallelization to parallel linear algebra routines. This is a limiting factor when solving on more than a few hundreds of cores. We describe routines for distributed storage of all major components coupled with efficient, scalable algorithms. We give an overview of our effort to enable the modern and generic finite element library deal.II to take advantage of the power of large clusters. In particular, we describe the construction of a distributed mesh and develop algorithms to fully parallelize the finite element calculation. Numerical results demonstrate good scalability. © 2010 Springer-Verlag.

  7. A Massively Parallel Face Recognition System

    Directory of Open Access Journals (Sweden)

    Lahdenoja Olli

    2007-01-01

    Full Text Available We present methods for processing the LBPs (local binary patterns with a massively parallel hardware, especially with CNN-UM (cellular nonlinear network-universal machine. In particular, we present a framework for implementing a massively parallel face recognition system, including a dedicated highly accurate algorithm suitable for various types of platforms (e.g., CNN-UM and digital FPGA. We study in detail a dedicated mixed-mode implementation of the algorithm and estimate its implementation cost in the view of its performance and accuracy restrictions.

  8. A Massively Parallel Face Recognition System

    Directory of Open Access Journals (Sweden)

    Ari Paasio

    2006-12-01

    Full Text Available We present methods for processing the LBPs (local binary patterns with a massively parallel hardware, especially with CNN-UM (cellular nonlinear network-universal machine. In particular, we present a framework for implementing a massively parallel face recognition system, including a dedicated highly accurate algorithm suitable for various types of platforms (e.g., CNN-UM and digital FPGA. We study in detail a dedicated mixed-mode implementation of the algorithm and estimate its implementation cost in the view of its performance and accuracy restrictions.

  9. Massively parallel diffuse optical tomography

    Energy Technology Data Exchange (ETDEWEB)

    Sandusky, John V.; Pitts, Todd A.

    2017-09-05

    Diffuse optical tomography systems and methods are described herein. In a general embodiment, the diffuse optical tomography system comprises a plurality of sensor heads, the plurality of sensor heads comprising respective optical emitter systems and respective sensor systems. A sensor head in the plurality of sensors heads is caused to act as an illuminator, such that its optical emitter system transmits a transillumination beam towards a portion of a sample. Other sensor heads in the plurality of sensor heads act as observers, detecting portions of the transillumination beam that radiate from the sample in the fields of view of the respective sensory systems of the other sensor heads. Thus, sensor heads in the plurality of sensors heads generate sensor data in parallel.

  10. Massively Parallel Computing: A Sandia Perspective

    Energy Technology Data Exchange (ETDEWEB)

    Dosanjh, Sudip S.; Greenberg, David S.; Hendrickson, Bruce; Heroux, Michael A.; Plimpton, Steve J.; Tomkins, James L.; Womble, David E.

    1999-05-06

    The computing power available to scientists and engineers has increased dramatically in the past decade, due in part to progress in making massively parallel computing practical and available. The expectation for these machines has been great. The reality is that progress has been slower than expected. Nevertheless, massively parallel computing is beginning to realize its potential for enabling significant break-throughs in science and engineering. This paper provides a perspective on the state of the field, colored by the authors' experiences using large scale parallel machines at Sandia National Laboratories. We address trends in hardware, system software and algorithms, and we also offer our view of the forces shaping the parallel computing industry.

  11. Adapting algorithms to massively parallel hardware

    CERN Document Server

    Sioulas, Panagiotis

    2016-01-01

    In the recent years, the trend in computing has shifted from delivering processors with faster clock speeds to increasing the number of cores per processor. This marks a paradigm shift towards parallel programming in which applications are programmed to exploit the power provided by multi-cores. Usually there is gain in terms of the time-to-solution and the memory footprint. Specifically, this trend has sparked an interest towards massively parallel systems that can provide a large number of processors, and possibly computing nodes, as in the GPUs and MPPAs (Massively Parallel Processor Arrays). In this project, the focus was on two distinct computing problems: k-d tree searches and track seeding cellular automata. The goal was to adapt the algorithms to parallel systems and evaluate their performance in different cases.

  12. Massively parallel evolutionary computation on GPGPUs

    CERN Document Server

    Tsutsui, Shigeyoshi

    2013-01-01

    Evolutionary algorithms (EAs) are metaheuristics that learn from natural collective behavior and are applied to solve optimization problems in domains such as scheduling, engineering, bioinformatics, and finance. Such applications demand acceptable solutions with high-speed execution using finite computational resources. Therefore, there have been many attempts to develop platforms for running parallel EAs using multicore machines, massively parallel cluster machines, or grid computing environments. Recent advances in general-purpose computing on graphics processing units (GPGPU) have opened u

  13. Massively parallel sequencing of forensic STRs

    DEFF Research Database (Denmark)

    Parson, Walther; Ballard, David; Budowle, Bruce

    2016-01-01

    The DNA Commission of the International Society for Forensic Genetics (ISFG) is reviewing factors that need to be considered ahead of the adoption by the forensic community of short tandem repeat (STR) genotyping by massively parallel sequencing (MPS) technologies. MPS produces sequence data that...

  14. Massively parallel Fokker-Planck code ALLAp

    International Nuclear Information System (INIS)

    Batishcheva, A.A.; Krasheninnikov, S.I.; Craddock, G.G.; Djordjevic, V.

    1996-01-01

    The recently developed for workstations Fokker-Planck code ALLA simulates the temporal evolution of 1V, 2V and 1D2V collisional edge plasmas. In this work we present the results of code parallelization on the CRI T3D massively parallel platform (ALLAp version). Simultaneously we benchmark the 1D2V parallel vesion against an analytic self-similar solution of the collisional kinetic equation. This test is not trivial as it demands a very strong spatial temperature and density variation within the simulation domain. (orig.)

  15. Frontiers of massively parallel scientific computation

    International Nuclear Information System (INIS)

    Fischer, J.R.

    1987-07-01

    Practical applications using massively parallel computer hardware first appeared during the 1980s. Their development was motivated by the need for computing power orders of magnitude beyond that available today for tasks such as numerical simulation of complex physical and biological processes, generation of interactive visual displays, satellite image analysis, and knowledge based systems. Representative of the first generation of this new class of computers is the Massively Parallel Processor (MPP). A team of scientists was provided the opportunity to test and implement their algorithms on the MPP. The first results are presented. The research spans a broad variety of applications including Earth sciences, physics, signal and image processing, computer science, and graphics. The performance of the MPP was very good. Results obtained using the Connection Machine and the Distributed Array Processor (DAP) are presented

  16. Impact analysis on a massively parallel computer

    International Nuclear Information System (INIS)

    Zacharia, T.; Aramayo, G.A.

    1994-01-01

    Advanced mathematical techniques and computer simulation play a major role in evaluating and enhancing the design of beverage cans, industrial, and transportation containers for improved performance. Numerical models are used to evaluate the impact requirements of containers used by the Department of Energy (DOE) for transporting radioactive materials. Many of these models are highly compute-intensive. An analysis may require several hours of computational time on current supercomputers despite the simplicity of the models being studied. As computer simulations and materials databases grow in complexity, massively parallel computers have become important tools. Massively parallel computational research at the Oak Ridge National Laboratory (ORNL) and its application to the impact analysis of shipping containers is briefly described in this paper

  17. Template based parallel checkpointing in a massively parallel computer system

    Science.gov (United States)

    Archer, Charles Jens [Rochester, MN; Inglett, Todd Alan [Rochester, MN

    2009-01-13

    A method and apparatus for a template based parallel checkpoint save for a massively parallel super computer system using a parallel variation of the rsync protocol, and network broadcast. In preferred embodiments, the checkpoint data for each node is compared to a template checkpoint file that resides in the storage and that was previously produced. Embodiments herein greatly decrease the amount of data that must be transmitted and stored for faster checkpointing and increased efficiency of the computer system. Embodiments are directed to a parallel computer system with nodes arranged in a cluster with a high speed interconnect that can perform broadcast communication. The checkpoint contains a set of actual small data blocks with their corresponding checksums from all nodes in the system. The data blocks may be compressed using conventional non-lossy data compression algorithms to further reduce the overall checkpoint size.

  18. Massively Parallel Algorithms for Solution of Schrodinger Equation

    Science.gov (United States)

    Fijany, Amir; Barhen, Jacob; Toomerian, Nikzad

    1994-01-01

    In this paper massively parallel algorithms for solution of Schrodinger equation are developed. Our results clearly indicate that the Crank-Nicolson method, in addition to its excellent numerical properties, is also highly suitable for massively parallel computation.

  19. A Massively Parallel Code for Polarization Calculations

    Science.gov (United States)

    Akiyama, Shizuka; Höflich, Peter

    2001-03-01

    We present an implementation of our Monte-Carlo radiation transport method for rapidly expanding, NLTE atmospheres for massively parallel computers which utilizes both the distributed and shared memory models. This allows us to take full advantage of the fast communication and low latency inherent to nodes with multiple CPUs, and to stretch the limits of scalability with the number of nodes compared to a version which is based on the shared memory model. Test calculations on a local 20-node Beowulf cluster with dual CPUs showed an improved scalability by about 40%.

  20. Massive Asynchronous Parallelization of Sparse Matrix Factorizations

    Energy Technology Data Exchange (ETDEWEB)

    Chow, Edmond [Georgia Inst. of Technology, Atlanta, GA (United States)

    2018-01-08

    Solving sparse problems is at the core of many DOE computational science applications. We focus on the challenge of developing sparse algorithms that can fully exploit the parallelism in extreme-scale computing systems, in particular systems with massive numbers of cores per node. Our approach is to express a sparse matrix factorization as a large number of bilinear constraint equations, and then solving these equations via an asynchronous iterative method. The unknowns in these equations are the matrix entries of the factorization that is desired.

  1. Neural nets for massively parallel optimization

    Science.gov (United States)

    Dixon, Laurence C. W.; Mills, David

    1992-07-01

    To apply massively parallel processing systems to the solution of large scale optimization problems it is desirable to be able to evaluate any function f(z), z (epsilon) Rn in a parallel manner. The theorem of Cybenko, Hecht Nielsen, Hornik, Stinchcombe and White, and Funahasi shows that this can be achieved by a neural network with one hidden layer. In this paper we address the problem of the number of nodes required in the layer to achieve a given accuracy in the function and gradient values at all points within a given n dimensional interval. The type of activation function needed to obtain nonsingular Hessian matrices is described and a strategy for obtaining accurate minimal networks presented.

  2. Massive hybrid parallelism for fully implicit multiphysics

    International Nuclear Information System (INIS)

    Gaston, D. R.; Permann, C. J.; Andrs, D.; Peterson, J. W.

    2013-01-01

    As hardware advances continue to modify the supercomputing landscape, traditional scientific software development practices will become more outdated, ineffective, and inefficient. The process of rewriting/retooling existing software for new architectures is a Sisyphean task, and results in substantial hours of development time, effort, and money. Software libraries which provide an abstraction of the resources provided by such architectures are therefore essential if the computational engineering and science communities are to continue to flourish in this modern computing environment. The Multiphysics Object Oriented Simulation Environment (MOOSE) framework enables complex multiphysics analysis tools to be built rapidly by scientists, engineers, and domain specialists, while also allowing them to both take advantage of current HPC architectures, and efficiently prepare for future supercomputer designs. MOOSE employs a hybrid shared-memory and distributed-memory parallel model and provides a complete and consistent interface for creating multiphysics analysis tools. In this paper, a brief discussion of the mathematical algorithms underlying the framework and the internal object-oriented hybrid parallel design are given. Representative massively parallel results from several applications areas are presented, and a brief discussion of future areas of research for the framework are provided. (authors)

  3. Massive hybrid parallelism for fully implicit multiphysics

    Energy Technology Data Exchange (ETDEWEB)

    Gaston, D. R.; Permann, C. J.; Andrs, D.; Peterson, J. W. [Idaho National Laboratory, 2525 N. Fremont Ave., Idaho Falls, ID 83415 (United States)

    2013-07-01

    As hardware advances continue to modify the supercomputing landscape, traditional scientific software development practices will become more outdated, ineffective, and inefficient. The process of rewriting/retooling existing software for new architectures is a Sisyphean task, and results in substantial hours of development time, effort, and money. Software libraries which provide an abstraction of the resources provided by such architectures are therefore essential if the computational engineering and science communities are to continue to flourish in this modern computing environment. The Multiphysics Object Oriented Simulation Environment (MOOSE) framework enables complex multiphysics analysis tools to be built rapidly by scientists, engineers, and domain specialists, while also allowing them to both take advantage of current HPC architectures, and efficiently prepare for future supercomputer designs. MOOSE employs a hybrid shared-memory and distributed-memory parallel model and provides a complete and consistent interface for creating multiphysics analysis tools. In this paper, a brief discussion of the mathematical algorithms underlying the framework and the internal object-oriented hybrid parallel design are given. Representative massively parallel results from several applications areas are presented, and a brief discussion of future areas of research for the framework are provided. (authors)

  4. MASSIVE HYBRID PARALLELISM FOR FULLY IMPLICIT MULTIPHYSICS

    Energy Technology Data Exchange (ETDEWEB)

    Cody J. Permann; David Andrs; John W. Peterson; Derek R. Gaston

    2013-05-01

    As hardware advances continue to modify the supercomputing landscape, traditional scientific software development practices will become more outdated, ineffective, and inefficient. The process of rewriting/retooling existing software for new architectures is a Sisyphean task, and results in substantial hours of development time, effort, and money. Software libraries which provide an abstraction of the resources provided by such architectures are therefore essential if the computational engineering and science communities are to continue to flourish in this modern computing environment. The Multiphysics Object Oriented Simulation Environment (MOOSE) framework enables complex multiphysics analysis tools to be built rapidly by scientists, engineers, and domain specialists, while also allowing them to both take advantage of current HPC architectures, and efficiently prepare for future supercomputer designs. MOOSE employs a hybrid shared-memory and distributed-memory parallel model and provides a complete and consistent interface for creating multiphysics analysis tools. In this paper, a brief discussion of the mathematical algorithms underlying the framework and the internal object-oriented hybrid parallel design are given. Representative massively parallel results from several applications areas are presented, and a brief discussion of future areas of research for the framework are provided.

  5. Massively parallel Fokker-Planck calculations

    International Nuclear Information System (INIS)

    Mirin, A.A.

    1990-01-01

    This paper reports that the Fokker-Planck package FPPAC, which solves the complete nonlinear multispecies Fokker-Planck collision operator for a plasma in two-dimensional velocity space, has been rewritten for the Connection Machine 2. This has involved allocation of variables either to the front end or the CM2, minimization of data flow, and replacement of Cray-optimized algorithms with ones suitable for a massively parallel architecture. Calculations have been carried out on various Connection Machines throughout the country. Results and timings on these machines have been compared to each other and to those on the static memory Cray-2. For large problem size, the Connection Machine 2 is found to be cost-efficient

  6. Computational chaos in massively parallel neural networks

    Science.gov (United States)

    Barhen, Jacob; Gulati, Sandeep

    1989-01-01

    A fundamental issue which directly impacts the scalability of current theoretical neural network models to massively parallel embodiments, in both software as well as hardware, is the inherent and unavoidable concurrent asynchronicity of emerging fine-grained computational ensembles and the possible emergence of chaotic manifestations. Previous analyses attributed dynamical instability to the topology of the interconnection matrix, to parasitic components or to propagation delays. However, researchers have observed the existence of emergent computational chaos in a concurrently asynchronous framework, independent of the network topology. Researcher present a methodology enabling the effective asynchronous operation of large-scale neural networks. Necessary and sufficient conditions guaranteeing concurrent asynchronous convergence are established in terms of contracting operators. Lyapunov exponents are computed formally to characterize the underlying nonlinear dynamics. Simulation results are presented to illustrate network convergence to the correct results, even in the presence of large delays.

  7. Climate models on massively parallel computers

    International Nuclear Information System (INIS)

    Vitart, F.; Rouvillois, P.

    1993-01-01

    First results got on massively parallel computers (Multiple Instruction Multiple Data and Simple Instruction Multiple Data) allow to consider building of coupled models with high resolutions. This would make possible simulation of thermoaline circulation and other interaction phenomena between atmosphere and ocean. The increasing of computers powers, and then the improvement of resolution will go us to revise our approximations. Then hydrostatic approximation (in ocean circulation) will not be valid when the grid mesh will be of a dimension lower than a few kilometers: We shall have to find other models. The expert appraisement got in numerical analysis at the Center of Limeil-Valenton (CEL-V) will be used again to imagine global models taking in account atmosphere, ocean, ice floe and biosphere, allowing climate simulation until a regional scale

  8. Massively parallel computation of conservation laws

    Energy Technology Data Exchange (ETDEWEB)

    Garbey, M [Univ. Claude Bernard, Villeurbanne (France); Levine, D [Argonne National Lab., IL (United States)

    1990-01-01

    The authors present a new method for computing solutions of conservation laws based on the use of cellular automata with the method of characteristics. The method exploits the high degree of parallelism available with cellular automata and retains important features of the method of characteristics. It yields high numerical accuracy and extends naturally to adaptive meshes and domain decomposition methods for perturbed conservation laws. They describe the method and its implementation for a Dirichlet problem with a single conservation law for the one-dimensional case. Numerical results for the one-dimensional law with the classical Burgers nonlinearity or the Buckley-Leverett equation show good numerical accuracy outside the neighborhood of the shocks. The error in the area of the shocks is of the order of the mesh size. The algorithm is well suited for execution on both massively parallel computers and vector machines. They present timing results for an Alliant FX/8, Connection Machine Model 2, and CRAY X-MP.

  9. Massively Parallel Dimension Independent Adaptive Metropolis

    KAUST Repository

    Chen, Yuxin

    2015-05-14

    This work considers black-box Bayesian inference over high-dimensional parameter spaces. The well-known and widely respected adaptive Metropolis (AM) algorithm is extended herein to asymptotically scale uniformly with respect to the underlying parameter dimension, by respecting the variance, for Gaussian targets. The result- ing algorithm, referred to as the dimension-independent adaptive Metropolis (DIAM) algorithm, also shows improved performance with respect to adaptive Metropolis on non-Gaussian targets. This algorithm is further improved, and the possibility of probing high-dimensional targets is enabled, via GPU-accelerated numerical libraries and periodically synchronized concurrent chains (justified a posteriori). Asymptoti- cally in dimension, this massively parallel dimension-independent adaptive Metropolis (MPDIAM) GPU implementation exhibits a factor of four improvement versus the CPU-based Intel MKL version alone, which is itself already a factor of three improve- ment versus the serial version. The scaling to multiple CPUs and GPUs exhibits a form of strong scaling in terms of the time necessary to reach a certain convergence criterion, through a combination of longer time per sample batch (weak scaling) and yet fewer necessary samples to convergence. This is illustrated by e ciently sampling from several Gaussian and non-Gaussian targets for dimension d 1000.

  10. Multiplexed microsatellite recovery using massively parallel sequencing

    Science.gov (United States)

    Jennings, T.N.; Knaus, B.J.; Mullins, T.D.; Haig, S.M.; Cronn, R.C.

    2011-01-01

    Conservation and management of natural populations requires accurate and inexpensive genotyping methods. Traditional microsatellite, or simple sequence repeat (SSR), marker analysis remains a popular genotyping method because of the comparatively low cost of marker development, ease of analysis and high power of genotype discrimination. With the availability of massively parallel sequencing (MPS), it is now possible to sequence microsatellite-enriched genomic libraries in multiplex pools. To test this approach, we prepared seven microsatellite-enriched, barcoded genomic libraries from diverse taxa (two conifer trees, five birds) and sequenced these on one lane of the Illumina Genome Analyzer using paired-end 80-bp reads. In this experiment, we screened 6.1 million sequences and identified 356958 unique microreads that contained di- or trinucleotide microsatellites. Examination of four species shows that our conversion rate from raw sequences to polymorphic markers compares favourably to Sanger- and 454-based methods. The advantage of multiplexed MPS is that the staggering capacity of modern microread sequencing is spread across many libraries; this reduces sample preparation and sequencing costs to less than $400 (USD) per species. This price is sufficiently low that microsatellite libraries could be prepared and sequenced for all 1373 organisms listed as 'threatened' and 'endangered' in the United States for under $0.5M (USD).

  11. Programming massively parallel processors a hands-on approach

    CERN Document Server

    Kirk, David B

    2010-01-01

    Programming Massively Parallel Processors discusses basic concepts about parallel programming and GPU architecture. ""Massively parallel"" refers to the use of a large number of processors to perform a set of computations in a coordinated parallel way. The book details various techniques for constructing parallel programs. It also discusses the development process, performance level, floating-point format, parallel patterns, and dynamic parallelism. The book serves as a teaching guide where parallel programming is the main topic of the course. It builds on the basics of C programming for CUDA, a parallel programming environment that is supported on NVI- DIA GPUs. Composed of 12 chapters, the book begins with basic information about the GPU as a parallel computer source. It also explains the main concepts of CUDA, data parallelism, and the importance of memory access efficiency using CUDA. The target audience of the book is graduate and undergraduate students from all science and engineering disciplines who ...

  12. Statistical method to compare massive parallel sequencing pipelines.

    Science.gov (United States)

    Elsensohn, M H; Leblay, N; Dimassi, S; Campan-Fournier, A; Labalme, A; Roucher-Boulez, F; Sanlaville, D; Lesca, G; Bardel, C; Roy, P

    2017-03-01

    Today, sequencing is frequently carried out by Massive Parallel Sequencing (MPS) that cuts drastically sequencing time and expenses. Nevertheless, Sanger sequencing remains the main validation method to confirm the presence of variants. The analysis of MPS data involves the development of several bioinformatic tools, academic or commercial. We present here a statistical method to compare MPS pipelines and test it in a comparison between an academic (BWA-GATK) and a commercial pipeline (TMAP-NextGENe®), with and without reference to a gold standard (here, Sanger sequencing), on a panel of 41 genes in 43 epileptic patients. This method used the number of variants to fit log-linear models for pairwise agreements between pipelines. To assess the heterogeneity of the margins and the odds ratios of agreement, four log-linear models were used: a full model, a homogeneous-margin model, a model with single odds ratio for all patients, and a model with single intercept. Then a log-linear mixed model was fitted considering the biological variability as a random effect. Among the 390,339 base-pairs sequenced, TMAP-NextGENe® and BWA-GATK found, on average, 2253.49 and 1857.14 variants (single nucleotide variants and indels), respectively. Against the gold standard, the pipelines had similar sensitivities (63.47% vs. 63.42%) and close but significantly different specificities (99.57% vs. 99.65%; p < 0.001). Same-trend results were obtained when only single nucleotide variants were considered (99.98% specificity and 76.81% sensitivity for both pipelines). The method allows thus pipeline comparison and selection. It is generalizable to all types of MPS data and all pipelines.

  13. RAMA: A file system for massively parallel computers

    Science.gov (United States)

    Miller, Ethan L.; Katz, Randy H.

    1993-01-01

    This paper describes a file system design for massively parallel computers which makes very efficient use of a few disks per processor. This overcomes the traditional I/O bottleneck of massively parallel machines by storing the data on disks within the high-speed interconnection network. In addition, the file system, called RAMA, requires little inter-node synchronization, removing another common bottleneck in parallel processor file systems. Support for a large tertiary storage system can easily be integrated in lo the file system; in fact, RAMA runs most efficiently when tertiary storage is used.

  14. A discrete ordinate response matrix method for massively parallel computers

    International Nuclear Information System (INIS)

    Hanebutte, U.R.; Lewis, E.E.

    1991-01-01

    A discrete ordinate response matrix method is formulated for the solution of neutron transport problems on massively parallel computers. The response matrix formulation eliminates iteration on the scattering source. The nodal matrices which result from the diamond-differenced equations are utilized in a factored form which minimizes memory requirements and significantly reduces the required number of algorithm utilizes massive parallelism by assigning each spatial node to a processor. The algorithm is accelerated effectively by a synthetic method in which the low-order diffusion equations are also solved by massively parallel red/black iterations. The method has been implemented on a 16k Connection Machine-2, and S 8 and S 16 solutions have been obtained for fixed-source benchmark problems in X--Y geometry

  15. Solving the Stokes problem on a massively parallel computer

    DEFF Research Database (Denmark)

    Axelsson, Owe; Barker, Vincent A.; Neytcheva, Maya

    2001-01-01

    boundary value problem for each velocity component, are solved by the conjugate gradient method with a preconditioning based on the algebraic multi‐level iteration (AMLI) technique. The velocity is found from the computed pressure. The method is optimal in the sense that the computational work...... is proportional to the number of unknowns. Further, it is designed to exploit a massively parallel computer with distributed memory architecture. Numerical experiments on a Cray T3E computer illustrate the parallel performance of the method....

  16. Performance of Air Pollution Models on Massively Parallel Computers

    DEFF Research Database (Denmark)

    Brown, John; Hansen, Per Christian; Wasniewski, Jerzy

    1996-01-01

    To compare the performance and use of three massively parallel SIMD computers, we implemented a large air pollution model on the computers. Using a realistic large-scale model, we gain detailed insight about the performance of the three computers when used to solve large-scale scientific problems...

  17. PARALLEL SPATIOTEMPORAL SPECTRAL CLUSTERING WITH MASSIVE TRAJECTORY DATA

    Directory of Open Access Journals (Sweden)

    Y. Z. Gu

    2017-09-01

    Full Text Available Massive trajectory data contains wealth useful information and knowledge. Spectral clustering, which has been shown to be effective in finding clusters, becomes an important clustering approaches in the trajectory data mining. However, the traditional spectral clustering lacks the temporal expansion on the algorithm and limited in its applicability to large-scale problems due to its high computational complexity. This paper presents a parallel spatiotemporal spectral clustering based on multiple acceleration solutions to make the algorithm more effective and efficient, the performance is proved due to the experiment carried out on the massive taxi trajectory dataset in Wuhan city, China.

  18. A Programming Model for Massive Data Parallelism with Data Dependencies

    International Nuclear Information System (INIS)

    Cui, Xiaohui; Mueller, Frank; Potok, Thomas E.; Zhang, Yongpeng

    2009-01-01

    Accelerating processors can often be more cost and energy effective for a wide range of data-parallel computing problems than general-purpose processors. For graphics processor units (GPUs), this is particularly the case when program development is aided by environments such as NVIDIA s Compute Unified Device Architecture (CUDA), which dramatically reduces the gap between domain-specific architectures and general purpose programming. Nonetheless, general-purpose GPU (GPGPU) programming remains subject to several restrictions. Most significantly, the separation of host (CPU) and accelerator (GPU) address spaces requires explicit management of GPU memory resources, especially for massive data parallelism that well exceeds the memory capacity of GPUs. One solution to this problem is to transfer data between the GPU and host memories frequently. In this work, we investigate another approach. We run massively data-parallel applications on GPU clusters. We further propose a programming model for massive data parallelism with data dependencies for this scenario. Experience from micro benchmarks and real-world applications shows that our model provides not only ease of programming but also significant performance gains

  19. The language parallel Pascal and other aspects of the massively parallel processor

    Science.gov (United States)

    Reeves, A. P.; Bruner, J. D.

    1982-01-01

    A high level language for the Massively Parallel Processor (MPP) was designed. This language, called Parallel Pascal, is described in detail. A description of the language design, a description of the intermediate language, Parallel P-Code, and details for the MPP implementation are included. Formal descriptions of Parallel Pascal and Parallel P-Code are given. A compiler was developed which converts programs in Parallel Pascal into the intermediate Parallel P-Code language. The code generator to complete the compiler for the MPP is being developed independently. A Parallel Pascal to Pascal translator was also developed. The architecture design for a VLSI version of the MPP was completed with a description of fault tolerant interconnection networks. The memory arrangement aspects of the MPP are discussed and a survey of other high level languages is given.

  20. Massively parallel sparse matrix function calculations with NTPoly

    Science.gov (United States)

    Dawson, William; Nakajima, Takahito

    2018-04-01

    We present NTPoly, a massively parallel library for computing the functions of sparse, symmetric matrices. The theory of matrix functions is a well developed framework with a wide range of applications including differential equations, graph theory, and electronic structure calculations. One particularly important application area is diagonalization free methods in quantum chemistry. When the input and output of the matrix function are sparse, methods based on polynomial expansions can be used to compute matrix functions in linear time. We present a library based on these methods that can compute a variety of matrix functions. Distributed memory parallelization is based on a communication avoiding sparse matrix multiplication algorithm. OpenMP task parallellization is utilized to implement hybrid parallelization. We describe NTPoly's interface and show how it can be integrated with programs written in many different programming languages. We demonstrate the merits of NTPoly by performing large scale calculations on the K computer.

  1. Development of massively parallel quantum chemistry program SMASH

    Energy Technology Data Exchange (ETDEWEB)

    Ishimura, Kazuya [Department of Theoretical and Computational Molecular Science, Institute for Molecular Science 38 Nishigo-Naka, Myodaiji, Okazaki, Aichi 444-8585 (Japan)

    2015-12-31

    A massively parallel program for quantum chemistry calculations SMASH was released under the Apache License 2.0 in September 2014. The SMASH program is written in the Fortran90/95 language with MPI and OpenMP standards for parallelization. Frequently used routines, such as one- and two-electron integral calculations, are modularized to make program developments simple. The speed-up of the B3LYP energy calculation for (C{sub 150}H{sub 30}){sub 2} with the cc-pVDZ basis set (4500 basis functions) was 50,499 on 98,304 cores of the K computer.

  2. Development of massively parallel quantum chemistry program SMASH

    International Nuclear Information System (INIS)

    Ishimura, Kazuya

    2015-01-01

    A massively parallel program for quantum chemistry calculations SMASH was released under the Apache License 2.0 in September 2014. The SMASH program is written in the Fortran90/95 language with MPI and OpenMP standards for parallelization. Frequently used routines, such as one- and two-electron integral calculations, are modularized to make program developments simple. The speed-up of the B3LYP energy calculation for (C 150 H 30 ) 2 with the cc-pVDZ basis set (4500 basis functions) was 50,499 on 98,304 cores of the K computer

  3. Analysis of multigrid methods on massively parallel computers: Architectural implications

    Science.gov (United States)

    Matheson, Lesley R.; Tarjan, Robert E.

    1993-01-01

    We study the potential performance of multigrid algorithms running on massively parallel computers with the intent of discovering whether presently envisioned machines will provide an efficient platform for such algorithms. We consider the domain parallel version of the standard V cycle algorithm on model problems, discretized using finite difference techniques in two and three dimensions on block structured grids of size 10(exp 6) and 10(exp 9), respectively. Our models of parallel computation were developed to reflect the computing characteristics of the current generation of massively parallel multicomputers. These models are based on an interconnection network of 256 to 16,384 message passing, 'workstation size' processors executing in an SPMD mode. The first model accomplishes interprocessor communications through a multistage permutation network. The communication cost is a logarithmic function which is similar to the costs in a variety of different topologies. The second model allows single stage communication costs only. Both models were designed with information provided by machine developers and utilize implementation derived parameters. With the medium grain parallelism of the current generation and the high fixed cost of an interprocessor communication, our analysis suggests an efficient implementation requires the machine to support the efficient transmission of long messages, (up to 1000 words) or the high initiation cost of a communication must be significantly reduced through an alternative optimization technique. Furthermore, with variable length message capability, our analysis suggests the low diameter multistage networks provide little or no advantage over a simple single stage communications network.

  4. Computational fluid dynamics on a massively parallel computer

    Science.gov (United States)

    Jespersen, Dennis C.; Levit, Creon

    1989-01-01

    A finite difference code was implemented for the compressible Navier-Stokes equations on the Connection Machine, a massively parallel computer. The code is based on the ARC2D/ARC3D program and uses the implicit factored algorithm of Beam and Warming. The codes uses odd-even elimination to solve linear systems. Timings and computation rates are given for the code, and a comparison is made with a Cray XMP.

  5. A massively parallel corpus: the Bible in 100 languages.

    Science.gov (United States)

    Christodouloupoulos, Christos; Steedman, Mark

    We describe the creation of a massively parallel corpus based on 100 translations of the Bible. We discuss some of the difficulties in acquiring and processing the raw material as well as the potential of the Bible as a corpus for natural language processing. Finally we present a statistical analysis of the corpora collected and a detailed comparison between the English translation and other English corpora.

  6. Image processing with massively parallel computer Quadrics Q1

    International Nuclear Information System (INIS)

    Della Rocca, A.B.; La Porta, L.; Ferriani, S.

    1995-05-01

    Aimed to evaluate the image processing capabilities of the massively parallel computer Quadrics Q1, a convolution algorithm that has been implemented is described in this report. At first the discrete convolution mathematical definition is recalled together with the main Q1 h/w and s/w features. Then the different codification forms of the algorythm are described and the Q1 performances are compared with those obtained by different computers. Finally, the conclusions report on main results and suggestions

  7. Increasing the reach of forensic genetics with massively parallel sequencing.

    Science.gov (United States)

    Budowle, Bruce; Schmedes, Sarah E; Wendt, Frank R

    2017-09-01

    The field of forensic genetics has made great strides in the analysis of biological evidence related to criminal and civil matters. More so, the discipline has set a standard of performance and quality in the forensic sciences. The advent of massively parallel sequencing will allow the field to expand its capabilities substantially. This review describes the salient features of massively parallel sequencing and how it can impact forensic genetics. The features of this technology offer increased number and types of genetic markers that can be analyzed, higher throughput of samples, and the capability of targeting different organisms, all by one unifying methodology. While there are many applications, three are described where massively parallel sequencing will have immediate impact: molecular autopsy, microbial forensics and differentiation of monozygotic twins. The intent of this review is to expose the forensic science community to the potential enhancements that have or are soon to arrive and demonstrate the continued expansion the field of forensic genetics and its service in the investigation of legal matters.

  8. The 2nd Symposium on the Frontiers of Massively Parallel Computations

    Science.gov (United States)

    Mills, Ronnie (Editor)

    1988-01-01

    Programming languages, computer graphics, neural networks, massively parallel computers, SIMD architecture, algorithms, digital terrain models, sort computation, simulation of charged particle transport on the massively parallel processor and image processing are among the topics discussed.

  9. Engineering-Based Thermal CFD Simulations on Massive Parallel Systems

    KAUST Repository

    Frisch, Jérôme

    2015-05-22

    The development of parallel Computational Fluid Dynamics (CFD) codes is a challenging task that entails efficient parallelization concepts and strategies in order to achieve good scalability values when running those codes on modern supercomputers with several thousands to millions of cores. In this paper, we present a hierarchical data structure for massive parallel computations that supports the coupling of a Navier–Stokes-based fluid flow code with the Boussinesq approximation in order to address complex thermal scenarios for energy-related assessments. The newly designed data structure is specifically designed with the idea of interactive data exploration and visualization during runtime of the simulation code; a major shortcoming of traditional high-performance computing (HPC) simulation codes. We further show and discuss speed-up values obtained on one of Germany’s top-ranked supercomputers with up to 140,000 processes and present simulation results for different engineering-based thermal problems.

  10. Neural Parallel Engine: A toolbox for massively parallel neural signal processing.

    Science.gov (United States)

    Tam, Wing-Kin; Yang, Zhi

    2018-05-01

    Large-scale neural recordings provide detailed information on neuronal activities and can help elicit the underlying neural mechanisms of the brain. However, the computational burden is also formidable when we try to process the huge data stream generated by such recordings. In this study, we report the development of Neural Parallel Engine (NPE), a toolbox for massively parallel neural signal processing on graphical processing units (GPUs). It offers a selection of the most commonly used routines in neural signal processing such as spike detection and spike sorting, including advanced algorithms such as exponential-component-power-component (EC-PC) spike detection and binary pursuit spike sorting. We also propose a new method for detecting peaks in parallel through a parallel compact operation. Our toolbox is able to offer a 5× to 110× speedup compared with its CPU counterparts depending on the algorithms. A user-friendly MATLAB interface is provided to allow easy integration of the toolbox into existing workflows. Previous efforts on GPU neural signal processing only focus on a few rudimentary algorithms, are not well-optimized and often do not provide a user-friendly programming interface to fit into existing workflows. There is a strong need for a comprehensive toolbox for massively parallel neural signal processing. A new toolbox for massively parallel neural signal processing has been created. It can offer significant speedup in processing signals from large-scale recordings up to thousands of channels. Copyright © 2018 Elsevier B.V. All rights reserved.

  11. Scientific programming on massively parallel processor CP-PACS

    International Nuclear Information System (INIS)

    Boku, Taisuke

    1998-01-01

    The massively parallel processor CP-PACS takes various problems of calculation physics as the object, and it has been designed so that its architecture has been devised to do various numerical processings. In this report, the outline of the CP-PACS and the example of programming in the Kernel CG benchmark in NAS Parallel Benchmarks, version 1, are shown, and the pseudo vector processing mechanism and the parallel processing tuning of scientific and technical computation utilizing the three-dimensional hyper crossbar net, which are two great features of the architecture of the CP-PACS are described. As for the CP-PACS, the PUs based on RISC processor and added with pseudo vector processor are used. Pseudo vector processing is realized as the loop processing by scalar command. The features of the connection net of PUs are explained. The algorithm of the NPB version 1 Kernel CG is shown. The part that takes the time for processing most in the main loop is the product of matrix and vector (matvec), and the parallel processing of the matvec is explained. The time for the computation by the CPU is determined. As the evaluation of the performance, the evaluation of the time for execution, the short vector processing of pseudo vector processor based on slide window, and the comparison with other parallel computers are reported. (K.I.)

  12. Routing performance analysis and optimization within a massively parallel computer

    Science.gov (United States)

    Archer, Charles Jens; Peters, Amanda; Pinnow, Kurt Walter; Swartz, Brent Allen

    2013-04-16

    An apparatus, program product and method optimize the operation of a massively parallel computer system by, in part, receiving actual performance data concerning an application executed by the plurality of interconnected nodes, and analyzing the actual performance data to identify an actual performance pattern. A desired performance pattern may be determined for the application, and an algorithm may be selected from among a plurality of algorithms stored within a memory, the algorithm being configured to achieve the desired performance pattern based on the actual performance data.

  13. Implementation of PHENIX trigger algorithms on massively parallel computers

    International Nuclear Information System (INIS)

    Petridis, A.N.; Wohn, F.K.

    1995-01-01

    The event selection requirements of contemporary high energy and nuclear physics experiments are met by the introduction of on-line trigger algorithms which identify potentially interesting events and reduce the data acquisition rate to levels that are manageable by the electronics. Such algorithms being parallel in nature can be simulated off-line using massively parallel computers. The PHENIX experiment intends to investigate the possible existence of a new phase of matter called the quark gluon plasma which has been theorized to have existed in very early stages of the evolution of the universe by studying collisions of heavy nuclei at ultra-relativistic energies. Such interactions can also reveal important information regarding the structure of the nucleus and mandate a thorough investigation of the simpler proton-nucleus collisions at the same energies. The complexity of PHENIX events and the need to analyze and also simulate them at rates similar to the data collection ones imposes enormous computation demands. This work is a first effort to implement PHENIX trigger algorithms on parallel computers and to study the feasibility of using such machines to run the complex programs necessary for the simulation of the PHENIX detector response. Fine and coarse grain approaches have been studied and evaluated. Depending on the application the performance of a massively parallel computer can be much better or much worse than that of a serial workstation. A comparison between single instruction and multiple instruction computers is also made and possible applications of the single instruction machines to high energy and nuclear physics experiments are outlined. copyright 1995 American Institute of Physics

  14. Micro-mechanical Simulations of Soils using Massively Parallel Supercomputers

    Directory of Open Access Journals (Sweden)

    David W. Washington

    2004-06-01

    Full Text Available In this research a computer program, Trubal version 1.51, based on the Discrete Element Method was converted to run on a Connection Machine (CM-5,a massively parallel supercomputer with 512 nodes, to expedite the computational times of simulating Geotechnical boundary value problems. The dynamic memory algorithm in Trubal program did not perform efficiently in CM-2 machine with the Single Instruction Multiple Data (SIMD architecture. This was due to the communication overhead involving global array reductions, global array broadcast and random data movement. Therefore, a dynamic memory algorithm in Trubal program was converted to a static memory arrangement and Trubal program was successfully converted to run on CM-5 machines. The converted program was called "TRUBAL for Parallel Machines (TPM." Simulating two physical triaxial experiments and comparing simulation results with Trubal simulations validated the TPM program. With a 512 nodes CM-5 machine TPM produced a nine-fold speedup demonstrating the inherent parallelism within algorithms based on the Discrete Element Method.

  15. Representing and computing regular languages on massively parallel networks

    Energy Technology Data Exchange (ETDEWEB)

    Miller, M.I.; O' Sullivan, J.A. (Electronic Systems and Research Lab., of Electrical Engineering, Washington Univ., St. Louis, MO (US)); Boysam, B. (Dept. of Electrical, Computer and Systems Engineering, Rensselaer Polytechnic Inst., Troy, NY (US)); Smith, K.R. (Dept. of Electrical Engineering, Southern Illinois Univ., Edwardsville, IL (US))

    1991-01-01

    This paper proposes a general method for incorporating rule-based constraints corresponding to regular languages into stochastic inference problems, thereby allowing for a unified representation of stochastic and syntactic pattern constraints. The authors' approach first established the formal connection of rules to Chomsky grammars, and generalizes the original work of Shannon on the encoding of rule-based channel sequences to Markov chains of maximum entropy. This maximum entropy probabilistic view leads to Gibb's representations with potentials which have their number of minima growing at precisely the exponential rate that the language of deterministically constrained sequences grow. These representations are coupled to stochastic diffusion algorithms, which sample the language-constrained sequences by visiting the energy minima according to the underlying Gibbs' probability law. The coupling to stochastic search methods yields the all-important practical result that fully parallel stochastic cellular automata may be derived to generate samples from the rule-based constraint sets. The production rules and neighborhood state structure of the language of sequences directly determines the necessary connection structures of the required parallel computing surface. Representations of this type have been mapped to the DAP-510 massively-parallel processor consisting of 1024 mesh-connected bit-serial processing elements for performing automated segmentation of electron-micrograph images.

  16. Enhanced memory architecture for massively parallel vision chip

    Science.gov (United States)

    Chen, Zhe; Yang, Jie; Liu, Liyuan; Wu, Nanjian

    2015-04-01

    Local memory architecture plays an important role in high performance massively parallel vision chip. In this paper, we propose an enhanced memory architecture with compact circuit area designed in a full-custom flow. The memory consists of separate master-stage static latches and shared slave-stage dynamic latches. We use split transmission transistors on the input data path to enhance tolerance for charge sharing and to achieve random read/write capabilities. The memory is designed in a 0.18 μm CMOS process. The area overhead of the memory achieves 16.6 μm2/bit. Simulation results show that the maximum operating frequency reaches 410 MHz and the corresponding peak dynamic power consumption for a 64-bit memory unit is 190 μW under 1.8 V supply voltage.

  17. A Computational Fluid Dynamics Algorithm on a Massively Parallel Computer

    Science.gov (United States)

    Jespersen, Dennis C.; Levit, Creon

    1989-01-01

    The discipline of computational fluid dynamics is demanding ever-increasing computational power to deal with complex fluid flow problems. We investigate the performance of a finite-difference computational fluid dynamics algorithm on a massively parallel computer, the Connection Machine. Of special interest is an implicit time-stepping algorithm; to obtain maximum performance from the Connection Machine, it is necessary to use a nonstandard algorithm to solve the linear systems that arise in the implicit algorithm. We find that the Connection Machine ran achieve very high computation rates on both explicit and implicit algorithms. The performance of the Connection Machine puts it in the same class as today's most powerful conventional supercomputers.

  18. PUMA: An Operating System for Massively Parallel Systems

    Directory of Open Access Journals (Sweden)

    Stephen R. Wheat

    1994-01-01

    Full Text Available This article presents an overview of PUMA (Performance-oriented, User-managed Messaging Architecture, a message-passing kernel for massively parallel systems. Message passing in PUMA is based on portals – an opening in the address space of an application process. Once an application process has established a portal, other processes can write values into the portal using a simple send operation. Because messages are written directly into the address space of the receiving process, there is no need to buffer messages in the PUMA kernel and later copy them into the applications address space. PUMA consists of two components: the quintessential kernel (Q-Kernel and the process control thread (PCT. Although the PCT provides management decisions, the Q-Kernel controls access and implements the policies specified by the PCT.

  19. Massively parallel performance of neutron transport response matrix algorithms

    International Nuclear Information System (INIS)

    Hanebutte, U.R.; Lewis, E.E.

    1993-01-01

    Massively parallel red/black response matrix algorithms for the solution of within-group neutron transport problems are implemented on the Connection Machines-2, 200 and 5. The response matrices are dericed from the diamond-differences and linear-linear nodal discrete ordinate and variational nodal P 3 approximations. The unaccelerated performance of the iterative procedure is examined relative to the maximum rated performances of the machines. The effects of processor partitions size, of virtual processor ratio and of problems size are examined in detail. For the red/black algorithm, the ratio of inter-node communication to computing times is found to be quite small, normally of the order of ten percent or less. Performance increases with problems size and with virtual processor ratio, within the memeory per physical processor limitation. Algorithm adaptation to courser grain machines is straight-forward, with total computing time being virtually inversely proportional to the number of physical processors. (orig.)

  20. Beam dynamics calculations and particle tracking using massively parallel processors

    International Nuclear Information System (INIS)

    Ryne, R.D.; Habib, S.

    1995-01-01

    During the past decade massively parallel processors (MPPs) have slowly gained acceptance within the scientific community. At present these machines typically contain a few hundred to one thousand off-the-shelf microprocessors and a total memory of up to 32 GBytes. The potential performance of these machines is illustrated by the fact that a month long job on a high end workstation might require only a few hours on an MPP. The acceptance of MPPs has been slow for a variety of reasons. For example, some algorithms are not easily parallelizable. Also, in the past these machines were difficult to program. But in recent years the development of Fortran-like languages such as CM Fortran and High Performance Fortran have made MPPs much easier to use. In the following we will describe how MPPs can be used for beam dynamics calculations and long term particle tracking

  1. Tolerating correlated failures in Massively Parallel Stream Processing Engines

    DEFF Research Database (Denmark)

    Su, L.; Zhou, Y.

    2016-01-01

    Fault-tolerance techniques for stream processing engines can be categorized into passive and active approaches. A typical passive approach periodically checkpoints a processing task's runtime states and can recover a failed task by restoring its runtime state using its latest checkpoint. On the o......Fault-tolerance techniques for stream processing engines can be categorized into passive and active approaches. A typical passive approach periodically checkpoints a processing task's runtime states and can recover a failed task by restoring its runtime state using its latest checkpoint....... On the other hand, an active approach usually employs backup nodes to run replicated tasks. Upon failure, the active replica can take over the processing of the failed task with minimal latency. However, both approaches have their own inadequacies in Massively Parallel Stream Processing Engines (MPSPE...

  2. Massively parallel fabrication of repetitive nanostructures: nanolithography for nanoarrays

    International Nuclear Information System (INIS)

    Luttge, Regina

    2009-01-01

    This topical review provides an overview of nanolithographic techniques for nanoarrays. Using patterning techniques such as lithography, normally we aim for a higher order architecture similarly to functional systems in nature. Inspired by the wealth of complexity in nature, these architectures are translated into technical devices, for example, found in integrated circuitry or other systems in which structural elements work as discrete building blocks in microdevices. Ordered artificial nanostructures (arrays of pillars, holes and wires) have shown particular properties and bring about the opportunity to modify and tune the device operation. Moreover, these nanostructures deliver new applications, for example, the nanoscale control of spin direction within a nanomagnet. Subsequently, we can look for applications where this unique property of the smallest manufactured element is repetitively used such as, for example with respect to spin, in nanopatterned magnetic media for data storage. These nanostructures are generally called nanoarrays. Most of these applications require massively parallel produced nanopatterns which can be directly realized by laser interference (areas up to 4 cm 2 are easily achieved with a Lloyd's mirror set-up). In this topical review we will further highlight the application of laser interference as a tool for nanofabrication, its limitations and ultimate advantages towards a variety of devices including nanostructuring for photonic crystal devices, high resolution patterned media and surface modifications of medical implants. The unique properties of nanostructured surfaces have also found applications in biomedical nanoarrays used either for diagnostic or functional assays including catalytic reactions on chip. Bio-inspired templated nanoarrays will be presented in perspective to other massively parallel nanolithography techniques currently discussed in the scientific literature. (topical review)

  3. Massively Parallel Single-Molecule Manipulation Using Centrifugal Force

    Science.gov (United States)

    Wong, Wesley; Halvorsen, Ken

    2011-03-01

    Precise manipulation of single molecules has led to remarkable insights in physics, chemistry, biology, and medicine. However, two issues that have impeded the widespread adoption of these techniques are equipment cost and the laborious nature of making measurements one molecule at a time. To meet these challenges, we have developed an approach that enables massively parallel single- molecule force measurements using centrifugal force. This approach is realized in the centrifuge force microscope, an instrument in which objects in an orbiting sample are subjected to a calibration-free, macroscopically uniform force- field while their micro-to-nanoscopic motions are observed. We demonstrate high- throughput single-molecule force spectroscopy with this technique by performing thousands of rupture experiments in parallel, characterizing force-dependent unbinding kinetics of an antibody-antigen pair in minutes rather than days. Currently, we are taking steps to integrate high-resolution detection, fluorescence, temperature control and a greater dynamic range in force. With significant benefits in efficiency, cost, simplicity, and versatility, single-molecule centrifugation has the potential to expand single-molecule experimentation to a wider range of researchers and experimental systems.

  4. Massive parallel 3D PIC simulation of negative ion extraction

    Science.gov (United States)

    Revel, Adrien; Mochalskyy, Serhiy; Montellano, Ivar Mauricio; Wünderlich, Dirk; Fantz, Ursel; Minea, Tiberiu

    2017-09-01

    The 3D PIC-MCC code ONIX is dedicated to modeling Negative hydrogen/deuterium Ion (NI) extraction and co-extraction of electrons from radio-frequency driven, low pressure plasma sources. It provides valuable insight on the complex phenomena involved in the extraction process. In previous calculations, a mesh size larger than the Debye length was used, implying numerical electron heating. Important steps have been achieved in terms of computation performance and parallelization efficiency allowing successful massive parallel calculations (4096 cores), imperative to resolve the Debye length. In addition, the numerical algorithms have been improved in terms of grid treatment, i.e., the electric field near the complex geometry boundaries (plasma grid) is calculated more accurately. The revised model preserves the full 3D treatment, but can take advantage of a highly refined mesh. ONIX was used to investigate the role of the mesh size, the re-injection scheme for lost particles (extracted or wall absorbed), and the electron thermalization process on the calculated extracted current and plasma characteristics. It is demonstrated that all numerical schemes give the same NI current distribution for extracted ions. Concerning the electrons, the pair-injection technique is found well-adapted to simulate the sheath in front of the plasma grid.

  5. CHOLLA: A NEW MASSIVELY PARALLEL HYDRODYNAMICS CODE FOR ASTROPHYSICAL SIMULATION

    Energy Technology Data Exchange (ETDEWEB)

    Schneider, Evan E.; Robertson, Brant E. [Steward Observatory, University of Arizona, 933 North Cherry Avenue, Tucson, AZ 85721 (United States)

    2015-04-15

    We present Computational Hydrodynamics On ParaLLel Architectures (Cholla ), a new three-dimensional hydrodynamics code that harnesses the power of graphics processing units (GPUs) to accelerate astrophysical simulations. Cholla models the Euler equations on a static mesh using state-of-the-art techniques, including the unsplit Corner Transport Upwind algorithm, a variety of exact and approximate Riemann solvers, and multiple spatial reconstruction techniques including the piecewise parabolic method (PPM). Using GPUs, Cholla evolves the fluid properties of thousands of cells simultaneously and can update over 10 million cells per GPU-second while using an exact Riemann solver and PPM reconstruction. Owing to the massively parallel architecture of GPUs and the design of the Cholla code, astrophysical simulations with physically interesting grid resolutions (≳256{sup 3}) can easily be computed on a single device. We use the Message Passing Interface library to extend calculations onto multiple devices and demonstrate nearly ideal scaling beyond 64 GPUs. A suite of test problems highlights the physical accuracy of our modeling and provides a useful comparison to other codes. We then use Cholla to simulate the interaction of a shock wave with a gas cloud in the interstellar medium, showing that the evolution of the cloud is highly dependent on its density structure. We reconcile the computed mixing time of a turbulent cloud with a realistic density distribution destroyed by a strong shock with the existing analytic theory for spherical cloud destruction by describing the system in terms of its median gas density.

  6. CHOLLA: A NEW MASSIVELY PARALLEL HYDRODYNAMICS CODE FOR ASTROPHYSICAL SIMULATION

    International Nuclear Information System (INIS)

    Schneider, Evan E.; Robertson, Brant E.

    2015-01-01

    We present Computational Hydrodynamics On ParaLLel Architectures (Cholla ), a new three-dimensional hydrodynamics code that harnesses the power of graphics processing units (GPUs) to accelerate astrophysical simulations. Cholla models the Euler equations on a static mesh using state-of-the-art techniques, including the unsplit Corner Transport Upwind algorithm, a variety of exact and approximate Riemann solvers, and multiple spatial reconstruction techniques including the piecewise parabolic method (PPM). Using GPUs, Cholla evolves the fluid properties of thousands of cells simultaneously and can update over 10 million cells per GPU-second while using an exact Riemann solver and PPM reconstruction. Owing to the massively parallel architecture of GPUs and the design of the Cholla code, astrophysical simulations with physically interesting grid resolutions (≳256 3 ) can easily be computed on a single device. We use the Message Passing Interface library to extend calculations onto multiple devices and demonstrate nearly ideal scaling beyond 64 GPUs. A suite of test problems highlights the physical accuracy of our modeling and provides a useful comparison to other codes. We then use Cholla to simulate the interaction of a shock wave with a gas cloud in the interstellar medium, showing that the evolution of the cloud is highly dependent on its density structure. We reconcile the computed mixing time of a turbulent cloud with a realistic density distribution destroyed by a strong shock with the existing analytic theory for spherical cloud destruction by describing the system in terms of its median gas density

  7. MCBooster: a tool for MC generation for massively parallel platforms

    CERN Multimedia

    Alves Junior, Antonio Augusto

    2016-01-01

    MCBooster is a header-only, C++11-compliant library for the generation of large samples of phase-space Monte Carlo events on massively parallel platforms. It was released on GitHub in the spring of 2016. The library core algorithms implement the Raubold-Lynch method; they are able to generate the full kinematics of decays with up to nine particles in the final state. The library supports the generation of sequential decays as well as the parallel evaluation of arbitrary functions over the generated events. The output of MCBooster completely accords with popular and well-tested software packages such as GENBOD (W515 from CERNLIB) and TGenPhaseSpace from the ROOT framework. MCBooster is developed on top of the Thrust library and runs on Linux systems. It deploys transparently on NVidia CUDA-enabled GPUs as well as multicore CPUs. This contribution summarizes the main features of MCBooster. A basic description of the user interface and some examples of applications are provided, along with measurements of perfor...

  8. cellGPU: Massively parallel simulations of dynamic vertex models

    Science.gov (United States)

    Sussman, Daniel M.

    2017-10-01

    Vertex models represent confluent tissue by polygonal or polyhedral tilings of space, with the individual cells interacting via force laws that depend on both the geometry of the cells and the topology of the tessellation. This dependence on the connectivity of the cellular network introduces several complications to performing molecular-dynamics-like simulations of vertex models, and in particular makes parallelizing the simulations difficult. cellGPU addresses this difficulty and lays the foundation for massively parallelized, GPU-based simulations of these models. This article discusses its implementation for a pair of two-dimensional models, and compares the typical performance that can be expected between running cellGPU entirely on the CPU versus its performance when running on a range of commercial and server-grade graphics cards. By implementing the calculation of topological changes and forces on cells in a highly parallelizable fashion, cellGPU enables researchers to simulate time- and length-scales previously inaccessible via existing single-threaded CPU implementations. Program Files doi:http://dx.doi.org/10.17632/6j2cj29t3r.1 Licensing provisions: MIT Programming language: CUDA/C++ Nature of problem: Simulations of off-lattice "vertex models" of cells, in which the interaction forces depend on both the geometry and the topology of the cellular aggregate. Solution method: Highly parallelized GPU-accelerated dynamical simulations in which the force calculations and the topological features can be handled on either the CPU or GPU. Additional comments: The code is hosted at https://gitlab.com/dmsussman/cellGPU, with documentation additionally maintained at http://dmsussman.gitlab.io/cellGPUdocumentation

  9. Massively Parallel Interrogation of Aptamer Sequence, Structure and Function

    Energy Technology Data Exchange (ETDEWEB)

    Fischer, N O; Tok, J B; Tarasow, T M

    2008-02-08

    Optimization of high affinity reagents is a significant bottleneck in medicine and the life sciences. The ability to synthetically create thousands of permutations of a lead high-affinity reagent and survey the properties of individual permutations in parallel could potentially relieve this bottleneck. Aptamers are single stranded oligonucleotides affinity reagents isolated by in vitro selection processes and as a class have been shown to bind a wide variety of target molecules. Methodology/Principal Findings. High density DNA microarray technology was used to synthesize, in situ, arrays of approximately 3,900 aptamer sequence permutations in triplicate. These sequences were interrogated on-chip for their ability to bind the fluorescently-labeled cognate target, immunoglobulin E, resulting in the parallel execution of thousands of experiments. Fluorescence intensity at each array feature was well resolved and shown to be a function of the sequence present. The data demonstrated high intra- and interchip correlation between the same features as well as among the sequence triplicates within a single array. Consistent with aptamer mediated IgE binding, fluorescence intensity correlated strongly with specific aptamer sequences and the concentration of IgE applied to the array. The massively parallel sequence-function analyses provided by this approach confirmed the importance of a consensus sequence found in all 21 of the original IgE aptamer sequences and support a common stem:loop structure as being the secondary structure underlying IgE binding. The microarray application, data and results presented illustrate an efficient, high information content approach to optimizing aptamer function. It also provides a foundation from which to better understand and manipulate this important class of high affinity biomolecules.

  10. Massively parallel interrogation of aptamer sequence, structure and function.

    Directory of Open Access Journals (Sweden)

    Nicholas O Fischer

    Full Text Available BACKGROUND: Optimization of high affinity reagents is a significant bottleneck in medicine and the life sciences. The ability to synthetically create thousands of permutations of a lead high-affinity reagent and survey the properties of individual permutations in parallel could potentially relieve this bottleneck. Aptamers are single stranded oligonucleotides affinity reagents isolated by in vitro selection processes and as a class have been shown to bind a wide variety of target molecules. METHODOLOGY/PRINCIPAL FINDINGS: High density DNA microarray technology was used to synthesize, in situ, arrays of approximately 3,900 aptamer sequence permutations in triplicate. These sequences were interrogated on-chip for their ability to bind the fluorescently-labeled cognate target, immunoglobulin E, resulting in the parallel execution of thousands of experiments. Fluorescence intensity at each array feature was well resolved and shown to be a function of the sequence present. The data demonstrated high intra- and inter-chip correlation between the same features as well as among the sequence triplicates within a single array. Consistent with aptamer mediated IgE binding, fluorescence intensity correlated strongly with specific aptamer sequences and the concentration of IgE applied to the array. CONCLUSION AND SIGNIFICANCE: The massively parallel sequence-function analyses provided by this approach confirmed the importance of a consensus sequence found in all 21 of the original IgE aptamer sequences and support a common stem:loop structure as being the secondary structure underlying IgE binding. The microarray application, data and results presented illustrate an efficient, high information content approach to optimizing aptamer function. It also provides a foundation from which to better understand and manipulate this important class of high affinity biomolecules.

  11. Cloud identification using genetic algorithms and massively parallel computation

    Science.gov (United States)

    Buckles, Bill P.; Petry, Frederick E.

    1996-01-01

    As a Guest Computational Investigator under the NASA administered component of the High Performance Computing and Communication Program, we implemented a massively parallel genetic algorithm on the MasPar SIMD computer. Experiments were conducted using Earth Science data in the domains of meteorology and oceanography. Results obtained in these domains are competitive with, and in most cases better than, similar problems solved using other methods. In the meteorological domain, we chose to identify clouds using AVHRR spectral data. Four cloud speciations were used although most researchers settle for three. Results were remarkedly consistent across all tests (91% accuracy). Refinements of this method may lead to more timely and complete information for Global Circulation Models (GCMS) that are prevalent in weather forecasting and global environment studies. In the oceanographic domain, we chose to identify ocean currents from a spectrometer having similar characteristics to AVHRR. Here the results were mixed (60% to 80% accuracy). Given that one is willing to run the experiment several times (say 10), then it is acceptable to claim the higher accuracy rating. This problem has never been successfully automated. Therefore, these results are encouraging even though less impressive than the cloud experiment. Successful conclusion of an automated ocean current detection system would impact coastal fishing, naval tactics, and the study of micro-climates. Finally we contributed to the basic knowledge of GA (genetic algorithm) behavior in parallel environments. We developed better knowledge of the use of subpopulations in the context of shared breeding pools and the migration of individuals. Rigorous experiments were conducted based on quantifiable performance criteria. While much of the work confirmed current wisdom, for the first time we were able to submit conclusive evidence. The software developed under this grant was placed in the public domain. An extensive user

  12. Massively parallel de novo protein design for targeted therapeutics

    KAUST Repository

    Chevalier, Aaron; Silva, Daniel-Adriano; Rocklin, Gabriel J.; Hicks, Derrick R.; Vergara, Renan; Murapa, Patience; Bernard, Steffen M.; Zhang, Lu; Lam, Kwok-Ho; Yao, Guorui; Bahl, Christopher D.; Miyashita, Shin-Ichiro; Goreshnik, Inna; Fuller, James T.; Koday, Merika T.; Jenkins, Cody M.; Colvin, Tom; Carter, Lauren; Bohn, Alan; Bryan, Cassie M.; Ferná ndez-Velasco, D. Alejandro; Stewart, Lance; Dong, Min; Huang, Xuhui; Jin, Rongsheng; Wilson, Ian A.; Fuller, Deborah H.; Baker, David

    2017-01-01

    De novo protein design holds promise for creating small stable proteins with shapes customized to bind therapeutic targets. We describe a massively parallel approach for designing, manufacturing and screening mini-protein binders, integrating large-scale computational design, oligonucleotide synthesis, yeast display screening and next-generation sequencing. We designed and tested 22,660 mini-proteins of 37-43 residues that target influenza haemagglutinin and botulinum neurotoxin B, along with 6,286 control sequences to probe contributions to folding and binding, and identified 2,618 high-affinity binders. Comparison of the binding and non-binding design sets, which are two orders of magnitude larger than any previously investigated, enabled the evaluation and improvement of the computational model. Biophysical characterization of a subset of the binder designs showed that they are extremely stable and, unlike antibodies, do not lose activity after exposure to high temperatures. The designs elicit little or no immune response and provide potent prophylactic and therapeutic protection against influenza, even after extensive repeated dosing.

  13. Massively parallel de novo protein design for targeted therapeutics

    KAUST Repository

    Chevalier, Aaron

    2017-09-26

    De novo protein design holds promise for creating small stable proteins with shapes customized to bind therapeutic targets. We describe a massively parallel approach for designing, manufacturing and screening mini-protein binders, integrating large-scale computational design, oligonucleotide synthesis, yeast display screening and next-generation sequencing. We designed and tested 22,660 mini-proteins of 37-43 residues that target influenza haemagglutinin and botulinum neurotoxin B, along with 6,286 control sequences to probe contributions to folding and binding, and identified 2,618 high-affinity binders. Comparison of the binding and non-binding design sets, which are two orders of magnitude larger than any previously investigated, enabled the evaluation and improvement of the computational model. Biophysical characterization of a subset of the binder designs showed that they are extremely stable and, unlike antibodies, do not lose activity after exposure to high temperatures. The designs elicit little or no immune response and provide potent prophylactic and therapeutic protection against influenza, even after extensive repeated dosing.

  14. An FPGA-Based Massively Parallel Neuromorphic Cortex Simulator.

    Science.gov (United States)

    Wang, Runchun M; Thakur, Chetan S; van Schaik, André

    2018-01-01

    This paper presents a massively parallel and scalable neuromorphic cortex simulator designed for simulating large and structurally connected spiking neural networks, such as complex models of various areas of the cortex. The main novelty of this work is the abstraction of a neuromorphic architecture into clusters represented by minicolumns and hypercolumns, analogously to the fundamental structural units observed in neurobiology. Without this approach, simulating large-scale fully connected networks needs prohibitively large memory to store look-up tables for point-to-point connections. Instead, we use a novel architecture, based on the structural connectivity in the neocortex, such that all the required parameters and connections can be stored in on-chip memory. The cortex simulator can be easily reconfigured for simulating different neural networks without any change in hardware structure by programming the memory. A hierarchical communication scheme allows one neuron to have a fan-out of up to 200 k neurons. As a proof-of-concept, an implementation on one Altera Stratix V FPGA was able to simulate 20 million to 2.6 billion leaky-integrate-and-fire (LIF) neurons in real time. We verified the system by emulating a simplified auditory cortex (with 100 million neurons). This cortex simulator achieved a low power dissipation of 1.62 μW per neuron. With the advent of commercially available FPGA boards, our system offers an accessible and scalable tool for the design, real-time simulation, and analysis of large-scale spiking neural networks.

  15. Massively parallel de novo protein design for targeted therapeutics

    Science.gov (United States)

    Chevalier, Aaron; Silva, Daniel-Adriano; Rocklin, Gabriel J.; Hicks, Derrick R.; Vergara, Renan; Murapa, Patience; Bernard, Steffen M.; Zhang, Lu; Lam, Kwok-Ho; Yao, Guorui; Bahl, Christopher D.; Miyashita, Shin-Ichiro; Goreshnik, Inna; Fuller, James T.; Koday, Merika T.; Jenkins, Cody M.; Colvin, Tom; Carter, Lauren; Bohn, Alan; Bryan, Cassie M.; Fernández-Velasco, D. Alejandro; Stewart, Lance; Dong, Min; Huang, Xuhui; Jin, Rongsheng; Wilson, Ian A.; Fuller, Deborah H.; Baker, David

    2018-01-01

    De novo protein design holds promise for creating small stable proteins with shapes customized to bind therapeutic targets. We describe a massively parallel approach for designing, manufacturing and screening mini-protein binders, integrating large-scale computational design, oligonucleotide synthesis, yeast display screening and next-generation sequencing. We designed and tested 22,660 mini-proteins of 37–43 residues that target influenza haemagglutinin and botulinum neurotoxin B, along with 6,286 control sequences to probe contributions to folding and binding, and identified 2,618 high-affinity binders. Comparison of the binding and non-binding design sets, which are two orders of magnitude larger than any previously investigated, enabled the evaluation and improvement of the computational model. Biophysical characterization of a subset of the binder designs showed that they are extremely stable and, unlike antibodies, do not lose activity after exposure to high temperatures. The designs elicit little or no immune response and provide potent prophylactic and therapeutic protection against influenza, even after extensive repeated dosing. PMID:28953867

  16. An FPGA-Based Massively Parallel Neuromorphic Cortex Simulator

    Directory of Open Access Journals (Sweden)

    Runchun M. Wang

    2018-04-01

    Full Text Available This paper presents a massively parallel and scalable neuromorphic cortex simulator designed for simulating large and structurally connected spiking neural networks, such as complex models of various areas of the cortex. The main novelty of this work is the abstraction of a neuromorphic architecture into clusters represented by minicolumns and hypercolumns, analogously to the fundamental structural units observed in neurobiology. Without this approach, simulating large-scale fully connected networks needs prohibitively large memory to store look-up tables for point-to-point connections. Instead, we use a novel architecture, based on the structural connectivity in the neocortex, such that all the required parameters and connections can be stored in on-chip memory. The cortex simulator can be easily reconfigured for simulating different neural networks without any change in hardware structure by programming the memory. A hierarchical communication scheme allows one neuron to have a fan-out of up to 200 k neurons. As a proof-of-concept, an implementation on one Altera Stratix V FPGA was able to simulate 20 million to 2.6 billion leaky-integrate-and-fire (LIF neurons in real time. We verified the system by emulating a simplified auditory cortex (with 100 million neurons. This cortex simulator achieved a low power dissipation of 1.62 μW per neuron. With the advent of commercially available FPGA boards, our system offers an accessible and scalable tool for the design, real-time simulation, and analysis of large-scale spiking neural networks.

  17. A massively parallel strategy for STR marker development, capture, and genotyping.

    Science.gov (United States)

    Kistler, Logan; Johnson, Stephen M; Irwin, Mitchell T; Louis, Edward E; Ratan, Aakrosh; Perry, George H

    2017-09-06

    Short tandem repeat (STR) variants are highly polymorphic markers that facilitate powerful population genetic analyses. STRs are especially valuable in conservation and ecological genetic research, yielding detailed information on population structure and short-term demographic fluctuations. Massively parallel sequencing has not previously been leveraged for scalable, efficient STR recovery. Here, we present a pipeline for developing STR markers directly from high-throughput shotgun sequencing data without a reference genome, and an approach for highly parallel target STR recovery. We employed our approach to capture a panel of 5000 STRs from a test group of diademed sifakas (Propithecus diadema, n = 3), endangered Malagasy rainforest lemurs, and we report extremely efficient recovery of targeted loci-97.3-99.6% of STRs characterized with ≥10x non-redundant sequence coverage. We then tested our STR capture strategy on P. diadema fecal DNA, and report robust initial results and suggestions for future implementations. In addition to STR targets, this approach also generates large, genome-wide single nucleotide polymorphism (SNP) panels from flanking regions. Our method provides a cost-effective and scalable solution for rapid recovery of large STR and SNP datasets in any species without needing a reference genome, and can be used even with suboptimal DNA more easily acquired in conservation and ecological studies. Published by Oxford University Press on behalf of Nucleic Acids Research 2017.

  18. Engineering-Based Thermal CFD Simulations on Massive Parallel Systems

    KAUST Repository

    Frisch, Jé rô me; Mundani, Ralf-Peter; Rank, Ernst; van Treeck, Christoph

    2015-01-01

    The development of parallel Computational Fluid Dynamics (CFD) codes is a challenging task that entails efficient parallelization concepts and strategies in order to achieve good scalability values when running those codes on modern supercomputers

  19. Massively Parallel Computing at Sandia and Its Application to National Defense

    National Research Council Canada - National Science Library

    Dosanjh, Sudip

    1991-01-01

    Two years ago, researchers at Sandia National Laboratories showed that a massively parallel computer with 1024 processors could solve scientific problems more than 1000 times faster than a single processor...

  20. SWAMP+: multiple subsequence alignment using associative massive parallelism

    Energy Technology Data Exchange (ETDEWEB)

    Steinfadt, Shannon Irene [Los Alamos National Laboratory; Baker, Johnnie W [KENT STATE UNIV.

    2010-10-18

    A new parallel algorithm SWAMP+ incorporates the Smith-Waterman sequence alignment on an associative parallel model known as ASC. It is a highly sensitive parallel approach that expands traditional pairwise sequence alignment. This is the first parallel algorithm to provide multiple non-overlapping, non-intersecting subsequence alignments with the accuracy of Smith-Waterman. The efficient algorithm provides multiple alignments similar to BLAST while creating a better workflow for the end users. The parallel portions of the code run in O(m+n) time using m processors. When m = n, the algorithmic analysis becomes O(n) with a coefficient of two, yielding a linear speedup. Implementation of the algorithm on the SIMD ClearSpeed CSX620 confirms this theoretical linear speedup with real timings.

  1. Massively parallel red-black algorithms for x-y-z response matrix equations

    International Nuclear Information System (INIS)

    Hanebutte, U.R.; Laurin-Kovitz, K.; Lewis, E.E.

    1992-01-01

    Recently, both discrete ordinates and spherical harmonic (S n and P n ) methods have been cast in the form of response matrices. In x-y geometry, massively parallel algorithms have been developed to solve the resulting response matrix equations on the Connection Machine family of parallel computers, the CM-2, CM-200, and CM-5. These algorithms utilize two-cycle iteration on a red-black checkerboard. In this work we examine the use of massively parallel red-black algorithms to solve response matric equations in three dimensions. This longer term objective is to utilize massively parallel algorithms to solve S n and/or P n response matrix problems. In this exploratory examination, however, we consider the simple 6 x 6 response matrices that are derivable from fine-mesh diffusion approximations in three dimensions

  2. Smoldyn on graphics processing units: massively parallel Brownian dynamics simulations.

    Science.gov (United States)

    Dematté, Lorenzo

    2012-01-01

    Space is a very important aspect in the simulation of biochemical systems; recently, the need for simulation algorithms able to cope with space is becoming more and more compelling. Complex and detailed models of biochemical systems need to deal with the movement of single molecules and particles, taking into consideration localized fluctuations, transportation phenomena, and diffusion. A common drawback of spatial models lies in their complexity: models can become very large, and their simulation could be time consuming, especially if we want to capture the systems behavior in a reliable way using stochastic methods in conjunction with a high spatial resolution. In order to deliver the promise done by systems biology to be able to understand a system as whole, we need to scale up the size of models we are able to simulate, moving from sequential to parallel simulation algorithms. In this paper, we analyze Smoldyn, a widely diffused algorithm for stochastic simulation of chemical reactions with spatial resolution and single molecule detail, and we propose an alternative, innovative implementation that exploits the parallelism of Graphics Processing Units (GPUs). The implementation executes the most computational demanding steps (computation of diffusion, unimolecular, and bimolecular reaction, as well as the most common cases of molecule-surface interaction) on the GPU, computing them in parallel on each molecule of the system. The implementation offers good speed-ups and real time, high quality graphics output

  3. A massively parallel discrete ordinates response matrix method for neutron transport

    International Nuclear Information System (INIS)

    Hanebutte, U.R.; Lewis, E.E.

    1992-01-01

    In this paper a discrete ordinates response matrix method is formulated with anisotropic scattering for the solution of neutron transport problems on massively parallel computers. The response matrix formulation eliminates iteration on the scattering source. The nodal matrices that result from the diamond-differenced equations are utilized in a factored form that minimizes memory requirements and significantly reduces the number of arithmetic operations required per node. The red-black solution algorithm utilizes massive parallelism by assigning each spatial node to one or more processors. The algorithm is accelerated by a synthetic method in which the low-order diffusion equations are also solved by massively parallel red-black iterations. The method is implemented on a 16K Connection Machine-2, and S 8 and S 16 solutions are obtained for fixed-source benchmark problems in x-y geometry

  4. Massively Parallel Sort-Merge Joins in Main Memory Multi-Core Database Systems

    OpenAIRE

    Albutiu, Martina-Cezara; Kemper, Alfons; Neumann, Thomas

    2012-01-01

    Two emerging hardware trends will dominate the database system technology in the near future: increasing main memory capacities of several TB per server and massively parallel multi-core processing. Many algorithmic and control techniques in current database technology were devised for disk-based systems where I/O dominated the performance. In this work we take a new look at the well-known sort-merge join which, so far, has not been in the focus of research in scalable massively parallel mult...

  5. Implementation of a Monte Carlo algorithm for neutron transport on a massively parallel SIMD machine

    International Nuclear Information System (INIS)

    Baker, R.S.

    1992-01-01

    We present some results from the recent adaptation of a vectorized Monte Carlo algorithm to a massively parallel architecture. The performance of the algorithm on a single processor Cray Y-MP and a Thinking Machine Corporations CM-2 and CM-200 is compared for several test problems. The results show that significant speedups are obtainable for vectorized Monte Carlo algorithms on massively parallel machines, even when the algorithms are applied to realistic problems which require extensive variance reduction. However, the architecture of the Connection Machine does place some limitations on the regime in which the Monte Carlo algorithm may be expected to perform well

  6. Implementation of a Monte Carlo algorithm for neutron transport on a massively parallel SIMD machine

    International Nuclear Information System (INIS)

    Baker, R.S.

    1993-01-01

    We present some results from the recent adaptation of a vectorized Monte Carlo algorithm to a massively parallel architecture. The performance of the algorithm on a single processor Cray Y-MP and a Thinking Machine Corporations CM-2 and CM-200 is compared for several test problems. The results show that significant speedups are obtainable for vectorized Monte Carlo algorithms on massively parallel machines, even when the algorithms are applied to realistic problems which require extensive variance reduction. However, the architecture of the Connection Machine does place some limitations on the regime in which the Monte Carlo algorithm may be expected to perform well. (orig.)

  7. MASSIVELY PARALLEL LATENT SEMANTIC ANALYSES USING A GRAPHICS PROCESSING UNIT

    Energy Technology Data Exchange (ETDEWEB)

    Cavanagh, J.; Cui, S.

    2009-01-01

    Latent Semantic Analysis (LSA) aims to reduce the dimensions of large term-document datasets using Singular Value Decomposition. However, with the ever-expanding size of datasets, current implementations are not fast enough to quickly and easily compute the results on a standard PC. A graphics processing unit (GPU) can solve some highly parallel problems much faster than a traditional sequential processor or central processing unit (CPU). Thus, a deployable system using a GPU to speed up large-scale LSA processes would be a much more effective choice (in terms of cost/performance ratio) than using a PC cluster. Due to the GPU’s application-specifi c architecture, harnessing the GPU’s computational prowess for LSA is a great challenge. We presented a parallel LSA implementation on the GPU, using NVIDIA® Compute Unifi ed Device Architecture and Compute Unifi ed Basic Linear Algebra Subprograms software. The performance of this implementation is compared to traditional LSA implementation on a CPU using an optimized Basic Linear Algebra Subprograms library. After implementation, we discovered that the GPU version of the algorithm was twice as fast for large matrices (1 000x1 000 and above) that had dimensions not divisible by 16. For large matrices that did have dimensions divisible by 16, the GPU algorithm ran fi ve to six times faster than the CPU version. The large variation is due to architectural benefi ts of the GPU for matrices divisible by 16. It should be noted that the overall speeds for the CPU version did not vary from relative normal when the matrix dimensions were divisible by 16. Further research is needed in order to produce a fully implementable version of LSA. With that in mind, the research we presented shows that the GPU is a viable option for increasing the speed of LSA, in terms of cost/performance ratio.

  8. Massively parallel algorithms for trace-driven cache simulations

    Science.gov (United States)

    Nicol, David M.; Greenberg, Albert G.; Lubachevsky, Boris D.

    1991-01-01

    Trace driven cache simulation is central to computer design. A trace is a very long sequence of reference lines from main memory. At the t(exp th) instant, reference x sub t is hashed into a set of cache locations, the contents of which are then compared with x sub t. If at the t sup th instant x sub t is not present in the cache, then it is said to be a miss, and is loaded into the cache set, possibly forcing the replacement of some other memory line, and making x sub t present for the (t+1) sup st instant. The problem of parallel simulation of a subtrace of N references directed to a C line cache set is considered, with the aim of determining which references are misses and related statistics. A simulation method is presented for the Least Recently Used (LRU) policy, which regradless of the set size C runs in time O(log N) using N processors on the exclusive read, exclusive write (EREW) parallel model. A simpler LRU simulation algorithm is given that runs in O(C log N) time using N/log N processors. Timings are presented of the second algorithm's implementation on the MasPar MP-1, a machine with 16384 processors. A broad class of reference based line replacement policies are considered, which includes LRU as well as the Least Frequently Used and Random replacement policies. A simulation method is presented for any such policy that on any trace of length N directed to a C line set runs in the O(C log N) time with high probability using N processors on the EREW model. The algorithms are simple, have very little space overhead, and are well suited for SIMD implementation.

  9. A massively parallel sequencing approach uncovers ancient origins and high genetic variability of endangered Przewalski's horses.

    Science.gov (United States)

    Goto, Hiroki; Ryder, Oliver A; Fisher, Allison R; Schultz, Bryant; Kosakovsky Pond, Sergei L; Nekrutenko, Anton; Makova, Kateryna D

    2011-01-01

    The endangered Przewalski's horse is the closest relative of the domestic horse and is the only true wild horse species surviving today. The question of whether Przewalski's horse is the direct progenitor of domestic horse has been hotly debated. Studies of DNA diversity within Przewalski's horses have been sparse but are urgently needed to ensure their successful reintroduction to the wild. In an attempt to resolve the controversy surrounding the phylogenetic position and genetic diversity of Przewalski's horses, we used massively parallel sequencing technology to decipher the complete mitochondrial and partial nuclear genomes for all four surviving maternal lineages of Przewalski's horses. Unlike single-nucleotide polymorphism (SNP) typing usually affected by ascertainment bias, the present method is expected to be largely unbiased. Three mitochondrial haplotypes were discovered-two similar ones, haplotypes I/II, and one substantially divergent from the other two, haplotype III. Haplotypes I/II versus III did not cluster together on a phylogenetic tree, rejecting the monophyly of Przewalski's horse maternal lineages, and were estimated to split 0.117-0.186 Ma, significantly preceding horse domestication. In the phylogeny based on autosomal sequences, Przewalski's horses formed a monophyletic clade, separate from the Thoroughbred domestic horse lineage. Our results suggest that Przewalski's horses have ancient origins and are not the direct progenitors of domestic horses. The analysis of the vast amount of sequence data presented here suggests that Przewalski's and domestic horse lineages diverged at least 0.117 Ma but since then have retained ancestral genetic polymorphism and/or experienced gene flow.

  10. First massively parallel algorithm to be implemented in Apollo-II code

    International Nuclear Information System (INIS)

    Stankovski, Z.

    1994-01-01

    The collision probability (CP) method in neutron transport, as applied to arbitrary 2D XY geometries, like the TDT module in APOLLO-II, is very time consuming. Consequently RZ or 3D extensions became prohibitive. Fortunately, this method is very suitable for parallelization. Massively parallel computer architectures, especially MIMD machines, bring a new breath to this method. In this paper we present a CM5 implementation of the CP method. Parallelization is applied to the energy groups, using the CMMD message passing library. In our case we use 32 processors for the standard 99-group APOLLIB-II library. The real advantage of this algorithm will appear in the calculation of the future fine multigroup library (about 8000 groups) of the SAPHYR project with a massively parallel computer (to the order of hundreds of processors). (author). 3 tabs., 4 figs., 4 refs

  11. First massively parallel algorithm to be implemented in APOLLO-II code

    International Nuclear Information System (INIS)

    Stankovski, Z.

    1994-01-01

    The collision probability method in neutron transport, as applied to arbitrary 2-dimensional geometries, like the two dimensional transport module in APOLLO-II is very time consuming. Consequently 3-dimensional extension became prohibitive. Fortunately, this method is very suitable for parallelization. Massively parallel computer architectures, especially MIMD machines, bring a new breath to this method. In this paper we present a CM5 implementation of the collision probability method. Parallelization is applied to the energy groups, using the CMMD massage passing library. In our case we used 32 processors for the standard 99-group APOLLIB-II library. The real advantage of this algorithm will appear in the calculation of the future multigroup library (about 8000 groups) of the SAPHYR project with a massively parallel computer (to the order of hundreds of processors). (author). 4 refs., 4 figs., 3 tabs

  12. I - Template Metaprogramming for Massively Parallel Scientific Computing - Expression Templates

    CERN Multimedia

    CERN. Geneva

    2016-01-01

    Large scale scientific computing raises questions on different levels ranging from the fomulation of the problems to the choice of the best algorithms and their implementation for a specific platform. There are similarities in these different topics that can be exploited by modern-style C++ template metaprogramming techniques to produce readable, maintainable and generic code. Traditional low-level code tend to be fast but platform-dependent, and it obfuscates the meaning of the algorithm. On the other hand, object-oriented approach is nice to read, but may come with an inherent performance penalty. These lectures aim to present he basics of the Expression Template (ET) idiom which allows us to keep the object-oriented approach without sacrificing performance. We will in particular show to to enhance ET to include SIMD vectorization. We will then introduce techniques for abstracting iteration, and introduce thread-level parallelism for use in heavy data-centric loads. We will show to to apply these methods i...

  13. Monte Carlo simulations of quantum systems on massively parallel supercomputers

    International Nuclear Information System (INIS)

    Ding, H.Q.

    1993-01-01

    A large class of quantum physics applications uses operator representations that are discrete integers by nature. This class includes magnetic properties of solids, interacting bosons modeling superfluids and Cooper pairs in superconductors, and Hubbard models for strongly correlated electrons systems. This kind of application typically uses integer data representations and the resulting algorithms are dominated entirely by integer operations. The authors implemented an efficient algorithm for one such application on the Intel Touchstone Delta and iPSC/860. The algorithm uses a multispin coding technique which allows significant data compactification and efficient vectorization of Monte Carlo updates. The algorithm regularly switches between two data decompositions, corresponding naturally to different Monte Carlo updating processes and observable measurements such that only nearest-neighbor communications are needed within a given decomposition. On 128 nodes of Intel Delta, this algorithm updates 183 million spins per second (compared to 21 million on CM-2 and 6.2 million on a Cray Y-MP). A systematic performance analysis shows a better than 90% efficiency in the parallel implementation

  14. Massively Parallel Polar Decomposition on Distributed-Memory Systems

    KAUST Repository

    Ltaief, Hatem

    2018-01-01

    We present a high-performance implementation of the Polar Decomposition (PD) on distributed-memory systems. Building upon on the QR-based Dynamically Weighted Halley (QDWH) algorithm, the key idea lies in finding the best rational approximation for the scalar sign function, which also corresponds to the polar factor for symmetric matrices, to further accelerate the QDWH convergence. Based on the Zolotarev rational functions—introduced by Zolotarev (ZOLO) in 1877— this new PD algorithm ZOLO-PD converges within two iterations even for ill-conditioned matrices, instead of the original six iterations needed for QDWH. ZOLO-PD uses the property of Zolotarev functions that optimality is maintained when two functions are composed in an appropriate manner. The resulting ZOLO-PD has a convergence rate up to seventeen, in contrast to the cubic convergence rate for QDWH. This comes at the price of higher arithmetic costs and memory footprint. These extra floating-point operations can, however, be processed in an embarrassingly parallel fashion. We demonstrate performance using up to 102, 400 cores on two supercomputers. We demonstrate that, in the presence of a large number of processing units, ZOLO-PD is able to outperform QDWH by up to 2.3X speedup, especially in situations where QDWH runs out of work, for instance, in the strong scaling mode of operation.

  15. Optimization of multi-phase compressible lattice Boltzmann codes on massively parallel multi-core systems

    NARCIS (Netherlands)

    Biferale, L.; Mantovani, F.; Pivanti, M.; Pozzati, F.; Sbragaglia, M.; Schifano, S.F.; Toschi, F.; Tripiccione, R.

    2011-01-01

    We develop a Lattice Boltzmann code for computational fluid-dynamics and optimize it for massively parallel systems based on multi-core processors. Our code describes 2D multi-phase compressible flows. We analyze the performance bottlenecks that we find as we gradually expose a larger fraction of

  16. Massively-parallel best subset selection for ordinary least-squares regression

    DEFF Research Database (Denmark)

    Gieseke, Fabian; Polsterer, Kai Lars; Mahabal, Ashish

    2017-01-01

    Selecting an optimal subset of k out of d features for linear regression models given n training instances is often considered intractable for feature spaces with hundreds or thousands of dimensions. We propose an efficient massively-parallel implementation for selecting such optimal feature...

  17. Increasing phylogenetic resolution at low taxonomic levels using massively parallel sequencing of chloroplast genomes

    Science.gov (United States)

    Matthew Parks; Richard Cronn; Aaron Liston

    2009-01-01

    We reconstruct the infrageneric phylogeny of Pinus from 37 nearly-complete chloroplast genomes (average 109 kilobases each of an approximately 120 kilobase genome) generated using multiplexed massively parallel sequencing. We found that 30/33 ingroup nodes resolved wlth > 95-percent bootstrap support; this is a substantial improvement relative...

  18. Design and performance characterization of electronic structure calculations on massively parallel supercomputers

    DEFF Research Database (Denmark)

    Romero, N. A.; Glinsvad, Christian; Larsen, Ask Hjorth

    2013-01-01

    Density function theory (DFT) is the most widely employed electronic structure method because of its favorable scaling with system size and accuracy for a broad range of molecular and condensed-phase systems. The advent of massively parallel supercomputers has enhanced the scientific community...

  19. Implementation of QR up- and downdating on a massively parallel |computer

    DEFF Research Database (Denmark)

    Bendtsen, Claus; Hansen, Per Christian; Madsen, Kaj

    1995-01-01

    We describe an implementation of QR up- and downdating on a massively parallel computer (the Connection Machine CM-200) and show that the algorithm maps well onto the computer. In particular, we show how the use of corrected semi-normal equations for downdating can be efficiently implemented. We...... also illustrate the use of our algorithms in a new LP algorithm....

  20. ARTS - adaptive runtime system for massively parallel systems. Final report; ARTS - optimale Ausfuehrungsunterstuetzung fuer komplexe Anwendungen auf massiv parallelen Systemen. Teilprojekt: Parallele Stroemungsmechanik. Abschlussbericht

    Energy Technology Data Exchange (ETDEWEB)

    Gentzsch, W.; Ferstl, F.; Paap, H.G.; Riedel, E.

    1998-03-20

    In the ARTS project, system software has been developed to support smog and fluid dynamic applications on massively parallel systems. The aim is to implement and test specific software structures within an adaptive run-time system to separate the parallel core algorithms of the applications from the platform independent runtime aspects. Only slight modifications is existing Fortran and C code are necessary to integrate the application code into the new object oriented parallel integrated ARTS framework. The OO-design offers easy control, re-use and adaptation of the system services, resulting in a dramatic decrease in development time of the application and in ease of maintainability of the application software in the future. (orig.) [Deutsch] Im Projekt ARTS wird Basissoftware zur Unterstuetzung von Anwendungen aus den Bereichen Smoganalyse und Stroemungsmechanik auf massiv parallelen Systemen entwickelt und optimiert. Im Vordergrund steht die Erprobung geeigneter Strukturen, um systemnahe Funktionalitaeten in einer Laufzeitumgebung anzusiedeln und dadurch die parallelen Kernalgorithmen der Anwendungsprogramme von den plattformunabhaengigen Laufzeitaspekten zu trennen. Es handelt sich dabei um herkoemmlich strukturierten Fortran-Code, der unter minimalen Aenderungen auch weiterhin nutzbar sein muss, sowie um objektbasiert entworfenen C-Code, der die volle Funktionalitaet der ARTS-Plattform ausnutzen kann. Ein objektorientiertes Design erlaubt eine einfache Kontrolle, Wiederverwendung und Adaption der vom System vorgegebenen Basisdienste. Daraus resultiert ein deutlich reduzierter Entwicklungs- und Laufzeitaufwand fuer die Anwendung. ARTS schafft eine integrierende Plattform, die moderne Technologien aus dem Bereich objektorientierter Laufzeitsysteme mit praxisrelevanten Anforderungen aus dem Bereich des wissenschaftlichen Hoechstleistungsrechnens kombiniert. (orig.)

  1. Reduced complexity and latency for a massive MIMO system using a parallel detection algorithm

    Directory of Open Access Journals (Sweden)

    Shoichi Higuchi

    2017-09-01

    Full Text Available In recent years, massive MIMO systems have been widely researched to realize high-speed data transmission. Since massive MIMO systems use a large number of antennas, these systems require huge complexity to detect the signal. In this paper, we propose a novel detection method for massive MIMO using parallel detection with maximum likelihood detection with QR decomposition and M-algorithm (QRM-MLD to reduce the complexity and latency. The proposed scheme obtains an R matrix after permutation of an H matrix and QR decomposition. The R matrix is also eliminated using a Gauss–Jordan elimination method. By using a modified R matrix, the proposed method can detect the transmitted signal using parallel detection. From the simulation results, the proposed scheme can achieve a reduced complexity and latency with a little degradation of the bit error rate (BER performance compared with the conventional method.

  2. A massively-parallel electronic-structure calculations based on real-space density functional theory

    International Nuclear Information System (INIS)

    Iwata, Jun-Ichi; Takahashi, Daisuke; Oshiyama, Atsushi; Boku, Taisuke; Shiraishi, Kenji; Okada, Susumu; Yabana, Kazuhiro

    2010-01-01

    Based on the real-space finite-difference method, we have developed a first-principles density functional program that efficiently performs large-scale calculations on massively-parallel computers. In addition to efficient parallel implementation, we also implemented several computational improvements, substantially reducing the computational costs of O(N 3 ) operations such as the Gram-Schmidt procedure and subspace diagonalization. Using the program on a massively-parallel computer cluster with a theoretical peak performance of several TFLOPS, we perform electronic-structure calculations for a system consisting of over 10,000 Si atoms, and obtain a self-consistent electronic-structure in a few hundred hours. We analyze in detail the costs of the program in terms of computation and of inter-node communications to clarify the efficiency, the applicability, and the possibility for further improvements.

  3. A highly scalable massively parallel fast marching method for the Eikonal equation

    Science.gov (United States)

    Yang, Jianming; Stern, Frederick

    2017-03-01

    The fast marching method is a widely used numerical method for solving the Eikonal equation arising from a variety of scientific and engineering fields. It is long deemed inherently sequential and an efficient parallel algorithm applicable to large-scale practical applications is not available in the literature. In this study, we present a highly scalable massively parallel implementation of the fast marching method using a domain decomposition approach. Central to this algorithm is a novel restarted narrow band approach that coordinates the frequency of communications and the amount of computations extra to a sequential run for achieving an unprecedented parallel performance. Within each restart, the narrow band fast marching method is executed; simple synchronous local exchanges and global reductions are adopted for communicating updated data in the overlapping regions between neighboring subdomains and getting the latest front status, respectively. The independence of front characteristics is exploited through special data structures and augmented status tags to extract the masked parallelism within the fast marching method. The efficiency, flexibility, and applicability of the parallel algorithm are demonstrated through several examples. These problems are extensively tested on six grids with up to 1 billion points using different numbers of processes ranging from 1 to 65536. Remarkable parallel speedups are achieved using tens of thousands of processes. Detailed pseudo-codes for both the sequential and parallel algorithms are provided to illustrate the simplicity of the parallel implementation and its similarity to the sequential narrow band fast marching algorithm.

  4. Proxy-equation paradigm: A strategy for massively parallel asynchronous computations

    Science.gov (United States)

    Mittal, Ankita; Girimaji, Sharath

    2017-09-01

    Massively parallel simulations of transport equation systems call for a paradigm change in algorithm development to achieve efficient scalability. Traditional approaches require time synchronization of processing elements (PEs), which severely restricts scalability. Relaxing synchronization requirement introduces error and slows down convergence. In this paper, we propose and develop a novel "proxy equation" concept for a general transport equation that (i) tolerates asynchrony with minimal added error, (ii) preserves convergence order and thus, (iii) expected to scale efficiently on massively parallel machines. The central idea is to modify a priori the transport equation at the PE boundaries to offset asynchrony errors. Proof-of-concept computations are performed using a one-dimensional advection (convection) diffusion equation. The results demonstrate the promise and advantages of the present strategy.

  5. Developing a Massively Parallel Forward Projection Radiography Model for Large-Scale Industrial Applications

    Energy Technology Data Exchange (ETDEWEB)

    Bauerle, Matthew [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

    2014-08-01

    This project utilizes Graphics Processing Units (GPUs) to compute radiograph simulations for arbitrary objects. The generation of radiographs, also known as the forward projection imaging model, is computationally intensive and not widely utilized. The goal of this research is to develop a massively parallel algorithm that can compute forward projections for objects with a trillion voxels (3D pixels). To achieve this end, the data are divided into blocks that can each t into GPU memory. The forward projected image is also divided into segments to allow for future parallelization and to avoid needless computations.

  6. Intelligent trigger by massively parallel processors for high energy physics experiments

    International Nuclear Information System (INIS)

    Rohrbach, F.; Vesztergombi, G.

    1992-01-01

    The CERN-MPPC collaboration concentrates its effort on the development of machines based on massive parallelism with thousands of integrated processing elements, arranged in a string. Seven applications are under detailed studies within the collaboration: three for LHC, one for SSC, two for fixed target high energy physics at CERN and one for HDTV. Preliminary results are presented. They show that the objectives should be reached with the use of the ASP architecture. (author)

  7. Massively Parallel Sort-Merge Joins in Main Memory Multi-Core Database Systems

    OpenAIRE

    Martina-Cezara Albutiu, Alfons Kemper, Thomas Neumann

    2012-01-01

    Two emerging hardware trends will dominate the database system technology in the near future: increasing main memory capacities of several TB per server and massively parallel multi-core processing. Many algorithmic and control techniques in current database technology were devised for disk-based systems where I/O dominated the performance. In this work we take a new look at the well-known sort-merge join which, so far, has not been in the focus of research ...

  8. Increasing phylogenetic resolution at low taxonomic levels using massively parallel sequencing of chloroplast genomes

    Directory of Open Access Journals (Sweden)

    Cronn Richard

    2009-12-01

    Full Text Available Abstract Background Molecular evolutionary studies share the common goal of elucidating historical relationships, and the common challenge of adequately sampling taxa and characters. Particularly at low taxonomic levels, recent divergence, rapid radiations, and conservative genome evolution yield limited sequence variation, and dense taxon sampling is often desirable. Recent advances in massively parallel sequencing make it possible to rapidly obtain large amounts of sequence data, and multiplexing makes extensive sampling of megabase sequences feasible. Is it possible to efficiently apply massively parallel sequencing to increase phylogenetic resolution at low taxonomic levels? Results We reconstruct the infrageneric phylogeny of Pinus from 37 nearly-complete chloroplast genomes (average 109 kilobases each of an approximately 120 kilobase genome generated using multiplexed massively parallel sequencing. 30/33 ingroup nodes resolved with ≥ 95% bootstrap support; this is a substantial improvement relative to prior studies, and shows massively parallel sequencing-based strategies can produce sufficient high quality sequence to reach support levels originally proposed for the phylogenetic bootstrap. Resampling simulations show that at least the entire plastome is necessary to fully resolve Pinus, particularly in rapidly radiating clades. Meta-analysis of 99 published infrageneric phylogenies shows that whole plastome analysis should provide similar gains across a range of plant genera. A disproportionate amount of phylogenetic information resides in two loci (ycf1, ycf2, highlighting their unusual evolutionary properties. Conclusion Plastome sequencing is now an efficient option for increasing phylogenetic resolution at lower taxonomic levels in plant phylogenetic and population genetic analyses. With continuing improvements in sequencing capacity, the strategies herein should revolutionize efforts requiring dense taxon and character sampling

  9. Targeted capture massively parallel sequencing analysis of LCIS and invasive lobular cancer: Repertoire of somatic genetic alterations and clonal relationships.

    Science.gov (United States)

    Sakr, Rita A; Schizas, Michail; Carniello, Jose V Scarpa; Ng, Charlotte K Y; Piscuoglio, Salvatore; Giri, Dilip; Andrade, Victor P; De Brot, Marina; Lim, Raymond S; Towers, Russell; Weigelt, Britta; Reis-Filho, Jorge S; King, Tari A

    2016-02-01

    Lobular carcinoma in situ (LCIS) has been proposed as a non-obligate precursor of invasive lobular carcinoma (ILC). Here we sought to define the repertoire of somatic genetic alterations in pure LCIS and in synchronous LCIS and ILC using targeted massively parallel sequencing. DNA samples extracted from microdissected LCIS, ILC and matched normal breast tissue or peripheral blood from 30 patients were subjected to massively parallel sequencing targeting all exons of 273 genes, including the genes most frequently mutated in breast cancer and DNA repair-related genes. Single nucleotide variants and insertions and deletions were identified using state-of-the-art bioinformatics approaches. The constellation of somatic mutations found in LCIS (n = 34) and ILC (n = 21) were similar, with the most frequently mutated genes being CDH1 (56% and 66%, respectively), PIK3CA (41% and 52%, respectively) and CBFB (12% and 19%, respectively). Among 19 LCIS and ILC synchronous pairs, 14 (74%) had at least one identical mutation in common, including identical PIK3CA and CDH1 mutations. Paired analysis of independent foci of LCIS from 3 breasts revealed at least one common mutation in each of the 3 pairs (CDH1, PIK3CA, CBFB and PKHD1L1). LCIS and ILC have a similar repertoire of somatic mutations, with PIK3CA and CDH1 being the most frequently mutated genes. The presence of identical mutations between LCIS-LCIS and LCIS-ILC pairs demonstrates that LCIS is a clonal neoplastic lesion, and provides additional evidence that at least some LCIS are non-obligate precursors of ILC. Copyright © 2015 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.

  10. Massively parallel Monte Carlo for many-particle simulations on GPUs

    Energy Technology Data Exchange (ETDEWEB)

    Anderson, Joshua A.; Jankowski, Eric [Department of Chemical Engineering, University of Michigan, Ann Arbor, MI 48109 (United States); Grubb, Thomas L. [Department of Materials Science and Engineering, University of Michigan, Ann Arbor, MI 48109 (United States); Engel, Michael [Department of Chemical Engineering, University of Michigan, Ann Arbor, MI 48109 (United States); Glotzer, Sharon C., E-mail: sglotzer@umich.edu [Department of Chemical Engineering, University of Michigan, Ann Arbor, MI 48109 (United States); Department of Materials Science and Engineering, University of Michigan, Ann Arbor, MI 48109 (United States)

    2013-12-01

    Current trends in parallel processors call for the design of efficient massively parallel algorithms for scientific computing. Parallel algorithms for Monte Carlo simulations of thermodynamic ensembles of particles have received little attention because of the inherent serial nature of the statistical sampling. In this paper, we present a massively parallel method that obeys detailed balance and implement it for a system of hard disks on the GPU. We reproduce results of serial high-precision Monte Carlo runs to verify the method. This is a good test case because the hard disk equation of state over the range where the liquid transforms into the solid is particularly sensitive to small deviations away from the balance conditions. On a Tesla K20, our GPU implementation executes over one billion trial moves per second, which is 148 times faster than on a single Intel Xeon E5540 CPU core, enables 27 times better performance per dollar, and cuts energy usage by a factor of 13. With this improved performance we are able to calculate the equation of state for systems of up to one million hard disks. These large system sizes are required in order to probe the nature of the melting transition, which has been debated for the last forty years. In this paper we present the details of our computational method, and discuss the thermodynamics of hard disks separately in a companion paper.

  11. III - Template Metaprogramming for massively parallel scientific computing - Templates for Iteration; Thread-level Parallelism

    CERN Multimedia

    CERN. Geneva

    2016-01-01

    Large scale scientific computing raises questions on different levels ranging from the fomulation of the problems to the choice of the best algorithms and their implementation for a specific platform. There are similarities in these different topics that can be exploited by modern-style C++ template metaprogramming techniques to produce readable, maintainable and generic code. Traditional low-level code tend to be fast but platform-dependent, and it obfuscates the meaning of the algorithm. On the other hand, object-oriented approach is nice to read, but may come with an inherent performance penalty. These lectures aim to present he basics of the Expression Template (ET) idiom which allows us to keep the object-oriented approach without sacrificing performance. We will in particular show to to enhance ET to include SIMD vectorization. We will then introduce techniques for abstracting iteration, and introduce thread-level parallelism for use in heavy data-centric loads. We will show to to apply these methods i...

  12. Massively parallel data processing for quantitative total flow imaging with optical coherence microscopy and tomography

    Science.gov (United States)

    Sylwestrzak, Marcin; Szlag, Daniel; Marchand, Paul J.; Kumar, Ashwin S.; Lasser, Theo

    2017-08-01

    We present an application of massively parallel processing of quantitative flow measurements data acquired using spectral optical coherence microscopy (SOCM). The need for massive signal processing of these particular datasets has been a major hurdle for many applications based on SOCM. In view of this difficulty, we implemented and adapted quantitative total flow estimation algorithms on graphics processing units (GPU) and achieved a 150 fold reduction in processing time when compared to a former CPU implementation. As SOCM constitutes the microscopy counterpart to spectral optical coherence tomography (SOCT), the developed processing procedure can be applied to both imaging modalities. We present the developed DLL library integrated in MATLAB (with an example) and have included the source code for adaptations and future improvements. Catalogue identifier: AFBT_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AFBT_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: GNU GPLv3 No. of lines in distributed program, including test data, etc.: 913552 No. of bytes in distributed program, including test data, etc.: 270876249 Distribution format: tar.gz Programming language: CUDA/C, MATLAB. Computer: Intel x64 CPU, GPU supporting CUDA technology. Operating system: 64-bit Windows 7 Professional. Has the code been vectorized or parallelized?: Yes, CPU code has been vectorized in MATLAB, CUDA code has been parallelized. RAM: Dependent on users parameters, typically between several gigabytes and several tens of gigabytes Classification: 6.5, 18. Nature of problem: Speed up of data processing in optical coherence microscopy Solution method: Utilization of GPU for massively parallel data processing Additional comments: Compiled DLL library with source code and documentation, example of utilization (MATLAB script with raw data) Running time: 1,8 s for one B-scan (150 × faster in comparison to the CPU

  13. Grinding up Wheat: a Massive Loss of Nucleotide Diversity Since Domestication

    DEFF Research Database (Denmark)

    Haudry, Anabelle; Cenci, Alberto; Ravel, Catherine

    2007-01-01

    Several demographic and selective events occurred during the domestication of wheat from the allotetraploid wild emmer (Triticum turgidum ssp. dicoccoides). Cultivated wheat has since been affected by other historical events. We analyzed nucleotide diversity at 21 loci in a sample of 101 individu...

  14. Visualizing Network Traffic to Understand the Performance of Massively Parallel Simulations

    KAUST Repository

    Landge, A. G.

    2012-12-01

    The performance of massively parallel applications is often heavily impacted by the cost of communication among compute nodes. However, determining how to best use the network is a formidable task, made challenging by the ever increasing size and complexity of modern supercomputers. This paper applies visualization techniques to aid parallel application developers in understanding the network activity by enabling a detailed exploration of the flow of packets through the hardware interconnect. In order to visualize this large and complex data, we employ two linked views of the hardware network. The first is a 2D view, that represents the network structure as one of several simplified planar projections. This view is designed to allow a user to easily identify trends and patterns in the network traffic. The second is a 3D view that augments the 2D view by preserving the physical network topology and providing a context that is familiar to the application developers. Using the massively parallel multi-physics code pF3D as a case study, we demonstrate that our tool provides valuable insight that we use to explain and optimize pF3D-s performance on an IBM Blue Gene/P system. © 1995-2012 IEEE.

  15. Massive Parallelism of Monte-Carlo Simulation on Low-End Hardware using Graphic Processing Units

    Energy Technology Data Exchange (ETDEWEB)

    Mburu, Joe Mwangi; Hah, Chang Joo Hah [KEPCO International Nuclear Graduate School, Ulsan (Korea, Republic of)

    2014-05-15

    Within the past decade, research has been done on utilizing GPU massive parallelization in core simulation with impressive results but unfortunately, not much commercial application has been done in the nuclear field especially in reactor core simulation. The purpose of this paper is to give an introductory concept on the topic and illustrate the potential of exploiting the massive parallel nature of GPU computing on a simple monte-carlo simulation with very minimal hardware specifications. To do a comparative analysis, a simple two dimension monte-carlo simulation is implemented for both the CPU and GPU in order to evaluate performance gain based on the computing devices. The heterogeneous platform utilized in this analysis is done on a slow notebook with only 1GHz processor. The end results are quite surprising whereby high speedups obtained are almost a factor of 10. In this work, we have utilized heterogeneous computing in a GPU-based approach in applying potential high arithmetic intensive calculation. By applying a complex monte-carlo simulation on GPU platform, we have speed up the computational process by almost a factor of 10 based on one million neutrons. This shows how easy, cheap and efficient it is in using GPU in accelerating scientific computing and the results should encourage in exploring further this avenue especially in nuclear reactor physics simulation where deterministic and stochastic calculations are quite favourable in parallelization.

  16. Massive Parallelism of Monte-Carlo Simulation on Low-End Hardware using Graphic Processing Units

    International Nuclear Information System (INIS)

    Mburu, Joe Mwangi; Hah, Chang Joo Hah

    2014-01-01

    Within the past decade, research has been done on utilizing GPU massive parallelization in core simulation with impressive results but unfortunately, not much commercial application has been done in the nuclear field especially in reactor core simulation. The purpose of this paper is to give an introductory concept on the topic and illustrate the potential of exploiting the massive parallel nature of GPU computing on a simple monte-carlo simulation with very minimal hardware specifications. To do a comparative analysis, a simple two dimension monte-carlo simulation is implemented for both the CPU and GPU in order to evaluate performance gain based on the computing devices. The heterogeneous platform utilized in this analysis is done on a slow notebook with only 1GHz processor. The end results are quite surprising whereby high speedups obtained are almost a factor of 10. In this work, we have utilized heterogeneous computing in a GPU-based approach in applying potential high arithmetic intensive calculation. By applying a complex monte-carlo simulation on GPU platform, we have speed up the computational process by almost a factor of 10 based on one million neutrons. This shows how easy, cheap and efficient it is in using GPU in accelerating scientific computing and the results should encourage in exploring further this avenue especially in nuclear reactor physics simulation where deterministic and stochastic calculations are quite favourable in parallelization

  17. Time-dependent density-functional theory in massively parallel computer architectures: the OCTOPUS project.

    Science.gov (United States)

    Andrade, Xavier; Alberdi-Rodriguez, Joseba; Strubbe, David A; Oliveira, Micael J T; Nogueira, Fernando; Castro, Alberto; Muguerza, Javier; Arruabarrena, Agustin; Louie, Steven G; Aspuru-Guzik, Alán; Rubio, Angel; Marques, Miguel A L

    2012-06-13

    Octopus is a general-purpose density-functional theory (DFT) code, with a particular emphasis on the time-dependent version of DFT (TDDFT). In this paper we present the ongoing efforts to achieve the parallelization of octopus. We focus on the real-time variant of TDDFT, where the time-dependent Kohn-Sham equations are directly propagated in time. This approach has great potential for execution in massively parallel systems such as modern supercomputers with thousands of processors and graphics processing units (GPUs). For harvesting the potential of conventional supercomputers, the main strategy is a multi-level parallelization scheme that combines the inherent scalability of real-time TDDFT with a real-space grid domain-partitioning approach. A scalable Poisson solver is critical for the efficiency of this scheme. For GPUs, we show how using blocks of Kohn-Sham states provides the required level of data parallelism and that this strategy is also applicable for code optimization on standard processors. Our results show that real-time TDDFT, as implemented in octopus, can be the method of choice for studying the excited states of large molecular systems in modern parallel architectures.

  18. Scalable Parallel Distributed Coprocessor System for Graph Searching Problems with Massive Data

    Directory of Open Access Journals (Sweden)

    Wanrong Huang

    2017-01-01

    Full Text Available The Internet applications, such as network searching, electronic commerce, and modern medical applications, produce and process massive data. Considerable data parallelism exists in computation processes of data-intensive applications. A traversal algorithm, breadth-first search (BFS, is fundamental in many graph processing applications and metrics when a graph grows in scale. A variety of scientific programming methods have been proposed for accelerating and parallelizing BFS because of the poor temporal and spatial locality caused by inherent irregular memory access patterns. However, new parallel hardware could provide better improvement for scientific methods. To address small-world graph problems, we propose a scalable and novel field-programmable gate array-based heterogeneous multicore system for scientific programming. The core is multithread for streaming processing. And the communication network InfiniBand is adopted for scalability. We design a binary search algorithm to address mapping to unify all processor addresses. Within the limits permitted by the Graph500 test bench after 1D parallel hybrid BFS algorithm testing, our 8-core and 8-thread-per-core system achieved superior performance and efficiency compared with the prior work under the same degree of parallelism. Our system is efficient not as a special acceleration unit but as a processor platform that deals with graph searching applications.

  19. Time-dependent density-functional theory in massively parallel computer architectures: the octopus project

    Science.gov (United States)

    Andrade, Xavier; Alberdi-Rodriguez, Joseba; Strubbe, David A.; Oliveira, Micael J. T.; Nogueira, Fernando; Castro, Alberto; Muguerza, Javier; Arruabarrena, Agustin; Louie, Steven G.; Aspuru-Guzik, Alán; Rubio, Angel; Marques, Miguel A. L.

    2012-06-01

    Octopus is a general-purpose density-functional theory (DFT) code, with a particular emphasis on the time-dependent version of DFT (TDDFT). In this paper we present the ongoing efforts to achieve the parallelization of octopus. We focus on the real-time variant of TDDFT, where the time-dependent Kohn-Sham equations are directly propagated in time. This approach has great potential for execution in massively parallel systems such as modern supercomputers with thousands of processors and graphics processing units (GPUs). For harvesting the potential of conventional supercomputers, the main strategy is a multi-level parallelization scheme that combines the inherent scalability of real-time TDDFT with a real-space grid domain-partitioning approach. A scalable Poisson solver is critical for the efficiency of this scheme. For GPUs, we show how using blocks of Kohn-Sham states provides the required level of data parallelism and that this strategy is also applicable for code optimization on standard processors. Our results show that real-time TDDFT, as implemented in octopus, can be the method of choice for studying the excited states of large molecular systems in modern parallel architectures.

  20. Time-dependent density-functional theory in massively parallel computer architectures: the octopus project

    International Nuclear Information System (INIS)

    Andrade, Xavier; Aspuru-Guzik, Alán; Alberdi-Rodriguez, Joseba; Rubio, Angel; Strubbe, David A; Louie, Steven G; Oliveira, Micael J T; Nogueira, Fernando; Castro, Alberto; Muguerza, Javier; Arruabarrena, Agustin; Marques, Miguel A L

    2012-01-01

    Octopus is a general-purpose density-functional theory (DFT) code, with a particular emphasis on the time-dependent version of DFT (TDDFT). In this paper we present the ongoing efforts to achieve the parallelization of octopus. We focus on the real-time variant of TDDFT, where the time-dependent Kohn-Sham equations are directly propagated in time. This approach has great potential for execution in massively parallel systems such as modern supercomputers with thousands of processors and graphics processing units (GPUs). For harvesting the potential of conventional supercomputers, the main strategy is a multi-level parallelization scheme that combines the inherent scalability of real-time TDDFT with a real-space grid domain-partitioning approach. A scalable Poisson solver is critical for the efficiency of this scheme. For GPUs, we show how using blocks of Kohn-Sham states provides the required level of data parallelism and that this strategy is also applicable for code optimization on standard processors. Our results show that real-time TDDFT, as implemented in octopus, can be the method of choice for studying the excited states of large molecular systems in modern parallel architectures. (topical review)

  1. Massively parallel DNA sequencing facilitates diagnosis of patients with Usher syndrome type 1.

    Directory of Open Access Journals (Sweden)

    Hidekane Yoshimura

    Full Text Available Usher syndrome is an autosomal recessive disorder manifesting hearing loss, retinitis pigmentosa and vestibular dysfunction, and having three clinical subtypes. Usher syndrome type 1 is the most severe subtype due to its profound hearing loss, lack of vestibular responses, and retinitis pigmentosa that appears in prepuberty. Six of the corresponding genes have been identified, making early diagnosis through DNA testing possible, with many immediate and several long-term advantages for patients and their families. However, the conventional genetic techniques, such as direct sequence analysis, are both time-consuming and expensive. Targeted exon sequencing of selected genes using the massively parallel DNA sequencing technology will potentially enable us to systematically tackle previously intractable monogenic disorders and improve molecular diagnosis. Using this technique combined with direct sequence analysis, we screened 17 unrelated Usher syndrome type 1 patients and detected probable pathogenic variants in the 16 of them (94.1% who carried at least one mutation. Seven patients had the MYO7A mutation (41.2%, which is the most common type in Japanese. Most of the mutations were detected by only the massively parallel DNA sequencing. We report here four patients, who had probable pathogenic mutations in two different Usher syndrome type 1 genes, and one case of MYO7A/PCDH15 digenic inheritance. This is the first report of Usher syndrome mutation analysis using massively parallel DNA sequencing and the frequency of Usher syndrome type 1 genes in Japanese. Mutation screening using this technique has the power to quickly identify mutations of many causative genes while maintaining cost-benefit performance. In addition, the simultaneous mutation analysis of large numbers of genes is useful for detecting mutations in different genes that are possibly disease modifiers or of digenic inheritance.

  2. Massively parallel DNA sequencing facilitates diagnosis of patients with Usher syndrome type 1.

    Science.gov (United States)

    Yoshimura, Hidekane; Iwasaki, Satoshi; Nishio, Shin-Ya; Kumakawa, Kozo; Tono, Tetsuya; Kobayashi, Yumiko; Sato, Hiroaki; Nagai, Kyoko; Ishikawa, Kotaro; Ikezono, Tetsuo; Naito, Yasushi; Fukushima, Kunihiro; Oshikawa, Chie; Kimitsuki, Takashi; Nakanishi, Hiroshi; Usami, Shin-Ichi

    2014-01-01

    Usher syndrome is an autosomal recessive disorder manifesting hearing loss, retinitis pigmentosa and vestibular dysfunction, and having three clinical subtypes. Usher syndrome type 1 is the most severe subtype due to its profound hearing loss, lack of vestibular responses, and retinitis pigmentosa that appears in prepuberty. Six of the corresponding genes have been identified, making early diagnosis through DNA testing possible, with many immediate and several long-term advantages for patients and their families. However, the conventional genetic techniques, such as direct sequence analysis, are both time-consuming and expensive. Targeted exon sequencing of selected genes using the massively parallel DNA sequencing technology will potentially enable us to systematically tackle previously intractable monogenic disorders and improve molecular diagnosis. Using this technique combined with direct sequence analysis, we screened 17 unrelated Usher syndrome type 1 patients and detected probable pathogenic variants in the 16 of them (94.1%) who carried at least one mutation. Seven patients had the MYO7A mutation (41.2%), which is the most common type in Japanese. Most of the mutations were detected by only the massively parallel DNA sequencing. We report here four patients, who had probable pathogenic mutations in two different Usher syndrome type 1 genes, and one case of MYO7A/PCDH15 digenic inheritance. This is the first report of Usher syndrome mutation analysis using massively parallel DNA sequencing and the frequency of Usher syndrome type 1 genes in Japanese. Mutation screening using this technique has the power to quickly identify mutations of many causative genes while maintaining cost-benefit performance. In addition, the simultaneous mutation analysis of large numbers of genes is useful for detecting mutations in different genes that are possibly disease modifiers or of digenic inheritance.

  3. Detection of reverse transcriptase termination sites using cDNA ligation and massive parallel sequencing

    DEFF Research Database (Denmark)

    Kielpinski, Lukasz J; Boyd, Mette; Sandelin, Albin

    2013-01-01

    Detection of reverse transcriptase termination sites is important in many different applications, such as structural probing of RNAs, rapid amplification of cDNA 5' ends (5' RACE), cap analysis of gene expression, and detection of RNA modifications and protein-RNA cross-links. The throughput...... of these methods can be increased by applying massive parallel sequencing technologies.Here, we describe a versatile method for detection of reverse transcriptase termination sites based on ligation of an adapter to the 3' end of cDNA with bacteriophage TS2126 RNA ligase (CircLigase™). In the following PCR...

  4. From Massively Parallel Algorithms and Fluctuating Time Horizons to Nonequilibrium Surface Growth

    International Nuclear Information System (INIS)

    Korniss, G.; Toroczkai, Z.; Novotny, M. A.; Rikvold, P. A.

    2000-01-01

    We study the asymptotic scaling properties of a massively parallel algorithm for discrete-event simulations where the discrete events are Poisson arrivals. The evolution of the simulated time horizon is analogous to a nonequilibrium surface. Monte Carlo simulations and a coarse-grained approximation indicate that the macroscopic landscape in the steady state is governed by the Edwards-Wilkinson Hamiltonian. Since the efficiency of the algorithm corresponds to the density of local minima in the associated surface, our results imply that the algorithm is asymptotically scalable. (c) 2000 The American Physical Society

  5. Passive and partially active fault tolerance for massively parallel stream processing engines

    DEFF Research Database (Denmark)

    Su, Li; Zhou, Yongluan

    2018-01-01

    . On the other hand, an active approach usually employs backup nodes to run replicated tasks. Upon failure, the active replica can take over the processing of the failed task with minimal latency. However, both approaches have their own inadequacies in Massively Parallel Stream Processing Engines (MPSPE...... also propose effective and efficient algorithms to optimize a partially active replication plan to maximize the quality of tentative outputs. We implemented PPA on top of Storm, an open-source MPSPE and conducted extensive experiments using both real and synthetic datasets to verify the effectiveness...

  6. Block iterative restoration of astronomical images with the massively parallel processor

    International Nuclear Information System (INIS)

    Heap, S.R.; Lindler, D.J.

    1987-01-01

    A method is described for algebraic image restoration capable of treating astronomical images. For a typical 500 x 500 image, direct algebraic restoration would require the solution of a 250,000 x 250,000 linear system. The block iterative approach is used to reduce the problem to solving 4900 121 x 121 linear systems. The algorithm was implemented on the Goddard Massively Parallel Processor, which can solve a 121 x 121 system in approximately 0.06 seconds. Examples are shown of the results for various astronomical images

  7. Massive parallel sequencing in sarcoma pathobiology: state of the art and perspectives.

    Science.gov (United States)

    Brenca, Monica; Maestro, Roberta

    2015-01-01

    Sarcomas are an aggressive and highly heterogeneous group of mesenchymal malignancies with different morphologies and clinical behavior. Current therapeutic strategies remain unsatisfactory. Cytogenetic and molecular characterization of these tumors is resulting in the breakdown of the classical histopathological categories into molecular subgroups that better define sarcoma pathobiology and pave the way to more precise diagnostic criteria and novel therapeutic opportunities. The purpose of this short review is to summarize the state-of-the-art on the exploitation of massive parallel sequencing technologies, also known as next generation sequencing, in the elucidation of sarcoma pathobiology and to discuss how these applications may impact on diagnosis, prognosis and therapy of these tumors.

  8. A Massively Parallel Solver for the Mechanical Harmonic Analysis of Accelerator Cavities

    International Nuclear Information System (INIS)

    2015-01-01

    ACE3P is a 3D massively parallel simulation suite that developed at SLAC National Accelerator Laboratory that can perform coupled electromagnetic, thermal and mechanical study. Effectively utilizing supercomputer resources, ACE3P has become a key simulation tool for particle accelerator R and D. A new frequency domain solver to perform mechanical harmonic response analysis of accelerator components is developed within the existing parallel framework. This solver is designed to determine the frequency response of the mechanical system to external harmonic excitations for time-efficient accurate analysis of the large-scale problems. Coupled with the ACE3P electromagnetic modules, this capability complements a set of multi-physics tools for a comprehensive study of microphonics in superconducting accelerating cavities in order to understand the RF response and feedback requirements for the operational reliability of a particle accelerator. (auth)

  9. Implementation, capabilities, and benchmarking of Shift, a massively parallel Monte Carlo radiation transport code

    International Nuclear Information System (INIS)

    Pandya, Tara M.; Johnson, Seth R.; Evans, Thomas M.; Davidson, Gregory G.; Hamilton, Steven P.; Godfrey, Andrew T.

    2015-01-01

    This paper discusses the implementation, capabilities, and validation of Shift, a massively parallel Monte Carlo radiation transport package developed and maintained at Oak Ridge National Laboratory. It has been developed to scale well from laptop to small computing clusters to advanced supercomputers. Special features of Shift include hybrid capabilities for variance reduction such as CADIS and FW-CADIS, and advanced parallel decomposition and tally methods optimized for scalability on supercomputing architectures. Shift has been validated and verified against various reactor physics benchmarks and compares well to other state-of-the-art Monte Carlo radiation transport codes such as MCNP5, CE KENO-VI, and OpenMC. Some specific benchmarks used for verification and validation include the CASL VERA criticality test suite and several Westinghouse AP1000 ® problems. These benchmark and scaling studies show promising results

  10. Animated computer graphics models of space and earth sciences data generated via the massively parallel processor

    Science.gov (United States)

    Treinish, Lloyd A.; Gough, Michael L.; Wildenhain, W. David

    1987-01-01

    The capability was developed of rapidly producing visual representations of large, complex, multi-dimensional space and earth sciences data sets via the implementation of computer graphics modeling techniques on the Massively Parallel Processor (MPP) by employing techniques recently developed for typically non-scientific applications. Such capabilities can provide a new and valuable tool for the understanding of complex scientific data, and a new application of parallel computing via the MPP. A prototype system with such capabilities was developed and integrated into the National Space Science Data Center's (NSSDC) Pilot Climate Data System (PCDS) data-independent environment for computer graphics data display to provide easy access to users. While developing these capabilities, several problems had to be solved independently of the actual use of the MPP, all of which are outlined.

  11. Scalable and massively parallel Monte Carlo photon transport simulations for heterogeneous computing platforms.

    Science.gov (United States)

    Yu, Leiming; Nina-Paravecino, Fanny; Kaeli, David; Fang, Qianqian

    2018-01-01

    We present a highly scalable Monte Carlo (MC) three-dimensional photon transport simulation platform designed for heterogeneous computing systems. Through the development of a massively parallel MC algorithm using the Open Computing Language framework, this research extends our existing graphics processing unit (GPU)-accelerated MC technique to a highly scalable vendor-independent heterogeneous computing environment, achieving significantly improved performance and software portability. A number of parallel computing techniques are investigated to achieve portable performance over a wide range of computing hardware. Furthermore, multiple thread-level and device-level load-balancing strategies are developed to obtain efficient simulations using multiple central processing units and GPUs. (2018) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE).

  12. DGDFT: A massively parallel method for large scale density functional theory calculations.

    Science.gov (United States)

    Hu, Wei; Lin, Lin; Yang, Chao

    2015-09-28

    We describe a massively parallel implementation of the recently developed discontinuous Galerkin density functional theory (DGDFT) method, for efficient large-scale Kohn-Sham DFT based electronic structure calculations. The DGDFT method uses adaptive local basis (ALB) functions generated on-the-fly during the self-consistent field iteration to represent the solution to the Kohn-Sham equations. The use of the ALB set provides a systematic way to improve the accuracy of the approximation. By using the pole expansion and selected inversion technique to compute electron density, energy, and atomic forces, we can make the computational complexity of DGDFT scale at most quadratically with respect to the number of electrons for both insulating and metallic systems. We show that for the two-dimensional (2D) phosphorene systems studied here, using 37 basis functions per atom allows us to reach an accuracy level of 1.3 × 10(-4) Hartree/atom in terms of the error of energy and 6.2 × 10(-4) Hartree/bohr in terms of the error of atomic force, respectively. DGDFT can achieve 80% parallel efficiency on 128,000 high performance computing cores when it is used to study the electronic structure of 2D phosphorene systems with 3500-14 000 atoms. This high parallel efficiency results from a two-level parallelization scheme that we will describe in detail.

  13. Massively Parallel and Scalable Implicit Time Integration Algorithms for Structural Dynamics

    Science.gov (United States)

    Farhat, Charbel

    1997-01-01

    Explicit codes are often used to simulate the nonlinear dynamics of large-scale structural systems, even for low frequency response, because the storage and CPU requirements entailed by the repeated factorizations traditionally found in implicit codes rapidly overwhelm the available computing resources. With the advent of parallel processing, this trend is accelerating because of the following additional facts: (a) explicit schemes are easier to parallelize than implicit ones, and (b) explicit schemes induce short range interprocessor communications that are relatively inexpensive, while the factorization methods used in most implicit schemes induce long range interprocessor communications that often ruin the sought-after speed-up. However, the time step restriction imposed by the Courant stability condition on all explicit schemes cannot yet be offset by the speed of the currently available parallel hardware. Therefore, it is essential to develop efficient alternatives to direct methods that are also amenable to massively parallel processing because implicit codes using unconditionally stable time-integration algorithms are computationally more efficient when simulating the low-frequency dynamics of aerospace structures.

  14. DGDFT: A massively parallel method for large scale density functional theory calculations

    Energy Technology Data Exchange (ETDEWEB)

    Hu, Wei, E-mail: whu@lbl.gov; Yang, Chao, E-mail: cyang@lbl.gov [Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720 (United States); Lin, Lin, E-mail: linlin@math.berkeley.edu [Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720 (United States); Department of Mathematics, University of California, Berkeley, California 94720 (United States)

    2015-09-28

    We describe a massively parallel implementation of the recently developed discontinuous Galerkin density functional theory (DGDFT) method, for efficient large-scale Kohn-Sham DFT based electronic structure calculations. The DGDFT method uses adaptive local basis (ALB) functions generated on-the-fly during the self-consistent field iteration to represent the solution to the Kohn-Sham equations. The use of the ALB set provides a systematic way to improve the accuracy of the approximation. By using the pole expansion and selected inversion technique to compute electron density, energy, and atomic forces, we can make the computational complexity of DGDFT scale at most quadratically with respect to the number of electrons for both insulating and metallic systems. We show that for the two-dimensional (2D) phosphorene systems studied here, using 37 basis functions per atom allows us to reach an accuracy level of 1.3 × 10{sup −4} Hartree/atom in terms of the error of energy and 6.2 × 10{sup −4} Hartree/bohr in terms of the error of atomic force, respectively. DGDFT can achieve 80% parallel efficiency on 128,000 high performance computing cores when it is used to study the electronic structure of 2D phosphorene systems with 3500-14 000 atoms. This high parallel efficiency results from a two-level parallelization scheme that we will describe in detail.

  15. DGDFT: A massively parallel method for large scale density functional theory calculations

    International Nuclear Information System (INIS)

    Hu, Wei; Yang, Chao; Lin, Lin

    2015-01-01

    We describe a massively parallel implementation of the recently developed discontinuous Galerkin density functional theory (DGDFT) method, for efficient large-scale Kohn-Sham DFT based electronic structure calculations. The DGDFT method uses adaptive local basis (ALB) functions generated on-the-fly during the self-consistent field iteration to represent the solution to the Kohn-Sham equations. The use of the ALB set provides a systematic way to improve the accuracy of the approximation. By using the pole expansion and selected inversion technique to compute electron density, energy, and atomic forces, we can make the computational complexity of DGDFT scale at most quadratically with respect to the number of electrons for both insulating and metallic systems. We show that for the two-dimensional (2D) phosphorene systems studied here, using 37 basis functions per atom allows us to reach an accuracy level of 1.3 × 10 −4 Hartree/atom in terms of the error of energy and 6.2 × 10 −4 Hartree/bohr in terms of the error of atomic force, respectively. DGDFT can achieve 80% parallel efficiency on 128,000 high performance computing cores when it is used to study the electronic structure of 2D phosphorene systems with 3500-14 000 atoms. This high parallel efficiency results from a two-level parallelization scheme that we will describe in detail

  16. The Fortran-P Translator: Towards Automatic Translation of Fortran 77 Programs for Massively Parallel Processors

    Directory of Open Access Journals (Sweden)

    Matthew O'keefe

    1995-01-01

    Full Text Available Massively parallel processors (MPPs hold the promise of extremely high performance that, if realized, could be used to study problems of unprecedented size and complexity. One of the primary stumbling blocks to this promise has been the lack of tools to translate application codes to MPP form. In this article we show how applications codes written in a subset of Fortran 77, called Fortran-P, can be translated to achieve good performance on several massively parallel machines. This subset can express codes that are self-similar, where the algorithm applied to the global data domain is also applied to each subdomain. We have found many codes that match the Fortran-P programming style and have converted them using our tools. We believe a self-similar coding style will accomplish what a vectorizable style has accomplished for vector machines by allowing the construction of robust, user-friendly, automatic translation systems that increase programmer productivity and generate fast, efficient code for MPPs.

  17. Decreasing Data Analytics Time: Hybrid Architecture MapReduce-Massive Parallel Processing for a Smart Grid

    Directory of Open Access Journals (Sweden)

    Abdeslam Mehenni

    2017-03-01

    Full Text Available As our populations grow in a world of limited resources enterprise seek ways to lighten our load on the planet. The idea of modifying consumer behavior appears as a foundation for smart grids. Enterprise demonstrates the value available from deep analysis of electricity consummation histories, consumers’ messages, and outage alerts, etc. Enterprise mines massive structured and unstructured data. In a nutshell, smart grids result in a flood of data that needs to be analyzed, for better adjust to demand and give customers more ability to delve into their power consumption. Simply put, smart grids will increasingly have a flexible data warehouse attached to them. The key driver for the adoption of data management strategies is clearly the need to handle and analyze the large amounts of information utilities are now faced with. New approaches to data integration are nauseating moment; Hadoop is in fact now being used by the utility to help manage the huge growth in data whilst maintaining coherence of the Data Warehouse. In this paper we define a new Meter Data Management System Architecture repository that differ with three leaders MDMS, where we use MapReduce programming model for ETL and Parallel DBMS in Query statements(Massive Parallel Processing MPP.

  18. Massively parallel whole genome amplification for single-cell sequencing using droplet microfluidics.

    Science.gov (United States)

    Hosokawa, Masahito; Nishikawa, Yohei; Kogawa, Masato; Takeyama, Haruko

    2017-07-12

    Massively parallel single-cell genome sequencing is required to further understand genetic diversities in complex biological systems. Whole genome amplification (WGA) is the first step for single-cell sequencing, but its throughput and accuracy are insufficient in conventional reaction platforms. Here, we introduce single droplet multiple displacement amplification (sd-MDA), a method that enables massively parallel amplification of single cell genomes while maintaining sequence accuracy and specificity. Tens of thousands of single cells are compartmentalized in millions of picoliter droplets and then subjected to lysis and WGA by passive droplet fusion in microfluidic channels. Because single cells are isolated in compartments, their genomes are amplified to saturation without contamination. This enables the high-throughput acquisition of contamination-free and cell specific sequence reads from single cells (21,000 single-cells/h), resulting in enhancement of the sequence data quality compared to conventional methods. This method allowed WGA of both single bacterial cells and human cancer cells. The obtained sequencing coverage rivals those of conventional techniques with superior sequence quality. In addition, we also demonstrate de novo assembly of uncultured soil bacteria and obtain draft genomes from single cell sequencing. This sd-MDA is promising for flexible and scalable use in single-cell sequencing.

  19. Massively Parallel, Molecular Analysis Platform Developed Using a CMOS Integrated Circuit With Biological Nanopores

    Science.gov (United States)

    Roever, Stefan

    2012-01-01

    A massively parallel, low cost molecular analysis platform will dramatically change the nature of protein, molecular and genomics research, DNA sequencing, and ultimately, molecular diagnostics. An integrated circuit (IC) with 264 sensors was fabricated using standard CMOS semiconductor processing technology. Each of these sensors is individually controlled with precision analog circuitry and is capable of single molecule measurements. Under electronic and software control, the IC was used to demonstrate the feasibility of creating and detecting lipid bilayers and biological nanopores using wild type α-hemolysin. The ability to dynamically create bilayers over each of the sensors will greatly accelerate pore development and pore mutation analysis. In addition, the noise performance of the IC was measured to be 30fA(rms). With this noise performance, single base detection of DNA was demonstrated using α-hemolysin. The data shows that a single molecule, electrical detection platform using biological nanopores can be operationalized and can ultimately scale to millions of sensors. Such a massively parallel platform will revolutionize molecular analysis and will completely change the field of molecular diagnostics in the future.

  20. A Faster Parallel Algorithm and Efficient Multithreaded Implementations for Evaluating Betweenness Centrality on Massive Datasets

    Energy Technology Data Exchange (ETDEWEB)

    Madduri, Kamesh; Ediger, David; Jiang, Karl; Bader, David A.; Chavarria-Miranda, Daniel

    2009-02-15

    We present a new lock-free parallel algorithm for computing betweenness centralityof massive small-world networks. With minor changes to the data structures, ouralgorithm also achieves better spatial cache locality compared to previous approaches. Betweenness centrality is a key algorithm kernel in HPCS SSCA#2, a benchmark extensively used to evaluate the performance of emerging high-performance computing architectures for graph-theoretic computations. We design optimized implementations of betweenness centrality and the SSCA#2 benchmark for two hardware multithreaded systems: a Cray XMT system with the Threadstorm processor, and a single-socket Sun multicore server with the UltraSPARC T2 processor. For a small-world network of 134 million vertices and 1.073 billion edges, the 16-processor XMT system and the 8-core Sun Fire T5120 server achieve TEPS scores (an algorithmic performance count for the SSCA#2 benchmark) of 160 million and 90 million respectively, which corresponds to more than a 2X performance improvement over the previous parallel implementations. To better characterize the performance of these multithreaded systems, we correlate the SSCA#2 performance results with data from the memory-intensive STREAM and RandomAccess benchmarks. Finally, we demonstrate the applicability of our implementation to analyze massive real-world datasets by computing approximate betweenness centrality for a large-scale IMDb movie-actor network.

  1. Application of Massively Parallel Sequencing in the Clinical Diagnostic Testing of Inherited Cardiac Conditions

    Directory of Open Access Journals (Sweden)

    Ivone U. S. Leong

    2014-06-01

    Full Text Available Sudden cardiac death in people between the ages of 1–40 years is a devastating event and is frequently caused by several heritable cardiac disorders. These disorders include cardiac ion channelopathies, such as long QT syndrome, catecholaminergic polymorphic ventricular tachycardia and Brugada syndrome and cardiomyopathies, such as hypertrophic cardiomyopathy and arrhythmogenic right ventricular cardiomyopathy. Through careful molecular genetic evaluation of DNA from sudden death victims, the causative gene mutation can be uncovered, and the rest of the family can be screened and preventative measures implemented in at-risk individuals. The current screening approach in most diagnostic laboratories uses Sanger-based sequencing; however, this method is time consuming and labour intensive. The development of massively parallel sequencing has made it possible to produce millions of sequence reads simultaneously and is potentially an ideal approach to screen for mutations in genes that are associated with sudden cardiac death. This approach offers mutation screening at reduced cost and turnaround time. Here, we will review the current commercially available enrichment kits, massively parallel sequencing (MPS platforms, downstream data analysis and its application to sudden cardiac death in a diagnostic environment.

  2. MADmap: A Massively Parallel Maximum-Likelihood Cosmic Microwave Background Map-Maker

    Energy Technology Data Exchange (ETDEWEB)

    Cantalupo, Christopher; Borrill, Julian; Jaffe, Andrew; Kisner, Theodore; Stompor, Radoslaw

    2009-06-09

    MADmap is a software application used to produce maximum-likelihood images of the sky from time-ordered data which include correlated noise, such as those gathered by Cosmic Microwave Background (CMB) experiments. It works efficiently on platforms ranging from small workstations to the most massively parallel supercomputers. Map-making is a critical step in the analysis of all CMB data sets, and the maximum-likelihood approach is the most accurate and widely applicable algorithm; however, it is a computationally challenging task. This challenge will only increase with the next generation of ground-based, balloon-borne and satellite CMB polarization experiments. The faintness of the B-mode signal that these experiments seek to measure requires them to gather enormous data sets. MADmap is already being run on up to O(1011) time samples, O(108) pixels and O(104) cores, with ongoing work to scale to the next generation of data sets and supercomputers. We describe MADmap's algorithm based around a preconditioned conjugate gradient solver, fast Fourier transforms and sparse matrix operations. We highlight MADmap's ability to address problems typically encountered in the analysis of realistic CMB data sets and describe its application to simulations of the Planck and EBEX experiments. The massively parallel and distributed implementation is detailed and scaling complexities are given for the resources required. MADmap is capable of analysing the largest data sets now being collected on computing resources currently available, and we argue that, given Moore's Law, MADmap will be capable of reducing the most massive projected data sets.

  3. ASSET: Analysis of Sequences of Synchronous Events in Massively Parallel Spike Trains

    Science.gov (United States)

    Canova, Carlos; Denker, Michael; Gerstein, George; Helias, Moritz

    2016-01-01

    With the ability to observe the activity from large numbers of neurons simultaneously using modern recording technologies, the chance to identify sub-networks involved in coordinated processing increases. Sequences of synchronous spike events (SSEs) constitute one type of such coordinated spiking that propagates activity in a temporally precise manner. The synfire chain was proposed as one potential model for such network processing. Previous work introduced a method for visualization of SSEs in massively parallel spike trains, based on an intersection matrix that contains in each entry the degree of overlap of active neurons in two corresponding time bins. Repeated SSEs are reflected in the matrix as diagonal structures of high overlap values. The method as such, however, leaves the task of identifying these diagonal structures to visual inspection rather than to a quantitative analysis. Here we present ASSET (Analysis of Sequences of Synchronous EvenTs), an improved, fully automated method which determines diagonal structures in the intersection matrix by a robust mathematical procedure. The method consists of a sequence of steps that i) assess which entries in the matrix potentially belong to a diagonal structure, ii) cluster these entries into individual diagonal structures and iii) determine the neurons composing the associated SSEs. We employ parallel point processes generated by stochastic simulations as test data to demonstrate the performance of the method under a wide range of realistic scenarios, including different types of non-stationarity of the spiking activity and different correlation structures. Finally, the ability of the method to discover SSEs is demonstrated on complex data from large network simulations with embedded synfire chains. Thus, ASSET represents an effective and efficient tool to analyze massively parallel spike data for temporal sequences of synchronous activity. PMID:27420734

  4. TIA: algorithms for development of identity-linked SNP islands for analysis by massively parallel DNA sequencing.

    Science.gov (United States)

    Farris, M Heath; Scott, Andrew R; Texter, Pamela A; Bartlett, Marta; Coleman, Patricia; Masters, David

    2018-04-11

    Single nucleotide polymorphisms (SNPs) located within the human genome have been shown to have utility as markers of identity in the differentiation of DNA from individual contributors. Massively parallel DNA sequencing (MPS) technologies and human genome SNP databases allow for the design of suites of identity-linked target regions, amenable to sequencing in a multiplexed and massively parallel manner. Therefore, tools are needed for leveraging the genotypic information found within SNP databases for the discovery of genomic targets that can be evaluated on MPS platforms. The SNP island target identification algorithm (TIA) was developed as a user-tunable system to leverage SNP information within databases. Using data within the 1000 Genomes Project SNP database, human genome regions were identified that contain globally ubiquitous identity-linked SNPs and that were responsive to targeted resequencing on MPS platforms. Algorithmic filters were used to exclude target regions that did not conform to user-tunable SNP island target characteristics. To validate the accuracy of TIA for discovering these identity-linked SNP islands within the human genome, SNP island target regions were amplified from 70 contributor genomic DNA samples using the polymerase chain reaction. Multiplexed amplicons were sequenced using the Illumina MiSeq platform, and the resulting sequences were analyzed for SNP variations. 166 putative identity-linked SNPs were targeted in the identified genomic regions. Of the 309 SNPs that provided discerning power across individual SNP profiles, 74 previously undefined SNPs were identified during evaluation of targets from individual genomes. Overall, DNA samples of 70 individuals were uniquely identified using a subset of the suite of identity-linked SNP islands. TIA offers a tunable genome search tool for the discovery of targeted genomic regions that are scalable in the population frequency and numbers of SNPs contained within the SNP island regions

  5. Application of Raptor-M3G to reactor dosimetry problems on massively parallel architectures - 026

    International Nuclear Information System (INIS)

    Longoni, G.

    2010-01-01

    The solution of complex 3-D radiation transport problems requires significant resources both in terms of computation time and memory availability. Therefore, parallel algorithms and multi-processor architectures are required to solve efficiently large 3-D radiation transport problems. This paper presents the application of RAPTOR-M3G (Rapid Parallel Transport Of Radiation - Multiple 3D Geometries) to reactor dosimetry problems. RAPTOR-M3G is a newly developed parallel computer code designed to solve the discrete ordinates (SN) equations on multi-processor computer architectures. This paper presents the results for a reactor dosimetry problem using a 3-D model of a commercial 2-loop pressurized water reactor (PWR). The accuracy and performance of RAPTOR-M3G will be analyzed and the numerical results obtained from the calculation will be compared directly to measurements of the neutron field in the reactor cavity air gap. The parallel performance of RAPTOR-M3G on massively parallel architectures, where the number of computing nodes is in the order of hundreds, will be analyzed up to four hundred processors. The performance results will be presented based on two supercomputing architectures: the POPLE supercomputer operated by the Pittsburgh Supercomputing Center and the Westinghouse computer cluster. The Westinghouse computer cluster is equipped with a standard Ethernet network connection and an InfiniBand R interconnects capable of a bandwidth in excess of 20 GBit/sec. Therefore, the impact of the network architecture on RAPTOR-M3G performance will be analyzed as well. (authors)

  6. User's Guide for TOUGH2-MP - A Massively Parallel Version of the TOUGH2 Code

    International Nuclear Information System (INIS)

    Earth Sciences Division; Zhang, Keni; Zhang, Keni; Wu, Yu-Shu; Pruess, Karsten

    2008-01-01

    TOUGH2-MP is a massively parallel (MP) version of the TOUGH2 code, designed for computationally efficient parallel simulation of isothermal and nonisothermal flows of multicomponent, multiphase fluids in one, two, and three-dimensional porous and fractured media. In recent years, computational requirements have become increasingly intensive in large or highly nonlinear problems for applications in areas such as radioactive waste disposal, CO2 geological sequestration, environmental assessment and remediation, reservoir engineering, and groundwater hydrology. The primary objective of developing the parallel-simulation capability is to significantly improve the computational performance of the TOUGH2 family of codes. The particular goal for the parallel simulator is to achieve orders-of-magnitude improvement in computational time for models with ever-increasing complexity. TOUGH2-MP is designed to perform parallel simulation on multi-CPU computational platforms. An earlier version of TOUGH2-MP (V1.0) was based on the TOUGH2 Version 1.4 with EOS3, EOS9, and T2R3D modules, a software previously qualified for applications in the Yucca Mountain project, and was designed for execution on CRAY T3E and IBM SP supercomputers. The current version of TOUGH2-MP (V2.0) includes all fluid property modules of the standard version TOUGH2 V2.0. It provides computationally efficient capabilities using supercomputers, Linux clusters, or multi-core PCs, and also offers many user-friendly features. The parallel simulator inherits all process capabilities from V2.0 together with additional capabilities for handling fractured media from V1.4. This report provides a quick starting guide on how to set up and run the TOUGH2-MP program for users with a basic knowledge of running the (standard) version TOUGH2 code. The report also gives a brief technical description of the code, including a discussion of parallel methodology, code structure, as well as mathematical and numerical methods used

  7. Screening of a Brassica napus bacterial artificial chromosome library using highly parallel single nucleotide polymorphism assays

    Science.gov (United States)

    2013-01-01

    Background Efficient screening of bacterial artificial chromosome (BAC) libraries with polymerase chain reaction (PCR)-based markers is feasible provided that a multidimensional pooling strategy is implemented. Single nucleotide polymorphisms (SNPs) can be screened in multiplexed format, therefore this marker type lends itself particularly well for medium- to high-throughput applications. Combining the power of multiplex-PCR assays with a multidimensional pooling system may prove to be especially challenging in a polyploid genome. In polyploid genomes two classes of SNPs need to be distinguished, polymorphisms between accessions (intragenomic SNPs) and those differentiating between homoeologous genomes (intergenomic SNPs). We have assessed whether the highly parallel Illumina GoldenGate® Genotyping Assay is suitable for the screening of a BAC library of the polyploid Brassica napus genome. Results A multidimensional screening platform was developed for a Brassica napus BAC library which is composed of almost 83,000 clones. Intragenomic and intergenomic SNPs were included in Illumina’s GoldenGate® Genotyping Assay and both SNP classes were used successfully for screening of the multidimensional BAC pools of the Brassica napus library. An optimized scoring method is proposed which is especially valuable for SNP calling of intergenomic SNPs. Validation of the genotyping results by independent methods revealed a success of approximately 80% for the multiplex PCR-based screening regardless of whether intra- or intergenomic SNPs were evaluated. Conclusions Illumina’s GoldenGate® Genotyping Assay can be efficiently used for screening of multidimensional Brassica napus BAC pools. SNP calling was specifically tailored for the evaluation of BAC pool screening data. The developed scoring method can be implemented independently of plant reference samples. It is demonstrated that intergenomic SNPs represent a powerful tool for BAC library screening of a polyploid genome

  8. Simultaneous digital quantification and fluorescence-based size characterization of massively parallel sequencing libraries.

    Science.gov (United States)

    Laurie, Matthew T; Bertout, Jessica A; Taylor, Sean D; Burton, Joshua N; Shendure, Jay A; Bielas, Jason H

    2013-08-01

    Due to the high cost of failed runs and suboptimal data yields, quantification and determination of fragment size range are crucial steps in the library preparation process for massively parallel sequencing (or next-generation sequencing). Current library quality control methods commonly involve quantification using real-time quantitative PCR and size determination using gel or capillary electrophoresis. These methods are laborious and subject to a number of significant limitations that can make library calibration unreliable. Herein, we propose and test an alternative method for quality control of sequencing libraries using droplet digital PCR (ddPCR). By exploiting a correlation we have discovered between droplet fluorescence and amplicon size, we achieve the joint quantification and size determination of target DNA with a single ddPCR assay. We demonstrate the accuracy and precision of applying this method to the preparation of sequencing libraries.

  9. A scalable approach to modeling groundwater flow on massively parallel computers

    International Nuclear Information System (INIS)

    Ashby, S.F.; Falgout, R.D.; Tompson, A.F.B.

    1995-12-01

    We describe a fully scalable approach to the simulation of groundwater flow on a hierarchy of computing platforms, ranging from workstations to massively parallel computers. Specifically, we advocate the use of scalable conceptual models in which the subsurface model is defined independently of the computational grid on which the simulation takes place. We also describe a scalable multigrid algorithm for computing the groundwater flow velocities. We axe thus able to leverage both the engineer's time spent developing the conceptual model and the computing resources used in the numerical simulation. We have successfully employed this approach at the LLNL site, where we have run simulations ranging in size from just a few thousand spatial zones (on workstations) to more than eight million spatial zones (on the CRAY T3D)-all using the same conceptual model

  10. Massively parallel computing and the search for jets and black holes at the LHC

    Energy Technology Data Exchange (ETDEWEB)

    Halyo, V., E-mail: vhalyo@gmail.com; LeGresley, P.; Lujan, P.

    2014-04-21

    Massively parallel computing at the LHC could be the next leap necessary to reach an era of new discoveries at the LHC after the Higgs discovery. Scientific computing is a critical component of the LHC experiment, including operation, trigger, LHC computing GRID, simulation, and analysis. One way to improve the physics reach of the LHC is to take advantage of the flexibility of the trigger system by integrating coprocessors based on Graphics Processing Units (GPUs) or the Many Integrated Core (MIC) architecture into its server farm. This cutting edge technology provides not only the means to accelerate existing algorithms, but also the opportunity to develop new algorithms that select events in the trigger that previously would have evaded detection. In this paper we describe new algorithms that would allow us to select in the trigger new topological signatures that include non-prompt jet and black hole-like objects in the silicon tracker.

  11. Massively parallel simulations of strong electronic correlations: Realistic Coulomb vertex and multiplet effects

    Science.gov (United States)

    Baumgärtel, M.; Ghanem, K.; Kiani, A.; Koch, E.; Pavarini, E.; Sims, H.; Zhang, G.

    2017-07-01

    We discuss the efficient implementation of general impurity solvers for dynamical mean-field theory. We show that both Lanczos and quantum Monte Carlo in different flavors (Hirsch-Fye, continuous-time hybridization- and interaction-expansion) exhibit excellent scaling on massively parallel supercomputers. We apply these algorithms to simulate realistic model Hamiltonians including the full Coulomb vertex, crystal-field splitting, and spin-orbit interaction. We discuss how to remove the sign problem in the presence of non-diagonal crystal-field and hybridization matrices. We show how to extract the physically observable quantities from imaginary time data, in particular correlation functions and susceptibilities. Finally, we present benchmarks and applications for representative correlated systems.

  12. Quantification of massively parallel sequencing libraries - a comparative study of eight methods

    DEFF Research Database (Denmark)

    Hussing, Christian; Kampmann, Marie-Louise; Mogensen, Helle Smidt

    2018-01-01

    Quantification of massively parallel sequencing libraries is important for acquisition of monoclonal beads or clusters prior to clonal amplification and to avoid large variations in library coverage when multiple samples are included in one sequencing analysis. No gold standard for quantification...... estimates followed by Qubit and electrophoresis-based instruments (Bioanalyzer, TapeStation, GX Touch, and Fragment Analyzer), while SYBR Green and TaqMan based qPCR assays gave the lowest estimates. qPCR gave more accurate predictions of sequencing coverage than Qubit and TapeStation did. Costs, time......-consumption, workflow simplicity, and ability to quantify multiple samples are discussed. Technical specifications, advantages, and disadvantages of the various methods are pointed out....

  13. Computations on the massively parallel processor at the Goddard Space Flight Center

    Science.gov (United States)

    Strong, James P.

    1991-01-01

    Described are four significant algorithms implemented on the massively parallel processor (MPP) at the Goddard Space Flight Center. Two are in the area of image analysis. Of the other two, one is a mathematical simulation experiment and the other deals with the efficient transfer of data between distantly separated processors in the MPP array. The first algorithm presented is the automatic determination of elevations from stereo pairs. The second algorithm solves mathematical logistic equations capable of producing both ordered and chaotic (or random) solutions. This work can potentially lead to the simulation of artificial life processes. The third algorithm is the automatic segmentation of images into reasonable regions based on some similarity criterion, while the fourth is an implementation of a bitonic sort of data which significantly overcomes the nearest neighbor interconnection constraints on the MPP for transferring data between distant processors.

  14. Phase space simulation of collisionless stellar systems on the massively parallel processor

    International Nuclear Information System (INIS)

    White, R.L.

    1987-01-01

    A numerical technique for solving the collisionless Boltzmann equation describing the time evolution of a self gravitating fluid in phase space was implemented on the Massively Parallel Processor (MPP). The code performs calculations for a two dimensional phase space grid (with one space and one velocity dimension). Some results from calculations are presented. The execution speed of the code is comparable to the speed of a single processor of a Cray-XMP. Advantages and disadvantages of the MPP architecture for this type of problem are discussed. The nearest neighbor connectivity of the MPP array does not pose a significant obstacle. Future MPP-like machines should have much more local memory and easier access to staging memory and disks in order to be effective for this type of problem

  15. GPAW - massively parallel electronic structure calculations with Python-based software

    DEFF Research Database (Denmark)

    Enkovaara, Jussi; Romero, Nichols A.; Shende, Sameer

    2011-01-01

    of the productivity enhancing features together with a good numerical performance. We have used this approach in implementing an electronic structure simulation software GPAW using the combination of Python and C programming languages. While the chosen approach works well in standard workstations and Unix...... popular choice. While dynamic, interpreted languages, such as Python, can increase the effciency of programmer, they cannot compete directly with the raw performance of compiled languages. However, by using an interpreted language together with a compiled language, it is possible to have most...... environments, massively parallel supercomputing systems can present some challenges in porting, debugging and profiling the software. In this paper we describe some details of the implementation and discuss the advantages and challenges of the combined Python/C approach. We show that despite the challenges...

  16. Detection of arboviruses and other micro-organisms in experimentally infected mosquitoes using massively parallel sequencing.

    Directory of Open Access Journals (Sweden)

    Sonja Hall-Mendelin

    Full Text Available Human disease incidence attributed to arbovirus infection is increasing throughout the world, with effective control interventions limited by issues of sustainability, insecticide resistance and the lack of effective vaccines. Several promising control strategies are currently under development, such as the release of mosquitoes trans-infected with virus-blocking Wolbachia bacteria. Implementation of any control program is dependent on effective virus surveillance and a thorough understanding of virus-vector interactions. Massively parallel sequencing has enormous potential for providing comprehensive genomic information that can be used to assess many aspects of arbovirus ecology, as well as to evaluate novel control strategies. To demonstrate proof-of-principle, we analyzed Aedes aegypti or Aedes albopictus experimentally infected with dengue, yellow fever or chikungunya viruses. Random amplification was used to prepare sufficient template for sequencing on the Personal Genome Machine. Viral sequences were present in all infected mosquitoes. In addition, in most cases, we were also able to identify the mosquito species and mosquito micro-organisms, including the bacterial endosymbiont Wolbachia. Importantly, naturally occurring Wolbachia strains could be differentiated from strains that had been trans-infected into the mosquito. The method allowed us to assemble near full-length viral genomes and detect other micro-organisms without prior sequence knowledge, in a single reaction. This is a step toward the application of massively parallel sequencing as an arbovirus surveillance tool. It has the potential to provide insight into virus transmission dynamics, and has applicability to the post-release monitoring of Wolbachia in mosquito populations.

  17. Massively parallel simulator of optical coherence tomography of inhomogeneous turbid media.

    Science.gov (United States)

    Malektaji, Siavash; Lima, Ivan T; Escobar I, Mauricio R; Sherif, Sherif S

    2017-10-01

    An accurate and practical simulator for Optical Coherence Tomography (OCT) could be an important tool to study the underlying physical phenomena in OCT such as multiple light scattering. Recently, many researchers have investigated simulation of OCT of turbid media, e.g., tissue, using Monte Carlo methods. The main drawback of these earlier simulators is the long computational time required to produce accurate results. We developed a massively parallel simulator of OCT of inhomogeneous turbid media that obtains both Class I diffusive reflectivity, due to ballistic and quasi-ballistic scattered photons, and Class II diffusive reflectivity due to multiply scattered photons. This Monte Carlo-based simulator is implemented on graphic processing units (GPUs), using the Compute Unified Device Architecture (CUDA) platform and programming model, to exploit the parallel nature of propagation of photons in tissue. It models an arbitrary shaped sample medium as a tetrahedron-based mesh and uses an advanced importance sampling scheme. This new simulator speeds up simulations of OCT of inhomogeneous turbid media by about two orders of magnitude. To demonstrate this result, we have compared the computation times of our new parallel simulator and its serial counterpart using two samples of inhomogeneous turbid media. We have shown that our parallel implementation reduced simulation time of OCT of the first sample medium from 407 min to 92 min by using a single GPU card, to 12 min by using 8 GPU cards and to 7 min by using 16 GPU cards. For the second sample medium, the OCT simulation time was reduced from 209 h to 35.6 h by using a single GPU card, and to 4.65 h by using 8 GPU cards, and to only 2 h by using 16 GPU cards. Therefore our new parallel simulator is considerably more practical to use than its central processing unit (CPU)-based counterpart. Our new parallel OCT simulator could be a practical tool to study the different physical phenomena underlying OCT

  18. Multi-mode sensor processing on a dynamically reconfigurable massively parallel processor array

    Science.gov (United States)

    Chen, Paul; Butts, Mike; Budlong, Brad; Wasson, Paul

    2008-04-01

    This paper introduces a novel computing architecture that can be reconfigured in real time to adapt on demand to multi-mode sensor platforms' dynamic computational and functional requirements. This 1 teraOPS reconfigurable Massively Parallel Processor Array (MPPA) has 336 32-bit processors. The programmable 32-bit communication fabric provides streamlined inter-processor connections with deterministically high performance. Software programmability, scalability, ease of use, and fast reconfiguration time (ranging from microseconds to milliseconds) are the most significant advantages over FPGAs and DSPs. This paper introduces the MPPA architecture, its programming model, and methods of reconfigurability. An MPPA platform for reconfigurable computing is based on a structural object programming model. Objects are software programs running concurrently on hundreds of 32-bit RISC processors and memories. They exchange data and control through a network of self-synchronizing channels. A common application design pattern on this platform, called a work farm, is a parallel set of worker objects, with one input and one output stream. Statically configured work farms with homogeneous and heterogeneous sets of workers have been used in video compression and decompression, network processing, and graphics applications.

  19. Massively parallel electrical conductivity imaging of the subsurface: Applications to hydrocarbon exploration

    Science.gov (United States)

    Newman, Gregory A.; Commer, Michael

    2009-07-01

    Three-dimensional (3D) geophysical imaging is now receiving considerable attention for electrical conductivity mapping of potential offshore oil and gas reservoirs. The imaging technology employs controlled source electromagnetic (CSEM) and magnetotelluric (MT) fields and treats geological media exhibiting transverse anisotropy. Moreover when combined with established seismic methods, direct imaging of reservoir fluids is possible. Because of the size of the 3D conductivity imaging problem, strategies are required exploiting computational parallelism and optimal meshing. The algorithm thus developed has been shown to scale to tens of thousands of processors. In one imaging experiment, 32,768 tasks/processors on the IBM Watson Research Blue Gene/L supercomputer were successfully utilized. Over a 24 hour period we were able to image a large scale field data set that previously required over four months of processing time on distributed clusters based on Intel or AMD processors utilizing 1024 tasks on an InfiniBand fabric. Electrical conductivity imaging using massively parallel computational resources produces results that cannot be obtained otherwise and are consistent with timeframes required for practical exploration problems.

  20. Massively parallel electrical conductivity imaging of the subsurface: Applications to hydrocarbon exploration

    International Nuclear Information System (INIS)

    Newman, Gregory A; Commer, Michael

    2009-01-01

    Three-dimensional (3D) geophysical imaging is now receiving considerable attention for electrical conductivity mapping of potential offshore oil and gas reservoirs. The imaging technology employs controlled source electromagnetic (CSEM) and magnetotelluric (MT) fields and treats geological media exhibiting transverse anisotropy. Moreover when combined with established seismic methods, direct imaging of reservoir fluids is possible. Because of the size of the 3D conductivity imaging problem, strategies are required exploiting computational parallelism and optimal meshing. The algorithm thus developed has been shown to scale to tens of thousands of processors. In one imaging experiment, 32,768 tasks/processors on the IBM Watson Research Blue Gene/L supercomputer were successfully utilized. Over a 24 hour period we were able to image a large scale field data set that previously required over four months of processing time on distributed clusters based on Intel or AMD processors utilizing 1024 tasks on an InfiniBand fabric. Electrical conductivity imaging using massively parallel computational resources produces results that cannot be obtained otherwise and are consistent with timeframes required for practical exploration problems.

  1. Massive Exploration of Perturbed Conditions of the Blood Coagulation Cascade through GPU Parallelization

    Directory of Open Access Journals (Sweden)

    Paolo Cazzaniga

    2014-01-01

    high-performance computing solutions is motivated by the need of performing large numbers of in silico analysis to study the behavior of biological systems in different conditions, which necessitate a computing power that usually overtakes the capability of standard desktop computers. In this work we present coagSODA, a CUDA-powered computational tool that was purposely developed for the analysis of a large mechanistic model of the blood coagulation cascade (BCC, defined according to both mass-action kinetics and Hill functions. coagSODA allows the execution of parallel simulations of the dynamics of the BCC by automatically deriving the system of ordinary differential equations and then exploiting the numerical integration algorithm LSODA. We present the biological results achieved with a massive exploration of perturbed conditions of the BCC, carried out with one-dimensional and bi-dimensional parameter sweep analysis, and show that GPU-accelerated parallel simulations of this model can increase the computational performances up to a 181× speedup compared to the corresponding sequential simulations.

  2. My-Forensic-Loci-queries (MyFLq) framework for analysis of forensic STR data generated by massive parallel sequencing.

    Science.gov (United States)

    Van Neste, Christophe; Vandewoestyne, Mado; Van Criekinge, Wim; Deforce, Dieter; Van Nieuwerburgh, Filip

    2014-03-01

    Forensic scientists are currently investigating how to transition from capillary electrophoresis (CE) to massive parallel sequencing (MPS) for analysis of forensic DNA profiles. MPS offers several advantages over CE such as virtually unlimited multiplexy of loci, combining both short tandem repeat (STR) and single nucleotide polymorphism (SNP) loci, small amplicons without constraints of size separation, more discrimination power, deep mixture resolution and sample multiplexing. We present our bioinformatic framework My-Forensic-Loci-queries (MyFLq) for analysis of MPS forensic data. For allele calling, the framework uses a MySQL reference allele database with automatically determined regions of interest (ROIs) by a generic maximal flanking algorithm which makes it possible to use any STR or SNP forensic locus. Python scripts were designed to automatically make allele calls starting from raw MPS data. We also present a method to assess the usefulness and overall performance of a forensic locus with respect to MPS, as well as methods to estimate whether an unknown allele, which sequence is not present in the MySQL database, is in fact a new allele or a sequencing error. The MyFLq framework was applied to an Illumina MiSeq dataset of a forensic Illumina amplicon library, generated from multilocus STR polymerase chain reaction (PCR) on both single contributor samples and multiple person DNA mixtures. Although the multilocus PCR was not yet optimized for MPS in terms of amplicon length or locus selection, the results show excellent results for most loci. The results show a high signal-to-noise ratio, correct allele calls, and a low limit of detection for minor DNA contributors in mixed DNA samples. Technically, forensic MPS affords great promise for routine implementation in forensic genomics. The method is also applicable to adjacent disciplines such as molecular autopsy in legal medicine and in mitochondrial DNA research. Copyright © 2013 The Authors. Published by

  3. Revealing the Physics of Galactic Winds Through Massively-Parallel Hydrodynamics Simulations

    Science.gov (United States)

    Schneider, Evan Elizabeth

    This thesis documents the hydrodynamics code Cholla and a numerical study of multiphase galactic winds. Cholla is a massively-parallel, GPU-based code designed for astrophysical simulations that is freely available to the astrophysics community. A static-mesh Eulerian code, Cholla is ideally suited to carrying out massive simulations (> 20483 cells) that require very high resolution. The code incorporates state-of-the-art hydrodynamics algorithms including third-order spatial reconstruction, exact and linearized Riemann solvers, and unsplit integration algorithms that account for transverse fluxes on multidimensional grids. Operator-split radiative cooling and a dual-energy formalism for high mach number flows are also included. An extensive test suite demonstrates Cholla's superior ability to model shocks and discontinuities, while the GPU-native design makes the code extremely computationally efficient - speeds of 5-10 million cell updates per GPU-second are typical on current hardware for 3D simulations with all of the aforementioned physics. The latter half of this work comprises a comprehensive study of the mixing between a hot, supernova-driven wind and cooler clouds representative of those observed in multiphase galactic winds. Both adiabatic and radiatively-cooling clouds are investigated. The analytic theory of cloud-crushing is applied to the problem, and adiabatic turbulent clouds are found to be mixed with the hot wind on similar timescales as the classic spherical case (4-5 t cc) with an appropriate rescaling of the cloud-crushing time. Radiatively cooling clouds survive considerably longer, and the differences in evolution between turbulent and spherical clouds cannot be reconciled with a simple rescaling. The rapid incorporation of low-density material into the hot wind implies efficient mass-loading of hot phases of galactic winds. At the same time, the extreme compression of high-density cloud material leads to long-lived but slow-moving clumps

  4. Convergence analysis of a class of massively parallel direction splitting algorithms for the Navier-Stokes equations in simple domains

    KAUST Repository

    Guermond, Jean-Luc; Minev, Peter D.; Salgado, Abner J.

    2012-01-01

    We provide a convergence analysis for a new fractional timestepping technique for the incompressible Navier-Stokes equations based on direction splitting. This new technique is of linear complexity, unconditionally stable and convergent, and suitable for massive parallelization. © 2012 American Mathematical Society.

  5. A quantitative assessment of the Hadoop framework for analyzing massively parallel DNA sequencing data.

    Science.gov (United States)

    Siretskiy, Alexey; Sundqvist, Tore; Voznesenskiy, Mikhail; Spjuth, Ola

    2015-01-01

    New high-throughput technologies, such as massively parallel sequencing, have transformed the life sciences into a data-intensive field. The most common e-infrastructure for analyzing this data consists of batch systems that are based on high-performance computing resources; however, the bioinformatics software that is built on this platform does not scale well in the general case. Recently, the Hadoop platform has emerged as an interesting option to address the challenges of increasingly large datasets with distributed storage, distributed processing, built-in data locality, fault tolerance, and an appealing programming methodology. In this work we introduce metrics and report on a quantitative comparison between Hadoop and a single node of conventional high-performance computing resources for the tasks of short read mapping and variant calling. We calculate efficiency as a function of data size and observe that the Hadoop platform is more efficient for biologically relevant data sizes in terms of computing hours for both split and un-split data files. We also quantify the advantages of the data locality provided by Hadoop for NGS problems, and show that a classical architecture with network-attached storage will not scale when computing resources increase in numbers. Measurements were performed using ten datasets of different sizes, up to 100 gigabases, using the pipeline implemented in Crossbow. To make a fair comparison, we implemented an improved preprocessor for Hadoop with better performance for splittable data files. For improved usability, we implemented a graphical user interface for Crossbow in a private cloud environment using the CloudGene platform. All of the code and data in this study are freely available as open source in public repositories. From our experiments we can conclude that the improved Hadoop pipeline scales better than the same pipeline on high-performance computing resources, we also conclude that Hadoop is an economically viable

  6. Wideband aperture array using RF channelizers and massively parallel digital 2D IIR filterbank

    Science.gov (United States)

    Sengupta, Arindam; Madanayake, Arjuna; Gómez-García, Roberto; Engeberg, Erik D.

    2014-05-01

    Wideband receive-mode beamforming applications in wireless location, electronically-scanned antennas for radar, RF sensing, microwave imaging and wireless communications require digital aperture arrays that offer a relatively constant far-field beam over several octaves of bandwidth. Several beamforming schemes including the well-known true time-delay and the phased array beamformers have been realized using either finite impulse response (FIR) or fast Fourier transform (FFT) digital filter-sum based techniques. These beamforming algorithms offer the desired selectivity at the cost of a high computational complexity and frequency-dependant far-field array patterns. A novel approach to receiver beamforming is the use of massively parallel 2-D infinite impulse response (IIR) fan filterbanks for the synthesis of relatively frequency independent RF beams at an order of magnitude lower multiplier complexity compared to FFT or FIR filter based conventional algorithms. The 2-D IIR filterbanks demand fast digital processing that can support several octaves of RF bandwidth, fast analog-to-digital converters (ADCs) for RF-to-bits type direct conversion of wideband antenna element signals. Fast digital implementation platforms that can realize high-precision recursive filter structures necessary for real-time beamforming, at RF radio bandwidths, are also desired. We propose a novel technique that combines a passive RF channelizer, multichannel ADC technology, and single-phase massively parallel 2-D IIR digital fan filterbanks, realized at low complexity using FPGA and/or ASIC technology. There exists native support for a larger bandwidth than the maximum clock frequency of the digital implementation technology. We also strive to achieve More-than-Moore throughput by processing a wideband RF signal having content with N-fold (B = N Fclk/2) bandwidth compared to the maximum clock frequency Fclk Hz of the digital VLSI platform under consideration. Such increase in bandwidth is

  7. Comparison of Pre-Analytical FFPE Sample Preparation Methods and Their Impact on Massively Parallel Sequencing in Routine Diagnostics

    Science.gov (United States)

    Heydt, Carina; Fassunke, Jana; Künstlinger, Helen; Ihle, Michaela Angelika; König, Katharina; Heukamp, Lukas Carl; Schildhaus, Hans-Ulrich; Odenthal, Margarete; Büttner, Reinhard; Merkelbach-Bruse, Sabine

    2014-01-01

    Over the last years, massively parallel sequencing has rapidly evolved and has now transitioned into molecular pathology routine laboratories. It is an attractive platform for analysing multiple genes at the same time with very little input material. Therefore, the need for high quality DNA obtained from automated DNA extraction systems has increased, especially to those laboratories which are dealing with formalin-fixed paraffin-embedded (FFPE) material and high sample throughput. This study evaluated five automated FFPE DNA extraction systems as well as five DNA quantification systems using the three most common techniques, UV spectrophotometry, fluorescent dye-based quantification and quantitative PCR, on 26 FFPE tissue samples. Additionally, the effects on downstream applications were analysed to find the most suitable pre-analytical methods for massively parallel sequencing in routine diagnostics. The results revealed that the Maxwell 16 from Promega (Mannheim, Germany) seems to be the superior system for DNA extraction from FFPE material. The extracts had a 1.3–24.6-fold higher DNA concentration in comparison to the other extraction systems, a higher quality and were most suitable for downstream applications. The comparison of the five quantification methods showed intermethod variations but all methods could be used to estimate the right amount for PCR amplification and for massively parallel sequencing. Interestingly, the best results in massively parallel sequencing were obtained with a DNA input of 15 ng determined by the NanoDrop 2000c spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA). No difference could be detected in mutation analysis based on the results of the quantification methods. These findings emphasise, that it is particularly important to choose the most reliable and constant DNA extraction system, especially when using small biopsies and low elution volumes, and that all common DNA quantification techniques can be used for

  8. Comparison of pre-analytical FFPE sample preparation methods and their impact on massively parallel sequencing in routine diagnostics.

    Directory of Open Access Journals (Sweden)

    Carina Heydt

    Full Text Available Over the last years, massively parallel sequencing has rapidly evolved and has now transitioned into molecular pathology routine laboratories. It is an attractive platform for analysing multiple genes at the same time with very little input material. Therefore, the need for high quality DNA obtained from automated DNA extraction systems has increased, especially to those laboratories which are dealing with formalin-fixed paraffin-embedded (FFPE material and high sample throughput. This study evaluated five automated FFPE DNA extraction systems as well as five DNA quantification systems using the three most common techniques, UV spectrophotometry, fluorescent dye-based quantification and quantitative PCR, on 26 FFPE tissue samples. Additionally, the effects on downstream applications were analysed to find the most suitable pre-analytical methods for massively parallel sequencing in routine diagnostics. The results revealed that the Maxwell 16 from Promega (Mannheim, Germany seems to be the superior system for DNA extraction from FFPE material. The extracts had a 1.3-24.6-fold higher DNA concentration in comparison to the other extraction systems, a higher quality and were most suitable for downstream applications. The comparison of the five quantification methods showed intermethod variations but all methods could be used to estimate the right amount for PCR amplification and for massively parallel sequencing. Interestingly, the best results in massively parallel sequencing were obtained with a DNA input of 15 ng determined by the NanoDrop 2000c spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA. No difference could be detected in mutation analysis based on the results of the quantification methods. These findings emphasise, that it is particularly important to choose the most reliable and constant DNA extraction system, especially when using small biopsies and low elution volumes, and that all common DNA quantification techniques can

  9. Hierarchical Image Segmentation of Remotely Sensed Data using Massively Parallel GNU-LINUX Software

    Science.gov (United States)

    Tilton, James C.

    2003-01-01

    A hierarchical set of image segmentations is a set of several image segmentations of the same image at different levels of detail in which the segmentations at coarser levels of detail can be produced from simple merges of regions at finer levels of detail. In [1], Tilton, et a1 describes an approach for producing hierarchical segmentations (called HSEG) and gave a progress report on exploiting these hierarchical segmentations for image information mining. The HSEG algorithm is a hybrid of region growing and constrained spectral clustering that produces a hierarchical set of image segmentations based on detected convergence points. In the main, HSEG employs the hierarchical stepwise optimization (HSWO) approach to region growing, which was described as early as 1989 by Beaulieu and Goldberg. The HSWO approach seeks to produce segmentations that are more optimized than those produced by more classic approaches to region growing (e.g. Horowitz and T. Pavlidis, [3]). In addition, HSEG optionally interjects between HSWO region growing iterations, merges between spatially non-adjacent regions (i.e., spectrally based merging or clustering) constrained by a threshold derived from the previous HSWO region growing iteration. While the addition of constrained spectral clustering improves the utility of the segmentation results, especially for larger images, it also significantly increases HSEG s computational requirements. To counteract this, a computationally efficient recursive, divide-and-conquer, implementation of HSEG (RHSEG) was devised, which includes special code to avoid processing artifacts caused by RHSEG s recursive subdivision of the image data. The recursive nature of RHSEG makes for a straightforward parallel implementation. This paper describes the HSEG algorithm, its recursive formulation (referred to as RHSEG), and the implementation of RHSEG using massively parallel GNU-LINUX software. Results with Landsat TM data are included comparing RHSEG with classic

  10. Practical tools to implement massive parallel pyrosequencing of PCR products in next generation molecular diagnostics.

    Directory of Open Access Journals (Sweden)

    Kim De Leeneer

    Full Text Available Despite improvements in terms of sequence quality and price per basepair, Sanger sequencing remains restricted to screening of individual disease genes. The development of massively parallel sequencing (MPS technologies heralded an era in which molecular diagnostics for multigenic disorders becomes reality. Here, we outline different PCR amplification based strategies for the screening of a multitude of genes in a patient cohort. We performed a thorough evaluation in terms of set-up, coverage and sequencing variants on the data of 10 GS-FLX experiments (over 200 patients. Crucially, we determined the actual coverage that is required for reliable diagnostic results using MPS, and provide a tool to calculate the number of patients that can be screened in a single run. Finally, we provide an overview of factors contributing to false negative or false positive mutation calls and suggest ways to maximize sensitivity and specificity, both important in a routine setting. By describing practical strategies for screening of multigenic disorders in a multitude of samples and providing answers to questions about minimum required coverage, the number of patients that can be screened in a single run and the factors that may affect sensitivity and specificity we hope to facilitate the implementation of MPS technology in molecular diagnostics.

  11. Massively parallel unsupervised single-particle cryo-EM data clustering via statistical manifold learning.

    Science.gov (United States)

    Wu, Jiayi; Ma, Yong-Bei; Congdon, Charles; Brett, Bevin; Chen, Shuobing; Xu, Yaofang; Ouyang, Qi; Mao, Youdong

    2017-01-01

    Structural heterogeneity in single-particle cryo-electron microscopy (cryo-EM) data represents a major challenge for high-resolution structure determination. Unsupervised classification may serve as the first step in the assessment of structural heterogeneity. However, traditional algorithms for unsupervised classification, such as K-means clustering and maximum likelihood optimization, may classify images into wrong classes with decreasing signal-to-noise-ratio (SNR) in the image data, yet demand increased computational costs. Overcoming these limitations requires further development of clustering algorithms for high-performance cryo-EM data processing. Here we introduce an unsupervised single-particle clustering algorithm derived from a statistical manifold learning framework called generative topographic mapping (GTM). We show that unsupervised GTM clustering improves classification accuracy by about 40% in the absence of input references for data with lower SNRs. Applications to several experimental datasets suggest that our algorithm can detect subtle structural differences among classes via a hierarchical clustering strategy. After code optimization over a high-performance computing (HPC) environment, our software implementation was able to generate thousands of reference-free class averages within hours in a massively parallel fashion, which allows a significant improvement on ab initio 3D reconstruction and assists in the computational purification of homogeneous datasets for high-resolution visualization.

  12. A massive parallel sequencing workflow for diagnostic genetic testing of mismatch repair genes

    Science.gov (United States)

    Hansen, Maren F; Neckmann, Ulrike; Lavik, Liss A S; Vold, Trine; Gilde, Bodil; Toft, Ragnhild K; Sjursen, Wenche

    2014-01-01

    The purpose of this study was to develop a massive parallel sequencing (MPS) workflow for diagnostic analysis of mismatch repair (MMR) genes using the GS Junior system (Roche). A pathogenic variant in one of four MMR genes, (MLH1, PMS2, MSH6, and MSH2), is the cause of Lynch Syndrome (LS), which mainly predispose to colorectal cancer. We used an amplicon-based sequencing method allowing specific and preferential amplification of the MMR genes including PMS2, of which several pseudogenes exist. The amplicons were pooled at different ratios to obtain coverage uniformity and maximize the throughput of a single-GS Junior run. In total, 60 previously identified and distinct variants (substitutions and indels), were sequenced by MPS and successfully detected. The heterozygote detection range was from 19% to 63% and dependent on sequence context and coverage. We were able to distinguish between false-positive and true-positive calls in homopolymeric regions by cross-sample comparison and evaluation of flow signal distributions. In addition, we filtered variants according to a predefined status, which facilitated variant annotation. Our study shows that implementation of MPS in routine diagnostics of LS can accelerate sample throughput and reduce costs without compromising sensitivity, compared to Sanger sequencing. PMID:24689082

  13. Integrated massively parallel sequencing of 15 autosomal STRs and Amelogenin using a simplified library preparation approach.

    Science.gov (United States)

    Xue, Jian; Wu, Riga; Pan, Yajiao; Wang, Shunxia; Qu, Baowang; Qin, Ying; Shi, Yuequn; Zhang, Chuchu; Li, Ran; Zhang, Liyan; Zhou, Cheng; Sun, Hongyu

    2018-04-02

    Massively parallel sequencing (MPS) technologies, also termed as next-generation sequencing (NGS), are becoming increasingly popular in study of short tandem repeats (STR). However, current library preparation methods are usually based on ligation or two-round PCR that requires more steps, making it time-consuming (about 2 days), laborious and expensive. In this study, a 16-plex STR typing system was designed with fusion primer strategy based on the Ion Torrent S5 XL platform which could effectively resolve the above challenges for forensic DNA database-type samples (bloodstains, saliva stains, etc.). The efficiency of this system was tested in 253 Han Chinese participants. The libraries were prepared without DNA isolation and adapter ligation, and the whole process only required approximately 5 h. The proportion of thoroughly genotyped samples in which all the 16 loci were successfully genotyped was 86% (220/256). Of the samples, 99.7% showed 100% concordance between NGS-based STR typing and capillary electrophoresis (CE)-based STR typing. The inconsistency might have been caused by off-ladder alleles and mutations in primer binding sites. Overall, this panel enabled the large-scale genotyping of the DNA samples with controlled quality and quantity because it is a simple, operation-friendly process flow that saves labor, time and costs. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  14. Characterization of the Zoarces viviparus liver transcriptome using massively parallel pyrosequencing

    Directory of Open Access Journals (Sweden)

    Asker Noomi

    2009-07-01

    Full Text Available Abstract Background The teleost Zoarces viviparus (eelpout lives along the coasts of Northern Europe and has long been an established model organism for marine ecology and environmental monitoring. The scarce information about this species genome has however restrained the use of efficient molecular-level assays, such as gene expression microarrays. Results In the present study we present the first comprehensive characterization of the Zoarces viviparus liver transcriptome. From 400,000 reads generated by massively parallel pyrosequencing, more than 50,000 pieces of putative transcripts were assembled, annotated and functionally classified. The data was estimated to cover roughly 40% of the total transcriptome and homologues for about half of the genes of Gasterosteus aculeatus (stickleback were identified. The sequence data was consequently used to design an oligonucleotide microarray for large-scale gene expression analysis. Conclusion Our results show that one run using a Genome Sequencer FLX from 454 Life Science/Roche generates enough genomic information for adequate de novo assembly of a large number of genes in a higher vertebrate. The generated sequence data, including the validated microarray probes, are publicly available to promote genome-wide research in Zoarces viviparus.

  15. Massively parallel unsupervised single-particle cryo-EM data clustering via statistical manifold learning.

    Directory of Open Access Journals (Sweden)

    Jiayi Wu

    Full Text Available Structural heterogeneity in single-particle cryo-electron microscopy (cryo-EM data represents a major challenge for high-resolution structure determination. Unsupervised classification may serve as the first step in the assessment of structural heterogeneity. However, traditional algorithms for unsupervised classification, such as K-means clustering and maximum likelihood optimization, may classify images into wrong classes with decreasing signal-to-noise-ratio (SNR in the image data, yet demand increased computational costs. Overcoming these limitations requires further development of clustering algorithms for high-performance cryo-EM data processing. Here we introduce an unsupervised single-particle clustering algorithm derived from a statistical manifold learning framework called generative topographic mapping (GTM. We show that unsupervised GTM clustering improves classification accuracy by about 40% in the absence of input references for data with lower SNRs. Applications to several experimental datasets suggest that our algorithm can detect subtle structural differences among classes via a hierarchical clustering strategy. After code optimization over a high-performance computing (HPC environment, our software implementation was able to generate thousands of reference-free class averages within hours in a massively parallel fashion, which allows a significant improvement on ab initio 3D reconstruction and assists in the computational purification of homogeneous datasets for high-resolution visualization.

  16. GRay: A MASSIVELY PARALLEL GPU-BASED CODE FOR RAY TRACING IN RELATIVISTIC SPACETIMES

    Energy Technology Data Exchange (ETDEWEB)

    Chan, Chi-kwan; Psaltis, Dimitrios; Özel, Feryal [Department of Astronomy, University of Arizona, 933 N. Cherry Ave., Tucson, AZ 85721 (United States)

    2013-11-01

    We introduce GRay, a massively parallel integrator designed to trace the trajectories of billions of photons in a curved spacetime. This graphics-processing-unit (GPU)-based integrator employs the stream processing paradigm, is implemented in CUDA C/C++, and runs on nVidia graphics cards. The peak performance of GRay using single-precision floating-point arithmetic on a single GPU exceeds 300 GFLOP (or 1 ns per photon per time step). For a realistic problem, where the peak performance cannot be reached, GRay is two orders of magnitude faster than existing central-processing-unit-based ray-tracing codes. This performance enhancement allows more effective searches of large parameter spaces when comparing theoretical predictions of images, spectra, and light curves from the vicinities of compact objects to observations. GRay can also perform on-the-fly ray tracing within general relativistic magnetohydrodynamic algorithms that simulate accretion flows around compact objects. Making use of this algorithm, we calculate the properties of the shadows of Kerr black holes and the photon rings that surround them. We also provide accurate fitting formulae of their dependencies on black hole spin and observer inclination, which can be used to interpret upcoming observations of the black holes at the center of the Milky Way, as well as M87, with the Event Horizon Telescope.

  17. Frequency of Usher syndrome type 1 in deaf children by massively parallel DNA sequencing.

    Science.gov (United States)

    Yoshimura, Hidekane; Miyagawa, Maiko; Kumakawa, Kozo; Nishio, Shin-Ya; Usami, Shin-Ichi

    2016-05-01

    Usher syndrome type 1 (USH1) is the most severe of the three USH subtypes due to its profound hearing loss, absent vestibular response and retinitis pigmentosa appearing at a prepubescent age. Six causative genes have been identified for USH1, making early diagnosis and therapy possible through DNA testing. Targeted exon sequencing of selected genes using massively parallel DNA sequencing (MPS) technology enables clinicians to systematically tackle previously intractable monogenic disorders and improve molecular diagnosis. Using MPS along with direct sequence analysis, we screened 227 unrelated non-syndromic deaf children and detected recessive mutations in USH1 causative genes in five patients (2.2%): three patients harbored MYO7A mutations and one each carried CDH23 or PCDH15 mutations. As indicated by an earlier genotype-phenotype correlation study of the CDH23 and PCDH15 genes, we considered the latter two patients to have USH1. Based on clinical findings, it was also highly likely that one patient with MYO7A mutations possessed USH1 due to a late onset age of walking. This first report describing the frequency (1.3-2.2%) of USH1 among non-syndromic deaf children highlights the importance of comprehensive genetic testing for early disease diagnosis.

  18. Molecular diagnosis of glycogen storage disease and disorders with overlapping clinical symptoms by massive parallel sequencing.

    Science.gov (United States)

    Vega, Ana I; Medrano, Celia; Navarrete, Rosa; Desviat, Lourdes R; Merinero, Begoña; Rodríguez-Pombo, Pilar; Vitoria, Isidro; Ugarte, Magdalena; Pérez-Cerdá, Celia; Pérez, Belen

    2016-10-01

    Glycogen storage disease (GSD) is an umbrella term for a group of genetic disorders that involve the abnormal metabolism of glycogen; to date, 23 types of GSD have been identified. The nonspecific clinical presentation of GSD and the lack of specific biomarkers mean that Sanger sequencing is now widely relied on for making a diagnosis. However, this gene-by-gene sequencing technique is both laborious and costly, which is a consequence of the number of genes to be sequenced and the large size of some genes. This work reports the use of massive parallel sequencing to diagnose patients at our laboratory in Spain using either a customized gene panel (targeted exome sequencing) or the Illumina Clinical-Exome TruSight One Gene Panel (clinical exome sequencing (CES)). Sequence variants were matched against biochemical and clinical hallmarks. Pathogenic mutations were detected in 23 patients. Twenty-two mutations were recognized (mostly loss-of-function mutations), including 11 that were novel in GSD-associated genes. In addition, CES detected five patients with mutations in ALDOB, LIPA, NKX2-5, CPT2, or ANO5. Although these genes are not involved in GSD, they are associated with overlapping phenotypic characteristics such as hepatic, muscular, and cardiac dysfunction. These results show that next-generation sequencing, in combination with the detection of biochemical and clinical hallmarks, provides an accurate, high-throughput means of making genetic diagnoses of GSD and related diseases.Genet Med 18 10, 1037-1043.

  19. A safe an easy method for building consensus HIV sequences from 454 massively parallel sequencing data.

    Science.gov (United States)

    Fernández-Caballero Rico, Jose Ángel; Chueca Porcuna, Natalia; Álvarez Estévez, Marta; Mosquera Gutiérrez, María Del Mar; Marcos Maeso, María Ángeles; García, Federico

    2018-02-01

    To show how to generate a consensus sequence from the information of massive parallel sequences data obtained from routine HIV anti-retroviral resistance studies, and that may be suitable for molecular epidemiology studies. Paired Sanger (Trugene-Siemens) and next-generation sequencing (NGS) (454 GSJunior-Roche) HIV RT and protease sequences from 62 patients were studied. NGS consensus sequences were generated using Mesquite, using 10%, 15%, and 20% thresholds. Molecular evolutionary genetics analysis (MEGA) was used for phylogenetic studies. At a 10% threshold, NGS-Sanger sequences from 17/62 patients were phylogenetically related, with a median bootstrap-value of 88% (IQR83.5-95.5). Association increased to 36/62 sequences, median bootstrap 94% (IQR85.5-98)], using a 15% threshold. Maximum association was at the 20% threshold, with 61/62 sequences associated, and a median bootstrap value of 99% (IQR98-100). A safe method is presented to generate consensus sequences from HIV-NGS data at 20% threshold, which will prove useful for molecular epidemiological studies. Copyright © 2016 Elsevier España, S.L.U. and Sociedad Española de Enfermedades Infecciosas y Microbiología Clínica. All rights reserved.

  20. High fidelity thermal-hydraulic analysis using CFD and massively parallel computers

    International Nuclear Information System (INIS)

    Weber, D.P.; Wei, T.Y.C.; Brewster, R.A.; Rock, Daniel T.; Rizwan-uddin

    2000-01-01

    Thermal-hydraulic analyses play an important role in design and reload analysis of nuclear power plants. These analyses have historically relied on early generation computational fluid dynamics capabilities, originally developed in the 1960s and 1970s. Over the last twenty years, however, dramatic improvements in both computational fluid dynamics codes in the commercial sector and in computing power have taken place. These developments offer the possibility of performing large scale, high fidelity, core thermal hydraulics analysis. Such analyses will allow a determination of the conservatism employed in traditional design approaches and possibly justify the operation of nuclear power systems at higher powers without compromising safety margins. The objective of this work is to demonstrate such a large scale analysis approach using a state of the art CFD code, STAR-CD, and the computing power of massively parallel computers, provided by IBM. A high fidelity representation of a current generation PWR was analyzed with the STAR-CD CFD code and the results were compared to traditional analyses based on the VIPRE code. Current design methodology typically involves a simplified representation of the assemblies, where a single average pin is used in each assembly to determine the hot assembly from a whole core analysis. After determining this assembly, increased refinement is used in the hot assembly, and possibly some of its neighbors, to refine the analysis for purposes of calculating DNBR. This latter calculation is performed with sub-channel codes such as VIPRE. The modeling simplifications that are used involve the approximate treatment of surrounding assemblies and coarse representation of the hot assembly, where the subchannel is the lowest level of discretization. In the high fidelity analysis performed in this study, both restrictions have been removed. Within the hot assembly, several hundred thousand to several million computational zones have been used, to

  1. Application of massively parallel sequencing to genetic diagnosis in multiplex families with idiopathic sensorineural hearing impairment.

    Directory of Open Access Journals (Sweden)

    Chen-Chi Wu

    Full Text Available Despite the clinical utility of genetic diagnosis to address idiopathic sensorineural hearing impairment (SNHI, the current strategy for screening mutations via Sanger sequencing suffers from the limitation that only a limited number of DNA fragments associated with common deafness mutations can be genotyped. Consequently, a definitive genetic diagnosis cannot be achieved in many families with discernible family history. To investigate the diagnostic utility of massively parallel sequencing (MPS, we applied the MPS technique to 12 multiplex families with idiopathic SNHI in which common deafness mutations had previously been ruled out. NimbleGen sequence capture array was designed to target all protein coding sequences (CDSs and 100 bp of the flanking sequence of 80 common deafness genes. We performed MPS on the Illumina HiSeq2000, and applied BWA, SAMtools, Picard, GATK, Variant Tools, ANNOVAR, and IGV for bioinformatics analyses. Initial data filtering with allele frequencies (0.95 prioritized 5 indels (insertions/deletions and 36 missense variants in the 12 multiplex families. After further validation by Sanger sequencing, segregation pattern, and evolutionary conservation of amino acid residues, we identified 4 variants in 4 different genes, which might lead to SNHI in 4 families compatible with autosomal dominant inheritance. These included GJB2 p.R75Q, MYO7A p.T381M, KCNQ4 p.S680F, and MYH9 p.E1256K. Among them, KCNQ4 p.S680F and MYH9 p.E1256K were novel. In conclusion, MPS allows genetic diagnosis in multiplex families with idiopathic SNHI by detecting mutations in relatively uncommon deafness genes.

  2. Gene discovery using massively parallel pyrosequencing to develop ESTs for the flesh fly Sarcophaga crassipalpis

    Directory of Open Access Journals (Sweden)

    Hahn Daniel A

    2009-05-01

    Full Text Available Abstract Background Flesh flies in the genus Sarcophaga are important models for investigating endocrinology, diapause, cold hardiness, reproduction, and immunity. Despite the prominence of Sarcophaga flesh flies as models for insect physiology and biochemistry, and in forensic studies, little genomic or transcriptomic data are available for members of this genus. We used massively parallel pyrosequencing on the Roche 454-FLX platform to produce a substantial EST dataset for the flesh fly Sarcophaga crassipalpis. To maximize sequence diversity, we pooled RNA extracted from whole bodies of all life stages and normalized the cDNA pool after reverse transcription. Results We obtained 207,110 ESTs with an average read length of 241 bp. These reads assembled into 20,995 contigs and 31,056 singletons. Using BLAST searches of the NR and NT databases we were able to identify 11,757 unique gene elements (ES. crassipalpis unigenes among GO Biological Process functional groups with that of the Drosophila melanogaster transcriptome suggests that our ESTs are broadly representative of the flesh fly transcriptome. Insertion and deletion errors in 454 sequencing present a serious hurdle to comparative transcriptome analysis. Aided by a new approach to correcting for these errors, we performed a comparative analysis of genetic divergence across GO categories among S. crassipalpis, D. melanogaster, and Anopheles gambiae. The results suggest that non-synonymous substitutions occur at similar rates across categories, although genes related to response to stimuli may evolve slightly faster. In addition, we identified over 500 potential microsatellite loci and more than 12,000 SNPs among our ESTs. Conclusion Our data provides the first large-scale EST-project for flesh flies, a much-needed resource for exploring this model species. In addition, we identified a large number of potential microsatellite and SNP markers that could be used in population and systematic

  3. QuASAR-MPRA: accurate allele-specific analysis for massively parallel reporter assays.

    Science.gov (United States)

    Kalita, Cynthia A; Moyerbrailean, Gregory A; Brown, Christopher; Wen, Xiaoquan; Luca, Francesca; Pique-Regi, Roger

    2018-03-01

    The majority of the human genome is composed of non-coding regions containing regulatory elements such as enhancers, which are crucial for controlling gene expression. Many variants associated with complex traits are in these regions, and may disrupt gene regulatory sequences. Consequently, it is important to not only identify true enhancers but also to test if a variant within an enhancer affects gene regulation. Recently, allele-specific analysis in high-throughput reporter assays, such as massively parallel reporter assays (MPRAs), have been used to functionally validate non-coding variants. However, we are still missing high-quality and robust data analysis tools for these datasets. We have further developed our method for allele-specific analysis QuASAR (quantitative allele-specific analysis of reads) to analyze allele-specific signals in barcoded read counts data from MPRA. Using this approach, we can take into account the uncertainty on the original plasmid proportions, over-dispersion, and sequencing errors. The provided allelic skew estimate and its standard error also simplifies meta-analysis of replicate experiments. Additionally, we show that a beta-binomial distribution better models the variability present in the allelic imbalance of these synthetic reporters and results in a test that is statistically well calibrated under the null. Applying this approach to the MPRA data, we found 602 SNPs with significant (false discovery rate 10%) allele-specific regulatory function in LCLs. We also show that we can combine MPRA with QuASAR estimates to validate existing experimental and computational annotations of regulatory variants. Our study shows that with appropriate data analysis tools, we can improve the power to detect allelic effects in high-throughput reporter assays. http://github.com/piquelab/QuASAR/tree/master/mpra. fluca@wayne.edu or rpique@wayne.edu. Supplementary data are available online at Bioinformatics. © The Author (2017). Published by

  4. Comprehensive microRNA profiling in B-cells of human centenarians by massively parallel sequencing

    Directory of Open Access Journals (Sweden)

    Gombar Saurabh

    2012-07-01

    Full Text Available Abstract Background MicroRNAs (miRNAs are small, non-coding RNAs that regulate gene expression and play a critical role in development, homeostasis, and disease. Despite their demonstrated roles in age-associated pathologies, little is known about the role of miRNAs in human aging and longevity. Results We employed massively parallel sequencing technology to identify miRNAs expressed in B-cells from Ashkenazi Jewish centenarians, i.e., those living to a hundred and a human model of exceptional longevity, and younger controls without a family history of longevity. With data from 26.7 million reads comprising 9.4 × 108 bp from 3 centenarian and 3 control individuals, we discovered a total of 276 known miRNAs and 8 unknown miRNAs ranging several orders of magnitude in expression levels, a typical characteristics of saturated miRNA-sequencing. A total of 22 miRNAs were found to be significantly upregulated, with only 2 miRNAs downregulated, in centenarians as compared to controls. Gene Ontology analysis of the predicted and validated targets of the 24 differentially expressed miRNAs indicated enrichment of functional pathways involved in cell metabolism, cell cycle, cell signaling, and cell differentiation. A cross sectional expression analysis of the differentially expressed miRNAs in B-cells from Ashkenazi Jewish individuals between the 50th and 100th years of age indicated that expression levels of miR-363* declined significantly with age. Centenarians, however, maintained the youthful expression level. This result suggests that miR-363* may be a candidate longevity-associated miRNA. Conclusion Our comprehensive miRNA data provide a resource for further studies to identify genetic pathways associated with aging and longevity in humans.

  5. Genotypic tropism testing by massively parallel sequencing: qualitative and quantitative analysis

    Directory of Open Access Journals (Sweden)

    Thiele Bernhard

    2011-05-01

    Full Text Available Abstract Background Inferring viral tropism from genotype is a fast and inexpensive alternative to phenotypic testing. While being highly predictive when performed on clonal samples, sensitivity of predicting CXCR4-using (X4 variants drops substantially in clinical isolates. This is mainly attributed to minor variants not detected by standard bulk-sequencing. Massively parallel sequencing (MPS detects single clones thereby being much more sensitive. Using this technology we wanted to improve genotypic prediction of coreceptor usage. Methods Plasma samples from 55 antiretroviral-treated patients tested for coreceptor usage with the Monogram Trofile Assay were sequenced with standard population-based approaches. Fourteen of these samples were selected for further analysis with MPS. Tropism was predicted from each sequence with geno2pheno[coreceptor]. Results Prediction based on bulk-sequencing yielded 59.1% sensitivity and 90.9% specificity compared to the trofile assay. With MPS, 7600 reads were generated on average per isolate. Minorities of sequences with high confidence in CXCR4-usage were found in all samples, irrespective of phenotype. When using the default false-positive-rate of geno2pheno[coreceptor] (10%, and defining a minority cutoff of 5%, the results were concordant in all but one isolate. Conclusions The combination of MPS and coreceptor usage prediction results in a fast and accurate alternative to phenotypic assays. The detection of X4-viruses in all isolates suggests that coreceptor usage as well as fitness of minorities is important for therapy outcome. The high sensitivity of this technology in combination with a quantitative description of the viral population may allow implementing meaningful cutoffs for predicting response to CCR5-antagonists in the presence of X4-minorities.

  6. Genotypic tropism testing by massively parallel sequencing: qualitative and quantitative analysis.

    Science.gov (United States)

    Däumer, Martin; Kaiser, Rolf; Klein, Rolf; Lengauer, Thomas; Thiele, Bernhard; Thielen, Alexander

    2011-05-13

    Inferring viral tropism from genotype is a fast and inexpensive alternative to phenotypic testing. While being highly predictive when performed on clonal samples, sensitivity of predicting CXCR4-using (X4) variants drops substantially in clinical isolates. This is mainly attributed to minor variants not detected by standard bulk-sequencing. Massively parallel sequencing (MPS) detects single clones thereby being much more sensitive. Using this technology we wanted to improve genotypic prediction of coreceptor usage. Plasma samples from 55 antiretroviral-treated patients tested for coreceptor usage with the Monogram Trofile Assay were sequenced with standard population-based approaches. Fourteen of these samples were selected for further analysis with MPS. Tropism was predicted from each sequence with geno2pheno[coreceptor]. Prediction based on bulk-sequencing yielded 59.1% sensitivity and 90.9% specificity compared to the trofile assay. With MPS, 7600 reads were generated on average per isolate. Minorities of sequences with high confidence in CXCR4-usage were found in all samples, irrespective of phenotype. When using the default false-positive-rate of geno2pheno[coreceptor] (10%), and defining a minority cutoff of 5%, the results were concordant in all but one isolate. The combination of MPS and coreceptor usage prediction results in a fast and accurate alternative to phenotypic assays. The detection of X4-viruses in all isolates suggests that coreceptor usage as well as fitness of minorities is important for therapy outcome. The high sensitivity of this technology in combination with a quantitative description of the viral population may allow implementing meaningful cutoffs for predicting response to CCR5-antagonists in the presence of X4-minorities.

  7. Optimizing a massive parallel sequencing workflow for quantitative miRNA expression analysis.

    Directory of Open Access Journals (Sweden)

    Francesca Cordero

    Full Text Available BACKGROUND: Massive Parallel Sequencing methods (MPS can extend and improve the knowledge obtained by conventional microarray technology, both for mRNAs and short non-coding RNAs, e.g. miRNAs. The processing methods used to extract and interpret the information are an important aspect of dealing with the vast amounts of data generated from short read sequencing. Although the number of computational tools for MPS data analysis is constantly growing, their strengths and weaknesses as part of a complex analytical pipe-line have not yet been well investigated. PRIMARY FINDINGS: A benchmark MPS miRNA dataset, resembling a situation in which miRNAs are spiked in biological replication experiments was assembled by merging a publicly available MPS spike-in miRNAs data set with MPS data derived from healthy donor peripheral blood mononuclear cells. Using this data set we observed that short reads counts estimation is strongly under estimated in case of duplicates miRNAs, if whole genome is used as reference. Furthermore, the sensitivity of miRNAs detection is strongly dependent by the primary tool used in the analysis. Within the six aligners tested, specifically devoted to miRNA detection, SHRiMP and MicroRazerS show the highest sensitivity. Differential expression estimation is quite efficient. Within the five tools investigated, two of them (DESseq, baySeq show a very good specificity and sensitivity in the detection of differential expression. CONCLUSIONS: The results provided by our analysis allow the definition of a clear and simple analytical optimized workflow for miRNAs digital quantitative analysis.

  8. Optimizing a massive parallel sequencing workflow for quantitative miRNA expression analysis.

    Science.gov (United States)

    Cordero, Francesca; Beccuti, Marco; Arigoni, Maddalena; Donatelli, Susanna; Calogero, Raffaele A

    2012-01-01

    Massive Parallel Sequencing methods (MPS) can extend and improve the knowledge obtained by conventional microarray technology, both for mRNAs and short non-coding RNAs, e.g. miRNAs. The processing methods used to extract and interpret the information are an important aspect of dealing with the vast amounts of data generated from short read sequencing. Although the number of computational tools for MPS data analysis is constantly growing, their strengths and weaknesses as part of a complex analytical pipe-line have not yet been well investigated. A benchmark MPS miRNA dataset, resembling a situation in which miRNAs are spiked in biological replication experiments was assembled by merging a publicly available MPS spike-in miRNAs data set with MPS data derived from healthy donor peripheral blood mononuclear cells. Using this data set we observed that short reads counts estimation is strongly under estimated in case of duplicates miRNAs, if whole genome is used as reference. Furthermore, the sensitivity of miRNAs detection is strongly dependent by the primary tool used in the analysis. Within the six aligners tested, specifically devoted to miRNA detection, SHRiMP and MicroRazerS show the highest sensitivity. Differential expression estimation is quite efficient. Within the five tools investigated, two of them (DESseq, baySeq) show a very good specificity and sensitivity in the detection of differential expression. The results provided by our analysis allow the definition of a clear and simple analytical optimized workflow for miRNAs digital quantitative analysis.

  9. Performance evaluations of advanced massively parallel platforms based on gyrokinetic toroidal five-dimensional Eulerian code GT5D

    International Nuclear Information System (INIS)

    Idomura, Yasuhiro; Jolliet, Sebastien

    2010-01-01

    A gyrokinetic toroidal five dimensional Eulerian code GT5D is ported on six advanced massively parallel platforms and comprehensive benchmark tests are performed. A parallelisation technique based on physical properties of the gyrokinetic equation is presented. By extending the parallelisation technique with a hybrid parallel model, the scalability of the code is improved on platforms with multi-core processors. In the benchmark tests, a good salability is confirmed up to several thousands cores on every platforms, and the maximum sustained performance of ∼18.6 Tflops is achieved using 16384 cores of BX900. (author)

  10. Global transcriptional profiling of the toxic dinoflagellate Alexandrium fundyense using Massively Parallel Signature Sequencing

    Directory of Open Access Journals (Sweden)

    Anderson Donald M

    2006-04-01

    Full Text Available Abstract Background Dinoflagellates are one of the most important classes of marine and freshwater algae, notable both for their functional diversity and ecological significance. They occur naturally as free-living cells, as endosymbionts of marine invertebrates and are well known for their involvement in "red tides". Dinoflagellates are also notable for their unusual genome content and structure, which suggests that the organization and regulation of dinoflagellate genes may be very different from that of most eukaryotes. To investigate the content and regulation of the dinoflagellate genome, we performed a global analysis of the transcriptome of the toxic dinoflagellate Alexandrium fundyense under nitrate- and phosphate-limited conditions using Massively Parallel Signature Sequencing (MPSS. Results Data from the two MPSS libraries showed that the number of unique signatures found in A. fundyense cells is similar to that of humans and Arabidopsis thaliana, two eukaryotes that have been extensively analyzed using this method. The general distribution, abundance and expression patterns of the A. fundyense signatures were also quite similar to other eukaryotes, and at least 10% of the A. fundyense signatures were differentially expressed between the two conditions. RACE amplification and sequencing of a subset of signatures showed that multiple signatures arose from sequence variants of a single gene. Single signatures also mapped to different sequence variants of the same gene. Conclusion The MPSS data presented here provide a quantitative view of the transcriptome and its regulation in these unusual single-celled eukaryotes. The observed signature abundance and distribution in Alexandrium is similar to that of other eukaryotes that have been analyzed using MPSS. Results of signature mapping via RACE indicate that many signatures result from sequence variants of individual genes. These data add to the growing body of evidence for widespread gene

  11. Resolution of the neutron transport equation by massively parallel computer in the Cronos code

    International Nuclear Information System (INIS)

    Zardini, D.M.

    1996-01-01

    The feasibility of neutron transport problems parallel resolution by CRONOS code's SN module is here studied. In this report we give the first data about the parallel resolution by angular variable decomposition of the transport equation. Problems about parallel resolution by spatial variable decomposition and memory stage limits are also explained here. (author)

  12. Massive parallel electromagnetic field simulation program JEMS-FDTD design and implementation on jasmin

    International Nuclear Information System (INIS)

    Li Hanyu; Zhou Haijing; Dong Zhiwei; Liao Cheng; Chang Lei; Cao Xiaolin; Xiao Li

    2010-01-01

    A large-scale parallel electromagnetic field simulation program JEMS-FDTD(J Electromagnetic Solver-Finite Difference Time Domain) is designed and implemented on JASMIN (J parallel Adaptive Structured Mesh applications INfrastructure). This program can simulate propagation, radiation, couple of electromagnetic field by solving Maxwell equations on structured mesh explicitly with FDTD method. JEMS-FDTD is able to simulate billion-mesh-scale problems on thousands of processors. In this article, the program is verified by simulating the radiation of an electric dipole. A beam waveguide is simulated to demonstrate the capability of large scale parallel computation. A parallel performance test indicates that a high parallel efficiency is obtained. (authors)

  13. Space-charge-dominated beam dynamics simulations using the massively parallel processors (MPPs) of the Cray T3D

    International Nuclear Information System (INIS)

    Liu, H.

    1996-01-01

    Computer simulations using the multi-particle code PARMELA with a three-dimensional point-by-point space charge algorithm have turned out to be very helpful in supporting injector commissioning and operations at Thomas Jefferson National Accelerator Facility (Jefferson Lab, formerly called CEBAF). However, this algorithm, which defines a typical N 2 problem in CPU time scaling, is very time-consuming when N, the number of macro-particles, is large. Therefore, it is attractive to use massively parallel processors (MPPs) to speed up the simulations. Motivated by this, the authors modified the space charge subroutine for using the MPPs of the Cray T3D. The techniques used to parallelize and optimize the code on the T3D are discussed in this paper. The performance of the code on the T3D is examined in comparison with a Parallel Vector Processing supercomputer of the Cray C90 and an HP 735/15 high-end workstation

  14. Introduction to massively-parallel computing in high-energy physics

    CERN Document Server

    AUTHOR|(CDS)2083520

    1993-01-01

    Ever since computers were first used for scientific and numerical work, there has existed an "arms race" between the technical development of faster computing hardware, and the desires of scientists to solve larger problems in shorter time-scales. However, the vast leaps in processor performance achieved through advances in semi-conductor science have reached a hiatus as the technology comes up against the physical limits of the speed of light and quantum effects. This has lead all high performance computer manufacturers to turn towards a parallel architecture for their new machines. In these lectures we will introduce the history and concepts behind parallel computing, and review the various parallel architectures and software environments currently available. We will then introduce programming methodologies that allow efficient exploitation of parallel machines, and present case studies of the parallelization of typical High Energy Physics codes for the two main classes of parallel computing architecture (S...

  15. Parallel processing is good for your scientific codes...But massively parallel processing is so much better

    International Nuclear Information System (INIS)

    Thomas, B.; Domain, Ch.; Souffez, Y.; Eon-Duval, P.

    1998-01-01

    Harnessing the power of many computers, to solve concurrently difficult scientific problems, is one of the most innovative trend in High Performance Computing. At EDF, we have invested in parallel computing and have achieved significant results. First we improved the processing speed of strategic codes, in order to extend their scope. Then we turned to numerical simulations at the atomic scale. These computations, we never dreamt of before, provided us with a better understanding of metallurgic phenomena. More precisely we were able to trace defects in alloys that are used in nuclear power plants. (author)

  16. Development and application of a massively parallel KKR Green function method for large scale systems

    Energy Technology Data Exchange (ETDEWEB)

    Thiess, Alexander R.

    2011-12-19

    In this thesis we present the development of the self-consistent, full-potential Korringa-Kohn-Rostoker (KKR) Green function method KKRnano for calculating the electronic properties, magnetic interactions, and total energy including all electrons on the basis of the density functional theory (DFT) on high-end massively parallelized high-performance computers for supercells containing thousands of atoms without sacrifice of accuracy. KKRnano was used for the following two applications. The first application is centered in the field of dilute magnetic semiconductors. In this field a new promising material combination was identified: gadolinium doped gallium nitride which shows ferromagnetic ordering of colossal magnetic moments above room temperature. It quickly turned out that additional extrinsic defects are inducing the striking properties. However, the question which kind of extrinsic defects are present in experimental samples is still unresolved. In order to shed light on this open question, we perform extensive studies of the most promising candidates: interstitial nitrogen and oxygen, as well as gallium vacancies. By analyzing the pairwise magnetic coupling among defects it is shown that nitrogen and oxygen interstitials cannot support thermally stable ferromagnetic order. Gallium vacancies, on the other hand, facilitate an important coupling mechanism. The vacancies are found to induce large magnetic moments on all surrounding nitrogen sites, which then couple ferromagnetically both among themselves and with the gadolinium dopants. Based on a statistical evaluation it can be concluded that already small concentrations of gallium vacancies can lead to a distinct long-range ferromagnetic ordering. Beyond this important finding we present further indications, from which we infer that gallium vacancies likely cause the striking ferromagnetic coupling of colossal magnetic moments in GaN:Gd. The second application deals with the phase-change material germanium

  17. Development and application of a massively parallel KKR Green function method for large scale systems

    International Nuclear Information System (INIS)

    Thiess, Alexander R.

    2011-01-01

    In this thesis we present the development of the self-consistent, full-potential Korringa-Kohn-Rostoker (KKR) Green function method KKRnano for calculating the electronic properties, magnetic interactions, and total energy including all electrons on the basis of the density functional theory (DFT) on high-end massively parallelized high-performance computers for supercells containing thousands of atoms without sacrifice of accuracy. KKRnano was used for the following two applications. The first application is centered in the field of dilute magnetic semiconductors. In this field a new promising material combination was identified: gadolinium doped gallium nitride which shows ferromagnetic ordering of colossal magnetic moments above room temperature. It quickly turned out that additional extrinsic defects are inducing the striking properties. However, the question which kind of extrinsic defects are present in experimental samples is still unresolved. In order to shed light on this open question, we perform extensive studies of the most promising candidates: interstitial nitrogen and oxygen, as well as gallium vacancies. By analyzing the pairwise magnetic coupling among defects it is shown that nitrogen and oxygen interstitials cannot support thermally stable ferromagnetic order. Gallium vacancies, on the other hand, facilitate an important coupling mechanism. The vacancies are found to induce large magnetic moments on all surrounding nitrogen sites, which then couple ferromagnetically both among themselves and with the gadolinium dopants. Based on a statistical evaluation it can be concluded that already small concentrations of gallium vacancies can lead to a distinct long-range ferromagnetic ordering. Beyond this important finding we present further indications, from which we infer that gallium vacancies likely cause the striking ferromagnetic coupling of colossal magnetic moments in GaN:Gd. The second application deals with the phase-change material germanium

  18. Feasibility studies for a high energy physics MC program on massive parallel platforms

    International Nuclear Information System (INIS)

    Bertolotto, L.M.; Peach, K.J.; Apostolakis, J.; Bruschini, C.E.; Calafiura, P.; Gagliardi, F.; Metcalf, M.; Norton, A.; Panzer-Steindel, B.

    1994-01-01

    The parallelization of a Monte Carlo program for the NA48 experiment is presented. As a first step, a task farming structure was realized. Based on this, a further step, making use of a distributed database for showers in the electro-magnetic calorimeter, was implemented. Further possibilities for using parallel processing for a quasi-real time calibration of the calorimeter are described

  19. Massively parallel read mapping on GPUs with the q-group index and PEANUT

    NARCIS (Netherlands)

    J. Köster (Johannes); S. Rahmann (Sven)

    2014-01-01

    textabstractWe present the q-group index, a novel data structure for read mapping tailored towards graphics processing units (GPUs) with a small memory footprint and efficient parallel algorithms for querying and building. On top of the q-group index we introduce PEANUT, a highly parallel GPU-based

  20. PFLOTRAN User Manual: A Massively Parallel Reactive Flow and Transport Model for Describing Surface and Subsurface Processes

    Energy Technology Data Exchange (ETDEWEB)

    Lichtner, Peter C. [OFM Research, Redmond, WA (United States); Hammond, Glenn E. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Lu, Chuan [Idaho National Lab. (INL), Idaho Falls, ID (United States); Karra, Satish [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Bisht, Gautam [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Andre, Benjamin [National Center for Atmospheric Research, Boulder, CO (United States); Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Mills, Richard [Intel Corporation, Portland, OR (United States); Univ. of Tennessee, Knoxville, TN (United States); Kumar, Jitendra [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)

    2015-01-20

    PFLOTRAN solves a system of generally nonlinear partial differential equations describing multi-phase, multicomponent and multiscale reactive flow and transport in porous materials. The code is designed to run on massively parallel computing architectures as well as workstations and laptops (e.g. Hammond et al., 2011). Parallelization is achieved through domain decomposition using the PETSc (Portable Extensible Toolkit for Scientific Computation) libraries for the parallelization framework (Balay et al., 1997). PFLOTRAN has been developed from the ground up for parallel scalability and has been run on up to 218 processor cores with problem sizes up to 2 billion degrees of freedom. Written in object oriented Fortran 90, the code requires the latest compilers compatible with Fortran 2003. At the time of this writing this requires gcc 4.7.x, Intel 12.1.x and PGC compilers. As a requirement of running problems with a large number of degrees of freedom, PFLOTRAN allows reading input data that is too large to fit into memory allotted to a single processor core. The current limitation to the problem size PFLOTRAN can handle is the limitation of the HDF5 file format used for parallel IO to 32 bit integers. Noting that 232 = 4; 294; 967; 296, this gives an estimate of the maximum problem size that can be currently run with PFLOTRAN. Hopefully this limitation will be remedied in the near future.

  1. A massively parallel algorithm for the collision probability calculations in the Apollo-II code using the PVM library

    International Nuclear Information System (INIS)

    Stankovski, Z.

    1995-01-01

    The collision probability method in neutron transport, as applied to 2D geometries, consume a great amount of computer time, for a typical 2D assembly calculation evaluations. Consequently RZ or 3D calculations became prohibitive. In this paper we present a simple but efficient parallel algorithm based on the message passing host/node programing model. Parallelization was applied to the energy group treatment. Such approach permits parallelization of the existing code, requiring only limited modifications. Sequential/parallel computer portability is preserved, witch is a necessary condition for a industrial code. Sequential performances are also preserved. The algorithm is implemented on a CRAY 90 coupled to a 128 processor T3D computer, a 16 processor IBM SP1 and a network of workstations, using the Public Domain PVM library. The tests were executed for a 2D geometry with the standard 99-group library. All results were very satisfactory, the best ones with IBM SP1. Because of heterogeneity of the workstation network, we did ask high performances for this architecture. The same source code was used for all computers. A more impressive advantage of this algorithm will appear in the calculations of the SAPHYR project (with the future fine multigroup library of about 8000 groups) with a massively parallel computer, using several hundreds of processors. (author). 5 refs., 6 figs., 2 tabs

  2. A massively parallel algorithm for the collision probability calculations in the Apollo-II code using the PVM library

    International Nuclear Information System (INIS)

    Stankovski, Z.

    1995-01-01

    The collision probability method in neutron transport, as applied to 2D geometries, consume a great amount of computer time, for a typical 2D assembly calculation about 90% of the computing time is consumed in the collision probability evaluations. Consequently RZ or 3D calculations became prohibitive. In this paper the author presents a simple but efficient parallel algorithm based on the message passing host/node programmation model. Parallelization was applied to the energy group treatment. Such approach permits parallelization of the existing code, requiring only limited modifications. Sequential/parallel computer portability is preserved, which is a necessary condition for a industrial code. Sequential performances are also preserved. The algorithm is implemented on a CRAY 90 coupled to a 128 processor T3D computer, a 16 processor IBM SPI and a network of workstations, using the Public Domain PVM library. The tests were executed for a 2D geometry with the standard 99-group library. All results were very satisfactory, the best ones with IBM SPI. Because of heterogeneity of the workstation network, the author did not ask high performances for this architecture. The same source code was used for all computers. A more impressive advantage of this algorithm will appear in the calculations of the SAPHYR project (with the future fine multigroup library of about 8000 groups) with a massively parallel computer, using several hundreds of processors

  3. Massively parallel computation of PARASOL code on the Origin 3800 system

    International Nuclear Information System (INIS)

    Hosokawa, Masanari; Takizuka, Tomonori

    2001-10-01

    The divertor particle simulation code named PARASOL simulates open-field plasmas between divertor walls self-consistently by using an electrostatic PIC method and a binary collision Monte Carlo model. The PARASOL parallelized with MPI-1.1 for scalar parallel computer worked on Intel Paragon XP/S system. A system SGI Origin 3800 was newly installed (May, 2001). The parallel programming was improved at this switchover. As a result of the high-performance new hardware and this improvement, the PARASOL is speeded up by about 60 times with the same number of processors. (author)

  4. Hybrid massively parallel fast sweeping method for static Hamilton–Jacobi equations

    Energy Technology Data Exchange (ETDEWEB)

    Detrixhe, Miles, E-mail: mdetrixhe@engineering.ucsb.edu [Department of Mechanical Engineering (United States); University of California Santa Barbara, Santa Barbara, CA, 93106 (United States); Gibou, Frédéric, E-mail: fgibou@engineering.ucsb.edu [Department of Mechanical Engineering (United States); University of California Santa Barbara, Santa Barbara, CA, 93106 (United States); Department of Computer Science (United States); Department of Mathematics (United States)

    2016-10-01

    The fast sweeping method is a popular algorithm for solving a variety of static Hamilton–Jacobi equations. Fast sweeping algorithms for parallel computing have been developed, but are severely limited. In this work, we present a multilevel, hybrid parallel algorithm that combines the desirable traits of two distinct parallel methods. The fine and coarse grained components of the algorithm take advantage of heterogeneous computer architecture common in high performance computing facilities. We present the algorithm and demonstrate its effectiveness on a set of example problems including optimal control, dynamic games, and seismic wave propagation. We give results for convergence, parallel scaling, and show state-of-the-art speedup values for the fast sweeping method.

  5. Hybrid massively parallel fast sweeping method for static Hamilton–Jacobi equations

    International Nuclear Information System (INIS)

    Detrixhe, Miles; Gibou, Frédéric

    2016-01-01

    The fast sweeping method is a popular algorithm for solving a variety of static Hamilton–Jacobi equations. Fast sweeping algorithms for parallel computing have been developed, but are severely limited. In this work, we present a multilevel, hybrid parallel algorithm that combines the desirable traits of two distinct parallel methods. The fine and coarse grained components of the algorithm take advantage of heterogeneous computer architecture common in high performance computing facilities. We present the algorithm and demonstrate its effectiveness on a set of example problems including optimal control, dynamic games, and seismic wave propagation. We give results for convergence, parallel scaling, and show state-of-the-art speedup values for the fast sweeping method.

  6. A Parallel Adaboost-Backpropagation Neural Network for Massive Image Dataset Classification

    Science.gov (United States)

    Cao, Jianfang; Chen, Lichao; Wang, Min; Shi, Hao; Tian, Yun

    2016-01-01

    Image classification uses computers to simulate human understanding and cognition of images by automatically categorizing images. This study proposes a faster image classification approach that parallelizes the traditional Adaboost-Backpropagation (BP) neural network using the MapReduce parallel programming model. First, we construct a strong classifier by assembling the outputs of 15 BP neural networks (which are individually regarded as weak classifiers) based on the Adaboost algorithm. Second, we design Map and Reduce tasks for both the parallel Adaboost-BP neural network and the feature extraction algorithm. Finally, we establish an automated classification model by building a Hadoop cluster. We use the Pascal VOC2007 and Caltech256 datasets to train and test the classification model. The results are superior to those obtained using traditional Adaboost-BP neural network or parallel BP neural network approaches. Our approach increased the average classification accuracy rate by approximately 14.5% and 26.0% compared to the traditional Adaboost-BP neural network and parallel BP neural network, respectively. Furthermore, the proposed approach requires less computation time and scales very well as evaluated by speedup, sizeup and scaleup. The proposed approach may provide a foundation for automated large-scale image classification and demonstrates practical value. PMID:27905520

  7. A Novel Algorithm for Solving the Multidimensional Neutron Transport Equation on Massively Parallel Architectures

    Energy Technology Data Exchange (ETDEWEB)

    Azmy, Yousry

    2014-06-10

    We employ the Integral Transport Matrix Method (ITMM) as the kernel of new parallel solution methods for the discrete ordinates approximation of the within-group neutron transport equation. The ITMM abandons the repetitive mesh sweeps of the traditional source iterations (SI) scheme in favor of constructing stored operators that account for the direct coupling factors among all the cells' fluxes and between the cells' and boundary surfaces' fluxes. The main goals of this work are to develop the algorithms that construct these operators and employ them in the solution process, determine the most suitable way to parallelize the entire procedure, and evaluate the behavior and parallel performance of the developed methods with increasing number of processes, P. The fastest observed parallel solution method, Parallel Gauss-Seidel (PGS), was used in a weak scaling comparison with the PARTISN transport code, which uses the source iteration (SI) scheme parallelized with the Koch-baker-Alcouffe (KBA) method. Compared to the state-of-the-art SI-KBA with diffusion synthetic acceleration (DSA), this new method- even without acceleration/preconditioning-is completitive for optically thick problems as P is increased to the tens of thousands range. For the most optically thick cells tested, PGS reduced execution time by an approximate factor of three for problems with more than 130 million computational cells on P = 32,768. Moreover, the SI-DSA execution times's trend rises generally more steeply with increasing P than the PGS trend. Furthermore, the PGS method outperforms SI for the periodic heterogeneous layers (PHL) configuration problems. The PGS method outperforms SI and SI-DSA on as few as P = 16 for PHL problems and reduces execution time by a factor of ten or more for all problems considered with more than 2 million computational cells on P = 4.096.

  8. 'Iconic' tracking algorithms for high energy physics using the TRAX-I massively parallel processor

    International Nuclear Information System (INIS)

    Vesztergombi, G.

    1989-01-01

    TRAX-I, a cost-effective parallel microcomputer, applying associative string processor (ASP) architecture with 16 K parallel processing elements, is being built by Aspex Microsystems Ltd. (UK). When applied to the tracking problem of very complex events with several hundred tracks, the large number of processors allows one to dedicate one or more processors to each wire (in MWPC), each pixel (in digitized images from streamer chambers or other visual detectors), or each pad (in TPC) to perform very efficient pattern recognition. Some linear tracking algorithms based on this ''ionic'' representation are presented. (orig.)

  9. 'Iconic' tracking algorithms for high energy physics using the TRAX-I massively parallel processor

    International Nuclear Information System (INIS)

    Vestergombi, G.

    1989-11-01

    TRAX-I, a cost-effective parallel microcomputer, applying Associative String Processor (ASP) architecture with 16 K parallel processing elements, is being built by Aspex Microsystems Ltd. (UK). When applied to the tracking problem of very complex events with several hundred tracks, the large number of processors allows one to dedicate one or more processors to each wire (in MWPC), each pixel (in digitized images from streamer chambers or other visual detectors), or each pad (in TPC) to perform very efficient pattern recognition. Some linear tracking algorithms based on this 'iconic' representation are presented. (orig.)

  10. Use of massively parallel computing to improve modelling accuracy within the nuclear sector

    Directory of Open Access Journals (Sweden)

    L M Evans

    2016-06-01

    This work presents recent advancements in three techniques: Uncertainty quantification (UQ; Cellular automata finite element (CAFE; Image based finite element methods (IBFEM. Case studies are presented demonstrating their suitability for use in nuclear engineering made possible by advancements in parallel computing hardware that is projected to be available for industry within the next decade costing of the order of $100k.

  11. OpenCL Implementation of a Parallel Universal Kriging Algorithm for Massive Spatial Data Interpolation on Heterogeneous Systems

    Directory of Open Access Journals (Sweden)

    Fang Huang

    2016-06-01

    Full Text Available In some digital Earth engineering applications, spatial interpolation algorithms are required to process and analyze large amounts of data. Due to its powerful computing capacity, heterogeneous computing has been used in many applications for data processing in various fields. In this study, we explore the design and implementation of a parallel universal kriging spatial interpolation algorithm using the OpenCL programming model on heterogeneous computing platforms for massive Geo-spatial data processing. This study focuses primarily on transforming the hotspots in serial algorithms, i.e., the universal kriging interpolation function, into the corresponding kernel function in OpenCL. We also employ parallelization and optimization techniques in our implementation to improve the code performance. Finally, based on the results of experiments performed on two different high performance heterogeneous platforms, i.e., an NVIDIA graphics processing unit system and an Intel Xeon Phi system (MIC, we show that the parallel universal kriging algorithm can achieve the highest speedup of up to 40× with a single computing device and the highest speedup of up to 80× with multiple devices.

  12. Solution of the within-group multidimensional discrete ordinates transport equations on massively parallel architectures

    Science.gov (United States)

    Zerr, Robert Joseph

    2011-12-01

    The integral transport matrix method (ITMM) has been used as the kernel of new parallel solution methods for the discrete ordinates approximation of the within-group neutron transport equation. The ITMM abandons the repetitive mesh sweeps of the traditional source iterations (SI) scheme in favor of constructing stored operators that account for the direct coupling factors among all the cells and between the cells and boundary surfaces. The main goals of this work were to develop the algorithms that construct these operators and employ them in the solution process, determine the most suitable way to parallelize the entire procedure, and evaluate the behavior and performance of the developed methods for increasing number of processes. This project compares the effectiveness of the ITMM with the SI scheme parallelized with the Koch-Baker-Alcouffe (KBA) method. The primary parallel solution method involves a decomposition of the domain into smaller spatial sub-domains, each with their own transport matrices, and coupled together via interface boundary angular fluxes. Each sub-domain has its own set of ITMM operators and represents an independent transport problem. Multiple iterative parallel solution methods have investigated, including parallel block Jacobi (PBJ), parallel red/black Gauss-Seidel (PGS), and parallel GMRES (PGMRES). The fastest observed parallel solution method, PGS, was used in a weak scaling comparison with the PARTISN code. Compared to the state-of-the-art SI-KBA with diffusion synthetic acceleration (DSA), this new method without acceleration/preconditioning is not competitive for any problem parameters considered. The best comparisons occur for problems that are difficult for SI DSA, namely highly scattering and optically thick. SI DSA execution time curves are generally steeper than the PGS ones. However, until further testing is performed it cannot be concluded that SI DSA does not outperform the ITMM with PGS even on several thousand or tens of

  13. Efficient numerical methods for fluid- and electrodynamics on massively parallel systems

    Energy Technology Data Exchange (ETDEWEB)

    Zudrop, Jens

    2016-07-01

    In the last decade, computer technology has evolved rapidly. Modern high performance computing systems offer a tremendous amount of computing power in the range of a few peta floating point operations per second. In contrast, numerical software development is much slower and most existing simulation codes cannot exploit the full computing power of these systems. Partially, this is due to the numerical methods themselves and partially it is related to bottlenecks within the parallelization concept and its data structures. The goal of the thesis is the development of numerical algorithms and corresponding data structures to remedy both kinds of parallelization bottlenecks. The approach is based on a co-design of the numerical schemes (including numerical analysis) and their realizations in algorithms and software. Various kinds of applications, from multicomponent flows (Lattice Boltzmann Method) to electrodynamics (Discontinuous Galerkin Method) to embedded geometries (Octree), are considered and efficiency of the developed approaches is demonstrated for large scale simulations.

  14. Massively parallel amplicon sequencing reveals isotype-specific variability of antimicrobial peptide transcripts in Mytilus galloprovincialis.

    Directory of Open Access Journals (Sweden)

    Umberto Rosani

    Full Text Available BACKGROUND: Effective innate responses against potential pathogens are essential in the living world and possibly contributed to the evolutionary success of invertebrates. Taken together, antimicrobial peptide (AMP precursors of defensin, mytilin, myticin and mytimycin can represent about 40% of the hemocyte transcriptome in mussels injected with viral-like and bacterial preparations, and unique profiles of myticin C variants are expressed in single mussels. Based on amplicon pyrosequencing, we have ascertained and compared the natural and Vibrio-induced diversity of AMP transcripts in mussel hemocytes from three European regions. METHODOLOGY/PRINCIPAL FINDINGS: Hemolymph was collected from mussels farmed in the coastal regions of Palavas (France, Vigo (Spain and Venice (Italy. To represent the AMP families known in M. galloprovincialis, nine transcript sequences have been selected, amplified from hemocyte RNA and subjected to pyrosequencing. Hemolymph from farmed (offshore and wild (lagoon Venice mussels, both injected with 10(7 Vibrio cells, were similarly processed. Amplicon pyrosequencing emphasized the AMP transcript diversity, with Single Nucleotide Changes (SNC minimal for mytilin B/C and maximal for arthropod-like defensin and myticin C. Ratio of non-synonymous vs. synonymous changes also greatly differed between AMP isotypes. Overall, each amplicon revealed similar levels of nucleotidic variation across geographical regions, with two main sequence patterns confirmed for mytimycin and no substantial changes after immunostimulation. CONCLUSIONS/SIGNIFICANCE: Barcoding and bidirectional pyrosequencing allowed us to map and compare the transcript diversity of known mussel AMPs. Though most of the genuine cds variation was common to the analyzed samples we could estimate from 9 to 106 peptide variants in hemolymph pools representing 100 mussels, depending on the AMP isoform and sampling site. In this study, no prevailing SNC patterns related

  15. ESPRIT-Forest: Parallel clustering of massive amplicon sequence data in subquadratic time.

    Science.gov (United States)

    Cai, Yunpeng; Zheng, Wei; Yao, Jin; Yang, Yujie; Mai, Volker; Mao, Qi; Sun, Yijun

    2017-04-01

    The rapid development of sequencing technology has led to an explosive accumulation of genomic sequence data. Clustering is often the first step to perform in sequence analysis, and hierarchical clustering is one of the most commonly used approaches for this purpose. However, it is currently computationally expensive to perform hierarchical clustering of extremely large sequence datasets due to its quadratic time and space complexities. In this paper we developed a new algorithm called ESPRIT-Forest for parallel hierarchical clustering of sequences. The algorithm achieves subquadratic time and space complexity and maintains a high clustering accuracy comparable to the standard method. The basic idea is to organize sequences into a pseudo-metric based partitioning tree for sub-linear time searching of nearest neighbors, and then use a new multiple-pair merging criterion to construct clusters in parallel using multiple threads. The new algorithm was tested on the human microbiome project (HMP) dataset, currently one of the largest published microbial 16S rRNA sequence dataset. Our experiment demonstrated that with the power of parallel computing it is now compu- tationally feasible to perform hierarchical clustering analysis of tens of millions of sequences. The software is available at http://www.acsu.buffalo.edu/∼yijunsun/lab/ESPRIT-Forest.html.

  16. Investigation and experimental validation of the contribution of optical interconnects in the SYMPHONIE massively parallel computer

    International Nuclear Information System (INIS)

    Scheer, Patrick

    1998-01-01

    Progress in microelectronics lead to electronic circuits which are increasingly integrated, with an operating frequency and an inputs/outputs count larger than the ones supported by printed circuit board and back-plane technologies. As a result, distributed systems with several boards cannot fully exploit the performance of integrated circuits. In synchronous parallel computers, the situation is worsen since the overall system performances rely on the efficiency of electrical interconnects between the integrated circuits which include the processing elements (PE). The study of a real parallel computer named SYMPHONIE shows for instance that the system operating frequency is far smaller than the capabilities of the microelectronics technology used for the PE implementation. Optical interconnections may cancel these limitations by providing more efficient connections between the PE. Especially, free-space optical interconnections based on vertical-cavity surface-emitting lasers (VCSEL), micro-lens and PIN photodiodes are compatible with the required features of the PE communications. Zero bias modulation of VCSEL with CMOS-compatible digital signals is studied and experimentally demonstrated. A model of the propagation of truncated gaussian beams through micro-lenses is developed. It is then used to optimise the geometry of the detection areas. A dedicated mechanical system is also proposed and implemented for integrating free-space optical interconnects in a standard electronic environment, representative of the one of parallel computer systems. A specially designed demonstrator provides the experimental validation of the above physical concepts. (author) [fr

  17. Identification of the bovine Arachnomelia mutation by massively parallel sequencing implicates sulfite oxidase (SUOX in bone development.

    Directory of Open Access Journals (Sweden)

    Cord Drögemüller

    2010-08-01

    Full Text Available Arachnomelia is a monogenic recessive defect of skeletal development in cattle. The causative mutation was previously mapped to a ∼7 Mb interval on chromosome 5. Here we show that array-based sequence capture and massively parallel sequencing technology, combined with the typical family structure in livestock populations, facilitates the identification of the causative mutation. We re-sequenced the entire critical interval in a healthy partially inbred cow carrying one copy of the critical chromosome segment in its ancestral state and one copy of the same segment with the arachnomelia mutation, and we detected a single heterozygous position. The genetic makeup of several partially inbred cattle provides extremely strong support for the causality of this mutation. The mutation represents a single base insertion leading to a premature stop codon in the coding sequence of the SUOX gene and is perfectly associated with the arachnomelia phenotype. Our findings suggest an important role for sulfite oxidase in bone development.

  18. Identification of the Bovine Arachnomelia Mutation by Massively Parallel Sequencing Implicates Sulfite Oxidase (SUOX) in Bone Development

    Science.gov (United States)

    Drögemüller, Cord; Tetens, Jens; Sigurdsson, Snaevar; Gentile, Arcangelo; Testoni, Stefania; Lindblad-Toh, Kerstin; Leeb, Tosso

    2010-01-01

    Arachnomelia is a monogenic recessive defect of skeletal development in cattle. The causative mutation was previously mapped to a ∼7 Mb interval on chromosome 5. Here we show that array-based sequence capture and massively parallel sequencing technology, combined with the typical family structure in livestock populations, facilitates the identification of the causative mutation. We re-sequenced the entire critical interval in a healthy partially inbred cow carrying one copy of the critical chromosome segment in its ancestral state and one copy of the same segment with the arachnomelia mutation, and we detected a single heterozygous position. The genetic makeup of several partially inbred cattle provides extremely strong support for the causality of this mutation. The mutation represents a single base insertion leading to a premature stop codon in the coding sequence of the SUOX gene and is perfectly associated with the arachnomelia phenotype. Our findings suggest an important role for sulfite oxidase in bone development. PMID:20865119

  19. Massively parallel sequencing and the emergence of forensic genomics: Defining the policy and legal issues for law enforcement.

    Science.gov (United States)

    Scudder, Nathan; McNevin, Dennis; Kelty, Sally F; Walsh, Simon J; Robertson, James

    2018-03-01

    Use of DNA in forensic science will be significantly influenced by new technology in coming years. Massively parallel sequencing and forensic genomics will hasten the broadening of forensic DNA analysis beyond short tandem repeats for identity towards a wider array of genetic markers, in applications as diverse as predictive phenotyping, ancestry assignment, and full mitochondrial genome analysis. With these new applications come a range of legal and policy implications, as forensic science touches on areas as diverse as 'big data', privacy and protected health information. Although these applications have the potential to make a more immediate and decisive forensic intelligence contribution to criminal investigations, they raise policy issues that will require detailed consideration if this potential is to be realised. The purpose of this paper is to identify the scope of the issues that will confront forensic and user communities. Copyright © 2017 The Chartered Society of Forensic Sciences. All rights reserved.

  20. Whole-Genome Analyses of Korean Native and Holstein Cattle Breeds by Massively Parallel Sequencing

    Science.gov (United States)

    Stothard, Paul; Chung, Won-Hyong; Jeon, Heoyn-Jeong; Miller, Stephen P.; Choi, So-Young; Lee, Jeong-Koo; Yang, Bokyoung; Lee, Kyung-Tai; Han, Kwang-Jin; Kim, Hyeong-Cheol; Jeong, Dongkee; Oh, Jae-Don; Kim, Namshin; Kim, Tae-Hun; Lee, Hak-Kyo; Lee, Sung-Jin

    2014-01-01

    A main goal of cattle genomics is to identify DNA differences that account for variations in economically important traits. In this study, we performed whole-genome analyses of three important cattle breeds in Korea—Hanwoo, Jeju Heugu, and Korean Holstein—using the Illumina HiSeq 2000 sequencing platform. We achieved 25.5-, 29.6-, and 29.5-fold coverage of the Hanwoo, Jeju Heugu, and Korean Holstein genomes, respectively, and identified a total of 10.4 million single nucleotide polymorphisms (SNPs), of which 54.12% were found to be novel. We also detected 1,063,267 insertions–deletions (InDels) across the genomes (78.92% novel). Annotations of the datasets identified a total of 31,503 nonsynonymous SNPs and 859 frameshift InDels that could affect phenotypic variations in traits of interest. Furthermore, genome-wide copy number variation regions (CNVRs) were detected by comparing the Hanwoo, Jeju Heugu, and previously published Chikso genomes against that of Korean Holstein. A total of 992, 284, and 1881 CNVRs, respectively, were detected throughout the genome. Moreover, 53, 65, 45, and 82 putative regions of homozygosity (ROH) were identified in Hanwoo, Jeju Heugu, Chikso, and Korean Holstein respectively. The results of this study provide a valuable foundation for further investigations to dissect the molecular mechanisms underlying variation in economically important traits in cattle and to develop genetic markers for use in cattle breeding. PMID:24992012

  1. 3D multiphysics modeling of superconducting cavities with a massively parallel simulation suite

    Directory of Open Access Journals (Sweden)

    Oleksiy Kononenko

    2017-10-01

    Full Text Available Radiofrequency cavities based on superconducting technology are widely used in particle accelerators for various applications. The cavities usually have high quality factors and hence narrow bandwidths, so the field stability is sensitive to detuning from the Lorentz force and external loads, including vibrations and helium pressure variations. If not properly controlled, the detuning can result in a serious performance degradation of a superconducting accelerator, so an understanding of the underlying detuning mechanisms can be very helpful. Recent advances in the simulation suite ace3p have enabled realistic multiphysics characterization of such complex accelerator systems on supercomputers. In this paper, we present the new capabilities in ace3p for large-scale 3D multiphysics modeling of superconducting cavities, in particular, a parallel eigensolver for determining mechanical resonances, a parallel harmonic response solver to calculate the response of a cavity to external vibrations, and a numerical procedure to decompose mechanical loads, such as from the Lorentz force or piezoactuators, into the corresponding mechanical modes. These capabilities have been used to do an extensive rf-mechanical analysis of dressed TESLA-type superconducting cavities. The simulation results and their implications for the operational stability of the Linac Coherent Light Source-II are discussed.

  2. A massively parallel method of characteristic neutral particle transport code for GPUs

    International Nuclear Information System (INIS)

    Boyd, W. R.; Smith, K.; Forget, B.

    2013-01-01

    Over the past 20 years, parallel computing has enabled computers to grow ever larger and more powerful while scientific applications have advanced in sophistication and resolution. This trend is being challenged, however, as the power consumption for conventional parallel computing architectures has risen to unsustainable levels and memory limitations have come to dominate compute performance. Heterogeneous computing platforms, such as Graphics Processing Units (GPUs), are an increasingly popular paradigm for solving these issues. This paper explores the applicability of GPUs for deterministic neutron transport. A 2D method of characteristics (MOC) code - OpenMOC - has been developed with solvers for both shared memory multi-core platforms as well as GPUs. The multi-threading and memory locality methodologies for the GPU solver are presented. Performance results for the 2D C5G7 benchmark demonstrate 25-35 x speedup for MOC on the GPU. The lessons learned from this case study will provide the basis for further exploration of MOC on GPUs as well as design decisions for hardware vendors exploring technologies for the next generation of machines for scientific computing. (authors)

  3. A new class of massively parallel direction splitting for the incompressible Navier–Stokes equations

    KAUST Repository

    Guermond, J.L.

    2011-06-01

    We introduce in this paper a new direction splitting algorithm for solving the incompressible Navier-Stokes equations. The main originality of the method consists of using the operator (I-∂xx)(I-∂yy)(I-∂zz) for approximating the pressure correction instead of the Poisson operator as done in all the contemporary projection methods. The complexity of the proposed algorithm is significantly lower than that of projection methods, and it is shown the have the same stability properties as the Poisson-based pressure-correction techniques, either in standard or rotational form. The first-order (in time) version of the method is proved to have the same convergence properties as the classical first-order projection techniques. Numerical tests reveal that the second-order version of the method has the same convergence rate as its second-order projection counterpart as well. The method is suitable for parallel implementation and preliminary tests show excellent parallel performance on a distributed memory cluster of up to 1024 processors. The method has been validated on the three-dimensional lid-driven cavity flow using grids composed of up to 2×109 points. © 2011 Elsevier B.V.

  4. Multichannel microformulators for massively parallel machine learning and automated design of biological experiments

    Science.gov (United States)

    Wikswo, John; Kolli, Aditya; Shankaran, Harish; Wagoner, Matthew; Mettetal, Jerome; Reiserer, Ronald; Gerken, Gregory; Britt, Clayton; Schaffer, David

    Genetic, proteomic, and metabolic networks describing biological signaling can have 102 to 103 nodes. Transcriptomics and mass spectrometry can quantify 104 different dynamical experimental variables recorded from in vitro experiments with a time resolution approaching 1 s. It is difficult to infer metabolic and signaling models from such massive data sets, and it is unlikely that causality can be determined simply from observed temporal correlations. There is a need to design and apply specific system perturbations, which will be difficult to perform manually with 10 to 102 externally controlled variables. Machine learning and optimal experimental design can select an experiment that best discriminates between multiple conflicting models, but a remaining problem is to control in real time multiple variables in the form of concentrations of growth factors, toxins, nutrients and other signaling molecules. With time-division multiplexing, a microfluidic MicroFormulator (μF) can create in real time complex mixtures of reagents in volumes suitable for biological experiments. Initial 96-channel μF implementations control the exposure profile of cells in a 96-well plate to different temporal profiles of drugs; future experiments will include challenge compounds. Funded in part by AstraZeneca, NIH/NCATS HHSN271201600009C and UH3TR000491, and VIIBRE.

  5. Massive parallelization of a 3D finite difference electromagnetic forward solution using domain decomposition methods on multiple CUDA enabled GPUs

    Science.gov (United States)

    Schultz, A.

    2010-12-01

    3D forward solvers lie at the core of inverse formulations used to image the variation of electrical conductivity within the Earth's interior. This property is associated with variations in temperature, composition, phase, presence of volatiles, and in specific settings, the presence of groundwater, geothermal resources, oil/gas or minerals. The high cost of 3D solutions has been a stumbling block to wider adoption of 3D methods. Parallel algorithms for modeling frequency domain 3D EM problems have not achieved wide scale adoption, with emphasis on fairly coarse grained parallelism using MPI and similar approaches. The communications bandwidth as well as the latency required to send and receive network communication packets is a limiting factor in implementing fine grained parallel strategies, inhibiting wide adoption of these algorithms. Leading Graphics Processor Unit (GPU) companies now produce GPUs with hundreds of GPU processor cores per die. The footprint, in silicon, of the GPU's restricted instruction set is much smaller than the general purpose instruction set required of a CPU. Consequently, the density of processor cores on a GPU can be much greater than on a CPU. GPUs also have local memory, registers and high speed communication with host CPUs, usually through PCIe type interconnects. The extremely low cost and high computational power of GPUs provides the EM geophysics community with an opportunity to achieve fine grained (i.e. massive) parallelization of codes on low cost hardware. The current generation of GPUs (e.g. NVidia Fermi) provides 3 billion transistors per chip die, with nearly 500 processor cores and up to 6 GB of fast (DDR5) GPU memory. This latest generation of GPU supports fast hardware double precision (64 bit) floating point operations of the type required for frequency domain EM forward solutions. Each Fermi GPU board can sustain nearly 1 TFLOP in double precision, and multiple boards can be installed in the host computer system. We

  6. Unified Lambert Tool for Massively Parallel Applications in Space Situational Awareness

    Science.gov (United States)

    Woollands, Robyn M.; Read, Julie; Hernandez, Kevin; Probe, Austin; Junkins, John L.

    2018-03-01

    This paper introduces a parallel-compiled tool that combines several of our recently developed methods for solving the perturbed Lambert problem using modified Chebyshev-Picard iteration. This tool (unified Lambert tool) consists of four individual algorithms, each of which is unique and better suited for solving a particular type of orbit transfer. The first is a Keplerian Lambert solver, which is used to provide a good initial guess (warm start) for solving the perturbed problem. It is also used to determine the appropriate algorithm to call for solving the perturbed problem. The arc length or true anomaly angle spanned by the transfer trajectory is the parameter that governs the automated selection of the appropriate perturbed algorithm, and is based on the respective algorithm convergence characteristics. The second algorithm solves the perturbed Lambert problem using the modified Chebyshev-Picard iteration two-point boundary value solver. This algorithm does not require a Newton-like shooting method and is the most efficient of the perturbed solvers presented herein, however the domain of convergence is limited to about a third of an orbit and is dependent on eccentricity. The third algorithm extends the domain of convergence of the modified Chebyshev-Picard iteration two-point boundary value solver to about 90% of an orbit, through regularization with the Kustaanheimo-Stiefel transformation. This is the second most efficient of the perturbed set of algorithms. The fourth algorithm uses the method of particular solutions and the modified Chebyshev-Picard iteration initial value solver for solving multiple revolution perturbed transfers. This method does require "shooting" but differs from Newton-like shooting methods in that it does not require propagation of a state transition matrix. The unified Lambert tool makes use of the General Mission Analysis Tool and we use it to compute thousands of perturbed Lambert trajectories in parallel on the Space Situational

  7. Mn-silicide nanostructures aligned on massively parallel silicon nano-ribbons

    International Nuclear Information System (INIS)

    De Padova, Paola; Ottaviani, Carlo; Ronci, Fabio; Colonna, Stefano; Quaresima, Claudio; Cricenti, Antonio; Olivieri, Bruno; Dávila, Maria E; Hennies, Franz; Pietzsch, Annette; Shariati, Nina; Le Lay, Guy

    2013-01-01

    The growth of Mn nanostructures on a 1D grating of silicon nano-ribbons is investigated at atomic scale by means of scanning tunneling microscopy, low energy electron diffraction and core level photoelectron spectroscopy. The grating of silicon nano-ribbons represents an atomic scale template that can be used in a surface-driven route to control the combination of Si with Mn in the development of novel materials for spintronics devices. The Mn atoms show a preferential adsorption site on silicon atoms, forming one-dimensional nanostructures. They are parallel oriented with respect to the surface Si array, which probably predetermines the diffusion pathways of the Mn atoms during the process of nanostructure formation.

  8. A massively parallel GPU-accelerated model for analysis of fully nonlinear free surface waves

    DEFF Research Database (Denmark)

    Engsig-Karup, Allan Peter; Madsen, Morten G.; Glimberg, Stefan Lemvig

    2011-01-01

    -storage flexible-order accurate finite difference method that is known to be efficient and scalable on a CPU core (single thread). To achieve parallel performance of the relatively complex numerical model, we investigate a new trend in high-performance computing where many-core GPUs are utilized as high......-throughput co-processors to the CPU. We describe and demonstrate how this approach makes it possible to do fast desktop computations for large nonlinear wave problems in numerical wave tanks (NWTs) with close to 50/100 million total grid points in double/ single precision with 4 GB global device memory...... available. A new code base has been developed in C++ and compute unified device architecture C and is found to improve the runtime more than an order in magnitude in double precision arithmetic for the same accuracy over an existing CPU (single thread) Fortran 90 code when executed on a single modern GPU...

  9. Neptune: An astrophysical smooth particle hydrodynamics code for massively parallel computer architectures

    Science.gov (United States)

    Sandalski, Stou

    Smooth particle hydrodynamics is an efficient method for modeling the dynamics of fluids. It is commonly used to simulate astrophysical processes such as binary mergers. We present a newly developed GPU accelerated smooth particle hydrodynamics code for astrophysical simulations. The code is named neptune after the Roman god of water. It is written in OpenMP parallelized C++ and OpenCL and includes octree based hydrodynamic and gravitational acceleration. The design relies on object-oriented methodologies in order to provide a flexible and modular framework that can be easily extended and modified by the user. Several pre-built scenarios for simulating collisions of polytropes and black-hole accretion are provided. The code is released under the MIT Open Source license and publicly available at http://code.google.com/p/neptune-sph/.

  10. 3D streamers simulation in a pin to plane configuration using massively parallel computing

    Science.gov (United States)

    Plewa, J.-M.; Eichwald, O.; Ducasse, O.; Dessante, P.; Jacobs, C.; Renon, N.; Yousfi, M.

    2018-03-01

    This paper concerns the 3D simulation of corona discharge using high performance computing (HPC) managed with the message passing interface (MPI) library. In the field of finite volume methods applied on non-adaptive mesh grids and in the case of a specific 3D dynamic benchmark test devoted to streamer studies, the great efficiency of the iterative R&B SOR and BiCGSTAB methods versus the direct MUMPS method was clearly demonstrated in solving the Poisson equation using HPC resources. The optimization of the parallelization and the resulting scalability was undertaken as a function of the HPC architecture for a number of mesh cells ranging from 8 to 512 million and a number of cores ranging from 20 to 1600. The R&B SOR method remains at least about four times faster than the BiCGSTAB method and requires significantly less memory for all tested situations. The R&B SOR method was then implemented in a 3D MPI parallelized code that solves the classical first order model of an atmospheric pressure corona discharge in air. The 3D code capabilities were tested by following the development of one, two and four coplanar streamers generated by initial plasma spots for 6 ns. The preliminary results obtained allowed us to follow in detail the formation of the tree structure of a corona discharge and the effects of the mutual interactions between the streamers in terms of streamer velocity, trajectory and diameter. The computing time for 64 million of mesh cells distributed over 1000 cores using the MPI procedures is about 30 min ns-1, regardless of the number of streamers.

  11. Running ATLAS workloads within massively parallel distributed applications using Athena Multi-Process framework (AthenaMP)

    CERN Document Server

    Calafiura, Paolo; The ATLAS collaboration; Seuster, Rolf; Tsulaia, Vakhtang; van Gemmeren, Peter

    2015-01-01

    AthenaMP is a multi-process version of the ATLAS reconstruction and data analysis framework Athena. By leveraging Linux fork and copy-on-write, it allows the sharing of memory pages between event processors running on the same compute node with little to no change in the application code. Originally targeted to optimize the memory footprint of reconstruction jobs, AthenaMP has demonstrated that it can reduce the memory usage of certain confugurations of ATLAS production jobs by a factor of 2. AthenaMP has also evolved to become the parallel event-processing core of the recently developed ATLAS infrastructure for fine-grained event processing (Event Service) which allows to run AthenaMP inside massively parallel distributed applications on hundreds of compute nodes simultaneously. We present the architecture of AthenaMP, various strategies implemented by AthenaMP for scheduling workload to worker processes (for example: Shared Event Queue and Shared Distributor of Event Tokens) and the usage of AthenaMP in the...

  12. Running ATLAS workloads within massively parallel distributed applications using Athena Multi-Process framework (AthenaMP)

    CERN Document Server

    Calafiura, Paolo; Seuster, Rolf; Tsulaia, Vakhtang; van Gemmeren, Peter

    2015-01-01

    AthenaMP is a multi-process version of the ATLAS reconstruction, simulation and data analysis framework Athena. By leveraging Linux fork and copy-on-write, it allows for sharing of memory pages between event processors running on the same compute node with little to no change in the application code. Originally targeted to optimize the memory footprint of reconstruction jobs, AthenaMP has demonstrated that it can reduce the memory usage of certain configurations of ATLAS production jobs by a factor of 2. AthenaMP has also evolved to become the parallel event-processing core of the recently developed ATLAS infrastructure for fine-grained event processing (Event Service) which allows to run AthenaMP inside massively parallel distributed applications on hundreds of compute nodes simultaneously. We present the architecture of AthenaMP, various strategies implemented by AthenaMP for scheduling workload to worker processes (for example: Shared Event Queue and Shared Distributor of Event Tokens) and the usage of Ath...

  13. Magnetohydrodynamics: Parallel computation of the dynamics of thermonuclear and astrophysical plasmas. 1. Annual report of massively parallel computing pilot project 93MPR05

    International Nuclear Information System (INIS)

    1994-08-01

    This is the first annual report of the MPP pilot project 93MPR05. In this pilot project four research groups with different, complementary backgrounds collaborate with the aim to develop new algorithms and codes to simulate the magnetohydrodynamics of thermonuclear and astrophysical plasmas on massively parallel machines. The expected speed-up is required to simulate the dynamics of the hot plasmas of interest which are characterized by very large magnetic Reynolds numbers and, hence, require high spatial and temporal resolutions (for details see section 1). The four research groups that collaborated to produce the results reported here are: The MHD group of Prof. Dr. J.P. Goedbloed at the FOM-Institute for Plasma Physics 'Rijnhuizen' in Nieuwegein, the group of Prof. Dr. H. van der Vorst at the Mathematics Institute of Utrecht University, the group of Prof. Dr. A.G. Hearn at the Astronomical Institute of Utrecht University, and the group of Dr. Ir. H.J.J. te Riele at the CWI in Amsterdam. The full project team met frequently during this first project year to discuss progress reports, current problems, etc. (see section 2). The main results of the first project year are: - Proof of the scalability of typical linear and nonlinear MHD codes - development and testing of a parallel version of the Arnoldi algorithm - development and testing of alternative methods for solving large non-Hermitian eigenvalue problems - porting of the 3D nonlinear semi-implicit time evolution code HERA to an MPP system. The steps that were scheduled to reach these intended results are given in section 3. (orig./WL)

  14. Magnetohydrodynamics: Parallel computation of the dynamics of thermonuclear and astrophysical plasmas. 1. Annual report of massively parallel computing pilot project 93MPR05

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1994-08-01

    This is the first annual report of the MPP pilot project 93MPR05. In this pilot project four research groups with different, complementary backgrounds collaborate with the aim to develop new algorithms and codes to simulate the magnetohydrodynamics of thermonuclear and astrophysical plasmas on massively parallel machines. The expected speed-up is required to simulate the dynamics of the hot plasmas of interest which are characterized by very large magnetic Reynolds numbers and, hence, require high spatial and temporal resolutions (for details see section 1). The four research groups that collaborated to produce the results reported here are: The MHD group of Prof. Dr. J.P. Goedbloed at the FOM-Institute for Plasma Physics `Rijnhuizen` in Nieuwegein, the group of Prof. Dr. H. van der Vorst at the Mathematics Institute of Utrecht University, the group of Prof. Dr. A.G. Hearn at the Astronomical Institute of Utrecht University, and the group of Dr. Ir. H.J.J. te Riele at the CWI in Amsterdam. The full project team met frequently during this first project year to discuss progress reports, current problems, etc. (see section 2). The main results of the first project year are: - Proof of the scalability of typical linear and nonlinear MHD codes - development and testing of a parallel version of the Arnoldi algorithm - development and testing of alternative methods for solving large non-Hermitian eigenvalue problems - porting of the 3D nonlinear semi-implicit time evolution code HERA to an MPP system. The steps that were scheduled to reach these intended results are given in section 3. (orig./WL).

  15. Identification and analysis of common bean (Phaseolus vulgaris L. transcriptomes by massively parallel pyrosequencing

    Directory of Open Access Journals (Sweden)

    Thimmapuram Jyothi

    2011-10-01

    Full Text Available Abstract Background Common bean (Phaseolus vulgaris is the most important food legume in the world. Although this crop is very important to both the developed and developing world as a means of dietary protein supply, resources available in common bean are limited. Global transcriptome analysis is important to better understand gene expression, genetic variation, and gene structure annotation in addition to other important features. However, the number and description of common bean sequences are very limited, which greatly inhibits genome and transcriptome research. Here we used 454 pyrosequencing to obtain a substantial transcriptome dataset for common bean. Results We obtained 1,692,972 reads with an average read length of 207 nucleotides (nt. These reads were assembled into 59,295 unigenes including 39,572 contigs and 19,723 singletons, in addition to 35,328 singletons less than 100 bp. Comparing the unigenes to common bean ESTs deposited in GenBank, we found that 53.40% or 31,664 of these unigenes had no matches to this dataset and can be considered as new common bean transcripts. Functional annotation of the unigenes carried out by Gene Ontology assignments from hits to Arabidopsis and soybean indicated coverage of a broad range of GO categories. The common bean unigenes were also compared to the bean bacterial artificial chromosome (BAC end sequences, and a total of 21% of the unigenes (12,724 including 9,199 contigs and 3,256 singletons match to the 8,823 BAC-end sequences. In addition, a large number of simple sequence repeats (SSRs and transcription factors were also identified in this study. Conclusions This work provides the first large scale identification of the common bean transcriptome derived by 454 pyrosequencing. This research has resulted in a 150% increase in the number of Phaseolus vulgaris ESTs. The dataset obtained through this analysis will provide a platform for functional genomics in common bean and related legumes and

  16. II - Template Metaprogramming for Massively Parallel Scientific Computing - Vectorization with Expression Templates

    CERN Multimedia

    CERN. Geneva

    2016-01-01

    Large scale scientific computing raises questions on different levels ranging from the fomulation of the problems to the choice of the best algorithms and their implementation for a specific platform. There are similarities in these different topics that can be exploited by modern-style C++ template metaprogramming techniques to produce readable, maintainable and generic code. Traditional low-level code tend to be fast but platform-dependent, and it obfuscates the meaning of the algorithm. On the other hand, object-oriented approach is nice to read, but may come with an inherent performance penalty. These lectures aim to present he basics of the Expression Template (ET) idiom which allows us to keep the object-oriented approach without sacrificing performance. We will in particular show to to enhance ET to include SIMD vectorization. We will then introduce techniques for abstracting iteration, and introduce thread-level parallelism for use in heavy data-centric loads. We will show to to apply these methods i...

  17. Massively parallel Monte Carlo. Experiences running nuclear simulations on a large condor cluster

    International Nuclear Information System (INIS)

    Tickner, James; O'Dwyer, Joel; Roach, Greg; Uher, Josef; Hitchen, Greg

    2010-01-01

    The trivially-parallel nature of Monte Carlo (MC) simulations make them ideally suited for running on a distributed, heterogeneous computing environment. We report on the setup and operation of a large, cycle-harvesting Condor computer cluster, used to run MC simulations of nuclear instruments ('jobs') on approximately 4,500 desktop PCs. Successful operation must balance the competing goals of maximizing the availability of machines for running jobs whilst minimizing the impact on users' PC performance. This requires classification of jobs according to anticipated run-time and priority and careful optimization of the parameters used to control job allocation to host machines. To maximize use of a large Condor cluster, we have created a powerful suite of tools to handle job submission and analysis, as the manual creation, submission and evaluation of large numbers (hundred to thousands) of jobs would be too arduous. We describe some of the key aspects of this suite, which has been interfaced to the well-known MCNP and EGSnrc nuclear codes and our in-house PHOTON optical MC code. We report on our practical experiences of operating our Condor cluster and present examples of several large-scale instrument design problems that have been solved using this tool. (author)

  18. Algorithms and data structures for massively parallel generic adaptive finite element codes

    KAUST Repository

    Bangerth, Wolfgang

    2011-12-01

    Today\\'s largest supercomputers have 100,000s of processor cores and offer the potential to solve partial differential equations discretized by billions of unknowns. However, the complexity of scaling to such large machines and problem sizes has so far prevented the emergence of generic software libraries that support such computations, although these would lower the threshold of entry and enable many more applications to benefit from large-scale computing. We are concerned with providing this functionality for mesh-adaptive finite element computations. We assume the existence of an "oracle" that implements the generation and modification of an adaptive mesh distributed across many processors, and that responds to queries about its structure. Based on querying the oracle, we develop scalable algorithms and data structures for generic finite element methods. Specifically, we consider the parallel distribution of mesh data, global enumeration of degrees of freedom, constraints, and postprocessing. Our algorithms remove the bottlenecks that typically limit large-scale adaptive finite element analyses. We demonstrate scalability of complete finite element workflows on up to 16,384 processors. An implementation of the proposed algorithms, based on the open source software p4est as mesh oracle, is provided under an open source license through the widely used deal.II finite element software library. © 2011 ACM 0098-3500/2011/12-ART10 $10.00.

  19. Massively Parallel Geostatistical Inversion of Coupled Processes in Heterogeneous Porous Media

    Science.gov (United States)

    Ngo, A.; Schwede, R. L.; Li, W.; Bastian, P.; Ippisch, O.; Cirpka, O. A.

    2012-04-01

    another level of parallelization has been added.

  20. Massively-parallel FDTD simulations to address mask electromagnetic effects in hyper-NA immersion lithography

    Science.gov (United States)

    Tirapu Azpiroz, Jaione; Burr, Geoffrey W.; Rosenbluth, Alan E.; Hibbs, Michael

    2008-03-01

    In the Hyper-NA immersion lithography regime, the electromagnetic response of the reticle is known to deviate in a complicated manner from the idealized Thin-Mask-like behavior. Already, this is driving certain RET choices, such as the use of polarized illumination and the customization of reticle film stacks. Unfortunately, full 3-D electromagnetic mask simulations are computationally intensive. And while OPC-compatible mask electromagnetic field (EMF) models can offer a reasonable tradeoff between speed and accuracy for full-chip OPC applications, full understanding of these complex physical effects demands higher accuracy. Our paper describes recent advances in leveraging High Performance Computing as a critical step towards lithographic modeling of the full manufacturing process. In this paper, highly accurate full 3-D electromagnetic simulation of very large mask layouts are conducted in parallel with reasonable turnaround time, using a Blue- Gene/L supercomputer and a Finite-Difference Time-Domain (FDTD) code developed internally within IBM. A 3-D simulation of a large 2-D layout spanning 5μm×5μm at the wafer plane (and thus (20μm×20μm×0.5μm at the mask) results in a simulation with roughly 12.5GB of memory (grid size of 10nm at the mask, single-precision computation, about 30 bytes/grid point). FDTD is flexible and easily parallelizable to enable full simulations of such large layout in approximately an hour using one BlueGene/L "midplane" containing 512 dual-processor nodes with 256MB of memory per processor. Our scaling studies on BlueGene/L demonstrate that simulations up to 100μm × 100μm at the mask can be computed in a few hours. Finally, we will show that the use of a subcell technique permits accurate simulation of features smaller than the grid discretization, thus improving on the tradeoff between computational complexity and simulation accuracy. We demonstrate the correlation of the real and quadrature components that comprise the

  1. Inter-laboratory evaluation of the EUROFORGEN Global ancestry-informative SNP panel by massively parallel sequencing using the Ion PGM™

    DEFF Research Database (Denmark)

    Eduardoff, M; Gross, T E; Santos, C

    2016-01-01

    Seq™ PCR primers was designed for the Global AIM-SNPs to perform massively parallel sequencing using the Ion PGM™ system. This study assessed individual SNP genotyping precision using the Ion PGM™, the forensic sensitivity of the multiplex using dilution series, degraded DNA plus simple mixtures...

  2. Detection and Evaluation of Spatio-Temporal Spike Patterns in Massively Parallel Spike Train Data with SPADE

    Directory of Open Access Journals (Sweden)

    Pietro Quaglio

    2017-05-01

    Full Text Available Repeated, precise sequences of spikes are largely considered a signature of activation of cell assemblies. These repeated sequences are commonly known under the name of spatio-temporal patterns (STPs. STPs are hypothesized to play a role in the communication of information in the computational process operated by the cerebral cortex. A variety of statistical methods for the detection of STPs have been developed and applied to electrophysiological recordings, but such methods scale poorly with the current size of available parallel spike train recordings (more than 100 neurons. In this work, we introduce a novel method capable of overcoming the computational and statistical limits of existing analysis techniques in detecting repeating STPs within massively parallel spike trains (MPST. We employ advanced data mining techniques to efficiently extract repeating sequences of spikes from the data. Then, we introduce and compare two alternative approaches to distinguish statistically significant patterns from chance sequences. The first approach uses a measure known as conceptual stability, of which we investigate a computationally cheap approximation for applications to such large data sets. The second approach is based on the evaluation of pattern statistical significance. In particular, we provide an extension to STPs of a method we recently introduced for the evaluation of statistical significance of synchronous spike patterns. The performance of the two approaches is evaluated in terms of computational load and statistical power on a variety of artificial data sets that replicate specific features of experimental data. Both methods provide an effective and robust procedure for detection of STPs in MPST data. The method based on significance evaluation shows the best overall performance, although at a higher computational cost. We name the novel procedure the spatio-temporal Spike PAttern Detection and Evaluation (SPADE analysis.

  3. Modular and efficient ozone systems based on massively parallel chemical processing in microchannel plasma arrays: performance and commercialization

    Science.gov (United States)

    Kim, M.-H.; Cho, J. H.; Park, S.-J.; Eden, J. G.

    2017-08-01

    Plasmachemical systems based on the production of a specific molecule (O3) in literally thousands of microchannel plasmas simultaneously have been demonstrated, developed and engineered over the past seven years, and commercialized. At the heart of this new plasma technology is the plasma chip, a flat aluminum strip fabricated by photolithographic and wet chemical processes and comprising 24-48 channels, micromachined into nanoporous aluminum oxide, with embedded electrodes. By integrating 4-6 chips into a module, the mass output of an ozone microplasma system is scaled linearly with the number of modules operating in parallel. A 115 g/hr (2.7 kg/day) ozone system, for example, is realized by the combined output of 18 modules comprising 72 chips and 1,800 microchannels. The implications of this plasma processing architecture for scaling ozone production capability, and reducing capital and service costs when introducing redundancy into the system, are profound. In contrast to conventional ozone generator technology, microplasma systems operate reliably (albeit with reduced output) in ambient air and humidity levels up to 90%, a characteristic attributable to the water adsorption/desorption properties and electrical breakdown strength of nanoporous alumina. Extensive testing has documented chip and system lifetimes (MTBF) beyond 5,000 hours, and efficiencies >130 g/kWh when oxygen is the feedstock gas. Furthermore, the weight and volume of microplasma systems are a factor of 3-10 lower than those for conventional ozone systems of comparable output. Massively-parallel plasmachemical processing offers functionality, performance, and commercial value beyond that afforded by conventional technology, and is currently in operation in more than 30 countries worldwide.

  4. Massively parallel sequencing and genome-wide copy number analysis revealed a clonal relationship in benign metastasizing leiomyoma.

    Science.gov (United States)

    Wu, Ren-Chin; Chao, An-Shine; Lee, Li-Yu; Lin, Gigin; Chen, Shu-Jen; Lu, Yen-Jung; Huang, Huei-Jean; Yen, Chi-Feng; Han, Chien Min; Lee, Yun-Shien; Wang, Tzu-Hao; Chao, Angel

    2017-07-18

    Benign metastasizing leiomyoma (BML) is a rare disease entity typically presenting as multiple extrauterine leiomyomas associated with a uterine leiomyoma. It has been hypothesized that the extrauterine leiomyomata represent distant metastasis of the uterine leiomyoma. To date, the only molecular evidence supporting this hypothesis was derived from clonality analyses based on X-chromosome inactivation assays. Here, we sought to address this issue by examining paired specimens of synchronous pulmonary and uterine leiomyomata from three patients using targeted massively parallel sequencing and molecular inversion probe array analysis for detecting somatic mutations and copy number aberrations. We detected identical non-hot-spot somatic mutations and similar patterns of copy number aberrations (CNAs) in paired pulmonary and uterine leiomyomata from two patients, indicating the clonal relationship between pulmonary and uterine leiomyomata. In addition to loss of chromosome 22q found in the literature, we identified additional recurrent CNAs including losses of chromosome 3q and 11q. In conclusion, our findings of the clonal relationship between synchronous pulmonary and uterine leiomyomas support the hypothesis that BML represents a condition wherein a uterine leiomyoma disseminates to distant extrauterine locations.

  5. LiNbO3: A photovoltaic substrate for massive parallel manipulation and patterning of nano-objects

    International Nuclear Information System (INIS)

    Carrascosa, M.; García-Cabañes, A.; Jubera, M.; Ramiro, J. B.; Agulló-López, F.

    2015-01-01

    The application of evanescent photovoltaic (PV) fields, generated by visible illumination of Fe:LiNbO 3 substrates, for parallel massive trapping and manipulation of micro- and nano-objects is critically reviewed. The technique has been often referred to as photovoltaic or photorefractive tweezers. The main advantage of the new method is that the involved electrophoretic and/or dielectrophoretic forces do not require any electrodes and large scale manipulation of nano-objects can be easily achieved using the patterning capabilities of light. The paper describes the experimental techniques for particle trapping and the main reported experimental results obtained with a variety of micro- and nano-particles (dielectric and conductive) and different illumination configurations (single beam, holographic geometry, and spatial light modulator projection). The report also pays attention to the physical basis of the method, namely, the coupling of the evanescent photorefractive fields to the dielectric response of the nano-particles. The role of a number of physical parameters such as the contrast and spatial periodicities of the illumination pattern or the particle deposition method is discussed. Moreover, the main properties of the obtained particle patterns in relation to potential applications are summarized, and first demonstrations reviewed. Finally, the PV method is discussed in comparison to other patterning strategies, such as those based on the pyroelectric response and the electric fields associated to domain poling of ferroelectric materials

  6. Massively parallel sequencing and genome-wide copy number analysis revealed a clonal relationship in benign metastasizing leiomyoma

    Science.gov (United States)

    Lee, Li-Yu; Lin, Gigin; Chen, Shu-Jen; Lu, Yen-Jung; Huang, Huei-Jean; Yen, Chi-Feng; Han, Chien Min; Lee, Yun-Shien; Wang, Tzu-Hao; Chao, Angel

    2017-01-01

    Benign metastasizing leiomyoma (BML) is a rare disease entity typically presenting as multiple extrauterine leiomyomas associated with a uterine leiomyoma. It has been hypothesized that the extrauterine leiomyomata represent distant metastasis of the uterine leiomyoma. To date, the only molecular evidence supporting this hypothesis was derived from clonality analyses based on X-chromosome inactivation assays. Here, we sought to address this issue by examining paired specimens of synchronous pulmonary and uterine leiomyomata from three patients using targeted massively parallel sequencing and molecular inversion probe array analysis for detecting somatic mutations and copy number aberrations. We detected identical non-hot-spot somatic mutations and similar patterns of copy number aberrations (CNAs) in paired pulmonary and uterine leiomyomata from two patients, indicating the clonal relationship between pulmonary and uterine leiomyomata. In addition to loss of chromosome 22q found in the literature, we identified additional recurrent CNAs including losses of chromosome 3q and 11q. In conclusion, our findings of the clonal relationship between synchronous pulmonary and uterine leiomyomas support the hypothesis that BML represents a condition wherein a uterine leiomyoma disseminates to distant extrauterine locations. PMID:28533481

  7. Massively parallel E-beam inspection: enabling next-generation patterned defect inspection for wafer and mask manufacturing

    Science.gov (United States)

    Malloy, Matt; Thiel, Brad; Bunday, Benjamin D.; Wurm, Stefan; Mukhtar, Maseeh; Quoi, Kathy; Kemen, Thomas; Zeidler, Dirk; Eberle, Anna Lena; Garbowski, Tomasz; Dellemann, Gregor; Peters, Jan Hendrik

    2015-03-01

    SEMATECH aims to identify and enable disruptive technologies to meet the ever-increasing demands of semiconductor high volume manufacturing (HVM). As such, a program was initiated in 2012 focused on high-speed e-beam defect inspection as a complement, and eventual successor, to bright field optical patterned defect inspection [1]. The primary goal is to enable a new technology to overcome the key gaps that are limiting modern day inspection in the fab; primarily, throughput and sensitivity to detect ultra-small critical defects. The program specifically targets revolutionary solutions based on massively parallel e-beam technologies, as opposed to incremental improvements to existing e-beam and optical inspection platforms. Wafer inspection is the primary target, but attention is also being paid to next generation mask inspection. During the first phase of the multi-year program multiple technologies were reviewed, a down-selection was made to the top candidates, and evaluations began on proof of concept systems. A champion technology has been selected and as of late 2014 the program has begun to move into the core technology maturation phase in order to enable eventual commercialization of an HVM system. Performance data from early proof of concept systems will be shown along with roadmaps to achieving HVM performance. SEMATECH's vision for moving from early-stage development to commercialization will be shown, including plans for development with industry leading technology providers.

  8. A bumpy ride on the diagnostic bench of massive parallel sequencing, the case of the mitochondrial genome.

    Directory of Open Access Journals (Sweden)

    Kim Vancampenhout

    Full Text Available The advent of massive parallel sequencing (MPS has revolutionized the field of human molecular genetics, including the diagnostic study of mitochondrial (mt DNA dysfunction. The analysis of the complete mitochondrial genome using MPS platforms is now common and will soon outrun conventional sequencing. However, the development of a robust and reliable protocol is rather challenging. A previous pilot study for the re-sequencing of human mtDNA revealed an uneven coverage, affecting predominantly part of the plus strand. In an attempt to address this problem, we undertook a comparative study of standard and modified protocols for the Ion Torrent PGM system. We could not improve strand representation by altering the recommended shearing methodology of the standard workflow or omitting the DNA polymerase amplification step from the library construction process. However, we were able to associate coverage bias of the plus strand with a specific sequence motif. Additionally, we compared coverage and variant calling across technologies. The same samples were also sequenced on a MiSeq device which showed that coverage and heteroplasmic variant calling were much improved.

  9. Enabling inspection solutions for future mask technologies through the development of massively parallel E-Beam inspection

    Science.gov (United States)

    Malloy, Matt; Thiel, Brad; Bunday, Benjamin D.; Wurm, Stefan; Jindal, Vibhu; Mukhtar, Maseeh; Quoi, Kathy; Kemen, Thomas; Zeidler, Dirk; Eberle, Anna Lena; Garbowski, Tomasz; Dellemann, Gregor; Peters, Jan Hendrik

    2015-09-01

    The new device architectures and materials being introduced for sub-10nm manufacturing, combined with the complexity of multiple patterning and the need for improved hotspot detection strategies, have pushed current wafer inspection technologies to their limits. In parallel, gaps in mask inspection capability are growing as new generations of mask technologies are developed to support these sub-10nm wafer manufacturing requirements. In particular, the challenges associated with nanoimprint and extreme ultraviolet (EUV) mask inspection require new strategies that enable fast inspection at high sensitivity. The tradeoffs between sensitivity and throughput for optical and e-beam inspection are well understood. Optical inspection offers the highest throughput and is the current workhorse of the industry for both wafer and mask inspection. E-beam inspection offers the highest sensitivity but has historically lacked the throughput required for widespread adoption in the manufacturing environment. It is unlikely that continued incremental improvements to either technology will meet tomorrow's requirements, and therefore a new inspection technology approach is required; one that combines the high-throughput performance of optical with the high-sensitivity capabilities of e-beam inspection. To support the industry in meeting these challenges SUNY Poly SEMATECH has evaluated disruptive technologies that can meet the requirements for high volume manufacturing (HVM), for both the wafer fab [1] and the mask shop. Highspeed massively parallel e-beam defect inspection has been identified as the leading candidate for addressing the key gaps limiting today's patterned defect inspection techniques. As of late 2014 SUNY Poly SEMATECH completed a review, system analysis, and proof of concept evaluation of multiple e-beam technologies for defect inspection. A champion approach has been identified based on a multibeam technology from Carl Zeiss. This paper includes a discussion on the

  10. Massively parallel sequencing and targeted exomes in familial kidney disease can diagnose underlying genetic disorders.

    Science.gov (United States)

    Mallett, Andrew J; McCarthy, Hugh J; Ho, Gladys; Holman, Katherine; Farnsworth, Elizabeth; Patel, Chirag; Fletcher, Jeffery T; Mallawaarachchi, Amali; Quinlan, Catherine; Bennetts, Bruce; Alexander, Stephen I

    2017-12-01

    Inherited kidney disease encompasses a broad range of disorders, with both multiple genes contributing to specific phenotypes and single gene defects having multiple clinical presentations. Advances in sequencing capacity may allow a genetic diagnosis for familial renal disease, by testing the increasing number of known causative genes. However, there has been limited translation of research findings of causative genes into clinical settings. Here, we report the results of a national accredited diagnostic genetic service for familial renal disease. An expert multidisciplinary team developed a targeted exomic sequencing approach with ten curated multigene panels (207 genes) and variant assessment individualized to the patient's phenotype. A genetic diagnosis (pathogenic genetic variant[s]) was identified in 58 of 135 families referred in two years. The genetic diagnosis rate was similar between families with a pediatric versus adult proband (46% vs 40%), although significant differences were found in certain panels such as atypical hemolytic uremic syndrome (88% vs 17%). High diagnostic rates were found for Alport syndrome (22 of 27) and tubular disorders (8 of 10), whereas the monogenic diagnostic rate for congenital anomalies of the kidney and urinary tract was one of 13. Quality reporting was aided by a strong clinical renal and genetic multidisciplinary committee review. Importantly, for a diagnostic service, few variants of uncertain significance were found with this targeted, phenotype-based approach. Thus, use of targeted massively parallel sequencing approaches in inherited kidney disease has a significant capacity to diagnose the underlying genetic disorder across most renal phenotypes. Copyright © 2017 International Society of Nephrology. Published by Elsevier Inc. All rights reserved.

  11. Application of affymetrix array and massively parallel signature sequencing for identification of genes involved in prostate cancer progression

    International Nuclear Information System (INIS)

    Oudes, Asa J; Roach, Jared C; Walashek, Laura S; Eichner, Lillian J; True, Lawrence D; Vessella, Robert L; Liu, Alvin Y

    2005-01-01

    Affymetrix GeneChip Array and Massively Parallel Signature Sequencing (MPSS) are two high throughput methodologies used to profile transcriptomes. Each method has certain strengths and weaknesses; however, no comparison has been made between the data derived from Affymetrix arrays and MPSS. In this study, two lineage-related prostate cancer cell lines, LNCaP and C4-2, were used for transcriptome analysis with the aim of identifying genes associated with prostate cancer progression. Affymetrix GeneChip array and MPSS analyses were performed. Data was analyzed with GeneSpring 6.2 and in-house perl scripts. Expression array results were verified with RT-PCR. Comparison of the data revealed that both technologies detected genes the other did not. In LNCaP, 3,180 genes were only detected by Affymetrix and 1,169 genes were only detected by MPSS. Similarly, in C4-2, 4,121 genes were only detected by Affymetrix and 1,014 genes were only detected by MPSS. Analysis of the combined transcriptomes identified 66 genes unique to LNCaP cells and 33 genes unique to C4-2 cells. Expression analysis of these genes in prostate cancer specimens showed CA1 to be highly expressed in bone metastasis but not expressed in primary tumor and EPHA7 to be expressed in normal prostate and primary tumor but not bone metastasis. Our data indicates that transcriptome profiling with a single methodology will not fully assess the expression of all genes in a cell line. A combination of transcription profiling technologies such as DNA array and MPSS provides a more robust means to assess the expression profile of an RNA sample. Finally, genes that were differentially expressed in cell lines were also differentially expressed in primary prostate cancer and its metastases

  12. Accelerating Monte Carlo Molecular Simulations Using Novel Extrapolation Schemes Combined with Fast Database Generation on Massively Parallel Machines

    KAUST Repository

    Amir, Sahar Z.

    2013-05-01

    We introduce an efficient thermodynamically consistent technique to extrapolate and interpolate normalized Canonical NVT ensemble averages like pressure and energy for Lennard-Jones (L-J) fluids. Preliminary results show promising applicability in oil and gas modeling, where accurate determination of thermodynamic properties in reservoirs is challenging. The thermodynamic interpolation and thermodynamic extrapolation schemes predict ensemble averages at different thermodynamic conditions from expensively simulated data points. The methods reweight and reconstruct previously generated database values of Markov chains at neighboring temperature and density conditions. To investigate the efficiency of these methods, two databases corresponding to different combinations of normalized density and temperature are generated. One contains 175 Markov chains with 10,000,000 MC cycles each and the other contains 3000 Markov chains with 61,000,000 MC cycles each. For such massive database creation, two algorithms to parallelize the computations have been investigated. The accuracy of the thermodynamic extrapolation scheme is investigated with respect to classical interpolation and extrapolation. Finally, thermodynamic interpolation benefiting from four neighboring Markov chains points is implemented and compared with previous schemes. The thermodynamic interpolation scheme using knowledge from the four neighboring points proves to be more accurate than the thermodynamic extrapolation from the closest point only, while both thermodynamic extrapolation and thermodynamic interpolation are more accurate than the classical interpolation and extrapolation. The investigated extrapolation scheme has great potential in oil and gas reservoir modeling.That is, such a scheme has the potential to speed up the MCMC thermodynamic computation to be comparable with conventional Equation of State approaches in efficiency. In particular, this makes it applicable to large-scale optimization of L

  13. Genomic Characterization of Non–Small-Cell Lung Cancer in African Americans by Targeted Massively Parallel Sequencing

    Science.gov (United States)

    Araujo, Luiz H.; Timmers, Cynthia; Bell, Erica Hlavin; Shilo, Konstantin; Lammers, Philip E.; Zhao, Weiqiang; Natarajan, Thanemozhi G.; Miller, Clinton J.; Zhang, Jianying; Yilmaz, Ayse S.; Liu, Tom; Coombes, Kevin; Amann, Joseph; Carbone, David P.

    2015-01-01

    Purpose Technologic advances have enabled the comprehensive analysis of genetic perturbations in non–small-cell lung cancer (NSCLC); however, African Americans have often been underrepresented in these studies. This ethnic group has higher lung cancer incidence and mortality rates, and some studies have suggested a lower incidence of epidermal growth factor receptor mutations. Herein, we report the most in-depth molecular profile of NSCLC in African Americans to date. Methods A custom panel was designed to cover the coding regions of 81 NSCLC-related genes and 40 ancestry-informative markers. Clinical samples were sequenced on a massively parallel sequencing instrument, and anaplastic lymphoma kinase translocation was evaluated by fluorescent in situ hybridization. Results The study cohort included 99 patients (61% males, 94% smokers) comprising 31 squamous and 68 nonsquamous cell carcinomas. We detected 227 nonsilent variants in the coding sequence, including 24 samples with nonoverlapping, classic driver alterations. The frequency of driver mutations was not significantly different from that of whites, and no association was found between genetic ancestry and the presence of somatic mutations. Copy number alteration analysis disclosed distinguishable amplifications in the 3q chromosome arm in squamous cell carcinomas and pointed toward a handful of targetable alterations. We also found frequent SMARCA4 mutations and protein loss, mostly in driver-negative tumors. Conclusion Our data suggest that African American ancestry may not be significantly different from European/white background for the presence of somatic driver mutations in NSCLC. Furthermore, we demonstrated that using a comprehensive genotyping approach could identify numerous targetable alterations, with potential impact on therapeutic decisions. PMID:25918285

  14. Massively parallel signature sequencing and bioinformatics analysis identifies up-regulation of TGFBI and SOX4 in human glioblastoma.

    Directory of Open Access Journals (Sweden)

    Biaoyang Lin

    Full Text Available BACKGROUND: A comprehensive network-based understanding of molecular pathways abnormally altered in glioblastoma multiforme (GBM is essential for developing effective therapeutic approaches for this deadly disease. METHODOLOGY/PRINCIPAL FINDINGS: Applying a next generation sequencing technology, massively parallel signature sequencing (MPSS, we identified a total of 4535 genes that are differentially expressed between normal brain and GBM tissue. The expression changes of three up-regulated genes, CHI3L1, CHI3L2, and FOXM1, and two down-regulated genes, neurogranin and L1CAM, were confirmed by quantitative PCR. Pathway analysis revealed that TGF- beta pathway related genes were significantly up-regulated in GBM tumor samples. An integrative pathway analysis of the TGF beta signaling network identified two alternative TGF-beta signaling pathways mediated by SOX4 (sex determining region Y-box 4 and TGFBI (Transforming growth factor beta induced. Quantitative RT-PCR and immunohistochemistry staining demonstrated that SOX4 and TGFBI expression is elevated in GBM tissues compared with normal brain tissues at both the RNA and protein levels. In vitro functional studies confirmed that TGFBI and SOX4 expression is increased by TGF-beta stimulation and decreased by a specific inhibitor of TGF-beta receptor 1 kinase. CONCLUSIONS/SIGNIFICANCE: Our MPSS database for GBM and normal brain tissues provides a useful resource for the scientific community. The identification of non-SMAD mediated TGF-beta signaling pathways acting through SOX4 and TGFBI (GENE ID:7045 in GBM indicates that these alternative pathways should be considered, in addition to the canonical SMAD mediated pathway, in the development of new therapeutic strategies targeting TGF-beta signaling in GBM. Finally, the construction of an extended TGF-beta signaling network with overlaid gene expression changes between GBM and normal brain extends our understanding of the biology of GBM.

  15. Development and application of a 6.5 million feature Affymetrix Genechip® for massively parallel discovery of single position polymorphisms in lettuce (Lactuca spp.)

    OpenAIRE

    Stoffel, Kevin; Kozik, Alexander; Ashrafi, Hamid; Cui, Xinping; Tan, Xiaoping; Hill, Theresa; Reyes-Chin-Wo, Sebastian; Truco, Maria-Jose; Michelmore, Richard W; Van Deynze, Allen

    2012-01-01

    Abstract Background High-resolution genetic maps are needed in many crops to help characterize the genetic diversity that determines agriculturally important traits. Hybridization to microarrays to detect single feature polymorphisms is a powerful technique for marker discovery and genotyping because of its highly parallel nature. However, microarrays designed for gene expression analysis rarely provide sufficient gene coverage for optimal detection of nucleotide polymorphisms, which limits u...

  16. Three-dimensional gyrokinetic particle-in-cell simulation of plasmas on a massively parallel computer: Final report on LDRD Core Competency Project, FY 1991--FY 1993

    International Nuclear Information System (INIS)

    Byers, J.A.; Williams, T.J.; Cohen, B.I.; Dimits, A.M.

    1994-01-01

    One of the programs of the Magnetic fusion Energy (MFE) Theory and computations Program is studying the anomalous transport of thermal energy across the field lines in the core of a tokamak. We use the method of gyrokinetic particle-in-cell simulation in this study. For this LDRD project we employed massively parallel processing, new algorithms, and new algorithms, and new formal techniques to improve this research. Specifically, we sought to take steps toward: researching experimentally-relevant parameters in our simulations, learning parallel computing to have as a resource for our group, and achieving a 100 x speedup over our starting-point Cray2 simulation code's performance

  17. Significant Association between Sulfate-Reducing Bacteria and Uranium-Reducing Microbial Communities as Revealed by a Combined Massively Parallel Sequencing-Indicator Species Approach▿ †

    OpenAIRE

    Cardenas, Erick; Wu, Wei-Min; Leigh, Mary Beth; Carley, Jack; Carroll, Sue; Gentry, Terry; Luo, Jian; Watson, David; Gu, Baohua; Ginder-Vogel, Matthew; Kitanidis, Peter K.; Jardine, Philip M.; Zhou, Jizhong; Criddle, Craig S.; Marsh, Terence L.

    2010-01-01

    Massively parallel sequencing has provided a more affordable and high-throughput method to study microbial communities, although it has mostly been used in an exploratory fashion. We combined pyrosequencing with a strict indicator species statistical analysis to test if bacteria specifically responded to ethanol injection that successfully promoted dissimilatory uranium(VI) reduction in the subsurface of a uranium contamination plume at the Oak Ridge Field Research Center in Tennessee. Remedi...

  18. Massively Parallel Assimilation of TOGA/TAO and Topex/Poseidon Measurements into a Quasi Isopycnal Ocean General Circulation Model Using an Ensemble Kalman Filter

    Science.gov (United States)

    Keppenne, Christian L.; Rienecker, Michele; Borovikov, Anna Y.; Suarez, Max

    1999-01-01

    A massively parallel ensemble Kalman filter (EnKF)is used to assimilate temperature data from the TOGA/TAO array and altimetry from TOPEX/POSEIDON into a Pacific basin version of the NASA Seasonal to Interannual Prediction Project (NSIPP)ls quasi-isopycnal ocean general circulation model. The EnKF is an approximate Kalman filter in which the error-covariance propagation step is modeled by the integration of multiple instances of a numerical model. An estimate of the true error covariances is then inferred from the distribution of the ensemble of model state vectors. This inplementation of the filter takes advantage of the inherent parallelism in the EnKF algorithm by running all the model instances concurrently. The Kalman filter update step also occurs in parallel by having each processor process the observations that occur in the region of physical space for which it is responsible. The massively parallel data assimilation system is validated by withholding some of the data and then quantifying the extent to which the withheld information can be inferred from the assimilation of the remaining data. The distributions of the forecast and analysis error covariances predicted by the ENKF are also examined.

  19. Three pillars for achieving quantum mechanical molecular dynamics simulations of huge systems: Divide-and-conquer, density-functional tight-binding, and massively parallel computation.

    Science.gov (United States)

    Nishizawa, Hiroaki; Nishimura, Yoshifumi; Kobayashi, Masato; Irle, Stephan; Nakai, Hiromi

    2016-08-05

    The linear-scaling divide-and-conquer (DC) quantum chemical methodology is applied to the density-functional tight-binding (DFTB) theory to develop a massively parallel program that achieves on-the-fly molecular reaction dynamics simulations of huge systems from scratch. The functions to perform large scale geometry optimization and molecular dynamics with DC-DFTB potential energy surface are implemented to the program called DC-DFTB-K. A novel interpolation-based algorithm is developed for parallelizing the determination of the Fermi level in the DC method. The performance of the DC-DFTB-K program is assessed using a laboratory computer and the K computer. Numerical tests show the high efficiency of the DC-DFTB-K program, a single-point energy gradient calculation of a one-million-atom system is completed within 60 s using 7290 nodes of the K computer. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  20. Inter-laboratory evaluation of the EUROFORGEN Global ancestry-informative SNP panel by massively parallel sequencing using the Ion PGM™.

    Science.gov (United States)

    Eduardoff, M; Gross, T E; Santos, C; de la Puente, M; Ballard, D; Strobl, C; Børsting, C; Morling, N; Fusco, L; Hussing, C; Egyed, B; Souto, L; Uacyisrael, J; Syndercombe Court, D; Carracedo, Á; Lareu, M V; Schneider, P M; Parson, W; Phillips, C; Parson, W; Phillips, C

    2016-07-01

    The EUROFORGEN Global ancestry-informative SNP (AIM-SNPs) panel is a forensic multiplex of 128 markers designed to differentiate an individual's ancestry from amongst the five continental population groups of Africa, Europe, East Asia, Native America, and Oceania. A custom multiplex of AmpliSeq™ PCR primers was designed for the Global AIM-SNPs to perform massively parallel sequencing using the Ion PGM™ system. This study assessed individual SNP genotyping precision using the Ion PGM™, the forensic sensitivity of the multiplex using dilution series, degraded DNA plus simple mixtures, and the ancestry differentiation power of the final panel design, which required substitution of three original ancestry-informative SNPs with alternatives. Fourteen populations that had not been previously analyzed were genotyped using the custom multiplex and these studies allowed assessment of genotyping performance by comparison of data across five laboratories. Results indicate a low level of genotyping error can still occur from sequence misalignment caused by homopolymeric tracts close to the target SNP, despite careful scrutiny of candidate SNPs at the design stage. Such sequence misalignment required the exclusion of component SNP rs2080161 from the Global AIM-SNPs panel. However, the overall genotyping precision and sensitivity of this custom multiplex indicates the Ion PGM™ assay for the Global AIM-SNPs is highly suitable for forensic ancestry analysis with massively parallel sequencing. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  1. Dissecting Cell-Type Composition and Activity-Dependent Transcriptional State in Mammalian Brains by Massively Parallel Single-Nucleus RNA-Seq.

    Science.gov (United States)

    Hu, Peng; Fabyanic, Emily; Kwon, Deborah Y; Tang, Sheng; Zhou, Zhaolan; Wu, Hao

    2017-12-07

    Massively parallel single-cell RNA sequencing can precisely resolve cellular diversity in a high-throughput manner at low cost, but unbiased isolation of intact single cells from complex tissues such as adult mammalian brains is challenging. Here, we integrate sucrose-gradient-assisted purification of nuclei with droplet microfluidics to develop a highly scalable single-nucleus RNA-seq approach (sNucDrop-seq), which is free of enzymatic dissociation and nucleus sorting. By profiling ∼18,000 nuclei isolated from cortical tissues of adult mice, we demonstrate that sNucDrop-seq not only accurately reveals neuronal and non-neuronal subtype composition with high sensitivity but also enables in-depth analysis of transient transcriptional states driven by neuronal activity, at single-cell resolution, in vivo. Copyright © 2017 Elsevier Inc. All rights reserved.

  2. MCBooster: a library for fast Monte Carlo generation of phase-space decays on massively parallel platforms.

    Science.gov (United States)

    Alves Júnior, A. A.; Sokoloff, M. D.

    2017-10-01

    MCBooster is a header-only, C++11-compliant library that provides routines to generate and perform calculations on large samples of phase space Monte Carlo events. To achieve superior performance, MCBooster is capable to perform most of its calculations in parallel using CUDA- and OpenMP-enabled devices. MCBooster is built on top of the Thrust library and runs on Linux systems. This contribution summarizes the main features of MCBooster. A basic description of the user interface and some examples of applications are provided, along with measurements of performance in a variety of environments

  3. Tinker-HP: a massively parallel molecular dynamics package for multiscale simulations of large complex systems with advanced point dipole polarizable force fields.

    Science.gov (United States)

    Lagardère, Louis; Jolly, Luc-Henri; Lipparini, Filippo; Aviat, Félix; Stamm, Benjamin; Jing, Zhifeng F; Harger, Matthew; Torabifard, Hedieh; Cisneros, G Andrés; Schnieders, Michael J; Gresh, Nohad; Maday, Yvon; Ren, Pengyu Y; Ponder, Jay W; Piquemal, Jean-Philip

    2018-01-28

    We present Tinker-HP, a massively MPI parallel package dedicated to classical molecular dynamics (MD) and to multiscale simulations, using advanced polarizable force fields (PFF) encompassing distributed multipoles electrostatics. Tinker-HP is an evolution of the popular Tinker package code that conserves its simplicity of use and its reference double precision implementation for CPUs. Grounded on interdisciplinary efforts with applied mathematics, Tinker-HP allows for long polarizable MD simulations on large systems up to millions of atoms. We detail in the paper the newly developed extension of massively parallel 3D spatial decomposition to point dipole polarizable models as well as their coupling to efficient Krylov iterative and non-iterative polarization solvers. The design of the code allows the use of various computer systems ranging from laboratory workstations to modern petascale supercomputers with thousands of cores. Tinker-HP proposes therefore the first high-performance scalable CPU computing environment for the development of next generation point dipole PFFs and for production simulations. Strategies linking Tinker-HP to Quantum Mechanics (QM) in the framework of multiscale polarizable self-consistent QM/MD simulations are also provided. The possibilities, performances and scalability of the software are demonstrated via benchmarks calculations using the polarizable AMOEBA force field on systems ranging from large water boxes of increasing size and ionic liquids to (very) large biosystems encompassing several proteins as well as the complete satellite tobacco mosaic virus and ribosome structures. For small systems, Tinker-HP appears to be competitive with the Tinker-OpenMM GPU implementation of Tinker. As the system size grows, Tinker-HP remains operational thanks to its access to distributed memory and takes advantage of its new algorithmic enabling for stable long timescale polarizable simulations. Overall, a several thousand-fold acceleration over

  4. Influence of nucleotide modifications at the C2' position on the Hoogsteen base-paired parallel-stranded duplex of poly(A) RNA.

    Science.gov (United States)

    Copp, William; Denisov, Alexey Y; Xie, Jingwei; Noronha, Anne M; Liczner, Christopher; Safaee, Nozhat; Wilds, Christopher J; Gehring, Kalle

    2017-09-29

    Polyadenylate (poly(A)) has the ability to form a parallel duplex with Hoogsteen adenine:adenine base pairs at low pH or in the presence of ammonium ions. In order to evaluate the potential of this structural motif for nucleic acid-based nanodevices, we characterized the effects on duplex stability of substitutions of the ribose sugar with 2'-deoxyribose, 2'-O-methyl-ribose, 2'-deoxy-2'-fluoro-ribose, arabinose and 2'-deoxy-2'-fluoro-arabinose. Deoxyribose substitutions destabilized the poly(A) duplex both at low pH and in the presence of ammonium ions: no duplex formation could be detected with poly(A) DNA oligomers. Other sugar C2' modifications gave a variety of effects. Arabinose and 2'-deoxy-2'-fluoro-arabinose nucleotides strongly destabilized poly(A) duplex formation. In contrast, 2'-O-methyl and 2'-deoxy-2'-fluoro-ribo modifications were stabilizing either at pH 4 or in the presence of ammonium ions. The differential effect suggests they could be used to design molecules selectively responsive to pH or ammonium ions. To understand the destabilization by deoxyribose, we determined the structures of poly(A) duplexes with a single DNA residue by nuclear magnetic resonance spectroscopy and X-ray crystallography. The structures revealed minor structural perturbations suggesting that the combination of sugar pucker propensity, hydrogen bonding, pKa shifts and changes in hydration determine duplex stability. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  5. Rapid profiling of the antigen regions recognized by serum antibodies using massively parallel sequencing of antigen-specific libraries.

    KAUST Repository

    Domina, Maria; Lanza Cariccio, Veronica; Benfatto, Salvatore; D'Aliberti, Deborah; Venza, Mario; Borgogni, Erica; Castellino, Flora; Biondo, Carmelo; D'Andrea, Daniel; Grassi, Luigi; Tramontano, Anna; Teti, Giuseppe; Felici, Franco; Beninati, Concetta

    2014-01-01

    There is a need for techniques capable of identifying the antigenic epitopes targeted by polyclonal antibody responses during deliberate or natural immunization. Although successful, traditional phage library screening is laborious and can map only some of the epitopes. To accelerate and improve epitope identification, we have employed massive sequencing of phage-displayed antigen-specific libraries using the Illumina MiSeq platform. This enabled us to precisely identify the regions of a model antigen, the meningococcal NadA virulence factor, targeted by serum antibodies in vaccinated individuals and to rank hundreds of antigenic fragments according to their immunoreactivity. We found that next generation sequencing can significantly empower the analysis of antigen-specific libraries by allowing simultaneous processing of dozens of library/serum combinations in less than two days, including the time required for antibody-mediated library selection. Moreover, compared with traditional plaque picking, the new technology (named Phage-based Representation OF Immuno-Ligand Epitope Repertoire or PROFILER) provides superior resolution in epitope identification. PROFILER seems ideally suited to streamline and guide rational antigen design, adjuvant selection, and quality control of newly produced vaccines. Furthermore, this method is also susceptible to find important applications in other fields covered by traditional quantitative serology.

  6. Rapid profiling of the antigen regions recognized by serum antibodies using massively parallel sequencing of antigen-specific libraries.

    Directory of Open Access Journals (Sweden)

    Maria Domina

    Full Text Available There is a need for techniques capable of identifying the antigenic epitopes targeted by polyclonal antibody responses during deliberate or natural immunization. Although successful, traditional phage library screening is laborious and can map only some of the epitopes. To accelerate and improve epitope identification, we have employed massive sequencing of phage-displayed antigen-specific libraries using the Illumina MiSeq platform. This enabled us to precisely identify the regions of a model antigen, the meningococcal NadA virulence factor, targeted by serum antibodies in vaccinated individuals and to rank hundreds of antigenic fragments according to their immunoreactivity. We found that next generation sequencing can significantly empower the analysis of antigen-specific libraries by allowing simultaneous processing of dozens of library/serum combinations in less than two days, including the time required for antibody-mediated library selection. Moreover, compared with traditional plaque picking, the new technology (named Phage-based Representation OF Immuno-Ligand Epitope Repertoire or PROFILER provides superior resolution in epitope identification. PROFILER seems ideally suited to streamline and guide rational antigen design, adjuvant selection, and quality control of newly produced vaccines. Furthermore, this method is also susceptible to find important applications in other fields covered by traditional quantitative serology.

  7. Rapid profiling of the antigen regions recognized by serum antibodies using massively parallel sequencing of antigen-specific libraries.

    KAUST Repository

    Domina, Maria

    2014-12-04

    There is a need for techniques capable of identifying the antigenic epitopes targeted by polyclonal antibody responses during deliberate or natural immunization. Although successful, traditional phage library screening is laborious and can map only some of the epitopes. To accelerate and improve epitope identification, we have employed massive sequencing of phage-displayed antigen-specific libraries using the Illumina MiSeq platform. This enabled us to precisely identify the regions of a model antigen, the meningococcal NadA virulence factor, targeted by serum antibodies in vaccinated individuals and to rank hundreds of antigenic fragments according to their immunoreactivity. We found that next generation sequencing can significantly empower the analysis of antigen-specific libraries by allowing simultaneous processing of dozens of library/serum combinations in less than two days, including the time required for antibody-mediated library selection. Moreover, compared with traditional plaque picking, the new technology (named Phage-based Representation OF Immuno-Ligand Epitope Repertoire or PROFILER) provides superior resolution in epitope identification. PROFILER seems ideally suited to streamline and guide rational antigen design, adjuvant selection, and quality control of newly produced vaccines. Furthermore, this method is also susceptible to find important applications in other fields covered by traditional quantitative serology.

  8. Massively parallel signal processing using the graphics processing unit for real-time brain-computer interface feature extraction

    Directory of Open Access Journals (Sweden)

    J. Adam Wilson

    2009-07-01

    Full Text Available The clock speeds of modern computer processors have nearly plateaued in the past five years. Consequently, neural prosthetic systems that rely on processing large quantities of data in a short period of time face a bottleneck, in that it may not be possible to process all of the data recorded from an electrode array with high channel counts and bandwidth, such as electrocorticographic grids or other implantable systems. Therefore, in this study a method of using the processing capabilities of a graphics card (GPU was developed for real-time neural signal processing of a brain-computer interface (BCI. The NVIDIA CUDA system was used to offload processing to the GPU, which is capable of running many operations in parallel, potentially greatly increasing the speed of existing algorithms. The BCI system records many channels of data, which are processed and translated into a control signal, such as the movement of a computer cursor. This signal processing chain involves computing a matrix-matrix multiplication (i.e., a spatial filter, followed by calculating the power spectral density on every channel using an auto-regressive method, and finally classifying appropriate features for control. In this study, the first two computationally-intensive steps were implemented on the GPU, and the speed was compared to both the current implementation and a CPU-based implementation that uses multi-threading. Significant performance gains were obtained with GPU processing: the current implementation processed 1000 channels in 933 ms, while the new GPU method took only 27 ms, an improvement of nearly 35 times.

  9. Massively Parallel Signal Processing using the Graphics Processing Unit for Real-Time Brain-Computer Interface Feature Extraction.

    Science.gov (United States)

    Wilson, J Adam; Williams, Justin C

    2009-01-01

    The clock speeds of modern computer processors have nearly plateaued in the past 5 years. Consequently, neural prosthetic systems that rely on processing large quantities of data in a short period of time face a bottleneck, in that it may not be possible to process all of the data recorded from an electrode array with high channel counts and bandwidth, such as electrocorticographic grids or other implantable systems. Therefore, in this study a method of using the processing capabilities of a graphics card [graphics processing unit (GPU)] was developed for real-time neural signal processing of a brain-computer interface (BCI). The NVIDIA CUDA system was used to offload processing to the GPU, which is capable of running many operations in parallel, potentially greatly increasing the speed of existing algorithms. The BCI system records many channels of data, which are processed and translated into a control signal, such as the movement of a computer cursor. This signal processing chain involves computing a matrix-matrix multiplication (i.e., a spatial filter), followed by calculating the power spectral density on every channel using an auto-regressive method, and finally classifying appropriate features for control. In this study, the first two computationally intensive steps were implemented on the GPU, and the speed was compared to both the current implementation and a central processing unit-based implementation that uses multi-threading. Significant performance gains were obtained with GPU processing: the current implementation processed 1000 channels of 250 ms in 933 ms, while the new GPU method took only 27 ms, an improvement of nearly 35 times.

  10. Development and application of a 6.5 million feature Affymetrix Genechip® for massively parallel discovery of single position polymorphisms in lettuce (Lactuca spp.

    Directory of Open Access Journals (Sweden)

    Stoffel Kevin

    2012-05-01

    Full Text Available Abstract Background High-resolution genetic maps are needed in many crops to help characterize the genetic diversity that determines agriculturally important traits. Hybridization to microarrays to detect single feature polymorphisms is a powerful technique for marker discovery and genotyping because of its highly parallel nature. However, microarrays designed for gene expression analysis rarely provide sufficient gene coverage for optimal detection of nucleotide polymorphisms, which limits utility in species with low rates of polymorphism such as lettuce (Lactuca sativa. Results We developed a 6.5 million feature Affymetrix GeneChip® for efficient polymorphism discovery and genotyping, as well as for analysis of gene expression in lettuce. Probes on the microarray were designed from 26,809 unigenes from cultivated lettuce and an additional 8,819 unigenes from four related species (L. serriola, L. saligna, L. virosa and L. perennis. Where possible, probes were tiled with a 2 bp stagger, alternating on each DNA strand; providing an average of 187 probes covering approximately 600 bp for each of over 35,000 unigenes; resulting in up to 13 fold redundancy in coverage per nucleotide. We developed protocols for hybridization of genomic DNA to the GeneChip® and refined custom algorithms that utilized coverage from multiple, high quality probes to detect single position polymorphisms in 2 bp sliding windows across each unigene. This allowed us to detect greater than 18,000 polymorphisms between the parental lines of our core mapping population, as well as numerous polymorphisms between cultivated lettuce and wild species in the lettuce genepool. Using marker data from our diversity panel comprised of 52 accessions from the five species listed above, we were able to separate accessions by species using both phylogenetic and principal component analyses. Additionally, we estimated the diversity between different types of cultivated lettuce and

  11. Development and application of a 6.5 million feature Affymetrix Genechip® for massively parallel discovery of single position polymorphisms in lettuce (Lactuca spp.).

    Science.gov (United States)

    Stoffel, Kevin; van Leeuwen, Hans; Kozik, Alexander; Caldwell, David; Ashrafi, Hamid; Cui, Xinping; Tan, Xiaoping; Hill, Theresa; Reyes-Chin-Wo, Sebastian; Truco, Maria-Jose; Michelmore, Richard W; Van Deynze, Allen

    2012-05-14

    High-resolution genetic maps are needed in many crops to help characterize the genetic diversity that determines agriculturally important traits. Hybridization to microarrays to detect single feature polymorphisms is a powerful technique for marker discovery and genotyping because of its highly parallel nature. However, microarrays designed for gene expression analysis rarely provide sufficient gene coverage for optimal detection of nucleotide polymorphisms, which limits utility in species with low rates of polymorphism such as lettuce (Lactuca sativa). We developed a 6.5 million feature Affymetrix GeneChip® for efficient polymorphism discovery and genotyping, as well as for analysis of gene expression in lettuce. Probes on the microarray were designed from 26,809 unigenes from cultivated lettuce and an additional 8,819 unigenes from four related species (L. serriola, L. saligna, L. virosa and L. perennis). Where possible, probes were tiled with a 2 bp stagger, alternating on each DNA strand; providing an average of 187 probes covering approximately 600 bp for each of over 35,000 unigenes; resulting in up to 13 fold redundancy in coverage per nucleotide. We developed protocols for hybridization of genomic DNA to the GeneChip® and refined custom algorithms that utilized coverage from multiple, high quality probes to detect single position polymorphisms in 2 bp sliding windows across each unigene. This allowed us to detect greater than 18,000 polymorphisms between the parental lines of our core mapping population, as well as numerous polymorphisms between cultivated lettuce and wild species in the lettuce genepool. Using marker data from our diversity panel comprised of 52 accessions from the five species listed above, we were able to separate accessions by species using both phylogenetic and principal component analyses. Additionally, we estimated the diversity between different types of cultivated lettuce and distinguished morphological types. By hybridizing

  12. Massively parallel implementations of coupled-cluster methods for electron spin resonance spectra. I. Isotropic hyperfine coupling tensors in large radicals

    Energy Technology Data Exchange (ETDEWEB)

    Verma, Prakash; Morales, Jorge A., E-mail: jorge.morales@ttu.edu [Department of Chemistry and Biochemistry, Texas Tech University, P.O. Box 41061, Lubbock, Texas 79409-1061 (United States); Perera, Ajith [Department of Chemistry and Biochemistry, Texas Tech University, P.O. Box 41061, Lubbock, Texas 79409-1061 (United States); Department of Chemistry, Quantum Theory Project, University of Florida, Gainesville, Florida 32611 (United States)

    2013-11-07

    Coupled cluster (CC) methods provide highly accurate predictions of molecular properties, but their high computational cost has precluded their routine application to large systems. Fortunately, recent computational developments in the ACES III program by the Bartlett group [the OED/ERD atomic integral package, the super instruction processor, and the super instruction architecture language] permit overcoming that limitation by providing a framework for massively parallel CC implementations. In that scheme, we are further extending those parallel CC efforts to systematically predict the three main electron spin resonance (ESR) tensors (A-, g-, and D-tensors) to be reported in a series of papers. In this paper inaugurating that series, we report our new ACES III parallel capabilities that calculate isotropic hyperfine coupling constants in 38 neutral, cationic, and anionic radicals that include the {sup 11}B, {sup 17}O, {sup 9}Be, {sup 19}F, {sup 1}H, {sup 13}C, {sup 35}Cl, {sup 33}S,{sup 14}N, {sup 31}P, and {sup 67}Zn nuclei. Present parallel calculations are conducted at the Hartree-Fock (HF), second-order many-body perturbation theory [MBPT(2)], CC singles and doubles (CCSD), and CCSD with perturbative triples [CCSD(T)] levels using Roos augmented double- and triple-zeta atomic natural orbitals basis sets. HF results consistently overestimate isotropic hyperfine coupling constants. However, inclusion of electron correlation effects in the simplest way via MBPT(2) provides significant improvements in the predictions, but not without occasional failures. In contrast, CCSD results are consistently in very good agreement with experimental results. Inclusion of perturbative triples to CCSD via CCSD(T) leads to small improvements in the predictions, which might not compensate for the extra computational effort at a non-iterative N{sup 7}-scaling in CCSD(T). The importance of these accurate computations of isotropic hyperfine coupling constants to elucidate

  13. Opportunities and challenges for the integration of massively parallel genomic sequencing into clinical practice: lessons from the ClinSeq project.

    Science.gov (United States)

    Biesecker, Leslie G

    2012-04-01

    The debate surrounding the return of results from high-throughput genomic interrogation encompasses many important issues including ethics, law, economics, and social policy. As well, the debate is also informed by the molecular, genetic, and clinical foundations of the emerging field of clinical genomics, which is based on this new technology. This article outlines the main biomedical considerations of sequencing technologies and demonstrates some of the early clinical experiences with the technology to enable the debate to stay focused on real-world practicalities. These experiences are based on early data from the ClinSeq project, which is a project to pilot the use of massively parallel sequencing in a clinical research context with a major aim to develop modes of returning results to individual subjects. The study has enrolled >900 subjects and generated exome sequence data on 572 subjects. These data are beginning to be interpreted and returned to the subjects, which provides examples of the potential usefulness and pitfalls of clinical genomics. There are numerous genetic results that can be readily derived from a genome including rare, high-penetrance traits, and carrier states. However, much work needs to be done to develop the tools and resources for genomic interpretation. The main lesson learned is that a genome sequence may be better considered as a health-care resource, rather than a test, one that can be interpreted and used over the lifetime of the patient.

  14. Towards anatomic scale agent-based modeling with a massively parallel spatially explicit general-purpose model of enteric tissue (SEGMEnT_HPC).

    Science.gov (United States)

    Cockrell, Robert Chase; Christley, Scott; Chang, Eugene; An, Gary

    2015-01-01

    Perhaps the greatest challenge currently facing the biomedical research community is the ability to integrate highly detailed cellular and molecular mechanisms to represent clinical disease states as a pathway to engineer effective therapeutics. This is particularly evident in the representation of organ-level pathophysiology in terms of abnormal tissue structure, which, through histology, remains a mainstay in disease diagnosis and staging. As such, being able to generate anatomic scale simulations is a highly desirable goal. While computational limitations have previously constrained the size and scope of multi-scale computational models, advances in the capacity and availability of high-performance computing (HPC) resources have greatly expanded the ability of computational models of biological systems to achieve anatomic, clinically relevant scale. Diseases of the intestinal tract are exemplary examples of pathophysiological processes that manifest at multiple scales of spatial resolution, with structural abnormalities present at the microscopic, macroscopic and organ-levels. In this paper, we describe a novel, massively parallel computational model of the gut, the Spatially Explicitly General-purpose Model of Enteric Tissue_HPC (SEGMEnT_HPC), which extends an existing model of the gut epithelium, SEGMEnT, in order to create cell-for-cell anatomic scale simulations. We present an example implementation of SEGMEnT_HPC that simulates the pathogenesis of ileal pouchitis, and important clinical entity that affects patients following remedial surgery for ulcerative colitis.

  15. Identification of the first homozygous 1-bp deletion in GDF9 gene leading to primary ovarian insufficiency by using targeted massively parallel sequencing.

    Science.gov (United States)

    França, M M; Funari, M F A; Nishi, M Y; Narcizo, A M; Domenice, S; Costa, E M F; Lerario, A M; Mendonca, B B

    2018-02-01

    Targeted massively parallel sequencing (TMPS) has been used in genetic diagnosis for Mendelian disorders. In the past few years, the TMPS has identified new and already described genes associated with primary ovarian insufficiency (POI) phenotype. Here, we performed a targeted gene sequencing to find a genetic diagnosis in idiopathic cases of Brazilian POI cohort. A custom SureSelect XT DNA target enrichment panel was designed and the sequencing was performed on Illumina NextSeq sequencer. We identified 1 homozygous 1-bp deletion variant (c.783delC) in the GDF9 gene in 1 patient with POI. The variant was confirmed and segregated using Sanger sequencing. The c.783delC GDF9 variant changed an amino acid creating a premature termination codon (p.Ser262Hisfs*2). This variant was not present in all public databases (ExAC/gnomAD, NHLBI/EVS and 1000Genomes). Moreover, it was absent in 400 alleles from fertile Brazilian women screened by Sanger sequencing. The patient's mother and her unaffected sister carried the c.783delC variant in a heterozygous state, as expected for an autosomal recessive inheritance. Here, the TMPS identified the first homozygous 1-bp deletion variant in GDF9. This finding reveals a novel inheritance pattern of pathogenic variant in GDF9 associated with POI, thus improving the genetic diagnosis of this disorder. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  16. Sampling the Denatured State of Polypeptides in Water, Urea, and Guanidine Chloride to Strict Equilibrium Conditions with the Help of Massively Parallel Computers.

    Science.gov (United States)

    Meloni, Roberto; Camilloni, Carlo; Tiana, Guido

    2014-02-11

    The denatured state of polypeptides and proteins, stabilized by chemical denaturants like urea and guanidine chloride, displays residual secondary structure when studied by nuclear-magnetic-resonance spectroscopy. However, these experimental techniques are weakly sensitive, and thus molecular-dynamics simulations can be useful to complement the experimental findings. To sample the denatured state, we made use of massively-parallel computers and of a variant of the replica exchange algorithm, in which the different branches, connected with unbiased replicas, favor the formation and disruption of local secondary structure. The algorithm is applied to the second hairpin of GB1 in water, in urea, and in guanidine chloride. We show with the help of different criteria that the simulations converge to equilibrium. It results that urea and guanidine chloride, besides inducing some polyproline-II structure, have different effect on the hairpin. Urea disrupts completely the native region and stabilizes a state which resembles a random coil, while guanidine chloride has a milder effect.

  17. Design and implementation of an integrated architecture for massive parallel data treatment of analogue signals supplied by silicon detectors of very high spatial resolution

    International Nuclear Information System (INIS)

    Michel, J.

    1993-02-01

    This doctorate thesis studies an integrated architecture designed to a parallel massive treatment of analogue signals supplied by silicon detectors of very high spatial resolution. The first chapter is an introduction presenting the general outline and the triggering conditions of the spectrometer. Chapter two describes the operational structure of a microvertex detector made of Si micro-plates associated to the measuring chains. Information preconditioning is related to the pre-amplification stage, to the pile-up effects and to the reduction in the time characteristic due to the high counting rates. The chapter three describes the architecture of the analogue delay buffer, makes an analysis of the intrinsic noise and presents the operational testings and input/output control operations. The fourth chapter is devoted to the description of the analogue pulse shape processor and gives also the testings and the corresponding measurements on the circuit. Finally, the chapter five deals with the simplest modeling of the entire conditioning chain. Also, the testings and measuring procedures are here discussed. In conclusion the author presents some prospects for improving the signal-to-noise ratio by summation of the de-convoluted micro-paths. 78 refs., 78 figs., 1 annexe

  18. Parallel inversion of a massive ERT data set to characterize deep vadose zone contamination beneath former nuclear waste infiltration galleries at the Hanford Site B-Complex (Invited)

    Science.gov (United States)

    Johnson, T.; Rucker, D. F.; Wellman, D.

    2013-12-01

    revealed the general footprint of vadose zone contamination beneath infiltration galleries. In 2011, the USDOE commissioned an effort to re-invert the B-Complex ERT data as a whole using a recently developed massively parallel 3D ERT inversion code. The computational mesh included approximately 1.085 million elements and closely honored the 37m of topographic relief as determined by LiDAR imaging. The water table and tank boundaries were also incorporated into the mesh to facilitate regularization disconnects, enabling sharp conductivity contrasts where they occur naturally without penalty. The data were inverted using 1024 processors, requiring 910 Gb of memory and 11.5 hours of computation time. The imaging results revealed previously unrealized detail concerning the distribution and behavior of contaminants migrating through the vadose zone, and are currently being used by site cleanup operators and regulators to understand the origin of a groundwater nitrate plume emerging from one of the infiltration galleries. The results overall demonstrate the utility of high performance computing, unstructured meshing, and custom regularization constraints for optimal processing of massive ERT data sets enabled by modern ERT survey hardware.

  19. POU4F3 mutation screening in Japanese hearing loss patients: Massively parallel DNA sequencing-based analysis identified novel variants associated with autosomal dominant hearing loss.

    Directory of Open Access Journals (Sweden)

    Tomohiro Kitano

    Full Text Available A variant in a transcription factor gene, POU4F3, is responsible for autosomal dominant nonsyndromic hereditary hearing loss, DFNA15. To date, 14 variants, including a whole deletion of POU4F3, have been reported to cause HL in various ethnic groups. In the present study, genetic screening for POU4F3 variants was carried out for a large series of Japanese hearing loss (HL patients to clarify the prevalence and clinical characteristics of DFNA15 in the Japanese population. Massively parallel DNA sequencing of 68 target candidate genes was utilized in 2,549 unrelated Japanese HL patients (probands to identify genomic variations responsible for HL. The detailed clinical features in patients with POU4F3 variants were collected from medical charts and analyzed. Novel 12 POU4F3 likely pathogenic variants (six missense variants, three frameshift variants, and three nonsense variants were successfully identified in 15 probands (2.5% among 602 families exhibiting autosomal dominant HL, whereas no variants were detected in the other 1,947 probands with autosomal recessive or inheritance pattern unknown HL. To obtain the audiovestibular configuration of the patients harboring POU4F3 variants, we collected audiograms and vestibular symptoms of the probands and their affected family members. Audiovestibular phenotypes in a total of 24 individuals from the 15 families possessing variants were characterized by progressive HL, with a large variation in the onset age and severity with or without vestibular symptoms observed. Pure-tone audiograms indicated the most prevalent configuration as mid-frequency HL type followed by high-frequency HL type, with asymmetry observed in approximately 20% of affected individuals. Analysis of the relationship between age and pure-tone average suggested that individuals with truncating variants showed earlier onset and slower progression of HL than did those with non-truncating variants. The present study showed that variants

  20. FDSTools: A software package for analysis of massively parallel sequencing data with the ability to recognise and correct STR stutter and other PCR or sequencing noise.

    Science.gov (United States)

    Hoogenboom, Jerry; van der Gaag, Kristiaan J; de Leeuw, Rick H; Sijen, Titia; de Knijff, Peter; Laros, Jeroen F J

    2017-03-01

    Massively parallel sequencing (MPS) is on the advent of a broad scale application in forensic research and casework. The improved capabilities to analyse evidentiary traces representing unbalanced mixtures is often mentioned as one of the major advantages of this technique. However, most of the available software packages that analyse forensic short tandem repeat (STR) sequencing data are not well suited for high throughput analysis of such mixed traces. The largest challenge is the presence of stutter artefacts in STR amplifications, which are not readily discerned from minor contributions. FDSTools is an open-source software solution developed for this purpose. The level of stutter formation is influenced by various aspects of the sequence, such as the length of the longest uninterrupted stretch occurring in an STR. When MPS is used, STRs are evaluated as sequence variants that each have particular stutter characteristics which can be precisely determined. FDSTools uses a database of reference samples to determine stutter and other systemic PCR or sequencing artefacts for each individual allele. In addition, stutter models are created for each repeating element in order to predict stutter artefacts for alleles that are not included in the reference set. This information is subsequently used to recognise and compensate for the noise in a sequence profile. The result is a better representation of the true composition of a sample. Using Promega Powerseq™ Auto System data from 450 reference samples and 31 two-person mixtures, we show that the FDSTools correction module decreases stutter ratios above 20% to below 3%. Consequently, much lower levels of contributions in the mixed traces are detected. FDSTools contains modules to visualise the data in an interactive format allowing users to filter data with their own preferred thresholds. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.

  1. Significant Association between Sulfate-Reducing Bacteria and Uranium-Reducing Microbial Communities as Revealed by a Combined Massively Parallel Sequencing-Indicator Species Approach▿ †

    Science.gov (United States)

    Cardenas, Erick; Wu, Wei-Min; Leigh, Mary Beth; Carley, Jack; Carroll, Sue; Gentry, Terry; Luo, Jian; Watson, David; Gu, Baohua; Ginder-Vogel, Matthew; Kitanidis, Peter K.; Jardine, Philip M.; Zhou, Jizhong; Criddle, Craig S.; Marsh, Terence L.; Tiedje, James M.

    2010-01-01

    Massively parallel sequencing has provided a more affordable and high-throughput method to study microbial communities, although it has mostly been used in an exploratory fashion. We combined pyrosequencing with a strict indicator species statistical analysis to test if bacteria specifically responded to ethanol injection that successfully promoted dissimilatory uranium(VI) reduction in the subsurface of a uranium contamination plume at the Oak Ridge Field Research Center in Tennessee. Remediation was achieved with a hydraulic flow control consisting of an inner loop, where ethanol was injected, and an outer loop for flow-field protection. This strategy reduced uranium concentrations in groundwater to levels below 0.126 μM and created geochemical gradients in electron donors from the inner-loop injection well toward the outer loop and downgradient flow path. Our analysis with 15 sediment samples from the entire test area found significant indicator species that showed a high degree of adaptation to the three different hydrochemical-created conditions. Castellaniella and Rhodanobacter characterized areas with low pH, heavy metals, and low bioactivity, while sulfate-, Fe(III)-, and U(VI)-reducing bacteria (Desulfovibrio, Anaeromyxobacter, and Desulfosporosinus) were indicators of areas where U(VI) reduction occurred. The abundance of these bacteria, as well as the Fe(III) and U(VI) reducer Geobacter, correlated with the hydraulic connectivity to the substrate injection site, suggesting that the selected populations were a direct response to electron donor addition by the groundwater flow path. A false-discovery-rate approach was implemented to discard false-positive results by chance, given the large amount of data compared. PMID:20729318

  2. Significant association between sulfate-reducing bacteria and uranium-reducing microbial communities as revealed by a combined massively parallel sequencing-indicator species approach.

    Science.gov (United States)

    Cardenas, Erick; Wu, Wei-Min; Leigh, Mary Beth; Carley, Jack; Carroll, Sue; Gentry, Terry; Luo, Jian; Watson, David; Gu, Baohua; Ginder-Vogel, Matthew; Kitanidis, Peter K; Jardine, Philip M; Zhou, Jizhong; Criddle, Craig S; Marsh, Terence L; Tiedje, James M

    2010-10-01

    Massively parallel sequencing has provided a more affordable and high-throughput method to study microbial communities, although it has mostly been used in an exploratory fashion. We combined pyrosequencing with a strict indicator species statistical analysis to test if bacteria specifically responded to ethanol injection that successfully promoted dissimilatory uranium(VI) reduction in the subsurface of a uranium contamination plume at the Oak Ridge Field Research Center in Tennessee. Remediation was achieved with a hydraulic flow control consisting of an inner loop, where ethanol was injected, and an outer loop for flow-field protection. This strategy reduced uranium concentrations in groundwater to levels below 0.126 μM and created geochemical gradients in electron donors from the inner-loop injection well toward the outer loop and downgradient flow path. Our analysis with 15 sediment samples from the entire test area found significant indicator species that showed a high degree of adaptation to the three different hydrochemical-created conditions. Castellaniella and Rhodanobacter characterized areas with low pH, heavy metals, and low bioactivity, while sulfate-, Fe(III)-, and U(VI)-reducing bacteria (Desulfovibrio, Anaeromyxobacter, and Desulfosporosinus) were indicators of areas where U(VI) reduction occurred. The abundance of these bacteria, as well as the Fe(III) and U(VI) reducer Geobacter, correlated with the hydraulic connectivity to the substrate injection site, suggesting that the selected populations were a direct response to electron donor addition by the groundwater flow path. A false-discovery-rate approach was implemented to discard false-positive results by chance, given the large amount of data compared.

  3. Quaternary Morphodynamics of Fluvial Dispersal Systems Revealed: The Fly River, PNG, and the Sunda Shelf, SE Asia, simulated with the Massively Parallel GPU-based Model 'GULLEM'

    Science.gov (United States)

    Aalto, R. E.; Lauer, J. W.; Darby, S. E.; Best, J.; Dietrich, W. E.

    2015-12-01

    During glacial-marine transgressions vast volumes of sediment are deposited due to the infilling of lowland fluvial systems and shallow shelves, material that is removed during ensuing regressions. Modelling these processes would illuminate system morphodynamics, fluxes, and 'complexity' in response to base level change, yet such problems are computationally formidable. Environmental systems are characterized by strong interconnectivity, yet traditional supercomputers have slow inter-node communication -- whereas rapidly advancing Graphics Processing Unit (GPU) technology offers vastly higher (>100x) bandwidths. GULLEM (GpU-accelerated Lowland Landscape Evolution Model) employs massively parallel code to simulate coupled fluvial-landscape evolution for complex lowland river systems over large temporal and spatial scales. GULLEM models the accommodation space carved/infilled by representing a range of geomorphic processes, including: river & tributary incision within a multi-directional flow regime, non-linear diffusion, glacial-isostatic flexure, hydraulic geometry, tectonic deformation, sediment production, transport & deposition, and full 3D tracking of all resulting stratigraphy. Model results concur with the Holocene dynamics of the Fly River, PNG -- as documented with dated cores, sonar imaging of floodbasin stratigraphy, and the observations of topographic remnants from LGM conditions. Other supporting research was conducted along the Mekong River, the largest fluvial system of the Sunda Shelf. These and other field data provide tantalizing empirical glimpses into the lowland landscapes of large rivers during glacial-interglacial transitions, observations that can be explored with this powerful numerical model. GULLEM affords estimates for the timing and flux budgets within the Fly and Sunda Systems, illustrating complex internal system responses to the external forcing of sea level and climate. Furthermore, GULLEM can be applied to most ANY fluvial system to

  4. Genome-wide massively parallel sequencing of formaldehyde fixed-paraffin embedded (FFPE tumor tissues for copy-number- and mutation-analysis.

    Directory of Open Access Journals (Sweden)

    Michal R Schweiger

    Full Text Available BACKGROUND: Cancer re-sequencing programs rely on DNA isolated from fresh snap frozen tissues, the preparation of which is combined with additional preservation efforts. Tissue samples at pathology departments are routinely stored as formalin-fixed and paraffin-embedded (FFPE samples and their use would open up access to a variety of clinical trials. However, FFPE preparation is incompatible with many down-stream molecular biology techniques such as PCR based amplification methods and gene expression studies. METHODOLOGY/PRINCIPAL FINDINGS: Here we investigated the sample quality requirements of FFPE tissues for massively parallel short-read sequencing approaches. We evaluated key variables of pre-fixation, fixation related and post-fixation processes that occur in routine medical service (e.g. degree of autolysis, duration of fixation and of storage. We also investigated the influence of tissue storage time on sequencing quality by using material that was up to 18 years old. Finally, we analyzed normal and tumor breast tissues using the Sequencing by Synthesis technique (Illumina Genome Analyzer, Solexa to simultaneously localize genome-wide copy number alterations and to detect genomic variations such as substitutions and point-deletions and/or insertions in FFPE tissue samples. CONCLUSIONS/SIGNIFICANCE: The application of second generation sequencing techniques on small amounts of FFPE material opens up the possibility to analyze tissue samples which have been collected during routine clinical work as well as in the context of clinical trials. This is in particular important since FFPE samples are amply available from surgical tumor resections and histopathological diagnosis, and comprise tissue from precursor lesions, primary tumors, lymphogenic and/or hematogenic metastases. Large-scale studies using this tissue material will result in a better prediction of the prognosis of cancer patients and the early identification of patients which

  5. Touch imprint cytology with massively parallel sequencing (TIC-seq): a simple and rapid method to snapshot genetic alterations in tumors.

    Science.gov (United States)

    Amemiya, Kenji; Hirotsu, Yosuke; Goto, Taichiro; Nakagomi, Hiroshi; Mochizuki, Hitoshi; Oyama, Toshio; Omata, Masao

    2016-12-01

    Identifying genetic alterations in tumors is critical for molecular targeting of therapy. In the clinical setting, formalin-fixed paraffin-embedded (FFPE) tissue is usually employed for genetic analysis. However, DNA extracted from FFPE tissue is often not suitable for analysis because of its low levels and poor quality. Additionally, FFPE sample preparation is time-consuming. To provide early treatment for cancer patients, a more rapid and robust method is required for precision medicine. We present a simple method for genetic analysis, called touch imprint cytology combined with massively paralleled sequencing (touch imprint cytology [TIC]-seq), to detect somatic mutations in tumors. We prepared FFPE tissues and TIC specimens from tumors in nine lung cancer patients and one patient with breast cancer. We found that the quality and quantity of TIC DNA was higher than that of FFPE DNA, which requires microdissection to enrich DNA from target tissues. Targeted sequencing using a next-generation sequencer obtained sufficient sequence data using TIC DNA. Most (92%) somatic mutations in lung primary tumors were found to be consistent between TIC and FFPE DNA. We also applied TIC DNA to primary and metastatic tumor tissues to analyze tumor heterogeneity in a breast cancer patient, and showed that common and distinct mutations among primary and metastatic sites could be classified into two distinct histological subtypes. TIC-seq is an alternative and feasible method to analyze genomic alterations in tumors by simply touching the cut surface of specimens to slides. © 2016 The Authors. Cancer Medicine published by John Wiley & Sons Ltd.

  6. Nucleotide Metabolism

    DEFF Research Database (Denmark)

    Martinussen, Jan; Willemoës, M.; Kilstrup, Mogens

    2011-01-01

    Metabolic pathways are connected through their utilization of nucleotides as supplier of energy, allosteric effectors, and their role in activation of intermediates. Therefore, any attempt to exploit a given living organism in a biotechnological process will have an impact on nucleotide metabolis...

  7. SIERRA Mechanics, an emerging massively parallel HPC capability, for use in coupled THMC analyses of HLW repositories in clay/shale

    International Nuclear Information System (INIS)

    Bean, J.E.; Sanchez, M.; Arguello, J.G.

    2012-01-01

    Document available in extended abstract form only. Because, until recently, U.S. efforts had been focused on the volcanic tuff site at Yucca Mountain, radioactive waste disposal in U.S. clay/shale formations has not been considered for many years. However, advances in multi-physics computational modeling and research into clay mineralogy continue to improve the scientific basis for assessing nuclear waste repository performance in such formations. Disposal of high-level radioactive waste (HLW) in suitable clay/shale formations is attractive because the material is essentially impermeable and self-sealing, conditions are chemically reducing, and sorption tends to prevent radionuclide transport. Vertically and laterally extensive shale and clay formations exist in multiple locations in the contiguous 48 states. This paper describes an emerging massively parallel (MP) high performance computing (HPC) capability - SIERRA Mechanics - that is applicable to the simulation of coupled-physics processes occurring within a potential clay/shale repository for disposal of HLW within the U.S. The SIERRA Mechanics code development project has been underway at Sandia National Laboratories for approximately the past decade under the auspices of the U.S. Department of Energy's Advanced Scientific Computing (ASC) program. SIERRA Mechanics was designed and developed from its inception to run on the latest and most sophisticated massively parallel computing hardware, with the capability to span the hardware range from single workstations to systems with thousands of processors. The foundation of SIERRA Mechanics is the SIERRA tool-kit, which provides finite element application-code services such as: (1) mesh and field data management, both parallel and distributed; (2) transfer operators for mapping field variables from one mechanics application to another; (3) a solution controller for code coupling; and (4) included third party libraries (e.g., solver libraries, communications

  8. High prevalence of HIV-1 transmitted drug-resistance mutations from proviral DNA massively parallel sequencing data of therapy-naïve chronically infected Brazilian blood donors.

    Directory of Open Access Journals (Sweden)

    Rodrigo Pessôa

    Full Text Available An improved understanding of the prevalence of low-abundance transmitted drug-resistance mutations (TDRM in therapy-naïve HIV-1-infected patients may help determine which patients are the best candidates for therapy. In this study, we aimed to obtain a comprehensive picture of the evolving HIV-1 TDRM across the massive parallel sequences (MPS of the viral entire proviral genome in a well-characterized Brazilian blood donor naïve to antiretroviral drugs.The MPS data from 128 samples used in the analysis were sourced from Brazilian blood donors and were previously classified by less-sensitive (LS or "detuned" enzyme immunoassay as non-recent or longstanding HIV-1 infections. The Stanford HIV Resistance Database (HIVDBv 6.2 and IAS-USA mutation lists were used to interpret the pattern of drug resistance. The minority variants with TDRM were identified using a threshold of ≥ 1.0% and ≤ 20% of the reads sequenced. The rate of TDRM in the MPS data of the proviral genome were compared with the corresponding published consensus sequences of their plasma viruses.No TDRM were detected in the integrase or envelope regions. The overall prevalence of TDRM in the protease (PR and reverse transcriptase (RT regions of the HIV-1 pol gene was 44.5% (57/128, including any mutations to the nucleoside analogue reverse transcriptase inhibitors (NRTI and non-nucleoside analogue reverse transcriptase inhibitors (NNRTI. Of the 57 subjects, 43 (75.4% harbored a minority variant containing at least one clinically relevant TDRM. Among the 43 subjects, 33 (76.7% had detectable minority resistant variants to NRTIs, 6 (13.9% to NNRTIs, and 16 (37.2% to PR inhibitors. The comparison of viral sequences in both sources, plasma and cells, would have detected 48 DNA provirus disclosed TDRM by MPS previously missed by plasma bulk analysis.Our findings revealed a high prevalence of TDRM found in this group, as the use of MPS drastically increased the detection of these

  9. Highly parallel and short-acting amplification with locus-specific primers to detect single nucleotide polymorphisms by the DigiTag2 assay.

    Directory of Open Access Journals (Sweden)

    Nao Nishida

    Full Text Available The DigiTag2 assay enables analysis of a set of 96 SNPs using Kapa 2GFast HotStart DNA polymerase with a new protocol that has a total running time of about 7 hours, which is 6 hours shorter than the previous protocol. Quality parameters (conversion rate, call rate, reproducibility and concordance were at the same levels as when genotype calls were acquired using the previous protocol. Multiplex PCR with 192 pairs of locus-specific primers was available for target preparation in the DigiTag2 assay without the optimization of reaction conditions, and quality parameters had the same levels as those acquired with 96-plex PCR. The locus-specific primers were able to achieve sufficient (concentration of target amplicon ≥5 nM and specific (concentration of unexpected amplicons <2 nM amplification within 2 hours, were also able to achieve detectable amplifications even when working in a 96-plex or 192-plex form. The improved DigiTag2 assay will be an efficient platform for screening an intermediate number of SNPs (tens to hundreds of sites in the replication analysis after genome-wide association study. Moreover, highly parallel and short-acting amplification with locus-specific primers may thus facilitate widespread application to other PCR-based assays.

  10. Start-up flow in a three-dimensional lid-driven cavity by means of a massively parallel direction splitting algorithm

    KAUST Repository

    Guermond, J. L.; Minev, P. D.

    2011-01-01

    The purpose of this paper is to validate a new highly parallelizable direction splitting algorithm. The parallelization capabilities of this algorithm are illustrated by providing a highly accurate solution for the start-up flow in a three

  11. Start-up flow in a three-dimensional lid-driven cavity by means of a massively parallel direction splitting algorithm

    KAUST Repository

    Guermond, J. L.

    2011-05-04

    The purpose of this paper is to validate a new highly parallelizable direction splitting algorithm. The parallelization capabilities of this algorithm are illustrated by providing a highly accurate solution for the start-up flow in a three-dimensional impulsively started lid-driven cavity of aspect ratio 1×1×2 at Reynolds numbers 1000 and 5000. The computations are done in parallel (up to 1024 processors) on adapted grids of up to 2 billion nodes in three space dimensions. Velocity profiles are given at dimensionless times t=4, 8, and 12; at least four digits are expected to be correct at Re=1000. © 2011 John Wiley & Sons, Ltd.

  12. Development and application of a 6.5 million feature affymetrix genechip® for massively parallel discovery of single position polymorphisms in lettuce (Lactuca spp.)

    OpenAIRE

    Stoffel, Kevin; van Leeuwen, Hans; Kozik, Alexander; Caldwell, David; Ashrafi, Hamid; Cui, Xinping; Tan, Xiaoping; Hill, Theresa; Reyes-Chin-Wo, Sebastian; Truco, Maria-Jose; Michelmore, Richard W; Van Deynze, Allen

    2012-01-01

    Abstract Background High-resolution genetic maps are needed in many crops to help characterize the genetic diversity that determines agriculturally important traits. Hybridization to microarrays to detect single feature polymorphisms is a powerful technique for marker discovery and genotyping because of its highly parallel nature. However, microarrays designed for gene expression analysis rarely provide sufficient gene coverage for optimal detection o...

  13. A hybrid, massively parallel implementation of a genetic algorithm for optimization of the impact performance of a metal/polymer composite plate

    KAUST Repository

    Narayanan, Kiran

    2012-07-17

    A hybrid parallelization method composed of a coarse-grained genetic algorithm (GA) and fine-grained objective function evaluations is implemented on a heterogeneous computational resource consisting of 16 IBM Blue Gene/P racks, a single x86 cluster node and a high-performance file system. The GA iterator is coupled with a finite-element (FE) analysis code developed in house to facilitate computational steering in order to calculate the optimal impact velocities of a projectile colliding with a polyurea/structural steel composite plate. The FE code is capable of capturing adiabatic shear bands and strain localization, which are typically observed in high-velocity impact applications, and it includes several constitutive models of plasticity, viscoelasticity and viscoplasticity for metals and soft materials, which allow simulation of ductile fracture by void growth. A strong scaling study of the FE code was conducted to determine the optimum number of processes run in parallel. The relative efficiency of the hybrid, multi-level parallelization method is studied in order to determine the parameters for the parallelization. Optimal impact velocities of the projectile calculated using the proposed approach, are reported. © The Author(s) 2012.

  14. A hybrid, massively parallel implementation of a genetic algorithm for optimization of the impact performance of a metal/polymer composite plate

    KAUST Repository

    Narayanan, Kiran; Mora Cordova, Angel; Allsopp, Nicholas; El Sayed, Tamer S.

    2012-01-01

    A hybrid parallelization method composed of a coarse-grained genetic algorithm (GA) and fine-grained objective function evaluations is implemented on a heterogeneous computational resource consisting of 16 IBM Blue Gene/P racks, a single x86 cluster

  15. Parallel computing works!

    CERN Document Server

    Fox, Geoffrey C; Messina, Guiseppe C

    2014-01-01

    A clear illustration of how parallel computers can be successfully appliedto large-scale scientific computations. This book demonstrates how avariety of applications in physics, biology, mathematics and other scienceswere implemented on real parallel computers to produce new scientificresults. It investigates issues of fine-grained parallelism relevant forfuture supercomputers with particular emphasis on hypercube architecture. The authors describe how they used an experimental approach to configuredifferent massively parallel machines, design and implement basic systemsoftware, and develop

  16. Molecular characterization of human T-cell lymphotropic virus type 1 full and partial genomes by Illumina massively parallel sequencing technology.

    Directory of Open Access Journals (Sweden)

    Rodrigo Pessôa

    Full Text Available BACKGROUND: Here, we report on the partial and full-length genomic (FLG variability of HTLV-1 sequences from 90 well-characterized subjects, including 48 HTLV-1 asymptomatic carriers (ACs, 35 HTLV-1-associated myelopathy/tropical spastic paraparesis (HAM/TSP and 7 adult T-cell leukemia/lymphoma (ATLL patients, using an Illumina paired-end protocol. METHODS: Blood samples were collected from 90 individuals, and DNA was extracted from the PBMCs to measure the proviral load and to amplify the HTLV-1 FLG from two overlapping fragments. The amplified PCR products were subjected to deep sequencing. The sequencing data were assembled, aligned, and mapped against the HTLV-1 genome with sufficient genetic resemblance and utilized for further phylogenetic analysis. RESULTS: A high-throughput sequencing-by-synthesis instrument was used to obtain an average of 3210- and 5200-fold coverage of the partial (n = 14 and FLG (n = 76 data from the HTLV-1 strains, respectively. The results based on the phylogenetic trees of consensus sequences from partial and FLGs revealed that 86 (95.5% individuals were infected with the transcontinental sub-subtypes of the cosmopolitan subtype (aA and that 4 individuals (4.5% were infected with the Japanese sub-subtypes (aB. A comparison of the nucleotide and amino acids of the FLG between the three clinical settings yielded no correlation between the sequenced genotype and clinical outcomes. The evolutionary relationships among the HTLV sequences were inferred from nucleotide sequence, and the results are consistent with the hypothesis that there were multiple introductions of the transcontinental subtype in Brazil. CONCLUSIONS: This study has increased the number of subtype aA full-length genomes from 8 to 81 and HTLV-1 aB from 2 to 5 sequences. The overall data confirmed that the cosmopolitan transcontinental sub-subtypes were the most prevalent in the Brazilian population. It is hoped that this valuable genomic data

  17. Molecular characterization of human T-cell lymphotropic virus type 1 full and partial genomes by Illumina massively parallel sequencing technology.

    Science.gov (United States)

    Pessôa, Rodrigo; Watanabe, Jaqueline Tomoko; Nukui, Youko; Pereira, Juliana; Casseb, Jorge; Kasseb, Jorge; de Oliveira, Augusto César Penalva; Segurado, Aluisio Cotrim; Sanabani, Sabri Saeed

    2014-01-01

    Here, we report on the partial and full-length genomic (FLG) variability of HTLV-1 sequences from 90 well-characterized subjects, including 48 HTLV-1 asymptomatic carriers (ACs), 35 HTLV-1-associated myelopathy/tropical spastic paraparesis (HAM/TSP) and 7 adult T-cell leukemia/lymphoma (ATLL) patients, using an Illumina paired-end protocol. Blood samples were collected from 90 individuals, and DNA was extracted from the PBMCs to measure the proviral load and to amplify the HTLV-1 FLG from two overlapping fragments. The amplified PCR products were subjected to deep sequencing. The sequencing data were assembled, aligned, and mapped against the HTLV-1 genome with sufficient genetic resemblance and utilized for further phylogenetic analysis. A high-throughput sequencing-by-synthesis instrument was used to obtain an average of 3210- and 5200-fold coverage of the partial (n = 14) and FLG (n = 76) data from the HTLV-1 strains, respectively. The results based on the phylogenetic trees of consensus sequences from partial and FLGs revealed that 86 (95.5%) individuals were infected with the transcontinental sub-subtypes of the cosmopolitan subtype (aA) and that 4 individuals (4.5%) were infected with the Japanese sub-subtypes (aB). A comparison of the nucleotide and amino acids of the FLG between the three clinical settings yielded no correlation between the sequenced genotype and clinical outcomes. The evolutionary relationships among the HTLV sequences were inferred from nucleotide sequence, and the results are consistent with the hypothesis that there were multiple introductions of the transcontinental subtype in Brazil. This study has increased the number of subtype aA full-length genomes from 8 to 81 and HTLV-1 aB from 2 to 5 sequences. The overall data confirmed that the cosmopolitan transcontinental sub-subtypes were the most prevalent in the Brazilian population. It is hoped that this valuable genomic data will add to our current understanding of the

  18. Exome Capture and Massively Parallel Sequencing Identifies a Novel HPSE2 Mutation in a Saudi Arabian Child with Ochoa (Urofacial) Syndrome

    Science.gov (United States)

    Al Badr, Wisam; Al Bader, Suha; Otto, Edgar; Hildebrandt, Friedhelm; Ackley, Todd; Peng, Weiping; Xu, Jishu; Li, Jun; Owens, Kailey M.; Bloom, David; Innis, Jeffrey W.

    2011-01-01

    We describe a child of Middle Eastern descent by first-cousin mating with idiopathic neurogenic bladder and high grade vesicoureteral reflux at 1 year of age, whose characteristic facial grimace led to the diagnosis of Ochoa (Urofacial) syndrome at age 5 years. We used homozygosity mapping, exome capture and paired end sequencing to identify the disease causing mutation in the proband. We reviewed the literature with respect to the urologic manifestations of Ochoa syndrome. A large region of marker homozygosity was observed at 10q24, consistent with known autosomal recessive inheritance, family consanguinity and previous genetic mapping in other families with Ochoa syndrome. A homozygous mutation was identified in the proband in HPSE2: c.1374_1378delTGTGC, a deletion of 5 nucleotides in exon 10 that is predicted to lead to a frameshift followed by replacement of 132 C-terminal amino acids with 153 novel amino acids (p.Ala458Alafsdel132ins153). This mutation is novel relative to very recently published mutations in HPSE2 in other families. Early intervention and recognition of Ochoa syndrome with control of risk factors and close surveillance will decrease complications and renal failure. PMID:21450525

  19. Massive Gravity

    OpenAIRE

    de Rham, Claudia

    2014-01-01

    We review recent progress in massive gravity. We start by showing how different theories of massive gravity emerge from a higher-dimensional theory of general relativity, leading to the Dvali–Gabadadze–Porrati model (DGP), cascading gravity, and ghost-free massive gravity. We then explore their theoretical and phenomenological consistency, proving the absence of Boulware–Deser ghosts and reviewing the Vainshtein mechanism and the cosmological solutions in these models. Finally, we present alt...

  20. Practical parallel computing

    CERN Document Server

    Morse, H Stephen

    1994-01-01

    Practical Parallel Computing provides information pertinent to the fundamental aspects of high-performance parallel processing. This book discusses the development of parallel applications on a variety of equipment.Organized into three parts encompassing 12 chapters, this book begins with an overview of the technology trends that converge to favor massively parallel hardware over traditional mainframes and vector machines. This text then gives a tutorial introduction to parallel hardware architectures. Other chapters provide worked-out examples of programs using several parallel languages. Thi

  1. A hybrid genetic linkage map of two ecologically and morphologically divergent Midas cichlid fishes (Amphilophus spp.) obtained by massively parallel DNA sequencing (ddRADSeq).

    Science.gov (United States)

    Recknagel, Hans; Elmer, Kathryn R; Meyer, Axel

    2013-01-01

    Cichlid fishes are an excellent model system for studying speciation and the formation of adaptive radiations because of their tremendous species richness and astonishing phenotypic diversity. Most research has focused on African rift lake fishes, although Neotropical cichlid species display much variability as well. Almost one dozen species of the Midas cichlid species complex (Amphilophus spp.) have been described so far and have formed repeated adaptive radiations in several Nicaraguan crater lakes. Here we apply double-digest restriction-site associated DNA sequencing to obtain a high-density linkage map of an interspecific cross between the benthic Amphilophus astorquii and the limnetic Amphilophus zaliosus, which are sympatric species endemic to Crater Lake Apoyo, Nicaragua. A total of 755 RAD markers were genotyped in 343 F(2) hybrids. The map resolved 25 linkage groups and spans a total distance of 1427 cM with an average marker spacing distance of 1.95 cM, almost matching the total number of chromosomes (n = 24) in these species. Regions of segregation distortion were identified in five linkage groups. Based on the pedigree of parents to F(2) offspring, we calculated a genome-wide mutation rate of 6.6 × 10(-8) mutations per nucleotide per generation. This genetic map will facilitate the mapping of ecomorphologically relevant adaptive traits in the repeated phenotypes that evolved within the Midas cichlid lineage and, as the first linkage map of a Neotropical cichlid, facilitate comparative genomic analyses between African cichlids, Neotropical cichlids and other teleost fishes.

  2. Massive branes

    International Nuclear Information System (INIS)

    Bergshoeff, E.; Ortin, T.

    1998-01-01

    We investigate the effective world-volume theories of branes in a background given by (the bosonic sector of) 10-dimensional massive IIA supergravity (''''massive branes'''') and their M-theoretic origin. In the case of the solitonic 5-brane of type IIA superstring theory the construction of the Wess-Zumino term in the world-volume action requires a dualization of the massive Neveu-Schwarz/Neveu-Schwarz target space 2-form field. We find that, in general, the effective world-volume theory of massive branes contains new world-volume fields that are absent in the massless case, i.e. when the mass parameter m of massive IIA supergravity is set to zero. We show how these new world-volume fields can be introduced in a systematic way. (orig.)

  3. User's guide of TOUGH2-EGS-MP: A Massively Parallel Simulator with Coupled Geomechanics for Fluid and Heat Flow in Enhanced Geothermal Systems VERSION 1.0

    Energy Technology Data Exchange (ETDEWEB)

    Xiong, Yi [Colorado School of Mines, Golden, CO (United States); Fakcharoenphol, Perapon [Colorado School of Mines, Golden, CO (United States); Wang, Shihao [Colorado School of Mines, Golden, CO (United States); Winterfeld, Philip H. [Colorado School of Mines, Golden, CO (United States); Zhang, Keni [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Wu, Yu-Shu [Colorado School of Mines, Golden, CO (United States)

    2013-12-01

    TOUGH2-EGS-MP is a parallel numerical simulation program coupling geomechanics with fluid and heat flow in fractured and porous media, and is applicable for simulation of enhanced geothermal systems (EGS). TOUGH2-EGS-MP is based on the TOUGH2-MP code, the massively parallel version of TOUGH2. In TOUGH2-EGS-MP, the fully-coupled flow-geomechanics model is developed from linear elastic theory for thermo-poro-elastic systems and is formulated in terms of mean normal stress as well as pore pressure and temperature. Reservoir rock properties such as porosity and permeability depend on rock deformation, and the relationships between these two, obtained from poro-elasticity theories and empirical correlations, are incorporated into the simulation. This report provides the user with detailed information on the TOUGH2-EGS-MP mathematical model and instructions for using it for Thermal-Hydrological-Mechanical (THM) simulations. The mathematical model includes the fluid and heat flow equations, geomechanical equation, and discretization of those equations. In addition, the parallel aspects of the code, such as domain partitioning and communication between processors, are also included. Although TOUGH2-EGS-MP has the capability for simulating fluid and heat flows coupled with geomechanical effects, it is up to the user to select the specific coupling process, such as THM or only TH, in a simulation. There are several example problems illustrating applications of this program. These example problems are described in detail and their input data are presented. Their results demonstrate that this program can be used for field-scale geothermal reservoir simulation in porous and fractured media with fluid and heat flow coupled with geomechanical effects.

  4. Reconstruction of the 1997/1998 El Nino from TOPEX/POSEIDON and TOGA/TAO Data Using a Massively Parallel Pacific-Ocean Model and Ensemble Kalman Filter

    Science.gov (United States)

    Keppenne, C. L.; Rienecker, M.; Borovikov, A. Y.

    1999-01-01

    Two massively parallel data assimilation systems in which the model forecast-error covariances are estimated from the distribution of an ensemble of model integrations are applied to the assimilation of 97-98 TOPEX/POSEIDON altimetry and TOGA/TAO temperature data into a Pacific basin version the NASA Seasonal to Interannual Prediction Project (NSIPP)ls quasi-isopycnal ocean general circulation model. in the first system, ensemble of model runs forced by an ensemble of atmospheric model simulations is used to calculate asymptotic error statistics. The data assimilation then occurs in the reduced phase space spanned by the corresponding leading empirical orthogonal functions. The second system is an ensemble Kalman filter in which new error statistics are computed during each assimilation cycle from the time-dependent ensemble distribution. The data assimilation experiments are conducted on NSIPP's 512-processor CRAY T3E. The two data assimilation systems are validated by withholding part of the data and quantifying the extent to which the withheld information can be inferred from the assimilation of the remaining data. The pros and cons of each system are discussed.

  5. Parallel computing works

    Energy Technology Data Exchange (ETDEWEB)

    1991-10-23

    An account of the Caltech Concurrent Computation Program (C{sup 3}P), a five year project that focused on answering the question: Can parallel computers be used to do large-scale scientific computations '' As the title indicates, the question is answered in the affirmative, by implementing numerous scientific applications on real parallel computers and doing computations that produced new scientific results. In the process of doing so, C{sup 3}P helped design and build several new computers, designed and implemented basic system software, developed algorithms for frequently used mathematical computations on massively parallel machines, devised performance models and measured the performance of many computers, and created a high performance computing facility based exclusively on parallel computers. While the initial focus of C{sup 3}P was the hypercube architecture developed by C. Seitz, many of the methods developed and lessons learned have been applied successfully on other massively parallel architectures.

  6. NUCLEOTIDES IN INFANT FEEDING

    Directory of Open Access Journals (Sweden)

    L.G. Mamonova

    2007-01-01

    Full Text Available The article reviews the application of nucleotides-metabolites, playing a key role in many biological processes, for the infant feeding. The researcher provides the date on the nucleotides in the women's milk according to the lactation stages. She also analyzes the foreign experience in feeding newborns with nucleotides-containing milk formulas. The article gives a comparison of nucleotides in the adapted formulas represented in the domestic market of the given products.Key words: children, feeding, nucleotides.

  7. Massively parallel self-consistent-field calculations

    International Nuclear Information System (INIS)

    Tilson, J.L.

    1994-01-01

    The advent of supercomputers with many computational nodes each with its own independent memory makes possible extremely fast computations. The author's work, as part of the US High Performance Computing and Communications Program (HPCCP), is focused on the development of electronic structure techniques for the solution of Grand Challenge-size molecules containing hundreds of atoms. Their efforts have resulted in a fully scalable Direct-SCF program that is portable and efficient. This code, named NWCHEM, is built around a distributed-data model. This distributed data is managed by a software package called Global Arrays developed within the HPCCP. They present performance results for Direct-SCF calculations of interest to the consortium

  8. Massively Parallel Dimension Independent Adaptive Metropolis

    KAUST Repository

    Chen, Yuxin

    2015-01-01

    parameter dimension, by respecting the variance, for Gaussian targets. The result- ing algorithm, referred to as the dimension-independent adaptive Metropolis (DIAM) algorithm, also shows improved performance with respect to adaptive Metropolis on non

  9. COSMOS: Python library for massively parallel workflows.

    Science.gov (United States)

    Gafni, Erik; Luquette, Lovelace J; Lancaster, Alex K; Hawkins, Jared B; Jung, Jae-Yoon; Souilmi, Yassine; Wall, Dennis P; Tonellato, Peter J

    2014-10-15

    Efficient workflows to shepherd clinically generated genomic data through the multiple stages of a next-generation sequencing pipeline are of critical importance in translational biomedical science. Here we present COSMOS, a Python library for workflow management that allows formal description of pipelines and partitioning of jobs. In addition, it includes a user interface for tracking the progress of jobs, abstraction of the queuing system and fine-grained control over the workflow. Workflows can be created on traditional computing clusters as well as cloud-based services. Source code is available for academic non-commercial research purposes. Links to code and documentation are provided at http://lpm.hms.harvard.edu and http://wall-lab.stanford.edu. dpwall@stanford.edu or peter_tonellato@hms.harvard.edu. Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.

  10. Associative Networks on a Massively Parallel Computer.

    Science.gov (United States)

    1985-10-01

    lgbt (as a group of numbers, in this case), but this only leads to sensible queries when a statistical function is applied: "What is the largest salary...34.*"* . •.,. 64 the siW~pe operations being used during ascend, each movement step costs the same as executing an operation

  11. Event analysis using a massively parallel processor

    International Nuclear Information System (INIS)

    Bale, A.; Gerelle, E.; Messersmith, J.; Warren, R.; Hoek, J.

    1990-01-01

    This paper describes a system for performing histogramming of n-tuple data at interactive rates using a commercial SIMD processor array connected to a work-station running the well-known Physics Analysis Workstation software (PAW). Results indicate that an order of magnitude performance improvement over current RISC technology is easily achievable

  12. A task parallel implementation of fast multipole methods

    KAUST Repository

    Taura, Kenjiro; Nakashima, Jun; Yokota, Rio; Maruyama, Naoya

    2012-01-01

    This paper describes a task parallel implementation of ExaFMM, an open source implementation of fast multipole methods (FMM), using a lightweight task parallel library MassiveThreads. Although there have been many attempts on parallelizing FMM

  13. Differences Between Distributed and Parallel Systems

    Energy Technology Data Exchange (ETDEWEB)

    Brightwell, R.; Maccabe, A.B.; Rissen, R.

    1998-10-01

    Distributed systems have been studied for twenty years and are now coming into wider use as fast networks and powerful workstations become more readily available. In many respects a massively parallel computer resembles a network of workstations and it is tempting to port a distributed operating system to such a machine. However, there are significant differences between these two environments and a parallel operating system is needed to get the best performance out of a massively parallel system. This report characterizes the differences between distributed systems, networks of workstations, and massively parallel systems and analyzes the impact of these differences on operating system design. In the second part of the report, we introduce Puma, an operating system specifically developed for massively parallel systems. We describe Puma portals, the basic building blocks for message passing paradigms implemented on top of Puma, and show how the differences observed in the first part of the report have influenced the design and implementation of Puma.

  14. Main: Nucleotide Analysis [KOME

    Lifescience Database Archive (English)

    Full Text Available Nucleotide Analysis Japonica genome blast search result Result of blastn search against jap...onica genome sequence kome_japonica_genome_blast_search_result.zip kome_japonica_genome_blast_search_result ...

  15. Cyclic nucleotides and radioresistnace

    International Nuclear Information System (INIS)

    Kulinskij, V.I.; Mikheeva, G.A.; Zel'manovich, B.M.

    1982-01-01

    The addition of glucose to meat-peptone broth does not change the radiosensitizing effect (RSE) of cAMP at the logarithmic phase (LP) and the radioprotective effect (RPE) at the stationary phase (SP), but sensitization, characteristic of cGMP, disappears in SP and turns into RPE in LP. Introduction of glucose into the broth for 20 min eliminates all the effects of both cyclic nucleotides in the cya + strain while cya - mutant exhibits RSE. RSE of both cyclic nucleotides is only manifested on minimal media. These data brought confirmation of the dependence of the influence of cyclic media. These data brought confirmation of the dependence of the influence of cyclic nucleotides on radioresistance upon the metabolic status of the cell [ru

  16. New massive gravity

    NARCIS (Netherlands)

    Bergshoeff, Eric A.; Hohm, Olaf; Townsend, Paul K.

    2012-01-01

    We present a brief review of New Massive Gravity, which is a unitary theory of massive gravitons in three dimensions obtained by considering a particular combination of the Einstein-Hilbert and curvature squared terms.

  17. Single Nucleotide Polymorphism

    DEFF Research Database (Denmark)

    Børsting, Claus; Pereira, Vania; Andersen, Jeppe Dyrberg

    2014-01-01

    Single nucleotide polymorphisms (SNPs) are the most frequent DNA sequence variations in the genome. They have been studied extensively in the last decade with various purposes in mind. In this chapter, we will discuss the advantages and disadvantages of using SNPs for human identification...... of SNPs. This will allow acquisition of more information from the sample materials and open up for new possibilities as well as new challenges....

  18. The arabidopsis cyclic nucleotide interactome

    KAUST Repository

    Donaldson, Lara Elizabeth; Meier, Stuart Kurt; Gehring, Christoph A

    2016-01-01

    Cyclic nucleotides have been shown to play important signaling roles in many physiological processes in plants including photosynthesis and defence. Despite this, little is known about cyclic nucleotide-dependent signaling mechanisms

  19. Parallel rendering

    Science.gov (United States)

    Crockett, Thomas W.

    1995-01-01

    This article provides a broad introduction to the subject of parallel rendering, encompassing both hardware and software systems. The focus is on the underlying concepts and the issues which arise in the design of parallel rendering algorithms and systems. We examine the different types of parallelism and how they can be applied in rendering applications. Concepts from parallel computing, such as data decomposition, task granularity, scalability, and load balancing, are considered in relation to the rendering problem. We also explore concepts from computer graphics, such as coherence and projection, which have a significant impact on the structure of parallel rendering algorithms. Our survey covers a number of practical considerations as well, including the choice of architectural platform, communication and memory requirements, and the problem of image assembly and display. We illustrate the discussion with numerous examples from the parallel rendering literature, representing most of the principal rendering methods currently used in computer graphics.

  20. Parallel computations

    CERN Document Server

    1982-01-01

    Parallel Computations focuses on parallel computation, with emphasis on algorithms used in a variety of numerical and physical applications and for many different types of parallel computers. Topics covered range from vectorization of fast Fourier transforms (FFTs) and of the incomplete Cholesky conjugate gradient (ICCG) algorithm on the Cray-1 to calculation of table lookups and piecewise functions. Single tridiagonal linear systems and vectorized computation of reactive flow are also discussed.Comprised of 13 chapters, this volume begins by classifying parallel computers and describing techn

  1. Ultrascalable petaflop parallel supercomputer

    Science.gov (United States)

    Blumrich, Matthias A [Ridgefield, CT; Chen, Dong [Croton On Hudson, NY; Chiu, George [Cross River, NY; Cipolla, Thomas M [Katonah, NY; Coteus, Paul W [Yorktown Heights, NY; Gara, Alan G [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Hall, Shawn [Pleasantville, NY; Haring, Rudolf A [Cortlandt Manor, NY; Heidelberger, Philip [Cortlandt Manor, NY; Kopcsay, Gerard V [Yorktown Heights, NY; Ohmacht, Martin [Yorktown Heights, NY; Salapura, Valentina [Chappaqua, NY; Sugavanam, Krishnan [Mahopac, NY; Takken, Todd [Brewster, NY

    2010-07-20

    A massively parallel supercomputer of petaOPS-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC) having up to four processing elements. The ASIC nodes are interconnected by multiple independent networks that optimally maximize the throughput of packet communications between nodes with minimal latency. The multiple networks may include three high-speed networks for parallel algorithm message passing including a Torus, collective network, and a Global Asynchronous network that provides global barrier and notification functions. These multiple independent networks may be collaboratively or independently utilized according to the needs or phases of an algorithm for optimizing algorithm processing performance. The use of a DMA engine is provided to facilitate message passing among the nodes without the expenditure of processing resources at the node.

  2. In silico screening for inhibitors of p-glycoprotein that target the nucleotide binding domains.

    Science.gov (United States)

    Brewer, Frances K; Follit, Courtney A; Vogel, Pia D; Wise, John G

    2014-12-01

    Multidrug resistances and the failure of chemotherapies are often caused by the expression or overexpression of ATP-binding cassette transporter proteins such as the multidrug resistance protein, P-glycoprotein (P-gp). P-gp is expressed in the plasma membrane of many cell types and protects cells from accumulation of toxins. P-gp uses ATP hydrolysis to catalyze the transport of a broad range of mostly hydrophobic compounds across the plasma membrane and out of the cell. During cancer chemotherapy, the administration of therapeutics often selects for cells which overexpress P-gp, thereby creating populations of cancer cells resistant to a variety of chemically unrelated chemotherapeutics. The present study describes extremely high-throughput, massively parallel in silico ligand docking studies aimed at identifying reversible inhibitors of ATP hydrolysis that target the nucleotide-binding domains of P-gp. We used a structural model of human P-gp that we obtained from molecular dynamics experiments as the protein target for ligand docking. We employed a novel approach of subtractive docking experiments that identified ligands that bound predominantly to the nucleotide-binding domains but not the drug-binding domains of P-gp. Four compounds were found that inhibit ATP hydrolysis by P-gp. Using electron spin resonance spectroscopy, we showed that at least three of these compounds affected nucleotide binding to the transporter. These studies represent a successful proof of principle demonstrating the potential of targeted approaches for identifying specific inhibitors of P-gp. Copyright © 2014 by The American Society for Pharmacology and Experimental Therapeutics.

  3. Parallelization of quantum molecular dynamics simulation code

    International Nuclear Information System (INIS)

    Kato, Kaori; Kunugi, Tomoaki; Shibahara, Masahiko; Kotake, Susumu

    1998-02-01

    A quantum molecular dynamics simulation code has been developed for the analysis of the thermalization of photon energies in the molecule or materials in Kansai Research Establishment. The simulation code is parallelized for both Scalar massively parallel computer (Intel Paragon XP/S75) and Vector parallel computer (Fujitsu VPP300/12). Scalable speed-up has been obtained with a distribution to processor units by division of particle group in both parallel computers. As a result of distribution to processor units not only by particle group but also by the particles calculation that is constructed with fine calculations, highly parallelization performance is achieved in Intel Paragon XP/S75. (author)

  4. Parallel algorithms

    CERN Document Server

    Casanova, Henri; Robert, Yves

    2008-01-01

    ""…The authors of the present book, who have extensive credentials in both research and instruction in the area of parallelism, present a sound, principled treatment of parallel algorithms. … This book is very well written and extremely well designed from an instructional point of view. … The authors have created an instructive and fascinating text. The book will serve researchers as well as instructors who need a solid, readable text for a course on parallelism in computing. Indeed, for anyone who wants an understandable text from which to acquire a current, rigorous, and broad vi

  5. Massive Conformal Gravity

    International Nuclear Information System (INIS)

    Faria, F. F.

    2014-01-01

    We construct a massive theory of gravity that is invariant under conformal transformations. The massive action of the theory depends on the metric tensor and a scalar field, which are considered the only field variables. We find the vacuum field equations of the theory and analyze its weak-field approximation and Newtonian limit.

  6. Scalable Strategies for Computing with Massive Data

    Directory of Open Access Journals (Sweden)

    Michael Kane

    2013-11-01

    Full Text Available This paper presents two complementary statistical computing frameworks that address challenges in parallel processing and the analysis of massive data. First, the foreach package allows users of the R programming environment to define parallel loops that may be run sequentially on a single machine, in parallel on a symmetric multiprocessing (SMP machine, or in cluster environments without platform-specific code. Second, the bigmemory package implements memory- and file-mapped data structures that provide (a access to arbitrarily large data while retaining a look and feel that is familiar to R users and (b data structures that are shared across processor cores in order to support efficient parallel computing techniques. Although these packages may be used independently, this paper shows how they can be used in combination to address challenges that have effectively been beyond the reach of researchers who lack specialized software development skills or expensive hardware.

  7. Cellular automata a parallel model

    CERN Document Server

    Mazoyer, J

    1999-01-01

    Cellular automata can be viewed both as computational models and modelling systems of real processes. This volume emphasises the first aspect. In articles written by leading researchers, sophisticated massive parallel algorithms (firing squad, life, Fischer's primes recognition) are treated. Their computational power and the specific complexity classes they determine are surveyed, while some recent results in relation to chaos from a new dynamic systems point of view are also presented. Audience: This book will be of interest to specialists of theoretical computer science and the parallelism challenge.

  8. Parallel plasma fluid turbulence calculations

    International Nuclear Information System (INIS)

    Leboeuf, J.N.; Carreras, B.A.; Charlton, L.A.; Drake, J.B.; Lynch, V.E.; Newman, D.E.; Sidikman, K.L.; Spong, D.A.

    1994-01-01

    The study of plasma turbulence and transport is a complex problem of critical importance for fusion-relevant plasmas. To this day, the fluid treatment of plasma dynamics is the best approach to realistic physics at the high resolution required for certain experimentally relevant calculations. Core and edge turbulence in a magnetic fusion device have been modeled using state-of-the-art, nonlinear, three-dimensional, initial-value fluid and gyrofluid codes. Parallel implementation of these models on diverse platforms--vector parallel (National Energy Research Supercomputer Center's CRAY Y-MP C90), massively parallel (Intel Paragon XP/S 35), and serial parallel (clusters of high-performance workstations using the Parallel Virtual Machine protocol)--offers a variety of paths to high resolution and significant improvements in real-time efficiency, each with its own advantages. The largest and most efficient calculations have been performed at the 200 Mword memory limit on the C90 in dedicated mode, where an overlap of 12 to 13 out of a maximum of 16 processors has been achieved with a gyrofluid model of core fluctuations. The richness of the physics captured by these calculations is commensurate with the increased resolution and efficiency and is limited only by the ingenuity brought to the analysis of the massive amounts of data generated

  9. The arabidopsis cyclic nucleotide interactome

    KAUST Repository

    Donaldson, Lara Elizabeth

    2016-05-11

    Background Cyclic nucleotides have been shown to play important signaling roles in many physiological processes in plants including photosynthesis and defence. Despite this, little is known about cyclic nucleotide-dependent signaling mechanisms in plants since the downstream target proteins remain unknown. This is largely due to the fact that bioinformatics searches fail to identify plant homologs of protein kinases and phosphodiesterases that are the main targets of cyclic nucleotides in animals. Methods An affinity purification technique was used to identify cyclic nucleotide binding proteins in Arabidopsis thaliana. The identified proteins were subjected to a computational analysis that included a sequence, transcriptional co-expression and functional annotation analysis in order to assess their potential role in plant cyclic nucleotide signaling. Results A total of twelve cyclic nucleotide binding proteins were identified experimentally including key enzymes in the Calvin cycle and photorespiration pathway. Importantly, eight of the twelve proteins were shown to contain putative cyclic nucleotide binding domains. Moreover, the identified proteins are post-translationally modified by nitric oxide, transcriptionally co-expressed and annotated to function in hydrogen peroxide signaling and the defence response. The activity of one of these proteins, GLYGOLATE OXIDASE 1, a photorespiratory enzyme that produces hydrogen peroxide in response to Pseudomonas, was shown to be repressed by a combination of cGMP and nitric oxide treatment. Conclusions We propose that the identified proteins function together as points of cross-talk between cyclic nucleotide, nitric oxide and reactive oxygen species signaling during the defence response.

  10. Topological massive sigma models

    International Nuclear Information System (INIS)

    Lambert, N.D.

    1995-01-01

    In this paper we construct topological sigma models which include a potential and are related to twisted massive supersymmetric sigma models. Contrary to a previous construction these models have no central charge and do not require the manifold to admit a Killing vector. We use the topological massive sigma model constructed here to simplify the calculation of the observables. Lastly it is noted that this model can be viewed as interpolating between topological massless sigma models and topological Landau-Ginzburg models. ((orig.))

  11. Massive neutrinos in astrophysics

    International Nuclear Information System (INIS)

    Qadir, A.

    1982-08-01

    Massive neutrinos are among the big hopes of cosmologists. If they happen to have the right mass they can close the Universe, explain the motion of galaxies in clusters, provide galactic halos and even, possibly, explain galaxy formation. Tremaine and Gunn have argued that massive neutrinos cannot do all these things. I will explain, here, what some of us believe is wrong with their arguments. (author)

  12. Parallel thermal radiation transport in two dimensions

    International Nuclear Information System (INIS)

    Smedley-Stevenson, R.P.; Ball, S.R.

    2003-01-01

    This paper describes the distributed memory parallel implementation of a deterministic thermal radiation transport algorithm in a 2-dimensional ALE hydrodynamics code. The parallel algorithm consists of a variety of components which are combined in order to produce a state of the art computational capability, capable of solving large thermal radiation transport problems using Blue-Oak, the 3 Tera-Flop MPP (massive parallel processors) computing facility at AWE (United Kingdom). Particular aspects of the parallel algorithm are described together with examples of the performance on some challenging applications. (author)

  13. Parallel thermal radiation transport in two dimensions

    Energy Technology Data Exchange (ETDEWEB)

    Smedley-Stevenson, R.P.; Ball, S.R. [AWE Aldermaston (United Kingdom)

    2003-07-01

    This paper describes the distributed memory parallel implementation of a deterministic thermal radiation transport algorithm in a 2-dimensional ALE hydrodynamics code. The parallel algorithm consists of a variety of components which are combined in order to produce a state of the art computational capability, capable of solving large thermal radiation transport problems using Blue-Oak, the 3 Tera-Flop MPP (massive parallel processors) computing facility at AWE (United Kingdom). Particular aspects of the parallel algorithm are described together with examples of the performance on some challenging applications. (author)

  14. Massive graviton geons

    Science.gov (United States)

    Aoki, Katsuki; Maeda, Kei-ichi; Misonoh, Yosuke; Okawa, Hirotada

    2018-02-01

    We find vacuum solutions such that massive gravitons are confined in a local spacetime region by their gravitational energy in asymptotically flat spacetimes in the context of the bigravity theory. We call such self-gravitating objects massive graviton geons. The basic equations can be reduced to the Schrödinger-Poisson equations with the tensor "wave function" in the Newtonian limit. We obtain a nonspherically symmetric solution with j =2 , ℓ=0 as well as a spherically symmetric solution with j =0 , ℓ=2 in this system where j is the total angular momentum quantum number and ℓ is the orbital angular momentum quantum number, respectively. The energy eigenvalue of the Schrödinger equation in the nonspherical solution is smaller than that in the spherical solution. We then study the perturbative stability of the spherical solution and find that there is an unstable mode in the quadrupole mode perturbations which may be interpreted as the transition mode to the nonspherical solution. The results suggest that the nonspherically symmetric solution is the ground state of the massive graviton geon. The massive graviton geons may decay in time due to emissions of gravitational waves but this timescale can be quite long when the massive gravitons are nonrelativistic and then the geons can be long-lived. We also argue possible prospects of the massive graviton geons: applications to the ultralight dark matter scenario, nonlinear (in)stability of the Minkowski spacetime, and a quantum transition of the spacetime.

  15. Rapid Genome-wide Single Nucleotide Polymorphism Discovery in Soybean and Rice via Deep Resequencing of Reduced Representation Libraries with the Illumina Genome Analyzer

    Directory of Open Access Journals (Sweden)

    Stéphane Deschamps

    2010-07-01

    Full Text Available Massively parallel sequencing platforms have allowed for the rapid discovery of single nucleotide polymorphisms (SNPs among related genotypes within a species. We describe the creation of reduced representation libraries (RRLs using an initial digestion of nuclear genomic DNA with a methylation-sensitive restriction endonuclease followed by a secondary digestion with the 4bp-restriction endonuclease This strategy allows for the enrichment of hypomethylated genomic DNA, which has been shown to be rich in genic sequences, and the digestion with serves to increase the number of common loci resequenced between individuals. Deep resequencing of these RRLs performed with the Illumina Genome Analyzer led to the identification of 2618 SNPs in rice and 1682 SNPs in soybean for two representative genotypes in each of the species. A subset of these SNPs was validated via Sanger sequencing, exhibiting validation rates of 96.4 and 97.0%, in rice ( and soybean (, respectively. Comparative analysis of the read distribution relative to annotated genes in the reference genome assemblies indicated that the RRL strategy was primarily sampling within genic regions for both species. The massively parallel sequencing of methylation-sensitive RRLs for genome-wide SNP discovery can be applied across a wide range of plant species having sufficient reference genomic sequence.

  16. Time complexity analysis for distributed memory computers: implementation of parallel conjugate gradient method

    NARCIS (Netherlands)

    Hoekstra, A.G.; Sloot, P.M.A.; Haan, M.J.; Hertzberger, L.O.; van Leeuwen, J.

    1991-01-01

    New developments in Computer Science, both hardware and software, offer researchers, such as physicists, unprecedented possibilities to solve their computational intensive problems.However, full exploitation of e.g. new massively parallel computers, parallel languages or runtime environments

  17. Parallel computation

    International Nuclear Information System (INIS)

    Jejcic, A.; Maillard, J.; Maurel, G.; Silva, J.; Wolff-Bacha, F.

    1997-01-01

    The work in the field of parallel processing has developed as research activities using several numerical Monte Carlo simulations related to basic or applied current problems of nuclear and particle physics. For the applications utilizing the GEANT code development or improvement works were done on parts simulating low energy physical phenomena like radiation, transport and interaction. The problem of actinide burning by means of accelerators was approached using a simulation with the GEANT code. A program of neutron tracking in the range of low energies up to the thermal region has been developed. It is coupled to the GEANT code and permits in a single pass the simulation of a hybrid reactor core receiving a proton burst. Other works in this field refers to simulations for nuclear medicine applications like, for instance, development of biological probes, evaluation and characterization of the gamma cameras (collimators, crystal thickness) as well as the method for dosimetric calculations. Particularly, these calculations are suited for a geometrical parallelization approach especially adapted to parallel machines of the TN310 type. Other works mentioned in the same field refer to simulation of the electron channelling in crystals and simulation of the beam-beam interaction effect in colliders. The GEANT code was also used to simulate the operation of germanium detectors designed for natural and artificial radioactivity monitoring of environment

  18. A computational fluid dynamics algorithm on a massively parallel computer

    International Nuclear Information System (INIS)

    Jespersen, D.C.; Levit, C.

    1989-01-01

    The implementation and performance of a finite-difference algorithm for the compressible Navier-Stokes equations in two or three dimensions on the Connection Machine are described. This machine is a single-instruction multiple-data machine with up to 65536 physical processors. The implicit portion of the algorithm is of particular interest. Running times and megadrop rates are given for two- and three-dimensional problems. Included are comparisons with the standard codes on a Cray X-MP/48. 15 refs

  19. Ocean Modeling and Visualization on Massively Parallel Computer

    Science.gov (United States)

    Chao, Yi; Li, P. Peggy; Wang, Ping; Katz, Daniel S.; Cheng, Benny N.

    1997-01-01

    Climate modeling is one of the grand challenges of computational science, and ocean modeling plays an important role in both understanding the current climatic conditions and predicting future climate change.

  20. SNP detection for massively parallel whole-genome resequencing

    DEFF Research Database (Denmark)

    Li, Ruiqiang; Li, Yingrui; Fang, Xiaodong

    2009-01-01

    -genome or target region resequencing. Here, we have developed a consensus-calling and SNP-detection method for sequencing-by-synthesis Illumina Genome Analyzer technology. We designed this method by carefully considering the data quality, alignment, and experimental errors common to this technology. All...... of this information was integrated into a single quality score for each base under Bayesian theory to measure the accuracy of consensus calling. We tested this methodology using a large-scale human resequencing data set of 36x coverage and assembled a high-quality nonrepetitive consensus sequence for 92.......25% of the diploid autosomes and 88.07% of the haploid X chromosome. Comparison of the consensus sequence with Illumina human 1M BeadChip genotyped alleles from the same DNA sample showed that 98.6% of the 37,933 genotyped alleles on the X chromosome and 98% of 999,981 genotyped alleles on autosomes were covered...

  1. Approximate inference for spatial functional data on massively parallel processors

    DEFF Research Database (Denmark)

    Raket, Lars Lau; Markussen, Bo

    2014-01-01

    With continually increasing data sizes, the relevance of the big n problem of classical likelihood approaches is greater than ever. The functional mixed-effects model is a well established class of models for analyzing functional data. Spatial functional data in a mixed-effects setting...... in linear time. An extremely efficient GPU implementation is presented, and the proposed methods are illustrated by conducting a classical statistical analysis of 2D chromatography data consisting of more than 140 million spatially correlated observation points....

  2. Massively Parallel Post-Packaging for Microelectromechanical Systems (MEMS)

    National Research Council Canada - National Science Library

    Lin, Liwei

    2003-01-01

    ...) demonstrations and characterizations of post-fabrication device trimming. In summary, we were able to develop several new localized bonding processes, including eutectic bonding, fusion bonding, solder bonding, chemical vapor deposition (CVD...

  3. Alternative stitching method for massively parallel e-beam lithography

    Science.gov (United States)

    Brandt, Pieter; Tranquillin, Céline; Wieland, Marco; Bayle, Sébastien; Milléquant, Matthieu; Renault, Guillaume

    2015-07-01

    In this study, a stitching method other than soft edge (SE) and smart boundary (SB) is introduced and benchmarked against SE. The method is based on locally enhanced exposure latitude without throughput cost, making use of the fact that the two beams that pass through the stitching region can deposit up to 2× the nominal dose. The method requires a complex proximity effect correction that takes a preset stitching dose profile into account. Although the principle of the presented stitching method can be multibeam (lithography) systems in general, in this study, the MAPPER FLX 1200 tool is specifically considered. For the latter tool at a metal clip at minimum half-pitch of 32 nm, the stitching method effectively mitigates beam-to-beam (B2B) position errors such that they do not induce an increase in critical dimension uniformity (CDU). In other words, the same CDU can be realized inside the stitching region as outside the stitching region. For the SE method, the CDU inside is 0.3 nm higher than outside the stitching region. A 5-nm direct overlay impact from the B2B position errors cannot be reduced by a stitching strategy.

  4. Lagrange constraint neural networks for massive pixel parallel image demixing

    Science.gov (United States)

    Szu, Harold H.; Hsu, Charles C.

    2002-03-01

    We have shown that the remote sensing optical imaging to achieve detailed sub-pixel decomposition is a unique application of blind source separation (BSS) that is truly linear of far away weak signal, instantaneous speed of light without delay, and along the line of sight without multiple paths. In early papers, we have presented a direct application of statistical mechanical de-mixing method called Lagrange Constraint Neural Network (LCNN). While the BSAO algorithm (using a posteriori MaxEnt ANN and neighborhood pixel average) is not acceptable for remote sensing, a mirror symmetric LCNN approach is all right assuming a priori MaxEnt for unknown sources to be averaged over the source statistics (not neighborhood pixel data) in a pixel-by-pixel independent fashion. LCNN reduces the computation complexity, save a great number of memory devices, and cut the cost of implementation. The Landsat system is designed to measure the radiation to deduce surface conditions and materials. For any given material, the amount of emitted and reflected radiation varies by the wavelength. In practice, a single pixel of a Landsat image has seven channels receiving 0.1 to 12 microns of radiation from the ground within a 20x20 meter footprint containing a variety of radiation materials. A-priori LCNN algorithm provides the spatial-temporal variation of mixture that is hardly de-mixable by other a-posteriori BSS or ICA methods. We have already compared the Landsat remote sensing using both methods in WCCI 2002 Hawaii. Unfortunately the absolute benchmark is not possible because of lacking of the ground truth. We will arbitrarily mix two incoherent sampled images as the ground truth. However, the constant total probability of co-located sources within the pixel footprint is necessary for the remote sensing constraint (since on a clear day the total reflecting energy is constant in neighborhood receiving pixel sensors), we have to normalized two image pixel-by-pixel as well. Then, the result is indeed as expected.

  5. Nyx: Adaptive mesh, massively-parallel, cosmological simulation code

    Science.gov (United States)

    Almgren, Ann; Beckner, Vince; Friesen, Brian; Lukic, Zarija; Zhang, Weiqun

    2017-12-01

    Nyx code solves equations of compressible hydrodynamics on an adaptive grid hierarchy coupled with an N-body treatment of dark matter. The gas dynamics in Nyx use a finite volume methodology on an adaptive set of 3-D Eulerian grids; dark matter is represented as discrete particles moving under the influence of gravity. Particles are evolved via a particle-mesh method, using Cloud-in-Cell deposition/interpolation scheme. Both baryonic and dark matter contribute to the gravitational field. In addition, Nyx includes physics for accurately modeling the intergalactic medium; in optically thin limits and assuming ionization equilibrium, the code calculates heating and cooling processes of the primordial-composition gas in an ionizing ultraviolet background radiation field.

  6. Massively Parallel Optical-to-Electronic Data Transfer

    Science.gov (United States)

    1992-07-27

    is an angle of 0.43 degrees (0.75 mrad). A TeO2 Bragg cell with an acoustic velocity of 6.16 x 10 cm/sec and a center frequency of 60 Mhz has an...laminated face to face and attached to a glass substrate. As shown in Figure 5-1, the diffracted 632.8 nm power exhibits a true maximum because, unlike the...in Photopolymers An attempt was made to stabilize the geometry of the photopolymer by infiltrating a solution of the photopolymer into a porus glass

  7. Programming a massively parallel, computation universal system: static behavior

    Energy Technology Data Exchange (ETDEWEB)

    Lapedes, A.; Farber, R.

    1986-01-01

    In previous work by the authors, the ''optimum finding'' properties of Hopfield neural nets were applied to the nets themselves to create a ''neural compiler.'' This was done in such a way that the problem of programming the attractors of one neural net (called the Slave net) was expressed as an optimization problem that was in turn solved by a second neural net (the Master net). In this series of papers that approach is extended to programming nets that contain interneurons (sometimes called ''hidden neurons''), and thus deals with nets capable of universal computation. 22 refs.

  8. A programmable method for massively parallel targeted sequencing

    Science.gov (United States)

    Hopmans, Erik S.; Natsoulis, Georges; Bell, John M.; Grimes, Susan M.; Sieh, Weiva; Ji, Hanlee P.

    2014-01-01

    We have developed a targeted resequencing approach referred to as Oligonucleotide-Selective Sequencing. In this study, we report a series of significant improvements and novel applications of this method whereby the surface of a sequencing flow cell is modified in situ to capture specific genomic regions of interest from a sample and then sequenced. These improvements include a fully automated targeted sequencing platform through the use of a standard Illumina cBot fluidics station. Targeting optimization increased the yield of total on-target sequencing data 2-fold compared to the previous iteration, while simultaneously increasing the percentage of reads that could be mapped to the human genome. The described assays cover up to 1421 genes with a total coverage of 5.5 Megabases (Mb). We demonstrate a 10-fold abundance uniformity of greater than 90% in 1 log distance from the median and a targeting rate of up to 95%. We also sequenced continuous genomic loci up to 1.5 Mb while simultaneously genotyping SNPs and genes. Variants with low minor allele fraction were sensitively detected at levels of 5%. Finally, we determined the exact breakpoint sequence of cancer rearrangements. Overall, this approach has high performance for selective sequencing of genome targets, configuration flexibility and variant calling accuracy. PMID:24782526

  9. Massively Parallel Polar Decomposition on Distributed-Memory Systems

    KAUST Repository

    Ltaief, Hatem; Sukkari, Dalal E.; Esposito, Aniello; Nakatsukasa, Yuji; Keyes, David E.

    2018-01-01

    We present a high-performance implementation of the Polar Decomposition (PD) on distributed-memory systems. Building upon on the QR-based Dynamically Weighted Halley (QDWH) algorithm, the key idea lies in finding the best rational approximation

  10. Massively parallel sequencing and analysis of the Necator americanus transcriptome.

    Directory of Open Access Journals (Sweden)

    Cinzia Cantacessi

    2010-05-01

    Full Text Available The blood-feeding hookworm Necator americanus infects hundreds of millions of people worldwide. In order to elucidate fundamental molecular biological aspects of this hookworm, the transcriptome of the adult stage of Necator americanus was explored using next-generation sequencing and bioinformatic analyses.A total of 19,997 contigs were assembled from the sequence data; 6,771 of these contigs had known orthologues in the free-living nematode Caenorhabditis elegans, and most of them encoded proteins with WD40 repeats (10.6%, proteinase inhibitors (7.8% or calcium-binding EF-hand proteins (6.7%. Bioinformatic analyses inferred that the C. elegans homologues are involved mainly in biological pathways linked to ribosome biogenesis (70%, oxidative phosphorylation (63% and/or proteases (60%; most of these molecules were predicted to be involved in more than one biological pathway. Comparative analyses of the transcriptomes of N. americanus and the canine hookworm, Ancylostoma caninum, revealed qualitative and quantitative differences. For instance, proteinase inhibitors were inferred to be highly represented in the former species, whereas SCP/Tpx-1/Ag5/PR-1/Sc7 proteins ( = SCP/TAPS or Ancylostoma-secreted proteins were predominant in the latter. In N. americanus, essential molecules were predicted using a combination of orthology mapping and functional data available for C. elegans. Further analyses allowed the prioritization of 18 predicted drug targets which did not have homologues in the human host. These candidate targets were inferred to be linked to mitochondrial (e.g., processing proteins or amino acid metabolism (e.g., asparagine t-RNA synthetase.This study has provided detailed insights into the transcriptome of the adult stage of N. americanus and examines similarities and differences between this species and A. caninum. Future efforts should focus on comparative transcriptomic and proteomic investigations of the other predominant human hookworm, A. duodenale, for both fundamental and applied purposes, including the prevalidation of anti-hookworm drug targets.

  11. Optimisation of a parallel ocean general circulation model

    OpenAIRE

    M. I. Beare; D. P. Stevens

    1997-01-01

    International audience; This paper presents the development of a general-purpose parallel ocean circulation model, for use on a wide range of computer platforms, from traditional scalar machines to workstation clusters and massively parallel processors. Parallelism is provided, as a modular option, via high-level message-passing routines, thus hiding the technical intricacies from the user. An initial implementation highlights that the parallel efficiency of the model is adversely affected by...

  12. Antinociceptive effect of purine nucleotides.

    Science.gov (United States)

    Mello, C F; Begnini, J; De-La-Vega, D D; Lopes, F P; Schwartz, C C; Jimenez-Bernal, R E; Bellot, R G; Frussa-Filho, R

    1996-10-01

    The antinociceptive effect of purine nucleotides administered systematically (sc) was determined using the formalin and writhing tests in adult male albino mice. The mechanisms underlying nucleotide-induced antinociception were investigated by preinjecting the animals (sc) with specific antagonists for opioid (naloxone, 1 mg/kg), purinergic P1 (caffeine, 5, 10, of 30 mg/kg); theophylline, 10 mg/kg) or purinergic P2 receptors (suramin, 100 mg/kg; Coomassie blue, 30-300 mg/kg; quinidine, 10 mg/kg). Adenosine, adenosine monophosphate (AMP), diphosphate (ADP) and triphosphate (ATP) caused a reduction in the number of writhes and in the time of licking the formalin-injected paw. Naloxone had no effect on adenosine- or adenine nucleotide-induced antinociception. Caffeine (30 mg/kg) and theophylline (10 mg/kg) reversed the antinociceptive action of adenosine and adenine nucleotide derivatives in both tests. P2 antagonists did not reverse adenine nucleotide-induced antinociception. These results suggest that antinociceptive effect of adenine nucleotides is mediated by adenosine.

  13. Epidemiology of Massive Transfusion

    DEFF Research Database (Denmark)

    Halmin, Märit; Chiesa, Flaminia; Vasan, Senthil K

    2016-01-01

    in Sweden from 1987 and in Denmark from 1996. A total of 92,057 patients were included. Patients were followed until the end of 2012. MEASUREMENTS AND MAIN RESULTS: Descriptive statistics were used to characterize the patients and indications. Post transfusion mortality was expressed as crude 30-day...... mortality and as long-term mortality using the Kaplan-Meier method and using standardized mortality ratios. The incidence of massive transfusion was higher in Denmark (4.5 per 10,000) than in Sweden (2.5 per 10,000). The most common indication for massive transfusion was major surgery (61.2%) followed...

  14. Topologically massive supergravity

    Directory of Open Access Journals (Sweden)

    S. Deser

    1983-01-01

    Full Text Available The locally supersymmetric extension of three-dimensional topologically massive gravity is constructed. Its fermionic part is the sum of the (dynamically trivial Rarita-Schwinger action and a gauge-invariant topological term, of second derivative order, analogous to the gravitational one. It is ghost free and represents a single massive spin 3/2 excitation. The fermion-gravity coupling is minimal and the invariance is under the usual supergravity transformations. The system's energy, as well as that of the original topological gravity, is therefore positive.

  15. Epidemiology of massive transfusion

    DEFF Research Database (Denmark)

    Halmin, M A; Chiesa, F; Vasan, S K

    2015-01-01

    and to describe characteristics and mortality of massively transfused patients. Methods: We performed a retrospective cohort study based on the Scandinavian Donations and Transfusions (SCANDAT2) database, linking data on blood donation, blood components and transfused patients with inpatient- and population.......4% among women transfused for obstetrical bleeding. Mortality increased gradually with age and among all patients massively transfused at age 80 years, only 26% were alive [TABLE PRESENTED] after 5 years. The relative mortality, early after transfusion, was high and decreased with time since transfusion...

  16. Parallel R

    CERN Document Server

    McCallum, Ethan

    2011-01-01

    It's tough to argue with R as a high-quality, cross-platform, open source statistical software product-unless you're in the business of crunching Big Data. This concise book introduces you to several strategies for using R to analyze large datasets. You'll learn the basics of Snow, Multicore, Parallel, and some Hadoop-related tools, including how to find them, how to use them, when they work well, and when they don't. With these packages, you can overcome R's single-threaded nature by spreading work across multiple CPUs, or offloading work to multiple machines to address R's memory barrier.

  17. Design considerations for parallel graphics libraries

    Science.gov (United States)

    Crockett, Thomas W.

    1994-01-01

    Applications which run on parallel supercomputers are often characterized by massive datasets. Converting these vast collections of numbers to visual form has proven to be a powerful aid to comprehension. For a variety of reasons, it may be desirable to provide this visual feedback at runtime. One way to accomplish this is to exploit the available parallelism to perform graphics operations in place. In order to do this, we need appropriate parallel rendering algorithms and library interfaces. This paper provides a tutorial introduction to some of the issues which arise in designing parallel graphics libraries and their underlying rendering algorithms. The focus is on polygon rendering for distributed memory message-passing systems. We illustrate our discussion with examples from PGL, a parallel graphics library which has been developed on the Intel family of parallel systems.

  18. Radiology in massive hemoptysis

    International Nuclear Information System (INIS)

    Marini, M.; Castro, J.M.; Gayol, A.; Aguilera, C.; Blanco, M.; Beraza, A.; Torres, J.

    1995-01-01

    We have reviewed our experience in diseases involving massive hemoptysis, systematizing the most common causes which include tuberculosis, bronchiectasis and cancer of the lung. Other less frequent causes, such as arteriovenous fistula, Aspergilloma, aneurysm, etc.; are also evaluated, and the most demonstrative images of each produced by the most precise imaging methods for their assessment are presented

  19. Massive Supergravity and Deconstruction

    CERN Document Server

    Gregoire, T; Shadmi, Y; Gregoire, Thomas; Schwartz, Matthew D; Shadmi, Yael

    2004-01-01

    We present a simple superfield Lagrangian for massive supergravity. It comprises the minimal supergravity Lagrangian with interactions as well as mass terms for the metric superfield and the chiral compensator. This is the natural generalization of the Fierz-Pauli Lagrangian for massive gravity which comprises mass terms for the metric and its trace. We show that the on-shell bosonic and fermionic fields are degenerate and have the appropriate spins: 2, 3/2, 3/2 and 1. We then study this interacting Lagrangian using goldstone superfields. We find that a chiral multiplet of goldstones gets a kinetic term through mixing, just as the scalar goldstone does in the non-supersymmetric case. This produces Planck scale (Mpl) interactions with matter and all the discontinuities and unitarity bounds associated with massive gravity. In particular, the scale of strong coupling is (Mpl m^4)^1/5, where m is the multiplet's mass. Next, we consider applications of massive supergravity to deconstruction. We estimate various qu...

  20. Update on massive transfusion.

    Science.gov (United States)

    Pham, H P; Shaz, B H

    2013-12-01

    Massive haemorrhage requires massive transfusion (MT) to maintain adequate circulation and haemostasis. For optimal management of massively bleeding patients, regardless of aetiology (trauma, obstetrical, surgical), effective preparation and communication between transfusion and other laboratory services and clinical teams are essential. A well-defined MT protocol is a valuable tool to delineate how blood products are ordered, prepared, and delivered; determine laboratory algorithms to use as transfusion guidelines; and outline duties and facilitate communication between involved personnel. In MT patients, it is crucial to practice damage control resuscitation and to administer blood products early in the resuscitation. Trauma patients are often admitted with early trauma-induced coagulopathy (ETIC), which is associated with mortality; the aetiology of ETIC is likely multifactorial. Current data support that trauma patients treated with higher ratios of plasma and platelet to red blood cell transfusions have improved outcomes, but further clinical investigation is needed. Additionally, tranexamic acid has been shown to decrease the mortality in trauma patients requiring MT. Greater use of cryoprecipitate or fibrinogen concentrate might be beneficial in MT patients from obstetrical causes. The risks and benefits for other therapies (prothrombin complex concentrate, recombinant activated factor VII, or whole blood) are not clearly defined in MT patients. Throughout the resuscitation, the patient should be closely monitored and both metabolic and coagulation abnormalities corrected. Further studies are needed to clarify the optimal ratios of blood products, treatment based on underlying clinical disorder, use of alternative therapies, and integration of laboratory testing results in the management of massively bleeding patients.

  1. Massive antenatal fetomaternal hemorrhage

    DEFF Research Database (Denmark)

    Dziegiel, Morten Hanefeld; Koldkjaer, Ole; Berkowicz, Adela

    2005-01-01

    Massive fetomaternal hemorrhage (FMH) can lead to life-threatening anemia. Quantification based on flow cytometry with anti-hemoglobin F (HbF) is applicable in all cases but underestimation of large fetal bleeds has been reported. A large FMH from an ABO-compatible fetus allows an estimation...

  2. COLA with massive neutrinos

    Energy Technology Data Exchange (ETDEWEB)

    Wright, Bill S.; Winther, Hans A.; Koyama, Kazuya, E-mail: bill.wright@port.ac.uk, E-mail: hans.winther@port.ac.uk, E-mail: kazuya.koyama@port.ac.uk [Institute of Cosmology and Gravitation, University of Portsmouth, Dennis Sciama Building, Burnaby Road, Portsmouth, Hampshire, PO1 3FX (United Kingdom)

    2017-10-01

    The effect of massive neutrinos on the growth of cold dark matter perturbations acts as a scale-dependent Newton's constant and leads to scale-dependent growth factors just as we often find in models of gravity beyond General Relativity. We show how to compute growth factors for ΛCDM and general modified gravity cosmologies combined with massive neutrinos in Lagrangian perturbation theory for use in COLA and extensions thereof. We implement this together with the grid-based massive neutrino method of Brandbyge and Hannestad in MG-PICOLA and compare COLA simulations to full N -body simulations of ΛCDM and f ( R ) gravity with massive neutrinos. Our implementation is computationally cheap if the underlying cosmology already has scale-dependent growth factors and it is shown to be able to produce results that match N -body to percent level accuracy for both the total and CDM matter power-spectra up to k ∼< 1 h /Mpc.

  3. Parallel Lines

    Directory of Open Access Journals (Sweden)

    James G. Worner

    2017-05-01

    Full Text Available James Worner is an Australian-based writer and scholar currently pursuing a PhD at the University of Technology Sydney. His research seeks to expose masculinities lost in the shadow of Australia’s Anzac hegemony while exploring new opportunities for contemporary historiography. He is the recipient of the Doctoral Scholarship in Historical Consciousness at the university’s Australian Centre of Public History and will be hosted by the University of Bologna during 2017 on a doctoral research writing scholarship.   ‘Parallel Lines’ is one of a collection of stories, The Shapes of Us, exploring liminal spaces of modern life: class, gender, sexuality, race, religion and education. It looks at lives, like lines, that do not meet but which travel in proximity, simultaneously attracted and repelled. James’ short stories have been published in various journals and anthologies.

  4. Massive propagators in instanton fields

    International Nuclear Information System (INIS)

    Brown, L.S.; Lee, C.

    1978-01-01

    Green's functions for massive spinor and vector particles propagating in a self-dual but otherwise arbitrary non-Abelian gauge field are shown to be completely determined by the corresponding Green's functions of massive scalar particles

  5. Permutations of massive vacua

    Energy Technology Data Exchange (ETDEWEB)

    Bourget, Antoine [Department of Physics, Universidad de Oviedo, Avenida Calvo Sotelo 18, 33007 Oviedo (Spain); Troost, Jan [Laboratoire de Physique Théorique de l’É cole Normale Supérieure, CNRS,PSL Research University, Sorbonne Universités, 75005 Paris (France)

    2017-05-09

    We discuss the permutation group G of massive vacua of four-dimensional gauge theories with N=1 supersymmetry that arises upon tracing loops in the space of couplings. We concentrate on superconformal N=4 and N=2 theories with N=1 supersymmetry preserving mass deformations. The permutation group G of massive vacua is the Galois group of characteristic polynomials for the vacuum expectation values of chiral observables. We provide various techniques to effectively compute characteristic polynomials in given theories, and we deduce the existence of varying symmetry breaking patterns of the duality group depending on the gauge algebra and matter content of the theory. Our examples give rise to interesting field extensions of spaces of modular forms.

  6. Massive stars in galaxies

    International Nuclear Information System (INIS)

    Humphreys, R.M.

    1987-01-01

    The relationship between the morphologic type of a galaxy and the evolution of its massive stars is explored, reviewing observational results for nearby galaxies. The data are presented in diagrams, and it is found that the massive-star populations of most Sc spiral galaxies and irregular galaxies are similar, while those of Sb spirals such as M 31 and M 81 may be affected by morphology (via differences in the initial mass function or star-formation rate). Consideration is also given to the stability-related upper luminosity limit in the H-R diagram of hypergiant stars (attributed to radiation pressure in hot stars and turbulence in cool stars) and the goals of future observation campaigns. 88 references

  7. Massive Open Online Courses

    Directory of Open Access Journals (Sweden)

    Tharindu Rekha Liyanagunawardena

    2015-01-01

    Full Text Available Massive Open Online Courses (MOOCs are a new addition to the open educational provision. They are offered mainly by prestigious universities on various commercial and non-commercial MOOC platforms allowing anyone who is interested to experience the world class teaching practiced in these universities. MOOCs have attracted wide interest from around the world. However, learner demographics in MOOCs suggest that some demographic groups are underrepresented. At present MOOCs seem to be better serving the continuous professional development sector.

  8. Evolution of massive stars

    International Nuclear Information System (INIS)

    Loore, C. de

    1984-01-01

    The evolution of stars with masses larger than 15 sun masses is reviewed. These stars have large convective cores and lose a substantial fraction of their matter by stellar wind. The treatment of convection and the parameterisation of the stellar wind mass loss are analysed within the context of existing disagreements between theory and observation. The evolution of massive close binaries and the origin of Wolf-Rayet Stars and X-ray binaries is also sketched. (author)

  9. Parallel algorithms and cluster computing

    CERN Document Server

    Hoffmann, Karl Heinz

    2007-01-01

    This book presents major advances in high performance computing as well as major advances due to high performance computing. It contains a collection of papers in which results achieved in the collaboration of scientists from computer science, mathematics, physics, and mechanical engineering are presented. From the science problems to the mathematical algorithms and on to the effective implementation of these algorithms on massively parallel and cluster computers we present state-of-the-art methods and technology as well as exemplary results in these fields. This book shows that problems which seem superficially distinct become intimately connected on a computational level.

  10. From parallel to distributed computing for reactive scattering calculations

    International Nuclear Information System (INIS)

    Lagana, A.; Gervasi, O.; Baraglia, R.

    1994-01-01

    Some reactive scattering codes have been ported on different innovative computer architectures ranging from massively parallel machines to clustered workstations. The porting has required a drastic restructuring of the codes to single out computationally decoupled cpu intensive subsections. The suitability of different theoretical approaches for parallel and distributed computing restructuring is discussed and the efficiency of related algorithms evaluated

  11. Parallel Implicit Algorithms for CFD

    Science.gov (United States)

    Keyes, David E.

    1998-01-01

    The main goal of this project was efficient distributed parallel and workstation cluster implementations of Newton-Krylov-Schwarz (NKS) solvers for implicit Computational Fluid Dynamics (CFD.) "Newton" refers to a quadratically convergent nonlinear iteration using gradient information based on the true residual, "Krylov" to an inner linear iteration that accesses the Jacobian matrix only through highly parallelizable sparse matrix-vector products, and "Schwarz" to a domain decomposition form of preconditioning the inner Krylov iterations with primarily neighbor-only exchange of data between the processors. Prior experience has established that Newton-Krylov methods are competitive solvers in the CFD context and that Krylov-Schwarz methods port well to distributed memory computers. The combination of the techniques into Newton-Krylov-Schwarz was implemented on 2D and 3D unstructured Euler codes on the parallel testbeds that used to be at LaRC and on several other parallel computers operated by other agencies or made available by the vendors. Early implementations were made directly in Massively Parallel Integration (MPI) with parallel solvers we adapted from legacy NASA codes and enhanced for full NKS functionality. Later implementations were made in the framework of the PETSC library from Argonne National Laboratory, which now includes pseudo-transient continuation Newton-Krylov-Schwarz solver capability (as a result of demands we made upon PETSC during our early porting experiences). A secondary project pursued with funding from this contract was parallel implicit solvers in acoustics, specifically in the Helmholtz formulation. A 2D acoustic inverse problem has been solved in parallel within the PETSC framework.

  12. A Parallel Software Pipeline for DMET Microarray Genotyping Data Analysis

    Directory of Open Access Journals (Sweden)

    Giuseppe Agapito

    2018-06-01

    Full Text Available Personalized medicine is an aspect of the P4 medicine (predictive, preventive, personalized and participatory based precisely on the customization of all medical characters of each subject. In personalized medicine, the development of medical treatments and drugs is tailored to the individual characteristics and needs of each subject, according to the study of diseases at different scales from genotype to phenotype scale. To make concrete the goal of personalized medicine, it is necessary to employ high-throughput methodologies such as Next Generation Sequencing (NGS, Genome-Wide Association Studies (GWAS, Mass Spectrometry or Microarrays, that are able to investigate a single disease from a broader perspective. A side effect of high-throughput methodologies is the massive amount of data produced for each single experiment, that poses several challenges (e.g., high execution time and required memory to bioinformatic software. Thus a main requirement of modern bioinformatic softwares, is the use of good software engineering methods and efficient programming techniques, able to face those challenges, that include the use of parallel programming and efficient and compact data structures. This paper presents the design and the experimentation of a comprehensive software pipeline, named microPipe, for the preprocessing, annotation and analysis of microarray-based Single Nucleotide Polymorphism (SNP genotyping data. A use case in pharmacogenomics is presented. The main advantages of using microPipe are: the reduction of errors that may happen when trying to make data compatible among different tools; the possibility to analyze in parallel huge datasets; the easy annotation and integration of data. microPipe is available under Creative Commons license, and is freely downloadable for academic and not-for-profit institutions.

  13. Nucleotide excision repair in yeast

    NARCIS (Netherlands)

    Eijk, Patrick van

    2012-01-01

    Nucleotide Excision Repair (NER) is a conserved DNA repair pathway capable of removing a broad spectrum of DNA damage. In human cells a defect in NER leads to the disorder Xeroderma pigmentosum (XP). The yeast Saccharomyces cerevisiae is an excellent model organism to study the mechanism of NER. The

  14. GPU Parallel Bundle Block Adjustment

    Directory of Open Access Journals (Sweden)

    ZHENG Maoteng

    2017-09-01

    Full Text Available To deal with massive data in photogrammetry, we introduce the GPU parallel computing technology. The preconditioned conjugate gradient and inexact Newton method are also applied to decrease the iteration times while solving the normal equation. A brand new workflow of bundle adjustment is developed to utilize GPU parallel computing technology. Our method can avoid the storage and inversion of the big normal matrix, and compute the normal matrix in real time. The proposed method can not only largely decrease the memory requirement of normal matrix, but also largely improve the efficiency of bundle adjustment. It also achieves the same accuracy as the conventional method. Preliminary experiment results show that the bundle adjustment of a dataset with about 4500 images and 9 million image points can be done in only 1.5 minutes while achieving sub-pixel accuracy.

  15. Introduction to massive neutrinos

    International Nuclear Information System (INIS)

    Kayser, B.

    1984-01-01

    We discuss the theoretical ideas which make it natural to expect that neutrinos do indeed have mass. Then we focus on the physical consequences of neutrino mass, including neutrino oscillation and other phenomena whose observation would be very interesting, and would serve to demonstrate that neutrinos are indeed massive. We comment on the legitimacy of comparing results from different types of experiments. Finally, we consider the question of whether neutrinos are their own antiparticles. We explain what this question means, discuss the nature of a neutrino which is its own antiparticles, and consider how one might determine experimentally whether neutrinos are their own antiparticles or not

  16. Phases of massive gravity

    CERN Document Server

    Dubovsky, S L

    2004-01-01

    We systematically study the most general Lorentz-violating graviton mass invariant under three-dimensional Eucledian group using the explicitly covariant language. We find that at general values of mass parameters the massive graviton has six propagating degrees of freedom, and some of them are ghosts or lead to rapid classical instabilities. However, there is a number of different regions in the mass parameter space where massive gravity can be described by a consistent low-energy effective theory with cutoff $\\sim\\sqrt{mM_{Pl}}$ free of rapid instabilities and vDVZ discontinuity. Each of these regions is characterized by certain fine-tuning relations between mass parameters, generalizing the Fierz--Pauli condition. In some cases the required fine-tunings are consequences of the existence of the subgroups of the diffeomorphism group that are left unbroken by the graviton mass. We found two new cases, when the resulting theories have a property of UV insensitivity, i.e. remain well behaved after inclusion of ...

  17. Minimal massive 3D gravity

    International Nuclear Information System (INIS)

    Bergshoeff, Eric; Merbis, Wout; Hohm, Olaf; Routh, Alasdair J; Townsend, Paul K

    2014-01-01

    We present an alternative to topologically massive gravity (TMG) with the same ‘minimal’ bulk properties; i.e. a single local degree of freedom that is realized as a massive graviton in linearization about an anti-de Sitter (AdS) vacuum. However, in contrast to TMG, the new ‘minimal massive gravity’ has both a positive energy graviton and positive central charges for the asymptotic AdS-boundary conformal algebra. (paper)

  18. Massive Galileon positivity bounds

    Science.gov (United States)

    de Rham, Claudia; Melville, Scott; Tolley, Andrew J.; Zhou, Shuang-Yong

    2017-09-01

    The EFT coefficients in any gapped, scalar, Lorentz invariant field theory must satisfy positivity requirements if there is to exist a local, analytic Wilsonian UV completion. We apply these bounds to the tree level scattering amplitudes for a massive Galileon. The addition of a mass term, which does not spoil the non-renormalization theorem of the Galileon and preserves the Galileon symmetry at loop level, is necessary to satisfy the lowest order positivity bound. We further show that a careful choice of successively higher derivative corrections are necessary to satisfy the higher order positivity bounds. There is then no obstruction to a local UV completion from considerations of tree level 2-to-2 scattering alone. To demonstrate this we give an explicit example of such a UV completion.

  19. A visualization of null geodesics for the bonnor massive dipole

    Directory of Open Access Journals (Sweden)

    G. Andree Oliva Mercado

    2015-08-01

    Full Text Available In this work we simulate null geodesics for the Bonnor massive dipole metric by implementing a symbolic-numerical algorithm in Sage and Python. This program is also capable of visualizing in 3D, in principle, the geodesics for any given metric. Geodesics are launched from a common point, collectively forming a cone of light beams, simulating a solid-angle section of a point source in front of a massive object with a magnetic field. Parallel light beams also were considered, and their bending due to the curvature of the space-time was simulated.

  20. Lemon : An MPI parallel I/O library for data encapsulation using LIME

    NARCIS (Netherlands)

    Deuzeman, Albert; Reker, Siebren; Urbach, Carsten

    We introduce Lemon, an MPI parallel I/O library that provides efficient parallel I/O of both binary and metadata on massively parallel architectures. Motivated by the demands of the lattice Quantum Chromodynamics community, the data is stored in the SciDAC Lattice QCD Interchange Message

  1. Parallel adaptive simulations on unstructured meshes

    International Nuclear Information System (INIS)

    Shephard, M S; Jansen, K E; Sahni, O; Diachin, L A

    2007-01-01

    This paper discusses methods being developed by the ITAPS center to support the execution of parallel adaptive simulations on unstructured meshes. The paper first outlines the ITAPS approach to the development of interoperable mesh, geometry and field services to support the needs of SciDAC application in these areas. The paper then demonstrates the ability of unstructured adaptive meshing methods built on such interoperable services to effectively solve important physics problems. Attention is then focused on ITAPs' developing ability to solve adaptive unstructured mesh problems on massively parallel computers

  2. Parallel preconditioning techniques for sparse CG solvers

    Energy Technology Data Exchange (ETDEWEB)

    Basermann, A.; Reichel, B.; Schelthoff, C. [Central Institute for Applied Mathematics, Juelich (Germany)

    1996-12-31

    Conjugate gradient (CG) methods to solve sparse systems of linear equations play an important role in numerical methods for solving discretized partial differential equations. The large size and the condition of many technical or physical applications in this area result in the need for efficient parallelization and preconditioning techniques of the CG method. In particular for very ill-conditioned matrices, sophisticated preconditioner are necessary to obtain both acceptable convergence and accuracy of CG. Here, we investigate variants of polynomial and incomplete Cholesky preconditioners that markedly reduce the iterations of the simply diagonally scaled CG and are shown to be well suited for massively parallel machines.

  3. Single nucleotide polymorphism isolated from a novel EST dataset in garden asparagus (Asparagus officinalis L.).

    Science.gov (United States)

    Mercati, Francesco; Riccardi, Paolo; Leebens-Mack, Jim; Abenavoli, Maria Rosa; Falavigna, Agostino; Sunseri, Francesco

    2013-04-01

    Single nucleotide polymorphisms (SNPs) and simple sequence repeats (SSR) are abundant and evenly distributed co-dominant molecular markers in plant genomes. SSRs are valuable for marker assisted breeding and positional cloning of genes associated traits of interest. Although several high throughput platforms have been developed to identify SNP and SSR markers for analysis of segregant plant populations, breeding in garden asparagus (Asparagus officinalis L.) has been limited by a low content of such markers. In this study massively parallel GS-FLX pyro-sequencing technology (454 Life Sciences) has been used to sequence and compare transcriptome from two genotypes: a rust tolerant male (1770) and a susceptible female (G190). A total of 122,963 and 99,368 sequence reads, with an average length of 245.7bp, have been recovered from accessions 1770 and 190 respectively. A computational pipeline has been used to predict and visually inspect putative SNPs and SSR sequences. Analysis of Gene Ontology (GO) slim annotation assignments for all assembled uniscripts indicated that the 24,403 assemblies represent genes from a broad array of functions. Further, over 1800 putative SNPs and 1000 SSRs were detected. One hundred forty-four SNPs together with 60 selected SSRs were validated and used to develop a preliminary genetic map by using a large BC(1) population, derived from 1770 and G190. The abundance of SNPs and SSRs provides a foundation for the development of saturated genetic maps and their utilization in assisted asparagus breeding programs. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  4. Research in Parallel Algorithms and Software for Computational Aerosciences

    Science.gov (United States)

    Domel, Neal D.

    1996-01-01

    Phase 1 is complete for the development of a computational fluid dynamics CFD) parallel code with automatic grid generation and adaptation for the Euler analysis of flow over complex geometries. SPLITFLOW, an unstructured Cartesian grid code developed at Lockheed Martin Tactical Aircraft Systems, has been modified for a distributed memory/massively parallel computing environment. The parallel code is operational on an SGI network, Cray J90 and C90 vector machines, SGI Power Challenge, and Cray T3D and IBM SP2 massively parallel machines. Parallel Virtual Machine (PVM) is the message passing protocol for portability to various architectures. A domain decomposition technique was developed which enforces dynamic load balancing to improve solution speed and memory requirements. A host/node algorithm distributes the tasks. The solver parallelizes very well, and scales with the number of processors. Partially parallelized and non-parallelized tasks consume most of the wall clock time in a very fine grain environment. Timing comparisons on a Cray C90 demonstrate that Parallel SPLITFLOW runs 2.4 times faster on 8 processors than its non-parallel counterpart autotasked over 8 processors.

  5. Optimisation of a parallel ocean general circulation model

    Science.gov (United States)

    Beare, M. I.; Stevens, D. P.

    1997-10-01

    This paper presents the development of a general-purpose parallel ocean circulation model, for use on a wide range of computer platforms, from traditional scalar machines to workstation clusters and massively parallel processors. Parallelism is provided, as a modular option, via high-level message-passing routines, thus hiding the technical intricacies from the user. An initial implementation highlights that the parallel efficiency of the model is adversely affected by a number of factors, for which optimisations are discussed and implemented. The resulting ocean code is portable and, in particular, allows science to be achieved on local workstations that could otherwise only be undertaken on state-of-the-art supercomputers.

  6. Parallel Programming with Intel Parallel Studio XE

    CERN Document Server

    Blair-Chappell , Stephen

    2012-01-01

    Optimize code for multi-core processors with Intel's Parallel Studio Parallel programming is rapidly becoming a "must-know" skill for developers. Yet, where to start? This teach-yourself tutorial is an ideal starting point for developers who already know Windows C and C++ and are eager to add parallelism to their code. With a focus on applying tools, techniques, and language extensions to implement parallelism, this essential resource teaches you how to write programs for multicore and leverage the power of multicore in your programs. Sharing hands-on case studies and real-world examples, the

  7. AMIDST: Analysis of MassIve Data STreams

    DEFF Research Database (Denmark)

    Masegosa, Andres; Martinez, Ana Maria; Borchani, Hanen

    2015-01-01

    The Analysis of MassIve Data STreams (AMIDST) Java toolbox provides a collection of scalable and parallel algorithms for inference and learning of hybrid Bayesian networks from data streams. The toolbox, available at http://amidst.github.io/toolbox/ under the Apache Software License version 2.......0, also efficiently leverages existing functionalities and algorithms by interfacing to software tools such as HUGIN and MOA....

  8. The Destructive Birth of Massive Stars and Massive Star Clusters

    Science.gov (United States)

    Rosen, Anna; Krumholz, Mark; McKee, Christopher F.; Klein, Richard I.; Ramirez-Ruiz, Enrico

    2017-01-01

    Massive stars play an essential role in the Universe. They are rare, yet the energy and momentum they inject into the interstellar medium with their intense radiation fields dwarfs the contribution by their vastly more numerous low-mass cousins. Previous theoretical and observational studies have concluded that the feedback associated with massive stars' radiation fields is the dominant mechanism regulating massive star and massive star cluster (MSC) formation. Therefore detailed simulation of the formation of massive stars and MSCs, which host hundreds to thousands of massive stars, requires an accurate treatment of radiation. For this purpose, we have developed a new, highly accurate hybrid radiation algorithm that properly treats the absorption of the direct radiation field from stars and the re-emission and processing by interstellar dust. We use our new tool to perform a suite of three-dimensional radiation-hydrodynamic simulations of the formation of massive stars and MSCs. For individual massive stellar systems, we simulate the collapse of massive pre-stellar cores with laminar and turbulent initial conditions and properly resolve regions where we expect instabilities to grow. We find that mass is channeled to the massive stellar system via gravitational and Rayleigh-Taylor (RT) instabilities. For laminar initial conditions, proper treatment of the direct radiation field produces later onset of RT instability, but does not suppress it entirely provided the edges of the radiation-dominated bubbles are adequately resolved. RT instabilities arise immediately for turbulent pre-stellar cores because the initial turbulence seeds the instabilities. To model MSC formation, we simulate the collapse of a dense, turbulent, magnetized Mcl = 106 M⊙ molecular cloud. We find that the influence of the magnetic pressure and radiative feedback slows down star formation. Furthermore, we find that star formation is suppressed along dense filaments where the magnetic field is

  9. Parallel computing of a climate model on the dawn 1000 by domain decomposition method

    Science.gov (United States)

    Bi, Xunqiang

    1997-12-01

    In this paper the parallel computing of a grid-point nine-level atmospheric general circulation model on the Dawn 1000 is introduced. The model was developed by the Institute of Atmospheric Physics (IAP), Chinese Academy of Sciences (CAS). The Dawn 1000 is a MIMD massive parallel computer made by National Research Center for Intelligent Computer (NCIC), CAS. A two-dimensional domain decomposition method is adopted to perform the parallel computing. The potential ways to increase the speed-up ratio and exploit more resources of future massively parallel supercomputation are also discussed.

  10. Parallel sorting algorithms

    CERN Document Server

    Akl, Selim G

    1985-01-01

    Parallel Sorting Algorithms explains how to use parallel algorithms to sort a sequence of items on a variety of parallel computers. The book reviews the sorting problem, the parallel models of computation, parallel algorithms, and the lower bounds on the parallel sorting problems. The text also presents twenty different algorithms, such as linear arrays, mesh-connected computers, cube-connected computers. Another example where algorithm can be applied is on the shared-memory SIMD (single instruction stream multiple data stream) computers in which the whole sequence to be sorted can fit in the

  11. Massive gravity from bimetric gravity

    International Nuclear Information System (INIS)

    Baccetti, Valentina; Martín-Moruno, Prado; Visser, Matt

    2013-01-01

    We discuss the subtle relationship between massive gravity and bimetric gravity, focusing particularly on the manner in which massive gravity may be viewed as a suitable limit of bimetric gravity. The limiting procedure is more delicate than currently appreciated. Specifically, this limiting procedure should not unnecessarily constrain the background metric, which must be externally specified by the theory of massive gravity itself. The fact that in bimetric theories one always has two sets of metric equations of motion continues to have an effect even in the massive gravity limit, leading to additional constraints besides the one set of equations of motion naively expected. Thus, since solutions of bimetric gravity in the limit of vanishing kinetic term are also solutions of massive gravity, but the contrary statement is not necessarily true, there is no complete continuity in the parameter space of the theory. In particular, we study the massive cosmological solutions which are continuous in the parameter space, showing that many interesting cosmologies belong to this class. (paper)

  12. Nucleotide Selectivity in Abiotic RNA Polymerization Reactions

    Science.gov (United States)

    Coari, Kristin M.; Martin, Rebecca C.; Jain, Kopal; McGown, Linda B.

    2017-09-01

    In order to establish an RNA world on early Earth, the nucleotides must form polymers through chemical rather than biochemical reactions. The polymerization products must be long enough to perform catalytic functions, including self-replication, and to preserve genetic information. These functions depend not only on the length of the polymers, but also on their sequences. To date, studies of abiotic RNA polymerization generally have focused on routes to polymerization of a single nucleotide and lengths of the homopolymer products. Less work has been done the selectivity of the reaction toward incorporation of some nucleotides over others in nucleotide mixtures. Such information is an essential step toward understanding the chemical evolution of RNA. To address this question, in the present work RNA polymerization reactions were performed in the presence of montmorillonite clay catalyst. The nucleotides included the monophosphates of adenosine, cytosine, guanosine, uridine and inosine. Experiments included reactions of mixtures of an imidazole-activated nucleotide (ImpX) with one or more unactivated nucleotides (XMP), of two or more ImpX, and of XMP that were activated in situ in the polymerization reaction itself. The reaction products were analyzed using matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) to identify the lengths and nucleotide compositions of the polymerization products. The results show that the extent of polymerization, the degree of heteropolymerization vs. homopolymerization, and the composition of the polymeric products all vary among the different nucleotides and depend upon which nucleotides and how many different nucleotides are present in the mixture.

  13. Nucleotide Selectivity in Abiotic RNA Polymerization Reactions.

    Science.gov (United States)

    Coari, Kristin M; Martin, Rebecca C; Jain, Kopal; McGown, Linda B

    2017-09-01

    In order to establish an RNA world on early Earth, the nucleotides must form polymers through chemical rather than biochemical reactions. The polymerization products must be long enough to perform catalytic functions, including self-replication, and to preserve genetic information. These functions depend not only on the length of the polymers, but also on their sequences. To date, studies of abiotic RNA polymerization generally have focused on routes to polymerization of a single nucleotide and lengths of the homopolymer products. Less work has been done the selectivity of the reaction toward incorporation of some nucleotides over others in nucleotide mixtures. Such information is an essential step toward understanding the chemical evolution of RNA. To address this question, in the present work RNA polymerization reactions were performed in the presence of montmorillonite clay catalyst. The nucleotides included the monophosphates of adenosine, cytosine, guanosine, uridine and inosine. Experiments included reactions of mixtures of an imidazole-activated nucleotide (ImpX) with one or more unactivated nucleotides (XMP), of two or more ImpX, and of XMP that were activated in situ in the polymerization reaction itself. The reaction products were analyzed using matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) to identify the lengths and nucleotide compositions of the polymerization products. The results show that the extent of polymerization, the degree of heteropolymerization vs. homopolymerization, and the composition of the polymeric products all vary among the different nucleotides and depend upon which nucleotides and how many different nucleotides are present in the mixture.

  14. Introduction to parallel programming

    CERN Document Server

    Brawer, Steven

    1989-01-01

    Introduction to Parallel Programming focuses on the techniques, processes, methodologies, and approaches involved in parallel programming. The book first offers information on Fortran, hardware and operating system models, and processes, shared memory, and simple parallel programs. Discussions focus on processes and processors, joining processes, shared memory, time-sharing with multiple processors, hardware, loops, passing arguments in function/subroutine calls, program structure, and arithmetic expressions. The text then elaborates on basic parallel programming techniques, barriers and race

  15. Nucleotide sequence preservation of human mitochondrial DNA

    International Nuclear Information System (INIS)

    Monnat, R.J. Jr.; Loeb, L.A.

    1985-01-01

    Recombinant DNA techniques have been used to quantitate the amount of nucleotide sequence divergence in the mitochondrial DNA population of individual normal humans. Mitochondrial DNA was isolated from the peripheral blood lymphocytes of five normal humans and cloned in M13 mp11; 49 kilobases of nucleotide sequence information was obtained from 248 independently isolated clones from the five normal donors. Both between- and within-individual differences were identified. Between-individual differences were identified in approximately = to 1/200 nucleotides. In contrast, only one within-individual difference was identified in 49 kilobases of nucleotide sequence information. This high degree of mitochondrial nucleotide sequence homogeneity in human somatic cells is in marked contrast to the rapid evolutionary divergence of human mitochondrial DNA and suggests the existence of mechanisms for the concerted preservation of mammalian mitochondrial DNA sequences in single organisms

  16. Parallel computing by Monte Carlo codes MVP/GMVP

    International Nuclear Information System (INIS)

    Nagaya, Yasunobu; Nakagawa, Masayuki; Mori, Takamasa

    2001-01-01

    General-purpose Monte Carlo codes MVP/GMVP are well-vectorized and thus enable us to perform high-speed Monte Carlo calculations. In order to achieve more speedups, we parallelized the codes on the different types of parallel computing platforms or by using a standard parallelization library MPI. The platforms used for benchmark calculations are a distributed-memory vector-parallel computer Fujitsu VPP500, a distributed-memory massively parallel computer Intel paragon and a distributed-memory scalar-parallel computer Hitachi SR2201, IBM SP2. As mentioned generally, linear speedup could be obtained for large-scale problems but parallelization efficiency decreased as the batch size per a processing element(PE) was smaller. It was also found that the statistical uncertainty for assembly powers was less than 0.1% by the PWR full-core calculation with more than 10 million histories and it took about 1.5 hours by massively parallel computing. (author)

  17. Holographically viable extensions of topologically massive and minimal massive gravity?

    Science.gov (United States)

    Altas, Emel; Tekin, Bayram

    2016-01-01

    Recently [E. Bergshoeff et al., Classical Quantum Gravity 31, 145008 (2014)], an extension of the topologically massive gravity (TMG) in 2 +1 dimensions, dubbed as minimal massive gravity (MMG), which is free of the bulk-boundary unitarity clash that inflicts the former theory and all the other known three-dimensional theories, was found. Field equations of MMG differ from those of TMG at quadratic terms in the curvature that do not come from the variation of an action depending on the metric alone. Here we show that MMG is a unique theory and there does not exist a deformation of TMG or MMG at the cubic and quartic order (and beyond) in the curvature that is consistent at the level of the field equations. The only extension of TMG with the desired bulk and boundary properties having a single massive degree of freedom is MMG.

  18. Massive Submucosal Ganglia in Colonic Inertia.

    Science.gov (United States)

    Naemi, Kaveh; Stamos, Michael J; Wu, Mark Li-Cheng

    2018-02-01

    - Colonic inertia is a debilitating form of primary chronic constipation with unknown etiology and diagnostic criteria, often requiring pancolectomy. We have occasionally observed massively enlarged submucosal ganglia containing at least 20 perikarya, in addition to previously described giant ganglia with greater than 8 perikarya, in cases of colonic inertia. These massively enlarged ganglia have yet to be formally recognized. - To determine whether such "massive submucosal ganglia," defined as ganglia harboring at least 20 perikarya, characterize colonic inertia. - We retrospectively reviewed specimens from colectomies of patients with colonic inertia and compared the prevalence of massive submucosal ganglia occurring in this setting to the prevalence of massive submucosal ganglia occurring in a set of control specimens from patients lacking chronic constipation. - Seven of 8 specimens affected by colonic inertia harbored 1 to 4 massive ganglia, for a total of 11 massive ganglia. One specimen lacked massive ganglia but had limited sampling and nearly massive ganglia. Massive ganglia occupied both superficial and deep submucosal plexus. The patient with 4 massive ganglia also had 1 mitotically active giant ganglion. Only 1 massive ganglion occupied the entire set of 10 specimens from patients lacking chronic constipation. - We performed the first, albeit distinctly small, study of massive submucosal ganglia and showed that massive ganglia may be linked to colonic inertia. Further, larger studies are necessary to determine whether massive ganglia are pathogenetic or secondary phenomena, and whether massive ganglia or mitotically active ganglia distinguish colonic inertia from other types of chronic constipation.

  19. A Generic Mesh Data Structure with Parallel Applications

    Science.gov (United States)

    Cochran, William Kenneth, Jr.

    2009-01-01

    High performance, massively-parallel multi-physics simulations are built on efficient mesh data structures. Most data structures are designed from the bottom up, focusing on the implementation of linear algebra routines. In this thesis, we explore a top-down approach to design, evaluating the various needs of many aspects of simulation, not just…

  20. Parallel Atomistic Simulations

    Energy Technology Data Exchange (ETDEWEB)

    HEFFELFINGER,GRANT S.

    2000-01-18

    Algorithms developed to enable the use of atomistic molecular simulation methods with parallel computers are reviewed. Methods appropriate for bonded as well as non-bonded (and charged) interactions are included. While strategies for obtaining parallel molecular simulations have been developed for the full variety of atomistic simulation methods, molecular dynamics and Monte Carlo have received the most attention. Three main types of parallel molecular dynamics simulations have been developed, the replicated data decomposition, the spatial decomposition, and the force decomposition. For Monte Carlo simulations, parallel algorithms have been developed which can be divided into two categories, those which require a modified Markov chain and those which do not. Parallel algorithms developed for other simulation methods such as Gibbs ensemble Monte Carlo, grand canonical molecular dynamics, and Monte Carlo methods for protein structure determination are also reviewed and issues such as how to measure parallel efficiency, especially in the case of parallel Monte Carlo algorithms with modified Markov chains are discussed.

  1. Tiling as a Durable Abstraction for Parallelism and Data Locality

    Energy Technology Data Exchange (ETDEWEB)

    Unat, Didem [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Chan, Cy P. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Zhang, Weiqun [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Bell, John [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Shalf, John [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)

    2013-11-18

    Tiling is a useful loop transformation for expressing parallelism and data locality. Automated tiling transformations that preserve data-locality are increasingly important due to hardware trends towards massive parallelism and the increasing costs of data movement relative to the cost of computing. We propose TiDA as a durable tiling abstraction that centralizes parameterized tiling information within array data types with minimal changes to the source code. The data layout information can be used by the compiler and runtime to automatically manage parallelism, optimize data locality, and schedule tasks intelligently. In this study, we present the design features and early interface of TiDA along with some preliminary results.

  2. Key Technologies in Massive MIMO

    Directory of Open Access Journals (Sweden)

    Hu Qiang

    2018-01-01

    Full Text Available The explosive growth of wireless data traffic in the future fifth generation mobile communication system (5G has led researchers to develop new disruptive technologies. As an extension of traditional MIMO technology, massive MIMO can greatly improve the throughput rate and energy efficiency, and can effectively improve the link reliability and data transmission rate, which is an important research direction of 5G wireless communication. Massive MIMO technology is nearly three years to get a new technology of rapid development and it through a lot of increasing the number of antenna communication, using very duplex communication mode, make the system spectrum efficiency to an unprecedented height.

  3. Hunting for a massive neutrino

    CERN Document Server

    AUTHOR|(CDS)2108802

    1997-01-01

    A great effort is devoted by many groups of physicists all over the world to give an answer to the following question: Is the neutrino massive ? This question has profound implications with particle physics, astrophysics and cosmology, in relation to the so-called Dark Matter puzzle. The neutrino oscillation process, in particular, can only occur if the neutrino is massive. An overview of the neutrino mass measurements, of the oscillation formalism and experiments will be given, also in connection with the present experimental programme at CERN with the two experiments CHORUS and NOMAD.

  4. Parallelization in Modern C++

    CERN Multimedia

    CERN. Geneva

    2016-01-01

    The traditionally used and well established parallel programming models OpenMP and MPI are both targeting lower level parallelism and are meant to be as language agnostic as possible. For a long time, those models were the only widely available portable options for developing parallel C++ applications beyond using plain threads. This has strongly limited the optimization capabilities of compilers, has inhibited extensibility and genericity, and has restricted the use of those models together with other, modern higher level abstractions introduced by the C++11 and C++14 standards. The recent revival of interest in the industry and wider community for the C++ language has also spurred a remarkable amount of standardization proposals and technical specifications being developed. Those efforts however have so far failed to build a vision on how to seamlessly integrate various types of parallelism, such as iterative parallel execution, task-based parallelism, asynchronous many-task execution flows, continuation s...

  5. Parallelism in matrix computations

    CERN Document Server

    Gallopoulos, Efstratios; Sameh, Ahmed H

    2016-01-01

    This book is primarily intended as a research monograph that could also be used in graduate courses for the design of parallel algorithms in matrix computations. It assumes general but not extensive knowledge of numerical linear algebra, parallel architectures, and parallel programming paradigms. The book consists of four parts: (I) Basics; (II) Dense and Special Matrix Computations; (III) Sparse Matrix Computations; and (IV) Matrix functions and characteristics. Part I deals with parallel programming paradigms and fundamental kernels, including reordering schemes for sparse matrices. Part II is devoted to dense matrix computations such as parallel algorithms for solving linear systems, linear least squares, the symmetric algebraic eigenvalue problem, and the singular-value decomposition. It also deals with the development of parallel algorithms for special linear systems such as banded ,Vandermonde ,Toeplitz ,and block Toeplitz systems. Part III addresses sparse matrix computations: (a) the development of pa...

  6. Massive Neurofibroma of the Breast

    African Journals Online (AJOL)

    Valued eMachines Customer

    Neurofibromas are benign nerve sheath tumors that are extremely rare in the breast. We report a massive ... plexiform breast neurofibromas may transform into a malignant peripheral nerve sheath tumor1. We present a case .... Breast neurofibroma. http://www.breast-cancer.ca/type/breast-neurofibroma.htm. August 2011. 2.

  7. Cleaning Massive Sonar Point Clouds

    DEFF Research Database (Denmark)

    Arge, Lars Allan; Larsen, Kasper Green; Mølhave, Thomas

    2010-01-01

    We consider the problem of automatically cleaning massive sonar data point clouds, that is, the problem of automatically removing noisy points that for example appear as a result of scans of (shoals of) fish, multiple reflections, scanner self-reflections, refraction in gas bubbles, and so on. We...

  8. Topologically Massive Higher Spin Gravity

    NARCIS (Netherlands)

    Bagchi, A.; Lal, S.; Saha, A.; Sahoo, B.

    2011-01-01

    We look at the generalisation of topologically massive gravity (TMG) to higher spins, specifically spin-3. We find a special "chiral" point for the spin-three, analogous to the spin-two example, which actually coincides with the usual spin-two chiral point. But in contrast to usual TMG, there is the

  9. Supernovae from massive AGB stars

    NARCIS (Netherlands)

    Poelarends, A.J.T.; Izzard, R.G.; Herwig, F.; Langer, N.; Heger, A.

    2006-01-01

    We present new computations of the final fate of massive AGB-stars. These stars form ONeMg cores after a phase of carbon burning and are called Super AGB stars (SAGB). Detailed stellar evolutionary models until the thermally pulsing AGB were computed using three di erent stellar evolution codes. The

  10. Nucleotide diversity and phylogenetic relationships among ...

    Indian Academy of Sciences (India)

    2017-03-03

    Mar 3, 2017 ... 2Department of Botany, D. S. B. Campus, Kumaun University, Nainital 263 001, India ... Rana T. S. 2017 Nucleotide diversity and phylogenetic relationships ... Anderson and Park 1989). ..... Edgewood Press, Edgewood, USA.

  11. Nucleotide excision repair in the test tube.

    NARCIS (Netherlands)

    N.G.J. Jaspers (Nicolaas); J.H.J. Hoeijmakers (Jan)

    1995-01-01

    textabstractThe eukaryotic nucleotide excision-repair pathway has been reconstituted in vitro, an achievement that should hasten the full enzymological characterization of this highly complex DNA-repair pathway.

  12. Parallelized Seeded Region Growing Using CUDA

    Directory of Open Access Journals (Sweden)

    Seongjin Park

    2014-01-01

    Full Text Available This paper presents a novel method for parallelizing the seeded region growing (SRG algorithm using Compute Unified Device Architecture (CUDA technology, with intention to overcome the theoretical weakness of SRG algorithm of its computation time being directly proportional to the size of a segmented region. The segmentation performance of the proposed CUDA-based SRG is compared with SRG implementations on single-core CPUs, quad-core CPUs, and shader language programming, using synthetic datasets and 20 body CT scans. Based on the experimental results, the CUDA-based SRG outperforms the other three implementations, advocating that it can substantially assist the segmentation during massive CT screening tests.

  13. An efficient implementation of a backpropagation learning algorithm on quadrics parallel supercomputer

    International Nuclear Information System (INIS)

    Taraglio, S.; Massaioli, F.

    1995-08-01

    A parallel implementation of a library to build and train Multi Layer Perceptrons via the Back Propagation algorithm is presented. The target machine is the SIMD massively parallel supercomputer Quadrics. Performance measures are provided on three different machines with different number of processors, for two network examples. A sample source code is given

  14. A parallel buffer tree

    DEFF Research Database (Denmark)

    Sitchinava, Nodar; Zeh, Norbert

    2012-01-01

    We present the parallel buffer tree, a parallel external memory (PEM) data structure for batched search problems. This data structure is a non-trivial extension of Arge's sequential buffer tree to a private-cache multiprocessor environment and reduces the number of I/O operations by the number of...... in the optimal OhOf(psortN + K/PB) parallel I/O complexity, where K is the size of the output reported in the process and psortN is the parallel I/O complexity of sorting N elements using P processors....

  15. Parallel MR imaging.

    Science.gov (United States)

    Deshmane, Anagha; Gulani, Vikas; Griswold, Mark A; Seiberlich, Nicole

    2012-07-01

    Parallel imaging is a robust method for accelerating the acquisition of magnetic resonance imaging (MRI) data, and has made possible many new applications of MR imaging. Parallel imaging works by acquiring a reduced amount of k-space data with an array of receiver coils. These undersampled data can be acquired more quickly, but the undersampling leads to aliased images. One of several parallel imaging algorithms can then be used to reconstruct artifact-free images from either the aliased images (SENSE-type reconstruction) or from the undersampled data (GRAPPA-type reconstruction). The advantages of parallel imaging in a clinical setting include faster image acquisition, which can be used, for instance, to shorten breath-hold times resulting in fewer motion-corrupted examinations. In this article the basic concepts behind parallel imaging are introduced. The relationship between undersampling and aliasing is discussed and two commonly used parallel imaging methods, SENSE and GRAPPA, are explained in detail. Examples of artifacts arising from parallel imaging are shown and ways to detect and mitigate these artifacts are described. Finally, several current applications of parallel imaging are presented and recent advancements and promising research in parallel imaging are briefly reviewed. Copyright © 2012 Wiley Periodicals, Inc.

  16. Parallel Algorithms and Patterns

    Energy Technology Data Exchange (ETDEWEB)

    Robey, Robert W. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

    2016-06-16

    This is a powerpoint presentation on parallel algorithms and patterns. A parallel algorithm is a well-defined, step-by-step computational procedure that emphasizes concurrency to solve a problem. Examples of problems include: Sorting, searching, optimization, matrix operations. A parallel pattern is a computational step in a sequence of independent, potentially concurrent operations that occurs in diverse scenarios with some frequency. Examples are: Reductions, prefix scans, ghost cell updates. We only touch on parallel patterns in this presentation. It really deserves its own detailed discussion which Gabe Rockefeller would like to develop.

  17. Application Portable Parallel Library

    Science.gov (United States)

    Cole, Gary L.; Blech, Richard A.; Quealy, Angela; Townsend, Scott

    1995-01-01

    Application Portable Parallel Library (APPL) computer program is subroutine-based message-passing software library intended to provide consistent interface to variety of multiprocessor computers on market today. Minimizes effort needed to move application program from one computer to another. User develops application program once and then easily moves application program from parallel computer on which created to another parallel computer. ("Parallel computer" also include heterogeneous collection of networked computers). Written in C language with one FORTRAN 77 subroutine for UNIX-based computers and callable from application programs written in C language or FORTRAN 77.

  18. A task parallel implementation of fast multipole methods

    KAUST Repository

    Taura, Kenjiro

    2012-11-01

    This paper describes a task parallel implementation of ExaFMM, an open source implementation of fast multipole methods (FMM), using a lightweight task parallel library MassiveThreads. Although there have been many attempts on parallelizing FMM, experiences have almost exclusively been limited to formulation based on flat homogeneous parallel loops. FMM in fact contains operations that cannot be readily expressed in such conventional but restrictive models. We show that task parallelism, or parallel recursions in particular, allows us to parallelize all operations of FMM naturally and scalably. Moreover it allows us to parallelize a \\'\\'mutual interaction\\'\\' for force/potential evaluation, which is roughly twice as efficient as a more conventional, unidirectional force/potential evaluation. The net result is an open source FMM that is clearly among the fastest single node implementations, including those on GPUs; with a million particles on a 32 cores Sandy Bridge 2.20GHz node, it completes a single time step including tree construction and force/potential evaluation in 65 milliseconds. The study clearly showcases both programmability and performance benefits of flexible parallel constructs over more monolithic parallel loops. © 2012 IEEE.

  19. Parallel Evolutionary Optimization for Neuromorphic Network Training

    Energy Technology Data Exchange (ETDEWEB)

    Schuman, Catherine D [ORNL; Disney, Adam [University of Tennessee (UT); Singh, Susheela [North Carolina State University (NCSU), Raleigh; Bruer, Grant [University of Tennessee (UT); Mitchell, John Parker [University of Tennessee (UT); Klibisz, Aleksander [University of Tennessee (UT); Plank, James [University of Tennessee (UT)

    2016-01-01

    One of the key impediments to the success of current neuromorphic computing architectures is the issue of how best to program them. Evolutionary optimization (EO) is one promising programming technique; in particular, its wide applicability makes it especially attractive for neuromorphic architectures, which can have many different characteristics. In this paper, we explore different facets of EO on a spiking neuromorphic computing model called DANNA. We focus on the performance of EO in the design of our DANNA simulator, and on how to structure EO on both multicore and massively parallel computing systems. We evaluate how our parallel methods impact the performance of EO on Titan, the U.S.'s largest open science supercomputer, and BOB, a Beowulf-style cluster of Raspberry Pi's. We also focus on how to improve the EO by evaluating commonality in higher performing neural networks, and present the result of a study that evaluates the EO performed by Titan.

  20. Computation and parallel implementation for early vision

    Science.gov (United States)

    Gualtieri, J. Anthony

    1990-01-01

    The problem of early vision is to transform one or more retinal illuminance images-pixel arrays-to image representations built out of such primitive visual features such as edges, regions, disparities, and clusters. These transformed representations form the input to later vision stages that perform higher level vision tasks including matching and recognition. Researchers developed algorithms for: (1) edge finding in the scale space formulation; (2) correlation methods for computing matches between pairs of images; and (3) clustering of data by neural networks. These algorithms are formulated for parallel implementation of SIMD machines, such as the Massively Parallel Processor, a 128 x 128 array processor with 1024 bits of local memory per processor. For some cases, researchers can show speedups of three orders of magnitude over serial implementations.

  1. Massive lepton pair production in massive quantum electrodynamics

    International Nuclear Information System (INIS)

    Raychaudhuri, P.

    1976-01-01

    The pp → l + +l - +x inclusive interaction has been studied at high energies in terms of the massive quantum electrodynamics. The differential cross-section (dsigma/dQ 2 ) is derived and proves to be proportional to Q -4 , where Q-mass of the lepton pair. Basic features of the cross-section are demonstrated to be consistent with the Drell-Yan model

  2. MassiveNuS: cosmological massive neutrino simulations

    Science.gov (United States)

    Liu, Jia; Bird, Simeon; Zorrilla Matilla, José Manuel; Hill, J. Colin; Haiman, Zoltán; Madhavacheril, Mathew S.; Petri, Andrea; Spergel, David N.

    2018-03-01

    The non-zero mass of neutrinos suppresses the growth of cosmic structure on small scales. Since the level of suppression depends on the sum of the masses of the three active neutrino species, the evolution of large-scale structure is a promising tool to constrain the total mass of neutrinos and possibly shed light on the mass hierarchy. In this work, we investigate these effects via a large suite of N-body simulations that include massive neutrinos using an analytic linear-response approximation: the Cosmological Massive Neutrino Simulations (MassiveNuS). The simulations include the effects of radiation on the background expansion, as well as the clustering of neutrinos in response to the nonlinear dark matter evolution. We allow three cosmological parameters to vary: the neutrino mass sum Mν in the range of 0–0.6 eV, the total matter density Ωm, and the primordial power spectrum amplitude As. The rms density fluctuation in spheres of 8 comoving Mpc/h (σ8) is a derived parameter as a result. Our data products include N-body snapshots, halo catalogues, merger trees, ray-traced galaxy lensing convergence maps for four source redshift planes between zs=1–2.5, and ray-traced cosmic microwave background lensing convergence maps. We describe the simulation procedures and code validation in this paper. The data are publicly available at http://columbialensing.org.

  3. Spacetime structure of massive Majorana particles and massive gravitino

    Energy Technology Data Exchange (ETDEWEB)

    Ahluwalia, D.V.; Kirchbach, M. [Theoretical Physics Group, Facultad de Fisica, Universidad Autonoma de Zacatecas, A.P. 600, 98062 Zacatecas (Mexico)

    2003-07-01

    The profound difference between Dirac and Majorana particles is traced back to the possibility of having physically different constructs in the (1/2, 0) 0 (0,1/2) representation space. Contrary to Dirac particles, Majorana-particle propagators are shown to differ from the simple linear {gamma} {mu} p{sub {mu}}, structure. Furthermore, neither Majorana particles, nor their antiparticles can be associated with a well defined arrow of time. The inevitable consequence of this peculiarity is the particle-antiparticle metamorphosis giving rise to neutrinoless double beta decay, on the one side, and enabling spin-1/2 fields to act as gauge fields, gauginos, on the other side. The second part of the lecture notes is devoted to massive gravitino. We argue that a spin measurement in the rest frame for an unpolarized ensemble of massive gravitino, associated with the spinor-vector [(1/2, 0) 0 (0,1/2)] 0 (1/2,1/2) representation space, would yield the results 3/2 with probability one half, and 1/2 with probability one half. The latter is distributed uniformly, i.e. as 1/4, among the two spin-1/2+ and spin-1/2- states of opposite parities. From that we draw the conclusion that the massive gravitino should be interpreted as a particle of multiple spin. (Author)

  4. Parallel discrete event simulation

    NARCIS (Netherlands)

    Overeinder, B.J.; Hertzberger, L.O.; Sloot, P.M.A.; Withagen, W.J.

    1991-01-01

    In simulating applications for execution on specific computing systems, the simulation performance figures must be known in a short period of time. One basic approach to the problem of reducing the required simulation time is the exploitation of parallelism. However, in parallelizing the simulation

  5. Parallel reservoir simulator computations

    International Nuclear Information System (INIS)

    Hemanth-Kumar, K.; Young, L.C.

    1995-01-01

    The adaptation of a reservoir simulator for parallel computations is described. The simulator was originally designed for vector processors. It performs approximately 99% of its calculations in vector/parallel mode and relative to scalar calculations it achieves speedups of 65 and 81 for black oil and EOS simulations, respectively on the CRAY C-90

  6. Totally parallel multilevel algorithms

    Science.gov (United States)

    Frederickson, Paul O.

    1988-01-01

    Four totally parallel algorithms for the solution of a sparse linear system have common characteristics which become quite apparent when they are implemented on a highly parallel hypercube such as the CM2. These four algorithms are Parallel Superconvergent Multigrid (PSMG) of Frederickson and McBryan, Robust Multigrid (RMG) of Hackbusch, the FFT based Spectral Algorithm, and Parallel Cyclic Reduction. In fact, all four can be formulated as particular cases of the same totally parallel multilevel algorithm, which are referred to as TPMA. In certain cases the spectral radius of TPMA is zero, and it is recognized to be a direct algorithm. In many other cases the spectral radius, although not zero, is small enough that a single iteration per timestep keeps the local error within the required tolerance.

  7. Minimal theory of massive gravity

    International Nuclear Information System (INIS)

    De Felice, Antonio; Mukohyama, Shinji

    2016-01-01

    We propose a new theory of massive gravity with only two propagating degrees of freedom. While the homogeneous and isotropic background cosmology and the tensor linear perturbations around it are described by exactly the same equations as those in the de Rham–Gabadadze–Tolley (dRGT) massive gravity, the scalar and vector gravitational degrees of freedom are absent in the new theory at the fully nonlinear level. Hence the new theory provides a stable nonlinear completion of the self-accelerating cosmological solution that was originally found in the dRGT theory. The cosmological solution in the other branch, often called the normal branch, is also rendered stable in the new theory and, for the first time, makes it possible to realize an effective equation-of-state parameter different from (either larger or smaller than) −1 without introducing any extra degrees of freedom.

  8. Spin-3 topologically massive gravity

    Energy Technology Data Exchange (ETDEWEB)

    Chen Bin, E-mail: bchen01@pku.edu.cn [Department of Physics, and State Key Laboratory of Nuclear Physics and Technology, Peking University, Beijing 100871 (China); Center for High Energy Physics, Peking University, Beijing 100871 (China); Long Jiang, E-mail: longjiang0301@gmail.com [Department of Physics, and State Key Laboratory of Nuclear Physics and Technology, Peking University, Beijing 100871 (China); Wu Junbao, E-mail: wujb@ihep.ac.cn [Institute of High Energy Physics, and Theoretical Physics Center for Science Facilities, Chinese Academy of Sciences, Beijing 100049 (China)

    2011-11-24

    In this Letter, we study the spin-3 topologically massive gravity (TMG), paying special attention to its properties at the chiral point. We propose an action describing the higher spin fields coupled to TMG. We discuss the traceless spin-3 fluctuations around the AdS{sub 3} vacuum and find that there is an extra local massive mode, besides the left-moving and right-moving boundary massless modes. At the chiral point, such extra mode becomes massless and degenerates with the left-moving mode. We show that at the chiral point the only degrees of freedom in the theory are the boundary right-moving graviton and spin-3 field. We conjecture that spin-3 chiral gravity with generalized Brown-Henneaux boundary condition is holographically dual to 2D chiral CFT with classical W{sub 3} algebra and central charge c{sub R}=3l/G.

  9. Minimal theory of massive gravity

    Directory of Open Access Journals (Sweden)

    Antonio De Felice

    2016-01-01

    Full Text Available We propose a new theory of massive gravity with only two propagating degrees of freedom. While the homogeneous and isotropic background cosmology and the tensor linear perturbations around it are described by exactly the same equations as those in the de Rham–Gabadadze–Tolley (dRGT massive gravity, the scalar and vector gravitational degrees of freedom are absent in the new theory at the fully nonlinear level. Hence the new theory provides a stable nonlinear completion of the self-accelerating cosmological solution that was originally found in the dRGT theory. The cosmological solution in the other branch, often called the normal branch, is also rendered stable in the new theory and, for the first time, makes it possible to realize an effective equation-of-state parameter different from (either larger or smaller than −1 without introducing any extra degrees of freedom.

  10. Search of massive star formation with COMICS

    Science.gov (United States)

    Okamoto, Yoshiko K.

    2004-04-01

    Mid-infrared observations is useful for studies of massive star formation. Especially COMICS offers powerful tools: imaging survey of the circumstellar structures of forming massive stars such as massive disks and cavity structures, mass estimate from spectroscopy of fine structure lines, and high dispersion spectroscopy to census gas motion around formed stars. COMICS will open the next generation infrared studies of massive star formation.

  11. The physics of massive neutrinos

    CERN Document Server

    Kayser, Boris; Perrier, Frederic

    1989-01-01

    This book explains the physics and phenomenology of massive neutrinos. The authors argue that neutrino mass is not unlikely and consider briefly the search for evidence of this mass in decay processes before they examine the physics and phenomenology of neutrino oscillation. The physics of Majorana neutrinos (neutrinos which are their own antiparticles) is then discussed. This volume requires of the reader only a knowledge of quantum mechanics and of very elementary quantum field theory.

  12. Molecular dynamics beyonds the limits: Massive scaling on 72 racks of a BlueGene/P and supercooled glass dynamics of a 1 billion particles system

    KAUST Repository

    Allsopp, Nicholas; Ruocco, Giancarlo; Fratalocchi, Andrea

    2012-01-01

    We report scaling results on the world's largest supercomputer of our recently developed Billions-Body Molecular Dynamics (BBMD) package, which was especially designed for massively parallel simulations of the short-range atomic dynamics

  13. A parallel solution for high resolution histological image analysis.

    Science.gov (United States)

    Bueno, G; González, R; Déniz, O; García-Rojo, M; González-García, J; Fernández-Carrobles, M M; Vállez, N; Salido, J

    2012-10-01

    This paper describes a general methodology for developing parallel image processing algorithms based on message passing for high resolution images (on the order of several Gigabytes). These algorithms have been applied to histological images and must be executed on massively parallel processing architectures. Advances in new technologies for complete slide digitalization in pathology have been combined with developments in biomedical informatics. However, the efficient use of these digital slide systems is still a challenge. The image processing that these slides are subject to is still limited both in terms of data processed and processing methods. The work presented here focuses on the need to design and develop parallel image processing tools capable of obtaining and analyzing the entire gamut of information included in digital slides. Tools have been developed to assist pathologists in image analysis and diagnosis, and they cover low and high-level image processing methods applied to histological images. Code portability, reusability and scalability have been tested by using the following parallel computing architectures: distributed memory with massive parallel processors and two networks, INFINIBAND and Myrinet, composed of 17 and 1024 nodes respectively. The parallel framework proposed is flexible, high performance solution and it shows that the efficient processing of digital microscopic images is possible and may offer important benefits to pathology laboratories. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.

  14. Scalable parallel prefix solvers for discrete ordinates transport

    International Nuclear Information System (INIS)

    Pautz, S.; Pandya, T.; Adams, M.

    2009-01-01

    The well-known 'sweep' algorithm for inverting the streaming-plus-collision term in first-order deterministic radiation transport calculations has some desirable numerical properties. However, it suffers from parallel scaling issues caused by a lack of concurrency. The maximum degree of concurrency, and thus the maximum parallelism, grows more slowly than the problem size for sweeps-based solvers. We investigate a new class of parallel algorithms that involves recasting the streaming-plus-collision problem in prefix form and solving via cyclic reduction. This method, although computationally more expensive at low levels of parallelism than the sweep algorithm, offers better theoretical scalability properties. Previous work has demonstrated this approach for one-dimensional calculations; we show how to extend it to multidimensional calculations. Notably, for multiple dimensions it appears that this approach is limited to long-characteristics discretizations; other discretizations cannot be cast in prefix form. We implement two variants of the algorithm within the radlib/SCEPTRE transport code library at Sandia National Laboratories and show results on two different massively parallel systems. Both the 'forward' and 'symmetric' solvers behave similarly, scaling well to larger degrees of parallelism then sweeps-based solvers. We do observe some issues at the highest levels of parallelism (relative to the system size) and discuss possible causes. We conclude that this approach shows good potential for future parallel systems, but the parallel scalability will depend heavily on the architecture of the communication networks of these systems. (authors)

  15. Mapping robust parallel multigrid algorithms to scalable memory architectures

    Science.gov (United States)

    Overman, Andrea; Vanrosendale, John

    1993-01-01

    The convergence rate of standard multigrid algorithms degenerates on problems with stretched grids or anisotropic operators. The usual cure for this is the use of line or plane relaxation. However, multigrid algorithms based on line and plane relaxation have limited and awkward parallelism and are quite difficult to map effectively to highly parallel architectures. Newer multigrid algorithms that overcome anisotropy through the use of multiple coarse grids rather than relaxation are better suited to massively parallel architectures because they require only simple point-relaxation smoothers. In this paper, we look at the parallel implementation of a V-cycle multiple semicoarsened grid (MSG) algorithm on distributed-memory architectures such as the Intel iPSC/860 and Paragon computers. The MSG algorithms provide two levels of parallelism: parallelism within the relaxation or interpolation on each grid and across the grids on each multigrid level. Both levels of parallelism must be exploited to map these algorithms effectively to parallel architectures. This paper describes a mapping of an MSG algorithm to distributed-memory architectures that demonstrates how both levels of parallelism can be exploited. The result is a robust and effective multigrid algorithm for distributed-memory machines.

  16. A Nucleotide Phosphatase Activity in the Nucleotide Binding Domain of an Orphan Resistance Protein from Rice*

    Science.gov (United States)

    Fenyk, Stepan; de San Eustaquio Campillo, Alba; Pohl, Ehmke; Hussey, Patrick J.; Cann, Martin J.

    2012-01-01

    Plant resistance proteins (R-proteins) are key components of the plant immune system activated in response to a plethora of different pathogens. R-proteins are P-loop NTPase superfamily members, and current models describe their main function as ATPases in defense signaling pathways. Here we show that a subset of R-proteins have evolved a new function to combat pathogen infection. This subset of R-proteins possesses a nucleotide phosphatase activity in the nucleotide-binding domain. Related R-proteins that fall in the same phylogenetic clade all show the same nucleotide phosphatase activity indicating a conserved function within at least a subset of R-proteins. R-protein nucleotide phosphatases catalyze the production of nucleoside from nucleotide with the nucleotide monophosphate as the preferred substrate. Mutation of conserved catalytic residues substantially reduced activity consistent with the biochemistry of P-loop NTPases. Kinetic analysis, analytical gel filtration, and chemical cross-linking demonstrated that the nucleotide-binding domain was active as a multimer. Nuclear magnetic resonance and nucleotide analogues identified the terminal phosphate bond as the target of a reaction that utilized a metal-mediated nucleophilic attack by water on the phosphoester. In conclusion, we have identified a group of R-proteins with a unique function. This biochemical activity appears to have co-evolved with plants in signaling pathways designed to resist pathogen attack. PMID:22157756

  17. A nucleotide phosphatase activity in the nucleotide binding domain of an orphan resistance protein from rice.

    Science.gov (United States)

    Fenyk, Stepan; Campillo, Alba de San Eustaquio; Pohl, Ehmke; Hussey, Patrick J; Cann, Martin J

    2012-02-03

    Plant resistance proteins (R-proteins) are key components of the plant immune system activated in response to a plethora of different pathogens. R-proteins are P-loop NTPase superfamily members, and current models describe their main function as ATPases in defense signaling pathways. Here we show that a subset of R-proteins have evolved a new function to combat pathogen infection. This subset of R-proteins possesses a nucleotide phosphatase activity in the nucleotide-binding domain. Related R-proteins that fall in the same phylogenetic clade all show the same nucleotide phosphatase activity indicating a conserved function within at least a subset of R-proteins. R-protein nucleotide phosphatases catalyze the production of nucleoside from nucleotide with the nucleotide monophosphate as the preferred substrate. Mutation of conserved catalytic residues substantially reduced activity consistent with the biochemistry of P-loop NTPases. Kinetic analysis, analytical gel filtration, and chemical cross-linking demonstrated that the nucleotide-binding domain was active as a multimer. Nuclear magnetic resonance and nucleotide analogues identified the terminal phosphate bond as the target of a reaction that utilized a metal-mediated nucleophilic attack by water on the phosphoester. In conclusion, we have identified a group of R-proteins with a unique function. This biochemical activity appears to have co-evolved with plants in signaling pathways designed to resist pathogen attack.

  18. Tryton Supercomputer Capabilities for Analysis of Massive Data Streams

    Directory of Open Access Journals (Sweden)

    Krawczyk Henryk

    2015-09-01

    Full Text Available The recently deployed supercomputer Tryton, located in the Academic Computer Center of Gdansk University of Technology, provides great means for massive parallel processing. Moreover, the status of the Center as one of the main network nodes in the PIONIER network enables the fast and reliable transfer of data produced by miscellaneous devices scattered in the area of the whole country. The typical examples of such data are streams containing radio-telescope and satellite observations. Their analysis, especially with real-time constraints, can be challenging and requires the usage of dedicated software components. We propose a solution for such parallel analysis using the supercomputer, supervised by the KASKADA platform, which with the conjunction with immerse 3D visualization techniques can be used to solve problems such as pulsar detection and chronometric or oil-spill simulation on the sea surface.

  19. Vaidya spacetime in massive gravity's rainbow

    Directory of Open Access Journals (Sweden)

    Yaghoub Heydarzade

    2017-11-01

    Full Text Available In this paper, we will analyze the energy dependent deformation of massive gravity using the formalism of massive gravity's rainbow. So, we will use the Vainshtein mechanism and the dRGT mechanism for the energy dependent massive gravity, and thus analyze a ghost free theory of massive gravity's rainbow. We study the energy dependence of a time-dependent geometry, by analyzing the radiating Vaidya solution in this theory of massive gravity's rainbow. The energy dependent deformation of this Vaidya metric will be performed using suitable rainbow functions.

  20. Algorithms for parallel computers

    International Nuclear Information System (INIS)

    Churchhouse, R.F.

    1985-01-01

    Until relatively recently almost all the algorithms for use on computers had been designed on the (usually unstated) assumption that they were to be run on single processor, serial machines. With the introduction of vector processors, array processors and interconnected systems of mainframes, minis and micros, however, various forms of parallelism have become available. The advantage of parallelism is that it offers increased overall processing speed but it also raises some fundamental questions, including: (i) which, if any, of the existing 'serial' algorithms can be adapted for use in the parallel mode. (ii) How close to optimal can such adapted algorithms be and, where relevant, what are the convergence criteria. (iii) How can we design new algorithms specifically for parallel systems. (iv) For multi-processor systems how can we handle the software aspects of the interprocessor communications. Aspects of these questions illustrated by examples are considered in these lectures. (orig.)

  1. Parallelism and array processing

    International Nuclear Information System (INIS)

    Zacharov, V.

    1983-01-01

    Modern computing, as well as the historical development of computing, has been dominated by sequential monoprocessing. Yet there is the alternative of parallelism, where several processes may be in concurrent execution. This alternative is discussed in a series of lectures, in which the main developments involving parallelism are considered, both from the standpoint of computing systems and that of applications that can exploit such systems. The lectures seek to discuss parallelism in a historical context, and to identify all the main aspects of concurrency in computation right up to the present time. Included will be consideration of the important question as to what use parallelism might be in the field of data processing. (orig.)

  2. STABLE ISOTOPE GEOCHEMISTRY OF MASSIVE ICE

    Directory of Open Access Journals (Sweden)

    Yurij K. Vasil’chuk

    2016-01-01

    Full Text Available The paper summarises stable-isotope research on massive ice in the Russian and North American Arctic, and includes the latest understanding of massive-ice formation. A new classification of massive-ice complexes is proposed, encompassing the range and variabilityof massive ice. It distinguishes two new categories of massive-ice complexes: homogeneousmassive-ice complexes have a similar structure, properties and genesis throughout, whereasheterogeneous massive-ice complexes vary spatially (in their structure and properties andgenetically within a locality and consist of two or more homogeneous massive-ice bodies.Analysis of pollen and spores in massive ice from Subarctic regions and from ice and snow cover of Arctic ice caps assists with interpretation of the origin of massive ice. Radiocarbon ages of massive ice and host sediments are considered together with isotope values of heavy oxygen and deuterium from massive ice plotted at a uniform scale in order to assist interpretation and correlation of the ice.

  3. Parallel computing of physical maps--a comparative study in SIMD and MIMD parallelism.

    Science.gov (United States)

    Bhandarkar, S M; Chirravuri, S; Arnold, J

    1996-01-01

    Ordering clones from a genomic library into physical maps of whole chromosomes presents a central computational problem in genetics. Chromosome reconstruction via clone ordering is usually isomorphic to the NP-complete Optimal Linear Arrangement problem. Parallel SIMD and MIMD algorithms for simulated annealing based on Markov chain distribution are proposed and applied to the problem of chromosome reconstruction via clone ordering. Perturbation methods and problem-specific annealing heuristics are proposed and described. The SIMD algorithms are implemented on a 2048 processor MasPar MP-2 system which is an SIMD 2-D toroidal mesh architecture whereas the MIMD algorithms are implemented on an 8 processor Intel iPSC/860 which is an MIMD hypercube architecture. A comparative analysis of the various SIMD and MIMD algorithms is presented in which the convergence, speedup, and scalability characteristics of the various algorithms are analyzed and discussed. On a fine-grained, massively parallel SIMD architecture with a low synchronization overhead such as the MasPar MP-2, a parallel simulated annealing algorithm based on multiple periodically interacting searches performs the best. For a coarse-grained MIMD architecture with high synchronization overhead such as the Intel iPSC/860, a parallel simulated annealing algorithm based on multiple independent searches yields the best results. In either case, distribution of clonal data across multiple processors is shown to exacerbate the tendency of the parallel simulated annealing algorithm to get trapped in a local optimum.

  4. Parallel magnetic resonance imaging

    International Nuclear Information System (INIS)

    Larkman, David J; Nunes, Rita G

    2007-01-01

    Parallel imaging has been the single biggest innovation in magnetic resonance imaging in the last decade. The use of multiple receiver coils to augment the time consuming Fourier encoding has reduced acquisition times significantly. This increase in speed comes at a time when other approaches to acquisition time reduction were reaching engineering and human limits. A brief summary of spatial encoding in MRI is followed by an introduction to the problem parallel imaging is designed to solve. There are a large number of parallel reconstruction algorithms; this article reviews a cross-section, SENSE, SMASH, g-SMASH and GRAPPA, selected to demonstrate the different approaches. Theoretical (the g-factor) and practical (coil design) limits to acquisition speed are reviewed. The practical implementation of parallel imaging is also discussed, in particular coil calibration. How to recognize potential failure modes and their associated artefacts are shown. Well-established applications including angiography, cardiac imaging and applications using echo planar imaging are reviewed and we discuss what makes a good application for parallel imaging. Finally, active research areas where parallel imaging is being used to improve data quality by repairing artefacted images are also reviewed. (invited topical review)

  5. Spacetime structure of massive Majorana particles and massive gravitino

    CERN Document Server

    Ahluwalia, D V

    2003-01-01

    The profound difference between Dirac and Majorana particles is traced back to the possibility of having physically different constructs in the (1/2, 0) 0 (0,1/2) representation space. Contrary to Dirac particles, Majorana-particle propagators are shown to differ from the simple linear gamma mu p submu, structure. Furthermore, neither Majorana particles, nor their antiparticles can be associated with a well defined arrow of time. The inevitable consequence of this peculiarity is the particle-antiparticle metamorphosis giving rise to neutrinoless double beta decay, on the one side, and enabling spin-1/2 fields to act as gauge fields, gauginos, on the other side. The second part of the lecture notes is devoted to massive gravitino. We argue that a spin measurement in the rest frame for an unpolarized ensemble of massive gravitino, associated with the spinor-vector [(1/2, 0) 0 (0,1/2)] 0 (1/2,1/2) representation space, would yield the results 3/2 with probability one half, and 1/2 with probability one half. The ...

  6. The evolution of massive stars

    International Nuclear Information System (INIS)

    Loore, C. de

    1980-01-01

    The evolution of stars with masses between 15 M 0 and 100 M 0 is considered. Stars in this mass range lose a considerable fraction of their matter during their evolution. The treatment of convection, semi-convection and the influence of mass loss by stellar winds at different evolutionary phases are analysed as well as the adopted opacities. Evolutionary sequences computed by various groups are examined and compared with observations, and the advanced evolution of a 15 M 0 and a 25 M 0 star from zero-age main sequence (ZAMS) through iron collapse is discussed. The effect of centrifugal forces on stellar wind mass loss and the influence of rotation on evolutionary models is examined. As a consequence of the outflow of matter deeper layers show up and when the mass loss rates are large enough layers with changed composition, due to interior nuclear reactions, appear on the surface. The evolution of massive close binaries as well during the phase of mass loss by stellar wind as during the mass exchange and mass loss phase due to Roche lobe overflow is treated in detail, and the value of the parameters governing mass and angular momentum losses are discussed. The problem of the Wolf-Rayet stars, their origin and the possibilities of their production either as single stars or as massive binaries is examined. Finally, the origin of X-ray binaries is discussed and the scenario for the formation of these objects (starting from massive ZAMS close binaries, through Wolf-Rayet binaries leading to OB-stars with a compact companion after a supernova explosion) is reviewed and completed, including stellar wind mass loss. (orig.)

  7. Massive stars, successes and challenges

    OpenAIRE

    Meynet, Georges; Maeder, André; Georgy, Cyril; Ekström, Sylvia; Eggenberger, Patrick; Barblan, Fabio; Song, Han Feng

    2017-01-01

    We give a brief overview of where we stand with respect to some old and new questions bearing on how massive stars evolve and end their lifetime. We focus on the following key points that are further discussed by other contributions during this conference: convection, mass losses, rotation, magnetic field and multiplicity. For purpose of clarity, each of these processes are discussed on its own but we have to keep in mind that they are all interacting between them offering a large variety of ...

  8. Massive stars, successes and challenges

    Science.gov (United States)

    Meynet, Georges; Maeder, André; Georgy, Cyril; Ekström, Sylvia; Eggenberger, Patrick; Barblan, Fabio; Song, Han Feng

    2017-11-01

    We give a brief overview of where we stand with respect to some old and new questions bearing on how massive stars evolve and end their lifetime. We focus on the following key points that are further discussed by other contributions during this conference: convection, mass losses, rotation, magnetic field and multiplicity. For purpose of clarity, each of these processes are discussed on its own but we have to keep in mind that they are all interacting between them offering a large variety of outputs, some of them still to be discovered.

  9. Optimisation of a parallel ocean general circulation model

    Directory of Open Access Journals (Sweden)

    M. I. Beare

    1997-10-01

    Full Text Available This paper presents the development of a general-purpose parallel ocean circulation model, for use on a wide range of computer platforms, from traditional scalar machines to workstation clusters and massively parallel processors. Parallelism is provided, as a modular option, via high-level message-passing routines, thus hiding the technical intricacies from the user. An initial implementation highlights that the parallel efficiency of the model is adversely affected by a number of factors, for which optimisations are discussed and implemented. The resulting ocean code is portable and, in particular, allows science to be achieved on local workstations that could otherwise only be undertaken on state-of-the-art supercomputers.

  10. Optimisation of a parallel ocean general circulation model

    Directory of Open Access Journals (Sweden)

    M. I. Beare

    Full Text Available This paper presents the development of a general-purpose parallel ocean circulation model, for use on a wide range of computer platforms, from traditional scalar machines to workstation clusters and massively parallel processors. Parallelism is provided, as a modular option, via high-level message-passing routines, thus hiding the technical intricacies from the user. An initial implementation highlights that the parallel efficiency of the model is adversely affected by a number of factors, for which optimisations are discussed and implemented. The resulting ocean code is portable and, in particular, allows science to be achieved on local workstations that could otherwise only be undertaken on state-of-the-art supercomputers.

  11. Beam dynamics simulations using a parallel version of PARMILA

    International Nuclear Information System (INIS)

    Ryne, R.D.

    1996-01-01

    The computer code PARMILA has been the primary tool for the design of proton and ion linacs in the United States for nearly three decades. Previously it was sufficient to perform simulations with of order 10000 particles, but recently the need to perform high resolution halo studies for next-generation, high intensity linacs has made it necessary to perform simulations with of order 100 million particles. With the advent of massively parallel computers such simulations are now within reach. Parallel computers already make it possible, for example, to perform beam dynamics calculations with tens of millions of particles, requiring over 10 GByte of core memory, in just a few hours. Also, parallel computers are becoming easier to use thanks to the availability of mature, Fortran-like languages such as Connection Machine Fortran and High Performance Fortran. We will describe our experience developing a parallel version of PARMILA and the performance of the new code

  12. Beam dynamics simulations using a parallel version of PARMILA

    International Nuclear Information System (INIS)

    Ryne, Robert

    1996-01-01

    The computer code PARMILA has been the primary tool for the design of proton and ion linacs in the United States for nearly three decades. Previously it was sufficient to perform simulations with of order 10000 particles, but recently the need to perform high resolution halo studies for next-generation, high intensity linacs has made it necessary to perform simulations with of order 100 million particles. With the advent of massively parallel computers such simulations are now within reach. Parallel computers already make it possible, for example, to perform beam dynamics calculations with tens of millions of particles, requiring over 10 GByte of core memory, in just a few hours. Also, parallel computers are becoming easier to use thanks to the availability of mature, Fortran-like languages such as Connection Machine Fortran and High Performance Fortran. We will describe our experience developing a parallel version of PARMILA and the performance of the new code. (author)

  13. An efficient parallel algorithm for matrix-vector multiplication

    Energy Technology Data Exchange (ETDEWEB)

    Hendrickson, B.; Leland, R.; Plimpton, S.

    1993-03-01

    The multiplication of a vector by a matrix is the kernel computation of many algorithms in scientific computation. A fast parallel algorithm for this calculation is therefore necessary if one is to make full use of the new generation of parallel supercomputers. This paper presents a high performance, parallel matrix-vector multiplication algorithm that is particularly well suited to hypercube multiprocessors. For an n x n matrix on p processors, the communication cost of this algorithm is O(n/[radical]p + log(p)), independent of the matrix sparsity pattern. The performance of the algorithm is demonstrated by employing it as the kernel in the well-known NAS conjugate gradient benchmark, where a run time of 6.09 seconds was observed. This is the best published performance on this benchmark achieved to date using a massively parallel supercomputer.

  14. DIMACS Workshop on Interconnection Networks and Mapping, and Scheduling Parallel Computations

    CERN Document Server

    Rosenberg, Arnold L; Sotteau, Dominique; NSF Science and Technology Center in Discrete Mathematics and Theoretical Computer Science; Interconnection networks and mapping and scheduling parallel computations

    1995-01-01

    The interconnection network is one of the most basic components of a massively parallel computer system. Such systems consist of hundreds or thousands of processors interconnected to work cooperatively on computations. One of the central problems in parallel computing is the task of mapping a collection of processes onto the processors and routing network of a parallel machine. Once this mapping is done, it is critical to schedule computations within and communication among processor from universities and laboratories, as well as practitioners involved in the design, implementation, and application of massively parallel systems. Focusing on interconnection networks of parallel architectures of today and of the near future , the book includes topics such as network topologies,network properties, message routing, network embeddings, network emulation, mappings, and efficient scheduling. inputs for a process are available where and when the process is scheduled to be computed. This book contains the refereed pro...

  15. The STAPL Parallel Graph Library

    KAUST Repository

    Harshvardhan,; Fidel, Adam; Amato, Nancy M.; Rauchwerger, Lawrence

    2013-01-01

    This paper describes the stapl Parallel Graph Library, a high-level framework that abstracts the user from data-distribution and parallelism details and allows them to concentrate on parallel graph algorithm development. It includes a customizable

  16. Parallel implementation of the PHOENIX generalized stellar atmosphere program. II. Wavelength parallelization

    International Nuclear Information System (INIS)

    Baron, E.; Hauschildt, Peter H.

    1998-01-01

    We describe an important addition to the parallel implementation of our generalized nonlocal thermodynamic equilibrium (NLTE) stellar atmosphere and radiative transfer computer program PHOENIX. In a previous paper in this series we described data and task parallel algorithms we have developed for radiative transfer, spectral line opacity, and NLTE opacity and rate calculations. These algorithms divided the work spatially or by spectral lines, that is, distributing the radial zones, individual spectral lines, or characteristic rays among different processors and employ, in addition, task parallelism for logically independent functions (such as atomic and molecular line opacities). For finite, monotonic velocity fields, the radiative transfer equation is an initial value problem in wavelength, and hence each wavelength point depends upon the previous one. However, for sophisticated NLTE models of both static and moving atmospheres needed to accurately describe, e.g., novae and supernovae, the number of wavelength points is very large (200,000 - 300,000) and hence parallelization over wavelength can lead both to considerable speedup in calculation time and the ability to make use of the aggregate memory available on massively parallel supercomputers. Here, we describe an implementation of a pipelined design for the wavelength parallelization of PHOENIX, where the necessary data from the processor working on a previous wavelength point is sent to the processor working on the succeeding wavelength point as soon as it is known. Our implementation uses a MIMD design based on a relatively small number of standard message passing interface (MPI) library calls and is fully portable between serial and parallel computers. copyright 1998 The American Astronomical Society

  17. Distributed Parallel Endmember Extraction of Hyperspectral Data Based on Spark

    Directory of Open Access Journals (Sweden)

    Zebin Wu

    2016-01-01

    Full Text Available Due to the increasing dimensionality and volume of remotely sensed hyperspectral data, the development of acceleration techniques for massive hyperspectral image analysis approaches is a very important challenge. Cloud computing offers many possibilities of distributed processing of hyperspectral datasets. This paper proposes a novel distributed parallel endmember extraction method based on iterative error analysis that utilizes cloud computing principles to efficiently process massive hyperspectral data. The proposed method takes advantage of technologies including MapReduce programming model, Hadoop Distributed File System (HDFS, and Apache Spark to realize distributed parallel implementation for hyperspectral endmember extraction, which significantly accelerates the computation of hyperspectral processing and provides high throughput access to large hyperspectral data. The experimental results, which are obtained by extracting endmembers of hyperspectral datasets on a cloud computing platform built on a cluster, demonstrate the effectiveness and computational efficiency of the proposed method.

  18. Parallel Tensor Compression for Large-Scale Scientific Data.

    Energy Technology Data Exchange (ETDEWEB)

    Kolda, Tamara G. [Sandia National Lab. (SNL-CA), Livermore, CA (United States); Ballard, Grey [Sandia National Lab. (SNL-CA), Livermore, CA (United States); Austin, Woody Nathan [Univ. of Texas, Austin, TX (United States)

    2015-10-01

    As parallel computing trends towards the exascale, scientific data produced by high-fidelity simulations are growing increasingly massive. For instance, a simulation on a three-dimensional spatial grid with 512 points per dimension that tracks 64 variables per grid point for 128 time steps yields 8 TB of data. By viewing the data as a dense five way tensor, we can compute a Tucker decomposition to find inherent low-dimensional multilinear structure, achieving compression ratios of up to 10000 on real-world data sets with negligible loss in accuracy. So that we can operate on such massive data, we present the first-ever distributed memory parallel implementation for the Tucker decomposition, whose key computations correspond to parallel linear algebra operations, albeit with nonstandard data layouts. Our approach specifies a data distribution for tensors that avoids any tensor data redistribution, either locally or in parallel. We provide accompanying analysis of the computation and communication costs of the algorithms. To demonstrate the compression and accuracy of the method, we apply our approach to real-world data sets from combustion science simulations. We also provide detailed performance results, including parallel performance in both weak and strong scaling experiments.

  19. Solid holography and massive gravity

    International Nuclear Information System (INIS)

    Alberte, Lasma; Baggioli, Matteo; Khmelnitsky, Andrei; Pujolàs, Oriol

    2016-01-01

    Momentum dissipation is an important ingredient in condensed matter physics that requires a translation breaking sector. In the bottom-up gauge/gravity duality, this implies that the gravity dual is massive. We start here a systematic analysis of holographic massive gravity (HMG) theories, which admit field theory dual interpretations and which, therefore, might store interesting condensed matter applications. We show that there are many phases of HMG that are fully consistent effective field theories and which have been left overlooked in the literature. The most important distinction between the different HMG phases is that they can be clearly separated into solids and fluids. This can be done both at the level of the unbroken spacetime symmetries as well as concerning the elastic properties of the dual materials. We extract the modulus of rigidity of the solid HMG black brane solutions and show how it relates to the graviton mass term. We also consider the implications of the different HMGs on the electric response. We show that the types of response that can be consistently described within this framework is much wider than what is captured by the narrow class of models mostly considered so far.

  20. Solid holography and massive gravity

    Energy Technology Data Exchange (ETDEWEB)

    Alberte, Lasma [Abdus Salam International Centre for Theoretical Physics,Strada Costiera 11, 34151, Trieste (Italy); Baggioli, Matteo [Institut de Física d’Altes Energies (IFAE),The Barcelona Institute of Science and Technology (BIST), Campus UAB, 08193 Bellaterra, Barcelona (Spain); Department of Physics, Institute for Condensed Matter Theory, University of Illinois,1110 W. Green Street, Urbana, IL 61801 (United States); Khmelnitsky, Andrei [Abdus Salam International Centre for Theoretical Physics, Strada Costiera 11, 34151, Trieste (Italy); Pujolàs, Oriol [Institut de Física d’Altes Energies (IFAE),The Barcelona Institute of Science and Technology (BIST), Campus UAB, 08193 Bellaterra, Barcelona (Spain)

    2016-02-17

    Momentum dissipation is an important ingredient in condensed matter physics that requires a translation breaking sector. In the bottom-up gauge/gravity duality, this implies that the gravity dual is massive. We start here a systematic analysis of holographic massive gravity (HMG) theories, which admit field theory dual interpretations and which, therefore, might store interesting condensed matter applications. We show that there are many phases of HMG that are fully consistent effective field theories and which have been left overlooked in the literature. The most important distinction between the different HMG phases is that they can be clearly separated into solids and fluids. This can be done both at the level of the unbroken spacetime symmetries as well as concerning the elastic properties of the dual materials. We extract the modulus of rigidity of the solid HMG black brane solutions and show how it relates to the graviton mass term. We also consider the implications of the different HMGs on the electric response. We show that the types of response that can be consistently described within this framework is much wider than what is captured by the narrow class of models mostly considered so far.

  1. CUDAMPF: a multi-tiered parallel framework for accelerating protein sequence search in HMMER on CUDA-enabled GPU.

    Science.gov (United States)

    Jiang, Hanyu; Ganesan, Narayan

    2016-02-27

    HMMER software suite is widely used for analysis of homologous protein and nucleotide sequences with high sensitivity. The latest version of hmmsearch in HMMER 3.x, utilizes heuristic-pipeline which consists of MSV/SSV (Multiple/Single ungapped Segment Viterbi) stage, P7Viterbi stage and the Forward scoring stage to accelerate homology detection. Since the latest version is highly optimized for performance on modern multi-core CPUs with SSE capabilities, only a few acceleration attempts report speedup. However, the most compute intensive tasks within the pipeline (viz., MSV/SSV and P7Viterbi stages) still stand to benefit from the computational capabilities of massively parallel processors. A Multi-Tiered Parallel Framework (CUDAMPF) implemented on CUDA-enabled GPUs presented here, offers a finer-grained parallelism for MSV/SSV and Viterbi algorithms. We couple SIMT (Single Instruction Multiple Threads) mechanism with SIMD (Single Instructions Multiple Data) video instructions with warp-synchronism to achieve high-throughput processing and eliminate thread idling. We also propose a hardware-aware optimal allocation scheme of scarce resources like on-chip memory and caches in order to boost performance and scalability of CUDAMPF. In addition, runtime compilation via NVRTC available with CUDA 7.0 is incorporated into the presented framework that not only helps unroll innermost loop to yield upto 2 to 3-fold speedup than static compilation but also enables dynamic loading and switching of kernels depending on the query model size, in order to achieve optimal performance. CUDAMPF is designed as a hardware-aware parallel framework for accelerating computational hotspots within the hmmsearch pipeline as well as other sequence alignment applications. It achieves significant speedup by exploiting hierarchical parallelism on single GPU and takes full advantage of limited resources based on their own performance features. In addition to exceeding performance of other

  2. New Parallel Algorithms for Landscape Evolution Model

    Science.gov (United States)

    Jin, Y.; Zhang, H.; Shi, Y.

    2017-12-01

    Most landscape evolution models (LEM) developed in the last two decades solve the diffusion equation to simulate the transportation of surface sediments. This numerical approach is difficult to parallelize due to the computation of drainage area for each node, which needs huge amount of communication if run in parallel. In order to overcome this difficulty, we developed two parallel algorithms for LEM with a stream net. One algorithm handles the partition of grid with traditional methods and applies an efficient global reduction algorithm to do the computation of drainage areas and transport rates for the stream net; the other algorithm is based on a new partition algorithm, which partitions the nodes in catchments between processes first, and then partitions the cells according to the partition of nodes. Both methods focus on decreasing communication between processes and take the advantage of massive computing techniques, and numerical experiments show that they are both adequate to handle large scale problems with millions of cells. We implemented the two algorithms in our program based on the widely used finite element library deal.II, so that it can be easily coupled with ASPECT.

  3. Parallel community climate model: Description and user`s guide

    Energy Technology Data Exchange (ETDEWEB)

    Drake, J.B.; Flanery, R.E.; Semeraro, B.D.; Worley, P.H. [and others

    1996-07-15

    This report gives an overview of a parallel version of the NCAR Community Climate Model, CCM2, implemented for MIMD massively parallel computers using a message-passing programming paradigm. The parallel implementation was developed on an Intel iPSC/860 with 128 processors and on the Intel Delta with 512 processors, and the initial target platform for the production version of the code is the Intel Paragon with 2048 processors. Because the implementation uses a standard, portable message-passing libraries, the code has been easily ported to other multiprocessors supporting a message-passing programming paradigm. The parallelization strategy used is to decompose the problem domain into geographical patches and assign each processor the computation associated with a distinct subset of the patches. With this decomposition, the physics calculations involve only grid points and data local to a processor and are performed in parallel. Using parallel algorithms developed for the semi-Lagrangian transport, the fast Fourier transform and the Legendre transform, both physics and dynamics are computed in parallel with minimal data movement and modest change to the original CCM2 source code. Sequential or parallel history tapes are written and input files (in history tape format) are read sequentially by the parallel code to promote compatibility with production use of the model on other computer systems. A validation exercise has been performed with the parallel code and is detailed along with some performance numbers on the Intel Paragon and the IBM SP2. A discussion of reproducibility of results is included. A user`s guide for the PCCM2 version 2.1 on the various parallel machines completes the report. Procedures for compilation, setup and execution are given. A discussion of code internals is included for those who may wish to modify and use the program in their own research.

  4. SPINning parallel systems software

    International Nuclear Information System (INIS)

    Matlin, O.S.; Lusk, E.; McCune, W.

    2002-01-01

    We describe our experiences in using Spin to verify parts of the Multi Purpose Daemon (MPD) parallel process management system. MPD is a distributed collection of processes connected by Unix network sockets. MPD is dynamic processes and connections among them are created and destroyed as MPD is initialized, runs user processes, recovers from faults, and terminates. This dynamic nature is easily expressible in the Spin/Promela framework but poses performance and scalability challenges. We present here the results of expressing some of the parallel algorithms of MPD and executing both simulation and verification runs with Spin

  5. Parallel programming with Python

    CERN Document Server

    Palach, Jan

    2014-01-01

    A fast, easy-to-follow and clear tutorial to help you develop Parallel computing systems using Python. Along with explaining the fundamentals, the book will also introduce you to slightly advanced concepts and will help you in implementing these techniques in the real world. If you are an experienced Python programmer and are willing to utilize the available computing resources by parallelizing applications in a simple way, then this book is for you. You are required to have a basic knowledge of Python development to get the most of this book.

  6. The massive transformation in Ti-Al alloys: mechanistic observations

    International Nuclear Information System (INIS)

    Zhang, X.D.; Godfrey, S.; Weaver, M.; Strangwood, M.; Kaufman, M.J.; Loretto, M.H.

    1996-01-01

    The massive α→γ m transformation, as observed using analytical transmission electron microscopy, in Ti-49Al, Ti-48Al-2Nb-2Mn, Ti-55Al-25Ta and Ti-50Al-20Ta alloys is described. Conventional solution heating and quenching experiments have been combined with the more rapid quenching possible using electron beam melting in order to provide further insight into the early stages of the transformation of these alloys. It is shown that the γ develops first at grain boundaries as lamellae in one of the grains and that these lamellae intersect and spread into the adjacent grain in a massive manner. Consequently, there is no orientation relationship between the massive gamma (γ m ) and the grain being consumed whereas there is the expected relation between the γ m and the first grain which is inherited from the lamellae. It is further shown that the γ m grows as an f.c.c. phase after initially growing with the L1 0 structure. Furthermore, it is shown that the massive f.c.c. phase then orders to the L1 0 structure producing APDB-like defects which are actually thin 90 degree domains separating adjacent domains that have the same orientation yet are out of phase. The advancing γ m interface tends to facet parallel either to one of its four {111} planes or to the basal plane in the grain being consumed by impinging on existing γ lamellae. Thin microtwins and α 2 platelets then form in the γ m presumably due, respectively, to transformation stresses and supersaturation of the γ m with titanium for alloys containing ∼48% Al; indeed, there is a local depletion in aluminium across the α 2 platelets as determined using fine probe microanalysis

  7. On maximal massive 3D supergravity

    OpenAIRE

    Bergshoeff , Eric A; Hohm , Olaf; Rosseel , Jan; Townsend , Paul K

    2010-01-01

    ABSTRACT We construct, at the linearized level, the three-dimensional (3D) N = 4 supersymmetric " general massive supergravity " and the maximally supersymmetric N = 8 " new massive supergravity ". We also construct the maximally supersymmetric linearized N = 7 topologically massive supergravity, although we expect N = 6 to be maximal at the non-linear level. (Bergshoeff, Eric A) (Hohm, Olaf) (Rosseel, Jan) P.K.Townsend@da...

  8. On the singularities of massive superstring amplitudes

    International Nuclear Information System (INIS)

    Foda, O.

    1987-01-01

    Superstring one-loop amplitudes with massive external states are shown to be in general ill-defined due to internal on-shell propagators. However, we argue that since any massive string state (in the uncompactified theory) has a finite lifetime to decay into massless particles, such amplitudes are not terms in the perturbative expansion of physical S-matrix elements: These can be defined only with massless external states. Consistent massive amplitudes repuire an off-shell formalism. (orig.)

  9. On the singularities of massive superstring amplitudes

    Energy Technology Data Exchange (ETDEWEB)

    Foda, O.

    1987-06-04

    Superstring one-loop amplitudes with massive external states are shown to be in general ill-defined due to internal on-shell propagators. However, we argue that since any massive string state (in the uncompactified theory) has a finite lifetime to decay into massless particles, such amplitudes are not terms in the perturbative expansion of physical S-matrix elements: These can be defined only with massless external states. Consistent massive amplitudes repuire an off-shell formalism.

  10. Light weakly interacting massive particles

    Science.gov (United States)

    Gelmini, Graciela B.

    2017-08-01

    Light weakly interacting massive particles (WIMPs) are dark matter particle candidates with weak scale interaction with the known particles, and mass in the GeV to tens of GeV range. Hints of light WIMPs have appeared in several dark matter searches in the last decade. The unprecedented possible coincidence into tantalizingly close regions of mass and cross section of four separate direct detection experimental hints and a potential indirect detection signal in gamma rays from the galactic center, aroused considerable interest in our field. Even if these hints did not so far result in a discovery, they have had a significant impact in our field. Here we review the evidence for and against light WIMPs as dark matter candidates and discuss future relevant experiments and observations.

  11. Massive postpartum right renal hemorrhage.

    Science.gov (United States)

    Kiracofe, H L; Peterson, N

    1975-06-01

    All reported cases of massive postpartum right renal hemorrhage have involved healthy young primigravidas and blacks have predominated (4 of 7 women). Coagulopathies and underlying renal disease have been absent. Hematuria was painless in 5 of 8 cases. Hemorrhage began within 24 hours in 1 case, within 48 hours in 4 cases and 4 days post partum in 3 cases. Our first case is the only report in which hemorrhage has occurred in a primipara. Failure of closure or reopening of pyelovenous channels is suggested as the pathogenesis. The hemorrhage has been self-limiting, requiring no more than 1,500 cc whole blood replacement. Bleeding should stop spontaneously, and rapid renal pelvic clot lysis should follow with maintenance of adequate urine output and Foley catheter bladder decompression. To date surgical intervention has not been necessary.

  12. Cosmological attractors in massive gravity

    CERN Document Server

    Dubovsky, S; Tkachev, I I

    2005-01-01

    We study Lorentz-violating models of massive gravity which preserve rotations and are invariant under time-dependent shifts of the spatial coordinates. In the linear approximation the Newtonian potential in these models has an extra ``confining'' term proportional to the distance from the source. We argue that during cosmological expansion the Universe may be driven to an attractor point with larger symmetry which includes particular simultaneous dilatations of time and space coordinates. The confining term in the potential vanishes as one approaches the attractor. In the vicinity of the attractor the extra contribution is present in the Friedmann equation which, in a certain range of parameters, gives rise to the cosmic acceleration.

  13. Massive Black Holes and Galaxies

    CERN Multimedia

    CERN. Geneva

    2016-01-01

    Evidence has been accumulating for several decades that many galaxies harbor central mass concentrations that may be in the form of black holes with masses between a few million to a few billion time the mass of the Sun. I will discuss measurements over the last two decades, employing adaptive optics imaging and spectroscopy on large ground-based telescopes that prove the existence of such a massive black hole in the Center of our Milky Way, beyond any reasonable doubt. These data also provide key insights into its properties and environment. Most recently, a tidally disrupting cloud of gas has been discovered on an almost radial orbit that reached its peri-distance of ~2000 Schwarzschild radii in 2014, promising to be a valuable tool for exploring the innermost accretion zone. Future interferometric studies of the Galactic Center Black hole promise to be able to test gravity in its strong field limit.

  14. Stable massive particles at colliders

    Energy Technology Data Exchange (ETDEWEB)

    Fairbairn, M.; /Stockholm U.; Kraan, A.C.; /Pennsylvania U.; Milstead, D.A.; /Stockholm U.; Sjostrand, T.; /Lund U.; Skands, P.; /Fermilab; Sloan, T.; /Lancaster U.

    2006-11-01

    We review the theoretical motivations and experimental status of searches for stable massive particles (SMPs) which could be sufficiently long-lived as to be directly detected at collider experiments. The discovery of such particles would address a number of important questions in modern physics including the origin and composition of dark matter in the universe and the unification of the fundamental forces. This review describes the techniques used in SMP-searches at collider experiments and the limits so far obtained on the production of SMPs which possess various colour, electric and magnetic charge quantum numbers. We also describe theoretical scenarios which predict SMPs, the phenomenology needed to model their production at colliders and interactions with matter. In addition, the interplay between collider searches and open questions in cosmology such as dark matter composition are addressed.

  15. Expressing Parallelism with ROOT

    Energy Technology Data Exchange (ETDEWEB)

    Piparo, D. [CERN; Tejedor, E. [CERN; Guiraud, E. [CERN; Ganis, G. [CERN; Mato, P. [CERN; Moneta, L. [CERN; Valls Pla, X. [CERN; Canal, P. [Fermilab

    2017-11-22

    The need for processing the ever-increasing amount of data generated by the LHC experiments in a more efficient way has motivated ROOT to further develop its support for parallelism. Such support is being tackled both for shared-memory and distributed-memory environments. The incarnations of the aforementioned parallelism are multi-threading, multi-processing and cluster-wide executions. In the area of multi-threading, we discuss the new implicit parallelism and related interfaces, as well as the new building blocks to safely operate with ROOT objects in a multi-threaded environment. Regarding multi-processing, we review the new MultiProc framework, comparing it with similar tools (e.g. multiprocessing module in Python). Finally, as an alternative to PROOF for cluster-wide executions, we introduce the efforts on integrating ROOT with state-of-the-art distributed data processing technologies like Spark, both in terms of programming model and runtime design (with EOS as one of the main components). For all the levels of parallelism, we discuss, based on real-life examples and measurements, how our proposals can increase the productivity of scientists.

  16. Expressing Parallelism with ROOT

    Science.gov (United States)

    Piparo, D.; Tejedor, E.; Guiraud, E.; Ganis, G.; Mato, P.; Moneta, L.; Valls Pla, X.; Canal, P.

    2017-10-01

    The need for processing the ever-increasing amount of data generated by the LHC experiments in a more efficient way has motivated ROOT to further develop its support for parallelism. Such support is being tackled both for shared-memory and distributed-memory environments. The incarnations of the aforementioned parallelism are multi-threading, multi-processing and cluster-wide executions. In the area of multi-threading, we discuss the new implicit parallelism and related interfaces, as well as the new building blocks to safely operate with ROOT objects in a multi-threaded environment. Regarding multi-processing, we review the new MultiProc framework, comparing it with similar tools (e.g. multiprocessing module in Python). Finally, as an alternative to PROOF for cluster-wide executions, we introduce the efforts on integrating ROOT with state-of-the-art distributed data processing technologies like Spark, both in terms of programming model and runtime design (with EOS as one of the main components). For all the levels of parallelism, we discuss, based on real-life examples and measurements, how our proposals can increase the productivity of scientists.

  17. Parallel Fast Legendre Transform

    NARCIS (Netherlands)

    Alves de Inda, M.; Bisseling, R.H.; Maslen, D.K.

    1998-01-01

    We discuss a parallel implementation of a fast algorithm for the discrete polynomial Legendre transform We give an introduction to the DriscollHealy algorithm using polynomial arithmetic and present experimental results on the eciency and accuracy of our implementation The algorithms were

  18. Practical parallel programming

    CERN Document Server

    Bauer, Barr E

    2014-01-01

    This is the book that will teach programmers to write faster, more efficient code for parallel processors. The reader is introduced to a vast array of procedures and paradigms on which actual coding may be based. Examples and real-life simulations using these devices are presented in C and FORTRAN.

  19. Parallel hierarchical radiosity rendering

    Energy Technology Data Exchange (ETDEWEB)

    Carter, Michael [Iowa State Univ., Ames, IA (United States)

    1993-07-01

    In this dissertation, the step-by-step development of a scalable parallel hierarchical radiosity renderer is documented. First, a new look is taken at the traditional radiosity equation, and a new form is presented in which the matrix of linear system coefficients is transformed into a symmetric matrix, thereby simplifying the problem and enabling a new solution technique to be applied. Next, the state-of-the-art hierarchical radiosity methods are examined for their suitability to parallel implementation, and scalability. Significant enhancements are also discovered which both improve their theoretical foundations and improve the images they generate. The resultant hierarchical radiosity algorithm is then examined for sources of parallelism, and for an architectural mapping. Several architectural mappings are discussed. A few key algorithmic changes are suggested during the process of making the algorithm parallel. Next, the performance, efficiency, and scalability of the algorithm are analyzed. The dissertation closes with a discussion of several ideas which have the potential to further enhance the hierarchical radiosity method, or provide an entirely new forum for the application of hierarchical methods.

  20. Parallel universes beguile science

    CERN Multimedia

    2007-01-01

    A staple of mind-bending science fiction, the possibility of multiple universes has long intrigued hard-nosed physicists, mathematicians and cosmologists too. We may not be able -- as least not yet -- to prove they exist, many serious scientists say, but there are plenty of reasons to think that parallel dimensions are more than figments of eggheaded imagination.

  1. Parallel k-means++

    Energy Technology Data Exchange (ETDEWEB)

    2017-04-04

    A parallelization of the k-means++ seed selection algorithm on three distinct hardware platforms: GPU, multicore CPU, and multithreaded architecture. K-means++ was developed by David Arthur and Sergei Vassilvitskii in 2007 as an extension of the k-means data clustering technique. These algorithms allow people to cluster multidimensional data, by attempting to minimize the mean distance of data points within a cluster. K-means++ improved upon traditional k-means by using a more intelligent approach to selecting the initial seeds for the clustering process. While k-means++ has become a popular alternative to traditional k-means clustering, little work has been done to parallelize this technique. We have developed original C++ code for parallelizing the algorithm on three unique hardware architectures: GPU using NVidia's CUDA/Thrust framework, multicore CPU using OpenMP, and the Cray XMT multithreaded architecture. By parallelizing the process for these platforms, we are able to perform k-means++ clustering much more quickly than it could be done before.

  2. Parallel plate detectors

    International Nuclear Information System (INIS)

    Gardes, D.; Volkov, P.

    1981-01-01

    A 5x3cm 2 (timing only) and a 15x5cm 2 (timing and position) parallel plate avalanche counters (PPAC) are considered. The theory of operation and timing resolution is given. The measurement set-up and the curves of experimental results illustrate the possibilities of the two counters [fr

  3. Parallel hierarchical global illumination

    Energy Technology Data Exchange (ETDEWEB)

    Snell, Quinn O. [Iowa State Univ., Ames, IA (United States)

    1997-10-08

    Solving the global illumination problem is equivalent to determining the intensity of every wavelength of light in all directions at every point in a given scene. The complexity of the problem has led researchers to use approximation methods for solving the problem on serial computers. Rather than using an approximation method, such as backward ray tracing or radiosity, the authors have chosen to solve the Rendering Equation by direct simulation of light transport from the light sources. This paper presents an algorithm that solves the Rendering Equation to any desired accuracy, and can be run in parallel on distributed memory or shared memory computer systems with excellent scaling properties. It appears superior in both speed and physical correctness to recent published methods involving bidirectional ray tracing or hybrid treatments of diffuse and specular surfaces. Like progressive radiosity methods, it dynamically refines the geometry decomposition where required, but does so without the excessive storage requirements for ray histories. The algorithm, called Photon, produces a scene which converges to the global illumination solution. This amounts to a huge task for a 1997-vintage serial computer, but using the power of a parallel supercomputer significantly reduces the time required to generate a solution. Currently, Photon can be run on most parallel environments from a shared memory multiprocessor to a parallel supercomputer, as well as on clusters of heterogeneous workstations.

  4. The International Nucleotide Sequence Database Collaboration.

    Science.gov (United States)

    Cochrane, Guy; Karsch-Mizrachi, Ilene; Nakamura, Yasukazu

    2011-01-01

    Under the International Nucleotide Sequence Database Collaboration (INSDC; http://www.insdc.org), globally comprehensive public domain nucleotide sequence is captured, preserved and presented. The partners of this long-standing collaboration work closely together to provide data formats and conventions that enable consistent data submission to their databases and support regular data exchange around the globe. Clearly defined policy and governance in relation to free access to data and relationships with journal publishers have positioned INSDC databases as a key provider of the scientific record and a core foundation for the global bioinformatics data infrastructure. While growth in sequence data volumes comes no longer as a surprise to INSDC partners, the uptake of next-generation sequencing technology by mainstream science that we have witnessed in recent years brings a step-change to growth, necessarily making a clear mark on INSDC strategy. In this article, we introduce the INSDC, outline data growth patterns and comment on the challenges of increased growth.

  5. Bacterial nucleotide-based second messengers.

    Science.gov (United States)

    Pesavento, Christina; Hengge, Regine

    2009-04-01

    In all domains of life nucleotide-based second messengers transduce signals originating from changes in the environment or in intracellular conditions into appropriate cellular responses. In prokaryotes cyclic di-GMP has emerged as an important and ubiquitous second messenger regulating bacterial life-style transitions relevant for biofilm formation, virulence, and many other bacterial functions. This review describes similarities and differences in the architecture of the cAMP, (p)ppGpp, and c-di-GMP signaling systems and their underlying signaling principles. Moreover, recent advances in c-di-GMP-mediated signaling will be presented and the integration of c-di-GMP signaling with other nucleotide-based signaling systems will be discussed.

  6. Nucleotide Manipulatives to Illustrate the Central Dogma

    OpenAIRE

    Sonja B. Yung; Todd P. Primm

    2015-01-01

    The central dogma is a core concept that is critical for introductory biology and microbiology students to master. However, students often struggle to conceptualize the processes involved, and fail to move beyond simply memorizing the basic facts. To encourage critical thinking, we have designed a set of magnetic nucleotide manipulatives that allow students to model DNA structure, along with the processes of replication, transcription, and translation.

  7. Nucleotide Manipulatives to Illustrate the Central Dogma

    Directory of Open Access Journals (Sweden)

    Sonja B. Yung

    2015-08-01

    Full Text Available The central dogma is a core concept that is critical for introductory biology and microbiology students to master. However, students often struggle to conceptualize the processes involved, and fail to move beyond simply memorizing the basic facts. To encourage critical thinking, we have designed a set of magnetic nucleotide manipulatives that allow students to model DNA structure, along with the processes of replication, transcription, and translation.

  8. Histone displacement during nucleotide excision repair

    DEFF Research Database (Denmark)

    Dinant, C.; Bartek, J.; Bekker-Jensen, S.

    2012-01-01

    Nucleotide excision repair (NER) is an important DNA repair mechanism required for cellular resistance against UV light and toxic chemicals such as those found in tobacco smoke. In living cells, NER efficiently detects and removes DNA lesions within the large nuclear macromolecular complex called...... of histone variants and histone displacement (including nucleosome sliding). Here we review current knowledge, and speculate about current unknowns, regarding those chromatin remodeling activities that physically displace histones before, during and after NER....

  9. Pyrrolidine nucleotide analogs with a tunable conformation

    Czech Academy of Sciences Publication Activity Database

    Poštová Slavětínská, Lenka; Rejman, Dominik; Pohl, Radek

    2014-01-01

    Roč. 10, Aug 22 (2014), s. 1967-1980 ISSN 1860-5397 R&D Projects: GA ČR GA13-24880S Institutional support: RVO:61388963 Keywords : conformation * NMR * nucleic acids * nucleotide analog * phosphonic acid * pseudorotation * pyrrolidine Subject RIV: CC - Organic Chemistry Impact factor: 2.762, year: 2014 http://www.beilstein-journals.org/bjoc/single/articleFullText.htm?publicId=1860-5397-10-205

  10. Parallel Algorithms for Switching Edges in Heterogeneous Graphs.

    Science.gov (United States)

    Bhuiyan, Hasanuzzaman; Khan, Maleq; Chen, Jiangzhuo; Marathe, Madhav

    2017-06-01

    An edge switch is an operation on a graph (or network) where two edges are selected randomly and one of their end vertices are swapped with each other. Edge switch operations have important applications in graph theory and network analysis, such as in generating random networks with a given degree sequence, modeling and analyzing dynamic networks, and in studying various dynamic phenomena over a network. The recent growth of real-world networks motivates the need for efficient parallel algorithms. The dependencies among successive edge switch operations and the requirement to keep the graph simple (i.e., no self-loops or parallel edges) as the edges are switched lead to significant challenges in designing a parallel algorithm. Addressing these challenges requires complex synchronization and communication among the processors leading to difficulties in achieving a good speedup by parallelization. In this paper, we present distributed memory parallel algorithms for switching edges in massive networks. These algorithms provide good speedup and scale well to a large number of processors. A harmonic mean speedup of 73.25 is achieved on eight different networks with 1024 processors. One of the steps in our edge switch algorithms requires the computation of multinomial random variables in parallel. This paper presents the first non-trivial parallel algorithm for the problem, achieving a speedup of 925 using 1024 processors.

  11. Vacuum ultraviolet photoionization of carbohydrates and nucleotides

    Energy Technology Data Exchange (ETDEWEB)

    Shin, Joong-Won, E-mail: jshin@govst.edu [Division of Science, Governors State University, University Park, Illinois 60484-0975 (United States); Department of Chemistry, Colorado State University, Fort Collins, Colorado 80523-1872 (United States); Bernstein, Elliot R., E-mail: erb@lamar.colostate.edu [Department of Chemistry, Colorado State University, Fort Collins, Colorado 80523-1872 (United States)

    2014-01-28

    Carbohydrates (2-deoxyribose, ribose, and xylose) and nucleotides (adenosine-, cytidine-, guanosine-, and uridine-5{sup ′}-monophosphate) are generated in the gas phase, and ionized with vacuum ultraviolet photons (VUV, 118.2 nm). The observed time of flight mass spectra of the carbohydrate fragmentation are similar to those observed [J.-W. Shin, F. Dong, M. Grisham, J. J. Rocca, and E. R. Bernstein, Chem. Phys. Lett. 506, 161 (2011)] for 46.9 nm photon ionization, but with more intensity in higher mass fragment ions. The tendency of carbohydrate ions to fragment extensively following ionization seemingly suggests that nucleic acids might undergo radiation damage as a result of carbohydrate, rather than nucleobase fragmentation. VUV photoionization of nucleotides (monophosphate-carbohydrate-nucleobase), however, shows that the carbohydrate-nucleobase bond is the primary fragmentation site for these species. Density functional theory (DFT) calculations indicate that the removed carbohydrate electrons by the 118.2 nm photons are associated with endocyclic C–C and C–O ring centered orbitals: loss of electron density in the ring bonds of the nascent ion can thus account for the observed fragmentation patterns following carbohydrate ionization. DFT calculations also indicate that electrons removed from nucleotides under these same conditions are associated with orbitals involved with the nucleobase-saccharide linkage electron density. The calculations give a general mechanism and explanation of the experimental results.

  12. Vacuum ultraviolet photoionization of carbohydrates and nucleotides

    International Nuclear Information System (INIS)

    Shin, Joong-Won; Bernstein, Elliot R.

    2014-01-01

    Carbohydrates (2-deoxyribose, ribose, and xylose) and nucleotides (adenosine-, cytidine-, guanosine-, and uridine-5 ′ -monophosphate) are generated in the gas phase, and ionized with vacuum ultraviolet photons (VUV, 118.2 nm). The observed time of flight mass spectra of the carbohydrate fragmentation are similar to those observed [J.-W. Shin, F. Dong, M. Grisham, J. J. Rocca, and E. R. Bernstein, Chem. Phys. Lett. 506, 161 (2011)] for 46.9 nm photon ionization, but with more intensity in higher mass fragment ions. The tendency of carbohydrate ions to fragment extensively following ionization seemingly suggests that nucleic acids might undergo radiation damage as a result of carbohydrate, rather than nucleobase fragmentation. VUV photoionization of nucleotides (monophosphate-carbohydrate-nucleobase), however, shows that the carbohydrate-nucleobase bond is the primary fragmentation site for these species. Density functional theory (DFT) calculations indicate that the removed carbohydrate electrons by the 118.2 nm photons are associated with endocyclic C–C and C–O ring centered orbitals: loss of electron density in the ring bonds of the nascent ion can thus account for the observed fragmentation patterns following carbohydrate ionization. DFT calculations also indicate that electrons removed from nucleotides under these same conditions are associated with orbitals involved with the nucleobase-saccharide linkage electron density. The calculations give a general mechanism and explanation of the experimental results

  13. Identification of cyclic nucleotide gated channels using regular expressions

    KAUST Repository

    Zelman, Alice K.; Dawe, Adam Sean; Berkowitz, Gerald A.

    2013-01-01

    Cyclic nucleotide-gated channels (CNGCs) are nonselective cation channels found in plants, animals, and some bacteria. They have a six-transmembrane/one- pore structure, a cytosolic cyclic nucleotide-binding domain, and a cytosolic calmodulin

  14. Effects of hypokinesia on cyclic nucleotides and hormonal regulation ...

    African Journals Online (AJOL)

    PTH), calcitonin (CT), cyclic nucleotides (cAMP, cGMP) and calcium in the blood of rats, while in urine - phosphate, calcium and cyclic nucleotides. Design: Laboratory based experiment. Setting: Laboratory in the Department of Biochemistry, ...

  15. Massive Star Burps, Then Explodes

    Science.gov (United States)

    2007-04-01

    Berkeley -- In a galaxy far, far away, a massive star suffered a nasty double whammy. On Oct. 20, 2004, Japanese amateur astronomer Koichi Itagaki saw the star let loose an outburst so bright that it was initially mistaken for a supernova. The star survived, but for only two years. On Oct. 11, 2006, professional and amateur astronomers witnessed the star actually blowing itself to smithereens as Supernova 2006jc. Swift UVOT Image Swift UVOT Image (Credit: NASA / Swift / S.Immler) "We have never observed a stellar outburst and then later seen the star explode," says University of California, Berkeley, astronomer Ryan Foley. His group studied the event with ground-based telescopes, including the 10-meter (32.8-foot) W. M. Keck telescopes in Hawaii. Narrow helium spectral lines showed that the supernova's blast wave ran into a slow-moving shell of material, presumably the progenitor's outer layers ejected just two years earlier. If the spectral lines had been caused by the supernova's fast-moving blast wave, the lines would have been much broader. artistic rendering This artistic rendering depicts two years in the life of a massive blue supergiant star, which burped and spewed a shell of gas, then, two years later, exploded. When the supernova slammed into the shell of gas, X-rays were produced. (Credit: NASA/Sonoma State Univ./A.Simonnet) Another group, led by Stefan Immler of NASA's Goddard Space Flight Center, Greenbelt, Md., monitored SN 2006jc with NASA's Swift satellite and Chandra X-ray Observatory. By observing how the supernova brightened in X-rays, a result of the blast wave slamming into the outburst ejecta, they could measure the amount of gas blown off in the 2004 outburst: about 0.01 solar mass, the equivalent of about 10 Jupiters. "The beautiful aspect of our SN 2006jc observations is that although they were obtained in different parts of the electromagnetic spectrum, in the optical and in X-rays, they lead to the same conclusions," says Immler. "This

  16. Parallel computing techniques for rotorcraft aerodynamics

    Science.gov (United States)

    Ekici, Kivanc

    The modification of unsteady three-dimensional Navier-Stokes codes for application on massively parallel and distributed computing environments is investigated. The Euler/Navier-Stokes code TURNS (Transonic Unsteady Rotor Navier-Stokes) was chosen as a test bed because of its wide use by universities and industry. For the efficient implementation of TURNS on parallel computing systems, two algorithmic changes are developed. First, main modifications to the implicit operator, Lower-Upper Symmetric Gauss Seidel (LU-SGS) originally used in TURNS, is performed. Second, application of an inexact Newton method, coupled with a Krylov subspace iterative method (Newton-Krylov method) is carried out. Both techniques have been tried previously for the Euler equations mode of the code. In this work, we have extended the methods to the Navier-Stokes mode. Several new implicit operators were tried because of convergence problems of traditional operators with the high cell aspect ratio (CAR) grids needed for viscous calculations on structured grids. Promising results for both Euler and Navier-Stokes cases are presented for these operators. For the efficient implementation of Newton-Krylov methods to the Navier-Stokes mode of TURNS, efficient preconditioners must be used. The parallel implicit operators used in the previous step are employed as preconditioners and the results are compared. The Message Passing Interface (MPI) protocol has been used because of its portability to various parallel architectures. It should be noted that the proposed methodology is general and can be applied to several other CFD codes (e.g. OVERFLOW).

  17. An effective theory of massive gauge bosons

    International Nuclear Information System (INIS)

    Doria, R.M.; Helayel Neto, J.A.

    1986-01-01

    The coupling of a group-valued massive scalar field to a gauge field through a symmetric rank-2 field strenght is studied. By considering energies very small compared with the mass of the scalar and invoking the decoupling theorem, one is left with a low-energy effective theory describing a dynamics of massive vector fields. (Author) [pt

  18. On the singularities of massive superstring amplitudes

    NARCIS (Netherlands)

    Foda, O.

    1987-01-01

    Superstring one-loop amplitudes with massive external states are shown to be in general ill-defined due to internal on-shell propagators. However, we argue that since any massive string state (in the uncompactified theory) has a finite lifetime to decay into massless particles, such amplitudes are

  19. Massive vector fields and black holes

    International Nuclear Information System (INIS)

    Frolov, V.P.

    1977-04-01

    A massive vector field inside the event horizon created by the static sources located outside the black hole is investigated. It is shown that the back reaction of such a field on the metric near r = 0 cannot be neglected. The possibility of the space-time structure changing near r = 0 due to the external massive field is discussed

  20. Management of massive haemoptysis | Adegboye | Nigerian Journal ...

    African Journals Online (AJOL)

    Background: This study compares two management techniques in the treatment of massive haemotysis. Method: All patients with massive haemoptysis treated between January 1969 and December 1980 (group 1) were retrospectively reviewed and those prospectively treated between January 1981 and August 1999 ...