robust large-scale parallel: Topics by WorldWideScience.org

Sample records for robust large-scale parallel

Robust large-scale parallel nonlinear solvers for simulations.

Energy Technology Data Exchange (ETDEWEB)

Bader, Brett William; Pawlowski, Roger Patrick; Kolda, Tamara Gibson (Sandia National Laboratories, Livermore, CA)

2005-11-01

This report documents research to develop robust and efficient solution techniques for solving large-scale systems of nonlinear equations. The most widely used method for solving systems of nonlinear equations is Newton's method. While much research has been devoted to augmenting Newton-based solvers (usually with globalization techniques), little has been devoted to exploring the application of different models. Our research has been directed at evaluating techniques using different models than Newton's method: a lower order model, Broyden's method, and a higher order model, the tensor method. We have developed large-scale versions of each of these models and have demonstrated their use in important applications at Sandia. Broyden's method replaces the Jacobian with an approximation, allowing codes that cannot evaluate a Jacobian or have an inaccurate Jacobian to converge to a solution. Limited-memory methods, which have been successful in optimization, allow us to extend this approach to large-scale problems. We compare the robustness and efficiency of Newton's method, modified Newton's method, Jacobian-free Newton-Krylov method, and our limited-memory Broyden method. Comparisons are carried out for large-scale applications of fluid flow simulations and electronic circuit simulations. Results show that, in cases where the Jacobian was inaccurate or could not be computed, Broyden's method converged in some cases where Newton's method failed to converge. We identify conditions where Broyden's method can be more efficient than Newton's method. We also present modifications to a large-scale tensor method, originally proposed by Bouaricha, for greater efficiency, better robustness, and wider applicability. Tensor methods are an alternative to Newton-based methods and are based on computing a step based on a local quadratic model rather than a linear model. The advantage of Bouaricha's method is that it can use any
Parameter estimation in large-scale systems biology models: a parallel and self-adaptive cooperative strategy.

Science.gov (United States)

Penas, David R; González, Patricia; Egea, Jose A; Doallo, Ramón; Banga, Julio R

2017-01-21

The development of large-scale kinetic models is one of the current key issues in computational systems biology and bioinformatics. Here we consider the problem of parameter estimation in nonlinear dynamic models. Global optimization methods can be used to solve this type of problems but the associated computational cost is very large. Moreover, many of these methods need the tuning of a number of adjustable search parameters, requiring a number of initial exploratory runs and therefore further increasing the computation times. Here we present a novel parallel method, self-adaptive cooperative enhanced scatter search (saCeSS), to accelerate the solution of this class of problems. The method is based on the scatter search optimization metaheuristic and incorporates several key new mechanisms: (i) asynchronous cooperation between parallel processes, (ii) coarse and fine-grained parallelism, and (iii) self-tuning strategies. The performance and robustness of saCeSS is illustrated by solving a set of challenging parameter estimation problems, including medium and large-scale kinetic models of the bacterium E. coli, bakerés yeast S. cerevisiae, the vinegar fly D. melanogaster, Chinese Hamster Ovary cells, and a generic signal transduction network. The results consistently show that saCeSS is a robust and efficient method, allowing very significant reduction of computation times with respect to several previous state of the art methods (from days to minutes, in several cases) even when only a small number of processors is used. The new parallel cooperative method presented here allows the solution of medium and large scale parameter estimation problems in reasonable computation times and with small hardware requirements. Further, the method includes self-tuning mechanisms which facilitate its use by non-experts. We believe that this new method can play a key role in the development of large-scale and even whole-cell dynamic models.
Parallel clustering algorithm for large-scale biological data sets.

Science.gov (United States)

Wang, Minchao; Zhang, Wu; Ding, Wang; Dai, Dongbo; Zhang, Huiran; Xie, Hao; Chen, Luonan; Guo, Yike; Xie, Jiang

2014-01-01

Recent explosion of biological data brings a great challenge for the traditional clustering algorithms. With increasing scale of data sets, much larger memory and longer runtime are required for the cluster identification problems. The affinity propagation algorithm outperforms many other classical clustering algorithms and is widely applied into the biological researches. However, the time and space complexity become a great bottleneck when handling the large-scale data sets. Moreover, the similarity matrix, whose constructing procedure takes long runtime, is required before running the affinity propagation algorithm, since the algorithm clusters data sets based on the similarities between data pairs. Two types of parallel architectures are proposed in this paper to accelerate the similarity matrix constructing procedure and the affinity propagation algorithm. The memory-shared architecture is used to construct the similarity matrix, and the distributed system is taken for the affinity propagation algorithm, because of its large memory size and great computing capacity. An appropriate way of data partition and reduction is designed in our method, in order to minimize the global communication cost among processes. A speedup of 100 is gained with 128 cores. The runtime is reduced from serval hours to a few seconds, which indicates that parallel algorithm is capable of handling large-scale data sets effectively. The parallel affinity propagation also achieves a good performance when clustering large-scale gene data (microarray) and detecting families in large protein superfamilies.
A new asynchronous parallel algorithm for inferring large-scale gene regulatory networks.

Directory of Open Access Journals (Sweden)

Xiangyun Xiao

Full Text Available The reconstruction of gene regulatory networks (GRNs from high-throughput experimental data has been considered one of the most important issues in systems biology research. With the development of high-throughput technology and the complexity of biological problems, we need to reconstruct GRNs that contain thousands of genes. However, when many existing algorithms are used to handle these large-scale problems, they will encounter two important issues: low accuracy and high computational cost. To overcome these difficulties, the main goal of this study is to design an effective parallel algorithm to infer large-scale GRNs based on high-performance parallel computing environments. In this study, we proposed a novel asynchronous parallel framework to improve the accuracy and lower the time complexity of large-scale GRN inference by combining splitting technology and ordinary differential equation (ODE-based optimization. The presented algorithm uses the sparsity and modularity of GRNs to split whole large-scale GRNs into many small-scale modular subnetworks. Through the ODE-based optimization of all subnetworks in parallel and their asynchronous communications, we can easily obtain the parameters of the whole network. To test the performance of the proposed approach, we used well-known benchmark datasets from Dialogue for Reverse Engineering Assessments and Methods challenge (DREAM, experimentally determined GRN of Escherichia coli and one published dataset that contains more than 10 thousand genes to compare the proposed approach with several popular algorithms on the same high-performance computing environments in terms of both accuracy and time complexity. The numerical results demonstrate that our parallel algorithm exhibits obvious superiority in inferring large-scale GRNs.
A new asynchronous parallel algorithm for inferring large-scale gene regulatory networks.

Science.gov (United States)

Xiao, Xiangyun; Zhang, Wei; Zou, Xiufen

2015-01-01

The reconstruction of gene regulatory networks (GRNs) from high-throughput experimental data has been considered one of the most important issues in systems biology research. With the development of high-throughput technology and the complexity of biological problems, we need to reconstruct GRNs that contain thousands of genes. However, when many existing algorithms are used to handle these large-scale problems, they will encounter two important issues: low accuracy and high computational cost. To overcome these difficulties, the main goal of this study is to design an effective parallel algorithm to infer large-scale GRNs based on high-performance parallel computing environments. In this study, we proposed a novel asynchronous parallel framework to improve the accuracy and lower the time complexity of large-scale GRN inference by combining splitting technology and ordinary differential equation (ODE)-based optimization. The presented algorithm uses the sparsity and modularity of GRNs to split whole large-scale GRNs into many small-scale modular subnetworks. Through the ODE-based optimization of all subnetworks in parallel and their asynchronous communications, we can easily obtain the parameters of the whole network. To test the performance of the proposed approach, we used well-known benchmark datasets from Dialogue for Reverse Engineering Assessments and Methods challenge (DREAM), experimentally determined GRN of Escherichia coli and one published dataset that contains more than 10 thousand genes to compare the proposed approach with several popular algorithms on the same high-performance computing environments in terms of both accuracy and time complexity. The numerical results demonstrate that our parallel algorithm exhibits obvious superiority in inferring large-scale GRNs.
Hierarchical approach to optimization of parallel matrix multiplication on large-scale platforms

KAUST Repository

Hasanov, Khalid

2014-03-04

© 2014, Springer Science+Business Media New York. Many state-of-the-art parallel algorithms, which are widely used in scientific applications executed on high-end computing systems, were designed in the twentieth century with relatively small-scale parallelism in mind. Indeed, while in 1990s a system with few hundred cores was considered a powerful supercomputer, modern top supercomputers have millions of cores. In this paper, we present a hierarchical approach to optimization of message-passing parallel algorithms for execution on large-scale distributed-memory systems. The idea is to reduce the communication cost by introducing hierarchy and hence more parallelism in the communication scheme. We apply this approach to SUMMA, the state-of-the-art parallel algorithm for matrix–matrix multiplication, and demonstrate both theoretically and experimentally that the modified Hierarchical SUMMA significantly improves the communication cost and the overall performance on large-scale platforms.
Fast Simulation of Large-Scale Floods Based on GPU Parallel Computing

Directory of Open Access Journals (Sweden)

Qiang Liu

2018-05-01

Full Text Available Computing speed is a significant issue of large-scale flood simulations for real-time response to disaster prevention and mitigation. Even today, most of the large-scale flood simulations are generally run on supercomputers due to the massive amounts of data and computations necessary. In this work, a two-dimensional shallow water model based on an unstructured Godunov-type finite volume scheme was proposed for flood simulation. To realize a fast simulation of large-scale floods on a personal computer, a Graphics Processing Unit (GPU-based, high-performance computing method using the OpenACC application was adopted to parallelize the shallow water model. An unstructured data management method was presented to control the data transportation between the GPU and CPU (Central Processing Unit with minimum overhead, and then both computation and data were offloaded from the CPU to the GPU, which exploited the computational capability of the GPU as much as possible. The parallel model was validated using various benchmarks and real-world case studies. The results demonstrate that speed-ups of up to one order of magnitude can be achieved in comparison with the serial model. The proposed parallel model provides a fast and reliable tool with which to quickly assess flood hazards in large-scale areas and, thus, has a bright application prospect for dynamic inundation risk identification and disaster assessment.
Parallel Solution of Robust Nonlinear Model Predictive Control Problems in Batch Crystallization

Directory of Open Access Journals (Sweden)

Yankai Cao

2016-06-01

Full Text Available Representing the uncertainties with a set of scenarios, the optimization problem resulting from a robust nonlinear model predictive control (NMPC strategy at each sampling instance can be viewed as a large-scale stochastic program. This paper solves these optimization problems using the parallel Schur complement method developed to solve stochastic programs on distributed and shared memory machines. The control strategy is illustrated with a case study of a multidimensional unseeded batch crystallization process. For this application, a robust NMPC based on min–max optimization guarantees satisfaction of all state and input constraints for a set of uncertainty realizations, and also provides better robust performance compared with open-loop optimal control, nominal NMPC, and robust NMPC minimizing the expected performance at each sampling instance. The performance of robust NMPC can be improved by generating optimization scenarios using Bayesian inference. With the efficient parallel solver, the solution time of one optimization problem is reduced from 6.7 min to 0.5 min, allowing for real-time application.
Parallel Optimization of Polynomials for Large-scale Problems in Stability and Control

Science.gov (United States)

Kamyar, Reza

In this thesis, we focus on some of the NP-hard problems in control theory. Thanks to the converse Lyapunov theory, these problems can often be modeled as optimization over polynomials. To avoid the problem of intractability, we establish a trade off between accuracy and complexity. In particular, we develop a sequence of tractable optimization problems --- in the form of Linear Programs (LPs) and/or Semi-Definite Programs (SDPs) --- whose solutions converge to the exact solution of the NP-hard problem. However, the computational and memory complexity of these LPs and SDPs grow exponentially with the progress of the sequence - meaning that improving the accuracy of the solutions requires solving SDPs with tens of thousands of decision variables and constraints. Setting up and solving such problems is a significant challenge. The existing optimization algorithms and software are only designed to use desktop computers or small cluster computers --- machines which do not have sufficient memory for solving such large SDPs. Moreover, the speed-up of these algorithms does not scale beyond dozens of processors. This in fact is the reason we seek parallel algorithms for setting-up and solving large SDPs on large cluster- and/or super-computers. We propose parallel algorithms for stability analysis of two classes of systems: 1) Linear systems with a large number of uncertain parameters; 2) Nonlinear systems defined by polynomial vector fields. First, we develop a distributed parallel algorithm which applies Polya's and/or Handelman's theorems to some variants of parameter-dependent Lyapunov inequalities with parameters defined over the standard simplex. The result is a sequence of SDPs which possess a block-diagonal structure. We then develop a parallel SDP solver which exploits this structure in order to map the computation, memory and communication to a distributed parallel environment. Numerical tests on a supercomputer demonstrate the ability of the algorithm to
Visual analysis of inter-process communication for large-scale parallel computing.

Science.gov (United States)

Muelder, Chris; Gygi, Francois; Ma, Kwan-Liu

2009-01-01

In serial computation, program profiling is often helpful for optimization of key sections of code. When moving to parallel computation, not only does the code execution need to be considered but also communication between the different processes which can induce delays that are detrimental to performance. As the number of processes increases, so does the impact of the communication delays on performance. For large-scale parallel applications, it is critical to understand how the communication impacts performance in order to make the code more efficient. There are several tools available for visualizing program execution and communications on parallel systems. These tools generally provide either views which statistically summarize the entire program execution or process-centric views. However, process-centric visualizations do not scale well as the number of processes gets very large. In particular, the most common representation of parallel processes is a Gantt char t with a row for each process. As the number of processes increases, these charts can become difficult to work with and can even exceed screen resolution. We propose a new visualization approach that affords more scalability and then demonstrate it on systems running with up to 16,384 processes.
Large-Scale Parallel Finite Element Analysis of the Stress Singular Problems

International Nuclear Information System (INIS)

Noriyuki Kushida; Hiroshi Okuda; Genki Yagawa

2002-01-01

In this paper, the convergence behavior of large-scale parallel finite element method for the stress singular problems was investigated. The convergence behavior of iterative solvers depends on the efficiency of the pre-conditioners. However, efficiency of pre-conditioners may be influenced by the domain decomposition that is necessary for parallel FEM. In this study the following results were obtained: Conjugate gradient method without preconditioning and the diagonal scaling preconditioned conjugate gradient method were not influenced by the domain decomposition as expected. symmetric successive over relaxation method preconditioned conjugate gradient method converged 6% faster as maximum if the stress singular area was contained in one sub-domain. (authors)
Application of parallel computing techniques to a large-scale reservoir simulation

International Nuclear Information System (INIS)

Zhang, Keni; Wu, Yu-Shu; Ding, Chris; Pruess, Karsten

2001-01-01

Even with the continual advances made in both computational algorithms and computer hardware used in reservoir modeling studies, large-scale simulation of fluid and heat flow in heterogeneous reservoirs remains a challenge. The problem commonly arises from intensive computational requirement for detailed modeling investigations of real-world reservoirs. This paper presents the application of a massive parallel-computing version of the TOUGH2 code developed for performing large-scale field simulations. As an application example, the parallelized TOUGH2 code is applied to develop a three-dimensional unsaturated-zone numerical model simulating flow of moisture, gas, and heat in the unsaturated zone of Yucca Mountain, Nevada, a potential repository for high-level radioactive waste. The modeling approach employs refined spatial discretization to represent the heterogeneous fractured tuffs of the system, using more than a million 3-D gridblocks. The problem of two-phase flow and heat transfer within the model domain leads to a total of 3,226,566 linear equations to be solved per Newton iteration. The simulation is conducted on a Cray T3E-900, a distributed-memory massively parallel computer. Simulation results indicate that the parallel computing technique, as implemented in the TOUGH2 code, is very efficient. The reliability and accuracy of the model results have been demonstrated by comparing them to those of small-scale (coarse-grid) models. These comparisons show that simulation results obtained with the refined grid provide more detailed predictions of the future flow conditions at the site, aiding in the assessment of proposed repository performance
Parallelizing Gene Expression Programming Algorithm in Enabling Large-Scale Classification

Directory of Open Access Journals (Sweden)

Lixiong Xu

2017-01-01

Full Text Available As one of the most effective function mining algorithms, Gene Expression Programming (GEP algorithm has been widely used in classification, pattern recognition, prediction, and other research fields. Based on the self-evolution, GEP is able to mine an optimal function for dealing with further complicated tasks. However, in big data researches, GEP encounters low efficiency issue due to its long time mining processes. To improve the efficiency of GEP in big data researches especially for processing large-scale classification tasks, this paper presents a parallelized GEP algorithm using MapReduce computing model. The experimental results show that the presented algorithm is scalable and efficient for processing large-scale classification tasks.
Parallel Quasi Newton Algorithms for Large Scale Non Linear Unconstrained Optimization

International Nuclear Information System (INIS)

Rahman, M. A.; Basarudin, T.

1997-01-01

This paper discusses about Quasi Newton (QN) method to solve non-linear unconstrained minimization problems. One of many important of QN method is choice of matrix Hk. to be positive definite and satisfies to QN method. Our interest here is the parallel QN methods which will suite for the solution of large-scale optimization problems. The QN methods became less attractive in large-scale problems because of the storage and computational requirements. How ever, it is often the case that the Hessian is space matrix. In this paper we include the mechanism of how to reduce the Hessian update and hold the Hessian properties.One major reason of our research is that the QN method may be good in solving certain type of minimization problems, but it is efficiency degenerate when is it applied to solve other category of problems. For this reason, we use an algorithm containing several direction strategies which are processed in parallel. We shall attempt to parallelized algorithm by exploring different search directions which are generated by various QN update during the minimization process. The different line search strategies will be employed simultaneously in the process of locating the minimum along each direction.The code of algorithm will be written in Occam language 2 which is run on the transputer machine
Large-Scale, Parallel, Multi-Sensor Data Fusion in the Cloud

Science.gov (United States)

Wilson, B. D.; Manipon, G.; Hua, H.

2012-12-01

NASA's Earth Observing System (EOS) is an ambitious facility for studying global climate change. The mandate now is to combine measurements from the instruments on the "A-Train" platforms (AIRS, AMSR-E, MODIS, MISR, MLS, and CloudSat) and other Earth probes to enable large-scale studies of climate change over periods of years to decades. However, moving from predominantly single-instrument studies to a multi-sensor, measurement-based model for long-duration analysis of important climate variables presents serious challenges for large-scale data mining and data fusion. For example, one might want to compare temperature and water vapor retrievals from one instrument (AIRS) to another instrument (MODIS), and to a model (ECMWF), stratify the comparisons using a classification of the "cloud scenes" from CloudSat, and repeat the entire analysis over years of AIRS data. To perform such an analysis, one must discover & access multiple datasets from remote sites, find the space/time "matchups" between instruments swaths and model grids, understand the quality flags and uncertainties for retrieved physical variables, assemble merged datasets, and compute fused products for further scientific and statistical analysis. To efficiently assemble such decade-scale datasets in a timely manner, we are utilizing Elastic Computing in the Cloud and parallel map/reduce-based algorithms. "SciReduce" is a Hadoop-like parallel analysis system, programmed in parallel python, that is designed from the ground up for Earth science. SciReduce executes inside VMWare images and scales to any number of nodes in the Cloud. Unlike Hadoop, in which simple tuples (keys & values) are passed between the map and reduce functions, SciReduce operates on bundles of named numeric arrays, which can be passed in memory or serialized to disk in netCDF4 or HDF5. Thus, SciReduce uses the native datatypes (geolocated grids, swaths, and points) that geo-scientists are familiar with. We are deploying within Sci
Parallel Framework for Dimensionality Reduction of Large-Scale Datasets

Directory of Open Access Journals (Sweden)

Sai Kiranmayee Samudrala

2015-01-01

Full Text Available Dimensionality reduction refers to a set of mathematical techniques used to reduce complexity of the original high-dimensional data, while preserving its selected properties. Improvements in simulation strategies and experimental data collection methods are resulting in a deluge of heterogeneous and high-dimensional data, which often makes dimensionality reduction the only viable way to gain qualitative and quantitative understanding of the data. However, existing dimensionality reduction software often does not scale to datasets arising in real-life applications, which may consist of thousands of points with millions of dimensions. In this paper, we propose a parallel framework for dimensionality reduction of large-scale data. We identify key components underlying the spectral dimensionality reduction techniques, and propose their efficient parallel implementation. We show that the resulting framework can be used to process datasets consisting of millions of points when executed on a 16,000-core cluster, which is beyond the reach of currently available methods. To further demonstrate applicability of our framework we perform dimensionality reduction of 75,000 images representing morphology evolution during manufacturing of organic solar cells in order to identify how processing parameters affect morphology evolution.
Streaming Parallel GPU Acceleration of Large-Scale filter-based Spiking Neural Networks

NARCIS (Netherlands)

L.P. Slazynski (Leszek); S.M. Bohte (Sander)

2012-01-01

htmlabstractThe arrival of graphics processing (GPU) cards suitable for massively parallel computing promises a↵ordable large-scale neural network simulation previously only available at supercomputing facil- ities. While the raw numbers suggest that GPUs may outperform CPUs by at least an order of
Random number generators for large-scale parallel Monte Carlo simulations on FPGA

Science.gov (United States)

Lin, Y.; Wang, F.; Liu, B.

2018-05-01

Through parallelization, field programmable gate array (FPGA) can achieve unprecedented speeds in large-scale parallel Monte Carlo (LPMC) simulations. FPGA presents both new constraints and new opportunities for the implementations of random number generators (RNGs), which are key elements of any Monte Carlo (MC) simulation system. Using empirical and application based tests, this study evaluates all of the four RNGs used in previous FPGA based MC studies and newly proposed FPGA implementations for two well-known high-quality RNGs that are suitable for LPMC studies on FPGA. One of the newly proposed FPGA implementations: a parallel version of additive lagged Fibonacci generator (Parallel ALFG) is found to be the best among the evaluated RNGs in fulfilling the needs of LPMC simulations on FPGA.
Exploiting multi-scale parallelism for large scale numerical modelling of laser wakefield accelerators

International Nuclear Information System (INIS)

Fonseca, R A; Vieira, J; Silva, L O; Fiuza, F; Davidson, A; Tsung, F S; Mori, W B

2013-01-01

A new generation of laser wakefield accelerators (LWFA), supported by the extreme accelerating fields generated in the interaction of PW-Class lasers and underdense targets, promises the production of high quality electron beams in short distances for multiple applications. Achieving this goal will rely heavily on numerical modelling to further understand the underlying physics and identify optimal regimes, but large scale modelling of these scenarios is computationally heavy and requires the efficient use of state-of-the-art petascale supercomputing systems. We discuss the main difficulties involved in running these simulations and the new developments implemented in the OSIRIS framework to address these issues, ranging from multi-dimensional dynamic load balancing and hybrid distributed/shared memory parallelism to the vectorization of the PIC algorithm. We present the results of the OASCR Joule Metric program on the issue of large scale modelling of LWFA, demonstrating speedups of over 1 order of magnitude on the same hardware. Finally, scalability to over ∼10 6 cores and sustained performance over ∼2 P Flops is demonstrated, opening the way for large scale modelling of LWFA scenarios. (paper)
Decomposition and parallelization strategies for solving large-scale MDO problems

Energy Technology Data Exchange (ETDEWEB)

Grauer, M.; Eschenauer, H.A. [Research Center for Multidisciplinary Analyses and Applied Structural Optimization, FOMAAS, Univ. of Siegen (Germany)

2007-07-01

During previous years, structural optimization has been recognized as a useful tool within the discriptiones of engineering and economics. However, the optimization of large-scale systems or structures is impeded by an immense solution effort. This was the reason to start a joint research and development (R and D) project between the Institute of Mechanics and Control Engineering and the Information and Decision Sciences Institute within the Research Center for Multidisciplinary Analyses and Applied Structural Optimization (FOMAAS) on cluster computing for parallel and distributed solution of multidisciplinary optimization (MDO) problems based on the OpTiX-Workbench. Here the focus of attention will be put on coarsegrained parallelization and its implementation on clusters of workstations. A further point of emphasis was laid on the development of a parallel decomposition strategy called PARDEC, for the solution of very complex optimization problems which cannot be solved efficiently by sequential integrated optimization. The use of the OptiX-Workbench together with the FEM ground water simulation system FEFLOW is shown for a special water management problem. (orig.)

Large-scale parallel genome assembler over cloud computing environment.

Science.gov (United States)

Das, Arghya Kusum; Koppa, Praveen Kumar; Goswami, Sayan; Platania, Richard; Park, Seung-Jong

2017-06-01

The size of high throughput DNA sequencing data has already reached the terabyte scale. To manage this huge volume of data, many downstream sequencing applications started using locality-based computing over different cloud infrastructures to take advantage of elastic (pay as you go) resources at a lower cost. However, the locality-based programming model (e.g. MapReduce) is relatively new. Consequently, developing scalable data-intensive bioinformatics applications using this model and understanding the hardware environment that these applications require for good performance, both require further research. In this paper, we present a de Bruijn graph oriented Parallel Giraph-based Genome Assembler (GiGA), as well as the hardware platform required for its optimal performance. GiGA uses the power of Hadoop (MapReduce) and Giraph (large-scale graph analysis) to achieve high scalability over hundreds of compute nodes by collocating the computation and data. GiGA achieves significantly higher scalability with competitive assembly quality compared to contemporary parallel assemblers (e.g. ABySS and Contrail) over traditional HPC cluster. Moreover, we show that the performance of GiGA is significantly improved by using an SSD-based private cloud infrastructure over traditional HPC cluster. We observe that the performance of GiGA on 256 cores of this SSD-based cloud infrastructure closely matches that of 512 cores of traditional HPC cluster.
Hierarchical modeling and robust synthesis for the preliminary design of large scale complex systems

Science.gov (United States)

Koch, Patrick Nathan

Large-scale complex systems are characterized by multiple interacting subsystems and the analysis of multiple disciplines. The design and development of such systems inevitably requires the resolution of multiple conflicting objectives. The size of complex systems, however, prohibits the development of comprehensive system models, and thus these systems must be partitioned into their constituent parts. Because simultaneous solution of individual subsystem models is often not manageable iteration is inevitable and often excessive. In this dissertation these issues are addressed through the development of a method for hierarchical robust preliminary design exploration to facilitate concurrent system and subsystem design exploration, for the concurrent generation of robust system and subsystem specifications for the preliminary design of multi-level, multi-objective, large-scale complex systems. This method is developed through the integration and expansion of current design techniques: (1) Hierarchical partitioning and modeling techniques for partitioning large-scale complex systems into more tractable parts, and allowing integration of subproblems for system synthesis, (2) Statistical experimentation and approximation techniques for increasing both the efficiency and the comprehensiveness of preliminary design exploration, and (3) Noise modeling techniques for implementing robust preliminary design when approximate models are employed. The method developed and associated approaches are illustrated through their application to the preliminary design of a commercial turbofan turbine propulsion system; the turbofan system-level problem is partitioned into engine cycle and configuration design and a compressor module is integrated for more detailed subsystem-level design exploration, improving system evaluation.
Parallel supercomputing: Advanced methods, algorithms, and software for large-scale linear and nonlinear problems

Energy Technology Data Exchange (ETDEWEB)

Carey, G.F.; Young, D.M.

1993-12-31

The program outlined here is directed to research on methods, algorithms, and software for distributed parallel supercomputers. Of particular interest are finite element methods and finite difference methods together with sparse iterative solution schemes for scientific and engineering computations of very large-scale systems. Both linear and nonlinear problems will be investigated. In the nonlinear case, applications with bifurcation to multiple solutions will be considered using continuation strategies. The parallelizable numerical methods of particular interest are a family of partitioning schemes embracing domain decomposition, element-by-element strategies, and multi-level techniques. The methods will be further developed incorporating parallel iterative solution algorithms with associated preconditioners in parallel computer software. The schemes will be implemented on distributed memory parallel architectures such as the CRAY MPP, Intel Paragon, the NCUBE3, and the Connection Machine. We will also consider other new architectures such as the Kendall-Square (KSQ) and proposed machines such as the TERA. The applications will focus on large-scale three-dimensional nonlinear flow and reservoir problems with strong convective transport contributions. These are legitimate grand challenge class computational fluid dynamics (CFD) problems of significant practical interest to DOE. The methods developed and algorithms will, however, be of wider interest.
Implementation of highly parallel and large scale GW calculations within the OpenAtom software

Science.gov (United States)

Ismail-Beigi, Sohrab

The need to describe electronic excitations with better accuracy than provided by band structures produced by Density Functional Theory (DFT) has been a long-term enterprise for the computational condensed matter and materials theory communities. In some cases, appropriate theoretical frameworks have existed for some time but have been difficult to apply widely due to computational cost. For example, the GW approximation incorporates a great deal of important non-local and dynamical electronic interaction effects but has been too computationally expensive for routine use in large materials simulations. OpenAtom is an open source massively parallel ab initiodensity functional software package based on plane waves and pseudopotentials (http://charm.cs.uiuc.edu/OpenAtom/) that takes advantage of the Charm + + parallel framework. At present, it is developed via a three-way collaboration, funded by an NSF SI2-SSI grant (ACI-1339804), between Yale (Ismail-Beigi), IBM T. J. Watson (Glenn Martyna) and the University of Illinois at Urbana Champaign (Laxmikant Kale). We will describe the project and our current approach towards implementing large scale GW calculations with OpenAtom. Potential applications of large scale parallel GW software for problems involving electronic excitations in semiconductor and/or metal oxide systems will be also be pointed out.
Parallel Motion Simulation of Large-Scale Real-Time Crowd in a Hierarchical Environmental Model

Directory of Open Access Journals (Sweden)

Xin Wang

2012-01-01

Full Text Available This paper presents a parallel real-time crowd simulation method based on a hierarchical environmental model. A dynamical model of the complex environment should be constructed to simulate the state transition and propagation of individual motions. By modeling of a virtual environment where virtual crowds reside, we employ different parallel methods on a topological layer, a path layer and a perceptual layer. We propose a parallel motion path matching method based on the path layer and a parallel crowd simulation method based on the perceptual layer. The large-scale real-time crowd simulation becomes possible with these methods. Numerical experiments are carried out to demonstrate the methods and results.
Very Large-Scale Neighborhoods with Performance Guarantees for Minimizing Makespan on Parallel Machines

NARCIS (Netherlands)

Brueggemann, T.; Hurink, Johann L.; Vredeveld, T.; Woeginger, Gerhard

2006-01-01

We study the problem of minimizing the makespan on m parallel machines. We introduce a very large-scale neighborhood of exponential size (in the number of machines) that is based on a matching in a complete graph. The idea is to partition the jobs assigned to the same machine into two sets. This
DGDFT: A massively parallel method for large scale density functional theory calculations.

Science.gov (United States)

Hu, Wei; Lin, Lin; Yang, Chao

2015-09-28

We describe a massively parallel implementation of the recently developed discontinuous Galerkin density functional theory (DGDFT) method, for efficient large-scale Kohn-Sham DFT based electronic structure calculations. The DGDFT method uses adaptive local basis (ALB) functions generated on-the-fly during the self-consistent field iteration to represent the solution to the Kohn-Sham equations. The use of the ALB set provides a systematic way to improve the accuracy of the approximation. By using the pole expansion and selected inversion technique to compute electron density, energy, and atomic forces, we can make the computational complexity of DGDFT scale at most quadratically with respect to the number of electrons for both insulating and metallic systems. We show that for the two-dimensional (2D) phosphorene systems studied here, using 37 basis functions per atom allows us to reach an accuracy level of 1.3 × 10(-4) Hartree/atom in terms of the error of energy and 6.2 × 10(-4) Hartree/bohr in terms of the error of atomic force, respectively. DGDFT can achieve 80% parallel efficiency on 128,000 high performance computing cores when it is used to study the electronic structure of 2D phosphorene systems with 3500-14 000 atoms. This high parallel efficiency results from a two-level parallelization scheme that we will describe in detail.
DGDFT: A massively parallel method for large scale density functional theory calculations

International Nuclear Information System (INIS)

Hu, Wei; Yang, Chao; Lin, Lin

2015-01-01

We describe a massively parallel implementation of the recently developed discontinuous Galerkin density functional theory (DGDFT) method, for efficient large-scale Kohn-Sham DFT based electronic structure calculations. The DGDFT method uses adaptive local basis (ALB) functions generated on-the-fly during the self-consistent field iteration to represent the solution to the Kohn-Sham equations. The use of the ALB set provides a systematic way to improve the accuracy of the approximation. By using the pole expansion and selected inversion technique to compute electron density, energy, and atomic forces, we can make the computational complexity of DGDFT scale at most quadratically with respect to the number of electrons for both insulating and metallic systems. We show that for the two-dimensional (2D) phosphorene systems studied here, using 37 basis functions per atom allows us to reach an accuracy level of 1.3 × 10 −4 Hartree/atom in terms of the error of energy and 6.2 × 10 −4 Hartree/bohr in terms of the error of atomic force, respectively. DGDFT can achieve 80% parallel efficiency on 128,000 high performance computing cores when it is used to study the electronic structure of 2D phosphorene systems with 3500-14 000 atoms. This high parallel efficiency results from a two-level parallelization scheme that we will describe in detail
DGDFT: A massively parallel method for large scale density functional theory calculations

Energy Technology Data Exchange (ETDEWEB)

Hu, Wei, E-mail: whu@lbl.gov; Yang, Chao, E-mail: cyang@lbl.gov [Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720 (United States); Lin, Lin, E-mail: linlin@math.berkeley.edu [Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720 (United States); Department of Mathematics, University of California, Berkeley, California 94720 (United States)

2015-09-28

We describe a massively parallel implementation of the recently developed discontinuous Galerkin density functional theory (DGDFT) method, for efficient large-scale Kohn-Sham DFT based electronic structure calculations. The DGDFT method uses adaptive local basis (ALB) functions generated on-the-fly during the self-consistent field iteration to represent the solution to the Kohn-Sham equations. The use of the ALB set provides a systematic way to improve the accuracy of the approximation. By using the pole expansion and selected inversion technique to compute electron density, energy, and atomic forces, we can make the computational complexity of DGDFT scale at most quadratically with respect to the number of electrons for both insulating and metallic systems. We show that for the two-dimensional (2D) phosphorene systems studied here, using 37 basis functions per atom allows us to reach an accuracy level of 1.3 × 10{sup −4} Hartree/atom in terms of the error of energy and 6.2 × 10{sup −4} Hartree/bohr in terms of the error of atomic force, respectively. DGDFT can achieve 80% parallel efficiency on 128,000 high performance computing cores when it is used to study the electronic structure of 2D phosphorene systems with 3500-14 000 atoms. This high parallel efficiency results from a two-level parallelization scheme that we will describe in detail.
Efficient graph-based dynamic load-balancing for parallel large-scale agent-based traffic simulation

NARCIS (Netherlands)

Xu, Y.; Cai, W.; Aydt, H.; Lees, M.; Tolk, A.; Diallo, S.Y.; Ryzhov, I.O.; Yilmaz, L.; Buckley, S.; Miller, J.A.

2014-01-01

One of the issues of parallelizing large-scale agent-based traffic simulations is partitioning and load-balancing. Traffic simulations are dynamic applications where the distribution of workload in the spatial domain constantly changes. Dynamic load-balancing at run-time has shown better efficiency
Event-triggered decentralized robust model predictive control for constrained large-scale interconnected systems

Directory of Open Access Journals (Sweden)

Ling Lu

2016-12-01

Full Text Available This paper considers the problem of event-triggered decentralized model predictive control (MPC for constrained large-scale linear systems subject to additive bounded disturbances. The constraint tightening method is utilized to formulate the MPC optimization problem. The local predictive control law for each subsystem is determined aperiodically by relevant triggering rule which allows a considerable reduction of the computational load. And then, the robust feasibility and closed-loop stability are proved and it is shown that every subsystem state will be driven into a robust invariant set. Finally, the effectiveness of the proposed approach is illustrated via numerical simulations.
Parallel Tensor Compression for Large-Scale Scientific Data.

Energy Technology Data Exchange (ETDEWEB)

Kolda, Tamara G. [Sandia National Lab. (SNL-CA), Livermore, CA (United States); Ballard, Grey [Sandia National Lab. (SNL-CA), Livermore, CA (United States); Austin, Woody Nathan [Univ. of Texas, Austin, TX (United States)

2015-10-01

As parallel computing trends towards the exascale, scientific data produced by high-fidelity simulations are growing increasingly massive. For instance, a simulation on a three-dimensional spatial grid with 512 points per dimension that tracks 64 variables per grid point for 128 time steps yields 8 TB of data. By viewing the data as a dense five way tensor, we can compute a Tucker decomposition to find inherent low-dimensional multilinear structure, achieving compression ratios of up to 10000 on real-world data sets with negligible loss in accuracy. So that we can operate on such massive data, we present the first-ever distributed memory parallel implementation for the Tucker decomposition, whose key computations correspond to parallel linear algebra operations, albeit with nonstandard data layouts. Our approach specifies a data distribution for tensors that avoids any tensor data redistribution, either locally or in parallel. We provide accompanying analysis of the computation and communication costs of the algorithms. To demonstrate the compression and accuracy of the method, we apply our approach to real-world data sets from combustion science simulations. We also provide detailed performance results, including parallel performance in both weak and strong scaling experiments.
Multilevel parallel strategy on Monte Carlo particle transport for the large-scale full-core pin-by-pin simulations

International Nuclear Information System (INIS)

Zhang, B.; Li, G.; Wang, W.; Shangguan, D.; Deng, L.

2015-01-01

This paper introduces the Strategy of multilevel hybrid parallelism of JCOGIN Infrastructure on Monte Carlo Particle Transport for the large-scale full-core pin-by-pin simulations. The particle parallelism, domain decomposition parallelism and MPI/OpenMP parallelism are designed and implemented. By the testing, JMCT presents the parallel scalability of JCOGIN, which reaches the parallel efficiency 80% on 120,000 cores for the pin-by-pin computation of the BEAVRS benchmark. (author)
A review of parallel computing for large-scale remote sensing image mosaicking

OpenAIRE

Chen, Lajiao; Ma, Yan; Liu, Peng; Wei, Jingbo; Jie, Wei; He, Jijun

2015-01-01

Interest in image mosaicking has been spurred by a wide variety of research and management needs. However, for large-scale applications, remote sensing image mosaicking usually requires significant computational capabilities. Several studies have attempted to apply parallel computing to improve image mosaicking algorithms and to speed up calculation process. The state of the art of this field has not yet been summarized, which is, however, essential for a better understanding and for further ...
On Modeling Large-Scale Multi-Agent Systems with Parallel, Sequential and Genuinely Asynchronous Cellular Automata

International Nuclear Information System (INIS)

Tosic, P.T.

2011-01-01

We study certain types of Cellular Automata (CA) viewed as an abstraction of large-scale Multi-Agent Systems (MAS). We argue that the classical CA model needs to be modified in several important respects, in order to become a relevant and sufficiently general model for the large-scale MAS, and so that thus generalized model can capture many important MAS properties at the level of agent ensembles and their long-term collective behavior patterns. We specifically focus on the issue of inter-agent communication in CA, and propose sequential cellular automata (SCA) as the first step, and genuinely Asynchronous Cellular Automata (ACA) as the ultimate deterministic CA-based abstract models for large-scale MAS made of simple reactive agents. We first formulate deterministic and nondeterministic versions of sequential CA, and then summarize some interesting configuration space properties (i.e., possible behaviors) of a restricted class of sequential CA. In particular, we compare and contrast those properties of sequential CA with the corresponding properties of the classical (that is, parallel and perfectly synchronous) CA with the same restricted class of update rules. We analytically demonstrate failure of the studied sequential CA models to simulate all possible behaviors of perfectly synchronous parallel CA, even for a very restricted class of non-linear totalistic node update rules. The lesson learned is that the interleaving semantics of concurrency, when applied to sequential CA, is not refined enough to adequately capture the perfect synchrony of parallel CA updates. Last but not least, we outline what would be an appropriate CA-like abstraction for large-scale distributed computing insofar as the inter-agent communication model is concerned, and in that context we propose genuinely asynchronous CA. (author)
LDRD final report : robust analysis of large-scale combinatorial applications.

Energy Technology Data Exchange (ETDEWEB)

Carr, Robert D.; Morrison, Todd (University of Colorado, Denver, CO); Hart, William Eugene; Benavides, Nicolas L. (Santa Clara University, Santa Clara, CA); Greenberg, Harvey J. (University of Colorado, Denver, CO); Watson, Jean-Paul; Phillips, Cynthia Ann

2007-09-01

Discrete models of large, complex systems like national infrastructures and complex logistics frameworks naturally incorporate many modeling uncertainties. Consequently, there is a clear need for optimization techniques that can robustly account for risks associated with modeling uncertainties. This report summarizes the progress of the Late-Start LDRD 'Robust Analysis of Largescale Combinatorial Applications'. This project developed new heuristics for solving robust optimization models, and developed new robust optimization models for describing uncertainty scenarios.
Visual Data-Analytics of Large-Scale Parallel Discrete-Event Simulations

Energy Technology Data Exchange (ETDEWEB)

Ross, Caitlin; Carothers, Christopher D.; Mubarak, Misbah; Carns, Philip; Ross, Robert; Li, Jianping Kelvin; Ma, Kwan-Liu

2016-11-13

Parallel discrete-event simulation (PDES) is an important tool in the codesign of extreme-scale systems because PDES provides a cost-effective way to evaluate designs of highperformance computing systems. Optimistic synchronization algorithms for PDES, such as Time Warp, allow events to be processed without global synchronization among the processing elements. A rollback mechanism is provided when events are processed out of timestamp order. Although optimistic synchronization protocols enable the scalability of large-scale PDES, the performance of the simulations must be tuned to reduce the number of rollbacks and provide an improved simulation runtime. To enable efficient large-scale optimistic simulations, one has to gain insight into the factors that affect the rollback behavior and simulation performance. We developed a tool for ROSS model developers that gives them detailed metrics on the performance of their large-scale optimistic simulations at varying levels of simulation granularity. Model developers can use this information for parameter tuning of optimistic simulations in order to achieve better runtime and fewer rollbacks. In this work, we instrument the ROSS optimistic PDES framework to gather detailed statistics about the simulation engine. We have also developed an interactive visualization interface that uses the data collected by the ROSS instrumentation to understand the underlying behavior of the simulation engine. The interface connects real time to virtual time in the simulation and provides the ability to view simulation data at different granularities. We demonstrate the usefulness of our framework by performing a visual analysis of the dragonfly network topology model provided by the CODES simulation framework built on top of ROSS. The instrumentation needs to minimize overhead in order to accurately collect data about the simulation performance. To ensure that the instrumentation does not introduce unnecessary overhead, we perform a
Large-Scale, Parallel, Multi-Sensor Atmospheric Data Fusion Using Cloud Computing

Science.gov (United States)

Wilson, B. D.; Manipon, G.; Hua, H.; Fetzer, E. J.

2013-12-01

NASA's Earth Observing System (EOS) is an ambitious facility for studying global climate change. The mandate now is to combine measurements from the instruments on the 'A-Train' platforms (AIRS, AMSR-E, MODIS, MISR, MLS, and CloudSat) and other Earth probes to enable large-scale studies of climate change over decades. Moving to multi-sensor, long-duration analyses of important climate variables presents serious challenges for large-scale data mining and fusion. For example, one might want to compare temperature and water vapor retrievals from one instrument (AIRS) to another (MODIS), and to a model (MERRA), stratify the comparisons using a classification of the 'cloud scenes' from CloudSat, and repeat the entire analysis over 10 years of data. To efficiently assemble such datasets, we are utilizing Elastic Computing in the Cloud and parallel map/reduce-based algorithms. However, these problems are Data Intensive computing so the data transfer times and storage costs (for caching) are key issues. SciReduce is a Hadoop-like parallel analysis system, programmed in parallel python, that is designed from the ground up for Earth science. SciReduce executes inside VMWare images and scales to any number of nodes in the Cloud. Unlike Hadoop, SciReduce operates on bundles of named numeric arrays, which can be passed in memory or serialized to disk in netCDF4 or HDF5. Figure 1 shows the architecture of the full computational system, with SciReduce at the core. Multi-year datasets are automatically 'sharded' by time and space across a cluster of nodes so that years of data (millions of files) can be processed in a massively parallel way. Input variables (arrays) are pulled on-demand into the Cloud using OPeNDAP URLs or other subsetting services, thereby minimizing the size of the cached input and intermediate datasets. We are using SciReduce to automate the production of multiple versions of a ten-year A-Train water vapor climatology under a NASA MEASURES grant. We will
Parallel Computational Fluid Dynamics 2007 : Implementations and Experiences on Large Scale and Grid Computing

CERN Document Server

2009-01-01

At the 19th Annual Conference on Parallel Computational Fluid Dynamics held in Antalya, Turkey, in May 2007, the most recent developments and implementations of large-scale and grid computing were presented. This book, comprised of the invited and selected papers of this conference, details those advances, which are of particular interest to CFD and CFD-related communities. It also offers the results related to applications of various scientific and engineering problems involving flows and flow-related topics. Intended for CFD researchers and graduate students, this book is a state-of-the-art presentation of the relevant methodology and implementation techniques of large-scale computing.
SQDFT: Spectral Quadrature method for large-scale parallel O(N) Kohn-Sham calculations at high temperature

Science.gov (United States)

Suryanarayana, Phanish; Pratapa, Phanisri P.; Sharma, Abhiraj; Pask, John E.

2018-03-01

We present SQDFT: a large-scale parallel implementation of the Spectral Quadrature (SQ) method for O(N) Kohn-Sham Density Functional Theory (DFT) calculations at high temperature. Specifically, we develop an efficient and scalable finite-difference implementation of the infinite-cell Clenshaw-Curtis SQ approach, in which results for the infinite crystal are obtained by expressing quantities of interest as bilinear forms or sums of bilinear forms, that are then approximated by spatially localized Clenshaw-Curtis quadrature rules. We demonstrate the accuracy of SQDFT by showing systematic convergence of energies and atomic forces with respect to SQ parameters to reference diagonalization results, and convergence with discretization to established planewave results, for both metallic and insulating systems. We further demonstrate that SQDFT achieves excellent strong and weak parallel scaling on computer systems consisting of tens of thousands of processors, with near perfect O(N) scaling with system size and wall times as low as a few seconds per self-consistent field iteration. Finally, we verify the accuracy of SQDFT in large-scale quantum molecular dynamics simulations of aluminum at high temperature.

Large-scale parallel configuration interaction. II. Two- and four-component double-group general active space implementation with application to BiH

DEFF Research Database (Denmark)

Knecht, Stefan; Jensen, Hans Jørgen Aagaard; Fleig, Timo

2010-01-01

We present a parallel implementation of a large-scale relativistic double-group configuration interaction CIprogram. It is applicable with a large variety of two- and four-component Hamiltonians. The parallel algorithm is based on a distributed data model in combination with a static load balanci...
Parallel Dynamic Analysis of a Large-Scale Water Conveyance Tunnel under Seismic Excitation Using ALE Finite-Element Method

Directory of Open Access Journals (Sweden)

Xiaoqing Wang

2016-01-01

Full Text Available Parallel analyses about the dynamic responses of a large-scale water conveyance tunnel under seismic excitation are presented in this paper. A full three-dimensional numerical model considering the water-tunnel-soil coupling is established and adopted to investigate the tunnel’s dynamic responses. The movement and sloshing of the internal water are simulated using the multi-material Arbitrary Lagrangian Eulerian (ALE method. Nonlinear fluid–structure interaction (FSI between tunnel and inner water is treated by using the penalty method. Nonlinear soil-structure interaction (SSI between soil and tunnel is dealt with by using the surface to surface contact algorithm. To overcome computing power limitations and to deal with such a large-scale calculation, a parallel algorithm based on the modified recursive coordinate bisection (MRCB considering the balance of SSI and FSI loads is proposed and used. The whole simulation is accomplished on Dawning 5000 A using the proposed MRCB based parallel algorithm optimized to run on supercomputers. The simulation model and the proposed approaches are validated by comparison with the added mass method. Dynamic responses of the tunnel are analyzed and the parallelism is discussed. Besides, factors affecting the dynamic responses are investigated. Better speedup and parallel efficiency show the scalability of the parallel method and the analysis results can be used to aid in the design of water conveyance tunnels.
An efficient implementation of 3D high-resolution imaging for large-scale seismic data with GPU/CPU heterogeneous parallel computing

Science.gov (United States)

Xu, Jincheng; Liu, Wei; Wang, Jin; Liu, Linong; Zhang, Jianfeng

2018-02-01

De-absorption pre-stack time migration (QPSTM) compensates for the absorption and dispersion of seismic waves by introducing an effective Q parameter, thereby making it an effective tool for 3D, high-resolution imaging of seismic data. Although the optimal aperture obtained via stationary-phase migration reduces the computational cost of 3D QPSTM and yields 3D stationary-phase QPSTM, the associated computational efficiency is still the main problem in the processing of 3D, high-resolution images for real large-scale seismic data. In the current paper, we proposed a division method for large-scale, 3D seismic data to optimize the performance of stationary-phase QPSTM on clusters of graphics processing units (GPU). Then, we designed an imaging point parallel strategy to achieve an optimal parallel computing performance. Afterward, we adopted an asynchronous double buffering scheme for multi-stream to perform the GPU/CPU parallel computing. Moreover, several key optimization strategies of computation and storage based on the compute unified device architecture (CUDA) were adopted to accelerate the 3D stationary-phase QPSTM algorithm. Compared with the initial GPU code, the implementation of the key optimization steps, including thread optimization, shared memory optimization, register optimization and special function units (SFU), greatly improved the efficiency. A numerical example employing real large-scale, 3D seismic data showed that our scheme is nearly 80 times faster than the CPU-QPSTM algorithm. Our GPU/CPU heterogeneous parallel computing framework significant reduces the computational cost and facilitates 3D high-resolution imaging for large-scale seismic data.
Hierarchical approach to optimization of parallel matrix multiplication on large-scale platforms

KAUST Repository

Hasanov, Khalid; Quintin, Jean-Noë l; Lastovetsky, Alexey

2014-01-01

-scale parallelism in mind. Indeed, while in 1990s a system with few hundred cores was considered a powerful supercomputer, modern top supercomputers have millions of cores. In this paper, we present a hierarchical approach to optimization of message-passing parallel
A Robust Parallel Algorithm for Combinatorial Compressed Sensing

Science.gov (United States)

Mendoza-Smith, Rodrigo; Tanner, Jared W.; Wechsung, Florian

2018-04-01

In previous work two of the authors have shown that a vector $x \\in \\mathbb{R}^n$ with at most $k Parallel-$\\ell_0$ decoding algorithm, where $\\mathrm{nnz}(A)$ denotes the number of nonzero entries in $A \\in \\mathbb{R}^{m \\times n}$. In this paper we present the Robust-$\\ell_0$ decoding algorithm, which robustifies Parallel-$\\ell_0$ when the sketch $Ax$ is corrupted by additive noise. This robustness is achieved by approximating the asymptotic posterior distribution of values in the sketch given its corrupted measurements. We provide analytic expressions that approximate these posteriors under the assumptions that the nonzero entries in the signal and the noise are drawn from continuous distributions. Numerical experiments presented show that Robust-$\\ell_0$ is superior to existing greedy and combinatorial compressed sensing algorithms in the presence of small to moderate signal-to-noise ratios in the setting of Gaussian signals and Gaussian additive noise.
Large-scale modeling of epileptic seizures: scaling properties of two parallel neuronal network simulation algorithms.

Science.gov (United States)

Pesce, Lorenzo L; Lee, Hyong C; Hereld, Mark; Visser, Sid; Stevens, Rick L; Wildeman, Albert; van Drongelen, Wim

2013-01-01

Our limited understanding of the relationship between the behavior of individual neurons and large neuronal networks is an important limitation in current epilepsy research and may be one of the main causes of our inadequate ability to treat it. Addressing this problem directly via experiments is impossibly complex; thus, we have been developing and studying medium-large-scale simulations of detailed neuronal networks to guide us. Flexibility in the connection schemas and a complete description of the cortical tissue seem necessary for this purpose. In this paper we examine some of the basic issues encountered in these multiscale simulations. We have determined the detailed behavior of two such simulators on parallel computer systems. The observed memory and computation-time scaling behavior for a distributed memory implementation were very good over the range studied, both in terms of network sizes (2,000 to 400,000 neurons) and processor pool sizes (1 to 256 processors). Our simulations required between a few megabytes and about 150 gigabytes of RAM and lasted between a few minutes and about a week, well within the capability of most multinode clusters. Therefore, simulations of epileptic seizures on networks with millions of cells should be feasible on current supercomputers.
Large-Scale Modeling of Epileptic Seizures: Scaling Properties of Two Parallel Neuronal Network Simulation Algorithms

Directory of Open Access Journals (Sweden)

Lorenzo L. Pesce

2013-01-01

Full Text Available Our limited understanding of the relationship between the behavior of individual neurons and large neuronal networks is an important limitation in current epilepsy research and may be one of the main causes of our inadequate ability to treat it. Addressing this problem directly via experiments is impossibly complex; thus, we have been developing and studying medium-large-scale simulations of detailed neuronal networks to guide us. Flexibility in the connection schemas and a complete description of the cortical tissue seem necessary for this purpose. In this paper we examine some of the basic issues encountered in these multiscale simulations. We have determined the detailed behavior of two such simulators on parallel computer systems. The observed memory and computation-time scaling behavior for a distributed memory implementation were very good over the range studied, both in terms of network sizes (2,000 to 400,000 neurons and processor pool sizes (1 to 256 processors. Our simulations required between a few megabytes and about 150 gigabytes of RAM and lasted between a few minutes and about a week, well within the capability of most multinode clusters. Therefore, simulations of epileptic seizures on networks with millions of cells should be feasible on current supercomputers.
MOCC: A Fast and Robust Correlation-Based Method for Interest Point Matching under Large Scale Changes

Science.gov (United States)

Zhao, Feng; Huang, Qingming; Wang, Hao; Gao, Wen

2010-12-01

Similarity measures based on correlation have been used extensively for matching tasks. However, traditional correlation-based image matching methods are sensitive to rotation and scale changes. This paper presents a fast correlation-based method for matching two images with large rotation and significant scale changes. Multiscale oriented corner correlation (MOCC) is used to evaluate the degree of similarity between the feature points. The method is rotation invariant and capable of matching image pairs with scale changes up to a factor of 7. Moreover, MOCC is much faster in comparison with the state-of-the-art matching methods. Experimental results on real images show the robustness and effectiveness of the proposed method.
MOCC: A Fast and Robust Correlation-Based Method for Interest Point Matching under Large Scale Changes

Directory of Open Access Journals (Sweden)

Wang Hao

2010-01-01

Full Text Available Similarity measures based on correlation have been used extensively for matching tasks. However, traditional correlation-based image matching methods are sensitive to rotation and scale changes. This paper presents a fast correlation-based method for matching two images with large rotation and significant scale changes. Multiscale oriented corner correlation (MOCC is used to evaluate the degree of similarity between the feature points. The method is rotation invariant and capable of matching image pairs with scale changes up to a factor of 7. Moreover, MOCC is much faster in comparison with the state-of-the-art matching methods. Experimental results on real images show the robustness and effectiveness of the proposed method.
Node-based finite element method for large-scale adaptive fluid analysis in parallel environments

International Nuclear Information System (INIS)

Toshimitsu, Fujisawa; Genki, Yagawa

2003-01-01

In this paper, a FEM-based (finite element method) mesh free method with a probabilistic node generation technique is presented. In the proposed method, all computational procedures, from the mesh generation to the solution of a system of equations, can be performed fluently in parallel in terms of nodes. Local finite element mesh is generated robustly around each node, even for harsh boundary shapes such as cracks. The algorithm and the data structure of finite element calculation are based on nodes, and parallel computing is realized by dividing a system of equations by the row of the global coefficient matrix. In addition, the node-based finite element method is accompanied by a probabilistic node generation technique, which generates good-natured points for nodes of finite element mesh. Furthermore, the probabilistic node generation technique can be performed in parallel environments. As a numerical example of the proposed method, we perform a compressible flow simulation containing strong shocks. Numerical simulations with frequent mesh refinement, which are required for such kind of analysis, can effectively be performed on parallel processors by using the proposed method. (authors)
Node-based finite element method for large-scale adaptive fluid analysis in parallel environments

Energy Technology Data Exchange (ETDEWEB)

Toshimitsu, Fujisawa [Tokyo Univ., Collaborative Research Center of Frontier Simulation Software for Industrial Science, Institute of Industrial Science (Japan); Genki, Yagawa [Tokyo Univ., Department of Quantum Engineering and Systems Science (Japan)

2003-07-01

In this paper, a FEM-based (finite element method) mesh free method with a probabilistic node generation technique is presented. In the proposed method, all computational procedures, from the mesh generation to the solution of a system of equations, can be performed fluently in parallel in terms of nodes. Local finite element mesh is generated robustly around each node, even for harsh boundary shapes such as cracks. The algorithm and the data structure of finite element calculation are based on nodes, and parallel computing is realized by dividing a system of equations by the row of the global coefficient matrix. In addition, the node-based finite element method is accompanied by a probabilistic node generation technique, which generates good-natured points for nodes of finite element mesh. Furthermore, the probabilistic node generation technique can be performed in parallel environments. As a numerical example of the proposed method, we perform a compressible flow simulation containing strong shocks. Numerical simulations with frequent mesh refinement, which are required for such kind of analysis, can effectively be performed on parallel processors by using the proposed method. (authors)
Robust Parallel Machine Scheduling Problem with Uncertainties and Sequence-Dependent Setup Time

Directory of Open Access Journals (Sweden)

Hongtao Hu

2016-01-01

Full Text Available A parallel machine scheduling problem in plastic production is studied in this paper. In this problem, the processing time and arrival time are uncertain but lie in their respective intervals. In addition, each job must be processed together with a mold while jobs which belong to one family can share the same mold. Therefore, time changing mold is required for two consecutive jobs that belong to different families, which is known as sequence-dependent setup time. This paper aims to identify a robust schedule by min–max regret criterion. It is proved that the scenario incurring maximal regret for each feasible solution lies in finite extreme scenarios. A mixed integer linear programming formulation and an exact algorithm are proposed to solve the problem. Moreover, a modified artificial bee colony algorithm is developed to solve large-scale problems. The performance of the presented algorithm is evaluated through extensive computational experiments and the results show that the proposed algorithm surpasses the exact method in terms of objective value and computational time.
Eighth SIAM conference on parallel processing for scientific computing: Final program and abstracts

Energy Technology Data Exchange (ETDEWEB)

NONE

1997-12-31

This SIAM conference is the premier forum for developments in parallel numerical algorithms, a field that has seen very lively and fruitful developments over the past decade, and whose health is still robust. Themes for this conference were: combinatorial optimization; data-parallel languages; large-scale parallel applications; message-passing; molecular modeling; parallel I/O; parallel libraries; parallel software tools; parallel compilers; particle simulations; problem-solving environments; and sparse matrix computations.
Parallelization of a beam dynamics code and first large scale radio frequency quadrupole simulations

Directory of Open Access Journals (Sweden)

J. Xu

2007-01-01

Full Text Available The design and operation support of hadron (proton and heavy-ion linear accelerators require substantial use of beam dynamics simulation tools. The beam dynamics code TRACK has been originally developed at Argonne National Laboratory (ANL to fulfill the special requirements of the rare isotope accelerator (RIA accelerator systems. From the beginning, the code has been developed to make it useful in the three stages of a linear accelerator project, namely, the design, commissioning, and operation of the machine. To realize this concept, the code has unique features such as end-to-end simulations from the ion source to the final beam destination and automatic procedures for tuning of a multiple charge state heavy-ion beam. The TRACK code has become a general beam dynamics code for hadron linacs and has found wide applications worldwide. Until recently, the code has remained serial except for a simple parallelization used for the simulation of multiple seeds to study the machine errors. To speed up computation, the TRACK Poisson solver has been parallelized. This paper discusses different parallel models for solving the Poisson equation with the primary goal to extend the scalability of the code onto 1024 and more processors of the new generation of supercomputers known as BlueGene (BG/L. Domain decomposition techniques have been adapted and incorporated into the parallel version of the TRACK code. To demonstrate the new capabilities of the parallelized TRACK code, the dynamics of a 45 mA proton beam represented by 10^{8} particles has been simulated through the 325 MHz radio frequency quadrupole and initial accelerator section of the proposed FNAL proton driver. The results show the benefits and advantages of large-scale parallel computing in beam dynamics simulations.
Real-time simulation of large-scale floods

Science.gov (United States)

Liu, Q.; Qin, Y.; Li, G. D.; Liu, Z.; Cheng, D. J.; Zhao, Y. H.

2016-08-01

According to the complex real-time water situation, the real-time simulation of large-scale floods is very important for flood prevention practice. Model robustness and running efficiency are two critical factors in successful real-time flood simulation. This paper proposed a robust, two-dimensional, shallow water model based on the unstructured Godunov- type finite volume method. A robust wet/dry front method is used to enhance the numerical stability. An adaptive method is proposed to improve the running efficiency. The proposed model is used for large-scale flood simulation on real topography. Results compared to those of MIKE21 show the strong performance of the proposed model.
Mapping robust parallel multigrid algorithms to scalable memory architectures

Science.gov (United States)

Overman, Andrea; Vanrosendale, John

1993-01-01

The convergence rate of standard multigrid algorithms degenerates on problems with stretched grids or anisotropic operators. The usual cure for this is the use of line or plane relaxation. However, multigrid algorithms based on line and plane relaxation have limited and awkward parallelism and are quite difficult to map effectively to highly parallel architectures. Newer multigrid algorithms that overcome anisotropy through the use of multiple coarse grids rather than relaxation are better suited to massively parallel architectures because they require only simple point-relaxation smoothers. In this paper, we look at the parallel implementation of a V-cycle multiple semicoarsened grid (MSG) algorithm on distributed-memory architectures such as the Intel iPSC/860 and Paragon computers. The MSG algorithms provide two levels of parallelism: parallelism within the relaxation or interpolation on each grid and across the grids on each multigrid level. Both levels of parallelism must be exploited to map these algorithms effectively to parallel architectures. This paper describes a mapping of an MSG algorithm to distributed-memory architectures that demonstrates how both levels of parallelism can be exploited. The result is a robust and effective multigrid algorithm for distributed-memory machines.
Distributed parallel cooperative coevolutionary multi-objective large-scale immune algorithm for deployment of wireless sensor networks

DEFF Research Database (Denmark)

Cao, Bin; Zhao, Jianwei; Yang, Po

2018-01-01

-objective evolutionary algorithms the Cooperative Coevolutionary Generalized Differential Evolution 3, the Cooperative Multi-objective Differential Evolution and the Nondominated Sorting Genetic Algorithm III, the proposed algorithm addresses the deployment optimization problem efficiently and effectively.......Using immune algorithms is generally a time-intensive process especially for problems with a large number of variables. In this paper, we propose a distributed parallel cooperative coevolutionary multi-objective large-scale immune algorithm that is implemented using the message passing interface...... (MPI). The proposed algorithm is composed of three layers: objective, group and individual layers. First, for each objective in the multi-objective problem to be addressed, a subpopulation is used for optimization, and an archive population is used to optimize all the objectives. Second, the large...
A Decentralized Multivariable Robust Adaptive Voltage and Speed Regulator for Large-Scale Power Systems

Science.gov (United States)

Okou, Francis A.; Akhrif, Ouassima; Dessaint, Louis A.; Bouchard, Derrick

2013-05-01

This papter introduces a decentralized multivariable robust adaptive voltage and frequency regulator to ensure the stability of large-scale interconnnected generators. Interconnection parameters (i.e. load, line and transormer parameters) are assumed to be unknown. The proposed design approach requires the reformulation of conventiaonal power system models into a multivariable model with generator terminal voltages as state variables, and excitation and turbine valve inputs as control signals. This model, while suitable for the application of modern control methods, introduces problems with regards to current design techniques for large-scale systems. Interconnection terms, which are treated as perturbations, do not meet the common matching condition assumption. A new adaptive method for a certain class of large-scale systems is therefore introduces that does not require the matching condition. The proposed controller consists of nonlinear inputs that cancel some nonlinearities of the model. Auxiliary controls with linear and nonlinear components are used to stabilize the system. They compensate unknown parametes of the model by updating both the nonlinear component gains and excitation parameters. The adaptation algorithms involve the sigma-modification approach for auxiliary control gains, and the projection approach for excitation parameters to prevent estimation drift. The computation of the matrix-gain of the controller linear component requires the resolution of an algebraic Riccati equation and helps to solve the perturbation-mismatching problem. A realistic power system is used to assess the proposed controller performance. The results show that both stability and transient performance are considerably improved following a severe contingency.
Secure access control and large scale robust representation for online multimedia event detection.

Science.gov (United States)

Liu, Changyu; Lu, Bin; Li, Huiling

2014-01-01

We developed an online multimedia event detection (MED) system. However, there are a secure access control issue and a large scale robust representation issue when we want to integrate traditional event detection algorithms into the online environment. For the first issue, we proposed a tree proxy-based and service-oriented access control (TPSAC) model based on the traditional role based access control model. Verification experiments were conducted on the CloudSim simulation platform, and the results showed that the TPSAC model is suitable for the access control of dynamic online environments. For the second issue, inspired by the object-bank scene descriptor, we proposed a 1000-object-bank (1000OBK) event descriptor. Feature vectors of the 1000OBK were extracted from response pyramids of 1000 generic object detectors which were trained on standard annotated image datasets, such as the ImageNet dataset. A spatial bag of words tiling approach was then adopted to encode these feature vectors for bridging the gap between the objects and events. Furthermore, we performed experiments in the context of event classification on the challenging TRECVID MED 2012 dataset, and the results showed that the robust 1000OBK event descriptor outperforms the state-of-the-art approaches.
Secure Access Control and Large Scale Robust Representation for Online Multimedia Event Detection

Directory of Open Access Journals (Sweden)

Changyu Liu

2014-01-01

Full Text Available We developed an online multimedia event detection (MED system. However, there are a secure access control issue and a large scale robust representation issue when we want to integrate traditional event detection algorithms into the online environment. For the first issue, we proposed a tree proxy-based and service-oriented access control (TPSAC model based on the traditional role based access control model. Verification experiments were conducted on the CloudSim simulation platform, and the results showed that the TPSAC model is suitable for the access control of dynamic online environments. For the second issue, inspired by the object-bank scene descriptor, we proposed a 1000-object-bank (1000OBK event descriptor. Feature vectors of the 1000OBK were extracted from response pyramids of 1000 generic object detectors which were trained on standard annotated image datasets, such as the ImageNet dataset. A spatial bag of words tiling approach was then adopted to encode these feature vectors for bridging the gap between the objects and events. Furthermore, we performed experiments in the context of event classification on the challenging TRECVID MED 2012 dataset, and the results showed that the robust 1000OBK event descriptor outperforms the state-of-the-art approaches.

A novel two-level dynamic parallel data scheme for large 3-D SN calculations

International Nuclear Information System (INIS)

Sjoden, G.E.; Shedlock, D.; Haghighat, A.; Yi, C.

2005-01-01

We introduce a new dynamic parallel memory optimization scheme for executing large scale 3-D discrete ordinates (Sn) simulations on distributed memory parallel computers. In order for parallel transport codes to be truly scalable, they must use parallel data storage, where only the variables that are locally computed are locally stored. Even with parallel data storage for the angular variables, cumulative storage requirements for large discrete ordinates calculations can be prohibitive. To address this problem, Memory Tuning has been implemented into the PENTRAN 3-D parallel discrete ordinates code as an optimized, two-level ('large' array, 'small' array) parallel data storage scheme. Memory Tuning can be described as the process of parallel data memory optimization. Memory Tuning dynamically minimizes the amount of required parallel data in allocated memory on each processor using a statistical sampling algorithm. This algorithm is based on the integral average and standard deviation of the number of fine meshes contained in each coarse mesh in the global problem. Because PENTRAN only stores the locally computed problem phase space, optimal two-level memory assignments can be unique on each node, depending upon the parallel decomposition used (hybrid combinations of angular, energy, or spatial). As demonstrated in the two large discrete ordinates models presented (a storage cask and an OECD MOX Benchmark), Memory Tuning can save a substantial amount of memory per parallel processor, allowing one to accomplish very large scale Sn computations. (authors)
Parallel simulation of tsunami inundation on a large-scale supercomputer

Science.gov (United States)

Oishi, Y.; Imamura, F.; Sugawara, D.

2013-12-01

An accurate prediction of tsunami inundation is important for disaster mitigation purposes. One approach is to approximate the tsunami wave source through an instant inversion analysis using real-time observation data (e.g., Tsushima et al., 2009) and then use the resulting wave source data in an instant tsunami inundation simulation. However, a bottleneck of this approach is the large computational cost of the non-linear inundation simulation and the computational power of recent massively parallel supercomputers is helpful to enable faster than real-time execution of a tsunami inundation simulation. Parallel computers have become approximately 1000 times faster in 10 years (www.top500.org), and so it is expected that very fast parallel computers will be more and more prevalent in the near future. Therefore, it is important to investigate how to efficiently conduct a tsunami simulation on parallel computers. In this study, we are targeting very fast tsunami inundation simulations on the K computer, currently the fastest Japanese supercomputer, which has a theoretical peak performance of 11.2 PFLOPS. One computing node of the K computer consists of 1 CPU with 8 cores that share memory, and the nodes are connected through a high-performance torus-mesh network. The K computer is designed for distributed-memory parallel computation, so we have developed a parallel tsunami model. Our model is based on TUNAMI-N2 model of Tohoku University, which is based on a leap-frog finite difference method. A grid nesting scheme is employed to apply high-resolution grids only at the coastal regions. To balance the computation load of each CPU in the parallelization, CPUs are first allocated to each nested layer in proportion to the number of grid points of the nested layer. Using CPUs allocated to each layer, 1-D domain decomposition is performed on each layer. In the parallel computation, three types of communication are necessary: (1) communication to adjacent neighbours for the
Parallel integer sorting with medium and fine-scale parallelism

Science.gov (United States)

Dagum, Leonardo

1993-01-01

Two new parallel integer sorting algorithms, queue-sort and barrel-sort, are presented and analyzed in detail. These algorithms do not have optimal parallel complexity, yet they show very good performance in practice. Queue-sort designed for fine-scale parallel architectures which allow the queueing of multiple messages to the same destination. Barrel-sort is designed for medium-scale parallel architectures with a high message passing overhead. The performance results from the implementation of queue-sort on a Connection Machine CM-2 and barrel-sort on a 128 processor iPSC/860 are given. The two implementations are found to be comparable in performance but not as good as a fully vectorized bucket sort on the Cray YMP.
Parallel continuous simulated tempering and its applications in large-scale molecular simulations

Energy Technology Data Exchange (ETDEWEB)

Zang, Tianwu; Yu, Linglin; Zhang, Chong [Applied Physics Program and Department of Bioengineering, Rice University, Houston, Texas 77005 (United States); Ma, Jianpeng, E-mail: jpma@bcm.tmc.edu [Applied Physics Program and Department of Bioengineering, Rice University, Houston, Texas 77005 (United States); Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, One Baylor Plaza, BCM-125, Houston, Texas 77030 (United States)

2014-07-28

In this paper, we introduce a parallel continuous simulated tempering (PCST) method for enhanced sampling in studying large complex systems. It mainly inherits the continuous simulated tempering (CST) method in our previous studies [C. Zhang and J. Ma, J. Chem. Phys. 130, 194112 (2009); C. Zhang and J. Ma, J. Chem. Phys. 132, 244101 (2010)], while adopts the spirit of parallel tempering (PT), or replica exchange method, by employing multiple copies with different temperature distributions. Differing from conventional PT methods, despite the large stride of total temperature range, the PCST method requires very few copies of simulations, typically 2–3 copies, yet it is still capable of maintaining a high rate of exchange between neighboring copies. Furthermore, in PCST method, the size of the system does not dramatically affect the number of copy needed because the exchange rate is independent of total potential energy, thus providing an enormous advantage over conventional PT methods in studying very large systems. The sampling efficiency of PCST was tested in two-dimensional Ising model, Lennard-Jones liquid and all-atom folding simulation of a small globular protein trp-cage in explicit solvent. The results demonstrate that the PCST method significantly improves sampling efficiency compared with other methods and it is particularly effective in simulating systems with long relaxation time or correlation time. We expect the PCST method to be a good alternative to parallel tempering methods in simulating large systems such as phase transition and dynamics of macromolecules in explicit solvent.
Large-scale self-assembled zirconium phosphate smectic layers via a simple spray-coating process

Science.gov (United States)

Wong, Minhao; Ishige, Ryohei; White, Kevin L.; Li, Peng; Kim, Daehak; Krishnamoorti, Ramanan; Gunther, Robert; Higuchi, Takeshi; Jinnai, Hiroshi; Takahara, Atsushi; Nishimura, Riichi; Sue, Hung-Jue

2014-04-01

The large-scale assembly of asymmetric colloidal particles is used in creating high-performance fibres. A similar concept is extended to the manufacturing of thin films of self-assembled two-dimensional crystal-type materials with enhanced and tunable properties. Here we present a spray-coating method to manufacture thin, flexible and transparent epoxy films containing zirconium phosphate nanoplatelets self-assembled into a lamellar arrangement aligned parallel to the substrate. The self-assembled mesophase of zirconium phosphate nanoplatelets is stabilized by epoxy pre-polymer and exhibits rheology favourable towards large-scale manufacturing. The thermally cured film forms a mechanically robust coating and shows excellent gas barrier properties at both low- and high humidity levels as a result of the highly aligned and overlapping arrangement of nanoplatelets. This work shows that the large-scale ordering of high aspect ratio nanoplatelets is easier to achieve than previously thought and may have implications in the technological applications for similar materials.
Fast Simulation of Large-Scale Floods Based on GPU Parallel Computing

OpenAIRE

Qiang Liu; Yi Qin; Guodong Li

2018-01-01

Computing speed is a significant issue of large-scale flood simulations for real-time response to disaster prevention and mitigation. Even today, most of the large-scale flood simulations are generally run on supercomputers due to the massive amounts of data and computations necessary. In this work, a two-dimensional shallow water model based on an unstructured Godunov-type finite volume scheme was proposed for flood simulation. To realize a fast simulation of large-scale floods on a personal...
Decentralized control of large-scale systems: Fixed modes, sensitivity and parametric robustness. Ph.D. Thesis - Universite Paul Sabatier, 1985

Science.gov (United States)

Tarras, A.

1987-01-01

The problem of stabilization/pole placement under structural constraints of large scale linear systems is discussed. The existence of a solution to this problem is expressed in terms of fixed modes. The aim is to provide a bibliographic survey of the available results concerning the fixed modes (characterization, elimination, control structure selection to avoid them, control design in their absence) and to present the author's contribution to this problem which can be summarized by the use of the mode sensitivity concept to detect or to avoid them, the use of vibrational control to stabilize them, and the addition of parametric robustness considerations to design an optimal decentralized robust control.
Parallel Algorithm for Incremental Betweenness Centrality on Large Graphs

KAUST Repository

Jamour, Fuad Tarek

2017-10-17

Betweenness centrality quantifies the importance of nodes in a graph in many applications, including network analysis, community detection and identification of influential users. Typically, graphs in such applications evolve over time. Thus, the computation of betweenness centrality should be performed incrementally. This is challenging because updating even a single edge may trigger the computation of all-pairs shortest paths in the entire graph. Existing approaches cannot scale to large graphs: they either require excessive memory (i.e., quadratic to the size of the input graph) or perform unnecessary computations rendering them prohibitively slow. We propose iCentral; a novel incremental algorithm for computing betweenness centrality in evolving graphs. We decompose the graph into biconnected components and prove that processing can be localized within the affected components. iCentral is the first algorithm to support incremental betweeness centrality computation within a graph component. This is done efficiently, in linear space; consequently, iCentral scales to large graphs. We demonstrate with real datasets that the serial implementation of iCentral is up to 3.7 times faster than existing serial methods. Our parallel implementation that scales to large graphs, is an order of magnitude faster than the state-of-the-art parallel algorithm, while using an order of magnitude less computational resources.
MapReduce Based Parallel Neural Networks in Enabling Large Scale Machine Learning.

Science.gov (United States)

Liu, Yang; Yang, Jie; Huang, Yuan; Xu, Lixiong; Li, Siguang; Qi, Man

2015-01-01

Artificial neural networks (ANNs) have been widely used in pattern recognition and classification applications. However, ANNs are notably slow in computation especially when the size of data is large. Nowadays, big data has received a momentum from both industry and academia. To fulfill the potentials of ANNs for big data applications, the computation process must be speeded up. For this purpose, this paper parallelizes neural networks based on MapReduce, which has become a major computing model to facilitate data intensive applications. Three data intensive scenarios are considered in the parallelization process in terms of the volume of classification data, the size of the training data, and the number of neurons in the neural network. The performance of the parallelized neural networks is evaluated in an experimental MapReduce computer cluster from the aspects of accuracy in classification and efficiency in computation.
Large-scale computing with Quantum Espresso

International Nuclear Information System (INIS)

Giannozzi, P.; Cavazzoni, C.

2009-01-01

This paper gives a short introduction to Quantum Espresso: a distribution of software for atomistic simulations in condensed-matter physics, chemical physics, materials science, and to its usage in large-scale parallel computing.
MapReduce Based Parallel Neural Networks in Enabling Large Scale Machine Learning

Directory of Open Access Journals (Sweden)

Yang Liu

2015-01-01

Full Text Available Artificial neural networks (ANNs have been widely used in pattern recognition and classification applications. However, ANNs are notably slow in computation especially when the size of data is large. Nowadays, big data has received a momentum from both industry and academia. To fulfill the potentials of ANNs for big data applications, the computation process must be speeded up. For this purpose, this paper parallelizes neural networks based on MapReduce, which has become a major computing model to facilitate data intensive applications. Three data intensive scenarios are considered in the parallelization process in terms of the volume of classification data, the size of the training data, and the number of neurons in the neural network. The performance of the parallelized neural networks is evaluated in an experimental MapReduce computer cluster from the aspects of accuracy in classification and efficiency in computation.
Parallel Computing in SCALE

International Nuclear Information System (INIS)

DeHart, Mark D.; Williams, Mark L.; Bowman, Stephen M.

2010-01-01

The SCALE computational architecture has remained basically the same since its inception 30 years ago, although constituent modules and capabilities have changed significantly. This SCALE concept was intended to provide a framework whereby independent codes can be linked to provide a more comprehensive capability than possible with the individual programs - allowing flexibility to address a wide variety of applications. However, the current system was designed originally for mainframe computers with a single CPU and with significantly less memory than today's personal computers. It has been recognized that the present SCALE computation system could be restructured to take advantage of modern hardware and software capabilities, while retaining many of the modular features of the present system. Preliminary work is being done to define specifications and capabilities for a more advanced computational architecture. This paper describes the state of current SCALE development activities and plans for future development. With the release of SCALE 6.1 in 2010, a new phase of evolutionary development will be available to SCALE users within the TRITON and NEWT modules. The SCALE (Standardized Computer Analyses for Licensing Evaluation) code system developed by Oak Ridge National Laboratory (ORNL) provides a comprehensive and integrated package of codes and nuclear data for a wide range of applications in criticality safety, reactor physics, shielding, isotopic depletion and decay, and sensitivity/uncertainty (S/U) analysis. Over the last three years, since the release of version 5.1 in 2006, several important new codes have been introduced within SCALE, and significant advances applied to existing codes. Many of these new features became available with the release of SCALE 6.0 in early 2009. However, beginning with SCALE 6.1, a first generation of parallel computing is being introduced. In addition to near-term improvements, a plan for longer term SCALE enhancement
Robust stability analysis of large power systems using the structured singular value theory

Energy Technology Data Exchange (ETDEWEB)

Castellanos, R.; Sarmiento, H. [Instituto de Investigaciones Electricas, Cuernavaca, Morelos (Mexico); Messina, A.R. [Cinvestav, Graduate Program in Electrical Engineering, Guadalajara, Jalisco (Mexico)

2005-07-01

This paper examines the application of structured singular value (SSV) theory to analyse robust stability of complex power systems with respect to a set of structured uncertainties. Based on SSV theory and the frequency sweep method, techniques for robust analysis of large-scale power systems are developed. The main interest is focused on determining robust stability for varying operating conditions and uncertainties in the structure of the power system. The applicability of the proposed techniques is verified through simulation studies on a large-scale power system. In particular, results for the system are considered for a wide range of uncertainties of operating conditions. Specifically, the developed technique is used to estimate the effect of variations in the parameters of a major system inter-tie on the nominal stability of a critical inter-area mode. (Author)
Topology Optimization of Large Scale Stokes Flow Problems

DEFF Research Database (Denmark)

Aage, Niels; Poulsen, Thomas Harpsøe; Gersborg-Hansen, Allan

2008-01-01

This note considers topology optimization of large scale 2D and 3D Stokes flow problems using parallel computations. We solve problems with up to 1.125.000 elements in 2D and 128.000 elements in 3D on a shared memory computer consisting of Sun UltraSparc IV CPUs.......This note considers topology optimization of large scale 2D and 3D Stokes flow problems using parallel computations. We solve problems with up to 1.125.000 elements in 2D and 128.000 elements in 3D on a shared memory computer consisting of Sun UltraSparc IV CPUs....
A concurrent visualization system for large-scale unsteady simulations. Parallel vector performance on an NEC SX-4

International Nuclear Information System (INIS)

Takei, Toshifumi; Doi, Shun; Matsumoto, Hideki; Muramatsu, Kazuhiro

2000-01-01

We have developed a concurrent visualization system RVSLIB (Real-time Visual Simulation Library). This paper shows the effectiveness of the system when it is applied to large-scale unsteady simulations, for which the conventional post-processing approach may no longer work, on high-performance parallel vector supercomputers. The system performs almost all of the visualization tasks on a computation server and uses compressed visualized image data for efficient communication between the server and the user terminal. We have introduced several techniques, including vectorization and parallelization, into the system to minimize the computational costs of the visualization tools. The performance of RVSLIB was evaluated by using an actual CFD code on an NEC SX-4. The computational time increase due to the concurrent visualization was at most 3% for a smaller (1.6 million) grid and less than 1% for a larger (6.2 million) one. (author)
Concurrent Programming Using Actors: Exploiting Large-Scale Parallelism,

Science.gov (United States)

1985-10-07

ORGANIZATION NAME AND ADDRESS 10. PROGRAM ELEMENT. PROJECT. TASK* Artificial Inteligence Laboratory AREA Is WORK UNIT NUMBERS 545 Technology Square...D-R162 422 CONCURRENT PROGRMMIZNG USING f"OS XL?ITP TEH l’ LARGE-SCALE PARALLELISH(U) NASI AC E Al CAMBRIDGE ARTIFICIAL INTELLIGENCE L. G AGHA ET AL...RESOLUTION TEST CHART N~ATIONAL BUREAU OF STANDA.RDS - -96 A -E. __ _ __ __’ .,*- - -- •. - MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL
Robust AIC with High Breakdown Scale Estimate

Directory of Open Access Journals (Sweden)

Shokrya Saleh

2014-01-01

Full Text Available Akaike Information Criterion (AIC based on least squares (LS regression minimizes the sum of the squared residuals; LS is sensitive to outlier observations. Alternative criterion, which is less sensitive to outlying observation, has been proposed; examples are robust AIC (RAIC, robust Mallows Cp (RCp, and robust Bayesian information criterion (RBIC. In this paper, we propose a robust AIC by replacing the scale estimate with a high breakdown point estimate of scale. The robustness of the proposed methods is studied through its influence function. We show that, the proposed robust AIC is effective in selecting accurate models in the presence of outliers and high leverage points, through simulated and real data examples.
Parallel Index and Query for Large Scale Data Analysis

Energy Technology Data Exchange (ETDEWEB)

Chou, Jerry; Wu, Kesheng; Ruebel, Oliver; Howison, Mark; Qiang, Ji; Prabhat,; Austin, Brian; Bethel, E. Wes; Ryne, Rob D.; Shoshani, Arie

2011-07-18

Modern scientific datasets present numerous data management and analysis challenges. State-of-the-art index and query technologies are critical for facilitating interactive exploration of large datasets, but numerous challenges remain in terms of designing a system for process- ing general scientific datasets. The system needs to be able to run on distributed multi-core platforms, efficiently utilize underlying I/O infrastructure, and scale to massive datasets. We present FastQuery, a novel software framework that address these challenges. FastQuery utilizes a state-of-the-art index and query technology (FastBit) and is designed to process mas- sive datasets on modern supercomputing platforms. We apply FastQuery to processing of a massive 50TB dataset generated by a large scale accelerator modeling code. We demonstrate the scalability of the tool to 11,520 cores. Motivated by the scientific need to search for inter- esting particles in this dataset, we use our framework to reduce search time from hours to tens of seconds.
Large-scale Intelligent Transporation Systems simulation

Energy Technology Data Exchange (ETDEWEB)

Ewing, T.; Canfield, T.; Hannebutte, U.; Levine, D.; Tentner, A.

1995-06-01

A prototype computer system has been developed which defines a high-level architecture for a large-scale, comprehensive, scalable simulation of an Intelligent Transportation System (ITS) capable of running on massively parallel computers and distributed (networked) computer systems. The prototype includes the modelling of instrumented ``smart`` vehicles with in-vehicle navigation units capable of optimal route planning and Traffic Management Centers (TMC). The TMC has probe vehicle tracking capabilities (display position and attributes of instrumented vehicles), and can provide 2-way interaction with traffic to provide advisories and link times. Both the in-vehicle navigation module and the TMC feature detailed graphical user interfaces to support human-factors studies. The prototype has been developed on a distributed system of networked UNIX computers but is designed to run on ANL`s IBM SP-X parallel computer system for large scale problems. A novel feature of our design is that vehicles will be represented by autonomus computer processes, each with a behavior model which performs independent route selection and reacts to external traffic events much like real vehicles. With this approach, one will be able to take advantage of emerging massively parallel processor (MPP) systems.
Optical technologies for data communication in large parallel systems

International Nuclear Information System (INIS)

Ritter, M B; Vlasov, Y; Kash, J A; Benner, A

2011-01-01

Large, parallel systems have greatly aided scientific computation and data collection, but performance scaling now relies on chip and system-level parallelism. This has happened because power density limits have caused processor frequency growth to stagnate, driving the new multi-core architecture paradigm, which would seem to provide generations of performance increases as transistors scale. However, this paradigm will be constrained by electrical I/O bandwidth limits; first off the processor card, then off the processor module itself. We will present best-estimates of these limits, then show how optical technologies can help provide more bandwidth to allow continued system scaling. We will describe the current status of optical transceiver technology which is already being used to exceed off-board electrical bandwidth limits, then present work on silicon nanophotonic transceivers and 3D integration technologies which, taken together, promise to allow further increases in off-module and off-card bandwidth. Finally, we will show estimated limits of nanophotonic links and discuss breakthroughs that are needed for further progress, and will speculate on whether we will reach Exascale-class machine performance at affordable powers.

Optical technologies for data communication in large parallel systems

Energy Technology Data Exchange (ETDEWEB)

Ritter, M B; Vlasov, Y; Kash, J A [IBM T.J. Watson Research Center, Yorktown Heights, NY (United States); Benner, A, E-mail: mritter@us.ibm.com [IBM Poughkeepsie, Poughkeepsie, NY (United States)

2011-01-15

Large, parallel systems have greatly aided scientific computation and data collection, but performance scaling now relies on chip and system-level parallelism. This has happened because power density limits have caused processor frequency growth to stagnate, driving the new multi-core architecture paradigm, which would seem to provide generations of performance increases as transistors scale. However, this paradigm will be constrained by electrical I/O bandwidth limits; first off the processor card, then off the processor module itself. We will present best-estimates of these limits, then show how optical technologies can help provide more bandwidth to allow continued system scaling. We will describe the current status of optical transceiver technology which is already being used to exceed off-board electrical bandwidth limits, then present work on silicon nanophotonic transceivers and 3D integration technologies which, taken together, promise to allow further increases in off-module and off-card bandwidth. Finally, we will show estimated limits of nanophotonic links and discuss breakthroughs that are needed for further progress, and will speculate on whether we will reach Exascale-class machine performance at affordable powers.
Regional-scale calculation of the LS factor using parallel processing

Science.gov (United States)

Liu, Kai; Tang, Guoan; Jiang, Ling; Zhu, A.-Xing; Yang, Jianyi; Song, Xiaodong

2015-05-01

With the increase of data resolution and the increasing application of USLE over large areas, the existing serial implementation of algorithms for computing the LS factor is becoming a bottleneck. In this paper, a parallel processing model based on message passing interface (MPI) is presented for the calculation of the LS factor, so that massive datasets at a regional scale can be processed efficiently. The parallel model contains algorithms for calculating flow direction, flow accumulation, drainage network, slope, slope length and the LS factor. According to the existence of data dependence, the algorithms are divided into local algorithms and global algorithms. Parallel strategy are designed according to the algorithm characters including the decomposition method for maintaining the integrity of the results, optimized workflow for reducing the time taken for exporting the unnecessary intermediate data and a buffer-communication-computation strategy for improving the communication efficiency. Experiments on a multi-node system show that the proposed parallel model allows efficient calculation of the LS factor at a regional scale with a massive dataset.
Robust Visual Tracking Using the Bidirectional Scale Estimation

Directory of Open Access Journals (Sweden)

An Zhiyong

2017-01-01

Full Text Available Object tracking with robust scale estimation is a challenging task in computer vision. This paper presents a novel tracking algorithm that learns the translation and scale filters with a complementary scheme. The translation filter is constructed using the ridge regression and multidimensional features. A robust scale filter is constructed by the bidirectional scale estimation, including the forward scale and backward scale. Firstly, we learn the scale filter using the forward tracking information. Then the forward scale and backward scale can be estimated using the respective scale filter. Secondly, a conservative strategy is adopted to compromise the forward and backward scales. Finally, the scale filter is updated based on the final scale estimation. It is effective to update scale filter since the stable scale estimation can improve the performance of scale filter. To reveal the effectiveness of our tracker, experiments are performed on 32 sequences with significant scale variation and on the benchmark dataset with 50 challenging videos. Our results show that the proposed tracker outperforms several state-of-the-art trackers in terms of robustness and accuracy.
A cloud-based framework for large-scale traditional Chinese medical record retrieval.

Science.gov (United States)

Liu, Lijun; Liu, Li; Fu, Xiaodong; Huang, Qingsong; Zhang, Xianwen; Zhang, Yin

2018-01-01

Electronic medical records are increasingly common in medical practice. The secondary use of medical records has become increasingly important. It relies on the ability to retrieve the complete information about desired patient populations. How to effectively and accurately retrieve relevant medical records from large- scale medical big data is becoming a big challenge. Therefore, we propose an efficient and robust framework based on cloud for large-scale Traditional Chinese Medical Records (TCMRs) retrieval. We propose a parallel index building method and build a distributed search cluster, the former is used to improve the performance of index building, and the latter is used to provide high concurrent online TCMRs retrieval. Then, a real-time multi-indexing model is proposed to ensure the latest relevant TCMRs are indexed and retrieved in real-time, and a semantics-based query expansion method and a multi- factor ranking model are proposed to improve retrieval quality. Third, we implement a template-based visualization method for displaying medical reports. The proposed parallel indexing method and distributed search cluster can improve the performance of index building and provide high concurrent online TCMRs retrieval. The multi-indexing model can ensure the latest relevant TCMRs are indexed and retrieved in real-time. The semantics expansion method and the multi-factor ranking model can enhance retrieval quality. The template-based visualization method can enhance the availability and universality, where the medical reports are displayed via friendly web interface. In conclusion, compared with the current medical record retrieval systems, our system provides some advantages that are useful in improving the secondary use of large-scale traditional Chinese medical records in cloud environment. The proposed system is more easily integrated with existing clinical systems and be used in various scenarios. Copyright © 2017. Published by Elsevier Inc.
Efficient numerical methods for the large-scale, parallel solution of elastoplastic contact problems

KAUST Repository

Frohne, Jö rg; Heister, Timo; Bangerth, Wolfgang

2015-01-01

© 2016 John Wiley & Sons, Ltd. Quasi-static elastoplastic contact problems are ubiquitous in many industrial processes and other contexts, and their numerical simulation is consequently of great interest in accurately describing and optimizing production processes. The key component in these simulations is the solution of a single load step of a time iteration. From a mathematical perspective, the problems to be solved in each time step are characterized by the difficulties of variational inequalities for both the plastic behavior and the contact problem. Computationally, they also often lead to very large problems. In this paper, we present and evaluate a complete set of methods that are (1) designed to work well together and (2) allow for the efficient solution of such problems. In particular, we use adaptive finite element meshes with linear and quadratic elements, a Newton linearization of the plasticity, active set methods for the contact problem, and multigrid-preconditioned linear solvers. Through a sequence of numerical experiments, we show the performance of these methods. This includes highly accurate solutions of a three-dimensional benchmark problem and scaling our methods in parallel to 1024 cores and more than a billion unknowns.
Efficient numerical methods for the large-scale, parallel solution of elastoplastic contact problems

KAUST Repository

Frohne, Jörg

2015-08-06

© 2016 John Wiley & Sons, Ltd. Quasi-static elastoplastic contact problems are ubiquitous in many industrial processes and other contexts, and their numerical simulation is consequently of great interest in accurately describing and optimizing production processes. The key component in these simulations is the solution of a single load step of a time iteration. From a mathematical perspective, the problems to be solved in each time step are characterized by the difficulties of variational inequalities for both the plastic behavior and the contact problem. Computationally, they also often lead to very large problems. In this paper, we present and evaluate a complete set of methods that are (1) designed to work well together and (2) allow for the efficient solution of such problems. In particular, we use adaptive finite element meshes with linear and quadratic elements, a Newton linearization of the plasticity, active set methods for the contact problem, and multigrid-preconditioned linear solvers. Through a sequence of numerical experiments, we show the performance of these methods. This includes highly accurate solutions of a three-dimensional benchmark problem and scaling our methods in parallel to 1024 cores and more than a billion unknowns.
Neurite, a finite difference large scale parallel program for the simulation of electrical signal propagation in neurites under mechanical loading.

Directory of Open Access Journals (Sweden)

Julián A García-Grajales

Full Text Available With the growing body of research on traumatic brain injury and spinal cord injury, computational neuroscience has recently focused its modeling efforts on neuronal functional deficits following mechanical loading. However, in most of these efforts, cell damage is generally only characterized by purely mechanistic criteria, functions of quantities such as stress, strain or their corresponding rates. The modeling of functional deficits in neurites as a consequence of macroscopic mechanical insults has been rarely explored. In particular, a quantitative mechanically based model of electrophysiological impairment in neuronal cells, Neurite, has only very recently been proposed. In this paper, we present the implementation details of this model: a finite difference parallel program for simulating electrical signal propagation along neurites under mechanical loading. Following the application of a macroscopic strain at a given strain rate produced by a mechanical insult, Neurite is able to simulate the resulting neuronal electrical signal propagation, and thus the corresponding functional deficits. The simulation of the coupled mechanical and electrophysiological behaviors requires computational expensive calculations that increase in complexity as the network of the simulated cells grows. The solvers implemented in Neurite--explicit and implicit--were therefore parallelized using graphics processing units in order to reduce the burden of the simulation costs of large scale scenarios. Cable Theory and Hodgkin-Huxley models were implemented to account for the electrophysiological passive and active regions of a neurite, respectively, whereas a coupled mechanical model accounting for the neurite mechanical behavior within its surrounding medium was adopted as a link between electrophysiology and mechanics. This paper provides the details of the parallel implementation of Neurite, along with three different application examples: a long myelinated axon
Large-scale linear programs in planning and prediction.

Science.gov (United States)

2017-06-01

Large-scale linear programs are at the core of many traffic-related optimization problems in both planning and prediction. Moreover, many of these involve significant uncertainty, and hence are modeled using either chance constraints, or robust optim...
Large-Scale Parallel Viscous Flow Computations using an Unstructured Multigrid Algorithm

Science.gov (United States)

Mavriplis, Dimitri J.

1999-01-01

The development and testing of a parallel unstructured agglomeration multigrid algorithm for steady-state aerodynamic flows is discussed. The agglomeration multigrid strategy uses a graph algorithm to construct the coarse multigrid levels from the given fine grid, similar to an algebraic multigrid approach, but operates directly on the non-linear system using the FAS (Full Approximation Scheme) approach. The scalability and convergence rate of the multigrid algorithm are examined on the SGI Origin 2000 and the Cray T3E. An argument is given which indicates that the asymptotic scalability of the multigrid algorithm should be similar to that of its underlying single grid smoothing scheme. For medium size problems involving several million grid points, near perfect scalability is obtained for the single grid algorithm, while only a slight drop-off in parallel efficiency is observed for the multigrid V- and W-cycles, using up to 128 processors on the SGI Origin 2000, and up to 512 processors on the Cray T3E. For a large problem using 25 million grid points, good scalability is observed for the multigrid algorithm using up to 1450 processors on a Cray T3E, even when the coarsest grid level contains fewer points than the total number of processors.
Speedup predictions on large scientific parallel programs

International Nuclear Information System (INIS)

Williams, E.; Bobrowicz, F.

1985-01-01

How much speedup can we expect for large scientific parallel programs running on supercomputers. For insight into this problem we extend the parallel processing environment currently existing on the Cray X-MP (a shared memory multiprocessor with at most four processors) to a simulated N-processor environment, where N greater than or equal to 1. Several large scientific parallel programs from Los Alamos National Laboratory were run in this simulated environment, and speedups were predicted. A speedup of 14.4 on 16 processors was measured for one of the three most used codes at the Laboratory
A Parallel Solver for Large-Scale Markov Chains

Czech Academy of Sciences Publication Activity Database

Benzi, M.; Tůma, Miroslav

2002-01-01

Roč. 41, - (2002), s. 135-153 ISSN 0168-9274 R&D Projects: GA AV ČR IAA2030801; GA ČR GA101/00/1035 Keywords : parallel preconditioning * iterative methods * discrete Markov chains * generalized inverses * singular matrices * graph partitioning * AINV * Bi-CGSTAB Subject RIV: BA - General Mathematics Impact factor: 0.504, year: 2002
Parallel time domain solvers for electrically large transient scattering problems

KAUST Repository

Liu, Yang

2014-09-26

Marching on in time (MOT)-based integral equation solvers represent an increasingly appealing avenue for analyzing transient electromagnetic interactions with large and complex structures. MOT integral equation solvers for analyzing electromagnetic scattering from perfect electrically conducting objects are obtained by enforcing electric field boundary conditions and implicitly time advance electric surface current densities by iteratively solving sparse systems of equations at all time steps. Contrary to finite difference and element competitors, these solvers apply to nonlinear and multi-scale structures comprising geometrically intricate and deep sub-wavelength features residing atop electrically large platforms. Moreover, they are high-order accurate, stable in the low- and high-frequency limits, and applicable to conducting and penetrable structures represented by highly irregular meshes. This presentation reviews some recent advances in the parallel implementations of time domain integral equation solvers, specifically those that leverage multilevel plane-wave time-domain algorithm (PWTD) on modern manycore computer architectures including graphics processing units (GPUs) and distributed memory supercomputers. The GPU-based implementation achieves at least one order of magnitude speedups compared to serial implementations while the distributed parallel implementation are highly scalable to thousands of compute-nodes. A distributed parallel PWTD kernel has been adopted to solve time domain surface/volume integral equations (TDSIE/TDVIE) for analyzing transient scattering from large and complex-shaped perfectly electrically conducting (PEC)/dielectric objects involving ten million/tens of millions of spatial unknowns.
Accelerating large-scale protein structure alignments with graphics processing units

Directory of Open Access Journals (Sweden)

Pang Bin

2012-02-01

Full Text Available Abstract Background Large-scale protein structure alignment, an indispensable tool to structural bioinformatics, poses a tremendous challenge on computational resources. To ensure structure alignment accuracy and efficiency, efforts have been made to parallelize traditional alignment algorithms in grid environments. However, these solutions are costly and of limited accessibility. Others trade alignment quality for speedup by using high-level characteristics of structure fragments for structure comparisons. Findings We present ppsAlign, a parallel protein structure Alignment framework designed and optimized to exploit the parallelism of Graphics Processing Units (GPUs. As a general-purpose GPU platform, ppsAlign could take many concurrent methods, such as TM-align and Fr-TM-align, into the parallelized algorithm design. We evaluated ppsAlign on an NVIDIA Tesla C2050 GPU card, and compared it with existing software solutions running on an AMD dual-core CPU. We observed a 36-fold speedup over TM-align, a 65-fold speedup over Fr-TM-align, and a 40-fold speedup over MAMMOTH. Conclusions ppsAlign is a high-performance protein structure alignment tool designed to tackle the computational complexity issues from protein structural data. The solution presented in this paper allows large-scale structure comparisons to be performed using massive parallel computing power of GPU.
Integration experiences and performance studies of A COTS parallel archive systems

Energy Technology Data Exchange (ETDEWEB)

Chen, Hsing-bung [Los Alamos National Laboratory; Scott, Cody [Los Alamos National Laboratory; Grider, Bary [Los Alamos National Laboratory; Torres, Aaron [Los Alamos National Laboratory; Turley, Milton [Los Alamos National Laboratory; Sanchez, Kathy [Los Alamos National Laboratory; Bremer, John [Los Alamos National Laboratory

2010-01-01

Current and future Archive Storage Systems have been asked to (a) scale to very high bandwidths, (b) scale in metadata performance, (c) support policy-based hierarchical storage management capability, (d) scale in supporting changing needs of very large data sets, (e) support standard interface, and (f) utilize commercial-off-the-shelf(COTS) hardware. Parallel file systems have been asked to do the same thing but at one or more orders of magnitude faster in performance. Archive systems continue to move closer to file systems in their design due to the need for speed and bandwidth, especially metadata searching speeds such as more caching and less robust semantics. Currently the number of extreme highly scalable parallel archive solutions is very small especially those that will move a single large striped parallel disk file onto many tapes in parallel. We believe that a hybrid storage approach of using COTS components and innovative software technology can bring new capabilities into a production environment for the HPC community much faster than the approach of creating and maintaining a complete end-to-end unique parallel archive software solution. In this paper, we relay our experience of integrating a global parallel file system and a standard backup/archive product with a very small amount of additional code to provide a scalable, parallel archive. Our solution has a high degree of overlap with current parallel archive products including (a) doing parallel movement to/from tape for a single large parallel file, (b) hierarchical storage management, (c) ILM features, (d) high volume (non-single parallel file) archives for backup/archive/content management, and (e) leveraging all free file movement tools in Linux such as copy, move, ls, tar, etc. We have successfully applied our working COTS Parallel Archive System to the current world's first petaflop/s computing system, LANL's Roadrunner, and demonstrated its capability to address requirements of
Integration experiments and performance studies of a COTS parallel archive system

Energy Technology Data Exchange (ETDEWEB)

Chen, Hsing-bung [Los Alamos National Laboratory; Scott, Cody [Los Alamos National Laboratory; Grider, Gary [Los Alamos National Laboratory; Torres, Aaron [Los Alamos National Laboratory; Turley, Milton [Los Alamos National Laboratory; Sanchez, Kathy [Los Alamos National Laboratory; Bremer, John [Los Alamos National Laboratory

2010-06-16

Current and future Archive Storage Systems have been asked to (a) scale to very high bandwidths, (b) scale in metadata performance, (c) support policy-based hierarchical storage management capability, (d) scale in supporting changing needs of very large data sets, (e) support standard interface, and (f) utilize commercial-off-the-shelf (COTS) hardware. Parallel file systems have been asked to do the same thing but at one or more orders of magnitude faster in performance. Archive systems continue to move closer to file systems in their design due to the need for speed and bandwidth, especially metadata searching speeds such as more caching and less robust semantics. Currently the number of extreme highly scalable parallel archive solutions is very small especially those that will move a single large striped parallel disk file onto many tapes in parallel. We believe that a hybrid storage approach of using COTS components and innovative software technology can bring new capabilities into a production environment for the HPC community much faster than the approach of creating and maintaining a complete end-to-end unique parallel archive software solution. In this paper, we relay our experience of integrating a global parallel file system and a standard backup/archive product with a very small amount of additional code to provide a scalable, parallel archive. Our solution has a high degree of overlap with current parallel archive products including (a) doing parallel movement to/from tape for a single large parallel file, (b) hierarchical storage management, (c) ILM features, (d) high volume (non-single parallel file) archives for backup/archive/content management, and (e) leveraging all free file movement tools in Linux such as copy, move, Is, tar, etc. We have successfully applied our working COTS Parallel Archive System to the current world's first petafiop/s computing system, LANL's Roadrunner machine, and demonstrated its capability to address
Robust equivalent consumption-based controllers for a dual-mode diesel parallel HEV

International Nuclear Information System (INIS)

Finesso, Roberto; Spessa, Ezio; Venditti, Mattia

2016-01-01

Highlights: • Non-plug-in dual-mode parallel hybrid architecture. • Cross-validation machine-learning for robust equivalent consumption-based controllers. • Optimal control strategy based on fuel consumption, NOx and battery aging. • Impact of different equivalent consumption definitions on HEV performance. • Correlation between vehicle braking energy and SOC variation in the traction stages. - Abstract: New equivalent consumption minimization strategy (ECMS) tools have been developed and applied to identify the optimal control strategy of a dual-mode parallel hybrid electric vehicle equipped with a compression-ignition engine. In this architecture, the electric machine is coupled to the engine through either a single-speed gearbox (torque-coupling) or a planetary gear set (speed-coupling). One of the main novelties of the present study concerns the definition of the instantaneous equivalent consumption (EC) function, which takes into account not only fuel consumption (FC) and the energy flow through the electric components, but also NO_x emissions, battery aging, and the battery SOC. The EC function has been trained using a cross-validation machine-learning technique, based on a genetic algorithm, where the training data set has been selected in order to maximize performances over a testing data set. The adoption of this technique, in conjunction with the new definition of EC, have led to the identification of very robust controllers, which provide an accurate control for different driving scenarios, even when the EC function is not specifically trained on the same missions over which it is tested. To this aim, a data set of fifty driving cycles and six user-defined missions, which cover a total distance of 70–100 km, has been considered as a training driving set. The ECMS controllers can be implemented in a vehicle control unit, and their performance has resulted to be close to that of a dynamic programming tool, which has here been used as benchmark
Finite element analysis of multi-material models using a balancing domain decomposition method combined with the diagonal scaling preconditioner

International Nuclear Information System (INIS)

Ogino, Masao

2016-01-01

Actual problems in science and industrial applications are modeled by multi-materials and large-scale unstructured mesh, and the finite element analysis has been widely used to solve such problems on the parallel computer. However, for large-scale problems, the iterative methods for linear finite element equations suffer from slow or no convergence. Therefore, numerical methods having both robust convergence and scalable parallel efficiency are in great demand. The domain decomposition method is well known as an iterative substructuring method, and is an efficient approach for parallel finite element methods. Moreover, the balancing preconditioner achieves robust convergence. However, in case of problems consisting of very different materials, the convergence becomes bad. There are some research to solve this issue, however not suitable for cases of complex shape and composite materials. In this study, to improve convergence of the balancing preconditioner for multi-materials, a balancing preconditioner combined with the diagonal scaling preconditioner, called Scaled-BDD method, is proposed. Some numerical results are included which indicate that the proposed method has robust convergence for the number of subdomains and shows high performances compared with the original balancing preconditioner. (author)
Research on precision grinding technology of large scale and ultra thin optics

Science.gov (United States)

Zhou, Lian; Wei, Qiancai; Li, Jie; Chen, Xianhua; Zhang, Qinghua

2018-03-01

The flatness and parallelism error of large scale and ultra thin optics have an important influence on the subsequent polishing efficiency and accuracy. In order to realize the high precision grinding of those ductile elements, the low deformation vacuum chuck was designed first, which was used for clamping the optics with high supporting rigidity in the full aperture. Then the optics was planar grinded under vacuum adsorption. After machining, the vacuum system was turned off. The form error of optics was on-machine measured using displacement sensor after elastic restitution. The flatness would be convergenced with high accuracy by compensation machining, whose trajectories were integrated with the measurement result. For purpose of getting high parallelism, the optics was turned over and compensation grinded using the form error of vacuum chuck. Finally, the grinding experiment of large scale and ultra thin fused silica optics with aperture of 430mm×430mm×10mm was performed. The best P-V flatness of optics was below 3 μm, and parallelism was below 3 ″. This machining technique has applied in batch grinding of large scale and ultra thin optics.
Leveraging human oversight and intervention in large-scale parallel processing of open-source data

Science.gov (United States)

Casini, Enrico; Suri, Niranjan; Bradshaw, Jeffrey M.

2015-05-01

The popularity of cloud computing along with the increased availability of cheap storage have led to the necessity of elaboration and transformation of large volumes of open-source data, all in parallel. One way to handle such extensive volumes of information properly is to take advantage of distributed computing frameworks like Map-Reduce. Unfortunately, an entirely automated approach that excludes human intervention is often unpredictable and error prone. Highly accurate data processing and decision-making can be achieved by supporting an automatic process through human collaboration, in a variety of environments such as warfare, cyber security and threat monitoring. Although this mutual participation seems easily exploitable, human-machine collaboration in the field of data analysis presents several challenges. First, due to the asynchronous nature of human intervention, it is necessary to verify that once a correction is made, all the necessary reprocessing is done in chain. Second, it is often needed to minimize the amount of reprocessing in order to optimize the usage of resources due to limited availability. In order to improve on these strict requirements, this paper introduces improvements to an innovative approach for human-machine collaboration in the processing of large amounts of open-source data in parallel.
Faster Parallel Traversal of Scale Free Graphs at Extreme Scale with Vertex Delegates

KAUST Repository

Pearce, Roger

2014-11-01

© 2014 IEEE. At extreme scale, irregularities in the structure of scale-free graphs such as social network graphs limit our ability to analyze these important and growing datasets. A key challenge is the presence of high-degree vertices (hubs), that leads to parallel workload and storage imbalances. The imbalances occur because existing partitioning techniques are not able to effectively partition high-degree vertices. We present techniques to distribute storage, computation, and communication of hubs for extreme scale graphs in distributed memory supercomputers. To balance the hub processing workload, we distribute hub data structures and related computation among a set of delegates. The delegates coordinate using highly optimized, yet portable, asynchronous broadcast and reduction operations. We demonstrate scalability of our new algorithmic technique using Breadth-First Search (BFS), Single Source Shortest Path (SSSP), K-Core Decomposition, and Page-Rank on synthetically generated scale-free graphs. Our results show excellent scalability on large scale-free graphs up to 131K cores of the IBM BG/P, and outperform the best known Graph500 performance on BG/P Intrepid by 15%

Faster Parallel Traversal of Scale Free Graphs at Extreme Scale with Vertex Delegates

KAUST Repository

Pearce, Roger; Gokhale, Maya; Amato, Nancy M.

2014-01-01

© 2014 IEEE. At extreme scale, irregularities in the structure of scale-free graphs such as social network graphs limit our ability to analyze these important and growing datasets. A key challenge is the presence of high-degree vertices (hubs), that leads to parallel workload and storage imbalances. The imbalances occur because existing partitioning techniques are not able to effectively partition high-degree vertices. We present techniques to distribute storage, computation, and communication of hubs for extreme scale graphs in distributed memory supercomputers. To balance the hub processing workload, we distribute hub data structures and related computation among a set of delegates. The delegates coordinate using highly optimized, yet portable, asynchronous broadcast and reduction operations. We demonstrate scalability of our new algorithmic technique using Breadth-First Search (BFS), Single Source Shortest Path (SSSP), K-Core Decomposition, and Page-Rank on synthetically generated scale-free graphs. Our results show excellent scalability on large scale-free graphs up to 131K cores of the IBM BG/P, and outperform the best known Graph500 performance on BG/P Intrepid by 15%
BFAST: an alignment tool for large scale genome resequencing.

Directory of Open Access Journals (Sweden)

Nils Homer

2009-11-01

Full Text Available The new generation of massively parallel DNA sequencers, combined with the challenge of whole human genome resequencing, result in the need for rapid and accurate alignment of billions of short DNA sequence reads to a large reference genome. Speed is obviously of great importance, but equally important is maintaining alignment accuracy of short reads, in the 25-100 base range, in the presence of errors and true biological variation.We introduce a new algorithm specifically optimized for this task, as well as a freely available implementation, BFAST, which can align data produced by any of current sequencing platforms, allows for user-customizable levels of speed and accuracy, supports paired end data, and provides for efficient parallel and multi-threaded computation on a computer cluster. The new method is based on creating flexible, efficient whole genome indexes to rapidly map reads to candidate alignment locations, with arbitrary multiple independent indexes allowed to achieve robustness against read errors and sequence variants. The final local alignment uses a Smith-Waterman method, with gaps to support the detection of small indels.We compare BFAST to a selection of large-scale alignment tools -- BLAT, MAQ, SHRiMP, and SOAP -- in terms of both speed and accuracy, using simulated and real-world datasets. We show BFAST can achieve substantially greater sensitivity of alignment in the context of errors and true variants, especially insertions and deletions, and minimize false mappings, while maintaining adequate speed compared to other current methods. We show BFAST can align the amount of data needed to fully resequence a human genome, one billion reads, with high sensitivity and accuracy, on a modest computer cluster in less than 24 hours. BFAST is available at (http://bfast.sourceforge.net.
A Robust Photogrammetric Processing Method of Low-Altitude UAV Images

Directory of Open Access Journals (Sweden)

Mingyao Ai

2015-02-01

Full Text Available Low-altitude Unmanned Aerial Vehicles (UAV images which include distortion, illumination variance, and large rotation angles are facing multiple challenges of image orientation and image processing. In this paper, a robust and convenient photogrammetric approach is proposed for processing low-altitude UAV images, involving a strip management method to automatically build a standardized regional aerial triangle (AT network, a parallel inner orientation algorithm, a ground control points (GCPs predicting method, and an improved Scale Invariant Feature Transform (SIFT method to produce large number of evenly distributed reliable tie points for bundle adjustment (BA. A multi-view matching approach is improved to produce Digital Surface Models (DSM and Digital Orthophoto Maps (DOM for 3D visualization. Experimental results show that the proposed approach is robust and feasible for photogrammetric processing of low-altitude UAV images and 3D visualization of products.
Computational challenges of large-scale, long-time, first-principles molecular dynamics

International Nuclear Information System (INIS)

Kent, P R C

2008-01-01

Plane wave density functional calculations have traditionally been able to use the largest available supercomputing resources. We analyze the scalability of modern projector-augmented wave implementations to identify the challenges in performing molecular dynamics calculations of large systems containing many thousands of electrons. Benchmark calculations on the Cray XT4 demonstrate that global linear-algebra operations are the primary reason for limited parallel scalability. Plane-wave related operations can be made sufficiently scalable. Improving parallel linear-algebra performance is an essential step to reaching longer timescales in future large-scale molecular dynamics calculations
Hierarchical Parallel Matrix Multiplication on Large-Scale Distributed Memory Platforms

KAUST Repository

Quintin, Jean-Noel

2013-10-01

Matrix multiplication is a very important computation kernel both in its own right as a building block of many scientific applications and as a popular representative for other scientific applications. Cannon\\'s algorithm which dates back to 1969 was the first efficient algorithm for parallel matrix multiplication providing theoretically optimal communication cost. However this algorithm requires a square number of processors. In the mid-1990s, the SUMMA algorithm was introduced. SUMMA overcomes the shortcomings of Cannon\\'s algorithm as it can be used on a nonsquare number of processors as well. Since then the number of processors in HPC platforms has increased by two orders of magnitude making the contribution of communication in the overall execution time more significant. Therefore, the state of the art parallel matrix multiplication algorithms should be revisited to reduce the communication cost further. This paper introduces a new parallel matrix multiplication algorithm, Hierarchical SUMMA (HSUMMA), which is a redesign of SUMMA. Our algorithm reduces the communication cost of SUMMA by introducing a two-level virtual hierarchy into the two-dimensional arrangement of processors. Experiments on an IBM BlueGene/P demonstrate the reduction of communication cost up to 2.08 times on 2048 cores and up to 5.89 times on 16384 cores. © 2013 IEEE.
Hierarchical Parallel Matrix Multiplication on Large-Scale Distributed Memory Platforms

KAUST Repository

Quintin, Jean-Noel; Hasanov, Khalid; Lastovetsky, Alexey

2013-01-01

Matrix multiplication is a very important computation kernel both in its own right as a building block of many scientific applications and as a popular representative for other scientific applications. Cannon's algorithm which dates back to 1969 was the first efficient algorithm for parallel matrix multiplication providing theoretically optimal communication cost. However this algorithm requires a square number of processors. In the mid-1990s, the SUMMA algorithm was introduced. SUMMA overcomes the shortcomings of Cannon's algorithm as it can be used on a nonsquare number of processors as well. Since then the number of processors in HPC platforms has increased by two orders of magnitude making the contribution of communication in the overall execution time more significant. Therefore, the state of the art parallel matrix multiplication algorithms should be revisited to reduce the communication cost further. This paper introduces a new parallel matrix multiplication algorithm, Hierarchical SUMMA (HSUMMA), which is a redesign of SUMMA. Our algorithm reduces the communication cost of SUMMA by introducing a two-level virtual hierarchy into the two-dimensional arrangement of processors. Experiments on an IBM BlueGene/P demonstrate the reduction of communication cost up to 2.08 times on 2048 cores and up to 5.89 times on 16384 cores. © 2013 IEEE.
Building a parallel file system simulator

International Nuclear Information System (INIS)

Molina-Estolano, E; Maltzahn, C; Brandt, S A; Bent, J

2009-01-01

Parallel file systems are gaining in popularity in high-end computing centers as well as commercial data centers. High-end computing systems are expected to scale exponentially and to pose new challenges to their storage scalability in terms of cost and power. To address these challenges scientists and file system designers will need a thorough understanding of the design space of parallel file systems. Yet there exist few systematic studies of parallel file system behavior at petabyte- and exabyte scale. An important reason is the significant cost of getting access to large-scale hardware to test parallel file systems. To contribute to this understanding we are building a parallel file system simulator that can simulate parallel file systems at very large scale. Our goal is to simulate petabyte-scale parallel file systems on a small cluster or even a single machine in reasonable time and fidelity. With this simulator, file system experts will be able to tune existing file systems for specific workloads, scientists and file system deployment engineers will be able to better communicate workload requirements, file system designers and researchers will be able to try out design alternatives and innovations at scale, and instructors will be able to study very large-scale parallel file system behavior in the class room. In this paper we describe our approach and provide preliminary results that are encouraging both in terms of fidelity and simulation scalability.
Robust and rapid algorithms facilitate large-scale whole genome sequencing downstream analysis in an integrative framework.

Science.gov (United States)

Li, Miaoxin; Li, Jiang; Li, Mulin Jun; Pan, Zhicheng; Hsu, Jacob Shujui; Liu, Dajiang J; Zhan, Xiaowei; Wang, Junwen; Song, Youqiang; Sham, Pak Chung

2017-05-19

Whole genome sequencing (WGS) is a promising strategy to unravel variants or genes responsible for human diseases and traits. However, there is a lack of robust platforms for a comprehensive downstream analysis. In the present study, we first proposed three novel algorithms, sequence gap-filled gene feature annotation, bit-block encoded genotypes and sectional fast access to text lines to address three fundamental problems. The three algorithms then formed the infrastructure of a robust parallel computing framework, KGGSeq, for integrating downstream analysis functions for whole genome sequencing data. KGGSeq has been equipped with a comprehensive set of analysis functions for quality control, filtration, annotation, pathogenic prediction and statistical tests. In the tests with whole genome sequencing data from 1000 Genomes Project, KGGSeq annotated several thousand more reliable non-synonymous variants than other widely used tools (e.g. ANNOVAR and SNPEff). It took only around half an hour on a small server with 10 CPUs to access genotypes of ∼60 million variants of 2504 subjects, while a popular alternative tool required around one day. KGGSeq's bit-block genotype format used 1.5% or less space to flexibly represent phased or unphased genotypes with multiple alleles and achieved a speed of over 1000 times faster to calculate genotypic correlation. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Large Scale Simulations of the Euler Equations on GPU Clusters

KAUST Repository

Liebmann, Manfred; Douglas, Craig C.; Haase, Gundolf; Horvá th, Zoltá n

2010-01-01

The paper investigates the scalability of a parallel Euler solver, using the Vijayasundaram method, on a GPU cluster with 32 Nvidia Geforce GTX 295 boards. The aim of this research is to enable large scale fluid dynamics simulations with up to one
Final Report: Migration Mechanisms for Large-scale Parallel Applications

Energy Technology Data Exchange (ETDEWEB)

Jason Nieh

2009-10-30

Process migration is the ability to transfer a process from one machine to another. It is a useful facility in distributed computing environments, especially as computing devices become more pervasive and Internet access becomes more ubiquitous. The potential benefits of process migration, among others, are fault resilience by migrating processes off of faulty hosts, data access locality by migrating processes closer to the data, better system response time by migrating processes closer to users, dynamic load balancing by migrating processes to less loaded hosts, and improved service availability and administration by migrating processes before host maintenance so that applications can continue to run with minimal downtime. Although process migration provides substantial potential benefits and many approaches have been considered, achieving transparent process migration functionality has been difficult in practice. To address this problem, our work has designed, implemented, and evaluated new and powerful transparent process checkpoint-restart and migration mechanisms for desktop, server, and parallel applications that operate across heterogeneous cluster and mobile computing environments. A key aspect of this work has been to introduce lightweight operating system virtualization to provide processes with private, virtual namespaces that decouple and isolate processes from dependencies on the host operating system instance. This decoupling enables processes to be transparently checkpointed and migrated without modifying, recompiling, or relinking applications or the operating system. Building on this lightweight operating system virtualization approach, we have developed novel technologies that enable (1) coordinated, consistent checkpoint-restart and migration of multiple processes, (2) fast checkpointing of process and file system state to enable restart of multiple parallel execution environments and time travel, (3) process migration across heterogeneous
Performance characteristics of hybrid MPI/OpenMP implementations of NAS parallel benchmarks SP and BT on large-scale multicore supercomputers

KAUST Repository

Wu, Xingfu; Taylor, Valerie

2011-01-01

The NAS Parallel Benchmarks (NPB) are well-known applications with the fixed algorithms for evaluating parallel systems and tools. Multicore supercomputers provide a natural programming paradigm for hybrid programs, whereby OpenMP can be used with the data sharing with the multicores that comprise a node and MPI can be used with the communication between nodes. In this paper, we use SP and BT benchmarks of MPI NPB 3.3 as a basis for a comparative approach to implement hybrid MPI/OpenMP versions of SP and BT. In particular, we can compare the performance of the hybrid SP and BT with the MPI counterparts on large-scale multicore supercomputers. Our performance results indicate that the hybrid SP outperforms the MPI SP by up to 20.76%, and the hybrid BT outperforms the MPI BT by up to 8.58% on up to 10,000 cores on BlueGene/P at Argonne National Laboratory and Jaguar (Cray XT4/5) at Oak Ridge National Laboratory. We also use performance tools and MPI trace libraries available on these supercomputers to further investigate the performance characteristics of the hybrid SP and BT.
Performance characteristics of hybrid MPI/OpenMP implementations of NAS parallel benchmarks SP and BT on large-scale multicore supercomputers

KAUST Repository

Wu, Xingfu

2011-03-29

The NAS Parallel Benchmarks (NPB) are well-known applications with the fixed algorithms for evaluating parallel systems and tools. Multicore supercomputers provide a natural programming paradigm for hybrid programs, whereby OpenMP can be used with the data sharing with the multicores that comprise a node and MPI can be used with the communication between nodes. In this paper, we use SP and BT benchmarks of MPI NPB 3.3 as a basis for a comparative approach to implement hybrid MPI/OpenMP versions of SP and BT. In particular, we can compare the performance of the hybrid SP and BT with the MPI counterparts on large-scale multicore supercomputers. Our performance results indicate that the hybrid SP outperforms the MPI SP by up to 20.76%, and the hybrid BT outperforms the MPI BT by up to 8.58% on up to 10,000 cores on BlueGene/P at Argonne National Laboratory and Jaguar (Cray XT4/5) at Oak Ridge National Laboratory. We also use performance tools and MPI trace libraries available on these supercomputers to further investigate the performance characteristics of the hybrid SP and BT.
Parallel Multivariate Spatio-Temporal Clustering of Large Ecological Datasets on Hybrid Supercomputers

Energy Technology Data Exchange (ETDEWEB)

Sreepathi, Sarat [ORNL; Kumar, Jitendra [ORNL; Mills, Richard T. [Argonne National Laboratory; Hoffman, Forrest M. [ORNL; Sripathi, Vamsi [Intel Corporation; Hargrove, William Walter [United States Department of Agriculture (USDA), United States Forest Service (USFS)

2017-09-01

A proliferation of data from vast networks of remote sensing platforms (satellites, unmanned aircraft systems (UAS), airborne etc.), observational facilities (meteorological, eddy covariance etc.), state-of-the-art sensors, and simulation models offer unprecedented opportunities for scientific discovery. Unsupervised classification is a widely applied data mining approach to derive insights from such data. However, classification of very large data sets is a complex computational problem that requires efficient numerical algorithms and implementations on high performance computing (HPC) platforms. Additionally, increasing power, space, cooling and efficiency requirements has led to the deployment of hybrid supercomputing platforms with complex architectures and memory hierarchies like the Titan system at Oak Ridge National Laboratory. The advent of such accelerated computing architectures offers new challenges and opportunities for big data analytics in general and specifically, large scale cluster analysis in our case. Although there is an existing body of work on parallel cluster analysis, those approaches do not fully meet the needs imposed by the nature and size of our large data sets. Moreover, they had scaling limitations and were mostly limited to traditional distributed memory computing platforms. We present a parallel Multivariate Spatio-Temporal Clustering (MSTC) technique based on k-means cluster analysis that can target hybrid supercomputers like Titan. We developed a hybrid MPI, CUDA and OpenACC implementation that can utilize both CPU and GPU resources on computational nodes. We describe performance results on Titan that demonstrate the scalability and efficacy of our approach in processing large ecological data sets.
Parallel real-time visualization system for large-scale simulation. Application to WSPEEDI

International Nuclear Information System (INIS)

Muramatsu, Kazuhiro; Otani, Takayuki; Kitabata, Hideyuki; Matsumoto, Hideki; Takei, Toshifumi; Doi, Shun

2000-01-01

The real-time visualization system, PATRAS (PArallel TRAcking Steering system) has been developed on parallel computing servers. The system performs almost all of the visualization tasks on a parallel computing server, and uses image data compression technique for efficient communication between the server and the client terminal. Therefore, the system realizes high performance concurrent visualization in an internet computing environment. The experience in applying PATRAS to WSPEEDI (Worldwide version of System for Prediction Environmental Emergency Dose Information) is reported. The application of PATRAS to WSPEEDI enables users to understand behaviours of radioactive tracers from different release points easily and quickly. (author)
Evolution favors protein mutational robustness in sufficiently large populations

Directory of Open Access Journals (Sweden)

Venturelli Ophelia S

2007-07-01

Full Text Available Abstract Background An important question is whether evolution favors properties such as mutational robustness or evolvability that do not directly benefit any individual, but can influence the course of future evolution. Functionally similar proteins can differ substantially in their robustness to mutations and capacity to evolve new functions, but it has remained unclear whether any of these differences might be due to evolutionary selection for these properties. Results Here we use laboratory experiments to demonstrate that evolution favors protein mutational robustness if the evolving population is sufficiently large. We neutrally evolve cytochrome P450 proteins under identical selection pressures and mutation rates in populations of different sizes, and show that proteins from the larger and thus more polymorphic population tend towards higher mutational robustness. Proteins from the larger population also evolve greater stability, a biophysical property that is known to enhance both mutational robustness and evolvability. The excess mutational robustness and stability is well described by mathematical theory, and can be quantitatively related to the way that the proteins occupy their neutral network. Conclusion Our work is the first experimental demonstration of the general tendency of evolution to favor mutational robustness and protein stability in highly polymorphic populations. We suggest that this phenomenon could contribute to the mutational robustness and evolvability of viruses and bacteria that exist in large populations.
Scheduling Parallel Jobs Using Migration and Consolidation in the Cloud

Directory of Open Access Journals (Sweden)

Xiaocheng Liu

2012-01-01

Full Text Available An increasing number of high performance computing parallel applications leverages the power of the cloud for parallel processing. How to schedule the parallel applications to improve the quality of service is the key to the successful host of parallel applications in the cloud. The large scale of the cloud makes the parallel job scheduling more complicated as even simple parallel job scheduling problem is NP-complete. In this paper, we propose a parallel job scheduling algorithm named MEASY. MEASY adopts migration and consolidation to enhance the most popular EASY scheduling algorithm. Our extensive experiments on well-known workloads show that our algorithm takes very good care of the quality of service. For two common parallel job scheduling objectives, our algorithm produces an up to 41.1% and an average of 23.1% improvement on the average response time; an up to 82.9% and an average of 69.3% improvement on the average slowdown. Our algorithm is robust even in terms that it allows inaccurate CPU usage estimation and high migration cost. Our approach involves trivial modification on EASY and requires no additional technique; it is practical and effective in the cloud environment.
Parallel multiple instance learning for extremely large histopathology image analysis.

Science.gov (United States)

Xu, Yan; Li, Yeshu; Shen, Zhengyang; Wu, Ziwei; Gao, Teng; Fan, Yubo; Lai, Maode; Chang, Eric I-Chao

2017-08-03

Histopathology images are critical for medical diagnosis, e.g., cancer and its treatment. A standard histopathology slice can be easily scanned at a high resolution of, say, 200,000×200,000 pixels. These high resolution images can make most existing imaging processing tools infeasible or less effective when operated on a single machine with limited memory, disk space and computing power. In this paper, we propose an algorithm tackling this new emerging "big data" problem utilizing parallel computing on High-Performance-Computing (HPC) clusters. Experimental results on a large-scale data set (1318 images at a scale of 10 billion pixels each) demonstrate the efficiency and effectiveness of the proposed algorithm for low-latency real-time applications. The framework proposed an effective and efficient system for extremely large histopathology image analysis. It is based on the multiple instance learning formulation for weakly-supervised learning for image classification, segmentation and clustering. When a max-margin concept is adopted for different clusters, we obtain further improvement in clustering performance.
Balancing modern Power System with large scale of wind power

DEFF Research Database (Denmark)

Basit, Abdul; Altin, Müfit; Hansen, Anca Daniela

2014-01-01

Power system operators must ensure robust, secure and reliable power system operation even with a large scale integration of wind power. Electricity generated from the intermittent wind in large propor-tion may impact on the control of power system balance and thus deviations in the power system...... frequency in small or islanded power systems or tie line power flows in interconnected power systems. Therefore, the large scale integration of wind power into the power system strongly concerns the secure and stable grid operation. To ensure the stable power system operation, the evolving power system has...... to be analysed with improved analytical tools and techniques. This paper proposes techniques for the active power balance control in future power systems with the large scale wind power integration, where power balancing model provides the hour-ahead dispatch plan with reduced planning horizon and the real time...
Parallel Scaling Characteristics of Selected NERSC User ProjectCodes

Energy Technology Data Exchange (ETDEWEB)

Skinner, David; Verdier, Francesca; Anand, Harsh; Carter,Jonathan; Durst, Mark; Gerber, Richard

2005-03-05

This report documents parallel scaling characteristics of NERSC user project codes between Fiscal Year 2003 and the first half of Fiscal Year 2004 (Oct 2002-March 2004). The codes analyzed cover 60% of all the CPU hours delivered during that time frame on seaborg, a 6080 CPU IBM SP and the largest parallel computer at NERSC. The scale in terms of concurrency and problem size of the workload is analyzed. Drawing on batch queue logs, performance data and feedback from researchers we detail the motivations, benefits, and challenges of implementing highly parallel scientific codes on current NERSC High Performance Computing systems. An evaluation and outlook of the NERSC workload for Allocation Year 2005 is presented.
Multi Scale Finite Element Analyses By Using SEM-EBSD Crystallographic Modeling and Parallel Computing

International Nuclear Information System (INIS)

Nakamachi, Eiji

2005-01-01

A crystallographic homogenization procedure is introduced to the conventional static-explicit and dynamic-explicit finite element formulation to develop a multi scale - double scale - analysis code to predict the plastic strain induced texture evolution, yield loci and formability of sheet metal. The double-scale structure consists of a crystal aggregation - micro-structure - and a macroscopic elastic plastic continuum. At first, we measure crystal morphologies by using SEM-EBSD apparatus, and define a unit cell of micro structure, which satisfy the periodicity condition in the real scale of polycrystal. Next, this crystallographic homogenization FE code is applied to 3N pure-iron and 'Benchmark' aluminum A6022 polycrystal sheets. It reveals that the initial crystal orientation distribution - the texture - affects very much to a plastic strain induced texture and anisotropic hardening evolutions and sheet deformation. Since, the multi-scale finite element analysis requires a large computation time, a parallel computing technique by using PC cluster is developed for a quick calculation. In this parallelization scheme, a dynamic workload balancing technique is introduced for quick and efficient calculations

SCALE Continuous-Energy Monte Carlo Depletion with Parallel KENO in TRITON

International Nuclear Information System (INIS)

Goluoglu, Sedat; Bekar, Kursat B.; Wiarda, Dorothea

2012-01-01

The TRITON sequence of the SCALE code system is a powerful and robust tool for performing multigroup (MG) reactor physics analysis using either the 2-D deterministic solver NEWT or the 3-D Monte Carlo transport code KENO. However, as with all MG codes, the accuracy of the results depends on the accuracy of the MG cross sections that are generated and/or used. While SCALE resonance self-shielding modules provide rigorous resonance self-shielding, they are based on 1-D models and therefore 2-D or 3-D effects such as heterogeneity of the lattice structures may render final MG cross sections inaccurate. Another potential drawback to MG Monte Carlo depletion is the need to perform resonance self-shielding calculations at each depletion step for each fuel segment that is being depleted. The CPU time and memory required for self-shielding calculations can often eclipse the resources needed for the Monte Carlo transport. This summary presents the results of the new continuous-energy (CE) calculation mode in TRITON. With the new capability, accurate reactor physics analyses can be performed for all types of systems using the SCALE Monte Carlo code KENO as the CE transport solver. In addition, transport calculations can be performed in parallel mode on multiple processors.
Robust Cell Detection for Large-Scale 3D Microscopy Using GPU-Accelerated Iterative Voting

Directory of Open Access Journals (Sweden)

Leila Saadatifard

2018-04-01

Full Text Available High-throughput imaging techniques, such as Knife-Edge Scanning Microscopy (KESM,are capable of acquiring three-dimensional whole-organ images at sub-micrometer resolution. These images are challenging to segment since they can exceed several terabytes (TB in size, requiring extremely fast and fully automated algorithms. Staining techniques are limited to contrast agents that can be applied to large samples and imaged in a single pass. This requires maximizing the number of structures labeled in a single channel, resulting in images that are densely packed with spatial features. In this paper, we propose a three-dimensional approach for locating cells based on iterative voting. Due to the computational complexity of this algorithm, a highly efficient GPU implementation is required to make it practical on large data sets. The proposed algorithm has a limited number of input parameters and is highly parallel.
Parallel Auxiliary Space AMG Solver for $H(div)$ Problems

Energy Technology Data Exchange (ETDEWEB)

Kolev, Tzanio V. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Vassilevski, Panayot S. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

2012-12-18

We present a family of scalable preconditioners for matrices arising in the discretization of $H(div)$ problems using the lowest order Raviart--Thomas finite elements. Our approach belongs to the class of “auxiliary space''--based methods and requires only the finite element stiffness matrix plus some minimal additional discretization information about the topology and orientation of mesh entities. Also, we provide a detailed algebraic description of the theory, parallel implementation, and different variants of this parallel auxiliary space divergence solver (ADS) and discuss its relations to the Hiptmair--Xu (HX) auxiliary space decomposition of $H(div)$ [SIAM J. Numer. Anal., 45 (2007), pp. 2483--2509] and to the auxiliary space Maxwell solver AMS [J. Comput. Math., 27 (2009), pp. 604--623]. Finally, an extensive set of numerical experiments demonstrates the robustness and scalability of our implementation on large-scale $H(div)$ problems with large jumps in the material coefficients.
Robust Optimal Adaptive Control Method with Large Adaptive Gain

Science.gov (United States)

Nguyen, Nhan T.

2009-01-01

In the presence of large uncertainties, a control system needs to be able to adapt rapidly to regain performance. Fast adaptation is referred to the implementation of adaptive control with a large adaptive gain to reduce the tracking error rapidly. However, a large adaptive gain can lead to high-frequency oscillations which can adversely affect robustness of an adaptive control law. A new adaptive control modification is presented that can achieve robust adaptation with a large adaptive gain without incurring high-frequency oscillations as with the standard model-reference adaptive control. The modification is based on the minimization of the Y2 norm of the tracking error, which is formulated as an optimal control problem. The optimality condition is used to derive the modification using the gradient method. The optimal control modification results in a stable adaptation and allows a large adaptive gain to be used for better tracking while providing sufficient stability robustness. Simulations were conducted for a damaged generic transport aircraft with both standard adaptive control and the adaptive optimal control modification technique. The results demonstrate the effectiveness of the proposed modification in tracking a reference model while maintaining a sufficient time delay margin.
Large Scale GW Calculations on the Cori System

Science.gov (United States)

Deslippe, Jack; Del Ben, Mauro; da Jornada, Felipe; Canning, Andrew; Louie, Steven

The NERSC Cori system, powered by 9000+ Intel Xeon-Phi processors, represents one of the largest HPC systems for open-science in the United States and the world. We discuss the optimization of the GW methodology for this system, including both node level and system-scale optimizations. We highlight multiple large scale (thousands of atoms) case studies and discuss both absolute application performance and comparison to calculations on more traditional HPC architectures. We find that the GW method is particularly well suited for many-core architectures due to the ability to exploit a large amount of parallelism across many layers of the system. This work was supported by the U.S. Department of Energy, Office of Science, Basic Energy Sciences, Materials Sciences and Engineering Division, as part of the Computational Materials Sciences Program.
Large scale electrolysers

International Nuclear Information System (INIS)

B Bello; M Junker

2006-01-01

Hydrogen production by water electrolysis represents nearly 4 % of the world hydrogen production. Future development of hydrogen vehicles will require large quantities of hydrogen. Installation of large scale hydrogen production plants will be needed. In this context, development of low cost large scale electrolysers that could use 'clean power' seems necessary. ALPHEA HYDROGEN, an European network and center of expertise on hydrogen and fuel cells, has performed for its members a study in 2005 to evaluate the potential of large scale electrolysers to produce hydrogen in the future. The different electrolysis technologies were compared. Then, a state of art of the electrolysis modules currently available was made. A review of the large scale electrolysis plants that have been installed in the world was also realized. The main projects related to large scale electrolysis were also listed. Economy of large scale electrolysers has been discussed. The influence of energy prices on the hydrogen production cost by large scale electrolysis was evaluated. (authors)
Algorithms for large scale singular value analysis of spatially variant tomography systems

International Nuclear Information System (INIS)

Cao-Huu, Tuan; Brownell, G.; Lachiver, G.

1996-01-01

The problem of determining the eigenvalues of large matrices occurs often in the design and analysis of modem tomography systems. As there is an interest in solving systems containing an ever-increasing number of variables, current research effort is being made to create more robust solvers which do not depend on some special feature of the matrix for convergence (e.g. block circulant), and to improve the speed of already known and understood solvers so that solving even larger systems in a reasonable time becomes viable. Our standard techniques for singular value analysis are based on sparse matrix factorization and are not applicable when the input matrices are large because the algorithms cause too much fill. Fill refers to the increase of non-zero elements in the LU decomposition of the original matrix A (the system matrix). So we have developed iterative solutions that are based on sparse direct methods. Data motion and preconditioning techniques are critical for performance. This conference paper describes our algorithmic approaches for large scale singular value analysis of spatially variant imaging systems, and in particular of PCR2, a cylindrical three-dimensional PET imager 2 built at the Massachusetts General Hospital (MGH) in Boston. We recommend the desirable features and challenges for the next generation of parallel machines for optimal performance of our solver
Parallel computing works!

CERN Document Server

Fox, Geoffrey C; Messina, Guiseppe C

2014-01-01

A clear illustration of how parallel computers can be successfully appliedto large-scale scientific computations. This book demonstrates how avariety of applications in physics, biology, mathematics and other scienceswere implemented on real parallel computers to produce new scientificresults. It investigates issues of fine-grained parallelism relevant forfuture supercomputers with particular emphasis on hypercube architecture. The authors describe how they used an experimental approach to configuredifferent massively parallel machines, design and implement basic systemsoftware, and develop
Some scale-free networks could be robust under selective node attacks

Science.gov (United States)

Zheng, Bojin; Huang, Dan; Li, Deyi; Chen, Guisheng; Lan, Wenfei

2011-04-01

It is a mainstream idea that scale-free network would be fragile under the selective attacks. Internet is a typical scale-free network in the real world, but it never collapses under the selective attacks of computer viruses and hackers. This phenomenon is different from the deduction of the idea above because this idea assumes the same cost to delete an arbitrary node. Hence this paper discusses the behaviors of the scale-free network under the selective node attack with different cost. Through the experiments on five complex networks, we show that the scale-free network is possibly robust under the selective node attacks; furthermore, the more compact the network is, and the larger the average degree is, then the more robust the network is; with the same average degrees, the more compact the network is, the more robust the network is. This result would enrich the theory of the invulnerability of the network, and can be used to build robust social, technological and biological networks, and also has the potential to find the target of drugs.
A parallel orbital-updating based plane-wave basis method for electronic structure calculations

International Nuclear Information System (INIS)

Pan, Yan; Dai, Xiaoying; Gironcoli, Stefano de; Gong, Xin-Gao; Rignanese, Gian-Marco; Zhou, Aihui

2017-01-01

Highlights: • Propose three parallel orbital-updating based plane-wave basis methods for electronic structure calculations. • These new methods can avoid the generating of large scale eigenvalue problems and then reduce the computational cost. • These new methods allow for two-level parallelization which is particularly interesting for large scale parallelization. • Numerical experiments show that these new methods are reliable and efficient for large scale calculations on modern supercomputers. - Abstract: Motivated by the recently proposed parallel orbital-updating approach in real space method , we propose a parallel orbital-updating based plane-wave basis method for electronic structure calculations, for solving the corresponding eigenvalue problems. In addition, we propose two new modified parallel orbital-updating methods. Compared to the traditional plane-wave methods, our methods allow for two-level parallelization, which is particularly interesting for large scale parallelization. Numerical experiments show that these new methods are more reliable and efficient for large scale calculations on modern supercomputers.
WImpiBLAST: web interface for mpiBLAST to help biologists perform large-scale annotation using high performance computing.

Directory of Open Access Journals (Sweden)

Parichit Sharma

Full Text Available The function of a newly sequenced gene can be discovered by determining its sequence homology with known proteins. BLAST is the most extensively used sequence analysis program for sequence similarity search in large databases of sequences. With the advent of next generation sequencing technologies it has now become possible to study genes and their expression at a genome-wide scale through RNA-seq and metagenome sequencing experiments. Functional annotation of all the genes is done by sequence similarity search against multiple protein databases. This annotation task is computationally very intensive and can take days to obtain complete results. The program mpiBLAST, an open-source parallelization of BLAST that achieves superlinear speedup, can be used to accelerate large-scale annotation by using supercomputers and high performance computing (HPC clusters. Although many parallel bioinformatics applications using the Message Passing Interface (MPI are available in the public domain, researchers are reluctant to use them due to lack of expertise in the Linux command line and relevant programming experience. With these limitations, it becomes difficult for biologists to use mpiBLAST for accelerating annotation. No web interface is available in the open-source domain for mpiBLAST. We have developed WImpiBLAST, a user-friendly open-source web interface for parallel BLAST searches. It is implemented in Struts 1.3 using a Java backbone and runs atop the open-source Apache Tomcat Server. WImpiBLAST supports script creation and job submission features and also provides a robust job management interface for system administrators. It combines script creation and modification features with job monitoring and management through the Torque resource manager on a Linux-based HPC cluster. Use case information highlights the acceleration of annotation analysis achieved by using WImpiBLAST. Here, we describe the WImpiBLAST web interface features and architecture
WImpiBLAST: web interface for mpiBLAST to help biologists perform large-scale annotation using high performance computing.

Science.gov (United States)

Sharma, Parichit; Mantri, Shrikant S

2014-01-01

The function of a newly sequenced gene can be discovered by determining its sequence homology with known proteins. BLAST is the most extensively used sequence analysis program for sequence similarity search in large databases of sequences. With the advent of next generation sequencing technologies it has now become possible to study genes and their expression at a genome-wide scale through RNA-seq and metagenome sequencing experiments. Functional annotation of all the genes is done by sequence similarity search against multiple protein databases. This annotation task is computationally very intensive and can take days to obtain complete results. The program mpiBLAST, an open-source parallelization of BLAST that achieves superlinear speedup, can be used to accelerate large-scale annotation by using supercomputers and high performance computing (HPC) clusters. Although many parallel bioinformatics applications using the Message Passing Interface (MPI) are available in the public domain, researchers are reluctant to use them due to lack of expertise in the Linux command line and relevant programming experience. With these limitations, it becomes difficult for biologists to use mpiBLAST for accelerating annotation. No web interface is available in the open-source domain for mpiBLAST. We have developed WImpiBLAST, a user-friendly open-source web interface for parallel BLAST searches. It is implemented in Struts 1.3 using a Java backbone and runs atop the open-source Apache Tomcat Server. WImpiBLAST supports script creation and job submission features and also provides a robust job management interface for system administrators. It combines script creation and modification features with job monitoring and management through the Torque resource manager on a Linux-based HPC cluster. Use case information highlights the acceleration of annotation analysis achieved by using WImpiBLAST. Here, we describe the WImpiBLAST web interface features and architecture, explain design
Large-scale hydrogen production using nuclear reactors

Energy Technology Data Exchange (ETDEWEB)

Ryland, D.; Stolberg, L.; Kettner, A.; Gnanapragasam, N.; Suppiah, S. [Atomic Energy of Canada Limited, Chalk River, ON (Canada)

2014-07-01

For many years, Atomic Energy of Canada Limited (AECL) has been studying the feasibility of using nuclear reactors, such as the Supercritical Water-cooled Reactor, as an energy source for large scale hydrogen production processes such as High Temperature Steam Electrolysis and the Copper-Chlorine thermochemical cycle. Recent progress includes the augmentation of AECL's experimental capabilities by the construction of experimental systems to test high temperature steam electrolysis button cells at ambient pressure and temperatures up to 850{sup o}C and CuCl/HCl electrolysis cells at pressures up to 7 bar and temperatures up to 100{sup o}C. In parallel, detailed models of solid oxide electrolysis cells and the CuCl/HCl electrolysis cell are being refined and validated using experimental data. Process models are also under development to assess options for economic integration of these hydrogen production processes with nuclear reactors. Options for large-scale energy storage, including hydrogen storage, are also under study. (author)
PALNS - A software framework for parallel large neighborhood search

DEFF Research Database (Denmark)

Røpke, Stefan

2009-01-01

This paper propose a simple, parallel, portable software framework for the metaheuristic named large neighborhood search (LNS). The aim is to provide a framework where the user has to set up a few data structures and implement a few functions and then the framework provides a metaheuristic where ...... parallelization "comes for free". We apply the parallel LNS heuristic to two different problems: the traveling salesman problem with pickup and delivery (TSPPD) and the capacitated vehicle routing problem (CVRP)....
Large scale simulations of lattice QCD thermodynamics on Columbia Parallel Supercomputers

International Nuclear Information System (INIS)

Ohta, Shigemi

1989-01-01

The Columbia Parallel Supercomputer project aims at the construction of a parallel processing, multi-gigaflop computer optimized for numerical simulations of lattice QCD. The project has three stages; 16-node, 1/4GF machine completed in April 1985, 64-node, 1GF machine completed in August 1987, and 256-node, 16GF machine now under construction. The machines all share a common architecture; a two dimensional torus formed from a rectangular array of N 1 x N 2 independent and identical processors. A processor is capable of operating in a multi-instruction multi-data mode, except for periods of synchronous interprocessor communication with its four nearest neighbors. Here the thermodynamics simulations on the two working machines are reported. (orig./HSI)
Using Agent Base Models to Optimize Large Scale Network for Large System Inventories

Science.gov (United States)

Shameldin, Ramez Ahmed; Bowling, Shannon R.

2010-01-01

The aim of this paper is to use Agent Base Models (ABM) to optimize large scale network handling capabilities for large system inventories and to implement strategies for the purpose of reducing capital expenses. The models used in this paper either use computational algorithms or procedure implementations developed by Matlab to simulate agent based models in a principal programming language and mathematical theory using clusters, these clusters work as a high performance computational performance to run the program in parallel computational. In both cases, a model is defined as compilation of a set of structures and processes assumed to underlie the behavior of a network system.
Enabling parallel simulation of large-scale HPC network systems

International Nuclear Information System (INIS)

Mubarak, Misbah; Carothers, Christopher D.; Ross, Robert B.; Carns, Philip

2016-01-01

Here, with the increasing complexity of today’s high-performance computing (HPC) architectures, simulation has become an indispensable tool for exploring the design space of HPC systems—in particular, networks. In order to make effective design decisions, simulations of these systems must possess the following properties: (1) have high accuracy and fidelity, (2) produce results in a timely manner, and (3) be able to analyze a broad range of network workloads. Most state-of-the-art HPC network simulation frameworks, however, are constrained in one or more of these areas. In this work, we present a simulation framework for modeling two important classes of networks used in today’s IBM and Cray supercomputers: torus and dragonfly networks. We use the Co-Design of Multi-layer Exascale Storage Architecture (CODES) simulation framework to simulate these network topologies at a flit-level detail using the Rensselaer Optimistic Simulation System (ROSS) for parallel discrete-event simulation. Our simulation framework meets all the requirements of a practical network simulation and can assist network designers in design space exploration. First, it uses validated and detailed flit-level network models to provide an accurate and high-fidelity network simulation. Second, instead of relying on serial time-stepped or traditional conservative discrete-event simulations that limit simulation scalability and efficiency, we use the optimistic event-scheduling capability of ROSS to achieve efficient and scalable HPC network simulations on today’s high-performance cluster systems. Third, our models give network designers a choice in simulating a broad range of network workloads, including HPC application workloads using detailed network traces, an ability that is rarely offered in parallel with high-fidelity network simulations
Vacuum Large Current Parallel Transfer Numerical Analysis

Directory of Open Access Journals (Sweden)

Enyuan Dong

2014-01-01

Full Text Available The stable operation and reliable breaking of large generator current are a difficult problem in power system. It can be solved successfully by the parallel interrupters and proper timing sequence with phase-control technology, in which the strategy of breaker’s control is decided by the time of both the first-opening phase and second-opening phase. The precise transfer current’s model can provide the proper timing sequence to break the generator circuit breaker. By analysis of the transfer current’s experiments and data, the real vacuum arc resistance and precise correctional model in the large transfer current’s process are obtained in this paper. The transfer time calculated by the correctional model of transfer current is very close to the actual transfer time. It can provide guidance for planning proper timing sequence and breaking the vacuum generator circuit breaker with the parallel interrupters.
Performance Characteristics of Hybrid MPI/OpenMP Implementations of NAS Parallel Benchmarks SP and BT on Large-Scale Multicore Clusters

KAUST Repository

Wu, X.; Taylor, V.

2011-01-01

The NAS Parallel Benchmarks (NPB) are well-known applications with fixed algorithms for evaluating parallel systems and tools. Multicore clusters provide a natural programming paradigm for hybrid programs, whereby OpenMP can be used with the data sharing with the multicores that comprise a node, and MPI can be used with the communication between nodes. In this paper, we use Scalar Pentadiagonal (SP) and Block Tridiagonal (BT) benchmarks of MPI NPB 3.3 as a basis for a comparative approach to implement hybrid MPI/OpenMP versions of SP and BT. In particular, we can compare the performance of the hybrid SP and BT with the MPI counterparts on large-scale multicore clusters, Intrepid (BlueGene/P) at Argonne National Laboratory and Jaguar (Cray XT4/5) at Oak Ridge National Laboratory. Our performance results indicate that the hybrid SP outperforms the MPI SP by up to 20.76 %, and the hybrid BT outperforms the MPI BT by up to 8.58 % on up to 10 000 cores on Intrepid and Jaguar. We also use performance tools and MPI trace libraries available on these clusters to further investigate the performance characteristics of the hybrid SP and BT. © 2011 The Author. Published by Oxford University Press on behalf of The British Computer Society. All rights reserved.
Performance Characteristics of Hybrid MPI/OpenMP Implementations of NAS Parallel Benchmarks SP and BT on Large-Scale Multicore Clusters

KAUST Repository

Wu, X.

2011-07-18

The NAS Parallel Benchmarks (NPB) are well-known applications with fixed algorithms for evaluating parallel systems and tools. Multicore clusters provide a natural programming paradigm for hybrid programs, whereby OpenMP can be used with the data sharing with the multicores that comprise a node, and MPI can be used with the communication between nodes. In this paper, we use Scalar Pentadiagonal (SP) and Block Tridiagonal (BT) benchmarks of MPI NPB 3.3 as a basis for a comparative approach to implement hybrid MPI/OpenMP versions of SP and BT. In particular, we can compare the performance of the hybrid SP and BT with the MPI counterparts on large-scale multicore clusters, Intrepid (BlueGene/P) at Argonne National Laboratory and Jaguar (Cray XT4/5) at Oak Ridge National Laboratory. Our performance results indicate that the hybrid SP outperforms the MPI SP by up to 20.76 %, and the hybrid BT outperforms the MPI BT by up to 8.58 % on up to 10 000 cores on Intrepid and Jaguar. We also use performance tools and MPI trace libraries available on these clusters to further investigate the performance characteristics of the hybrid SP and BT. © 2011 The Author. Published by Oxford University Press on behalf of The British Computer Society. All rights reserved.

Development of design technology on thermal-hydraulic performance in tight-lattice rod bundle. 4. Large paralleled simulation by the advanced two-fluid model code

International Nuclear Information System (INIS)

Misawa, Takeharu; Yoshida, Hiroyuki; Akimoto, Hajime

2008-01-01

In Japan Atomic Energy Agency (JAEA), the Innovative Water Reactor for Flexible Fuel Cycle (FLWR) has been developed. For thermal design of FLWR, it is necessary to develop analytical method to predict boiling transition of FLWR. Japan Atomic Energy Agency (JAEA) has been developing three-dimensional two-fluid model analysis code ACE-3D, which adopts boundary fitted coordinate system to simulate complex shape channel flow. In this paper, as a part of development of ACE-3D to apply to rod bundle analysis, introduction of parallelization to ACE-3D and assessments of ACE-3D are shown. In analysis of large-scale domain such as a rod bundle, even two-fluid model requires large number of computational cost, which exceeds upper limit of memory amount of 1 CPU. Therefore, parallelization was introduced to ACE-3D to divide data amount for analysis of large-scale domain among large number of CPUs, and it is confirmed that analysis of large-scale domain such as a rod bundle can be performed by parallel computation with keeping parallel computation performance even using large number of CPUs. ACE-3D adopts two-phase flow models, some of which are dependent upon channel geometry. Therefore, analyses in the domains, which simulate individual subchannel and 37 rod bundle, are performed, and compared with experiments. It is confirmed that the results obtained by both analyses using ACE-3D show agreement with past experimental result qualitatively. (author)
Parallel Implementation of the Multi-Dimensional Spectral Code SPECT3D on large 3D grids.

Science.gov (United States)

Golovkin, Igor E.; Macfarlane, Joseph J.; Woodruff, Pamela R.; Pereyra, Nicolas A.

2006-10-01

The multi-dimensional collisional-radiative, spectral analysis code SPECT3D can be used to study radiation from complex plasmas. SPECT3D can generate instantaneous and time-gated images and spectra, space-resolved and streaked spectra, which makes it a valuable tool for post-processing hydrodynamics calculations and direct comparison between simulations and experimental data. On large three dimensional grids, transporting radiation along lines of sight (LOS) requires substantial memory and CPU resources. Currently, the parallel option in SPECT3D is based on parallelization over photon frequencies and allows for a nearly linear speed-up for a variety of problems. In addition, we are introducing a new parallel mechanism that will greatly reduce memory requirements. In the new implementation, spatial domain decomposition will be utilized allowing transport along a LOS to be performed only on the mesh cells the LOS crosses. The ability to operate on a fraction of the grid is crucial for post-processing the results of large-scale three-dimensional hydrodynamics simulations. We will present a parallel implementation of the code and provide a scalability study performed on a Linux cluster.
Parallel algorithms for large-scale biological sequence alignment on Xeon-Phi based clusters.

Science.gov (United States)

Lan, Haidong; Chan, Yuandong; Xu, Kai; Schmidt, Bertil; Peng, Shaoliang; Liu, Weiguo

2016-07-19

Computing alignments between two or more sequences are common operations frequently performed in computational molecular biology. The continuing growth of biological sequence databases establishes the need for their efficient parallel implementation on modern accelerators. This paper presents new approaches to high performance biological sequence database scanning with the Smith-Waterman algorithm and the first stage of progressive multiple sequence alignment based on the ClustalW heuristic on a Xeon Phi-based compute cluster. Our approach uses a three-level parallelization scheme to take full advantage of the compute power available on this type of architecture; i.e. cluster-level data parallelism, thread-level coarse-grained parallelism, and vector-level fine-grained parallelism. Furthermore, we re-organize the sequence datasets and use Xeon Phi shuffle operations to improve I/O efficiency. Evaluations show that our method achieves a peak overall performance up to 220 GCUPS for scanning real protein sequence databanks on a single node consisting of two Intel E5-2620 CPUs and two Intel Xeon Phi 7110P cards. It also exhibits good scalability in terms of sequence length and size, and number of compute nodes for both database scanning and multiple sequence alignment. Furthermore, the achieved performance is highly competitive in comparison to optimized Xeon Phi and GPU implementations. Our implementation is available at https://github.com/turbo0628/LSDBS-mpi .
THE EFFECT OF INTERMITTENT GYRO-SCALE SLAB TURBULENCE ON PARALLEL AND PERPENDICULAR COSMIC-RAY TRANSPORT

International Nuclear Information System (INIS)

Le Roux, J. A.

2011-01-01

Earlier work based on nonlinear guiding center (NLGC) theory suggested that perpendicular cosmic-ray transport is diffusive when cosmic rays encounter random three-dimensional magnetohydrodynamic turbulence dominated by uniform two-dimensional (2D) turbulence with a minor uniform slab turbulence component. In this approach large-scale perpendicular cosmic-ray transport is due to cosmic rays microscopically diffusing along the meandering magnetic field dominated by 2D turbulence because of gyroresonant interactions with slab turbulence. However, turbulence in the solar wind is intermittent and it has been suggested that intermittent turbulence might be responsible for the observation of 'dropout' events in solar energetic particle fluxes on small scales. In a previous paper le Roux et al. suggested, using NLGC theory as a basis, that if gyro-scale slab turbulence is intermittent, large-scale perpendicular cosmic-ray transport in weak uniform 2D turbulence will be superdiffusive or subdiffusive depending on the statistical characteristics of the intermittent slab turbulence. In this paper we expand and refine our previous work further by investigating how both parallel and perpendicular transport are affected by intermittent slab turbulence for weak as well as strong uniform 2D turbulence. The main new finding is that both parallel and perpendicular transport are the net effect of an interplay between diffusive and nondiffusive (superdiffusive or subdiffusive) transport effects as a consequence of this intermittency.
THE EFFECT OF INTERMITTENT GYRO-SCALE SLAB TURBULENCE ON PARALLEL AND PERPENDICULAR COSMIC-RAY TRANSPORT

Energy Technology Data Exchange (ETDEWEB)

Le Roux, J. A. [Department of Physics, University of Alabama in Huntsville, Huntsville, AL 35899 (United States)

2011-12-10

Earlier work based on nonlinear guiding center (NLGC) theory suggested that perpendicular cosmic-ray transport is diffusive when cosmic rays encounter random three-dimensional magnetohydrodynamic turbulence dominated by uniform two-dimensional (2D) turbulence with a minor uniform slab turbulence component. In this approach large-scale perpendicular cosmic-ray transport is due to cosmic rays microscopically diffusing along the meandering magnetic field dominated by 2D turbulence because of gyroresonant interactions with slab turbulence. However, turbulence in the solar wind is intermittent and it has been suggested that intermittent turbulence might be responsible for the observation of 'dropout' events in solar energetic particle fluxes on small scales. In a previous paper le Roux et al. suggested, using NLGC theory as a basis, that if gyro-scale slab turbulence is intermittent, large-scale perpendicular cosmic-ray transport in weak uniform 2D turbulence will be superdiffusive or subdiffusive depending on the statistical characteristics of the intermittent slab turbulence. In this paper we expand and refine our previous work further by investigating how both parallel and perpendicular transport are affected by intermittent slab turbulence for weak as well as strong uniform 2D turbulence. The main new finding is that both parallel and perpendicular transport are the net effect of an interplay between diffusive and nondiffusive (superdiffusive or subdiffusive) transport effects as a consequence of this intermittency.
Development of a Robust and Efficient Parallel Solver for Unsteady Turbomachinery Flows

Science.gov (United States)

West, Jeff; Wright, Jeffrey; Thakur, Siddharth; Luke, Ed; Grinstead, Nathan

2012-01-01

The traditional design and analysis practice for advanced propulsion systems relies heavily on expensive full-scale prototype development and testing. Over the past decade, use of high-fidelity analysis and design tools such as CFD early in the product development cycle has been identified as one way to alleviate testing costs and to develop these devices better, faster and cheaper. In the design of advanced propulsion systems, CFD plays a major role in defining the required performance over the entire flight regime, as well as in testing the sensitivity of the design to the different modes of operation. Increased emphasis is being placed on developing and applying CFD models to simulate the flow field environments and performance of advanced propulsion systems. This necessitates the development of next generation computational tools which can be used effectively and reliably in a design environment. The turbomachinery simulation capability presented here is being developed in a computational tool called Loci-STREAM [1]. It integrates proven numerical methods for generalized grids and state-of-the-art physical models in a novel rule-based programming framework called Loci [2] which allows: (a) seamless integration of multidisciplinary physics in a unified manner, and (b) automatic handling of massively parallel computing. The objective is to be able to routinely simulate problems involving complex geometries requiring large unstructured grids and complex multidisciplinary physics. An immediate application of interest is simulation of unsteady flows in rocket turbopumps, particularly in cryogenic liquid rocket engines. The key components of the overall methodology presented in this paper are the following: (a) high fidelity unsteady simulation capability based on Detached Eddy Simulation (DES) in conjunction with second-order temporal discretization, (b) compliance with Geometric Conservation Law (GCL) in order to maintain conservative property on moving meshes for
A parallel form of the Gudjonsson Suggestibility Scale.

Science.gov (United States)

Gudjonsson, G H

1987-09-01

The purpose of this study is twofold: (1) to present a parallel form of the Gudjonsson Suggestibility Scale (GSS, Form 1); (2) to study test-retest reliabilities of interrogative suggestibility. Three groups of subjects were administered the two suggestibility scales in a counterbalanced order. Group 1 (28 normal subjects) and Group 2 (32 'forensic' patients) completed both scales within the same testing session, whereas Group 3 (30 'forensic' patients) completed the two scales between one week and eight months apart. All the correlations were highly significant, giving support for high 'temporal consistency' of interrogative suggestibility.
Constructing sites on a large scale

DEFF Research Database (Denmark)

Braae, Ellen Marie; Tietjen, Anne

2011-01-01

Since the 1990s, the regional scale has regained importance in urban and landscape design. In parallel, the focus in design tasks has shifted from master plans for urban extension to strategic urban transformation projects. A prominent example of a contemporary spatial development approach...... for setting the design brief in a large scale urban landscape in Norway, the Jaeren region around the city of Stavanger. In this paper, we first outline the methodological challenges and then present and discuss the proposed method based on our teaching experiences. On this basis, we discuss aspects...... is the IBA Emscher Park in the Ruhr area in Germany. Over a 10 years period (1988-1998), more than a 100 local transformation projects contributed to the transformation from an industrial to a post-industrial region. The current paradigm of planning by projects reinforces the role of the design disciplines...
Large scale parallel FEM computations of far/near stress field changes in rocks

Czech Academy of Sciences Publication Activity Database

Blaheta, Radim; Byczanski, Petr; Jakl, Ondřej; Kohut, Roman; Kolcun, Alexej; Krečmer, Karel; Starý, Jiří

2006-01-01

Roč. 22, č. 4 (2006), s. 449-459 ISSN 0167-739X R&D Projects: GA ČR(CZ) GA105/02/0492; GA AV ČR(CZ) 1ET400300415 Institutional research plan: CEZ:AV0Z30860518 Keywords : large scale finite element analysis Subject RIV: BA - General Mathematics Impact factor: 0.722, year: 2006
High performance parallel I/O

CERN Document Server

Prabhat

2014-01-01

Gain Critical Insight into the Parallel I/O EcosystemParallel I/O is an integral component of modern high performance computing (HPC), especially in storing and processing very large datasets to facilitate scientific discovery. Revealing the state of the art in this field, High Performance Parallel I/O draws on insights from leading practitioners, researchers, software architects, developers, and scientists who shed light on the parallel I/O ecosystem.The first part of the book explains how large-scale HPC facilities scope, configure, and operate systems, with an emphasis on choices of I/O har
Large-scale pool fires

Directory of Open Access Journals (Sweden)

Steinhaus Thomas

2007-01-01

Full Text Available A review of research into the burning behavior of large pool fires and fuel spill fires is presented. The features which distinguish such fires from smaller pool fires are mainly associated with the fire dynamics at low source Froude numbers and the radiative interaction with the fire source. In hydrocarbon fires, higher soot levels at increased diameters result in radiation blockage effects around the perimeter of large fire plumes; this yields lower emissive powers and a drastic reduction in the radiative loss fraction; whilst there are simplifying factors with these phenomena, arising from the fact that soot yield can saturate, there are other complications deriving from the intermittency of the behavior, with luminous regions of efficient combustion appearing randomly in the outer surface of the fire according the turbulent fluctuations in the fire plume. Knowledge of the fluid flow instabilities, which lead to the formation of large eddies, is also key to understanding the behavior of large-scale fires. Here modeling tools can be effectively exploited in order to investigate the fluid flow phenomena, including RANS- and LES-based computational fluid dynamics codes. The latter are well-suited to representation of the turbulent motions, but a number of challenges remain with their practical application. Massively-parallel computational resources are likely to be necessary in order to be able to adequately address the complex coupled phenomena to the level of detail that is necessary.
Cooperative parallel adaptive neighbourhood search for the disjunctively constrained knapsack problem

Science.gov (United States)

Quan, Zhe; Wu, Lei

2017-09-01

This article investigates the use of parallel computing for solving the disjunctively constrained knapsack problem. The proposed parallel computing model can be viewed as a cooperative algorithm based on a multi-neighbourhood search. The cooperation system is composed of a team manager and a crowd of team members. The team members aim at applying their own search strategies to explore the solution space. The team manager collects the solutions from the members and shares the best one with them. The performance of the proposed method is evaluated on a group of benchmark data sets. The results obtained are compared to those reached by the best methods from the literature. The results show that the proposed method is able to provide the best solutions in most cases. In order to highlight the robustness of the proposed parallel computing model, a new set of large-scale instances is introduced. Encouraging results have been obtained.
A Topology Visualization Early Warning Distribution Algorithm for Large-Scale Network Security Incidents

Directory of Open Access Journals (Sweden)

Hui He

2013-01-01

Full Text Available It is of great significance to research the early warning system for large-scale network security incidents. It can improve the network system’s emergency response capabilities, alleviate the cyber attacks’ damage, and strengthen the system’s counterattack ability. A comprehensive early warning system is presented in this paper, which combines active measurement and anomaly detection. The key visualization algorithm and technology of the system are mainly discussed. The large-scale network system’s plane visualization is realized based on the divide and conquer thought. First, the topology of the large-scale network is divided into some small-scale networks by the MLkP/CR algorithm. Second, the sub graph plane visualization algorithm is applied to each small-scale network. Finally, the small-scale networks’ topologies are combined into a topology based on the automatic distribution algorithm of force analysis. As the algorithm transforms the large-scale network topology plane visualization problem into a series of small-scale network topology plane visualization and distribution problems, it has higher parallelism and is able to handle the display of ultra-large-scale network topology.
GPU-based large-scale visualization

KAUST Repository

Hadwiger, Markus

2013-11-19

Recent advances in image and volume acquisition as well as computational advances in simulation have led to an explosion of the amount of data that must be visualized and analyzed. Modern techniques combine the parallel processing power of GPUs with out-of-core methods and data streaming to enable the interactive visualization of giga- and terabytes of image and volume data. A major enabler for interactivity is making both the computational and the visualization effort proportional to the amount of data that is actually visible on screen, decoupling it from the full data size. This leads to powerful display-aware multi-resolution techniques that enable the visualization of data of almost arbitrary size. The course consists of two major parts: An introductory part that progresses from fundamentals to modern techniques, and a more advanced part that discusses details of ray-guided volume rendering, novel data structures for display-aware visualization and processing, and the remote visualization of large online data collections. You will learn how to develop efficient GPU data structures and large-scale visualizations, implement out-of-core strategies and concepts such as virtual texturing that have only been employed recently, as well as how to use modern multi-resolution representations. These approaches reduce the GPU memory requirements of extremely large data to a working set size that fits into current GPUs. You will learn how to perform ray-casting of volume data of almost arbitrary size and how to render and process gigapixel images using scalable, display-aware techniques. We will describe custom virtual texturing architectures as well as recent hardware developments in this area. We will also describe client/server systems for distributed visualization, on-demand data processing and streaming, and remote visualization. We will describe implementations using OpenGL as well as CUDA, exploiting parallelism on GPUs combined with additional asynchronous
Phenomenology of two-dimensional stably stratified turbulence under large-scale forcing

KAUST Repository

Kumar, Abhishek; Verma, Mahendra K.; Sukhatme, Jai

2017-01-01

In this paper, we characterise the scaling of energy spectra, and the interscale transfer of energy and enstrophy, for strongly, moderately and weakly stably stratified two-dimensional (2D) turbulence, restricted in a vertical plane, under large-scale random forcing. In the strongly stratified case, a large-scale vertically sheared horizontal flow (VSHF) coexists with small scale turbulence. The VSHF consists of internal gravity waves and the turbulent flow has a kinetic energy (KE) spectrum that follows an approximate k−3 scaling with zero KE flux and a robust positive enstrophy flux. The spectrum of the turbulent potential energy (PE) also approximately follows a k−3 power-law and its flux is directed to small scales. For moderate stratification, there is no VSHF and the KE of the turbulent flow exhibits Bolgiano–Obukhov scaling that transitions from a shallow k−11/5 form at large scales, to a steeper approximate k−3 scaling at small scales. The entire range of scales shows a strong forward enstrophy flux, and interestingly, large (small) scales show an inverse (forward) KE flux. The PE flux in this regime is directed to small scales, and the PE spectrum is characterised by an approximate k−1.64 scaling. Finally, for weak stratification, KE is transferred upscale and its spectrum closely follows a k−2.5 scaling, while PE exhibits a forward transfer and its spectrum shows an approximate k−1.6 power-law. For all stratification strengths, the total energy always flows from large to small scales and almost all the spectral indicies are well explained by accounting for the scale-dependent nature of the corresponding flux.
Phenomenology of two-dimensional stably stratified turbulence under large-scale forcing

KAUST Repository

Kumar, Abhishek

2017-01-11

In this paper, we characterise the scaling of energy spectra, and the interscale transfer of energy and enstrophy, for strongly, moderately and weakly stably stratified two-dimensional (2D) turbulence, restricted in a vertical plane, under large-scale random forcing. In the strongly stratified case, a large-scale vertically sheared horizontal flow (VSHF) coexists with small scale turbulence. The VSHF consists of internal gravity waves and the turbulent flow has a kinetic energy (KE) spectrum that follows an approximate k−3 scaling with zero KE flux and a robust positive enstrophy flux. The spectrum of the turbulent potential energy (PE) also approximately follows a k−3 power-law and its flux is directed to small scales. For moderate stratification, there is no VSHF and the KE of the turbulent flow exhibits Bolgiano–Obukhov scaling that transitions from a shallow k−11/5 form at large scales, to a steeper approximate k−3 scaling at small scales. The entire range of scales shows a strong forward enstrophy flux, and interestingly, large (small) scales show an inverse (forward) KE flux. The PE flux in this regime is directed to small scales, and the PE spectrum is characterised by an approximate k−1.64 scaling. Finally, for weak stratification, KE is transferred upscale and its spectrum closely follows a k−2.5 scaling, while PE exhibits a forward transfer and its spectrum shows an approximate k−1.6 power-law. For all stratification strengths, the total energy always flows from large to small scales and almost all the spectral indicies are well explained by accounting for the scale-dependent nature of the corresponding flux.
Large scale tracking of stem cells using sparse coding and coupled graphs

DEFF Research Database (Denmark)

Vestergaard, Jacob Schack; Dahl, Anders Lindbjerg; Holm, Peter

Stem cell tracking is an inherently large scale problem. The challenge is to identify and track hundreds or thousands of cells over a time period of several weeks. This requires robust methods that can leverage the knowledge of specialists on the field. The tracking pipeline presented here consists...
Large-scale simulations of error-prone quantum computation devices

International Nuclear Information System (INIS)

Trieu, Doan Binh

2009-01-01

The theoretical concepts of quantum computation in the idealized and undisturbed case are well understood. However, in practice, all quantum computation devices do suffer from decoherence effects as well as from operational imprecisions. This work assesses the power of error-prone quantum computation devices using large-scale numerical simulations on parallel supercomputers. We present the Juelich Massively Parallel Ideal Quantum Computer Simulator (JUMPIQCS), that simulates a generic quantum computer on gate level. It comprises an error model for decoherence and operational errors. The robustness of various algorithms in the presence of noise has been analyzed. The simulation results show that for large system sizes and long computations it is imperative to actively correct errors by means of quantum error correction. We implemented the 5-, 7-, and 9-qubit quantum error correction codes. Our simulations confirm that using error-prone correction circuits with non-fault-tolerant quantum error correction will always fail, because more errors are introduced than being corrected. Fault-tolerant methods can overcome this problem, provided that the single qubit error rate is below a certain threshold. We incorporated fault-tolerant quantum error correction techniques into JUMPIQCS using Steane's 7-qubit code and determined this threshold numerically. Using the depolarizing channel as the source of decoherence, we find a threshold error rate of (5.2±0.2) x 10 -6 . For Gaussian distributed operational over-rotations the threshold lies at a standard deviation of 0.0431±0.0002. We can conclude that quantum error correction is especially well suited for the correction of operational imprecisions and systematic over-rotations. For realistic simulations of specific quantum computation devices we need to extend the generic model to dynamic simulations, i.e. time-dependent Hamiltonian simulations of realistic hardware models. We focus on today's most advanced technology, i
Multi-area market clearing in wind-integrated interconnected power systems: A fast parallel decentralized method

International Nuclear Information System (INIS)

Doostizadeh, Meysam; Aminifar, Farrokh; Lesani, Hamid; Ghasemi, Hassan

2016-01-01

Highlights: • A parallel-decentralized multi-area energy & reserve clearance model is proposed. • A fictitious area and joint variables coordinate & parallelize area market models. • Adjustable intervals of random variables compromise optimality and robustness. • The stochastic nature of problem is tackled in an efficient deterministic manner. • The model is compact and applicable in multi-area real-scale systems. - Abstract: The growing evolution of regional electricity markets and proliferation of wind power penetration underline the prominence of coordinated operation of interconnected regional power systems. This paper develops a parallel decentralized methodology for multi-area energy and reserve clearance under wind power uncertainty. Preserving the independency of regional markets while fully taking the advantages of interconnection is a salient feature of the new model. Additionally, the parallel procedure simultaneously clears regional markets for the sake of acceleration particularly in large-scale systems. In order to achieve the optimal solution in a distributed fashion, the augmented Lagrangian relaxation along with alternative direction method of multipliers are applied. The wind power intermittency and uncertainty are tackled through the interval optimization approach. Opposed to the conventional wisdom, adjustable intervals, as subsets of conventional predefined intervals, are introduced here to compromise the cost and conservatism of the solution. The confidence level approach is employed to accommodate the stochastic nature of wind power in a computationally efficient deterministic manner. The effectiveness and robustness of the proposed method are evaluated through several case studies on a two-area 6-bus and the modified three-area IEEE 118-bus test systems.
Imprint of non-linear effects on HI intensity mapping on large scales

Energy Technology Data Exchange (ETDEWEB)

Umeh, Obinna, E-mail: umeobinna@gmail.com [Department of Physics and Astronomy, University of the Western Cape, Cape Town 7535 (South Africa)

2017-06-01

Intensity mapping of the HI brightness temperature provides a unique way of tracing large-scale structures of the Universe up to the largest possible scales. This is achieved by using a low angular resolution radio telescopes to detect emission line from cosmic neutral Hydrogen in the post-reionization Universe. We use general relativistic perturbation theory techniques to derive for the first time the full expression for the HI brightness temperature up to third order in perturbation theory without making any plane-parallel approximation. We use this result and the renormalization prescription for biased tracers to study the impact of nonlinear effects on the power spectrum of HI brightness temperature both in real and redshift space. We show how mode coupling at nonlinear order due to nonlinear bias parameters and redshift space distortion terms modulate the power spectrum on large scales. The large scale modulation may be understood to be due to the effective bias parameter and effective shot noise.

Large-scale climatic anomalies affect marine predator foraging behaviour and demography

Science.gov (United States)

Bost, Charles A.; Cotté, Cedric; Terray, Pascal; Barbraud, Christophe; Bon, Cécile; Delord, Karine; Gimenez, Olivier; Handrich, Yves; Naito, Yasuhiko; Guinet, Christophe; Weimerskirch, Henri

2015-10-01

Determining the links between the behavioural and population responses of wild species to environmental variations is critical for understanding the impact of climate variability on ecosystems. Using long-term data sets, we show how large-scale climatic anomalies in the Southern Hemisphere affect the foraging behaviour and population dynamics of a key marine predator, the king penguin. When large-scale subtropical dipole events occur simultaneously in both subtropical Southern Indian and Atlantic Oceans, they generate tropical anomalies that shift the foraging zone southward. Consequently the distances that penguins foraged from the colony and their feeding depths increased and the population size decreased. This represents an example of a robust and fast impact of large-scale climatic anomalies affecting a marine predator through changes in its at-sea behaviour and demography, despite lack of information on prey availability. Our results highlight a possible behavioural mechanism through which climate variability may affect population processes.
Statistical analyses of scatterplots to identify important factors in large-scale simulations, 2: robustness of techniques

International Nuclear Information System (INIS)

Kleijnen, J.P.C.; Helton, J.C.

1999-01-01

The robustness of procedures for identifying patterns in scatterplots generated in Monte Carlo sensitivity analyses is investigated. These procedures are based on attempts to detect increasingly complex patterns in the scatterplots under consideration and involve the identification of (i) linear relationships with correlation coefficients, (ii) monotonic relationships with rank correlation coefficients, (iii) trends in central tendency as defined by means, medians and the Kruskal-Wallis statistic, (iv) trends in variability as defined by variances and interquartile ranges, and (v) deviations from randomness as defined by the chi-square statistic. The following two topics related to the robustness of these procedures are considered for a sequence of example analyses with a large model for two-phase fluid flow: the presence of Type I and Type II errors, and the stability of results obtained with independent Latin hypercube samples. Observations from analysis include: (i) Type I errors are unavoidable, (ii) Type II errors can occur when inappropriate analysis procedures are used, (iii) physical explanations should always be sought for why statistical procedures identify variables as being important, and (iv) the identification of important variables tends to be stable for independent Latin hypercube samples
Large Scale Parallel DNA Detection by Two-Dimensional Solid-State Multipore Systems.

Science.gov (United States)

Athreya, Nagendra Bala Murali; Sarathy, Aditya; Leburton, Jean-Pierre

2018-04-23

We describe a scalable device design of a dense array of multiple nanopores made from nanoscale semiconductor materials to detect and identify translocations of many biomolecules in a massively parallel detection scheme. We use molecular dynamics coupled to nanoscale device simulations to illustrate the ability of this device setup to uniquely identify DNA parallel translocations. We show that the transverse sheet currents along membranes are immune to the crosstalk effects arising from simultaneous translocations of biomolecules through multiple pores, due to their ability to sense only the local potential changes. We also show that electronic sensing across the nanopore membrane offers a higher detection resolution compared to ionic current blocking technique in a multipore setup, irrespective of the irregularities that occur while fabricating the nanopores in a two-dimensional membrane.
MacroBac: New Technologies for Robust and Efficient Large-Scale Production of Recombinant Multiprotein Complexes.

Science.gov (United States)

Gradia, Scott D; Ishida, Justin P; Tsai, Miaw-Sheue; Jeans, Chris; Tainer, John A; Fuss, Jill O

2017-01-01

Recombinant expression of large, multiprotein complexes is essential and often rate limiting for determining structural, biophysical, and biochemical properties of DNA repair, replication, transcription, and other key cellular processes. Baculovirus-infected insect cell expression systems are especially well suited for producing large, human proteins recombinantly, and multigene baculovirus systems have facilitated studies of multiprotein complexes. In this chapter, we describe a multigene baculovirus system called MacroBac that uses a Biobricks-type assembly method based on restriction and ligation (Series 11) or ligation-independent cloning (Series 438). MacroBac cloning and assembly is efficient and equally well suited for either single subcloning reactions or high-throughput cloning using 96-well plates and liquid handling robotics. MacroBac vectors are polypromoter with each gene flanked by a strong polyhedrin promoter and an SV40 poly(A) termination signal that minimize gene order expression level effects seen in many polycistronic assemblies. Large assemblies are robustly achievable, and we have successfully assembled as many as 10 genes into a single MacroBac vector. Importantly, we have observed significant increases in expression levels and quality of large, multiprotein complexes using a single, multigene, polypromoter virus rather than coinfection with multiple, single-gene viruses. Given the importance of characterizing functional complexes, we believe that MacroBac provides a critical enabling technology that may change the way that structural, biophysical, and biochemical research is done. © 2017 Elsevier Inc. All rights reserved.
Reliability-Based Robustness Analysis for a Croatian Sports Hall

DEFF Research Database (Denmark)

Čizmar, Dean; Kirkegaard, Poul Henning; Sørensen, John Dalsgaard

2011-01-01

This paper presents a probabilistic approach for structural robustness assessment for a timber structure built a few years ago. The robustness analysis is based on a structural reliability based framework for robustness and a simplified mechanical system modelling of a timber truss system....... A complex timber structure with a large number of failure modes is modelled with only a few dominant failure modes. First, a component based robustness analysis is performed based on the reliability indices of the remaining elements after the removal of selected critical elements. The robustness...... is expressed and evaluated by a robustness index. Next, the robustness is assessed using system reliability indices where the probabilistic failure model is modelled by a series system of parallel systems....
Developing a Massively Parallel Forward Projection Radiography Model for Large-Scale Industrial Applications

Energy Technology Data Exchange (ETDEWEB)

Bauerle, Matthew [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

2014-08-01

This project utilizes Graphics Processing Units (GPUs) to compute radiograph simulations for arbitrary objects. The generation of radiographs, also known as the forward projection imaging model, is computationally intensive and not widely utilized. The goal of this research is to develop a massively parallel algorithm that can compute forward projections for objects with a trillion voxels (3D pixels). To achieve this end, the data are divided into blocks that can each t into GPU memory. The forward projected image is also divided into segments to allow for future parallelization and to avoid needless computations.
Large-scale parallel uncontracted multireference-averaged quadratic coupled cluster: the ground state of the chromium dimer revisited.

Science.gov (United States)

Müller, Thomas

2009-11-12

The accurate prediction of the potential energy function of the X1Sigmag+ state of Cr2 is a remarkable challenge; large differential electron correlation effects, significant scalar relativistic contributions, the need for large flexible basis sets containing g functions, the importance of semicore valence electron correlation, and its multireference nature pose considerable obstacles. So far, the only reasonable successful approaches were based on multireference perturbation theory (MRPT). Recently, there was some controversy in the literature about the role of error compensation and systematic defects of various MRPT implementations that cannot be easily overcome. A detailed basis set study of the potential energy function is presented, adopting a variational method. The method of choice for this electron-rich target with up to 28 correlated electrons is fully uncontracted multireference-averaged quadratic coupled cluster (MR-AQCC), which shares the flexibility of the multireference configuration interaction (MRCI) approach and is, in addition, approximately size-extensive (0.02 eV in error as compared to the MRCI value of 1.37 eV for two noninteracting chromium atoms). The best estimate for De arrives at 1.48 eV and agrees well with the experimental data of 1.47 +/- 0.056 eV. At the estimated CBS limit, the equilibrium bond distance (1.685 A) and vibrational frequency (459 cm-1) are in agreement with experiment (1.679 A, 481 cm-1). Large basis sets and reference configuration spaces invariably result in huge wave function expansions (here, up to 2.8 billion configuration state functions), and efficient parallel implementations of the method are crucial. Hence, relevant details on implementation and general performance of the parallel program code are discussed as well.
Understanding the faint red galaxy population using large-scale clustering measurements from SDSS DR7

OpenAIRE

Ross, Ashley; Tojeiro, Rita; Percival, Will

2011-01-01

We use data from the SDSS to investigate the evolution of the large-scale galaxy bias as a function of luminosity for red galaxies. We carefully consider correlation functions of galaxies selected from both photometric and spectroscopic data, and cross-correlations between them, to obtain multiple measurements of the large-scale bias. We find, for our most robust analyses, a strong increase in bias with luminosity for the most luminous galaxies, an intermediate regime where bias does not evol...
Non-parametric co-clustering of large scale sparse bipartite networks on the GPU

DEFF Research Database (Denmark)

Hansen, Toke Jansen; Mørup, Morten; Hansen, Lars Kai

2011-01-01

of row and column clusters from a hypothesis space of an infinite number of clusters. To reach large scale applications of co-clustering we exploit that parameter inference for co-clustering is well suited for parallel computing. We develop a generic GPU framework for efficient inference on large scale...... sparse bipartite networks and achieve a speedup of two orders of magnitude compared to estimation based on conventional CPUs. In terms of scalability we find for networks with more than 100 million links that reliable inference can be achieved in less than an hour on a single GPU. To efficiently manage...
A New Tool for Intelligent Parallel Processing of Radar/SAR Remotely Sensed Imagery

Directory of Open Access Journals (Sweden)

A. Castillo Atoche

2013-01-01

Full Text Available A novel parallel tool for large-scale image enhancement/reconstruction and postprocessing of radar/SAR sensor systems is addressed. The proposed parallel tool performs the following intelligent processing steps: image formation, for the application of different system-level effects of image degradation with a particular remote sensing (RS system and simulation of random noising effects, enhancement/reconstruction by employing nonparametric robust high-resolution techniques, and image postprocessing using the fuzzy anisotropic diffusion technique which incorporates a better edge-preserving noise removal effect and faster diffusion process. This innovative tool allows the processing of high-resolution images provided with different radar/SAR sensor systems as required by RS endusers for environmental monitoring, risk prevention, and resource management. To verify the performance implementation of the proposed parallel framework, the processing steps are developed and specifically tested on graphic processing units (GPU, achieving considerable speedups compared to the serial version of the same techniques implemented in C language.
Neural Parallel Engine: A toolbox for massively parallel neural signal processing.

Science.gov (United States)

Tam, Wing-Kin; Yang, Zhi

2018-05-01

Large-scale neural recordings provide detailed information on neuronal activities and can help elicit the underlying neural mechanisms of the brain. However, the computational burden is also formidable when we try to process the huge data stream generated by such recordings. In this study, we report the development of Neural Parallel Engine (NPE), a toolbox for massively parallel neural signal processing on graphical processing units (GPUs). It offers a selection of the most commonly used routines in neural signal processing such as spike detection and spike sorting, including advanced algorithms such as exponential-component-power-component (EC-PC) spike detection and binary pursuit spike sorting. We also propose a new method for detecting peaks in parallel through a parallel compact operation. Our toolbox is able to offer a 5× to 110× speedup compared with its CPU counterparts depending on the algorithms. A user-friendly MATLAB interface is provided to allow easy integration of the toolbox into existing workflows. Previous efforts on GPU neural signal processing only focus on a few rudimentary algorithms, are not well-optimized and often do not provide a user-friendly programming interface to fit into existing workflows. There is a strong need for a comprehensive toolbox for massively parallel neural signal processing. A new toolbox for massively parallel neural signal processing has been created. It can offer significant speedup in processing signals from large-scale recordings up to thousands of channels. Copyright © 2018 Elsevier B.V. All rights reserved.
OpenMP parallelization of a gridded SWAT (SWATG)

Science.gov (United States)

Zhang, Ying; Hou, Jinliang; Cao, Yongpan; Gu, Juan; Huang, Chunlin

2017-12-01

Large-scale, long-term and high spatial resolution simulation is a common issue in environmental modeling. A Gridded Hydrologic Response Unit (HRU)-based Soil and Water Assessment Tool (SWATG) that integrates grid modeling scheme with different spatial representations also presents such problems. The time-consuming problem affects applications of very high resolution large-scale watershed modeling. The OpenMP (Open Multi-Processing) parallel application interface is integrated with SWATG (called SWATGP) to accelerate grid modeling based on the HRU level. Such parallel implementation takes better advantage of the computational power of a shared memory computer system. We conducted two experiments at multiple temporal and spatial scales of hydrological modeling using SWATG and SWATGP on a high-end server. At 500-m resolution, SWATGP was found to be up to nine times faster than SWATG in modeling over a roughly 2000 km2 watershed with 1 CPU and a 15 thread configuration. The study results demonstrate that parallel models save considerable time relative to traditional sequential simulation runs. Parallel computations of environmental models are beneficial for model applications, especially at large spatial and temporal scales and at high resolutions. The proposed SWATGP model is thus a promising tool for large-scale and high-resolution water resources research and management in addition to offering data fusion and model coupling ability.
Robust mode space approach for atomistic modeling of realistically large nanowire transistors

Science.gov (United States)

Huang, Jun Z.; Ilatikhameneh, Hesameddin; Povolotskyi, Michael; Klimeck, Gerhard

2018-01-01

Nanoelectronic transistors have reached 3D length scales in which the number of atoms is countable. Truly atomistic device representations are needed to capture the essential functionalities of the devices. Atomistic quantum transport simulations of realistically extended devices are, however, computationally very demanding. The widely used mode space (MS) approach can significantly reduce the numerical cost, but a good MS basis is usually very hard to obtain for atomistic full-band models. In this work, a robust and parallel algorithm is developed to optimize the MS basis for atomistic nanowires. This enables engineering-level, reliable tight binding non-equilibrium Green's function simulation of nanowire metal-oxide-semiconductor field-effect transistor (MOSFET) with a realistic cross section of 10 nm × 10 nm using a small computer cluster. This approach is applied to compare the performance of InGaAs and Si nanowire n-type MOSFETs (nMOSFETs) with various channel lengths and cross sections. Simulation results with full-band accuracy indicate that InGaAs nanowire nMOSFETs have no drive current advantage over their Si counterparts for cross sections up to about 10 nm × 10 nm.
Robust estimation for partially linear models with large-dimensional covariates.

Science.gov (United States)

Zhu, LiPing; Li, RunZe; Cui, HengJian

2013-10-01

We are concerned with robust estimation procedures to estimate the parameters in partially linear models with large-dimensional covariates. To enhance the interpretability, we suggest implementing a noncon-cave regularization method in the robust estimation procedure to select important covariates from the linear component. We establish the consistency for both the linear and the nonlinear components when the covariate dimension diverges at the rate of [Formula: see text], where n is the sample size. We show that the robust estimate of linear component performs asymptotically as well as its oracle counterpart which assumes the baseline function and the unimportant covariates were known a priori. With a consistent estimator of the linear component, we estimate the nonparametric component by a robust local linear regression. It is proved that the robust estimate of nonlinear component performs asymptotically as well as if the linear component were known in advance. Comprehensive simulation studies are carried out and an application is presented to examine the finite-sample performance of the proposed procedures.
Robust Manufacturing Control

CERN Document Server

2013-01-01

This contributed volume collects research papers, presented at the CIRP Sponsored Conference Robust Manufacturing Control: Innovative and Interdisciplinary Approaches for Global Networks (RoMaC 2012, Jacobs University, Bremen, Germany, June 18th-20th 2012). These research papers present the latest developments and new ideas focusing on robust manufacturing control for global networks. Today, Global Production Networks (i.e. the nexus of interconnected material and information flows through which products and services are manufactured, assembled and distributed) are confronted with and expected to adapt to: sudden and unpredictable large-scale changes of important parameters which are occurring more and more frequently, event propagation in networks with high degree of interconnectivity which leads to unforeseen fluctuations, and non-equilibrium states which increasingly characterize daily business. These multi-scale changes deeply influence logistic target achievement and call for robust planning and control ...
Reduce, reuse, recycle for robust cluster-state generation

International Nuclear Information System (INIS)

Horsman, Clare; Brown, Katherine L.; Kendon, Vivien M.; Munro, William J.

2011-01-01

Efficient generation of cluster states is crucial for engineering large-scale measurement-based quantum computers. Hybrid matter-optical systems offer a robust, scalable path to this goal. Such systems have an ancilla which acts as a bus connecting the qubits. We show that by generating the cluster in smaller sections of interlocking bricks, reusing one ancilla per brick, the cluster can be produced with maximal efficiency, requiring fewer than half the operations compared with no bus reuse. By reducing the time required to prepare sections of the cluster, bus reuse more than doubles the size of the computational workspace that can be used before decoherence effects dominate. A row of buses in parallel provides fully scalable cluster-state generation requiring only 20 controlled-phase gates per bus use.
Large-scale micromagnetics simulations with dipolar interaction using all-to-all communications

Directory of Open Access Journals (Sweden)

Hiroshi Tsukahara

2016-05-01

Full Text Available We implement on our micromagnetics simulator low-complexity parallel fast-Fourier-transform algorithms, which reduces the frequency of all-to-all communications from six to two times. Almost all the computation time of micromagnetics simulation is taken up by the calculation of the magnetostatic field which can be calculated using the fast Fourier transform method. The results show that the simulation time is decreased with good scalability, even if the micromagentics simulation is performed using 8192 physical cores. This high parallelization effect enables large-scale micromagentics simulation using over one billion to be performed. Because massively parallel computing is needed to simulate the magnetization dynamics of real permanent magnets composed of many micron-sized grains, it is expected that our simulator reveals how magnetization dynamics influences the coercivity of the permanent magnet.
MOCC: A Fast and Robust Correlation-Based Method for Interest Point Matching under Large Scale Changes

OpenAIRE

Wang Hao; Gao Wen; Huang Qingming; Zhao Feng

2010-01-01

Similarity measures based on correlation have been used extensively for matching tasks. However, traditional correlation-based image matching methods are sensitive to rotation and scale changes. This paper presents a fast correlation-based method for matching two images with large rotation and significant scale changes. Multiscale oriented corner correlation (MOCC) is used to evaluate the degree of similarity between the feature points. The method is rotation invariant and capable of matchin...
Visual coherence for large-scale line-plot visualizations

KAUST Repository

Muigg, Philipp

2011-06-01

Displaying a large number of lines within a limited amount of screen space is a task that is common to many different classes of visualization techniques such as time-series visualizations, parallel coordinates, link-node diagrams, and phase-space diagrams. This paper addresses the challenging problems of cluttering and overdraw inherent to such visualizations. We generate a 2x2 tensor field during line rasterization that encodes the distribution of line orientations through each image pixel. Anisotropic diffusion of a noise texture is then used to generate a dense, coherent visualization of line orientation. In order to represent features of different scales, we employ a multi-resolution representation of the tensor field. The resulting technique can easily be applied to a wide variety of line-based visualizations. We demonstrate this for parallel coordinates, a time-series visualization, and a phase-space diagram. Furthermore, we demonstrate how to integrate a focus+context approach by incorporating a second tensor field. Our approach achieves interactive rendering performance for large data sets containing millions of data items, due to its image-based nature and ease of implementation on GPUs. Simulation results from computational fluid dynamics are used to evaluate the performance and usefulness of the proposed method. © 2011 The Author(s).
Visual coherence for large-scale line-plot visualizations

KAUST Repository

Muigg, Philipp; Hadwiger, Markus; Doleisch, Helmut; Grö ller, Eduard M.

2011-01-01

Displaying a large number of lines within a limited amount of screen space is a task that is common to many different classes of visualization techniques such as time-series visualizations, parallel coordinates, link-node diagrams, and phase-space diagrams. This paper addresses the challenging problems of cluttering and overdraw inherent to such visualizations. We generate a 2x2 tensor field during line rasterization that encodes the distribution of line orientations through each image pixel. Anisotropic diffusion of a noise texture is then used to generate a dense, coherent visualization of line orientation. In order to represent features of different scales, we employ a multi-resolution representation of the tensor field. The resulting technique can easily be applied to a wide variety of line-based visualizations. We demonstrate this for parallel coordinates, a time-series visualization, and a phase-space diagram. Furthermore, we demonstrate how to integrate a focus+context approach by incorporating a second tensor field. Our approach achieves interactive rendering performance for large data sets containing millions of data items, due to its image-based nature and ease of implementation on GPUs. Simulation results from computational fluid dynamics are used to evaluate the performance and usefulness of the proposed method. © 2011 The Author(s).

Traffic Flow Prediction Model for Large-Scale Road Network Based on Cloud Computing

Directory of Open Access Journals (Sweden)

Zhaosheng Yang

2014-01-01

Full Text Available To increase the efficiency and precision of large-scale road network traffic flow prediction, a genetic algorithm-support vector machine (GA-SVM model based on cloud computing is proposed in this paper, which is based on the analysis of the characteristics and defects of genetic algorithm and support vector machine. In cloud computing environment, firstly, SVM parameters are optimized by the parallel genetic algorithm, and then this optimized parallel SVM model is used to predict traffic flow. On the basis of the traffic flow data of Haizhu District in Guangzhou City, the proposed model was verified and compared with the serial GA-SVM model and parallel GA-SVM model based on MPI (message passing interface. The results demonstrate that the parallel GA-SVM model based on cloud computing has higher prediction accuracy, shorter running time, and higher speedup.
Finding Tropical Cyclones on a Cloud Computing Cluster: Using Parallel Virtualization for Large-Scale Climate Simulation Analysis

Energy Technology Data Exchange (ETDEWEB)

Hasenkamp, Daren; Sim, Alexander; Wehner, Michael; Wu, Kesheng

2010-09-30

Extensive computing power has been used to tackle issues such as climate changes, fusion energy, and other pressing scientific challenges. These computations produce a tremendous amount of data; however, many of the data analysis programs currently only run a single processor. In this work, we explore the possibility of using the emerging cloud computing platform to parallelize such sequential data analysis tasks. As a proof of concept, we wrap a program for analyzing trends of tropical cyclones in a set of virtual machines (VMs). This approach allows the user to keep their familiar data analysis environment in the VMs, while we provide the coordination and data transfer services to ensure the necessary input and output are directed to the desired locations. This work extensively exercises the networking capability of the cloud computing systems and has revealed a number of weaknesses in the current cloud system software. In our tests, we are able to scale the parallel data analysis job to a modest number of VMs and achieve a speedup that is comparable to running the same analysis task using MPI. However, compared to MPI based parallelization, the cloud-based approach has a number of advantages. The cloud-based approach is more flexible because the VMs can capture arbitrary software dependencies without requiring the user to rewrite their programs. The cloud-based approach is also more resilient to failure; as long as a single VM is running, it can make progress while as soon as one MPI node fails the whole analysis job fails. In short, this initial work demonstrates that a cloud computing system is a viable platform for distributed scientific data analyses traditionally conducted on dedicated supercomputing systems.
Finding Tropical Cyclones on a Cloud Computing Cluster: Using Parallel Virtualization for Large-Scale Climate Simulation Analysis

International Nuclear Information System (INIS)

Hasenkamp, Daren; Sim, Alexander; Wehner, Michael; Wu, Kesheng

2010-01-01

Extensive computing power has been used to tackle issues such as climate changes, fusion energy, and other pressing scientific challenges. These computations produce a tremendous amount of data; however, many of the data analysis programs currently only run a single processor. In this work, we explore the possibility of using the emerging cloud computing platform to parallelize such sequential data analysis tasks. As a proof of concept, we wrap a program for analyzing trends of tropical cyclones in a set of virtual machines (VMs). This approach allows the user to keep their familiar data analysis environment in the VMs, while we provide the coordination and data transfer services to ensure the necessary input and output are directed to the desired locations. This work extensively exercises the networking capability of the cloud computing systems and has revealed a number of weaknesses in the current cloud system software. In our tests, we are able to scale the parallel data analysis job to a modest number of VMs and achieve a speedup that is comparable to running the same analysis task using MPI. However, compared to MPI based parallelization, the cloud-based approach has a number of advantages. The cloud-based approach is more flexible because the VMs can capture arbitrary software dependencies without requiring the user to rewrite their programs. The cloud-based approach is also more resilient to failure; as long as a single VM is running, it can make progress while as soon as one MPI node fails the whole analysis job fails. In short, this initial work demonstrates that a cloud computing system is a viable platform for distributed scientific data analyses traditionally conducted on dedicated supercomputing systems.
Large-scale solar purchasing

International Nuclear Information System (INIS)

1999-01-01

The principal objective of the project was to participate in the definition of a new IEA task concerning solar procurement (''the Task'') and to assess whether involvement in the task would be in the interest of the UK active solar heating industry. The project also aimed to assess the importance of large scale solar purchasing to UK active solar heating market development and to evaluate the level of interest in large scale solar purchasing amongst potential large scale purchasers (in particular housing associations and housing developers). A further aim of the project was to consider means of stimulating large scale active solar heating purchasing activity within the UK. (author)
Parallel Monte Carlo reactor neutronics

International Nuclear Information System (INIS)

Blomquist, R.N.; Brown, F.B.

1994-01-01

The issues affecting implementation of parallel algorithms for large-scale engineering Monte Carlo neutron transport simulations are discussed. For nuclear reactor calculations, these include load balancing, recoding effort, reproducibility, domain decomposition techniques, I/O minimization, and strategies for different parallel architectures. Two codes were parallelized and tested for performance. The architectures employed include SIMD, MIMD-distributed memory, and workstation network with uneven interactive load. Speedups linear with the number of nodes were achieved
Large-scale simulations of error-prone quantum computation devices

Energy Technology Data Exchange (ETDEWEB)

Trieu, Doan Binh

2009-07-01

The theoretical concepts of quantum computation in the idealized and undisturbed case are well understood. However, in practice, all quantum computation devices do suffer from decoherence effects as well as from operational imprecisions. This work assesses the power of error-prone quantum computation devices using large-scale numerical simulations on parallel supercomputers. We present the Juelich Massively Parallel Ideal Quantum Computer Simulator (JUMPIQCS), that simulates a generic quantum computer on gate level. It comprises an error model for decoherence and operational errors. The robustness of various algorithms in the presence of noise has been analyzed. The simulation results show that for large system sizes and long computations it is imperative to actively correct errors by means of quantum error correction. We implemented the 5-, 7-, and 9-qubit quantum error correction codes. Our simulations confirm that using error-prone correction circuits with non-fault-tolerant quantum error correction will always fail, because more errors are introduced than being corrected. Fault-tolerant methods can overcome this problem, provided that the single qubit error rate is below a certain threshold. We incorporated fault-tolerant quantum error correction techniques into JUMPIQCS using Steane's 7-qubit code and determined this threshold numerically. Using the depolarizing channel as the source of decoherence, we find a threshold error rate of (5.2{+-}0.2) x 10{sup -6}. For Gaussian distributed operational over-rotations the threshold lies at a standard deviation of 0.0431{+-}0.0002. We can conclude that quantum error correction is especially well suited for the correction of operational imprecisions and systematic over-rotations. For realistic simulations of specific quantum computation devices we need to extend the generic model to dynamic simulations, i.e. time-dependent Hamiltonian simulations of realistic hardware models. We focus on today's most advanced
Robust haptic large distance telemanipulation for ITER

International Nuclear Information System (INIS)

Heck, D.J.F.; Heemskerk, C.J.M.; Koning, J.F.; Abbasi, A.; Nijmeijer, H.

2013-01-01

Highlights: • ITER remote handling maintenance can be controlled safely over a large distance. • Bilateral teleoperation experiments were performed in a local network. • Wave variables make the controller robust against constant communication delays. • Master and slave position synchronization guaranteed by proportional action. -- Abstract: During shutdowns, maintenance crews are expected to work in 24/6 shifts to perform critical remote handling maintenance tasks on the ITER system. In this article, we investigate the possibility to safely perform these haptic maintenance tasks remotely from control stations located anywhere around the world. To guarantee stability in time delayed bilateral teleoperation, the symmetric position tracking controller using wave variables is selected. This algorithm guarantees robustness against communication delays, can eliminate wave reflections and provide position synchronization of the master and slave devices. Experiments have been conducted under realistic local network bandwidth, latency and jitter constraints. They show sufficient transparency even for substantial communication delays
Robust haptic large distance telemanipulation for ITER

Energy Technology Data Exchange (ETDEWEB)

Heck, D.J.F., E-mail: d.j.f.heck@tue.nl [Eindhoven University of Technology, Department of Mechanical Engineering, Eindhoven (Netherlands); Heemskerk, C.J.M.; Koning, J.F. [Heemskerk Innovative Technologies, Sassenheim (Netherlands); Abbasi, A.; Nijmeijer, H. [Eindhoven University of Technology, Department of Mechanical Engineering, Eindhoven (Netherlands)

2013-10-15

Highlights: • ITER remote handling maintenance can be controlled safely over a large distance. • Bilateral teleoperation experiments were performed in a local network. • Wave variables make the controller robust against constant communication delays. • Master and slave position synchronization guaranteed by proportional action. -- Abstract: During shutdowns, maintenance crews are expected to work in 24/6 shifts to perform critical remote handling maintenance tasks on the ITER system. In this article, we investigate the possibility to safely perform these haptic maintenance tasks remotely from control stations located anywhere around the world. To guarantee stability in time delayed bilateral teleoperation, the symmetric position tracking controller using wave variables is selected. This algorithm guarantees robustness against communication delays, can eliminate wave reflections and provide position synchronization of the master and slave devices. Experiments have been conducted under realistic local network bandwidth, latency and jitter constraints. They show sufficient transparency even for substantial communication delays.
Massive parallel electromagnetic field simulation program JEMS-FDTD design and implementation on jasmin

International Nuclear Information System (INIS)

Li Hanyu; Zhou Haijing; Dong Zhiwei; Liao Cheng; Chang Lei; Cao Xiaolin; Xiao Li

2010-01-01

A large-scale parallel electromagnetic field simulation program JEMS-FDTD(J Electromagnetic Solver-Finite Difference Time Domain) is designed and implemented on JASMIN (J parallel Adaptive Structured Mesh applications INfrastructure). This program can simulate propagation, radiation, couple of electromagnetic field by solving Maxwell equations on structured mesh explicitly with FDTD method. JEMS-FDTD is able to simulate billion-mesh-scale problems on thousands of processors. In this article, the program is verified by simulating the radiation of an electric dipole. A beam waveguide is simulated to demonstrate the capability of large scale parallel computation. A parallel performance test indicates that a high parallel efficiency is obtained. (authors)
Detecting differential protein expression in large-scale population proteomics

Energy Technology Data Exchange (ETDEWEB)

Ryu, Soyoung; Qian, Weijun; Camp, David G.; Smith, Richard D.; Tompkins, Ronald G.; Davis, Ronald W.; Xiao, Wenzhong

2014-06-17

Mass spectrometry-based high-throughput quantitative proteomics shows great potential in clinical biomarker studies, identifying and quantifying thousands of proteins in biological samples. However, methods are needed to appropriately handle issues/challenges unique to mass spectrometry data in order to detect as many biomarker proteins as possible. One issue is that different mass spectrometry experiments generate quite different total numbers of quantified peptides, which can result in more missing peptide abundances in an experiment with a smaller total number of quantified peptides. Another issue is that the quantification of peptides is sometimes absent, especially for less abundant peptides and such missing values contain the information about the peptide abundance. Here, we propose a Significance Analysis for Large-scale Proteomics Studies (SALPS) that handles missing peptide intensity values caused by the two mechanisms mentioned above. Our model has a robust performance in both simulated data and proteomics data from a large clinical study. Because varying patients’ sample qualities and deviating instrument performances are not avoidable for clinical studies performed over the course of several years, we believe that our approach will be useful to analyze large-scale clinical proteomics data.
A Parallel, Finite-Volume Algorithm for Large-Eddy Simulation of Turbulent Flows

Science.gov (United States)

Bui, Trong T.

1999-01-01

A parallel, finite-volume algorithm has been developed for large-eddy simulation (LES) of compressible turbulent flows. This algorithm includes piecewise linear least-square reconstruction, trilinear finite-element interpolation, Roe flux-difference splitting, and second-order MacCormack time marching. Parallel implementation is done using the message-passing programming model. In this paper, the numerical algorithm is described. To validate the numerical method for turbulence simulation, LES of fully developed turbulent flow in a square duct is performed for a Reynolds number of 320 based on the average friction velocity and the hydraulic diameter of the duct. Direct numerical simulation (DNS) results are available for this test case, and the accuracy of this algorithm for turbulence simulations can be ascertained by comparing the LES solutions with the DNS results. The effects of grid resolution, upwind numerical dissipation, and subgrid-scale dissipation on the accuracy of the LES are examined. Comparison with DNS results shows that the standard Roe flux-difference splitting dissipation adversely affects the accuracy of the turbulence simulation. For accurate turbulence simulations, only 3-5 percent of the standard Roe flux-difference splitting dissipation is needed.
On the validity and robustness of the scale error phenomenon in early childhood.

Science.gov (United States)

DeLoache, Judy S; LoBue, Vanessa; Vanderborght, Mieke; Chiong, Cynthia

2013-02-01

Scale errors is a term referring to very young children's serious efforts to perform actions on miniature replica objects that are impossible due to great differences in the size of the child's body and the size of the target objects. We report three studies providing further documentation of scale errors and investigating the validity and robustness of the phenomenon. In the first, we establish that 2-year-olds' behavior in response to prompts to "pretend" with miniature replica objects differs dramatically from scale errors. The second and third studies address the robustness of the phenomenon and its relative imperviousness to attempts to influence the rate of scale errors. Copyright © 2012 Elsevier Inc. All rights reserved.
Visual attention mitigates information loss in small- and large-scale neural codes

Science.gov (United States)

Sprague, Thomas C; Saproo, Sameer; Serences, John T

2015-01-01

Summary The visual system transforms complex inputs into robust and parsimonious neural codes that efficiently guide behavior. Because neural communication is stochastic, the amount of encoded visual information necessarily decreases with each synapse. This constraint requires processing sensory signals in a manner that protects information about relevant stimuli from degradation. Such selective processing – or selective attention – is implemented via several mechanisms, including neural gain and changes in tuning properties. However, examining each of these effects in isolation obscures their joint impact on the fidelity of stimulus feature representations by large-scale population codes. Instead, large-scale activity patterns can be used to reconstruct representations of relevant and irrelevant stimuli, providing a holistic understanding about how neuron-level modulations collectively impact stimulus encoding. PMID:25769502
Scalable multi-objective control for large scale water resources systems under uncertainty

Science.gov (United States)

Giuliani, Matteo; Quinn, Julianne; Herman, Jonathan; Castelletti, Andrea; Reed, Patrick

2016-04-01

The use of mathematical models to support the optimal management of environmental systems is rapidly expanding over the last years due to advances in scientific knowledge of the natural processes, efficiency of the optimization techniques, and availability of computational resources. However, undergoing changes in climate and society introduce additional challenges for controlling these systems, ultimately motivating the emergence of complex models to explore key causal relationships and dependencies on uncontrolled sources of variability. In this work, we contribute a novel implementation of the evolutionary multi-objective direct policy search (EMODPS) method for controlling environmental systems under uncertainty. The proposed approach combines direct policy search (DPS) with hierarchical parallelization of multi-objective evolutionary algorithms (MOEAs) and offers a threefold advantage: the DPS simulation-based optimization can be combined with any simulation model and does not add any constraint on modeled information, allowing the use of exogenous information in conditioning the decisions. Moreover, the combination of DPS and MOEAs prompts the generation or Pareto approximate set of solutions for up to 10 objectives, thus overcoming the decision biases produced by cognitive myopia, where narrow or restrictive definitions of optimality strongly limit the discovery of decision relevant alternatives. Finally, the use of large-scale MOEAs parallelization improves the ability of the designed solutions in handling the uncertainty due to severe natural variability. The proposed approach is demonstrated on a challenging water resources management problem represented by the optimal control of a network of four multipurpose water reservoirs in the Red River basin (Vietnam). As part of the medium-long term energy and food security national strategy, four large reservoirs have been constructed on the Red River tributaries, which are mainly operated for hydropower
TOPOLOGY OF A LARGE-SCALE STRUCTURE AS A TEST OF MODIFIED GRAVITY

International Nuclear Information System (INIS)

Wang Xin; Chen Xuelei; Park, Changbom

2012-01-01

The genus of the isodensity contours is a robust measure of the topology of a large-scale structure, and it is relatively insensitive to nonlinear gravitational evolution, galaxy bias, and redshift-space distortion. We show that the growth of density fluctuations is scale dependent even in the linear regime in some modified gravity theories, which opens a new possibility of testing the theories observationally. We propose to use the genus of the isodensity contours, an intrinsic measure of the topology of the large-scale structure, as a statistic to be used in such tests. In Einstein's general theory of relativity, density fluctuations grow at the same rate on all scales in the linear regime, and the genus per comoving volume is almost conserved as structures grow homologously, so we expect that the genus-smoothing-scale relation is basically time independent. However, in some modified gravity models where structures grow with different rates on different scales, the genus-smoothing-scale relation should change over time. This can be used to test the gravity models with large-scale structure observations. We study the cases of the f(R) theory, DGP braneworld theory as well as the parameterized post-Friedmann models. We also forecast how the modified gravity models can be constrained with optical/IR or redshifted 21 cm radio surveys in the near future.
The build up of the correlation between halo spin and the large-scale structure

Science.gov (United States)

Wang, Peng; Kang, Xi

2018-01-01

Both simulations and observations have confirmed that the spin of haloes/galaxies is correlated with the large-scale structure (LSS) with a mass dependence such that the spin of low-mass haloes/galaxies tend to be parallel with the LSS, while that of massive haloes/galaxies tend to be perpendicular with the LSS. It is still unclear how this mass dependence is built up over time. We use N-body simulations to trace the evolution of the halo spin-LSS correlation and find that at early times the spin of all halo progenitors is parallel with the LSS. As time goes on, mass collapsing around massive halo is more isotropic, especially the recent mass accretion along the slowest collapsing direction is significant and it brings the halo spin to be perpendicular with the LSS. Adopting the fractional anisotropy (FA) parameter to describe the degree of anisotropy of the large-scale environment, we find that the spin-LSS correlation is a strong function of the environment such that a higher FA (more anisotropic environment) leads to an aligned signal, and a lower anisotropy leads to a misaligned signal. In general, our results show that the spin-LSS correlation is a combined consequence of mass flow and halo growth within the cosmic web. Our predicted environmental dependence between spin and large-scale structure can be further tested using galaxy surveys.
Large-Scale Optimization for Bayesian Inference in Complex Systems

Energy Technology Data Exchange (ETDEWEB)

Willcox, Karen [MIT; Marzouk, Youssef [MIT

2013-11-12

The SAGUARO (Scalable Algorithms for Groundwater Uncertainty Analysis and Robust Optimization) Project focused on the development of scalable numerical algorithms for large-scale Bayesian inversion in complex systems that capitalize on advances in large-scale simulation-based optimization and inversion methods. The project was a collaborative effort among MIT, the University of Texas at Austin, Georgia Institute of Technology, and Sandia National Laboratories. The research was directed in three complementary areas: efficient approximations of the Hessian operator, reductions in complexity of forward simulations via stochastic spectral approximations and model reduction, and employing large-scale optimization concepts to accelerate sampling. The MIT--Sandia component of the SAGUARO Project addressed the intractability of conventional sampling methods for large-scale statistical inverse problems by devising reduced-order models that are faithful to the full-order model over a wide range of parameter values; sampling then employs the reduced model rather than the full model, resulting in very large computational savings. Results indicate little effect on the computed posterior distribution. On the other hand, in the Texas--Georgia Tech component of the project, we retain the full-order model, but exploit inverse problem structure (adjoint-based gradients and partial Hessian information of the parameter-to-observation map) to implicitly extract lower dimensional information on the posterior distribution; this greatly speeds up sampling methods, so that fewer sampling points are needed. We can think of these two approaches as ``reduce then sample'' and ``sample then reduce.'' In fact, these two approaches are complementary, and can be used in conjunction with each other. Moreover, they both exploit deterministic inverse problem structure, in the form of adjoint-based gradient and Hessian information of the underlying parameter-to-observation map, to
The TeraShake Computational Platform for Large-Scale Earthquake Simulations

Science.gov (United States)

Cui, Yifeng; Olsen, Kim; Chourasia, Amit; Moore, Reagan; Maechling, Philip; Jordan, Thomas

Geoscientific and computer science researchers with the Southern California Earthquake Center (SCEC) are conducting a large-scale, physics-based, computationally demanding earthquake system science research program with the goal of developing predictive models of earthquake processes. The computational demands of this program continue to increase rapidly as these researchers seek to perform physics-based numerical simulations of earthquake processes for larger meet the needs of this research program, a multiple-institution team coordinated by SCEC has integrated several scientific codes into a numerical modeling-based research tool we call the TeraShake computational platform (TSCP). A central component in the TSCP is a highly scalable earthquake wave propagation simulation program called the TeraShake anelastic wave propagation (TS-AWP) code. In this chapter, we describe how we extended an existing, stand-alone, wellvalidated, finite-difference, anelastic wave propagation modeling code into the highly scalable and widely used TS-AWP and then integrated this code into the TeraShake computational platform that provides end-to-end (initialization to analysis) research capabilities. We also describe the techniques used to enhance the TS-AWP parallel performance on TeraGrid supercomputers, as well as the TeraShake simulations phases including input preparation, run time, data archive management, and visualization. As a result of our efforts to improve its parallel efficiency, the TS-AWP has now shown highly efficient strong scaling on over 40K processors on IBM’s BlueGene/L Watson computer. In addition, the TSCP has developed into a computational system that is useful to many members of the SCEC community for performing large-scale earthquake simulations.
Robust Face Recognition via Multi-Scale Patch-Based Matrix Regression.

Directory of Open Access Journals (Sweden)

Guangwei Gao

Full Text Available In many real-world applications such as smart card solutions, law enforcement, surveillance and access control, the limited training sample size is the most fundamental problem. By making use of the low-rank structural information of the reconstructed error image, the so-called nuclear norm-based matrix regression has been demonstrated to be effective for robust face recognition with continuous occlusions. However, the recognition performance of nuclear norm-based matrix regression degrades greatly in the face of the small sample size problem. An alternative solution to tackle this problem is performing matrix regression on each patch and then integrating the outputs from all patches. However, it is difficult to set an optimal patch size across different databases. To fully utilize the complementary information from different patch scales for the final decision, we propose a multi-scale patch-based matrix regression scheme based on which the ensemble of multi-scale outputs can be achieved optimally. Extensive experiments on benchmark face databases validate the effectiveness and robustness of our method, which outperforms several state-of-the-art patch-based face recognition algorithms.
Scalable Parallel Methods for Analyzing Metagenomics Data at Extreme Scale

Energy Technology Data Exchange (ETDEWEB)

Daily, Jeffrey A. [Washington State Univ., Pullman, WA (United States)

2015-05-01

The field of bioinformatics and computational biology is currently experiencing a data revolution. The exciting prospect of making fundamental biological discoveries is fueling the rapid development and deployment of numerous cost-effective, high-throughput next-generation sequencing technologies. The result is that the DNA and protein sequence repositories are being bombarded with new sequence information. Databases are continuing to report a Moore’s law-like growth trajectory in their database sizes, roughly doubling every 18 months. In what seems to be a paradigm-shift, individual projects are now capable of generating billions of raw sequence data that need to be analyzed in the presence of already annotated sequence information. While it is clear that data-driven methods, such as sequencing homology detection, are becoming the mainstay in the field of computational life sciences, the algorithmic advancements essential for implementing complex data analytics at scale have mostly lagged behind. Sequence homology detection is central to a number of bioinformatics applications including genome sequencing and protein family characterization. Given millions of sequences, the goal is to identify all pairs of sequences that are highly similar (or “homologous”) on the basis of alignment criteria. While there are optimal alignment algorithms to compute pairwise homology, their deployment for large-scale is currently not feasible; instead, heuristic methods are used at the expense of quality. In this dissertation, we present the design and evaluation of a parallel implementation for conducting optimal homology detection on distributed memory supercomputers. Our approach uses a combination of techniques from asynchronous load balancing (viz. work stealing, dynamic task counters), data replication, and exact-matching filters to achieve homology detection at scale. Results for a collection of 2.56M sequences show parallel efficiencies of ~75-100% on up to 8K cores

Scalable Parallel Methods for Analyzing Metagenomics Data at Extreme Scale

International Nuclear Information System (INIS)

Daily, Jeffrey A.

2015-01-01

The field of bioinformatics and computational biology is currently experiencing a data revolution. The exciting prospect of making fundamental biological discoveries is fueling the rapid development and deployment of numerous cost-effective, high-throughput next-generation sequencing technologies. The result is that the DNA and protein sequence repositories are being bombarded with new sequence information. Databases are continuing to report a Moore's law-like growth trajectory in their database sizes, roughly doubling every 18 months. In what seems to be a paradigm-shift, individual projects are now capable of generating billions of raw sequence data that need to be analyzed in the presence of already annotated sequence information. While it is clear that data-driven methods, such as sequencing homology detection, are becoming the mainstay in the field of computational life sciences, the algorithmic advancements essential for implementing complex data analytics at scale have mostly lagged behind. Sequence homology detection is central to a number of bioinformatics applications including genome sequencing and protein family characterization. Given millions of sequences, the goal is to identify all pairs of sequences that are highly similar (or 'homologous') on the basis of alignment criteria. While there are optimal alignment algorithms to compute pairwise homology, their deployment for large-scale is currently not feasible; instead, heuristic methods are used at the expense of quality. In this dissertation, we present the design and evaluation of a parallel implementation for conducting optimal homology detection on distributed memory supercomputers. Our approach uses a combination of techniques from asynchronous load balancing (viz. work stealing, dynamic task counters), data replication, and exact-matching filters to achieve homology detection at scale. Results for a collection of 2.56M sequences show parallel efficiencies of ~75-100% on up to 8K
Parallel Computation of RCS of Electrically Large Platform with Coatings Modeled with NURBS Surfaces

Directory of Open Access Journals (Sweden)

Ying Yan

2012-01-01

Full Text Available The significance of Radar Cross Section (RCS in the military applications makes its prediction an important problem. This paper uses large-scale parallel Physical Optics (PO to realize the fast computation of RCS to electrically large targets, which are modeled by Non-Uniform Rational B-Spline (NURBS surfaces and coated with dielectric materials. Some numerical examples are presented to validate this paper’s method. In addition, 1024 CPUs are used in Shanghai Supercomputer Center (SSC to perform the simulation of a model with the maximum electrical size 1966.7 λ for the first time in China. From which, it can be found that this paper’s method can greatly speed the calculation and is capable of solving the real-life problem of RCS prediction.
Adaptive robust trajectory tracking control of a parallel manipulator driven by pneumatic cylinders

Directory of Open Access Journals (Sweden)

Ce Shang

2016-04-01

Full Text Available Due to the compressibility of air, non-linear characteristics, and parameter uncertainties of pneumatic elements, the position control of a pneumatic cylinder or parallel platform is still very difficult while comparing with the systems driven by electric or hydraulic power. In this article, based on the basic dynamic model and descriptions of thermal processes, a controller integrated with online parameter estimation is proposed to improve the performance of a pneumatic cylinder controlled by a proportional valve. The trajectory tracking error is significantly decreased by applying this method. Moreover, the algorithm is expanded to the problem of posture trajectory tracking for the three-revolute prismatic spherical pneumatic parallel manipulator. Lyapunov’s method is used to give the proof of stability of the controller. Using NI-CompactRio, NI-PXI, and Veristand platform as the realistic controller hardware and data interactive environment, the adaptive robust control algorithm is applied to the physical system successfully. Experimental results and data analysis showed that the posture error of the platform could be about 0.5%–0.7% of the desired trajectory amplitude. By integrating this method to the mechatronic system, the pneumatic servo solutions can be much more competitive in the industrial market of position and posture control.
Parallel Simulation of Three-Dimensional Free Surface Fluid Flow Problems

International Nuclear Information System (INIS)

BAER, THOMAS A.; SACKINGER, PHILIP A.; SUBIA, SAMUEL R.

1999-01-01

Simulation of viscous three-dimensional fluid flow typically involves a large number of unknowns. When free surfaces are included, the number of unknowns increases dramatically. Consequently, this class of problem is an obvious application of parallel high performance computing. We describe parallel computation of viscous, incompressible, free surface, Newtonian fluid flow problems that include dynamic contact fines. The Galerkin finite element method was used to discretize the fully-coupled governing conservation equations and a ''pseudo-solid'' mesh mapping approach was used to determine the shape of the free surface. In this approach, the finite element mesh is allowed to deform to satisfy quasi-static solid mechanics equations subject to geometric or kinematic constraints on the boundaries. As a result, nodal displacements must be included in the set of unknowns. Other issues discussed are the proper constraints appearing along the dynamic contact line in three dimensions. Issues affecting efficient parallel simulations include problem decomposition to equally distribute computational work among a SPMD computer and determination of robust, scalable preconditioners for the distributed matrix systems that must be solved. Solution continuation strategies important for serial simulations have an enhanced relevance in a parallel coquting environment due to the difficulty of solving large scale systems. Parallel computations will be demonstrated on an example taken from the coating flow industry: flow in the vicinity of a slot coater edge. This is a three dimensional free surface problem possessing a contact line that advances at the web speed in one region but transitions to static behavior in another region. As such, a significant fraction of the computational time is devoted to processing boundary data. Discussion focuses on parallel speed ups for fixed problem size, a class of problems of immediate practical importance
Integrated fringe projection 3D scanning system for large-scale metrology based on laser tracker

Science.gov (United States)

Du, Hui; Chen, Xiaobo; Zhou, Dan; Guo, Gen; Xi, Juntong

2017-10-01

Large scale components exist widely in advance manufacturing industry,3D profilometry plays a pivotal role for the quality control. This paper proposes a flexible, robust large-scale 3D scanning system by integrating a robot with a binocular structured light scanner and a laser tracker. The measurement principle and system construction of the integrated system are introduced. And a mathematical model is established for the global data fusion. Subsequently, a flexible and robust method and mechanism is introduced for the establishment of the end coordination system. Based on this method, a virtual robot noumenon is constructed for hand-eye calibration. And then the transformation matrix between end coordination system and world coordination system is solved. Validation experiment is implemented for verifying the proposed algorithms. Firstly, hand-eye transformation matrix is solved. Then a car body rear is measured for 16 times for the global data fusion algorithm verification. And the 3D shape of the rear is reconstructed successfully.
An Experiment of Robust Parallel Algorithm for the Eigenvalue problem of a Multigroup Neutron Diffusion based on modified FETI-DP

Energy Technology Data Exchange (ETDEWEB)

Chang, Jonghwa [Korea Atomic Energy Research Institute, Daejeon (Korea, Republic of)

2014-05-15

Parallelization of Monte Carlo simulation is widely adpoted. There are also several parallel algorithms developed for the SN transport theory using the parallel wave sweeping algorithm and for the CPM using parallel ray tracing. For practical purpose of reactor physics application, the thermal feedback and burnup effects on the multigroup cross section should be considered. In this respect, the domain decomposition method(DDM) is suitable for distributing the expensive cross section calculation work. Parallel transport code and diffusion code based on the Raviart-Thomas mixed finite element method was developed. However most of the developed methods rely on the heuristic convergence of flux and current at the domain interfaces. Convergence was not attained in some cases. Mechanical stress computation community has also work on the DDM to solve the stress-strain equation using the finite element methods. The most successful domain decomposition method in terms of robustness is FETI-DP. We have modified the original FETI-DP to solve the eigenvalue problem for the multigroup diffusion problem in this study.
Performance of Air Pollution Models on Massively Parallel Computers

DEFF Research Database (Denmark)

Brown, John; Hansen, Per Christian; Wasniewski, Jerzy

1996-01-01

To compare the performance and use of three massively parallel SIMD computers, we implemented a large air pollution model on the computers. Using a realistic large-scale model, we gain detailed insight about the performance of the three computers when used to solve large-scale scientific problems...
An inertia-free filter line-search algorithm for large-scale nonlinear programming

Energy Technology Data Exchange (ETDEWEB)

Chiang, Nai-Yuan; Zavala, Victor M.

2016-02-15

We present a filter line-search algorithm that does not require inertia information of the linear system. This feature enables the use of a wide range of linear algebra strategies and libraries, which is essential to tackle large-scale problems on modern computing architectures. The proposed approach performs curvature tests along the search step to detect negative curvature and to trigger convexification. We prove that the approach is globally convergent and we implement the approach within a parallel interior-point framework to solve large-scale and highly nonlinear problems. Our numerical tests demonstrate that the inertia-free approach is as efficient as inertia detection via symmetric indefinite factorizations. We also demonstrate that the inertia-free approach can lead to reductions in solution time because it reduces the amount of convexification needed.
The parallel volume at large distances

DEFF Research Database (Denmark)

Kampf, Jürgen

In this paper we examine the asymptotic behavior of the parallel volume of planar non-convex bodies as the distance tends to infinity. We show that the difference between the parallel volume of the convex hull of a body and the parallel volume of the body itself tends to . This yields a new proof...... for the fact that a planar body can only have polynomial parallel volume, if it is convex. Extensions to Minkowski spaces and random sets are also discussed....
The parallel volume at large distances

DEFF Research Database (Denmark)

Kampf, Jürgen

In this paper we examine the asymptotic behavior of the parallel volume of planar non-convex bodies as the distance tends to infinity. We show that the difference between the parallel volume of the convex hull of a body and the parallel volume of the body itself tends to 0. This yields a new proof...... for the fact that a planar body can only have polynomial parallel volume, if it is convex. Extensions to Minkowski spaces and random sets are also discussed....
On distributed wavefront reconstruction for large-scale adaptive optics systems.

Science.gov (United States)

de Visser, Cornelis C; Brunner, Elisabeth; Verhaegen, Michel

2016-05-01

The distributed-spline-based aberration reconstruction (D-SABRE) method is proposed for distributed wavefront reconstruction with applications to large-scale adaptive optics systems. D-SABRE decomposes the wavefront sensor domain into any number of partitions and solves a local wavefront reconstruction problem on each partition using multivariate splines. D-SABRE accuracy is within 1% of a global approach with a speedup that scales quadratically with the number of partitions. The D-SABRE is compared to the distributed cumulative reconstruction (CuRe-D) method in open-loop and closed-loop simulations using the YAO adaptive optics simulation tool. D-SABRE accuracy exceeds CuRe-D for low levels of decomposition, and D-SABRE proved to be more robust to variations in the loop gain.
Energy transfers in large-scale and small-scale dynamos

Science.gov (United States)

Samtaney, Ravi; Kumar, Rohit; Verma, Mahendra

2015-11-01

We present the energy transfers, mainly energy fluxes and shell-to-shell energy transfers in small-scale dynamo (SSD) and large-scale dynamo (LSD) using numerical simulations of MHD turbulence for Pm = 20 (SSD) and for Pm = 0.2 on 10243 grid. For SSD, we demonstrate that the magnetic energy growth is caused by nonlocal energy transfers from the large-scale or forcing-scale velocity field to small-scale magnetic field. The peak of these energy transfers move towards lower wavenumbers as dynamo evolves, which is the reason for the growth of the magnetic fields at the large scales. The energy transfers U2U (velocity to velocity) and B2B (magnetic to magnetic) are forward and local. For LSD, we show that the magnetic energy growth takes place via energy transfers from large-scale velocity field to large-scale magnetic field. We observe forward U2U and B2B energy flux, similar to SSD.
Visual attention mitigates information loss in small- and large-scale neural codes.

Science.gov (United States)

Sprague, Thomas C; Saproo, Sameer; Serences, John T

2015-04-01

The visual system transforms complex inputs into robust and parsimonious neural codes that efficiently guide behavior. Because neural communication is stochastic, the amount of encoded visual information necessarily decreases with each synapse. This constraint requires that sensory signals are processed in a manner that protects information about relevant stimuli from degradation. Such selective processing--or selective attention--is implemented via several mechanisms, including neural gain and changes in tuning properties. However, examining each of these effects in isolation obscures their joint impact on the fidelity of stimulus feature representations by large-scale population codes. Instead, large-scale activity patterns can be used to reconstruct representations of relevant and irrelevant stimuli, thereby providing a holistic understanding about how neuron-level modulations collectively impact stimulus encoding. Copyright © 2015 Elsevier Ltd. All rights reserved.
Time delay effects on large-scale MR damper based semi-active control strategies

International Nuclear Information System (INIS)

Cha, Y-J; Agrawal, A K; Dyke, S J

2013-01-01

This paper presents a detailed investigation on the robustness of large-scale 200 kN MR damper based semi-active control strategies in the presence of time delays in the control system. Although the effects of time delay on stability and performance degradation of an actively controlled system have been investigated extensively by many researchers, degradation in the performance of semi-active systems due to time delay has yet to be investigated. Since semi-active systems are inherently stable, instability problems due to time delay are unlikely to arise. This paper investigates the effects of time delay on the performance of a building with a large-scale MR damper, using numerical simulations of near- and far-field earthquakes. The MR damper is considered to be controlled by four different semi-active control algorithms, namely (i) clipped-optimal control (COC), (ii) decentralized output feedback polynomial control (DOFPC), (iii) Lyapunov control, and (iv) simple-passive control (SPC). It is observed that all controllers except for the COC are significantly robust with respect to time delay. On the other hand, the clipped-optimal controller should be integrated with a compensator to improve the performance in the presence of time delay. (paper)
Large-scale Cosmic-Ray Anisotropy as a Probe of Interstellar Turbulence

Energy Technology Data Exchange (ETDEWEB)

Giacinti, Gwenael; Kirk, John G. [Max-Planck-Institut für Kernphysik, Postfach 103980, D-69029 Heidelberg (Germany)

2017-02-01

We calculate the large-scale cosmic-ray (CR) anisotropies predicted for a range of Goldreich–Sridhar (GS) and isotropic models of interstellar turbulence, and compare them with IceTop data. In general, the predicted CR anisotropy is not a pure dipole; the cold spots reported at 400 TeV and 2 PeV are consistent with a GS model that contains a smooth deficit of parallel-propagating waves and a broad resonance function, though some other possibilities cannot, as yet, be ruled out. In particular, isotropic fast magnetosonic wave turbulence can match the observations at high energy, but cannot accommodate an energy dependence in the shape of the CR anisotropy. Our findings suggest that improved data on the large-scale CR anisotropy could provide a valuable probe of the properties—notably the power-spectrum—of the interstellar turbulence within a few tens of parsecs from Earth.
A comparison of parallel dust and fibre measurements of airborne chrysotile asbestos in a large mine and processing factories in the Russian Federation

NARCIS (Netherlands)

Feletto, Eleonora; Schonfeld, Sara J; Kovalevskiy, Evgeny V; Bukhtiyarov, Igor V; Kashanskiy, Sergey V; Moissonnier, Monika; Straif, Kurt; Kromhout, Hans

2017-01-01

INTRODUCTION: Historic dust concentrations are available in a large-scale cohort study of workers in a chrysotile mine and processing factories in Asbest, Russian Federation. Parallel dust (gravimetric) and fibre (phase-contrast optical microscopy) concentrations collected in 1995, 2007 and 2013/14
Solving Large Quadratic|Assignment Problems in Parallel

DEFF Research Database (Denmark)

Clausen, Jens; Perregaard, Michael

1997-01-01

and recalculation of bounds between branchings when used in a parallel Branch-and-Bound algorithm. The algorithm has been implemented on a 16-processor MEIKO Computing Surface with Intel i860 processors. Computational results from the solution of a number of large QAPs, including the classical Nugent 20...... processors, and have hence not been ideally suited for computations essentially involving non-vectorizable computations on integers.In this paper we investigate the combination of one of the best bound functions for a Branch-and-Bound algorithm (the Gilmore-Lawler bound) and various testing, variable binding...
Scaling up the Fabrication of Mechanically-Robust Carbon Nanofiber Foams

Directory of Open Access Journals (Sweden)

William Curtin

2016-02-01

Full Text Available This work aimed to identify and address the main challenges associated with fabricating large samples of carbon foams composed of interwoven networks of carbon nanofibers. Solutions to two difficulties related with the process of fabricating carbon foams, maximum foam size and catalyst cost, were developed. First, a simple physical method was invented to scale-up the constrained formation of fibrous nanostructures process (CoFFiN to fabricate relatively large foams. Specifically, a gas deflector system capable of maintaining conditions supportive of carbon nanofiber foam growth throughout a relatively large mold was developed. ANSYS CFX models were used to simulate the gas flow paths with and without deflectors; the data generated proved to be a very useful tool for the deflector design. Second, a simple method for selectively leaching the Pd catalyst material trapped in the foam during growth was successfully tested. Multiple techniques, including scanning electron microscopy, surface area measurements, and mechanical testing, were employed to characterize the foams generated in this study. All results confirmed that the larger foam samples preserve the basic characteristics: their interwoven nanofiber microstructure forms a low-density tridimensional solid with viscoelastic behavior. Fiber growth mechanisms are also discussed. Larger samples of mechanically-robust carbon nanofiber foams will enable the use of these materials as strain sensors, shock absorbers, selective absorbents for environmental remediation and electrodes for energy storage devices, among other applications.
Massively Parallel Finite Element Programming

KAUST Repository

Heister, Timo; Kronbichler, Martin; Bangerth, Wolfgang

2010-01-01

Today's large finite element simulations require parallel algorithms to scale on clusters with thousands or tens of thousands of processor cores. We present data structures and algorithms to take advantage of the power of high performance computers in generic finite element codes. Existing generic finite element libraries often restrict the parallelization to parallel linear algebra routines. This is a limiting factor when solving on more than a few hundreds of cores. We describe routines for distributed storage of all major components coupled with efficient, scalable algorithms. We give an overview of our effort to enable the modern and generic finite element library deal.II to take advantage of the power of large clusters. In particular, we describe the construction of a distributed mesh and develop algorithms to fully parallelize the finite element calculation. Numerical results demonstrate good scalability. © 2010 Springer-Verlag.
Massively Parallel Finite Element Programming

KAUST Repository

Heister, Timo

2010-01-01

Today\\'s large finite element simulations require parallel algorithms to scale on clusters with thousands or tens of thousands of processor cores. We present data structures and algorithms to take advantage of the power of high performance computers in generic finite element codes. Existing generic finite element libraries often restrict the parallelization to parallel linear algebra routines. This is a limiting factor when solving on more than a few hundreds of cores. We describe routines for distributed storage of all major components coupled with efficient, scalable algorithms. We give an overview of our effort to enable the modern and generic finite element library deal.II to take advantage of the power of large clusters. In particular, we describe the construction of a distributed mesh and develop algorithms to fully parallelize the finite element calculation. Numerical results demonstrate good scalability. © 2010 Springer-Verlag.

HAlign-II: efficient ultra-large multiple sequence alignment and phylogenetic tree reconstruction with distributed and parallel computing.

Science.gov (United States)

Wan, Shixiang; Zou, Quan

2017-01-01

Multiple sequence alignment (MSA) plays a key role in biological sequence analyses, especially in phylogenetic tree construction. Extreme increase in next-generation sequencing results in shortage of efficient ultra-large biological sequence alignment approaches for coping with different sequence types. Distributed and parallel computing represents a crucial technique for accelerating ultra-large (e.g. files more than 1 GB) sequence analyses. Based on HAlign and Spark distributed computing system, we implement a highly cost-efficient and time-efficient HAlign-II tool to address ultra-large multiple biological sequence alignment and phylogenetic tree construction. The experiments in the DNA and protein large scale data sets, which are more than 1GB files, showed that HAlign II could save time and space. It outperformed the current software tools. HAlign-II can efficiently carry out MSA and construct phylogenetic trees with ultra-large numbers of biological sequences. HAlign-II shows extremely high memory efficiency and scales well with increases in computing resource. THAlign-II provides a user-friendly web server based on our distributed computing infrastructure. HAlign-II with open-source codes and datasets was established at http://lab.malab.cn/soft/halign.
Group Centric Networking: Large Scale Over the Air Testing of Group Centric Networking

Science.gov (United States)

2016-11-01

Large Scale Over-the-Air Testing of Group Centric Networking Logan Mercer, Greg Kuperman, Andrew Hunter, Brian Proulx MIT Lincoln Laboratory...performance of Group Centric Networking (GCN), a networking protocol developed for robust and scalable communications in lossy networks where users are...devices, and the ad-hoc nature of the network . Group Centric Networking (GCN) is a proposed networking protocol that addresses challenges specific to
Large-scale hydrology in Europe : observed patterns and model performance

Energy Technology Data Exchange (ETDEWEB)

Gudmundsson, Lukas

2011-06-15

In a changing climate, terrestrial water storages are of great interest as water availability impacts key aspects of ecosystem functioning. Thus, a better understanding of the variations of wet and dry periods will contribute to fully grasp processes of the earth system such as nutrient cycling and vegetation dynamics. Currently, river runoff from small, nearly natural, catchments is one of the few variables of the terrestrial water balance that is regularly monitored with detailed spatial and temporal coverage on large scales. River runoff, therefore, provides a foundation to approach European hydrology with respect to observed patterns on large scales, with regard to the ability of models to capture these.The analysis of observed river flow from small catchments, focused on the identification and description of spatial patterns of simultaneous temporal variations of runoff. These are dominated by large-scale variations of climatic variables but also altered by catchment processes. It was shown that time series of annual low, mean and high flows follow the same atmospheric drivers. The observation that high flows are more closely coupled to large scale atmospheric drivers than low flows, indicates the increasing influence of catchment properties on runoff under dry conditions. Further, it was shown that the low-frequency variability of European runoff is dominated by two opposing centres of simultaneous variations, such that dry years in the north are accompanied by wet years in the south.Large-scale hydrological models are simplified representations of our current perception of the terrestrial water balance on large scales. Quantification of the models strengths and weaknesses is the prerequisite for a reliable interpretation of simulation results. Model evaluations may also enable to detect shortcomings with model assumptions and thus enable a refinement of the current perception of hydrological systems. The ability of a multi model ensemble of nine large-scale
A new robust adaptive controller for vibration control of active engine mount subjected to large uncertainties

International Nuclear Information System (INIS)

Fakhari, Vahid; Choi, Seung-Bok; Cho, Chang-Hyun

2015-01-01

This work presents a new robust model reference adaptive control (MRAC) for vibration control caused from vehicle engine using an electromagnetic type of active engine mount. Vibration isolation performances of the active mount associated with the robust controller are evaluated in the presence of large uncertainties. As a first step, an active mount with linear solenoid actuator is prepared and its dynamic model is identified via experimental test. Subsequently, a new robust MRAC based on the gradient method with σ-modification is designed by selecting a proper reference model. In designing the robust adaptive control, structured (parametric) uncertainties in the stiffness of the passive part of the mount and in damping ratio of the active part of the mount are considered to investigate the robustness of the proposed controller. Experimental and simulation results are presented to evaluate performance focusing on the robustness behavior of the controller in the face of large uncertainties. The obtained results show that the proposed controller can sufficiently provide the robust vibration control performance even in the presence of large uncertainties showing an effective vibration isolation. (paper)
Solving very large scattering problems using a parallel PWTD-enhanced surface integral equation solver

KAUST Repository

Liu, Yang

2013-07-01

The computational complexity and memory requirements of multilevel plane wave time domain (PWTD)-accelerated marching-on-in-time (MOT)-based surface integral equation (SIE) solvers scale as O(NtNs(log 2)Ns) and O(Ns 1.5); here N t and Ns denote numbers of temporal and spatial basis functions discretizing the current [Shanker et al., IEEE Trans. Antennas Propag., 51, 628-641, 2003]. In the past, serial versions of these solvers have been successfully applied to the analysis of scattering from perfect electrically conducting as well as homogeneous penetrable targets involving up to Ns ≈ 0.5 × 106 and Nt ≈ 10 3. To solve larger problems, parallel PWTD-enhanced MOT solvers are called for. Even though a simple parallelization strategy was demonstrated in the context of electromagnetic compatibility analysis [M. Lu et al., in Proc. IEEE Int. Symp. AP-S, 4, 4212-4215, 2004], by and large, progress in this area has been slow. The lack of progress can be attributed wholesale to difficulties associated with the construction of a scalable PWTD kernel. © 2013 IEEE.
Large-scale data analytics

CERN Document Server

Gkoulalas-Divanis, Aris

2014-01-01

Provides cutting-edge research in large-scale data analytics from diverse scientific areas Surveys varied subject areas and reports on individual results of research in the field Shares many tips and insights into large-scale data analytics from authors and editors with long-term experience and specialization in the field
Three-dimensional all-speed CFD code for safety analysis of nuclear reactor containment: Status of GASFLOW parallelization, model development, validation and application

Energy Technology Data Exchange (ETDEWEB)

Xiao, Jianjun, E-mail: jianjun.xiao@kit.edu [Institute of Nuclear and Energy Technologies, Karlsruhe Institute of Technology, P.O. Box 3640, 76021 Karlsruhe (Germany); Travis, John R., E-mail: jack_travis@comcast.com [Engineering and Scientific Software Inc., 3010 Old Pecos Trail, Santa Fe, NM 87505 (United States); Royl, Peter, E-mail: peter.royl@partner.kit.edu [Institute of Nuclear and Energy Technologies, Karlsruhe Institute of Technology, P.O. Box 3640, 76021 Karlsruhe (Germany); Necker, Gottfried, E-mail: gottfried.necker@partner.kit.edu [Institute of Nuclear and Energy Technologies, Karlsruhe Institute of Technology, P.O. Box 3640, 76021 Karlsruhe (Germany); Svishchev, Anatoly, E-mail: anatoly.svishchev@kit.edu [Institute of Nuclear and Energy Technologies, Karlsruhe Institute of Technology, P.O. Box 3640, 76021 Karlsruhe (Germany); Jordan, Thomas, E-mail: thomas.jordan@kit.edu [Institute of Nuclear and Energy Technologies, Karlsruhe Institute of Technology, P.O. Box 3640, 76021 Karlsruhe (Germany)

2016-05-15

Highlights: • 3-D scalable semi-implicit pressure-based CFD code for containment safety analysis. • Robust solution algorithm valid for all-speed flows. • Well validated and widely used CFD code for hydrogen safety analysis. • Code applied in various types of nuclear reactor containments. • Parallelization enables high-fidelity models in large scale containment simulations. - Abstract: GASFLOW is a three dimensional semi-implicit all-speed CFD code which can be used to predict fluid dynamics, chemical kinetics, heat and mass transfer, aerosol transportation and other related phenomena involved in postulated accidents in nuclear reactor containments. The main purpose of the paper is to give a brief review on recent GASFLOW code development, validations and applications in the field of nuclear safety. GASFLOW code has been well validated by international experimental benchmarks, and has been widely applied to hydrogen safety analysis in various types of nuclear power plants in European and Asian countries, which have been summarized in this paper. Furthermore, four benchmark tests of a lid-driven cavity flow, low Mach number jet flow, 1-D shock tube and supersonic flow over a forward-facing step are presented in order to demonstrate the accuracy and wide-ranging capability of ICE’d ALE solution algorithm for all-speed flows. GASFLOW has been successfully parallelized using the paradigms of Message Passing Interface (MPI) and domain decomposition. The parallel version, GASFLOW-MPI, adds great value to large scale containment simulations by enabling high-fidelity models, including more geometric details and more complex physics. It will be helpful for the nuclear safety engineers to better understand the hydrogen safety related physical phenomena during the severe accident, to optimize the design of the hydrogen risk mitigation systems and to fulfill the licensing requirements by the nuclear regulatory authorities. GASFLOW-MPI is targeting a high
Towards Large-area Field-scale Operational Evapotranspiration for Water Use Mapping

Science.gov (United States)

Senay, G. B.; Friedrichs, M.; Morton, C.; Huntington, J. L.; Verdin, J.

2017-12-01

Field-scale evapotranspiration (ET) estimates are needed for improving surface and groundwater use and water budget studies. Ideally, field-scale ET estimates would be at regional to national levels and cover long time periods. As a result of large data storage and computational requirements associated with processing field-scale satellite imagery such as Landsat, numerous challenges remain to develop operational ET estimates over large areas for detailed water use and availability studies. However, the combination of new science, data availability, and cloud computing technology is enabling unprecedented capabilities for ET mapping. To demonstrate this capability, we used Google's Earth Engine cloud computing platform to create nationwide annual ET estimates with 30-meter resolution Landsat ( 16,000 images) and gridded weather data using the Operational Simplified Surface Energy Balance (SSEBop) model in support of the National Water Census, a USGS research program designed to build decision support capacity for water management agencies and other natural resource managers. By leveraging Google's Earth Engine Application Programming Interface (API) and developing software in a collaborative, open-platform environment, we rapidly advance from research towards applications for large-area field-scale ET mapping. Cloud computing of the Landsat image archive combined with other satellite, climate, and weather data, is creating never imagined opportunities for assessing ET model behavior and uncertainty, and ultimately providing the ability for more robust operational monitoring and assessment of water use at field-scales.
Robust object tracking combining color and scale invariant features

Science.gov (United States)

Zhang, Shengping; Yao, Hongxun; Gao, Peipei

2010-07-01

Object tracking plays a very important role in many computer vision applications. However its performance will significantly deteriorate due to some challenges in complex scene, such as pose and illumination changes, clustering background and so on. In this paper, we propose a robust object tracking algorithm which exploits both global color and local scale invariant (SIFT) features in a particle filter framework. Due to the expensive computation cost of SIFT features, the proposed tracker adopts a speed-up variation of SIFT, SURF, to extract local features. Specially, the proposed method first finds matching points between the target model and target candidate, than the weight of the corresponding particle based on scale invariant features is computed as the the proportion of matching points of that particle to matching points of all particles, finally the weight of the particle is obtained by combining weights of color and SURF features with a probabilistic way. The experimental results on a variety of challenging videos verify that the proposed method is robust to pose and illumination changes and is significantly superior to the standard particle filter tracker and the mean shift tracker.
Synchronization Techniques in Parallel Discrete Event Simulation

OpenAIRE

Lindén, Jonatan

2018-01-01

Discrete event simulation is an important tool for evaluating system models in many fields of science and engineering. To improve the performance of large-scale discrete event simulations, several techniques to parallelize discrete event simulation have been developed. In parallel discrete event simulation, the work of a single discrete event simulation is distributed over multiple processing elements. A key challenge in parallel discrete event simulation is to ensure that causally dependent ...
The multilevel fast multipole algorithm (MLFMA) for solving large-scale computational electromagnetics problems

CERN Document Server

Ergul, Ozgur

2014-01-01

The Multilevel Fast Multipole Algorithm (MLFMA) for Solving Large-Scale Computational Electromagnetic Problems provides a detailed and instructional overview of implementing MLFMA. The book: Presents a comprehensive treatment of the MLFMA algorithm, including basic linear algebra concepts, recent developments on the parallel computation, and a number of application examplesCovers solutions of electromagnetic problems involving dielectric objects and perfectly-conducting objectsDiscusses applications including scattering from airborne targets, scattering from red
Disinformative data in large-scale hydrological modelling

Directory of Open Access Journals (Sweden)

A. Kauffeldt

2013-07-01

Full Text Available Large-scale hydrological modelling has become an important tool for the study of global and regional water resources, climate impacts, and water-resources management. However, modelling efforts over large spatial domains are fraught with problems of data scarcity, uncertainties and inconsistencies between model forcing and evaluation data. Model-independent methods to screen and analyse data for such problems are needed. This study aimed at identifying data inconsistencies in global datasets using a pre-modelling analysis, inconsistencies that can be disinformative for subsequent modelling. The consistency between (i basin areas for different hydrographic datasets, and (ii between climate data (precipitation and potential evaporation and discharge data, was examined in terms of how well basin areas were represented in the flow networks and the possibility of water-balance closure. It was found that (i most basins could be well represented in both gridded basin delineations and polygon-based ones, but some basins exhibited large area discrepancies between flow-network datasets and archived basin areas, (ii basins exhibiting too-high runoff coefficients were abundant in areas where precipitation data were likely affected by snow undercatch, and (iii the occurrence of basins exhibiting losses exceeding the potential-evaporation limit was strongly dependent on the potential-evaporation data, both in terms of numbers and geographical distribution. Some inconsistencies may be resolved by considering sub-grid variability in climate data, surface-dependent potential-evaporation estimates, etc., but further studies are needed to determine the reasons for the inconsistencies found. Our results emphasise the need for pre-modelling data analysis to identify dataset inconsistencies as an important first step in any large-scale study. Applying data-screening methods before modelling should also increase our chances to draw robust conclusions from subsequent
A Model of Parallel Kinematics for Machine Calibration

DEFF Research Database (Denmark)

Pedersen, David Bue; Bæk Nielsen, Morten; Kløve Christensen, Simon

2016-01-01

Parallel kinematics have been adopted by more than 25 manufacturers of high-end desktop 3D printers [Wohlers Report (2015), p.118] as well as by research projects such as the WASP project [WASP (2015)], a 12 meter tall linear delta robot for Additive Manufacture of large-scale components for cons......Parallel kinematics have been adopted by more than 25 manufacturers of high-end desktop 3D printers [Wohlers Report (2015), p.118] as well as by research projects such as the WASP project [WASP (2015)], a 12 meter tall linear delta robot for Additive Manufacture of large-scale components...
Large Scale Simulations of the Euler Equations on GPU Clusters

KAUST Repository

Liebmann, Manfred

2010-08-01

The paper investigates the scalability of a parallel Euler solver, using the Vijayasundaram method, on a GPU cluster with 32 Nvidia Geforce GTX 295 boards. The aim of this research is to enable large scale fluid dynamics simulations with up to one billion elements. We investigate communication protocols for the GPU cluster to compensate for the slow Gigabit Ethernet network between the GPU compute nodes and to maintain overall efficiency. A diesel engine intake-port and a nozzle, meshed in different resolutions, give good real world examples for the scalability tests on the GPU cluster. © 2010 IEEE.
The Transition to Large-scale Cosmic Homogeneity in the WiggleZ Dark Energy Survey

Science.gov (United States)

Scrimgeour, Morag; Davis, T.; Blake, C.; James, B.; Poole, G. B.; Staveley-Smith, L.; Dark Energy Survey, WiggleZ

2013-01-01

The most fundamental assumption of the standard cosmological model (ΛCDM) is that the universe is homogeneous on large scales. This is clearly not true on small scales, where clusters and voids exist, and some studies seem to suggest that galaxies follow a fractal distribution up to very large scales 200 h-1 Mpc or more), whereas the ΛCDM model predicts transition to homogeneity at scales of ~100 h-1 Mpc. Any cosmological measurements made below the scale of homogeneity (such as the power spectrum) could be misleading, so it is crucial to measure the scale of homogeneity in the Universe. We have used the WiggleZ Dark Energy Survey to make the largest volume measurement to date of the transition to homogeneity in the galaxy distribution. WiggleZ is a UV-selected spectroscopic survey of ~200,000 luminous blue galaxies up to z=1, made with the Anglo-Australian Telescope. We have corrected for survey incompleteness using random catalogues that account for the various survey selection criteria, and tested the robustness of our results using a suite of fractal mock catalogues. The large volume and depth of WiggleZ allows us to probe the transition of the galaxy distribution to homogeneity on large scales and over several epochs, and see if this is consistent with a ΛCDM prediction.
[Parallel virtual reality visualization of extreme large medical datasets].

Science.gov (United States)

Tang, Min

2010-04-01

On the basis of a brief description of grid computing, the essence and critical techniques of parallel visualization of extreme large medical datasets are discussed in connection with Intranet and common-configuration computers of hospitals. In this paper are introduced several kernel techniques, including the hardware structure, software framework, load balance and virtual reality visualization. The Maximum Intensity Projection algorithm is realized in parallel using common PC cluster. In virtual reality world, three-dimensional models can be rotated, zoomed, translated and cut interactively and conveniently through the control panel built on virtual reality modeling language (VRML). Experimental results demonstrate that this method provides promising and real-time results for playing the role in of a good assistant in making clinical diagnosis.
Large-scale grid management

International Nuclear Information System (INIS)

Langdal, Bjoern Inge; Eggen, Arnt Ove

2003-01-01

The network companies in the Norwegian electricity industry now have to establish a large-scale network management, a concept essentially characterized by (1) broader focus (Broad Band, Multi Utility,...) and (2) bigger units with large networks and more customers. Research done by SINTEF Energy Research shows so far that the approaches within large-scale network management may be structured according to three main challenges: centralization, decentralization and out sourcing. The article is part of a planned series
Parallel computing works

Energy Technology Data Exchange (ETDEWEB)

1991-10-23

An account of the Caltech Concurrent Computation Program (C{sup 3}P), a five year project that focused on answering the question: Can parallel computers be used to do large-scale scientific computations '' As the title indicates, the question is answered in the affirmative, by implementing numerous scientific applications on real parallel computers and doing computations that produced new scientific results. In the process of doing so, C{sup 3}P helped design and build several new computers, designed and implemented basic system software, developed algorithms for frequently used mathematical computations on massively parallel machines, devised performance models and measured the performance of many computers, and created a high performance computing facility based exclusively on parallel computers. While the initial focus of C{sup 3}P was the hypercube architecture developed by C. Seitz, many of the methods developed and lessons learned have been applied successfully on other massively parallel architectures.
Research of the effectiveness of parallel multithreaded realizations of interpolation methods for scaling raster images

Science.gov (United States)

Vnukov, A. A.; Shershnev, M. B.

2018-01-01

The aim of this work is the software implementation of three image scaling algorithms using parallel computations, as well as the development of an application with a graphical user interface for the Windows operating system to demonstrate the operation of algorithms and to study the relationship between system performance, algorithm execution time and the degree of parallelization of computations. Three methods of interpolation were studied, formalized and adapted to scale images. The result of the work is a program for scaling images by different methods. Comparison of the quality of scaling by different methods is given.
Large-scale generation of human iPSC-derived neural stem cells/early neural progenitor cells and their neuronal differentiation.

Science.gov (United States)

D'Aiuto, Leonardo; Zhi, Yun; Kumar Das, Dhanjit; Wilcox, Madeleine R; Johnson, Jon W; McClain, Lora; MacDonald, Matthew L; Di Maio, Roberto; Schurdak, Mark E; Piazza, Paolo; Viggiano, Luigi; Sweet, Robert; Kinchington, Paul R; Bhattacharjee, Ayantika G; Yolken, Robert; Nimgaonka, Vishwajit L; Nimgaonkar, Vishwajit L

2014-01-01

Induced pluripotent stem cell (iPSC)-based technologies offer an unprecedented opportunity to perform high-throughput screening of novel drugs for neurological and neurodegenerative diseases. Such screenings require a robust and scalable method for generating large numbers of mature, differentiated neuronal cells. Currently available methods based on differentiation of embryoid bodies (EBs) or directed differentiation of adherent culture systems are either expensive or are not scalable. We developed a protocol for large-scale generation of neuronal stem cells (NSCs)/early neural progenitor cells (eNPCs) and their differentiation into neurons. Our scalable protocol allows robust and cost-effective generation of NSCs/eNPCs from iPSCs. Following culture in neurobasal medium supplemented with B27 and BDNF, NSCs/eNPCs differentiate predominantly into vesicular glutamate transporter 1 (VGLUT1) positive neurons. Targeted mass spectrometry analysis demonstrates that iPSC-derived neurons express ligand-gated channels and other synaptic proteins and whole-cell patch-clamp experiments indicate that these channels are functional. The robust and cost-effective differentiation protocol described here for large-scale generation of NSCs/eNPCs and their differentiation into neurons paves the way for automated high-throughput screening of drugs for neurological and neurodegenerative diseases.

Highly parallel machines and future of scientific computing

International Nuclear Information System (INIS)

Singh, G.S.

1992-01-01

Computing requirement of large scale scientific computing has always been ahead of what state of the art hardware could supply in the form of supercomputers of the day. And for any single processor system the limit to increase in the computing power was realized a few years back itself. Now with the advent of parallel computing systems the availability of machines with the required computing power seems a reality. In this paper the author tries to visualize the future large scale scientific computing in the penultimate decade of the present century. The author summarized trends in parallel computers and emphasize the need for a better programming environment and software tools for optimal performance. The author concludes this paper with critique on parallel architectures, software tools and algorithms. (author). 10 refs., 2 tabs
Data-driven process decomposition and robust online distributed modelling for large-scale processes

Science.gov (United States)

Shu, Zhang; Lijuan, Li; Lijuan, Yao; Shipin, Yang; Tao, Zou

2018-02-01

With the increasing attention of networked control, system decomposition and distributed models show significant importance in the implementation of model-based control strategy. In this paper, a data-driven system decomposition and online distributed subsystem modelling algorithm was proposed for large-scale chemical processes. The key controlled variables are first partitioned by affinity propagation clustering algorithm into several clusters. Each cluster can be regarded as a subsystem. Then the inputs of each subsystem are selected by offline canonical correlation analysis between all process variables and its controlled variables. Process decomposition is then realised after the screening of input and output variables. When the system decomposition is finished, the online subsystem modelling can be carried out by recursively block-wise renewing the samples. The proposed algorithm was applied in the Tennessee Eastman process and the validity was verified.
Large Scale Community Detection Using a Small World Model

Directory of Open Access Journals (Sweden)

Ranjan Kumar Behera

2017-11-01

Full Text Available In a social network, small or large communities within the network play a major role in deciding the functionalities of the network. Despite of diverse definitions, communities in the network may be defined as the group of nodes that are more densely connected as compared to nodes outside the group. Revealing such hidden communities is one of the challenging research problems. A real world social network follows small world phenomena, which indicates that any two social entities can be reachable in a small number of steps. In this paper, nodes are mapped into communities based on the random walk in the network. However, uncovering communities in large-scale networks is a challenging task due to its unprecedented growth in the size of social networks. A good number of community detection algorithms based on random walk exist in literature. In addition, when large-scale social networks are being considered, these algorithms are observed to take considerably longer time. In this work, with an objective to improve the efficiency of algorithms, parallel programming framework like Map-Reduce has been considered for uncovering the hidden communities in social network. The proposed approach has been compared with some standard existing community detection algorithms for both synthetic and real-world datasets in order to examine its performance, and it is observed that the proposed algorithm is more efficient than the existing ones.
Planning under uncertainty solving large-scale stochastic linear programs

Energy Technology Data Exchange (ETDEWEB)

Infanger, G. [Stanford Univ., CA (United States). Dept. of Operations Research]|[Technische Univ., Vienna (Austria). Inst. fuer Energiewirtschaft

1992-12-01

For many practical problems, solutions obtained from deterministic models are unsatisfactory because they fail to hedge against certain contingencies that may occur in the future. Stochastic models address this shortcoming, but up to recently seemed to be intractable due to their size. Recent advances both in solution algorithms and in computer technology now allow us to solve important and general classes of practical stochastic problems. We show how large-scale stochastic linear programs can be efficiently solved by combining classical decomposition and Monte Carlo (importance) sampling techniques. We discuss the methodology for solving two-stage stochastic linear programs with recourse, present numerical results of large problems with numerous stochastic parameters, show how to efficiently implement the methodology on a parallel multi-computer and derive the theory for solving a general class of multi-stage problems with dependency of the stochastic parameters within a stage and between different stages.
Ethics of large-scale change

OpenAIRE

Arler, Finn

2006-01-01

The subject of this paper is long-term large-scale changes in human society. Some very significant examples of large-scale change are presented: human population growth, human appropriation of land and primary production, the human use of fossil fuels, and climate change. The question is posed, which kind of attitude is appropriate when dealing with large-scale changes like these from an ethical point of view. Three kinds of approaches are discussed: Aldo Leopold's mountain thinking, th...
Massively parallel Monte Carlo. Experiences running nuclear simulations on a large condor cluster

International Nuclear Information System (INIS)

Tickner, James; O'Dwyer, Joel; Roach, Greg; Uher, Josef; Hitchen, Greg

2010-01-01

The trivially-parallel nature of Monte Carlo (MC) simulations make them ideally suited for running on a distributed, heterogeneous computing environment. We report on the setup and operation of a large, cycle-harvesting Condor computer cluster, used to run MC simulations of nuclear instruments ('jobs') on approximately 4,500 desktop PCs. Successful operation must balance the competing goals of maximizing the availability of machines for running jobs whilst minimizing the impact on users' PC performance. This requires classification of jobs according to anticipated run-time and priority and careful optimization of the parameters used to control job allocation to host machines. To maximize use of a large Condor cluster, we have created a powerful suite of tools to handle job submission and analysis, as the manual creation, submission and evaluation of large numbers (hundred to thousands) of jobs would be too arduous. We describe some of the key aspects of this suite, which has been interfaced to the well-known MCNP and EGSnrc nuclear codes and our in-house PHOTON optical MC code. We report on our practical experiences of operating our Condor cluster and present examples of several large-scale instrument design problems that have been solved using this tool. (author)
Towards a Database System for Large-scale Analytics on Strings

KAUST Repository

Sahli, Majed A.

2015-07-23

Recent technological advances are causing an explosion in the production of sequential data. Biological sequences, web logs and time series are represented as strings. Currently, strings are stored, managed and queried in an ad-hoc fashion because they lack a standardized data model and query language. String queries are computationally demanding, especially when strings are long and numerous. Existing approaches cannot handle the growing number of strings produced by environmental, healthcare, bioinformatic, and space applications. There is a trade- off between performing analytics efficiently and scaling to thousands of cores to finish in reasonable times. In this thesis, we introduce a data model that unifies the input and output representations of core string operations. We define a declarative query language for strings where operators can be pipelined to form complex queries. A rich set of core string operators is described to support string analytics. We then demonstrate a database system for string analytics based on our model and query language. In particular, we propose the use of a novel data structure augmented by efficient parallel computation to strike a balance between preprocessing overheads and query execution times. Next, we delve into repeated motifs extraction as a core string operation for large-scale string analytics. Motifs are frequent patterns used, for example, to identify biological functionality, periodic trends, or malicious activities. Statistical approaches are fast but inexact while combinatorial methods are sound but slow. We introduce ACME, a combinatorial repeated motifs extractor. We study the spatial and temporal locality of motif extraction and devise a cache-aware search space traversal technique. ACME is the only method that scales to gigabyte- long strings, handles large alphabets, and supports interesting motif types with minimal overhead. While ACME is cache-efficient, it is limited by being serial. We devise a lightweight
NonLinear Parallel OPtimization Tool, Phase II

Data.gov (United States)

National Aeronautics and Space Administration — The technological advancement proposed is a novel large-scale Noninear Parallel OPtimization Tool (NLPAROPT). This software package will eliminate the computational...
The linearly scaling 3D fragment method for large scale electronic structure calculations

Energy Technology Data Exchange (ETDEWEB)

Zhao Zhengji [National Energy Research Scientific Computing Center (NERSC) (United States); Meza, Juan; Shan Hongzhang; Strohmaier, Erich; Bailey, David; Wang Linwang [Computational Research Division, Lawrence Berkeley National Laboratory (United States); Lee, Byounghak, E-mail: ZZhao@lbl.go [Physics Department, Texas State University (United States)

2009-07-01

The linearly scaling three-dimensional fragment (LS3DF) method is an O(N) ab initio electronic structure method for large-scale nano material simulations. It is a divide-and-conquer approach with a novel patching scheme that effectively cancels out the artificial boundary effects, which exist in all divide-and-conquer schemes. This method has made ab initio simulations of thousand-atom nanosystems feasible in a couple of hours, while retaining essentially the same accuracy as the direct calculation methods. The LS3DF method won the 2008 ACM Gordon Bell Prize for algorithm innovation. Our code has reached 442 Tflop/s running on 147,456 processors on the Cray XT5 (Jaguar) at OLCF, and has been run on 163,840 processors on the Blue Gene/P (Intrepid) at ALCF, and has been applied to a system containing 36,000 atoms. In this paper, we will present the recent parallel performance results of this code, and will apply the method to asymmetric CdSe/CdS core/shell nanorods, which have potential applications in electronic devices and solar cells.
Design of a planar 3-DOF parallel micromanipulator

International Nuclear Information System (INIS)

Lee, Jeong Jae; Dong, Yanlu; Jeon, Yong Ho; Lee, Moon Gu

2013-01-01

A planar three degree-of-freedom (DOF) parallel manipulator is proposed to be applied for alignment during assembly of microcomponents. It adopts a PRR (prismatic-revolute-revolute) mechanism to meet the requirements of high precision for assembly and robustness against disturbance. The mechanism was designed to have a large workspace and good dexterity because parallel mechanisms usually have a narrow range and singularity of motion compared to serial mechanisms. Inverse kinematics and a simple closed-loop algorithm of the parallel manipulator are presented to control it. Experimental tests have been carried out with high-resolution capacitance sensors to verify the performance of the mechanism. The results of experiments show that the manipulator has a large workspace of ±1.0 mm, ±1.0 mm, and ±10 mrad in the X-, Y-, and θ-directions, respectively. This is a large workspace when considering it adopts a parallel mechanism and has a small size, 100 ´ 100 ´ 100 mm3 . It also has a good precision of 2 μm, 3 μm, and 0.2 mrad, in the X-, Y-, and θ- axes, respectively. These are high resolutions considering the manipulator adopts conventional joints. The manipulator is expected to have good dexterity.
An Efficient Parallel Multi-Scale Segmentation Method for Remote Sensing Imagery

Directory of Open Access Journals (Sweden)

Haiyan Gu

2018-04-01

Full Text Available Remote sensing (RS image segmentation is an essential step in geographic object-based image analysis (GEOBIA to ultimately derive “meaningful objects”. While many segmentation methods exist, most of them are not efficient for large data sets. Thus, the goal of this research is to develop an efficient parallel multi-scale segmentation method for RS imagery by combining graph theory and the fractal net evolution approach (FNEA. Specifically, a minimum spanning tree (MST algorithm in graph theory is proposed to be combined with a minimum heterogeneity rule (MHR algorithm that is used in FNEA. The MST algorithm is used for the initial segmentation while the MHR algorithm is used for object merging. An efficient implementation of the segmentation strategy is presented using data partition and the “reverse searching-forward processing” chain based on message passing interface (MPI parallel technology. Segmentation results of the proposed method using images from multiple sensors (airborne, SPECIM AISA EAGLE II, WorldView-2, RADARSAT-2 and different selected landscapes (residential/industrial, residential/agriculture covering four test sites indicated its efficiency in accuracy and speed. We conclude that the proposed method is applicable and efficient for the segmentation of a variety of RS imagery (airborne optical, satellite optical, SAR, high-spectral, while the accuracy is comparable with that of the FNEA method.
Large-scale visualization system for grid environment

International Nuclear Information System (INIS)

Suzuki, Yoshio

2007-01-01

Center for Computational Science and E-systems of Japan Atomic Energy Agency (CCSE/JAEA) has been conducting R and Ds of distributed computing (grid computing) environments: Seamless Thinking Aid (STA), Information Technology Based Laboratory (ITBL) and Atomic Energy Grid InfraStructure (AEGIS). In these R and Ds, we have developed the visualization technology suitable for the distributed computing environment. As one of the visualization tools, we have developed the Parallel Support Toolkit (PST) which can execute the visualization process parallely on a computer. Now, we improve PST to be executable simultaneously on multiple heterogeneous computers using Seamless Thinking Aid Message Passing Interface (STAMPI). STAMPI, we have developed in these R and Ds, is the MPI library executable on a heterogeneous computing environment. The improvement realizes the visualization of extremely large-scale data and enables more efficient visualization processes in a distributed computing environment. (author)
A Parallel Distributed-Memory Particle Method Enables Acquisition-Rate Segmentation of Large Fluorescence Microscopy Images.

Science.gov (United States)

Afshar, Yaser; Sbalzarini, Ivo F

2016-01-01

Modern fluorescence microscopy modalities, such as light-sheet microscopy, are capable of acquiring large three-dimensional images at high data rate. This creates a bottleneck in computational processing and analysis of the acquired images, as the rate of acquisition outpaces the speed of processing. Moreover, images can be so large that they do not fit the main memory of a single computer. We address both issues by developing a distributed parallel algorithm for segmentation of large fluorescence microscopy images. The method is based on the versatile Discrete Region Competition algorithm, which has previously proven useful in microscopy image segmentation. The present distributed implementation decomposes the input image into smaller sub-images that are distributed across multiple computers. Using network communication, the computers orchestrate the collectively solving of the global segmentation problem. This not only enables segmentation of large images (we test images of up to 10(10) pixels), but also accelerates segmentation to match the time scale of image acquisition. Such acquisition-rate image segmentation is a prerequisite for the smart microscopes of the future and enables online data compression and interactive experiments.
Structural Quality of Service in Large-Scale Networks

DEFF Research Database (Denmark)

Pedersen, Jens Myrup

, telephony and data. To meet the requirements of the different applications, and to handle the increased vulnerability to failures, the ability to design robust networks providing good Quality of Service is crucial. However, most planning of large-scale networks today is ad-hoc based, leading to highly...... complex networks lacking predictability and global structural properties. The thesis applies the concept of Structural Quality of Service to formulate desirable global properties, and it shows how regular graph structures can be used to obtain such properties.......Digitalization has created the base for co-existence and convergence in communications, leading to an increasing use of multi service networks. This is for example seen in the Fiber To The Home implementations, where a single fiber is used for virtually all means of communication, including TV...
A parallel nearly implicit time-stepping scheme

OpenAIRE

Botchev, Mike A.; van der Vorst, Henk A.

2001-01-01

Across-the-space parallelism still remains the most mature, convenient and natural way to parallelize large scale problems. One of the major problems here is that implicit time stepping is often difficult to parallelize due to the structure of the system. Approximate implicit schemes have been suggested to circumvent the problem. These schemes have attractive stability properties and they are also very well parallelizable. The purpose of this article is to give an overall assessment of the pa...
Robust hybrid name disambiguation framework for large databases

KAUST Repository

Zhu, Jia

2013-10-26

In many databases, science bibliography database for example, name attribute is the most commonly chosen identifier to identify entities. However, names are often ambiguous and not always unique which cause problems in many fields. Name disambiguation is a non-trivial task in data management that aims to properly distinguish different entities which share the same name, particularly for large databases like digital libraries, as only limited information can be used to identify authors\\' name. In digital libraries, ambiguous author names occur due to the existence of multiple authors with the same name or different name variations for the same person. Also known as name disambiguation, most of the previous works to solve this issue often employ hierarchical clustering approaches based on information inside the citation records, e.g. co-authors and publication titles. In this paper, we focus on proposing a robust hybrid name disambiguation framework that is not only applicable for digital libraries but also can be easily extended to other application based on different data sources. We propose a web pages genre identification component to identify the genre of a web page, e.g. whether the page is a personal homepage. In addition, we propose a re-clustering model based on multidimensional scaling that can further improve the performance of name disambiguation. We evaluated our approach on known corpora, and the favorable experiment results indicated that our proposed framework is feasible. © 2013 Akadémiai Kiadó, Budapest, Hungary.
Robust hybrid name disambiguation framework for large databases

KAUST Repository

Zhu, Jia; Yang, Yi; Xie, Qing; Wang, Liwei; Hassan, Saeed-Ul

2013-01-01

In many databases, science bibliography database for example, name attribute is the most commonly chosen identifier to identify entities. However, names are often ambiguous and not always unique which cause problems in many fields. Name disambiguation is a non-trivial task in data management that aims to properly distinguish different entities which share the same name, particularly for large databases like digital libraries, as only limited information can be used to identify authors' name. In digital libraries, ambiguous author names occur due to the existence of multiple authors with the same name or different name variations for the same person. Also known as name disambiguation, most of the previous works to solve this issue often employ hierarchical clustering approaches based on information inside the citation records, e.g. co-authors and publication titles. In this paper, we focus on proposing a robust hybrid name disambiguation framework that is not only applicable for digital libraries but also can be easily extended to other application based on different data sources. We propose a web pages genre identification component to identify the genre of a web page, e.g. whether the page is a personal homepage. In addition, we propose a re-clustering model based on multidimensional scaling that can further improve the performance of name disambiguation. We evaluated our approach on known corpora, and the favorable experiment results indicated that our proposed framework is feasible. © 2013 Akadémiai Kiadó, Budapest, Hungary.
Overview of the Force Scientific Parallel Language

Directory of Open Access Journals (Sweden)

Gita Alaghband

1994-01-01

Full Text Available The Force parallel programming language designed for large-scale shared-memory multiprocessors is presented. The language provides a number of parallel constructs as extensions to the ordinary Fortran language and is implemented as a two-level macro preprocessor to support portability across shared memory multiprocessors. The global parallelism model on which the Force is based provides a powerful parallel language. The parallel constructs, generic synchronization, and freedom from process management supported by the Force has resulted in structured parallel programs that are ported to the many multiprocessors on which the Force is implemented. Two new parallel constructs for looping and functional decomposition are discussed. Several programming examples to illustrate some parallel programming approaches using the Force are also presented.
Large-Scale Cubic-Scaling Random Phase Approximation Correlation Energy Calculations Using a Gaussian Basis.

Science.gov (United States)

Wilhelm, Jan; Seewald, Patrick; Del Ben, Mauro; Hutter, Jürg

2016-12-13

We present an algorithm for computing the correlation energy in the random phase approximation (RPA) in a Gaussian basis requiring [Formula: see text] operations and [Formula: see text] memory. The method is based on the resolution of the identity (RI) with the overlap metric, a reformulation of RI-RPA in the Gaussian basis, imaginary time, and imaginary frequency integration techniques, and the use of sparse linear algebra. Additional memory reduction without extra computations can be achieved by an iterative scheme that overcomes the memory bottleneck of canonical RPA implementations. We report a massively parallel implementation that is the key for the application to large systems. Finally, cubic-scaling RPA is applied to a thousand water molecules using a correlation-consistent triple-ζ quality basis.
Robustizing Circuit Optimization using Huber Functions

DEFF Research Database (Denmark)

Bandler, John W.; Biernacki, Radek M.; Chen, Steve H.

1993-01-01

The authors introduce a novel approach to 'robustizing' microwave circuit optimization using Huber functions, both two-sided and one-sided. They compare Huber optimization with l/sub 1/, l/sub 2/, and minimax methods in the presence of faults, large and small measurement errors, bad starting poin......, a preliminary optimization by selecting a small number of dominant variables. It is demonstrated, through multiplexer optimization, that the one-sided Huber function can be more effective and efficient than minimax in overcoming a bad starting point.......The authors introduce a novel approach to 'robustizing' microwave circuit optimization using Huber functions, both two-sided and one-sided. They compare Huber optimization with l/sub 1/, l/sub 2/, and minimax methods in the presence of faults, large and small measurement errors, bad starting points......, and statistical uncertainties. They demonstrate FET statistical modeling, multiplexer optimization, analog fault location, and data fitting. They extend the Huber concept by introducing a 'one-sided' Huber function for large-scale optimization. For large-scale problems, the designer often attempts, by intuition...

3D large-scale calculations using the method of characteristics

International Nuclear Information System (INIS)

Dahmani, M.; Roy, R.; Koclas, J.

2004-01-01

An overview of the computational requirements and the numerical developments made in order to be able to solve 3D large-scale problems using the characteristics method will be presented. To accelerate the MCI solver, efficient acceleration techniques were implemented and parallelization was performed. However, for the very large problems, the size of the tracking file used to store the tracks can still become prohibitive and exceed the capacity of the machine. The new 3D characteristics solver MCG will now be introduced. This methodology is dedicated to solve very large 3D problems (a part or a whole core) without spatial homogenization. In order to eliminate the input/output problems occurring when solving these large problems, we define a new computing scheme that requires more CPU resources than the usual one, based on sweeps over large tracking files. The huge capacity of storage needed in some problems and the related I/O queries needed by the characteristics solver are replaced by on-the-fly recalculation of tracks at each iteration step. Using this technique, large 3D problems are no longer I/O-bound, and distributed CPU resources can be efficiently used. (author)
A Proactive Complex Event Processing Method for Large-Scale Transportation Internet of Things

OpenAIRE

Wang, Yongheng; Cao, Kening

2014-01-01

The Internet of Things (IoT) provides a new way to improve the transportation system. The key issue is how to process the numerous events generated by IoT. In this paper, a proactive complex event processing method is proposed for large-scale transportation IoT. Based on a multilayered adaptive dynamic Bayesian model, a Bayesian network structure learning algorithm using search-and-score is proposed to support accurate predictive analytics. A parallel Markov decision processes model is design...
State-of-the-Art in GPU-Based Large-Scale Volume Visualization

KAUST Repository

Beyer, Johanna

2015-05-01

This survey gives an overview of the current state of the art in GPU techniques for interactive large-scale volume visualization. Modern techniques in this field have brought about a sea change in how interactive visualization and analysis of giga-, tera- and petabytes of volume data can be enabled on GPUs. In addition to combining the parallel processing power of GPUs with out-of-core methods and data streaming, a major enabler for interactivity is making both the computational and the visualization effort proportional to the amount and resolution of data that is actually visible on screen, i.e. \\'output-sensitive\\' algorithms and system designs. This leads to recent output-sensitive approaches that are \\'ray-guided\\', \\'visualization-driven\\' or \\'display-aware\\'. In this survey, we focus on these characteristics and propose a new categorization of GPU-based large-scale volume visualization techniques based on the notions of actual output-resolution visibility and the current working set of volume bricks-the current subset of data that is minimally required to produce an output image of the desired display resolution. Furthermore, we discuss the differences and similarities of different rendering and data traversal strategies in volume rendering by putting them into a common context-the notion of address translation. For our purposes here, we view parallel (distributed) visualization using clusters as an orthogonal set of techniques that we do not discuss in detail but that can be used in conjunction with what we present in this survey. © 2015 The Eurographics Association and John Wiley & Sons Ltd.
State-of-the-Art in GPU-Based Large-Scale Volume Visualization

KAUST Repository

Beyer, Johanna; Hadwiger, Markus; Pfister, Hanspeter

2015-01-01

This survey gives an overview of the current state of the art in GPU techniques for interactive large-scale volume visualization. Modern techniques in this field have brought about a sea change in how interactive visualization and analysis of giga-, tera- and petabytes of volume data can be enabled on GPUs. In addition to combining the parallel processing power of GPUs with out-of-core methods and data streaming, a major enabler for interactivity is making both the computational and the visualization effort proportional to the amount and resolution of data that is actually visible on screen, i.e. 'output-sensitive' algorithms and system designs. This leads to recent output-sensitive approaches that are 'ray-guided', 'visualization-driven' or 'display-aware'. In this survey, we focus on these characteristics and propose a new categorization of GPU-based large-scale volume visualization techniques based on the notions of actual output-resolution visibility and the current working set of volume bricks-the current subset of data that is minimally required to produce an output image of the desired display resolution. Furthermore, we discuss the differences and similarities of different rendering and data traversal strategies in volume rendering by putting them into a common context-the notion of address translation. For our purposes here, we view parallel (distributed) visualization using clusters as an orthogonal set of techniques that we do not discuss in detail but that can be used in conjunction with what we present in this survey. © 2015 The Eurographics Association and John Wiley & Sons Ltd.
Stability of Large Parallel Tunnels Excavated in Weak Rocks: A Case Study

Science.gov (United States)

Ding, Xiuli; Weng, Yonghong; Zhang, Yuting; Xu, Tangjin; Wang, Tuanle; Rao, Zhiwen; Qi, Zufang

2017-09-01

Diversion tunnels are important structures for hydropower projects but are always placed in locations with less favorable geological conditions than those in which other structures are placed. Because diversion tunnels are usually large and closely spaced, the rock pillar between adjacent tunnels in weak rocks is affected on both sides, and conventional support measures may not be adequate to achieve the required stability. Thus, appropriate reinforcement support measures are needed, and the design philosophy regarding large parallel tunnels in weak rocks should be updated. This paper reports a recent case in which two large parallel diversion tunnels are excavated. The rock masses are thin- to ultra-thin-layered strata coated with phyllitic films, which significantly decrease the soundness and strength of the strata and weaken the rocks. The behaviors of the surrounding rock masses under original (and conventional) support measures are detailed in terms of rock mass deformation, anchor bolt stress, and the extent of the excavation disturbed zone (EDZ), as obtained from safety monitoring and field testing. In situ observed phenomena and their interpretation are also included. The sidewall deformations exhibit significant time-dependent characteristics, and large magnitudes are recorded. The stresses in the anchor bolts are small, but the extents of the EDZs are large. The stability condition under the original support measures is evaluated as poor. To enhance rock mass stability, attempts are made to reinforce support design and improve safety monitoring programs. The main feature of these attempts is the use of prestressed cables that run through the rock pillar between the parallel tunnels. The efficacy of reinforcement support measures is verified by further safety monitoring data and field test results. Numerical analysis is constantly performed during the construction process to provide a useful reference for decision making. The calculated deformations are in
Robust Distributed Kalman Filter for Wireless Sensor Networks with Uncertain Communication Channels

Directory of Open Access Journals (Sweden)

Du Yong Kim

2012-01-01

Full Text Available We address a state estimation problem over a large-scale sensor network with uncertain communication channel. Consensus protocol is usually used to adapt a large-scale sensor network. However, when certain parts of communication channels are broken down, the accuracy performance is seriously degraded. Specifically, outliers in the channel or temporal disconnection are avoided via proposed method for the practical implementation of the distributed estimation over large-scale sensor networks. We handle this practical challenge by using adaptive channel status estimator and robust L1-norm Kalman filter in design of the processor of the individual sensor node. Then, they are incorporated into the consensus algorithm in order to achieve the robust distributed state estimation. The robust property of the proposed algorithm enables the sensor network to selectively weight sensors of normal conditions so that the filter can be practically useful.
Final Report: Large-Scale Optimization for Bayesian Inference in Complex Systems

Energy Technology Data Exchange (ETDEWEB)

Ghattas, Omar [The University of Texas at Austin

2013-10-15

The SAGUARO (Scalable Algorithms for Groundwater Uncertainty Analysis and Robust Optimiza- tion) Project focuses on the development of scalable numerical algorithms for large-scale Bayesian inversion in complex systems that capitalize on advances in large-scale simulation-based optimiza- tion and inversion methods. Our research is directed in three complementary areas: efficient approximations of the Hessian operator, reductions in complexity of forward simulations via stochastic spectral approximations and model reduction, and employing large-scale optimization concepts to accelerate sampling. Our efforts are integrated in the context of a challenging testbed problem that considers subsurface reacting flow and transport. The MIT component of the SAGUARO Project addresses the intractability of conventional sampling methods for large-scale statistical inverse problems by devising reduced-order models that are faithful to the full-order model over a wide range of parameter values; sampling then employs the reduced model rather than the full model, resulting in very large computational savings. Results indicate little effect on the computed posterior distribution. On the other hand, in the Texas-Georgia Tech component of the project, we retain the full-order model, but exploit inverse problem structure (adjoint-based gradients and partial Hessian information of the parameter-to- observation map) to implicitly extract lower dimensional information on the posterior distribution; this greatly speeds up sampling methods, so that fewer sampling points are needed. We can think of these two approaches as "reduce then sample" and "sample then reduce." In fact, these two approaches are complementary, and can be used in conjunction with each other. Moreover, they both exploit deterministic inverse problem structure, in the form of adjoint-based gradient and Hessian information of the underlying parameter-to-observation map, to achieve their speedups.
The Parallel System for Integrating Impact Models and Sectors (pSIMS)

Science.gov (United States)

Elliott, Joshua; Kelly, David; Chryssanthacopoulos, James; Glotter, Michael; Jhunjhnuwala, Kanika; Best, Neil; Wilde, Michael; Foster, Ian

2014-01-01

We present a framework for massively parallel climate impact simulations: the parallel System for Integrating Impact Models and Sectors (pSIMS). This framework comprises a) tools for ingesting and converting large amounts of data to a versatile datatype based on a common geospatial grid; b) tools for translating this datatype into custom formats for site-based models; c) a scalable parallel framework for performing large ensemble simulations, using any one of a number of different impacts models, on clusters, supercomputers, distributed grids, or clouds; d) tools and data standards for reformatting outputs to common datatypes for analysis and visualization; and e) methodologies for aggregating these datatypes to arbitrary spatial scales such as administrative and environmental demarcations. By automating many time-consuming and error-prone aspects of large-scale climate impacts studies, pSIMS accelerates computational research, encourages model intercomparison, and enhances reproducibility of simulation results. We present the pSIMS design and use example assessments to demonstrate its multi-model, multi-scale, and multi-sector versatility.
Adaptive Scaling of Cluster Boundaries for Large-Scale Social Media Data Clustering.

Science.gov (United States)

Meng, Lei; Tan, Ah-Hwee; Wunsch, Donald C

2016-12-01

The large scale and complex nature of social media data raises the need to scale clustering techniques to big data and make them capable of automatically identifying data clusters with few empirical settings. In this paper, we present our investigation and three algorithms based on the fuzzy adaptive resonance theory (Fuzzy ART) that have linear computational complexity, use a single parameter, i.e., the vigilance parameter to identify data clusters, and are robust to modest parameter settings. The contribution of this paper lies in two aspects. First, we theoretically demonstrate how complement coding, commonly known as a normalization method, changes the clustering mechanism of Fuzzy ART, and discover the vigilance region (VR) that essentially determines how a cluster in the Fuzzy ART system recognizes similar patterns in the feature space. The VR gives an intrinsic interpretation of the clustering mechanism and limitations of Fuzzy ART. Second, we introduce the idea of allowing different clusters in the Fuzzy ART system to have different vigilance levels in order to meet the diverse nature of the pattern distribution of social media data. To this end, we propose three vigilance adaptation methods, namely, the activation maximization (AM) rule, the confliction minimization (CM) rule, and the hybrid integration (HI) rule. With an initial vigilance value, the resulting clustering algorithms, namely, the AM-ART, CM-ART, and HI-ART, can automatically adapt the vigilance values of all clusters during the learning epochs in order to produce better cluster boundaries. Experiments on four social media data sets show that AM-ART, CM-ART, and HI-ART are more robust than Fuzzy ART to the initial vigilance value, and they usually achieve better or comparable performance and much faster speed than the state-of-the-art clustering algorithms that also do not require a predefined number of clusters.
Political consultation and large-scale research

International Nuclear Information System (INIS)

Bechmann, G.; Folkers, H.

1977-01-01

Large-scale research and policy consulting have an intermediary position between sociological sub-systems. While large-scale research coordinates science, policy, and production, policy consulting coordinates science, policy and political spheres. In this very position, large-scale research and policy consulting lack of institutional guarantees and rational back-ground guarantee which are characteristic for their sociological environment. This large-scale research can neither deal with the production of innovative goods under consideration of rentability, nor can it hope for full recognition by the basis-oriented scientific community. Policy consulting knows neither the competence assignment of the political system to make decisions nor can it judge succesfully by the critical standards of the established social science, at least as far as the present situation is concerned. This intermediary position of large-scale research and policy consulting has, in three points, a consequence supporting the thesis which states that this is a new form of institutionalization of science: These are: 1) external control, 2) the organization form, 3) the theoretical conception of large-scale research and policy consulting. (orig.) [de
Large-scale HTS bulks for magnetic application

International Nuclear Information System (INIS)

Werfel, Frank N.; Floegel-Delor, Uta; Riedel, Thomas; Goebel, Bernd; Rothfeld, Rolf; Schirrmeister, Peter; Wippich, Dieter

2013-01-01

Highlights: ► ATZ Company has constructed about 130 HTS magnet systems. ► Multi-seeded YBCO bulks joint the way for large-scale application. ► Levitation platforms demonstrate “superconductivity” to a great public audience (100 years anniversary). ► HTS magnetic bearings show forces up to 1 t. ► Modular HTS maglev vacuum cryostats are tested for train demonstrators in Brazil, China and Germany. -- Abstract: ATZ Company has constructed about 130 HTS magnet systems using high-Tc bulk magnets. A key feature in scaling-up is the fabrication of YBCO melts textured multi-seeded large bulks with three to eight seeds. Except of levitation, magnetization, trapped field and hysteresis, we review system engineering parameters of HTS magnetic linear and rotational bearings like compactness, cryogenics, power density, efficiency and robust construction. We examine mobile compact YBCO bulk magnet platforms cooled with LN 2 and Stirling cryo-cooler for demonstrator use. Compact cryostats for Maglev train operation contain 24 pieces of 3-seed bulks and can levitate 2500–3000 N at 10 mm above a permanent magnet (PM) track. The effective magnetic distance of the thermally insulated bulks is 2 mm only; the stored 2.5 l LN 2 allows more than 24 h operation without refilling. 34 HTS Maglev vacuum cryostats are manufactured tested and operate in Germany, China and Brazil. The magnetic levitation load to weight ratio is more than 15, and by group assembling the HTS cryostats under vehicles up to 5 t total loads levitated above a magnetic track is achieved
Large-scale HTS bulks for magnetic application

Energy Technology Data Exchange (ETDEWEB)

Werfel, Frank N., E-mail: werfel@t-online.de [Adelwitz Technologiezentrum GmbH (ATZ), Rittergut Adelwitz 16, 04886 Arzberg-Adelwitz (Germany); Floegel-Delor, Uta; Riedel, Thomas; Goebel, Bernd; Rothfeld, Rolf; Schirrmeister, Peter; Wippich, Dieter [Adelwitz Technologiezentrum GmbH (ATZ), Rittergut Adelwitz 16, 04886 Arzberg-Adelwitz (Germany)

2013-01-15

Highlights: ► ATZ Company has constructed about 130 HTS magnet systems. ► Multi-seeded YBCO bulks joint the way for large-scale application. ► Levitation platforms demonstrate “superconductivity” to a great public audience (100 years anniversary). ► HTS magnetic bearings show forces up to 1 t. ► Modular HTS maglev vacuum cryostats are tested for train demonstrators in Brazil, China and Germany. -- Abstract: ATZ Company has constructed about 130 HTS magnet systems using high-Tc bulk magnets. A key feature in scaling-up is the fabrication of YBCO melts textured multi-seeded large bulks with three to eight seeds. Except of levitation, magnetization, trapped field and hysteresis, we review system engineering parameters of HTS magnetic linear and rotational bearings like compactness, cryogenics, power density, efficiency and robust construction. We examine mobile compact YBCO bulk magnet platforms cooled with LN{sub 2} and Stirling cryo-cooler for demonstrator use. Compact cryostats for Maglev train operation contain 24 pieces of 3-seed bulks and can levitate 2500–3000 N at 10 mm above a permanent magnet (PM) track. The effective magnetic distance of the thermally insulated bulks is 2 mm only; the stored 2.5 l LN{sub 2} allows more than 24 h operation without refilling. 34 HTS Maglev vacuum cryostats are manufactured tested and operate in Germany, China and Brazil. The magnetic levitation load to weight ratio is more than 15, and by group assembling the HTS cryostats under vehicles up to 5 t total loads levitated above a magnetic track is achieved.
Large-scale multimedia modeling applications

International Nuclear Information System (INIS)

Droppo, J.G. Jr.; Buck, J.W.; Whelan, G.; Strenge, D.L.; Castleton, K.J.; Gelston, G.M.

1995-08-01

Over the past decade, the US Department of Energy (DOE) and other agencies have faced increasing scrutiny for a wide range of environmental issues related to past and current practices. A number of large-scale applications have been undertaken that required analysis of large numbers of potential environmental issues over a wide range of environmental conditions and contaminants. Several of these applications, referred to here as large-scale applications, have addressed long-term public health risks using a holistic approach for assessing impacts from potential waterborne and airborne transport pathways. Multimedia models such as the Multimedia Environmental Pollutant Assessment System (MEPAS) were designed for use in such applications. MEPAS integrates radioactive and hazardous contaminants impact computations for major exposure routes via air, surface water, ground water, and overland flow transport. A number of large-scale applications of MEPAS have been conducted to assess various endpoints for environmental and human health impacts. These applications are described in terms of lessons learned in the development of an effective approach for large-scale applications
Kalman Filter Tracking on Parallel Architectures

International Nuclear Information System (INIS)

Cerati, Giuseppe; Elmer, Peter; Krutelyov, Slava; Lantz, Steven; Lefebvre, Matthieu; McDermott, Kevin; Riley, Daniel; Tadel, Matevž; Wittich, Peter; Würthwein, Frank; Yagil, Avi

2016-01-01

Power density constraints are limiting the performance improvements of modern CPUs. To address this we have seen the introduction of lower-power, multi-core processors such as GPGPU, ARM and Intel MIC. In order to achieve the theoretical performance gains of these processors, it will be necessary to parallelize algorithms to exploit larger numbers of lightweight cores and specialized functions like large vector units. Track finding and fitting is one of the most computationally challenging problems for event reconstruction in particle physics. At the High-Luminosity Large Hadron Collider (HL-LHC), for example, this will be by far the dominant problem. The need for greater parallelism has driven investigations of very different track finding techniques such as Cellular Automata or Hough Transforms. The most common track finding techniques in use today, however, are those based on a Kalman filter approach. Significant experience has been accumulated with these techniques on real tracking detector systems, both in the trigger and offline. They are known to provide high physics performance, are robust, and are in use today at the LHC. Given the utility of the Kalman filter in track finding, we have begun to port these algorithms to parallel architectures, namely Intel Xeon and Xeon Phi. We report here on our progress towards an end-to-end track reconstruction algorithm fully exploiting vectorization and parallelization techniques in a simplified experimental environment
H∞ Robust Control of a Large-Piston MEMS Micromirror for Compact Fourier Transform Spectrometer Systems

Directory of Open Access Journals (Sweden)

Huipeng Chen

2018-02-01

Full Text Available Incorporating linear-scanning micro-electro-mechanical systems (MEMS micromirrors into Fourier transform spectral acquisition systems can greatly reduce the size of the spectrometer equipment, making portable Fourier transform spectrometers (FTS possible. How to minimize the tilting of the MEMS mirror plate during its large linear scan is a major problem in this application. In this work, an FTS system has been constructed based on a biaxial MEMS micromirror with a large-piston displacement of 180 μm, and a biaxial H∞ robust controller is designed. Compared with open-loop control and proportional-integral-derivative (PID closed-loop control, H∞ robust control has good stability and robustness. The experimental results show that the stable scanning displacement reaches 110.9 μm under the H∞ robust control, and the tilting angle of the MEMS mirror plate in that full scanning range falls within ±0.0014°. Without control, the FTS system cannot generate meaningful spectra. In contrast, the FTS yields a clean spectrum with a full width at half maximum (FWHM spectral linewidth of 96 cm−1 under the H∞ robust control. Moreover, the FTS system can maintain good stability and robustness under various driving conditions.
H∞ Robust Control of a Large-Piston MEMS Micromirror for Compact Fourier Transform Spectrometer Systems.

Science.gov (United States)

Chen, Huipeng; Li, Mengyuan; Zhang, Yi; Xie, Huikai; Chen, Chang; Peng, Zhangming; Su, Shaohui

2018-02-08

Incorporating linear-scanning micro-electro-mechanical systems (MEMS) micromirrors into Fourier transform spectral acquisition systems can greatly reduce the size of the spectrometer equipment, making portable Fourier transform spectrometers (FTS) possible. How to minimize the tilting of the MEMS mirror plate during its large linear scan is a major problem in this application. In this work, an FTS system has been constructed based on a biaxial MEMS micromirror with a large-piston displacement of 180 μm, and a biaxial H∞ robust controller is designed. Compared with open-loop control and proportional-integral-derivative (PID) closed-loop control, H∞ robust control has good stability and robustness. The experimental results show that the stable scanning displacement reaches 110.9 μm under the H∞ robust control, and the tilting angle of the MEMS mirror plate in that full scanning range falls within ±0.0014°. Without control, the FTS system cannot generate meaningful spectra. In contrast, the FTS yields a clean spectrum with a full width at half maximum (FWHM) spectral linewidth of 96 cm -1 under the H∞ robust control. Moreover, the FTS system can maintain good stability and robustness under various driving conditions.
Scale-adaptive Local Patches for Robust Visual Object Tracking

Directory of Open Access Journals (Sweden)

Kang Sun

2014-04-01

Full Text Available This paper discusses the problem of robustly tracking objects which undergo rapid and dramatic scale changes. To remove the weakness of global appearance models, we present a novel scheme that combines object’s global and local appearance features. The local feature is a set of local patches that geometrically constrain the changes in the target’s appearance. In order to adapt to the object’s geometric deformation, the local patches could be removed and added online. The addition of these patches is constrained by the global features such as color, texture and motion. The global visual features are updated via the stable local patches during tracking. To deal with scale changes, we adapt the scale of patches in addition to adapting the object bound box. We evaluate our method by comparing it to several state-of-the-art trackers on publicly available datasets. The experimental results on challenging sequences confirm that, by using this scale-adaptive local patches and global properties, our tracker outperforms the related trackers in many cases by having smaller failure rate as well as better accuracy.
Toward robust nanogenerators using aluminum substrate

Energy Technology Data Exchange (ETDEWEB)

Lee, Sangmin; Xu, Chen; Lee, Minbaek; Lin, Long [School of Material Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia (United States); Hong, Jung-Il [School of Material Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia (United States); Department of Emerging Materials Science, Daegu Gyeongbuk Institute of Science and Technology, Daegu (Korea, Republic of); Kim, Dongseob; Hwang, Woonbong [Department of Mechanical Engineering, Pohang University of Science and Technology, San 31, Hyoja, Namgu, Pohang, Gyungbuk (Korea, Republic of); Wang, Zhong Lin [School of Material Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia (United States); Beijing Institue of Nanoenergy and Nanosystems, Chinese Academy of Sciences, Beijing (China)

2012-08-22

Nanogenerators (NG) have been developed to harvest mechanical energy from environmental sources such as vibration, human motion, or movement of automobiles. We demonstrate a robust and large-area NG based on a cost-effective Al substrate with the capability to be easily integrated in series and parallel for high-output performance. The output voltage and current density of the three-dimensionally integrated NG device reaches up to 3 V and 195 nA under human walking conditions. (Copyright copyright 2012 WILEY-VCH Verlag GmbH and Co. KGaA, Weinheim)
Decentralized Large-Scale Power Balancing

DEFF Research Database (Denmark)

Halvgaard, Rasmus; Jørgensen, John Bagterp; Poulsen, Niels Kjølstad

2013-01-01

problem is formulated as a centralized large-scale optimization problem but is then decomposed into smaller subproblems that are solved locally by each unit connected to an aggregator. For large-scale systems the method is faster than solving the full problem and can be distributed to include an arbitrary...
Large Scale Earth's Bow Shock with Northern IMF as Simulated by PIC Code in Parallel with MHD Model

Science.gov (United States)

Baraka, Suleiman

2016-06-01

In this paper, we propose a 3D kinetic model (particle-in-cell, PIC) for the description of the large scale Earth's bow shock. The proposed version is stable and does not require huge or extensive computer resources. Because PIC simulations work with scaled plasma and field parameters, we also propose to validate our code by comparing its results with the available MHD simulations under same scaled solar wind (SW) and (IMF) conditions. We report new results from the two models. In both codes the Earth's bow shock position is found to be ≈14.8 R E along the Sun-Earth line, and ≈29 R E on the dusk side. Those findings are consistent with past in situ observations. Both simulations reproduce the theoretical jump conditions at the shock. However, the PIC code density and temperature distributions are inflated and slightly shifted sunward when compared to the MHD results. Kinetic electron motions and reflected ions upstream may cause this sunward shift. Species distributions in the foreshock region are depicted within the transition of the shock (measured ≈2 c/ ω pi for Θ Bn = 90° and M MS = 4.7) and in the downstream. The size of the foot jump in the magnetic field at the shock is measured to be (1.7 c/ ω pi ). In the foreshocked region, the thermal velocity is found equal to 213 km s-1 at 15 R E and is equal to 63 km s -1 at 12 R E (magnetosheath region). Despite the large cell size of the current version of the PIC code, it is powerful to retain macrostructure of planets magnetospheres in very short time, thus it can be used for pedagogical test purposes. It is also likely complementary with MHD to deepen our understanding of the large scale magnetosphere.

Zebrafish whole-adult-organism chemogenomics for large-scale predictive and discovery chemical biology.

Directory of Open Access Journals (Sweden)

Siew Hong Lam

2008-07-01

Full Text Available The ability to perform large-scale, expression-based chemogenomics on whole adult organisms, as in invertebrate models (worm and fly, is highly desirable for a vertebrate model but its feasibility and potential has not been demonstrated. We performed expression-based chemogenomics on the whole adult organism of a vertebrate model, the zebrafish, and demonstrated its potential for large-scale predictive and discovery chemical biology. Focusing on two classes of compounds with wide implications to human health, polycyclic (halogenated aromatic hydrocarbons [P(HAHs] and estrogenic compounds (ECs, we generated robust prediction models that can discriminate compounds of the same class from those of different classes in two large independent experiments. The robust expression signatures led to the identification of biomarkers for potent aryl hydrocarbon receptor (AHR and estrogen receptor (ER agonists, respectively, and were validated in multiple targeted tissues. Knowledge-based data mining of human homologs of zebrafish genes revealed highly conserved chemical-induced biological responses/effects, health risks, and novel biological insights associated with AHR and ER that could be inferred to humans. Thus, our study presents an effective, high-throughput strategy of capturing molecular snapshots of chemical-induced biological states of a whole adult vertebrate that provides information on biomarkers of effects, deregulated signaling pathways, and possible affected biological functions, perturbed physiological systems, and increased health risks. These findings place zebrafish in a strategic position to bridge the wide gap between cell-based and rodent models in chemogenomics research and applications, especially in preclinical drug discovery and toxicology.
Robust scaling laws for energy confinement time, including radiated fraction, in Tokamaks

Science.gov (United States)

Murari, A.; Peluso, E.; Gaudio, P.; Gelfusa, M.

2017-12-01

In recent years, the limitations of scalings in power-law form that are obtained from traditional log regression have become increasingly evident in many fields of research. Given the wide gap in operational space between present-day and next-generation devices, robustness of the obtained models in guaranteeing reasonable extrapolability is a major issue. In this paper, a new technique, called symbolic regression, is reviewed, refined, and applied to the ITPA database for extracting scaling laws of the energy-confinement time at different radiated fraction levels. The main advantage of this new methodology is its ability to determine the most appropriate mathematical form of the scaling laws to model the available databases without the restriction of their having to be power laws. In a completely new development, this technique is combined with the concept of geodesic distance on Gaussian manifolds so as to take into account the error bars in the measurements and provide more reliable models. Robust scaling laws, including radiated fractions as regressor, have been found; they are not in power-law form, and are significantly better than the traditional scalings. These scaling laws, including radiated fractions, extrapolate quite differently to ITER, and therefore they require serious consideration. On the other hand, given the limitations of the existing databases, dedicated experimental investigations will have to be carried out to fully understand the impact of radiated fractions on the confinement in metallic machines and in the next generation of devices.
Automating large-scale reactor systems

International Nuclear Information System (INIS)

Kisner, R.A.

1985-01-01

This paper conveys a philosophy for developing automated large-scale control systems that behave in an integrated, intelligent, flexible manner. Methods for operating large-scale systems under varying degrees of equipment degradation are discussed, and a design approach that separates the effort into phases is suggested. 5 refs., 1 fig
Parallel analysis tools and new visualization techniques for ultra-large climate data set

Energy Technology Data Exchange (ETDEWEB)

Middleton, Don [National Center for Atmospheric Research, Boulder, CO (United States); Haley, Mary [National Center for Atmospheric Research, Boulder, CO (United States)

2014-12-10

ParVis was a project funded under LAB 10-05: “Earth System Modeling: Advanced Scientific Visualization of Ultra-Large Climate Data Sets”. Argonne was the lead lab with partners at PNNL, SNL, NCAR and UC-Davis. This report covers progress from January 1st, 2013 through Dec 1st, 2014. Two previous reports covered the period from Summer, 2010, through September 2011 and October 2011 through December 2012, respectively. While the project was originally planned to end on April 30, 2013, personnel and priority changes allowed many of the institutions to continue work through FY14 using existing funds. A primary focus of ParVis was introducing parallelism to climate model analysis to greatly reduce the time-to-visualization for ultra-large climate data sets. Work in the first two years was conducted on two tracks with different time horizons: one track to provide immediate help to climate scientists already struggling to apply their analysis to existing large data sets and another focused on building a new data-parallel library and tool for climate analysis and visualization that will give the field a platform for performing analysis and visualization on ultra-large datasets for the foreseeable future. In the final 2 years of the project, we focused mostly on the new data-parallel library and associated tools for climate analysis and visualization.
A Parallel Distributed-Memory Particle Method Enables Acquisition-Rate Segmentation of Large Fluorescence Microscopy Images

Science.gov (United States)

Afshar, Yaser; Sbalzarini, Ivo F.

2016-01-01

Modern fluorescence microscopy modalities, such as light-sheet microscopy, are capable of acquiring large three-dimensional images at high data rate. This creates a bottleneck in computational processing and analysis of the acquired images, as the rate of acquisition outpaces the speed of processing. Moreover, images can be so large that they do not fit the main memory of a single computer. We address both issues by developing a distributed parallel algorithm for segmentation of large fluorescence microscopy images. The method is based on the versatile Discrete Region Competition algorithm, which has previously proven useful in microscopy image segmentation. The present distributed implementation decomposes the input image into smaller sub-images that are distributed across multiple computers. Using network communication, the computers orchestrate the collectively solving of the global segmentation problem. This not only enables segmentation of large images (we test images of up to 1010 pixels), but also accelerates segmentation to match the time scale of image acquisition. Such acquisition-rate image segmentation is a prerequisite for the smart microscopes of the future and enables online data compression and interactive experiments. PMID:27046144
A Parallel Distributed-Memory Particle Method Enables Acquisition-Rate Segmentation of Large Fluorescence Microscopy Images.

Directory of Open Access Journals (Sweden)

Yaser Afshar

Full Text Available Modern fluorescence microscopy modalities, such as light-sheet microscopy, are capable of acquiring large three-dimensional images at high data rate. This creates a bottleneck in computational processing and analysis of the acquired images, as the rate of acquisition outpaces the speed of processing. Moreover, images can be so large that they do not fit the main memory of a single computer. We address both issues by developing a distributed parallel algorithm for segmentation of large fluorescence microscopy images. The method is based on the versatile Discrete Region Competition algorithm, which has previously proven useful in microscopy image segmentation. The present distributed implementation decomposes the input image into smaller sub-images that are distributed across multiple computers. Using network communication, the computers orchestrate the collectively solving of the global segmentation problem. This not only enables segmentation of large images (we test images of up to 10(10 pixels, but also accelerates segmentation to match the time scale of image acquisition. Such acquisition-rate image segmentation is a prerequisite for the smart microscopes of the future and enables online data compression and interactive experiments.
Synchronization Of Parallel Discrete Event Simulations

Science.gov (United States)

Steinman, Jeffrey S.

1992-01-01

Adaptive, parallel, discrete-event-simulation-synchronization algorithm, Breathing Time Buckets, developed in Synchronous Parallel Environment for Emulation and Discrete Event Simulation (SPEEDES) operating system. Algorithm allows parallel simulations to process events optimistically in fluctuating time cycles that naturally adapt while simulation in progress. Combines best of optimistic and conservative synchronization strategies while avoiding major disadvantages. Algorithm processes events optimistically in time cycles adapting while simulation in progress. Well suited for modeling communication networks, for large-scale war games, for simulated flights of aircraft, for simulations of computer equipment, for mathematical modeling, for interactive engineering simulations, and for depictions of flows of information.
Massively Parallel Computing: A Sandia Perspective

Energy Technology Data Exchange (ETDEWEB)

Dosanjh, Sudip S.; Greenberg, David S.; Hendrickson, Bruce; Heroux, Michael A.; Plimpton, Steve J.; Tomkins, James L.; Womble, David E.

1999-05-06

The computing power available to scientists and engineers has increased dramatically in the past decade, due in part to progress in making massively parallel computing practical and available. The expectation for these machines has been great. The reality is that progress has been slower than expected. Nevertheless, massively parallel computing is beginning to realize its potential for enabling significant break-throughs in science and engineering. This paper provides a perspective on the state of the field, colored by the authors' experiences using large scale parallel machines at Sandia National Laboratories. We address trends in hardware, system software and algorithms, and we also offer our view of the forces shaping the parallel computing industry.
Implicit solvers for large-scale nonlinear problems

International Nuclear Information System (INIS)

Keyes, David E; Reynolds, Daniel R; Woodward, Carol S

2006-01-01

Computational scientists are grappling with increasingly complex, multi-rate applications that couple such physical phenomena as fluid dynamics, electromagnetics, radiation transport, chemical and nuclear reactions, and wave and material propagation in inhomogeneous media. Parallel computers with large storage capacities are paving the way for high-resolution simulations of coupled problems; however, hardware improvements alone will not prove enough to enable simulations based on brute-force algorithmic approaches. To accurately capture nonlinear couplings between dynamically relevant phenomena, often while stepping over rapid adjustments to quasi-equilibria, simulation scientists are increasingly turning to implicit formulations that require a discrete nonlinear system to be solved for each time step or steady state solution. Recent advances in iterative methods have made fully implicit formulations a viable option for solution of these large-scale problems. In this paper, we overview one of the most effective iterative methods, Newton-Krylov, for nonlinear systems and point to software packages with its implementation. We illustrate the method with an example from magnetically confined plasma fusion and briefly survey other areas in which implicit methods have bestowed important advantages, such as allowing high-order temporal integration and providing a pathway to sensitivity analyses and optimization. Lastly, we overview algorithm extensions under development motivated by current SciDAC applications
Efficient parallel implicit methods for rotary-wing aerodynamics calculations

Science.gov (United States)

Wissink, Andrew M.

Euler/Navier-Stokes Computational Fluid Dynamics (CFD) methods are commonly used for prediction of the aerodynamics and aeroacoustics of modern rotary-wing aircraft. However, their widespread application to large complex problems is limited lack of adequate computing power. Parallel processing offers the potential for dramatic increases in computing power, but most conventional implicit solution methods are inefficient in parallel and new techniques must be adopted to realize its potential. This work proposes alternative implicit schemes for Euler/Navier-Stokes rotary-wing calculations which are robust and efficient in parallel. The first part of this work proposes an efficient parallelizable modification of the Lower Upper-Symmetric Gauss Seidel (LU-SGS) implicit operator used in the well-known Transonic Unsteady Rotor Navier Stokes (TURNS) code. The new hybrid LU-SGS scheme couples a point-relaxation approach of the Data Parallel-Lower Upper Relaxation (DP-LUR) algorithm for inter-processor communication with the Symmetric Gauss Seidel algorithm of LU-SGS for on-processor computations. With the modified operator, TURNS is implemented in parallel using Message Passing Interface (MPI) for communication. Numerical performance and parallel efficiency are evaluated on the IBM SP2 and Thinking Machines CM-5 multi-processors for a variety of steady-state and unsteady test cases. The hybrid LU-SGS scheme maintains the numerical performance of the original LU-SGS algorithm in all cases and shows a good degree of parallel efficiency. It experiences a higher degree of robustness than DP-LUR for third-order upwind solutions. The second part of this work examines use of Krylov subspace iterative solvers for the nonlinear CFD solutions. The hybrid LU-SGS scheme is used as a parallelizable preconditioner. Two iterative methods are tested, Generalized Minimum Residual (GMRES) and Orthogonal s-Step Generalized Conjugate Residual (OSGCR). The Newton method demonstrates good
Real-Time Large Scale 3d Reconstruction by Fusing Kinect and Imu Data

Science.gov (United States)

Huai, J.; Zhang, Y.; Yilmaz, A.

2015-08-01

Kinect-style RGB-D cameras have been used to build large scale dense 3D maps for indoor environments. These maps can serve many purposes such as robot navigation, and augmented reality. However, to generate dense 3D maps of large scale environments is still very challenging. In this paper, we present a mapping system for 3D reconstruction that fuses measurements from a Kinect and an inertial measurement unit (IMU) to estimate motion. Our major achievements include: (i) Large scale consistent 3D reconstruction is realized by volume shifting and loop closure; (ii) The coarse-to-fine iterative closest point (ICP) algorithm, the SIFT odometry, and IMU odometry are combined to robustly and precisely estimate pose. In particular, ICP runs routinely to track the Kinect motion. If ICP fails in planar areas, the SIFT odometry provides incremental motion estimate. If both ICP and the SIFT odometry fail, e.g., upon abrupt motion or inadequate features, the incremental motion is estimated by the IMU. Additionally, the IMU also observes the roll and pitch angles which can reduce long-term drift of the sensor assembly. In experiments on a consumer laptop, our system estimates motion at 8Hz on average while integrating color images to the local map and saving volumes of meshes concurrently. Moreover, it is immune to tracking failures, and has smaller drift than the state-of-the-art systems in large scale reconstruction.
Challenges in Managing Trustworthy Large-scale Digital Science

Science.gov (United States)

Evans, B. J. K.

2017-12-01

The increased use of large-scale international digital science has opened a number of challenges for managing, handling, using and preserving scientific information. The large volumes of information are driven by three main categories - model outputs including coupled models and ensembles, data products that have been processing to a level of usability, and increasingly heuristically driven data analysis. These data products are increasingly the ones that are usable by the broad communities, and far in excess of the raw instruments data outputs. The data, software and workflows are then shared and replicated to allow broad use at an international scale, which places further demands of infrastructure to support how the information is managed reliably across distributed resources. Users necessarily rely on these underlying "black boxes" so that they are productive to produce new scientific outcomes. The software for these systems depend on computational infrastructure, software interconnected systems, and information capture systems. This ranges from the fundamentals of the reliability of the compute hardware, system software stacks and libraries, and the model software. Due to these complexities and capacity of the infrastructure, there is an increased emphasis of transparency of the approach and robustness of the methods over the full reproducibility. Furthermore, with large volume data management, it is increasingly difficult to store the historical versions of all model and derived data. Instead, the emphasis is on the ability to access the updated products and the reliability by which both previous outcomes are still relevant and can be updated for the new information. We will discuss these challenges and some of the approaches underway that are being used to address these issues.
SWAP-Assembler 2: Optimization of De Novo Genome Assembler at Large Scale

Energy Technology Data Exchange (ETDEWEB)

Meng, Jintao; Seo, Sangmin; Balaji, Pavan; Wei, Yanjie; Wang, Bingqiang; Feng, Shengzhong

2016-08-16

In this paper, we analyze and optimize the most time-consuming steps of the SWAP-Assembler, a parallel genome assembler, so that it can scale to a large number of cores for huge genomes with the size of sequencing data ranging from terabyes to petabytes. According to the performance analysis results, the most time-consuming steps are input parallelization, k-mer graph construction, and graph simplification (edge merging). For the input parallelization, the input data is divided into virtual fragments with nearly equal size, and the start position and end position of each fragment are automatically separated at the beginning of the reads. In k-mer graph construction, in order to improve the communication efficiency, the message size is kept constant between any two processes by proportionally increasing the number of nucleotides to the number of processes in the input parallelization step for each round. The memory usage is also decreased because only a small part of the input data is processed in each round. With graph simplification, the communication protocol reduces the number of communication loops from four to two loops and decreases the idle communication time. The optimized assembler is denoted as SWAP-Assembler 2 (SWAP2). In our experiments using a 1000 Genomes project dataset of 4 terabytes (the largest dataset ever used for assembling) on the supercomputer Mira, the results show that SWAP2 scales to 131,072 cores with an efficiency of 40%. We also compared our work with both the HipMER assembler and the SWAP-Assembler. On the Yanhuang dataset of 300 gigabytes, SWAP2 shows a 3X speedup and 4X better scalability compared with the HipMer assembler and is 45 times faster than the SWAP-Assembler. The SWAP2 software is available at https://sourceforge.net/projects/swapassembler.
An efficient heuristic versus a robust hybrid meta-heuristic for general framework of serial-parallel redundancy problem

International Nuclear Information System (INIS)

Sadjadi, Seyed Jafar; Soltani, R.

2009-01-01

We present a heuristic approach to solve a general framework of serial-parallel redundancy problem where the reliability of the system is maximized subject to some general linear constraints. The complexity of the redundancy problem is generally considered to be NP-Hard and the optimal solution is not normally available. Therefore, to evaluate the performance of the proposed method, a hybrid genetic algorithm is also implemented whose parameters are calibrated via Taguchi's robust design method. Then, various test problems are solved and the computational results indicate that the proposed heuristic approach could provide us some promising reliabilities, which are fairly close to optimal solutions in a reasonable amount of time.
Alignment between galaxies and large-scale structure

International Nuclear Information System (INIS)

Faltenbacher, A.; Li Cheng; White, Simon D. M.; Jing, Yi-Peng; Mao Shude; Wang Jie

2009-01-01

Based on the Sloan Digital Sky Survey DR6 (SDSS) and the Millennium Simulation (MS), we investigate the alignment between galaxies and large-scale structure. For this purpose, we develop two new statistical tools, namely the alignment correlation function and the cos(2θ)-statistic. The former is a two-dimensional extension of the traditional two-point correlation function and the latter is related to the ellipticity correlation function used for cosmic shear measurements. Both are based on the cross correlation between a sample of galaxies with orientations and a reference sample which represents the large-scale structure. We apply the new statistics to the SDSS galaxy catalog. The alignment correlation function reveals an overabundance of reference galaxies along the major axes of red, luminous (L ∼ * ) galaxies out to projected separations of 60 h- 1 Mpc. The signal increases with central galaxy luminosity. No alignment signal is detected for blue galaxies. The cos(2θ)-statistic yields very similar results. Starting from a MS semi-analytic galaxy catalog, we assign an orientation to each red, luminous and central galaxy, based on that of the central region of the host halo (with size similar to that of the stellar galaxy). As an alternative, we use the orientation of the host halo itself. We find a mean projected misalignment between a halo and its central region of ∼ 25 deg. The misalignment decreases slightly with increasing luminosity of the central galaxy. Using the orientations and luminosities of the semi-analytic galaxies, we repeat our alignment analysis on mock surveys of the MS. Agreement with the SDSS results is good if the central orientations are used. Predictions using the halo orientations as proxies for central galaxy orientations overestimate the observed alignment by more than a factor of 2. Finally, the large volume of the MS allows us to generate a two-dimensional map of the alignment correlation function, which shows the reference
Optimization of large-scale industrial systems : an emerging method

Energy Technology Data Exchange (ETDEWEB)

Hammache, A.; Aube, F.; Benali, M.; Cantave, R. [Natural Resources Canada, Varennes, PQ (Canada). CANMET Energy Technology Centre

2006-07-01

This paper reviewed optimization methods of large-scale industrial production systems and presented a novel systematic multi-objective and multi-scale optimization methodology. The methodology was based on a combined local optimality search with global optimality determination, and advanced system decomposition and constraint handling. The proposed method focused on the simultaneous optimization of the energy, economy and ecology aspects of industrial systems (E{sup 3}-ISO). The aim of the methodology was to provide guidelines for decision-making strategies. The approach was based on evolutionary algorithms (EA) with specifications including hybridization of global optimality determination with a local optimality search; a self-adaptive algorithm to account for the dynamic changes of operating parameters and design variables occurring during the optimization process; interactive optimization; advanced constraint handling and decomposition strategy; and object-oriented programming and parallelization techniques. Flowcharts of the working principles of the basic EA were presented. It was concluded that the EA uses a novel decomposition and constraint handling technique to enhance the Pareto solution search procedure for multi-objective problems. 6 refs., 9 figs.
Parallel Computing Strategies for Irregular Algorithms

Science.gov (United States)

Biswas, Rupak; Oliker, Leonid; Shan, Hongzhang; Biegel, Bryan (Technical Monitor)

2002-01-01

Parallel computing promises several orders of magnitude increase in our ability to solve realistic computationally-intensive problems, but relies on their efficient mapping and execution on large-scale multiprocessor architectures. Unfortunately, many important applications are irregular and dynamic in nature, making their effective parallel implementation a daunting task. Moreover, with the proliferation of parallel architectures and programming paradigms, the typical scientist is faced with a plethora of questions that must be answered in order to obtain an acceptable parallel implementation of the solution algorithm. In this paper, we consider three representative irregular applications: unstructured remeshing, sparse matrix computations, and N-body problems, and parallelize them using various popular programming paradigms on a wide spectrum of computer platforms ranging from state-of-the-art supercomputers to PC clusters. We present the underlying problems, the solution algorithms, and the parallel implementation strategies. Smart load-balancing, partitioning, and ordering techniques are used to enhance parallel performance. Overall results demonstrate the complexity of efficiently parallelizing irregular algorithms.
Efficient parallel and out of core algorithms for constructing large bi-directed de Bruijn graphs

Directory of Open Access Journals (Sweden)

Vaughn Matthew

2010-11-01

-directed de Bruijn graph is a fundamental data structure for any sequence assembly program based on Eulerian approach. Our algorithms for constructing Bi-directed de Bruijn graphs are efficient in parallel and out of core settings. These algorithms can be used in building large scale bi-directed de Bruijn graphs. Furthermore, our algorithms do not employ any all-to-all communications in a parallel setting and perform better than the prior algorithms. Finally our out-of-core algorithm is extremely memory efficient and can replace the existing graph construction algorithm in VELVET.
Efficient parallel and out of core algorithms for constructing large bi-directed de Bruijn graphs.

Science.gov (United States)

Kundeti, Vamsi K; Rajasekaran, Sanguthevar; Dinh, Hieu; Vaughn, Matthew; Thapar, Vishal

2010-11-15

any sequence assembly program based on Eulerian approach. Our algorithms for constructing Bi-directed de Bruijn graphs are efficient in parallel and out of core settings. These algorithms can be used in building large scale bi-directed de Bruijn graphs. Furthermore, our algorithms do not employ any all-to-all communications in a parallel setting and perform better than the prior algorithms. Finally our out-of-core algorithm is extremely memory efficient and can replace the existing graph construction algorithm in VELVET.
Large-scale structure after COBE: Peculiar velocities and correlations of cold dark matter halos

Science.gov (United States)

Zurek, Wojciech H.; Quinn, Peter J.; Salmon, John K.; Warren, Michael S.

1994-01-01

Large N-body simulations on parallel supercomputers allow one to simultaneously investigate large-scale structure and the formation of galactic halos with unprecedented resolution. Our study shows that the masses as well as the spatial distribution of halos on scales of tens of megaparsecs in a cold dark matter (CDM) universe with the spectrum normalized to the anisotropies detected by Cosmic Background Explorer (COBE) is compatible with the observations. We also show that the average value of the relative pairwise velocity dispersion sigma(sub v) - used as a principal argument against COBE-normalized CDM models-is significantly lower for halos than for individual particles. When the observational methods of extracting sigma(sub v) are applied to the redshift catalogs obtained from the numerical experiments, estimates differ significantly between different observation-sized samples and overlap observational estimates obtained following the same procedure.

A hybrid parallel framework for the cellular Potts model simulations

Energy Technology Data Exchange (ETDEWEB)

Jiang, Yi [Los Alamos National Laboratory; He, Kejing [SOUTH CHINA UNIV; Dong, Shoubin [SOUTH CHINA UNIV

2009-01-01

The Cellular Potts Model (CPM) has been widely used for biological simulations. However, most current implementations are either sequential or approximated, which can't be used for large scale complex 3D simulation. In this paper we present a hybrid parallel framework for CPM simulations. The time-consuming POE solving, cell division, and cell reaction operation are distributed to clusters using the Message Passing Interface (MPI). The Monte Carlo lattice update is parallelized on shared-memory SMP system using OpenMP. Because the Monte Carlo lattice update is much faster than the POE solving and SMP systems are more and more common, this hybrid approach achieves good performance and high accuracy at the same time. Based on the parallel Cellular Potts Model, we studied the avascular tumor growth using a multiscale model. The application and performance analysis show that the hybrid parallel framework is quite efficient. The hybrid parallel CPM can be used for the large scale simulation ({approx}10{sup 8} sites) of complex collective behavior of numerous cells ({approx}10{sup 6}).
Large scale three-dimensional topology optimisation of heat sinks cooled by natural convection

DEFF Research Database (Denmark)

Alexandersen, Joe; Sigmund, Ole; Aage, Niels

2016-01-01

the Bousinessq approximation. The fully coupled non-linear multiphysics system is solved using stabilised trilinear equal-order finite elements in a parallel framework allowing for the optimisation of large scale problems with order of 20-330 million state degrees of freedom. The flow is assumed to be laminar...... topologies verify prior conclusions regarding fin length/thickness ratios and Biot numbers, but also indicate that carefully tailored and complex geometries may improve cooling behaviour considerably compared to simple heat fin geometries. (C) 2016 Elsevier Ltd. All rights reserved....
Robust-yet-fragile nature of interdependent networks

Science.gov (United States)

Tan, Fei; Xia, Yongxiang; Wei, Zhi

2015-05-01

Interdependent networks have been shown to be extremely vulnerable based on the percolation model. Parshani et al. [Europhys. Lett. 92, 68002 (2010), 10.1209/0295-5075/92/68002] further indicated that the more intersimilar networks are, the more robust they are to random failures. When traffic load is considered, how do the coupling patterns impact cascading failures in interdependent networks? This question has been largely unexplored until now. In this paper, we address this question by investigating the robustness of interdependent Erdös-Rényi random graphs and Barabási-Albert scale-free networks under either random failures or intentional attacks. It is found that interdependent Erdös-Rényi random graphs are robust yet fragile under either random failures or intentional attacks. Interdependent Barabási-Albert scale-free networks, however, are only robust yet fragile under random failures but fragile under intentional attacks. We further analyze the interdependent communication network and power grid and achieve similar results. These results advance our understanding of how interdependency shapes network robustness.
Xyce parallel electronic simulator release notes.

Energy Technology Data Exchange (ETDEWEB)

Keiter, Eric R; Hoekstra, Robert John; Mei, Ting; Russo, Thomas V.; Schiek, Richard Louis; Thornquist, Heidi K.; Rankin, Eric Lamont; Coffey, Todd S; Pawlowski, Roger P; Santarelli, Keith R.

2010-05-01

The Xyce Parallel Electronic Simulator has been written to support, in a rigorous manner, the simulation needs of the Sandia National Laboratories electrical designers. Specific requirements include, among others, the ability to solve extremely large circuit problems by supporting large-scale parallel computing platforms, improved numerical performance and object-oriented code design and implementation. The Xyce release notes describe: Hardware and software requirements New features and enhancements Any defects fixed since the last release Current known defects and defect workarounds For up-to-date information not available at the time these notes were produced, please visit the Xyce web page at http://www.cs.sandia.gov/xyce.
The Software Reliability of Large Scale Integration Circuit and Very Large Scale Integration Circuit

OpenAIRE

Artem Ganiyev; Jan Vitasek

2010-01-01

This article describes evaluation method of faultless function of large scale integration circuits (LSI) and very large scale integration circuits (VLSI). In the article there is a comparative analysis of factors which determine faultless of integrated circuits, analysis of already existing methods and model of faultless function evaluation of LSI and VLSI. The main part describes a proposed algorithm and program for analysis of fault rate in LSI and VLSI circuits.
Scaling Behavior of Dilute Polymer Solutions Confined between Parallel Plates

NARCIS (Netherlands)

Vliet, J.H. van; Luyten, M.C.; Brinke, G. ten

1992-01-01

The average size and shape of a polymer coil confined in a slit between two parallel plates depends on the distance L between the plates. On the basis of numerical results, four different regimes can be distingubhed. For large values of L the coil is essentially unconfined. For intermediate values
Ecological niche modeling as a new paradigm for large-scale investigations of diversity and distribution of birds

Science.gov (United States)

A. Townsend Peterson; Daniel A. Kluza

2005-01-01

Large-scale assessments of the distribution and diversity of birds have been challenged by the need for a robust methodology for summarizing or predicting species' geographic distributions (e.g. Beard et al. 1999, Manel et al. 1999, Saveraid et al. 2001). Methodologies used in such studies have at times been inappropriate, or even more frequently limited in their...
Sparse deconvolution for the large-scale ill-posed inverse problem of impact force reconstruction

Science.gov (United States)

Qiao, Baijie; Zhang, Xingwu; Gao, Jiawei; Liu, Ruonan; Chen, Xuefeng

2017-01-01

Most previous regularization methods for solving the inverse problem of force reconstruction are to minimize the l2-norm of the desired force. However, these traditional regularization methods such as Tikhonov regularization and truncated singular value decomposition, commonly fail to solve the large-scale ill-posed inverse problem in moderate computational cost. In this paper, taking into account the sparse characteristic of impact force, the idea of sparse deconvolution is first introduced to the field of impact force reconstruction and a general sparse deconvolution model of impact force is constructed. Second, a novel impact force reconstruction method based on the primal-dual interior point method (PDIPM) is proposed to solve such a large-scale sparse deconvolution model, where minimizing the l2-norm is replaced by minimizing the l1-norm. Meanwhile, the preconditioned conjugate gradient algorithm is used to compute the search direction of PDIPM with high computational efficiency. Finally, two experiments including the small-scale or medium-scale single impact force reconstruction and the relatively large-scale consecutive impact force reconstruction are conducted on a composite wind turbine blade and a shell structure to illustrate the advantage of PDIPM. Compared with Tikhonov regularization, PDIPM is more efficient, accurate and robust whether in the single impact force reconstruction or in the consecutive impact force reconstruction.
Coupling graph perturbation theory with scalable parallel algorithms for large-scale enumeration of maximal cliques in biological graphs

International Nuclear Information System (INIS)

Samatova, N F; Schmidt, M C; Hendrix, W; Breimyer, P; Thomas, K; Park, B-H

2008-01-01

Data-driven construction of predictive models for biological systems faces challenges from data intensity, uncertainty, and computational complexity. Data-driven model inference is often considered a combinatorial graph problem where an enumeration of all feasible models is sought. The data-intensive and the NP-hard nature of such problems, however, challenges existing methods to meet the required scale of data size and uncertainty, even on modern supercomputers. Maximal clique enumeration (MCE) in a graph derived from such biological data is often a rate-limiting step in detecting protein complexes in protein interaction data, finding clusters of co-expressed genes in microarray data, or identifying clusters of orthologous genes in protein sequence data. We report two key advances that address this challenge. We designed and implemented the first (to the best of our knowledge) parallel MCE algorithm that scales linearly on thousands of processors running MCE on real-world biological networks with thousands and hundreds of thousands of vertices. In addition, we proposed and developed the Graph Perturbation Theory (GPT) that establishes a foundation for efficiently solving the MCE problem in perturbed graphs, which model the uncertainty in the data. GPT formulates necessary and sufficient conditions for detecting the differences between the sets of maximal cliques in the original and perturbed graphs and reduces the enumeration time by more than 80% compared to complete recomputation
Apparatus to examine pulsed parallel field losses in large conductors

International Nuclear Information System (INIS)

Miller, J.R.; Shen, S.S.

1977-01-01

Conductors in tokamak toroidal field coils will be exposed to pulsed fields both parallel and perpendicular to the current direction. These conductors will likely be quite high capacity (10 to 20 kA) and therefore probably will be built up out of smaller units. We have previously published measurements of losses in conductors exposed to a pulsed parallel field, but those experiments necessarily used monolithic conductors of relatively small cross section because the pulse coil, a torus that surrounded the test conductor, was itself small. Here we describe an apparatus that is conceptually similar but has been scaled up to accept conductors of much larger cross section and current capacity. The apparatus consists basically of a superconducting torus that contains a movable spool to allow test samples to be wound inside without unwinding the torus. Details of apparatus design and capabilities are described and preliminary results from tests of the apparatus and from loss measurements using it are reported
Accurate and Efficient Parallel Implementation of an Effective Linear-Scaling Direct Random Phase Approximation Method.

Science.gov (United States)

Graf, Daniel; Beuerle, Matthias; Schurkus, Henry F; Luenser, Arne; Savasci, Gökcen; Ochsenfeld, Christian

2018-05-08

An efficient algorithm for calculating the random phase approximation (RPA) correlation energy is presented that is as accurate as the canonical molecular orbital resolution-of-the-identity RPA (RI-RPA) with the important advantage of an effective linear-scaling behavior (instead of quartic) for large systems due to a formulation in the local atomic orbital space. The high accuracy is achieved by utilizing optimized minimax integration schemes and the local Coulomb metric attenuated by the complementary error function for the RI approximation. The memory bottleneck of former atomic orbital (AO)-RI-RPA implementations ( Schurkus, H. F.; Ochsenfeld, C. J. Chem. Phys. 2016 , 144 , 031101 and Luenser, A.; Schurkus, H. F.; Ochsenfeld, C. J. Chem. Theory Comput. 2017 , 13 , 1647 - 1655 ) is addressed by precontraction of the large 3-center integral matrix with the Cholesky factors of the ground state density reducing the memory requirements of that matrix by a factor of [Formula: see text]. Furthermore, we present a parallel implementation of our method, which not only leads to faster RPA correlation energy calculations but also to a scalable decrease in memory requirements, opening the door for investigations of large molecules even on small- to medium-sized computing clusters. Although it is known that AO methods are highly efficient for extended systems, where sparsity allows for reaching the linear-scaling regime, we show that our work also extends the applicability when considering highly delocalized systems for which no linear scaling can be achieved. As an example, the interlayer distance of two covalent organic framework pore fragments (comprising 384 atoms in total) is analyzed.
Phylogenetic distribution of large-scale genome patchiness

Directory of Open Access Journals (Sweden)

Hackenberg Michael

2008-04-01

Full Text Available Abstract Background The phylogenetic distribution of large-scale genome structure (i.e. mosaic compositional patchiness has been explored mainly by analytical ultracentrifugation of bulk DNA. However, with the availability of large, good-quality chromosome sequences, and the recently developed computational methods to directly analyze patchiness on the genome sequence, an evolutionary comparative analysis can be carried out at the sequence level. Results The local variations in the scaling exponent of the Detrended Fluctuation Analysis are used here to analyze large-scale genome structure and directly uncover the characteristic scales present in genome sequences. Furthermore, through shuffling experiments of selected genome regions, computationally-identified, isochore-like regions were identified as the biological source for the uncovered large-scale genome structure. The phylogenetic distribution of short- and large-scale patchiness was determined in the best-sequenced genome assemblies from eleven eukaryotic genomes: mammals (Homo sapiens, Pan troglodytes, Mus musculus, Rattus norvegicus, and Canis familiaris, birds (Gallus gallus, fishes (Danio rerio, invertebrates (Drosophila melanogaster and Caenorhabditis elegans, plants (Arabidopsis thaliana and yeasts (Saccharomyces cerevisiae. We found large-scale patchiness of genome structure, associated with in silico determined, isochore-like regions, throughout this wide phylogenetic range. Conclusion Large-scale genome structure is detected by directly analyzing DNA sequences in a wide range of eukaryotic chromosome sequences, from human to yeast. In all these genomes, large-scale patchiness can be associated with the isochore-like regions, as directly detected in silico at the sequence level.
Managing large-scale models: DBS

International Nuclear Information System (INIS)

1981-05-01

A set of fundamental management tools for developing and operating a large scale model and data base system is presented. Based on experience in operating and developing a large scale computerized system, the only reasonable way to gain strong management control of such a system is to implement appropriate controls and procedures. Chapter I discusses the purpose of the book. Chapter II classifies a broad range of generic management problems into three groups: documentation, operations, and maintenance. First, system problems are identified then solutions for gaining management control are disucssed. Chapters III, IV, and V present practical methods for dealing with these problems. These methods were developed for managing SEAS but have general application for large scale models and data bases
Large Scale Self-Organizing Information Distribution System

National Research Council Canada - National Science Library

Low, Steven

2005-01-01

This project investigates issues in "large-scale" networks. Here "large-scale" refers to networks with large number of high capacity nodes and transmission links, and shared by a large number of users...
Self-assembly of highly fluorescent semiconductor nanorods into large scale smectic liquid crystal structures by coffee stain evaporation dynamics

International Nuclear Information System (INIS)

Nobile, Concetta; Carbone, Luigi; Fiore, Angela; Cingolani, Roberto; Manna, Liberato; Krahne, Roman

2009-01-01

We deposit droplets of nanorods dispersed in solvents on substrate surfaces and let the solvent evaporate. We find that strong contact line pinning leads to dense nanorod deposition inside coffee stain fringes, where we observe large scale lateral ordering of the nanorods with the long axis of the rods oriented parallel to the contact line. We observe birefringence of these coffee stain fringes by polarized microscopy and we find the direction of the extraordinary refractive index parallel to the long axis of the nanorods.
Development of a large-scale general purpose two-phase flow analysis code

International Nuclear Information System (INIS)

Terasaka, Haruo; Shimizu, Sensuke

2001-01-01

A general purpose three-dimensional two-phase flow analysis code has been developed for solving large-scale problems in industrial fields. The code uses a two-fluid model to describe the conservation equations for two-phase flow in order to be applicable to various phenomena. Complicated geometrical conditions are modeled by FAVOR method in structured grid systems, and the discretization equations are solved by a modified SIMPLEST scheme. To reduce computing time a matrix solver for the pressure correction equation is parallelized with OpenMP. Results of numerical examples show that the accurate solutions can be obtained efficiently and stably. (author)
Highly uniform parallel microfabrication using a large numerical aperture system

Energy Technology Data Exchange (ETDEWEB)

Zhang, Zi-Yu; Su, Ya-Hui, E-mail: ustcsyh@ahu.edu.cn, E-mail: dongwu@ustc.edu.cn [School of Electrical Engineering and Automation, Anhui University, Hefei 230601 (China); Zhang, Chen-Chu; Hu, Yan-Lei; Wang, Chao-Wei; Li, Jia-Wen; Chu, Jia-Ru; Wu, Dong, E-mail: ustcsyh@ahu.edu.cn, E-mail: dongwu@ustc.edu.cn [CAS Key Laboratory of Mechanical Behavior and Design of Materials, Department of Precision Machinery and Precision Instrumentation, University of Science and Technology of China, Hefei 230026 (China)

2016-07-11

In this letter, we report an improved algorithm to produce accurate phase patterns for generating highly uniform diffraction-limited multifocal arrays in a large numerical aperture objective system. It is shown that based on the original diffraction integral, the uniformity of the diffraction-limited focal arrays can be improved from ∼75% to >97%, owing to the critical consideration of the aperture function and apodization effect associated with a large numerical aperture objective. The experimental results, e.g., 3 × 3 arrays of square and triangle, seven microlens arrays with high uniformity, further verify the advantage of the improved algorithm. This algorithm enables the laser parallel processing technology to realize uniform microstructures and functional devices in the microfabrication system with a large numerical aperture objective.
Robust nonlinear PID-like fuzzy logic control of a planar parallel (2PRP-PPR) manipulator.

Science.gov (United States)

Londhe, P S; Singh, Yogesh; Santhakumar, M; Patre, B M; Waghmare, L M

2016-07-01

In this paper, a robust nonlinear proportional-integral-derivative (PID)-like fuzzy control scheme is presented and applied to complex trajectory tracking control of a 2PRP-PPR (P-prismatic, R-revolute) planar parallel manipulator (motion platform) with three degrees-of-freedom (DOF) in the presence of parameter uncertainties and external disturbances. The proposed control law consists of mainly two parts: first part uses a feed forward term to enhance the control activity and estimated perturbed term to compensate for the unknown effects namely external disturbances and unmodeled dynamics, and the second part uses a PID-like fuzzy logic control as a feedback portion to enhance the overall closed-loop stability of the system. Experimental results are presented to show the effectiveness of the proposed control scheme. Copyright © 2016 ISA. Published by Elsevier Ltd. All rights reserved.
Visual Interfaces for Parallel Simulations (VIPS), Phase I

Data.gov (United States)

National Aeronautics and Space Administration — Configuring the 3D geometry and physics of large scale parallel physics simulations is increasingly complex. Given the investment in time and effort to run these...
Large scale structure and baryogenesis

International Nuclear Information System (INIS)

Kirilova, D.P.; Chizhov, M.V.

2001-08-01

We discuss a possible connection between the large scale structure formation and the baryogenesis in the universe. An update review of the observational indications for the presence of a very large scale 120h -1 Mpc in the distribution of the visible matter of the universe is provided. The possibility to generate a periodic distribution with the characteristic scale 120h -1 Mpc through a mechanism producing quasi-periodic baryon density perturbations during inflationary stage, is discussed. The evolution of the baryon charge density distribution is explored in the framework of a low temperature boson condensate baryogenesis scenario. Both the observed very large scale of a the visible matter distribution in the universe and the observed baryon asymmetry value could naturally appear as a result of the evolution of a complex scalar field condensate, formed at the inflationary stage. Moreover, for some model's parameters a natural separation of matter superclusters from antimatter ones can be achieved. (author)

A Parallel Algorithm for Connected Component Labelling of Gray-scale Images on Homogeneous Multicore Architectures

International Nuclear Information System (INIS)

Niknam, Mehdi; Thulasiraman, Parimala; Camorlinga, Sergio

2010-01-01

Connected component labelling is an essential step in image processing. We provide a parallel version of Suzuki's sequential connected component algorithm in order to speed up the labelling process. Also, we modify the algorithm to enable labelling gray-scale images. Due to the data dependencies in the algorithm we used a method similar to pipeline to exploit parallelism. The parallel algorithm method achieved a speedup of 2.5 for image size of 256 x 256 pixels using 4 processing threads.
Automatic management software for large-scale cluster system

International Nuclear Information System (INIS)

Weng Yunjian; Chinese Academy of Sciences, Beijing; Sun Gongxing

2007-01-01

At present, the large-scale cluster system faces to the difficult management. For example the manager has large work load. It needs to cost much time on the management and the maintenance of large-scale cluster system. The nodes in large-scale cluster system are very easy to be chaotic. Thousands of nodes are put in big rooms so that some managers are very easy to make the confusion with machines. How do effectively carry on accurate management under the large-scale cluster system? The article introduces ELFms in the large-scale cluster system. Furthermore, it is proposed to realize the large-scale cluster system automatic management. (authors)
A review of advanced small-scale parallel bioreactor technology for accelerated process development: current state and future need.

Science.gov (United States)

Bareither, Rachel; Pollard, David

2011-01-01

The pharmaceutical and biotech industries face continued pressure to reduce development costs and accelerate process development. This challenge occurs alongside the need for increased upstream experimentation to support quality by design initiatives and the pursuit of predictive models from systems biology. A small scale system enabling multiple reactions in parallel (n ≥ 20), with automated sampling and integrated to purification, would provide significant improvement (four to fivefold) to development timelines. State of the art attempts to pursue high throughput process development include shake flasks, microfluidic reactors, microtiter plates and small-scale stirred reactors. The limitations of these systems are compared to desired criteria to mimic large scale commercial processes. The comparison shows that significant technological improvement is still required to provide automated solutions that can speed upstream process development. Copyright © 2010 American Institute of Chemical Engineers (AIChE).
Large-Scale Constraint-Based Pattern Mining

Science.gov (United States)

Zhu, Feida

2009-01-01

We studied the problem of constraint-based pattern mining for three different data formats, item-set, sequence and graph, and focused on mining patterns of large sizes. Colossal patterns in each data formats are studied to discover pruning properties that are useful for direct mining of these patterns. For item-set data, we observed robustness of…
Robust and scalable optical one-way quantum computation

International Nuclear Information System (INIS)

Wang Hefeng; Yang Chuiping; Nori, Franco

2010-01-01

We propose an efficient approach for deterministically generating scalable cluster states with photons. This approach involves unitary transformations performed on atoms coupled to optical cavities. Its operation cost scales linearly with the number of qubits in the cluster state, and photon qubits are encoded such that single-qubit operations can be easily implemented by using linear optics. Robust optical one-way quantum computation can be performed since cluster states can be stored in atoms and then transferred to photons that can be easily operated and measured. Therefore, this proposal could help in performing robust large-scale optical one-way quantum computation.
Large-Scale Production of Nanographite by Tube-Shear Exfoliation in Water.

Directory of Open Access Journals (Sweden)

Nicklas Blomquist

Full Text Available The number of applications based on graphene, few-layer graphene, and nanographite is rapidly increasing. A large-scale process for production of these materials is critically needed to achieve cost-effective commercial products. Here, we present a novel process to mechanically exfoliate industrial quantities of nanographite from graphite in an aqueous environment with low energy consumption and at controlled shear conditions. This process, based on hydrodynamic tube shearing, produced nanometer-thick and micrometer-wide flakes of nanographite with a production rate exceeding 500 gh-1 with an energy consumption about 10 Whg-1. In addition, to facilitate large-area coating, we show that the nanographite can be mixed with nanofibrillated cellulose in the process to form highly conductive, robust and environmentally friendly composites. This composite has a sheet resistance below 1.75 Ω/sq and an electrical resistivity of 1.39×10-4 Ωm and may find use in several applications, from supercapacitors and batteries to printed electronics and solar cells. A batch of 100 liter was processed in less than 4 hours. The design of the process allow scaling to even larger volumes and the low energy consumption indicates a low-cost process.
Robustness analysis of a parallel two-box digital polynomial predistorter for an SOA-based CO-OFDM system

Science.gov (United States)

Diouf, C.; Younes, M.; Noaja, A.; Azou, S.; Telescu, M.; Morel, P.; Tanguy, N.

2017-11-01

The linearization performance of various digital baseband pre-distortion schemes is evaluated in this paper for a coherent optical OFDM (CO-OFDM) transmitter employing a semiconductor optical amplifier (SOA). In particular, the benefits of using a parallel two-box (PTB) behavioral model, combining a static nonlinear function with a memory polynomial (MP) model, is investigated for mitigating the system nonlinearities and compared to the memoryless and MP models. Moreover, the robustness of the predistorters under different operating conditions and system uncertainties is assessed based on a precise SOA physical model. The PTB scheme proves to be the most effective linearization technique for the considered setup, with an excellent performance-complexity tradeoff over a wide range of conditions.
Parallel finite elements with domain decomposition and its pre-processing

International Nuclear Information System (INIS)

Yoshida, A.; Yagawa, G.; Hamada, S.

1993-01-01

This paper describes a parallel finite element analysis using a domain decomposition method, and the pre-processing for the parallel calculation. Computer simulations are about to replace experiments in various fields, and the scale of model to be simulated tends to be extremely large. On the other hand, computational environment has drastically changed in these years. Especially, parallel processing on massively parallel computers or computer networks is considered to be promising techniques. In order to achieve high efficiency on such parallel computation environment, large granularity of tasks, a well-balanced workload distribution are key issues. It is also important to reduce the cost of pre-processing in such parallel FEM. From the point of view, the authors developed the domain decomposition FEM with the automatic and dynamic task-allocation mechanism and the automatic mesh generation/domain subdivision system for it. (author)
Analysis of Parallel Algorithms on SMP Node and Cluster of Workstations Using Parallel Programming Models with New Tile-based Method for Large Biological Datasets.

Science.gov (United States)

Shrimankar, D D; Sathe, S R

2016-01-01

Sequence alignment is an important tool for describing the relationships between DNA sequences. Many sequence alignment algorithms exist, differing in efficiency, in their models of the sequences, and in the relationship between sequences. The focus of this study is to obtain an optimal alignment between two sequences of biological data, particularly DNA sequences. The algorithm is discussed with particular emphasis on time, speedup, and efficiency optimizations. Parallel programming presents a number of critical challenges to application developers. Today's supercomputer often consists of clusters of SMP nodes. Programming paradigms such as OpenMP and MPI are used to write parallel codes for such architectures. However, the OpenMP programs cannot be scaled for more than a single SMP node. However, programs written in MPI can have more than single SMP nodes. But such a programming paradigm has an overhead of internode communication. In this work, we explore the tradeoffs between using OpenMP and MPI. We demonstrate that the communication overhead incurs significantly even in OpenMP loop execution and increases with the number of cores participating. We also demonstrate a communication model to approximate the overhead from communication in OpenMP loops. Our results are astonishing and interesting to a large variety of input data files. We have developed our own load balancing and cache optimization technique for message passing model. Our experimental results show that our own developed techniques give optimum performance of our parallel algorithm for various sizes of input parameter, such as sequence size and tile size, on a wide variety of multicore architectures.
Analysis of Parallel Algorithms on SMP Node and Cluster of Workstations Using Parallel Programming Models with New Tile-based Method for Large Biological Datasets

Science.gov (United States)

Shrimankar, D. D.; Sathe, S. R.

2016-01-01

Sequence alignment is an important tool for describing the relationships between DNA sequences. Many sequence alignment algorithms exist, differing in efficiency, in their models of the sequences, and in the relationship between sequences. The focus of this study is to obtain an optimal alignment between two sequences of biological data, particularly DNA sequences. The algorithm is discussed with particular emphasis on time, speedup, and efficiency optimizations. Parallel programming presents a number of critical challenges to application developers. Today’s supercomputer often consists of clusters of SMP nodes. Programming paradigms such as OpenMP and MPI are used to write parallel codes for such architectures. However, the OpenMP programs cannot be scaled for more than a single SMP node. However, programs written in MPI can have more than single SMP nodes. But such a programming paradigm has an overhead of internode communication. In this work, we explore the tradeoffs between using OpenMP and MPI. We demonstrate that the communication overhead incurs significantly even in OpenMP loop execution and increases with the number of cores participating. We also demonstrate a communication model to approximate the overhead from communication in OpenMP loops. Our results are astonishing and interesting to a large variety of input data files. We have developed our own load balancing and cache optimization technique for message passing model. Our experimental results show that our own developed techniques give optimum performance of our parallel algorithm for various sizes of input parameter, such as sequence size and tile size, on a wide variety of multicore architectures. PMID:27932868
Parallel-In-Time For Moving Meshes

Energy Technology Data Exchange (ETDEWEB)

Falgout, R. D. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Manteuffel, T. A. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Southworth, B. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Schroder, J. B. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

2016-02-04

With steadily growing computational resources available, scientists must develop e ective ways to utilize the increased resources. High performance, highly parallel software has be- come a standard. However until recent years parallelism has focused primarily on the spatial domain. When solving a space-time partial di erential equation (PDE), this leads to a sequential bottleneck in the temporal dimension, particularly when taking a large number of time steps. The XBraid parallel-in-time library was developed as a practical way to add temporal parallelism to existing se- quential codes with only minor modi cations. In this work, a rezoning-type moving mesh is applied to a di usion problem and formulated in a parallel-in-time framework. Tests and scaling studies are run using XBraid and demonstrate excellent results for the simple model problem considered herein.
Large scale network-centric distributed systems

CERN Document Server

Sarbazi-Azad, Hamid

2014-01-01

A highly accessible reference offering a broad range of topics and insights on large scale network-centric distributed systems Evolving from the fields of high-performance computing and networking, large scale network-centric distributed systems continues to grow as one of the most important topics in computing and communication and many interdisciplinary areas. Dealing with both wired and wireless networks, this book focuses on the design and performance issues of such systems. Large Scale Network-Centric Distributed Systems provides in-depth coverage ranging from ground-level hardware issu
General-purpose parallel simulator for quantum computing

International Nuclear Information System (INIS)

Niwa, Jumpei; Matsumoto, Keiji; Imai, Hiroshi

2002-01-01

With current technologies, it seems to be very difficult to implement quantum computers with many qubits. It is therefore of importance to simulate quantum algorithms and circuits on the existing computers. However, for a large-size problem, the simulation often requires more computational power than is available from sequential processing. Therefore, simulation methods for parallel processors are required. We have developed a general-purpose simulator for quantum algorithms/circuits on the parallel computer (Sun Enterprise4500). It can simulate algorithms/circuits with up to 30 qubits. In order to test efficiency of our proposed methods, we have simulated Shor's factorization algorithm and Grover's database search, and we have analyzed robustness of the corresponding quantum circuits in the presence of both decoherence and operational errors. The corresponding results, statistics, and analyses are presented in this paper
Large-Scale Outflows in Seyfert Galaxies

Science.gov (United States)

Colbert, E. J. M.; Baum, S. A.

1995-12-01

\\catcode`\\@=11 \\ialign{m @th#1hfil ##hfil \\crcr#2\\crcr\\sim\\crcr}}} \\catcode`\\@=12 Highly collimated outflows extend out to Mpc scales in many radio-loud active galaxies. In Seyfert galaxies, which are radio-quiet, the outflows extend out to kpc scales and do not appear to be as highly collimated. In order to study the nature of large-scale (>~1 kpc) outflows in Seyferts, we have conducted optical, radio and X-ray surveys of a distance-limited sample of 22 edge-on Seyfert galaxies. Results of the optical emission-line imaging and spectroscopic survey imply that large-scale outflows are present in >~{{1} /{4}} of all Seyferts. The radio (VLA) and X-ray (ROSAT) surveys show that large-scale radio and X-ray emission is present at about the same frequency. Kinetic luminosities of the outflows in Seyferts are comparable to those in starburst-driven superwinds. Large-scale radio sources in Seyferts appear diffuse, but do not resemble radio halos found in some edge-on starburst galaxies (e.g. M82). We discuss the feasibility of the outflows being powered by the active nucleus (e.g. a jet) or a circumnuclear starburst.
GENESIS: a hybrid-parallel and multi-scale molecular dynamics simulator with enhanced sampling algorithms for biomolecular and cellular simulations.

Science.gov (United States)

Jung, Jaewoon; Mori, Takaharu; Kobayashi, Chigusa; Matsunaga, Yasuhiro; Yoda, Takao; Feig, Michael; Sugita, Yuji

2015-07-01

GENESIS (Generalized-Ensemble Simulation System) is a new software package for molecular dynamics (MD) simulations of macromolecules. It has two MD simulators, called ATDYN and SPDYN. ATDYN is parallelized based on an atomic decomposition algorithm for the simulations of all-atom force-field models as well as coarse-grained Go-like models. SPDYN is highly parallelized based on a domain decomposition scheme, allowing large-scale MD simulations on supercomputers. Hybrid schemes combining OpenMP and MPI are used in both simulators to target modern multicore computer architectures. Key advantages of GENESIS are (1) the highly parallel performance of SPDYN for very large biological systems consisting of more than one million atoms and (2) the availability of various REMD algorithms (T-REMD, REUS, multi-dimensional REMD for both all-atom and Go-like models under the NVT, NPT, NPAT, and NPγT ensembles). The former is achieved by a combination of the midpoint cell method and the efficient three-dimensional Fast Fourier Transform algorithm, where the domain decomposition space is shared in real-space and reciprocal-space calculations. Other features in SPDYN, such as avoiding concurrent memory access, reducing communication times, and usage of parallel input/output files, also contribute to the performance. We show the REMD simulation results of a mixed (POPC/DMPC) lipid bilayer as a real application using GENESIS. GENESIS is released as free software under the GPLv2 licence and can be easily modified for the development of new algorithms and molecular models. WIREs Comput Mol Sci 2015, 5:310-323. doi: 10.1002/wcms.1220.
Large-scale transportation network congestion evolution prediction using deep learning theory.

Science.gov (United States)

Ma, Xiaolei; Yu, Haiyang; Wang, Yunpeng; Wang, Yinhai

2015-01-01

Understanding how congestion at one location can cause ripples throughout large-scale transportation network is vital for transportation researchers and practitioners to pinpoint traffic bottlenecks for congestion mitigation. Traditional studies rely on either mathematical equations or simulation techniques to model traffic congestion dynamics. However, most of the approaches have limitations, largely due to unrealistic assumptions and cumbersome parameter calibration process. With the development of Intelligent Transportation Systems (ITS) and Internet of Things (IoT), transportation data become more and more ubiquitous. This triggers a series of data-driven research to investigate transportation phenomena. Among them, deep learning theory is considered one of the most promising techniques to tackle tremendous high-dimensional data. This study attempts to extend deep learning theory into large-scale transportation network analysis. A deep Restricted Boltzmann Machine and Recurrent Neural Network architecture is utilized to model and predict traffic congestion evolution based on Global Positioning System (GPS) data from taxi. A numerical study in Ningbo, China is conducted to validate the effectiveness and efficiency of the proposed method. Results show that the prediction accuracy can achieve as high as 88% within less than 6 minutes when the model is implemented in a Graphic Processing Unit (GPU)-based parallel computing environment. The predicted congestion evolution patterns can be visualized temporally and spatially through a map-based platform to identify the vulnerable links for proactive congestion mitigation.
SCALE INTERACTION IN A MIXING LAYER. THE ROLE OF THE LARGE-SCALE GRADIENTS

KAUST Repository

Fiscaletti, Daniele

2015-08-23

The interaction between scales is investigated in a turbulent mixing layer. The large-scale amplitude modulation of the small scales already observed in other works depends on the crosswise location. Large-scale positive fluctuations correlate with a stronger activity of the small scales on the low speed-side of the mixing layer, and a reduced activity on the high speed-side. However, from physical considerations we would expect the scales to interact in a qualitatively similar way within the flow and across different turbulent flows. Therefore, instead of the large-scale fluctuations, the large-scale gradients modulation of the small scales has been additionally investigated.
A Web-based Distributed Voluntary Computing Platform for Large Scale Hydrological Computations

Science.gov (United States)

Demir, I.; Agliamzanov, R.

2014-12-01

Distributed volunteer computing can enable researchers and scientist to form large parallel computing environments to utilize the computing power of the millions of computers on the Internet, and use them towards running large scale environmental simulations and models to serve the common good of local communities and the world. Recent developments in web technologies and standards allow client-side scripting languages to run at speeds close to native application, and utilize the power of Graphics Processing Units (GPU). Using a client-side scripting language like JavaScript, we have developed an open distributed computing framework that makes it easy for researchers to write their own hydrologic models, and run them on volunteer computers. Users will easily enable their websites for visitors to volunteer sharing their computer resources to contribute running advanced hydrological models and simulations. Using a web-based system allows users to start volunteering their computational resources within seconds without installing any software. The framework distributes the model simulation to thousands of nodes in small spatial and computational sizes. A relational database system is utilized for managing data connections and queue management for the distributed computing nodes. In this paper, we present a web-based distributed volunteer computing platform to enable large scale hydrological simulations and model runs in an open and integrated environment.
Distributed Robustness Analysis of Interconnected Uncertain Systems Using Chordal Decomposition

DEFF Research Database (Denmark)

Pakazad, Sina Khoshfetrat; Hansson, Anders; Andersen, Martin Skovgaard

2014-01-01

Large-scale interconnected uncertain systems commonly have large state and uncertainty dimensions. Aside from the heavy computational cost of performing robust stability analysis in a centralized manner, privacy requirements in the network can also introduce further issues. In this paper, we util...
An efficient method based on the uniformity principle for synthesis of large-scale heat exchanger networks

International Nuclear Information System (INIS)

Zhang, Chunwei; Cui, Guomin; Chen, Shang

2016-01-01

Highlights: • Two dimensionless uniformity factors are presented to heat exchange network. • The grouping of process streams reduces the computational complexity of large-scale HENS problems. • The optimal sub-network can be obtained by Powell particle swarm optimization algorithm. • The method is illustrated by a case study involving 39 process streams, with a better solution. - Abstract: The optimal design of large-scale heat exchanger networks is a difficult task due to the inherent non-linear characteristics and the combinatorial nature of heat exchangers. To solve large-scale heat exchanger network synthesis (HENS) problems, two dimensionless uniformity factors to describe the heat exchanger network (HEN) uniformity in terms of the temperature difference and the accuracy of process stream grouping are deduced. Additionally, a novel algorithm that combines deterministic and stochastic optimizations to obtain an optimal sub-network with a suitable heat load for a given group of streams is proposed, and is named the Powell particle swarm optimization (PPSO). As a result, the synthesis of large-scale heat exchanger networks is divided into two corresponding sub-parts, namely, the grouping of process streams and the optimization of sub-networks. This approach reduces the computational complexity and increases the efficiency of the proposed method. The robustness and effectiveness of the proposed method are demonstrated by solving a large-scale HENS problem involving 39 process streams, and the results obtained are better than those previously published in the literature.

A new large-scale manufacturing platform for complex biopharmaceuticals.

Science.gov (United States)

Vogel, Jens H; Nguyen, Huong; Giovannini, Roberto; Ignowski, Jolene; Garger, Steve; Salgotra, Anil; Tom, Jennifer

2012-12-01

Complex biopharmaceuticals, such as recombinant blood coagulation factors, are addressing critical medical needs and represent a growing multibillion-dollar market. For commercial manufacturing of such, sometimes inherently unstable, molecules it is important to minimize product residence time in non-ideal milieu in order to obtain acceptable yields and consistently high product quality. Continuous perfusion cell culture allows minimization of residence time in the bioreactor, but also brings unique challenges in product recovery, which requires innovative solutions. In order to maximize yield, process efficiency, facility and equipment utilization, we have developed, scaled-up and successfully implemented a new integrated manufacturing platform in commercial scale. This platform consists of a (semi-)continuous cell separation process based on a disposable flow path and integrated with the upstream perfusion operation, followed by membrane chromatography on large-scale adsorber capsules in rapid cycling mode. Implementation of the platform at commercial scale for a new product candidate led to a yield improvement of 40% compared to the conventional process technology, while product quality has been shown to be more consistently high. Over 1,000,000 L of cell culture harvest have been processed with 100% success rate to date, demonstrating the robustness of the new platform process in GMP manufacturing. While membrane chromatography is well established for polishing in flow-through mode, this is its first commercial-scale application for bind/elute chromatography in the biopharmaceutical industry and demonstrates its potential in particular for manufacturing of potent, low-dose biopharmaceuticals. Copyright © 2012 Wiley Periodicals, Inc.
A Parallel, Multi-Scale Watershed-Hydrologic-Inundation Model with Adaptively Switching Mesh for Capturing Flooding and Lake Dynamics

Science.gov (United States)

Ji, X.; Shen, C.

2017-12-01

Flood inundation presents substantial societal hazards and also changes biogeochemistry for systems like the Amazon. It is often expensive to simulate high-resolution flood inundation and propagation in a long-term watershed-scale model. Due to the Courant-Friedrichs-Lewy (CFL) restriction, high resolution and large local flow velocity both demand prohibitively small time steps even for parallel codes. Here we develop a parallel surface-subsurface process-based model enhanced by multi-resolution meshes that are adaptively switched on or off. The high-resolution overland flow meshes are enabled only when the flood wave invades to floodplains. This model applies semi-implicit, semi-Lagrangian (SISL) scheme in solving dynamic wave equations, and with the assistant of the multi-mesh method, it also adaptively chooses the dynamic wave equation only in the area of deep inundation. Therefore, the model achieves a balance between accuracy and computational cost.
Parallel MR imaging.

Science.gov (United States)

Deshmane, Anagha; Gulani, Vikas; Griswold, Mark A; Seiberlich, Nicole

2012-07-01

Parallel imaging is a robust method for accelerating the acquisition of magnetic resonance imaging (MRI) data, and has made possible many new applications of MR imaging. Parallel imaging works by acquiring a reduced amount of k-space data with an array of receiver coils. These undersampled data can be acquired more quickly, but the undersampling leads to aliased images. One of several parallel imaging algorithms can then be used to reconstruct artifact-free images from either the aliased images (SENSE-type reconstruction) or from the undersampled data (GRAPPA-type reconstruction). The advantages of parallel imaging in a clinical setting include faster image acquisition, which can be used, for instance, to shorten breath-hold times resulting in fewer motion-corrupted examinations. In this article the basic concepts behind parallel imaging are introduced. The relationship between undersampling and aliasing is discussed and two commonly used parallel imaging methods, SENSE and GRAPPA, are explained in detail. Examples of artifacts arising from parallel imaging are shown and ways to detect and mitigate these artifacts are described. Finally, several current applications of parallel imaging are presented and recent advancements and promising research in parallel imaging are briefly reviewed. Copyright © 2012 Wiley Periodicals, Inc.
Dissecting the large-scale galactic conformity

Science.gov (United States)

Seo, Seongu

2018-01-01

Galactic conformity is an observed phenomenon that galaxies located in the same region have similar properties such as star formation rate, color, gas fraction, and so on. The conformity was first observed among galaxies within in the same halos (“one-halo conformity”). The one-halo conformity can be readily explained by mutual interactions among galaxies within a halo. Recent observations however further witnessed a puzzling connection among galaxies with no direct interaction. In particular, galaxies located within a sphere of ~5 Mpc radius tend to show similarities, even though the galaxies do not share common halos with each other ("two-halo conformity" or “large-scale conformity”). Using a cosmological hydrodynamic simulation, Illustris, we investigate the physical origin of the two-halo conformity and put forward two scenarios. First, back-splash galaxies are likely responsible for the large-scale conformity. They have evolved into red galaxies due to ram-pressure stripping in a given galaxy cluster and happen to reside now within a ~5 Mpc sphere. Second, galaxies in strong tidal field induced by large-scale structure also seem to give rise to the large-scale conformity. The strong tides suppress star formation in the galaxies. We discuss the importance of the large-scale conformity in the context of galaxy evolution.
Fast electrostatic force calculation on parallel computer clusters

International Nuclear Information System (INIS)

Kia, Amirali; Kim, Daejoong; Darve, Eric

2008-01-01

The fast multipole method (FMM) and smooth particle mesh Ewald (SPME) are well known fast algorithms to evaluate long range electrostatic interactions in molecular dynamics and other fields. FMM is a multi-scale method which reduces the computation cost by approximating the potential due to a group of particles at a large distance using few multipole functions. This algorithm scales like O(N) for N particles. SPME algorithm is an O(NlnN) method which is based on an interpolation of the Fourier space part of the Ewald sum and evaluating the resulting convolutions using fast Fourier transform (FFT). Those algorithms suffer from relatively poor efficiency on large parallel machines especially for mid-size problems around hundreds of thousands of atoms. A variation of the FMM, called PWA, based on plane wave expansions is presented in this paper. A new parallelization strategy for PWA, which takes advantage of the specific form of this expansion, is described. Its parallel efficiency is compared with SPME through detail time measurements on two different computer clusters
Analysis of passive scalar advection in parallel shear flows: Sorting of modes at intermediate time scales

Science.gov (United States)

Camassa, Roberto; McLaughlin, Richard M.; Viotti, Claudio

2010-11-01

The time evolution of a passive scalar advected by parallel shear flows is studied for a class of rapidly varying initial data. Such situations are of practical importance in a wide range of applications from microfluidics to geophysics. In these contexts, it is well-known that the long-time evolution of the tracer concentration is governed by Taylor's asymptotic theory of dispersion. In contrast, we focus here on the evolution of the tracer at intermediate time scales. We show how intermediate regimes can be identified before Taylor's, and in particular, how the Taylor regime can be delayed indefinitely by properly manufactured initial data. A complete characterization of the sorting of these time scales and their associated spatial structures is presented. These analytical predictions are compared with highly resolved numerical simulations. Specifically, this comparison is carried out for the case of periodic variations in the streamwise direction on the short scale with envelope modulations on the long scales, and show how this structure can lead to "anomalously" diffusive transients in the evolution of the scalar onto the ultimate regime governed by Taylor dispersion. Mathematically, the occurrence of these transients can be viewed as a competition in the asymptotic dominance between large Péclet (Pe) numbers and the long/short scale aspect ratios (LVel/LTracer≡k), two independent nondimensional parameters of the problem. We provide analytical predictions of the associated time scales by a modal analysis of the eigenvalue problem arising in the separation of variables of the governing advection-diffusion equation. The anomalous time scale in the asymptotic limit of large k Pe is derived for the short scale periodic structure of the scalar's initial data, for both exactly solvable cases and in general with WKBJ analysis. In particular, the exactly solvable sawtooth flow is especially important in that it provides a short cut to the exact solution to the
Large scale structure from viscous dark matter

CERN Document Server

Blas, Diego; Garny, Mathias; Tetradis, Nikolaos; Wiedemann, Urs Achim

2015-01-01

Cosmological perturbations of sufficiently long wavelength admit a fluid dynamic description. We consider modes with wavevectors below a scale $k_m$ for which the dynamics is only mildly non-linear. The leading effect of modes above that scale can be accounted for by effective non-equilibrium viscosity and pressure terms. For mildly non-linear scales, these mainly arise from momentum transport within the ideal and cold but inhomogeneous fluid, while momentum transport due to more microscopic degrees of freedom is suppressed. As a consequence, concrete expressions with no free parameters, except the matching scale $k_m$, can be derived from matching evolution equations to standard cosmological perturbation theory. Two-loop calculations of the matter power spectrum in the viscous theory lead to excellent agreement with $N$-body simulations up to scales $k=0.2 \\, h/$Mpc. The convergence properties in the ultraviolet are better than for standard perturbation theory and the results are robust with respect to varia...
Adapting to large-scale changes in Advanced Placement Biology, Chemistry, and Physics: the impact of online teacher communities

Science.gov (United States)

Frumin, Kim; Dede, Chris; Fischer, Christian; Foster, Brandon; Lawrenz, Frances; Eisenkraft, Arthur; Fishman, Barry; Jurist Levy, Abigail; McCoy, Ayana

2018-03-01

Over the past decade, the field of teacher professional learning has coalesced around core characteristics of high quality professional development experiences (e.g. Borko, Jacobs, & Koellner, 2010. Contemporary approaches to teacher professional development. In P. L. Peterson, E. Baker, & B. McGaw (Eds.), International encyclopedia of education (Vol. 7, pp. 548-556). Oxford: Elsevier.; Darling-Hammond, Hyler, & Gardner, 2017. Effective teacher professional development. Palo Alto, CA: Learning Policy Institute). Many countries have found these advances of great interest because of a desire to build teacher capacity in science education and across the full curriculum. This paper continues this progress by examining the role and impact of an online professional development community within the top-down, large-scale curriculum and assessment revision of Advanced Placement (AP) Biology, Chemistry, and Physics. This paper is part of a five-year, longitudinal, U.S. National Science Foundation-funded project to study the relative effectiveness of various types of professional development in enabling teachers to adapt to the revised AP course goals and exams. Of the many forms of professional development our research has examined, preliminary analyses indicated that participation in the College Board's online AP Teacher Community (APTC) - where teachers can discuss teaching strategies, share resources, and connect with each other - had positive, direct, and statistically significant association with teacher self-reported shifts in practice and with gains in student AP scores (Fishman et al., 2014). This study explored how usage of the online APTC might be useful to teachers and examined a more robust estimate of these effects. Findings from the experience of AP teachers may be valuable in supporting other large-scale curriculum changes, such as the U.S. Next Generation Science Standards or Common Core Standards, as well as parallel curricular shifts in other countries.
Cosmic Shear With ACS Pure Parallels

Science.gov (United States)

Rhodes, Jason

2002-07-01

Small distortions in the shapes of background galaxies by foreground mass provide a powerful method of directly measuring the amount and distribution of dark matter. Several groups have recently detected this weak lensing by large-scale structure, also called cosmic shear. The high resolution and sensitivity of HST/ACS provide a unique opportunity to measure cosmic shear accurately on small scales. Using 260 parallel orbits in Sloan textiti {F775W} we will measure for the first time: beginlistosetlength sep0cm setlengthemsep0cm setlengthopsep0cm em the cosmic shear variance on scales Omega_m^0.5, with signal-to-noise {s/n} 20, and the mass density Omega_m with s/n=4. They will be done at small angular scales where non-linear effects dominate the power spectrum, providing a test of the gravitational instability paradigm for structure formation. Measurements on these scales are not possible from the ground, because of the systematic effects induced by PSF smearing from seeing. Having many independent lines of sight reduces the uncertainty due to cosmic variance, making parallel observations ideal.
Large-scale perspective as a challenge

NARCIS (Netherlands)

Plomp, M.G.A.

2012-01-01

1. Scale forms a challenge for chain researchers: when exactly is something ‘large-scale’? What are the underlying factors (e.g. number of parties, data, objects in the chain, complexity) that determine this? It appears to be a continuum between small- and large-scale, where positioning on that
Algorithm 896: LSA: Algorithms for Large-Scale Optimization

Czech Academy of Sciences Publication Activity Database

Lukšan, Ladislav; Matonoha, Ctirad; Vlček, Jan

2009-01-01

Roč. 36, č. 3 (2009), 16-1-16-29 ISSN 0098-3500 R&D Pro jects: GA AV ČR IAA1030405; GA ČR GP201/06/P397 Institutional research plan: CEZ:AV0Z10300504 Keywords : algorithms * design * large-scale optimization * large-scale nonsmooth optimization * large-scale nonlinear least squares * large-scale nonlinear minimax * large-scale systems of nonlinear equations * sparse pro blems * partially separable pro blems * limited-memory methods * discrete Newton methods * quasi-Newton methods * primal interior-point methods Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 1.904, year: 2009
Large-scale production of diesel-like biofuels - process design as an inherent part of microorganism development.

Science.gov (United States)

Cuellar, Maria C; Heijnen, Joseph J; van der Wielen, Luuk A M

2013-06-01

Industrial biotechnology is playing an important role in the transition to a bio-based economy. Currently, however, industrial implementation is still modest, despite the advances made in microorganism development. Given that the fuels and commodity chemicals sectors are characterized by tight economic margins, we propose to address overall process design and efficiency at the start of bioprocess development. While current microorganism development is targeted at product formation and product yield, addressing process design at the start of bioprocess development means that microorganism selection can also be extended to other critical targets for process technology and process scale implementation, such as enhancing cell separation or increasing cell robustness at operating conditions that favor the overall process. In this paper we follow this approach for the microbial production of diesel-like biofuels. We review current microbial routes with both oleaginous and engineered microorganisms. For the routes leading to extracellular production, we identify the process conditions for large scale operation. The process conditions identified are finally translated to microorganism development targets. We show that microorganism development should be directed at anaerobic production, increasing robustness at extreme process conditions and tailoring cell surface properties. All the same time, novel process configurations integrating fermentation and product recovery, cell reuse and low-cost technologies for product separation are mandatory. This review provides a state-of-the-art summary of the latest challenges in large-scale production of diesel-like biofuels. Copyright © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Scale interactions in a mixing layer – the role of the large-scale gradients

KAUST Repository

Fiscaletti, D.

2016-02-15

© 2016 Cambridge University Press. The interaction between the large and the small scales of turbulence is investigated in a mixing layer, at a Reynolds number based on the Taylor microscale of , via direct numerical simulations. The analysis is performed in physical space, and the local vorticity root-mean-square (r.m.s.) is taken as a measure of the small-scale activity. It is found that positive large-scale velocity fluctuations correspond to large vorticity r.m.s. on the low-speed side of the mixing layer, whereas, they correspond to low vorticity r.m.s. on the high-speed side. The relationship between large and small scales thus depends on position if the vorticity r.m.s. is correlated with the large-scale velocity fluctuations. On the contrary, the correlation coefficient is nearly constant throughout the mixing layer and close to unity if the vorticity r.m.s. is correlated with the large-scale velocity gradients. Therefore, the small-scale activity appears closely related to large-scale gradients, while the correlation between the small-scale activity and the large-scale velocity fluctuations is shown to reflect a property of the large scales. Furthermore, the vorticity from unfiltered (small scales) and from low pass filtered (large scales) velocity fields tend to be aligned when examined within vortical tubes. These results provide evidence for the so-called \\'scale invariance\\' (Meneveau & Katz, Annu. Rev. Fluid Mech., vol. 32, 2000, pp. 1-32), and suggest that some of the large-scale characteristics are not lost at the small scales, at least at the Reynolds number achieved in the present simulation.
Parallel processing for artificial intelligence 2

CERN Document Server

Kumar, V; Suttner, CB

1994-01-01

With the increasing availability of parallel machines and the raising of interest in large scale and real world applications, research on parallel processing for Artificial Intelligence (AI) is gaining greater importance in the computer science environment. Many applications have been implemented and delivered but the field is still considered to be in its infancy. This book assembles diverse aspects of research in the area, providing an overview of the current state of technology. It also aims to promote further growth across the discipline. Contributions have been grouped according to their
Efficient Computation of Sparse Matrix Functions for Large-Scale Electronic Structure Calculations: The CheSS Library.

Science.gov (United States)

Mohr, Stephan; Dawson, William; Wagner, Michael; Caliste, Damien; Nakajima, Takahito; Genovese, Luigi

2017-10-10

We present CheSS, the "Chebyshev Sparse Solvers" library, which has been designed to solve typical problems arising in large-scale electronic structure calculations using localized basis sets. The library is based on a flexible and efficient expansion in terms of Chebyshev polynomials and presently features the calculation of the density matrix, the calculation of matrix powers for arbitrary powers, and the extraction of eigenvalues in a selected interval. CheSS is able to exploit the sparsity of the matrices and scales linearly with respect to the number of nonzero entries, making it well-suited for large-scale calculations. The approach is particularly adapted for setups leading to small spectral widths of the involved matrices and outperforms alternative methods in this regime. By coupling CheSS to the DFT code BigDFT, we show that such a favorable setup is indeed possible in practice. In addition, the approach based on Chebyshev polynomials can be massively parallelized, and CheSS exhibits excellent scaling up to thousands of cores even for relatively small matrix sizes.
Large-scale matrix-handling subroutines 'ATLAS'

International Nuclear Information System (INIS)

Tsunematsu, Toshihide; Takeda, Tatsuoki; Fujita, Keiichi; Matsuura, Toshihiko; Tahara, Nobuo

1978-03-01

Subroutine package ''ATLAS'' has been developed for handling large-scale matrices. The package is composed of four kinds of subroutines, i.e., basic arithmetic routines, routines for solving linear simultaneous equations and for solving general eigenvalue problems and utility routines. The subroutines are useful in large scale plasma-fluid simulations. (auth.)
Multiscale registration of remote sensing image using robust SIFT features in Steerable-Domain

Directory of Open Access Journals (Sweden)

Xiangzeng Liu

2011-12-01

Full Text Available This paper proposes a multiscale registration technique using robust Scale Invariant Feature Transform (SIFT features in Steerable-Domain, which can deal with the large variations of scale, rotation and illumination between images. First, a new robust SIFT descriptor is presented, which is invariant under affine transformation. Then, an adaptive similarity measure is developed according to the robust SIFT descriptor and the adaptive normalized cross correlation of feature point’s neighborhood. Finally, the corresponding feature points can be determined by the adaptive similarity measure in Steerable-Domain of the two input images, and the final refined transformation parameters determined by using gradual optimization are adopted to achieve the registration results. Quantitative comparisons of our algorithm with the related methods show a significant improvement in the presence of large scale, rotation changes, and illumination contrast. The effectiveness of the proposed method is demonstrated by the experimental results.
Large Scale Flutter Data for Design of Rotating Blades Using Navier-Stokes Equations

Science.gov (United States)

Guruswamy, Guru P.

2012-01-01

A procedure to compute flutter boundaries of rotating blades is presented; a) Navier-Stokes equations. b) Frequency domain method compatible with industry practice. Procedure is initially validated: a) Unsteady loads with flapping wing experiment. b) Flutter boundary with fixed wing experiment. Large scale flutter computation is demonstrated for rotating blade: a) Single job submission script. b) Flutter boundary in 24 hour wall clock time with 100 cores. c) Linearly scalable with number of cores. Tested with 1000 cores that produced data in 25 hrs for 10 flutter boundaries. Further wall-clock speed-up is possible by performing parallel computations within each case.
SCALES: SEVIRI and GERB CaL/VaL area for large-scale field experiments

Science.gov (United States)

Lopez-Baeza, Ernesto; Belda, Fernando; Bodas, Alejandro; Crommelynck, Dominique; Dewitte, Steven; Domenech, Carlos; Gimeno, Jaume F.; Harries, John E.; Jorge Sanchez, Joan; Pineda, Nicolau; Pino, David; Rius, Antonio; Saleh, Kauzar; Tarruella, Ramon; Velazquez, Almudena

2004-02-01

The main objective of the SCALES Project is to exploit the unique opportunity offered by the recent launch of the first European METEOSAT Second Generation geostationary satellite (MSG-1) to generate and validate new radiation budget and cloud products provided by the GERB (Geostationary Earth Radiation Budget) instrument. SCALES" specific objectives are: (i) definition and characterization of a large reasonably homogeneous area compatible to GERB pixel size (around 50 x 50 km2), (ii) validation of GERB TOA radiances and fluxes derived by means of angular distribution models, (iii) development of algorithms to estimate surface net radiation from GERB TOA measurements, and (iv) development of accurate methodologies to measure radiation flux divergence and analyze its influence on the thermal regime and dynamics of the atmosphere, also using GERB data. SCALES is highly innovative: it focuses on a new and unique space instrument and develops a new specific validation methodology for low resolution sensors that is based on the use of a robust reference meteorological station (Valencia Anchor Station) around which 3D high resolution meteorological fields are obtained from the MM5 Meteorological Model. During the 1st GERB Ground Validation Campaign (18th-24th June, 2003), CERES instruments on Aqua and Terra provided additional radiance measurements to support validation efforts. CERES instruments operated in the PAPS mode (Programmable Azimuth Plane Scanning) focusing the station. Ground measurements were taken by lidar, sun photometer, GPS precipitable water content, radiosounding ascents, Anchor Station operational meteorological measurements at 2m and 15m., 4 radiation components at 2m, and mobile stations to characterize a large area. In addition, measurements during LANDSAT overpasses on June 14th and 30th were also performed. These activities were carried out within the GIST (GERB International Science Team) framework, during GERB Commissioning Period.
Parallel computing by Monte Carlo codes MVP/GMVP

International Nuclear Information System (INIS)

Nagaya, Yasunobu; Nakagawa, Masayuki; Mori, Takamasa

2001-01-01

General-purpose Monte Carlo codes MVP/GMVP are well-vectorized and thus enable us to perform high-speed Monte Carlo calculations. In order to achieve more speedups, we parallelized the codes on the different types of parallel computing platforms or by using a standard parallelization library MPI. The platforms used for benchmark calculations are a distributed-memory vector-parallel computer Fujitsu VPP500, a distributed-memory massively parallel computer Intel paragon and a distributed-memory scalar-parallel computer Hitachi SR2201, IBM SP2. As mentioned generally, linear speedup could be obtained for large-scale problems but parallelization efficiency decreased as the batch size per a processing element(PE) was smaller. It was also found that the statistical uncertainty for assembly powers was less than 0.1% by the PWR full-core calculation with more than 10 million histories and it took about 1.5 hours by massively parallel computing. (author)

Large-scale solar heat

Energy Technology Data Exchange (ETDEWEB)

Tolonen, J.; Konttinen, P.; Lund, P. [Helsinki Univ. of Technology, Otaniemi (Finland). Dept. of Engineering Physics and Mathematics

1998-12-31

In this project a large domestic solar heating system was built and a solar district heating system was modelled and simulated. Objectives were to improve the performance and reduce costs of a large-scale solar heating system. As a result of the project the benefit/cost ratio can be increased by 40 % through dimensioning and optimising the system at the designing stage. (orig.)
Parallel Implementation and Scaling of an Adaptive Mesh Discrete Ordinates Algorithm for Transport

International Nuclear Information System (INIS)

Howell, L H

2004-01-01

Block-structured adaptive mesh refinement (AMR) uses a mesh structure built up out of locally-uniform rectangular grids. In the BoxLib parallel framework used by the Raptor code, each processor operates on one or more of these grids at each refinement level. The decomposition of the mesh into grids and the distribution of these grids among processors may change every few timesteps as a calculation proceeds. Finer grids use smaller timesteps than coarser grids, requiring additional work to keep the system synchronized and ensure conservation between different refinement levels. In a paper for NECDC 2002 I presented preliminary results on implementation of parallel transport sweeps on the AMR mesh, conjugate gradient acceleration, accuracy of the AMR solution, and scalar speedup of the AMR algorithm compared to a uniform fully-refined mesh. This paper continues with a more in-depth examination of the parallel scaling properties of the scheme, both in single-level and multi-level calculations. Both sweeping and setup costs are considered. The algorithm scales with acceptable performance to several hundred processors. Trends suggest, however, that this is the limit for efficient calculations with traditional transport sweeps, and that modifications to the sweep algorithm will be increasingly needed as job sizes in the thousands of processors become common
Multiple Independent File Parallel I/O with HDF5

Energy Technology Data Exchange (ETDEWEB)

Miller, M. C.

2016-07-13

The HDF5 library has supported the I/O requirements of HPC codes at Lawrence Livermore National Labs (LLNL) since the late 90’s. In particular, HDF5 used in the Multiple Independent File (MIF) parallel I/O paradigm has supported LLNL code’s scalable I/O requirements and has recently been gainfully used at scales as large as O(10⁶) parallel tasks.
ROBUST: an interactive FORTRAN-77 package for exploratory data analysis using parametric, ROBUST and nonparametric location and scale estimates, data transformations, normality tests, and outlier assessment

Science.gov (United States)

Rock, N. M. S.

ROBUST calculates 53 statistics, plus significance levels for 6 hypothesis tests, on each of up to 52 variables. These together allow the following properties of the data distribution for each variable to be examined in detail: (1) Location. Three means (arithmetic, geometric, harmonic) are calculated, together with the midrange and 19 high-performance robust L-, M-, and W-estimates of location (combined, adaptive, trimmed estimates, etc.) (2) Scale. The standard deviation is calculated along with the H-spread/2 (≈ semi-interquartile range), the mean and median absolute deviations from both mean and median, and a biweight scale estimator. The 23 location and 6 scale estimators programmed cover all possible degrees of robustness. (3) Normality: Distributions are tested against the null hypothesis that they are normal, using the 3rd (√ h1) and 4th ( b 2) moments, Geary's ratio (mean deviation/standard deviation), Filliben's probability plot correlation coefficient, and a more robust test based on the biweight scale estimator. These statistics collectively are sensitive to most usual departures from normality. (4) Presence of outliers. The maximum and minimum values are assessed individually or jointly using Grubbs' maximum Studentized residuals, Harvey's and Dixon's criteria, and the Studentized range. For a single input variable, outliers can be either winsorized or eliminated and all estimates recalculated iteratively as desired. The following data-transformations also can be applied: linear, log 10, generalized Box Cox power (including log, reciprocal, and square root), exponentiation, and standardization. For more than one variable, all results are tabulated in a single run of ROBUST. Further options are incorporated to assess ratios (of two variables) as well as discrete variables, and be concerned with missing data. Cumulative S-plots (for assessing normality graphically) also can be generated. The mutual consistency or inconsistency of all these measures
Reducing computational costs in large scale 3D EIT by using a sparse Jacobian matrix with block-wise CGLS reconstruction

International Nuclear Information System (INIS)

Yang, C L; Wei, H Y; Soleimani, M; Adler, A

2013-01-01

Electrical impedance tomography (EIT) is a fast and cost-effective technique to provide a tomographic conductivity image of a subject from boundary current–voltage data. This paper proposes a time and memory efficient method for solving a large scale 3D EIT inverse problem using a parallel conjugate gradient (CG) algorithm. The 3D EIT system with a large number of measurement data can produce a large size of Jacobian matrix; this could cause difficulties in computer storage and the inversion process. One of challenges in 3D EIT is to decrease the reconstruction time and memory usage, at the same time retaining the image quality. Firstly, a sparse matrix reduction technique is proposed using thresholding to set very small values of the Jacobian matrix to zero. By adjusting the Jacobian matrix into a sparse format, the element with zeros would be eliminated, which results in a saving of memory requirement. Secondly, a block-wise CG method for parallel reconstruction has been developed. The proposed method has been tested using simulated data as well as experimental test samples. Sparse Jacobian with a block-wise CG enables the large scale EIT problem to be solved efficiently. Image quality measures are presented to quantify the effect of sparse matrix reduction in reconstruction results. (paper)
Reducing computational costs in large scale 3D EIT by using a sparse Jacobian matrix with block-wise CGLS reconstruction.

Science.gov (United States)

Yang, C L; Wei, H Y; Adler, A; Soleimani, M

2013-06-01

Electrical impedance tomography (EIT) is a fast and cost-effective technique to provide a tomographic conductivity image of a subject from boundary current-voltage data. This paper proposes a time and memory efficient method for solving a large scale 3D EIT inverse problem using a parallel conjugate gradient (CG) algorithm. The 3D EIT system with a large number of measurement data can produce a large size of Jacobian matrix; this could cause difficulties in computer storage and the inversion process. One of challenges in 3D EIT is to decrease the reconstruction time and memory usage, at the same time retaining the image quality. Firstly, a sparse matrix reduction technique is proposed using thresholding to set very small values of the Jacobian matrix to zero. By adjusting the Jacobian matrix into a sparse format, the element with zeros would be eliminated, which results in a saving of memory requirement. Secondly, a block-wise CG method for parallel reconstruction has been developed. The proposed method has been tested using simulated data as well as experimental test samples. Sparse Jacobian with a block-wise CG enables the large scale EIT problem to be solved efficiently. Image quality measures are presented to quantify the effect of sparse matrix reduction in reconstruction results.
A Pervasive Parallel Processing Framework for Data Visualization and Analysis at Extreme Scale

Energy Technology Data Exchange (ETDEWEB)

Moreland, Kenneth [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Geveci, Berk [Kitware, Inc., Clifton Park, NY (United States)

2014-11-01

The evolution of the computing world from teraflop to petaflop has been relatively effortless, with several of the existing programming models scaling effectively to the petascale. The migration to exascale, however, poses considerable challenges. All industry trends infer that the exascale machine will be built using processors containing hundreds to thousands of cores per chip. It can be inferred that efficient concurrency on exascale machines requires a massive amount of concurrent threads, each performing many operations on a localized piece of data. Currently, visualization libraries and applications are based off what is known as the visualization pipeline. In the pipeline model, algorithms are encapsulated as filters with inputs and outputs. These filters are connected by setting the output of one component to the input of another. Parallelism in the visualization pipeline is achieved by replicating the pipeline for each processing thread. This works well for today’s distributed memory parallel computers but cannot be sustained when operating on processors with thousands of cores. Our project investigates a new visualization framework designed to exhibit the pervasive parallelism necessary for extreme scale machines. Our framework achieves this by defining algorithms in terms of worklets, which are localized stateless operations. Worklets are atomic operations that execute when invoked unlike filters, which execute when a pipeline request occurs. The worklet design allows execution on a massive amount of lightweight threads with minimal overhead. Only with such fine-grained parallelism can we hope to fill the billions of threads we expect will be necessary for efficient computation on an exascale machine.
Probes of large-scale structure in the Universe

International Nuclear Information System (INIS)

Suto, Yasushi; Gorski, K.; Juszkiewicz, R.; Silk, J.

1988-01-01

Recent progress in observational techniques has made it possible to confront quantitatively various models for the large-scale structure of the Universe with detailed observational data. We develop a general formalism to show that the gravitational instability theory for the origin of large-scale structure is now capable of critically confronting observational results on cosmic microwave background radiation angular anisotropies, large-scale bulk motions and large-scale clumpiness in the galaxy counts. (author)
Totally parallel multilevel algorithms

Science.gov (United States)

Frederickson, Paul O.

1988-01-01

Four totally parallel algorithms for the solution of a sparse linear system have common characteristics which become quite apparent when they are implemented on a highly parallel hypercube such as the CM2. These four algorithms are Parallel Superconvergent Multigrid (PSMG) of Frederickson and McBryan, Robust Multigrid (RMG) of Hackbusch, the FFT based Spectral Algorithm, and Parallel Cyclic Reduction. In fact, all four can be formulated as particular cases of the same totally parallel multilevel algorithm, which are referred to as TPMA. In certain cases the spectral radius of TPMA is zero, and it is recognized to be a direct algorithm. In many other cases the spectral radius, although not zero, is small enough that a single iteration per timestep keeps the local error within the required tolerance.
Micrometer and nanometer-scale parallel patterning of ceramic and organic-inorganic hybrid materials

NARCIS (Netherlands)

ten Elshof, Johan E.; Khan, Sajid; Göbel, Ole

2010-01-01

This review gives an overview of the progress made in recent years in the development of low-cost parallel patterning techniques for ceramic materials, silica, and organic–inorganic silsesquioxane-based hybrids from wet-chemical solutions and suspensions on the micrometer and nanometer-scale. The
Dynamic Analysis and Vibration Attenuation of Cable-Driven Parallel Manipulators for Large Workspace Applications

Directory of Open Access Journals (Sweden)

Jingli Du

2013-01-01

Full Text Available Cable-driven parallel manipulators are one of the best solutions to achieving large workspace since flexible cables can be easily stored on reels. However, due to the negligible flexural stiffness of cables, long cables will unavoidably vibrate during operation for large workspace applications. In this paper a finite element model for cable-driven parallel manipulators is proposed to mimic small amplitude vibration of cables around their desired position. Output feedback of the cable tension variation at the end of the end-effector is utilized to design the vibration attenuation controller which aims at attenuating the vibration of cables by slightly varying the cable length, thus decreasing its effect on the end-effector. When cable vibration is attenuated, motion controller could be designed for implementing precise large motion to track given trajectories. A numerical example is presented to demonstrate the dynamic model and the control algorithm.
Large-scale grid management; Storskala Nettforvaltning

Energy Technology Data Exchange (ETDEWEB)

Langdal, Bjoern Inge; Eggen, Arnt Ove

2003-07-01

The network companies in the Norwegian electricity industry now have to establish a large-scale network management, a concept essentially characterized by (1) broader focus (Broad Band, Multi Utility,...) and (2) bigger units with large networks and more customers. Research done by SINTEF Energy Research shows so far that the approaches within large-scale network management may be structured according to three main challenges: centralization, decentralization and out sourcing. The article is part of a planned series.
Japanese large-scale interferometers

CERN Document Server

Kuroda, K; Miyoki, S; Ishizuka, H; Taylor, C T; Yamamoto, K; Miyakawa, O; Fujimoto, M K; Kawamura, S; Takahashi, R; Yamazaki, T; Arai, K; Tatsumi, D; Ueda, A; Fukushima, M; Sato, S; Shintomi, T; Yamamoto, A; Suzuki, T; Saitô, Y; Haruyama, T; Sato, N; Higashi, Y; Uchiyama, T; Tomaru, T; Tsubono, K; Ando, M; Takamori, A; Numata, K; Ueda, K I; Yoneda, H; Nakagawa, K; Musha, M; Mio, N; Moriwaki, S; Somiya, K; Araya, A; Kanda, N; Telada, S; Sasaki, M; Tagoshi, H; Nakamura, T; Tanaka, T; Ohara, K

2002-01-01

The objective of the TAMA 300 interferometer was to develop advanced technologies for kilometre scale interferometers and to observe gravitational wave events in nearby galaxies. It was designed as a power-recycled Fabry-Perot-Michelson interferometer and was intended as a step towards a final interferometer in Japan. The present successful status of TAMA is presented. TAMA forms a basis for LCGT (large-scale cryogenic gravitational wave telescope), a 3 km scale cryogenic interferometer to be built in the Kamioka mine in Japan, implementing cryogenic mirror techniques. The plan of LCGT is schematically described along with its associated R and D.
Robust Algebraic Multilevel Methods and Algorithms

CERN Document Server

Kraus, Johannes

2009-01-01

This book deals with algorithms for the solution of linear systems of algebraic equations with large-scale sparse matrices, with a focus on problems that are obtained after discretization of partial differential equations using finite element methods. Provides a systematic presentation of the recent advances in robust algebraic multilevel methods. Can be used for advanced courses on the topic.
Improving predictions of large scale soil carbon dynamics: Integration of fine-scale hydrological and biogeochemical processes, scaling, and benchmarking

Science.gov (United States)

Riley, W. J.; Dwivedi, D.; Ghimire, B.; Hoffman, F. M.; Pau, G. S. H.; Randerson, J. T.; Shen, C.; Tang, J.; Zhu, Q.

2015-12-01

Numerical model representations of decadal- to centennial-scale soil-carbon dynamics are a dominant cause of uncertainty in climate change predictions. Recent attempts by some Earth System Model (ESM) teams to integrate previously unrepresented soil processes (e.g., explicit microbial processes, abiotic interactions with mineral surfaces, vertical transport), poor performance of many ESM land models against large-scale and experimental manipulation observations, and complexities associated with spatial heterogeneity highlight the nascent nature of our community's ability to accurately predict future soil carbon dynamics. I will present recent work from our group to develop a modeling framework to integrate pore-, column-, watershed-, and global-scale soil process representations into an ESM (ACME), and apply the International Land Model Benchmarking (ILAMB) package for evaluation. At the column scale and across a wide range of sites, observed depth-resolved carbon stocks and their 14C derived turnover times can be explained by a model with explicit representation of two microbial populations, a simple representation of mineralogy, and vertical transport. Integrating soil and plant dynamics requires a 'process-scaling' approach, since all aspects of the multi-nutrient system cannot be explicitly resolved at ESM scales. I will show that one approach, the Equilibrium Chemistry Approximation, improves predictions of forest nitrogen and phosphorus experimental manipulations and leads to very different global soil carbon predictions. Translating model representations from the site- to ESM-scale requires a spatial scaling approach that either explicitly resolves the relevant processes, or more practically, accounts for fine-resolution dynamics at coarser scales. To that end, I will present recent watershed-scale modeling work that applies reduced order model methods to accurately scale fine-resolution soil carbon dynamics to coarse-resolution simulations. Finally, we
Stormbow: A Cloud-Based Tool for Reads Mapping and Expression Quantification in Large-Scale RNA-Seq Studies.

Science.gov (United States)

Zhao, Shanrong; Prenger, Kurt; Smith, Lance

2013-01-01

RNA-Seq is becoming a promising replacement to microarrays in transcriptome profiling and differential gene expression study. Technical improvements have decreased sequencing costs and, as a result, the size and number of RNA-Seq datasets have increased rapidly. However, the increasing volume of data from large-scale RNA-Seq studies poses a practical challenge for data analysis in a local environment. To meet this challenge, we developed Stormbow, a cloud-based software package, to process large volumes of RNA-Seq data in parallel. The performance of Stormbow has been tested by practically applying it to analyse 178 RNA-Seq samples in the cloud. In our test, it took 6 to 8 hours to process an RNA-Seq sample with 100 million reads, and the average cost was $3.50 per sample. Utilizing Amazon Web Services as the infrastructure for Stormbow allows us to easily scale up to handle large datasets with on-demand computational resources. Stormbow is a scalable, cost effective, and open-source based tool for large-scale RNA-Seq data analysis. Stormbow can be freely downloaded and can be used out of box to process Illumina RNA-Seq datasets.
Large-scale transportation network congestion evolution prediction using deep learning theory.

Directory of Open Access Journals (Sweden)

Xiaolei Ma

Full Text Available Understanding how congestion at one location can cause ripples throughout large-scale transportation network is vital for transportation researchers and practitioners to pinpoint traffic bottlenecks for congestion mitigation. Traditional studies rely on either mathematical equations or simulation techniques to model traffic congestion dynamics. However, most of the approaches have limitations, largely due to unrealistic assumptions and cumbersome parameter calibration process. With the development of Intelligent Transportation Systems (ITS and Internet of Things (IoT, transportation data become more and more ubiquitous. This triggers a series of data-driven research to investigate transportation phenomena. Among them, deep learning theory is considered one of the most promising techniques to tackle tremendous high-dimensional data. This study attempts to extend deep learning theory into large-scale transportation network analysis. A deep Restricted Boltzmann Machine and Recurrent Neural Network architecture is utilized to model and predict traffic congestion evolution based on Global Positioning System (GPS data from taxi. A numerical study in Ningbo, China is conducted to validate the effectiveness and efficiency of the proposed method. Results show that the prediction accuracy can achieve as high as 88% within less than 6 minutes when the model is implemented in a Graphic Processing Unit (GPU-based parallel computing environment. The predicted congestion evolution patterns can be visualized temporally and spatially through a map-based platform to identify the vulnerable links for proactive congestion mitigation.
Compiler Technology for Parallel Scientific Computation

Directory of Open Access Journals (Sweden)

Can Özturan

1994-01-01

Full Text Available There is a need for compiler technology that, given the source program, will generate efficient parallel codes for different architectures with minimal user involvement. Parallel computation is becoming indispensable in solving large-scale problems in science and engineering. Yet, the use of parallel computation is limited by the high costs of developing the needed software. To overcome this difficulty we advocate a comprehensive approach to the development of scalable architecture-independent software for scientific computation based on our experience with equational programming language (EPL. Our approach is based on a program decomposition, parallel code synthesis, and run-time support for parallel scientific computation. The program decomposition is guided by the source program annotations provided by the user. The synthesis of parallel code is based on configurations that describe the overall computation as a set of interacting components. Run-time support is provided by the compiler-generated code that redistributes computation and data during object program execution. The generated parallel code is optimized using techniques of data alignment, operator placement, wavefront determination, and memory optimization. In this article we discuss annotations, configurations, parallel code generation, and run-time support suitable for parallel programs written in the functional parallel programming language EPL and in Fortran.
Survey and research for the enhancement of large-scale technology development 2. How large-scale technology development should be in the future; Ogata gijutsu kaihatsu suishin no tame no chosa kenkyu. 2. Kongo no ogata gijutsu kaihatsu no arikata

Energy Technology Data Exchange (ETDEWEB)

NONE

1981-03-01

A survey is conducted over the subject matter by holding interviews with people, employed with the entrusted businesses participating in the large-scale industrial technology development system, who are engaged in the development of industrial technologies, and with people of experience or academic background involved in the project enhancement effort. Needs of improvement are pointed out that the competition principle based for example on parallel development be introduced; that research-on-research be practiced for effective task institution; midway evaluation be substantiated since prior evaluation is difficult; efforts be made to organize new industries utilizing the fruits of large-scale industrial technology for the creation of markets, not to induce economic conflicts; that transfer of technologies be enhanced from the private sector to public sector. Studies are made about the review of research conducting systems; utilization of the power of private sector research and development efforts; enlightening about industrial proprietorship; and the diffusion of large-scale project systems. In this connection, problems are pointed out, requests are submitted, and remedial measures and suggestions are presented. (NEDO)
Large-scale HTS bulks for magnetic application

Science.gov (United States)

Werfel, Frank N.; Floegel-Delor, Uta; Riedel, Thomas; Goebel, Bernd; Rothfeld, Rolf; Schirrmeister, Peter; Wippich, Dieter

2013-01-01

ATZ Company has constructed about 130 HTS magnet systems using high-Tc bulk magnets. A key feature in scaling-up is the fabrication of YBCO melts textured multi-seeded large bulks with three to eight seeds. Except of levitation, magnetization, trapped field and hysteresis, we review system engineering parameters of HTS magnetic linear and rotational bearings like compactness, cryogenics, power density, efficiency and robust construction. We examine mobile compact YBCO bulk magnet platforms cooled with LN2 and Stirling cryo-cooler for demonstrator use. Compact cryostats for Maglev train operation contain 24 pieces of 3-seed bulks and can levitate 2500-3000 N at 10 mm above a permanent magnet (PM) track. The effective magnetic distance of the thermally insulated bulks is 2 mm only; the stored 2.5 l LN2 allows more than 24 h operation without refilling. 34 HTS Maglev vacuum cryostats are manufactured tested and operate in Germany, China and Brazil. The magnetic levitation load to weight ratio is more than 15, and by group assembling the HTS cryostats under vehicles up to 5 t total loads levitated above a magnetic track is achieved.

pcircle - A Suite of Scalable Parallel File System Tools

Energy Technology Data Exchange (ETDEWEB)

2015-10-01

Most of the software related to file system are written for conventional local file system, they are serialized and can't take advantage of the benefit of a large scale parallel file system. "pcircle" software builds on top of ubiquitous MPI in cluster computing environment and "work-stealing" pattern to provide a scalable, high-performance suite of file system tools. In particular - it implemented parallel data copy and parallel data checksumming, with advanced features such as async progress report, checkpoint and restart, as well as integrity checking.
Robust design of large-displacement compliant mechanisms

DEFF Research Database (Denmark)

Lazarov, Boyan Stefanov; Schevenels, M.; Sigmund, Ole

2011-01-01

The aim of this article is to introduce a new topology optimisation formulation for optimal robust design of Micro Electro Mechanical Systems. Mesh independence in topology optimisation is most often ensured by using filtering techniques, which result in transition grey regions difficult to inter...... in nearly black and white mechanism designs, robust with respect to uncertainties in the production process, i.e. without any hinges or small details which can create manufacturing difficulties....
Large scale model testing

International Nuclear Information System (INIS)

Brumovsky, M.; Filip, R.; Polachova, H.; Stepanek, S.

1989-01-01

Fracture mechanics and fatigue calculations for WWER reactor pressure vessels were checked by large scale model testing performed using large testing machine ZZ 8000 (with a maximum load of 80 MN) at the SKODA WORKS. The results are described from testing the material resistance to fracture (non-ductile). The testing included the base materials and welded joints. The rated specimen thickness was 150 mm with defects of a depth between 15 and 100 mm. The results are also presented of nozzles of 850 mm inner diameter in a scale of 1:3; static, cyclic, and dynamic tests were performed without and with surface defects (15, 30 and 45 mm deep). During cyclic tests the crack growth rate in the elastic-plastic region was also determined. (author). 6 figs., 2 tabs., 5 refs
Hi-Corrector: a fast, scalable and memory-efficient package for normalizing large-scale Hi-C data.

Science.gov (United States)

Li, Wenyuan; Gong, Ke; Li, Qingjiao; Alber, Frank; Zhou, Xianghong Jasmine

2015-03-15

Genome-wide proximity ligation assays, e.g. Hi-C and its variant TCC, have recently become important tools to study spatial genome organization. Removing biases from chromatin contact matrices generated by such techniques is a critical preprocessing step of subsequent analyses. The continuing decline of sequencing costs has led to an ever-improving resolution of the Hi-C data, resulting in very large matrices of chromatin contacts. Such large-size matrices, however, pose a great challenge on the memory usage and speed of its normalization. Therefore, there is an urgent need for fast and memory-efficient methods for normalization of Hi-C data. We developed Hi-Corrector, an easy-to-use, open source implementation of the Hi-C data normalization algorithm. Its salient features are (i) scalability-the software is capable of normalizing Hi-C data of any size in reasonable times; (ii) memory efficiency-the sequential version can run on any single computer with very limited memory, no matter how little; (iii) fast speed-the parallel version can run very fast on multiple computing nodes with limited local memory. The sequential version is implemented in ANSI C and can be easily compiled on any system; the parallel version is implemented in ANSI C with the MPI library (a standardized and portable parallel environment designed for solving large-scale scientific problems). The package is freely available at http://zhoulab.usc.edu/Hi-Corrector/. © The Author 2014. Published by Oxford University Press.
Confirmation of general relativity on large scales from weak lensing and galaxy velocities

Science.gov (United States)

Reyes, Reinabelle; Mandelbaum, Rachel; Seljak, Uros; Baldauf, Tobias; Gunn, James E.; Lombriser, Lucas; Smith, Robert E.

2010-03-01

Although general relativity underlies modern cosmology, its applicability on cosmological length scales has yet to be stringently tested. Such a test has recently been proposed, using a quantity, EG, that combines measures of large-scale gravitational lensing, galaxy clustering and structure growth rate. The combination is insensitive to `galaxy bias' (the difference between the clustering of visible galaxies and invisible dark matter) and is thus robust to the uncertainty in this parameter. Modified theories of gravity generally predict values of EG different from the general relativistic prediction because, in these theories, the `gravitational slip' (the difference between the two potentials that describe perturbations in the gravitational metric) is non-zero, which leads to changes in the growth of structure and the strength of the gravitational lensing effect. Here we report that EG = 0.39+/-0.06 on length scales of tens of megaparsecs, in agreement with the general relativistic prediction of EG~0.4. The measured value excludes a model within the tensor-vector-scalar gravity theory, which modifies both Newtonian and Einstein gravity. However, the relatively large uncertainty still permits models within f() theory, which is an extension of general relativity. A fivefold decrease in uncertainty is needed to rule out these models.
Why small-scale cannabis growers stay small: five mechanisms that prevent small-scale growers from going large scale.

Science.gov (United States)

Hammersvik, Eirik; Sandberg, Sveinung; Pedersen, Willy

2012-11-01

Over the past 15-20 years, domestic cultivation of cannabis has been established in a number of European countries. New techniques have made such cultivation easier; however, the bulk of growers remain small-scale. In this study, we explore the factors that prevent small-scale growers from increasing their production. The study is based on 1 year of ethnographic fieldwork and qualitative interviews conducted with 45 Norwegian cannabis growers, 10 of whom were growing on a large-scale and 35 on a small-scale. The study identifies five mechanisms that prevent small-scale indoor growers from going large-scale. First, large-scale operations involve a number of people, large sums of money, a high work-load and a high risk of detection, and thus demand a higher level of organizational skills than for small growing operations. Second, financial assets are needed to start a large 'grow-site'. Housing rent, electricity, equipment and nutrients are expensive. Third, to be able to sell large quantities of cannabis, growers need access to an illegal distribution network and knowledge of how to act according to black market norms and structures. Fourth, large-scale operations require advanced horticultural skills to maximize yield and quality, which demands greater skills and knowledge than does small-scale cultivation. Fifth, small-scale growers are often embedded in the 'cannabis culture', which emphasizes anti-commercialism, anti-violence and ecological and community values. Hence, starting up large-scale production will imply having to renegotiate or abandon these values. Going from small- to large-scale cannabis production is a demanding task-ideologically, technically, economically and personally. The many obstacles that small-scale growers face and the lack of interest and motivation for going large-scale suggest that the risk of a 'slippery slope' from small-scale to large-scale growing is limited. Possible political implications of the findings are discussed. Copyright
Distributed large-scale dimensional metrology new insights

CERN Document Server

Franceschini, Fiorenzo; Maisano, Domenico

2011-01-01

Focuses on the latest insights into and challenges of distributed large scale dimensional metrology Enables practitioners to study distributed large scale dimensional metrology independently Includes specific examples of the development of new system prototypes
Large-Scale Analysis of Auditory Segregation Behavior Crowdsourced via a Smartphone App.

Science.gov (United States)

Teki, Sundeep; Kumar, Sukhbinder; Griffiths, Timothy D

2016-01-01

The human auditory system is adept at detecting sound sources of interest from a complex mixture of several other simultaneous sounds. The ability to selectively attend to the speech of one speaker whilst ignoring other speakers and background noise is of vital biological significance-the capacity to make sense of complex 'auditory scenes' is significantly impaired in aging populations as well as those with hearing loss. We investigated this problem by designing a synthetic signal, termed the 'stochastic figure-ground' stimulus that captures essential aspects of complex sounds in the natural environment. Previously, we showed that under controlled laboratory conditions, young listeners sampled from the university subject pool (n = 10) performed very well in detecting targets embedded in the stochastic figure-ground signal. Here, we presented a modified version of this cocktail party paradigm as a 'game' featured in a smartphone app (The Great Brain Experiment) and obtained data from a large population with diverse demographical patterns (n = 5148). Despite differences in paradigms and experimental settings, the observed target-detection performance by users of the app was robust and consistent with our previous results from the psychophysical study. Our results highlight the potential use of smartphone apps in capturing robust large-scale auditory behavioral data from normal healthy volunteers, which can also be extended to study auditory deficits in clinical populations with hearing impairments and central auditory disorders.
Large-Scale Analysis of Auditory Segregation Behavior Crowdsourced via a Smartphone App.

Directory of Open Access Journals (Sweden)

Sundeep Teki

Full Text Available The human auditory system is adept at detecting sound sources of interest from a complex mixture of several other simultaneous sounds. The ability to selectively attend to the speech of one speaker whilst ignoring other speakers and background noise is of vital biological significance-the capacity to make sense of complex 'auditory scenes' is significantly impaired in aging populations as well as those with hearing loss. We investigated this problem by designing a synthetic signal, termed the 'stochastic figure-ground' stimulus that captures essential aspects of complex sounds in the natural environment. Previously, we showed that under controlled laboratory conditions, young listeners sampled from the university subject pool (n = 10 performed very well in detecting targets embedded in the stochastic figure-ground signal. Here, we presented a modified version of this cocktail party paradigm as a 'game' featured in a smartphone app (The Great Brain Experiment and obtained data from a large population with diverse demographical patterns (n = 5148. Despite differences in paradigms and experimental settings, the observed target-detection performance by users of the app was robust and consistent with our previous results from the psychophysical study. Our results highlight the potential use of smartphone apps in capturing robust large-scale auditory behavioral data from normal healthy volunteers, which can also be extended to study auditory deficits in clinical populations with hearing impairments and central auditory disorders.
Robust Stability of Scaled-Four-Channel Teleoperation with Internet Time-Varying Delays

Directory of Open Access Journals (Sweden)

Emma Delgado

2016-04-01

Full Text Available We describe the application of a generic stability framework for a teleoperation system under time-varying delay conditions, as addressed in a previous work, to a scaled-four-channel (γ-4C control scheme. Described is how varying delays are dealt with by means of dynamic encapsulation, giving rise to mu-test conditions for robust stability and offering an appealing frequency technique to deal with the stability robustness of the architecture. We discuss ideal transparency problems and we adapt classical solutions so that controllers are proper, without single or double differentiators, and thus avoid the negative effects of noise. The control scheme was fine-tuned and tested for complete stability to zero of the whole state, while seeking a practical solution to the trade-off between stability and transparency in the Internet-based teleoperation. These ideas were tested on an Internet-based application with two Omni devices at remote laboratory locations via simulations and real remote experiments that achieved robust stability, while performing well in terms of position synchronization and force transparency.
Robust Stability of Scaled-Four-Channel Teleoperation with Internet Time-Varying Delays.

Science.gov (United States)

Delgado, Emma; Barreiro, Antonio; Falcón, Pablo; Díaz-Cacho, Miguel

2016-04-26

We describe the application of a generic stability framework for a teleoperation system under time-varying delay conditions, as addressed in a previous work, to a scaled-four-channel (γ-4C) control scheme. Described is how varying delays are dealt with by means of dynamic encapsulation, giving rise to mu-test conditions for robust stability and offering an appealing frequency technique to deal with the stability robustness of the architecture. We discuss ideal transparency problems and we adapt classical solutions so that controllers are proper, without single or double differentiators, and thus avoid the negative effects of noise. The control scheme was fine-tuned and tested for complete stability to zero of the whole state, while seeking a practical solution to the trade-off between stability and transparency in the Internet-based teleoperation. These ideas were tested on an Internet-based application with two Omni devices at remote laboratory locations via simulations and real remote experiments that achieved robust stability, while performing well in terms of position synchronization and force transparency.
High-Speed Interrogation for Large-Scale Fiber Bragg Grating Sensing.

Science.gov (United States)

Hu, Chenyuan; Bai, Wei

2018-02-24

A high-speed interrogation scheme for large-scale fiber Bragg grating (FBG) sensing arrays is presented. This technique employs parallel computing and pipeline control to modulate incident light and demodulate the reflected sensing signal. One Electro-optic modulator (EOM) and one semiconductor optical amplifier (SOA) were used to generate a phase delay to filter reflected spectrum form multiple candidate FBGs with the same optical path difference (OPD). Experimental results showed that the fastest interrogation delay time for the proposed method was only about 27.2 us for a single FBG interrogation, and the system scanning period was only limited by the optical transmission delay in the sensing fiber owing to the multiple simultaneous central wavelength calculations. Furthermore, the proposed FPGA-based technique had a verified FBG wavelength demodulation stability of ±1 pm without average processing.
SCALE INTERACTION IN A MIXING LAYER. THE ROLE OF THE LARGE-SCALE GRADIENTS

KAUST Repository

Fiscaletti, Daniele; Attili, Antonio; Bisetti, Fabrizio; Elsinga, Gerrit E.

2015-01-01

from physical considerations we would expect the scales to interact in a qualitatively similar way within the flow and across different turbulent flows. Therefore, instead of the large-scale fluctuations, the large-scale gradients modulation of the small scales has been additionally investigated.
The Immersive Virtual Reality Lab: Possibilities for Remote Experimental Manipulations of Autonomic Activity on a Large Scale

Directory of Open Access Journals (Sweden)

Joshua Juvrud

2018-05-01

Full Text Available There is a need for large-scale remote data collection in a controlled environment, and the in-home availability of virtual reality (VR and the commercial availability of eye tracking for VR present unique and exciting opportunities for researchers. We propose and provide a proof-of-concept assessment of a robust system for large-scale in-home testing using consumer products that combines psychophysiological measures and VR, here referred to as a Virtual Lab. For the first time, this method is validated by correlating autonomic responses, skin conductance response (SCR, and pupillary dilation, in response to a spider, a beetle, and a ball using commercially available VR. Participants demonstrated greater SCR and pupillary responses to the spider, and the effect was dependent on the proximity of the stimuli to the participant, with a stronger response when the spider was close to the virtual self. We replicated these effects across two experiments and in separate physical room contexts to mimic variability in home environment. Together, these findings demonstrate the utility of pupil dilation as a marker of autonomic arousal and the feasibility to assess this in commercially available VR hardware and support a robust Virtual Lab tool for massive remote testing.
The Immersive Virtual Reality Lab: Possibilities for Remote Experimental Manipulations of Autonomic Activity on a Large Scale.

Science.gov (United States)

Juvrud, Joshua; Gredebäck, Gustaf; Åhs, Fredrik; Lerin, Nils; Nyström, Pär; Kastrati, Granit; Rosén, Jörgen

2018-01-01

There is a need for large-scale remote data collection in a controlled environment, and the in-home availability of virtual reality (VR) and the commercial availability of eye tracking for VR present unique and exciting opportunities for researchers. We propose and provide a proof-of-concept assessment of a robust system for large-scale in-home testing using consumer products that combines psychophysiological measures and VR, here referred to as a Virtual Lab. For the first time, this method is validated by correlating autonomic responses, skin conductance response (SCR), and pupillary dilation, in response to a spider, a beetle, and a ball using commercially available VR. Participants demonstrated greater SCR and pupillary responses to the spider, and the effect was dependent on the proximity of the stimuli to the participant, with a stronger response when the spider was close to the virtual self. We replicated these effects across two experiments and in separate physical room contexts to mimic variability in home environment. Together, these findings demonstrate the utility of pupil dilation as a marker of autonomic arousal and the feasibility to assess this in commercially available VR hardware and support a robust Virtual Lab tool for massive remote testing.
Parallelism, fractal geometry and other aspects of computational mathematics

International Nuclear Information System (INIS)

Churchhouse, R.F.

1991-01-01

In some fields such as meteorology, theoretical physics, quantum chemistry and hydrodynamics there are problems which involve so much computation that computers of the power of a thousand times a Cray 2 could be fully utilised if they were available. Since it is unlikely that uniprocessors of such power will be available, such large scale problems could be solved by using systems of computers running in parallel. This approach, of course, requires to find appropriate algorithms for the solution of such problems which can efficiently make use of a large number of computers working in parallel. 11 refs, 10 figs, 1 tab
Parallel scalability of Hartree-Fock calculations

Science.gov (United States)

Chow, Edmond; Liu, Xing; Smelyanskiy, Mikhail; Hammond, Jeff R.

2015-03-01

Quantum chemistry is increasingly performed using large cluster computers consisting of multiple interconnected nodes. For a fixed molecular problem, the efficiency of a calculation usually decreases as more nodes are used, due to the cost of communication between the nodes. This paper empirically investigates the parallel scalability of Hartree-Fock calculations. The construction of the Fock matrix and the density matrix calculation are analyzed separately. For the former, we use a parallelization of Fock matrix construction based on a static partitioning of work followed by a work stealing phase. For the latter, we use density matrix purification from the linear scaling methods literature, but without using sparsity. When using large numbers of nodes for moderately sized problems, density matrix computations are network-bandwidth bound, making purification methods potentially faster than eigendecomposition methods.
A remote-control datalogger for large-scale resistivity surveys and robust processing of its signals using a software lock-in approach

Science.gov (United States)

Oppermann, Frank; Günther, Thomas

2018-02-01

We present a new versatile datalogger that can be used for a wide range of possible applications in geosciences. It is adjustable in signal strength and sampling frequency, battery saving and can remotely be controlled over a Global System for Mobile Communication (GSM) connection so that it saves running costs, particularly in monitoring experiments. The internet connection allows for checking functionality, controlling schedules and optimizing pre-amplification. We mainly use it for large-scale electrical resistivity tomography (ERT), where it independently registers voltage time series on three channels, while a square-wave current is injected. For the analysis of this time series we present a new approach that is based on the lock-in (LI) method, mainly known from electronic circuits. The method searches the working point (phase) using three different functions based on a mask signal, and determines the amplitude using a direct current (DC) correlation function. We use synthetic data with different types of noise to compare the new method with existing approaches, i.e. selective stacking and a modified fast Fourier transformation (FFT)-based approach that assumes a 1/f noise characteristics. All methods give comparable results, but the LI is better than the well-established stacking method. The FFT approach can be even better but only if the noise strictly follows the assumed characteristics. If overshoots are present in the data, which is typical in the field, FFT performs worse even with good data, which is why we conclude that the new LI approach is the most robust solution. This is also proved by a field data set from a long 2-D ERT profile.
Trends in large-scale testing of reactor structures

International Nuclear Information System (INIS)

Blejwas, T.E.

2003-01-01

Large-scale tests of reactor structures have been conducted at Sandia National Laboratories since the late 1970s. This paper describes a number of different large-scale impact tests, pressurization tests of models of containment structures, and thermal-pressure tests of models of reactor pressure vessels. The advantages of large-scale testing are evident, but cost, in particular limits its use. As computer models have grown in size, such as number of degrees of freedom, the advent of computer graphics has made possible very realistic representation of results - results that may not accurately represent reality. A necessary condition to avoiding this pitfall is the validation of the analytical methods and underlying physical representations. Ironically, the immensely larger computer models sometimes increase the need for large-scale testing, because the modeling is applied to increasing more complex structural systems and/or more complex physical phenomena. Unfortunately, the cost of large-scale tests is a disadvantage that will likely severely limit similar testing in the future. International collaborations may provide the best mechanism for funding future programs with large-scale tests. (author)
Hybrid MPI-OpenMP Parallelism in the ONETEP Linear-Scaling Electronic Structure Code: Application to the Delamination of Cellulose Nanofibrils.

Science.gov (United States)

Wilkinson, Karl A; Hine, Nicholas D M; Skylaris, Chris-Kriton

2014-11-11

We present a hybrid MPI-OpenMP implementation of Linear-Scaling Density Functional Theory within the ONETEP code. We illustrate its performance on a range of high performance computing (HPC) platforms comprising shared-memory nodes with fast interconnect. Our work has focused on applying OpenMP parallelism to the routines which dominate the computational load, attempting where possible to parallelize different loops from those already parallelized within MPI. This includes 3D FFT box operations, sparse matrix algebra operations, calculation of integrals, and Ewald summation. While the underlying numerical methods are unchanged, these developments represent significant changes to the algorithms used within ONETEP to distribute the workload across CPU cores. The new hybrid code exhibits much-improved strong scaling relative to the MPI-only code and permits calculations with a much higher ratio of cores to atoms. These developments result in a significantly shorter time to solution than was possible using MPI alone and facilitate the application of the ONETEP code to systems larger than previously feasible. We illustrate this with benchmark calculations from an amyloid fibril trimer containing 41,907 atoms. We use the code to study the mechanism of delamination of cellulose nanofibrils when undergoing sonification, a process which is controlled by a large number of interactions that collectively determine the structural properties of the fibrils. Many energy evaluations were needed for these simulations, and as these systems comprise up to 21,276 atoms this would not have been feasible without the developments described here.

Large Scale Flood Risk Analysis using a New Hyper-resolution Population Dataset

Science.gov (United States)

Smith, A.; Neal, J. C.; Bates, P. D.; Quinn, N.; Wing, O.

2017-12-01

Here we present the first national scale flood risk analyses, using high resolution Facebook Connectivity Lab population data and data from a hyper resolution flood hazard model. In recent years the field of large scale hydraulic modelling has been transformed by new remotely sensed datasets, improved process representation, highly efficient flow algorithms and increases in computational power. These developments have allowed flood risk analysis to be undertaken in previously unmodeled territories and from continental to global scales. Flood risk analyses are typically conducted via the integration of modelled water depths with an exposure dataset. Over large scales and in data poor areas, these exposure data typically take the form of a gridded population dataset, estimating population density using remotely sensed data and/or locally available census data. The local nature of flooding dictates that for robust flood risk analysis to be undertaken both hazard and exposure data should sufficiently resolve local scale features. Global flood frameworks are enabling flood hazard data to produced at 90m resolution, resulting in a mis-match with available population datasets which are typically more coarsely resolved. Moreover, these exposure data are typically focused on urban areas and struggle to represent rural populations. In this study we integrate a new population dataset with a global flood hazard model. The population dataset was produced by the Connectivity Lab at Facebook, providing gridded population data at 5m resolution, representing a resolution increase over previous countrywide data sets of multiple orders of magnitude. Flood risk analysis undertaken over a number of developing countries are presented, along with a comparison of flood risk analyses undertaken using pre-existing population datasets.
Assessing Programming Costs of Explicit Memory Localization on a Large Scale Shared Memory Multiprocessor

Directory of Open Access Journals (Sweden)

Silvio Picano

1992-01-01

Full Text Available We present detailed experimental work involving a commercially available large scale shared memory multiple instruction stream-multiple data stream (MIMD parallel computer having a software controlled cache coherence mechanism. To make effective use of such an architecture, the programmer is responsible for designing the program's structure to match the underlying multiprocessors capabilities. We describe the techniques used to exploit our multiprocessor (the BBN TC2000 on a network simulation program, showing the resulting performance gains and the associated programming costs. We show that an efficient implementation relies heavily on the user's ability to explicitly manage the memory system.
Remote collaboration system based on large scale simulation

International Nuclear Information System (INIS)

Kishimoto, Yasuaki; Sugahara, Akihiro; Li, J.Q.

2008-01-01

Large scale simulation using super-computer, which generally requires long CPU time and produces large amount of data, has been extensively studied as a third pillar in various advanced science fields in parallel to theory and experiment. Such a simulation is expected to lead new scientific discoveries through elucidation of various complex phenomena, which are hardly identified only by conventional theoretical and experimental approaches. In order to assist such large simulation studies for which many collaborators working at geographically different places participate and contribute, we have developed a unique remote collaboration system, referred to as SIMON (simulation monitoring system), which is based on client-server system control introducing an idea of up-date processing, contrary to that of widely used post-processing. As a key ingredient, we have developed a trigger method, which transmits various requests for the up-date processing from the simulation (client) running on a super-computer to a workstation (server). Namely, the simulation running on a super-computer actively controls the timing of up-date processing. The server that has received the requests from the ongoing simulation such as data transfer, data analyses, and visualizations, etc. starts operations according to the requests during the simulation. The server makes the latest results available to web browsers, so that the collaborators can monitor the results at any place and time in the world. By applying the system to a specific simulation project of laser-matter interaction, we have confirmed that the system works well and plays an important role as a collaboration platform on which many collaborators work with one another
Plasma turbulence driven by transversely large-scale standing shear Alfvén waves

International Nuclear Information System (INIS)

Singh, Nagendra; Rao, Sathyanarayan

2012-01-01

Using two-dimensional particle-in-cell simulations, we study generation of turbulence consisting of transversely small-scale dispersive Alfvén and electrostatic waves when plasma is driven by a large-scale standing shear Alfvén wave (LS-SAW). The standing wave is set up by reflecting a propagating LS-SAW. The ponderomotive force of the standing wave generates transversely large-scale density modifications consisting of density cavities and enhancements. The drifts of the charged particles driven by the ponderomotive force and those directly caused by the fields of the standing LS-SAW generate non-thermal features in the plasma. Parametric instabilities driven by the inherent plasma nonlinearities associated with the LS-SAW in combination with the non-thermal features generate small-scale electromagnetic and electrostatic waves, yielding a broad frequency spectrum ranging from below the source frequency of the LS-SAW to ion cyclotron and lower hybrid frequencies and beyond. The power spectrum of the turbulence has peaks at distinct perpendicular wave numbers (k ⊥ ) lying in the range d e −1 -6d e −1 , d e being the electron inertial length, suggesting non-local parametric decay from small to large k ⊥ . The turbulence spectrum encompassing both electromagnetic and electrostatic fluctuations is also broadband in parallel wave number (k || ). In a standing-wave supported density cavity, the ratio of the perpendicular electric to magnetic field amplitude is R(k ⊥ ) = |E ⊥ (k ⊥ )/|B ⊥ (k ⊥ )| ≪ V A for k ⊥ d e A is the Alfvén velocity. The characteristic features of the broadband plasma turbulence are compared with those available from satellite observations in space plasmas.
A parallel algorithm for 3D dislocation dynamics

International Nuclear Information System (INIS)

Wang Zhiqiang; Ghoniem, Nasr; Swaminarayan, Sriram; LeSar, Richard

2006-01-01

Dislocation dynamics (DD), a discrete dynamic simulation method in which dislocations are the fundamental entities, is a powerful tool for investigation of plasticity, deformation and fracture of materials at the micron length scale. However, severe computational difficulties arising from complex, long-range interactions between these curvilinear line defects limit the application of DD in the study of large-scale plastic deformation. We present here the development of a parallel algorithm for accelerated computer simulations of DD. By representing dislocations as a 3D set of dislocation particles, we show here that the problem of an interacting ensemble of dislocations can be converted to a problem of a particle ensemble, interacting with a long-range force field. A grid using binary space partitioning is constructed to keep track of node connectivity across domains. We demonstrate the computational efficiency of the parallel micro-plasticity code and discuss how O(N) methods map naturally onto the parallel data structure. Finally, we present results from applications of the parallel code to deformation in single crystal fcc metals
Large Scale Computations in Air Pollution Modelling

DEFF Research Database (Denmark)

Zlatev, Z.; Brandt, J.; Builtjes, P. J. H.

Proceedings of the NATO Advanced Research Workshop on Large Scale Computations in Air Pollution Modelling, Sofia, Bulgaria, 6-10 July 1998......Proceedings of the NATO Advanced Research Workshop on Large Scale Computations in Air Pollution Modelling, Sofia, Bulgaria, 6-10 July 1998...
Large-Scale 3D Printing: The Way Forward

Science.gov (United States)

Jassmi, Hamad Al; Najjar, Fady Al; Ismail Mourad, Abdel-Hamid

2018-03-01

Research on small-scale 3D printing has rapidly evolved, where numerous industrial products have been tested and successfully applied. Nonetheless, research on large-scale 3D printing, directed to large-scale applications such as construction and automotive manufacturing, yet demands a great a great deal of efforts. Large-scale 3D printing is considered an interdisciplinary topic and requires establishing a blended knowledge base from numerous research fields including structural engineering, materials science, mechatronics, software engineering, artificial intelligence and architectural engineering. This review article summarizes key topics of relevance to new research trends on large-scale 3D printing, particularly pertaining (1) technological solutions of additive construction (i.e. the 3D printers themselves), (2) materials science challenges, and (3) new design opportunities.
Growth Limits in Large Scale Networks

DEFF Research Database (Denmark)

Knudsen, Thomas Phillip

limitations. The rising complexity of network management with the convergence of communications platforms is shown as problematic for both automatic management feasibility and for manpower resource management. In the fourth step the scope is extended to include the present society with the DDN project as its......The Subject of large scale networks is approached from the perspective of the network planner. An analysis of the long term planning problems is presented with the main focus on the changing requirements for large scale networks and the potential problems in meeting these requirements. The problems...... the fundamental technological resources in network technologies are analysed for scalability. Here several technological limits to continued growth are presented. The third step involves a survey of major problems in managing large scale networks given the growth of user requirements and the technological...
Accelerating sustainability in large-scale facilities

CERN Multimedia

Marina Giampietro

2011-01-01

Scientific research centres and large-scale facilities are intrinsically energy intensive, but how can big science improve its energy management and eventually contribute to the environmental cause with new cleantech? CERN’s commitment to providing tangible answers to these questions was sealed in the first workshop on energy management for large scale scientific infrastructures held in Lund, Sweden, on the 13-14 October. Participants at the energy management for large scale scientific infrastructures workshop. The workshop, co-organised with the European Spallation Source (ESS) and the European Association of National Research Facilities (ERF), tackled a recognised need for addressing energy issues in relation with science and technology policies. It brought together more than 150 representatives of Research Infrastrutures (RIs) and energy experts from Europe and North America. “Without compromising our scientific projects, we can ...
Large scale reflood test

International Nuclear Information System (INIS)

Hirano, Kemmei; Murao, Yoshio

1980-01-01

The large-scale reflood test with a view to ensuring the safety of light water reactors was started in fiscal 1976 based on the special account act for power source development promotion measures by the entrustment from the Science and Technology Agency. Thereafter, to establish the safety of PWRs in loss-of-coolant accidents by joint international efforts, the Japan-West Germany-U.S. research cooperation program was started in April, 1980. Thereupon, the large-scale reflood test is now included in this program. It consists of two tests using a cylindrical core testing apparatus for examining the overall system effect and a plate core testing apparatus for testing individual effects. Each apparatus is composed of the mock-ups of pressure vessel, primary loop, containment vessel and ECCS. The testing method, the test results and the research cooperation program are described. (J.P.N.)
Multi-Agent System Supporting Automated Large-Scale Photometric Computations

Directory of Open Access Journals (Sweden)

Adam Sȩdziwy

2016-02-01

Full Text Available The technologies related to green energy, smart cities and similar areas being dynamically developed in recent years, face frequently problems of a computational nature rather than a technological one. The example is the ability of accurately predicting the weather conditions for PV farms or wind turbines. Another group of issues is related to the complexity of the computations required to obtain an optimal setup of a solution being designed. In this article, we present the case representing the latter group of problems, namely designing large-scale power-saving lighting installations. The term “large-scale” refers to an entire city area, containing tens of thousands of luminaires. Although a simple power reduction for a single street, giving limited savings, is relatively easy, it becomes infeasible for tasks covering thousands of luminaires described by precise coordinates (instead of simplified layouts. To overcome this critical issue, we propose introducing a formal representation of a computing problem and applying a multi-agent system to perform design-related computations in parallel. The important measure introduced in the article indicating optimization progress is entropy. It also allows for terminating optimization when the solution is satisfying. The article contains the results of real-life calculations being made with the help of the presented approach.
MPIGeneNet: Parallel Calculation of Gene Co-Expression Networks on Multicore Clusters.

Science.gov (United States)

Gonzalez-Dominguez, Jorge; Martin, Maria J

2017-10-10

In this work we present MPIGeneNet, a parallel tool that applies Pearson's correlation and Random Matrix Theory to construct gene co-expression networks. It is based on the state-of-the-art sequential tool RMTGeneNet, which provides networks with high robustness and sensitivity at the expenses of relatively long runtimes for large scale input datasets. MPIGeneNet returns the same results as RMTGeneNet but improves the memory management, reduces the I/O cost, and accelerates the two most computationally demanding steps of co-expression network construction by exploiting the compute capabilities of common multicore CPU clusters. Our performance evaluation on two different systems using three typical input datasets shows that MPIGeneNet is significantly faster than RMTGeneNet. As an example, our tool is up to 175.41 times faster on a cluster with eight nodes, each one containing two 12-core Intel Haswell processors. Source code of MPIGeneNet, as well as a reference manual, are available at https://sourceforge.net/projects/mpigenenet/.
A Parallel Computational Model for Multichannel Phase Unwrapping Problem

Science.gov (United States)

Imperatore, Pasquale; Pepe, Antonio; Lanari, Riccardo

2015-05-01

In this paper, a parallel model for the solution of the computationally intensive multichannel phase unwrapping (MCh-PhU) problem is proposed. Firstly, the Extended Minimum Cost Flow (EMCF) algorithm for solving MCh-PhU problem is revised within the rigorous mathematical framework of the discrete calculus ; thus permitting to capture its topological structure in terms of meaningful discrete differential operators. Secondly, emphasis is placed on those methodological and practical aspects, which lead to a parallel reformulation of the EMCF algorithm. Thus, a novel dual-level parallel computational model, in which the parallelism is hierarchically implemented at two different (i.e., process and thread) levels, is presented. The validity of our approach has been demonstrated through a series of experiments that have revealed a significant speedup. Therefore, the attained high-performance prototype is suitable for the solution of large-scale phase unwrapping problems in reasonable time frames, with a significant impact on the systematic exploitation of the existing, and rapidly growing, large archives of SAR data.
Large Scale Cosmological Anomalies and Inhomogeneous Dark Energy

Directory of Open Access Journals (Sweden)

Leandros Perivolaropoulos

2014-01-01

Full Text Available A wide range of large scale observations hint towards possible modifications on the standard cosmological model which is based on a homogeneous and isotropic universe with a small cosmological constant and matter. These observations, also known as “cosmic anomalies” include unexpected Cosmic Microwave Background perturbations on large angular scales, large dipolar peculiar velocity flows of galaxies (“bulk flows”, the measurement of inhomogenous values of the fine structure constant on cosmological scales (“alpha dipole” and other effects. The presence of the observational anomalies could either be a large statistical fluctuation in the context of ΛCDM or it could indicate a non-trivial departure from the cosmological principle on Hubble scales. Such a departure is very much constrained by cosmological observations for matter. For dark energy however there are no significant observational constraints for Hubble scale inhomogeneities. In this brief review I discuss some of the theoretical models that can naturally lead to inhomogeneous dark energy, their observational constraints and their potential to explain the large scale cosmic anomalies.
Preparing laboratory and real-world EEG data for large-scale analysis: A containerized approach

Directory of Open Access Journals (Sweden)

Nima eBigdely-Shamlo

2016-03-01

Full Text Available Large-scale analysis of EEG and other physiological measures promises new insights into brain processes and more accurate and robust brain-computer interface (BCI models.. However, the absence of standard-ized vocabularies for annotating events in a machine understandable manner, the welter of collection-specific data organizations, the diffi-culty in moving data across processing platforms, and the unavailability of agreed-upon standards for preprocessing have prevented large-scale analyses of EEG. Here we describe a containerized approach and freely available tools we have developed to facilitate the process of an-notating, packaging, and preprocessing EEG data collections to enable data sharing, archiving, large-scale machine learning/data mining and (meta-analysis. The EEG Study Schema (ESS comprises three data Levels, each with its own XML-document schema and file/folder convention, plus a standardized (PREP pipeline to move raw (Data Level 1 data to a basic preprocessed state (Data Level 2 suitable for application of a large class of EEG analysis methods. Researchers can ship a study as a single unit and operate on its data using a standardized interface. ESS does not require a central database and provides all the metadata data necessary to execute a wide variety of EEG processing pipelines. The primary focus of ESS is automated in-depth analysis and meta-analysis EEG studies. However, ESS can also encapsulate meta-information for the other modalities such as eye tracking, that are in-creasingly used in both laboratory and real-world neuroimaging. ESS schema and tools are freely available at eegstudy.org, and a central cata-log of over 850 GB of existing data in ESS format is available at study-catalog.org. These tools and resources are part of a larger effort to ena-ble data sharing at sufficient scale for researchers to engage in truly large-scale EEG analysis and data mining (BigEEG.org.
Large-scale patterns in Rayleigh-Benard convection

International Nuclear Information System (INIS)

Hardenberg, J. von; Parodi, A.; Passoni, G.; Provenzale, A.; Spiegel, E.A.

2008-01-01

Rayleigh-Benard convection at large Rayleigh number is characterized by the presence of intense, vertically moving plumes. Both laboratory and numerical experiments reveal that the rising and descending plumes aggregate into separate clusters so as to produce large-scale updrafts and downdrafts. The horizontal scales of the aggregates reported so far have been comparable to the horizontal extent of the containers, but it has not been clear whether that represents a limitation imposed by domain size. In this work, we present numerical simulations of convection at sufficiently large aspect ratio to ascertain whether there is an intrinsic saturation scale for the clustering process when that ratio is large enough. From a series of simulations of Rayleigh-Benard convection with Rayleigh numbers between 10 5 and 10 8 and with aspect ratios up to 12π, we conclude that the clustering process has a finite horizontal saturation scale with at most a weak dependence on Rayleigh number in the range studied
Robust multi-site MR data processing: iterative optimization of bias correction, tissue classification, and registration.

Science.gov (United States)

Young Kim, Eun; Johnson, Hans J

2013-01-01

A robust multi-modal tool, for automated registration, bias correction, and tissue classification, has been implemented for large-scale heterogeneous multi-site longitudinal MR data analysis. This work focused on improving the an iterative optimization framework between bias-correction, registration, and tissue classification inspired from previous work. The primary contributions are robustness improvements from incorporation of following four elements: (1) utilize multi-modal and repeated scans, (2) incorporate high-deformable registration, (3) use extended set of tissue definitions, and (4) use of multi-modal aware intensity-context priors. The benefits of these enhancements were investigated by a series of experiments with both simulated brain data set (BrainWeb) and by applying to highly-heterogeneous data from a 32 site imaging study with quality assessments through the expert visual inspection. The implementation of this tool is tailored for, but not limited to, large-scale data processing with great data variation with a flexible interface. In this paper, we describe enhancements to a joint registration, bias correction, and the tissue classification, that improve the generalizability and robustness for processing multi-modal longitudinal MR scans collected at multi-sites. The tool was evaluated by using both simulated and simulated and human subject MRI images. With these enhancements, the results showed improved robustness for large-scale heterogeneous MRI processing.
FFTLasso: Large-Scale LASSO in the Fourier Domain

KAUST Repository

Bibi, Adel Aamer

2017-11-09

In this paper, we revisit the LASSO sparse representation problem, which has been studied and used in a variety of different areas, ranging from signal processing and information theory to computer vision and machine learning. In the vision community, it found its way into many important applications, including face recognition, tracking, super resolution, image denoising, to name a few. Despite advances in efficient sparse algorithms, solving large-scale LASSO problems remains a challenge. To circumvent this difficulty, people tend to downsample and subsample the problem (e.g. via dimensionality reduction) to maintain a manageable sized LASSO, which usually comes at the cost of losing solution accuracy. This paper proposes a novel circulant reformulation of the LASSO that lifts the problem to a higher dimension, where ADMM can be efficiently applied to its dual form. Because of this lifting, all optimization variables are updated using only basic element-wise operations, the most computationally expensive of which is a 1D FFT. In this way, there is no need for a linear system solver nor matrix-vector multiplication. Since all operations in our FFTLasso method are element-wise, the subproblems are completely independent and can be trivially parallelized (e.g. on a GPU). The attractive computational properties of FFTLasso are verified by extensive experiments on synthetic and real data and on the face recognition task. They demonstrate that FFTLasso scales much more effectively than a state-of-the-art solver.
FFTLasso: Large-Scale LASSO in the Fourier Domain

KAUST Repository

Bibi, Adel Aamer; Itani, Hani; Ghanem, Bernard

2017-01-01

In this paper, we revisit the LASSO sparse representation problem, which has been studied and used in a variety of different areas, ranging from signal processing and information theory to computer vision and machine learning. In the vision community, it found its way into many important applications, including face recognition, tracking, super resolution, image denoising, to name a few. Despite advances in efficient sparse algorithms, solving large-scale LASSO problems remains a challenge. To circumvent this difficulty, people tend to downsample and subsample the problem (e.g. via dimensionality reduction) to maintain a manageable sized LASSO, which usually comes at the cost of losing solution accuracy. This paper proposes a novel circulant reformulation of the LASSO that lifts the problem to a higher dimension, where ADMM can be efficiently applied to its dual form. Because of this lifting, all optimization variables are updated using only basic element-wise operations, the most computationally expensive of which is a 1D FFT. In this way, there is no need for a linear system solver nor matrix-vector multiplication. Since all operations in our FFTLasso method are element-wise, the subproblems are completely independent and can be trivially parallelized (e.g. on a GPU). The attractive computational properties of FFTLasso are verified by extensive experiments on synthetic and real data and on the face recognition task. They demonstrate that FFTLasso scales much more effectively than a state-of-the-art solver.
Robust and Effective Component-based Banknote Recognition by SURF Features.

Science.gov (United States)

Hasanuzzaman, Faiz M; Yang, Xiaodong; Tian, YingLi

2011-01-01

Camera-based computer vision technology is able to assist visually impaired people to automatically recognize banknotes. A good banknote recognition algorithm for blind or visually impaired people should have the following features: 1) 100% accuracy, and 2) robustness to various conditions in different environments and occlusions. Most existing algorithms of banknote recognition are limited to work for restricted conditions. In this paper we propose a component-based framework for banknote recognition by using Speeded Up Robust Features (SURF). The component-based framework is effective in collecting more class-specific information and robust in dealing with partial occlusion and viewpoint changes. Furthermore, the evaluation of SURF demonstrates its effectiveness in handling background noise, image rotation, scale, and illumination changes. To authenticate the robustness and generalizability of the proposed approach, we have collected a large dataset of banknotes from a variety of conditions including occlusion, cluttered background, rotation, and changes of illumination, scaling, and viewpoints. The proposed algorithm achieves 100% recognition rate on our challenging dataset.

The PREP Pipeline: Standardized preprocessing for large-scale EEG analysis

Directory of Open Access Journals (Sweden)

Nima eBigdelys Shamlo

2015-06-01

Full Text Available The technology to collect brain imaging and physiological measures has become portable and ubiquitous, opening the possibility of large-scale analysis of real-world human imaging. By its nature, such data is large and complex, making automated processing essential. This paper shows how lack of attention to the very early stages of an EEG preprocessing pipeline can reduce the signal-to-noise ratio and introduce unwanted artifacts into the data, particularly for computations done in single precision. We demonstrate that ordinary average referencing improves the signal-to-noise ratio, but that noisy channels can contaminate the results. We also show that identification of noisy channels depends on the reference and examine the complex interaction of filtering, noisy channel identification, and referencing. We introduce a multi-stage robust referencing scheme to deal with the noisy channel-reference interaction. We propose a standardized early-stage EEG processing pipeline (PREP and discuss the application of the pipeline to more than 600 EEG datasets. The pipeline includes an automatically generated report for each dataset processed. Users can download the PREP pipeline as a freely available MATLAB library from http://eegstudy.org/prepcode/.
The PREP pipeline: standardized preprocessing for large-scale EEG analysis.

Science.gov (United States)

Bigdely-Shamlo, Nima; Mullen, Tim; Kothe, Christian; Su, Kyung-Min; Robbins, Kay A

2015-01-01

The technology to collect brain imaging and physiological measures has become portable and ubiquitous, opening the possibility of large-scale analysis of real-world human imaging. By its nature, such data is large and complex, making automated processing essential. This paper shows how lack of attention to the very early stages of an EEG preprocessing pipeline can reduce the signal-to-noise ratio and introduce unwanted artifacts into the data, particularly for computations done in single precision. We demonstrate that ordinary average referencing improves the signal-to-noise ratio, but that noisy channels can contaminate the results. We also show that identification of noisy channels depends on the reference and examine the complex interaction of filtering, noisy channel identification, and referencing. We introduce a multi-stage robust referencing scheme to deal with the noisy channel-reference interaction. We propose a standardized early-stage EEG processing pipeline (PREP) and discuss the application of the pipeline to more than 600 EEG datasets. The pipeline includes an automatically generated report for each dataset processed. Users can download the PREP pipeline as a freely available MATLAB library from http://eegstudy.org/prepcode.
Benefits of Parallel I/O in Ab Initio Nuclear Physics Calculations

International Nuclear Information System (INIS)

Laghave, Nikhil; Sosonkina, Masha; Maris, Pieter; Vary, James P.

2009-01-01

Many modern scientific applications rely on highly parallel calculations, which scale to 10's of thousands processors. However, most applications do not concentrate on parallelizing input/output operations. In particular, sequential I/O has been identified as a bottleneck for the highly scalable MFDn (Many Fermion Dynamics for nuclear structure) code performing ab initio nuclear structure calculations. In this paper, we develop interfaces and parallel I/O procedures to use a well-known parallel I/O library in MFDn. As a result, we gain efficient input/output of large datasets along with their portability and ease of use in the downstream processing.
Development of large scale fusion plasma simulation and storage grid on JAERI Origin3800 system

International Nuclear Information System (INIS)

Idomura, Yasuhiro; Wang, Xin

2003-01-01

Under the Numerical EXperiment of Tokamak (NEXT) research project, various fluid, particle, and hybrid codes have been developed. These codes require a computational environment which consists of high performance processors, high speed storage system, and high speed parallelized visualization system. In this paper, the performance of the JAERI Origin3800 system is examined from a point of view of these requests. In the performance tests, it is shown that the representative particle and fluid codes operate with 15 - 40% of processing efficiency up to 512 processors. A storage area network (SAN) provides high speed parallel data transfer. A parallel visualization system enables order to magnitude faster visualization of a large scale simulation data compared with the previous graphic workstations. Accordingly, an extremely advanced simulation environment is realized on the JAERI Origin3800 system. Recently, development of a storage grid is underway in order to improve a computational environment of remote users. The storage grid is constructed by a combination of SAN and a wavelength division multiplexer (WDM). The preliminary tests show that compared with the existing data transfer methods, it enables dramatically high speed data transfer ∼100 Gbps over a wide area network. (author)
Manufacturing test of large scale hollow capsule and long length cladding in the large scale oxide dispersion strengthened (ODS) martensitic steel

International Nuclear Information System (INIS)

Narita, Takeshi; Ukai, Shigeharu; Kaito, Takeji; Ohtsuka, Satoshi; Fujiwara, Masayuki

2004-04-01

Mass production capability of oxide dispersion strengthened (ODS) martensitic steel cladding (9Cr) has being evaluated in the Phase II of the Feasibility Studies on Commercialized Fast Reactor Cycle System. The cost for manufacturing mother tube (raw materials powder production, mechanical alloying (MA) by ball mill, canning, hot extrusion, and machining) is a dominant factor in the total cost for manufacturing ODS ferritic steel cladding. In this study, the large-sale 9Cr-ODS martensitic steel mother tube which is made with a large-scale hollow capsule, and long length claddings were manufactured, and the applicability of these processes was evaluated. Following results were obtained in this study. (1) Manufacturing the large scale mother tube in the dimension of 32 mm OD, 21 mm ID, and 2 m length has been successfully carried out using large scale hollow capsule. This mother tube has a high degree of accuracy in size. (2) The chemical composition and the micro structure of the manufactured mother tube are similar to the existing mother tube manufactured by a small scale can. And the remarkable difference between the bottom and top sides in the manufactured mother tube has not been observed. (3) The long length cladding has been successfully manufactured from the large scale mother tube which was made using a large scale hollow capsule. (4) For reducing the manufacturing cost of the ODS steel claddings, manufacturing process of the mother tubes using a large scale hollow capsules is promising. (author)
Large amplitude parallel propagating electromagnetic oscillitons

International Nuclear Information System (INIS)

Cattaert, Tom; Verheest, Frank

2005-01-01

Earlier systematic nonlinear treatments of parallel propagating electromagnetic waves have been given within a fluid dynamic approach, in a frame where the nonlinear structures are stationary and various constraining first integrals can be obtained. This has lead to the concept of oscillitons that has found application in various space plasmas. The present paper differs in three main aspects from the previous studies: first, the invariants are derived in the plasma frame, as customary in the Sagdeev method, thus retaining in Maxwell's equations all possible effects. Second, a single differential equation is obtained for the parallel fluid velocity, in a form reminiscent of the Sagdeev integrals, hence allowing a fully nonlinear discussion of the oscilliton properties, at such amplitudes as the underlying Mach number restrictions allow. Third, the transition to weakly nonlinear whistler oscillitons is done in an analytical rather than a numerical fashion
Quantum Monte Carlo for large chemical systems: implementing efficient strategies for peta scale platforms and beyond

International Nuclear Information System (INIS)

Scemama, Anthony; Caffarel, Michel; Oseret, Emmanuel; Jalby, William

2013-01-01

Various strategies to implement efficiently quantum Monte Carlo (QMC) simulations for large chemical systems are presented. These include: (i) the introduction of an efficient algorithm to calculate the computationally expensive Slater matrices. This novel scheme is based on the use of the highly localized character of atomic Gaussian basis functions (not the molecular orbitals as usually done), (ii) the possibility of keeping the memory footprint minimal, (iii) the important enhancement of single-core performance when efficient optimization tools are used, and (iv) the definition of a universal, dynamic, fault-tolerant, and load-balanced framework adapted to all kinds of computational platforms (massively parallel machines, clusters, or distributed grids). These strategies have been implemented in the QMC-Chem code developed at Toulouse and illustrated with numerical applications on small peptides of increasing sizes (158, 434, 1056, and 1731 electrons). Using 10-80 k computing cores of the Curie machine (GENCI-TGCC-CEA, France), QMC-Chem has been shown to be capable of running at the peta scale level, thus demonstrating that for this machine a large part of the peak performance can be achieved. Implementation of large-scale QMC simulations for future exa scale platforms with a comparable level of efficiency is expected to be feasible. (authors)
Amplification of large-scale magnetic field in nonhelical magnetohydrodynamics

KAUST Repository

Kumar, Rohit

2017-08-11

It is typically assumed that the kinetic and magnetic helicities play a crucial role in the growth of large-scale dynamo. In this paper, we demonstrate that helicity is not essential for the amplification of large-scale magnetic field. For this purpose, we perform nonhelical magnetohydrodynamic (MHD) simulation, and show that the large-scale magnetic field can grow in nonhelical MHD when random external forcing is employed at scale 1/10 the box size. The energy fluxes and shell-to-shell transfer rates computed using the numerical data show that the large-scale magnetic energy grows due to the energy transfers from the velocity field at the forcing scales.
Robust visual tracking via multiscale deep sparse networks

Science.gov (United States)

Wang, Xin; Hou, Zhiqiang; Yu, Wangsheng; Xue, Yang; Jin, Zefenfen; Dai, Bo

2017-04-01

In visual tracking, deep learning with offline pretraining can extract more intrinsic and robust features. It has significant success solving the tracking drift in a complicated environment. However, offline pretraining requires numerous auxiliary training datasets and is considerably time-consuming for tracking tasks. To solve these problems, a multiscale sparse networks-based tracker (MSNT) under the particle filter framework is proposed. Based on the stacked sparse autoencoders and rectifier linear unit, the tracker has a flexible and adjustable architecture without the offline pretraining process and exploits the robust and powerful features effectively only through online training of limited labeled data. Meanwhile, the tracker builds four deep sparse networks of different scales, according to the target's profile type. During tracking, the tracker selects the matched tracking network adaptively in accordance with the initial target's profile type. It preserves the inherent structural information more efficiently than the single-scale networks. Additionally, a corresponding update strategy is proposed to improve the robustness of the tracker. Extensive experimental results on a large scale benchmark dataset show that the proposed method performs favorably against state-of-the-art methods in challenging environments.
Developing eThread Pipeline Using SAGA-Pilot Abstraction for Large-Scale Structural Bioinformatics

Directory of Open Access Journals (Sweden)

Anjani Ragothaman

2014-01-01

Full Text Available While most of computational annotation approaches are sequence-based, threading methods are becoming increasingly attractive because of predicted structural information that could uncover the underlying function. However, threading tools are generally compute-intensive and the number of protein sequences from even small genomes such as prokaryotes is large typically containing many thousands, prohibiting their application as a genome-wide structural systems biology tool. To leverage its utility, we have developed a pipeline for eThread—a meta-threading protein structure modeling tool, that can use computational resources efficiently and effectively. We employ a pilot-based approach that supports seamless data and task-level parallelism and manages large variation in workload and computational requirements. Our scalable pipeline is deployed on Amazon EC2 and can efficiently select resources based upon task requirements. We present runtime analysis to characterize computational complexity of eThread and EC2 infrastructure. Based on results, we suggest a pathway to an optimized solution with respect to metrics such as time-to-solution or cost-to-solution. Our eThread pipeline can scale to support a large number of sequences and is expected to be a viable solution for genome-scale structural bioinformatics and structure-based annotation, particularly, amenable for small genomes such as prokaryotes. The developed pipeline is easily extensible to other types of distributed cyberinfrastructure.
Hydrometeorological variability on a large french catchment and its relation to large-scale circulation across temporal scales

Science.gov (United States)

Massei, Nicolas; Dieppois, Bastien; Fritier, Nicolas; Laignel, Benoit; Debret, Maxime; Lavers, David; Hannah, David

2015-04-01

In the present context of global changes, considerable efforts have been deployed by the hydrological scientific community to improve our understanding of the impacts of climate fluctuations on water resources. Both observational and modeling studies have been extensively employed to characterize hydrological changes and trends, assess the impact of climate variability or provide future scenarios of water resources. In the aim of a better understanding of hydrological changes, it is of crucial importance to determine how and to what extent trends and long-term oscillations detectable in hydrological variables are linked to global climate oscillations. In this work, we develop an approach associating large-scale/local-scale correlation, enmpirical statistical downscaling and wavelet multiresolution decomposition of monthly precipitation and streamflow over the Seine river watershed, and the North Atlantic sea level pressure (SLP) in order to gain additional insights on the atmospheric patterns associated with the regional hydrology. We hypothesized that: i) atmospheric patterns may change according to the different temporal wavelengths defining the variability of the signals; and ii) definition of those hydrological/circulation relationships for each temporal wavelength may improve the determination of large-scale predictors of local variations. The results showed that the large-scale/local-scale links were not necessarily constant according to time-scale (i.e. for the different frequencies characterizing the signals), resulting in changing spatial patterns across scales. This was then taken into account by developing an empirical statistical downscaling (ESD) modeling approach which integrated discrete wavelet multiresolution analysis for reconstructing local hydrometeorological processes (predictand : precipitation and streamflow on the Seine river catchment) based on a large-scale predictor (SLP over the Euro-Atlantic sector) on a monthly time-step. This approach
Superconducting materials for large scale applications

International Nuclear Information System (INIS)

Dew-Hughes, D.

1975-01-01

Applications of superconductors capable of carrying large current densities in large-scale electrical devices are examined. Discussions are included on critical current density, superconducting materials available, and future prospects for improved superconducting materials. (JRD)
Large-scale parallel configuration interaction. I. Nonrelativisticand scalar-relativistic general active space implementationwith application to (Rb-Ba)+

DEFF Research Database (Denmark)

Knecht, Stefan; Jensen, Hans Jørgen Aagaard; Fleig, Timo

2008-01-01

We present a parallel implementation of a string-driven general active space configuration interaction program for nonrelativistic and scalar-relativistic electronic-structure calculations. The code has been modularly incorporated in the DIRAC quantum chemistry program package. The implementation...
Large-scale influences in near-wall turbulence.

Science.gov (United States)

Hutchins, Nicholas; Marusic, Ivan

2007-03-15

Hot-wire data acquired in a high Reynolds number facility are used to illustrate the need for adequate scale separation when considering the coherent structure in wall-bounded turbulence. It is found that a large-scale motion in the log region becomes increasingly comparable in energy to the near-wall cycle as the Reynolds number increases. Through decomposition of fluctuating velocity signals, it is shown that this large-scale motion has a distinct modulating influence on the small-scale energy (akin to amplitude modulation). Reassessment of DNS data, in light of these results, shows similar trends, with the rate and intensity of production due to the near-wall cycle subject to a modulating influence from the largest-scale motions.
A Set of Annotation Interfaces for Alignment of Parallel Corpora

Directory of Open Access Journals (Sweden)

Singh Anil Kumar

2014-09-01

Full Text Available Annotation interfaces for parallel corpora which fit in well with other tools can be very useful. We describe a set of annotation interfaces which fulfill this criterion. This set includes a sentence alignment interface, two different word or word group alignment interfaces and an initial version of a parallel syntactic annotation alignment interface. These tools can be used for manual alignment, or they can be used to correct automatic alignments. Manual alignment can be performed in combination with certain kinds of linguistic annotation. Most of these interfaces use a representation called the Shakti Standard Format that has been found to be very robust and has been used for large and successful projects. It ties together the different interfaces, so that the data created by them is portable across all tools which support this representation. The existence of a query language for data stored in this representation makes it possible to build tools that allow easy search and modification of annotated parallel data.
PKI security in large-scale healthcare networks.

Science.gov (United States)

Mantas, Georgios; Lymberopoulos, Dimitrios; Komninos, Nikos

2012-06-01

During the past few years a lot of PKI (Public Key Infrastructures) infrastructures have been proposed for healthcare networks in order to ensure secure communication services and exchange of data among healthcare professionals. However, there is a plethora of challenges in these healthcare PKI infrastructures. Especially, there are a lot of challenges for PKI infrastructures deployed over large-scale healthcare networks. In this paper, we propose a PKI infrastructure to ensure security in a large-scale Internet-based healthcare network connecting a wide spectrum of healthcare units geographically distributed within a wide region. Furthermore, the proposed PKI infrastructure facilitates the trust issues that arise in a large-scale healthcare network including multi-domain PKI infrastructures.
Large Scale Software Building with CMake in ATLAS

Science.gov (United States)

Elmsheuser, J.; Krasznahorkay, A.; Obreshkov, E.; Undrus, A.; ATLAS Collaboration

2017-10-01

The offline software of the ATLAS experiment at the Large Hadron Collider (LHC) serves as the platform for detector data reconstruction, simulation and analysis. It is also used in the detector’s trigger system to select LHC collision events during data taking. The ATLAS offline software consists of several million lines of C++ and Python code organized in a modular design of more than 2000 specialized packages. Because of different workflows, many stable numbered releases are in parallel production use. To accommodate specific workflow requests, software patches with modified libraries are distributed on top of existing software releases on a daily basis. The different ATLAS software applications also require a flexible build system that strongly supports unit and integration tests. Within the last year this build system was migrated to CMake. A CMake configuration has been developed that allows one to easily set up and build the above mentioned software packages. This also makes it possible to develop and test new and modified packages on top of existing releases. The system also allows one to detect and execute partial rebuilds of the release based on single package changes. The build system makes use of CPack for building RPM packages out of the software releases, and CTest for running unit and integration tests. We report on the migration and integration of the ATLAS software to CMake and show working examples of this large scale project in production.
Cosmic Shear With ACS Pure Parallels. Targeted Portion.

Science.gov (United States)

Rhodes, Jason

2002-07-01

Small distortions in the shapes of background galaxies by foreground mass provide a powerful method of directly measuring the amount and distribution of dark matter. Several groups have recently detected this weak lensing by large-scale structure, also called cosmic shear. The high resolution and sensitivity of HST/ACS provide a unique opportunity to measure cosmic shear accurately on small scales. Using 260 parallel orbits in Sloan i {F775W} we will measure for the first time: the cosmic shear variance on scales Omega_m^0.5, with signal-to-noise {s/n} 20, and the mass density Omega_m with s/n=4. They will be done at small angular scales where non-linear effects dominate the power spectrum, providing a test of the gravitational instability paradigm for structure formation. Measurements on these scales are not possible from the ground, because of the systematic effects induced by PSF smearing from seeing. Having many independent lines of sight reduces the uncertainty due to cosmic variance, making parallel observations ideal.
Robustness Analysis of Real Network Topologies Under Multiple Failure Scenarios

DEFF Research Database (Denmark)

Manzano, M.; Marzo, J. L.; Calle, E.

2012-01-01

on topological characteristics. Recently approaches also consider the services supported by such networks. In this paper we carry out a robustness analysis of five real backbone telecommunication networks under defined multiple failure scenarios, taking into account the consequences of the loss of established......Nowadays the ubiquity of telecommunication networks, which underpin and fulfill key aspects of modern day living, is taken for granted. Significant large-scale failures have occurred in the last years affecting telecommunication networks. Traditionally, network robustness analysis has been focused...... connections. Results show which networks are more robust in response to a specific type of failure....
Emerging large-scale solar heating applications

International Nuclear Information System (INIS)

Wong, W.P.; McClung, J.L.

2009-01-01

Currently the market for solar heating applications in Canada is dominated by outdoor swimming pool heating, make-up air pre-heating and domestic water heating in homes, commercial and institutional buildings. All of these involve relatively small systems, except for a few air pre-heating systems on very large buildings. Together these applications make up well over 90% of the solar thermal collectors installed in Canada during 2007. These three applications, along with the recent re-emergence of large-scale concentrated solar thermal for generating electricity, also dominate the world markets. This paper examines some emerging markets for large scale solar heating applications, with a focus on the Canadian climate and market. (author)

Emerging large-scale solar heating applications

Energy Technology Data Exchange (ETDEWEB)

Wong, W.P.; McClung, J.L. [Science Applications International Corporation (SAIC Canada), Ottawa, Ontario (Canada)

2009-07-01

Currently the market for solar heating applications in Canada is dominated by outdoor swimming pool heating, make-up air pre-heating and domestic water heating in homes, commercial and institutional buildings. All of these involve relatively small systems, except for a few air pre-heating systems on very large buildings. Together these applications make up well over 90% of the solar thermal collectors installed in Canada during 2007. These three applications, along with the recent re-emergence of large-scale concentrated solar thermal for generating electricity, also dominate the world markets. This paper examines some emerging markets for large scale solar heating applications, with a focus on the Canadian climate and market. (author)
Evaluation of Penalized and Nonpenalized Methods for Disease Prediction with Large-Scale Genetic Data

Directory of Open Access Journals (Sweden)

Sungho Won

2015-01-01

Full Text Available Owing to recent improvement of genotyping technology, large-scale genetic data can be utilized to identify disease susceptibility loci and this successful finding has substantially improved our understanding of complex diseases. However, in spite of these successes, most of the genetic effects for many complex diseases were found to be very small, which have been a big hurdle to build disease prediction model. Recently, many statistical methods based on penalized regressions have been proposed to tackle the so-called “large P and small N” problem. Penalized regressions including least absolute selection and shrinkage operator (LASSO and ridge regression limit the space of parameters, and this constraint enables the estimation of effects for very large number of SNPs. Various extensions have been suggested, and, in this report, we compare their accuracy by applying them to several complex diseases. Our results show that penalized regressions are usually robust and provide better accuracy than the existing methods for at least diseases under consideration.
Parallelization of 2-D lattice Boltzmann codes

International Nuclear Information System (INIS)

Suzuki, Soichiro; Kaburaki, Hideo; Yokokawa, Mitsuo.

1996-03-01

Lattice Boltzmann (LB) codes to simulate two dimensional fluid flow are developed on vector parallel computer Fujitsu VPP500 and scalar parallel computer Intel Paragon XP/S. While a 2-D domain decomposition method is used for the scalar parallel LB code, a 1-D domain decomposition method is used for the vector parallel LB code to be vectorized along with the axis perpendicular to the direction of the decomposition. High parallel efficiency of 95.1% by the vector parallel calculation on 16 processors with 1152x1152 grid and 88.6% by the scalar parallel calculation on 100 processors with 800x800 grid are obtained. The performance models are developed to analyze the performance of the LB codes. It is shown by our performance models that the execution speed of the vector parallel code is about one hundred times faster than that of the scalar parallel code with the same number of processors up to 100 processors. We also analyze the scalability in keeping the available memory size of one processor element at maximum. Our performance model predicts that the execution time of the vector parallel code increases about 3% on 500 processors. Although the 1-D domain decomposition method has in general a drawback in the interprocessor communication, the vector parallel LB code is still suitable for the large scale and/or high resolution simulations. (author)
Parallelization of 2-D lattice Boltzmann codes

Energy Technology Data Exchange (ETDEWEB)

Suzuki, Soichiro; Kaburaki, Hideo; Yokokawa, Mitsuo

1996-03-01

Lattice Boltzmann (LB) codes to simulate two dimensional fluid flow are developed on vector parallel computer Fujitsu VPP500 and scalar parallel computer Intel Paragon XP/S. While a 2-D domain decomposition method is used for the scalar parallel LB code, a 1-D domain decomposition method is used for the vector parallel LB code to be vectorized along with the axis perpendicular to the direction of the decomposition. High parallel efficiency of 95.1% by the vector parallel calculation on 16 processors with 1152x1152 grid and 88.6% by the scalar parallel calculation on 100 processors with 800x800 grid are obtained. The performance models are developed to analyze the performance of the LB codes. It is shown by our performance models that the execution speed of the vector parallel code is about one hundred times faster than that of the scalar parallel code with the same number of processors up to 100 processors. We also analyze the scalability in keeping the available memory size of one processor element at maximum. Our performance model predicts that the execution time of the vector parallel code increases about 3% on 500 processors. Although the 1-D domain decomposition method has in general a drawback in the interprocessor communication, the vector parallel LB code is still suitable for the large scale and/or high resolution simulations. (author).
Sensitivity Analysis of the Proximal-Based Parallel Decomposition Methods

Directory of Open Access Journals (Sweden)

Feng Ma

2014-01-01

Full Text Available The proximal-based parallel decomposition methods were recently proposed to solve structured convex optimization problems. These algorithms are eligible for parallel computation and can be used efficiently for solving large-scale separable problems. In this paper, compared with the previous theoretical results, we show that the range of the involved parameters can be enlarged while the convergence can be still established. Preliminary numerical tests on stable principal component pursuit problem testify to the advantages of the enlargement.
Hybrid parallel execution model for logic-based specification languages

CERN Document Server

Tsai, Jeffrey J P

2001-01-01

Parallel processing is a very important technique for improving the performance of various software development and maintenance activities. The purpose of this book is to introduce important techniques for parallel executation of high-level specifications of software systems. These techniques are very useful for the construction, analysis, and transformation of reliable large-scale and complex software systems. Contents: Current Approaches; Overview of the New Approach; FRORL Requirements Specification Language and Its Decomposition; Rewriting and Data Dependency, Control Flow Analysis of a Lo
Three-point phase correlations: A new measure of non-linear large-scale structure

CERN Document Server

Wolstenhulme, Richard; Obreschkow, Danail

2015-01-01

We derive an analytical expression for a novel large-scale structure observable: the line correlation function. The line correlation function, which is constructed from the three-point correlation function of the phase of the density field, is a robust statistical measure allowing the extraction of information in the non-linear and non-Gaussian regime. We show that, in perturbation theory, the line correlation is sensitive to the coupling kernel F_2, which governs the non-linear gravitational evolution of the density field. We compare our analytical expression with results from numerical simulations and find a very good agreement for separations r>20 Mpc/h. Fitting formulae for the power spectrum and the non-linear coupling kernel at small scales allow us to extend our prediction into the strongly non-linear regime. We discuss the advantages of the line correlation relative to standard statistical measures like the bispectrum. Unlike the latter, the line correlation is independent of the linear bias. Furtherm...
Power oscillation suppression by robust SMES in power system with large wind power penetration

International Nuclear Information System (INIS)

Ngamroo, Issarachai; Cuk Supriyadi, A.N.; Dechanupaprittha, Sanchai; Mitani, Yasunori

2009-01-01

The large penetration of wind farm into interconnected power systems may cause the severe problem of tie-line power oscillations. To suppress power oscillations, the superconducting magnetic energy storage (SMES) which is able to control active and reactive powers simultaneously, can be applied. On the other hand, several generating and loading conditions, variation of system parameters, etc., cause uncertainties in the system. The SMES controller designed without considering system uncertainties may fail to suppress power oscillations. To enhance the robustness of SMES controller against system uncertainties, this paper proposes a robust control design of SMES by taking system uncertainties into account. The inverse additive perturbation is applied to represent the unstructured system uncertainties and included in power system modeling. The configuration of active and reactive power controllers is the first-order lead-lag compensator with single input feedback. To tune the controller parameters, the optimization problem is formulated based on the enhancement of robust stability margin. The particle swarm optimization is used to solve the problem and achieve the controller parameters. Simulation studies in the six-area interconnected power system with wind farms confirm the robustness of the proposed SMES under various operating conditions
Power oscillation suppression by robust SMES in power system with large wind power penetration

Science.gov (United States)

Ngamroo, Issarachai; Cuk Supriyadi, A. N.; Dechanupaprittha, Sanchai; Mitani, Yasunori

2009-01-01

The large penetration of wind farm into interconnected power systems may cause the severe problem of tie-line power oscillations. To suppress power oscillations, the superconducting magnetic energy storage (SMES) which is able to control active and reactive powers simultaneously, can be applied. On the other hand, several generating and loading conditions, variation of system parameters, etc., cause uncertainties in the system. The SMES controller designed without considering system uncertainties may fail to suppress power oscillations. To enhance the robustness of SMES controller against system uncertainties, this paper proposes a robust control design of SMES by taking system uncertainties into account. The inverse additive perturbation is applied to represent the unstructured system uncertainties and included in power system modeling. The configuration of active and reactive power controllers is the first-order lead-lag compensator with single input feedback. To tune the controller parameters, the optimization problem is formulated based on the enhancement of robust stability margin. The particle swarm optimization is used to solve the problem and achieve the controller parameters. Simulation studies in the six-area interconnected power system with wind farms confirm the robustness of the proposed SMES under various operating conditions.
Large-scale regions of antimatter

International Nuclear Information System (INIS)

Grobov, A. V.; Rubin, S. G.

2015-01-01

Amodified mechanism of the formation of large-scale antimatter regions is proposed. Antimatter appears owing to fluctuations of a complex scalar field that carries a baryon charge in the inflation era
Large-scale regions of antimatter

Energy Technology Data Exchange (ETDEWEB)

Grobov, A. V., E-mail: alexey.grobov@gmail.com; Rubin, S. G., E-mail: sgrubin@mephi.ru [National Research Nuclear University MEPhI (Russian Federation)

2015-07-15

Amodified mechanism of the formation of large-scale antimatter regions is proposed. Antimatter appears owing to fluctuations of a complex scalar field that carries a baryon charge in the inflation era.
Effects of baryons on the statistical properties of large scale structure of the Universe

International Nuclear Information System (INIS)

Guillet, T.

2010-01-01

Observations of weak gravitational lensing will provide strong constraints on the cosmic expansion history and the growth rate of large scale structure, yielding clues to the properties and nature of dark energy. Their interpretation is impacted by baryonic physics, which are expected to modify the total matter distribution at small scales. My work has focused on determining and modeling the impact of baryons on the statistics of the large scale matter distribution in the Universe. Using numerical simulations, I have extracted the effect of baryons on the power spectrum, variance and skewness of the total density field as predicted by these simulations. I have shown that a model based on the halo model construction, featuring a concentrated central component to account for cool condensed baryons, is able to reproduce accurately, and down to very small scales, the measured amplifications of both the variance and skewness of the density field. Because of well-known issues with baryons in current cosmological simulations, I have extended the central component model to rely on as many observation-based ingredients as possible. As an application, I have studied the effect of baryons on the predictions of the upcoming Euclid weak lensing survey. During the course of this work, I have also worked at developing and extending the RAMSES code, in particular by developing a parallel self-gravity solver, which offers significant performance gains, in particular for the simulation of some astrophysical setups such as isolated galaxy or cluster simulations. (author) [fr
Large-Scale Analysis of Art Proportions

DEFF Research Database (Denmark)

Jensen, Karl Kristoffer

2014-01-01

While literature often tries to impute mathematical constants into art, this large-scale study (11 databases of paintings and photos, around 200.000 items) shows a different truth. The analysis, consisting of the width/height proportions, shows a value of rarely if ever one (square) and with majo......While literature often tries to impute mathematical constants into art, this large-scale study (11 databases of paintings and photos, around 200.000 items) shows a different truth. The analysis, consisting of the width/height proportions, shows a value of rarely if ever one (square...
The Expanded Large Scale Gap Test

Science.gov (United States)

1987-03-01

NSWC TR 86-32 DTIC THE EXPANDED LARGE SCALE GAP TEST BY T. P. LIDDIARD D. PRICE RESEARCH AND TECHNOLOGY DEPARTMENT ’ ~MARCH 1987 Ap~proved for public...arises, to reduce the spread in the LSGT 50% gap value.) The worst charges, such as those with the highest or lowest densities, the largest re-pressed...Arlington, VA 22217 PE 62314N INS3A 1 RJ14E31 7R4TBK 11 TITLE (Include Security CIlmsilficatiorn The Expanded Large Scale Gap Test . 12. PEIRSONAL AUTHOR() T
Efficient parallel iterative solvers for the solution of large dense linear systems arising from the boundary element method in electromagnetism

International Nuclear Information System (INIS)

Alleon, G.; Carpentieri, B.; Du, I.S.; Giraud, L.; Langou, J.; Martin, E.

2003-01-01

The boundary element method has become a popular tool for the solution of Maxwell's equations in electromagnetism. It discretizes only the surface of the radiating object and gives rise to linear systems that are smaller in size compared to those arising from finite element or finite difference discretizations. However, these systems are prohibitively demanding in terms of memory for direct methods and challenging to solve by iterative methods. In this paper we address the iterative solution via preconditioned Krylov methods of electromagnetic scattering problems expressed in an integral formulation, with main focus on the design of the pre-conditioner. We consider an approximate inverse method based on the Frobenius-norm minimization with a pattern prescribed in advance. The pre-conditioner is constructed from a sparse approximation of the dense coefficient matrix, and the patterns both for the pre-conditioner and for the coefficient matrix are computed a priori using geometric information from the mesh. We describe the implementation of the approximate inverse in an out-of-core parallel code that uses multipole techniques for the matrix-vector products, and show results on the numerical scalability of our method on systems of size up to one million unknowns. We propose an embedded iterative scheme based on the GMRES method and combined with multipole techniques, aimed at improving the robustness of the approximate inverse for large problems. We prove by numerical experiments that the proposed scheme enables the solution of very large and difficult problems efficiently at reduced computational and memory cost. Finally we perform a preliminary study on a spectral two-level pre-conditioner to enhance the robustness of our method. This numerical technique exploits spectral information of the preconditioned systems to build a low rank-update of the pre-conditioner. (authors)
Efficient parallel iterative solvers for the solution of large dense linear systems arising from the boundary element method in electromagnetism

Energy Technology Data Exchange (ETDEWEB)

Alleon, G. [EADS-CCR, 31 - Blagnac (France); Carpentieri, B.; Du, I.S.; Giraud, L.; Langou, J.; Martin, E. [Cerfacs, 31 - Toulouse (France)

2003-07-01

The boundary element method has become a popular tool for the solution of Maxwell's equations in electromagnetism. It discretizes only the surface of the radiating object and gives rise to linear systems that are smaller in size compared to those arising from finite element or finite difference discretizations. However, these systems are prohibitively demanding in terms of memory for direct methods and challenging to solve by iterative methods. In this paper we address the iterative solution via preconditioned Krylov methods of electromagnetic scattering problems expressed in an integral formulation, with main focus on the design of the pre-conditioner. We consider an approximate inverse method based on the Frobenius-norm minimization with a pattern prescribed in advance. The pre-conditioner is constructed from a sparse approximation of the dense coefficient matrix, and the patterns both for the pre-conditioner and for the coefficient matrix are computed a priori using geometric information from the mesh. We describe the implementation of the approximate inverse in an out-of-core parallel code that uses multipole techniques for the matrix-vector products, and show results on the numerical scalability of our method on systems of size up to one million unknowns. We propose an embedded iterative scheme based on the GMRES method and combined with multipole techniques, aimed at improving the robustness of the approximate inverse for large problems. We prove by numerical experiments that the proposed scheme enables the solution of very large and difficult problems efficiently at reduced computational and memory cost. Finally we perform a preliminary study on a spectral two-level pre-conditioner to enhance the robustness of our method. This numerical technique exploits spectral information of the preconditioned systems to build a low rank-update of the pre-conditioner. (authors)
Modified stress intensity factor as a crack growth parameter applicable under large scale yielding conditions

International Nuclear Information System (INIS)

Yasuoka, Tetsuo; Mizutani, Yoshihiro; Todoroki, Akira

2014-01-01

High-temperature water stress corrosion cracking has high tensile stress sensitivity, and its growth rate has been evaluated using the stress intensity factor, which is a linear fracture mechanics parameter. Stress corrosion cracking mainly occurs and propagates around welded metals or heat-affected zones. These regions have complex residual stress distributions and yield strength distributions because of input heat effects. The authors previously reported that the stress intensity factor becomes inapplicable when steep residual stress distributions or yield strength distributions occur along the crack propagation path, because small-scale yielding conditions deviate around those distributions. Here, when the stress intensity factor is modified by considering these distributions, the modified stress intensity factor may be used for crack growth evaluation for large-scale yielding. The authors previously proposed a modified stress intensity factor incorporating the stress distribution or yield strength distribution in front of the crack using the rate of change of stress intensity factor and yield strength. However, the applicable range of modified stress intensity factor for large-scale yielding was not clarified. In this study, the range was analytically investigated by comparison with the J-integral solution. A three-point bending specimen with parallel surface crack was adopted as the analytical model and the stress intensity factor, modified stress intensity factor and equivalent stress intensity factor derived from the J-integral were calculated and compared under large-scale yielding conditions. The modified stress intensity was closer to the equivalent stress intensity factor when compared with the stress intensity factor. If deviation from the J-integral solution is acceptable up to 2%, the modified stress intensity factor is applicable up to 30% of the J-integral limit, while the stress intensity factor is applicable up to 10%. These results showed that
Global robust stability of delayed neural networks: Estimating upper limit of norm of delayed connection weight matrix

International Nuclear Information System (INIS)

Singh, Vimal

2007-01-01

The question of estimating the upper limit of -parallel B -parallel 2 , which is a key step in some recently reported global robust stability criteria for delayed neural networks, is revisited ( B denotes the delayed connection weight matrix). Recently, Cao, Huang, and Qu have given an estimate of the upper limit of -parallel B -parallel 2 . In the present paper, an alternative estimate of the upper limit of -parallel B -parallel 2 is highlighted. It is shown that the alternative estimate may yield some new global robust stability results
SMARTS: Exploiting Temporal Locality and Parallelism through Vertical Execution

International Nuclear Information System (INIS)

Beckman, P.; Crotinger, J.; Karmesin, S.; Malony, A.; Oldehoeft, R.; Shende, S.; Smith, S.; Vajracharya, S.

1999-01-01

In the solution of large-scale numerical prob- lems, parallel computing is becoming simultaneously more important and more difficult. The complex organization of today's multiprocessors with several memory hierarchies has forced the scientific programmer to make a choice between simple but unscalable code and scalable but extremely com- plex code that does not port to other architectures. This paper describes how the SMARTS runtime system and the POOMA C++ class library for high-performance scientific computing work together to exploit data parallelism in scientific applications while hiding the details of manag- ing parallelism and data locality from the user. We present innovative algorithms, based on the macro -dataflow model, for detecting data parallelism and efficiently executing data- parallel statements on shared-memory multiprocessors. We also desclibe how these algorithms can be implemented on clusters of SMPS
SMARTS: Exploiting Temporal Locality and Parallelism through Vertical Execution

Energy Technology Data Exchange (ETDEWEB)

Beckman, P.; Crotinger, J.; Karmesin, S.; Malony, A.; Oldehoeft, R.; Shende, S.; Smith, S.; Vajracharya, S.

1999-01-04

In the solution of large-scale numerical prob- lems, parallel computing is becoming simultaneously more important and more difficult. The complex organization of today's multiprocessors with several memory hierarchies has forced the scientific programmer to make a choice between simple but unscalable code and scalable but extremely com- plex code that does not port to other architectures. This paper describes how the SMARTS runtime system and the POOMA C++ class library for high-performance scientific computing work together to exploit data parallelism in scientific applications while hiding the details of manag- ing parallelism and data locality from the user. We present innovative algorithms, based on the macro -dataflow model, for detecting data parallelism and efficiently executing data- parallel statements on shared-memory multiprocessors. We also desclibe how these algorithms can be implemented on clusters of SMPS.

An integrated system for large scale scanning of nuclear emulsions

Energy Technology Data Exchange (ETDEWEB)

Bozza, Cristiano, E-mail: kryss@sa.infn.it [University of Salerno and INFN, via Ponte Don Melillo, Fisciano 84084 (Italy); D’Ambrosio, Nicola [Laboratori Nazionali del Gran Sasso, S.S. 17 BIS km 18.910, Assergi (AQ) 67010 (Italy); De Lellis, Giovanni [University of Napoli and INFN, Complesso Universitario di Monte Sant' Angelo, via Cintia Ed. G, Napoli 80126 (Italy); De Serio, Marilisa [University of Bari and INFN, via E. Orabona 4, Bari 70125 (Italy); Di Capua, Francesco [INFN Napoli, Complesso Universitario di Monte Sant' Angelo, via Cintia Ed. G, Napoli 80126 (Italy); Di Crescenzo, Antonia [University of Napoli and INFN, Complesso Universitario di Monte Sant' Angelo, via Cintia Ed. G, Napoli 80126 (Italy); Di Ferdinando, Donato [INFN Bologna, viale B. Pichat 6/2, Bologna 40127 (Italy); Di Marco, Natalia [Laboratori Nazionali del Gran Sasso, S.S. 17 BIS km 18.910, Assergi (AQ) 67010 (Italy); Esposito, Luigi Salvatore [Laboratori Nazionali del Gran Sasso, now at CERN, Geneva (Switzerland); Fini, Rosa Anna [INFN Bari, via E. Orabona 4, Bari 70125 (Italy); Giacomelli, Giorgio [University of Bologna and INFN, viale B. Pichat 6/2, Bologna 40127 (Italy); Grella, Giuseppe [University of Salerno and INFN, via Ponte Don Melillo, Fisciano 84084 (Italy); Ieva, Michela [University of Bari and INFN, via E. Orabona 4, Bari 70125 (Italy); Kose, Umut [INFN Padova, via Marzolo 8, Padova (PD) 35131 (Italy); Longhin, Andrea; Mauri, Nicoletta [INFN Laboratori Nazionali di Frascati, via E. Fermi 40, Frascati (RM) 00044 (Italy); Medinaceli, Eduardo [University of Padova and INFN, via Marzolo 8, Padova (PD) 35131 (Italy); Monacelli, Piero [University of L' Aquila and INFN, via Vetoio Loc. Coppito, L' Aquila (AQ) 67100 (Italy); Muciaccia, Maria Teresa; Pastore, Alessandra [University of Bari and INFN, via E. Orabona 4, Bari 70125 (Italy); and others

2013-03-01

The European Scanning System, developed to analyse nuclear emulsions at high speed, has been completed with the development of a high level software infrastructure to automate and support large-scale emulsion scanning. In one year, an average installation is capable of performing data-taking and online analysis on a total surface ranging from few m{sup 2} to tens of m{sup 2}, acquiring many billions of tracks, corresponding to several TB. This paper focuses on the procedures that have been implemented and on their impact on physics measurements. The system proved robust, reliable, fault-tolerant and user-friendly, and seldom needs assistance. A dedicated relational Data Base system is the backbone of the whole infrastructure, storing data themselves and not only catalogues of data files, as in common practice, being a unique case in high-energy physics DAQ systems. The logical organisation of the system is described and a summary is given of the physics measurement that are readily available by automated processing.
An integrated system for large scale scanning of nuclear emulsions

International Nuclear Information System (INIS)

Bozza, Cristiano; D’Ambrosio, Nicola; De Lellis, Giovanni; De Serio, Marilisa; Di Capua, Francesco; Di Crescenzo, Antonia; Di Ferdinando, Donato; Di Marco, Natalia; Esposito, Luigi Salvatore; Fini, Rosa Anna; Giacomelli, Giorgio; Grella, Giuseppe; Ieva, Michela; Kose, Umut; Longhin, Andrea; Mauri, Nicoletta; Medinaceli, Eduardo; Monacelli, Piero; Muciaccia, Maria Teresa; Pastore, Alessandra

2013-01-01

The European Scanning System, developed to analyse nuclear emulsions at high speed, has been completed with the development of a high level software infrastructure to automate and support large-scale emulsion scanning. In one year, an average installation is capable of performing data-taking and online analysis on a total surface ranging from few m 2 to tens of m 2 , acquiring many billions of tracks, corresponding to several TB. This paper focuses on the procedures that have been implemented and on their impact on physics measurements. The system proved robust, reliable, fault-tolerant and user-friendly, and seldom needs assistance. A dedicated relational Data Base system is the backbone of the whole infrastructure, storing data themselves and not only catalogues of data files, as in common practice, being a unique case in high-energy physics DAQ systems. The logical organisation of the system is described and a summary is given of the physics measurement that are readily available by automated processing
Large scale and big data processing and management

CERN Document Server

Sakr, Sherif

2014-01-01

Large Scale and Big Data: Processing and Management provides readers with a central source of reference on the data management techniques currently available for large-scale data processing. Presenting chapters written by leading researchers, academics, and practitioners, it addresses the fundamental challenges associated with Big Data processing tools and techniques across a range of computing environments.The book begins by discussing the basic concepts and tools of large-scale Big Data processing and cloud computing. It also provides an overview of different programming models and cloud-bas
Scalability of Parallel Scientific Applications on the Cloud

Directory of Open Access Journals (Sweden)

Satish Narayana Srirama

2011-01-01

Full Text Available Cloud computing, with its promise of virtually infinite resources, seems to suit well in solving resource greedy scientific computing problems. To study the effects of moving parallel scientific applications onto the cloud, we deployed several benchmark applications like matrix–vector operations and NAS parallel benchmarks, and DOUG (Domain decomposition On Unstructured Grids on the cloud. DOUG is an open source software package for parallel iterative solution of very large sparse systems of linear equations. The detailed analysis of DOUG on the cloud showed that parallel applications benefit a lot and scale reasonable on the cloud. We could also observe the limitations of the cloud and its comparison with cluster in terms of performance. However, for efficiently running the scientific applications on the cloud infrastructure, the applications must be reduced to frameworks that can successfully exploit the cloud resources, like the MapReduce framework. Several iterative and embarrassingly parallel algorithms are reduced to the MapReduce model and their performance is measured and analyzed. The analysis showed that Hadoop MapReduce has significant problems with iterative methods, while it suits well for embarrassingly parallel algorithms. Scientific computing often uses iterative methods to solve large problems. Thus, for scientific computing on the cloud, this paper raises the necessity for better frameworks or optimizations for MapReduce.
Large scale cluster computing workshop

International Nuclear Information System (INIS)

Dane Skow; Alan Silverman

2002-01-01

Recent revolutions in computer hardware and software technologies have paved the way for the large-scale deployment of clusters of commodity computers to address problems heretofore the domain of tightly coupled SMP processors. Near term projects within High Energy Physics and other computing communities will deploy clusters of scale 1000s of processors and be used by 100s to 1000s of independent users. This will expand the reach in both dimensions by an order of magnitude from the current successful production facilities. The goals of this workshop were: (1) to determine what tools exist which can scale up to the cluster sizes foreseen for the next generation of HENP experiments (several thousand nodes) and by implication to identify areas where some investment of money or effort is likely to be needed. (2) To compare and record experimences gained with such tools. (3) To produce a practical guide to all stages of planning, installing, building and operating a large computing cluster in HENP. (4) To identify and connect groups with similar interest within HENP and the larger clustering community
LCL-Filter Design for Robust Active Damping in Grid-Connected Converters

DEFF Research Database (Denmark)

Pena-Alzola, Rafael; Liserre, Marco; Blaabjerg, Frede

2014-01-01

in the grid inductance may compromise system stability, and this problem is more severe for parallel converters. This situation, typical of rural areas with solar and wind resources, calls for robust LCL-filter design. This paper proposes a design procedure with remarkable results under severe grid inductance......Grid-connected converters employ LCL-filters, instead of simple inductors, because they allow lower inductances while reducing cost and size. Active damping, without dissipative elements, is preferred to passive damping for solving the associated stability problems. However, large variations...
Short assessment of the Big Five: robust across survey methods except telephone interviewing

OpenAIRE

Lang, Frieder R.; John, Dennis; Lüdtke, Oliver; Schupp, Jürgen; Wagner, Gert G.

2011-01-01

We examined measurement invariance and age-related robustness of a short 15-item Big Five Inventory (BFI–S) of personality dimensions, which is well suited for applications in large-scale multidisciplinary surveys. The BFI–S was assessed in three different interviewing conditions: computer-assisted or paper-assisted face-to-face interviewing, computer-assisted telephone interviewing, and a self-administered questionnaire. Randomized probability samples from a large-scale German panel survey a...
Iterative algorithms for large sparse linear systems on parallel computers

Science.gov (United States)

Adams, L. M.

1982-01-01

Algorithms for assembling in parallel the sparse system of linear equations that result from finite difference or finite element discretizations of elliptic partial differential equations, such as those that arise in structural engineering are developed. Parallel linear stationary iterative algorithms and parallel preconditioned conjugate gradient algorithms are developed for solving these systems. In addition, a model for comparing parallel algorithms on array architectures is developed and results of this model for the algorithms are given.
Large-Scale Agriculture and Outgrower Schemes in Ethiopia

DEFF Research Database (Denmark)

Wendimu, Mengistu Assefa

, the impact of large-scale agriculture and outgrower schemes on productivity, household welfare and wages in developing countries is highly contentious. Chapter 1 of this thesis provides an introduction to the study, while also reviewing the key debate in the contemporary land ‘grabbing’ and historical large...... sugarcane outgrower scheme on household income and asset stocks. Chapter 5 examines the wages and working conditions in ‘formal’ large-scale and ‘informal’ small-scale irrigated agriculture. The results in Chapter 2 show that moisture stress, the use of untested planting materials, and conflict over land...... commands a higher wage than ‘formal’ large-scale agriculture, while rather different wage determination mechanisms exist in the two sectors. Human capital characteristics (education and experience) partly explain the differences in wages within the formal sector, but play no significant role...
Distributed redundancy and robustness in complex systems

KAUST Repository

Randles, Martin

2011-03-01

The uptake and increasing prevalence of Web 2.0 applications, promoting new large-scale and complex systems such as Cloud computing and the emerging Internet of Services/Things, requires tools and techniques to analyse and model methods to ensure the robustness of these new systems. This paper reports on assessing and improving complex system resilience using distributed redundancy, termed degeneracy in biological systems, to endow large-scale complicated computer systems with the same robustness that emerges in complex biological and natural systems. However, in order to promote an evolutionary approach, through emergent self-organisation, it is necessary to specify the systems in an \\'open-ended\\' manner where not all states of the system are prescribed at design-time. In particular an observer system is used to select robust topologies, within system components, based on a measurement of the first non-zero Eigen value in the Laplacian spectrum of the components\\' network graphs; also known as the algebraic connectivity. It is shown, through experimentation on a simulation, that increasing the average algebraic connectivity across the components, in a network, leads to an increase in the variety of individual components termed distributed redundancy; the capacity for structurally distinct components to perform an identical function in a particular context. The results are applied to a specific application where active clustering of like services is used to aid load balancing in a highly distributed network. Using the described procedure is shown to improve performance and distribute redundancy. © 2010 Elsevier Inc.
Neural nets for massively parallel optimization

Science.gov (United States)

Dixon, Laurence C. W.; Mills, David

1992-07-01

To apply massively parallel processing systems to the solution of large scale optimization problems it is desirable to be able to evaluate any function f(z), z (epsilon) Rn in a parallel manner. The theorem of Cybenko, Hecht Nielsen, Hornik, Stinchcombe and White, and Funahasi shows that this can be achieved by a neural network with one hidden layer. In this paper we address the problem of the number of nodes required in the layer to achieve a given accuracy in the function and gradient values at all points within a given n dimensional interval. The type of activation function needed to obtain nonsingular Hessian matrices is described and a strategy for obtaining accurate minimal networks presented.
Economically viable large-scale hydrogen liquefaction

Science.gov (United States)

Cardella, U.; Decker, L.; Klein, H.

2017-02-01

The liquid hydrogen demand, particularly driven by clean energy applications, will rise in the near future. As industrial large scale liquefiers will play a major role within the hydrogen supply chain, production capacity will have to increase by a multiple of today’s typical sizes. The main goal is to reduce the total cost of ownership for these plants by increasing energy efficiency with innovative and simple process designs, optimized in capital expenditure. New concepts must ensure a manageable plant complexity and flexible operability. In the phase of process development and selection, a dimensioning of key equipment for large scale liquefiers, such as turbines and compressors as well as heat exchangers, must be performed iteratively to ensure technological feasibility and maturity. Further critical aspects related to hydrogen liquefaction, e.g. fluid properties, ortho-para hydrogen conversion, and coldbox configuration, must be analysed in detail. This paper provides an overview on the approach, challenges and preliminary results in the development of efficient as well as economically viable concepts for large-scale hydrogen liquefaction.
Topological acoustic polaritons: robust sound manipulation at the subwavelength scale

International Nuclear Information System (INIS)

Yves, Simon; Fleury, Romain; Lemoult, Fabrice; Fink, Mathias; Lerosey, Geoffroy

2017-01-01

Topological insulators, a hallmark of condensed matter physics, have recently reached the classical realm of acoustic waves. A remarkable property of time-reversal invariant topological insulators is the presence of unidirectional spin-polarized propagation along their edges, a property that could lead to a wealth of new opportunities in the ability to guide and manipulate sound. Here, we demonstrate and study the possibility to induce topologically non-trivial acoustic states at the deep subwavelength scale, in a structured two-dimensional metamaterial composed of Helmholtz resonators. Radically different from previous designs based on non-resonant sonic crystals, our proposal enables robust sound manipulation on a surface along predefined, subwavelength pathways of arbitrary shapes. (paper)
Topological acoustic polaritons: robust sound manipulation at the subwavelength scale

Science.gov (United States)

Yves, Simon; Fleury, Romain; Lemoult, Fabrice; Fink, Mathias; Lerosey, Geoffroy

2017-07-01

Topological insulators, a hallmark of condensed matter physics, have recently reached the classical realm of acoustic waves. A remarkable property of time-reversal invariant topological insulators is the presence of unidirectional spin-polarized propagation along their edges, a property that could lead to a wealth of new opportunities in the ability to guide and manipulate sound. Here, we demonstrate and study the possibility to induce topologically non-trivial acoustic states at the deep subwavelength scale, in a structured two-dimensional metamaterial composed of Helmholtz resonators. Radically different from previous designs based on non-resonant sonic crystals, our proposal enables robust sound manipulation on a surface along predefined, subwavelength pathways of arbitrary shapes.
Large scale chromatographic separations using continuous displacement chromatography (CDC)

International Nuclear Information System (INIS)

Taniguchi, V.T.; Doty, A.W.; Byers, C.H.

1988-01-01

A process for large scale chromatographic separations using a continuous chromatography technique is described. The process combines the advantages of large scale batch fixed column displacement chromatography with conventional analytical or elution continuous annular chromatography (CAC) to enable large scale displacement chromatography to be performed on a continuous basis (CDC). Such large scale, continuous displacement chromatography separations have not been reported in the literature. The process is demonstrated with the ion exchange separation of a binary lanthanide (Nd/Pr) mixture. The process is, however, applicable to any displacement chromatography separation that can be performed using conventional batch, fixed column chromatography
Large-scale functional purification of recombinant HIV-1 capsid.

Directory of Open Access Journals (Sweden)

Magdeleine Hung

Full Text Available During human immunodeficiency virus type-1 (HIV-1 virion maturation, capsid proteins undergo a major rearrangement to form a conical core that protects the viral nucleoprotein complexes. Mutations in the capsid sequence that alter the stability of the capsid core are deleterious to viral infectivity and replication. Recently, capsid assembly has become an attractive target for the development of a new generation of anti-retroviral agents. Drug screening efforts and subsequent structural and mechanistic studies require gram quantities of active, homogeneous and pure protein. Conventional means of laboratory purification of Escherichia coli expressed recombinant capsid protein rely on column chromatography steps that are not amenable to large-scale production. Here we present a function-based purification of wild-type and quadruple mutant capsid proteins, which relies on the inherent propensity of capsid protein to polymerize and depolymerize. This method does not require the packing of sizable chromatography columns and can generate double-digit gram quantities of functionally and biochemically well-behaved proteins with greater than 98% purity. We have used the purified capsid protein to characterize two known assembly inhibitors in our in-house developed polymerization assay and to measure their binding affinities. Our capsid purification procedure provides a robust method for purifying large quantities of a key protein in the HIV-1 life cycle, facilitating identification of the next generation anti-HIV agents.
Large-Scale, Multi-Sensor Atmospheric Data Fusion Using Hybrid Cloud Computing

Science.gov (United States)

Wilson, B. D.; Manipon, G.; Hua, H.; Fetzer, E. J.

2015-12-01

NASA's Earth Observing System (EOS) is an ambitious facility for studying global climate change. The mandate now is to combine measurements from the instruments on the "A-Train" platforms (AIRS, MODIS, MLS, and CloudSat) and other Earth probes to enable large-scale studies of climate change over decades. Moving to multi-sensor, long-duration presents serious challenges for large-scale data mining and fusion. For example, one might want to compare temperature and water vapor retrievals from one instrument (AIRS) to another (MODIS), and to a model (ECMWF), stratify the comparisons using a classification of the "cloud scenes" from CloudSat, and repeat the entire analysis over 10 years of data. HySDS is a Hybrid-Cloud Science Data System that has been developed and applied under NASA AIST, MEaSUREs, and ACCESS grants. HySDS uses the SciFlow workflow engine to partition analysis workflows into parallel tasks (e.g. segmenting by time or space) that are pushed into a durable job queue. The tasks are "pulled" from the queue by worker Virtual Machines (VM's) and executed in an on-premise Cloud (Eucalyptus or OpenStack) or at Amazon in the public Cloud or govCloud. In this way, years of data (millions of files) can be processed in a massively parallel way. Input variables (arrays) are pulled on-demand into the Cloud using OPeNDAP URLs or other subsetting services, thereby minimizing the size of the transferred data. We are using HySDS to automate the production of multiple versions of a ten-year A-Train water vapor climatology under a MEASURES grant. We will present the architecture of HySDS, describe the achieved "clock time" speedups in fusing datasets on our own nodes and in the Amazon Cloud, and discuss the Cloud cost tradeoffs for storage, compute, and data transfer. Our system demonstrates how one can pull A-Train variables (Levels 2 & 3) on-demand into the Amazon Cloud, and cache only those variables that are heavily used, so that any number of compute jobs can be
Large Scale Processes and Extreme Floods in Brazil

Science.gov (United States)

Ribeiro Lima, C. H.; AghaKouchak, A.; Lall, U.

2016-12-01

Persistent large scale anomalies in the atmospheric circulation and ocean state have been associated with heavy rainfall and extreme floods in water basins of different sizes across the world. Such studies have emerged in the last years as a new tool to improve the traditional, stationary based approach in flood frequency analysis and flood prediction. Here we seek to advance previous studies by evaluating the dominance of large scale processes (e.g. atmospheric rivers/moisture transport) over local processes (e.g. local convection) in producing floods. We consider flood-prone regions in Brazil as case studies and the role of large scale climate processes in generating extreme floods in such regions is explored by means of observed streamflow, reanalysis data and machine learning methods. The dynamics of the large scale atmospheric circulation in the days prior to the flood events are evaluated based on the vertically integrated moisture flux and its divergence field, which are interpreted in a low-dimensional space as obtained by machine learning techniques, particularly supervised kernel principal component analysis. In such reduced dimensional space, clusters are obtained in order to better understand the role of regional moisture recycling or teleconnected moisture in producing floods of a given magnitude. The convective available potential energy (CAPE) is also used as a measure of local convection activities. We investigate for individual sites the exceedance probability in which large scale atmospheric fluxes dominate the flood process. Finally, we analyze regional patterns of floods and how the scaling law of floods with drainage area responds to changes in the climate forcing mechanisms (e.g. local vs large scale).
Computing in Large-Scale Dynamic Systems

NARCIS (Netherlands)

Pruteanu, A.S.

2013-01-01

Software applications developed for large-scale systems have always been difficult to de- velop due to problems caused by the large number of computing devices involved. Above a certain network size (roughly one hundred), necessary services such as code updating, topol- ogy discovery and data
Rapid Large Scale Reprocessing of the ODI Archive using the QuickReduce Pipeline

Science.gov (United States)

Gopu, A.; Kotulla, R.; Young, M. D.; Hayashi, S.; Harbeck, D.; Liu, W.; Henschel, R.

2015-09-01

The traditional model of astronomers collecting their observations as raw instrument data is being increasingly replaced by astronomical observatories serving standard calibrated data products to observers and to the public at large once proprietary restrictions are lifted. For this model to be effective, observatories need the ability to periodically re-calibrate archival data products as improved master calibration products or pipeline improvements become available, and also to allow users to rapidly calibrate their data on-the-fly. Traditional astronomy pipelines are heavily I/O dependent and do not scale with increasing data volumes. In this paper, we present the One Degree Imager - Portal, Pipeline and Archive (ODI-PPA) calibration pipeline framework which integrates the efficient and parallelized QuickReduce pipeline to enable a large number of simultaneous, parallel data reduction jobs - initiated by operators AND/OR users - while also ensuring rapid processing times and full data provenance. Our integrated pipeline system allows re-processing of the entire ODI archive (˜15,000 raw science frames, ˜3.0 TB compressed) within ˜18 hours using twelve 32-core compute nodes on the Big Red II supercomputer. Our flexible, fast, easy to operate, and highly scalable framework improves access to ODI data, in particular when data rates double with an upgraded focal plane (scheduled for 2015), and also serve as a template for future data processing infrastructure across the astronomical community and beyond.

Fires in large scale ventilation systems

International Nuclear Information System (INIS)

Gregory, W.S.; Martin, R.A.; White, B.W.; Nichols, B.D.; Smith, P.R.; Leslie, I.H.; Fenton, D.L.; Gunaji, M.V.; Blythe, J.P.

1991-01-01

This paper summarizes the experience gained simulating fires in large scale ventilation systems patterned after ventilation systems found in nuclear fuel cycle facilities. The series of experiments discussed included: (1) combustion aerosol loading of 0.61x0.61 m HEPA filters with the combustion products of two organic fuels, polystyrene and polymethylemethacrylate; (2) gas dynamic and heat transport through a large scale ventilation system consisting of a 0.61x0.61 m duct 90 m in length, with dampers, HEPA filters, blowers, etc.; (3) gas dynamic and simultaneous transport of heat and solid particulate (consisting of glass beads with a mean aerodynamic diameter of 10μ) through the large scale ventilation system; and (4) the transport of heat and soot, generated by kerosene pool fires, through the large scale ventilation system. The FIRAC computer code, designed to predict fire-induced transients in nuclear fuel cycle facility ventilation systems, was used to predict the results of experiments (2) through (4). In general, the results of the predictions were satisfactory. The code predictions for the gas dynamics, heat transport, and particulate transport and deposition were within 10% of the experimentally measured values. However, the code was less successful in predicting the amount of soot generation from kerosene pool fires, probably due to the fire module of the code being a one-dimensional zone model. The experiments revealed a complicated three-dimensional combustion pattern within the fire room of the ventilation system. Further refinement of the fire module within FIRAC is needed. (orig.)
Robust synthesis for real-time systems

DEFF Research Database (Denmark)

Larsen, Kim Guldstrand; Legay, Axel; Traonouez, Luois-Marie

2014-01-01

Specification theories for real-time systems allow reasoning about interfaces and their implementation models, using a set of operators that includes satisfaction, refinement, logical and parallel composition. To make such theories applicable throughout the entire design process from an abstract...... of introducing small perturbations into formal models. We address this problem of robust implementations in timed specification theories. We first consider a fixed perturbation and study the robustness of timed specifications with respect to the operators of the theory. To this end we synthesize robust...... specification to an implementation, we need to reason about the possibility to effectively implement the theoretical specifications on physical systems, despite their limited precision. In the literature, this implementation problem has been linked to the robustness problem that analyzes the consequences...
Energy-scales convergence for optimal and robust quantum transport in photosynthetic complexes

Energy Technology Data Exchange (ETDEWEB)

Mohseni, M. [Google Research, Venice, California 90291 (United States); Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139 (United States); Shabani, A. [Department of Chemistry, Princeton University, Princeton, New Jersey 08544 (United States); Department of Chemistry, University of California at Berkeley, Berkeley, California 94720 (United States); Lloyd, S. [Department of Mechanical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139 (United States); Rabitz, H. [Department of Chemistry, Princeton University, Princeton, New Jersey 08544 (United States)

2014-01-21

Underlying physical principles for the high efficiency of excitation energy transfer in light-harvesting complexes are not fully understood. Notably, the degree of robustness of these systems for transporting energy is not known considering their realistic interactions with vibrational and radiative environments within the surrounding solvent and scaffold proteins. In this work, we employ an efficient technique to estimate energy transfer efficiency of such complex excitonic systems. We observe that the dynamics of the Fenna-Matthews-Olson (FMO) complex leads to optimal and robust energy transport due to a convergence of energy scales among all important internal and external parameters. In particular, we show that the FMO energy transfer efficiency is optimum and stable with respect to important parameters of environmental interactions including reorganization energy λ, bath frequency cutoff γ, temperature T, and bath spatial correlations. We identify the ratio of k{sub B}λT/ℏγ⁢g as a single key parameter governing quantum transport efficiency, where g is the average excitonic energy gap.
Energy-scales convergence for optimal and robust quantum transport in photosynthetic complexes

International Nuclear Information System (INIS)

Mohseni, M.; Shabani, A.; Lloyd, S.; Rabitz, H.

2014-01-01

Underlying physical principles for the high efficiency of excitation energy transfer in light-harvesting complexes are not fully understood. Notably, the degree of robustness of these systems for transporting energy is not known considering their realistic interactions with vibrational and radiative environments within the surrounding solvent and scaffold proteins. In this work, we employ an efficient technique to estimate energy transfer efficiency of such complex excitonic systems. We observe that the dynamics of the Fenna-Matthews-Olson (FMO) complex leads to optimal and robust energy transport due to a convergence of energy scales among all important internal and external parameters. In particular, we show that the FMO energy transfer efficiency is optimum and stable with respect to important parameters of environmental interactions including reorganization energy λ, bath frequency cutoff γ, temperature T, and bath spatial correlations. We identify the ratio of k B λT/ℏγ⁢g as a single key parameter governing quantum transport efficiency, where g is the average excitonic energy gap
Large-scale Complex IT Systems

OpenAIRE

Sommerville, Ian; Cliff, Dave; Calinescu, Radu; Keen, Justin; Kelly, Tim; Kwiatkowska, Marta; McDermid, John; Paige, Richard

2011-01-01

This paper explores the issues around the construction of large-scale complex systems which are built as 'systems of systems' and suggests that there are fundamental reasons, derived from the inherent complexity in these systems, why our current software engineering methods and techniques cannot be scaled up to cope with the engineering challenges of constructing such systems. It then goes on to propose a research and education agenda for software engineering that identifies the major challen...
Large-scale complex IT systems

OpenAIRE

Sommerville, Ian; Cliff, Dave; Calinescu, Radu; Keen, Justin; Kelly, Tim; Kwiatkowska, Marta; McDermid, John; Paige, Richard

2012-01-01

12 pages, 2 figures This paper explores the issues around the construction of large-scale complex systems which are built as 'systems of systems' and suggests that there are fundamental reasons, derived from the inherent complexity in these systems, why our current software engineering methods and techniques cannot be scaled up to cope with the engineering challenges of constructing such systems. It then goes on to propose a research and education agenda for software engineering that ident...
First Mile Challenges for Large-Scale IoT

KAUST Repository

Bader, Ahmed; Elsawy, Hesham; Gharbieh, Mohammad; Alouini, Mohamed-Slim; Adinoyi, Abdulkareem; Alshaalan, Furaih

2017-01-01

The Internet of Things is large-scale by nature. This is not only manifested by the large number of connected devices, but also by the sheer scale of spatial traffic intensity that must be accommodated, primarily in the uplink direction. To that end
MARVIN: Distributed reasoning over large-scale Semantic Web data

NARCIS (Netherlands)

Oren, E.; Kotoulas, S.; Anadiotis, G.; Siebes, R.M.; ten Teije, A.C.M.; van Harmelen, F.A.H.

2009-01-01

Many Semantic Web problems are difficult to solve through common divide-and-conquer strategies, since they are hard to partition. We present Marvin, a parallel and distributed platform for processing large amounts of RDF data, on a network of loosely coupled peers. We present our divide-conquer-swap
Intelligent spatial ecosystem modeling using parallel processors

International Nuclear Information System (INIS)

Maxwell, T.; Costanza, R.

1993-01-01

Spatial modeling of ecosystems is essential if one's modeling goals include developing a relatively realistic description of past behavior and predictions of the impacts of alternative management policies on future ecosystem behavior. Development of these models has been limited in the past by the large amount of input data required and the difficulty of even large mainframe serial computers in dealing with large spatial arrays. These two limitations have begun to erode with the increasing availability of remote sensing data and GIS systems to manipulate it, and the development of parallel computer systems which allow computation of large, complex, spatial arrays. Although many forms of dynamic spatial modeling are highly amenable to parallel processing, the primary focus in this project is on process-based landscape models. These models simulate spatial structure by first compartmentalizing the landscape into some geometric design and then describing flows within compartments and spatial processes between compartments according to location-specific algorithms. The authors are currently building and running parallel spatial models at the regional scale for the Patuxent River region in Maryland, the Everglades in Florida, and Barataria Basin in Louisiana. The authors are also planning a project to construct a series of spatially explicit linked ecological and economic simulation models aimed at assessing the long-term potential impacts of global climate change
A scalable PC-based parallel computer for lattice QCD

International Nuclear Information System (INIS)

Fodor, Z.; Katz, S.D.; Pappa, G.

2003-01-01

A PC-based parallel computer for medium/large scale lattice QCD simulations is suggested. The Eoetvoes Univ., Inst. Theor. Phys. cluster consists of 137 Intel P4-1.7GHz nodes. Gigabit Ethernet cards are used for nearest neighbor communication in a two-dimensional mesh. The sustained performance for dynamical staggered (wilson) quarks on large lattices is around 70(110) GFlops. The exceptional price/performance ratio is below $1/Mflop
A scalable PC-based parallel computer for lattice QCD

International Nuclear Information System (INIS)

Fodor, Z.; Papp, G.

2002-09-01

A PC-based parallel computer for medium/large scale lattice QCD simulations is suggested. The Eoetvoes Univ., Inst. Theor. Phys. cluster consists of 137 Intel P4-1.7 GHz nodes. Gigabit Ethernet cards are used for nearest neighbor communication in a two-dimensional mesh. The sustained performance for dynamical staggered(wilson) quarks on large lattices is around 70(110) GFlops. The exceptional price/performance ratio is below $1/Mflop. (orig.)
Parallel preconditioning techniques for sparse CG solvers

Energy Technology Data Exchange (ETDEWEB)

Basermann, A.; Reichel, B.; Schelthoff, C. [Central Institute for Applied Mathematics, Juelich (Germany)

1996-12-31

Conjugate gradient (CG) methods to solve sparse systems of linear equations play an important role in numerical methods for solving discretized partial differential equations. The large size and the condition of many technical or physical applications in this area result in the need for efficient parallelization and preconditioning techniques of the CG method. In particular for very ill-conditioned matrices, sophisticated preconditioner are necessary to obtain both acceptable convergence and accuracy of CG. Here, we investigate variants of polynomial and incomplete Cholesky preconditioners that markedly reduce the iterations of the simply diagonally scaled CG and are shown to be well suited for massively parallel machines.
Large scale access tests and online interfaces to ATLAS conditions databases

International Nuclear Information System (INIS)

Amorim, A; Lopes, L; Pereira, P; Simoes, J; Soloviev, I; Burckhart, D; Schmitt, J V D; Caprini, M; Kolos, S

2008-01-01

The access of the ATLAS Trigger and Data Acquisition (TDAQ) system to the ATLAS Conditions Databases sets strong reliability and performance requirements on the database storage and access infrastructures. Several applications were developed to support the integration of Conditions database access with the online services in TDAQ, including the interface to the Information Services (IS) and to the TDAQ Configuration Databases. The information storage requirements were the motivation for the ONline A Synchronous Interface to COOL (ONASIC) from the Information Service (IS) to LCG/COOL databases. ONASIC avoids the possible backpressure from Online Database servers by managing a local cache. In parallel, OKS2COOL was developed to store Configuration Databases into an Offline Database with history record. The DBStressor application was developed to test and stress the access to the Conditions database using the LCG/COOL interface while operating in an integrated way as a TDAQ application. The performance scaling of simultaneous Conditions database read accesses was studied in the context of the ATLAS High Level Trigger large computing farms. A large set of tests were performed involving up to 1000 computing nodes that simultaneously accessed the LCG central database server infrastructure at CERN
Parallel computing simulation of fluid flow in the unsaturated zone of Yucca Mountain, Nevada

International Nuclear Information System (INIS)

Zhang, Keni; Wu, Yu-Shu; Bodvarsson, G.S.

2001-01-01

This paper presents the application of parallel computing techniques to large-scale modeling of fluid flow in the unsaturated zone (UZ) at Yucca Mountain, Nevada. In this study, parallel computing techniques, as implemented into the TOUGH2 code, are applied in large-scale numerical simulations on a distributed-memory parallel computer. The modeling study has been conducted using an over-one-million-cell three-dimensional numerical model, which incorporates a wide variety of field data for the highly heterogeneous fractured formation at Yucca Mountain. The objective of this study is to analyze the impact of various surface infiltration scenarios (under current and possible future climates) on flow through the UZ system, using various hydrogeological conceptual models with refined grids. The results indicate that the one-million-cell models produce better resolution results and reveal some flow patterns that cannot be obtained using coarse-grid modeling models
Prospects for large scale electricity storage in Denmark

DEFF Research Database (Denmark)

Krog Ekman, Claus; Jensen, Søren Højgaard

2010-01-01

In a future power systems with additional wind power capacity there will be an increased need for large scale power management as well as reliable balancing and reserve capabilities. Different technologies for large scale electricity storage provide solutions to the different challenges arising w...
Robust Distributed Model Predictive Load Frequency Control of Interconnected Power System

Directory of Open Access Journals (Sweden)

Xiangjie Liu

2013-01-01

Full Text Available Considering the load frequency control (LFC of large-scale power system, a robust distributed model predictive control (RDMPC is presented. The system uncertainty according to power system parameter variation alone with the generation rate constraints (GRC is included in the synthesis procedure. The entire power system is composed of several control areas, and the problem is formulated as convex optimization problem with linear matrix inequalities (LMI that can be solved efficiently. It minimizes an upper bound on a robust performance objective for each subsystem. Simulation results show good dynamic response and robustness in the presence of power system dynamic uncertainties.
Multi-format all-optical processing based on a large-scale, hybridly integrated photonic circuit.

Science.gov (United States)

Bougioukos, M; Kouloumentas, Ch; Spyropoulou, M; Giannoulis, G; Kalavrouziotis, D; Maziotis, A; Bakopoulos, P; Harmon, R; Rogers, D; Harrison, J; Poustie, A; Maxwell, G; Avramopoulos, H

2011-06-06

We investigate through numerical studies and experiments the performance of a large scale, silica-on-silicon photonic integrated circuit for multi-format regeneration and wavelength-conversion. The circuit encompasses a monolithically integrated array of four SOAs inside two parallel Mach-Zehnder structures, four delay interferometers and a large number of silica waveguides and couplers. Exploiting phase-incoherent techniques, the circuit is capable of processing OOK signals at variable bit rates, DPSK signals at 22 or 44 Gb/s and DQPSK signals at 44 Gbaud. Simulation studies reveal the wavelength-conversion potential of the circuit with enhanced regenerative capabilities for OOK and DPSK modulation formats and acceptable quality degradation for DQPSK format. Regeneration of 22 Gb/s OOK signals with amplified spontaneous emission (ASE) noise and DPSK data signals degraded with amplitude, phase and ASE noise is experimentally validated demonstrating a power penalty improvement up to 1.5 dB.
Primal-dual convex optimization in large deformation diffeomorphic metric mapping: LDDMM meets robust regularizers

Science.gov (United States)

Hernandez, Monica

2017-12-01

This paper proposes a method for primal-dual convex optimization in variational large deformation diffeomorphic metric mapping problems formulated with robust regularizers and robust image similarity metrics. The method is based on Chambolle and Pock primal-dual algorithm for solving general convex optimization problems. Diagonal preconditioning is used to ensure the convergence of the algorithm to the global minimum. We consider three robust regularizers liable to provide acceptable results in diffeomorphic registration: Huber, V-Huber and total generalized variation. The Huber norm is used in the image similarity term. The primal-dual equations are derived for the stationary and the non-stationary parameterizations of diffeomorphisms. The resulting algorithms have been implemented for running in the GPU using Cuda. For the most memory consuming methods, we have developed a multi-GPU implementation. The GPU implementations allowed us to perform an exhaustive evaluation study in NIREP and LPBA40 databases. The experiments showed that, for all the considered regularizers, the proposed method converges to diffeomorphic solutions while better preserving discontinuities at the boundaries of the objects compared to baseline diffeomorphic registration methods. In most cases, the evaluation showed a competitive performance for the robust regularizers, close to the performance of the baseline diffeomorphic registration methods.
KINETIC ALFVÉN WAVE GENERATION BY LARGE-SCALE PHASE MIXING

International Nuclear Information System (INIS)

Vásconez, C. L.; Pucci, F.; Valentini, F.; Servidio, S.; Malara, F.; Matthaeus, W. H.

2015-01-01

One view of the solar wind turbulence is that the observed highly anisotropic fluctuations at spatial scales near the proton inertial length d p may be considered as kinetic Alfvén waves (KAWs). In the present paper, we show how phase mixing of large-scale parallel-propagating Alfvén waves is an efficient mechanism for the production of KAWs at wavelengths close to d p and at a large propagation angle with respect to the magnetic field. Magnetohydrodynamic (MHD), Hall magnetohydrodynamic (HMHD), and hybrid Vlasov–Maxwell (HVM) simulations modeling the propagation of Alfvén waves in inhomogeneous plasmas are performed. In the linear regime, the role of dispersive effects is singled out by comparing MHD and HMHD results. Fluctuations produced by phase mixing are identified as KAWs through a comparison of polarization of magnetic fluctuations and wave-group velocity with analytical linear predictions. In the nonlinear regime, a comparison of HMHD and HVM simulations allows us to point out the role of kinetic effects in shaping the proton-distribution function. We observe the generation of temperature anisotropy with respect to the local magnetic field and the production of field-aligned beams. The regions where the proton-distribution function highly departs from thermal equilibrium are located inside the shear layers, where the KAWs are excited, this suggesting that the distortions of the proton distribution are driven by a resonant interaction of protons with KAW fluctuations. Our results are relevant in configurations where magnetic-field inhomogeneities are present, as, for example, in the solar corona, where the presence of Alfvén waves has been ascertained
KINETIC ALFVÉN WAVE GENERATION BY LARGE-SCALE PHASE MIXING

Energy Technology Data Exchange (ETDEWEB)

Vásconez, C. L.; Pucci, F.; Valentini, F.; Servidio, S.; Malara, F. [Dipartimento di Fisica, Università della Calabria, I-87036, Rende (CS) (Italy); Matthaeus, W. H. [Department of Physics and Astronomy, University of Delaware, DE 19716 (United States)

2015-12-10

One view of the solar wind turbulence is that the observed highly anisotropic fluctuations at spatial scales near the proton inertial length d{sub p} may be considered as kinetic Alfvén waves (KAWs). In the present paper, we show how phase mixing of large-scale parallel-propagating Alfvén waves is an efficient mechanism for the production of KAWs at wavelengths close to d{sub p} and at a large propagation angle with respect to the magnetic field. Magnetohydrodynamic (MHD), Hall magnetohydrodynamic (HMHD), and hybrid Vlasov–Maxwell (HVM) simulations modeling the propagation of Alfvén waves in inhomogeneous plasmas are performed. In the linear regime, the role of dispersive effects is singled out by comparing MHD and HMHD results. Fluctuations produced by phase mixing are identified as KAWs through a comparison of polarization of magnetic fluctuations and wave-group velocity with analytical linear predictions. In the nonlinear regime, a comparison of HMHD and HVM simulations allows us to point out the role of kinetic effects in shaping the proton-distribution function. We observe the generation of temperature anisotropy with respect to the local magnetic field and the production of field-aligned beams. The regions where the proton-distribution function highly departs from thermal equilibrium are located inside the shear layers, where the KAWs are excited, this suggesting that the distortions of the proton distribution are driven by a resonant interaction of protons with KAW fluctuations. Our results are relevant in configurations where magnetic-field inhomogeneities are present, as, for example, in the solar corona, where the presence of Alfvén waves has been ascertained.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.