WorldWideScience

Sample records for large-scale scientific computations

  1. Large-scale computation at PSI scientific achievements and future requirements

    International Nuclear Information System (INIS)

    Adelmann, A.; Markushin, V.

    2008-11-01

    ' (SNSP-HPCN) is discussing this complex. Scientific results which are made possible by PSI's engagement at CSCS (named Horizon) are summarised and PSI's future high-performance computing requirements are evaluated. The data collected shows the current situation and a 5 year extrapolation of the users' needs with respect to HPC resources is made. In consequence this report can serve as a basis for future strategic decisions with respect to a non-existing HPC road-map for PSI. PSI's institutional HPC area started hardware-wise approximately in 1999 with the assembly of a 32-processor LINUX cluster called Merlin. Merlin was upgraded several times, lastly in 2007. The Merlin cluster at PSI is used for small scale parallel jobs, and is the only general purpose computing system at PSI. Several dedicated small scale clusters followed the Merlin scheme. Many of the clusters are used to analyse data from experiments at PSI or CERN, because dedicated clusters are most efficient. The intellectual and financial involvement of the procurement (including a machine update in 2007) results in a PSI share of 25 % of the available computing resources at CSCS. The (over) usage of available computing resources by PSI scientists is demonstrated. We actually get more computing cycles than we have paid for. The reason is the fair share policy that is implemented on the Horizon machine. This policy allows us to get cycles, with a low priority, even when our bi-monthly share is used. Five important observations can be drawn from the analysis of the scientific output and the survey of future requirements of main PSI HPC users: (1) High Performance Computing is a main pillar in many important PSI research areas; (2) there is a lack in the order of 10 times the current computing resources (measured in available core-hours per year); (3) there is a trend to use in the order of 600 processors per average production run; (4) the disk and tape storage growth is dramatic; (5) small HPC clusters located

  2. Large-scale computation at PSI scientific achievements and future requirements

    Energy Technology Data Exchange (ETDEWEB)

    Adelmann, A.; Markushin, V

    2008-11-15

    and Networking' (SNSP-HPCN) is discussing this complex. Scientific results which are made possible by PSI's engagement at CSCS (named Horizon) are summarised and PSI's future high-performance computing requirements are evaluated. The data collected shows the current situation and a 5 year extrapolation of the users' needs with respect to HPC resources is made. In consequence this report can serve as a basis for future strategic decisions with respect to a non-existing HPC road-map for PSI. PSI's institutional HPC area started hardware-wise approximately in 1999 with the assembly of a 32-processor LINUX cluster called Merlin. Merlin was upgraded several times, lastly in 2007. The Merlin cluster at PSI is used for small scale parallel jobs, and is the only general purpose computing system at PSI. Several dedicated small scale clusters followed the Merlin scheme. Many of the clusters are used to analyse data from experiments at PSI or CERN, because dedicated clusters are most efficient. The intellectual and financial involvement of the procurement (including a machine update in 2007) results in a PSI share of 25 % of the available computing resources at CSCS. The (over) usage of available computing resources by PSI scientists is demonstrated. We actually get more computing cycles than we have paid for. The reason is the fair share policy that is implemented on the Horizon machine. This policy allows us to get cycles, with a low priority, even when our bi-monthly share is used. Five important observations can be drawn from the analysis of the scientific output and the survey of future requirements of main PSI HPC users: (1) High Performance Computing is a main pillar in many important PSI research areas; (2) there is a lack in the order of 10 times the current computing resources (measured in available core-hours per year); (3) there is a trend to use in the order of 600 processors per average production run; (4) the disk and tape storage growth

  3. Parallel Computational Fluid Dynamics 2007 : Implementations and Experiences on Large Scale and Grid Computing

    CERN Document Server

    2009-01-01

    At the 19th Annual Conference on Parallel Computational Fluid Dynamics held in Antalya, Turkey, in May 2007, the most recent developments and implementations of large-scale and grid computing were presented. This book, comprised of the invited and selected papers of this conference, details those advances, which are of particular interest to CFD and CFD-related communities. It also offers the results related to applications of various scientific and engineering problems involving flows and flow-related topics. Intended for CFD researchers and graduate students, this book is a state-of-the-art presentation of the relevant methodology and implementation techniques of large-scale computing.

  4. XVIS: Visualization for the Extreme-Scale Scientific-Computation Ecosystem Final Scientific/Technical Report

    Energy Technology Data Exchange (ETDEWEB)

    Geveci, Berk [Kitware, Inc., Clifton Park, NY (United States); Maynard, Robert [Kitware, Inc., Clifton Park, NY (United States)

    2017-10-27

    The XVis project brings together the key elements of research to enable scientific discovery at extreme scale. Scientific computing will no longer be purely about how fast computations can be performed. Energy constraints, processor changes, and I/O limitations necessitate significant changes in both the software applications used in scientific computation and the ways in which scientists use them. Components for modeling, simulation, analysis, and visualization must work together in a computational ecosystem, rather than working independently as they have in the past. The XVis project brought together collaborators from predominant DOE projects for visualization on accelerators and combining their respective features into a new visualization toolkit called VTK-m.

  5. Exploiting Data Sparsity for Large-Scale Matrix Computations

    KAUST Repository

    Akbudak, Kadir

    2018-02-24

    Exploiting data sparsity in dense matrices is an algorithmic bridge between architectures that are increasingly memory-austere on a per-core basis and extreme-scale applications. The Hierarchical matrix Computations on Manycore Architectures (HiCMA) library tackles this challenging problem by achieving significant reductions in time to solution and memory footprint, while preserving a specified accuracy requirement of the application. HiCMA provides a high-performance implementation on distributed-memory systems of one of the most widely used matrix factorization in large-scale scientific applications, i.e., the Cholesky factorization. It employs the tile low-rank data format to compress the dense data-sparse off-diagonal tiles of the matrix. It then decomposes the matrix computations into interdependent tasks and relies on the dynamic runtime system StarPU for asynchronous out-of-order scheduling, while allowing high user-productivity. Performance comparisons and memory footprint on matrix dimensions up to eleven million show a performance gain and memory saving of more than an order of magnitude for both metrics on thousands of cores, against state-of-the-art open-source and vendor optimized numerical libraries. This represents an important milestone in enabling large-scale matrix computations toward solving big data problems in geospatial statistics for climate/weather forecasting applications.

  6. Exploiting Data Sparsity for Large-Scale Matrix Computations

    KAUST Repository

    Akbudak, Kadir; Ltaief, Hatem; Mikhalev, Aleksandr; Charara, Ali; Keyes, David E.

    2018-01-01

    Exploiting data sparsity in dense matrices is an algorithmic bridge between architectures that are increasingly memory-austere on a per-core basis and extreme-scale applications. The Hierarchical matrix Computations on Manycore Architectures (HiCMA) library tackles this challenging problem by achieving significant reductions in time to solution and memory footprint, while preserving a specified accuracy requirement of the application. HiCMA provides a high-performance implementation on distributed-memory systems of one of the most widely used matrix factorization in large-scale scientific applications, i.e., the Cholesky factorization. It employs the tile low-rank data format to compress the dense data-sparse off-diagonal tiles of the matrix. It then decomposes the matrix computations into interdependent tasks and relies on the dynamic runtime system StarPU for asynchronous out-of-order scheduling, while allowing high user-productivity. Performance comparisons and memory footprint on matrix dimensions up to eleven million show a performance gain and memory saving of more than an order of magnitude for both metrics on thousands of cores, against state-of-the-art open-source and vendor optimized numerical libraries. This represents an important milestone in enabling large-scale matrix computations toward solving big data problems in geospatial statistics for climate/weather forecasting applications.

  7. Large-scale computation in solid state physics - Recent developments and prospects

    International Nuclear Information System (INIS)

    DeVreese, J.T.

    1985-01-01

    During the past few years an increasing interest in large-scale computation is developing. Several initiatives were taken to evaluate and exploit the potential of ''supercomputers'' like the CRAY-1 (or XMP) or the CYBER-205. In the U.S.A., there first appeared the Lax report in 1982 and subsequently (1984) the National Science Foundation in the U.S.A. announced a program to promote large-scale computation at the universities. Also, in Europe several CRAY- and CYBER-205 systems have been installed. Although the presently available mainframes are the result of a continuous growth in speed and memory, they might have induced a discontinuous transition in the evolution of the scientific method; between theory and experiment a third methodology, ''computational science'', has become or is becoming operational

  8. Research on the Construction Management and Sustainable Development of Large-Scale Scientific Facilities in China

    Science.gov (United States)

    Guiquan, Xi; Lin, Cong; Xuehui, Jin

    2018-05-01

    As an important platform for scientific and technological development, large -scale scientific facilities are the cornerstone of technological innovation and a guarantee for economic and social development. Researching management of large-scale scientific facilities can play a key role in scientific research, sociology and key national strategy. This paper reviews the characteristics of large-scale scientific facilities, and summarizes development status of China's large-scale scientific facilities. At last, the construction, management, operation and evaluation of large-scale scientific facilities is analyzed from the perspective of sustainable development.

  9. Technologies for Large Data Management in Scientific Computing

    CERN Document Server

    Pace, A

    2014-01-01

    In recent years, intense usage of computing has been the main strategy of investigations in several scientific research projects. The progress in computing technology has opened unprecedented opportunities for systematic collection of experimental data and the associated analysis that were considered impossible only few years ago. This paper focusses on the strategies in use: it reviews the various components that are necessary for an effective solution that ensures the storage, the long term preservation, and the worldwide distribution of large quantities of data that are necessary in a large scientific research project. The paper also mentions several examples of data management solutions used in High Energy Physics for the CERN Large Hadron Collider (LHC) experiments in Geneva, Switzerland which generate more than 30,000 terabytes of data every year that need to be preserved, analyzed, and made available to a community of several tenth of thousands scientists worldwide.

  10. Large Scale Computations in Air Pollution Modelling

    DEFF Research Database (Denmark)

    Zlatev, Z.; Brandt, J.; Builtjes, P. J. H.

    Proceedings of the NATO Advanced Research Workshop on Large Scale Computations in Air Pollution Modelling, Sofia, Bulgaria, 6-10 July 1998......Proceedings of the NATO Advanced Research Workshop on Large Scale Computations in Air Pollution Modelling, Sofia, Bulgaria, 6-10 July 1998...

  11. Large scale cluster computing workshop

    International Nuclear Information System (INIS)

    Dane Skow; Alan Silverman

    2002-01-01

    Recent revolutions in computer hardware and software technologies have paved the way for the large-scale deployment of clusters of commodity computers to address problems heretofore the domain of tightly coupled SMP processors. Near term projects within High Energy Physics and other computing communities will deploy clusters of scale 1000s of processors and be used by 100s to 1000s of independent users. This will expand the reach in both dimensions by an order of magnitude from the current successful production facilities. The goals of this workshop were: (1) to determine what tools exist which can scale up to the cluster sizes foreseen for the next generation of HENP experiments (several thousand nodes) and by implication to identify areas where some investment of money or effort is likely to be needed. (2) To compare and record experimences gained with such tools. (3) To produce a practical guide to all stages of planning, installing, building and operating a large computing cluster in HENP. (4) To identify and connect groups with similar interest within HENP and the larger clustering community

  12. The Large-Scale Structure of Scientific Method

    Science.gov (United States)

    Kosso, Peter

    2009-01-01

    The standard textbook description of the nature of science describes the proposal, testing, and acceptance of a theoretical idea almost entirely in isolation from other theories. The resulting model of science is a kind of piecemeal empiricism that misses the important network structure of scientific knowledge. Only the large-scale description of…

  13. Highly parallel machines and future of scientific computing

    International Nuclear Information System (INIS)

    Singh, G.S.

    1992-01-01

    Computing requirement of large scale scientific computing has always been ahead of what state of the art hardware could supply in the form of supercomputers of the day. And for any single processor system the limit to increase in the computing power was realized a few years back itself. Now with the advent of parallel computing systems the availability of machines with the required computing power seems a reality. In this paper the author tries to visualize the future large scale scientific computing in the penultimate decade of the present century. The author summarized trends in parallel computers and emphasize the need for a better programming environment and software tools for optimal performance. The author concludes this paper with critique on parallel architectures, software tools and algorithms. (author). 10 refs., 2 tabs

  14. Large Scale Computing and Storage Requirements for Nuclear Physics Research

    Energy Technology Data Exchange (ETDEWEB)

    Gerber, Richard A.; Wasserman, Harvey J.

    2012-03-02

    IThe National Energy Research Scientific Computing Center (NERSC) is the primary computing center for the DOE Office of Science, serving approximately 4,000 users and hosting some 550 projects that involve nearly 700 codes for a wide variety of scientific disciplines. In addition to large-scale computing resources NERSC provides critical staff support and expertise to help scientists make the most efficient use of these resources to advance the scientific mission of the Office of Science. In May 2011, NERSC, DOE’s Office of Advanced Scientific Computing Research (ASCR) and DOE’s Office of Nuclear Physics (NP) held a workshop to characterize HPC requirements for NP research over the next three to five years. The effort is part of NERSC’s continuing involvement in anticipating future user needs and deploying necessary resources to meet these demands. The workshop revealed several key requirements, in addition to achieving its goal of characterizing NP computing. The key requirements include: 1. Larger allocations of computational resources at NERSC; 2. Visualization and analytics support; and 3. Support at NERSC for the unique needs of experimental nuclear physicists. This report expands upon these key points and adds others. The results are based upon representative samples, called “case studies,” of the needs of science teams within NP. The case studies were prepared by NP workshop participants and contain a summary of science goals, methods of solution, current and future computing requirements, and special software and support needs. Participants were also asked to describe their strategy for computing in the highly parallel, “multi-core” environment that is expected to dominate HPC architectures over the next few years. The report also includes a section with NERSC responses to the workshop findings. NERSC has many initiatives already underway that address key workshop findings and all of the action items are aligned with NERSC strategic plans.

  15. The TeraShake Computational Platform for Large-Scale Earthquake Simulations

    Science.gov (United States)

    Cui, Yifeng; Olsen, Kim; Chourasia, Amit; Moore, Reagan; Maechling, Philip; Jordan, Thomas

    Geoscientific and computer science researchers with the Southern California Earthquake Center (SCEC) are conducting a large-scale, physics-based, computationally demanding earthquake system science research program with the goal of developing predictive models of earthquake processes. The computational demands of this program continue to increase rapidly as these researchers seek to perform physics-based numerical simulations of earthquake processes for larger meet the needs of this research program, a multiple-institution team coordinated by SCEC has integrated several scientific codes into a numerical modeling-based research tool we call the TeraShake computational platform (TSCP). A central component in the TSCP is a highly scalable earthquake wave propagation simulation program called the TeraShake anelastic wave propagation (TS-AWP) code. In this chapter, we describe how we extended an existing, stand-alone, wellvalidated, finite-difference, anelastic wave propagation modeling code into the highly scalable and widely used TS-AWP and then integrated this code into the TeraShake computational platform that provides end-to-end (initialization to analysis) research capabilities. We also describe the techniques used to enhance the TS-AWP parallel performance on TeraGrid supercomputers, as well as the TeraShake simulations phases including input preparation, run time, data archive management, and visualization. As a result of our efforts to improve its parallel efficiency, the TS-AWP has now shown highly efficient strong scaling on over 40K processors on IBM’s BlueGene/L Watson computer. In addition, the TSCP has developed into a computational system that is useful to many members of the SCEC community for performing large-scale earthquake simulations.

  16. Advanced I/O for large-scale scientific applications

    International Nuclear Information System (INIS)

    Klasky, Scott; Schwan, Karsten; Oldfield, Ron A.; Lofstead, Gerald F. II

    2010-01-01

    As scientific simulations scale to use petascale machines and beyond, the data volumes generated pose a dual problem. First, with increasing machine sizes, the careful tuning of IO routines becomes more and more important to keep the time spent in IO acceptable. It is not uncommon, for instance, to have 20% of an application's runtime spent performing IO in a 'tuned' system. Careful management of the IO routines can move that to 5% or even less in some cases. Second, the data volumes are so large, on the order of 10s to 100s of TB, that trying to discover the scientifically valid contributions requires assistance at runtime to both organize and annotate the data. Waiting for offline processing is not feasible due both to the impact on the IO system and the time required. To reduce this load and improve the ability of scientists to use the large amounts of data being produced, new techniques for data management are required. First, there is a need for techniques for efficient movement of data from the compute space to storage. These techniques should understand the underlying system infrastructure and adapt to changing system conditions. Technologies include aggregation networks, data staging nodes for a closer parity to the IO subsystem, and autonomic IO routines that can detect system bottlenecks and choose different approaches, such as splitting the output into multiple targets, staggering output processes. Such methods must be end-to-end, meaning that even with properly managed asynchronous techniques, it is still essential to properly manage the later synchronous interaction with the storage system to maintain acceptable performance. Second, for the data being generated, annotations and other metadata must be incorporated to help the scientist understand output data for the simulation run as a whole, to select data and data features without concern for what files or other storage technologies were employed. All of these features should be attained while

  17. Large Scale Computing and Storage Requirements for Basic Energy Sciences Research

    Energy Technology Data Exchange (ETDEWEB)

    Gerber, Richard; Wasserman, Harvey

    2011-03-31

    The National Energy Research Scientific Computing Center (NERSC) is the leading scientific computing facility supporting research within the Department of Energy's Office of Science. NERSC provides high-performance computing (HPC) resources to approximately 4,000 researchers working on about 400 projects. In addition to hosting large-scale computing facilities, NERSC provides the support and expertise scientists need to effectively and efficiently use HPC systems. In February 2010, NERSC, DOE's Office of Advanced Scientific Computing Research (ASCR) and DOE's Office of Basic Energy Sciences (BES) held a workshop to characterize HPC requirements for BES research through 2013. The workshop was part of NERSC's legacy of anticipating users future needs and deploying the necessary resources to meet these demands. Workshop participants reached a consensus on several key findings, in addition to achieving the workshop's goal of collecting and characterizing computing requirements. The key requirements for scientists conducting research in BES are: (1) Larger allocations of computational resources; (2) Continued support for standard application software packages; (3) Adequate job turnaround time and throughput; and (4) Guidance and support for using future computer architectures. This report expands upon these key points and presents others. Several 'case studies' are included as significant representative samples of the needs of science teams within BES. Research teams scientific goals, computational methods of solution, current and 2013 computing requirements, and special software and support needs are summarized in these case studies. Also included are researchers strategies for computing in the highly parallel, 'multi-core' environment that is expected to dominate HPC architectures over the next few years. NERSC has strategic plans and initiatives already underway that address key workshop findings. This report includes a

  18. Computing in Large-Scale Dynamic Systems

    NARCIS (Netherlands)

    Pruteanu, A.S.

    2013-01-01

    Software applications developed for large-scale systems have always been difficult to de- velop due to problems caused by the large number of computing devices involved. Above a certain network size (roughly one hundred), necessary services such as code updating, topol- ogy discovery and data

  19. Large-scale computing with Quantum Espresso

    International Nuclear Information System (INIS)

    Giannozzi, P.; Cavazzoni, C.

    2009-01-01

    This paper gives a short introduction to Quantum Espresso: a distribution of software for atomistic simulations in condensed-matter physics, chemical physics, materials science, and to its usage in large-scale parallel computing.

  20. XVis: Visualization for the Extreme-Scale Scientific-Computation Ecosystem: Year-end report FY17.

    Energy Technology Data Exchange (ETDEWEB)

    Moreland, Kenneth D. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Pugmire, David [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Rogers, David [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Childs, Hank [Univ. of Oregon, Eugene, OR (United States); Ma, Kwan-Liu [Univ. of California, Davis, CA (United States); Geveci, Berk [Kitware, Inc., Clifton Park, NY (United States)

    2017-10-01

    The XVis project brings together the key elements of research to enable scientific discovery at extreme scale. Scientific computing will no longer be purely about how fast computations can be performed. Energy constraints, processor changes, and I/O limitations necessitate significant changes in both the software applications used in scientific computation and the ways in which scientists use them. Components for modeling, simulation, analysis, and visualization must work together in a computational ecosystem, rather than working independently as they have in the past. This project provides the necessary research and infrastructure for scientific discovery in this new computational ecosystem by addressing four interlocking challenges: emerging processor technology, in situ integration, usability, and proxy analysis.

  1. Large Scale Computing and Storage Requirements for High Energy Physics

    International Nuclear Information System (INIS)

    Gerber, Richard A.; Wasserman, Harvey

    2010-01-01

    The National Energy Research Scientific Computing Center (NERSC) is the leading scientific computing facility for the Department of Energy's Office of Science, providing high-performance computing (HPC) resources to more than 3,000 researchers working on about 400 projects. NERSC provides large-scale computing resources and, crucially, the support and expertise needed for scientists to make effective use of them. In November 2009, NERSC, DOE's Office of Advanced Scientific Computing Research (ASCR), and DOE's Office of High Energy Physics (HEP) held a workshop to characterize the HPC resources needed at NERSC to support HEP research through the next three to five years. The effort is part of NERSC's legacy of anticipating users needs and deploying resources to meet those demands. The workshop revealed several key points, in addition to achieving its goal of collecting and characterizing computing requirements. The chief findings: (1) Science teams need access to a significant increase in computational resources to meet their research goals; (2) Research teams need to be able to read, write, transfer, store online, archive, analyze, and share huge volumes of data; (3) Science teams need guidance and support to implement their codes on future architectures; and (4) Projects need predictable, rapid turnaround of their computational jobs to meet mission-critical time constraints. This report expands upon these key points and includes others. It also presents a number of case studies as representative of the research conducted within HEP. Workshop participants were asked to codify their requirements in this case study format, summarizing their science goals, methods of solution, current and three-to-five year computing requirements, and software and support needs. Participants were also asked to describe their strategy for computing in the highly parallel, multi-core environment that is expected to dominate HPC architectures over the next few years. The report includes

  2. Personalized Opportunistic Computing for CMS at Large Scale

    CERN Multimedia

    CERN. Geneva

    2015-01-01

    **Douglas Thain** is an Associate Professor of Computer Science and Engineering at the University of Notre Dame, where he designs large scale distributed computing systems to power the needs of advanced science and...

  3. Parallel Tensor Compression for Large-Scale Scientific Data.

    Energy Technology Data Exchange (ETDEWEB)

    Kolda, Tamara G. [Sandia National Lab. (SNL-CA), Livermore, CA (United States); Ballard, Grey [Sandia National Lab. (SNL-CA), Livermore, CA (United States); Austin, Woody Nathan [Univ. of Texas, Austin, TX (United States)

    2015-10-01

    As parallel computing trends towards the exascale, scientific data produced by high-fidelity simulations are growing increasingly massive. For instance, a simulation on a three-dimensional spatial grid with 512 points per dimension that tracks 64 variables per grid point for 128 time steps yields 8 TB of data. By viewing the data as a dense five way tensor, we can compute a Tucker decomposition to find inherent low-dimensional multilinear structure, achieving compression ratios of up to 10000 on real-world data sets with negligible loss in accuracy. So that we can operate on such massive data, we present the first-ever distributed memory parallel implementation for the Tucker decomposition, whose key computations correspond to parallel linear algebra operations, albeit with nonstandard data layouts. Our approach specifies a data distribution for tensors that avoids any tensor data redistribution, either locally or in parallel. We provide accompanying analysis of the computation and communication costs of the algorithms. To demonstrate the compression and accuracy of the method, we apply our approach to real-world data sets from combustion science simulations. We also provide detailed performance results, including parallel performance in both weak and strong scaling experiments.

  4. Large Scale Computing and Storage Requirements for High Energy Physics

    Energy Technology Data Exchange (ETDEWEB)

    Gerber, Richard A.; Wasserman, Harvey

    2010-11-24

    The National Energy Research Scientific Computing Center (NERSC) is the leading scientific computing facility for the Department of Energy's Office of Science, providing high-performance computing (HPC) resources to more than 3,000 researchers working on about 400 projects. NERSC provides large-scale computing resources and, crucially, the support and expertise needed for scientists to make effective use of them. In November 2009, NERSC, DOE's Office of Advanced Scientific Computing Research (ASCR), and DOE's Office of High Energy Physics (HEP) held a workshop to characterize the HPC resources needed at NERSC to support HEP research through the next three to five years. The effort is part of NERSC's legacy of anticipating users needs and deploying resources to meet those demands. The workshop revealed several key points, in addition to achieving its goal of collecting and characterizing computing requirements. The chief findings: (1) Science teams need access to a significant increase in computational resources to meet their research goals; (2) Research teams need to be able to read, write, transfer, store online, archive, analyze, and share huge volumes of data; (3) Science teams need guidance and support to implement their codes on future architectures; and (4) Projects need predictable, rapid turnaround of their computational jobs to meet mission-critical time constraints. This report expands upon these key points and includes others. It also presents a number of case studies as representative of the research conducted within HEP. Workshop participants were asked to codify their requirements in this case study format, summarizing their science goals, methods of solution, current and three-to-five year computing requirements, and software and support needs. Participants were also asked to describe their strategy for computing in the highly parallel, multi-core environment that is expected to dominate HPC architectures over the next few years

  5. Scientific Discovery through Advanced Computing in Plasma Science

    Science.gov (United States)

    Tang, William

    2005-03-01

    Advanced computing is generally recognized to be an increasingly vital tool for accelerating progress in scientific research during the 21st Century. For example, the Department of Energy's ``Scientific Discovery through Advanced Computing'' (SciDAC) Program was motivated in large measure by the fact that formidable scientific challenges in its research portfolio could best be addressed by utilizing the combination of the rapid advances in super-computing technology together with the emergence of effective new algorithms and computational methodologies. The imperative is to translate such progress into corresponding increases in the performance of the scientific codes used to model complex physical systems such as those encountered in high temperature plasma research. If properly validated against experimental measurements and analytic benchmarks, these codes can provide reliable predictive capability for the behavior of a broad range of complex natural and engineered systems. This talk reviews recent progress and future directions for advanced simulations with some illustrative examples taken from the plasma science applications area. Significant recent progress has been made in both particle and fluid simulations of fine-scale turbulence and large-scale dynamics, giving increasingly good agreement between experimental observations and computational modeling. This was made possible by the combination of access to powerful new computational resources together with innovative advances in analytic and computational methods for developing reduced descriptions of physics phenomena spanning a huge range in time and space scales. In particular, the plasma science community has made excellent progress in developing advanced codes for which computer run-time and problem size scale well with the number of processors on massively parallel machines (MPP's). A good example is the effective usage of the full power of multi-teraflop (multi-trillion floating point computations

  6. A large-scale computer facility for computational aerodynamics

    International Nuclear Information System (INIS)

    Bailey, F.R.; Balhaus, W.F.

    1985-01-01

    The combination of computer system technology and numerical modeling have advanced to the point that computational aerodynamics has emerged as an essential element in aerospace vehicle design methodology. To provide for further advances in modeling of aerodynamic flow fields, NASA has initiated at the Ames Research Center the Numerical Aerodynamic Simulation (NAS) Program. The objective of the Program is to develop a leading-edge, large-scale computer facility, and make it available to NASA, DoD, other Government agencies, industry and universities as a necessary element in ensuring continuing leadership in computational aerodynamics and related disciplines. The Program will establish an initial operational capability in 1986 and systematically enhance that capability by incorporating evolving improvements in state-of-the-art computer system technologies as required to maintain a leadership role. This paper briefly reviews the present and future requirements for computational aerodynamics and discusses the Numerical Aerodynamic Simulation Program objectives, computational goals, and implementation plans

  7. Software Engineering for Scientific Computer Simulations

    Science.gov (United States)

    Post, Douglass E.; Henderson, Dale B.; Kendall, Richard P.; Whitney, Earl M.

    2004-11-01

    Computer simulation is becoming a very powerful tool for analyzing and predicting the performance of fusion experiments. Simulation efforts are evolving from including only a few effects to many effects, from small teams with a few people to large teams, and from workstations and small processor count parallel computers to massively parallel platforms. Successfully making this transition requires attention to software engineering issues. We report on the conclusions drawn from a number of case studies of large scale scientific computing projects within DOE, academia and the DoD. The major lessons learned include attention to sound project management including setting reasonable and achievable requirements, building a good code team, enforcing customer focus, carrying out verification and validation and selecting the optimum computational mathematics approaches.

  8. A Web-based Distributed Voluntary Computing Platform for Large Scale Hydrological Computations

    Science.gov (United States)

    Demir, I.; Agliamzanov, R.

    2014-12-01

    Distributed volunteer computing can enable researchers and scientist to form large parallel computing environments to utilize the computing power of the millions of computers on the Internet, and use them towards running large scale environmental simulations and models to serve the common good of local communities and the world. Recent developments in web technologies and standards allow client-side scripting languages to run at speeds close to native application, and utilize the power of Graphics Processing Units (GPU). Using a client-side scripting language like JavaScript, we have developed an open distributed computing framework that makes it easy for researchers to write their own hydrologic models, and run them on volunteer computers. Users will easily enable their websites for visitors to volunteer sharing their computer resources to contribute running advanced hydrological models and simulations. Using a web-based system allows users to start volunteering their computational resources within seconds without installing any software. The framework distributes the model simulation to thousands of nodes in small spatial and computational sizes. A relational database system is utilized for managing data connections and queue management for the distributed computing nodes. In this paper, we present a web-based distributed volunteer computing platform to enable large scale hydrological simulations and model runs in an open and integrated environment.

  9. Learning from large scale neural simulations

    DEFF Research Database (Denmark)

    Serban, Maria

    2017-01-01

    Large-scale neural simulations have the marks of a distinct methodology which can be fruitfully deployed to advance scientific understanding of the human brain. Computer simulation studies can be used to produce surrogate observational data for better conceptual models and new how...

  10. XVis: Visualization for the Extreme-Scale Scientific-Computation Ecosystem: Year-end report FY15 Q4.

    Energy Technology Data Exchange (ETDEWEB)

    Moreland, Kenneth D. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Sewell, Christopher [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Childs, Hank [Univ. of Oregon, Eugene, OR (United States); Ma, Kwan-Liu [Univ. of California, Davis, CA (United States); Geveci, Berk [Kitware, Inc., Clifton Park, NY (United States); Meredith, Jeremy [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)

    2015-12-01

    The XVis project brings together the key elements of research to enable scientific discovery at extreme scale. Scientific computing will no longer be purely about how fast computations can be performed. Energy constraints, processor changes, and I/O limitations necessitate significant changes in both the software applications used in scientific computation and the ways in which scientists use them. Components for modeling, simulation, analysis, and visualization must work together in a computational ecosystem, rather than working independently as they have in the past. This project provides the necessary research and infrastructure for scientific discovery in this new computational ecosystem by addressing four interlocking challenges: emerging processor technology, in situ integration, usability, and proxy analysis.

  11. XVis: Visualization for the Extreme-Scale Scientific-Computation Ecosystem: Mid-year report FY17 Q2

    Energy Technology Data Exchange (ETDEWEB)

    Moreland, Kenneth D. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Pugmire, David [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Rogers, David [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Childs, Hank [Univ. of Oregon, Eugene, OR (United States); Ma, Kwan-Liu [Univ. of California, Davis, CA (United States); Geveci, Berk [Kitware Inc., Clifton Park, NY (United States)

    2017-05-01

    The XVis project brings together the key elements of research to enable scientific discovery at extreme scale. Scientific computing will no longer be purely about how fast computations can be performed. Energy constraints, processor changes, and I/O limitations necessitate significant changes in both the software applications used in scientific computation and the ways in which scientists use them. Components for modeling, simulation, analysis, and visualization must work together in a computational ecosystem, rather than working independently as they have in the past. This project provides the necessary research and infrastructure for scientific discovery in this new computational ecosystem by addressing four interlocking challenges: emerging processor technology, in situ integration, usability, and proxy analysis.

  12. XVis: Visualization for the Extreme-Scale Scientific-Computation Ecosystem. Mid-year report FY16 Q2

    Energy Technology Data Exchange (ETDEWEB)

    Moreland, Kenneth D.; Sewell, Christopher (LANL); Childs, Hank (U of Oregon); Ma, Kwan-Liu (UC Davis); Geveci, Berk (Kitware); Meredith, Jeremy (ORNL)

    2016-05-01

    The XVis project brings together the key elements of research to enable scientific discovery at extreme scale. Scientific computing will no longer be purely about how fast computations can be performed. Energy constraints, processor changes, and I/O limitations necessitate significant changes in both the software applications used in scientific computation and the ways in which scientists use them. Components for modeling, simulation, analysis, and visualization must work together in a computational ecosystem, rather than working independently as they have in the past. This project provides the necessary research and infrastructure for scientific discovery in this new computational ecosystem by addressing four interlocking challenges: emerging processor technology, in situ integration, usability, and proxy analysis.

  13. Large-Scale Assessment, Rationality, and Scientific Management: The Case of No Child Left Behind

    Science.gov (United States)

    Roach, Andrew T.; Frank, Jennifer

    2007-01-01

    This article examines the ways in which NCLB and the movement towards large-scale assessment systems are based on Weber's concept of formal rationality and tradition of scientific management. Building on these ideas, the authors use Ritzer's McDonaldization thesis to examine some of the core features of large-scale assessment and accountability…

  14. Fast Simulation of Large-Scale Floods Based on GPU Parallel Computing

    OpenAIRE

    Qiang Liu; Yi Qin; Guodong Li

    2018-01-01

    Computing speed is a significant issue of large-scale flood simulations for real-time response to disaster prevention and mitigation. Even today, most of the large-scale flood simulations are generally run on supercomputers due to the massive amounts of data and computations necessary. In this work, a two-dimensional shallow water model based on an unstructured Godunov-type finite volume scheme was proposed for flood simulation. To realize a fast simulation of large-scale floods on a personal...

  15. Comparing the performance of SIMD computers by running large air pollution models

    DEFF Research Database (Denmark)

    Brown, J.; Hansen, Per Christian; Wasniewski, J.

    1996-01-01

    To compare the performance and use of three massively parallel SIMD computers, we implemented a large air pollution model on these computers. Using a realistic large-scale model, we gained detailed insight about the performance of the computers involved when used to solve large-scale scientific...... problems that involve several types of numerical computations. The computers used in our study are the Connection Machines CM-200 and CM-5, and the MasPar MP-2216...

  16. Topic 14+16: High-performance and scientific applications and extreme-scale computing (Introduction)

    KAUST Repository

    Downes, Turlough P.

    2013-01-01

    As our understanding of the world around us increases it becomes more challenging to make use of what we already know, and to increase our understanding still further. Computational modeling and simulation have become critical tools in addressing this challenge. The requirements of high-resolution, accurate modeling have outstripped the ability of desktop computers and even small clusters to provide the necessary compute power. Many applications in the scientific and engineering domains now need very large amounts of compute time, while other applications, particularly in the life sciences, frequently have large data I/O requirements. There is thus a growing need for a range of high performance applications which can utilize parallel compute systems effectively, which have efficient data handling strategies and which have the capacity to utilise current and future systems. The High Performance and Scientific Applications topic aims to highlight recent progress in the use of advanced computing and algorithms to address the varied, complex and increasing challenges of modern research throughout both the "hard" and "soft" sciences. This necessitates being able to use large numbers of compute nodes, many of which are equipped with accelerators, and to deal with difficult I/O requirements. © 2013 Springer-Verlag.

  17. A Combined Ethical and Scientific Analysis of Large-scale Tests of Solar Climate Engineering

    Science.gov (United States)

    Ackerman, T. P.

    2017-12-01

    Our research group recently published an analysis of the combined ethical and scientific issues surrounding large-scale testing of stratospheric aerosol injection (SAI; Lenferna et al., 2017, Earth's Future). We are expanding this study in two directions. The first is extending this same analysis to other geoengineering techniques, particularly marine cloud brightening (MCB). MCB has substantial differences to SAI in this context because MCB can be tested over significantly smaller areas of the planet and, following injection, has a much shorter lifetime of weeks as opposed to years for SAI. We examine issues such as the role of intent, the lesser of two evils, and the nature of consent. In addition, several groups are currently considering climate engineering governance tools such as a code of ethics and a registry. We examine how these tools might influence climate engineering research programs and, specifically, large-scale testing. The second direction of expansion is asking whether ethical and scientific issues associated with large-scale testing are so significant that they effectively preclude moving ahead with climate engineering research and testing. Some previous authors have suggested that no research should take place until these issues are resolved. We think this position is too draconian and consider a more nuanced version of this argument. We note, however, that there are serious questions regarding the ability of the scientific research community to move to the point of carrying out large-scale tests.

  18. Fast Simulation of Large-Scale Floods Based on GPU Parallel Computing

    Directory of Open Access Journals (Sweden)

    Qiang Liu

    2018-05-01

    Full Text Available Computing speed is a significant issue of large-scale flood simulations for real-time response to disaster prevention and mitigation. Even today, most of the large-scale flood simulations are generally run on supercomputers due to the massive amounts of data and computations necessary. In this work, a two-dimensional shallow water model based on an unstructured Godunov-type finite volume scheme was proposed for flood simulation. To realize a fast simulation of large-scale floods on a personal computer, a Graphics Processing Unit (GPU-based, high-performance computing method using the OpenACC application was adopted to parallelize the shallow water model. An unstructured data management method was presented to control the data transportation between the GPU and CPU (Central Processing Unit with minimum overhead, and then both computation and data were offloaded from the CPU to the GPU, which exploited the computational capability of the GPU as much as possible. The parallel model was validated using various benchmarks and real-world case studies. The results demonstrate that speed-ups of up to one order of magnitude can be achieved in comparison with the serial model. The proposed parallel model provides a fast and reliable tool with which to quickly assess flood hazards in large-scale areas and, thus, has a bright application prospect for dynamic inundation risk identification and disaster assessment.

  19. Extreme Scale Computing for First-Principles Plasma Physics Research

    Energy Technology Data Exchange (ETDEWEB)

    Chang, Choogn-Seock [Princeton University

    2011-10-12

    World superpowers are in the middle of the “Computnik” race. US Department of Energy (and National Nuclear Security Administration) wishes to launch exascale computer systems into the scientific (and national security) world by 2018. The objective is to solve important scientific problems and to predict the outcomes using the most fundamental scientific laws, which would not be possible otherwise. Being chosen into the next “frontier” group can be of great benefit to a scientific discipline. An extreme scale computer system requires different types of algorithms and programming philosophy from those we have been accustomed to. Only a handful of scientific codes are blessed to be capable of scalable usage of today’s largest computers in operation at petascale (using more than 100,000 cores concurrently). Fortunately, a few magnetic fusion codes are competing well in this race using the “first principles” gyrokinetic equations.These codes are beginning to study the fusion plasma dynamics in full-scale realistic diverted device geometry in natural nonlinear multiscale, including the large scale neoclassical and small scale turbulence physics, but excluding some ultra fast dynamics. In this talk, most of the above mentioned topics will be introduced at executive level. Representative properties of the extreme scale computers, modern programming exercises to take advantage of them, and different philosophies in the data flows and analyses will be presented. Examples of the multi-scale multi-physics scientific discoveries made possible by solving the gyrokinetic equations on extreme scale computers will be described. Future directions into “virtual tokamak experiments” will also be discussed.

  20. Large-scale computing techniques for complex system simulations

    CERN Document Server

    Dubitzky, Werner; Schott, Bernard

    2012-01-01

    Complex systems modeling and simulation approaches are being adopted in a growing number of sectors, including finance, economics, biology, astronomy, and many more. Technologies ranging from distributed computing to specialized hardware are explored and developed to address the computational requirements arising in complex systems simulations. The aim of this book is to present a representative overview of contemporary large-scale computing technologies in the context of complex systems simulations applications. The intention is to identify new research directions in this field and

  1. Compiler Technology for Parallel Scientific Computation

    Directory of Open Access Journals (Sweden)

    Can Özturan

    1994-01-01

    Full Text Available There is a need for compiler technology that, given the source program, will generate efficient parallel codes for different architectures with minimal user involvement. Parallel computation is becoming indispensable in solving large-scale problems in science and engineering. Yet, the use of parallel computation is limited by the high costs of developing the needed software. To overcome this difficulty we advocate a comprehensive approach to the development of scalable architecture-independent software for scientific computation based on our experience with equational programming language (EPL. Our approach is based on a program decomposition, parallel code synthesis, and run-time support for parallel scientific computation. The program decomposition is guided by the source program annotations provided by the user. The synthesis of parallel code is based on configurations that describe the overall computation as a set of interacting components. Run-time support is provided by the compiler-generated code that redistributes computation and data during object program execution. The generated parallel code is optimized using techniques of data alignment, operator placement, wavefront determination, and memory optimization. In this article we discuss annotations, configurations, parallel code generation, and run-time support suitable for parallel programs written in the functional parallel programming language EPL and in Fortran.

  2. Computing the universe: how large-scale simulations illuminate galaxies and dark energy

    Science.gov (United States)

    O'Shea, Brian

    2015-04-01

    High-performance and large-scale computing is absolutely to understanding astronomical objects such as stars, galaxies, and the cosmic web. This is because these are structures that operate on physical, temporal, and energy scales that cannot be reasonably approximated in the laboratory, and whose complexity and nonlinearity often defies analytic modeling. In this talk, I show how the growth of computing platforms over time has facilitated our understanding of astrophysical and cosmological phenomena, focusing primarily on galaxies and large-scale structure in the Universe.

  3. Proceedings of the meeting on large scale computer simulation research

    International Nuclear Information System (INIS)

    2004-04-01

    The meeting to summarize the collaboration activities for FY2003 on the Large Scale Computer Simulation Research was held January 15-16, 2004 at Theory and Computer Simulation Research Center, National Institute for Fusion Science. Recent simulation results, methodologies and other related topics were presented. (author)

  4. Performance modeling of hybrid MPI/OpenMP scientific applications on large-scale multicore supercomputers

    KAUST Repository

    Wu, Xingfu; Taylor, Valerie

    2013-01-01

    In this paper, we present a performance modeling framework based on memory bandwidth contention time and a parameterized communication model to predict the performance of OpenMP, MPI and hybrid applications with weak scaling on three large-scale multicore supercomputers: IBM POWER4, POWER5+ and BlueGene/P, and analyze the performance of these MPI, OpenMP and hybrid applications. We use STREAM memory benchmarks and Intel's MPI benchmarks to provide initial performance analysis and model validation of MPI and OpenMP applications on these multicore supercomputers because the measured sustained memory bandwidth can provide insight into the memory bandwidth that a system should sustain on scientific applications with the same amount of workload per core. In addition to using these benchmarks, we also use a weak-scaling hybrid MPI/OpenMP large-scale scientific application: Gyrokinetic Toroidal Code (GTC) in magnetic fusion to validate our performance model of the hybrid application on these multicore supercomputers. The validation results for our performance modeling method show less than 7.77% error rate in predicting the performance of hybrid MPI/OpenMP GTC on up to 512 cores on these multicore supercomputers. © 2013 Elsevier Inc.

  5. Performance modeling of hybrid MPI/OpenMP scientific applications on large-scale multicore supercomputers

    KAUST Repository

    Wu, Xingfu

    2013-12-01

    In this paper, we present a performance modeling framework based on memory bandwidth contention time and a parameterized communication model to predict the performance of OpenMP, MPI and hybrid applications with weak scaling on three large-scale multicore supercomputers: IBM POWER4, POWER5+ and BlueGene/P, and analyze the performance of these MPI, OpenMP and hybrid applications. We use STREAM memory benchmarks and Intel\\'s MPI benchmarks to provide initial performance analysis and model validation of MPI and OpenMP applications on these multicore supercomputers because the measured sustained memory bandwidth can provide insight into the memory bandwidth that a system should sustain on scientific applications with the same amount of workload per core. In addition to using these benchmarks, we also use a weak-scaling hybrid MPI/OpenMP large-scale scientific application: Gyrokinetic Toroidal Code (GTC) in magnetic fusion to validate our performance model of the hybrid application on these multicore supercomputers. The validation results for our performance modeling method show less than 7.77% error rate in predicting the performance of hybrid MPI/OpenMP GTC on up to 512 cores on these multicore supercomputers. © 2013 Elsevier Inc.

  6. Large scale computing in theoretical physics: Example QCD

    International Nuclear Information System (INIS)

    Schilling, K.

    1986-01-01

    The limitations of the classical mathematical analysis of Newton and Leibniz appear to be more and more overcome by the power of modern computers. Large scale computing techniques - which resemble closely the methods used in simulations within statistical mechanics - allow to treat nonlinear systems with many degrees of freedom such as field theories in nonperturbative situations, where analytical methods do fail. The computation of the hadron spectrum within the framework of lattice QCD sets a demanding goal for the application of supercomputers in basic science. It requires both big computer capacities and clever algorithms to fight all the numerical evils that one encounters in the Euclidean world. The talk will attempt to describe both the computer aspects and the present state of the art of spectrum calculations within lattice QCD. (orig.)

  7. Direct Computation of Sound Radiation by Jet Flow Using Large-scale Equations

    Science.gov (United States)

    Mankbadi, R. R.; Shih, S. H.; Hixon, D. R.; Povinelli, L. A.

    1995-01-01

    Jet noise is directly predicted using large-scale equations. The computational domain is extended in order to directly capture the radiated field. As in conventional large-eddy-simulations, the effect of the unresolved scales on the resolved ones is accounted for. Special attention is given to boundary treatment to avoid spurious modes that can render the computed fluctuations totally unacceptable. Results are presented for a supersonic jet at Mach number 2.1.

  8. Large-scale theoretical calculations in molecular science - design of a large computer system for molecular science and necessary conditions for future computers

    Energy Technology Data Exchange (ETDEWEB)

    Kashiwagi, H [Institute for Molecular Science, Okazaki, Aichi (Japan)

    1982-06-01

    A large computer system was designed and established for molecular science under the leadership of molecular scientists. Features of the computer system are an automated operation system and an open self-service system. Large-scale theoretical calculations have been performed to solve many problems in molecular science, using the computer system. Necessary conditions for future computers are discussed on the basis of this experience.

  9. Large-scale theoretical calculations in molecular science - design of a large computer system for molecular science and necessary conditions for future computers

    International Nuclear Information System (INIS)

    Kashiwagi, H.

    1982-01-01

    A large computer system was designed and established for molecular science under the leadership of molecular scientists. Features of the computer system are an automated operation system and an open self-service system. Large-scale theoretical calculations have been performed to solve many problems in molecular science, using the computer system. Necessary conditions for future computers are discussed on the basis of this experience. (orig.)

  10. Large scale particle simulations in a virtual memory computer

    International Nuclear Information System (INIS)

    Gray, P.C.; Million, R.; Wagner, J.S.; Tajima, T.

    1983-01-01

    Virtual memory computers are capable of executing large-scale particle simulations even when the memory requirements exceeds the computer core size. The required address space is automatically mapped onto slow disc memory the the operating system. When the simulation size is very large, frequent random accesses to slow memory occur during the charge accumulation and particle pushing processes. Assesses to slow memory significantly reduce the excecution rate of the simulation. We demonstrate in this paper that with the proper choice of sorting algorithm, a nominal amount of sorting to keep physically adjacent particles near particles with neighboring array indices can reduce random access to slow memory, increase the efficiency of the I/O system, and hence, reduce the required computing time. (orig.)

  11. Large-scale particle simulations in a virtual-memory computer

    International Nuclear Information System (INIS)

    Gray, P.C.; Wagner, J.S.; Tajima, T.; Million, R.

    1982-08-01

    Virtual memory computers are capable of executing large-scale particle simulations even when the memory requirements exceed the computer core size. The required address space is automatically mapped onto slow disc memory by the operating system. When the simulation size is very large, frequent random accesses to slow memory occur during the charge accumulation and particle pushing processes. Accesses to slow memory significantly reduce the execution rate of the simulation. We demonstrate in this paper that with the proper choice of sorting algorithm, a nominal amount of sorting to keep physically adjacent particles near particles with neighboring array indices can reduce random access to slow memory, increase the efficiency of the I/O system, and hence, reduce the required computing time

  12. The Convergence of High Performance Computing and Large Scale Data Analytics

    Science.gov (United States)

    Duffy, D.; Bowen, M. K.; Thompson, J. H.; Yang, C. P.; Hu, F.; Wills, B.

    2015-12-01

    As the combinations of remote sensing observations and model outputs have grown, scientists are increasingly burdened with both the necessity and complexity of large-scale data analysis. Scientists are increasingly applying traditional high performance computing (HPC) solutions to solve their "Big Data" problems. While this approach has the benefit of limiting data movement, the HPC system is not optimized to run analytics, which can create problems that permeate throughout the HPC environment. To solve these issues and to alleviate some of the strain on the HPC environment, the NASA Center for Climate Simulation (NCCS) has created the Advanced Data Analytics Platform (ADAPT), which combines both HPC and cloud technologies to create an agile system designed for analytics. Large, commonly used data sets are stored in this system in a write once/read many file system, such as Landsat, MODIS, MERRA, and NGA. High performance virtual machines are deployed and scaled according to the individual scientist's requirements specifically for data analysis. On the software side, the NCCS and GMU are working with emerging commercial technologies and applying them to structured, binary scientific data in order to expose the data in new ways. Native NetCDF data is being stored within a Hadoop Distributed File System (HDFS) enabling storage-proximal processing through MapReduce while continuing to provide accessibility of the data to traditional applications. Once the data is stored within HDFS, an additional indexing scheme is built on top of the data and placed into a relational database. This spatiotemporal index enables extremely fast mappings of queries to data locations to dramatically speed up analytics. These are some of the first steps toward a single unified platform that optimizes for both HPC and large-scale data analysis, and this presentation will elucidate the resulting and necessary exascale architectures required for future systems.

  13. Large Spatial Scale Ground Displacement Mapping through the P-SBAS Processing of Sentinel-1 Data on a Cloud Computing Environment

    Science.gov (United States)

    Casu, F.; Bonano, M.; de Luca, C.; Lanari, R.; Manunta, M.; Manzo, M.; Zinno, I.

    2017-12-01

    Since its launch in 2014, the Sentinel-1 (S1) constellation has played a key role on SAR data availability and dissemination all over the World. Indeed, the free and open access data policy adopted by the European Copernicus program together with the global coverage acquisition strategy, make the Sentinel constellation as a game changer in the Earth Observation scenario. Being the SAR data become ubiquitous, the technological and scientific challenge is focused on maximizing the exploitation of such huge data flow. In this direction, the use of innovative processing algorithms and distributed computing infrastructures, such as the Cloud Computing platforms, can play a crucial role. In this work we present a Cloud Computing solution for the advanced interferometric (DInSAR) processing chain based on the Parallel SBAS (P-SBAS) approach, aimed at processing S1 Interferometric Wide Swath (IWS) data for the generation of large spatial scale deformation time series in efficient, automatic and systematic way. Such a DInSAR chain ingests Sentinel 1 SLC images and carries out several processing steps, to finally compute deformation time series and mean deformation velocity maps. Different parallel strategies have been designed ad hoc for each processing step of the P-SBAS S1 chain, encompassing both multi-core and multi-node programming techniques, in order to maximize the computational efficiency achieved within a Cloud Computing environment and cut down the relevant processing times. The presented P-SBAS S1 processing chain has been implemented on the Amazon Web Services platform and a thorough analysis of the attained parallel performances has been performed to identify and overcome the major bottlenecks to the scalability. The presented approach is used to perform national-scale DInSAR analyses over Italy, involving the processing of more than 3000 S1 IWS images acquired from both ascending and descending orbits. Such an experiment confirms the big advantage of

  14. Performance Modeling of Hybrid MPI/OpenMP Scientific Applications on Large-scale Multicore Cluster Systems

    KAUST Repository

    Wu, Xingfu; Taylor, Valerie

    2011-01-01

    In this paper, we present a performance modeling framework based on memory bandwidth contention time and a parameterized communication model to predict the performance of OpenMP, MPI and hybrid applications with weak scaling on three large-scale multicore clusters: IBM POWER4, POWER5+ and Blue Gene/P, and analyze the performance of these MPI, OpenMP and hybrid applications. We use STREAM memory benchmarks to provide initial performance analysis and model validation of MPI and OpenMP applications on these multicore clusters because the measured sustained memory bandwidth can provide insight into the memory bandwidth that a system should sustain on scientific applications with the same amount of workload per core. In addition to using these benchmarks, we also use a weak-scaling hybrid MPI/OpenMP large-scale scientific application: Gyro kinetic Toroidal Code in magnetic fusion to validate our performance model of the hybrid application on these multicore clusters. The validation results for our performance modeling method show less than 7.77% error rate in predicting the performance of hybrid MPI/OpenMP GTC on up to 512 cores on these multicore clusters. © 2011 IEEE.

  15. Performance Modeling of Hybrid MPI/OpenMP Scientific Applications on Large-scale Multicore Cluster Systems

    KAUST Repository

    Wu, Xingfu

    2011-08-01

    In this paper, we present a performance modeling framework based on memory bandwidth contention time and a parameterized communication model to predict the performance of OpenMP, MPI and hybrid applications with weak scaling on three large-scale multicore clusters: IBM POWER4, POWER5+ and Blue Gene/P, and analyze the performance of these MPI, OpenMP and hybrid applications. We use STREAM memory benchmarks to provide initial performance analysis and model validation of MPI and OpenMP applications on these multicore clusters because the measured sustained memory bandwidth can provide insight into the memory bandwidth that a system should sustain on scientific applications with the same amount of workload per core. In addition to using these benchmarks, we also use a weak-scaling hybrid MPI/OpenMP large-scale scientific application: Gyro kinetic Toroidal Code in magnetic fusion to validate our performance model of the hybrid application on these multicore clusters. The validation results for our performance modeling method show less than 7.77% error rate in predicting the performance of hybrid MPI/OpenMP GTC on up to 512 cores on these multicore clusters. © 2011 IEEE.

  16. Do large-scale assessments measure students' ability to integrate scientific knowledge?

    Science.gov (United States)

    Lee, Hee-Sun

    2010-03-01

    Large-scale assessments are used as means to diagnose the current status of student achievement in science and compare students across schools, states, and countries. For efficiency, multiple-choice items and dichotomously-scored open-ended items are pervasively used in large-scale assessments such as Trends in International Math and Science Study (TIMSS). This study investigated how well these items measure secondary school students' ability to integrate scientific knowledge. This study collected responses of 8400 students to 116 multiple-choice and 84 open-ended items and applied an Item Response Theory analysis based on the Rasch Partial Credit Model. Results indicate that most multiple-choice items and dichotomously-scored open-ended items can be used to determine whether students have normative ideas about science topics, but cannot measure whether students integrate multiple pieces of relevant science ideas. Only when the scoring rubric is redesigned to capture subtle nuances of student open-ended responses, open-ended items become a valid and reliable tool to assess students' knowledge integration ability.

  17. Optimization and large scale computation of an entropy-based moment closure

    Science.gov (United States)

    Kristopher Garrett, C.; Hauck, Cory; Hill, Judith

    2015-12-01

    We present computational advances and results in the implementation of an entropy-based moment closure, MN, in the context of linear kinetic equations, with an emphasis on heterogeneous and large-scale computing platforms. Entropy-based closures are known in several cases to yield more accurate results than closures based on standard spectral approximations, such as PN, but the computational cost is generally much higher and often prohibitive. Several optimizations are introduced to improve the performance of entropy-based algorithms over previous implementations. These optimizations include the use of GPU acceleration and the exploitation of the mathematical properties of spherical harmonics, which are used as test functions in the moment formulation. To test the emerging high-performance computing paradigm of communication bound simulations, we present timing results at the largest computational scales currently available. These results show, in particular, load balancing issues in scaling the MN algorithm that do not appear for the PN algorithm. We also observe that in weak scaling tests, the ratio in time to solution of MN to PN decreases.

  18. Exploring HPCS languages in scientific computing

    International Nuclear Information System (INIS)

    Barrett, R F; Alam, S R; Almeida, V F d; Bernholdt, D E; Elwasif, W R; Kuehn, J A; Poole, S W; Shet, A G

    2008-01-01

    As computers scale up dramatically to tens and hundreds of thousands of cores, develop deeper computational and memory hierarchies, and increased heterogeneity, developers of scientific software are increasingly challenged to express complex parallel simulations effectively and efficiently. In this paper, we explore the three languages developed under the DARPA High-Productivity Computing Systems (HPCS) program to help address these concerns: Chapel, Fortress, and X10. These languages provide a variety of features not found in currently popular HPC programming environments and make it easier to express powerful computational constructs, leading to new ways of thinking about parallel programming. Though the languages and their implementations are not yet mature enough for a comprehensive evaluation, we discuss some of the important features, and provide examples of how they can be used in scientific computing. We believe that these characteristics will be important to the future of high-performance scientific computing, whether the ultimate language of choice is one of the HPCS languages or something else

  19. Exploring HPCS languages in scientific computing

    Science.gov (United States)

    Barrett, R. F.; Alam, S. R.; Almeida, V. F. d.; Bernholdt, D. E.; Elwasif, W. R.; Kuehn, J. A.; Poole, S. W.; Shet, A. G.

    2008-07-01

    As computers scale up dramatically to tens and hundreds of thousands of cores, develop deeper computational and memory hierarchies, and increased heterogeneity, developers of scientific software are increasingly challenged to express complex parallel simulations effectively and efficiently. In this paper, we explore the three languages developed under the DARPA High-Productivity Computing Systems (HPCS) program to help address these concerns: Chapel, Fortress, and X10. These languages provide a variety of features not found in currently popular HPC programming environments and make it easier to express powerful computational constructs, leading to new ways of thinking about parallel programming. Though the languages and their implementations are not yet mature enough for a comprehensive evaluation, we discuss some of the important features, and provide examples of how they can be used in scientific computing. We believe that these characteristics will be important to the future of high-performance scientific computing, whether the ultimate language of choice is one of the HPCS languages or something else.

  20. Large scale electronic structure calculations in the study of the condensed phase

    NARCIS (Netherlands)

    van Dam, H.J.J.; Guest, M.F.; Sherwood, P.; Thomas, J.M.H.; van Lenthe, J.H.; van Lingen, J.N.J.; Bailey, C.L.; Bush, I.J.

    2006-01-01

    We consider the role that large-scale electronic structure computations can now play in the modelling of the condensed phase. To structure our analysis, we consider four distict ways in which today's scientific targets can be re-scoped to take advantage of advances in computing resources: 1. time to

  1. High-Resiliency and Auto-Scaling of Large-Scale Cloud Computing for OCO-2 L2 Full Physics Processing

    Science.gov (United States)

    Hua, H.; Manipon, G.; Starch, M.; Dang, L. B.; Southam, P.; Wilson, B. D.; Avis, C.; Chang, A.; Cheng, C.; Smyth, M.; McDuffie, J. L.; Ramirez, P.

    2015-12-01

    Next generation science data systems are needed to address the incoming flood of data from new missions such as SWOT and NISAR where data volumes and data throughput rates are order of magnitude larger than present day missions. Additionally, traditional means of procuring hardware on-premise are already limited due to facilities capacity constraints for these new missions. Existing missions, such as OCO-2, may also require high turn-around time for processing different science scenarios where on-premise and even traditional HPC computing environments may not meet the high processing needs. We present our experiences on deploying a hybrid-cloud computing science data system (HySDS) for the OCO-2 Science Computing Facility to support large-scale processing of their Level-2 full physics data products. We will explore optimization approaches to getting best performance out of hybrid-cloud computing as well as common issues that will arise when dealing with large-scale computing. Novel approaches were utilized to do processing on Amazon's spot market, which can potentially offer ~10X costs savings but with an unpredictable computing environment based on market forces. We will present how we enabled high-tolerance computing in order to achieve large-scale computing as well as operational cost savings.

  2. NASA's Information Power Grid: Large Scale Distributed Computing and Data Management

    Science.gov (United States)

    Johnston, William E.; Vaziri, Arsi; Hinke, Tom; Tanner, Leigh Ann; Feiereisen, William J.; Thigpen, William; Tang, Harry (Technical Monitor)

    2001-01-01

    Large-scale science and engineering are done through the interaction of people, heterogeneous computing resources, information systems, and instruments, all of which are geographically and organizationally dispersed. The overall motivation for Grids is to facilitate the routine interactions of these resources in order to support large-scale science and engineering. Multi-disciplinary simulations provide a good example of a class of applications that are very likely to require aggregation of widely distributed computing, data, and intellectual resources. Such simulations - e.g. whole system aircraft simulation and whole system living cell simulation - require integrating applications and data that are developed by different teams of researchers frequently in different locations. The research team's are the only ones that have the expertise to maintain and improve the simulation code and/or the body of experimental data that drives the simulations. This results in an inherently distributed computing and data management environment.

  3. Towards Portable Large-Scale Image Processing with High-Performance Computing.

    Science.gov (United States)

    Huo, Yuankai; Blaber, Justin; Damon, Stephen M; Boyd, Brian D; Bao, Shunxing; Parvathaneni, Prasanna; Noguera, Camilo Bermudez; Chaganti, Shikha; Nath, Vishwesh; Greer, Jasmine M; Lyu, Ilwoo; French, William R; Newton, Allen T; Rogers, Baxter P; Landman, Bennett A

    2018-05-03

    High-throughput, large-scale medical image computing demands tight integration of high-performance computing (HPC) infrastructure for data storage, job distribution, and image processing. The Vanderbilt University Institute for Imaging Science (VUIIS) Center for Computational Imaging (CCI) has constructed a large-scale image storage and processing infrastructure that is composed of (1) a large-scale image database using the eXtensible Neuroimaging Archive Toolkit (XNAT), (2) a content-aware job scheduling platform using the Distributed Automation for XNAT pipeline automation tool (DAX), and (3) a wide variety of encapsulated image processing pipelines called "spiders." The VUIIS CCI medical image data storage and processing infrastructure have housed and processed nearly half-million medical image volumes with Vanderbilt Advanced Computing Center for Research and Education (ACCRE), which is the HPC facility at the Vanderbilt University. The initial deployment was natively deployed (i.e., direct installations on a bare-metal server) within the ACCRE hardware and software environments, which lead to issues of portability and sustainability. First, it could be laborious to deploy the entire VUIIS CCI medical image data storage and processing infrastructure to another HPC center with varying hardware infrastructure, library availability, and software permission policies. Second, the spiders were not developed in an isolated manner, which has led to software dependency issues during system upgrades or remote software installation. To address such issues, herein, we describe recent innovations using containerization techniques with XNAT/DAX which are used to isolate the VUIIS CCI medical image data storage and processing infrastructure from the underlying hardware and software environments. The newly presented XNAT/DAX solution has the following new features: (1) multi-level portability from system level to the application level, (2) flexible and dynamic software

  4. Large-scale data analytics

    CERN Document Server

    Gkoulalas-Divanis, Aris

    2014-01-01

    Provides cutting-edge research in large-scale data analytics from diverse scientific areas Surveys varied subject areas and reports on individual results of research in the field Shares many tips and insights into large-scale data analytics from authors and editors with long-term experience and specialization in the field

  5. Accelerating sustainability in large-scale facilities

    CERN Multimedia

    Marina Giampietro

    2011-01-01

    Scientific research centres and large-scale facilities are intrinsically energy intensive, but how can big science improve its energy management and eventually contribute to the environmental cause with new cleantech? CERN’s commitment to providing tangible answers to these questions was sealed in the first workshop on energy management for large scale scientific infrastructures held in Lund, Sweden, on the 13-14 October.   Participants at the energy management for large scale scientific infrastructures workshop. The workshop, co-organised with the European Spallation Source (ESS) and  the European Association of National Research Facilities (ERF), tackled a recognised need for addressing energy issues in relation with science and technology policies. It brought together more than 150 representatives of Research Infrastrutures (RIs) and energy experts from Europe and North America. “Without compromising our scientific projects, we can ...

  6. Application of parallel computing techniques to a large-scale reservoir simulation

    International Nuclear Information System (INIS)

    Zhang, Keni; Wu, Yu-Shu; Ding, Chris; Pruess, Karsten

    2001-01-01

    Even with the continual advances made in both computational algorithms and computer hardware used in reservoir modeling studies, large-scale simulation of fluid and heat flow in heterogeneous reservoirs remains a challenge. The problem commonly arises from intensive computational requirement for detailed modeling investigations of real-world reservoirs. This paper presents the application of a massive parallel-computing version of the TOUGH2 code developed for performing large-scale field simulations. As an application example, the parallelized TOUGH2 code is applied to develop a three-dimensional unsaturated-zone numerical model simulating flow of moisture, gas, and heat in the unsaturated zone of Yucca Mountain, Nevada, a potential repository for high-level radioactive waste. The modeling approach employs refined spatial discretization to represent the heterogeneous fractured tuffs of the system, using more than a million 3-D gridblocks. The problem of two-phase flow and heat transfer within the model domain leads to a total of 3,226,566 linear equations to be solved per Newton iteration. The simulation is conducted on a Cray T3E-900, a distributed-memory massively parallel computer. Simulation results indicate that the parallel computing technique, as implemented in the TOUGH2 code, is very efficient. The reliability and accuracy of the model results have been demonstrated by comparing them to those of small-scale (coarse-grid) models. These comparisons show that simulation results obtained with the refined grid provide more detailed predictions of the future flow conditions at the site, aiding in the assessment of proposed repository performance

  7. Deterministic sensitivity and uncertainty analysis for large-scale computer models

    International Nuclear Information System (INIS)

    Worley, B.A.; Pin, F.G.; Oblow, E.M.; Maerker, R.E.; Horwedel, J.E.; Wright, R.Q.

    1988-01-01

    This paper presents a comprehensive approach to sensitivity and uncertainty analysis of large-scale computer models that is analytic (deterministic) in principle and that is firmly based on the model equations. The theory and application of two systems based upon computer calculus, GRESS and ADGEN, are discussed relative to their role in calculating model derivatives and sensitivities without a prohibitive initial manpower investment. Storage and computational requirements for these two systems are compared for a gradient-enhanced version of the PRESTO-II computer model. A Deterministic Uncertainty Analysis (DUA) method that retains the characteristics of analytically computing result uncertainties based upon parameter probability distributions is then introduced and results from recent studies are shown. 29 refs., 4 figs., 1 tab

  8. A Combined Eulerian-Lagrangian Data Representation for Large-Scale Applications.

    Science.gov (United States)

    Sauer, Franz; Xie, Jinrong; Ma, Kwan-Liu

    2017-10-01

    The Eulerian and Lagrangian reference frames each provide a unique perspective when studying and visualizing results from scientific systems. As a result, many large-scale simulations produce data in both formats, and analysis tasks that simultaneously utilize information from both representations are becoming increasingly popular. However, due to their fundamentally different nature, drawing correlations between these data formats is a computationally difficult task, especially in a large-scale setting. In this work, we present a new data representation which combines both reference frames into a joint Eulerian-Lagrangian format. By reorganizing Lagrangian information according to the Eulerian simulation grid into a "unit cell" based approach, we can provide an efficient out-of-core means of sampling, querying, and operating with both representations simultaneously. We also extend this design to generate multi-resolution subsets of the full data to suit the viewer's needs and provide a fast flow-aware trajectory construction scheme. We demonstrate the effectiveness of our method using three large-scale real world scientific datasets and provide insight into the types of performance gains that can be achieved.

  9. Multi-Agent System Supporting Automated Large-Scale Photometric Computations

    Directory of Open Access Journals (Sweden)

    Adam Sȩdziwy

    2016-02-01

    Full Text Available The technologies related to green energy, smart cities and similar areas being dynamically developed in recent years, face frequently problems of a computational nature rather than a technological one. The example is the ability of accurately predicting the weather conditions for PV farms or wind turbines. Another group of issues is related to the complexity of the computations required to obtain an optimal setup of a solution being designed. In this article, we present the case representing the latter group of problems, namely designing large-scale power-saving lighting installations. The term “large-scale” refers to an entire city area, containing tens of thousands of luminaires. Although a simple power reduction for a single street, giving limited savings, is relatively easy, it becomes infeasible for tasks covering thousands of luminaires described by precise coordinates (instead of simplified layouts. To overcome this critical issue, we propose introducing a formal representation of a computing problem and applying a multi-agent system to perform design-related computations in parallel. The important measure introduced in the article indicating optimization progress is entropy. It also allows for terminating optimization when the solution is satisfying. The article contains the results of real-life calculations being made with the help of the presented approach.

  10. Efficient Feature-Driven Visualization of Large-Scale Scientific Data

    Energy Technology Data Exchange (ETDEWEB)

    Lu, Aidong

    2012-12-12

    Very large, complex scientific data acquired in many research areas creates critical challenges for scientists to understand, analyze, and organize their data. The objective of this project is to expand the feature extraction and analysis capabilities to develop powerful and accurate visualization tools that can assist domain scientists with their requirements in multiple phases of scientific discovery. We have recently developed several feature-driven visualization methods for extracting different data characteristics of volumetric datasets. Our results verify the hypothesis in the proposal and will be used to develop additional prototype systems.

  11. Large Scale Computing and Storage Requirements for Fusion Energy Sciences: Target 2017

    Energy Technology Data Exchange (ETDEWEB)

    Gerber, Richard

    2014-05-02

    The National Energy Research Scientific Computing Center (NERSC) is the primary computing center for the DOE Office of Science, serving approximately 4,500 users working on some 650 projects that involve nearly 600 codes in a wide variety of scientific disciplines. In March 2013, NERSC, DOE?s Office of Advanced Scientific Computing Research (ASCR) and DOE?s Office of Fusion Energy Sciences (FES) held a review to characterize High Performance Computing (HPC) and storage requirements for FES research through 2017. This report is the result.

  12. Emerging Nanophotonic Applications Explored with Advanced Scientific Parallel Computing

    Science.gov (United States)

    Meng, Xiang

    The domain of nanoscale optical science and technology is a combination of the classical world of electromagnetics and the quantum mechanical regime of atoms and molecules. Recent advancements in fabrication technology allows the optical structures to be scaled down to nanoscale size or even to the atomic level, which are far smaller than the wavelength they are designed for. These nanostructures can have unique, controllable, and tunable optical properties and their interactions with quantum materials can have important near-field and far-field optical response. Undoubtedly, these optical properties can have many important applications, ranging from the efficient and tunable light sources, detectors, filters, modulators, high-speed all-optical switches; to the next-generation classical and quantum computation, and biophotonic medical sensors. This emerging research of nanoscience, known as nanophotonics, is a highly interdisciplinary field requiring expertise in materials science, physics, electrical engineering, and scientific computing, modeling and simulation. It has also become an important research field for investigating the science and engineering of light-matter interactions that take place on wavelength and subwavelength scales where the nature of the nanostructured matter controls the interactions. In addition, the fast advancements in the computing capabilities, such as parallel computing, also become as a critical element for investigating advanced nanophotonic devices. This role has taken on even greater urgency with the scale-down of device dimensions, and the design for these devices require extensive memory and extremely long core hours. Thus distributed computing platforms associated with parallel computing are required for faster designs processes. Scientific parallel computing constructs mathematical models and quantitative analysis techniques, and uses the computing machines to analyze and solve otherwise intractable scientific challenges. In

  13. Deterministic sensitivity and uncertainty analysis for large-scale computer models

    International Nuclear Information System (INIS)

    Worley, B.A.; Pin, F.G.; Oblow, E.M.; Maerker, R.E.; Horwedel, J.E.; Wright, R.Q.

    1988-01-01

    The fields of sensitivity and uncertainty analysis have traditionally been dominated by statistical techniques when large-scale modeling codes are being analyzed. These methods are able to estimate sensitivities, generate response surfaces, and estimate response probability distributions given the input parameter probability distributions. Because the statistical methods are computationally costly, they are usually applied only to problems with relatively small parameter sets. Deterministic methods, on the other hand, are very efficient and can handle large data sets, but generally require simpler models because of the considerable programming effort required for their implementation. The first part of this paper reports on the development and availability of two systems, GRESS and ADGEN, that make use of computer calculus compilers to automate the implementation of deterministic sensitivity analysis capability into existing computer models. This automation removes the traditional limitation of deterministic sensitivity methods. This second part of the paper describes a deterministic uncertainty analysis method (DUA) that uses derivative information as a basis to propagate parameter probability distributions to obtain result probability distributions. This paper is applicable to low-level radioactive waste disposal system performance assessment

  14. The scaling issue: scientific opportunities

    Science.gov (United States)

    Orbach, Raymond L.

    2009-07-01

    A brief history of the Leadership Computing Facility (LCF) initiative is presented, along with the importance of SciDAC to the initiative. The initiative led to the initiation of the Innovative and Novel Computational Impact on Theory and Experiment program (INCITE), open to all researchers in the US and abroad, and based solely on scientific merit through peer review, awarding sizeable allocations (typically millions of processor-hours per project). The development of the nation's LCFs has enabled available INCITE processor-hours to double roughly every eight months since its inception in 2004. The 'top ten' LCF accomplishments in 2009 illustrate the breadth of the scientific program, while the 75 million processor hours allocated to American business since 2006 highlight INCITE contributions to US competitiveness. The extrapolation of INCITE processor hours into the future brings new possibilities for many 'classic' scaling problems. Complex systems and atomic displacements to cracks are but two examples. However, even with increasing computational speeds, the development of theory, numerical representations, algorithms, and efficient implementation are required for substantial success, exhibiting the crucial role that SciDAC will play.

  15. The scaling issue: scientific opportunities

    International Nuclear Information System (INIS)

    Orbach, Raymond L

    2009-01-01

    A brief history of the Leadership Computing Facility (LCF) initiative is presented, along with the importance of SciDAC to the initiative. The initiative led to the initiation of the Innovative and Novel Computational Impact on Theory and Experiment program (INCITE), open to all researchers in the US and abroad, and based solely on scientific merit through peer review, awarding sizeable allocations (typically millions of processor-hours per project). The development of the nation's LCFs has enabled available INCITE processor-hours to double roughly every eight months since its inception in 2004. The 'top ten' LCF accomplishments in 2009 illustrate the breadth of the scientific program, while the 75 million processor hours allocated to American business since 2006 highlight INCITE contributions to US competitiveness. The extrapolation of INCITE processor hours into the future brings new possibilities for many 'classic' scaling problems. Complex systems and atomic displacements to cracks are but two examples. However, even with increasing computational speeds, the development of theory, numerical representations, algorithms, and efficient implementation are required for substantial success, exhibiting the crucial role that SciDAC will play.

  16. Designing Scientific Software for Heterogeneous Computing

    DEFF Research Database (Denmark)

    Glimberg, Stefan Lemvig

    , algorithms and data structures must be designed to utilize the underlying parallel architecture. The architectural changes in hardware design within the last decade, from single to multi and many-core architectures, require software developers to identify and properly implement methods that both exploit...... makes parallel software design applicable, but also a challenge for scientific software developers at all levels. We have developed a generic C++ library for fast prototyping of large-scale PDEs solvers based on flexible-order finite difference approximations on structured regular grids. The library...... is designed with a high abstraction interface to improve developer productivity. The library is based on modern template-based design concepts as described in Glimberg, Engsig-Karup, Nielsen & Dammann (2013). The library utilizes heterogeneous CPU/GPU environments in order to maximize computational throughput...

  17. Large-scale parallel genome assembler over cloud computing environment.

    Science.gov (United States)

    Das, Arghya Kusum; Koppa, Praveen Kumar; Goswami, Sayan; Platania, Richard; Park, Seung-Jong

    2017-06-01

    The size of high throughput DNA sequencing data has already reached the terabyte scale. To manage this huge volume of data, many downstream sequencing applications started using locality-based computing over different cloud infrastructures to take advantage of elastic (pay as you go) resources at a lower cost. However, the locality-based programming model (e.g. MapReduce) is relatively new. Consequently, developing scalable data-intensive bioinformatics applications using this model and understanding the hardware environment that these applications require for good performance, both require further research. In this paper, we present a de Bruijn graph oriented Parallel Giraph-based Genome Assembler (GiGA), as well as the hardware platform required for its optimal performance. GiGA uses the power of Hadoop (MapReduce) and Giraph (large-scale graph analysis) to achieve high scalability over hundreds of compute nodes by collocating the computation and data. GiGA achieves significantly higher scalability with competitive assembly quality compared to contemporary parallel assemblers (e.g. ABySS and Contrail) over traditional HPC cluster. Moreover, we show that the performance of GiGA is significantly improved by using an SSD-based private cloud infrastructure over traditional HPC cluster. We observe that the performance of GiGA on 256 cores of this SSD-based cloud infrastructure closely matches that of 512 cores of traditional HPC cluster.

  18. Scientific Grand Challenges: Challenges in Climate Change Science and the Role of Computing at the Extreme Scale

    Energy Technology Data Exchange (ETDEWEB)

    Khaleel, Mohammad A.; Johnson, Gary M.; Washington, Warren M.

    2009-07-02

    The U.S. Department of Energy (DOE) Office of Biological and Environmental Research (BER) in partnership with the Office of Advanced Scientific Computing Research (ASCR) held a workshop on the challenges in climate change science and the role of computing at the extreme scale, November 6-7, 2008, in Bethesda, Maryland. At the workshop, participants identified the scientific challenges facing the field of climate science and outlined the research directions of highest priority that should be pursued to meet these challenges. Representatives from the national and international climate change research community as well as representatives from the high-performance computing community attended the workshop. This group represented a broad mix of expertise. Of the 99 participants, 6 were from international institutions. Before the workshop, each of the four panels prepared a white paper, which provided the starting place for the workshop discussions. These four panels of workshop attendees devoted to their efforts the following themes: Model Development and Integrated Assessment; Algorithms and Computational Environment; Decadal Predictability and Prediction; Data, Visualization, and Computing Productivity. The recommendations of the panels are summarized in the body of this report.

  19. Cerebral methodology based computing to estimate real phenomena from large-scale nuclear simulation

    International Nuclear Information System (INIS)

    Suzuki, Yoshio

    2011-01-01

    Our final goal is to estimate real phenomena from large-scale nuclear simulations by using computing processes. Large-scale simulations mean that they include scale variety and physical complexity so that corresponding experiments and/or theories do not exist. In nuclear field, it is indispensable to estimate real phenomena from simulations in order to improve the safety and security of nuclear power plants. Here, the analysis of uncertainty included in simulations is needed to reveal sensitivity of uncertainty due to randomness, to reduce the uncertainty due to lack of knowledge and to lead a degree of certainty by verification and validation (V and V) and uncertainty quantification (UQ) processes. To realize this, we propose 'Cerebral Methodology based Computing (CMC)' as computing processes with deductive and inductive approaches by referring human reasoning processes. Our idea is to execute deductive and inductive simulations contrasted with deductive and inductive approaches. We have established its prototype system and applied it to a thermal displacement analysis of a nuclear power plant. The result shows that our idea is effective to reduce the uncertainty and to get the degree of certainty. (author)

  20. A review of parallel computing for large-scale remote sensing image mosaicking

    OpenAIRE

    Chen, Lajiao; Ma, Yan; Liu, Peng; Wei, Jingbo; Jie, Wei; He, Jijun

    2015-01-01

    Interest in image mosaicking has been spurred by a wide variety of research and management needs. However, for large-scale applications, remote sensing image mosaicking usually requires significant computational capabilities. Several studies have attempted to apply parallel computing to improve image mosaicking algorithms and to speed up calculation process. The state of the art of this field has not yet been summarized, which is, however, essential for a better understanding and for further ...

  1. Large-scale computer networks and the future of legal knowledge-based systems

    NARCIS (Netherlands)

    Leenes, R.E.; Svensson, Jorgen S.; Hage, J.C.; Bench-Capon, T.J.M.; Cohen, M.J.; van den Herik, H.J.

    1995-01-01

    In this paper we investigate the relation between legal knowledge-based systems and large-scale computer networks such as the Internet. On the one hand, researchers of legal knowledge-based systems have claimed huge possibilities, but despite the efforts over the last twenty years, the number of

  2. Molecular Science Computing Facility Scientific Challenges: Linking Across Scales

    Energy Technology Data Exchange (ETDEWEB)

    De Jong, Wibe A.; Windus, Theresa L.

    2005-07-01

    The purpose of this document is to define the evolving science drivers for performing environmental molecular research at the William R. Wiley Environmental Molecular Sciences Laboratory (EMSL) and to provide guidance associated with the next-generation high-performance computing center that must be developed at EMSL's Molecular Science Computing Facility (MSCF) in order to address this critical research. The MSCF is the pre-eminent computing facility?supported by the U.S. Department of Energy's (DOE's) Office of Biological and Environmental Research (BER)?tailored to provide the fastest time-to-solution for current computational challenges in chemistry and biology, as well as providing the means for broad research in the molecular and environmental sciences. The MSCF provides integral resources and expertise to emerging EMSL Scientific Grand Challenges and Collaborative Access Teams that are designed to leverage the multiple integrated research capabilities of EMSL, thereby creating a synergy between computation and experiment to address environmental molecular science challenges critical to DOE and the nation.

  3. Visual analysis of inter-process communication for large-scale parallel computing.

    Science.gov (United States)

    Muelder, Chris; Gygi, Francois; Ma, Kwan-Liu

    2009-01-01

    In serial computation, program profiling is often helpful for optimization of key sections of code. When moving to parallel computation, not only does the code execution need to be considered but also communication between the different processes which can induce delays that are detrimental to performance. As the number of processes increases, so does the impact of the communication delays on performance. For large-scale parallel applications, it is critical to understand how the communication impacts performance in order to make the code more efficient. There are several tools available for visualizing program execution and communications on parallel systems. These tools generally provide either views which statistically summarize the entire program execution or process-centric views. However, process-centric visualizations do not scale well as the number of processes gets very large. In particular, the most common representation of parallel processes is a Gantt char t with a row for each process. As the number of processes increases, these charts can become difficult to work with and can even exceed screen resolution. We propose a new visualization approach that affords more scalability and then demonstrate it on systems running with up to 16,384 processes.

  4. A Polar Rover for Large-Scale Scientific Surveys: Design, Implementation and Field Test Results

    Directory of Open Access Journals (Sweden)

    Yuqing He

    2015-10-01

    Full Text Available Exploration of polar regions is of great importance to scientific research. Unfortunately, due to the harsh environment, most of the regions on the Antarctic continent are still unreachable for humankind. Therefore, in 2011, the Chinese National Antarctic Research Expedition (CHINARE launched a project to design a rover to conduct large-scale scientific surveys on the Antarctic. The main challenges for the rover are twofold: one is the mobility, i.e., how to make a rover that could survive the harsh environment and safely move on the uneven, icy and snowy terrain; the other is the autonomy, in that the robot should be able to move at a relatively high speed with little or no human intervention so that it can explore a large region in a limit time interval under the communication constraints. In this paper, the corresponding techniques, especially the polar rover's design and autonomous navigation algorithms, are introduced in detail. Subsequently, an experimental report of the fields tests on the Antarctic is given to show some preliminary evaluation of the rover. Finally, experiences and existing challenging problems are summarized.

  5. Large-scale simulations of error-prone quantum computation devices

    International Nuclear Information System (INIS)

    Trieu, Doan Binh

    2009-01-01

    The theoretical concepts of quantum computation in the idealized and undisturbed case are well understood. However, in practice, all quantum computation devices do suffer from decoherence effects as well as from operational imprecisions. This work assesses the power of error-prone quantum computation devices using large-scale numerical simulations on parallel supercomputers. We present the Juelich Massively Parallel Ideal Quantum Computer Simulator (JUMPIQCS), that simulates a generic quantum computer on gate level. It comprises an error model for decoherence and operational errors. The robustness of various algorithms in the presence of noise has been analyzed. The simulation results show that for large system sizes and long computations it is imperative to actively correct errors by means of quantum error correction. We implemented the 5-, 7-, and 9-qubit quantum error correction codes. Our simulations confirm that using error-prone correction circuits with non-fault-tolerant quantum error correction will always fail, because more errors are introduced than being corrected. Fault-tolerant methods can overcome this problem, provided that the single qubit error rate is below a certain threshold. We incorporated fault-tolerant quantum error correction techniques into JUMPIQCS using Steane's 7-qubit code and determined this threshold numerically. Using the depolarizing channel as the source of decoherence, we find a threshold error rate of (5.2±0.2) x 10 -6 . For Gaussian distributed operational over-rotations the threshold lies at a standard deviation of 0.0431±0.0002. We can conclude that quantum error correction is especially well suited for the correction of operational imprecisions and systematic over-rotations. For realistic simulations of specific quantum computation devices we need to extend the generic model to dynamic simulations, i.e. time-dependent Hamiltonian simulations of realistic hardware models. We focus on today's most advanced technology, i

  6. Challenges in Managing Trustworthy Large-scale Digital Science

    Science.gov (United States)

    Evans, B. J. K.

    2017-12-01

    The increased use of large-scale international digital science has opened a number of challenges for managing, handling, using and preserving scientific information. The large volumes of information are driven by three main categories - model outputs including coupled models and ensembles, data products that have been processing to a level of usability, and increasingly heuristically driven data analysis. These data products are increasingly the ones that are usable by the broad communities, and far in excess of the raw instruments data outputs. The data, software and workflows are then shared and replicated to allow broad use at an international scale, which places further demands of infrastructure to support how the information is managed reliably across distributed resources. Users necessarily rely on these underlying "black boxes" so that they are productive to produce new scientific outcomes. The software for these systems depend on computational infrastructure, software interconnected systems, and information capture systems. This ranges from the fundamentals of the reliability of the compute hardware, system software stacks and libraries, and the model software. Due to these complexities and capacity of the infrastructure, there is an increased emphasis of transparency of the approach and robustness of the methods over the full reproducibility. Furthermore, with large volume data management, it is increasingly difficult to store the historical versions of all model and derived data. Instead, the emphasis is on the ability to access the updated products and the reliability by which both previous outcomes are still relevant and can be updated for the new information. We will discuss these challenges and some of the approaches underway that are being used to address these issues.

  7. The multilevel fast multipole algorithm (MLFMA) for solving large-scale computational electromagnetics problems

    CERN Document Server

    Ergul, Ozgur

    2014-01-01

    The Multilevel Fast Multipole Algorithm (MLFMA) for Solving Large-Scale Computational Electromagnetic Problems provides a detailed and instructional overview of implementing MLFMA. The book: Presents a comprehensive treatment of the MLFMA algorithm, including basic linear algebra concepts, recent developments on the parallel computation, and a number of application examplesCovers solutions of electromagnetic problems involving dielectric objects and perfectly-conducting objectsDiscusses applications including scattering from airborne targets, scattering from red

  8. Scientific Services on the Cloud

    Science.gov (United States)

    Chapman, David; Joshi, Karuna P.; Yesha, Yelena; Halem, Milt; Yesha, Yaacov; Nguyen, Phuong

    Scientific Computing was one of the first every applications for parallel and distributed computation. To this date, scientific applications remain some of the most compute intensive, and have inspired creation of petaflop compute infrastructure such as the Oak Ridge Jaguar and Los Alamos RoadRunner. Large dedicated hardware infrastructure has become both a blessing and a curse to the scientific community. Scientists are interested in cloud computing for much the same reason as businesses and other professionals. The hardware is provided, maintained, and administrated by a third party. Software abstraction and virtualization provide reliability, and fault tolerance. Graduated fees allow for multi-scale prototyping and execution. Cloud computing resources are only a few clicks away, and by far the easiest high performance distributed platform to gain access to. There may still be dedicated infrastructure for ultra-scale science, but the cloud can easily play a major part of the scientific computing initiative.

  9. Scientific Computing in Electrical Engineering

    CERN Document Server

    Amrhein, Wolfgang; Zulehner, Walter

    2018-01-01

    This collection of selected papers presented at the 11th International Conference on Scientific Computing in Electrical Engineering (SCEE), held in St. Wolfgang, Austria, in 2016, showcases the state of the art in SCEE. The aim of the SCEE 2016 conference was to bring together scientists from academia and industry, mathematicians, electrical engineers, computer scientists, and physicists, and to promote intensive discussions on industrially relevant mathematical problems, with an emphasis on the modeling and numerical simulation of electronic circuits and devices, electromagnetic fields, and coupled problems. The focus in methodology was on model order reduction and uncertainty quantification. This extensive reference work is divided into six parts: Computational Electromagnetics, Circuit and Device Modeling and Simulation, Coupled Problems and Multi‐Scale Approaches in Space and Time, Mathematical and Computational Methods Including Uncertainty Quantification, Model Order Reduction, and Industrial Applicat...

  10. Computer-supported analysis of scientific measurements

    NARCIS (Netherlands)

    de Jong, Hidde

    1998-01-01

    In the past decade, large-scale databases and knowledge bases have become available to researchers working in a range of scientific disciplines. In many cases these databases and knowledge bases contain measurements of properties of physical objects which have been obtained in experiments or at

  11. Large Scale Beam-beam Simulations for the CERN LHC using Distributed Computing

    CERN Document Server

    Herr, Werner; McIntosh, E; Schmidt, F

    2006-01-01

    We report on a large scale simulation of beam-beam effects for the CERN Large Hadron Collider (LHC). The stability of particles which experience head-on and long-range beam-beam effects was investigated for different optical configurations and machine imperfections. To cover the interesting parameter space required computing resources not available at CERN. The necessary resources were available in the LHC@home project, based on the BOINC platform. At present, this project makes more than 60000 hosts available for distributed computing. We shall discuss our experience using this system during a simulation campaign of more than six months and describe the tools and procedures necessary to ensure consistent results. The results from this extended study are presented and future plans are discussed.

  12. Large scale network-centric distributed systems

    CERN Document Server

    Sarbazi-Azad, Hamid

    2014-01-01

    A highly accessible reference offering a broad range of topics and insights on large scale network-centric distributed systems Evolving from the fields of high-performance computing and networking, large scale network-centric distributed systems continues to grow as one of the most important topics in computing and communication and many interdisciplinary areas. Dealing with both wired and wireless networks, this book focuses on the design and performance issues of such systems. Large Scale Network-Centric Distributed Systems provides in-depth coverage ranging from ground-level hardware issu

  13. Large scale inverse problems computational methods and applications in the earth sciences

    CERN Document Server

    Scheichl, Robert; Freitag, Melina A; Kindermann, Stefan

    2013-01-01

    This book is thesecond volume of three volume series recording the ""Radon Special Semester 2011 on Multiscale Simulation & Analysis in Energy and the Environment"" taking place in Linz, Austria, October 3-7, 2011. The volume addresses the common ground in the mathematical and computational procedures required for large-scale inverse problems and data assimilation in forefront applications.

  14. Performance Characteristics of Hybrid MPI/OpenMP Scientific Applications on a Large-Scale Multithreaded BlueGene/Q Supercomputer

    KAUST Repository

    Wu, Xingfu; Taylor, Valerie

    2013-01-01

    In this paper, we investigate the performance characteristics of five hybrid MPI/OpenMP scientific applications (two NAS Parallel benchmarks Multi-Zone SP-MZ and BT-MZ, an earthquake simulation PEQdyna, an aerospace application PMLB and a 3D particle-in-cell application GTC) on a large-scale multithreaded Blue Gene/Q supercomputer at Argonne National laboratory, and quantify the performance gap resulting from using different number of threads per node. We use performance tools and MPI profile and trace libraries available on the supercomputer to analyze and compare the performance of these hybrid scientific applications with increasing the number OpenMP threads per node, and find that increasing the number of threads to some extent saturates or worsens performance of these hybrid applications. For the strong-scaling hybrid scientific applications such as SP-MZ, BT-MZ, PEQdyna and PLMB, using 32 threads per node results in much better application efficiency than using 64 threads per node, and as increasing the number of threads per node, the FPU (Floating Point Unit) percentage decreases, and the MPI percentage (except PMLB) and IPC (Instructions per cycle) per core (except BT-MZ) increase. For the weak-scaling hybrid scientific application such as GTC, the performance trend (relative speedup) is very similar with increasing number of threads per node no matter how many nodes (32, 128, 512) are used. © 2013 IEEE.

  15. Performance Characteristics of Hybrid MPI/OpenMP Scientific Applications on a Large-Scale Multithreaded BlueGene/Q Supercomputer

    KAUST Repository

    Wu, Xingfu

    2013-07-01

    In this paper, we investigate the performance characteristics of five hybrid MPI/OpenMP scientific applications (two NAS Parallel benchmarks Multi-Zone SP-MZ and BT-MZ, an earthquake simulation PEQdyna, an aerospace application PMLB and a 3D particle-in-cell application GTC) on a large-scale multithreaded Blue Gene/Q supercomputer at Argonne National laboratory, and quantify the performance gap resulting from using different number of threads per node. We use performance tools and MPI profile and trace libraries available on the supercomputer to analyze and compare the performance of these hybrid scientific applications with increasing the number OpenMP threads per node, and find that increasing the number of threads to some extent saturates or worsens performance of these hybrid applications. For the strong-scaling hybrid scientific applications such as SP-MZ, BT-MZ, PEQdyna and PLMB, using 32 threads per node results in much better application efficiency than using 64 threads per node, and as increasing the number of threads per node, the FPU (Floating Point Unit) percentage decreases, and the MPI percentage (except PMLB) and IPC (Instructions per cycle) per core (except BT-MZ) increase. For the weak-scaling hybrid scientific application such as GTC, the performance trend (relative speedup) is very similar with increasing number of threads per node no matter how many nodes (32, 128, 512) are used. © 2013 IEEE.

  16. An efficient and novel computation method for simulating diffraction patterns from large-scale coded apertures on large-scale focal plane arrays

    Science.gov (United States)

    Shrekenhamer, Abraham; Gottesman, Stephen R.

    2012-10-01

    A novel and memory efficient method for computing diffraction patterns produced on large-scale focal planes by largescale Coded Apertures at wavelengths where diffraction effects are significant has been developed and tested. The scheme, readily implementable on portable computers, overcomes the memory limitations of present state-of-the-art simulation codes such as Zemax. The method consists of first calculating a set of reference complex field (amplitude and phase) patterns on the focal plane produced by a single (reference) central hole, extending to twice the focal plane array size, with one such pattern for each Line-of-Sight (LOS) direction and wavelength in the scene, and with the pattern amplitude corresponding to the square-root of the spectral irradiance from each such LOS direction in the scene at selected wavelengths. Next the set of reference patterns is transformed to generate pattern sets for other holes. The transformation consists of a translational pattern shift corresponding to each hole's position offset and an electrical phase shift corresponding to each hole's position offset and incoming radiance's direction and wavelength. The set of complex patterns for each direction and wavelength is then summed coherently and squared for each detector to yield a set of power patterns unique for each direction and wavelength. Finally the set of power patterns is summed to produce the full waveband diffraction pattern from the scene. With this tool researchers can now efficiently simulate diffraction patterns produced from scenes by large-scale Coded Apertures onto large-scale focal plane arrays to support the development and optimization of coded aperture masks and image reconstruction algorithms.

  17. Large Scale Computing for the Modelling of Whole Brain Connectivity

    DEFF Research Database (Denmark)

    Albers, Kristoffer Jon

    organization of the brain in continuously increasing resolution. From these images, networks of structural and functional connectivity can be constructed. Bayesian stochastic block modelling provides a prominent data-driven approach for uncovering the latent organization, by clustering the networks into groups...... of neurons. Relying on Markov Chain Monte Carlo (MCMC) simulations as the workhorse in Bayesian inference however poses significant computational challenges, especially when modelling networks at the scale and complexity supported by high-resolution whole-brain MRI. In this thesis, we present how to overcome...... these computational limitations and apply Bayesian stochastic block models for un-supervised data-driven clustering of whole-brain connectivity in full image resolution. We implement high-performance software that allows us to efficiently apply stochastic blockmodelling with MCMC sampling on large complex networks...

  18. Computational models of consumer confidence from large-scale online attention data: crowd-sourcing econometrics.

    Science.gov (United States)

    Dong, Xianlei; Bollen, Johan

    2015-01-01

    Economies are instances of complex socio-technical systems that are shaped by the interactions of large numbers of individuals. The individual behavior and decision-making of consumer agents is determined by complex psychological dynamics that include their own assessment of present and future economic conditions as well as those of others, potentially leading to feedback loops that affect the macroscopic state of the economic system. We propose that the large-scale interactions of a nation's citizens with its online resources can reveal the complex dynamics of their collective psychology, including their assessment of future system states. Here we introduce a behavioral index of Chinese Consumer Confidence (C3I) that computationally relates large-scale online search behavior recorded by Google Trends data to the macroscopic variable of consumer confidence. Our results indicate that such computational indices may reveal the components and complex dynamics of consumer psychology as a collective socio-economic phenomenon, potentially leading to improved and more refined economic forecasting.

  19. Computational models of consumer confidence from large-scale online attention data: crowd-sourcing econometrics.

    Directory of Open Access Journals (Sweden)

    Xianlei Dong

    Full Text Available Economies are instances of complex socio-technical systems that are shaped by the interactions of large numbers of individuals. The individual behavior and decision-making of consumer agents is determined by complex psychological dynamics that include their own assessment of present and future economic conditions as well as those of others, potentially leading to feedback loops that affect the macroscopic state of the economic system. We propose that the large-scale interactions of a nation's citizens with its online resources can reveal the complex dynamics of their collective psychology, including their assessment of future system states. Here we introduce a behavioral index of Chinese Consumer Confidence (C3I that computationally relates large-scale online search behavior recorded by Google Trends data to the macroscopic variable of consumer confidence. Our results indicate that such computational indices may reveal the components and complex dynamics of consumer psychology as a collective socio-economic phenomenon, potentially leading to improved and more refined economic forecasting.

  20. National Energy Research Scientific Computing Center (NERSC): Advancing the frontiers of computational science and technology

    Energy Technology Data Exchange (ETDEWEB)

    Hules, J. [ed.

    1996-11-01

    National Energy Research Scientific Computing Center (NERSC) provides researchers with high-performance computing tools to tackle science`s biggest and most challenging problems. Founded in 1974 by DOE/ER, the Controlled Thermonuclear Research Computer Center was the first unclassified supercomputer center and was the model for those that followed. Over the years the center`s name was changed to the National Magnetic Fusion Energy Computer Center and then to NERSC; it was relocated to LBNL. NERSC, one of the largest unclassified scientific computing resources in the world, is the principal provider of general-purpose computing services to DOE/ER programs: Magnetic Fusion Energy, High Energy and Nuclear Physics, Basic Energy Sciences, Health and Environmental Research, and the Office of Computational and Technology Research. NERSC users are a diverse community located throughout US and in several foreign countries. This brochure describes: the NERSC advantage, its computational resources and services, future technologies, scientific resources, and computational science of scale (interdisciplinary research over a decade or longer; examples: combustion in engines, waste management chemistry, global climate change modeling).

  1. Large-scale simulations of error-prone quantum computation devices

    Energy Technology Data Exchange (ETDEWEB)

    Trieu, Doan Binh

    2009-07-01

    The theoretical concepts of quantum computation in the idealized and undisturbed case are well understood. However, in practice, all quantum computation devices do suffer from decoherence effects as well as from operational imprecisions. This work assesses the power of error-prone quantum computation devices using large-scale numerical simulations on parallel supercomputers. We present the Juelich Massively Parallel Ideal Quantum Computer Simulator (JUMPIQCS), that simulates a generic quantum computer on gate level. It comprises an error model for decoherence and operational errors. The robustness of various algorithms in the presence of noise has been analyzed. The simulation results show that for large system sizes and long computations it is imperative to actively correct errors by means of quantum error correction. We implemented the 5-, 7-, and 9-qubit quantum error correction codes. Our simulations confirm that using error-prone correction circuits with non-fault-tolerant quantum error correction will always fail, because more errors are introduced than being corrected. Fault-tolerant methods can overcome this problem, provided that the single qubit error rate is below a certain threshold. We incorporated fault-tolerant quantum error correction techniques into JUMPIQCS using Steane's 7-qubit code and determined this threshold numerically. Using the depolarizing channel as the source of decoherence, we find a threshold error rate of (5.2{+-}0.2) x 10{sup -6}. For Gaussian distributed operational over-rotations the threshold lies at a standard deviation of 0.0431{+-}0.0002. We can conclude that quantum error correction is especially well suited for the correction of operational imprecisions and systematic over-rotations. For realistic simulations of specific quantum computation devices we need to extend the generic model to dynamic simulations, i.e. time-dependent Hamiltonian simulations of realistic hardware models. We focus on today's most advanced

  2. Large-Scale, Parallel, Multi-Sensor Atmospheric Data Fusion Using Cloud Computing

    Science.gov (United States)

    Wilson, B. D.; Manipon, G.; Hua, H.; Fetzer, E. J.

    2013-12-01

    NASA's Earth Observing System (EOS) is an ambitious facility for studying global climate change. The mandate now is to combine measurements from the instruments on the 'A-Train' platforms (AIRS, AMSR-E, MODIS, MISR, MLS, and CloudSat) and other Earth probes to enable large-scale studies of climate change over decades. Moving to multi-sensor, long-duration analyses of important climate variables presents serious challenges for large-scale data mining and fusion. For example, one might want to compare temperature and water vapor retrievals from one instrument (AIRS) to another (MODIS), and to a model (MERRA), stratify the comparisons using a classification of the 'cloud scenes' from CloudSat, and repeat the entire analysis over 10 years of data. To efficiently assemble such datasets, we are utilizing Elastic Computing in the Cloud and parallel map/reduce-based algorithms. However, these problems are Data Intensive computing so the data transfer times and storage costs (for caching) are key issues. SciReduce is a Hadoop-like parallel analysis system, programmed in parallel python, that is designed from the ground up for Earth science. SciReduce executes inside VMWare images and scales to any number of nodes in the Cloud. Unlike Hadoop, SciReduce operates on bundles of named numeric arrays, which can be passed in memory or serialized to disk in netCDF4 or HDF5. Figure 1 shows the architecture of the full computational system, with SciReduce at the core. Multi-year datasets are automatically 'sharded' by time and space across a cluster of nodes so that years of data (millions of files) can be processed in a massively parallel way. Input variables (arrays) are pulled on-demand into the Cloud using OPeNDAP URLs or other subsetting services, thereby minimizing the size of the cached input and intermediate datasets. We are using SciReduce to automate the production of multiple versions of a ten-year A-Train water vapor climatology under a NASA MEASURES grant. We will

  3. Scientific Grand Challenges: Discovery In Basic Energy Sciences: The Role of Computing at the Extreme Scale - August 13-15, 2009, Washington, D.C.

    Energy Technology Data Exchange (ETDEWEB)

    Galli, Giulia [Univ. of California, Davis, CA (United States). Workshop Chair; Dunning, Thom [Univ. of Illinois, Urbana, IL (United States). Workshop Chair

    2009-08-13

    The U.S. Department of Energy’s (DOE) Office of Basic Energy Sciences (BES) and Office of Advanced Scientific Computing Research (ASCR) workshop in August 2009 on extreme-scale computing provided a forum for more than 130 researchers to explore the needs and opportunities that will arise due to expected dramatic advances in computing power over the next decade. This scientific community firmly believes that the development of advanced theoretical tools within chemistry, physics, and materials science—combined with the development of efficient computational techniques and algorithms—has the potential to revolutionize the discovery process for materials and molecules with desirable properties. Doing so is necessary to meet the energy and environmental challenges of the 21st century as described in various DOE BES Basic Research Needs reports. Furthermore, computational modeling and simulation are a crucial complement to experimental studies, particularly when quantum mechanical processes controlling energy production, transformations, and storage are not directly observable and/or controllable. Many processes related to the Earth’s climate and subsurface need better modeling capabilities at the molecular level, which will be enabled by extreme-scale computing.

  4. Large Scale Computing and Storage Requirements for Biological and Environmental Research

    Energy Technology Data Exchange (ETDEWEB)

    DOE Office of Science, Biological and Environmental Research Program Office (BER),

    2009-09-30

    In May 2009, NERSC, DOE's Office of Advanced Scientific Computing Research (ASCR), and DOE's Office of Biological and Environmental Research (BER) held a workshop to characterize HPC requirements for BER-funded research over the subsequent three to five years. The workshop revealed several key points, in addition to achieving its goal of collecting and characterizing computing requirements. Chief among them: scientific progress in BER-funded research is limited by current allocations of computational resources. Additionally, growth in mission-critical computing -- combined with new requirements for collaborative data manipulation and analysis -- will demand ever increasing computing, storage, network, visualization, reliability and service richness from NERSC. This report expands upon these key points and adds others. It also presents a number of"case studies" as significant representative samples of the needs of science teams within BER. Workshop participants were asked to codify their requirements in this"case study" format, summarizing their science goals, methods of solution, current and 3-5 year computing requirements, and special software and support needs. Participants were also asked to describe their strategy for computing in the highly parallel,"multi-core" environment that is expected to dominate HPC architectures over the next few years.

  5. Scientific computer simulation review

    International Nuclear Information System (INIS)

    Kaizer, Joshua S.; Heller, A. Kevin; Oberkampf, William L.

    2015-01-01

    Before the results of a scientific computer simulation are used for any purpose, it should be determined if those results can be trusted. Answering that question of trust is the domain of scientific computer simulation review. There is limited literature that focuses on simulation review, and most is specific to the review of a particular type of simulation. This work is intended to provide a foundation for a common understanding of simulation review. This is accomplished through three contributions. First, scientific computer simulation review is formally defined. This definition identifies the scope of simulation review and provides the boundaries of the review process. Second, maturity assessment theory is developed. This development clarifies the concepts of maturity criteria, maturity assessment sets, and maturity assessment frameworks, which are essential for performing simulation review. Finally, simulation review is described as the application of a maturity assessment framework. This is illustrated through evaluating a simulation review performed by the U.S. Nuclear Regulatory Commission. In making these contributions, this work provides a means for a more objective assessment of a simulation’s trustworthiness and takes the next step in establishing scientific computer simulation review as its own field. - Highlights: • We define scientific computer simulation review. • We develop maturity assessment theory. • We formally define a maturity assessment framework. • We describe simulation review as the application of a maturity framework. • We provide an example of a simulation review using a maturity framework

  6. ASCR Cybersecurity for Scientific Computing Integrity - Research Pathways and Ideas Workshop

    Energy Technology Data Exchange (ETDEWEB)

    Peisert, Sean [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Univ. of California, Davis, CA (United States); Potok, Thomas E. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Jones, Todd [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

    2015-06-03

    At the request of the U.S. Department of Energy's (DOE) Office of Science (SC) Advanced Scientific Computing Research (ASCR) program office, a workshop was held June 2-3, 2015, in Gaithersburg, MD, to identify potential long term (10 to +20 year) cybersecurity fundamental basic research and development challenges, strategies and roadmap facing future high performance computing (HPC), networks, data centers, and extreme-scale scientific user facilities. This workshop was a follow-on to the workshop held January 7-9, 2015, in Rockville, MD, that examined higher level ideas about scientific computing integrity specific to the mission of the DOE Office of Science. Issues included research computation and simulation that takes place on ASCR computing facilities and networks, as well as network-connected scientific instruments, such as those run by various DOE Office of Science programs. Workshop participants included researchers and operational staff from DOE national laboratories, as well as academic researchers and industry experts. Participants were selected based on the submission of abstracts relating to the topics discussed in the previous workshop report [1] and also from other ASCR reports, including "Abstract Machine Models and Proxy Architectures for Exascale Computing" [27], the DOE "Preliminary Conceptual Design for an Exascale Computing Initiative" [28], and the January 2015 machine learning workshop [29]. The workshop was also attended by several observers from DOE and other government agencies. The workshop was divided into three topic areas: (1) Trustworthy Supercomputing, (2) Extreme-Scale Data, Knowledge, and Analytics for Understanding and Improving Cybersecurity, and (3) Trust within High-end Networking and Data Centers. Participants were divided into three corresponding teams based on the category of their abstracts. The workshop began with a series of talks from the program manager and workshop chair, followed by the leaders for each of the

  7. Standing Together for Reproducibility in Large-Scale Computing: Report on reproducibility@XSEDE

    OpenAIRE

    James, Doug; Wilkins-Diehr, Nancy; Stodden, Victoria; Colbry, Dirk; Rosales, Carlos; Fahey, Mark; Shi, Justin; Silva, Rafael F.; Lee, Kyo; Roskies, Ralph; Loewe, Laurence; Lindsey, Susan; Kooper, Rob; Barba, Lorena; Bailey, David

    2014-01-01

    This is the final report on reproducibility@xsede, a one-day workshop held in conjunction with XSEDE14, the annual conference of the Extreme Science and Engineering Discovery Environment (XSEDE). The workshop's discussion-oriented agenda focused on reproducibility in large-scale computational research. Two important themes capture the spirit of the workshop submissions and discussions: (1) organizational stakeholders, especially supercomputer centers, are in a unique position to promote, enab...

  8. Political consultation and large-scale research

    International Nuclear Information System (INIS)

    Bechmann, G.; Folkers, H.

    1977-01-01

    Large-scale research and policy consulting have an intermediary position between sociological sub-systems. While large-scale research coordinates science, policy, and production, policy consulting coordinates science, policy and political spheres. In this very position, large-scale research and policy consulting lack of institutional guarantees and rational back-ground guarantee which are characteristic for their sociological environment. This large-scale research can neither deal with the production of innovative goods under consideration of rentability, nor can it hope for full recognition by the basis-oriented scientific community. Policy consulting knows neither the competence assignment of the political system to make decisions nor can it judge succesfully by the critical standards of the established social science, at least as far as the present situation is concerned. This intermediary position of large-scale research and policy consulting has, in three points, a consequence supporting the thesis which states that this is a new form of institutionalization of science: These are: 1) external control, 2) the organization form, 3) the theoretical conception of large-scale research and policy consulting. (orig.) [de

  9. Trends in large-scale testing of reactor structures

    International Nuclear Information System (INIS)

    Blejwas, T.E.

    2003-01-01

    Large-scale tests of reactor structures have been conducted at Sandia National Laboratories since the late 1970s. This paper describes a number of different large-scale impact tests, pressurization tests of models of containment structures, and thermal-pressure tests of models of reactor pressure vessels. The advantages of large-scale testing are evident, but cost, in particular limits its use. As computer models have grown in size, such as number of degrees of freedom, the advent of computer graphics has made possible very realistic representation of results - results that may not accurately represent reality. A necessary condition to avoiding this pitfall is the validation of the analytical methods and underlying physical representations. Ironically, the immensely larger computer models sometimes increase the need for large-scale testing, because the modeling is applied to increasing more complex structural systems and/or more complex physical phenomena. Unfortunately, the cost of large-scale tests is a disadvantage that will likely severely limit similar testing in the future. International collaborations may provide the best mechanism for funding future programs with large-scale tests. (author)

  10. Really Large Scale Computer Graphic Projection Using Lasers and Laser Substitutes

    Science.gov (United States)

    Rother, Paul

    1989-07-01

    This paper reflects on past laser projects to display vector scanned computer graphic images onto very large and irregular surfaces. Since the availability of microprocessors and high powered visible lasers, very large scale computer graphics projection have become a reality. Due to the independence from a focusing lens, lasers easily project onto distant and irregular surfaces and have been used for amusement parks, theatrical performances, concert performances, industrial trade shows and dance clubs. Lasers have been used to project onto mountains, buildings, 360° globes, clouds of smoke and water. These methods have proven successful in installations at: Epcot Theme Park in Florida; Stone Mountain Park in Georgia; 1984 Olympics in Los Angeles; hundreds of Corporate trade shows and thousands of musical performances. Using new ColorRayTM technology, the use of costly and fragile lasers is no longer necessary. Utilizing fiber optic technology, the functionality of lasers can be duplicated for new and exciting projection possibilities. The use of ColorRayTM technology has enjoyed worldwide recognition in conjunction with Pink Floyd and George Michaels' world wide tours.

  11. Rainbow: a tool for large-scale whole-genome sequencing data analysis using cloud computing.

    Science.gov (United States)

    Zhao, Shanrong; Prenger, Kurt; Smith, Lance; Messina, Thomas; Fan, Hongtao; Jaeger, Edward; Stephens, Susan

    2013-06-27

    Technical improvements have decreased sequencing costs and, as a result, the size and number of genomic datasets have increased rapidly. Because of the lower cost, large amounts of sequence data are now being produced by small to midsize research groups. Crossbow is a software tool that can detect single nucleotide polymorphisms (SNPs) in whole-genome sequencing (WGS) data from a single subject; however, Crossbow has a number of limitations when applied to multiple subjects from large-scale WGS projects. The data storage and CPU resources that are required for large-scale whole genome sequencing data analyses are too large for many core facilities and individual laboratories to provide. To help meet these challenges, we have developed Rainbow, a cloud-based software package that can assist in the automation of large-scale WGS data analyses. Here, we evaluated the performance of Rainbow by analyzing 44 different whole-genome-sequenced subjects. Rainbow has the capacity to process genomic data from more than 500 subjects in two weeks using cloud computing provided by the Amazon Web Service. The time includes the import and export of the data using Amazon Import/Export service. The average cost of processing a single sample in the cloud was less than 120 US dollars. Compared with Crossbow, the main improvements incorporated into Rainbow include the ability: (1) to handle BAM as well as FASTQ input files; (2) to split large sequence files for better load balance downstream; (3) to log the running metrics in data processing and monitoring multiple Amazon Elastic Compute Cloud (EC2) instances; and (4) to merge SOAPsnp outputs for multiple individuals into a single file to facilitate downstream genome-wide association studies. Rainbow is a scalable, cost-effective, and open-source tool for large-scale WGS data analysis. For human WGS data sequenced by either the Illumina HiSeq 2000 or HiSeq 2500 platforms, Rainbow can be used straight out of the box. Rainbow is available

  12. Collaborative e-Science Experiments and Scientific Workflows

    NARCIS (Netherlands)

    Belloum, A.; Inda, M.A.; Vasunin, D.; Korkhov, V.; Zhao, Z.; Rauwerda, H.; Breit, T.M.; Bubak, M.; Hertzberger, L.O.

    2011-01-01

    Recent advances in Internet and grid technologies have greatly enhanced scientific experiments' life cycle. In addition to compute- and data-intensive tasks, large-scale collaborations involving geographically distributed scientists and e-infrastructure are now possible. Scientific workflows, which

  13. Information Power Grid: Distributed High-Performance Computing and Large-Scale Data Management for Science and Engineering

    Science.gov (United States)

    Johnston, William E.; Gannon, Dennis; Nitzberg, Bill

    2000-01-01

    ) Coupling large-scale computing and data systems to scientific and engineering instruments (e.g., realtime interaction with experiments through real-time data analysis and interpretation presented to the experimentalist in ways that allow direct interaction with the experiment (instead of just with instrument control); (5) Highly interactive, augmented reality and virtual reality remote collaborations (e.g., Ames / Boeing Remote Help Desk providing field maintenance use of coupled video and NDI to a remote, on-line airframe structures expert who uses this data to index into detailed design databases, and returns 3D internal aircraft geometry to the field); (5) Single computational problems too large for any single system (e.g. the rotocraft reference calculation). Grids also have the potential to provide pools of resources that could be called on in extraordinary / rapid response situations (such as disaster response) because they can provide common interfaces and access mechanisms, standardized management, and uniform user authentication and authorization, for large collections of distributed resources (whether or not they normally function in concert). IPG development and deployment is addressing requirements obtained by analyzing a number of different application areas, in particular from the NASA Aero-Space Technology Enterprise. This analysis has focussed primarily on two types of users: the scientist / design engineer whose primary interest is problem solving (e.g. determining wing aerodynamic characteristics in many different operating environments), and whose primary interface to IPG will be through various sorts of problem solving frameworks. The second type of user is the tool designer: the computational scientists who convert physics and mathematics into code that can simulate the physical world. These are the two primary users of IPG, and they have rather different requirements. The results of the analysis of the needs of these two types of users provides a

  14. An efficient implementation of 3D high-resolution imaging for large-scale seismic data with GPU/CPU heterogeneous parallel computing

    Science.gov (United States)

    Xu, Jincheng; Liu, Wei; Wang, Jin; Liu, Linong; Zhang, Jianfeng

    2018-02-01

    De-absorption pre-stack time migration (QPSTM) compensates for the absorption and dispersion of seismic waves by introducing an effective Q parameter, thereby making it an effective tool for 3D, high-resolution imaging of seismic data. Although the optimal aperture obtained via stationary-phase migration reduces the computational cost of 3D QPSTM and yields 3D stationary-phase QPSTM, the associated computational efficiency is still the main problem in the processing of 3D, high-resolution images for real large-scale seismic data. In the current paper, we proposed a division method for large-scale, 3D seismic data to optimize the performance of stationary-phase QPSTM on clusters of graphics processing units (GPU). Then, we designed an imaging point parallel strategy to achieve an optimal parallel computing performance. Afterward, we adopted an asynchronous double buffering scheme for multi-stream to perform the GPU/CPU parallel computing. Moreover, several key optimization strategies of computation and storage based on the compute unified device architecture (CUDA) were adopted to accelerate the 3D stationary-phase QPSTM algorithm. Compared with the initial GPU code, the implementation of the key optimization steps, including thread optimization, shared memory optimization, register optimization and special function units (SFU), greatly improved the efficiency. A numerical example employing real large-scale, 3D seismic data showed that our scheme is nearly 80 times faster than the CPU-QPSTM algorithm. Our GPU/CPU heterogeneous parallel computing framework significant reduces the computational cost and facilitates 3D high-resolution imaging for large-scale seismic data.

  15. Pascal-SC a computer language for scientific computation

    CERN Document Server

    Bohlender, Gerd; von Gudenberg, Jürgen Wolff; Rheinboldt, Werner; Siewiorek, Daniel

    1987-01-01

    Perspectives in Computing, Vol. 17: Pascal-SC: A Computer Language for Scientific Computation focuses on the application of Pascal-SC, a programming language developed as an extension of standard Pascal, in scientific computation. The publication first elaborates on the introduction to Pascal-SC, a review of standard Pascal, and real floating-point arithmetic. Discussions focus on optimal scalar product, standard functions, real expressions, program structure, simple extensions, real floating-point arithmetic, vector and matrix arithmetic, and dynamic arrays. The text then examines functions a

  16. Unified Access Architecture for Large-Scale Scientific Datasets

    Science.gov (United States)

    Karna, Risav

    2014-05-01

    Data-intensive sciences have to deploy diverse large scale database technologies for data analytics as scientists have now been dealing with much larger volume than ever before. While array databases have bridged many gaps between the needs of data-intensive research fields and DBMS technologies (Zhang 2011), invocation of other big data tools accompanying these databases is still manual and separate the database management's interface. We identify this as an architectural challenge that will increasingly complicate the user's work flow owing to the growing number of useful but isolated and niche database tools. Such use of data analysis tools in effect leaves the burden on the user's end to synchronize the results from other data manipulation analysis tools with the database management system. To this end, we propose a unified access interface for using big data tools within large scale scientific array database using the database queries themselves to embed foreign routines belonging to the big data tools. Such an invocation of foreign data manipulation routines inside a query into a database can be made possible through a user-defined function (UDF). UDFs that allow such levels of freedom as to call modules from another language and interface back and forth between the query body and the side-loaded functions would be needed for this purpose. For the purpose of this research we attempt coupling of four widely used tools Hadoop (hadoop1), Matlab (matlab1), R (r1) and ScaLAPACK (scalapack1) with UDF feature of rasdaman (Baumann 98), an array-based data manager, for investigating this concept. The native array data model used by an array-based data manager provides compact data storage and high performance operations on ordered data such as spatial data, temporal data, and matrix-based data for linear algebra operations (scidbusr1). Performances issues arising due to coupling of tools with different paradigms, niche functionalities, separate processes and output

  17. Enabling systematic, harmonised and large-scale biofilms data computation: the Biofilms Experiment Workbench.

    Science.gov (United States)

    Pérez-Rodríguez, Gael; Glez-Peña, Daniel; Azevedo, Nuno F; Pereira, Maria Olívia; Fdez-Riverola, Florentino; Lourenço, Anália

    2015-03-01

    Biofilms are receiving increasing attention from the biomedical community. Biofilm-like growth within human body is considered one of the key microbial strategies to augment resistance and persistence during infectious processes. The Biofilms Experiment Workbench is a novel software workbench for the operation and analysis of biofilms experimental data. The goal is to promote the interchange and comparison of data among laboratories, providing systematic, harmonised and large-scale data computation. The workbench was developed with AIBench, an open-source Java desktop application framework for scientific software development in the domain of translational biomedicine. Implementation favours free and open-source third-parties, such as the R statistical package, and reaches for the Web services of the BiofOmics database to enable public experiment deposition. First, we summarise the novel, free, open, XML-based interchange format for encoding biofilms experimental data. Then, we describe the execution of common scenarios of operation with the new workbench, such as the creation of new experiments, the importation of data from Excel spreadsheets, the computation of analytical results, the on-demand and highly customised construction of Web publishable reports, and the comparison of results between laboratories. A considerable and varied amount of biofilms data is being generated, and there is a critical need to develop bioinformatics tools that expedite the interchange and comparison of microbiological and clinical results among laboratories. We propose a simple, open-source software infrastructure which is effective, extensible and easy to understand. The workbench is freely available for non-commercial use at http://sing.ei.uvigo.es/bew under LGPL license. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  18. Berkeley Lab Computing Sciences: Accelerating Scientific Discovery

    International Nuclear Information System (INIS)

    Hules, John A.

    2008-01-01

    Scientists today rely on advances in computer science, mathematics, and computational science, as well as large-scale computing and networking facilities, to increase our understanding of ourselves, our planet, and our universe. Berkeley Lab's Computing Sciences organization researches, develops, and deploys new tools and technologies to meet these needs and to advance research in such areas as global climate change, combustion, fusion energy, nanotechnology, biology, and astrophysics

  19. Full-color large-scaled computer-generated holograms using RGB color filters.

    Science.gov (United States)

    Tsuchiyama, Yasuhiro; Matsushima, Kyoji

    2017-02-06

    A technique using RGB color filters is proposed for creating high-quality full-color computer-generated holograms (CGHs). The fringe of these CGHs is composed of more than a billion pixels. The CGHs reconstruct full-parallax three-dimensional color images with a deep sensation of depth caused by natural motion parallax. The simulation technique as well as the principle and challenges of high-quality full-color reconstruction are presented to address the design of filter properties suitable for large-scaled CGHs. Optical reconstructions of actual fabricated full-color CGHs are demonstrated in order to verify the proposed techniques.

  20. Large scale and big data processing and management

    CERN Document Server

    Sakr, Sherif

    2014-01-01

    Large Scale and Big Data: Processing and Management provides readers with a central source of reference on the data management techniques currently available for large-scale data processing. Presenting chapters written by leading researchers, academics, and practitioners, it addresses the fundamental challenges associated with Big Data processing tools and techniques across a range of computing environments.The book begins by discussing the basic concepts and tools of large-scale Big Data processing and cloud computing. It also provides an overview of different programming models and cloud-bas

  1. Phylogenetic distribution of large-scale genome patchiness

    Directory of Open Access Journals (Sweden)

    Hackenberg Michael

    2008-04-01

    Full Text Available Abstract Background The phylogenetic distribution of large-scale genome structure (i.e. mosaic compositional patchiness has been explored mainly by analytical ultracentrifugation of bulk DNA. However, with the availability of large, good-quality chromosome sequences, and the recently developed computational methods to directly analyze patchiness on the genome sequence, an evolutionary comparative analysis can be carried out at the sequence level. Results The local variations in the scaling exponent of the Detrended Fluctuation Analysis are used here to analyze large-scale genome structure and directly uncover the characteristic scales present in genome sequences. Furthermore, through shuffling experiments of selected genome regions, computationally-identified, isochore-like regions were identified as the biological source for the uncovered large-scale genome structure. The phylogenetic distribution of short- and large-scale patchiness was determined in the best-sequenced genome assemblies from eleven eukaryotic genomes: mammals (Homo sapiens, Pan troglodytes, Mus musculus, Rattus norvegicus, and Canis familiaris, birds (Gallus gallus, fishes (Danio rerio, invertebrates (Drosophila melanogaster and Caenorhabditis elegans, plants (Arabidopsis thaliana and yeasts (Saccharomyces cerevisiae. We found large-scale patchiness of genome structure, associated with in silico determined, isochore-like regions, throughout this wide phylogenetic range. Conclusion Large-scale genome structure is detected by directly analyzing DNA sequences in a wide range of eukaryotic chromosome sequences, from human to yeast. In all these genomes, large-scale patchiness can be associated with the isochore-like regions, as directly detected in silico at the sequence level.

  2. Development of computational infrastructure to support hyper-resolution large-ensemble hydrology simulations from local-to-continental scales

    Data.gov (United States)

    National Aeronautics and Space Administration — Development of computational infrastructure to support hyper-resolution large-ensemble hydrology simulations from local-to-continental scales A move is currently...

  3. Hierarchical approach to optimization of parallel matrix multiplication on large-scale platforms

    KAUST Repository

    Hasanov, Khalid

    2014-03-04

    © 2014, Springer Science+Business Media New York. Many state-of-the-art parallel algorithms, which are widely used in scientific applications executed on high-end computing systems, were designed in the twentieth century with relatively small-scale parallelism in mind. Indeed, while in 1990s a system with few hundred cores was considered a powerful supercomputer, modern top supercomputers have millions of cores. In this paper, we present a hierarchical approach to optimization of message-passing parallel algorithms for execution on large-scale distributed-memory systems. The idea is to reduce the communication cost by introducing hierarchy and hence more parallelism in the communication scheme. We apply this approach to SUMMA, the state-of-the-art parallel algorithm for matrix–matrix multiplication, and demonstrate both theoretically and experimentally that the modified Hierarchical SUMMA significantly improves the communication cost and the overall performance on large-scale platforms.

  4. The Roles of Sparse Direct Methods in Large-scale Simulations

    Energy Technology Data Exchange (ETDEWEB)

    Li, Xiaoye S.; Gao, Weiguo; Husbands, Parry J.R.; Yang, Chao; Ng, Esmond G.

    2005-06-27

    Sparse systems of linear equations and eigen-equations arise at the heart of many large-scale, vital simulations in DOE. Examples include the Accelerator Science and Technology SciDAC (Omega3P code, electromagnetic problem), the Center for Extended Magnetohydrodynamic Modeling SciDAC(NIMROD and M3D-C1 codes, fusion plasma simulation). The Terascale Optimal PDE Simulations (TOPS)is providing high-performance sparse direct solvers, which have had significant impacts on these applications. Over the past several years, we have been working closely with the other SciDAC teams to solve their large, sparse matrix problems arising from discretization of the partial differential equations. Most of these systems are very ill-conditioned, resulting in extremely poor convergence deployed our direct methods techniques in these applications, which achieved significant scientific results as well as performance gains. These successes were made possible through the SciDAC model of computer scientists and application scientists working together to take full advantage of terascale computing systems and new algorithms research.

  5. The Roles of Sparse Direct Methods in Large-scale Simulations

    International Nuclear Information System (INIS)

    Li, Xiaoye S.; Gao, Weiguo; Husbands, Parry J.R.; Yang, Chao; Ng, Esmond G.

    2005-01-01

    Sparse systems of linear equations and eigen-equations arise at the heart of many large-scale, vital simulations in DOE. Examples include the Accelerator Science and Technology SciDAC (Omega3P code, electromagnetic problem), the Center for Extended Magnetohydrodynamic Modeling SciDAC(NIMROD and M3D-C1 codes, fusion plasma simulation). The Terascale Optimal PDE Simulations (TOPS)is providing high-performance sparse direct solvers, which have had significant impacts on these applications. Over the past several years, we have been working closely with the other SciDAC teams to solve their large, sparse matrix problems arising from discretization of the partial differential equations. Most of these systems are very ill-conditioned, resulting in extremely poor convergence deployed our direct methods techniques in these applications, which achieved significant scientific results as well as performance gains. These successes were made possible through the SciDAC model of computer scientists and application scientists working together to take full advantage of terascale computing systems and new algorithms research

  6. Performance of Air Pollution Models on Massively Parallel Computers

    DEFF Research Database (Denmark)

    Brown, John; Hansen, Per Christian; Wasniewski, Jerzy

    1996-01-01

    To compare the performance and use of three massively parallel SIMD computers, we implemented a large air pollution model on the computers. Using a realistic large-scale model, we gain detailed insight about the performance of the three computers when used to solve large-scale scientific problems...

  7. High-End Scientific Computing

    Science.gov (United States)

    EPA uses high-end scientific computing, geospatial services and remote sensing/imagery analysis to support EPA's mission. The Center for Environmental Computing (CEC) assists the Agency's program offices and regions to meet staff needs in these areas.

  8. 3D fast adaptive correlation imaging for large-scale gravity data based on GPU computation

    Science.gov (United States)

    Chen, Z.; Meng, X.; Guo, L.; Liu, G.

    2011-12-01

    In recent years, large scale gravity data sets have been collected and employed to enhance gravity problem-solving abilities of tectonics studies in China. Aiming at the large scale data and the requirement of rapid interpretation, previous authors have carried out a lot of work, including the fast gradient module inversion and Euler deconvolution depth inversion ,3-D physical property inversion using stochastic subspaces and equivalent storage, fast inversion using wavelet transforms and a logarithmic barrier method. So it can be say that 3-D gravity inversion has been greatly improved in the last decade. Many authors added many different kinds of priori information and constraints to deal with nonuniqueness using models composed of a large number of contiguous cells of unknown property and obtained good results. However, due to long computation time, instability and other shortcomings, 3-D physical property inversion has not been widely applied to large-scale data yet. In order to achieve 3-D interpretation with high efficiency and precision for geological and ore bodies and obtain their subsurface distribution, there is an urgent need to find a fast and efficient inversion method for large scale gravity data. As an entirely new geophysical inversion method, 3D correlation has a rapid development thanks to the advantage of requiring no a priori information and demanding small amount of computer memory. This method was proposed to image the distribution of equivalent excess masses of anomalous geological bodies with high resolution both longitudinally and transversely. In order to tranform the equivalence excess masses into real density contrasts, we adopt the adaptive correlation imaging for gravity data. After each 3D correlation imaging, we change the equivalence into density contrasts according to the linear relationship, and then carry out forward gravity calculation for each rectangle cells. Next, we compare the forward gravity data with real data, and

  9. Initial explorations of ARM processors for scientific computing

    International Nuclear Information System (INIS)

    Abdurachmanov, David; Elmer, Peter; Eulisse, Giulio; Muzaffar, Shahzad

    2014-01-01

    Power efficiency is becoming an ever more important metric for both high performance and high throughput computing. Over the course of next decade it is expected that flops/watt will be a major driver for the evolution of computer architecture. Servers with large numbers of ARM processors, already ubiquitous in mobile computing, are a promising alternative to traditional x86-64 computing. We present the results of our initial investigations into the use of ARM processors for scientific computing applications. In particular we report the results from our work with a current generation ARMv7 development board to explore ARM-specific issues regarding the software development environment, operating system, performance benchmarks and issues for porting High Energy Physics software

  10. Lightweight computational steering of very large scale molecular dynamics simulations

    International Nuclear Information System (INIS)

    Beazley, D.M.

    1996-01-01

    We present a computational steering approach for controlling, analyzing, and visualizing very large scale molecular dynamics simulations involving tens to hundreds of millions of atoms. Our approach relies on extensible scripting languages and an easy to use tool for building extensions and modules. The system is extremely easy to modify, works with existing C code, is memory efficient, and can be used from inexpensive workstations and networks. We demonstrate how we have used this system to manipulate data from production MD simulations involving as many as 104 million atoms running on the CM-5 and Cray T3D. We also show how this approach can be used to build systems that integrate common scripting languages (including Tcl/Tk, Perl, and Python), simulation code, user extensions, and commercial data analysis packages

  11. Topology Optimization of Large Scale Stokes Flow Problems

    DEFF Research Database (Denmark)

    Aage, Niels; Poulsen, Thomas Harpsøe; Gersborg-Hansen, Allan

    2008-01-01

    This note considers topology optimization of large scale 2D and 3D Stokes flow problems using parallel computations. We solve problems with up to 1.125.000 elements in 2D and 128.000 elements in 3D on a shared memory computer consisting of Sun UltraSparc IV CPUs.......This note considers topology optimization of large scale 2D and 3D Stokes flow problems using parallel computations. We solve problems with up to 1.125.000 elements in 2D and 128.000 elements in 3D on a shared memory computer consisting of Sun UltraSparc IV CPUs....

  12. Traffic Flow Prediction Model for Large-Scale Road Network Based on Cloud Computing

    Directory of Open Access Journals (Sweden)

    Zhaosheng Yang

    2014-01-01

    Full Text Available To increase the efficiency and precision of large-scale road network traffic flow prediction, a genetic algorithm-support vector machine (GA-SVM model based on cloud computing is proposed in this paper, which is based on the analysis of the characteristics and defects of genetic algorithm and support vector machine. In cloud computing environment, firstly, SVM parameters are optimized by the parallel genetic algorithm, and then this optimized parallel SVM model is used to predict traffic flow. On the basis of the traffic flow data of Haizhu District in Guangzhou City, the proposed model was verified and compared with the serial GA-SVM model and parallel GA-SVM model based on MPI (message passing interface. The results demonstrate that the parallel GA-SVM model based on cloud computing has higher prediction accuracy, shorter running time, and higher speedup.

  13. On-demand Overlay Networks for Large Scientific Data Transfers

    Energy Technology Data Exchange (ETDEWEB)

    Ramakrishnan, Lavanya [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Guok, Chin [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Jackson, Keith [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Kissel, Ezra [Univ. of Delaware, Newark, DE (United States); Swany, D. Martin [Univ. of Delaware, Newark, DE (United States); Agarwal, Deborah [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)

    2009-10-12

    Large scale scientific data transfers are central to scientific processes. Data from large experimental facilities have to be moved to local institutions for analysis or often data needs to be moved between local clusters and large supercomputing centers. In this paper, we propose and evaluate a network overlay architecture to enable highthroughput, on-demand, coordinated data transfers over wide-area networks. Our work leverages Phoebus and On-demand Secure Circuits and AdvanceReservation System (OSCARS) to provide high performance wide-area network connections. OSCARS enables dynamic provisioning of network paths with guaranteed bandwidth and Phoebus enables the coordination and effective utilization of the OSCARS network paths. Our evaluation shows that this approach leads to improved end-to-end data transfer throughput with minimal overheads. The achievedthroughput using our overlay was limited only by the ability of the end hosts to sink the data.

  14. Practical scientific computing

    CERN Document Server

    Muhammad, A

    2011-01-01

    Scientific computing is about developing mathematical models, numerical methods and computer implementations to study and solve real problems in science, engineering, business and even social sciences. Mathematical modelling requires deep understanding of classical numerical methods. This essential guide provides the reader with sufficient foundations in these areas to venture into more advanced texts. The first section of the book presents numEclipse, an open source tool for numerical computing based on the notion of MATLAB®. numEclipse is implemented as a plug-in for Eclipse, a leading integ

  15. Large-Scale Data Collection Metadata Management at the National Computation Infrastructure

    Science.gov (United States)

    Wang, J.; Evans, B. J. K.; Bastrakova, I.; Ryder, G.; Martin, J.; Duursma, D.; Gohar, K.; Mackey, T.; Paget, M.; Siddeswara, G.

    2014-12-01

    Data Collection management has become an essential activity at the National Computation Infrastructure (NCI) in Australia. NCI's partners (CSIRO, Bureau of Meteorology, Australian National University, and Geoscience Australia), supported by the Australian Government and Research Data Storage Infrastructure (RDSI), have established a national data resource that is co-located with high-performance computing. This paper addresses the metadata management of these data assets over their lifetime. NCI manages 36 data collections (10+ PB) categorised as earth system sciences, climate and weather model data assets and products, earth and marine observations and products, geosciences, terrestrial ecosystem, water management and hydrology, astronomy, social science and biosciences. The data is largely sourced from NCI partners, the custodians of many of the national scientific records, and major research community organisations. The data is made available in a HPC and data-intensive environment - a ~56000 core supercomputer, virtual labs on a 3000 core cloud system, and data services. By assembling these large national assets, new opportunities have arisen to harmonise the data collections, making a powerful cross-disciplinary resource.To support the overall management, a Data Management Plan (DMP) has been developed to record the workflows, procedures, the key contacts and responsibilities. The DMP has fields that can be exported to the ISO19115 schema and to the collection level catalogue of GeoNetwork. The subset or file level metadata catalogues are linked with the collection level through parent-child relationship definition using UUID. A number of tools have been developed that support interactive metadata management, bulk loading of data, and support for computational workflows or data pipelines. NCI creates persistent identifiers for each of the assets. The data collection is tracked over its lifetime, and the recognition of the data providers, data owners, data

  16. Using Relational Reasoning to Learn about Scientific Phenomena at Unfamiliar Scales

    Science.gov (United States)

    Resnick, Ilyse; Davatzes, Alexandra; Newcombe, Nora S.; Shipley, Thomas F.

    2017-01-01

    Many scientific theories and discoveries involve reasoning about extreme scales, removed from human experience, such as time in geology and size in nanoscience. Thus, understanding scale is central to science, technology, engineering, and mathematics. Unfortunately, novices have trouble understanding and comparing sizes of unfamiliar large and…

  17. Large-scale multimedia modeling applications

    International Nuclear Information System (INIS)

    Droppo, J.G. Jr.; Buck, J.W.; Whelan, G.; Strenge, D.L.; Castleton, K.J.; Gelston, G.M.

    1995-08-01

    Over the past decade, the US Department of Energy (DOE) and other agencies have faced increasing scrutiny for a wide range of environmental issues related to past and current practices. A number of large-scale applications have been undertaken that required analysis of large numbers of potential environmental issues over a wide range of environmental conditions and contaminants. Several of these applications, referred to here as large-scale applications, have addressed long-term public health risks using a holistic approach for assessing impacts from potential waterborne and airborne transport pathways. Multimedia models such as the Multimedia Environmental Pollutant Assessment System (MEPAS) were designed for use in such applications. MEPAS integrates radioactive and hazardous contaminants impact computations for major exposure routes via air, surface water, ground water, and overland flow transport. A number of large-scale applications of MEPAS have been conducted to assess various endpoints for environmental and human health impacts. These applications are described in terms of lessons learned in the development of an effective approach for large-scale applications

  18. Large-Scale, Multi-Sensor Atmospheric Data Fusion Using Hybrid Cloud Computing

    Science.gov (United States)

    Wilson, B. D.; Manipon, G.; Hua, H.; Fetzer, E. J.

    2015-12-01

    NASA's Earth Observing System (EOS) is an ambitious facility for studying global climate change. The mandate now is to combine measurements from the instruments on the "A-Train" platforms (AIRS, MODIS, MLS, and CloudSat) and other Earth probes to enable large-scale studies of climate change over decades. Moving to multi-sensor, long-duration presents serious challenges for large-scale data mining and fusion. For example, one might want to compare temperature and water vapor retrievals from one instrument (AIRS) to another (MODIS), and to a model (ECMWF), stratify the comparisons using a classification of the "cloud scenes" from CloudSat, and repeat the entire analysis over 10 years of data. HySDS is a Hybrid-Cloud Science Data System that has been developed and applied under NASA AIST, MEaSUREs, and ACCESS grants. HySDS uses the SciFlow workflow engine to partition analysis workflows into parallel tasks (e.g. segmenting by time or space) that are pushed into a durable job queue. The tasks are "pulled" from the queue by worker Virtual Machines (VM's) and executed in an on-premise Cloud (Eucalyptus or OpenStack) or at Amazon in the public Cloud or govCloud. In this way, years of data (millions of files) can be processed in a massively parallel way. Input variables (arrays) are pulled on-demand into the Cloud using OPeNDAP URLs or other subsetting services, thereby minimizing the size of the transferred data. We are using HySDS to automate the production of multiple versions of a ten-year A-Train water vapor climatology under a MEASURES grant. We will present the architecture of HySDS, describe the achieved "clock time" speedups in fusing datasets on our own nodes and in the Amazon Cloud, and discuss the Cloud cost tradeoffs for storage, compute, and data transfer. Our system demonstrates how one can pull A-Train variables (Levels 2 & 3) on-demand into the Amazon Cloud, and cache only those variables that are heavily used, so that any number of compute jobs can be

  19. Scientific computing in electrical engineering SCEE 2010

    Energy Technology Data Exchange (ETDEWEB)

    Michielsen, Bastiaan [Office National d' Etudes et de Recherches Aerospatiales (ONERA), 31 - Toulouse (France); Poirier, Jean-Rene (eds.) [LAPLACE-ENSEEIHT, Toulouse (France)

    2012-07-01

    Selected from papers presented at the 8th Scientific Computation in Electrical Engineering conference in Toulouse in 2010, the contributions to this volume cover every angle of numerically modelling electronic and electrical systems, including computational electromagnetics, circuit theory and simulation and device modelling. On computational electromagnetics, the chapters examine cutting-edge material ranging from low-frequency electrical machine modelling problems to issues in high-frequency scattering. Regarding circuit theory and simulation, the book details the most advanced techniques for modelling networks with many thousands of components. Modelling devices at microscopic levels is covered by a number of fundamental mathematical physics papers, while numerous papers on model order reduction help engineers and systems designers to bring their modelling of industrial-scale systems within the reach of present-day computational power. Complementing these more specific papers, the volume also contains a selection of mathematical methods which can be used in any application domain. (orig.)

  20. Remote collaboration system based on large scale simulation

    International Nuclear Information System (INIS)

    Kishimoto, Yasuaki; Sugahara, Akihiro; Li, J.Q.

    2008-01-01

    Large scale simulation using super-computer, which generally requires long CPU time and produces large amount of data, has been extensively studied as a third pillar in various advanced science fields in parallel to theory and experiment. Such a simulation is expected to lead new scientific discoveries through elucidation of various complex phenomena, which are hardly identified only by conventional theoretical and experimental approaches. In order to assist such large simulation studies for which many collaborators working at geographically different places participate and contribute, we have developed a unique remote collaboration system, referred to as SIMON (simulation monitoring system), which is based on client-server system control introducing an idea of up-date processing, contrary to that of widely used post-processing. As a key ingredient, we have developed a trigger method, which transmits various requests for the up-date processing from the simulation (client) running on a super-computer to a workstation (server). Namely, the simulation running on a super-computer actively controls the timing of up-date processing. The server that has received the requests from the ongoing simulation such as data transfer, data analyses, and visualizations, etc. starts operations according to the requests during the simulation. The server makes the latest results available to web browsers, so that the collaborators can monitor the results at any place and time in the world. By applying the system to a specific simulation project of laser-matter interaction, we have confirmed that the system works well and plays an important role as a collaboration platform on which many collaborators work with one another

  1. DOE High Performance Computing Operational Review (HPCOR): Enabling Data-Driven Scientific Discovery at HPC Facilities

    Energy Technology Data Exchange (ETDEWEB)

    Gerber, Richard; Allcock, William; Beggio, Chris; Campbell, Stuart; Cherry, Andrew; Cholia, Shreyas; Dart, Eli; England, Clay; Fahey, Tim; Foertter, Fernanda; Goldstone, Robin; Hick, Jason; Karelitz, David; Kelly, Kaki; Monroe, Laura; Prabhat,; Skinner, David; White, Julia

    2014-10-17

    U.S. Department of Energy (DOE) High Performance Computing (HPC) facilities are on the verge of a paradigm shift in the way they deliver systems and services to science and engineering teams. Research projects are producing a wide variety of data at unprecedented scale and level of complexity, with community-specific services that are part of the data collection and analysis workflow. On June 18-19, 2014 representatives from six DOE HPC centers met in Oakland, CA at the DOE High Performance Operational Review (HPCOR) to discuss how they can best provide facilities and services to enable large-scale data-driven scientific discovery at the DOE national laboratories. The report contains findings from that review.

  2. Large-scale Intelligent Transporation Systems simulation

    Energy Technology Data Exchange (ETDEWEB)

    Ewing, T.; Canfield, T.; Hannebutte, U.; Levine, D.; Tentner, A.

    1995-06-01

    A prototype computer system has been developed which defines a high-level architecture for a large-scale, comprehensive, scalable simulation of an Intelligent Transportation System (ITS) capable of running on massively parallel computers and distributed (networked) computer systems. The prototype includes the modelling of instrumented ``smart`` vehicles with in-vehicle navigation units capable of optimal route planning and Traffic Management Centers (TMC). The TMC has probe vehicle tracking capabilities (display position and attributes of instrumented vehicles), and can provide 2-way interaction with traffic to provide advisories and link times. Both the in-vehicle navigation module and the TMC feature detailed graphical user interfaces to support human-factors studies. The prototype has been developed on a distributed system of networked UNIX computers but is designed to run on ANL`s IBM SP-X parallel computer system for large scale problems. A novel feature of our design is that vehicles will be represented by autonomus computer processes, each with a behavior model which performs independent route selection and reacts to external traffic events much like real vehicles. With this approach, one will be able to take advantage of emerging massively parallel processor (MPP) systems.

  3. Computers and Computation. Readings from Scientific American.

    Science.gov (United States)

    Fenichel, Robert R.; Weizenbaum, Joseph

    A collection of articles from "Scientific American" magazine has been put together at this time because the current period in computer science is one of consolidation rather than innovation. A few years ago, computer science was moving so swiftly that even the professional journals were more archival than informative; but today it is…

  4. Computer network access to scientific information systems for minority universities

    Science.gov (United States)

    Thomas, Valerie L.; Wakim, Nagi T.

    1993-08-01

    The evolution of computer networking technology has lead to the establishment of a massive networking infrastructure which interconnects various types of computing resources at many government, academic, and corporate institutions. A large segment of this infrastructure has been developed to facilitate information exchange and resource sharing within the scientific community. The National Aeronautics and Space Administration (NASA) supports both the development and the application of computer networks which provide its community with access to many valuable multi-disciplinary scientific information systems and on-line databases. Recognizing the need to extend the benefits of this advanced networking technology to the under-represented community, the National Space Science Data Center (NSSDC) in the Space Data and Computing Division at the Goddard Space Flight Center has developed the Minority University-Space Interdisciplinary Network (MU-SPIN) Program: a major networking and education initiative for Historically Black Colleges and Universities (HBCUs) and Minority Universities (MUs). In this paper, we will briefly explain the various components of the MU-SPIN Program while highlighting how, by providing access to scientific information systems and on-line data, it promotes a higher level of collaboration among faculty and students and NASA scientists.

  5. Deterministic methods for sensitivity and uncertainty analysis in large-scale computer models

    International Nuclear Information System (INIS)

    Worley, B.A.; Oblow, E.M.; Pin, F.G.; Maerker, R.E.; Horwedel, J.E.; Wright, R.Q.; Lucius, J.L.

    1987-01-01

    The fields of sensitivity and uncertainty analysis are dominated by statistical techniques when large-scale modeling codes are being analyzed. This paper reports on the development and availability of two systems, GRESS and ADGEN, that make use of computer calculus compilers to automate the implementation of deterministic sensitivity analysis capability into existing computer models. This automation removes the traditional limitation of deterministic sensitivity methods. The paper describes a deterministic uncertainty analysis method (DUA) that uses derivative information as a basis to propagate parameter probability distributions to obtain result probability distributions. The paper demonstrates the deterministic approach to sensitivity and uncertainty analysis as applied to a sample problem that models the flow of water through a borehole. The sample problem is used as a basis to compare the cumulative distribution function of the flow rate as calculated by the standard statistical methods and the DUA method. The DUA method gives a more accurate result based upon only two model executions compared to fifty executions in the statistical case

  6. UNEDF: Advanced Scientific Computing Collaboration Transforms the Low-Energy Nuclear Many-Body Problem

    International Nuclear Information System (INIS)

    Nam, H; Stoitsov, M; Nazarewicz, W; Hagen, G; Kortelainen, M; Pei, J C; Bulgac, A; Maris, P; Vary, J P; Roche, K J; Schunck, N; Thompson, I; Wild, S M

    2012-01-01

    The demands of cutting-edge science are driving the need for larger and faster computing resources. With the rapidly growing scale of computing systems and the prospect of technologically disruptive architectures to meet these needs, scientists face the challenge of effectively using complex computational resources to advance scientific discovery. Multi-disciplinary collaborating networks of researchers with diverse scientific backgrounds are needed to address these complex challenges. The UNEDF SciDAC collaboration of nuclear theorists, applied mathematicians, and computer scientists is developing a comprehensive description of nuclei and their reactions that delivers maximum predictive power with quantified uncertainties. This paper describes UNEDF and identifies attributes that classify it as a successful computational collaboration. We illustrate significant milestones accomplished by UNEDF through integrative solutions using the most reliable theoretical approaches, most advanced algorithms, and leadership-class computational resources.

  7. Finding Tropical Cyclones on a Cloud Computing Cluster: Using Parallel Virtualization for Large-Scale Climate Simulation Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Hasenkamp, Daren; Sim, Alexander; Wehner, Michael; Wu, Kesheng

    2010-09-30

    Extensive computing power has been used to tackle issues such as climate changes, fusion energy, and other pressing scientific challenges. These computations produce a tremendous amount of data; however, many of the data analysis programs currently only run a single processor. In this work, we explore the possibility of using the emerging cloud computing platform to parallelize such sequential data analysis tasks. As a proof of concept, we wrap a program for analyzing trends of tropical cyclones in a set of virtual machines (VMs). This approach allows the user to keep their familiar data analysis environment in the VMs, while we provide the coordination and data transfer services to ensure the necessary input and output are directed to the desired locations. This work extensively exercises the networking capability of the cloud computing systems and has revealed a number of weaknesses in the current cloud system software. In our tests, we are able to scale the parallel data analysis job to a modest number of VMs and achieve a speedup that is comparable to running the same analysis task using MPI. However, compared to MPI based parallelization, the cloud-based approach has a number of advantages. The cloud-based approach is more flexible because the VMs can capture arbitrary software dependencies without requiring the user to rewrite their programs. The cloud-based approach is also more resilient to failure; as long as a single VM is running, it can make progress while as soon as one MPI node fails the whole analysis job fails. In short, this initial work demonstrates that a cloud computing system is a viable platform for distributed scientific data analyses traditionally conducted on dedicated supercomputing systems.

  8. Finding Tropical Cyclones on a Cloud Computing Cluster: Using Parallel Virtualization for Large-Scale Climate Simulation Analysis

    International Nuclear Information System (INIS)

    Hasenkamp, Daren; Sim, Alexander; Wehner, Michael; Wu, Kesheng

    2010-01-01

    Extensive computing power has been used to tackle issues such as climate changes, fusion energy, and other pressing scientific challenges. These computations produce a tremendous amount of data; however, many of the data analysis programs currently only run a single processor. In this work, we explore the possibility of using the emerging cloud computing platform to parallelize such sequential data analysis tasks. As a proof of concept, we wrap a program for analyzing trends of tropical cyclones in a set of virtual machines (VMs). This approach allows the user to keep their familiar data analysis environment in the VMs, while we provide the coordination and data transfer services to ensure the necessary input and output are directed to the desired locations. This work extensively exercises the networking capability of the cloud computing systems and has revealed a number of weaknesses in the current cloud system software. In our tests, we are able to scale the parallel data analysis job to a modest number of VMs and achieve a speedup that is comparable to running the same analysis task using MPI. However, compared to MPI based parallelization, the cloud-based approach has a number of advantages. The cloud-based approach is more flexible because the VMs can capture arbitrary software dependencies without requiring the user to rewrite their programs. The cloud-based approach is also more resilient to failure; as long as a single VM is running, it can make progress while as soon as one MPI node fails the whole analysis job fails. In short, this initial work demonstrates that a cloud computing system is a viable platform for distributed scientific data analyses traditionally conducted on dedicated supercomputing systems.

  9. Parallel Index and Query for Large Scale Data Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Chou, Jerry; Wu, Kesheng; Ruebel, Oliver; Howison, Mark; Qiang, Ji; Prabhat,; Austin, Brian; Bethel, E. Wes; Ryne, Rob D.; Shoshani, Arie

    2011-07-18

    Modern scientific datasets present numerous data management and analysis challenges. State-of-the-art index and query technologies are critical for facilitating interactive exploration of large datasets, but numerous challenges remain in terms of designing a system for process- ing general scientific datasets. The system needs to be able to run on distributed multi-core platforms, efficiently utilize underlying I/O infrastructure, and scale to massive datasets. We present FastQuery, a novel software framework that address these challenges. FastQuery utilizes a state-of-the-art index and query technology (FastBit) and is designed to process mas- sive datasets on modern supercomputing platforms. We apply FastQuery to processing of a massive 50TB dataset generated by a large scale accelerator modeling code. We demonstrate the scalability of the tool to 11,520 cores. Motivated by the scientific need to search for inter- esting particles in this dataset, we use our framework to reduce search time from hours to tens of seconds.

  10. Scientific computing

    CERN Document Server

    Trangenstein, John A

    2017-01-01

    This is the third of three volumes providing a comprehensive presentation of the fundamentals of scientific computing. This volume discusses topics that depend more on calculus than linear algebra, in order to prepare the reader for solving differential equations. This book and its companions show how to determine the quality of computational results, and how to measure the relative efficiency of competing methods. Readers learn how to determine the maximum attainable accuracy of algorithms, and how to select the best method for computing problems. This book also discusses programming in several languages, including C++, Fortran and MATLAB. There are 90 examples, 200 exercises, 36 algorithms, 40 interactive JavaScript programs, 91 references to software programs and 1 case study. Topics are introduced with goals, literature references and links to public software. There are descriptions of the current algorithms in GSLIB and MATLAB. This book could be used for a second course in numerical methods, for either ...

  11. Development and application of a computer model for large-scale flame acceleration experiments

    International Nuclear Information System (INIS)

    Marx, K.D.

    1987-07-01

    A new computational model for large-scale premixed flames is developed and applied to the simulation of flame acceleration experiments. The primary objective is to circumvent the necessity for resolving turbulent flame fronts; this is imperative because of the relatively coarse computational grids which must be used in engineering calculations. The essence of the model is to artificially thicken the flame by increasing the appropriate diffusivities and decreasing the combustion rate, but to do this in such a way that the burn velocity varies with pressure, temperature, and turbulence intensity according to prespecified phenomenological characteristics. The model is particularly aimed at implementation in computer codes which simulate compressible flows. To this end, it is applied to the two-dimensional simulation of hydrogen-air flame acceleration experiments in which the flame speeds and gas flow velocities attain or exceed the speed of sound in the gas. It is shown that many of the features of the flame trajectories and pressure histories in the experiments are simulated quite well by the model. Using the comparison of experimental and computational results as a guide, some insight is developed into the processes which occur in such experiments. 34 refs., 25 figs., 4 tabs

  12. JINR CLOUD SERVICE FOR SCIENTIFIC AND ENGINEERING COMPUTATIONS

    Directory of Open Access Journals (Sweden)

    Nikita A. Balashov

    2018-03-01

    Full Text Available Pretty often small research scientific groups do not have access to powerful enough computational resources required for their research work to be productive. Global computational infrastructures used by large scientific collaborations can be challenging for small research teams because of bureaucracy overhead as well as usage complexity of underlying tools. Some researchers buy a set of powerful servers to cover their own needs in computational resources. A drawback of such approach is a necessity to take care about proper hosting environment for these hardware and maintenance which requires a certain level of expertise. Moreover a lot of time such resources may be underutilized because а researcher needs to spend a certain amount of time to prepare computations and to analyze results as well as he doesn’t always need all resources of modern multi-core CPUs servers. The JINR cloud team developed a service which provides an access for scientists of small research groups from JINR and its Member State organizations to computational resources via problem-oriented (i.e. application-specific web-interface. It allows a scientist to focus on his research domain by interacting with the service in a convenient way via browser and abstracting away from underlying infrastructure as well as its maintenance. A user just sets a required values for his job via web-interface and specify a location for uploading a result. The computational workloads are done on the virtual machines deployed in the JINR cloud infrastructure.

  13. Systematic control of large computer programs

    International Nuclear Information System (INIS)

    Goedbloed, J.P.; Klieb, L.

    1986-07-01

    A package of CCL, UPDATE, and FORTRAN procedures is described which facilitates the systematic control and development of large scientific computer programs. The package provides a general tool box for this purpose which contains many conveniences for the systematic administration of files, editing, reformating of line printer output files, etc. In addition, a small number of procedures is devoted to the problem of structured development of a large computer program which is used by a group of scientists. The essence of the method is contained in three procedures N, R, and X for the creation of a new UPDATE program library, its revision, and execution, resp., and a procedure REVISE which provides a joint editor - UPDATE session which combines the advantages of the two systems, viz. speed and rigor. (Auth.)

  14. Towards a Scalable and Adaptive Application Support Platform for Large-Scale Distributed E-Sciences in High-Performance Network Environments

    Energy Technology Data Exchange (ETDEWEB)

    Wu, Chase Qishi [New Jersey Inst. of Technology, Newark, NJ (United States); Univ. of Memphis, TN (United States); Zhu, Michelle Mengxia [Southern Illinois Univ., Carbondale, IL (United States)

    2016-06-06

    The advent of large-scale collaborative scientific applications has demonstrated the potential for broad scientific communities to pool globally distributed resources to produce unprecedented data acquisition, movement, and analysis. System resources including supercomputers, data repositories, computing facilities, network infrastructures, storage systems, and display devices have been increasingly deployed at national laboratories and academic institutes. These resources are typically shared by large communities of users over Internet or dedicated networks and hence exhibit an inherent dynamic nature in their availability, accessibility, capacity, and stability. Scientific applications using either experimental facilities or computation-based simulations with various physical, chemical, climatic, and biological models feature diverse scientific workflows as simple as linear pipelines or as complex as a directed acyclic graphs, which must be executed and supported over wide-area networks with massively distributed resources. Application users oftentimes need to manually configure their computing tasks over networks in an ad hoc manner, hence significantly limiting the productivity of scientists and constraining the utilization of resources. The success of these large-scale distributed applications requires a highly adaptive and massively scalable workflow platform that provides automated and optimized computing and networking services. This project is to design and develop a generic Scientific Workflow Automation and Management Platform (SWAMP), which contains a web-based user interface specially tailored for a target application, a set of user libraries, and several easy-to-use computing and networking toolkits for application scientists to conveniently assemble, execute, monitor, and control complex computing workflows in heterogeneous high-performance network environments. SWAMP will enable the automation and management of the entire process of scientific

  15. Amplification of large-scale magnetic field in nonhelical magnetohydrodynamics

    KAUST Repository

    Kumar, Rohit

    2017-08-11

    It is typically assumed that the kinetic and magnetic helicities play a crucial role in the growth of large-scale dynamo. In this paper, we demonstrate that helicity is not essential for the amplification of large-scale magnetic field. For this purpose, we perform nonhelical magnetohydrodynamic (MHD) simulation, and show that the large-scale magnetic field can grow in nonhelical MHD when random external forcing is employed at scale 1/10 the box size. The energy fluxes and shell-to-shell transfer rates computed using the numerical data show that the large-scale magnetic energy grows due to the energy transfers from the velocity field at the forcing scales.

  16. Parallel supercomputing: Advanced methods, algorithms, and software for large-scale linear and nonlinear problems

    Energy Technology Data Exchange (ETDEWEB)

    Carey, G.F.; Young, D.M.

    1993-12-31

    The program outlined here is directed to research on methods, algorithms, and software for distributed parallel supercomputers. Of particular interest are finite element methods and finite difference methods together with sparse iterative solution schemes for scientific and engineering computations of very large-scale systems. Both linear and nonlinear problems will be investigated. In the nonlinear case, applications with bifurcation to multiple solutions will be considered using continuation strategies. The parallelizable numerical methods of particular interest are a family of partitioning schemes embracing domain decomposition, element-by-element strategies, and multi-level techniques. The methods will be further developed incorporating parallel iterative solution algorithms with associated preconditioners in parallel computer software. The schemes will be implemented on distributed memory parallel architectures such as the CRAY MPP, Intel Paragon, the NCUBE3, and the Connection Machine. We will also consider other new architectures such as the Kendall-Square (KSQ) and proposed machines such as the TERA. The applications will focus on large-scale three-dimensional nonlinear flow and reservoir problems with strong convective transport contributions. These are legitimate grand challenge class computational fluid dynamics (CFD) problems of significant practical interest to DOE. The methods developed and algorithms will, however, be of wider interest.

  17. Using Amazon's Elastic Compute Cloud to scale CMS' compute hardware dynamically.

    CERN Document Server

    Melo, Andrew Malone

    2011-01-01

    Large international scientific collaborations such as the Compact Muon Solenoid (CMS) experiment at the Large Hadron Collider have traditionally addressed their data reduction and analysis needs by building and maintaining dedicated computational infrastructure. Emerging cloud-computing services such as Amazon's Elastic Compute Cloud (EC2) offer short-term CPU and storage resources with costs based on usage. These services allow experiments to purchase computing resources as needed, without significant prior planning and without long term investments in facilities and their management. We have demonstrated that services such as EC2 can successfully be integrated into the production-computing model of CMS, and find that they work very well as worker nodes. The cost-structure and transient nature of EC2 services makes them inappropriate for some CMS production services and functions. We also found that the resources are not truely on-demand as limits and caps on usage are imposed. Our trial workflows allow us t...

  18. Large-scale networks in engineering and life sciences

    CERN Document Server

    Findeisen, Rolf; Flockerzi, Dietrich; Reichl, Udo; Sundmacher, Kai

    2014-01-01

    This edited volume provides insights into and tools for the modeling, analysis, optimization, and control of large-scale networks in the life sciences and in engineering. Large-scale systems are often the result of networked interactions between a large number of subsystems, and their analysis and control are becoming increasingly important. The chapters of this book present the basic concepts and theoretical foundations of network theory and discuss its applications in different scientific areas such as biochemical reactions, chemical production processes, systems biology, electrical circuits, and mobile agents. The aim is to identify common concepts, to understand the underlying mathematical ideas, and to inspire discussions across the borders of the various disciplines.  The book originates from the interdisciplinary summer school “Large Scale Networks in Engineering and Life Sciences” hosted by the International Max Planck Research School Magdeburg, September 26-30, 2011, and will therefore be of int...

  19. RXY/DRXY-a postprocessing graphical system for scientific computation

    International Nuclear Information System (INIS)

    Jin Qijie

    1990-01-01

    Scientific computing require computer graphical function for its visualization. The developing objects and functions of a postprocessing graphical system for scientific computation are described, and also briefly described its implementation

  20. Seismic safety in conducting large-scale blasts

    Science.gov (United States)

    Mashukov, I. V.; Chaplygin, V. V.; Domanov, V. P.; Semin, A. A.; Klimkin, M. A.

    2017-09-01

    In mining enterprises to prepare hard rocks for excavation a drilling and blasting method is used. With the approach of mining operations to settlements the negative effect of large-scale blasts increases. To assess the level of seismic impact of large-scale blasts the scientific staff of Siberian State Industrial University carried out expertise for coal mines and iron ore enterprises. Determination of the magnitude of surface seismic vibrations caused by mass explosions was performed using seismic receivers, an analog-digital converter with recording on a laptop. The registration results of surface seismic vibrations during production of more than 280 large-scale blasts at 17 mining enterprises in 22 settlements are presented. The maximum velocity values of the Earth’s surface vibrations are determined. The safety evaluation of seismic effect was carried out according to the permissible value of vibration velocity. For cases with exceedance of permissible values recommendations were developed to reduce the level of seismic impact.

  1. Large-Scale Sentinel-1 Processing for Solid Earth Science and Urgent Response using Cloud Computing and Machine Learning

    Science.gov (United States)

    Hua, H.; Owen, S. E.; Yun, S. H.; Agram, P. S.; Manipon, G.; Starch, M.; Sacco, G. F.; Bue, B. D.; Dang, L. B.; Linick, J. P.; Malarout, N.; Rosen, P. A.; Fielding, E. J.; Lundgren, P.; Moore, A. W.; Liu, Z.; Farr, T.; Webb, F.; Simons, M.; Gurrola, E. M.

    2017-12-01

    With the increased availability of open SAR data (e.g. Sentinel-1 A/B), new challenges are being faced with processing and analyzing the voluminous SAR datasets to make geodetic measurements. Upcoming SAR missions such as NISAR are expected to generate close to 100TB per day. The Advanced Rapid Imaging and Analysis (ARIA) project can now generate geocoded unwrapped phase and coherence products from Sentinel-1 TOPS mode data in an automated fashion, using the ISCE software. This capability is currently being exercised on various study sites across the United States and around the globe, including Hawaii, Central California, Iceland and South America. The automated and large-scale SAR data processing and analysis capabilities use cloud computing techniques to speed the computations and provide scalable processing power and storage. Aspects such as how to processing these voluminous SLCs and interferograms at global scales, keeping up with the large daily SAR data volumes, and how to handle the voluminous data rates are being explored. Scene-partitioning approaches in the processing pipeline help in handling global-scale processing up to unwrapped interferograms with stitching done at a late stage. We have built an advanced science data system with rapid search functions to enable access to the derived data products. Rapid image processing of Sentinel-1 data to interferograms and time series is already being applied to natural hazards including earthquakes, floods, volcanic eruptions, and land subsidence due to fluid withdrawal. We will present the status of the ARIA science data system for generating science-ready data products and challenges that arise from being able to process SAR datasets to derived time series data products at large scales. For example, how do we perform large-scale data quality screening on interferograms? What approaches can be used to minimize compute, storage, and data movement costs for time series analysis in the cloud? We will also

  2. Large-scale Labeled Datasets to Fuel Earth Science Deep Learning Applications

    Science.gov (United States)

    Maskey, M.; Ramachandran, R.; Miller, J.

    2017-12-01

    Deep learning has revolutionized computer vision and natural language processing with various algorithms scaled using high-performance computing. However, generic large-scale labeled datasets such as the ImageNet are the fuel that drives the impressive accuracy of deep learning results. Large-scale labeled datasets already exist in domains such as medical science, but creating them in the Earth science domain is a challenge. While there are ways to apply deep learning using limited labeled datasets, there is a need in the Earth sciences for creating large-scale labeled datasets for benchmarking and scaling deep learning applications. At the NASA Marshall Space Flight Center, we are using deep learning for a variety of Earth science applications where we have encountered the need for large-scale labeled datasets. We will discuss our approaches for creating such datasets and why these datasets are just as valuable as deep learning algorithms. We will also describe successful usage of these large-scale labeled datasets with our deep learning based applications.

  3. Computational Cosmology: from the Early Universe to the Large Scale Structure

    Directory of Open Access Journals (Sweden)

    Peter Anninos

    1998-09-01

    Full Text Available In order to account for the observable Universe, any comprehensive theory or model of cosmology must draw from many disciplines of physics, including gauge theories of strong and weak interactions, the hydrodynamics and microphysics of baryonic matter, electromagnetic fields, and spacetime curvature, for example. Although it is difficult to incorporate all these physical elements into a single complete model of our Universe, advances in computing methods and technologies have contributed significantly towards our understanding of cosmological models, the Universe, and astrophysical processes within them. A sample of numerical calculations addressing specific issues in cosmology are reviewed in this article: from the Big Bang singularity dynamics to the fundamental interactions of gravitational waves; from the quark--hadron phase transition to the large scale structure of the Universe. The emphasis, although not exclusively, is on thosecalculations designed to test different models of cosmology against the observed Universe.

  4. Computational Cosmology: From the Early Universe to the Large Scale Structure.

    Science.gov (United States)

    Anninos, Peter

    2001-01-01

    In order to account for the observable Universe, any comprehensive theory or model of cosmology must draw from many disciplines of physics, including gauge theories of strong and weak interactions, the hydrodynamics and microphysics of baryonic matter, electromagnetic fields, and spacetime curvature, for example. Although it is difficult to incorporate all these physical elements into a single complete model of our Universe, advances in computing methods and technologies have contributed significantly towards our understanding of cosmological models, the Universe, and astrophysical processes within them. A sample of numerical calculations (and numerical methods applied to specific issues in cosmology are reviewed in this article: from the Big Bang singularity dynamics to the fundamental interactions of gravitational waves; from the quark-hadron phase transition to the large scale structure of the Universe. The emphasis, although not exclusively, is on those calculations designed to test different models of cosmology against the observed Universe.

  5. OPENING REMARKS: SciDAC: Scientific Discovery through Advanced Computing

    Science.gov (United States)

    Strayer, Michael

    2005-01-01

    Good morning. Welcome to SciDAC 2005 and San Francisco. SciDAC is all about computational science and scientific discovery. In a large sense, computational science characterizes SciDAC and its intent is change. It transforms both our approach and our understanding of science. It opens new doors and crosses traditional boundaries while seeking discovery. In terms of twentieth century methodologies, computational science may be said to be transformational. There are a number of examples to this point. First are the sciences that encompass climate modeling. The application of computational science has in essence created the field of climate modeling. This community is now international in scope and has provided precision results that are challenging our understanding of our environment. A second example is that of lattice quantum chromodynamics. Lattice QCD, while adding precision and insight to our fundamental understanding of strong interaction dynamics, has transformed our approach to particle and nuclear science. The individual investigator approach has evolved to teams of scientists from different disciplines working side-by-side towards a common goal. SciDAC is also undergoing a transformation. This meeting is a prime example. Last year it was a small programmatic meeting tracking progress in SciDAC. This year, we have a major computational science meeting with a variety of disciplines and enabling technologies represented. SciDAC 2005 should position itself as a new corner stone for Computational Science and its impact on science. As we look to the immediate future, FY2006 will bring a new cycle to SciDAC. Most of the program elements of SciDAC will be re-competed in FY2006. The re-competition will involve new instruments for computational science, new approaches for collaboration, as well as new disciplines. There will be new opportunities for virtual experiments in carbon sequestration, fusion, and nuclear power and nuclear waste, as well as collaborations

  6. Scientific computing vol II - eigenvalues and optimization

    CERN Document Server

    Trangenstein, John A

    2017-01-01

    This is the second of three volumes providing a comprehensive presentation of the fundamentals of scientific computing. This volume discusses more advanced topics than volume one, and is largely not a prerequisite for volume three. This book and its companions show how to determine the quality of computational results, and how to measure the relative efficiency of competing methods. Readers learn how to determine the maximum attainable accuracy of algorithms, and how to select the best method for computing problems. This book also discusses programming in several languages, including C++, Fortran and MATLAB. There are 49 examples, 110 exercises, 66 algorithms, 24 interactive JavaScript programs, 77 references to software programs and 1 case study. Topics are introduced with goals, literature references and links to public software. There are descriptions of the current algorithms in LAPACK, GSLIB and MATLAB. This book could be used for a second course in numerical methods, for either upper level undergraduate...

  7. I - Template Metaprogramming for Massively Parallel Scientific Computing - Expression Templates

    CERN Multimedia

    CERN. Geneva

    2016-01-01

    Large scale scientific computing raises questions on different levels ranging from the fomulation of the problems to the choice of the best algorithms and their implementation for a specific platform. There are similarities in these different topics that can be exploited by modern-style C++ template metaprogramming techniques to produce readable, maintainable and generic code. Traditional low-level code tend to be fast but platform-dependent, and it obfuscates the meaning of the algorithm. On the other hand, object-oriented approach is nice to read, but may come with an inherent performance penalty. These lectures aim to present he basics of the Expression Template (ET) idiom which allows us to keep the object-oriented approach without sacrificing performance. We will in particular show to to enhance ET to include SIMD vectorization. We will then introduce techniques for abstracting iteration, and introduce thread-level parallelism for use in heavy data-centric loads. We will show to to apply these methods i...

  8. ScalaLab and GroovyLab: Comparing Scala and Groovy for Scientific Computing

    Directory of Open Access Journals (Sweden)

    Stergios Papadimitriou

    2015-01-01

    Full Text Available ScalaLab and GroovyLab are both MATLAB-like environments for the Java Virtual Machine. ScalaLab is based on the Scala programming language and GroovyLab is based on the Groovy programming language. They present similar user interfaces and functionality to the user. They also share the same set of Java scientific libraries and of native code libraries. From the programmer's point of view though, they have significant differences. This paper compares some aspects of the two environments and highlights some of the strengths and weaknesses of Scala versus Groovy for scientific computing. The discussion also examines some aspects of the dilemma of using dynamic typing versus static typing for scientific programming. The performance of the Java platform is continuously improved at a fast pace. Today Java can effectively support demanding high-performance computing and scales well on multicore platforms. Thus, both systems can challenge the performance of the traditional C/C++/Fortran scientific code with an easier to use and more productive programming environment.

  9. Scientific and Computational Challenges of the Fusion Simulation Program (FSP)

    International Nuclear Information System (INIS)

    Tang, William M.

    2011-01-01

    This paper highlights the scientific and computational challenges facing the Fusion Simulation Program (FSP) a major national initiative in the United States with the primary objective being to enable scientific discovery of important new plasma phenomena with associated understanding that emerges only upon integration. This requires developing a predictive integrated simulation capability for magnetically-confined fusion plasmas that are properly validated against experiments in regimes relevant for producing practical fusion energy. It is expected to provide a suite of advanced modeling tools for reliably predicting fusion device behavior with comprehensive and targeted science-based simulations of nonlinearly-coupled phenomena in the core plasma, edge plasma, and wall region on time and space scales required for fusion energy production. As such, it will strive to embody the most current theoretical and experimental understanding of magnetic fusion plasmas and to provide a living framework for the simulation of such plasmas as the associated physics understanding continues to advance over the next several decades. Substantive progress on answering the outstanding scientific questions in the field will drive the FSP toward its ultimate goal of developing the ability to predict the behavior of plasma discharges in toroidal magnetic fusion devices with high physics fidelity on all relevant time and space scales. From a computational perspective, this will demand computing resources in the petascale range and beyond together with the associated multi-core algorithmic formulation needed to address burning plasma issues relevant to ITER - a multibillion dollar collaborative experiment involving seven international partners representing over half the world's population. Even more powerful exascale platforms will be needed to meet the future challenges of designing a demonstration fusion reactor (DEMO). Analogous to other major applied physics modeling projects (e

  10. Scientific and computational challenges of the fusion simulation program (FSP)

    International Nuclear Information System (INIS)

    Tang, William M.

    2011-01-01

    This paper highlights the scientific and computational challenges facing the Fusion Simulation Program (FSP) - a major national initiative in the United States with the primary objective being to enable scientific discovery of important new plasma phenomena with associated understanding that emerges only upon integration. This requires developing a predictive integrated simulation capability for magnetically-confined fusion plasmas that are properly validated against experiments in regimes relevant for producing practical fusion energy. It is expected to provide a suite of advanced modeling tools for reliably predicting fusion device behavior with comprehensive and targeted science-based simulations of nonlinearly-coupled phenomena in the core plasma, edge plasma, and wall region on time and space scales required for fusion energy production. As such, it will strive to embody the most current theoretical and experimental understanding of magnetic fusion plasmas and to provide a living framework for the simulation of such plasmas as the associated physics understanding continues to advance over the next several decades. Substantive progress on answering the outstanding scientific questions in the field will drive the FSP toward its ultimate goal of developing the ability to predict the behavior of plasma discharges in toroidal magnetic fusion devices with high physics fidelity on all relevant time and space scales. From a computational perspective, this will demand computing resources in the petascale range and beyond together with the associated multi-core algorithmic formulation needed to address burning plasma issues relevant to ITER - a multibillion dollar collaborative experiment involving seven international partners representing over half the world's population. Even more powerful exascale platforms will be needed to meet the future challenges of designing a demonstration fusion reactor (DEMO). Analogous to other major applied physics modeling projects (e

  11. Resolute large scale mining company contribution to health services of

    African Journals Online (AJOL)

    Resolute large scale mining company contribution to health services of Lusu ... in terms of socio economic, health, education, employment, safe drinking water, ... The data were analyzed using Scientific Package for Social Science (SPSS).

  12. Towards Building a High Performance Spatial Query System for Large Scale Medical Imaging Data.

    Science.gov (United States)

    Aji, Ablimit; Wang, Fusheng; Saltz, Joel H

    2012-11-06

    Support of high performance queries on large volumes of scientific spatial data is becoming increasingly important in many applications. This growth is driven by not only geospatial problems in numerous fields, but also emerging scientific applications that are increasingly data- and compute-intensive. For example, digital pathology imaging has become an emerging field during the past decade, where examination of high resolution images of human tissue specimens enables more effective diagnosis, prediction and treatment of diseases. Systematic analysis of large-scale pathology images generates tremendous amounts of spatially derived quantifications of micro-anatomic objects, such as nuclei, blood vessels, and tissue regions. Analytical pathology imaging provides high potential to support image based computer aided diagnosis. One major requirement for this is effective querying of such enormous amount of data with fast response, which is faced with two major challenges: the "big data" challenge and the high computation complexity. In this paper, we present our work towards building a high performance spatial query system for querying massive spatial data on MapReduce. Our framework takes an on demand index building approach for processing spatial queries and a partition-merge approach for building parallel spatial query pipelines, which fits nicely with the computing model of MapReduce. We demonstrate our framework on supporting multi-way spatial joins for algorithm evaluation and nearest neighbor queries for microanatomic objects. To reduce query response time, we propose cost based query optimization to mitigate the effect of data skew. Our experiments show that the framework can efficiently support complex analytical spatial queries on MapReduce.

  13. 5th International Conference on High Performance Scientific Computing

    CERN Document Server

    Hoang, Xuan; Rannacher, Rolf; Schlöder, Johannes

    2014-01-01

    This proceedings volume gathers a selection of papers presented at the Fifth International Conference on High Performance Scientific Computing, which took place in Hanoi on March 5-9, 2012. The conference was organized by the Institute of Mathematics of the Vietnam Academy of Science and Technology (VAST), the Interdisciplinary Center for Scientific Computing (IWR) of Heidelberg University, Ho Chi Minh City University of Technology, and the Vietnam Institute for Advanced Study in Mathematics. The contributions cover the broad interdisciplinary spectrum of scientific computing and present recent advances in theory, development of methods, and practical applications. Subjects covered include mathematical modeling; numerical simulation; methods for optimization and control; parallel computing; software development; and applications of scientific computing in physics, mechanics and biomechanics, material science, hydrology, chemistry, biology, biotechnology, medicine, sports, psychology, transport, logistics, com...

  14. 3rd International Conference on High Performance Scientific Computing

    CERN Document Server

    Kostina, Ekaterina; Phu, Hoang; Rannacher, Rolf

    2008-01-01

    This proceedings volume contains a selection of papers presented at the Third International Conference on High Performance Scientific Computing held at the Hanoi Institute of Mathematics, Vietnamese Academy of Science and Technology (VAST), March 6-10, 2006. The conference has been organized by the Hanoi Institute of Mathematics, Interdisciplinary Center for Scientific Computing (IWR), Heidelberg, and its International PhD Program ``Complex Processes: Modeling, Simulation and Optimization'', and Ho Chi Minh City University of Technology. The contributions cover the broad interdisciplinary spectrum of scientific computing and present recent advances in theory, development of methods, and applications in practice. Subjects covered are mathematical modelling, numerical simulation, methods for optimization and control, parallel computing, software development, applications of scientific computing in physics, chemistry, biology and mechanics, environmental and hydrology problems, transport, logistics and site loca...

  15. 6th International Conference on High Performance Scientific Computing

    CERN Document Server

    Phu, Hoang; Rannacher, Rolf; Schlöder, Johannes

    2017-01-01

    This proceedings volume highlights a selection of papers presented at the Sixth International Conference on High Performance Scientific Computing, which took place in Hanoi, Vietnam on March 16-20, 2015. The conference was jointly organized by the Heidelberg Institute of Theoretical Studies (HITS), the Institute of Mathematics of the Vietnam Academy of Science and Technology (VAST), the Interdisciplinary Center for Scientific Computing (IWR) at Heidelberg University, and the Vietnam Institute for Advanced Study in Mathematics, Ministry of Education The contributions cover a broad, interdisciplinary spectrum of scientific computing and showcase recent advances in theory, methods, and practical applications. Subjects covered numerical simulation, methods for optimization and control, parallel computing, and software development, as well as the applications of scientific computing in physics, mechanics, biomechanics and robotics, material science, hydrology, biotechnology, medicine, transport, scheduling, and in...

  16. Advancements in Large-Scale Data/Metadata Management for Scientific Data.

    Science.gov (United States)

    Guntupally, K.; Devarakonda, R.; Palanisamy, G.; Frame, M. T.

    2017-12-01

    Scientific data often comes with complex and diverse metadata which are critical for data discovery and users. The Online Metadata Editor (OME) tool, which was developed by an Oak Ridge National Laboratory team, effectively manages diverse scientific datasets across several federal data centers, such as DOE's Atmospheric Radiation Measurement (ARM) Data Center and USGS's Core Science Analytics, Synthesis, and Libraries (CSAS&L) project. This presentation will focus mainly on recent developments and future strategies for refining OME tool within these centers. The ARM OME is a standard based tool (https://www.archive.arm.gov/armome) that allows scientists to create and maintain metadata about their data products. The tool has been improved with new workflows that help metadata coordinators and submitting investigators to submit and review their data more efficiently. The ARM Data Center's newly upgraded Data Discovery Tool (http://www.archive.arm.gov/discovery) uses rich metadata generated by the OME to enable search and discovery of thousands of datasets, while also providing a citation generator and modern order-delivery techniques like Globus (using GridFTP), Dropbox and THREDDS. The Data Discovery Tool also supports incremental indexing, which allows users to find new data as and when they are added. The USGS CSAS&L search catalog employs a custom version of the OME (https://www1.usgs.gov/csas/ome), which has been upgraded with high-level Federal Geographic Data Committee (FGDC) validations and the ability to reserve and mint Digital Object Identifiers (DOIs). The USGS's Science Data Catalog (SDC) (https://data.usgs.gov/datacatalog) allows users to discover a myriad of science data holdings through a web portal. Recent major upgrades to the SDC and ARM Data Discovery Tool include improved harvesting performance and migration using new search software, such as Apache Solr 6.0 for serving up data/metadata to scientific communities. Our presentation will highlight

  17. Computational Cosmology: from the Early Universe to the Large Scale Structure

    Directory of Open Access Journals (Sweden)

    Anninos Peter

    2001-01-01

    Full Text Available In order to account for the observable Universe, any comprehensive theory or model of cosmology must draw from many disciplines of physics, including gauge theories of strong and weak interactions, the hydrodynamics and microphysics of baryonic matter, electromagnetic fields, and spacetime curvature, for example. Although it is difficult to incorporate all these physical elements into a single complete model of our Universe, advances in computing methods and technologies have contributed significantly towards our understanding of cosmological models, the Universe, and astrophysical processes within them. A sample of numerical calculations (and numerical methods applied to specific issues in cosmology are reviewed in this article: from the Big Bang singularity dynamics to the fundamental interactions of gravitational waves; from the quark-hadron phase transition to the large scale structure of the Universe. The emphasis, although not exclusively, is on those calculations designed to test different models of cosmology against the observed Universe.

  18. Paul Scherrer Institute Scientific and Technical Report 2000. Volume VI: Large Research Facilities

    International Nuclear Information System (INIS)

    Foroughi, Fereydoun; Bercher, Renate; Buechli, Carmen; Zumkeller, Lotty

    2001-01-01

    The PSI Department Large Research Facilities (GFA) joins the efforts to provide an excellent research environment to Swiss and foreign research groups on the experimental facilities driven by our high intensity proton accelerator complex. Its divisions care for the running, maintenance and enhancement of the accelerator complex, the primary proton beamlines, the targets and the secondary beams as well as the neutron spallation source SINQ. The division for technical support and coordination provides for technical support to the research facility complementary to the basic logistic available from the department for logistics and marketing. Besides running the facilities, the staff of the department is also involved in theoretical and experimental research projects. Some of them address basic scientific questions mainly concerning the properties of micro- or nanostructured materials: experiments as well as large scale computer simulations of molecular dynamics were performed to investigate nonclassical materials properties. Others are related to improvements or extensions of the capabilities of our facilities. We also report on intriguing results from applications of the neutron capture radiography, the prompt gamma activation method and the isotope production facility at SINQ

  19. Paul Scherrer Institute Scientific and Technical Report 2000. Volume VI: Large Research Facilities

    Energy Technology Data Exchange (ETDEWEB)

    Foroughi, Fereydoun; Bercher, Renate; Buechli, Carmen; Zumkeller, Lotty [eds.

    2001-07-01

    The PSI Department Large Research Facilities (GFA) joins the efforts to provide an excellent research environment to Swiss and foreign research groups on the experimental facilities driven by our high intensity proton accelerator complex. Its divisions care for the running, maintenance and enhancement of the accelerator complex, the primary proton beamlines, the targets and the secondary beams as well as the neutron spallation source SINQ. The division for technical support and coordination provides for technical support to the research facility complementary to the basic logistic available from the department for logistics and marketing. Besides running the facilities, the staff of the department is also involved in theoretical and experimental research projects. Some of them address basic scientific questions mainly concerning the properties of micro- or nanostructured materials: experiments as well as large scale computer simulations of molecular dynamics were performed to investigate nonclassical materials properties. Others are related to improvements or extensions of the capabilities of our facilities. We also report on intriguing results from applications of the neutron capture radiography, the prompt gamma activation method and the isotope production facility at SINQ.

  20. International Symposium on Scientific Computing, Computer Arithmetic and Validated Numerics

    CERN Document Server

    DEVELOPMENTS IN RELIABLE COMPUTING

    1999-01-01

    The SCAN conference, the International Symposium on Scientific Com­ puting, Computer Arithmetic and Validated Numerics, takes place bian­ nually under the joint auspices of GAMM (Gesellschaft fiir Angewandte Mathematik und Mechanik) and IMACS (International Association for Mathematics and Computers in Simulation). SCAN-98 attracted more than 100 participants from 21 countries all over the world. During the four days from September 22 to 25, nine highlighted, plenary lectures and over 70 contributed talks were given. These figures indicate a large participation, which was partly caused by the attraction of the organizing country, Hungary, but also the effec­ tive support system have contributed to the success. The conference was substantially supported by the Hungarian Research Fund OTKA, GAMM, the National Technology Development Board OMFB and by the J6zsef Attila University. Due to this funding, it was possible to subsidize the participation of over 20 scientists, mainly from Eastern European countries. I...

  1. Computational Techniques for Model Predictive Control of Large-Scale Systems with Continuous-Valued and Discrete-Valued Inputs

    Directory of Open Access Journals (Sweden)

    Koichi Kobayashi

    2013-01-01

    Full Text Available We propose computational techniques for model predictive control of large-scale systems with both continuous-valued control inputs and discrete-valued control inputs, which are a class of hybrid systems. In the proposed method, we introduce the notion of virtual control inputs, which are obtained by relaxing discrete-valued control inputs to continuous variables. In online computation, first, we find continuous-valued control inputs and virtual control inputs minimizing a cost function. Next, using the obtained virtual control inputs, only discrete-valued control inputs at the current time are computed in each subsystem. In addition, we also discuss the effect of quantization errors. Finally, the effectiveness of the proposed method is shown by a numerical example. The proposed method enables us to reduce and decentralize the computation load.

  2. Extreme Scale Computing to Secure the Nation

    Energy Technology Data Exchange (ETDEWEB)

    Brown, D L; McGraw, J R; Johnson, J R; Frincke, D

    2009-11-10

    absence of nuclear testing, a progam to: (1) Support a focused, multifaceted program to increase the understanding of the enduring stockpile; (2) Predict, detect, and evaluate potential problems of the aging of the stockpile; (3) Refurbish and re-manufacture weapons and components, as required; and (4) Maintain the science and engineering institutions needed to support the nation's nuclear deterrent, now and in the future'. This program continues to fulfill its national security mission by adding significant new capabilities for producing scientific results through large-scale computational simulation coupled with careful experimentation, including sub-critical nuclear experiments permitted under the CTBT. To develop the computational science and the computational horsepower needed to support its mission, SBSS initiated the Accelerated Strategic Computing Initiative, later renamed the Advanced Simulation & Computing (ASC) program (sidebar: 'History of ASC Computing Program Computing Capability'). The modern 3D computational simulation capability of the ASC program supports the assessment and certification of the current nuclear stockpile through calibration with past underground test (UGT) data. While an impressive accomplishment, continued evolution of national security mission requirements will demand computing resources at a significantly greater scale than we have today. In particular, continued observance and potential Senate confirmation of the Comprehensive Test Ban Treaty (CTBT) together with the U.S administration's promise for a significant reduction in the size of the stockpile and the inexorable aging and consequent refurbishment of the stockpile all demand increasing refinement of our computational simulation capabilities. Assessment of the present and future stockpile with increased confidence of the safety and reliability without reliance upon calibration with past or future test data is a long-term goal of the ASC program. This

  3. Large scale electrolysers

    International Nuclear Information System (INIS)

    B Bello; M Junker

    2006-01-01

    Hydrogen production by water electrolysis represents nearly 4 % of the world hydrogen production. Future development of hydrogen vehicles will require large quantities of hydrogen. Installation of large scale hydrogen production plants will be needed. In this context, development of low cost large scale electrolysers that could use 'clean power' seems necessary. ALPHEA HYDROGEN, an European network and center of expertise on hydrogen and fuel cells, has performed for its members a study in 2005 to evaluate the potential of large scale electrolysers to produce hydrogen in the future. The different electrolysis technologies were compared. Then, a state of art of the electrolysis modules currently available was made. A review of the large scale electrolysis plants that have been installed in the world was also realized. The main projects related to large scale electrolysis were also listed. Economy of large scale electrolysers has been discussed. The influence of energy prices on the hydrogen production cost by large scale electrolysis was evaluated. (authors)

  4. Report of the Working Group on Large-Scale Computing in Aeronautics.

    Science.gov (United States)

    1984-06-01

    function and the use of drawings. In the hardware area, comtemporary large computer installations are quite powerful in terms of speed of computation as...critical to the competitive advantage of that member. He might then be willing to make them available to less advanced members under some business

  5. Scientific and computational challenges of the fusion simulation project (FSP)

    International Nuclear Information System (INIS)

    Tang, W M

    2008-01-01

    This paper highlights the scientific and computational challenges facing the Fusion Simulation Project (FSP). The primary objective is to develop advanced software designed to use leadership-class computers for carrying out multiscale physics simulations to provide information vital to delivering a realistic integrated fusion simulation model with unprecedented physics fidelity. This multiphysics capability will be unprecedented in that in the current FES applications domain, the largest-scale codes are used to carry out first-principles simulations of mostly individual phenomena in realistic 3D geometry while the integrated models are much smaller-scale, lower-dimensionality codes with significant empirical elements used for modeling and designing experiments. The FSP is expected to be the most up-to-date embodiment of the theoretical and experimental understanding of magnetically confined thermonuclear plasmas and to provide a living framework for the simulation of such plasmas as the associated physics understanding continues to advance over the next several decades. Substantive progress on answering the outstanding scientific questions in the field will drive the FSP toward its ultimate goal of developing a reliable ability to predict the behavior of plasma discharges in toroidal magnetic fusion devices on all relevant time and space scales. From a computational perspective, the fusion energy science application goal to produce high-fidelity, whole-device modeling capabilities will demand computing resources in the petascale range and beyond, together with the associated multicore algorithmic formulation needed to address burning plasma issues relevant to ITER - a multibillion dollar collaborative device involving seven international partners representing over half the world's population. Even more powerful exascale platforms will be needed to meet the future challenges of designing a demonstration fusion reactor (DEMO). Analogous to other major applied physics

  6. [Adverse Effect Predictions Based on Computational Toxicology Techniques and Large-scale Databases].

    Science.gov (United States)

    Uesawa, Yoshihiro

    2018-01-01

     Understanding the features of chemical structures related to the adverse effects of drugs is useful for identifying potential adverse effects of new drugs. This can be based on the limited information available from post-marketing surveillance, assessment of the potential toxicities of metabolites and illegal drugs with unclear characteristics, screening of lead compounds at the drug discovery stage, and identification of leads for the discovery of new pharmacological mechanisms. This present paper describes techniques used in computational toxicology to investigate the content of large-scale spontaneous report databases of adverse effects, and it is illustrated with examples. Furthermore, volcano plotting, a new visualization method for clarifying the relationships between drugs and adverse effects via comprehensive analyses, will be introduced. These analyses may produce a great amount of data that can be applied to drug repositioning.

  7. Parallel computing works

    Energy Technology Data Exchange (ETDEWEB)

    1991-10-23

    An account of the Caltech Concurrent Computation Program (C{sup 3}P), a five year project that focused on answering the question: Can parallel computers be used to do large-scale scientific computations '' As the title indicates, the question is answered in the affirmative, by implementing numerous scientific applications on real parallel computers and doing computations that produced new scientific results. In the process of doing so, C{sup 3}P helped design and build several new computers, designed and implemented basic system software, developed algorithms for frequently used mathematical computations on massively parallel machines, devised performance models and measured the performance of many computers, and created a high performance computing facility based exclusively on parallel computers. While the initial focus of C{sup 3}P was the hypercube architecture developed by C. Seitz, many of the methods developed and lessons learned have been applied successfully on other massively parallel architectures.

  8. High-integrity software, computation and the scientific method

    International Nuclear Information System (INIS)

    Hatton, L.

    2012-01-01

    Computation rightly occupies a central role in modern science. Datasets are enormous and the processing implications of some algorithms are equally staggering. With the continuing difficulties in quantifying the results of complex computations, it is of increasing importance to understand its role in the essentially Popperian scientific method. In this paper, some of the problems with computation, for example the long-term unquantifiable presence of undiscovered defect, problems with programming languages and process issues will be explored with numerous examples. One of the aims of the paper is to understand the implications of trying to produce high-integrity software and the limitations which still exist. Unfortunately Computer Science itself suffers from an inability to be suitably critical of its practices and has operated in a largely measurement-free vacuum since its earliest days. Within computer science itself, this has not been so damaging in that it simply leads to unconstrained creativity and a rapid turnover of new technologies. In the applied sciences however which have to depend on computational results, such unquantifiability significantly undermines trust. It is time this particular demon was put to rest. (author)

  9. Progresses in application of computational ?uid dynamic methods to large scale wind turbine aerodynamics?

    Institute of Scientific and Technical Information of China (English)

    Zhenyu ZHANG; Ning ZHAO; Wei ZHONG; Long WANG; Bofeng XU

    2016-01-01

    The computational ?uid dynamics (CFD) methods are applied to aerody-namic problems for large scale wind turbines. The progresses including the aerodynamic analyses of wind turbine pro?les, numerical ?ow simulation of wind turbine blades, evalu-ation of aerodynamic performance, and multi-objective blade optimization are discussed. Based on the CFD methods, signi?cant improvements are obtained to predict two/three-dimensional aerodynamic characteristics of wind turbine airfoils and blades, and the vorti-cal structure in their wake ?ows is accurately captured. Combining with a multi-objective genetic algorithm, a 1.5 MW NH-1500 optimized blade is designed with high e?ciency in wind energy conversion.

  10. DOE Advanced Scientific Computing Advisory Subcommittee (ASCAC) Report: Top Ten Exascale Research Challenges

    Energy Technology Data Exchange (ETDEWEB)

    Lucas, Robert [University of Southern California, Information Sciences Institute; Ang, James [Sandia National Laboratories; Bergman, Keren [Columbia University; Borkar, Shekhar [Intel; Carlson, William [Institute for Defense Analyses; Carrington, Laura [University of California, San Diego; Chiu, George [IBM; Colwell, Robert [DARPA; Dally, William [NVIDIA; Dongarra, Jack [University of Tennessee; Geist, Al [Oak Ridge National Laboratory; Haring, Rud [IBM; Hittinger, Jeffrey [Lawrence Livermore National Laboratory; Hoisie, Adolfy [Pacific Northwest National Laboratory; Klein, Dean Micron; Kogge, Peter [University of Notre Dame; Lethin, Richard [Reservoir Labs; Sarkar, Vivek [Rice University; Schreiber, Robert [Hewlett Packard; Shalf, John [Lawrence Berkeley National Laboratory; Sterling, Thomas [Indiana University; Stevens, Rick [Argonne National Laboratory; Bashor, Jon [Lawrence Berkeley National Laboratory; Brightwell, Ron [Sandia National Laboratories; Coteus, Paul [IBM; Debenedictus, Erik [Sandia National Laboratories; Hiller, Jon [Science and Technology Associates; Kim, K. H. [IBM; Langston, Harper [Reservoir Labs; Murphy, Richard Micron; Webster, Clayton [Oak Ridge National Laboratory; Wild, Stefan [Argonne National Laboratory; Grider, Gary [Los Alamos National Laboratory; Ross, Rob [Argonne National Laboratory; Leyffer, Sven [Argonne National Laboratory; Laros III, James [Sandia National Laboratories

    2014-02-10

    Exascale computing systems are essential for the scientific fields that will transform the 21st century global economy, including energy, biotechnology, nanotechnology, and materials science. Progress in these fields is predicated on the ability to perform advanced scientific and engineering simulations, and analyze the deluge of data. On July 29, 2013, ASCAC was charged by Patricia Dehmer, the Acting Director of the Office of Science, to assemble a subcommittee to provide advice on exascale computing. This subcommittee was directed to return a list of no more than ten technical approaches (hardware and software) that will enable the development of a system that achieves the Department's goals for exascale computing. Numerous reports over the past few years have documented the technical challenges and the non¬-viability of simply scaling existing computer designs to reach exascale. The technical challenges revolve around energy consumption, memory performance, resilience, extreme concurrency, and big data. Drawing from these reports and more recent experience, this ASCAC subcommittee has identified the top ten computing technology advancements that are critical to making a capable, economically viable, exascale system.

  11. Sensitivity analysis for large-scale problems

    Science.gov (United States)

    Noor, Ahmed K.; Whitworth, Sandra L.

    1987-01-01

    The development of efficient techniques for calculating sensitivity derivatives is studied. The objective is to present a computational procedure for calculating sensitivity derivatives as part of performing structural reanalysis for large-scale problems. The scope is limited to framed type structures. Both linear static analysis and free-vibration eigenvalue problems are considered.

  12. Scientific Grand Challenges: Forefront Questions in Nuclear Science and the Role of High Performance Computing

    International Nuclear Information System (INIS)

    Khaleel, Mohammad A.

    2009-01-01

    This report is an account of the deliberations and conclusions of the workshop on 'Forefront Questions in Nuclear Science and the Role of High Performance Computing' held January 26-28, 2009, co-sponsored by the U.S. Department of Energy (DOE) Office of Nuclear Physics (ONP) and the DOE Office of Advanced Scientific Computing (ASCR). Representatives from the national and international nuclear physics communities, as well as from the high performance computing community, participated. The purpose of this workshop was to (1) identify forefront scientific challenges in nuclear physics and then determine which-if any-of these could be aided by high performance computing at the extreme scale; (2) establish how and why new high performance computing capabilities could address issues at the frontiers of nuclear science; (3) provide nuclear physicists the opportunity to influence the development of high performance computing; and (4) provide the nuclear physics community with plans for development of future high performance computing capability by DOE ASCR.

  13. Scientific Grand Challenges: Forefront Questions in Nuclear Science and the Role of High Performance Computing

    Energy Technology Data Exchange (ETDEWEB)

    Khaleel, Mohammad A.

    2009-10-01

    This report is an account of the deliberations and conclusions of the workshop on "Forefront Questions in Nuclear Science and the Role of High Performance Computing" held January 26-28, 2009, co-sponsored by the U.S. Department of Energy (DOE) Office of Nuclear Physics (ONP) and the DOE Office of Advanced Scientific Computing (ASCR). Representatives from the national and international nuclear physics communities, as well as from the high performance computing community, participated. The purpose of this workshop was to 1) identify forefront scientific challenges in nuclear physics and then determine which-if any-of these could be aided by high performance computing at the extreme scale; 2) establish how and why new high performance computing capabilities could address issues at the frontiers of nuclear science; 3) provide nuclear physicists the opportunity to influence the development of high performance computing; and 4) provide the nuclear physics community with plans for development of future high performance computing capability by DOE ASCR.

  14. Large-scale simulations with distributed computing: Asymptotic scaling of ballistic deposition

    International Nuclear Information System (INIS)

    Farnudi, Bahman; Vvedensky, Dimitri D

    2011-01-01

    Extensive kinetic Monte Carlo simulations are reported for ballistic deposition (BD) in (1 + 1) dimensions. The large system sizes L observed for the onset of asymptotic scaling (L ≅ 2 12 ) explains the widespread discrepancies in previous reports for exponents of BD in one and likely in higher dimensions. The exponents obtained directly from our simulations, α = 0.499 ± 0.004 and β = 0.336 ± 0.004, capture the exact values α = 1/2 and β = 1/3 for the one-dimensional Kardar-Parisi-Zhang equation. An analysis of our simulations suggests a criterion for identifying the onset of true asymptotic scaling, which enables a more informed evaluation of exponents for BD in higher dimensions. These simulations were made possible by the Simulation through Social Networking project at the Institute for Advanced Studies in Basic Sciences in 2007, which was re-launched in November 2010.

  15. Using Agent Base Models to Optimize Large Scale Network for Large System Inventories

    Science.gov (United States)

    Shameldin, Ramez Ahmed; Bowling, Shannon R.

    2010-01-01

    The aim of this paper is to use Agent Base Models (ABM) to optimize large scale network handling capabilities for large system inventories and to implement strategies for the purpose of reducing capital expenses. The models used in this paper either use computational algorithms or procedure implementations developed by Matlab to simulate agent based models in a principal programming language and mathematical theory using clusters, these clusters work as a high performance computational performance to run the program in parallel computational. In both cases, a model is defined as compilation of a set of structures and processes assumed to underlie the behavior of a network system.

  16. Scalability of Parallel Scientific Applications on the Cloud

    Directory of Open Access Journals (Sweden)

    Satish Narayana Srirama

    2011-01-01

    Full Text Available Cloud computing, with its promise of virtually infinite resources, seems to suit well in solving resource greedy scientific computing problems. To study the effects of moving parallel scientific applications onto the cloud, we deployed several benchmark applications like matrix–vector operations and NAS parallel benchmarks, and DOUG (Domain decomposition On Unstructured Grids on the cloud. DOUG is an open source software package for parallel iterative solution of very large sparse systems of linear equations. The detailed analysis of DOUG on the cloud showed that parallel applications benefit a lot and scale reasonable on the cloud. We could also observe the limitations of the cloud and its comparison with cluster in terms of performance. However, for efficiently running the scientific applications on the cloud infrastructure, the applications must be reduced to frameworks that can successfully exploit the cloud resources, like the MapReduce framework. Several iterative and embarrassingly parallel algorithms are reduced to the MapReduce model and their performance is measured and analyzed. The analysis showed that Hadoop MapReduce has significant problems with iterative methods, while it suits well for embarrassingly parallel algorithms. Scientific computing often uses iterative methods to solve large problems. Thus, for scientific computing on the cloud, this paper raises the necessity for better frameworks or optimizations for MapReduce.

  17. On the Large-Scaling Issues of Cloud-based Applications for Earth Science Dat

    Science.gov (United States)

    Hua, H.

    2016-12-01

    Next generation science data systems are needed to address the incoming flood of data from new missions such as NASA's SWOT and NISAR where its SAR data volumes and data throughput rates are order of magnitude larger than present day missions. Existing missions, such as OCO-2, may also require high turn-around time for processing different science scenarios where on-premise and even traditional HPC computing environments may not meet the high processing needs. Additionally, traditional means of procuring hardware on-premise are already limited due to facilities capacity constraints for these new missions. Experiences have shown that to embrace efficient cloud computing approaches for large-scale science data systems requires more than just moving existing code to cloud environments. At large cloud scales, we need to deal with scaling and cost issues. We present our experiences on deploying multiple instances of our hybrid-cloud computing science data system (HySDS) to support large-scale processing of Earth Science data products. We will explore optimization approaches to getting best performance out of hybrid-cloud computing as well as common issues that will arise when dealing with large-scale computing. Novel approaches were utilized to do processing on Amazon's spot market, which can potentially offer 75%-90% costs savings but with an unpredictable computing environment based on market forces.

  18. Penalized Estimation in Large-Scale Generalized Linear Array Models

    DEFF Research Database (Denmark)

    Lund, Adam; Vincent, Martin; Hansen, Niels Richard

    2017-01-01

    Large-scale generalized linear array models (GLAMs) can be challenging to fit. Computation and storage of its tensor product design matrix can be impossible due to time and memory constraints, and previously considered design matrix free algorithms do not scale well with the dimension...

  19. Proceedings of joint meeting of the 6th simulation science symposium and the NIFS collaboration research 'large scale computer simulation'

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    2003-03-01

    Joint meeting of the 6th Simulation Science Symposium and the NIFS Collaboration Research 'Large Scale Computer Simulation' was held on December 12-13, 2002 at National Institute for Fusion Science, with the aim of promoting interdisciplinary collaborations in various fields of computer simulations. The present meeting attended by more than 40 people consists of the 11 invited and 22 contributed papers, of which topics were extended not only to fusion science but also to related fields such as astrophysics, earth science, fluid dynamics, molecular dynamics, computer science etc. (author)

  20. Scientific computing an introduction using Maple and Matlab

    CERN Document Server

    Gander, Walter; Kwok, Felix

    2014-01-01

    Scientific computing is the study of how to use computers effectively to solve problems that arise from the mathematical modeling of phenomena in science and engineering. It is based on mathematics, numerical and symbolic/algebraic computations and visualization. This book serves as an introduction to both the theory and practice of scientific computing, with each chapter presenting the basic algorithms that serve as the workhorses of many scientific codes; we explain both the theory behind these algorithms and how they must be implemented in order to work reliably in finite-precision arithmetic. The book includes many programs written in Matlab and Maple – Maple is often used to derive numerical algorithms, whereas Matlab is used to implement them. The theory is developed in such a way that students can learn by themselves as they work through the text. Each chapter contains numerous examples and problems to help readers understand the material “hands-on”.

  1. Computational domain length and Reynolds number effects on large-scale coherent motions in turbulent pipe flow

    Science.gov (United States)

    Feldmann, Daniel; Bauer, Christian; Wagner, Claus

    2018-03-01

    We present results from direct numerical simulations (DNS) of turbulent pipe flow at shear Reynolds numbers up to Reτ = 1500 using different computational domains with lengths up to ?. The objectives are to analyse the effect of the finite size of the periodic pipe domain on large flow structures in dependency of Reτ and to assess a minimum ? required for relevant turbulent scales to be captured and a minimum Reτ for very large-scale motions (VLSM) to be analysed. Analysing one-point statistics revealed that the mean velocity profile is invariant for ?. The wall-normal location at which deviations occur in shorter domains changes strongly with increasing Reτ from the near-wall region to the outer layer, where VLSM are believed to live. The root mean square velocity profiles exhibit domain length dependencies for pipes shorter than 14R and 7R depending on Reτ. For all Reτ, the higher-order statistical moments show only weak dependencies and only for the shortest domain considered here. However, the analysis of one- and two-dimensional pre-multiplied energy spectra revealed that even for larger ?, not all physically relevant scales are fully captured, even though the aforementioned statistics are in good agreement with the literature. We found ? to be sufficiently large to capture VLSM-relevant turbulent scales in the considered range of Reτ based on our definition of an integral energy threshold of 10%. The requirement to capture at least 1/10 of the global maximum energy level is justified by a 14% increase of the streamwise turbulence intensity in the outer region between Reτ = 720 and 1500, which can be related to VLSM-relevant length scales. Based on this scaling anomaly, we found Reτ⪆1500 to be a necessary minimum requirement to investigate VLSM-related effects in pipe flow, even though the streamwise energy spectra does not yet indicate sufficient scale separation between the most energetic and the very long motions.

  2. Introduction to the LaRC central scientific computing complex

    Science.gov (United States)

    Shoosmith, John N.

    1993-01-01

    The computers and associated equipment that make up the Central Scientific Computing Complex of the Langley Research Center are briefly described. The electronic networks that provide access to the various components of the complex and a number of areas that can be used by Langley and contractors staff for special applications (scientific visualization, image processing, software engineering, and grid generation) are also described. Flight simulation facilities that use the central computers are described. Management of the complex, procedures for its use, and available services and resources are discussed. This document is intended for new users of the complex, for current users who wish to keep appraised of changes, and for visitors who need to understand the role of central scientific computers at Langley.

  3. Large-scale exact diagonalizations reveal low-momentum scales of nuclei

    Science.gov (United States)

    Forssén, C.; Carlsson, B. D.; Johansson, H. T.; Sääf, D.; Bansal, A.; Hagen, G.; Papenbrock, T.

    2018-03-01

    Ab initio methods aim to solve the nuclear many-body problem with controlled approximations. Virtually exact numerical solutions for realistic interactions can only be obtained for certain special cases such as few-nucleon systems. Here we extend the reach of exact diagonalization methods to handle model spaces with dimension exceeding 1010 on a single compute node. This allows us to perform no-core shell model (NCSM) calculations for 6Li in model spaces up to Nmax=22 and to reveal the 4He+d halo structure of this nucleus. Still, the use of a finite harmonic-oscillator basis implies truncations in both infrared (IR) and ultraviolet (UV) length scales. These truncations impose finite-size corrections on observables computed in this basis. We perform IR extrapolations of energies and radii computed in the NCSM and with the coupled-cluster method at several fixed UV cutoffs. It is shown that this strategy enables information gain also from data that is not fully UV converged. IR extrapolations improve the accuracy of relevant bound-state observables for a range of UV cutoffs, thus making them profitable tools. We relate the momentum scale that governs the exponential IR convergence to the threshold energy for the first open decay channel. Using large-scale NCSM calculations we numerically verify this small-momentum scale of finite nuclei.

  4. A computationally efficient tool for assessing the depth resolution in large-scale potential-field inversion

    DEFF Research Database (Denmark)

    Paoletti, Valeria; Hansen, Per Christian; Hansen, Mads Friis

    2014-01-01

    In potential-field inversion, careful management of singular value decomposition components is crucial for obtaining information about the source distribution with respect to depth. In principle, the depth-resolution plot provides a convenient visual tool for this analysis, but its computational...... on memory and computing time. We used the ApproxDRP to study retrievable depth resolution in inversion of the gravity field of the Neapolitan Volcanic Area. Our main contribution is the combined use of the Lanczos bidiagonalization algorithm, established in the scientific computing community, and the depth...

  5. Parallel Computing in SCALE

    International Nuclear Information System (INIS)

    DeHart, Mark D.; Williams, Mark L.; Bowman, Stephen M.

    2010-01-01

    The SCALE computational architecture has remained basically the same since its inception 30 years ago, although constituent modules and capabilities have changed significantly. This SCALE concept was intended to provide a framework whereby independent codes can be linked to provide a more comprehensive capability than possible with the individual programs - allowing flexibility to address a wide variety of applications. However, the current system was designed originally for mainframe computers with a single CPU and with significantly less memory than today's personal computers. It has been recognized that the present SCALE computation system could be restructured to take advantage of modern hardware and software capabilities, while retaining many of the modular features of the present system. Preliminary work is being done to define specifications and capabilities for a more advanced computational architecture. This paper describes the state of current SCALE development activities and plans for future development. With the release of SCALE 6.1 in 2010, a new phase of evolutionary development will be available to SCALE users within the TRITON and NEWT modules. The SCALE (Standardized Computer Analyses for Licensing Evaluation) code system developed by Oak Ridge National Laboratory (ORNL) provides a comprehensive and integrated package of codes and nuclear data for a wide range of applications in criticality safety, reactor physics, shielding, isotopic depletion and decay, and sensitivity/uncertainty (S/U) analysis. Over the last three years, since the release of version 5.1 in 2006, several important new codes have been introduced within SCALE, and significant advances applied to existing codes. Many of these new features became available with the release of SCALE 6.0 in early 2009. However, beginning with SCALE 6.1, a first generation of parallel computing is being introduced. In addition to near-term improvements, a plan for longer term SCALE enhancement

  6. OPENING REMARKS: Scientific Discovery through Advanced Computing

    Science.gov (United States)

    Strayer, Michael

    2006-01-01

    as the national and regional electricity grid, carbon sequestration, virtual engineering, and the nuclear fuel cycle. The successes of the first five years of SciDAC have demonstrated the power of using advanced computing to enable scientific discovery. One measure of this success could be found in the President’s State of the Union address in which President Bush identified ‘supercomputing’ as a major focus area of the American Competitiveness Initiative. Funds were provided in the FY 2007 President’s Budget request to increase the size of the NERSC-5 procurement to between 100-150 teraflops, to upgrade the LCF Cray XT3 at Oak Ridge to 250 teraflops and acquire a 100 teraflop IBM BlueGene/P to establish the Leadership computing facility at Argonne. We believe that we are on a path to establish a petascale computing resource for open science by 2009. We must develop software tools, packages, and libraries as well as the scientific application software that will scale to hundreds of thousands of processors. Computer scientists from universities and the DOE’s national laboratories will be asked to collaborate on the development of the critical system software components such as compilers, light-weight operating systems and file systems. Standing up these large machines will not be business as usual for ASCR. We intend to develop a series of interconnected projects that identify cost, schedule, risks, and scope for the upgrades at the LCF at Oak Ridge, the establishment of the LCF at Argonne, and the development of the software to support these high-end computers. The critical first step in defining the scope of the project is to identify a set of early application codes for each leadership class computing facility. These codes will have access to the resources during the commissioning phase of the facility projects and will be part of the acceptance tests for the machines. Applications will be selected, in part, by breakthrough science, scalability, and

  7. Using Amazon's Elastic Compute Cloud to dynamically scale CMS computational resources

    International Nuclear Information System (INIS)

    Evans, D; Fisk, I; Holzman, B; Pordes, R; Tiradani, A; Melo, A; Sheldon, P; Metson, S

    2011-01-01

    Large international scientific collaborations such as the Compact Muon Solenoid (CMS) experiment at the Large Hadron Collider have traditionally addressed their data reduction and analysis needs by building and maintaining dedicated computational infrastructure. Emerging cloud computing services such as Amazon's Elastic Compute Cloud (EC2) offer short-term CPU and storage resources with costs based on usage. These services allow experiments to purchase computing resources as needed, without significant prior planning and without long term investments in facilities and their management. We have demonstrated that services such as EC2 can successfully be integrated into the production-computing model of CMS, and find that they work very well as worker nodes. The cost-structure and transient nature of EC2 services makes them inappropriate for some CMS production services and functions. We also found that the resources are not truely 'on-demand' as limits and caps on usage are imposed. Our trial workflows allow us to make a cost comparison between EC2 resources and dedicated CMS resources at a University, and conclude that it is most cost effective to purchase dedicated resources for the 'base-line' needs of experiments such as CMS. However, if the ability to use cloud computing resources is built into an experiment's software framework before demand requires their use, cloud computing resources make sense for bursting during times when spikes in usage are required.

  8. Parallel computing works!

    CERN Document Server

    Fox, Geoffrey C; Messina, Guiseppe C

    2014-01-01

    A clear illustration of how parallel computers can be successfully appliedto large-scale scientific computations. This book demonstrates how avariety of applications in physics, biology, mathematics and other scienceswere implemented on real parallel computers to produce new scientificresults. It investigates issues of fine-grained parallelism relevant forfuture supercomputers with particular emphasis on hypercube architecture. The authors describe how they used an experimental approach to configuredifferent massively parallel machines, design and implement basic systemsoftware, and develop

  9. Low-Complexity Transmit Antenna Selection and Beamforming for Large-Scale MIMO Communications

    Directory of Open Access Journals (Sweden)

    Kun Qian

    2014-01-01

    Full Text Available Transmit antenna selection plays an important role in large-scale multiple-input multiple-output (MIMO communications, but optimal large-scale MIMO antenna selection is a technical challenge. Exhaustive search is often employed in antenna selection, but it cannot be efficiently implemented in large-scale MIMO communication systems due to its prohibitive high computation complexity. This paper proposes a low-complexity interactive multiple-parameter optimization method for joint transmit antenna selection and beamforming in large-scale MIMO communication systems. The objective is to jointly maximize the channel outrage capacity and signal-to-noise (SNR performance and minimize the mean square error in transmit antenna selection and minimum variance distortionless response (MVDR beamforming without exhaustive search. The effectiveness of all the proposed methods is verified by extensive simulation results. It is shown that the required antenna selection processing time of the proposed method does not increase along with the increase of selected antennas, but the computation complexity of conventional exhaustive search method will significantly increase when large-scale antennas are employed in the system. This is particularly useful in antenna selection for large-scale MIMO communication systems.

  10. Large-scale pool fires

    Directory of Open Access Journals (Sweden)

    Steinhaus Thomas

    2007-01-01

    Full Text Available A review of research into the burning behavior of large pool fires and fuel spill fires is presented. The features which distinguish such fires from smaller pool fires are mainly associated with the fire dynamics at low source Froude numbers and the radiative interaction with the fire source. In hydrocarbon fires, higher soot levels at increased diameters result in radiation blockage effects around the perimeter of large fire plumes; this yields lower emissive powers and a drastic reduction in the radiative loss fraction; whilst there are simplifying factors with these phenomena, arising from the fact that soot yield can saturate, there are other complications deriving from the intermittency of the behavior, with luminous regions of efficient combustion appearing randomly in the outer surface of the fire according the turbulent fluctuations in the fire plume. Knowledge of the fluid flow instabilities, which lead to the formation of large eddies, is also key to understanding the behavior of large-scale fires. Here modeling tools can be effectively exploited in order to investigate the fluid flow phenomena, including RANS- and LES-based computational fluid dynamics codes. The latter are well-suited to representation of the turbulent motions, but a number of challenges remain with their practical application. Massively-parallel computational resources are likely to be necessary in order to be able to adequately address the complex coupled phenomena to the level of detail that is necessary.

  11. Using CyberShake Workflows to Manage Big Seismic Hazard Data on Large-Scale Open-Science HPC Resources

    Science.gov (United States)

    Callaghan, S.; Maechling, P. J.; Juve, G.; Vahi, K.; Deelman, E.; Jordan, T. H.

    2015-12-01

    The CyberShake computational platform, developed by the Southern California Earthquake Center (SCEC), is an integrated collection of scientific software and middleware that performs 3D physics-based probabilistic seismic hazard analysis (PSHA) for Southern California. CyberShake integrates large-scale and high-throughput research codes to produce probabilistic seismic hazard curves for individual locations of interest and hazard maps for an entire region. A recent CyberShake calculation produced about 500,000 two-component seismograms for each of 336 locations, resulting in over 300 million synthetic seismograms in a Los Angeles-area probabilistic seismic hazard model. CyberShake calculations require a series of scientific software programs. Early computational stages produce data used as inputs by later stages, so we describe CyberShake calculations using a workflow definition language. Scientific workflow tools automate and manage the input and output data and enable remote job execution on large-scale HPC systems. To satisfy the requests of broad impact users of CyberShake data, such as seismologists, utility companies, and building code engineers, we successfully completed CyberShake Study 15.4 in April and May 2015, calculating a 1 Hz urban seismic hazard map for Los Angeles. We distributed the calculation between the NSF Track 1 system NCSA Blue Waters, the DOE Leadership-class system OLCF Titan, and USC's Center for High Performance Computing. This study ran for over 5 weeks, burning about 1.1 million node-hours and producing over half a petabyte of data. The CyberShake Study 15.4 results doubled the maximum simulated seismic frequency from 0.5 Hz to 1.0 Hz as compared to previous studies, representing a factor of 16 increase in computational complexity. We will describe how our workflow tools supported splitting the calculation across multiple systems. We will explain how we modified CyberShake software components, including GPU implementations and

  12. Large-scale structure of the Universe

    International Nuclear Information System (INIS)

    Doroshkevich, A.G.

    1978-01-01

    The problems, discussed at the ''Large-scale Structure of the Universe'' symposium are considered on a popular level. Described are the cell structure of galaxy distribution in the Universe, principles of mathematical galaxy distribution modelling. The images of cell structures, obtained after reprocessing with the computer are given. Discussed are three hypothesis - vortical, entropic, adiabatic, suggesting various processes of galaxy and galaxy clusters origin. A considerable advantage of the adiabatic hypothesis is recognized. The relict radiation, as a method of direct studying the processes taking place in the Universe is considered. The large-scale peculiarities and small-scale fluctuations of the relict radiation temperature enable one to estimate the turbance properties at the pre-galaxy stage. The discussion of problems, pertaining to studying the hot gas, contained in galaxy clusters, the interactions within galaxy clusters and with the inter-galaxy medium, is recognized to be a notable contribution into the development of theoretical and observational cosmology

  13. High-performance scientific computing in the cloud

    Science.gov (United States)

    Jorissen, Kevin; Vila, Fernando; Rehr, John

    2011-03-01

    Cloud computing has the potential to open up high-performance computational science to a much broader class of researchers, owing to its ability to provide on-demand, virtualized computational resources. However, before such approaches can become commonplace, user-friendly tools must be developed that hide the unfamiliar cloud environment and streamline the management of cloud resources for many scientific applications. We have recently shown that high-performance cloud computing is feasible for parallelized x-ray spectroscopy calculations. We now present benchmark results for a wider selection of scientific applications focusing on electronic structure and spectroscopic simulation software in condensed matter physics. These applications are driven by an improved portable interface that can manage virtual clusters and run various applications in the cloud. We also describe a next generation of cluster tools, aimed at improved performance and a more robust cluster deployment. Supported by NSF grant OCI-1048052.

  14. A high performance scientific cloud computing environment for materials simulations

    OpenAIRE

    Jorissen, Kevin; Vila, Fernando D.; Rehr, John J.

    2011-01-01

    We describe the development of a scientific cloud computing (SCC) platform that offers high performance computation capability. The platform consists of a scientific virtual machine prototype containing a UNIX operating system and several materials science codes, together with essential interface tools (an SCC toolset) that offers functionality comparable to local compute clusters. In particular, our SCC toolset provides automatic creation of virtual clusters for parallel computing, including...

  15. An interactive display system for large-scale 3D models

    Science.gov (United States)

    Liu, Zijian; Sun, Kun; Tao, Wenbing; Liu, Liman

    2018-04-01

    With the improvement of 3D reconstruction theory and the rapid development of computer hardware technology, the reconstructed 3D models are enlarging in scale and increasing in complexity. Models with tens of thousands of 3D points or triangular meshes are common in practical applications. Due to storage and computing power limitation, it is difficult to achieve real-time display and interaction with large scale 3D models for some common 3D display software, such as MeshLab. In this paper, we propose a display system for large-scale 3D scene models. We construct the LOD (Levels of Detail) model of the reconstructed 3D scene in advance, and then use an out-of-core view-dependent multi-resolution rendering scheme to realize the real-time display of the large-scale 3D model. With the proposed method, our display system is able to render in real time while roaming in the reconstructed scene and 3D camera poses can also be displayed. Furthermore, the memory consumption can be significantly decreased via internal and external memory exchange mechanism, so that it is possible to display a large scale reconstructed scene with over millions of 3D points or triangular meshes in a regular PC with only 4GB RAM.

  16. II - Template Metaprogramming for Massively Parallel Scientific Computing - Vectorization with Expression Templates

    CERN Multimedia

    CERN. Geneva

    2016-01-01

    Large scale scientific computing raises questions on different levels ranging from the fomulation of the problems to the choice of the best algorithms and their implementation for a specific platform. There are similarities in these different topics that can be exploited by modern-style C++ template metaprogramming techniques to produce readable, maintainable and generic code. Traditional low-level code tend to be fast but platform-dependent, and it obfuscates the meaning of the algorithm. On the other hand, object-oriented approach is nice to read, but may come with an inherent performance penalty. These lectures aim to present he basics of the Expression Template (ET) idiom which allows us to keep the object-oriented approach without sacrificing performance. We will in particular show to to enhance ET to include SIMD vectorization. We will then introduce techniques for abstracting iteration, and introduce thread-level parallelism for use in heavy data-centric loads. We will show to to apply these methods i...

  17. Scientific applications of symbolic computation

    International Nuclear Information System (INIS)

    Hearn, A.C.

    1976-02-01

    The use of symbolic computation systems for problem solving in scientific research is reviewed. The nature of the field is described, and particular examples are considered from celestial mechanics, quantum electrodynamics and general relativity. Symbolic integration and some more recent applications of algebra systems are also discussed [fr

  18. Optimization of large-scale heterogeneous system-of-systems models.

    Energy Technology Data Exchange (ETDEWEB)

    Parekh, Ojas; Watson, Jean-Paul; Phillips, Cynthia Ann; Siirola, John; Swiler, Laura Painton; Hough, Patricia Diane (Sandia National Laboratories, Livermore, CA); Lee, Herbert K. H. (University of California, Santa Cruz, Santa Cruz, CA); Hart, William Eugene; Gray, Genetha Anne (Sandia National Laboratories, Livermore, CA); Woodruff, David L. (University of California, Davis, Davis, CA)

    2012-01-01

    Decision makers increasingly rely on large-scale computational models to simulate and analyze complex man-made systems. For example, computational models of national infrastructures are being used to inform government policy, assess economic and national security risks, evaluate infrastructure interdependencies, and plan for the growth and evolution of infrastructure capabilities. A major challenge for decision makers is the analysis of national-scale models that are composed of interacting systems: effective integration of system models is difficult, there are many parameters to analyze in these systems, and fundamental modeling uncertainties complicate analysis. This project is developing optimization methods to effectively represent and analyze large-scale heterogeneous system of systems (HSoS) models, which have emerged as a promising approach for describing such complex man-made systems. These optimization methods enable decision makers to predict future system behavior, manage system risk, assess tradeoffs between system criteria, and identify critical modeling uncertainties.

  19. FPS scientific and supercomputers computers in chemistry

    International Nuclear Information System (INIS)

    Curington, I.J.

    1987-01-01

    FPS Array Processors, scientific computers, and highly parallel supercomputers are used in nearly all aspects of compute-intensive computational chemistry. A survey is made of work utilizing this equipment, both published and current research. The relationship of the computer architecture to computational chemistry is discussed, with specific reference to Molecular Dynamics, Quantum Monte Carlo simulations, and Molecular Graphics applications. Recent installations of the FPS T-Series are highlighted, and examples of Molecular Graphics programs running on the FPS-5000 are shown

  20. Towards Large-area Field-scale Operational Evapotranspiration for Water Use Mapping

    Science.gov (United States)

    Senay, G. B.; Friedrichs, M.; Morton, C.; Huntington, J. L.; Verdin, J.

    2017-12-01

    Field-scale evapotranspiration (ET) estimates are needed for improving surface and groundwater use and water budget studies. Ideally, field-scale ET estimates would be at regional to national levels and cover long time periods. As a result of large data storage and computational requirements associated with processing field-scale satellite imagery such as Landsat, numerous challenges remain to develop operational ET estimates over large areas for detailed water use and availability studies. However, the combination of new science, data availability, and cloud computing technology is enabling unprecedented capabilities for ET mapping. To demonstrate this capability, we used Google's Earth Engine cloud computing platform to create nationwide annual ET estimates with 30-meter resolution Landsat ( 16,000 images) and gridded weather data using the Operational Simplified Surface Energy Balance (SSEBop) model in support of the National Water Census, a USGS research program designed to build decision support capacity for water management agencies and other natural resource managers. By leveraging Google's Earth Engine Application Programming Interface (API) and developing software in a collaborative, open-platform environment, we rapidly advance from research towards applications for large-area field-scale ET mapping. Cloud computing of the Landsat image archive combined with other satellite, climate, and weather data, is creating never imagined opportunities for assessing ET model behavior and uncertainty, and ultimately providing the ability for more robust operational monitoring and assessment of water use at field-scales.

  1. HPCToolkit: performance tools for scientific computing

    Energy Technology Data Exchange (ETDEWEB)

    Tallent, N; Mellor-Crummey, J; Adhianto, L; Fagan, M; Krentel, M [Department of Computer Science, Rice University, Houston, TX 77005 (United States)

    2008-07-15

    As part of the U.S. Department of Energy's Scientific Discovery through Advanced Computing (SciDAC) program, science teams are tackling problems that require simulation and modeling on petascale computers. As part of activities associated with the SciDAC Center for Scalable Application Development Software (CScADS) and the Performance Engineering Research Institute (PERI), Rice University is building software tools for performance analysis of scientific applications on the leadership-class platforms. In this poster abstract, we briefly describe the HPCToolkit performance tools and how they can be used to pinpoint bottlenecks in SPMD and multi-threaded parallel codes. We demonstrate HPCToolkit's utility by applying it to two SciDAC applications: the S3D code for simulation of turbulent combustion and the MFDn code for ab initio calculations of microscopic structure of nuclei.

  2. HPCToolkit: performance tools for scientific computing

    International Nuclear Information System (INIS)

    Tallent, N; Mellor-Crummey, J; Adhianto, L; Fagan, M; Krentel, M

    2008-01-01

    As part of the U.S. Department of Energy's Scientific Discovery through Advanced Computing (SciDAC) program, science teams are tackling problems that require simulation and modeling on petascale computers. As part of activities associated with the SciDAC Center for Scalable Application Development Software (CScADS) and the Performance Engineering Research Institute (PERI), Rice University is building software tools for performance analysis of scientific applications on the leadership-class platforms. In this poster abstract, we briefly describe the HPCToolkit performance tools and how they can be used to pinpoint bottlenecks in SPMD and multi-threaded parallel codes. We demonstrate HPCToolkit's utility by applying it to two SciDAC applications: the S3D code for simulation of turbulent combustion and the MFDn code for ab initio calculations of microscopic structure of nuclei

  3. A full scale approximation of covariance functions for large spatial data sets

    KAUST Repository

    Sang, Huiyan

    2011-10-10

    Gaussian process models have been widely used in spatial statistics but face tremendous computational challenges for very large data sets. The model fitting and spatial prediction of such models typically require O(n 3) operations for a data set of size n. Various approximations of the covariance functions have been introduced to reduce the computational cost. However, most existing approximations cannot simultaneously capture both the large- and the small-scale spatial dependence. A new approximation scheme is developed to provide a high quality approximation to the covariance function at both the large and the small spatial scales. The new approximation is the summation of two parts: a reduced rank covariance and a compactly supported covariance obtained by tapering the covariance of the residual of the reduced rank approximation. Whereas the former part mainly captures the large-scale spatial variation, the latter part captures the small-scale, local variation that is unexplained by the former part. By combining the reduced rank representation and sparse matrix techniques, our approach allows for efficient computation for maximum likelihood estimation, spatial prediction and Bayesian inference. We illustrate the new approach with simulated and real data sets. © 2011 Royal Statistical Society.

  4. A full scale approximation of covariance functions for large spatial data sets

    KAUST Repository

    Sang, Huiyan; Huang, Jianhua Z.

    2011-01-01

    Gaussian process models have been widely used in spatial statistics but face tremendous computational challenges for very large data sets. The model fitting and spatial prediction of such models typically require O(n 3) operations for a data set of size n. Various approximations of the covariance functions have been introduced to reduce the computational cost. However, most existing approximations cannot simultaneously capture both the large- and the small-scale spatial dependence. A new approximation scheme is developed to provide a high quality approximation to the covariance function at both the large and the small spatial scales. The new approximation is the summation of two parts: a reduced rank covariance and a compactly supported covariance obtained by tapering the covariance of the residual of the reduced rank approximation. Whereas the former part mainly captures the large-scale spatial variation, the latter part captures the small-scale, local variation that is unexplained by the former part. By combining the reduced rank representation and sparse matrix techniques, our approach allows for efficient computation for maximum likelihood estimation, spatial prediction and Bayesian inference. We illustrate the new approach with simulated and real data sets. © 2011 Royal Statistical Society.

  5. Simulation of large scale air detritiation operations by computer modeling and bench-scale experimentation

    International Nuclear Information System (INIS)

    Clemmer, R.G.; Land, R.H.; Maroni, V.A.; Mintz, J.M.

    1978-01-01

    Although some experience has been gained in the design and construction of 0.5 to 5 m 3 /s air-detritiation systems, little information is available on the performance of these systems under realistic conditions. Recently completed studies at ANL have attempted to provide some perspective on this subject. A time-dependent computer model was developed to study the effects of various reaction and soaking mechanisms that could occur in a typically-sized fusion reactor building (approximately 10 5 m 3 ) following a range of tritium releases (2 to 200 g). In parallel with the computer study, a small (approximately 50 liter) test chamber was set up to investigate cleanup characteristics under conditions which could also be simulated with the computer code. Whereas results of computer analyses indicated that only approximately 10 -3 percent of the tritium released to an ambient enclosure should be converted to tritiated water, the bench-scale experiments gave evidence of conversions to water greater than 1%. Furthermore, although the amounts (both calculated and observed) of soaked-in tritium are usually only a very small fraction of the total tritium release, the soaked tritium is significant, in that its continuous return to the enclosure extends the cleanup time beyond the predicted value in the absence of any soaking mechanisms

  6. A novel combined SLAM based on RBPF-SLAM and EIF-SLAM for mobile system sensing in a large scale environment.

    Science.gov (United States)

    He, Bo; Zhang, Shujing; Yan, Tianhong; Zhang, Tao; Liang, Yan; Zhang, Hongjin

    2011-01-01

    Mobile autonomous systems are very important for marine scientific investigation and military applications. Many algorithms have been studied to deal with the computational efficiency problem required for large scale simultaneous localization and mapping (SLAM) and its related accuracy and consistency. Among these methods, submap-based SLAM is a more effective one. By combining the strength of two popular mapping algorithms, the Rao-Blackwellised particle filter (RBPF) and extended information filter (EIF), this paper presents a combined SLAM-an efficient submap-based solution to the SLAM problem in a large scale environment. RBPF-SLAM is used to produce local maps, which are periodically fused into an EIF-SLAM algorithm. RBPF-SLAM can avoid linearization of the robot model during operating and provide a robust data association, while EIF-SLAM can improve the whole computational speed, and avoid the tendency of RBPF-SLAM to be over-confident. In order to further improve the computational speed in a real time environment, a binary-tree-based decision-making strategy is introduced. Simulation experiments show that the proposed combined SLAM algorithm significantly outperforms currently existing algorithms in terms of accuracy and consistency, as well as the computing efficiency. Finally, the combined SLAM algorithm is experimentally validated in a real environment by using the Victoria Park dataset.

  7. A Novel Combined SLAM Based on RBPF-SLAM and EIF-SLAM for Mobile System Sensing in a Large Scale Environment

    Directory of Open Access Journals (Sweden)

    Hongjin Zhang

    2011-10-01

    Full Text Available Mobile autonomous systems are very important for marine scientific investigation and military applications. Many algorithms have been studied to deal with the computational efficiency problem required for large scale Simultaneous Localization and Mapping (SLAM and its related accuracy and consistency. Among these methods, submap-based SLAM is a more effective one. By combining the strength of two popular mapping algorithms, the Rao-Blackwellised particle filter (RBPF and extended information filter (EIF, this paper presents a Combined SLAM—an efficient submap-based solution to the SLAM problem in a large scale environment. RBPF-SLAM is used to produce local maps, which are periodically fused into an EIF-SLAM algorithm. RBPF-SLAM can avoid linearization of the robot model during operating and provide a robust data association, while EIF-SLAM can improve the whole computational speed, and avoid the tendency of RBPF-SLAM to be over-confident. In order to further improve the computational speed in a real time environment, a binary-tree-based decision-making strategy is introduced. Simulation experiments show that the proposed Combined SLAM algorithm significantly outperforms currently existing algorithms in terms of accuracy and consistency, as well as the computing efficiency. Finally, the Combined SLAM algorithm is experimentally validated in a real environment by using the Victoria Park dataset.

  8. Enabling Large-Scale Biomedical Analysis in the Cloud

    Directory of Open Access Journals (Sweden)

    Ying-Chih Lin

    2013-01-01

    Full Text Available Recent progress in high-throughput instrumentations has led to an astonishing growth in both volume and complexity of biomedical data collected from various sources. The planet-size data brings serious challenges to the storage and computing technologies. Cloud computing is an alternative to crack the nut because it gives concurrent consideration to enable storage and high-performance computing on large-scale data. This work briefly introduces the data intensive computing system and summarizes existing cloud-based resources in bioinformatics. These developments and applications would facilitate biomedical research to make the vast amount of diversification data meaningful and usable.

  9. Large-scale visualization system for grid environment

    International Nuclear Information System (INIS)

    Suzuki, Yoshio

    2007-01-01

    Center for Computational Science and E-systems of Japan Atomic Energy Agency (CCSE/JAEA) has been conducting R and Ds of distributed computing (grid computing) environments: Seamless Thinking Aid (STA), Information Technology Based Laboratory (ITBL) and Atomic Energy Grid InfraStructure (AEGIS). In these R and Ds, we have developed the visualization technology suitable for the distributed computing environment. As one of the visualization tools, we have developed the Parallel Support Toolkit (PST) which can execute the visualization process parallely on a computer. Now, we improve PST to be executable simultaneously on multiple heterogeneous computers using Seamless Thinking Aid Message Passing Interface (STAMPI). STAMPI, we have developed in these R and Ds, is the MPI library executable on a heterogeneous computing environment. The improvement realizes the visualization of extremely large-scale data and enables more efficient visualization processes in a distributed computing environment. (author)

  10. State of the Art in Large-Scale Soil Moisture Monitoring

    Science.gov (United States)

    Ochsner, Tyson E.; Cosh, Michael Harold; Cuenca, Richard H.; Dorigo, Wouter; Draper, Clara S.; Hagimoto, Yutaka; Kerr, Yan H.; Larson, Kristine M.; Njoku, Eni Gerald; Small, Eric E.; hide

    2013-01-01

    Soil moisture is an essential climate variable influencing land atmosphere interactions, an essential hydrologic variable impacting rainfall runoff processes, an essential ecological variable regulating net ecosystem exchange, and an essential agricultural variable constraining food security. Large-scale soil moisture monitoring has advanced in recent years creating opportunities to transform scientific understanding of soil moisture and related processes. These advances are being driven by researchers from a broad range of disciplines, but this complicates collaboration and communication. For some applications, the science required to utilize large-scale soil moisture data is poorly developed. In this review, we describe the state of the art in large-scale soil moisture monitoring and identify some critical needs for research to optimize the use of increasingly available soil moisture data. We review representative examples of 1) emerging in situ and proximal sensing techniques, 2) dedicated soil moisture remote sensing missions, 3) soil moisture monitoring networks, and 4) applications of large-scale soil moisture measurements. Significant near-term progress seems possible in the use of large-scale soil moisture data for drought monitoring. Assimilation of soil moisture data for meteorological or hydrologic forecasting also shows promise, but significant challenges related to model structures and model errors remain. Little progress has been made yet in the use of large-scale soil moisture observations within the context of ecological or agricultural modeling. Opportunities abound to advance the science and practice of large-scale soil moisture monitoring for the sake of improved Earth system monitoring, modeling, and forecasting.

  11. How the Internet Will Help Large-Scale Assessment Reinvent Itself

    Directory of Open Access Journals (Sweden)

    Randy Elliot Bennett

    2001-02-01

    Full Text Available Large-scale assessment in the United States is undergoing enormous pressure to change. That pressure stems from many causes. Depending upon the type of test, the issues precipitating change include an outmoded cognitive-scientific basis for test design; a mismatch with curriculum; the differential performance of population groups; a lack of information to help individuals improve; and inefficiency. These issues provide a strong motivation to reconceptualize both the substance and the business of large-scale assessment. At the same time, advances in technology, measurement, and cognitive science are providing the means to make that reconceptualization a reality. The thesis of this paper is that the largest facilitating factor will be technological, in particular the Internet. In the same way that it is already helping to revolutionize commerce, education, and even social interaction, the Internet will help revolutionize the business and substance of large-scale assessment.

  12. Topic 14+16: High-performance and scientific applications and extreme-scale computing (Introduction)

    KAUST Repository

    Downes, Turlough P.; Roller, Sabine P.; Seitsonen, Ari Paavo; Valcke, Sophie; Keyes, David E.; Sawley, Marie Christine; Schulthess, Thomas C.; Shalf, John M.

    2013-01-01

    and algorithms to address the varied, complex and increasing challenges of modern research throughout both the "hard" and "soft" sciences. This necessitates being able to use large numbers of compute nodes, many of which are equipped with accelerators

  13. A high performance scientific cloud computing environment for materials simulations

    Science.gov (United States)

    Jorissen, K.; Vila, F. D.; Rehr, J. J.

    2012-09-01

    We describe the development of a scientific cloud computing (SCC) platform that offers high performance computation capability. The platform consists of a scientific virtual machine prototype containing a UNIX operating system and several materials science codes, together with essential interface tools (an SCC toolset) that offers functionality comparable to local compute clusters. In particular, our SCC toolset provides automatic creation of virtual clusters for parallel computing, including tools for execution and monitoring performance, as well as efficient I/O utilities that enable seamless connections to and from the cloud. Our SCC platform is optimized for the Amazon Elastic Compute Cloud (EC2). We present benchmarks for prototypical scientific applications and demonstrate performance comparable to local compute clusters. To facilitate code execution and provide user-friendly access, we have also integrated cloud computing capability in a JAVA-based GUI. Our SCC platform may be an alternative to traditional HPC resources for materials science or quantum chemistry applications.

  14. Stabilization Algorithms for Large-Scale Problems

    DEFF Research Database (Denmark)

    Jensen, Toke Koldborg

    2006-01-01

    The focus of the project is on stabilization of large-scale inverse problems where structured models and iterative algorithms are necessary for computing approximate solutions. For this purpose, we study various iterative Krylov methods and their abilities to produce regularized solutions. Some......-curve. This heuristic is implemented as a part of a larger algorithm which is developed in collaboration with G. Rodriguez and P. C. Hansen. Last, but not least, a large part of the project has, in different ways, revolved around the object-oriented Matlab toolbox MOORe Tools developed by PhD Michael Jacobsen. New...

  15. Dimensioning storage and computing clusters for efficient High Throughput Computing

    CERN Multimedia

    CERN. Geneva

    2012-01-01

    Scientific experiments are producing huge amounts of data, and they continue increasing the size of their datasets and the total volume of data. These data are then processed by researchers belonging to large scientific collaborations, with the Large Hadron Collider being a good example. The focal point of Scientific Data Centres has shifted from coping efficiently with PetaByte scale storage to deliver quality data processing throughput. The dimensioning of the internal components in High Throughput Computing (HTC) data centers is of crucial importance to cope with all the activities demanded by the experiments, both the online (data acceptance) and the offline (data processing, simulation and user analysis). This requires a precise setup involving disk and tape storage services, a computing cluster and the internal networking to prevent bottlenecks, overloads and undesired slowness that lead to losses cpu cycles and batch jobs failures. In this paper we point out relevant features for running a successful s...

  16. Dimensioning storage and computing clusters for efficient high throughput computing

    International Nuclear Information System (INIS)

    Accion, E; Bria, A; Bernabeu, G; Caubet, M; Delfino, M; Espinal, X; Merino, G; Lopez, F; Martinez, F; Planas, E

    2012-01-01

    Scientific experiments are producing huge amounts of data, and the size of their datasets and total volume of data continues increasing. These data are then processed by researchers belonging to large scientific collaborations, with the Large Hadron Collider being a good example. The focal point of scientific data centers has shifted from efficiently coping with PetaByte scale storage to deliver quality data processing throughput. The dimensioning of the internal components in High Throughput Computing (HTC) data centers is of crucial importance to cope with all the activities demanded by the experiments, both the online (data acceptance) and the offline (data processing, simulation and user analysis). This requires a precise setup involving disk and tape storage services, a computing cluster and the internal networking to prevent bottlenecks, overloads and undesired slowness that lead to losses cpu cycles and batch jobs failures. In this paper we point out relevant features for running a successful data storage and processing service in an intensive HTC environment.

  17. Advanced scientific computational methods and their applications to nuclear technologies. (4) Overview of scientific computational methods, introduction of continuum simulation methods and their applications (4)

    International Nuclear Information System (INIS)

    Sekimura, Naoto; Okita, Taira

    2006-01-01

    Scientific computational methods have advanced remarkably with the progress of nuclear development. They have played the role of weft connecting each realm of nuclear engineering and then an introductory course of advanced scientific computational methods and their applications to nuclear technologies were prepared in serial form. This is the fourth issue showing the overview of scientific computational methods with the introduction of continuum simulation methods and their applications. Simulation methods on physical radiation effects on materials are reviewed based on the process such as binary collision approximation, molecular dynamics, kinematic Monte Carlo method, reaction rate method and dislocation dynamics. (T. Tanaka)

  18. Collaborative Visualization for Large-Scale Accelerator Electromagnetic Modeling (Final Report)

    International Nuclear Information System (INIS)

    Schroeder, William J.

    2011-01-01

    This report contains the comprehensive summary of the work performed on the SBIR Phase II, Collaborative Visualization for Large-Scale Accelerator Electromagnetic Modeling at Kitware Inc. in collaboration with Stanford Linear Accelerator Center (SLAC). The goal of the work was to develop collaborative visualization tools for large-scale data as illustrated in the figure below. The solutions we proposed address the typical problems faced by geographicallyand organizationally-separated research and engineering teams, who produce large data (either through simulation or experimental measurement) and wish to work together to analyze and understand their data. Because the data is large, we expect that it cannot be easily transported to each team member's work site, and that the visualization server must reside near the data. Further, we also expect that each work site has heterogeneous resources: some with large computing clients, tiled (or large) displays and high bandwidth; others sites as simple as a team member on a laptop computer. Our solution is based on the open-source, widely used ParaView large-data visualization application. We extended this tool to support multiple collaborative clients who may locally visualize data, and then periodically rejoin and synchronize with the group to discuss their findings. Options for managing session control, adding annotation, and defining the visualization pipeline, among others, were incorporated. We also developed and deployed a Web visualization framework based on ParaView that enables the Web browser to act as a participating client in a collaborative session. The ParaView Web Visualization framework leverages various Web technologies including WebGL, JavaScript, Java and Flash to enable interactive 3D visualization over the web using ParaView as the visualization server. We steered the development of this technology by teaming with the SLAC National Accelerator Laboratory. SLAC has a computationally-intensive problem

  19. Collaborative Visualization for Large-Scale Accelerator Electromagnetic Modeling (Final Report)

    Energy Technology Data Exchange (ETDEWEB)

    William J. Schroeder

    2011-11-13

    This report contains the comprehensive summary of the work performed on the SBIR Phase II, Collaborative Visualization for Large-Scale Accelerator Electromagnetic Modeling at Kitware Inc. in collaboration with Stanford Linear Accelerator Center (SLAC). The goal of the work was to develop collaborative visualization tools for large-scale data as illustrated in the figure below. The solutions we proposed address the typical problems faced by geographicallyand organizationally-separated research and engineering teams, who produce large data (either through simulation or experimental measurement) and wish to work together to analyze and understand their data. Because the data is large, we expect that it cannot be easily transported to each team member's work site, and that the visualization server must reside near the data. Further, we also expect that each work site has heterogeneous resources: some with large computing clients, tiled (or large) displays and high bandwidth; others sites as simple as a team member on a laptop computer. Our solution is based on the open-source, widely used ParaView large-data visualization application. We extended this tool to support multiple collaborative clients who may locally visualize data, and then periodically rejoin and synchronize with the group to discuss their findings. Options for managing session control, adding annotation, and defining the visualization pipeline, among others, were incorporated. We also developed and deployed a Web visualization framework based on ParaView that enables the Web browser to act as a participating client in a collaborative session. The ParaView Web Visualization framework leverages various Web technologies including WebGL, JavaScript, Java and Flash to enable interactive 3D visualization over the web using ParaView as the visualization server. We steered the development of this technology by teaming with the SLAC National Accelerator Laboratory. SLAC has a computationally

  20. Speedup predictions on large scientific parallel programs

    International Nuclear Information System (INIS)

    Williams, E.; Bobrowicz, F.

    1985-01-01

    How much speedup can we expect for large scientific parallel programs running on supercomputers. For insight into this problem we extend the parallel processing environment currently existing on the Cray X-MP (a shared memory multiprocessor with at most four processors) to a simulated N-processor environment, where N greater than or equal to 1. Several large scientific parallel programs from Los Alamos National Laboratory were run in this simulated environment, and speedups were predicted. A speedup of 14.4 on 16 processors was measured for one of the three most used codes at the Laboratory

  1. Fires in large scale ventilation systems

    International Nuclear Information System (INIS)

    Gregory, W.S.; Martin, R.A.; White, B.W.; Nichols, B.D.; Smith, P.R.; Leslie, I.H.; Fenton, D.L.; Gunaji, M.V.; Blythe, J.P.

    1991-01-01

    This paper summarizes the experience gained simulating fires in large scale ventilation systems patterned after ventilation systems found in nuclear fuel cycle facilities. The series of experiments discussed included: (1) combustion aerosol loading of 0.61x0.61 m HEPA filters with the combustion products of two organic fuels, polystyrene and polymethylemethacrylate; (2) gas dynamic and heat transport through a large scale ventilation system consisting of a 0.61x0.61 m duct 90 m in length, with dampers, HEPA filters, blowers, etc.; (3) gas dynamic and simultaneous transport of heat and solid particulate (consisting of glass beads with a mean aerodynamic diameter of 10μ) through the large scale ventilation system; and (4) the transport of heat and soot, generated by kerosene pool fires, through the large scale ventilation system. The FIRAC computer code, designed to predict fire-induced transients in nuclear fuel cycle facility ventilation systems, was used to predict the results of experiments (2) through (4). In general, the results of the predictions were satisfactory. The code predictions for the gas dynamics, heat transport, and particulate transport and deposition were within 10% of the experimentally measured values. However, the code was less successful in predicting the amount of soot generation from kerosene pool fires, probably due to the fire module of the code being a one-dimensional zone model. The experiments revealed a complicated three-dimensional combustion pattern within the fire room of the ventilation system. Further refinement of the fire module within FIRAC is needed. (orig.)

  2. Component-based software for high-performance scientific computing

    Energy Technology Data Exchange (ETDEWEB)

    Alexeev, Yuri; Allan, Benjamin A; Armstrong, Robert C; Bernholdt, David E; Dahlgren, Tamara L; Gannon, Dennis; Janssen, Curtis L; Kenny, Joseph P; Krishnan, Manojkumar; Kohl, James A; Kumfert, Gary; McInnes, Lois Curfman; Nieplocha, Jarek; Parker, Steven G; Rasmussen, Craig; Windus, Theresa L

    2005-01-01

    Recent advances in both computational hardware and multidisciplinary science have given rise to an unprecedented level of complexity in scientific simulation software. This paper describes an ongoing grass roots effort aimed at addressing complexity in high-performance computing through the use of Component-Based Software Engineering (CBSE). Highlights of the benefits and accomplishments of the Common Component Architecture (CCA) Forum and SciDAC ISIC are given, followed by an illustrative example of how the CCA has been applied to drive scientific discovery in quantum chemistry. Thrusts for future research are also described briefly.

  3. Component-based software for high-performance scientific computing

    International Nuclear Information System (INIS)

    Alexeev, Yuri; Allan, Benjamin A; Armstrong, Robert C; Bernholdt, David E; Dahlgren, Tamara L; Gannon, Dennis; Janssen, Curtis L; Kenny, Joseph P; Krishnan, Manojkumar; Kohl, James A; Kumfert, Gary; McInnes, Lois Curfman; Nieplocha, Jarek; Parker, Steven G; Rasmussen, Craig; Windus, Theresa L

    2005-01-01

    Recent advances in both computational hardware and multidisciplinary science have given rise to an unprecedented level of complexity in scientific simulation software. This paper describes an ongoing grass roots effort aimed at addressing complexity in high-performance computing through the use of Component-Based Software Engineering (CBSE). Highlights of the benefits and accomplishments of the Common Component Architecture (CCA) Forum and SciDAC ISIC are given, followed by an illustrative example of how the CCA has been applied to drive scientific discovery in quantum chemistry. Thrusts for future research are also described briefly

  4. Reliability in Warehouse-Scale Computing: Why Low Latency Matters

    DEFF Research Database (Denmark)

    Nannarelli, Alberto

    2015-01-01

    , the limiting factor of these warehouse-scale data centers is the power dissipation. Power is dissipated not only in the computation itself, but also in heat removal (fans, air conditioning, etc.) to keep the temperature of the devices within the operating ranges. The need to keep the temperature low within......Warehouse sized buildings are nowadays hosting several types of large computing systems: from supercomputers to large clusters of servers to provide the infrastructure to the cloud. Although the main target, especially for high-performance computing, is still to achieve high throughput...

  5. Large Scale Document Inversion using a Multi-threaded Computing System.

    Science.gov (United States)

    Jung, Sungbo; Chang, Dar-Jen; Park, Juw Won

    2017-06-01

    Current microprocessor architecture is moving towards multi-core/multi-threaded systems. This trend has led to a surge of interest in using multi-threaded computing devices, such as the Graphics Processing Unit (GPU), for general purpose computing. We can utilize the GPU in computation as a massive parallel coprocessor because the GPU consists of multiple cores. The GPU is also an affordable, attractive, and user-programmable commodity. Nowadays a lot of information has been flooded into the digital domain around the world. Huge volume of data, such as digital libraries, social networking services, e-commerce product data, and reviews, etc., is produced or collected every moment with dramatic growth in size. Although the inverted index is a useful data structure that can be used for full text searches or document retrieval, a large number of documents will require a tremendous amount of time to create the index. The performance of document inversion can be improved by multi-thread or multi-core GPU. Our approach is to implement a linear-time, hash-based, single program multiple data (SPMD), document inversion algorithm on the NVIDIA GPU/CUDA programming platform utilizing the huge computational power of the GPU, to develop high performance solutions for document indexing. Our proposed parallel document inversion system shows 2-3 times faster performance than a sequential system on two different test datasets from PubMed abstract and e-commerce product reviews. •Information systems➝Information retrieval • Computing methodologies➝Massively parallel and high-performance simulations.

  6. A Disciplined Architectural Approach to Scaling Data Analysis for Massive, Scientific Data

    Science.gov (United States)

    Crichton, D. J.; Braverman, A. J.; Cinquini, L.; Turmon, M.; Lee, H.; Law, E.

    2014-12-01

    Data collections across remote sensing and ground-based instruments in astronomy, Earth science, and planetary science are outpacing scientists' ability to analyze them. Furthermore, the distribution, structure, and heterogeneity of the measurements themselves pose challenges that limit the scalability of data analysis using traditional approaches. Methods for developing science data processing pipelines, distribution of scientific datasets, and performing analysis will require innovative approaches that integrate cyber-infrastructure, algorithms, and data into more systematic approaches that can more efficiently compute and reduce data, particularly distributed data. This requires the integration of computer science, machine learning, statistics and domain expertise to identify scalable architectures for data analysis. The size of data returned from Earth Science observing satellites and the magnitude of data from climate model output, is predicted to grow into the tens of petabytes challenging current data analysis paradigms. This same kind of growth is present in astronomy and planetary science data. One of the major challenges in data science and related disciplines defining new approaches to scaling systems and analysis in order to increase scientific productivity and yield. Specific needs include: 1) identification of optimized system architectures for analyzing massive, distributed data sets; 2) algorithms for systematic analysis of massive data sets in distributed environments; and 3) the development of software infrastructures that are capable of performing massive, distributed data analysis across a comprehensive data science framework. NASA/JPL has begun an initiative in data science to address these challenges. Our goal is to evaluate how scientific productivity can be improved through optimized architectural topologies that identify how to deploy and manage the access, distribution, computation, and reduction of massive, distributed data, while

  7. Integrating multiple scientific computing needs via a Private Cloud infrastructure

    International Nuclear Information System (INIS)

    Bagnasco, S; Berzano, D; Brunetti, R; Lusso, S; Vallero, S

    2014-01-01

    In a typical scientific computing centre, diverse applications coexist and share a single physical infrastructure. An underlying Private Cloud facility eases the management and maintenance of heterogeneous use cases such as multipurpose or application-specific batch farms, Grid sites catering to different communities, parallel interactive data analysis facilities and others. It allows to dynamically and efficiently allocate resources to any application and to tailor the virtual machines according to the applications' requirements. Furthermore, the maintenance of large deployments of complex and rapidly evolving middleware and application software is eased by the use of virtual images and contextualization techniques; for example, rolling updates can be performed easily and minimizing the downtime. In this contribution we describe the Private Cloud infrastructure at the INFN-Torino Computer Centre, that hosts a full-fledged WLCG Tier-2 site and a dynamically expandable PROOF-based Interactive Analysis Facility for the ALICE experiment at the CERN LHC and several smaller scientific computing applications. The Private Cloud building blocks include the OpenNebula software stack, the GlusterFS filesystem (used in two different configurations for worker- and service-class hypervisors) and the OpenWRT Linux distribution (used for network virtualization). A future integration into a federated higher-level infrastructure is made possible by exposing commonly used APIs like EC2 and by using mainstream contextualization tools like CloudInit.

  8. Proceedings of joint meeting of the 6th simulation science symposium and the NIFS collaboration research 'large scale computer simulation'

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    2003-03-01

    Joint meeting of the 6th Simulation Science Symposium and the NIFS Collaboration Research 'Large Scale Computer Simulation' was held on December 12-13, 2002 at National Institute for Fusion Science, with the aim of promoting interdisciplinary collaborations in various fields of computer simulations. The present meeting attended by more than 40 people consists of the 11 invited and 22 contributed papers, of which topics were extended not only to fusion science but also to related fields such as astrophysics, earth science, fluid dynamics, molecular dynamics, computer science etc. (author)

  9. Interactive exploration of large-scale time-varying data using dynamic tracking graphs

    KAUST Repository

    Widanagamaachchi, W.

    2012-10-01

    Exploring and analyzing the temporal evolution of features in large-scale time-varying datasets is a common problem in many areas of science and engineering. One natural representation of such data is tracking graphs, i.e., constrained graph layouts that use one spatial dimension to indicate time and show the "tracks" of each feature as it evolves, merges or disappears. However, for practical data sets creating the corresponding optimal graph layouts that minimize the number of intersections can take hours to compute with existing techniques. Furthermore, the resulting graphs are often unmanageably large and complex even with an ideal layout. Finally, due to the cost of the layout, changing the feature definition, e.g. by changing an iso-value, or analyzing properly adjusted sub-graphs is infeasible. To address these challenges, this paper presents a new framework that couples hierarchical feature definitions with progressive graph layout algorithms to provide an interactive exploration of dynamically constructed tracking graphs. Our system enables users to change feature definitions on-the-fly and filter features using arbitrary attributes while providing an interactive view of the resulting tracking graphs. Furthermore, the graph display is integrated into a linked view system that provides a traditional 3D view of the current set of features and allows a cross-linked selection to enable a fully flexible spatio-temporal exploration of data. We demonstrate the utility of our approach with several large-scale scientific simulations from combustion science. © 2012 IEEE.

  10. Large-scale stochasticity in Hamiltonian systems

    International Nuclear Information System (INIS)

    Escande, D.F.

    1982-01-01

    Large scale stochasticity (L.S.S.) in Hamiltonian systems is defined on the paradigm Hamiltonian H(v,x,t) =v 2 /2-M cos x-P cos k(x-t) which describes the motion of one particle in two electrostatic waves. A renormalization transformation Tsub(r) is described which acts as a microscope that focusses on a given KAM (Kolmogorov-Arnold-Moser) torus in phase space. Though approximate, Tsub(r) yields the threshold of L.S.S. in H with an error of 5-10%. The universal behaviour of KAM tori is predicted: for instance the scale invariance of KAM tori and the critical exponent of the Lyapunov exponent of Cantori. The Fourier expansion of KAM tori is computed and several conjectures by L. Kadanoff and S. Shenker are proved. Chirikov's standard mapping for stochastic layers is derived in a simpler way and the width of the layers is computed. A simpler renormalization scheme for these layers is defined. A Mathieu equation for describing the stability of a discrete family of cycles is derived. When combined with Tsub(r), it allows to prove the link between KAM tori and nearby cycles, conjectured by J. Greene and, in particular, to compute the mean residue of a torus. The fractal diagrams defined by G. Schmidt are computed. A sketch of a methodology for computing the L.S.S. threshold in any two-degree-of-freedom Hamiltonian system is given. (Auth.)

  11. Accelerating large-scale phase-field simulations with GPU

    Directory of Open Access Journals (Sweden)

    Xiaoming Shi

    2017-10-01

    Full Text Available A new package for accelerating large-scale phase-field simulations was developed by using GPU based on the semi-implicit Fourier method. The package can solve a variety of equilibrium equations with different inhomogeneity including long-range elastic, magnetostatic, and electrostatic interactions. Through using specific algorithm in Compute Unified Device Architecture (CUDA, Fourier spectral iterative perturbation method was integrated in GPU package. The Allen-Cahn equation, Cahn-Hilliard equation, and phase-field model with long-range interaction were solved based on the algorithm running on GPU respectively to test the performance of the package. From the comparison of the calculation results between the solver executed in single CPU and the one on GPU, it was found that the speed on GPU is enormously elevated to 50 times faster. The present study therefore contributes to the acceleration of large-scale phase-field simulations and provides guidance for experiments to design large-scale functional devices.

  12. Large-scale, high-performance and cloud-enabled multi-model analytics experiments in the context of the Earth System Grid Federation

    Science.gov (United States)

    Fiore, S.; Płóciennik, M.; Doutriaux, C.; Blanquer, I.; Barbera, R.; Williams, D. N.; Anantharaj, V. G.; Evans, B. J. K.; Salomoni, D.; Aloisio, G.

    2017-12-01

    time this contribution is being written, the proposed testbed represents the first implementation of a distributed large-scale, multi-model experiment in the ESGF/CMIP context, joining together server-side approaches for scientific data analysis, HPDA frameworks, end-to-end workflow management, and cloud computing.

  13. Load Balancing Scientific Applications

    Energy Technology Data Exchange (ETDEWEB)

    Pearce, Olga Tkachyshyn [Texas A & M Univ., College Station, TX (United States)

    2014-12-01

    The largest supercomputers have millions of independent processors, and concurrency levels are rapidly increasing. For ideal efficiency, developers of the simulations that run on these machines must ensure that computational work is evenly balanced among processors. Assigning work evenly is challenging because many large modern parallel codes simulate behavior of physical systems that evolve over time, and their workloads change over time. Furthermore, the cost of imbalanced load increases with scale because most large-scale scientific simulations today use a Single Program Multiple Data (SPMD) parallel programming model, and an increasing number of processors will wait for the slowest one at the synchronization points. To address load imbalance, many large-scale parallel applications use dynamic load balance algorithms to redistribute work evenly. The research objective of this dissertation is to develop methods to decide when and how to load balance the application, and to balance it effectively and affordably. We measure and evaluate the computational load of the application, and develop strategies to decide when and how to correct the imbalance. Depending on the simulation, a fast, local load balance algorithm may be suitable, or a more sophisticated and expensive algorithm may be required. We developed a model for comparison of load balance algorithms for a specific state of the simulation that enables the selection of a balancing algorithm that will minimize overall runtime.

  14. Evaluation of Kirkwood-Buff integrals via finite size scaling: a large scale molecular dynamics study

    Science.gov (United States)

    Dednam, W.; Botha, A. E.

    2015-01-01

    Solvation of bio-molecules in water is severely affected by the presence of co-solvent within the hydration shell of the solute structure. Furthermore, since solute molecules can range from small molecules, such as methane, to very large protein structures, it is imperative to understand the detailed structure-function relationship on the microscopic level. For example, it is useful know the conformational transitions that occur in protein structures. Although such an understanding can be obtained through large-scale molecular dynamic simulations, it is often the case that such simulations would require excessively large simulation times. In this context, Kirkwood-Buff theory, which connects the microscopic pair-wise molecular distributions to global thermodynamic properties, together with the recently developed technique, called finite size scaling, may provide a better method to reduce system sizes, and hence also the computational times. In this paper, we present molecular dynamics trial simulations of biologically relevant low-concentration solvents, solvated by aqueous co-solvent solutions. In particular we compare two different methods of calculating the relevant Kirkwood-Buff integrals. The first (traditional) method computes running integrals over the radial distribution functions, which must be obtained from large system-size NVT or NpT simulations. The second, newer method, employs finite size scaling to obtain the Kirkwood-Buff integrals directly by counting the particle number fluctuations in small, open sub-volumes embedded within a larger reservoir that can be well approximated by a much smaller simulation cell. In agreement with previous studies, which made a similar comparison for aqueous co-solvent solutions, without the additional solvent, we conclude that the finite size scaling method is also applicable to the present case, since it can produce computationally more efficient results which are equivalent to the more costly radial distribution

  15. Evaluation of Kirkwood-Buff integrals via finite size scaling: a large scale molecular dynamics study

    International Nuclear Information System (INIS)

    Dednam, W; Botha, A E

    2015-01-01

    Solvation of bio-molecules in water is severely affected by the presence of co-solvent within the hydration shell of the solute structure. Furthermore, since solute molecules can range from small molecules, such as methane, to very large protein structures, it is imperative to understand the detailed structure-function relationship on the microscopic level. For example, it is useful know the conformational transitions that occur in protein structures. Although such an understanding can be obtained through large-scale molecular dynamic simulations, it is often the case that such simulations would require excessively large simulation times. In this context, Kirkwood-Buff theory, which connects the microscopic pair-wise molecular distributions to global thermodynamic properties, together with the recently developed technique, called finite size scaling, may provide a better method to reduce system sizes, and hence also the computational times. In this paper, we present molecular dynamics trial simulations of biologically relevant low-concentration solvents, solvated by aqueous co-solvent solutions. In particular we compare two different methods of calculating the relevant Kirkwood-Buff integrals. The first (traditional) method computes running integrals over the radial distribution functions, which must be obtained from large system-size NVT or NpT simulations. The second, newer method, employs finite size scaling to obtain the Kirkwood-Buff integrals directly by counting the particle number fluctuations in small, open sub-volumes embedded within a larger reservoir that can be well approximated by a much smaller simulation cell. In agreement with previous studies, which made a similar comparison for aqueous co-solvent solutions, without the additional solvent, we conclude that the finite size scaling method is also applicable to the present case, since it can produce computationally more efficient results which are equivalent to the more costly radial distribution

  16. A new asynchronous parallel algorithm for inferring large-scale gene regulatory networks.

    Directory of Open Access Journals (Sweden)

    Xiangyun Xiao

    Full Text Available The reconstruction of gene regulatory networks (GRNs from high-throughput experimental data has been considered one of the most important issues in systems biology research. With the development of high-throughput technology and the complexity of biological problems, we need to reconstruct GRNs that contain thousands of genes. However, when many existing algorithms are used to handle these large-scale problems, they will encounter two important issues: low accuracy and high computational cost. To overcome these difficulties, the main goal of this study is to design an effective parallel algorithm to infer large-scale GRNs based on high-performance parallel computing environments. In this study, we proposed a novel asynchronous parallel framework to improve the accuracy and lower the time complexity of large-scale GRN inference by combining splitting technology and ordinary differential equation (ODE-based optimization. The presented algorithm uses the sparsity and modularity of GRNs to split whole large-scale GRNs into many small-scale modular subnetworks. Through the ODE-based optimization of all subnetworks in parallel and their asynchronous communications, we can easily obtain the parameters of the whole network. To test the performance of the proposed approach, we used well-known benchmark datasets from Dialogue for Reverse Engineering Assessments and Methods challenge (DREAM, experimentally determined GRN of Escherichia coli and one published dataset that contains more than 10 thousand genes to compare the proposed approach with several popular algorithms on the same high-performance computing environments in terms of both accuracy and time complexity. The numerical results demonstrate that our parallel algorithm exhibits obvious superiority in inferring large-scale GRNs.

  17. A new asynchronous parallel algorithm for inferring large-scale gene regulatory networks.

    Science.gov (United States)

    Xiao, Xiangyun; Zhang, Wei; Zou, Xiufen

    2015-01-01

    The reconstruction of gene regulatory networks (GRNs) from high-throughput experimental data has been considered one of the most important issues in systems biology research. With the development of high-throughput technology and the complexity of biological problems, we need to reconstruct GRNs that contain thousands of genes. However, when many existing algorithms are used to handle these large-scale problems, they will encounter two important issues: low accuracy and high computational cost. To overcome these difficulties, the main goal of this study is to design an effective parallel algorithm to infer large-scale GRNs based on high-performance parallel computing environments. In this study, we proposed a novel asynchronous parallel framework to improve the accuracy and lower the time complexity of large-scale GRN inference by combining splitting technology and ordinary differential equation (ODE)-based optimization. The presented algorithm uses the sparsity and modularity of GRNs to split whole large-scale GRNs into many small-scale modular subnetworks. Through the ODE-based optimization of all subnetworks in parallel and their asynchronous communications, we can easily obtain the parameters of the whole network. To test the performance of the proposed approach, we used well-known benchmark datasets from Dialogue for Reverse Engineering Assessments and Methods challenge (DREAM), experimentally determined GRN of Escherichia coli and one published dataset that contains more than 10 thousand genes to compare the proposed approach with several popular algorithms on the same high-performance computing environments in terms of both accuracy and time complexity. The numerical results demonstrate that our parallel algorithm exhibits obvious superiority in inferring large-scale GRNs.

  18. Hydrometeorological variability on a large french catchment and its relation to large-scale circulation across temporal scales

    Science.gov (United States)

    Massei, Nicolas; Dieppois, Bastien; Fritier, Nicolas; Laignel, Benoit; Debret, Maxime; Lavers, David; Hannah, David

    2015-04-01

    In the present context of global changes, considerable efforts have been deployed by the hydrological scientific community to improve our understanding of the impacts of climate fluctuations on water resources. Both observational and modeling studies have been extensively employed to characterize hydrological changes and trends, assess the impact of climate variability or provide future scenarios of water resources. In the aim of a better understanding of hydrological changes, it is of crucial importance to determine how and to what extent trends and long-term oscillations detectable in hydrological variables are linked to global climate oscillations. In this work, we develop an approach associating large-scale/local-scale correlation, enmpirical statistical downscaling and wavelet multiresolution decomposition of monthly precipitation and streamflow over the Seine river watershed, and the North Atlantic sea level pressure (SLP) in order to gain additional insights on the atmospheric patterns associated with the regional hydrology. We hypothesized that: i) atmospheric patterns may change according to the different temporal wavelengths defining the variability of the signals; and ii) definition of those hydrological/circulation relationships for each temporal wavelength may improve the determination of large-scale predictors of local variations. The results showed that the large-scale/local-scale links were not necessarily constant according to time-scale (i.e. for the different frequencies characterizing the signals), resulting in changing spatial patterns across scales. This was then taken into account by developing an empirical statistical downscaling (ESD) modeling approach which integrated discrete wavelet multiresolution analysis for reconstructing local hydrometeorological processes (predictand : precipitation and streamflow on the Seine river catchment) based on a large-scale predictor (SLP over the Euro-Atlantic sector) on a monthly time-step. This approach

  19. Software Defects, Scientific Computation and the Scientific Method

    CERN Multimedia

    CERN. Geneva

    2011-01-01

    Computation has rapidly grown in the last 50 years so that in many scientific areas it is the dominant partner in the practice of science. Unfortunately, unlike the experimental sciences, it does not adhere well to the principles of the scientific method as espoused by, for example, the philosopher Karl Popper. Such principles are built around the notions of deniability and reproducibility. Although much research effort has been spent on measuring the density of software defects, much less has been spent on the more difficult problem of measuring their effect on the output of a program. This talk explores these issues with numerous examples suggesting how this situation might be improved to match the demands of modern science. Finally it develops a theoretical model based on an amalgam of statistical mechanics and Hartley/Shannon information theory which suggests that software systems have strong implementation independent behaviour and supports the widely observed phenomenon that defects clust...

  20. Large Scale Document Inversion using a Multi-threaded Computing System

    Science.gov (United States)

    Jung, Sungbo; Chang, Dar-Jen; Park, Juw Won

    2018-01-01

    Current microprocessor architecture is moving towards multi-core/multi-threaded systems. This trend has led to a surge of interest in using multi-threaded computing devices, such as the Graphics Processing Unit (GPU), for general purpose computing. We can utilize the GPU in computation as a massive parallel coprocessor because the GPU consists of multiple cores. The GPU is also an affordable, attractive, and user-programmable commodity. Nowadays a lot of information has been flooded into the digital domain around the world. Huge volume of data, such as digital libraries, social networking services, e-commerce product data, and reviews, etc., is produced or collected every moment with dramatic growth in size. Although the inverted index is a useful data structure that can be used for full text searches or document retrieval, a large number of documents will require a tremendous amount of time to create the index. The performance of document inversion can be improved by multi-thread or multi-core GPU. Our approach is to implement a linear-time, hash-based, single program multiple data (SPMD), document inversion algorithm on the NVIDIA GPU/CUDA programming platform utilizing the huge computational power of the GPU, to develop high performance solutions for document indexing. Our proposed parallel document inversion system shows 2-3 times faster performance than a sequential system on two different test datasets from PubMed abstract and e-commerce product reviews. CCS Concepts •Information systems➝Information retrieval • Computing methodologies➝Massively parallel and high-performance simulations.

  1. BioPig: a Hadoop-based analytic toolkit for large-scale sequence data.

    Science.gov (United States)

    Nordberg, Henrik; Bhatia, Karan; Wang, Kai; Wang, Zhong

    2013-12-01

    The recent revolution in sequencing technologies has led to an exponential growth of sequence data. As a result, most of the current bioinformatics tools become obsolete as they fail to scale with data. To tackle this 'data deluge', here we introduce the BioPig sequence analysis toolkit as one of the solutions that scale to data and computation. We built BioPig on the Apache's Hadoop MapReduce system and the Pig data flow language. Compared with traditional serial and MPI-based algorithms, BioPig has three major advantages: first, BioPig's programmability greatly reduces development time for parallel bioinformatics applications; second, testing BioPig with up to 500 Gb sequences demonstrates that it scales automatically with size of data; and finally, BioPig can be ported without modification on many Hadoop infrastructures, as tested with Magellan system at National Energy Research Scientific Computing Center and the Amazon Elastic Compute Cloud. In summary, BioPig represents a novel program framework with the potential to greatly accelerate data-intensive bioinformatics analysis.

  2. Distributed computing and nuclear reactor analysis

    International Nuclear Information System (INIS)

    Brown, F.B.; Derstine, K.L.; Blomquist, R.N.

    1994-01-01

    Large-scale scientific and engineering calculations for nuclear reactor analysis can now be carried out effectively in a distributed computing environment, at costs far lower than for traditional mainframes. The distributed computing environment must include support for traditional system services, such as a queuing system for batch work, reliable filesystem backups, and parallel processing capabilities for large jobs. All ANL computer codes for reactor analysis have been adapted successfully to a distributed system based on workstations and X-terminals. Distributed parallel processing has been demonstrated to be effective for long-running Monte Carlo calculations

  3. Novel algorithm of large-scale simultaneous linear equations

    International Nuclear Information System (INIS)

    Fujiwara, T; Hoshi, T; Yamamoto, S; Sogabe, T; Zhang, S-L

    2010-01-01

    We review our recently developed methods of solving large-scale simultaneous linear equations and applications to electronic structure calculations both in one-electron theory and many-electron theory. This is the shifted COCG (conjugate orthogonal conjugate gradient) method based on the Krylov subspace, and the most important issue for applications is the shift equation and the seed switching method, which greatly reduce the computational cost. The applications to nano-scale Si crystals and the double orbital extended Hubbard model are presented.

  4. Large-scale inverse model analyses employing fast randomized data reduction

    Science.gov (United States)

    Lin, Youzuo; Le, Ellen B.; O'Malley, Daniel; Vesselinov, Velimir V.; Bui-Thanh, Tan

    2017-08-01

    When the number of observations is large, it is computationally challenging to apply classical inverse modeling techniques. We have developed a new computationally efficient technique for solving inverse problems with a large number of observations (e.g., on the order of 107 or greater). Our method, which we call the randomized geostatistical approach (RGA), is built upon the principal component geostatistical approach (PCGA). We employ a data reduction technique combined with the PCGA to improve the computational efficiency and reduce the memory usage. Specifically, we employ a randomized numerical linear algebra technique based on a so-called "sketching" matrix to effectively reduce the dimension of the observations without losing the information content needed for the inverse analysis. In this way, the computational and memory costs for RGA scale with the information content rather than the size of the calibration data. Our algorithm is coded in Julia and implemented in the MADS open-source high-performance computational framework (http://mads.lanl.gov). We apply our new inverse modeling method to invert for a synthetic transmissivity field. Compared to a standard geostatistical approach (GA), our method is more efficient when the number of observations is large. Most importantly, our method is capable of solving larger inverse problems than the standard GA and PCGA approaches. Therefore, our new model inversion method is a powerful tool for solving large-scale inverse problems. The method can be applied in any field and is not limited to hydrogeological applications such as the characterization of aquifer heterogeneity.

  5. Subgrid-scale models for large-eddy simulation of rotating turbulent channel flows

    Science.gov (United States)

    Silvis, Maurits H.; Bae, Hyunji Jane; Trias, F. Xavier; Abkar, Mahdi; Moin, Parviz; Verstappen, Roel

    2017-11-01

    We aim to design subgrid-scale models for large-eddy simulation of rotating turbulent flows. Rotating turbulent flows form a challenging test case for large-eddy simulation due to the presence of the Coriolis force. The Coriolis force conserves the total kinetic energy while transporting it from small to large scales of motion, leading to the formation of large-scale anisotropic flow structures. The Coriolis force may also cause partial flow laminarization and the occurrence of turbulent bursts. Many subgrid-scale models for large-eddy simulation are, however, primarily designed to parametrize the dissipative nature of turbulent flows, ignoring the specific characteristics of transport processes. We, therefore, propose a new subgrid-scale model that, in addition to the usual dissipative eddy viscosity term, contains a nondissipative nonlinear model term designed to capture transport processes, such as those due to rotation. We show that the addition of this nonlinear model term leads to improved predictions of the energy spectra of rotating homogeneous isotropic turbulence as well as of the Reynolds stress anisotropy in spanwise-rotating plane-channel flows. This work is financed by the Netherlands Organisation for Scientific Research (NWO) under Project Number 613.001.212.

  6. Large-scale solar purchasing

    International Nuclear Information System (INIS)

    1999-01-01

    The principal objective of the project was to participate in the definition of a new IEA task concerning solar procurement (''the Task'') and to assess whether involvement in the task would be in the interest of the UK active solar heating industry. The project also aimed to assess the importance of large scale solar purchasing to UK active solar heating market development and to evaluate the level of interest in large scale solar purchasing amongst potential large scale purchasers (in particular housing associations and housing developers). A further aim of the project was to consider means of stimulating large scale active solar heating purchasing activity within the UK. (author)

  7. Localization Algorithm Based on a Spring Model (LASM for Large Scale Wireless Sensor Networks

    Directory of Open Access Journals (Sweden)

    Shuai Li

    2008-03-01

    Full Text Available A navigation method for a lunar rover based on large scale wireless sensornetworks is proposed. To obtain high navigation accuracy and large exploration area, highnode localization accuracy and large network scale are required. However, thecomputational and communication complexity and time consumption are greatly increasedwith the increase of the network scales. A localization algorithm based on a spring model(LASM method is proposed to reduce the computational complexity, while maintainingthe localization accuracy in large scale sensor networks. The algorithm simulates thedynamics of physical spring system to estimate the positions of nodes. The sensor nodesare set as particles with masses and connected with neighbor nodes by virtual springs. Thevirtual springs will force the particles move to the original positions, the node positionscorrespondingly, from the randomly set positions. Therefore, a blind node position can bedetermined from the LASM algorithm by calculating the related forces with the neighbornodes. The computational and communication complexity are O(1 for each node, since thenumber of the neighbor nodes does not increase proportionally with the network scale size.Three patches are proposed to avoid local optimization, kick out bad nodes and deal withnode variation. Simulation results show that the computational and communicationcomplexity are almost constant despite of the increase of the network scale size. The time consumption has also been proven to remain almost constant since the calculation steps arealmost unrelated with the network scale size.

  8. Computational biology in the cloud: methods and new insights from computing at scale.

    Science.gov (United States)

    Kasson, Peter M

    2013-01-01

    The past few years have seen both explosions in the size of biological data sets and the proliferation of new, highly flexible on-demand computing capabilities. The sheer amount of information available from genomic and metagenomic sequencing, high-throughput proteomics, experimental and simulation datasets on molecular structure and dynamics affords an opportunity for greatly expanded insight, but it creates new challenges of scale for computation, storage, and interpretation of petascale data. Cloud computing resources have the potential to help solve these problems by offering a utility model of computing and storage: near-unlimited capacity, the ability to burst usage, and cheap and flexible payment models. Effective use of cloud computing on large biological datasets requires dealing with non-trivial problems of scale and robustness, since performance-limiting factors can change substantially when a dataset grows by a factor of 10,000 or more. New computing paradigms are thus often needed. The use of cloud platforms also creates new opportunities to share data, reduce duplication, and to provide easy reproducibility by making the datasets and computational methods easily available.

  9. Robust large-scale parallel nonlinear solvers for simulations.

    Energy Technology Data Exchange (ETDEWEB)

    Bader, Brett William; Pawlowski, Roger Patrick; Kolda, Tamara Gibson (Sandia National Laboratories, Livermore, CA)

    2005-11-01

    This report documents research to develop robust and efficient solution techniques for solving large-scale systems of nonlinear equations. The most widely used method for solving systems of nonlinear equations is Newton's method. While much research has been devoted to augmenting Newton-based solvers (usually with globalization techniques), little has been devoted to exploring the application of different models. Our research has been directed at evaluating techniques using different models than Newton's method: a lower order model, Broyden's method, and a higher order model, the tensor method. We have developed large-scale versions of each of these models and have demonstrated their use in important applications at Sandia. Broyden's method replaces the Jacobian with an approximation, allowing codes that cannot evaluate a Jacobian or have an inaccurate Jacobian to converge to a solution. Limited-memory methods, which have been successful in optimization, allow us to extend this approach to large-scale problems. We compare the robustness and efficiency of Newton's method, modified Newton's method, Jacobian-free Newton-Krylov method, and our limited-memory Broyden method. Comparisons are carried out for large-scale applications of fluid flow simulations and electronic circuit simulations. Results show that, in cases where the Jacobian was inaccurate or could not be computed, Broyden's method converged in some cases where Newton's method failed to converge. We identify conditions where Broyden's method can be more efficient than Newton's method. We also present modifications to a large-scale tensor method, originally proposed by Bouaricha, for greater efficiency, better robustness, and wider applicability. Tensor methods are an alternative to Newton-based methods and are based on computing a step based on a local quadratic model rather than a linear model. The advantage of Bouaricha's method is that it can use any

  10. Challenges in scaling NLO generators to leadership computers

    Science.gov (United States)

    Benjamin, D.; Childers, JT; Hoeche, S.; LeCompte, T.; Uram, T.

    2017-10-01

    Exascale computing resources are roughly a decade away and will be capable of 100 times more computing than current supercomputers. In the last year, Energy Frontier experiments crossed a milestone of 100 million core-hours used at the Argonne Leadership Computing Facility, Oak Ridge Leadership Computing Facility, and NERSC. The Fortran-based leading-order parton generator called Alpgen was successfully scaled to millions of threads to achieve this level of usage on Mira. Sherpa and MadGraph are next-to-leading order generators used heavily by LHC experiments for simulation. Integration times for high-multiplicity or rare processes can take a week or more on standard Grid machines, even using all 16-cores. We will describe our ongoing work to scale the Sherpa generator to thousands of threads on leadership-class machines and reduce run-times to less than a day. This work allows the experiments to leverage large-scale parallel supercomputers for event generation today, freeing tens of millions of grid hours for other work, and paving the way for future applications (simulation, reconstruction) on these and future supercomputers.

  11. A Computing Environment to Support Repeatable Scientific Big Data Experimentation of World-Wide Scientific Literature

    Energy Technology Data Exchange (ETDEWEB)

    Schlicher, Bob G [ORNL; Kulesz, James J [ORNL; Abercrombie, Robert K [ORNL; Kruse, Kara L [ORNL

    2015-01-01

    A principal tenant of the scientific method is that experiments must be repeatable and relies on ceteris paribus (i.e., all other things being equal). As a scientific community, involved in data sciences, we must investigate ways to establish an environment where experiments can be repeated. We can no longer allude to where the data comes from, we must add rigor to the data collection and management process from which our analysis is conducted. This paper describes a computing environment to support repeatable scientific big data experimentation of world-wide scientific literature, and recommends a system that is housed at the Oak Ridge National Laboratory in order to provide value to investigators from government agencies, academic institutions, and industry entities. The described computing environment also adheres to the recently instituted digital data management plan mandated by multiple US government agencies, which involves all stages of the digital data life cycle including capture, analysis, sharing, and preservation. It particularly focuses on the sharing and preservation of digital research data. The details of this computing environment are explained within the context of cloud services by the three layer classification of Software as a Service , Platform as a Service , and Infrastructure as a Service .

  12. TensorFlow: A system for large-scale machine learning

    OpenAIRE

    Abadi, Martín; Barham, Paul; Chen, Jianmin; Chen, Zhifeng; Davis, Andy; Dean, Jeffrey; Devin, Matthieu; Ghemawat, Sanjay; Irving, Geoffrey; Isard, Michael; Kudlur, Manjunath; Levenberg, Josh; Monga, Rajat; Moore, Sherry; Murray, Derek G.

    2016-01-01

    TensorFlow is a machine learning system that operates at large scale and in heterogeneous environments. TensorFlow uses dataflow graphs to represent computation, shared state, and the operations that mutate that state. It maps the nodes of a dataflow graph across many machines in a cluster, and within a machine across multiple computational devices, including multicore CPUs, general-purpose GPUs, and custom designed ASICs known as Tensor Processing Units (TPUs). This architecture gives flexib...

  13. III - Template Metaprogramming for massively parallel scientific computing - Templates for Iteration; Thread-level Parallelism

    CERN Multimedia

    CERN. Geneva

    2016-01-01

    Large scale scientific computing raises questions on different levels ranging from the fomulation of the problems to the choice of the best algorithms and their implementation for a specific platform. There are similarities in these different topics that can be exploited by modern-style C++ template metaprogramming techniques to produce readable, maintainable and generic code. Traditional low-level code tend to be fast but platform-dependent, and it obfuscates the meaning of the algorithm. On the other hand, object-oriented approach is nice to read, but may come with an inherent performance penalty. These lectures aim to present he basics of the Expression Template (ET) idiom which allows us to keep the object-oriented approach without sacrificing performance. We will in particular show to to enhance ET to include SIMD vectorization. We will then introduce techniques for abstracting iteration, and introduce thread-level parallelism for use in heavy data-centric loads. We will show to to apply these methods i...

  14. Building a High Performance Computing Infrastructure for Novosibirsk Scientific Center

    International Nuclear Information System (INIS)

    Adakin, A; Chubarov, D; Nikultsev, V; Belov, S; Kaplin, V; Sukharev, A; Zaytsev, A; Kalyuzhny, V; Kuchin, N; Lomakin, S

    2011-01-01

    Novosibirsk Scientific Center (NSC), also known worldwide as Akademgorodok, is one of the largest Russian scientific centers hosting Novosibirsk State University (NSU) and more than 35 research organizations of the Siberian Branch of Russian Academy of Sciences including Budker Institute of Nuclear Physics (BINP), Institute of Computational Technologies (ICT), and Institute of Computational Mathematics and Mathematical Geophysics (ICM and MG). Since each institute has specific requirements on the architecture of the computing farms involved in its research field, currently we've got several computing facilities hosted by NSC institutes, each optimized for the particular set of tasks, of which the largest are the NSU Supercomputer Center, Siberian Supercomputer Center (ICM and MG), and a Grid Computing Facility of BINP. Recently a dedicated optical network with the initial bandwidth of 10 Gbps connecting these three facilities was built in order to make it possible to share the computing resources among the research communities of participating institutes, thus providing a common platform for building the computing infrastructure for various scientific projects. Unification of the computing infrastructure is achieved by extensive use of virtualization technologies based on XEN and KVM platforms. The solution implemented was tested thoroughly within the computing environment of KEDR detector experiment which is being carried out at BINP, and foreseen to be applied to the use cases of other HEP experiments in the upcoming future.

  15. Large-Scale Optimization for Bayesian Inference in Complex Systems

    Energy Technology Data Exchange (ETDEWEB)

    Willcox, Karen [MIT; Marzouk, Youssef [MIT

    2013-11-12

    The SAGUARO (Scalable Algorithms for Groundwater Uncertainty Analysis and Robust Optimization) Project focused on the development of scalable numerical algorithms for large-scale Bayesian inversion in complex systems that capitalize on advances in large-scale simulation-based optimization and inversion methods. The project was a collaborative effort among MIT, the University of Texas at Austin, Georgia Institute of Technology, and Sandia National Laboratories. The research was directed in three complementary areas: efficient approximations of the Hessian operator, reductions in complexity of forward simulations via stochastic spectral approximations and model reduction, and employing large-scale optimization concepts to accelerate sampling. The MIT--Sandia component of the SAGUARO Project addressed the intractability of conventional sampling methods for large-scale statistical inverse problems by devising reduced-order models that are faithful to the full-order model over a wide range of parameter values; sampling then employs the reduced model rather than the full model, resulting in very large computational savings. Results indicate little effect on the computed posterior distribution. On the other hand, in the Texas--Georgia Tech component of the project, we retain the full-order model, but exploit inverse problem structure (adjoint-based gradients and partial Hessian information of the parameter-to-observation map) to implicitly extract lower dimensional information on the posterior distribution; this greatly speeds up sampling methods, so that fewer sampling points are needed. We can think of these two approaches as ``reduce then sample'' and ``sample then reduce.'' In fact, these two approaches are complementary, and can be used in conjunction with each other. Moreover, they both exploit deterministic inverse problem structure, in the form of adjoint-based gradient and Hessian information of the underlying parameter-to-observation map, to

  16. Computational challenges of large-scale, long-time, first-principles molecular dynamics

    International Nuclear Information System (INIS)

    Kent, P R C

    2008-01-01

    Plane wave density functional calculations have traditionally been able to use the largest available supercomputing resources. We analyze the scalability of modern projector-augmented wave implementations to identify the challenges in performing molecular dynamics calculations of large systems containing many thousands of electrons. Benchmark calculations on the Cray XT4 demonstrate that global linear-algebra operations are the primary reason for limited parallel scalability. Plane-wave related operations can be made sufficiently scalable. Improving parallel linear-algebra performance is an essential step to reaching longer timescales in future large-scale molecular dynamics calculations

  17. DOE Advanced Scientific Computing Advisory Committee (ASCAC) Subcommittee Report on Scientific and Technical Information

    Energy Technology Data Exchange (ETDEWEB)

    Hey, Tony [eScience Institute, University of Washington; Agarwal, Deborah [Lawrence Berkeley National Laboratory; Borgman, Christine [University of California, Los Angeles; Cartaro, Concetta [SLAC National Accelerator Laboratory; Crivelli, Silvia [Lawrence Berkeley National Laboratory; Van Dam, Kerstin Kleese [Pacific Northwest National Laboratory; Luce, Richard [University of Oklahoma; Arjun, Shankar [CADES, Oak Ridge National Laboratory; Trefethen, Anne [University of Oxford; Wade, Alex [Microsoft Research, Microsoft Corporation; Williams, Dean [Lawrence Livermore National Laboratory

    2015-09-04

    The Advanced Scientific Computing Advisory Committee (ASCAC) was charged to form a standing subcommittee to review the Department of Energy’s Office of Scientific and Technical Information (OSTI) and to begin by assessing the quality and effectiveness of OSTI’s recent and current products and services and to comment on its mission and future directions in the rapidly changing environment for scientific publication and data. The Committee met with OSTI staff and reviewed available products, services and other materials. This report summaries their initial findings and recommendations.

  18. Multi-level discriminative dictionary learning with application to large scale image classification.

    Science.gov (United States)

    Shen, Li; Sun, Gang; Huang, Qingming; Wang, Shuhui; Lin, Zhouchen; Wu, Enhua

    2015-10-01

    The sparse coding technique has shown flexibility and capability in image representation and analysis. It is a powerful tool in many visual applications. Some recent work has shown that incorporating the properties of task (such as discrimination for classification task) into dictionary learning is effective for improving the accuracy. However, the traditional supervised dictionary learning methods suffer from high computation complexity when dealing with large number of categories, making them less satisfactory in large scale applications. In this paper, we propose a novel multi-level discriminative dictionary learning method and apply it to large scale image classification. Our method takes advantage of hierarchical category correlation to encode multi-level discriminative information. Each internal node of the category hierarchy is associated with a discriminative dictionary and a classification model. The dictionaries at different layers are learnt to capture the information of different scales. Moreover, each node at lower layers also inherits the dictionary of its parent, so that the categories at lower layers can be described with multi-scale information. The learning of dictionaries and associated classification models is jointly conducted by minimizing an overall tree loss. The experimental results on challenging data sets demonstrate that our approach achieves excellent accuracy and competitive computation cost compared with other sparse coding methods for large scale image classification.

  19. Large-scale retrieval for medical image analytics: A comprehensive review.

    Science.gov (United States)

    Li, Zhongyu; Zhang, Xiaofan; Müller, Henning; Zhang, Shaoting

    2018-01-01

    Over the past decades, medical image analytics was greatly facilitated by the explosion of digital imaging techniques, where huge amounts of medical images were produced with ever-increasing quality and diversity. However, conventional methods for analyzing medical images have achieved limited success, as they are not capable to tackle the huge amount of image data. In this paper, we review state-of-the-art approaches for large-scale medical image analysis, which are mainly based on recent advances in computer vision, machine learning and information retrieval. Specifically, we first present the general pipeline of large-scale retrieval, summarize the challenges/opportunities of medical image analytics on a large-scale. Then, we provide a comprehensive review of algorithms and techniques relevant to major processes in the pipeline, including feature representation, feature indexing, searching, etc. On the basis of existing work, we introduce the evaluation protocols and multiple applications of large-scale medical image retrieval, with a variety of exploratory and diagnostic scenarios. Finally, we discuss future directions of large-scale retrieval, which can further improve the performance of medical image analysis. Copyright © 2017 Elsevier B.V. All rights reserved.

  20. The Potential of the Cell Processor for Scientific Computing

    Energy Technology Data Exchange (ETDEWEB)

    Williams, Samuel; Shalf, John; Oliker, Leonid; Husbands, Parry; Kamil, Shoaib; Yelick, Katherine

    2005-10-14

    The slowing pace of commodity microprocessor performance improvements combined with ever-increasing chip power demands has become of utmost concern to computational scientists. As a result, the high performance computing community is examining alternative architectures that address the limitations of modern cache-based designs. In this work, we examine the potential of the using the forth coming STI Cell processor as a building block for future high-end computing systems. Our work contains several novel contributions. We are the first to present quantitative Cell performance data on scientific kernels and show direct comparisons against leading superscalar (AMD Opteron), VLIW (IntelItanium2), and vector (Cray X1) architectures. Since neither Cell hardware nor cycle-accurate simulators are currently publicly available, we develop both analytical models and simulators to predict kernel performance. Our work also explores the complexity of mapping several important scientific algorithms onto the Cells unique architecture. Additionally, we propose modest microarchitectural modifications that could significantly increase the efficiency of double-precision calculations. Overall results demonstrate the tremendous potential of the Cell architecture for scientific computations in terms of both raw performance and power efficiency.

  1. Reliability Lessons Learned From GPU Experience With The Titan Supercomputer at Oak Ridge Leadership Computing Facility

    Energy Technology Data Exchange (ETDEWEB)

    Gallarno, George [Christian Brothers University; Rogers, James H [ORNL; Maxwell, Don E [ORNL

    2015-01-01

    The high computational capability of graphics processing units (GPUs) is enabling and driving the scientific discovery process at large-scale. The world s second fastest supercomputer for open science, Titan, has more than 18,000 GPUs that computational scientists use to perform scientific simu- lations and data analysis. Understanding of GPU reliability characteristics, however, is still in its nascent stage since GPUs have only recently been deployed at large-scale. This paper presents a detailed study of GPU errors and their impact on system operations and applications, describing experiences with the 18,688 GPUs on the Titan supercom- puter as well as lessons learned in the process of efficient operation of GPUs at scale. These experiences are helpful to HPC sites which already have large-scale GPU clusters or plan to deploy GPUs in the future.

  2. The Y2K program for scientific-analysis computer programs at AECL

    International Nuclear Information System (INIS)

    Popovic, J.; Gaver, C.; Chapman, D.

    1999-01-01

    The evaluation of scientific-analysis computer programs for year-2000 compliance is part of AECL' s year-2000 (Y2K) initiative, which addresses both the infrastructure systems at AECL and AECL's products and services. This paper describes the Y2K-compliance program for scientific-analysis computer codes. This program involves the integrated evaluation of the computer hardware, middleware, and third-party software in addition to the scientific codes developed in-house. The project involves several steps: the assessment of the scientific computer programs for Y2K compliance, performing any required corrective actions, porting the programs to Y2K-compliant platforms, and verification of the programs after porting. Some programs or program versions, deemed no longer required in the year 2000 and beyond, will be retired and archived. (author)

  3. The Y2K program for scientific-analysis computer programs at AECL

    International Nuclear Information System (INIS)

    Popovic, J.; Gaver, C.; Chapman, D.

    1999-01-01

    The evaluation of scientific analysis computer programs for year-2000 compliance is part of AECL's year-2000 (Y2K) initiative, which addresses both the infrastructure systems at AECL and AECL's products and services. This paper describes the Y2K-compliance program for scientific-analysis computer codes. This program involves the integrated evaluation of the computer hardware, middleware, and third-party software in addition to the scientific codes developed in-house. The project involves several steps: the assessment of the scientific computer programs for Y2K compliance, performing any required corrective actions, porting the programs to Y2K-compliant platforms, and verification of the programs after porting. Some programs or program versions, deemed no longer required in the year 2000 and beyond, will be retired and archived. (author)

  4. Evolutionary Hierarchical Multi-Criteria Metaheuristics for Scheduling in Large-Scale Grid Systems

    CERN Document Server

    Kołodziej, Joanna

    2012-01-01

    One of the most challenging issues in modelling today's large-scale computational systems is to effectively manage highly parametrised distributed environments such as computational grids, clouds, ad hoc networks and P2P networks. Next-generation computational grids must provide a wide range of services and high performance computing infrastructures. Various types of information and data processed in the large-scale dynamic grid environment may be incomplete, imprecise, and fragmented, which complicates the specification of proper evaluation criteria and which affects both the availability of resources and the final collective decisions of users. The complexity of grid architectures and grid management may also contribute towards higher energy consumption. All of these issues necessitate the development of intelligent resource management techniques, which are capable of capturing all of this complexity and optimising meaningful metrics for a wide range of grid applications.   This book covers hot topics in t...

  5. Less is more: regularization perspectives on large scale machine learning

    CERN Multimedia

    CERN. Geneva

    2017-01-01

    Deep learning based techniques provide a possible solution at the expanse of theoretical guidance and, especially, of computational requirements. It is then a key challenge for large scale machine learning to devise approaches guaranteed to be accurate and yet computationally efficient. In this talk, we will consider a regularization perspectives on machine learning appealing to classical ideas in linear algebra and inverse problems to scale-up dramatically nonparametric methods such as kernel methods, often dismissed because of prohibitive costs. Our analysis derives optimal theoretical guarantees while providing experimental results at par or out-performing state of the art approaches.

  6. The 12-th INS scientific computational programs

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1996-06-01

    This issue is the collection of the paper on INS scientific computational programs. Separate abstracts were presented for 3 of the papers in this report. The remaining 5 were considered outside the subject scope of INIS. (J.P.N.)

  7. Educational NASA Computational and Scientific Studies (enCOMPASS)

    Science.gov (United States)

    Memarsadeghi, Nargess

    2013-01-01

    Educational NASA Computational and Scientific Studies (enCOMPASS) is an educational project of NASA Goddard Space Flight Center aimed at bridging the gap between computational objectives and needs of NASA's scientific research, missions, and projects, and academia's latest advances in applied mathematics and computer science. enCOMPASS achieves this goal via bidirectional collaboration and communication between NASA and academia. Using developed NASA Computational Case Studies in university computer science/engineering and applied mathematics classes is a way of addressing NASA's goals of contributing to the Science, Technology, Education, and Math (STEM) National Objective. The enCOMPASS Web site at http://encompass.gsfc.nasa.gov provides additional information. There are currently nine enCOMPASS case studies developed in areas of earth sciences, planetary sciences, and astrophysics. Some of these case studies have been published in AIP and IEEE's Computing in Science and Engineering magazines. A few university professors have used enCOMPASS case studies in their computational classes and contributed their findings to NASA scientists. In these case studies, after introducing the science area, the specific problem, and related NASA missions, students are first asked to solve a known problem using NASA data and past approaches used and often published in a scientific/research paper. Then, after learning about the NASA application and related computational tools and approaches for solving the proposed problem, students are given a harder problem as a challenge for them to research and develop solutions for. This project provides a model for NASA scientists and engineers on one side, and university students, faculty, and researchers in computer science and applied mathematics on the other side, to learn from each other's areas of work, computational needs and solutions, and the latest advances in research and development. This innovation takes NASA science and

  8. The cognitive dynamics of computer science cost-effective large scale software development

    CERN Document Server

    De Gyurky, Szabolcs Michael; John Wiley & Sons

    2006-01-01

    This book has three major objectives: To propose an ontology for computer software; To provide a methodology for development of large software systems to cost and schedule that is based on the ontology; To offer an alternative vision regarding the development of truly autonomous systems.

  9. Scientific Computing and Apple's Intel Transition

    CERN Document Server

    CERN. Geneva

    2006-01-01

    Intel's published processor roadmap and how it may affect the future of personal and scientific computing About the speaker: Eric Albert is Senior Software Engineer in Apple's Core Technologies group. During Mac OS X's transition to Intel processors he has worked on almost every part of the operating system, from the OS kernel and compiler tools to appli...

  10. Advanced scientific computational methods and their applications of nuclear technologies. (1) Overview of scientific computational methods, introduction of continuum simulation methods and their applications (1)

    International Nuclear Information System (INIS)

    Oka, Yoshiaki; Okuda, Hiroshi

    2006-01-01

    Scientific computational methods have advanced remarkably with the progress of nuclear development. They have played the role of weft connecting each realm of nuclear engineering and then an introductory course of advanced scientific computational methods and their applications to nuclear technologies were prepared in serial form. This is the first issue showing their overview and introduction of continuum simulation methods. Finite element method as their applications is also reviewed. (T. Tanaka)

  11. Similitude and scaling of large structural elements: Case study

    Directory of Open Access Journals (Sweden)

    M. Shehadeh

    2015-06-01

    Full Text Available Scaled down models are widely used for experimental investigations of large structures due to the limitation in the capacities of testing facilities along with the expenses of the experimentation. The modeling accuracy depends upon the model material properties, fabrication accuracy and loading techniques. In the present work the Buckingham π theorem is used to develop the relations (i.e. geometry, loading and properties between the model and a large structural element as that is present in the huge existing petroleum oil drilling rigs. The model is to be designed, loaded and treated according to a set of similitude requirements that relate the model to the large structural element. Three independent scale factors which represent three fundamental dimensions, namely mass, length and time need to be selected for designing the scaled down model. Numerical prediction of the stress distribution within the model and its elastic deformation under steady loading is to be made. The results are compared with those obtained from the full scale structure numerical computations. The effect of scaled down model size and material on the accuracy of the modeling technique is thoroughly examined.

  12. A novel artificial fish swarm algorithm for solving large-scale reliability-redundancy application problem.

    Science.gov (United States)

    He, Qiang; Hu, Xiangtao; Ren, Hong; Zhang, Hongqi

    2015-11-01

    A novel artificial fish swarm algorithm (NAFSA) is proposed for solving large-scale reliability-redundancy allocation problem (RAP). In NAFSA, the social behaviors of fish swarm are classified in three ways: foraging behavior, reproductive behavior, and random behavior. The foraging behavior designs two position-updating strategies. And, the selection and crossover operators are applied to define the reproductive ability of an artificial fish. For the random behavior, which is essentially a mutation strategy, the basic cloud generator is used as the mutation operator. Finally, numerical results of four benchmark problems and a large-scale RAP are reported and compared. NAFSA shows good performance in terms of computational accuracy and computational efficiency for large scale RAP. Copyright © 2015 ISA. Published by Elsevier Ltd. All rights reserved.

  13. Scientific Computing Kernels on the Cell Processor

    Energy Technology Data Exchange (ETDEWEB)

    Williams, Samuel W.; Shalf, John; Oliker, Leonid; Kamil, Shoaib; Husbands, Parry; Yelick, Katherine

    2007-04-04

    The slowing pace of commodity microprocessor performance improvements combined with ever-increasing chip power demands has become of utmost concern to computational scientists. As a result, the high performance computing community is examining alternative architectures that address the limitations of modern cache-based designs. In this work, we examine the potential of using the recently-released STI Cell processor as a building block for future high-end computing systems. Our work contains several novel contributions. First, we introduce a performance model for Cell and apply it to several key scientific computing kernels: dense matrix multiply, sparse matrix vector multiply, stencil computations, and 1D/2D FFTs. The difficulty of programming Cell, which requires assembly level intrinsics for the best performance, makes this model useful as an initial step in algorithm design and evaluation. Next, we validate the accuracy of our model by comparing results against published hardware results, as well as our own implementations on a 3.2GHz Cell blade. Additionally, we compare Cell performance to benchmarks run on leading superscalar (AMD Opteron), VLIW (Intel Itanium2), and vector (Cray X1E) architectures. Our work also explores several different mappings of the kernels and demonstrates a simple and effective programming model for Cell's unique architecture. Finally, we propose modest microarchitectural modifications that could significantly increase the efficiency of double-precision calculations. Overall results demonstrate the tremendous potential of the Cell architecture for scientific computations in terms of both raw performance and power efficiency.

  14. On the use of Cloud Computing and Machine Learning for Large-Scale SAR Science Data Processing and Quality Assessment Analysi

    Science.gov (United States)

    Hua, H.

    2016-12-01

    Geodetic imaging is revolutionizing geophysics, but the scope of discovery has been limited by labor-intensive technological implementation of the analyses. The Advanced Rapid Imaging and Analysis (ARIA) project has proven capability to automate SAR data processing and analysis. Existing and upcoming SAR missions such as Sentinel-1A/B and NISAR are also expected to generate massive amounts of SAR data. This has brought to the forefront the need for analytical tools for SAR quality assessment (QA) on the large volumes of SAR data-a critical step before higher-level time series and velocity products can be reliably generated. Initially leveraging an advanced hybrid-cloud computing science data system for performing large-scale processing, machine learning approaches were augmented for automated analysis of various quality metrics. Machine learning-based user-training of features, cross-validation, prediction models were integrated into our cloud-based science data processing flow to enable large-scale and high-throughput QA analytics for enabling improvements to the production quality of geodetic data products.

  15. Massive Cloud Computing Processing of P-SBAS Time Series for Displacement Analyses at Large Spatial Scale

    Science.gov (United States)

    Casu, F.; de Luca, C.; Lanari, R.; Manunta, M.; Zinno, I.

    2016-12-01

    A methodology for computing surface deformation time series and mean velocity maps of large areas is presented. Our approach relies on the availability of a multi-temporal set of Synthetic Aperture Radar (SAR) data collected from ascending and descending orbits over an area of interest, and also permits to estimate the vertical and horizontal (East-West) displacement components of the Earth's surface. The adopted methodology is based on an advanced Cloud Computing implementation of the Differential SAR Interferometry (DInSAR) Parallel Small Baseline Subset (P-SBAS) processing chain which allows the unsupervised processing of large SAR data volumes, from the raw data (level-0) imagery up to the generation of DInSAR time series and maps. The presented solution, which is highly scalable, has been tested on the ascending and descending ENVISAT SAR archives, which have been acquired over a large area of Southern California (US) that extends for about 90.000 km2. Such an input dataset has been processed in parallel by exploiting 280 computing nodes of the Amazon Web Services Cloud environment. Moreover, to produce the final mean deformation velocity maps of the vertical and East-West displacement components of the whole investigated area, we took also advantage of the information available from external GPS measurements that permit to account for possible regional trends not easily detectable by DInSAR and to refer the P-SBAS measurements to an external geodetic datum. The presented results clearly demonstrate the effectiveness of the proposed approach that paves the way to the extensive use of the available ERS and ENVISAT SAR data archives. Furthermore, the proposed methodology can be particularly suitable to deal with the very huge data flow provided by the Sentinel-1 constellation, thus permitting to extend the DInSAR analyses at a nearly global scale. This work is partially supported by: the DPC-CNR agreement, the EPOS-IP project and the ESA GEP project.

  16. Auto-Scaling of Geo-Based Image Processing in an OpenStack Cloud Computing Environment

    OpenAIRE

    Sanggoo Kang; Kiwon Lee

    2016-01-01

    Cloud computing is a base platform for the distribution of large volumes of data and high-performance image processing on the Web. Despite wide applications in Web-based services and their many benefits, geo-spatial applications based on cloud computing technology are still developing. Auto-scaling realizes automatic scalability, i.e., the scale-out and scale-in processing of virtual servers in a cloud computing environment. This study investigates the applicability of auto-scaling to geo-bas...

  17. Towards an integrated multiscale simulation of turbulent clouds on PetaScale computers

    International Nuclear Information System (INIS)

    Wang Lianping; Ayala, Orlando; Parishani, Hossein; Gao, Guang R; Kambhamettu, Chandra; Li Xiaoming; Rossi, Louis; Orozco, Daniel; Torres, Claudio; Grabowski, Wojciech W; Wyszogrodzki, Andrzej A; Piotrowski, Zbigniew

    2011-01-01

    The development of precipitating warm clouds is affected by several effects of small-scale air turbulence including enhancement of droplet-droplet collision rate by turbulence, entrainment and mixing at the cloud edges, and coupling of mechanical and thermal energies at various scales. Large-scale computation is a viable research tool for quantifying these multiscale processes. Specifically, top-down large-eddy simulations (LES) of shallow convective clouds typically resolve scales of turbulent energy-containing eddies while the effects of turbulent cascade toward viscous dissipation are parameterized. Bottom-up hybrid direct numerical simulations (HDNS) of cloud microphysical processes resolve fully the dissipation-range flow scales but only partially the inertial subrange scales. it is desirable to systematically decrease the grid length in LES and increase the domain size in HDNS so that they can be better integrated to address the full range of scales and their coupling. In this paper, we discuss computational issues and physical modeling questions in expanding the ranges of scales realizable in LES and HDNS, and in bridging LES and HDNS. We review our on-going efforts in transforming our simulation codes towards PetaScale computing, in improving physical representations in LES and HDNS, and in developing better methods to analyze and interpret the simulation results.

  18. Large-scale road safety programmes in low- and middle-income countries: an opportunity to generate evidence.

    Science.gov (United States)

    Hyder, Adnan A; Allen, Katharine A; Peters, David H; Chandran, Aruna; Bishai, David

    2013-01-01

    The growing burden of road traffic injuries, which kill over 1.2 million people yearly, falls mostly on low- and middle-income countries (LMICs). Despite this, evidence generation on the effectiveness of road safety interventions in LMIC settings remains scarce. This paper explores a scientific approach for evaluating road safety programmes in LMICs and introduces such a road safety multi-country initiative, the Road Safety in 10 Countries Project (RS-10). By building on existing evaluation frameworks, we develop a scientific approach for evaluating large-scale road safety programmes in LMIC settings. This also draws on '13 lessons' of large-scale programme evaluation: defining the evaluation scope; selecting study sites; maintaining objectivity; developing an impact model; utilising multiple data sources; using multiple analytic techniques; maximising external validity; ensuring an appropriate time frame; the importance of flexibility and a stepwise approach; continuous monitoring; providing feedback to implementers, policy-makers; promoting the uptake of evaluation results; and understanding evaluation costs. The use of relatively new approaches for evaluation of real-world programmes allows for the production of relevant knowledge. The RS-10 project affords an important opportunity to scientifically test these approaches for a real-world, large-scale road safety evaluation and generate new knowledge for the field of road safety.

  19. Large scale particle image velocimetry with helium filled soap bubbles

    Energy Technology Data Exchange (ETDEWEB)

    Bosbach, Johannes; Kuehn, Matthias; Wagner, Claus [German Aerospace Center (DLR), Institute of Aerodynamics and Flow Technology, Goettingen (Germany)

    2009-03-15

    The application of particle image velocimetry (PIV) to measurement of flows on large scales is a challenging necessity especially for the investigation of convective air flows. Combining helium filled soap bubbles as tracer particles with high power quality switched solid state lasers as light sources allows conducting PIV on scales of the order of several square meters. The technique was applied to mixed convection in a full scale double aisle aircraft cabin mock-up for validation of computational fluid dynamics simulations. (orig.)

  20. Large scale particle image velocimetry with helium filled soap bubbles

    Science.gov (United States)

    Bosbach, Johannes; Kühn, Matthias; Wagner, Claus

    2009-03-01

    The application of Particle Image Velocimetry (PIV) to measurement of flows on large scales is a challenging necessity especially for the investigation of convective air flows. Combining helium filled soap bubbles as tracer particles with high power quality switched solid state lasers as light sources allows conducting PIV on scales of the order of several square meters. The technique was applied to mixed convection in a full scale double aisle aircraft cabin mock-up for validation of Computational Fluid Dynamics simulations.

  1. Learning SciPy for numerical and scientific computing

    CERN Document Server

    Silva

    2013-01-01

    A step-by-step practical tutorial with plenty of examples on research-based problems from various areas of science, that prove how simple, yet effective, it is to provide solutions based on SciPy. This book is targeted at anyone with basic knowledge of Python, a somewhat advanced command of mathematics/physics, and an interest in engineering or scientific applications---this is broadly what we refer to as scientific computing.This book will be of critical importance to programmers and scientists who have basic Python knowledge and would like to be able to do scientific and numerical computatio

  2. The application of cloud computing to scientific workflows: a study of cost and performance.

    Science.gov (United States)

    Berriman, G Bruce; Deelman, Ewa; Juve, Gideon; Rynge, Mats; Vöckler, Jens-S

    2013-01-28

    The current model of transferring data from data centres to desktops for analysis will soon be rendered impractical by the accelerating growth in the volume of science datasets. Processing will instead often take place on high-performance servers co-located with data. Evaluations of how new technologies such as cloud computing would support such a new distributed computing model are urgently needed. Cloud computing is a new way of purchasing computing and storage resources on demand through virtualization technologies. We report here the results of investigations of the applicability of commercial cloud computing to scientific computing, with an emphasis on astronomy, including investigations of what types of applications can be run cheaply and efficiently on the cloud, and an example of an application well suited to the cloud: processing a large dataset to create a new science product.

  3. Methods of Scientific Research: Teaching Scientific Creativity at Scale

    Science.gov (United States)

    Robbins, Dennis; Ford, K. E. Saavik

    2016-01-01

    We present a scaling-up plan for AstroComNYC's Methods of Scientific Research (MSR), a course designed to improve undergraduate students' understanding of science practices. The course format and goals, notably the open-ended, hands-on, investigative nature of the curriculum are reviewed. We discuss how the course's interactive pedagogical techniques empower students to learn creativity within the context of experimental design and control of variables thinking. To date the course has been offered to a limited numbers of students in specific programs. The goals of broadly implementing MSR is to reach more students and early in their education—with the specific purpose of supporting and improving retention of students pursuing STEM careers. However, we also discuss challenges in preserving the effectiveness of the teaching and learning experience at scale.

  4. Scientific Computing in the CH Programming Language

    Directory of Open Access Journals (Sweden)

    Harry H. Cheng

    1993-01-01

    Full Text Available We have developed a general-purpose block-structured interpretive programming Ianguage. The syntax and semantics of this language called CH are similar to C. CH retains most features of C from the scientific computing point of view. In this paper, the extension of C to CH for numerical computation of real numbers will be described. Metanumbers of −0.0, 0.0, Inf, −Inf, and NaN are introduced in CH. Through these metanumbers, the power of the IEEE 754 arithmetic standard is easily available to the programmer. These metanumbers are extended to commonly used mathematical functions in the spirit of the IEEE 754 standard and ANSI C. The definitions for manipulation of these metanumbers in I/O; arithmetic, relational, and logic operations; and built-in polymorphic mathematical functions are defined. The capabilities of bitwise, assignment, address and indirection, increment and decrement, as well as type conversion operations in ANSI C are extended in CH. In this paper, mainly new linguistic features of CH in comparison to C will be described. Example programs programmed in CH with metanumbers and polymorphic mathematical functions will demonstrate capabilities of CH in scientific computing.

  5. Large Scale GW Calculations on the Cori System

    Science.gov (United States)

    Deslippe, Jack; Del Ben, Mauro; da Jornada, Felipe; Canning, Andrew; Louie, Steven

    The NERSC Cori system, powered by 9000+ Intel Xeon-Phi processors, represents one of the largest HPC systems for open-science in the United States and the world. We discuss the optimization of the GW methodology for this system, including both node level and system-scale optimizations. We highlight multiple large scale (thousands of atoms) case studies and discuss both absolute application performance and comparison to calculations on more traditional HPC architectures. We find that the GW method is particularly well suited for many-core architectures due to the ability to exploit a large amount of parallelism across many layers of the system. This work was supported by the U.S. Department of Energy, Office of Science, Basic Energy Sciences, Materials Sciences and Engineering Division, as part of the Computational Materials Sciences Program.

  6. High Performance Computing and Storage Requirements for Biological and Environmental Research Target 2017

    Energy Technology Data Exchange (ETDEWEB)

    Gerber, Richard [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). National Energy Research Scientific Computing Center (NERSC); Wasserman, Harvey [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). National Energy Research Scientific Computing Center (NERSC)

    2013-05-01

    The National Energy Research Scientific Computing Center (NERSC) is the primary computing center for the DOE Office of Science, serving approximately 4,500 users working on some 650 projects that involve nearly 600 codes in a wide variety of scientific disciplines. In addition to large-­scale computing and storage resources NERSC provides support and expertise that help scientists make efficient use of its systems. The latest review revealed several key requirements, in addition to achieving its goal of characterizing BER computing and storage needs.

  7. Recent developments in large-scale ozone generation with dielectric barrier discharges

    Science.gov (United States)

    Lopez, Jose L.

    2014-10-01

    Large-scale ozone generation for industrial applications has been entirely based on the creation of microplasmas or microdischarges created using dielectric barrier discharge (DBD) reactors. Although versions of DBD generated ozone have been in continuous use for over a hundred years especially in water treatment, recent changes in environmental awareness and sustainability have lead to a surge of ozone generating facilities throughout the world. As a result of this enhanced global usage of this environmental cleaning application various new discoveries have emerged in the science and technology of ozone generation. This presentation will describe some of the most recent breakthrough developments in large-scale ozone generation while further addressing some of the current scientific and engineering challenges of this technology.

  8. Development of Best Practices for Large-scale Data Management Infrastructure

    NARCIS (Netherlands)

    S. Stadtmüller; H.F. Mühleisen (Hannes); C. Bizer; M.L. Kersten (Martin); J.A. de Rijke (Arjen); F.E. Groffen (Fabian); Y. Zhang (Ying); G. Ladwig; A. Harth; M Trampus

    2012-01-01

    htmlabstractThe amount of available data for processing is constantly increasing and becomes more diverse. We collect our experiences on deploying large-scale data management tools on local-area clusters or cloud infrastructures and provide guidance to use these computing and storage

  9. Web-based visualization of very large scientific astronomy imagery

    Science.gov (United States)

    Bertin, E.; Pillay, R.; Marmo, C.

    2015-04-01

    Visualizing and navigating through large astronomy images from a remote location with current astronomy display tools can be a frustrating experience in terms of speed and ergonomics, especially on mobile devices. In this paper, we present a high performance, versatile and robust client-server system for remote visualization and analysis of extremely large scientific images. Applications of this work include survey image quality control, interactive data query and exploration, citizen science, as well as public outreach. The proposed software is entirely open source and is designed to be generic and applicable to a variety of datasets. It provides access to floating point data at terabyte scales, with the ability to precisely adjust image settings in real-time. The proposed clients are light-weight, platform-independent web applications built on standard HTML5 web technologies and compatible with both touch and mouse-based devices. We put the system to the test and assess the performance of the system and show that a single server can comfortably handle more than a hundred simultaneous users accessing full precision 32 bit astronomy data.

  10. Energy transfers in large-scale and small-scale dynamos

    Science.gov (United States)

    Samtaney, Ravi; Kumar, Rohit; Verma, Mahendra

    2015-11-01

    We present the energy transfers, mainly energy fluxes and shell-to-shell energy transfers in small-scale dynamo (SSD) and large-scale dynamo (LSD) using numerical simulations of MHD turbulence for Pm = 20 (SSD) and for Pm = 0.2 on 10243 grid. For SSD, we demonstrate that the magnetic energy growth is caused by nonlocal energy transfers from the large-scale or forcing-scale velocity field to small-scale magnetic field. The peak of these energy transfers move towards lower wavenumbers as dynamo evolves, which is the reason for the growth of the magnetic fields at the large scales. The energy transfers U2U (velocity to velocity) and B2B (magnetic to magnetic) are forward and local. For LSD, we show that the magnetic energy growth takes place via energy transfers from large-scale velocity field to large-scale magnetic field. We observe forward U2U and B2B energy flux, similar to SSD.

  11. Research initiatives for plug-and-play scientific computing

    International Nuclear Information System (INIS)

    McInnes, Lois Curfman; Dahlgren, Tamara; Nieplocha, Jarek; Bernholdt, David; Allan, Ben; Armstrong, Rob; Chavarria, Daniel; Elwasif, Wael; Gorton, Ian; Kenny, Joe; Krishan, Manoj; Malony, Allen; Norris, Boyana; Ray, Jaideep; Shende, Sameer

    2007-01-01

    This paper introduces three component technology initiatives within the SciDAC Center for Technology for Advanced Scientific Component Software (TASCS) that address ever-increasing productivity challenges in creating, managing, and applying simulation software to scientific discovery. By leveraging the Common Component Architecture (CCA), a new component standard for high-performance scientific computing, these initiatives tackle difficulties at different but related levels in the development of component-based scientific software: (1) deploying applications on massively parallel and heterogeneous architectures, (2) investigating new approaches to the runtime enforcement of behavioral semantics, and (3) developing tools to facilitate dynamic composition, substitution, and reconfiguration of component implementations and parameters, so that application scientists can explore tradeoffs among factors such as accuracy, reliability, and performance

  12. Large-scale simulations of plastic neural networks on neuromorphic hardware

    Directory of Open Access Journals (Sweden)

    James Courtney Knight

    2016-04-01

    Full Text Available SpiNNaker is a digital, neuromorphic architecture designed for simulating large-scale spiking neural networks at speeds close to biological real-time. Rather than using bespoke analog or digital hardware, the basic computational unit of a SpiNNaker system is a general-purpose ARM processor, allowing it to be programmed to simulate a wide variety of neuron and synapse models. This flexibility is particularly valuable in the study of biological plasticity phenomena. A recently proposed learning rule based on the Bayesian Confidence Propagation Neural Network (BCPNN paradigm offers a generic framework for modeling the interaction of different plasticity mechanisms using spiking neurons. However, it can be computationally expensive to simulate large networks with BCPNN learning since it requires multiple state variables for each synapse, each of which needs to be updated every simulation time-step. We discuss the trade-offs in efficiency and accuracy involved in developing an event-based BCPNN implementation for SpiNNaker based on an analytical solution to the BCPNN equations, and detail the steps taken to fit this within the limited computational and memory resources of the SpiNNaker architecture. We demonstrate this learning rule by learning temporal sequences of neural activity within a recurrent attractor network which we simulate at scales of up to 20000 neurons and 51200000 plastic synapses: the largest plastic neural network ever to be simulated on neuromorphic hardware. We also run a comparable simulation on a Cray XC-30 supercomputer system and find that, if it is to match the run-time of our SpiNNaker simulation, the super computer system uses approximately more power. This suggests that cheaper, more power efficient neuromorphic systems are becoming useful discovery tools in the study of plasticity in large-scale brain models.

  13. Large scale CMB anomalies from thawing cosmic strings

    Energy Technology Data Exchange (ETDEWEB)

    Ringeval, Christophe [Centre for Cosmology, Particle Physics and Phenomenology, Institute of Mathematics and Physics, Louvain University, 2 Chemin du Cyclotron, 1348 Louvain-la-Neuve (Belgium); Yamauchi, Daisuke; Yokoyama, Jun' ichi [Research Center for the Early Universe (RESCEU), Graduate School of Science, The University of Tokyo, Tokyo 113-0033 (Japan); Bouchet, François R., E-mail: christophe.ringeval@uclouvain.be, E-mail: yamauchi@resceu.s.u-tokyo.ac.jp, E-mail: yokoyama@resceu.s.u-tokyo.ac.jp, E-mail: bouchet@iap.fr [Institut d' Astrophysique de Paris, UMR 7095-CNRS, Université Pierre et Marie Curie, 98bis boulevard Arago, 75014 Paris (France)

    2016-02-01

    Cosmic strings formed during inflation are expected to be either diluted over super-Hubble distances, i.e., invisible today, or to have crossed our past light cone very recently. We discuss the latter situation in which a few strings imprint their signature in the Cosmic Microwave Background (CMB) Anisotropies after recombination. Being almost frozen in the Hubble flow, these strings are quasi static and evade almost all of the previously derived constraints on their tension while being able to source large scale anisotropies in the CMB sky. Using a local variance estimator on thousand of numerically simulated Nambu-Goto all sky maps, we compute the expected signal and show that it can mimic a dipole modulation at large angular scales while being negligible at small angles. Interestingly, such a scenario generically produces one cold spot from the thawing of a cosmic string loop. Mixed with anisotropies of inflationary origin, we find that a few strings of tension GU = O(1) × 10{sup −6} match the amplitude of the dipole modulation reported in the Planck satellite measurements and could be at the origin of other large scale anomalies.

  14. Performance evaluation of scientific programs on advanced architecture computers

    International Nuclear Information System (INIS)

    Walker, D.W.; Messina, P.; Baille, C.F.

    1988-01-01

    Recently a number of advanced architecture machines have become commercially available. These new machines promise better cost-performance then traditional computers, and some of them have the potential of competing with current supercomputers, such as the Cray X/MP, in terms of maximum performance. This paper describes an on-going project to evaluate a broad range of advanced architecture computers using a number of complete scientific application programs. The computers to be evaluated include distributed- memory machines such as the NCUBE, INTEL and Caltech/JPL hypercubes, and the MEIKO computing surface, shared-memory, bus architecture machines such as the Sequent Balance and the Alliant, very long instruction word machines such as the Multiflow Trace 7/200 computer, traditional supercomputers such as the Cray X.MP and Cray-2, and SIMD machines such as the Connection Machine. Currently 11 application codes from a number of scientific disciplines have been selected, although it is not intended to run all codes on all machines. Results are presented for two of the codes (QCD and missile tracking), and future work is proposed

  15. Exploiting multi-scale parallelism for large scale numerical modelling of laser wakefield accelerators

    International Nuclear Information System (INIS)

    Fonseca, R A; Vieira, J; Silva, L O; Fiuza, F; Davidson, A; Tsung, F S; Mori, W B

    2013-01-01

    A new generation of laser wakefield accelerators (LWFA), supported by the extreme accelerating fields generated in the interaction of PW-Class lasers and underdense targets, promises the production of high quality electron beams in short distances for multiple applications. Achieving this goal will rely heavily on numerical modelling to further understand the underlying physics and identify optimal regimes, but large scale modelling of these scenarios is computationally heavy and requires the efficient use of state-of-the-art petascale supercomputing systems. We discuss the main difficulties involved in running these simulations and the new developments implemented in the OSIRIS framework to address these issues, ranging from multi-dimensional dynamic load balancing and hybrid distributed/shared memory parallelism to the vectorization of the PIC algorithm. We present the results of the OASCR Joule Metric program on the issue of large scale modelling of LWFA, demonstrating speedups of over 1 order of magnitude on the same hardware. Finally, scalability to over ∼10 6 cores and sustained performance over ∼2 P Flops is demonstrated, opening the way for large scale modelling of LWFA scenarios. (paper)

  16. Large-Scale, Parallel, Multi-Sensor Data Fusion in the Cloud

    Science.gov (United States)

    Wilson, B. D.; Manipon, G.; Hua, H.

    2012-12-01

    NASA's Earth Observing System (EOS) is an ambitious facility for studying global climate change. The mandate now is to combine measurements from the instruments on the "A-Train" platforms (AIRS, AMSR-E, MODIS, MISR, MLS, and CloudSat) and other Earth probes to enable large-scale studies of climate change over periods of years to decades. However, moving from predominantly single-instrument studies to a multi-sensor, measurement-based model for long-duration analysis of important climate variables presents serious challenges for large-scale data mining and data fusion. For example, one might want to compare temperature and water vapor retrievals from one instrument (AIRS) to another instrument (MODIS), and to a model (ECMWF), stratify the comparisons using a classification of the "cloud scenes" from CloudSat, and repeat the entire analysis over years of AIRS data. To perform such an analysis, one must discover & access multiple datasets from remote sites, find the space/time "matchups" between instruments swaths and model grids, understand the quality flags and uncertainties for retrieved physical variables, assemble merged datasets, and compute fused products for further scientific and statistical analysis. To efficiently assemble such decade-scale datasets in a timely manner, we are utilizing Elastic Computing in the Cloud and parallel map/reduce-based algorithms. "SciReduce" is a Hadoop-like parallel analysis system, programmed in parallel python, that is designed from the ground up for Earth science. SciReduce executes inside VMWare images and scales to any number of nodes in the Cloud. Unlike Hadoop, in which simple tuples (keys & values) are passed between the map and reduce functions, SciReduce operates on bundles of named numeric arrays, which can be passed in memory or serialized to disk in netCDF4 or HDF5. Thus, SciReduce uses the native datatypes (geolocated grids, swaths, and points) that geo-scientists are familiar with. We are deploying within Sci

  17. Algebraic mesh generation for large scale viscous-compressible aerodynamic simulation

    International Nuclear Information System (INIS)

    Smith, R.E.

    1984-01-01

    Viscous-compressible aerodynamic simulation is the numerical solution of the compressible Navier-Stokes equations and associated boundary conditions. Boundary-fitted coordinate systems are well suited for the application of finite difference techniques to the Navier-Stokes equations. An algebraic approach to boundary-fitted coordinate systems is one where an explicit functional relation describes a mesh on which a solution is obtained. This approach has the advantage of rapid-precise mesh control. The basic mathematical structure of three algebraic mesh generation techniques is described. They are transfinite interpolation, the multi-surface method, and the two-boundary technique. The Navier-Stokes equations are transformed to a computational coordinate system where boundary-fitted coordinates can be applied. Large-scale computation implies that there is a large number of mesh points in the coordinate system. Computation of viscous compressible flow using boundary-fitted coordinate systems and the application of this computational philosophy on a vector computer are presented

  18. GISpark: A Geospatial Distributed Computing Platform for Spatiotemporal Big Data

    Science.gov (United States)

    Wang, S.; Zhong, E.; Wang, E.; Zhong, Y.; Cai, W.; Li, S.; Gao, S.

    2016-12-01

    Geospatial data are growing exponentially because of the proliferation of cost effective and ubiquitous positioning technologies such as global remote-sensing satellites and location-based devices. Analyzing large amounts of geospatial data can provide great value for both industrial and scientific applications. Data- and compute- intensive characteristics inherent in geospatial big data increasingly pose great challenges to technologies of data storing, computing and analyzing. Such challenges require a scalable and efficient architecture that can store, query, analyze, and visualize large-scale spatiotemporal data. Therefore, we developed GISpark - a geospatial distributed computing platform for processing large-scale vector, raster and stream data. GISpark is constructed based on the latest virtualized computing infrastructures and distributed computing architecture. OpenStack and Docker are used to build multi-user hosting cloud computing infrastructure for GISpark. The virtual storage systems such as HDFS, Ceph, MongoDB are combined and adopted for spatiotemporal data storage management. Spark-based algorithm framework is developed for efficient parallel computing. Within this framework, SuperMap GIScript and various open-source GIS libraries can be integrated into GISpark. GISpark can also integrated with scientific computing environment (e.g., Anaconda), interactive computing web applications (e.g., Jupyter notebook), and machine learning tools (e.g., TensorFlow/Orange). The associated geospatial facilities of GISpark in conjunction with the scientific computing environment, exploratory spatial data analysis tools, temporal data management and analysis systems make up a powerful geospatial computing tool. GISpark not only provides spatiotemporal big data processing capacity in the geospatial field, but also provides spatiotemporal computational model and advanced geospatial visualization tools that deals with other domains related with spatial property. We

  19. Results of research and development in large-scale research centers as an innovation source for firms

    International Nuclear Information System (INIS)

    Theenhaus, R.

    1978-01-01

    The twelve large-scale research centres of the Federal Republic of Germany with their 16,000 employees represent a considerable scientific and technical potential. Cooperation with industry with regard to large-scale projects has already become very close and the know-how flow as well as the contributions to innovation connected therewith are largely established. The first successful steps to utilizing the results of basic research, of spin off and those within the frame of research and development as well as the fulfilling of services are encouraging. However, there is a number of detail problems which can only be solved between all parties concerned, in particular between industry and all large-scale research centres. (orig./RW) [de

  20. Large-scale automated image analysis for computational profiling of brain tissue surrounding implanted neuroprosthetic devices using Python.

    Science.gov (United States)

    Rey-Villamizar, Nicolas; Somasundar, Vinay; Megjhani, Murad; Xu, Yan; Lu, Yanbin; Padmanabhan, Raghav; Trett, Kristen; Shain, William; Roysam, Badri

    2014-01-01

    In this article, we describe the use of Python for large-scale automated server-based bio-image analysis in FARSIGHT, a free and open-source toolkit of image analysis methods for quantitative studies of complex and dynamic tissue microenvironments imaged by modern optical microscopes, including confocal, multi-spectral, multi-photon, and time-lapse systems. The core FARSIGHT modules for image segmentation, feature extraction, tracking, and machine learning are written in C++, leveraging widely used libraries including ITK, VTK, Boost, and Qt. For solving complex image analysis tasks, these modules must be combined into scripts using Python. As a concrete example, we consider the problem of analyzing 3-D multi-spectral images of brain tissue surrounding implanted neuroprosthetic devices, acquired using high-throughput multi-spectral spinning disk step-and-repeat confocal microscopy. The resulting images typically contain 5 fluorescent channels. Each channel consists of 6000 × 10,000 × 500 voxels with 16 bits/voxel, implying image sizes exceeding 250 GB. These images must be mosaicked, pre-processed to overcome imaging artifacts, and segmented to enable cellular-scale feature extraction. The features are used to identify cell types, and perform large-scale analysis for identifying spatial distributions of specific cell types relative to the device. Python was used to build a server-based script (Dell 910 PowerEdge servers with 4 sockets/server with 10 cores each, 2 threads per core and 1TB of RAM running on Red Hat Enterprise Linux linked to a RAID 5 SAN) capable of routinely handling image datasets at this scale and performing all these processing steps in a collaborative multi-user multi-platform environment. Our Python script enables efficient data storage and movement between computers and storage servers, logs all the processing steps, and performs full multi-threaded execution of all codes, including open and closed-source third party libraries.

  1. Large-scale automated image analysis for computational profiling of brain tissue surrounding implanted neuroprosthetic devices using Python

    Directory of Open Access Journals (Sweden)

    Nicolas eRey-Villamizar

    2014-04-01

    Full Text Available In this article, we describe use of Python for large-scale automated server-based bio-image analysis in FARSIGHT, a free and open-source toolkit of image analysis methods for quantitative studies of complex and dynamic tissue microenvironments imaged by modern optical microscopes including confocal, multi-spectral, multi-photon, and time-lapse systems. The core FARSIGHT modules for image segmentation, feature extraction, tracking, and machine learning are written in C++, leveraging widely used libraries including ITK, VTK, Boost, and Qt. For solving complex image analysis task, these modules must be combined into scripts using Python. As a concrete example, we consider the problem of analyzing 3-D multi-spectral brain tissue images surrounding implanted neuroprosthetic devices, acquired using high-throughput multi-spectral spinning disk step-and-repeat confocal microscopy. The resulting images typically contain 5 fluorescent channels, 6,000$times$10,000$times$500 voxels with 16 bits/voxel, implying image sizes exceeding 250GB. These images must be mosaicked, pre-processed to overcome imaging artifacts, and segmented to enable cellular-scale feature extraction. The features are used to identify cell types, and perform large-scale analytics for identifying spatial distributions of specific cell types relative to the device. Python was used to build a server-based script (Dell 910 PowerEdge servers with 4 sockets/server with 10 cores each, 2 threads per core and 1TB of RAM running on Red Hat Enterprise Linux linked to a RAID 5 SAN capable of routinely handling image datasets at this scale and performing all these processing steps in a collaborative multi-user multi-platform environment consisting. Our Python script enables efficient data storage and movement between compute and storage servers, logging all processing steps, and performs full multi-threaded execution of all codes, including open and closed-source third party libraries.

  2. Reducing computational costs in large scale 3D EIT by using a sparse Jacobian matrix with block-wise CGLS reconstruction

    International Nuclear Information System (INIS)

    Yang, C L; Wei, H Y; Soleimani, M; Adler, A

    2013-01-01

    Electrical impedance tomography (EIT) is a fast and cost-effective technique to provide a tomographic conductivity image of a subject from boundary current–voltage data. This paper proposes a time and memory efficient method for solving a large scale 3D EIT inverse problem using a parallel conjugate gradient (CG) algorithm. The 3D EIT system with a large number of measurement data can produce a large size of Jacobian matrix; this could cause difficulties in computer storage and the inversion process. One of challenges in 3D EIT is to decrease the reconstruction time and memory usage, at the same time retaining the image quality. Firstly, a sparse matrix reduction technique is proposed using thresholding to set very small values of the Jacobian matrix to zero. By adjusting the Jacobian matrix into a sparse format, the element with zeros would be eliminated, which results in a saving of memory requirement. Secondly, a block-wise CG method for parallel reconstruction has been developed. The proposed method has been tested using simulated data as well as experimental test samples. Sparse Jacobian with a block-wise CG enables the large scale EIT problem to be solved efficiently. Image quality measures are presented to quantify the effect of sparse matrix reduction in reconstruction results. (paper)

  3. Reducing computational costs in large scale 3D EIT by using a sparse Jacobian matrix with block-wise CGLS reconstruction.

    Science.gov (United States)

    Yang, C L; Wei, H Y; Adler, A; Soleimani, M

    2013-06-01

    Electrical impedance tomography (EIT) is a fast and cost-effective technique to provide a tomographic conductivity image of a subject from boundary current-voltage data. This paper proposes a time and memory efficient method for solving a large scale 3D EIT inverse problem using a parallel conjugate gradient (CG) algorithm. The 3D EIT system with a large number of measurement data can produce a large size of Jacobian matrix; this could cause difficulties in computer storage and the inversion process. One of challenges in 3D EIT is to decrease the reconstruction time and memory usage, at the same time retaining the image quality. Firstly, a sparse matrix reduction technique is proposed using thresholding to set very small values of the Jacobian matrix to zero. By adjusting the Jacobian matrix into a sparse format, the element with zeros would be eliminated, which results in a saving of memory requirement. Secondly, a block-wise CG method for parallel reconstruction has been developed. The proposed method has been tested using simulated data as well as experimental test samples. Sparse Jacobian with a block-wise CG enables the large scale EIT problem to be solved efficiently. Image quality measures are presented to quantify the effect of sparse matrix reduction in reconstruction results.

  4. GPU-Accelerated Sparse Matrix Solvers for Large-Scale Simulations, Phase II

    Data.gov (United States)

    National Aeronautics and Space Administration — At the heart of scientific computing and numerical analysis are linear algebra solvers. In scientific computing, the focus is on the partial differential equations...

  5. N286.7-99, A Canadian standard specifying software quality management system requirements for analytical, scientific, and design computer programs and its implementation at AECL

    International Nuclear Information System (INIS)

    Abel, R.

    2000-01-01

    Analytical, scientific, and design computer programs (referred to in this paper as 'scientific computer programs') are developed for use in a large number of ways by the user-engineer to support and prove engineering calculations and assumptions. These computer programs are subject to frequent modifications inherent in their application and are often used for critical calculations and analysis relative to safety and functionality of equipment and systems. N286.7-99(4) was developed to establish appropriate quality management system requirements to deal with the development, modification, and application of scientific computer programs. N286.7-99 provides particular guidance regarding the treatment of legacy codes

  6. Large scale solar district heating. Evaluation, modelling and designing - Appendices

    Energy Technology Data Exchange (ETDEWEB)

    Heller, A.

    2000-07-01

    The appendices present the following: A) Cad-drawing of the Marstal CSHP design. B) Key values - large-scale solar heating in Denmark. C) Monitoring - a system description. D) WMO-classification of pyranometers (solarimeters). E) The computer simulation model in TRNSYS. F) Selected papers from the author. (EHS)

  7. Dynamic Modeling, Optimization, and Advanced Control for Large Scale Biorefineries

    DEFF Research Database (Denmark)

    Prunescu, Remus Mihail

    with a complex conversion route. Computational fluid dynamics is used to model transport phenomena in large reactors capturing tank profiles, and delays due to plug flows. This work publishes for the first time demonstration scale real data for validation showing that the model library is suitable...

  8. Multidimensional Environmental Data Resource Brokering on Computational Grids and Scientific Clouds

    Science.gov (United States)

    Montella, Raffaele; Giunta, Giulio; Laccetti, Giuliano

    Grid computing has widely evolved over the past years, and its capabilities have found their way even into business products and are no longer relegated to scientific applications. Today, grid computing technology is not restricted to a set of specific grid open source or industrial products, but rather it is comprised of a set of capabilities virtually within any kind of software to create shared and highly collaborative production environments. These environments are focused on computational (workload) capabilities and the integration of information (data) into those computational capabilities. An active grid computing application field is the fully virtualization of scientific instruments in order to increase their availability and decrease operational and maintaining costs. Computational and information grids allow to manage real-world objects in a service-oriented way using industrial world-spread standards.

  9. Neighborhood communication paradigm to increase scalability in large-scale dynamic scientific applications

    KAUST Repository

    Ovcharenko, Aleksandr

    2012-03-01

    This paper introduces a general-purpose communication package built on top of MPI which is aimed at improving inter-processor communications independently of the supercomputer architecture being considered. The package is developed to support parallel applications that rely on computation characterized by large number of messages of various sizes, often small, that are focused within processor neighborhoods. In some cases, such as solvers having static mesh partitions, the number and size of messages are known a priori. However, in other cases such as mesh adaptation, the messages evolve and vary in number and size and include the dynamic movement of partition objects. The current package provides a utility for dynamic applications based on two key attributes that are: (i) explicit consideration of the neighborhood communication pattern to avoid many-to-many calls and also to reduce the number of collective calls to a minimum, and (ii) use of non-blocking MPI functions along with message packing to manage message flow control and reduce the number and time of communication calls. The test application demonstrated is parallel unstructured mesh adaptation. Results on IBM Blue Gene/P and Cray XE6 computers show that the use of neighborhood-based communication control leads to scalable results when executing generally imbalanced mesh adaptation runs. © 2011 Elsevier B.V. All rights reserved.

  10. Computer simulations for the nano-scale

    International Nuclear Information System (INIS)

    Stich, I.

    2007-01-01

    A review of methods for computations for the nano-scale is presented. The paper should provide a convenient starting point into computations for the nano-scale as well as a more in depth presentation for those already working in the field of atomic/molecular-scale modeling. The argument is divided in chapters covering the methods for description of the (i) electrons, (ii) ions, and (iii) techniques for efficient solving of the underlying equations. A fairly broad view is taken covering the Hartree-Fock approximation, density functional techniques and quantum Monte-Carlo techniques for electrons. The customary quantum chemistry methods, such as post Hartree-Fock techniques, are only briefly mentioned. Description of both classical and quantum ions is presented. The techniques cover Ehrenfest, Born-Oppenheimer, and Car-Parrinello dynamics. The strong and weak points of both principal and technical nature are analyzed. In the second part we introduce a number of applications to demonstrate the different approximations and techniques introduced in the first part. They cover a wide range of applications such as non-simple liquids, surfaces, molecule-surface interactions, applications in nano technology, etc. These more in depth presentations, while certainly not exhaustive, should provide information on technical aspects of the simulations, typical parameters used, and ways of analysis of the huge amounts of data generated in these large-scale supercomputer simulations. (author)

  11. Statistical physics of fracture: scientific discovery through high-performance computing

    International Nuclear Information System (INIS)

    Kumar, Phani; Nukala, V V; Simunovic, Srdan; Mills, Richard T

    2006-01-01

    The paper presents the state-of-the-art algorithmic developments for simulating the fracture of disordered quasi-brittle materials using discrete lattice systems. Large scale simulations are often required to obtain accurate scaling laws; however, due to computational complexity, the simulations using the traditional algorithms were limited to small system sizes. We have developed two algorithms: a multiple sparse Cholesky downdating scheme for simulating 2D random fuse model systems, and a block-circulant preconditioner for simulating 2D random fuse model systems. Using these algorithms, we were able to simulate fracture of largest ever lattice system sizes (L = 1024 in 2D, and L = 64 in 3D) with extensive statistical sampling. Our recent simulations on 1024 processors of Cray-XT3 and IBM Blue-Gene/L have further enabled us to explore fracture of 3D lattice systems of size L = 200, which is a significant computational achievement. These largest ever numerical simulations have enhanced our understanding of physics of fracture; in particular, we analyze damage localization and its deviation from percolation behavior, scaling laws for damage density, universality of fracture strength distribution, size effect on the mean fracture strength, and finally the scaling of crack surface roughness

  12. Ontology-Driven Discovery of Scientific Computational Entities

    Science.gov (United States)

    Brazier, Pearl W.

    2010-01-01

    Many geoscientists use modern computational resources, such as software applications, Web services, scientific workflows and datasets that are readily available on the Internet, to support their research and many common tasks. These resources are often shared via human contact and sometimes stored in data portals; however, they are not necessarily…

  13. Report of the Workshop on Petascale Systems Integration for LargeScale Facilities

    Energy Technology Data Exchange (ETDEWEB)

    Kramer, William T.C.; Walter, Howard; New, Gary; Engle, Tom; Pennington, Rob; Comes, Brad; Bland, Buddy; Tomlison, Bob; Kasdorf, Jim; Skinner, David; Regimbal, Kevin

    2007-10-01

    There are significant issues regarding Large Scale System integration that are not being addressed in other forums such as current research portfolios or vendor user groups. Unfortunately, the issues in the area of large-scale system integration often fall into a netherworld; not research, not facilities, not procurement, not operations, not user services. Taken together, these issues along with the impact of sub-optimal integration technology means the time required to deploy, integrate and stabilize large scale system may consume up to 20 percent of the useful life of such systems. Improving the state of the art for large scale systems integration has potential to increase the scientific productivity of these systems. Sites have significant expertise, but there are no easy ways to leverage this expertise among them . Many issues inhibit the sharing of information, including available time and effort, as well as issues with sharing proprietary information. Vendors also benefit in the long run from the solutions to issues detected during site testing and integration. There is a great deal of enthusiasm for making large scale system integration a full-fledged partner along with the other major thrusts supported by funding agencies in the definition, design, and use of a petascale systems. Integration technology and issues should have a full 'seat at the table' as petascale and exascale initiatives and programs are planned. The workshop attendees identified a wide range of issues and suggested paths forward. Pursuing these with funding opportunities and innovation offers the opportunity to dramatically improve the state of large scale system integration.

  14. The Anatomy of Big Data Computing

    OpenAIRE

    Kune, Raghavendra; Konugurthi, Pramodkumar; Agarwal, Arun; Chillarige, Raghavendra Rao; Buyya, Rajkumar

    2015-01-01

    Advances in information technology and its widespread growth in several areas of business, engineering, medical and scientific studies are resulting in information/data explosion. Knowledge discovery and decision making from such rapidly growing voluminous data is a challenging task in terms of data organization and processing, which is an emerging trend known as Big Data Computing; a new paradigm which combines large scale compute, new data intensive techniques and mathematical models to bui...

  15. Accelerating scientific discovery : 2007 annual report.

    Energy Technology Data Exchange (ETDEWEB)

    Beckman, P.; Dave, P.; Drugan, C.

    2008-11-14

    guidance for applications that are transitioning to petascale as well as to produce software that facilitates their development, such as the MPICH library, which provides a portable and efficient implementation of the MPI standard--the prevalent programming model for large-scale scientific applications--and the PETSc toolkit that provides a programming paradigm that eases the development of many scientific applications on high-end computers.

  16. VisualRank: applying PageRank to large-scale image search.

    Science.gov (United States)

    Jing, Yushi; Baluja, Shumeet

    2008-11-01

    Because of the relative ease in understanding and processing text, commercial image-search systems often rely on techniques that are largely indistinguishable from text-search. Recently, academic studies have demonstrated the effectiveness of employing image-based features to provide alternative or additional signals. However, it remains uncertain whether such techniques will generalize to a large number of popular web queries, and whether the potential improvement to search quality warrants the additional computational cost. In this work, we cast the image-ranking problem into the task of identifying "authority" nodes on an inferred visual similarity graph and propose VisualRank to analyze the visual link structures among images. The images found to be "authorities" are chosen as those that answer the image-queries well. To understand the performance of such an approach in a real system, we conducted a series of large-scale experiments based on the task of retrieving images for 2000 of the most popular products queries. Our experimental results show significant improvement, in terms of user satisfaction and relevancy, in comparison to the most recent Google Image Search results. Maintaining modest computational cost is vital to ensuring that this procedure can be used in practice; we describe the techniques required to make this system practical for large scale deployment in commercial search engines.

  17. Super-computer architecture

    CERN Document Server

    Hockney, R W

    1977-01-01

    This paper examines the design of the top-of-the-range, scientific, number-crunching computers. The market for such computers is not as large as that for smaller machines, but on the other hand it is by no means negligible. The present work-horse machines in this category are the CDC 7600 and IBM 360/195, and over fifty of the former machines have been sold. The types of installation that form the market for such machines are not only the major scientific research laboratories in the major countries-such as Los Alamos, CERN, Rutherford laboratory-but also major universities or university networks. It is also true that, as with sports cars, innovations made to satisfy the top of the market today often become the standard for the medium-scale computer of tomorrow. Hence there is considerable interest in examining present developments in this area. (0 refs).

  18. Python for Scientific Computing Education: Modeling of Queueing Systems

    Directory of Open Access Journals (Sweden)

    Vladimiras Dolgopolovas

    2014-01-01

    Full Text Available In this paper, we present the methodology for the introduction to scientific computing based on model-centered learning. We propose multiphase queueing systems as a basis for learning objects. We use Python and parallel programming for implementing the models and present the computer code and results of stochastic simulations.

  19. Large Scale Landform Mapping Using Lidar DEM

    Directory of Open Access Journals (Sweden)

    Türkay Gökgöz

    2015-08-01

    Full Text Available In this study, LIDAR DEM data was used to obtain a primary landform map in accordance with a well-known methodology. This primary landform map was generalized using the Focal Statistics tool (Majority, considering the minimum area condition in cartographic generalization in order to obtain landform maps at 1:1000 and 1:5000 scales. Both the primary and the generalized landform maps were verified visually with hillshaded DEM and an orthophoto. As a result, these maps provide satisfactory visuals of the landforms. In order to show the effect of generalization, the area of each landform in both the primary and the generalized maps was computed. Consequently, landform maps at large scales could be obtained with the proposed methodology, including generalization using LIDAR DEM.

  20. Large-scale grid management

    International Nuclear Information System (INIS)

    Langdal, Bjoern Inge; Eggen, Arnt Ove

    2003-01-01

    The network companies in the Norwegian electricity industry now have to establish a large-scale network management, a concept essentially characterized by (1) broader focus (Broad Band, Multi Utility,...) and (2) bigger units with large networks and more customers. Research done by SINTEF Energy Research shows so far that the approaches within large-scale network management may be structured according to three main challenges: centralization, decentralization and out sourcing. The article is part of a planned series

  1. Large-Scale Compute-Intensive Analysis via a Combined In-situ and Co-scheduling Workflow Approach

    Energy Technology Data Exchange (ETDEWEB)

    Messer, Bronson [ORNL; Sewell, Christopher [Los Alamos National Laboratory (LANL); Heitmann, Katrin [ORNL; Finkel, Dr. Hal J [Argonne National Laboratory (ANL); Fasel, Patricia [Los Alamos National Laboratory (LANL); Zagaris, George [Lawrence Livermore National Laboratory (LLNL); Pope, Adrian [Los Alamos National Laboratory (LANL); Habib, Salman [ORNL; Parete-Koon, Suzanne T [ORNL

    2015-01-01

    Large-scale simulations can produce tens of terabytes of data per analysis cycle, complicating and limiting the efficiency of workflows. Traditionally, outputs are stored on the file system and analyzed in post-processing. With the rapidly increasing size and complexity of simulations, this approach faces an uncertain future. Trending techniques consist of performing the analysis in situ, utilizing the same resources as the simulation, and/or off-loading subsets of the data to a compute-intensive analysis system. We introduce an analysis framework developed for HACC, a cosmological N-body code, that uses both in situ and co-scheduling approaches for handling Petabyte-size outputs. An initial in situ step is used to reduce the amount of data to be analyzed, and to separate out the data-intensive tasks handled off-line. The analysis routines are implemented using the PISTON/VTK-m framework, allowing a single implementation of an algorithm that simultaneously targets a variety of GPU, multi-core, and many-core architectures.

  2. Side effects of problem-solving strategies in large-scale nutrition science: towards a diversification of health.

    Science.gov (United States)

    Penders, Bart; Vos, Rein; Horstman, Klasien

    2009-11-01

    Solving complex problems in large-scale research programmes requires cooperation and division of labour. Simultaneously, large-scale problem solving also gives rise to unintended side effects. Based upon 5 years of researching two large-scale nutrigenomic research programmes, we argue that problems are fragmented in order to be solved. These sub-problems are given priority for practical reasons and in the process of solving them, various changes are introduced in each sub-problem. Combined with additional diversity as a result of interdisciplinarity, this makes reassembling the original and overall goal of the research programme less likely. In the case of nutrigenomics and health, this produces a diversification of health. As a result, the public health goal of contemporary nutrition science is not reached in the large-scale research programmes we studied. Large-scale research programmes are very successful in producing scientific publications and new knowledge; however, in reaching their political goals they often are less successful.

  3. Scaling predictive modeling in drug development with cloud computing.

    Science.gov (United States)

    Moghadam, Behrooz Torabi; Alvarsson, Jonathan; Holm, Marcus; Eklund, Martin; Carlsson, Lars; Spjuth, Ola

    2015-01-26

    Growing data sets with increased time for analysis is hampering predictive modeling in drug discovery. Model building can be carried out on high-performance computer clusters, but these can be expensive to purchase and maintain. We have evaluated ligand-based modeling on cloud computing resources where computations are parallelized and run on the Amazon Elastic Cloud. We trained models on open data sets of varying sizes for the end points logP and Ames mutagenicity and compare with model building parallelized on a traditional high-performance computing cluster. We show that while high-performance computing results in faster model building, the use of cloud computing resources is feasible for large data sets and scales well within cloud instances. An additional advantage of cloud computing is that the costs of predictive models can be easily quantified, and a choice can be made between speed and economy. The easy access to computational resources with no up-front investments makes cloud computing an attractive alternative for scientists, especially for those without access to a supercomputer, and our study shows that it enables cost-efficient modeling of large data sets on demand within reasonable time.

  4. National Laboratory for Advanced Scientific Visualization at UNAM - Mexico

    Science.gov (United States)

    Manea, Marina; Constantin Manea, Vlad; Varela, Alfredo

    2016-04-01

    In 2015, the National Autonomous University of Mexico (UNAM) joined the family of Universities and Research Centers where advanced visualization and computing plays a key role to promote and advance missions in research, education, community outreach, as well as business-oriented consulting. This initiative provides access to a great variety of advanced hardware and software resources and offers a range of consulting services that spans a variety of areas related to scientific visualization, among which are: neuroanatomy, embryonic development, genome related studies, geosciences, geography, physics and mathematics related disciplines. The National Laboratory for Advanced Scientific Visualization delivers services through three main infrastructure environments: the 3D fully immersive display system Cave, the high resolution parallel visualization system Powerwall, the high resolution spherical displays Earth Simulator. The entire visualization infrastructure is interconnected to a high-performance-computing-cluster (HPCC) called ADA in honor to Ada Lovelace, considered to be the first computer programmer. The Cave is an extra large 3.6m wide room with projected images on the front, left and right, as well as floor walls. Specialized crystal eyes LCD-shutter glasses provide a strong stereo depth perception, and a variety of tracking devices allow software to track the position of a user's hand, head and wand. The Powerwall is designed to bring large amounts of complex data together through parallel computing for team interaction and collaboration. This system is composed by 24 (6x4) high-resolution ultra-thin (2 mm) bezel monitors connected to a high-performance GPU cluster. The Earth Simulator is a large (60") high-resolution spherical display used for global-scale data visualization like geophysical, meteorological, climate and ecology data. The HPCC-ADA, is a 1000+ computing core system, which offers parallel computing resources to applications that requires

  5. Full-Scale Approximations of Spatio-Temporal Covariance Models for Large Datasets

    KAUST Repository

    Zhang, Bohai; Sang, Huiyan; Huang, Jianhua Z.

    2014-01-01

    of dataset and application of such models is not feasible for large datasets. This article extends the full-scale approximation (FSA) approach by Sang and Huang (2012) to the spatio-temporal context to reduce computational complexity. A reversible jump Markov

  6. Parallel clustering algorithm for large-scale biological data sets.

    Science.gov (United States)

    Wang, Minchao; Zhang, Wu; Ding, Wang; Dai, Dongbo; Zhang, Huiran; Xie, Hao; Chen, Luonan; Guo, Yike; Xie, Jiang

    2014-01-01

    Recent explosion of biological data brings a great challenge for the traditional clustering algorithms. With increasing scale of data sets, much larger memory and longer runtime are required for the cluster identification problems. The affinity propagation algorithm outperforms many other classical clustering algorithms and is widely applied into the biological researches. However, the time and space complexity become a great bottleneck when handling the large-scale data sets. Moreover, the similarity matrix, whose constructing procedure takes long runtime, is required before running the affinity propagation algorithm, since the algorithm clusters data sets based on the similarities between data pairs. Two types of parallel architectures are proposed in this paper to accelerate the similarity matrix constructing procedure and the affinity propagation algorithm. The memory-shared architecture is used to construct the similarity matrix, and the distributed system is taken for the affinity propagation algorithm, because of its large memory size and great computing capacity. An appropriate way of data partition and reduction is designed in our method, in order to minimize the global communication cost among processes. A speedup of 100 is gained with 128 cores. The runtime is reduced from serval hours to a few seconds, which indicates that parallel algorithm is capable of handling large-scale data sets effectively. The parallel affinity propagation also achieves a good performance when clustering large-scale gene data (microarray) and detecting families in large protein superfamilies.

  7. Extraction of drainage networks from large terrain datasets using high throughput computing

    Science.gov (United States)

    Gong, Jianya; Xie, Jibo

    2009-02-01

    Advanced digital photogrammetry and remote sensing technology produces large terrain datasets (LTD). How to process and use these LTD has become a big challenge for GIS users. Extracting drainage networks, which are basic for hydrological applications, from LTD is one of the typical applications of digital terrain analysis (DTA) in geographical information applications. Existing serial drainage algorithms cannot deal with large data volumes in a timely fashion, and few GIS platforms can process LTD beyond the GB size. High throughput computing (HTC), a distributed parallel computing mode, is proposed to improve the efficiency of drainage networks extraction from LTD. Drainage network extraction using HTC involves two key issues: (1) how to decompose the large DEM datasets into independent computing units and (2) how to merge the separate outputs into a final result. A new decomposition method is presented in which the large datasets are partitioned into independent computing units using natural watershed boundaries instead of using regular 1-dimensional (strip-wise) and 2-dimensional (block-wise) decomposition. Because the distribution of drainage networks is strongly related to watershed boundaries, the new decomposition method is more effective and natural. The method to extract natural watershed boundaries was improved by using multi-scale DEMs instead of single-scale DEMs. A HTC environment is employed to test the proposed methods with real datasets.

  8. RGCA: A Reliable GPU Cluster Architecture for Large-Scale Internet of Things Computing Based on Effective Performance-Energy Optimization.

    Science.gov (United States)

    Fang, Yuling; Chen, Qingkui; Xiong, Neal N; Zhao, Deyu; Wang, Jingjuan

    2017-08-04

    This paper aims to develop a low-cost, high-performance and high-reliability computing system to process large-scale data using common data mining algorithms in the Internet of Things (IoT) computing environment. Considering the characteristics of IoT data processing, similar to mainstream high performance computing, we use a GPU (Graphics Processing Unit) cluster to achieve better IoT services. Firstly, we present an energy consumption calculation method (ECCM) based on WSNs. Then, using the CUDA (Compute Unified Device Architecture) Programming model, we propose a Two-level Parallel Optimization Model (TLPOM) which exploits reasonable resource planning and common compiler optimization techniques to obtain the best blocks and threads configuration considering the resource constraints of each node. The key to this part is dynamic coupling Thread-Level Parallelism (TLP) and Instruction-Level Parallelism (ILP) to improve the performance of the algorithms without additional energy consumption. Finally, combining the ECCM and the TLPOM, we use the Reliable GPU Cluster Architecture (RGCA) to obtain a high-reliability computing system considering the nodes' diversity, algorithm characteristics, etc. The results show that the performance of the algorithms significantly increased by 34.1%, 33.96% and 24.07% for Fermi, Kepler and Maxwell on average with TLPOM and the RGCA ensures that our IoT computing system provides low-cost and high-reliability services.

  9. Sparse deconvolution for the large-scale ill-posed inverse problem of impact force reconstruction

    Science.gov (United States)

    Qiao, Baijie; Zhang, Xingwu; Gao, Jiawei; Liu, Ruonan; Chen, Xuefeng

    2017-01-01

    Most previous regularization methods for solving the inverse problem of force reconstruction are to minimize the l2-norm of the desired force. However, these traditional regularization methods such as Tikhonov regularization and truncated singular value decomposition, commonly fail to solve the large-scale ill-posed inverse problem in moderate computational cost. In this paper, taking into account the sparse characteristic of impact force, the idea of sparse deconvolution is first introduced to the field of impact force reconstruction and a general sparse deconvolution model of impact force is constructed. Second, a novel impact force reconstruction method based on the primal-dual interior point method (PDIPM) is proposed to solve such a large-scale sparse deconvolution model, where minimizing the l2-norm is replaced by minimizing the l1-norm. Meanwhile, the preconditioned conjugate gradient algorithm is used to compute the search direction of PDIPM with high computational efficiency. Finally, two experiments including the small-scale or medium-scale single impact force reconstruction and the relatively large-scale consecutive impact force reconstruction are conducted on a composite wind turbine blade and a shell structure to illustrate the advantage of PDIPM. Compared with Tikhonov regularization, PDIPM is more efficient, accurate and robust whether in the single impact force reconstruction or in the consecutive impact force reconstruction.

  10. BILGO: Bilateral greedy optimization for large scale semidefinite programming

    KAUST Repository

    Hao, Zhifeng; Yuan, Ganzhao; Ghanem, Bernard

    2013-01-01

    Many machine learning tasks (e.g. metric and manifold learning problems) can be formulated as convex semidefinite programs. To enable the application of these tasks on a large-scale, scalability and computational efficiency are considered as desirable properties for a practical semidefinite programming algorithm. In this paper, we theoretically analyze a new bilateral greedy optimization (denoted BILGO) strategy in solving general semidefinite programs on large-scale datasets. As compared to existing methods, BILGO employs a bilateral search strategy during each optimization iteration. In such an iteration, the current semidefinite matrix solution is updated as a bilateral linear combination of the previous solution and a suitable rank-1 matrix, which can be efficiently computed from the leading eigenvector of the descent direction at this iteration. By optimizing for the coefficients of the bilateral combination, BILGO reduces the cost function in every iteration until the KKT conditions are fully satisfied, thus, it tends to converge to a global optimum. In fact, we prove that BILGO converges to the global optimal solution at a rate of O(1/k), where k is the iteration counter. The algorithm thus successfully combines the efficiency of conventional rank-1 update algorithms and the effectiveness of gradient descent. Moreover, BILGO can be easily extended to handle low rank constraints. To validate the effectiveness and efficiency of BILGO, we apply it to two important machine learning tasks, namely Mahalanobis metric learning and maximum variance unfolding. Extensive experimental results clearly demonstrate that BILGO can solve large-scale semidefinite programs efficiently.

  11. BILGO: Bilateral greedy optimization for large scale semidefinite programming

    KAUST Repository

    Hao, Zhifeng

    2013-10-03

    Many machine learning tasks (e.g. metric and manifold learning problems) can be formulated as convex semidefinite programs. To enable the application of these tasks on a large-scale, scalability and computational efficiency are considered as desirable properties for a practical semidefinite programming algorithm. In this paper, we theoretically analyze a new bilateral greedy optimization (denoted BILGO) strategy in solving general semidefinite programs on large-scale datasets. As compared to existing methods, BILGO employs a bilateral search strategy during each optimization iteration. In such an iteration, the current semidefinite matrix solution is updated as a bilateral linear combination of the previous solution and a suitable rank-1 matrix, which can be efficiently computed from the leading eigenvector of the descent direction at this iteration. By optimizing for the coefficients of the bilateral combination, BILGO reduces the cost function in every iteration until the KKT conditions are fully satisfied, thus, it tends to converge to a global optimum. In fact, we prove that BILGO converges to the global optimal solution at a rate of O(1/k), where k is the iteration counter. The algorithm thus successfully combines the efficiency of conventional rank-1 update algorithms and the effectiveness of gradient descent. Moreover, BILGO can be easily extended to handle low rank constraints. To validate the effectiveness and efficiency of BILGO, we apply it to two important machine learning tasks, namely Mahalanobis metric learning and maximum variance unfolding. Extensive experimental results clearly demonstrate that BILGO can solve large-scale semidefinite programs efficiently.

  12. The OSG Open Facility: an on-ramp for opportunistic scientific computing

    Science.gov (United States)

    Jayatilaka, B.; Levshina, T.; Sehgal, C.; Gardner, R.; Rynge, M.; Würthwein, F.

    2017-10-01

    The Open Science Grid (OSG) is a large, robust computing grid that started primarily as a collection of sites associated with large HEP experiments such as ATLAS, CDF, CMS, and DZero, but has evolved in recent years to a much larger user and resource platform. In addition to meeting the US LHC community’s computational needs, the OSG continues to be one of the largest providers of distributed high-throughput computing (DHTC) to researchers from a wide variety of disciplines via the OSG Open Facility. The Open Facility consists of OSG resources that are available opportunistically to users other than resource owners and their collaborators. In the past two years, the Open Facility has doubled its annual throughput to over 200 million wall hours. More than half of these resources are used by over 100 individual researchers from over 60 institutions in fields such as biology, medicine, math, economics, and many others. Over 10% of these individual users utilized in excess of 1 million computational hours each in the past year. The largest source of these cycles is temporary unused capacity at institutions affiliated with US LHC computational sites. An increasing fraction, however, comes from university HPC clusters and large national infrastructure supercomputers offering unused capacity. Such expansions have allowed the OSG to provide ample computational resources to both individual researchers and small groups as well as sizable international science collaborations such as LIGO, AMS, IceCube, and sPHENIX. Opening up access to the Fermilab FabrIc for Frontier Experiments (FIFE) project has also allowed experiments such as mu2e and NOvA to make substantial use of Open Facility resources, the former with over 40 million wall hours in a year. We present how this expansion was accomplished as well as future plans for keeping the OSG Open Facility at the forefront of enabling scientific research by way of DHTC.

  13. The OSG Open Facility: An On-Ramp for Opportunistic Scientific Computing

    Energy Technology Data Exchange (ETDEWEB)

    Jayatilaka, B. [Fermilab; Levshina, T. [Fermilab; Sehgal, C. [Fermilab; Gardner, R. [Chicago U.; Rynge, M. [USC - ISI, Marina del Rey; Würthwein, F. [UC, San Diego

    2017-11-22

    The Open Science Grid (OSG) is a large, robust computing grid that started primarily as a collection of sites associated with large HEP experiments such as ATLAS, CDF, CMS, and DZero, but has evolved in recent years to a much larger user and resource platform. In addition to meeting the US LHC community’s computational needs, the OSG continues to be one of the largest providers of distributed high-throughput computing (DHTC) to researchers from a wide variety of disciplines via the OSG Open Facility. The Open Facility consists of OSG resources that are available opportunistically to users other than resource owners and their collaborators. In the past two years, the Open Facility has doubled its annual throughput to over 200 million wall hours. More than half of these resources are used by over 100 individual researchers from over 60 institutions in fields such as biology, medicine, math, economics, and many others. Over 10% of these individual users utilized in excess of 1 million computational hours each in the past year. The largest source of these cycles is temporary unused capacity at institutions affiliated with US LHC computational sites. An increasing fraction, however, comes from university HPC clusters and large national infrastructure supercomputers offering unused capacity. Such expansions have allowed the OSG to provide ample computational resources to both individual researchers and small groups as well as sizable international science collaborations such as LIGO, AMS, IceCube, and sPHENIX. Opening up access to the Fermilab FabrIc for Frontier Experiments (FIFE) project has also allowed experiments such as mu2e and NOvA to make substantial use of Open Facility resources, the former with over 40 million wall hours in a year. We present how this expansion was accomplished as well as future plans for keeping the OSG Open Facility at the forefront of enabling scientific research by way of DHTC.

  14. Scientific computing with MATLAB and Octave

    CERN Document Server

    Quarteroni, Alfio; Gervasio, Paola

    2014-01-01

    This textbook is an introduction to Scientific Computing, in which several numerical methods for the computer-based solution of certain classes of mathematical problems are illustrated. The authors show how to compute the zeros, the extrema, and the integrals of continuous functions, solve linear systems, approximate functions using polynomials and construct accurate approximations for the solution of ordinary and partial differential equations. To make the format concrete and appealing, the programming environments Matlab and Octave are adopted as faithful companions. The book contains the solutions to several problems posed in exercises and examples, often originating from important applications. At the end of each chapter, a specific section is devoted to subjects which were not addressed in the book and contains bibliographical references for a more comprehensive treatment of the material. From the review: ".... This carefully written textbook, the third English edition, contains substantial new developme...

  15. Large scale statistics for computational verification of grain growth simulations with experiments

    International Nuclear Information System (INIS)

    Demirel, Melik C.; Kuprat, Andrew P.; George, Denise C.; Straub, G.K.; Misra, Amit; Alexander, Kathleen B.; Rollett, Anthony D.

    2002-01-01

    It is known that by controlling microstructural development, desirable properties of materials can be achieved. The main objective of our research is to understand and control interface dominated material properties, and finally, to verify experimental results with computer simulations. We have previously showed a strong similarity between small-scale grain growth experiments and anisotropic three-dimensional simulations obtained from the Electron Backscattered Diffraction (EBSD) measurements. Using the same technique, we obtained 5170-grain data from an Aluminum-film (120 (micro)m thick) with a columnar grain structure. Experimentally obtained starting microstructure and grain boundary properties are input for the three-dimensional grain growth simulation. In the computational model, minimization of the interface energy is the driving force for the grain boundary motion. The computed evolved microstructure is compared with the final experimental microstructure, after annealing at 550 C. Characterization of the structures and properties of grain boundary networks (GBN) to produce desirable microstructures is one of the fundamental problems in interface science. There is an ongoing research for the development of new experimental and analytical techniques in order to obtain and synthesize information related to GBN. The grain boundary energy and mobility data were characterized by Electron Backscattered Diffraction (EBSD) technique and Atomic Force Microscopy (AFM) observations (i.e., for ceramic MgO and for the metal Al). Grain boundary energies are extracted from triple junction (TJ) geometry considering the local equilibrium condition at TJ's. Relative boundary mobilities were also extracted from TJ's through a statistical/multiscale analysis. Additionally, there are recent theoretical developments of grain boundary evolution in microstructures. In this paper, a new technique for three-dimensional grain growth simulations was used to simulate interface migration

  16. An assessment of future computer system needs for large-scale computation

    Science.gov (United States)

    Lykos, P.; White, J.

    1980-01-01

    Data ranging from specific computer capability requirements to opinions about the desirability of a national computer facility are summarized. It is concluded that considerable attention should be given to improving the user-machine interface. Otherwise, increased computer power may not improve the overall effectiveness of the machine user. Significant improvement in throughput requires highly concurrent systems plus the willingness of the user community to develop problem solutions for that kind of architecture. An unanticipated result was the expression of need for an on-going cross-disciplinary users group/forum in order to share experiences and to more effectively communicate needs to the manufacturers.

  17. The composing technique of fast and large scale nuclear data acquisition and control system with single chip microcomputers and PC computers

    International Nuclear Information System (INIS)

    Xu Zurun; Wu Shiying; Liu Haitao; Yao Yangsen; Wang Yingguan; Yang Chaowen

    1998-01-01

    The technique of employing single-chip microcomputers and PC computers to compose a fast and large scale nuclear data acquisition and control system was discussed in detail. The optimum composition mode of this kind of system, the acquisition and control circuit unit based on single-chip microcomputers, the real-time communication methods and the software composition under the Windows 3.2 were also described. One, two and three dimensional spectra measured by this system were demonstrated

  18. The composing technique of fast and large scale nuclear data acquisition and control system with single chip microcomputers and PC computers

    International Nuclear Information System (INIS)

    Xu Zurun; Wu Shiying; Liu Haitao; Yao Yangsen; Wang Yingguan; Yang Chaowen

    1997-01-01

    The technique of employing single-chip microcomputers and PC computers to compose a fast and large scale nuclear data acquisition and control system was discussed in detail. The optimum composition mode of this kind of system, the acquisition and control circuit unit based on single-chip microcomputers, the real-time communication methods and the software composition under the Windows 3.2 were also described. One, two and three dimensional spectra measured by this system were demonstrated

  19. Center for Technology for Advanced Scientific Component Software (TASCS)

    Energy Technology Data Exchange (ETDEWEB)

    Damevski, Kostadin [Virginia State Univ., Petersburg, VA (United States)

    2009-03-30

    A resounding success of the Scientific Discover through Advanced Computing (SciDAC) program is that high-performance computational science is now universally recognized as a critical aspect of scientific discovery [71], complementing both theoretical and experimental research. As scientific communities prepare to exploit unprecedened computing capabilities of emerging leadership-class machines for multi-model simulations at the extreme scale [72], it is more important than ever to address the technical and social challenges of geographically distributed teams that combine expertise in domain science, applied mathematics, and computer science to build robust and flexible codes that can incorporate changes over time. The Center for Technology for Advanced Scientific Component Software (TASCS) tackles these issues by exploiting component-based software development to facilitate collaborative hig-performance scientific computing.

  20. GRID : unlimited computing power on your desktop Conference MT17

    CERN Multimedia

    2001-01-01

    The Computational GRID is an analogy to the electrical power grid for computing resources. It decouples the provision of computing, data, and networking from its use, it allows large-scale pooling and sharing of resources distributed world-wide. Every computer, from a desktop to a mainframe or supercomputer, can provide computing power or data for the GRID. The final objective is to plug your computer into the wall and have direct access to huge computing resources immediately, just like plugging-in a lamp to get instant light. The GRID will facilitate world-wide scientific collaborations on an unprecedented scale. It will provide transparent access to major distributed resources of computer power, data, information, and collaborations.

  1. Large scale computing in the Energy Research Programs

    International Nuclear Information System (INIS)

    1991-05-01

    The Energy Research Supercomputer Users Group (ERSUG) comprises all investigators using resources of the Department of Energy Office of Energy Research supercomputers. At the December 1989 meeting held at Florida State University (FSU), the ERSUG executive committee determined that the continuing rapid advances in computational sciences and computer technology demanded a reassessment of the role computational science should play in meeting DOE's commitments. Initial studies were to be performed for four subdivisions: (1) Basic Energy Sciences (BES) and Applied Mathematical Sciences (AMS), (2) Fusion Energy, (3) High Energy and Nuclear Physics, and (4) Health and Environmental Research. The first two subgroups produced formal subreports that provided a basis for several sections of this report. Additional information provided in the AMS/BES is included as Appendix C in an abridged form that eliminates most duplication. Additionally, each member of the executive committee was asked to contribute area-specific assessments; these assessments are included in the next section. In the following sections, brief assessments are given for specific areas, a conceptual model is proposed that the entire computational effort for energy research is best viewed as one giant nation-wide computer, and then specific recommendations are made for the appropriate evolution of the system

  2. Commercial applications of large-scale Research and Development computer simulation technologies

    International Nuclear Information System (INIS)

    Kuok Mee Ling; Pascal Chen; Wen Ho Lee

    1998-01-01

    The potential commercial applications of two large-scale R and D computer simulation technologies are presented. One such technology is based on the numerical solution of the hydrodynamics equations, and is embodied in the two-dimensional Eulerian code EULE2D, which solves the hydrodynamic equations with various models for the equation of state (EOS), constitutive relations and fracture mechanics. EULE2D is an R and D code originally developed to design and analyze conventional munitions for anti-armor penetrations such as shaped charges, explosive formed projectiles, and kinetic energy rods. Simulated results agree very well with actual experiments. A commercial application presented here is the design and simulation of shaped charges for oil and gas well bore perforation. The other R and D simulation technology is based on the numerical solution of Maxwell's partial differential equations of electromagnetics in space and time, and is implemented in the three-dimensional code FDTD-SPICE, which solves Maxwell's equations in the time domain with finite-differences in the three spatial dimensions and calls SPICE for information when nonlinear active devices are involved. The FDTD method has been used in the radar cross-section modeling of military aircrafts and many other electromagnetic phenomena. The coupling of FDTD method with SPICE, a popular circuit and device simulation program, provides a powerful tool for the simulation and design of microwave and millimeter-wave circuits containing nonlinear active semiconductor devices. A commercial application of FDTD-SPICE presented here is the simulation of a two-element active antenna system. The simulation results and the experimental measurements are in excellent agreement. (Author)

  3. Large scale Brownian dynamics of confined suspensions of rigid particles

    Science.gov (United States)

    Sprinkle, Brennan; Balboa Usabiaga, Florencio; Patankar, Neelesh A.; Donev, Aleksandar

    2017-12-01

    We introduce methods for large-scale Brownian Dynamics (BD) simulation of many rigid particles of arbitrary shape suspended in a fluctuating fluid. Our method adds Brownian motion to the rigid multiblob method [F. Balboa Usabiaga et al., Commun. Appl. Math. Comput. Sci. 11(2), 217-296 (2016)] at a cost comparable to the cost of deterministic simulations. We demonstrate that we can efficiently generate deterministic and random displacements for many particles using preconditioned Krylov iterative methods, if kernel methods to efficiently compute the action of the Rotne-Prager-Yamakawa (RPY) mobility matrix and its "square" root are available for the given boundary conditions. These kernel operations can be computed with near linear scaling for periodic domains using the positively split Ewald method. Here we study particles partially confined by gravity above a no-slip bottom wall using a graphical processing unit implementation of the mobility matrix-vector product, combined with a preconditioned Lanczos iteration for generating Brownian displacements. We address a major challenge in large-scale BD simulations, capturing the stochastic drift term that arises because of the configuration-dependent mobility. Unlike the widely used Fixman midpoint scheme, our methods utilize random finite differences and do not require the solution of resistance problems or the computation of the action of the inverse square root of the RPY mobility matrix. We construct two temporal schemes which are viable for large-scale simulations, an Euler-Maruyama traction scheme and a trapezoidal slip scheme, which minimize the number of mobility problems to be solved per time step while capturing the required stochastic drift terms. We validate and compare these schemes numerically by modeling suspensions of boomerang-shaped particles sedimented near a bottom wall. Using the trapezoidal scheme, we investigate the steady-state active motion in dense suspensions of confined microrollers, whose

  4. Parameter estimation in large-scale systems biology models: a parallel and self-adaptive cooperative strategy.

    Science.gov (United States)

    Penas, David R; González, Patricia; Egea, Jose A; Doallo, Ramón; Banga, Julio R

    2017-01-21

    The development of large-scale kinetic models is one of the current key issues in computational systems biology and bioinformatics. Here we consider the problem of parameter estimation in nonlinear dynamic models. Global optimization methods can be used to solve this type of problems but the associated computational cost is very large. Moreover, many of these methods need the tuning of a number of adjustable search parameters, requiring a number of initial exploratory runs and therefore further increasing the computation times. Here we present a novel parallel method, self-adaptive cooperative enhanced scatter search (saCeSS), to accelerate the solution of this class of problems. The method is based on the scatter search optimization metaheuristic and incorporates several key new mechanisms: (i) asynchronous cooperation between parallel processes, (ii) coarse and fine-grained parallelism, and (iii) self-tuning strategies. The performance and robustness of saCeSS is illustrated by solving a set of challenging parameter estimation problems, including medium and large-scale kinetic models of the bacterium E. coli, bakerés yeast S. cerevisiae, the vinegar fly D. melanogaster, Chinese Hamster Ovary cells, and a generic signal transduction network. The results consistently show that saCeSS is a robust and efficient method, allowing very significant reduction of computation times with respect to several previous state of the art methods (from days to minutes, in several cases) even when only a small number of processors is used. The new parallel cooperative method presented here allows the solution of medium and large scale parameter estimation problems in reasonable computation times and with small hardware requirements. Further, the method includes self-tuning mechanisms which facilitate its use by non-experts. We believe that this new method can play a key role in the development of large-scale and even whole-cell dynamic models.

  5. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems

    OpenAIRE

    Abadi, Martín; Agarwal, Ashish; Barham, Paul; Brevdo, Eugene; Chen, Zhifeng; Citro, Craig; Corrado, Greg S.; Davis, Andy; Dean, Jeffrey; Devin, Matthieu; Ghemawat, Sanjay; Goodfellow, Ian; Harp, Andrew; Irving, Geoffrey; Isard, Michael

    2016-01-01

    TensorFlow is an interface for expressing machine learning algorithms, and an implementation for executing such algorithms. A computation expressed using TensorFlow can be executed with little or no change on a wide variety of heterogeneous systems, ranging from mobile devices such as phones and tablets up to large-scale distributed systems of hundreds of machines and thousands of computational devices such as GPU cards. The system is flexible and can be used to express a wide variety of algo...

  6. Inferring Large-Scale Terrestrial Water Storage Through GRACE and GPS Data Fusion in Cloud Computing Environments

    Science.gov (United States)

    Rude, C. M.; Li, J. D.; Gowanlock, M.; Herring, T.; Pankratius, V.

    2016-12-01

    Surface subsidence due to depletion of groundwater can lead to permanent compaction of aquifers and damaged infrastructure. However, studies of such effects on a large scale are challenging and compute intensive because they involve fusing a variety of data sets beyond direct measurements from groundwater wells, such as gravity change measurements from the Gravity Recovery and Climate Experiment (GRACE) or surface displacements measured by GPS receivers. Our work therefore leverages Amazon cloud computing to enable these types of analyses spanning the entire continental US. Changes in groundwater storage are inferred from surface displacements measured by GPS receivers stationed throughout the country. Receivers located on bedrock are anti-correlated with changes in water levels from elastic deformation due to loading, while stations on aquifers correlate with groundwater changes due to poroelastic expansion and compaction. Correlating linearly detrended equivalent water thickness measurements from GRACE with linearly detrended and Kalman filtered vertical displacements of GPS stations located throughout the United States helps compensate for the spatial and temporal limitations of GRACE. Our results show that the majority of GPS stations are negatively correlated with GRACE in a statistically relevant way, as most GPS stations are located on bedrock in order to provide stable reference locations and measure geophysical processes such as tectonic deformations. Additionally, stations located on the Central Valley California aquifer show statistically significant positive correlations. Through the identification of positive and negative correlations, deformation phenomena can be classified as loading or poroelastic expansion due to changes in groundwater. This method facilitates further studies of terrestrial water storage on a global scale. This work is supported by NASA AIST-NNX15AG84G (PI: V. Pankratius) and Amazon.

  7. Improving Large-scale Storage System Performance via Topology-aware and Balanced Data Placement

    Energy Technology Data Exchange (ETDEWEB)

    Wang, Feiyi [ORNL; Oral, H Sarp [ORNL; Vazhkudai, Sudharshan S [ORNL

    2014-01-01

    With the advent of big data, the I/O subsystems of large-scale compute clusters are becoming a center of focus, with more applications putting greater demands on end-to-end I/O performance. These subsystems are often complex in design. They comprise of multiple hardware and software layers to cope with the increasing capacity, capability and scalability requirements of data intensive applications. The sharing nature of storage resources and the intrinsic interactions across these layers make it to realize user-level, end-to-end performance gains a great challenge. We propose a topology-aware resource load balancing strategy to improve per-application I/O performance. We demonstrate the effectiveness of our algorithm on an extreme-scale compute cluster, Titan, at the Oak Ridge Leadership Computing Facility (OLCF). Our experiments with both synthetic benchmarks and a real-world application show that, even under congestion, our proposed algorithm can improve large-scale application I/O performance significantly, resulting in both the reduction of application run times and higher resolution simulation runs.

  8. The Large-Scale Biosphere-Atmosphere Experiment in Amazonia: Analyzing Regional Land Use Change Effects.

    Science.gov (United States)

    Michael Keller; Maria Assunção Silva-Dias; Daniel C. Nepstad; Meinrat O. Andreae

    2004-01-01

    The Large-Scale Biosphere-Atmosphere Experiment in Amazonia (LBA) is a multi-disciplinary, multinational scientific project led by Brazil. LBA researchers seek to understand Amazonia in its global context especially with regard to regional and global climate. Current development activities in Amazonia including deforestation, logging, cattle ranching, and agriculture...

  9. Ethics of large-scale change

    OpenAIRE

    Arler, Finn

    2006-01-01

      The subject of this paper is long-term large-scale changes in human society. Some very significant examples of large-scale change are presented: human population growth, human appropriation of land and primary production, the human use of fossil fuels, and climate change. The question is posed, which kind of attitude is appropriate when dealing with large-scale changes like these from an ethical point of view. Three kinds of approaches are discussed: Aldo Leopold's mountain thinking, th...

  10. WImpiBLAST: web interface for mpiBLAST to help biologists perform large-scale annotation using high performance computing.

    Directory of Open Access Journals (Sweden)

    Parichit Sharma

    Full Text Available The function of a newly sequenced gene can be discovered by determining its sequence homology with known proteins. BLAST is the most extensively used sequence analysis program for sequence similarity search in large databases of sequences. With the advent of next generation sequencing technologies it has now become possible to study genes and their expression at a genome-wide scale through RNA-seq and metagenome sequencing experiments. Functional annotation of all the genes is done by sequence similarity search against multiple protein databases. This annotation task is computationally very intensive and can take days to obtain complete results. The program mpiBLAST, an open-source parallelization of BLAST that achieves superlinear speedup, can be used to accelerate large-scale annotation by using supercomputers and high performance computing (HPC clusters. Although many parallel bioinformatics applications using the Message Passing Interface (MPI are available in the public domain, researchers are reluctant to use them due to lack of expertise in the Linux command line and relevant programming experience. With these limitations, it becomes difficult for biologists to use mpiBLAST for accelerating annotation. No web interface is available in the open-source domain for mpiBLAST. We have developed WImpiBLAST, a user-friendly open-source web interface for parallel BLAST searches. It is implemented in Struts 1.3 using a Java backbone and runs atop the open-source Apache Tomcat Server. WImpiBLAST supports script creation and job submission features and also provides a robust job management interface for system administrators. It combines script creation and modification features with job monitoring and management through the Torque resource manager on a Linux-based HPC cluster. Use case information highlights the acceleration of annotation analysis achieved by using WImpiBLAST. Here, we describe the WImpiBLAST web interface features and architecture

  11. WImpiBLAST: web interface for mpiBLAST to help biologists perform large-scale annotation using high performance computing.

    Science.gov (United States)

    Sharma, Parichit; Mantri, Shrikant S

    2014-01-01

    The function of a newly sequenced gene can be discovered by determining its sequence homology with known proteins. BLAST is the most extensively used sequence analysis program for sequence similarity search in large databases of sequences. With the advent of next generation sequencing technologies it has now become possible to study genes and their expression at a genome-wide scale through RNA-seq and metagenome sequencing experiments. Functional annotation of all the genes is done by sequence similarity search against multiple protein databases. This annotation task is computationally very intensive and can take days to obtain complete results. The program mpiBLAST, an open-source parallelization of BLAST that achieves superlinear speedup, can be used to accelerate large-scale annotation by using supercomputers and high performance computing (HPC) clusters. Although many parallel bioinformatics applications using the Message Passing Interface (MPI) are available in the public domain, researchers are reluctant to use them due to lack of expertise in the Linux command line and relevant programming experience. With these limitations, it becomes difficult for biologists to use mpiBLAST for accelerating annotation. No web interface is available in the open-source domain for mpiBLAST. We have developed WImpiBLAST, a user-friendly open-source web interface for parallel BLAST searches. It is implemented in Struts 1.3 using a Java backbone and runs atop the open-source Apache Tomcat Server. WImpiBLAST supports script creation and job submission features and also provides a robust job management interface for system administrators. It combines script creation and modification features with job monitoring and management through the Torque resource manager on a Linux-based HPC cluster. Use case information highlights the acceleration of annotation analysis achieved by using WImpiBLAST. Here, we describe the WImpiBLAST web interface features and architecture, explain design

  12. NASA's computer science research program

    Science.gov (United States)

    Larsen, R. L.

    1983-01-01

    Following a major assessment of NASA's computing technology needs, a new program of computer science research has been initiated by the Agency. The program includes work in concurrent processing, management of large scale scientific databases, software engineering, reliable computing, and artificial intelligence. The program is driven by applications requirements in computational fluid dynamics, image processing, sensor data management, real-time mission control and autonomous systems. It consists of university research, in-house NASA research, and NASA's Research Institute for Advanced Computer Science (RIACS) and Institute for Computer Applications in Science and Engineering (ICASE). The overall goal is to provide the technical foundation within NASA to exploit advancing computing technology in aerospace applications.

  13. Self-* and Adaptive Mechanisms for Large Scale Distributed Systems

    Science.gov (United States)

    Fragopoulou, P.; Mastroianni, C.; Montero, R.; Andrjezak, A.; Kondo, D.

    Large-scale distributed computing systems and infrastructure, such as Grids, P2P systems and desktop Grid platforms, are decentralized, pervasive, and composed of a large number of autonomous entities. The complexity of these systems is such that human administration is nearly impossible and centralized or hierarchical control is highly inefficient. These systems need to run on highly dynamic environments, where content, network topologies and workloads are continuously changing. Moreover, they are characterized by the high degree of volatility of their components and the need to provide efficient service management and to handle efficiently large amounts of data. This paper describes some of the areas for which adaptation emerges as a key feature, namely, the management of computational Grids, the self-management of desktop Grid platforms and the monitoring and healing of complex applications. It also elaborates on the use of bio-inspired algorithms to achieve self-management. Related future trends and challenges are described.

  14. High-Precision Computation and Mathematical Physics

    International Nuclear Information System (INIS)

    Bailey, David H.; Borwein, Jonathan M.

    2008-01-01

    At the present time, IEEE 64-bit floating-point arithmetic is sufficiently accurate for most scientific applications. However, for a rapidly growing body of important scientific computing applications, a higher level of numeric precision is required. Such calculations are facilitated by high-precision software packages that include high-level language translation modules to minimize the conversion effort. This paper presents a survey of recent applications of these techniques and provides some analysis of their numerical requirements. These applications include supernova simulations, climate modeling, planetary orbit calculations, Coulomb n-body atomic systems, scattering amplitudes of quarks, gluons and bosons, nonlinear oscillator theory, Ising theory, quantum field theory and experimental mathematics. We conclude that high-precision arithmetic facilities are now an indispensable component of a modern large-scale scientific computing environment.

  15. RAPPORT: running scientific high-performance computing applications on the cloud.

    Science.gov (United States)

    Cohen, Jeremy; Filippis, Ioannis; Woodbridge, Mark; Bauer, Daniela; Hong, Neil Chue; Jackson, Mike; Butcher, Sarah; Colling, David; Darlington, John; Fuchs, Brian; Harvey, Matt

    2013-01-28

    Cloud computing infrastructure is now widely used in many domains, but one area where there has been more limited adoption is research computing, in particular for running scientific high-performance computing (HPC) software. The Robust Application Porting for HPC in the Cloud (RAPPORT) project took advantage of existing links between computing researchers and application scientists in the fields of bioinformatics, high-energy physics (HEP) and digital humanities, to investigate running a set of scientific HPC applications from these domains on cloud infrastructure. In this paper, we focus on the bioinformatics and HEP domains, describing the applications and target cloud platforms. We conclude that, while there are many factors that need consideration, there is no fundamental impediment to the use of cloud infrastructure for running many types of HPC applications and, in some cases, there is potential for researchers to benefit significantly from the flexibility offered by cloud platforms.

  16. Computer-based data acquisition system in the Large Coil Test Facility

    International Nuclear Information System (INIS)

    Gould, S.S.; Layman, L.R.; Million, D.L.

    1983-01-01

    The utilization of computers for data acquisition and control is of paramount importance on large-scale fusion experiments because they feature the ability to acquire data from a large number of sensors at various sample rates and provide for flexible data interpretation, presentation, reduction, and analysis. In the Large Coil Test Facility (LCTF) a Digital Equipment Corporation (DEC) PDP-11/60 host computer with the DEC RSX-11M operating system coordinates the activities of five DEC LSI-11/23 front-end processors (FEPs) via direct memory access (DMA) communication links. This provides host control of scheduled data acquisition and FEP event-triggered data collection tasks. Four of the five FEPs have no operating system

  17. Parallelizing Gene Expression Programming Algorithm in Enabling Large-Scale Classification

    Directory of Open Access Journals (Sweden)

    Lixiong Xu

    2017-01-01

    Full Text Available As one of the most effective function mining algorithms, Gene Expression Programming (GEP algorithm has been widely used in classification, pattern recognition, prediction, and other research fields. Based on the self-evolution, GEP is able to mine an optimal function for dealing with further complicated tasks. However, in big data researches, GEP encounters low efficiency issue due to its long time mining processes. To improve the efficiency of GEP in big data researches especially for processing large-scale classification tasks, this paper presents a parallelized GEP algorithm using MapReduce computing model. The experimental results show that the presented algorithm is scalable and efficient for processing large-scale classification tasks.

  18. Large-scale ground motion simulation using GPGPU

    Science.gov (United States)

    Aoi, S.; Maeda, T.; Nishizawa, N.; Aoki, T.

    2012-12-01

    Huge computation resources are required to perform large-scale ground motion simulations using 3-D finite difference method (FDM) for realistic and complex models with high accuracy. Furthermore, thousands of various simulations are necessary to evaluate the variability of the assessment caused by uncertainty of the assumptions of the source models for future earthquakes. To conquer the problem of restricted computational resources, we introduced the use of GPGPU (General purpose computing on graphics processing units) which is the technique of using a GPU as an accelerator of the computation which has been traditionally conducted by the CPU. We employed the CPU version of GMS (Ground motion Simulator; Aoi et al., 2004) as the original code and implemented the function for GPU calculation using CUDA (Compute Unified Device Architecture). GMS is a total system for seismic wave propagation simulation based on 3-D FDM scheme using discontinuous grids (Aoi&Fujiwara, 1999), which includes the solver as well as the preprocessor tools (parameter generation tool) and postprocessor tools (filter tool, visualization tool, and so on). The computational model is decomposed in two horizontal directions and each decomposed model is allocated to a different GPU. We evaluated the performance of our newly developed GPU version of GMS on the TSUBAME2.0 which is one of the Japanese fastest supercomputer operated by the Tokyo Institute of Technology. First we have performed a strong scaling test using the model with about 22 million grids and achieved 3.2 and 7.3 times of the speed-up by using 4 and 16 GPUs. Next, we have examined a weak scaling test where the model sizes (number of grids) are increased in proportion to the degree of parallelism (number of GPUs). The result showed almost perfect linearity up to the simulation with 22 billion grids using 1024 GPUs where the calculation speed reached to 79.7 TFlops and about 34 times faster than the CPU calculation using the same number

  19. Data management, code deployment, and scientific visualization to enhance scientific discovery in fusion research through advanced computing

    International Nuclear Information System (INIS)

    Schissel, D.P.; Finkelstein, A.; Foster, I.T.; Fredian, T.W.; Greenwald, M.J.; Hansen, C.D.; Johnson, C.R.; Keahey, K.; Klasky, S.A.; Li, K.; McCune, D.C.; Peng, Q.; Stevens, R.; Thompson, M.R.

    2002-01-01

    The long-term vision of the Fusion Collaboratory described in this paper is to transform fusion research and accelerate scientific understanding and innovation so as to revolutionize the design of a fusion energy source. The Collaboratory will create and deploy collaborative software tools that will enable more efficient utilization of existing experimental facilities and more effective integration of experiment, theory, and modeling. The computer science research necessary to create the Collaboratory is centered on three activities: security, remote and distributed computing, and scientific visualization. It is anticipated that the presently envisioned Fusion Collaboratory software tools will require 3 years to complete

  20. Final Report - A DEEPER LOOK AT THE VISUALIZATION OF SCIENTIFIC DISCOVERY IN THE FEDERAL CONTEXT

    Energy Technology Data Exchange (ETDEWEB)

    Cozzens, Susan; Lane, Julia

    2008-09-12

    The visualization of scientific discovery has reached an intriguing point of development. Researchers in the field are producing fascinating representations that are catching the attention of program officers and policymakers. The accomplishments are being driven by the availability of both large scale data sets and the computing power and algorithms to analyze them.

  1. Accelerating large-scale protein structure alignments with graphics processing units

    Directory of Open Access Journals (Sweden)

    Pang Bin

    2012-02-01

    Full Text Available Abstract Background Large-scale protein structure alignment, an indispensable tool to structural bioinformatics, poses a tremendous challenge on computational resources. To ensure structure alignment accuracy and efficiency, efforts have been made to parallelize traditional alignment algorithms in grid environments. However, these solutions are costly and of limited accessibility. Others trade alignment quality for speedup by using high-level characteristics of structure fragments for structure comparisons. Findings We present ppsAlign, a parallel protein structure Alignment framework designed and optimized to exploit the parallelism of Graphics Processing Units (GPUs. As a general-purpose GPU platform, ppsAlign could take many concurrent methods, such as TM-align and Fr-TM-align, into the parallelized algorithm design. We evaluated ppsAlign on an NVIDIA Tesla C2050 GPU card, and compared it with existing software solutions running on an AMD dual-core CPU. We observed a 36-fold speedup over TM-align, a 65-fold speedup over Fr-TM-align, and a 40-fold speedup over MAMMOTH. Conclusions ppsAlign is a high-performance protein structure alignment tool designed to tackle the computational complexity issues from protein structural data. The solution presented in this paper allows large-scale structure comparisons to be performed using massive parallel computing power of GPU.

  2. Implementation of Grid-computing Framework for Simulation in Multi-scale Structural Analysis

    Directory of Open Access Journals (Sweden)

    Data Iranata

    2010-05-01

    Full Text Available A new grid-computing framework for simulation in multi-scale structural analysis is presented. Two levels of parallel processing will be involved in this framework: multiple local distributed computing environments connected by local network to form a grid-based cluster-to-cluster distributed computing environment. To successfully perform the simulation, a large-scale structural system task is decomposed into the simulations of a simplified global model and several detailed component models using various scales. These correlated multi-scale structural system tasks are distributed among clusters and connected together in a multi-level hierarchy and then coordinated over the internet. The software framework for supporting the multi-scale structural simulation approach is also presented. The program architecture design allows the integration of several multi-scale models as clients and servers under a single platform. To check its feasibility, a prototype software system has been designed and implemented to perform the proposed concept. The simulation results show that the software framework can increase the speedup performance of the structural analysis. Based on this result, the proposed grid-computing framework is suitable to perform the simulation of the multi-scale structural analysis.

  3. Automated Topographic Change Detection via Dem Differencing at Large Scales Using The Arcticdem Database

    Science.gov (United States)

    Candela, S. G.; Howat, I.; Noh, M. J.; Porter, C. C.; Morin, P. J.

    2016-12-01

    In the last decade, high resolution satellite imagery has become an increasingly accessible tool for geoscientists to quantify changes in the Arctic land surface due to geophysical, ecological and anthropomorphic processes. However, the trade off between spatial coverage and spatial-temporal resolution has limited detailed, process-level change detection over large (i.e. continental) scales. The ArcticDEM project utilized over 300,000 Worldview image pairs to produce a nearly 100% coverage elevation model (above 60°N) offering the first polar, high spatial - high resolution (2-8m by region) dataset, often with multiple repeats in areas of particular interest to geo-scientists. A dataset of this size (nearly 250 TB) offers endless new avenues of scientific inquiry, but quickly becomes unmanageable computationally and logistically for the computing resources available to the average scientist. Here we present TopoDiff, a framework for a generalized. automated workflow that requires minimal input from the end user about a study site, and utilizes cloud computing resources to provide a temporally sorted and differenced dataset, ready for geostatistical analysis. This hands-off approach allows the end user to focus on the science, without having to manage thousands of files, or petabytes of data. At the same time, TopoDiff provides a consistent and accurate workflow for image sorting, selection, and co-registration enabling cross-comparisons between research projects.

  4. Final Report: Large-Scale Optimization for Bayesian Inference in Complex Systems

    Energy Technology Data Exchange (ETDEWEB)

    Ghattas, Omar [The University of Texas at Austin

    2013-10-15

    The SAGUARO (Scalable Algorithms for Groundwater Uncertainty Analysis and Robust Optimiza- tion) Project focuses on the development of scalable numerical algorithms for large-scale Bayesian inversion in complex systems that capitalize on advances in large-scale simulation-based optimiza- tion and inversion methods. Our research is directed in three complementary areas: efficient approximations of the Hessian operator, reductions in complexity of forward simulations via stochastic spectral approximations and model reduction, and employing large-scale optimization concepts to accelerate sampling. Our efforts are integrated in the context of a challenging testbed problem that considers subsurface reacting flow and transport. The MIT component of the SAGUARO Project addresses the intractability of conventional sampling methods for large-scale statistical inverse problems by devising reduced-order models that are faithful to the full-order model over a wide range of parameter values; sampling then employs the reduced model rather than the full model, resulting in very large computational savings. Results indicate little effect on the computed posterior distribution. On the other hand, in the Texas-Georgia Tech component of the project, we retain the full-order model, but exploit inverse problem structure (adjoint-based gradients and partial Hessian information of the parameter-to- observation map) to implicitly extract lower dimensional information on the posterior distribution; this greatly speeds up sampling methods, so that fewer sampling points are needed. We can think of these two approaches as "reduce then sample" and "sample then reduce." In fact, these two approaches are complementary, and can be used in conjunction with each other. Moreover, they both exploit deterministic inverse problem structure, in the form of adjoint-based gradient and Hessian information of the underlying parameter-to-observation map, to achieve their speedups.

  5. Dose monitoring in large-scale flowing aqueous media

    International Nuclear Information System (INIS)

    Kuruca, C.N.

    1995-01-01

    The Miami Electron Beam Research Facility (EBRF) has been in operation for six years. The EBRF houses a 1.5 MV, 75 KW DC scanned electron beam. Experiments have been conducted to evaluate the effectiveness of high-energy electron irradiation in the removal of toxic organic chemicals from contaminated water and the disinfection of various wastewater streams. The large-scale plant operates at approximately 450 L/min (120 gal/min). The radiation dose absorbed by the flowing aqueous streams is estimated by measuring the difference in water temperature before and after it passes in front of the beam. Temperature measurements are made using resistance temperature devices (RTDs) and recorded by computer along with other operating parameters. Estimated dose is obtained from the measured temperature differences using the specific heat of water. This presentation will discuss experience with this measurement system, its application to different water presentation devices, sources of error, and the advantages and disadvantages of its use in large-scale process applications

  6. The InSAR Scientific Computing Environment (ISCE): A Python Framework for Earth Science

    Science.gov (United States)

    Rosen, P. A.; Gurrola, E. M.; Agram, P. S.; Sacco, G. F.; Lavalle, M.

    2015-12-01

    The InSAR Scientific Computing Environment (ISCE, funded by NASA ESTO) provides a modern computing framework for geodetic image processing of InSAR data from a diverse array of radar satellites and aircraft. ISCE is both a modular, flexible, and extensible framework for building software components and applications as well as a toolbox of applications for processing raw or focused InSAR and Polarimetric InSAR data. The ISCE framework contains object-oriented Python components layered to construct Python InSAR components that manage legacy Fortran/C InSAR programs. Components are independently configurable in a layered manner to provide maximum control. Polymorphism is used to define a workflow in terms of abstract facilities for each processing step that are realized by specific components at run-time. This enables a single workflow to work on either raw or focused data from all sensors. ISCE can serve as the core of a production center to process Level-0 radar data to Level-3 products, but is amenable to interactive processing approaches that allow scientists to experiment with data to explore new ways of doing science with InSAR data. The NASA-ISRO SAR (NISAR) Mission will deliver data of unprecedented quantity and quality, making possible global-scale studies in climate research, natural hazards, and Earth's ecosystems. ISCE is planned as the foundational element in processing NISAR data, enabling a new class of analyses that take greater advantage of the long time and large spatial scales of these new data. NISAR will be but one mission in a constellation of radar satellites in the future delivering such data. ISCE currently supports all publicly available strip map mode space-borne SAR data since ERS and is expected to include support for upcoming missions. ISCE has been incorporated into two prototype cloud-based systems that have demonstrated its elasticity in addressing larger data processing problems in a "production" context and its ability to be

  7. Scholarly literature and the press: scientific impact and social perception of physics computing

    CERN Document Server

    Pia, Maria Grazia; Bell, Zane W; Dressendorfer, Paul V

    2014-01-01

    The broad coverage of the search for the Higgs boson in the mainstream media is a relative novelty for high energy physics (HEP) research, whose achievements have traditionally been limited to scholarly literature. This paper illustrates the results of a scientometric analysis of HEP computing in scientific literature, institutional media and the press, and a comparative overview of similar metrics concerning representative particle physics measurements. The picture emerging from these scientometric data documents the scientific impact and social perception of HEP computing. The results of this analysis suggest that improved communication of the scientific and social role of HEP computing would be beneficial to the high energy physics community.

  8. Large-scale simulation of ductile fracture process of microstructured materials

    International Nuclear Information System (INIS)

    Tian Rong; Wang Chaowei

    2011-01-01

    The promise of computational science in the extreme-scale computing era is to reduce and decompose macroscopic complexities into microscopic simplicities with the expense of high spatial and temporal resolution of computing. In materials science and engineering, the direct combination of 3D microstructure data sets and 3D large-scale simulations provides unique opportunity for the development of a comprehensive understanding of nano/microstructure-property relationships in order to systematically design materials with specific desired properties. In the paper, we present a framework simulating the ductile fracture process zone in microstructural detail. The experimentally reconstructed microstructural data set is directly embedded into a FE mesh model to improve the simulation fidelity of microstructure effects on fracture toughness. To the best of our knowledge, it is for the first time that the linking of fracture toughness to multiscale microstructures in a realistic 3D numerical model in a direct manner is accomplished. (author)

  9. What makes computational open source software libraries successful?

    International Nuclear Information System (INIS)

    Bangerth, Wolfgang; Heister, Timo

    2013-01-01

    Software is the backbone of scientific computing. Yet, while we regularly publish detailed accounts about the results of scientific software, and while there is a general sense of which numerical methods work well, our community is largely unaware of best practices in writing the large-scale, open source scientific software upon which our discipline rests. This is particularly apparent in the commonly held view that writing successful software packages is largely the result of simply ‘being a good programmer’ when in fact there are many other factors involved, for example the social skill of community building. In this paper, we consider what we have found to be the necessary ingredients for successful scientific software projects and, in particular, for software libraries upon which the vast majority of scientific codes are built today. In particular, we discuss the roles of code, documentation, communities, project management and licenses. We also briefly comment on the impact on academic careers of engaging in software projects. (paper)

  10. What makes computational open source software libraries successful?

    Science.gov (United States)

    Bangerth, Wolfgang; Heister, Timo

    2013-01-01

    Software is the backbone of scientific computing. Yet, while we regularly publish detailed accounts about the results of scientific software, and while there is a general sense of which numerical methods work well, our community is largely unaware of best practices in writing the large-scale, open source scientific software upon which our discipline rests. This is particularly apparent in the commonly held view that writing successful software packages is largely the result of simply ‘being a good programmer’ when in fact there are many other factors involved, for example the social skill of community building. In this paper, we consider what we have found to be the necessary ingredients for successful scientific software projects and, in particular, for software libraries upon which the vast majority of scientific codes are built today. In particular, we discuss the roles of code, documentation, communities, project management and licenses. We also briefly comment on the impact on academic careers of engaging in software projects.

  11. Fast selection of miRNA candidates based on large-scale pre-computed MFE sets of randomized sequences.

    Science.gov (United States)

    Warris, Sven; Boymans, Sander; Muiser, Iwe; Noback, Michiel; Krijnen, Wim; Nap, Jan-Peter

    2014-01-13

    Small RNAs are important regulators of genome function, yet their prediction in genomes is still a major computational challenge. Statistical analyses of pre-miRNA sequences indicated that their 2D structure tends to have a minimal free energy (MFE) significantly lower than MFE values of equivalently randomized sequences with the same nucleotide composition, in contrast to other classes of non-coding RNA. The computation of many MFEs is, however, too intensive to allow for genome-wide screenings. Using a local grid infrastructure, MFE distributions of random sequences were pre-calculated on a large scale. These distributions follow a normal distribution and can be used to determine the MFE distribution for any given sequence composition by interpolation. It allows on-the-fly calculation of the normal distribution for any candidate sequence composition. The speedup achieved makes genome-wide screening with this characteristic of a pre-miRNA sequence practical. Although this particular property alone will not be able to distinguish miRNAs from other sequences sufficiently discriminative, the MFE-based P-value should be added to the parameters of choice to be included in the selection of potential miRNA candidates for experimental verification.

  12. Large scale analysis of signal reachability.

    Science.gov (United States)

    Todor, Andrei; Gabr, Haitham; Dobra, Alin; Kahveci, Tamer

    2014-06-15

    Major disorders, such as leukemia, have been shown to alter the transcription of genes. Understanding how gene regulation is affected by such aberrations is of utmost importance. One promising strategy toward this objective is to compute whether signals can reach to the transcription factors through the transcription regulatory network (TRN). Due to the uncertainty of the regulatory interactions, this is a #P-complete problem and thus solving it for very large TRNs remains to be a challenge. We develop a novel and scalable method to compute the probability that a signal originating at any given set of source genes can arrive at any given set of target genes (i.e., transcription factors) when the topology of the underlying signaling network is uncertain. Our method tackles this problem for large networks while providing a provably accurate result. Our method follows a divide-and-conquer strategy. We break down the given network into a sequence of non-overlapping subnetworks such that reachability can be computed autonomously and sequentially on each subnetwork. We represent each interaction using a small polynomial. The product of these polynomials express different scenarios when a signal can or cannot reach to target genes from the source genes. We introduce polynomial collapsing operators for each subnetwork. These operators reduce the size of the resulting polynomial and thus the computational complexity dramatically. We show that our method scales to entire human regulatory networks in only seconds, while the existing methods fail beyond a few tens of genes and interactions. We demonstrate that our method can successfully characterize key reachability characteristics of the entire transcriptions regulatory networks of patients affected by eight different subtypes of leukemia, as well as those from healthy control samples. All the datasets and code used in this article are available at bioinformatics.cise.ufl.edu/PReach/scalable.htm. © The Author 2014

  13. Developing eThread Pipeline Using SAGA-Pilot Abstraction for Large-Scale Structural Bioinformatics

    Directory of Open Access Journals (Sweden)

    Anjani Ragothaman

    2014-01-01

    Full Text Available While most of computational annotation approaches are sequence-based, threading methods are becoming increasingly attractive because of predicted structural information that could uncover the underlying function. However, threading tools are generally compute-intensive and the number of protein sequences from even small genomes such as prokaryotes is large typically containing many thousands, prohibiting their application as a genome-wide structural systems biology tool. To leverage its utility, we have developed a pipeline for eThread—a meta-threading protein structure modeling tool, that can use computational resources efficiently and effectively. We employ a pilot-based approach that supports seamless data and task-level parallelism and manages large variation in workload and computational requirements. Our scalable pipeline is deployed on Amazon EC2 and can efficiently select resources based upon task requirements. We present runtime analysis to characterize computational complexity of eThread and EC2 infrastructure. Based on results, we suggest a pathway to an optimized solution with respect to metrics such as time-to-solution or cost-to-solution. Our eThread pipeline can scale to support a large number of sequences and is expected to be a viable solution for genome-scale structural bioinformatics and structure-based annotation, particularly, amenable for small genomes such as prokaryotes. The developed pipeline is easily extensible to other types of distributed cyberinfrastructure.

  14. GPU-based large-scale visualization

    KAUST Repository

    Hadwiger, Markus

    2013-11-19

    Recent advances in image and volume acquisition as well as computational advances in simulation have led to an explosion of the amount of data that must be visualized and analyzed. Modern techniques combine the parallel processing power of GPUs with out-of-core methods and data streaming to enable the interactive visualization of giga- and terabytes of image and volume data. A major enabler for interactivity is making both the computational and the visualization effort proportional to the amount of data that is actually visible on screen, decoupling it from the full data size. This leads to powerful display-aware multi-resolution techniques that enable the visualization of data of almost arbitrary size. The course consists of two major parts: An introductory part that progresses from fundamentals to modern techniques, and a more advanced part that discusses details of ray-guided volume rendering, novel data structures for display-aware visualization and processing, and the remote visualization of large online data collections. You will learn how to develop efficient GPU data structures and large-scale visualizations, implement out-of-core strategies and concepts such as virtual texturing that have only been employed recently, as well as how to use modern multi-resolution representations. These approaches reduce the GPU memory requirements of extremely large data to a working set size that fits into current GPUs. You will learn how to perform ray-casting of volume data of almost arbitrary size and how to render and process gigapixel images using scalable, display-aware techniques. We will describe custom virtual texturing architectures as well as recent hardware developments in this area. We will also describe client/server systems for distributed visualization, on-demand data processing and streaming, and remote visualization. We will describe implementations using OpenGL as well as CUDA, exploiting parallelism on GPUs combined with additional asynchronous

  15. Hi-Corrector: a fast, scalable and memory-efficient package for normalizing large-scale Hi-C data.

    Science.gov (United States)

    Li, Wenyuan; Gong, Ke; Li, Qingjiao; Alber, Frank; Zhou, Xianghong Jasmine

    2015-03-15

    Genome-wide proximity ligation assays, e.g. Hi-C and its variant TCC, have recently become important tools to study spatial genome organization. Removing biases from chromatin contact matrices generated by such techniques is a critical preprocessing step of subsequent analyses. The continuing decline of sequencing costs has led to an ever-improving resolution of the Hi-C data, resulting in very large matrices of chromatin contacts. Such large-size matrices, however, pose a great challenge on the memory usage and speed of its normalization. Therefore, there is an urgent need for fast and memory-efficient methods for normalization of Hi-C data. We developed Hi-Corrector, an easy-to-use, open source implementation of the Hi-C data normalization algorithm. Its salient features are (i) scalability-the software is capable of normalizing Hi-C data of any size in reasonable times; (ii) memory efficiency-the sequential version can run on any single computer with very limited memory, no matter how little; (iii) fast speed-the parallel version can run very fast on multiple computing nodes with limited local memory. The sequential version is implemented in ANSI C and can be easily compiled on any system; the parallel version is implemented in ANSI C with the MPI library (a standardized and portable parallel environment designed for solving large-scale scientific problems). The package is freely available at http://zhoulab.usc.edu/Hi-Corrector/. © The Author 2014. Published by Oxford University Press.

  16. Lattice QCD - a challenge in large scale computing

    International Nuclear Information System (INIS)

    Schilling, K.

    1987-01-01

    The computation of the hadron spectrum within the framework of lattice QCD sets a demanding goal for the application of supercomputers in basic science. It requires both big computer capacities and clever algorithms to fight all the numerical evils that one encounters in the Euclidean space-time-world. The talk will attempt to introduce to the present state of the art of spectrum calculations by lattice simulations. (orig.)

  17. A quasi-Newton algorithm for large-scale nonlinear equations

    Directory of Open Access Journals (Sweden)

    Linghua Huang

    2017-02-01

    Full Text Available Abstract In this paper, the algorithm for large-scale nonlinear equations is designed by the following steps: (i a conjugate gradient (CG algorithm is designed as a sub-algorithm to obtain the initial points of the main algorithm, where the sub-algorithm’s initial point does not have any restrictions; (ii a quasi-Newton algorithm with the initial points given by sub-algorithm is defined as main algorithm, where a new nonmonotone line search technique is presented to get the step length α k $\\alpha_{k}$ . The given nonmonotone line search technique can avoid computing the Jacobian matrix. The global convergence and the 1 + q $1+q$ -order convergent rate of the main algorithm are established under suitable conditions. Numerical results show that the proposed method is competitive with a similar method for large-scale problems.

  18. Findings and Challenges in Fine-Resolution Large-Scale Hydrological Modeling

    Science.gov (United States)

    Her, Y. G.

    2017-12-01

    Fine-resolution large-scale (FL) modeling can provide the overall picture of the hydrological cycle and transport while taking into account unique local conditions in the simulation. It can also help develop water resources management plans consistent across spatial scales by describing the spatial consequences of decisions and hydrological events extensively. FL modeling is expected to be common in the near future as global-scale remotely sensed data are emerging, and computing resources have been advanced rapidly. There are several spatially distributed models available for hydrological analyses. Some of them rely on numerical methods such as finite difference/element methods (FDM/FEM), which require excessive computing resources (implicit scheme) to manipulate large matrices or small simulation time intervals (explicit scheme) to maintain the stability of the solution, to describe two-dimensional overland processes. Others make unrealistic assumptions such as constant overland flow velocity to reduce the computational loads of the simulation. Thus, simulation efficiency often comes at the expense of precision and reliability in FL modeling. Here, we introduce a new FL continuous hydrological model and its application to four watersheds in different landscapes and sizes from 3.5 km2 to 2,800 km2 at the spatial resolution of 30 m on an hourly basis. The model provided acceptable accuracy statistics in reproducing hydrological observations made in the watersheds. The modeling outputs including the maps of simulated travel time, runoff depth, soil water content, and groundwater recharge, were animated, visualizing the dynamics of hydrological processes occurring in the watersheds during and between storm events. Findings and challenges were discussed in the context of modeling efficiency, accuracy, and reproducibility, which we found can be improved by employing advanced computing techniques and hydrological understandings, by using remotely sensed hydrological

  19. Final Report: Quantification of Uncertainty in Extreme Scale Computations (QUEST)

    Energy Technology Data Exchange (ETDEWEB)

    Marzouk, Youssef [Massachusetts Inst. of Technology (MIT), Cambridge, MA (United States); Conrad, Patrick [Massachusetts Inst. of Technology (MIT), Cambridge, MA (United States); Bigoni, Daniele [Massachusetts Inst. of Technology (MIT), Cambridge, MA (United States); Parno, Matthew [Massachusetts Inst. of Technology (MIT), Cambridge, MA (United States)

    2017-06-09

    QUEST (\\url{www.quest-scidac.org}) is a SciDAC Institute that is focused on uncertainty quantification (UQ) in large-scale scientific computations. Our goals are to (1) advance the state of the art in UQ mathematics, algorithms, and software; and (2) provide modeling, algorithmic, and general UQ expertise, together with software tools, to other SciDAC projects, thereby enabling and guiding a broad range of UQ activities in their respective contexts. QUEST is a collaboration among six institutions (Sandia National Laboratories, Los Alamos National Laboratory, the University of Southern California, Massachusetts Institute of Technology, the University of Texas at Austin, and Duke University) with a history of joint UQ research. Our vision encompasses all aspects of UQ in leadership-class computing. This includes the well-founded setup of UQ problems; characterization of the input space given available data/information; local and global sensitivity analysis; adaptive dimensionality and order reduction; forward and inverse propagation of uncertainty; handling of application code failures, missing data, and hardware/software fault tolerance; and model inadequacy, comparison, validation, selection, and averaging. The nature of the UQ problem requires the seamless combination of data, models, and information across this landscape in a manner that provides a self-consistent quantification of requisite uncertainties in predictions from computational models. Accordingly, our UQ methods and tools span an interdisciplinary space across applied math, information theory, and statistics. The MIT QUEST effort centers on statistical inference and methods for surrogate or reduced-order modeling. MIT personnel have been responsible for the development of adaptive sampling methods, methods for approximating computationally intensive models, and software for both forward uncertainty propagation and statistical inverse problems. A key software product of the MIT QUEST effort is the MIT

  20. Discriminative Hierarchical K-Means Tree for Large-Scale Image Classification.

    Science.gov (United States)

    Chen, Shizhi; Yang, Xiaodong; Tian, Yingli

    2015-09-01

    A key challenge in large-scale image classification is how to achieve efficiency in terms of both computation and memory without compromising classification accuracy. The learning-based classifiers achieve the state-of-the-art accuracies, but have been criticized for the computational complexity that grows linearly with the number of classes. The nonparametric nearest neighbor (NN)-based classifiers naturally handle large numbers of categories, but incur prohibitively expensive computation and memory costs. In this brief, we present a novel classification scheme, i.e., discriminative hierarchical K-means tree (D-HKTree), which combines the advantages of both learning-based and NN-based classifiers. The complexity of the D-HKTree only grows sublinearly with the number of categories, which is much better than the recent hierarchical support vector machines-based methods. The memory requirement is the order of magnitude less than the recent Naïve Bayesian NN-based approaches. The proposed D-HKTree classification scheme is evaluated on several challenging benchmark databases and achieves the state-of-the-art accuracies, while with significantly lower computation cost and memory requirement.

  1. Auto-Scaling of Geo-Based Image Processing in an OpenStack Cloud Computing Environment

    Directory of Open Access Journals (Sweden)

    Sanggoo Kang

    2016-08-01

    Full Text Available Cloud computing is a base platform for the distribution of large volumes of data and high-performance image processing on the Web. Despite wide applications in Web-based services and their many benefits, geo-spatial applications based on cloud computing technology are still developing. Auto-scaling realizes automatic scalability, i.e., the scale-out and scale-in processing of virtual servers in a cloud computing environment. This study investigates the applicability of auto-scaling to geo-based image processing algorithms by comparing the performance of a single virtual server and multiple auto-scaled virtual servers under identical experimental conditions. In this study, the cloud computing environment is built with OpenStack, and four algorithms from the Orfeo toolbox are used for practical geo-based image processing experiments. The auto-scaling results from all experimental performance tests demonstrate applicable significance with respect to cloud utilization concerning response time. Auto-scaling contributes to the development of web-based satellite image application services using cloud-based technologies.

  2. Large Data at Small Universities: Astronomical processing using a computer classroom

    Science.gov (United States)

    Fuller, Nathaniel James; Clarkson, William I.; Fluharty, Bill; Belanger, Zach; Dage, Kristen

    2016-06-01

    The use of large computing clusters for astronomy research is becoming more commonplace as datasets expand, but access to these required resources is sometimes difficult for research groups working at smaller Universities. As an alternative to purchasing processing time on an off-site computing cluster, or purchasing dedicated hardware, we show how one can easily build a crude on-site cluster by utilizing idle cycles on instructional computers in computer-lab classrooms. Since these computers are maintained as part of the educational mission of the University, the resource impact on the investigator is generally low.By using open source Python routines, it is possible to have a large number of desktop computers working together via a local network to sort through large data sets. By running traditional analysis routines in an “embarrassingly parallel” manner, gains in speed are accomplished without requiring the investigator to learn how to write routines using highly specialized methodology. We demonstrate this concept here applied to 1. photometry of large-format images and 2. Statistical significance-tests for X-ray lightcurve analysis. In these scenarios, we see a speed-up factor which scales almost linearly with the number of cores in the cluster. Additionally, we show that the usage of the cluster does not severely limit performance for a local user, and indeed the processing can be performed while the computers are in use for classroom purposes.

  3. Blueprint for a microwave trapped ion quantum computer.

    Science.gov (United States)

    Lekitsch, Bjoern; Weidt, Sebastian; Fowler, Austin G; Mølmer, Klaus; Devitt, Simon J; Wunderlich, Christof; Hensinger, Winfried K

    2017-02-01

    The availability of a universal quantum computer may have a fundamental impact on a vast number of research fields and on society as a whole. An increasingly large scientific and industrial community is working toward the realization of such a device. An arbitrarily large quantum computer may best be constructed using a modular approach. We present a blueprint for a trapped ion-based scalable quantum computer module, making it possible to create a scalable quantum computer architecture based on long-wavelength radiation quantum gates. The modules control all operations as stand-alone units, are constructed using silicon microfabrication techniques, and are within reach of current technology. To perform the required quantum computations, the modules make use of long-wavelength radiation-based quantum gate technology. To scale this microwave quantum computer architecture to a large size, we present a fully scalable design that makes use of ion transport between different modules, thereby allowing arbitrarily many modules to be connected to construct a large-scale device. A high error-threshold surface error correction code can be implemented in the proposed architecture to execute fault-tolerant operations. With appropriate adjustments, the proposed modules are also suitable for alternative trapped ion quantum computer architectures, such as schemes using photonic interconnects.

  4. The Emergence of Large-Scale Computer Assisted Summative Examination Facilities in Higher Education

    NARCIS (Netherlands)

    Draaijer, S.; Warburton, W. I.

    2014-01-01

    A case study is presented of VU University Amsterdam where a dedicated large-scale CAA examination facility was established. In the facility, 385 students can take an exam concurrently. The case study describes the change factors and processes leading up to the decision by the institution to

  5. An Axiomatic Analysis Approach for Large-Scale Disaster-Tolerant Systems Modeling

    Directory of Open Access Journals (Sweden)

    Theodore W. Manikas

    2011-02-01

    Full Text Available Disaster tolerance in computing and communications systems refers to the ability to maintain a degree of functionality throughout the occurrence of a disaster. We accomplish the incorporation of disaster tolerance within a system by simulating various threats to the system operation and identifying areas for system redesign. Unfortunately, extremely large systems are not amenable to comprehensive simulation studies due to the large computational complexity requirements. To address this limitation, an axiomatic approach that decomposes a large-scale system into smaller subsystems is developed that allows the subsystems to be independently modeled. This approach is implemented using a data communications network system example. The results indicate that the decomposition approach produces simulation responses that are similar to the full system approach, but with greatly reduced simulation time.

  6. Exploiting large-scale correlations to detect continuous gravitational waves.

    Science.gov (United States)

    Pletsch, Holger J; Allen, Bruce

    2009-10-30

    Fully coherent searches (over realistic ranges of parameter space and year-long observation times) for unknown sources of continuous gravitational waves are computationally prohibitive. Less expensive hierarchical searches divide the data into shorter segments which are analyzed coherently, then detection statistics from different segments are combined incoherently. The novel method presented here solves the long-standing problem of how best to do the incoherent combination. The optimal solution exploits large-scale parameter-space correlations in the coherent detection statistic. Application to simulated data shows dramatic sensitivity improvements compared with previously available (ad hoc) methods, increasing the spatial volume probed by more than 2 orders of magnitude at lower computational cost.

  7. Application of Logic Models in a Large Scientific Research Program

    Science.gov (United States)

    O'Keefe, Christine M.; Head, Richard J.

    2011-01-01

    It is the purpose of this article to discuss the development and application of a logic model in the context of a large scientific research program within the Commonwealth Scientific and Industrial Research Organisation (CSIRO). CSIRO is Australia's national science agency and is a publicly funded part of Australia's innovation system. It conducts…

  8. Sentinel-1 data massive processing for large scale DInSAR analyses within Cloud Computing environments through the P-SBAS approach

    Science.gov (United States)

    Lanari, Riccardo; Bonano, Manuela; Buonanno, Sabatino; Casu, Francesco; De Luca, Claudio; Fusco, Adele; Manunta, Michele; Manzo, Mariarosaria; Pepe, Antonio; Zinno, Ivana

    2017-04-01

    -core programming techniques. Currently, Cloud Computing environments make available large collections of computing resources and storage that can be effectively exploited through the presented S1 P-SBAS processing chain to carry out interferometric analyses at a very large scale, in reduced time. This allows us to deal also with the problems connected to the use of S1 P-SBAS chain in operational contexts, related to hazard monitoring and risk prevention and mitigation, where handling large amounts of data represents a challenging task. As a significant experimental result we performed a large spatial scale SBAS analysis relevant to the Central and Southern Italy by exploiting the Amazon Web Services Cloud Computing platform. In particular, we processed in parallel 300 S1 acquisitions covering the Italian peninsula from Lazio to Sicily through the presented S1 P-SBAS processing chain, generating 710 interferograms, thus finally obtaining the displacement time series of the whole processed area. This work has been partially supported by the CNR-DPC agreement, the H2020 EPOS-IP project (GA 676564) and the ESA GEP project.

  9. A method of orbital analysis for large-scale first-principles simulations

    Energy Technology Data Exchange (ETDEWEB)

    Ohwaki, Tsukuru [Advanced Materials Laboratory, Nissan Research Center, Nissan Motor Co., Ltd., 1 Natsushima-cho, Yokosuka, Kanagawa 237-8523 (Japan); Otani, Minoru [Nanosystem Research Institute, National Institute of Advanced Industrial Science and Technology (AIST), Tsukuba, Ibaraki 305-8568 (Japan); Ozaki, Taisuke [Research Center for Simulation Science (RCSS), Japan Advanced Institute of Science and Technology (JAIST), 1-1 Asahidai, Nomi, Ishikawa 923-1292 (Japan)

    2014-06-28

    An efficient method of calculating the natural bond orbitals (NBOs) based on a truncation of the entire density matrix of a whole system is presented for large-scale density functional theory calculations. The method recovers an orbital picture for O(N) electronic structure methods which directly evaluate the density matrix without using Kohn-Sham orbitals, thus enabling quantitative analysis of chemical reactions in large-scale systems in the language of localized Lewis-type chemical bonds. With the density matrix calculated by either an exact diagonalization or O(N) method, the computational cost is O(1) for the calculation of NBOs associated with a local region where a chemical reaction takes place. As an illustration of the method, we demonstrate how an electronic structure in a local region of interest can be analyzed by NBOs in a large-scale first-principles molecular dynamics simulation for a liquid electrolyte bulk model (propylene carbonate + LiBF{sub 4})

  10. A method of orbital analysis for large-scale first-principles simulations

    International Nuclear Information System (INIS)

    Ohwaki, Tsukuru; Otani, Minoru; Ozaki, Taisuke

    2014-01-01

    An efficient method of calculating the natural bond orbitals (NBOs) based on a truncation of the entire density matrix of a whole system is presented for large-scale density functional theory calculations. The method recovers an orbital picture for O(N) electronic structure methods which directly evaluate the density matrix without using Kohn-Sham orbitals, thus enabling quantitative analysis of chemical reactions in large-scale systems in the language of localized Lewis-type chemical bonds. With the density matrix calculated by either an exact diagonalization or O(N) method, the computational cost is O(1) for the calculation of NBOs associated with a local region where a chemical reaction takes place. As an illustration of the method, we demonstrate how an electronic structure in a local region of interest can be analyzed by NBOs in a large-scale first-principles molecular dynamics simulation for a liquid electrolyte bulk model (propylene carbonate + LiBF 4 )

  11. Scientific computing and algorithms in industrial simulations projects and products of Fraunhofer SCAI

    CERN Document Server

    Schüller, Anton; Schweitzer, Marc

    2017-01-01

    The contributions gathered here provide an overview of current research projects and selected software products of the Fraunhofer Institute for Algorithms and Scientific Computing SCAI. They show the wide range of challenges that scientific computing currently faces, the solutions it offers, and its important role in developing applications for industry. Given the exciting field of applied collaborative research and development it discusses, the book will appeal to scientists, practitioners, and students alike. The Fraunhofer Institute for Algorithms and Scientific Computing SCAI combines excellent research and application-oriented development to provide added value for our partners. SCAI develops numerical techniques, parallel algorithms and specialized software tools to support and optimize industrial simulations. Moreover, it implements custom software solutions for production and logistics, and offers calculations on high-performance computers. Its services and products are based on state-of-the-art metho...

  12. Domain analysis of computational science - Fifty years of a scientific computing group

    Energy Technology Data Exchange (ETDEWEB)

    Tanaka, M.

    2010-02-23

    I employed bibliometric- and historical-methods to study the domain of the Scientific Computing group at Brookhaven National Laboratory (BNL) for an extended period of fifty years, from 1958 to 2007. I noted and confirmed the growing emergence of interdisciplinarity within the group. I also identified a strong, consistent mathematics and physics orientation within it.

  13. Planning under uncertainty solving large-scale stochastic linear programs

    Energy Technology Data Exchange (ETDEWEB)

    Infanger, G. [Stanford Univ., CA (United States). Dept. of Operations Research]|[Technische Univ., Vienna (Austria). Inst. fuer Energiewirtschaft

    1992-12-01

    For many practical problems, solutions obtained from deterministic models are unsatisfactory because they fail to hedge against certain contingencies that may occur in the future. Stochastic models address this shortcoming, but up to recently seemed to be intractable due to their size. Recent advances both in solution algorithms and in computer technology now allow us to solve important and general classes of practical stochastic problems. We show how large-scale stochastic linear programs can be efficiently solved by combining classical decomposition and Monte Carlo (importance) sampling techniques. We discuss the methodology for solving two-stage stochastic linear programs with recourse, present numerical results of large problems with numerous stochastic parameters, show how to efficiently implement the methodology on a parallel multi-computer and derive the theory for solving a general class of multi-stage problems with dependency of the stochastic parameters within a stage and between different stages.

  14. New tools to aid in scientific computing and visualization

    International Nuclear Information System (INIS)

    Wallace, M.G.; Christian-Frear, T.L.

    1992-01-01

    In this paper, two computer programs are described which aid in the pre- and post-processing of computer generated data. CoMeT (Computational Mechanics Toolkit) is a customizable, interactive, graphical, menu-driven program that provides the analyst with a consistent user-friendly interface to analysis codes. Trans Vol (Transparent Volume Visualization) is a specialized tool for the scientific three-dimensional visualization of complex solids by the technique of volume rendering. Both tools are described in basic detail along with an application example concerning the simulation of contaminant migration from an underground nuclear repository

  15. Solving Large-Scale Computational Problems Using Insights from Statistical Physics

    Energy Technology Data Exchange (ETDEWEB)

    Selman, Bart [Cornell University

    2012-02-29

    Many challenging problems in computer science and related fields can be formulated as constraint satisfaction problems. Such problems consist of a set of discrete variables and a set of constraints between those variables, and represent a general class of so-called NP-complete problems. The goal is to find a value assignment to the variables that satisfies all constraints, generally requiring a search through and exponentially large space of variable-value assignments. Models for disordered systems, as studied in statistical physics, can provide important new insights into the nature of constraint satisfaction problems. Recently, work in this area has resulted in the discovery of a new method for solving such problems, called the survey propagation (SP) method. With SP, we can solve problems with millions of variables and constraints, an improvement of two orders of magnitude over previous methods.

  16. 3D large-scale calculations using the method of characteristics

    International Nuclear Information System (INIS)

    Dahmani, M.; Roy, R.; Koclas, J.

    2004-01-01

    An overview of the computational requirements and the numerical developments made in order to be able to solve 3D large-scale problems using the characteristics method will be presented. To accelerate the MCI solver, efficient acceleration techniques were implemented and parallelization was performed. However, for the very large problems, the size of the tracking file used to store the tracks can still become prohibitive and exceed the capacity of the machine. The new 3D characteristics solver MCG will now be introduced. This methodology is dedicated to solve very large 3D problems (a part or a whole core) without spatial homogenization. In order to eliminate the input/output problems occurring when solving these large problems, we define a new computing scheme that requires more CPU resources than the usual one, based on sweeps over large tracking files. The huge capacity of storage needed in some problems and the related I/O queries needed by the characteristics solver are replaced by on-the-fly recalculation of tracks at each iteration step. Using this technique, large 3D problems are no longer I/O-bound, and distributed CPU resources can be efficiently used. (author)

  17. Decentralized Large-Scale Power Balancing

    DEFF Research Database (Denmark)

    Halvgaard, Rasmus; Jørgensen, John Bagterp; Poulsen, Niels Kjølstad

    2013-01-01

    problem is formulated as a centralized large-scale optimization problem but is then decomposed into smaller subproblems that are solved locally by each unit connected to an aggregator. For large-scale systems the method is faster than solving the full problem and can be distributed to include an arbitrary...

  18. Google Earth Engine: a new cloud-computing platform for global-scale earth observation data and analysis

    Science.gov (United States)

    Moore, R. T.; Hansen, M. C.

    2011-12-01

    Google Earth Engine is a new technology platform that enables monitoring and measurement of changes in the earth's environment, at planetary scale, on a large catalog of earth observation data. The platform offers intrinsically-parallel computational access to thousands of computers in Google's data centers. Initial efforts have focused primarily on global forest monitoring and measurement, in support of REDD+ activities in the developing world. The intent is to put this platform into the hands of scientists and developing world nations, in order to advance the broader operational deployment of existing scientific methods, and strengthen the ability for public institutions and civil society to better understand, manage and report on the state of their natural resources. Earth Engine currently hosts online nearly the complete historical Landsat archive of L5 and L7 data collected over more than twenty-five years. Newly-collected Landsat imagery is downloaded from USGS EROS Center into Earth Engine on a daily basis. Earth Engine also includes a set of historical and current MODIS data products. The platform supports generation, on-demand, of spatial and temporal mosaics, "best-pixel" composites (for example to remove clouds and gaps in satellite imagery), as well as a variety of spectral indices. Supervised learning methods are available over the Landsat data catalog. The platform also includes a new application programming framework, or "API", that allows scientists access to these computational and data resources, to scale their current algorithms or develop new ones. Under the covers of the Google Earth Engine API is an intrinsically-parallel image-processing system. Several forest monitoring applications powered by this API are currently in development and expected to be operational in 2011. Combining science with massive data and technology resources in a cloud-computing framework can offer advantages of computational speed, ease-of-use and collaboration, as

  19. A Classification Framework for Large-Scale Face Recognition Systems

    OpenAIRE

    Zhou, Ziheng; Deravi, Farzin

    2009-01-01

    This paper presents a generic classification framework for large-scale face recognition systems. Within the framework, a data sampling strategy is proposed to tackle the data imbalance when image pairs are sampled from thousands of face images for preparing a training dataset. A modified kernel Fisher discriminant classifier is proposed to make it computationally feasible to train the kernel-based classification method using tens of thousands of training samples. The framework is tested in an...

  20. Automating large-scale reactor systems

    International Nuclear Information System (INIS)

    Kisner, R.A.

    1985-01-01

    This paper conveys a philosophy for developing automated large-scale control systems that behave in an integrated, intelligent, flexible manner. Methods for operating large-scale systems under varying degrees of equipment degradation are discussed, and a design approach that separates the effort into phases is suggested. 5 refs., 1 fig

  1. Scholarly literature and the press: scientific impact and social perception of physics computing

    International Nuclear Information System (INIS)

    Pia, M G; Basaglia, T; Bell, Z W; Dressendorfer, P V

    2014-01-01

    The broad coverage of the search for the Higgs boson in the mainstream media is a relative novelty for high energy physics (HEP) research, whose achievements have traditionally been limited to scholarly literature. This paper illustrates the results of a scientometric analysis of HEP computing in scientific literature, institutional media and the press, and a comparative overview of similar metrics concerning representative particle physics measurements. The picture emerging from these scientometric data documents the relationship between the scientific impact and the social perception of HEP physics research versus that of HEP computing. The results of this analysis suggest that improved communication of the scientific and social role of HEP computing via press releases from the major HEP laboratories would be beneficial to the high energy physics community.

  2. Streaming Parallel GPU Acceleration of Large-Scale filter-based Spiking Neural Networks

    NARCIS (Netherlands)

    L.P. Slazynski (Leszek); S.M. Bohte (Sander)

    2012-01-01

    htmlabstractThe arrival of graphics processing (GPU) cards suitable for massively parallel computing promises a↵ordable large-scale neural network simulation previously only available at supercomputing facil- ities. While the raw numbers suggest that GPUs may outperform CPUs by at least an order of

  3. International scientific seminar «Chronicle of Nature – a common database for scientific analysis and joint planning of scientific publications»

    Directory of Open Access Journals (Sweden)

    Juri P. Kurhinen

    2016-05-01

    Full Text Available Provides information about the results of the international scienti fic seminar «Сhronicle of Nature – a common database for scientific analysis and joint planning of scientific publications», held at Findland-Russian project «Linking environmental change to biodiversity change: large scale analysis оf Eurasia ecosystem».

  4. Data-flow oriented visual programming libraries for scientific computing

    NARCIS (Netherlands)

    Maubach, J.M.L.; Drenth, W.D.; Sloot, P.M.A.

    2002-01-01

    The growing release of scientific computational software does not seem to aid the implementation of complex numerical algorithms. Released libraries lack a common standard interface with regard to for instance finite element, difference or volume discretizations. And, libraries written in standard

  5. National Energy Research Scientific Computing Center 2007 Annual Report

    Energy Technology Data Exchange (ETDEWEB)

    Hules, John A.; Bashor, Jon; Wang, Ucilia; Yarris, Lynn; Preuss, Paul

    2008-10-23

    This report presents highlights of the research conducted on NERSC computers in a variety of scientific disciplines during the year 2007. It also reports on changes and upgrades to NERSC's systems and services aswell as activities of NERSC staff.

  6. Power suppression at large scales in string inflation

    Energy Technology Data Exchange (ETDEWEB)

    Cicoli, Michele [Dipartimento di Fisica ed Astronomia, Università di Bologna, via Irnerio 46, Bologna, 40126 (Italy); Downes, Sean; Dutta, Bhaskar, E-mail: mcicoli@ictp.it, E-mail: sddownes@physics.tamu.edu, E-mail: dutta@physics.tamu.edu [Mitchell Institute for Fundamental Physics and Astronomy, Department of Physics and Astronomy, Texas A and M University, College Station, TX, 77843-4242 (United States)

    2013-12-01

    We study a possible origin of the anomalous suppression of the power spectrum at large angular scales in the cosmic microwave background within the framework of explicit string inflationary models where inflation is driven by a closed string modulus parameterizing the size of the extra dimensions. In this class of models the apparent power loss at large scales is caused by the background dynamics which involves a sharp transition from a fast-roll power law phase to a period of Starobinsky-like slow-roll inflation. An interesting feature of this class of string inflationary models is that the number of e-foldings of inflation is inversely proportional to the string coupling to a positive power. Therefore once the string coupling is tuned to small values in order to trust string perturbation theory, enough e-foldings of inflation are automatically obtained without the need of extra tuning. Moreover, in the less tuned cases the sharp transition responsible for the power loss takes place just before the last 50-60 e-foldings of inflation. We illustrate these general claims in the case of Fibre Inflation where we study the strength of this transition in terms of the attractor dynamics, finding that it induces a pivot from a blue to a redshifted power spectrum which can explain the apparent large scale power loss. We compute the effects of this pivot for example cases and demonstrate how magnitude and duration of this effect depend on model parameters.

  7. Multidimensional scaling for large genomic data sets

    Directory of Open Access Journals (Sweden)

    Lu Henry

    2008-04-01

    Full Text Available Abstract Background Multi-dimensional scaling (MDS is aimed to represent high dimensional data in a low dimensional space with preservation of the similarities between data points. This reduction in dimensionality is crucial for analyzing and revealing the genuine structure hidden in the data. For noisy data, dimension reduction can effectively reduce the effect of noise on the embedded structure. For large data set, dimension reduction can effectively reduce information retrieval complexity. Thus, MDS techniques are used in many applications of data mining and gene network research. However, although there have been a number of studies that applied MDS techniques to genomics research, the number of analyzed data points was restricted by the high computational complexity of MDS. In general, a non-metric MDS method is faster than a metric MDS, but it does not preserve the true relationships. The computational complexity of most metric MDS methods is over O(N2, so that it is difficult to process a data set of a large number of genes N, such as in the case of whole genome microarray data. Results We developed a new rapid metric MDS method with a low computational complexity, making metric MDS applicable for large data sets. Computer simulation showed that the new method of split-and-combine MDS (SC-MDS is fast, accurate and efficient. Our empirical studies using microarray data on the yeast cell cycle showed that the performance of K-means in the reduced dimensional space is similar to or slightly better than that of K-means in the original space, but about three times faster to obtain the clustering results. Our clustering results using SC-MDS are more stable than those in the original space. Hence, the proposed SC-MDS is useful for analyzing whole genome data. Conclusion Our new method reduces the computational complexity from O(N3 to O(N when the dimension of the feature space is far less than the number of genes N, and it successfully

  8. The Effect of Large Scale Salinity Gradient on Langmuir Turbulence

    Science.gov (United States)

    Fan, Y.; Jarosz, E.; Yu, Z.; Jensen, T.; Sullivan, P. P.; Liang, J.

    2017-12-01

    Langmuir circulation (LC) is believed to be one of the leading order causes of turbulent mixing in the upper ocean. It is important for momentum and heat exchange across the mixed layer (ML) and directly impact the dynamics and thermodynamics in the upper ocean and lower atmosphere including the vertical distributions of chemical, biological, optical, and acoustic properties. Based on Craik and Leibovich (1976) theory, large eddy simulation (LES) models have been developed to simulate LC in the upper ocean, yielding new insights that could not be obtained from field observations and turbulent closure models. Due its high computational cost, LES models are usually limited to small domain sizes and cannot resolve large-scale flows. Furthermore, most LES models used in the LC simulations use periodic boundary conditions in the horizontal direction, which assumes the physical properties (i.e. temperature and salinity) and expected flow patterns in the area of interest are of a periodically repeating nature so that the limited small LES domain is representative for the larger area. Using periodic boundary condition can significantly reduce computational effort in problems, and it is a good assumption for isotropic shear turbulence. However, LC is anisotropic (McWilliams et al 1997) and was observed to be modulated by crosswind tidal currents (Kukulka et al 2011). Using symmetrical domains, idealized LES studies also indicate LC could interact with oceanic fronts (Hamlington et al 2014) and standing internal waves (Chini and Leibovich, 2005). The present study expands our previous LES modeling investigations of Langmuir turbulence to the real ocean conditions with large scale environmental motion that features fresh water inflow into the study region. Large scale gradient forcing is introduced to the NCAR LES model through scale separation analysis. The model is applied to a field observation in the Gulf of Mexico in July, 2016 when the measurement site was impacted by

  9. Large-scale modeling of epileptic seizures: scaling properties of two parallel neuronal network simulation algorithms.

    Science.gov (United States)

    Pesce, Lorenzo L; Lee, Hyong C; Hereld, Mark; Visser, Sid; Stevens, Rick L; Wildeman, Albert; van Drongelen, Wim

    2013-01-01

    Our limited understanding of the relationship between the behavior of individual neurons and large neuronal networks is an important limitation in current epilepsy research and may be one of the main causes of our inadequate ability to treat it. Addressing this problem directly via experiments is impossibly complex; thus, we have been developing and studying medium-large-scale simulations of detailed neuronal networks to guide us. Flexibility in the connection schemas and a complete description of the cortical tissue seem necessary for this purpose. In this paper we examine some of the basic issues encountered in these multiscale simulations. We have determined the detailed behavior of two such simulators on parallel computer systems. The observed memory and computation-time scaling behavior for a distributed memory implementation were very good over the range studied, both in terms of network sizes (2,000 to 400,000 neurons) and processor pool sizes (1 to 256 processors). Our simulations required between a few megabytes and about 150 gigabytes of RAM and lasted between a few minutes and about a week, well within the capability of most multinode clusters. Therefore, simulations of epileptic seizures on networks with millions of cells should be feasible on current supercomputers.

  10. Large-Scale Modeling of Epileptic Seizures: Scaling Properties of Two Parallel Neuronal Network Simulation Algorithms

    Directory of Open Access Journals (Sweden)

    Lorenzo L. Pesce

    2013-01-01

    Full Text Available Our limited understanding of the relationship between the behavior of individual neurons and large neuronal networks is an important limitation in current epilepsy research and may be one of the main causes of our inadequate ability to treat it. Addressing this problem directly via experiments is impossibly complex; thus, we have been developing and studying medium-large-scale simulations of detailed neuronal networks to guide us. Flexibility in the connection schemas and a complete description of the cortical tissue seem necessary for this purpose. In this paper we examine some of the basic issues encountered in these multiscale simulations. We have determined the detailed behavior of two such simulators on parallel computer systems. The observed memory and computation-time scaling behavior for a distributed memory implementation were very good over the range studied, both in terms of network sizes (2,000 to 400,000 neurons and processor pool sizes (1 to 256 processors. Our simulations required between a few megabytes and about 150 gigabytes of RAM and lasted between a few minutes and about a week, well within the capability of most multinode clusters. Therefore, simulations of epileptic seizures on networks with millions of cells should be feasible on current supercomputers.

  11. Theoretical science and the future of large scale computing

    International Nuclear Information System (INIS)

    Wilson, K.G.

    1983-01-01

    The author describes the application of computer simulation to physical problems. In this connection the FORTRAN language is considered. Furthermore the application of computer networks is described whereby the processing of experimental data is considered. (HSI).

  12. The Software Reliability of Large Scale Integration Circuit and Very Large Scale Integration Circuit

    OpenAIRE

    Artem Ganiyev; Jan Vitasek

    2010-01-01

    This article describes evaluation method of faultless function of large scale integration circuits (LSI) and very large scale integration circuits (VLSI). In the article there is a comparative analysis of factors which determine faultless of integrated circuits, analysis of already existing methods and model of faultless function evaluation of LSI and VLSI. The main part describes a proposed algorithm and program for analysis of fault rate in LSI and VLSI circuits.

  13. Large scale and cloud-based multi-model analytics experiments on climate change data in the Earth System Grid Federation

    Science.gov (United States)

    Fiore, Sandro; Płóciennik, Marcin; Doutriaux, Charles; Blanquer, Ignacio; Barbera, Roberto; Donvito, Giacinto; Williams, Dean N.; Anantharaj, Valentine; Salomoni, Davide D.; Aloisio, Giovanni

    2017-04-01

    In many scientific domains such as climate, data is often n-dimensional and requires tools that support specialized data types and primitives to be properly stored, accessed, analysed and visualized. Moreover, new challenges arise in large-scale scenarios and eco-systems where petabytes (PB) of data can be available and data can be distributed and/or replicated, such as the Earth System Grid Federation (ESGF) serving the Coupled Model Intercomparison Project, Phase 5 (CMIP5) experiment, providing access to 2.5PB of data for the Intergovernmental Panel on Climate Change (IPCC) Fifth Assessment Report (AR5). A case study on climate models intercomparison data analysis addressing several classes of multi-model experiments is being implemented in the context of the EU H2020 INDIGO-DataCloud project. Such experiments require the availability of large amount of data (multi-terabyte order) related to the output of several climate models simulations as well as the exploitation of scientific data management tools for large-scale data analytics. More specifically, the talk discusses in detail a use case on precipitation trend analysis in terms of requirements, architectural design solution, and infrastructural implementation. The experiment has been tested and validated on CMIP5 datasets, in the context of a large scale distributed testbed across EU and US involving three ESGF sites (LLNL, ORNL, and CMCC) and one central orchestrator site (PSNC). The general "environment" of the case study relates to: (i) multi-model data analysis inter-comparison challenges; (ii) addressed on CMIP5 data; and (iii) which are made available through the IS-ENES/ESGF infrastructure. The added value of the solution proposed in the INDIGO-DataCloud project are summarized in the following: (i) it implements a different paradigm (from client- to server-side); (ii) it intrinsically reduces data movement; (iii) it makes lightweight the end-user setup; (iv) it fosters re-usability (of data, final

  14. Large scale parallel FEM computations of far/near stress field changes in rocks

    Czech Academy of Sciences Publication Activity Database

    Blaheta, Radim; Byczanski, Petr; Jakl, Ondřej; Kohut, Roman; Kolcun, Alexej; Krečmer, Karel; Starý, Jiří

    2006-01-01

    Roč. 22, č. 4 (2006), s. 449-459 ISSN 0167-739X R&D Projects: GA ČR(CZ) GA105/02/0492; GA AV ČR(CZ) 1ET400300415 Institutional research plan: CEZ:AV0Z30860518 Keywords : large scale finite element analysis Subject RIV: BA - General Mathematics Impact factor: 0.722, year: 2006

  15. Distributed simulation of large computer systems

    International Nuclear Information System (INIS)

    Marzolla, M.

    2001-01-01

    Sequential simulation of large complex physical systems is often regarded as a computationally expensive task. In order to speed-up complex discrete-event simulations, the paradigm of Parallel and Distributed Discrete Event Simulation (PDES) has been introduced since the late 70s. The authors analyze the applicability of PDES to the modeling and analysis of large computer system; such systems are increasingly common in the area of High Energy and Nuclear Physics, because many modern experiments make use of large 'compute farms'. Some feasibility tests have been performed on a prototype distributed simulator

  16. An industrial perspective on bioreactor scale-down: what we can learn from combined large-scale bioprocess and model fluid studies.

    Science.gov (United States)

    Noorman, Henk

    2011-08-01

    For industrial bioreactor design, operation, control and optimization, the scale-down approach is often advocated to efficiently generate data on a small scale, and effectively apply suggested improvements to the industrial scale. In all cases it is important to ensure that the scale-down conditions are representative of the real large-scale bioprocess. Progress is hampered by limited detailed and local information from large-scale bioprocesses. Complementary to real fermentation studies, physical aspects of model fluids such as air-water in large bioreactors provide useful information with limited effort and cost. Still, in industrial practice, investments of time, capital and resources often prohibit systematic work, although, in the end, savings obtained in this way are trivial compared to the expenses that result from real process disturbances, batch failures, and non-flyers with loss of business opportunity. Here we try to highlight what can be learned from real large-scale bioprocess in combination with model fluid studies, and to provide suitable computation tools to overcome data restrictions. Focus is on a specific well-documented case for a 30-m(3) bioreactor. Areas for further research from an industrial perspective are also indicated. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  17. Implicit solvers for large-scale nonlinear problems

    International Nuclear Information System (INIS)

    Keyes, David E; Reynolds, Daniel R; Woodward, Carol S

    2006-01-01

    Computational scientists are grappling with increasingly complex, multi-rate applications that couple such physical phenomena as fluid dynamics, electromagnetics, radiation transport, chemical and nuclear reactions, and wave and material propagation in inhomogeneous media. Parallel computers with large storage capacities are paving the way for high-resolution simulations of coupled problems; however, hardware improvements alone will not prove enough to enable simulations based on brute-force algorithmic approaches. To accurately capture nonlinear couplings between dynamically relevant phenomena, often while stepping over rapid adjustments to quasi-equilibria, simulation scientists are increasingly turning to implicit formulations that require a discrete nonlinear system to be solved for each time step or steady state solution. Recent advances in iterative methods have made fully implicit formulations a viable option for solution of these large-scale problems. In this paper, we overview one of the most effective iterative methods, Newton-Krylov, for nonlinear systems and point to software packages with its implementation. We illustrate the method with an example from magnetically confined plasma fusion and briefly survey other areas in which implicit methods have bestowed important advantages, such as allowing high-order temporal integration and providing a pathway to sensitivity analyses and optimization. Lastly, we overview algorithm extensions under development motivated by current SciDAC applications

  18. Data management strategies for multinational large-scale systems biology projects.

    Science.gov (United States)

    Wruck, Wasco; Peuker, Martin; Regenbrecht, Christian R A

    2014-01-01

    Good accessibility of publicly funded research data is essential to secure an open scientific system and eventually becomes mandatory [Wellcome Trust will Penalise Scientists Who Don't Embrace Open Access. The Guardian 2012]. By the use of high-throughput methods in many research areas from physics to systems biology, large data collections are increasingly important as raw material for research. Here, we present strategies worked out by international and national institutions targeting open access to publicly funded research data via incentives or obligations to share data. Funding organizations such as the British Wellcome Trust therefore have developed data sharing policies and request commitment to data management and sharing in grant applications. Increased citation rates are a profound argument for sharing publication data. Pre-publication sharing might be rewarded by a data citation credit system via digital object identifiers (DOIs) which have initially been in use for data objects. Besides policies and incentives, good practice in data management is indispensable. However, appropriate systems for data management of large-scale projects for example in systems biology are hard to find. Here, we give an overview of a selection of open-source data management systems proved to be employed successfully in large-scale projects.

  19. SciSpark's SRDD : A Scientific Resilient Distributed Dataset for Multidimensional Data

    Science.gov (United States)

    Palamuttam, R. S.; Wilson, B. D.; Mogrovejo, R. M.; Whitehall, K. D.; Mattmann, C. A.; McGibbney, L. J.; Ramirez, P.

    2015-12-01

    Remote sensing data and climate model output are multi-dimensional arrays of massive sizes locked away in heterogeneous file formats (HDF5/4, NetCDF 3/4) and metadata models (HDF-EOS, CF) making it difficult to perform multi-stage, iterative science processing since each stage requires writing and reading data to and from disk. We have developed SciSpark, a robust Big Data framework, that extends ApacheTM Spark for scaling scientific computations. Apache Spark improves the map-reduce implementation in ApacheTM Hadoop for parallel computing on a cluster, by emphasizing in-memory computation, "spilling" to disk only as needed, and relying on lazy evaluation. Central to Spark is the Resilient Distributed Dataset (RDD), an in-memory distributed data structure that extends the functional paradigm provided by the Scala programming language. However, RDDs are ideal for tabular or unstructured data, and not for highly dimensional data. The SciSpark project introduces the Scientific Resilient Distributed Dataset (sRDD), a distributed-computing array structure which supports iterative scientific algorithms for multidimensional data. SciSpark processes data stored in NetCDF and HDF files by partitioning them across time or space and distributing the partitions among a cluster of compute nodes. We show usability and extensibility of SciSpark by implementing distributed algorithms for geospatial operations on large collections of multi-dimensional grids. In particular we address the problem of scaling an automated method for finding Mesoscale Convective Complexes. SciSpark provides a tensor interface to support the pluggability of different matrix libraries. We evaluate performance of the various matrix libraries in distributed pipelines, such as Nd4jTM and BreezeTM. We detail the architecture and design of SciSpark, our efforts to integrate climate science algorithms, parallel ingest and partitioning (sharding) of A-Train satellite observations from model grids. These

  20. Effect of Variable Spatial Scales on USLE-GIS Computations

    Science.gov (United States)

    Patil, R. J.; Sharma, S. K.

    2017-12-01

    Use of appropriate spatial scale is very important in Universal Soil Loss Equation (USLE) based spatially distributed soil erosion modelling. This study aimed at assessment of annual rates of soil erosion at different spatial scales/grid sizes and analysing how changes in spatial scales affect USLE-GIS computations using simulation and statistical variabilities. Efforts have been made in this study to recommend an optimum spatial scale for further USLE-GIS computations for management and planning in the study area. The present research study was conducted in Shakkar River watershed, situated in Narsinghpur and Chhindwara districts of Madhya Pradesh, India. Remote Sensing and GIS techniques were integrated with Universal Soil Loss Equation (USLE) to predict spatial distribution of soil erosion in the study area at four different spatial scales viz; 30 m, 50 m, 100 m, and 200 m. Rainfall data, soil map, digital elevation model (DEM) and an executable C++ program, and satellite image of the area were used for preparation of the thematic maps for various USLE factors. Annual rates of soil erosion were estimated for 15 years (1992 to 2006) at four different grid sizes. The statistical analysis of four estimated datasets showed that sediment loss dataset at 30 m spatial scale has a minimum standard deviation (2.16), variance (4.68), percent deviation from observed values (2.68 - 18.91 %), and highest coefficient of determination (R2 = 0.874) among all the four datasets. Thus, it is recommended to adopt this spatial scale for USLE-GIS computations in the study area due to its minimum statistical variability and better agreement with the observed sediment loss data. This study also indicates large scope for use of finer spatial scales in spatially distributed soil erosion modelling.

  1. Managing large-scale models: DBS

    International Nuclear Information System (INIS)

    1981-05-01

    A set of fundamental management tools for developing and operating a large scale model and data base system is presented. Based on experience in operating and developing a large scale computerized system, the only reasonable way to gain strong management control of such a system is to implement appropriate controls and procedures. Chapter I discusses the purpose of the book. Chapter II classifies a broad range of generic management problems into three groups: documentation, operations, and maintenance. First, system problems are identified then solutions for gaining management control are disucssed. Chapters III, IV, and V present practical methods for dealing with these problems. These methods were developed for managing SEAS but have general application for large scale models and data bases

  2. Large Scale Self-Organizing Information Distribution System

    National Research Council Canada - National Science Library

    Low, Steven

    2005-01-01

    This project investigates issues in "large-scale" networks. Here "large-scale" refers to networks with large number of high capacity nodes and transmission links, and shared by a large number of users...

  3. MicroEcos: Micro-Scale Explorations of Large-Scale Late Pleistocene Ecosystems

    Science.gov (United States)

    Gellis, B. S.

    2017-12-01

    Pollen data can inform the reconstruction of early-floral environments by providing data for artistic representations of what early-terrestrial ecosystems looked like, and how existing terrestrial landscapes have evolved. For example, what did the Bighorn Basin look like when large ice sheets covered modern Canada, the Yellowstone Plateau had an ice cap, and the Bighorn Mountains were mantled with alpine glaciers? MicroEcos is an immersive, multimedia project that aims to strengthen human-nature connections through the understanding and appreciation of biological ecosystems. Collected pollen data elucidates flora that are visible in the fossil record - associated with the Late-Pleistocene - and have been illustrated and described in botanical literature. It aims to make scientific data accessible and interesting to all audiences through a series of interactive-digital sculptures, large-scale photography and field-based videography. While this project is driven by scientific data, it is rooted in deeply artistic and outreach-based practices, which include broad artistic practices, e.g.: digital design, illustration, photography, video and sound design. Using 3D modeling and printing technology MicroEcos centers around a series of 3D-printed models of the Last Canyon rock shelter on the Wyoming and Montana border, Little Windy Hill pond site in Wyoming's Medicine Bow National Forest, and Natural Trap Cave site in Wyoming's Big Horn Basin. These digital, interactive-3D sculpture provide audiences with glimpses of three-dimensional Late-Pleistocene environments, and helps create dialogue of how grass, sagebrush, and spruce based ecosystems form. To help audiences better contextualize how MicroEcos bridges notions of time, space, and place, modern photography and videography of the Last Canyon, Little Windy Hill and Natural Trap Cave sites surround these 3D-digital reconstructions.

  4. Center for Center for Technology for Advanced Scientific Component Software (TASCS)

    Energy Technology Data Exchange (ETDEWEB)

    Kostadin, Damevski [Virginia State Univ., Petersburg, VA (United States)

    2015-01-25

    A resounding success of the Scientific Discovery through Advanced Computing (SciDAC) program is that high-performance computational science is now universally recognized as a critical aspect of scientific discovery [71], complementing both theoretical and experimental research. As scientific communities prepare to exploit unprecedented computing capabilities of emerging leadership-class machines for multi-model simulations at the extreme scale [72], it is more important than ever to address the technical and social challenges of geographically distributed teams that combine expertise in domain science, applied mathematics, and computer science to build robust and flexible codes that can incorporate changes over time. The Center for Technology for Advanced Scientific Component Software (TASCS)1 tackles these these issues by exploiting component-based software development to facilitate collaborative high-performance scientific computing.

  5. Large-Scale Cubic-Scaling Random Phase Approximation Correlation Energy Calculations Using a Gaussian Basis.

    Science.gov (United States)

    Wilhelm, Jan; Seewald, Patrick; Del Ben, Mauro; Hutter, Jürg

    2016-12-13

    We present an algorithm for computing the correlation energy in the random phase approximation (RPA) in a Gaussian basis requiring [Formula: see text] operations and [Formula: see text] memory. The method is based on the resolution of the identity (RI) with the overlap metric, a reformulation of RI-RPA in the Gaussian basis, imaginary time, and imaginary frequency integration techniques, and the use of sparse linear algebra. Additional memory reduction without extra computations can be achieved by an iterative scheme that overcomes the memory bottleneck of canonical RPA implementations. We report a massively parallel implementation that is the key for the application to large systems. Finally, cubic-scaling RPA is applied to a thousand water molecules using a correlation-consistent triple-ζ quality basis.

  6. A large-scale evaluation of computational protein function prediction

    NARCIS (Netherlands)

    Radivojac, P.; Clark, W.T.; Oron, T.R.; Schnoes, A.M.; Wittkop, T.; Kourmpetis, Y.A.I.; Dijk, van A.D.J.; Friedberg, I.

    2013-01-01

    Automated annotation of protein function is challenging. As the number of sequenced genomes rapidly grows, the overwhelming majority of protein products can only be annotated computationally. If computational predictions are to be relied upon, it is crucial that the accuracy of these methods be

  7. A Dynamic Optimization Strategy for the Operation of Large Scale Seawater Reverses Osmosis System

    Directory of Open Access Journals (Sweden)

    Aipeng Jiang

    2014-01-01

    Full Text Available In this work, an efficient strategy was proposed for efficient solution of the dynamic model of SWRO system. Since the dynamic model is formulated by a set of differential-algebraic equations, simultaneous strategies based on collocations on finite element were used to transform the DAOP into large scale nonlinear programming problem named Opt2. Then, simulation of RO process and storage tanks was carried element by element and step by step with fixed control variables. All the obtained values of these variables then were used as the initial value for the optimal solution of SWRO system. Finally, in order to accelerate the computing efficiency and at the same time to keep enough accuracy for the solution of Opt2, a simple but efficient finite element refinement rule was used to reduce the scale of Opt2. The proposed strategy was applied to a large scale SWRO system with 8 RO plants and 4 storage tanks as case study. Computing result shows that the proposed strategy is quite effective for optimal operation of the large scale SWRO system; the optimal problem can be successfully solved within decades of iterations and several minutes when load and other operating parameters fluctuate.

  8. DOE Advanced Scientific Computing Advisory Committee (ASCAC) Report: Exascale Computing Initiative Review

    Energy Technology Data Exchange (ETDEWEB)

    Reed, Daniel [University of Iowa; Berzins, Martin [University of Utah; Pennington, Robert; Sarkar, Vivek [Rice University; Taylor, Valerie [Texas A& M University

    2015-08-01

    On November 19, 2014, the Advanced Scientific Computing Advisory Committee (ASCAC) was charged with reviewing the Department of Energy’s conceptual design for the Exascale Computing Initiative (ECI). In particular, this included assessing whether there are significant gaps in the ECI plan or areas that need to be given priority or extra management attention. Given the breadth and depth of previous reviews of the technical challenges inherent in exascale system design and deployment, the subcommittee focused its assessment on organizational and management issues, considering technical issues only as they informed organizational or management priorities and structures. This report presents the observations and recommendations of the subcommittee.

  9. Scientific computing vol III - approximation and integration

    CERN Document Server

    Trangenstein, John A

    2017-01-01

    This is the third of three volumes providing a comprehensive presentation of the fundamentals of scientific computing. This volume discusses topics that depend more on calculus than linear algebra, in order to prepare the reader for solving differential equations. This book and its companions show how to determine the quality of computational results, and how to measure the relative efficiency of competing methods. Readers learn how to determine the maximum attainable accuracy of algorithms, and how to select the best method for computing problems. This book also discusses programming in several languages, including C++, Fortran and MATLAB. There are 90 examples, 200 exercises, 36 algorithms, 40 interactive JavaScript programs, 91 references to software programs and 1 case study. Topics are introduced with goals, literature references and links to public software. There are descriptions of the current algorithms in GSLIB and MATLAB. This book could be used for a second course in numerical methods, for either ...

  10. Multigrid preconditioned conjugate-gradient method for large-scale wave-front reconstruction.

    Science.gov (United States)

    Gilles, Luc; Vogel, Curtis R; Ellerbroek, Brent L

    2002-09-01

    We introduce a multigrid preconditioned conjugate-gradient (MGCG) iterative scheme for computing open-loop wave-front reconstructors for extreme adaptive optics systems. We present numerical simulations for a 17-m class telescope with n = 48756 sensor measurement grid points within the aperture, which indicate that our MGCG method has a rapid convergence rate for a wide range of subaperture average slope measurement signal-to-noise ratios. The total computational cost is of order n log n. Hence our scheme provides for fast wave-front simulation and control in large-scale adaptive optics systems.

  11. Inference of functional properties from large-scale analysis of enzyme superfamilies.

    Science.gov (United States)

    Brown, Shoshana D; Babbitt, Patricia C

    2012-01-02

    As increasingly large amounts of data from genome and other sequencing projects become available, new approaches are needed to determine the functions of the proteins these genes encode. We show how large-scale computational analysis can help to address this challenge by linking functional information to sequence and structural similarities using protein similarity networks. Network analyses using three functionally diverse enzyme superfamilies illustrate the use of these approaches for facile updating and comparison of available structures for a large superfamily, for creation of functional hypotheses for metagenomic sequences, and to summarize the limits of our functional knowledge about even well studied superfamilies.

  12. Large scale structure and baryogenesis

    International Nuclear Information System (INIS)

    Kirilova, D.P.; Chizhov, M.V.

    2001-08-01

    We discuss a possible connection between the large scale structure formation and the baryogenesis in the universe. An update review of the observational indications for the presence of a very large scale 120h -1 Mpc in the distribution of the visible matter of the universe is provided. The possibility to generate a periodic distribution with the characteristic scale 120h -1 Mpc through a mechanism producing quasi-periodic baryon density perturbations during inflationary stage, is discussed. The evolution of the baryon charge density distribution is explored in the framework of a low temperature boson condensate baryogenesis scenario. Both the observed very large scale of a the visible matter distribution in the universe and the observed baryon asymmetry value could naturally appear as a result of the evolution of a complex scalar field condensate, formed at the inflationary stage. Moreover, for some model's parameters a natural separation of matter superclusters from antimatter ones can be achieved. (author)

  13. Automatic management software for large-scale cluster system

    International Nuclear Information System (INIS)

    Weng Yunjian; Chinese Academy of Sciences, Beijing; Sun Gongxing

    2007-01-01

    At present, the large-scale cluster system faces to the difficult management. For example the manager has large work load. It needs to cost much time on the management and the maintenance of large-scale cluster system. The nodes in large-scale cluster system are very easy to be chaotic. Thousands of nodes are put in big rooms so that some managers are very easy to make the confusion with machines. How do effectively carry on accurate management under the large-scale cluster system? The article introduces ELFms in the large-scale cluster system. Furthermore, it is proposed to realize the large-scale cluster system automatic management. (authors)

  14. A frequency-domain implementation of a sliding-window traffic sign detector for large scale panoramic datasets

    NARCIS (Netherlands)

    Creusen, I.M.; Hazelhoff, L.; With, de P.H.N.

    2013-01-01

    In large-scale automatic traffic sign surveying systems, the primary computational effort is concentrated at the traffic sign detection stage. This paper focuses on reducing the computational load of particularly the sliding window object detection algorithm which is employed for traffic sign

  15. Foundational perspectives on causality in large-scale brain networks

    Science.gov (United States)

    Mannino, Michael; Bressler, Steven L.

    2015-12-01

    A profusion of recent work in cognitive neuroscience has been concerned with the endeavor to uncover causal influences in large-scale brain networks. However, despite the fact that many papers give a nod to the important theoretical challenges posed by the concept of causality, this explosion of research has generally not been accompanied by a rigorous conceptual analysis of the nature of causality in the brain. This review provides both a descriptive and prescriptive account of the nature of causality as found within and between large-scale brain networks. In short, it seeks to clarify the concept of causality in large-scale brain networks both philosophically and scientifically. This is accomplished by briefly reviewing the rich philosophical history of work on causality, especially focusing on contributions by David Hume, Immanuel Kant, Bertrand Russell, and Christopher Hitchcock. We go on to discuss the impact that various interpretations of modern physics have had on our understanding of causality. Throughout all this, a central focus is the distinction between theories of deterministic causality (DC), whereby causes uniquely determine their effects, and probabilistic causality (PC), whereby causes change the probability of occurrence of their effects. We argue that, given the topological complexity of its large-scale connectivity, the brain should be considered as a complex system and its causal influences treated as probabilistic in nature. We conclude that PC is well suited for explaining causality in the brain for three reasons: (1) brain causality is often mutual; (2) connectional convergence dictates that only rarely is the activity of one neuronal population uniquely determined by another one; and (3) the causal influences exerted between neuronal populations may not have observable effects. A number of different techniques are currently available to characterize causal influence in the brain. Typically, these techniques quantify the statistical

  16. Interactive Visualization of Large-Scale Hydrological Data using Emerging Technologies in Web Systems and Parallel Programming

    Science.gov (United States)

    Demir, I.; Krajewski, W. F.

    2013-12-01

    As geoscientists are confronted with increasingly massive datasets from environmental observations to simulations, one of the biggest challenges is having the right tools to gain scientific insight from the data and communicate the understanding to stakeholders. Recent developments in web technologies make it easy to manage, visualize and share large data sets with general public. Novel visualization techniques and dynamic user interfaces allow users to interact with data, and modify the parameters to create custom views of the data to gain insight from simulations and environmental observations. This requires developing new data models and intelligent knowledge discovery techniques to explore and extract information from complex computational simulations or large data repositories. Scientific visualization will be an increasingly important component to build comprehensive environmental information platforms. This presentation provides an overview of the trends and challenges in the field of scientific visualization, and demonstrates information visualization and communication tools developed within the light of these challenges.

  17. Advanced computations in plasma physics

    International Nuclear Information System (INIS)

    Tang, W.M.

    2002-01-01

    Scientific simulation in tandem with theory and experiment is an essential tool for understanding complex plasma behavior. In this paper we review recent progress and future directions for advanced simulations in magnetically confined plasmas with illustrative examples chosen from magnetic confinement research areas such as microturbulence, magnetohydrodynamics, magnetic reconnection, and others. Significant recent progress has been made in both particle and fluid simulations of fine-scale turbulence and large-scale dynamics, giving increasingly good agreement between experimental observations and computational modeling. This was made possible by innovative advances in analytic and computational methods for developing reduced descriptions of physics phenomena spanning widely disparate temporal and spatial scales together with access to powerful new computational resources. In particular, the fusion energy science community has made excellent progress in developing advanced codes for which computer run-time and problem size scale well with the number of processors on massively parallel machines (MPP's). A good example is the effective usage of the full power of multi-teraflop (multi-trillion floating point computations per second) MPP's to produce three-dimensional, general geometry, nonlinear particle simulations which have accelerated progress in understanding the nature of turbulence self-regulation by zonal flows. It should be emphasized that these calculations, which typically utilized billions of particles for thousands of time-steps, would not have been possible without access to powerful present generation MPP computers and the associated diagnostic and visualization capabilities. In general, results from advanced simulations provide great encouragement for being able to include increasingly realistic dynamics to enable deeper physics insights into plasmas in both natural and laboratory environments. The associated scientific excitement should serve to

  18. Large-scale agent-based social simulation : A study on epidemic prediction and control

    NARCIS (Netherlands)

    Zhang, M.

    2016-01-01

    Large-scale agent-based social simulation is gradually proving to be a versatile methodological approach for studying human societies, which could make contributions from policy making in social science, to distributed artificial intelligence and agent technology in computer science, and to theory

  19. Non-parametric co-clustering of large scale sparse bipartite networks on the GPU

    DEFF Research Database (Denmark)

    Hansen, Toke Jansen; Mørup, Morten; Hansen, Lars Kai

    2011-01-01

    of row and column clusters from a hypothesis space of an infinite number of clusters. To reach large scale applications of co-clustering we exploit that parameter inference for co-clustering is well suited for parallel computing. We develop a generic GPU framework for efficient inference on large scale...... sparse bipartite networks and achieve a speedup of two orders of magnitude compared to estimation based on conventional CPUs. In terms of scalability we find for networks with more than 100 million links that reliable inference can be achieved in less than an hour on a single GPU. To efficiently manage...

  20. PathlinesExplorer — Image-based exploration of large-scale pathline fields

    KAUST Repository

    Nagoor, Omniah H.

    2015-10-25

    PathlinesExplorer is a novel image-based tool, which has been designed to visualize large scale pathline fields on a single computer [7]. PathlinesExplorer integrates explorable images (EI) technique [4] with order-independent transparency (OIT) method [2]. What makes this method different is that it allows users to handle large data on a single workstation. Although it is a view-dependent method, PathlinesExplorer combines both exploration and modification of visual aspects without re-accessing the original huge data. Our approach is based on constructing a per-pixel linked list data structure in which each pixel contains a list of pathline segments. With this view-dependent method, it is possible to filter, color-code, and explore large-scale flow data in real-time. In addition, optimization techniques such as early-ray termination and deferred shading are applied, which further improves the performance and scalability of our approach.

  1. Solving large scale structure in ten easy steps with COLA

    Energy Technology Data Exchange (ETDEWEB)

    Tassev, Svetlin [Department of Astrophysical Sciences, Princeton University, 4 Ivy Lane, Princeton, NJ 08544 (United States); Zaldarriaga, Matias [School of Natural Sciences, Institute for Advanced Study, Olden Lane, Princeton, NJ 08540 (United States); Eisenstein, Daniel J., E-mail: stassev@cfa.harvard.edu, E-mail: matiasz@ias.edu, E-mail: deisenstein@cfa.harvard.edu [Center for Astrophysics, Harvard University, 60 Garden Street, Cambridge, MA 02138 (United States)

    2013-06-01

    We present the COmoving Lagrangian Acceleration (COLA) method: an N-body method for solving for Large Scale Structure (LSS) in a frame that is comoving with observers following trajectories calculated in Lagrangian Perturbation Theory (LPT). Unlike standard N-body methods, the COLA method can straightforwardly trade accuracy at small-scales in order to gain computational speed without sacrificing accuracy at large scales. This is especially useful for cheaply generating large ensembles of accurate mock halo catalogs required to study galaxy clustering and weak lensing, as those catalogs are essential for performing detailed error analysis for ongoing and future surveys of LSS. As an illustration, we ran a COLA-based N-body code on a box of size 100 Mpc/h with particles of mass ≈ 5 × 10{sup 9}M{sub s}un/h. Running the code with only 10 timesteps was sufficient to obtain an accurate description of halo statistics down to halo masses of at least 10{sup 11}M{sub s}un/h. This is only at a modest speed penalty when compared to mocks obtained with LPT. A standard detailed N-body run is orders of magnitude slower than our COLA-based code. The speed-up we obtain with COLA is due to the fact that we calculate the large-scale dynamics exactly using LPT, while letting the N-body code solve for the small scales, without requiring it to capture exactly the internal dynamics of halos. Achieving a similar level of accuracy in halo statistics without the COLA method requires at least 3 times more timesteps than when COLA is employed.

  2. HAlign-II: efficient ultra-large multiple sequence alignment and phylogenetic tree reconstruction with distributed and parallel computing.

    Science.gov (United States)

    Wan, Shixiang; Zou, Quan

    2017-01-01

    Multiple sequence alignment (MSA) plays a key role in biological sequence analyses, especially in phylogenetic tree construction. Extreme increase in next-generation sequencing results in shortage of efficient ultra-large biological sequence alignment approaches for coping with different sequence types. Distributed and parallel computing represents a crucial technique for accelerating ultra-large (e.g. files more than 1 GB) sequence analyses. Based on HAlign and Spark distributed computing system, we implement a highly cost-efficient and time-efficient HAlign-II tool to address ultra-large multiple biological sequence alignment and phylogenetic tree construction. The experiments in the DNA and protein large scale data sets, which are more than 1GB files, showed that HAlign II could save time and space. It outperformed the current software tools. HAlign-II can efficiently carry out MSA and construct phylogenetic trees with ultra-large numbers of biological sequences. HAlign-II shows extremely high memory efficiency and scales well with increases in computing resource. THAlign-II provides a user-friendly web server based on our distributed computing infrastructure. HAlign-II with open-source codes and datasets was established at http://lab.malab.cn/soft/halign.

  3. Scientific Grand Challenges: Crosscutting Technologies for Computing at the Exascale - February 2-4, 2010, Washington, D.C.

    Energy Technology Data Exchange (ETDEWEB)

    Khaleel, Mohammad A.

    2011-02-06

    The goal of the "Scientific Grand Challenges - Crosscutting Technologies for Computing at the Exascale" workshop in February 2010, jointly sponsored by the U.S. Department of Energy’s Office of Advanced Scientific Computing Research and the National Nuclear Security Administration, was to identify the elements of a research and development agenda that will address these challenges and create a comprehensive exascale computing environment. This exascale computing environment will enable the science applications identified in the eight previously held Scientific Grand Challenges Workshop Series.

  4. Large-Scale Outflows in Seyfert Galaxies

    Science.gov (United States)

    Colbert, E. J. M.; Baum, S. A.

    1995-12-01

    \\catcode`\\@=11 \\ialign{m @th#1hfil ##hfil \\crcr#2\\crcr\\sim\\crcr}}} \\catcode`\\@=12 Highly collimated outflows extend out to Mpc scales in many radio-loud active galaxies. In Seyfert galaxies, which are radio-quiet, the outflows extend out to kpc scales and do not appear to be as highly collimated. In order to study the nature of large-scale (>~1 kpc) outflows in Seyferts, we have conducted optical, radio and X-ray surveys of a distance-limited sample of 22 edge-on Seyfert galaxies. Results of the optical emission-line imaging and spectroscopic survey imply that large-scale outflows are present in >~{{1} /{4}} of all Seyferts. The radio (VLA) and X-ray (ROSAT) surveys show that large-scale radio and X-ray emission is present at about the same frequency. Kinetic luminosities of the outflows in Seyferts are comparable to those in starburst-driven superwinds. Large-scale radio sources in Seyferts appear diffuse, but do not resemble radio halos found in some edge-on starburst galaxies (e.g. M82). We discuss the feasibility of the outflows being powered by the active nucleus (e.g. a jet) or a circumnuclear starburst.

  5. Using technology to support investigations in the electronic age: tracking hackers to large scale international computer fraud

    Science.gov (United States)

    McFall, Steve

    1994-03-01

    With the increase in business automation and the widespread availability and low cost of computer systems, law enforcement agencies have seen a corresponding increase in criminal acts involving computers. The examination of computer evidence is a new field of forensic science with numerous opportunities for research and development. Research is needed to develop new software utilities to examine computer storage media, expert systems capable of finding criminal activity in large amounts of data, and to find methods of recovering data from chemically and physically damaged computer storage media. In addition, defeating encryption and password protection of computer files is also a topic requiring more research and development.

  6. Are large-scale flow experiments informing the science and management of freshwater ecosystems?

    Science.gov (United States)

    Olden, Julian D.; Konrad, Christopher P.; Melis, Theodore S.; Kennard, Mark J.; Freeman, Mary C.; Mims, Meryl C.; Bray, Erin N.; Gido, Keith B.; Hemphill, Nina P.; Lytle, David A.; McMullen, Laura E.; Pyron, Mark; Robinson, Christopher T.; Schmidt, John C.; Williams, John G.

    2013-01-01

    Greater scientific knowledge, changing societal values, and legislative mandates have emphasized the importance of implementing large-scale flow experiments (FEs) downstream of dams. We provide the first global assessment of FEs to evaluate their success in advancing science and informing management decisions. Systematic review of 113 FEs across 20 countries revealed that clear articulation of experimental objectives, while not universally practiced, was crucial for achieving management outcomes and changing dam-operating policies. Furthermore, changes to dam operations were three times less likely when FEs were conducted primarily for scientific purposes. Despite the recognized importance of riverine flow regimes, four-fifths of FEs involved only discrete flow events. Over three-quarters of FEs documented both abiotic and biotic outcomes, but only one-third examined multiple taxonomic responses, thus limiting how FE results can inform holistic dam management. Future FEs will present new opportunities to advance scientifically credible water policies.

  7. Amplify scientific discovery with artificial intelligence

    Energy Technology Data Exchange (ETDEWEB)

    Gil, Yolanda; Greaves, Mark T.; Hendler, James; Hirsch, Hyam

    2014-10-10

    Computing innovations have fundamentally changed many aspects of scientific inquiry. For example, advances in robotics, high-end computing, networking, and databases now underlie much of what we do in science such as gene sequencing, general number crunching, sharing information between scientists, and analyzing large amounts of data. As computing has evolved at a rapid pace, so too has its impact in science, with the most recent computing innovations repeatedly being brought to bear to facilitate new forms of inquiry. Recently, advances in Artificial Intelligence (AI) have deeply penetrated many consumer sectors, including for example Apple’s Siri™ speech recognition system, real-time automated language translation services, and a new generation of self-driving cars and self-navigating drones. However, AI has yet to achieve comparable levels of penetration in scientific inquiry, despite its tremendous potential in aiding computers to help scientists tackle tasks that require scientific reasoning. We contend that advances in AI will transform the practice of science as we are increasingly able to effectively and jointly harness human and machine intelligence in the pursuit of major scientific challenges.

  8. A method to mine workflows from provenance for assisting scientific workflow composition

    NARCIS (Netherlands)

    Zeng, R.; He, X.; Aalst, van der W.M.P.

    2011-01-01

    Scientific workflows have recently emerged as a new paradigm for representing and managing complex distributed scientific computations and are used to accelerate the pace of scientific discovery. In many disciplines, individual workflows are large and complicated due to the large quantities of data

  9. FFTLasso: Large-Scale LASSO in the Fourier Domain

    KAUST Repository

    Bibi, Adel Aamer

    2017-11-09

    In this paper, we revisit the LASSO sparse representation problem, which has been studied and used in a variety of different areas, ranging from signal processing and information theory to computer vision and machine learning. In the vision community, it found its way into many important applications, including face recognition, tracking, super resolution, image denoising, to name a few. Despite advances in efficient sparse algorithms, solving large-scale LASSO problems remains a challenge. To circumvent this difficulty, people tend to downsample and subsample the problem (e.g. via dimensionality reduction) to maintain a manageable sized LASSO, which usually comes at the cost of losing solution accuracy. This paper proposes a novel circulant reformulation of the LASSO that lifts the problem to a higher dimension, where ADMM can be efficiently applied to its dual form. Because of this lifting, all optimization variables are updated using only basic element-wise operations, the most computationally expensive of which is a 1D FFT. In this way, there is no need for a linear system solver nor matrix-vector multiplication. Since all operations in our FFTLasso method are element-wise, the subproblems are completely independent and can be trivially parallelized (e.g. on a GPU). The attractive computational properties of FFTLasso are verified by extensive experiments on synthetic and real data and on the face recognition task. They demonstrate that FFTLasso scales much more effectively than a state-of-the-art solver.

  10. FFTLasso: Large-Scale LASSO in the Fourier Domain

    KAUST Repository

    Bibi, Adel Aamer; Itani, Hani; Ghanem, Bernard

    2017-01-01

    In this paper, we revisit the LASSO sparse representation problem, which has been studied and used in a variety of different areas, ranging from signal processing and information theory to computer vision and machine learning. In the vision community, it found its way into many important applications, including face recognition, tracking, super resolution, image denoising, to name a few. Despite advances in efficient sparse algorithms, solving large-scale LASSO problems remains a challenge. To circumvent this difficulty, people tend to downsample and subsample the problem (e.g. via dimensionality reduction) to maintain a manageable sized LASSO, which usually comes at the cost of losing solution accuracy. This paper proposes a novel circulant reformulation of the LASSO that lifts the problem to a higher dimension, where ADMM can be efficiently applied to its dual form. Because of this lifting, all optimization variables are updated using only basic element-wise operations, the most computationally expensive of which is a 1D FFT. In this way, there is no need for a linear system solver nor matrix-vector multiplication. Since all operations in our FFTLasso method are element-wise, the subproblems are completely independent and can be trivially parallelized (e.g. on a GPU). The attractive computational properties of FFTLasso are verified by extensive experiments on synthetic and real data and on the face recognition task. They demonstrate that FFTLasso scales much more effectively than a state-of-the-art solver.

  11. Automated and Assistive Tools for Accelerated Code migration of Scientific Computing on to Heterogeneous MultiCore Systems

    Science.gov (United States)

    2017-04-13

    AFRL-AFOSR-UK-TR-2017-0029 Automated and Assistive Tools for Accelerated Code migration of Scientific Computing on to Heterogeneous MultiCore Systems ...2012, “ Automated and Assistive Tools for Accelerated Code migration of Scientific Computing on to Heterogeneous MultiCore Systems .” 2. The objective...2012 - 01/25/2015 4. TITLE AND SUBTITLE Automated and Assistive Tools for Accelerated Code migration of Scientific Computing on to Heterogeneous

  12. SCALE INTERACTION IN A MIXING LAYER. THE ROLE OF THE LARGE-SCALE GRADIENTS

    KAUST Repository

    Fiscaletti, Daniele

    2015-08-23

    The interaction between scales is investigated in a turbulent mixing layer. The large-scale amplitude modulation of the small scales already observed in other works depends on the crosswise location. Large-scale positive fluctuations correlate with a stronger activity of the small scales on the low speed-side of the mixing layer, and a reduced activity on the high speed-side. However, from physical considerations we would expect the scales to interact in a qualitatively similar way within the flow and across different turbulent flows. Therefore, instead of the large-scale fluctuations, the large-scale gradients modulation of the small scales has been additionally investigated.

  13. From Data to Knowledge to Discoveries: Artificial Intelligence and Scientific Workflows

    Directory of Open Access Journals (Sweden)

    Yolanda Gil

    2009-01-01

    Full Text Available Scientific computing has entered a new era of scale and sharing with the arrival of cyberinfrastructure facilities for computational experimentation. A key emerging concept is scientific workflows, which provide a declarative representation of complex scientific applications that can be automatically managed and executed in distributed shared resources. In the coming decades, computational experimentation will push the boundaries of current cyberinfrastructure in terms of inter-disciplinary scope and integrative models of scientific phenomena under study. This paper argues that knowledge-rich workflow environments will provide necessary capabilities for that vision by assisting scientists to validate and vet complex analysis processes and by automating important aspects of scientific exploration and discovery.

  14. Tracking of large-scale structures in turbulent channel with direct numerical simulation of low Prandtl number passive scalar

    Science.gov (United States)

    Tiselj, Iztok

    2014-12-01

    Channel flow DNS (Direct Numerical Simulation) at friction Reynolds number 180 and with passive scalars of Prandtl numbers 1 and 0.01 was performed in various computational domains. The "normal" size domain was ˜2300 wall units long and ˜750 wall units wide; size taken from the similar DNS of Moser et al. The "large" computational domain, which is supposed to be sufficient to describe the largest structures of the turbulent flows was 3 times longer and 3 times wider than the "normal" domain. The "very large" domain was 6 times longer and 6 times wider than the "normal" domain. All simulations were performed with the same spatial and temporal resolution. Comparison of the standard and large computational domains shows the velocity field statistics (mean velocity, root-mean-square (RMS) fluctuations, and turbulent Reynolds stresses) that are within 1%-2%. Similar agreement is observed for Pr = 1 temperature fields and can be observed also for the mean temperature profiles at Pr = 0.01. These differences can be attributed to the statistical uncertainties of the DNS. However, second-order moments, i.e., RMS temperature fluctuations of standard and large computational domains at Pr = 0.01 show significant differences of up to 20%. Stronger temperature fluctuations in the "large" and "very large" domains confirm the existence of the large-scale structures. Their influence is more or less invisible in the main velocity field statistics or in the statistics of the temperature fields at Prandtl numbers around 1. However, these structures play visible role in the temperature fluctuations at low Prandtl number, where high temperature diffusivity effectively smears the small-scale structures in the thermal field and enhances the relative contribution of large-scales. These large thermal structures represent some kind of an echo of the large scale velocity structures: the highest temperature-velocity correlations are not observed between the instantaneous temperatures and

  15. Structural problems of public participation in large-scale projects with environmental impact

    International Nuclear Information System (INIS)

    Bechmann, G.

    1989-01-01

    Four items are discussed showing that the problems involved through participation of the public in large-scale projects with environmental impact cannot be solved satisfactorily without suitable modification of the existing legal framework. The problematic items are: the status of the electric utilities as a quasi public enterprise; informal preliminary negotiations; the penetration of scientific argumentation into administrative decisions; the procedural concept. The paper discusses the fundamental issue of the problem-adequate design of the procedure and develops suggestions for a cooperative participation design. (orig./HSCH) [de

  16. An efficient method based on the uniformity principle for synthesis of large-scale heat exchanger networks

    International Nuclear Information System (INIS)

    Zhang, Chunwei; Cui, Guomin; Chen, Shang

    2016-01-01

    Highlights: • Two dimensionless uniformity factors are presented to heat exchange network. • The grouping of process streams reduces the computational complexity of large-scale HENS problems. • The optimal sub-network can be obtained by Powell particle swarm optimization algorithm. • The method is illustrated by a case study involving 39 process streams, with a better solution. - Abstract: The optimal design of large-scale heat exchanger networks is a difficult task due to the inherent non-linear characteristics and the combinatorial nature of heat exchangers. To solve large-scale heat exchanger network synthesis (HENS) problems, two dimensionless uniformity factors to describe the heat exchanger network (HEN) uniformity in terms of the temperature difference and the accuracy of process stream grouping are deduced. Additionally, a novel algorithm that combines deterministic and stochastic optimizations to obtain an optimal sub-network with a suitable heat load for a given group of streams is proposed, and is named the Powell particle swarm optimization (PPSO). As a result, the synthesis of large-scale heat exchanger networks is divided into two corresponding sub-parts, namely, the grouping of process streams and the optimization of sub-networks. This approach reduces the computational complexity and increases the efficiency of the proposed method. The robustness and effectiveness of the proposed method are demonstrated by solving a large-scale HENS problem involving 39 process streams, and the results obtained are better than those previously published in the literature.

  17. Particle physics and polyedra proximity calculation for hazard simulations in large-scale industrial plants

    Science.gov (United States)

    Plebe, Alice; Grasso, Giorgio

    2016-12-01

    This paper describes a system developed for the simulation of flames inside an open-source 3D computer graphic software, Blender, with the aim of analyzing in virtual reality scenarios of hazards in large-scale industrial plants. The advantages of Blender are of rendering at high resolution the very complex structure of large industrial plants, and of embedding a physical engine based on smoothed particle hydrodynamics. This particle system is used to evolve a simulated fire. The interaction of this fire with the components of the plant is computed using polyhedron separation distance, adopting a Voronoi-based strategy that optimizes the number of feature distance computations. Results on a real oil and gas refining industry are presented.

  18. Large-scale machine learning and evaluation platform for real-time traffic surveillance

    Science.gov (United States)

    Eichel, Justin A.; Mishra, Akshaya; Miller, Nicholas; Jankovic, Nicholas; Thomas, Mohan A.; Abbott, Tyler; Swanson, Douglas; Keller, Joel

    2016-09-01

    In traffic engineering, vehicle detectors are trained on limited datasets, resulting in poor accuracy when deployed in real-world surveillance applications. Annotating large-scale high-quality datasets is challenging. Typically, these datasets have limited diversity; they do not reflect the real-world operating environment. There is a need for a large-scale, cloud-based positive and negative mining process and a large-scale learning and evaluation system for the application of automatic traffic measurements and classification. The proposed positive and negative mining process addresses the quality of crowd sourced ground truth data through machine learning review and human feedback mechanisms. The proposed learning and evaluation system uses a distributed cloud computing framework to handle data-scaling issues associated with large numbers of samples and a high-dimensional feature space. The system is trained using AdaBoost on 1,000,000 Haar-like features extracted from 70,000 annotated video frames. The trained real-time vehicle detector achieves an accuracy of at least 95% for 1/2 and about 78% for 19/20 of the time when tested on ˜7,500,000 video frames. At the end of 2016, the dataset is expected to have over 1 billion annotated video frames.

  19. Dissecting the large-scale galactic conformity

    Science.gov (United States)

    Seo, Seongu

    2018-01-01

    Galactic conformity is an observed phenomenon that galaxies located in the same region have similar properties such as star formation rate, color, gas fraction, and so on. The conformity was first observed among galaxies within in the same halos (“one-halo conformity”). The one-halo conformity can be readily explained by mutual interactions among galaxies within a halo. Recent observations however further witnessed a puzzling connection among galaxies with no direct interaction. In particular, galaxies located within a sphere of ~5 Mpc radius tend to show similarities, even though the galaxies do not share common halos with each other ("two-halo conformity" or “large-scale conformity”). Using a cosmological hydrodynamic simulation, Illustris, we investigate the physical origin of the two-halo conformity and put forward two scenarios. First, back-splash galaxies are likely responsible for the large-scale conformity. They have evolved into red galaxies due to ram-pressure stripping in a given galaxy cluster and happen to reside now within a ~5 Mpc sphere. Second, galaxies in strong tidal field induced by large-scale structure also seem to give rise to the large-scale conformity. The strong tides suppress star formation in the galaxies. We discuss the importance of the large-scale conformity in the context of galaxy evolution.

  20. An evaluation of multi-probe locality sensitive hashing for computing similarities over web-scale query logs.

    Directory of Open Access Journals (Sweden)

    Graham Cormode

    Full Text Available Many modern applications of AI such as web search, mobile browsing, image processing, and natural language processing rely on finding similar items from a large database of complex objects. Due to the very large scale of data involved (e.g., users' queries from commercial search engines, computing such near or nearest neighbors is a non-trivial task, as the computational cost grows significantly with the number of items. To address this challenge, we adopt Locality Sensitive Hashing (a.k.a, LSH methods and evaluate four variants in a distributed computing environment (specifically, Hadoop. We identify several optimizations which improve performance, suitable for deployment in very large scale settings. The experimental results demonstrate our variants of LSH achieve the robust performance with better recall compared with "vanilla" LSH, even when using the same amount of space.

  1. Advances and challenges in computational plasma science

    International Nuclear Information System (INIS)

    Tang, W M; Chan, V S

    2005-01-01

    Scientific simulation, which provides a natural bridge between theory and experiment, is an essential tool for understanding complex plasma behaviour. Recent advances in simulations of magnetically confined plasmas are reviewed in this paper, with illustrative examples, chosen from associated research areas such as microturbulence, magnetohydrodynamics and other topics. Progress has been stimulated, in particular, by the exponential growth of computer speed along with significant improvements in computer technology. The advances in both particle and fluid simulations of fine-scale turbulence and large-scale dynamics have produced increasingly good agreement between experimental observations and computational modelling. This was enabled by two key factors: (a) innovative advances in analytic and computational methods for developing reduced descriptions of physics phenomena spanning widely disparate temporal and spatial scales and (b) access to powerful new computational resources. Excellent progress has been made in developing codes for which computer run-time and problem-size scale well with the number of processors on massively parallel processors (MPPs). Examples include the effective usage of the full power of multi-teraflop (multi-trillion floating point computations per second) MPPs to produce three-dimensional, general geometry, nonlinear particle simulations that have accelerated advances in understanding the nature of turbulence self-regulation by zonal flows. These calculations, which typically utilized billions of particles for thousands of time-steps, would not have been possible without access to powerful present generation MPP computers and the associated diagnostic and visualization capabilities. In looking towards the future, the current results from advanced simulations provide great encouragement for being able to include increasingly realistic dynamics to enable deeper physics insights into plasmas in both natural and laboratory environments. This

  2. Inference of Functional Properties from Large-scale Analysis of Enzyme Superfamilies*

    Science.gov (United States)

    Brown, Shoshana D.; Babbitt, Patricia C.

    2012-01-01

    As increasingly large amounts of data from genome and other sequencing projects become available, new approaches are needed to determine the functions of the proteins these genes encode. We show how large-scale computational analysis can help to address this challenge by linking functional information to sequence and structural similarities using protein similarity networks. Network analyses using three functionally diverse enzyme superfamilies illustrate the use of these approaches for facile updating and comparison of available structures for a large superfamily, for creation of functional hypotheses for metagenomic sequences, and to summarize the limits of our functional knowledge about even well studied superfamilies. PMID:22069325

  3. Algorithm 873: LSTRS: MATLAB Software for Large-Scale Trust-Region Subproblems and Regularization

    DEFF Research Database (Denmark)

    Rojas Larrazabal, Marielba de la Caridad; Santos, Sandra A.; Sorensen, Danny C.

    2008-01-01

    A MATLAB 6.0 implementation of the LSTRS method is resented. LSTRS was described in Rojas, M., Santos, S.A., and Sorensen, D.C., A new matrix-free method for the large-scale trust-region subproblem, SIAM J. Optim., 11(3):611-646, 2000. LSTRS is designed for large-scale quadratic problems with one...... at each step. LSTRS relies on matrix-vector products only and has low and fixed storage requirements, features that make it suitable for large-scale computations. In the MATLAB implementation, the Hessian matrix of the quadratic objective function can be specified either explicitly, or in the form...... of a matrix-vector multiplication routine. Therefore, the implementation preserves the matrix-free nature of the method. A description of the LSTRS method and of the MATLAB software, version 1.2, is presented. Comparisons with other techniques and applications of the method are also included. A guide...

  4. Computer simulations and the changing face of scientific experimentation

    CERN Document Server

    Duran, Juan M

    2013-01-01

    Computer simulations have become a central tool for scientific practice. Their use has replaced, in many cases, standard experimental procedures. This goes without mentioning cases where the target system is empirical but there are no techniques for direct manipulation of the system, such as astronomical observation. To these cases, computer simulations have proved to be of central importance. The question about their use and implementation, therefore, is not only a technical one but represents a challenge for the humanities as well. In this volume, scientists, historians, and philosophers joi

  5. Grid computing the European Data Grid Project

    CERN Document Server

    Segal, B; Gagliardi, F; Carminati, F

    2000-01-01

    The goal of this project is the development of a novel environment to support globally distributed scientific exploration involving multi- PetaByte datasets. The project will devise and develop middleware solutions and testbeds capable of scaling to handle many PetaBytes of distributed data, tens of thousands of resources (processors, disks, etc.), and thousands of simultaneous users. The scale of the problem and the distribution of the resources and user community preclude straightforward replication of the data at different sites, while the aim of providing a general purpose application environment precludes distributing the data using static policies. We will construct this environment by combining and extending newly emerging "Grid" technologies to manage large distributed datasets in addition to computational elements. A consequence of this project will be the emergence of fundamental new modes of scientific exploration, as access to fundamental scientific data is no longer constrained to the producer of...

  6. Solving Large Scale Nonlinear Eigenvalue Problem in Next-Generation Accelerator Design

    Energy Technology Data Exchange (ETDEWEB)

    Liao, Ben-Shan; Bai, Zhaojun; /UC, Davis; Lee, Lie-Quan; Ko, Kwok; /SLAC

    2006-09-28

    A number of numerical methods, including inverse iteration, method of successive linear problem and nonlinear Arnoldi algorithm, are studied in this paper to solve a large scale nonlinear eigenvalue problem arising from finite element analysis of resonant frequencies and external Q{sub e} values of a waveguide loaded cavity in the next-generation accelerator design. They present a nonlinear Rayleigh-Ritz iterative projection algorithm, NRRIT in short and demonstrate that it is the most promising approach for a model scale cavity design. The NRRIT algorithm is an extension of the nonlinear Arnoldi algorithm due to Voss. Computational challenges of solving such a nonlinear eigenvalue problem for a full scale cavity design are outlined.

  7. High performance computing and communications: Advancing the frontiers of information technology

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1997-12-31

    This report, which supplements the President`s Fiscal Year 1997 Budget, describes the interagency High Performance Computing and Communications (HPCC) Program. The HPCC Program will celebrate its fifth anniversary in October 1996 with an impressive array of accomplishments to its credit. Over its five-year history, the HPCC Program has focused on developing high performance computing and communications technologies that can be applied to computation-intensive applications. Major highlights for FY 1996: (1) High performance computing systems enable practical solutions to complex problems with accuracies not possible five years ago; (2) HPCC-funded research in very large scale networking techniques has been instrumental in the evolution of the Internet, which continues exponential growth in size, speed, and availability of information; (3) The combination of hardware capability measured in gigaflop/s, networking technology measured in gigabit/s, and new computational science techniques for modeling phenomena has demonstrated that very large scale accurate scientific calculations can be executed across heterogeneous parallel processing systems located thousands of miles apart; (4) Federal investments in HPCC software R and D support researchers who pioneered the development of parallel languages and compilers, high performance mathematical, engineering, and scientific libraries, and software tools--technologies that allow scientists to use powerful parallel systems to focus on Federal agency mission applications; and (5) HPCC support for virtual environments has enabled the development of immersive technologies, where researchers can explore and manipulate multi-dimensional scientific and engineering problems. Educational programs fostered by the HPCC Program have brought into classrooms new science and engineering curricula designed to teach computational science. This document contains a small sample of the significant HPCC Program accomplishments in FY 1996.

  8. PARA'04, State-of-the-art in scientific computing

    DEFF Research Database (Denmark)

    Madsen, Kaj; Wasniewski, Jerzy

    This meeting in the series, the PARA'04 Workshop with the title ``State of the Art in Scientific Computing'', was held in Lyngby, Denmark, June 20-23, 2004. The PARA'04 Workshop was organized by Jack Dongarra from the University of Tennessee and Oak Ridge National Laboratory, and Kaj Madsen and J...

  9. Elastic Scheduling of Scientific Workflows under Deadline Constraints in Cloud Computing Environments

    Directory of Open Access Journals (Sweden)

    Nazia Anwar

    2018-01-01

    Full Text Available Scientific workflow applications are collections of several structured activities and fine-grained computational tasks. Scientific workflow scheduling in cloud computing is a challenging research topic due to its distinctive features. In cloud environments, it has become critical to perform efficient task scheduling resulting in reduced scheduling overhead, minimized cost and maximized resource utilization while still meeting the user-specified overall deadline. This paper proposes a strategy, Dynamic Scheduling of Bag of Tasks based workflows (DSB, for scheduling scientific workflows with the aim to minimize financial cost of leasing Virtual Machines (VMs under a user-defined deadline constraint. The proposed model groups the workflow into Bag of Tasks (BoTs based on data dependency and priority constraints and thereafter optimizes the allocation and scheduling of BoTs on elastic, heterogeneous and dynamically provisioned cloud resources called VMs in order to attain the proposed method’s objectives. The proposed approach considers pay-as-you-go Infrastructure as a Service (IaaS clouds having inherent features such as elasticity, abundance, heterogeneity and VM provisioning delays. A trace-based simulation using benchmark scientific workflows representing real world applications, demonstrates a significant reduction in workflow computation cost while the workflow deadline is met. The results validate that the proposed model produces better success rates to meet deadlines and cost efficiencies in comparison to adapted state-of-the-art algorithms for similar problems.

  10. Efficient Computation of Sparse Matrix Functions for Large-Scale Electronic Structure Calculations: The CheSS Library.

    Science.gov (United States)

    Mohr, Stephan; Dawson, William; Wagner, Michael; Caliste, Damien; Nakajima, Takahito; Genovese, Luigi

    2017-10-10

    We present CheSS, the "Chebyshev Sparse Solvers" library, which has been designed to solve typical problems arising in large-scale electronic structure calculations using localized basis sets. The library is based on a flexible and efficient expansion in terms of Chebyshev polynomials and presently features the calculation of the density matrix, the calculation of matrix powers for arbitrary powers, and the extraction of eigenvalues in a selected interval. CheSS is able to exploit the sparsity of the matrices and scales linearly with respect to the number of nonzero entries, making it well-suited for large-scale calculations. The approach is particularly adapted for setups leading to small spectral widths of the involved matrices and outperforms alternative methods in this regime. By coupling CheSS to the DFT code BigDFT, we show that such a favorable setup is indeed possible in practice. In addition, the approach based on Chebyshev polynomials can be massively parallelized, and CheSS exhibits excellent scaling up to thousands of cores even for relatively small matrix sizes.

  11. A multiple-scaling method of the computation of threaded structures

    International Nuclear Information System (INIS)

    Andrieux, S.; Leger, A.

    1989-01-01

    The numerical computation of threaded structures usually leads to very large finite elements problems. It was therefore very difficult to carry out some parametric studies, especially in non-linear cases involving plasticity or unilateral contact conditions. Nevertheless, these parametric studies are essential in many industrial problems, for instance for the evaluation of various repairing processes of the closure studs of PWR. It is well known that such repairing generally involves several modifications of the thread geometry, of the number of active threads, of the flange clamping conditions, and so on. This paper is devoted to the description of a two-scale method, which easily allows parametric studies. The main idea of this method consists of dividing the problem into a global part, and a local part. The local problem is solved by F.E.M. on the precise geometry of the thread of some elementary loadings. The global one is formulated on the gudgeon scale and is reduced to a monodimensional one. The resolution of this global problem leads to the unsignificant computational cost. Then, a post-processing gives the stress field at the thread scale anywhere in the assembly. After recalling some principles of the two-scales approach, the method is described. The validation by comparison with a direct F.E. computation and some further applications are presented

  12. Eighth SIAM conference on parallel processing for scientific computing: Final program and abstracts

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1997-12-31

    This SIAM conference is the premier forum for developments in parallel numerical algorithms, a field that has seen very lively and fruitful developments over the past decade, and whose health is still robust. Themes for this conference were: combinatorial optimization; data-parallel languages; large-scale parallel applications; message-passing; molecular modeling; parallel I/O; parallel libraries; parallel software tools; parallel compilers; particle simulations; problem-solving environments; and sparse matrix computations.

  13. Large-scale perspective as a challenge

    NARCIS (Netherlands)

    Plomp, M.G.A.

    2012-01-01

    1. Scale forms a challenge for chain researchers: when exactly is something ‘large-scale’? What are the underlying factors (e.g. number of parties, data, objects in the chain, complexity) that determine this? It appears to be a continuum between small- and large-scale, where positioning on that

  14. Algorithm 896: LSA: Algorithms for Large-Scale Optimization

    Czech Academy of Sciences Publication Activity Database

    Lukšan, Ladislav; Matonoha, Ctirad; Vlček, Jan

    2009-01-01

    Roč. 36, č. 3 (2009), 16-1-16-29 ISSN 0098-3500 R&D Pro jects: GA AV ČR IAA1030405; GA ČR GP201/06/P397 Institutional research plan: CEZ:AV0Z10300504 Keywords : algorithms * design * large-scale optimization * large-scale nonsmooth optimization * large-scale nonlinear least squares * large-scale nonlinear minimax * large-scale systems of nonlinear equations * sparse pro blems * partially separable pro blems * limited-memory methods * discrete Newton methods * quasi-Newton methods * primal interior-point methods Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 1.904, year: 2009

  15. Scale interactions in a mixing layer – the role of the large-scale gradients

    KAUST Repository

    Fiscaletti, D.

    2016-02-15

    © 2016 Cambridge University Press. The interaction between the large and the small scales of turbulence is investigated in a mixing layer, at a Reynolds number based on the Taylor microscale of , via direct numerical simulations. The analysis is performed in physical space, and the local vorticity root-mean-square (r.m.s.) is taken as a measure of the small-scale activity. It is found that positive large-scale velocity fluctuations correspond to large vorticity r.m.s. on the low-speed side of the mixing layer, whereas, they correspond to low vorticity r.m.s. on the high-speed side. The relationship between large and small scales thus depends on position if the vorticity r.m.s. is correlated with the large-scale velocity fluctuations. On the contrary, the correlation coefficient is nearly constant throughout the mixing layer and close to unity if the vorticity r.m.s. is correlated with the large-scale velocity gradients. Therefore, the small-scale activity appears closely related to large-scale gradients, while the correlation between the small-scale activity and the large-scale velocity fluctuations is shown to reflect a property of the large scales. Furthermore, the vorticity from unfiltered (small scales) and from low pass filtered (large scales) velocity fields tend to be aligned when examined within vortical tubes. These results provide evidence for the so-called \\'scale invariance\\' (Meneveau & Katz, Annu. Rev. Fluid Mech., vol. 32, 2000, pp. 1-32), and suggest that some of the large-scale characteristics are not lost at the small scales, at least at the Reynolds number achieved in the present simulation.

  16. Distributed weighted least-squares estimation with fast convergence for large-scale systems☆

    Science.gov (United States)

    Marelli, Damián Edgardo; Fu, Minyue

    2015-01-01

    In this paper we study a distributed weighted least-squares estimation problem for a large-scale system consisting of a network of interconnected sub-systems. Each sub-system is concerned with a subset of the unknown parameters and has a measurement linear in the unknown parameters with additive noise. The distributed estimation task is for each sub-system to compute the globally optimal estimate of its own parameters using its own measurement and information shared with the network through neighborhood communication. We first provide a fully distributed iterative algorithm to asymptotically compute the global optimal estimate. The convergence rate of the algorithm will be maximized using a scaling parameter and a preconditioning method. This algorithm works for a general network. For a network without loops, we also provide a different iterative algorithm to compute the global optimal estimate which converges in a finite number of steps. We include numerical experiments to illustrate the performances of the proposed methods. PMID:25641976

  17. Distributed weighted least-squares estimation with fast convergence for large-scale systems.

    Science.gov (United States)

    Marelli, Damián Edgardo; Fu, Minyue

    2015-01-01

    In this paper we study a distributed weighted least-squares estimation problem for a large-scale system consisting of a network of interconnected sub-systems. Each sub-system is concerned with a subset of the unknown parameters and has a measurement linear in the unknown parameters with additive noise. The distributed estimation task is for each sub-system to compute the globally optimal estimate of its own parameters using its own measurement and information shared with the network through neighborhood communication. We first provide a fully distributed iterative algorithm to asymptotically compute the global optimal estimate. The convergence rate of the algorithm will be maximized using a scaling parameter and a preconditioning method. This algorithm works for a general network. For a network without loops, we also provide a different iterative algorithm to compute the global optimal estimate which converges in a finite number of steps. We include numerical experiments to illustrate the performances of the proposed methods.

  18. Quantum Monte Carlo for large chemical systems: implementing efficient strategies for peta scale platforms and beyond

    International Nuclear Information System (INIS)

    Scemama, Anthony; Caffarel, Michel; Oseret, Emmanuel; Jalby, William

    2013-01-01

    Various strategies to implement efficiently quantum Monte Carlo (QMC) simulations for large chemical systems are presented. These include: (i) the introduction of an efficient algorithm to calculate the computationally expensive Slater matrices. This novel scheme is based on the use of the highly localized character of atomic Gaussian basis functions (not the molecular orbitals as usually done), (ii) the possibility of keeping the memory footprint minimal, (iii) the important enhancement of single-core performance when efficient optimization tools are used, and (iv) the definition of a universal, dynamic, fault-tolerant, and load-balanced framework adapted to all kinds of computational platforms (massively parallel machines, clusters, or distributed grids). These strategies have been implemented in the QMC-Chem code developed at Toulouse and illustrated with numerical applications on small peptides of increasing sizes (158, 434, 1056, and 1731 electrons). Using 10-80 k computing cores of the Curie machine (GENCI-TGCC-CEA, France), QMC-Chem has been shown to be capable of running at the peta scale level, thus demonstrating that for this machine a large part of the peak performance can be achieved. Implementation of large-scale QMC simulations for future exa scale platforms with a comparable level of efficiency is expected to be feasible. (authors)

  19. Large-scale matrix-handling subroutines 'ATLAS'

    International Nuclear Information System (INIS)

    Tsunematsu, Toshihide; Takeda, Tatsuoki; Fujita, Keiichi; Matsuura, Toshihiko; Tahara, Nobuo

    1978-03-01

    Subroutine package ''ATLAS'' has been developed for handling large-scale matrices. The package is composed of four kinds of subroutines, i.e., basic arithmetic routines, routines for solving linear simultaneous equations and for solving general eigenvalue problems and utility routines. The subroutines are useful in large scale plasma-fluid simulations. (auth.)

  20. Implementation of highly parallel and large scale GW calculations within the OpenAtom software

    Science.gov (United States)

    Ismail-Beigi, Sohrab

    The need to describe electronic excitations with better accuracy than provided by band structures produced by Density Functional Theory (DFT) has been a long-term enterprise for the computational condensed matter and materials theory communities. In some cases, appropriate theoretical frameworks have existed for some time but have been difficult to apply widely due to computational cost. For example, the GW approximation incorporates a great deal of important non-local and dynamical electronic interaction effects but has been too computationally expensive for routine use in large materials simulations. OpenAtom is an open source massively parallel ab initiodensity functional software package based on plane waves and pseudopotentials (http://charm.cs.uiuc.edu/OpenAtom/) that takes advantage of the Charm + + parallel framework. At present, it is developed via a three-way collaboration, funded by an NSF SI2-SSI grant (ACI-1339804), between Yale (Ismail-Beigi), IBM T. J. Watson (Glenn Martyna) and the University of Illinois at Urbana Champaign (Laxmikant Kale). We will describe the project and our current approach towards implementing large scale GW calculations with OpenAtom. Potential applications of large scale parallel GW software for problems involving electronic excitations in semiconductor and/or metal oxide systems will be also be pointed out.